Why does MySQL allow "group by" queries WITHOUT aggregate functions?

MysqlSqlStandards ComplianceAnsi Sql

Mysql Problem Overview


Surprise -- this is a perfectly valid query in MySQL:

select X, Y from someTable group by X

If you tried this query in Oracle or SQL Server, you’d get the natural error message:

Column 'Y' is invalid in the select list because it is not contained in 
either an aggregate function or the GROUP BY clause.

So how does MySQL determine which Y to show for each X? It just picks one. From what I can tell, it just picks the first Y it finds. The rationale being, if Y is neither an aggregate function nor in the group by clause, then specifying “select Y” in your query makes no sense to begin with. Therefore, I as the database engine will return whatever I want, and you’ll like it.

There’s even a MySQL configuration parameter to turn off this “looseness”. http://dev.mysql.com/doc/refman/5.7/en/sql-mode.html#sqlmode_only_full_group_by

This article even mentions how MySQL has been criticized for being ANSI-SQL non-compliant in this regard. http://www.oreillynet.com/databases/blog/2007/05/debunking_group_by_myths.html

My question is: Why was MySQL designed this way? What was their rationale for breaking with ANSI-SQL?

Mysql Solutions


Solution 1 - Mysql

According to this page (the 5.0 online manual), it's for better performance and user convenience.

Solution 2 - Mysql

I believe that it was to handle the case where grouping by one field would imply other fields are also being grouped:

SELECT user.id, user.name, COUNT(post.*) AS posts 
FROM user 
  LEFT OUTER JOIN post ON post.owner_id=user.id 
GROUP BY user.id

In this case the user.name will always be unique per user.id, so there is convenience in not requiring the user.name in the GROUP BY clause (although, as you say, there is definite scope for problems)

Solution 3 - Mysql

Unfortunately almost all the SQL varieties have situations where they break ANSI and have unpredictable results.

It sounds to me like they intended it to be treated like the "FIRST(Y)" function that many other systems have.

More than likely, this construct is something that the MySQL team regret, but don't want to stop supporting because of the number of applications that would break.

Solution 4 - Mysql

MySQL treats this is a single column DISTINCT when you use GROUP BY without an aggregate function. Using other options you either have the whole result be distinct, or have to use subqueries, etc. The question is whether the results are truly predictable.

Also, good info is in this thread.

Solution 5 - Mysql

From what I have read in the mysql reference page, it says: "You can use this feature to get better performance by avoiding unnecessary column sorting and grouping. However, this is useful primarily when all values in each nonaggregated column not named in the GROUP BY are the same for each group."

I suggest you to read this page (link to the reference manual of mysql): http://dev.mysql.com/doc/refman/5.5/en//group-by-extensions.html

Solution 6 - Mysql

Its actually a very useful tool that all other fields dont have to be in an aggregate function when you group by a field. You can manipulate the result which will be returned by simply ordering it first and then grouping it after. for instance if i wanted to get user login information and i wanted to see the last time the user logged in i would do this.

Tables

USER
user_id | name

USER_LOGIN_HISTORY 
user_id | date_logged_in

USER_LOGIN_HISTORY has multiple rows for one user so if i joined users to it it would return many rows. as i am only interested in the last entry i would do this

select 
  user_id,
  name,
  date_logged_in

from(

  select 
    u.user_id, 
    u.name, 
    ulh.date_logged_in

  from users as u

    join user_login_history as ulh
      on u.user_id = ulh.user_id

  where u.user_id = 1234

  order by ulh.date_logged_in desc 

)as table1

group by user_id

This would return one row with the name of the user and the last time that user logged in.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionAaron FiView Question on Stackoverflow
Solution 1 - MysqlMiroslav GenevView Answer on Stackoverflow
Solution 2 - MysqlCebjyreView Answer on Stackoverflow
Solution 3 - MysqlRob FarleyView Answer on Stackoverflow
Solution 4 - MysqlGL_StephenView Answer on Stackoverflow
Solution 5 - MysqlGiancarlo Nebiolo NavidadView Answer on Stackoverflow
Solution 6 - MysqlNick DenniesView Answer on Stackoverflow