Which is fastest? SELECT SQL_CALC_FOUND_ROWS FROM `table`, or SELECT COUNT(*)

MysqlOptimization

Mysql Problem Overview


When you limit the number of rows to be returned by a SQL query, usually used in paging, there are two methods to determine the total number of records:

###Method 1 Include the SQL_CALC_FOUND_ROWS option in the original SELECT, and then get the total number of rows by running SELECT FOUND_ROWS():

SELECT SQL_CALC_FOUND_ROWS * FROM table WHERE id > 100 LIMIT 10;
SELECT FOUND_ROWS();  

###Method 2 Run the query normally, and then get the total number of rows by running SELECT COUNT(*)

SELECT * FROM table WHERE id > 100 LIMIT 10;
SELECT COUNT(*) FROM table WHERE id > 100;  

Which method is the best / fastest?

Mysql Solutions


Solution 1 - Mysql

It depends. See the MySQL Performance Blog post on this subject: To SQL_CALC_FOUND_ROWS or not to SQL_CALC_FOUND_ROWS?

Just a quick summary: Peter says that it depends on your indexes and other factors. Many of the comments to the post seem to say that SQL_CALC_FOUND_ROWS is almost always slower - sometimes up to 10x slower - than running two queries.

Solution 2 - Mysql

MySQL has started deprecating SQL_CALC_FOUND_ROWS functionality with version 8.0.17 onwards.

So, it is always preferred to consider executing your query with LIMIT, and then a second query with COUNT(*) and without LIMIT to determine whether there are additional rows.

From docs:

> The SQL_CALC_FOUND_ROWS query modifier and accompanying FOUND_ROWS() > function are deprecated as of MySQL 8.0.17 and will be removed in a > future MySQL version. > > COUNT() is subject to certain optimizations. SQL_CALC_FOUND_ROWS > causes some optimizations to be disabled. > > Use these queries instead: > > SELECT * FROM tbl_name WHERE id > 100 LIMIT 10; > SELECT COUNT() WHERE id > 100;


Also, SQL_CALC_FOUND_ROWS has been observed to having more issues generally, as explained in the MySQL WL# 12615 :

> SQL_CALC_FOUND_ROWS has a number of problems. First of all, it's slow. > Frequently, it would be cheaper to run the query with LIMIT and then a > separate SELECT COUNT() for the same query, since COUNT() can make > use of optimizations that can't be done when searching for the entire > result set (e.g. filesort can be skipped for COUNT(*), whereas with > CALC_FOUND_ROWS, we must disable some filesort optimizations to > guarantee the right result) > > More importantly, it has very unclear semantics in a number of > situations. In particular, when a query has multiple query blocks > (e.g. with UNION), there's simply no way to calculate the number of > “would-have-been” rows at the same time as producing a valid query. As > the iterator executor is progressing towards these kinds of queries, > it is genuinely difficult to try to retain the same semantics. > Furthermore, if there are multiple LIMITs in the query (e.g. for > derived tables), it's not necessarily clear to which of them > SQL_CALC_FOUND_ROWS should refer to. Thus, such nontrivial queries > will necessarily get different semantics in the iterator executor > compared to what they had before. > > Finally, most of the use cases where SQL_CALC_FOUND_ROWS would seem > useful should simply be solved by other mechanisms than LIMIT/OFFSET. > E.g., a phone book should be paginated by letter (both in terms of UX > and in terms of index use), not by record number. Discussions are > increasingly infinite-scroll ordered by date (again allowing index > use), not by paginated by post number. And so on.

Solution 3 - Mysql

When choosing the "best" approach, a more important consideration than speed might be the maintainability and correctness of your code. If so, SQL_CALC_FOUND_ROWS is preferable because you only need to maintain a single query. Using a single query completely precludes the possibility of a subtle difference between the main and count queries, which may lead to an inaccurate COUNT.

Solution 4 - Mysql

According to the following article: https://www.percona.com/blog/2007/08/28/to-sql_calc_found_rows-or-not-to-sql_calc_found_rows/

If you have an INDEX on your where clause (if id is indexed in your case), then it is better not to use SQL_CALC_FOUND_ROWS and use 2 queries instead, but if you don't have an index on what you put in your where clause (id in your case) then using SQL_CALC_FOUND_ROWS is more efficient.

Solution 5 - Mysql

IMHO, the reason why 2 queries

SELECT * FROM count_test WHERE b = 666 ORDER BY c LIMIT 5;
SELECT count(*) FROM count_test WHERE b = 666;

are faster than using SQL_CALC_FOUND_ROWS

SELECT SQL_CALC_FOUND_ROWS * FROM count_test WHERE b = 555 ORDER BY c LIMIT 5;

has to be seen as a particular case.

It in facts depends on the selectivity of the WHERE clause compared to the selectivity of the implicit one equivalent to the ORDER + LIMIT.

As Arvids told in comment (http://www.mysqlperformanceblog.com/2007/08/28/to-sql_calc_found_rows-or-not-to-sql_calc_found_rows/#comment-1174394), the fact that the EXPLAIN use, or not, a temporay table, should be a good base for knowing if SCFR will be faster or not.

But, as I added (http://www.mysqlperformanceblog.com/2007/08/28/to-sql_calc_found_rows-or-not-to-sql_calc_found_rows/#comment-8166482), the result really, really depends on the case. For a particular paginator, you could get to the conclusion that “for the 3 first pages, use 2 queries; for the following pages, use a SCFR” !

Solution 6 - Mysql

Removing some unnecessary SQL and then COUNT(*) will be faster than SQL_CALC_FOUND_ROWS. Example:

SELECT Person.Id, Person.Name, Job.Description, Card.Number
FROM Person
JOIN Job ON Job.Id = Person.Job_Id
LEFT JOIN Card ON Card.Person_Id = Person.Id
WHERE Job.Name = 'WEB Developer'
ORDER BY Person.Name

Then count without unnecessary part:

SELECT COUNT(*)
FROM Person
JOIN Job ON Job.Id = Person.Job_Id
WHERE Job.Name = 'WEB Developer'

Solution 7 - Mysql

There are other options for you to benchmark:

1.) A window function will return the actual size directly (tested in MariaDB):

SELECT 
  `mytable`.*,
  COUNT(*) OVER() AS `total_count`
FROM `mytable`
ORDER BY `mycol`
LIMIT 10, 20

2.) Thinking out of the box, most of the time users don't need to know the EXACT size of the table, an approximate is often good enough.

SELECT `TABLE_ROWS` AS `rows_approx`
FROM `INFORMATION_SCHEMA`.`TABLES`
WHERE `TABLE_SCHEMA` = DATABASE()
  AND `TABLE_TYPE` = "BASE TABLE"
  AND `TABLE_NAME` = ?

Solution 8 - Mysql

Simple example on table with 2.000.000 rows and query like this :

select fieldname 
from table_add 
where 
descryption_per like '%marihuana%' 
or addiction_per like '%alkohol%';

it is a full table scan every query - so it take time x 2. I mean "select count(*) from .....

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionJrgnsView Question on Stackoverflow
Solution 1 - MysqlnathanView Answer on Stackoverflow
Solution 2 - MysqlMadhur BhaiyaView Answer on Stackoverflow
Solution 3 - MysqlJeff ClemensView Answer on Stackoverflow
Solution 4 - Mysqlpatapouf_aiView Answer on Stackoverflow
Solution 5 - MysqlPierre-Olivier VaresView Answer on Stackoverflow
Solution 6 - MysqlJessé CatrinckView Answer on Stackoverflow
Solution 7 - MysqlCode4R7View Answer on Stackoverflow
Solution 8 - MysqlJa LocView Answer on Stackoverflow