Why do people hate SQL cursors so much?
SqlDatabase CursorSql Problem Overview
I can understand wanting to avoid having to use a cursor due to the overhead and inconvenience, but it looks like there's some serious cursor-phobia-mania going on where people are going to great lengths to avoid having to use one.
For example, one question asked how to do something obviously trivial with a cursor and the accepted answer proposed using a common table expression (CTE) recursive query with a recursive custom function, even though this limits the number of rows that could be processed to 32 (due to recursive function call limit in sql server). This strikes me as a terrible solution for system longevity, not to mention a tremendous effort just to avoid using a simple cursor.
What is the reason for this level of insane hatred? Has some 'noted authority' issued a fatwa against cursors? Does some unspeakable evil lurk in the heart of cursors that corrupts the morals of children or something?
Wiki question, more interested in the answer than the rep.
Related Info:
https://stackoverflow.com/questions/37029/sql-server-fast-forward-cursors
EDIT: let me be more precise: I understand that cursors should not be used instead of normal relational operations; that is a no-brainer. What I don't understand is people going waaaaay out of their way to avoid cursors like they have cooties or something, even when a cursor is a simpler and/or more efficient solution. It's the irrational hatred that baffles me, not the obvious technical efficiencies.
Sql Solutions
Solution 1 - Sql
The "overhead" with cursors is merely part of the API. Cursors are how parts of the RDBMS work under the hood. Often CREATE TABLE
and INSERT
have SELECT
statements, and the implementation is the obvious internal cursor implementation.
Using higher-level "set-based operators" bundles the cursor results into a single result set, meaning less API back-and-forth.
Cursors predate modern languages that provide first-class collections. Old C, COBOL, Fortran, etc., had to process rows one at a time because there was no notion of "collection" that could be used widely. Java, C#, Python, etc., have first-class list structures to contain result sets.
The Slow Issue
In some circles, the relational joins are a mystery, and folks will write nested cursors rather than a simple join. I've seen truly epic nested loop operations written out as lots and lots of cursors. Defeating an RDBMS optimization. And running really slowly.
Simple SQL rewrites to replace nested cursor loops with joins and a single, flat cursor loop can make programs run in 100th the time. [They thought I was the god of optimization. All I did was replace nested loops with joins. Still used cursors.]
This confusion often leads to an indictment of cursors. However, it isn't the cursor, it's the misuse of the cursor that's the problem.
The Size Issue
For really epic result sets (i.e., dumping a table to a file), cursors are essential. The set-based operations can't materialize really large result sets as a single collection in memory.
Alternatives
I try to use an ORM layer as much as possible. But that has two purposes. First, the cursors are managed by the ORM component. Second, the SQL is separated from the application into a configuration file. It's not that the cursors are bad. It's that coding all those opens, closes and fetches is not value-add programming.
Solution 2 - Sql
Cursors make people overly apply a procedural mindset to a set-based environment.
And they are SLOW!!!
From SQLTeam:
> Please note that cursors are the > SLOWEST way to access data inside SQL > Server. The should only be used when > you truly need to access one row at a > time. The only reason I can think of > for that is to call a stored procedure > on each row. In the Cursor > Performance article I discovered > that cursors are over thirty times > slower than set based alternatives.
Solution 3 - Sql
There's an answer above which says "cursors are the SLOWEST way to access data inside SQL Server... cursors are over thirty times slower than set based alternatives."
This statement may be true under many circumstances, but as a blanket statement it's problematic. For example, I've made good use of cursors in situations where I want to perform an update or delete operation affecting many rows of a large table which is receiving constant production reads. Running a stored procedure which does these updates one row at a time ends up being faster than set-based operations, because the set-based operation conflicts with the read operation and ends up causing horrific locking problems (and may kill the production system entirely, in extreme cases).
In the absence of other database activity, set-based operations are universally faster. In production systems, it depends.
Solution 4 - Sql
Cursors tend to be used by beginning SQL developers in places where set-based operations would be better. Particularly when people learn SQL after learning a traditional programming language, the "iterate over these records" mentality tends to lead people to use cursors inappropriately.
Most serious SQL books include a chapter enjoining the use of cursors; well-written ones make it clear that cursors have their place but shouldn't be used for set-based operations.
There are obviously situations where cursors are the correct choice, or at least A correct choice.
Solution 5 - Sql
The optimizer often cannot use the relational algebra to transform the problem when a cursor method is used. Often a cursor is a great way to solve a problem, but SQL is a declarative language, and there is a lot of information in the database, from constraints, to statistics and indexes which mean that the optimizer has a lot of options to solve the problem, whereas a cursor pretty much explicitly directs the solution.
Solution 6 - Sql
In Oracle PL/SQL cursors will not result in table locks and it is possible to use bulk-collecting/bulk-fetching.
In Oracle 10 the often used implicit cursor
for x in (select ....) loop
--do something
end loop;
fetches implicitly 100 rows at a time. Explicit bulk-collecting/bulk-fetching is also possible.
However PL/SQL cursors are something of a last resort, use them when you are unable to solve a problem with set-based SQL.
Another reason is parallelization, it is easier for the database to parallelize big set-based statements than row-by-row imperative code. It is the same reason why functional programming becomes more and more popular (Haskell, F#, Lisp, C# LINQ, MapReduce ...), functional programming makes parallelization easier. The number CPUs per computer is rising so parallelization becomes more and more an issue.
Solution 7 - Sql
In general, because on a relational database, the performance of code using cursors is an order of magnitude worse than set-based operations.
Solution 8 - Sql
The answers above have not emphasized enough the importance of locking. I'm not a big fan of cursors because they often result in table level locks.
Solution 9 - Sql
For what it's worth I have read that the "one" place a cursor will out perform its set-based counterpart is in a running total. Over a small table the speed of summing up the rows over the order by columns favors the set-based operation but as the table increases in row size the cursor will become faster because it can simply carry the running total value to the next pass of the loop. Now where you should do a running total is a different argument...
Solution 10 - Sql
Outside of the performance (non)issues, I think the biggest failing of cursors is they are painful to debug. Especially compared to code in most client applications where debugging tends to be comparatively easy and language features tend to be much easier. In fact, I contend that nearly anything one is doing in SQL with a cursor should probably be happening in the client app in the first place.
Solution 11 - Sql
Can you post that cursor example or link to the question? There's probably an even better way than a recursive CTE.
In addition to other comments, cursors when used improperly (which is often) cause unnecessary page/row locks.
Solution 12 - Sql
You could have probably concluded your question after the second paragraph, rather than calling people "insane" simply because they have a different viewpoint than you do and otherwise trying to mock professionals who may have a very good reason for feeling the way that they do.
As to your question, while there are certainly situations where a cursor may be called for, in my experience developers decide that a cursor "must" be used FAR more often than is actually the case. The chance of someone erring on the side of too much use of cursors vs. not using them when they should is MUCH higher in my opinion.
Solution 13 - Sql
basicaly 2 blocks of code that do the same thing. maybe it's a bit weird example but it proves the point. SQL Server 2005:
SELECT * INTO #temp FROM master..spt_values
DECLARE @startTime DATETIME
BEGIN TRAN
SELECT @startTime = GETDATE()
UPDATE #temp
SET number = 0
select DATEDIFF(ms, @startTime, GETDATE())
ROLLBACK
BEGIN TRAN
DECLARE @name VARCHAR
DECLARE tempCursor CURSOR
FOR SELECT name FROM #temp
OPEN tempCursor
FETCH NEXT FROM tempCursor
INTO @name
SELECT @startTime = GETDATE()
WHILE @@FETCH_STATUS = 0
BEGIN
UPDATE #temp SET number = 0 WHERE NAME = @name
FETCH NEXT FROM tempCursor
INTO @name
END
select DATEDIFF(ms, @startTime, GETDATE())
CLOSE tempCursor
DEALLOCATE tempCursor
ROLLBACK
DROP TABLE #temp
the single update takes 156 ms while the cursor takes 2016 ms.