Fastest way to delete all the data in a large table

Sql ServerTsql

Sql Server Problem Overview


I had to delete all the rows from a log table that contained about 5 million rows. My initial try was to issue the following command in query analyzer:

delete from client_log

which took a very long time.

Sql Server Solutions


Solution 1 - Sql Server

Check out truncate table which is a lot faster.

Solution 2 - Sql Server

I discovered the TRUNCATE TABLE in the msdn transact-SQL reference. For all interested here are the remarks:

TRUNCATE TABLE is functionally identical to DELETE statement with no WHERE clause: both remove all rows in the table. But TRUNCATE TABLE is faster and uses fewer system and transaction log resources than DELETE.

The DELETE statement removes rows one at a time and records an entry in the transaction log for each deleted row. TRUNCATE TABLE removes the data by deallocating the data pages used to store the table's data, and only the page deallocations are recorded in the transaction log.

TRUNCATE TABLE removes all rows from a table, but the table structure and its columns, constraints, indexes and so on remain. The counter used by an identity for new rows is reset to the seed for the column. If you want to retain the identity counter, use DELETE instead. If you want to remove table definition and its data, use the DROP TABLE statement.

You cannot use TRUNCATE TABLE on a table referenced by a FOREIGN KEY constraint; instead, use DELETE statement without a WHERE clause. Because TRUNCATE TABLE is not logged, it cannot activate a trigger.

TRUNCATE TABLE may not be used on tables participating in an indexed view.

Solution 3 - Sql Server

There is a common myth that TRUNCATE somehow skips transaction log.

This is misunderstanding, and is clearly mentioned in MSDN.

This myth is invoked in several comments here. Let's eradicate it together ;)

Solution 4 - Sql Server

For reference TRUNCATE TABLE also works on MySQL

Solution 5 - Sql Server

I use the following method to zero out tables, with the added bonus that it leaves me with an archive copy of the table.

CREATE TABLE `new_table` LIKE `table`;
RENAME TABLE `table` TO `old_table`, `new_table` TO `table`;

Solution 6 - Sql Server

forget truncate and delete. maintain your table definitions (in case you want to recreate it) and just use drop table.

Solution 7 - Sql Server

truncate table client_log

is your best bet, truncate kills all content in the table and indices and resets any seeds you've got too.

Solution 8 - Sql Server

On SQL Server you can use the Truncate Table command which is faster than a regular delete and also uses less resources. It will reset any identity fields back to the seed value as well.

The drawbacks of truncate are that it can't be used on tables that are referenced by foreign keys and it won't fire any triggers. Also you won't be able to rollback the data if anything goes wrong.

Solution 9 - Sql Server

truncate table is not SQL-platform independent. If you suspect that you might ever change database providers, you might be wary of using it.

Solution 10 - Sql Server

Note that TRUNCATE will also reset any auto incrementing keys, if you are using those.

If you do not wish to lose your auto incrementing keys, you can speed up the delete by deleting in sets (e.g., DELETE FROM table WHERE id > 1 AND id < 10000). It will speed it up significantly and in some cases prevent data from being locked up.

Solution 11 - Sql Server

Yes, well, deleting 5 million rows is probably going to take a long time. The only potentially faster way I can think of would be to drop the table, and re-create it. That only works, of course, if you want to delete ALL data in the table.

Solution 12 - Sql Server

The suggestion of "Drop and recreate the table" is probably not a good one because that goofs up your foreign keys.

You ARE using foreign keys, right?

Solution 13 - Sql Server

I am revising my earlier statement:

> You should understand that by using > TRUNCATE the data will be cleared but > nothing will be logged to the > transaction log. Writing to the log > is why DELETE will take forever on 5 > million rows. I use TRUNCATE often > during development, but you should be > wary about using it on a production > database because you will not be able > to roll back your changes. You should > immediately make a full database > backup after doing a TRUNCATE to > establish a new basis for restoration.

The above statement was intended to prompt you to be sure that you understand there is difference between the two. Unfortunately, it is poorly written and makes unsupported statements as I have not actually done any testing myself between the two. It is based on statements that I have heard from others.

From MSDN:

> The DELETE statement removes rows one > at a time and records an entry in the > transaction log for each deleted row. > TRUNCATE TABLE removes the data by > deallocating the data pages used to > store the table's data, and only the > page deallocations are recorded in the > transaction log.

I just wanted to say that there is a fundamental difference between the two and because there is a difference, there will be applications where one or the other may be inappropriate.

Solution 14 - Sql Server

If you cannot use TRUNCATE TABLE because of foreign keys and/or triggers, you can consider to:

  • drop all indexes;
  • do the usual DELETE;
  • re-create all indexes.

This may speed up DELETE somewhat.

Solution 15 - Sql Server

DELETE * FROM table_name;

Premature optimization may be dangerous. Optimizing may mean doing something weird, but if it works you may want to take advantage of it.

SELECT DbVendor_SuperFastDeleteAllFunction(tablename, BOZO_BIT) FROM dummy;

For speed I think it depends on...

  • The underlying database: Oracle, Microsoft, MySQL, PostgreSQL, others, custom...

  • The table, it's content, and related tables:

There may be deletion rules. Is there an existing procedure to delete all content in the table? Can this be optimized for the specific underlying database engine? How much do we care about breaking things / related data? Performing a DELETE may be the 'safest' way assuming that other related tables do not depend on this table. Are there other tables and queries that are related / depend on the data within this table? If we don't care much about this table being around, using DROP might be a fast method, again depending on the underlying database.

DROP TABLE table_name;

How many rows are being deleted? Is there other information that is quickly gleaned that will optimize the deletion? For example, can we tell if the table is already empty? Can we tell if there are hundreds, thousands, millions, billions of rows?

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionRon SkufcaView Question on Stackoverflow
Solution 1 - Sql ServerRob WalkerView Answer on Stackoverflow
Solution 2 - Sql ServerRon SkufcaView Answer on Stackoverflow
Solution 3 - Sql ServersquadetteView Answer on Stackoverflow
Solution 4 - Sql ServerUnkwnTechView Answer on Stackoverflow
Solution 5 - Sql Serverdar7ylView Answer on Stackoverflow
Solution 6 - Sql ServersicularsView Answer on Stackoverflow
Solution 7 - Sql ServerRonald HobbsView Answer on Stackoverflow
Solution 8 - Sql ServerMartynnwView Answer on Stackoverflow
Solution 9 - Sql ServerJames A. RosenView Answer on Stackoverflow
Solution 10 - Sql ServerBrian D.View Answer on Stackoverflow
Solution 11 - Sql ServerTheSmurfView Answer on Stackoverflow
Solution 12 - Sql ServerAndy LesterView Answer on Stackoverflow
Solution 13 - Sql ServerAndy FriedersView Answer on Stackoverflow
Solution 14 - Sql ServersquadetteView Answer on Stackoverflow
Solution 15 - Sql ServerMark StockView Answer on Stackoverflow