How to use SQLite in a multi-threaded application?

C++MultithreadingSqlite

C++ Problem Overview


I'm developing an application with SQLite as the database, and am having a little trouble understanding how to go about using it in multiple threads (none of the other Stack Overflow questions really helped me, unfortunately).

My use case: The database has one table, let's call it "A", which has different groups of rows (based on one of their columns). I have the "main thread" of the application which reads the contents from table A. In addition, I decide, once in a while, to update a certain group of rows. To do this, I want to spawn a new thread, delete all the rows of the group, and re-insert them (that's the only way to do it in the context of my app). This might happen to different groups at the same time, so I might have 2+ threads trying to update the database.

I'm using different transactions from each thread, I.E. at the start of every thread's update cycle, I have a begin. In fact, what each thread actually does is call "BEGIN", delete from the database all the rows it needs to "update", and inserts them again with the new values (this is the way it must be done in the context of my application).

Now, I'm trying to understand how I go about implementing this. I've tried reading around (other answers on Stack Overflow, the SQLite site) but I haven't found all the answers. Here are some things I'm wondering about:

  1. Do I need to call "open" and create a new sqlite structure from each thread?
  2. Do I need to add any special code for all of this, or is it enough to spawn different threads, update the rows, and that's fine (since I'm using different transactions)?
  3. I saw something talking about the different lock types there are, and the fact that I might receive "SQLite busy" from calling certain APIs, but honestly I didn't see any reference that completely explained when I need to take all this into account. Do I need to?

If anyone can answer the questions/point me in the direction of a good resource, I'd be very grateful.

UPDATE 1: From all that I've read so far, it seems like you can't have two threads who are going to write to a database file anyway.

See: http://www.sqlite.org/lockingv3.html. In section 3.0: A RESERVED lock means that the process is planning on writing to the database file at some point in the future but that it is currently just reading from the file. Only a single RESERVED lock may be active at one time, though multiple SHARED locks can coexist with a single RESERVED lock.

Does this mean that I may as well only spawn off a single thread to update a group of rows each time? I.e., have some kind of poller thread which decides that I need to update some of the rows, and then creates a new thread to do it, but never more than one at a time? Since it looks like any other thread I create will just get SQLITE_BUSY until the first thread finishes, anyway.

Have I understood things correctly?

BTW, thanks for the answers so far, they've helped a lot.

C++ Solutions


Solution 1 - C++

Some steps when starting out with SQLlite for multithreaded use:

  1. Make sure sqlite is compiled with the multi threaded flag.
  2. You must call open on your sqlite file to create a connection on each thread, don't share connections between threads.
  3. SQLite has a very conservative threading model, when you do a write operation, which includes opening transactions that are about to do an INSERT/UPDATE/DELETE, other threads will be blocked until this operation completes.
  4. If you don't use a transaction, then transactions are implicit, so if you start a INSERT/DELETE/UPDATE, sqlite will try to acquire an exclusive lock, and complete the operation before releasing it.
  5. If you do a BEGIN EXCLUSIVE statement, it will acquire an exclusive lock before doing operations in that transaction. A COMMIT or ROLLBACK will release the lock.
  6. Your sqlite3_step, sqlite3_prepare and some other calls may return SQLITE_BUSY or SQLITE_LOCKED. SQLITE_BUSY usually means that sqlite needs to acquire the lock. The biggest difference between the two return values:
  • SQLITE_LOCKED: if you get this from a sqlite3_step statement, you MUST call sqlite3_reset on the statement handle. You should only get this on the first call to sqlite3_step, so once reset is called you can actually "retry" your sqlite3_step call. On other operations, it's the same as SQLITE_BUSY
  • SQLITE_BUSY : There is no need to call sqlite3_reset, just retry your operation after waiting a bit for the lock to be released.

Solution 2 - C++

Check out this link. The easiest way is to do the locking yourself, and to avoid sharing the connection between threads. Another good resource can be found here, and it concludes with:

  1. Make sure you're compiling SQLite with -DTHREADSAFE=1.

  2. Make sure that each thread opens the database file and keeps its own sqlite structure.

  3. Make sure you handle the likely possibility that one or more threads collide when they access the db file at the same time: handle SQLITE_BUSY appropriately.

  4. Make sure you enclose within transactions the commands that modify the database file, like INSERT, UPDATE, DELETE, and others.

Solution 3 - C++

I realize this is an old thread and the responses are good but I've been looking into this recently and came across an interesting analysis of some different implementations. Mainly it goes over the strengths and weaknesses of connection sharing, message passing, thread-local connections and connection pooling. Take a look at it here: http://dev.yorhel.nl/doc/sqlaccess

Solution 4 - C++

Modern versions of SQLite has thread safety enabled by default. SQLITE_THREADSAFE compilation flag controls whether or not code is included in SQLite to enable it to operate safely in a multithreaded environment. Default value is SQLITE_THREADSAFE=1. It means Serialized mode. In this mode:

> In this mode (which is the default when SQLite is compiled with SQLITE_THREADSAFE=1) the SQLite library will itself serialize access to database connections and prepared statements so that the application is free to use the same database connection or the same prepared statement in different threads at the same time.

Use sqlite3_threadsafe() function to check Sqlite library SQLITE_THREADSAFE compilation flag.

Default library thread safety behavior can be changed via sqlite3_config(). Use SQLITE_OPEN_NOMUTEX and SQLITE_OPEN_FULLMUTEX flags at sqlite3_open_v2() to adjust the threading mode of individual database connections.

Solution 5 - C++

Check this code from the SQLite wiki.

I have done something similar with C and I uploaded the code here.

I hope it's useful.

Solution 6 - C++

Summary

Transactions in SQLite are SERIALIZABLE.

Changes made in one database connection are invisible to all other database connections prior to commit.

A query sees all changes that are completed on the same database connection prior to the start of the query, regardless of whether or not those changes have been committed.

If changes occur on the same database connection after a query starts running but before the query completes, then it is undefined whether or not the query will see those changes.

If changes occur on the same database connection after a query starts running but before the query completes, then the query might return a changed row more than once, or it might return a row that was previously deleted.

For the purposes of the previous four items, two database connections that use the same shared cache and which enable PRAGMA read_uncommitted are considered to be the same database connection, not separate database connections.


In addition to the above information on multi-threaded access, it might be worth taking a look at this page on isolation, as many things have changed since this original question and the introduction of the write-ahead log (WAL).

It seems a hybrid approach of having several connections open to the database provides adequate concurrency guarantees, trading off the expense of opening a new connection with the benefit of allowing multi-threaded write transactions.

Solution 7 - C++

If you use connection pooling, like in Java EE, web application, set the connection pool max. size to 1. Access will be serialized.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionEdan MaorView Question on Stackoverflow
Solution 1 - C++SnazzerView Answer on Stackoverflow
Solution 2 - C++KristianView Answer on Stackoverflow
Solution 3 - C++adechiaroView Answer on Stackoverflow
Solution 4 - C++NikitaView Answer on Stackoverflow
Solution 5 - C++MacarseView Answer on Stackoverflow
Solution 6 - C++Adrian_HView Answer on Stackoverflow
Solution 7 - C++TuaimiAAView Answer on Stackoverflow