JDBC batch insert performance

MysqlPerformanceJdbcBatch File

Mysql Problem Overview


I need to insert a couple hundreds of millions of records into the mysql db. I'm batch inserting it 1 million at a time. Please see my code below. It seems to be slow. Is there any way to optimize it?

try {
        // Disable auto-commit
        connection.setAutoCommit(false);

        // Create a prepared statement
        String sql = "INSERT INTO mytable (xxx), VALUES(?)";
        PreparedStatement pstmt = connection.prepareStatement(sql);

        Object[] vals=set.toArray();
        for (int i=0; i<vals.length; i++) {
            pstmt.setString(1, vals[i].toString());
            pstmt.addBatch();
        }

        // Execute the batch
        int [] updateCounts = pstmt.executeBatch();
        System.out.append("inserted "+updateCounts.length);

Mysql Solutions


Solution 1 - Mysql

I had a similar performance issue with mysql and solved it by setting the useServerPrepStmts and the rewriteBatchedStatements properties in the connection url.

Connection c = DriverManager.getConnection("jdbc:mysql://host:3306/db?useServerPrepStmts=false&rewriteBatchedStatements=true", "username", "password");
    

Solution 2 - Mysql

I'd like to expand on Bertil's answer, as I've been experimenting with the connection URL parameters.

rewriteBatchedStatements=true is the important parameter. useServerPrepStmts is already false by default, and even changing it to true doesn't make much difference in terms of batch insert performance.

Now I think is the time to write how rewriteBatchedStatements=true improves the performance so dramatically. It does so by rewriting of prepared statements for INSERT into multi-value inserts when executeBatch() (Source). That means that instead of sending the following n INSERT statements to the mysql server each time executeBatch() is called :

INSERT INTO X VALUES (A1,B1,C1)
INSERT INTO X VALUES (A2,B2,C2)
...
INSERT INTO X VALUES (An,Bn,Cn)

It would send a single INSERT statement :

INSERT INTO X VALUES (A1,B1,C1),(A2,B2,C2),...,(An,Bn,Cn)

You can observe it by toggling on the mysql logging (by SET global general_log = 1) which would log into a file each statement sent to the mysql server.

Solution 3 - Mysql

You can insert multiple rows with one insert statement, doing a few thousands at a time can greatly speed things up, that is, instead of doing e.g. 3 inserts of the form INSERT INTO tbl_name (a,b,c) VALUES(1,2,3); , you do INSERT INTO tbl_name (a,b,c) VALUES(1,2,3),(1,2,3),(1,2,3); (It might be JDBC .addBatch() does similar optimization now - though the mysql addBatch used to be entierly un-optimized and just issuing individual queries anyhow - I don't know if that's still the case with recent drivers)

If you really need speed, load your data from a comma separated file with LOAD DATA INFILE , we get around 7-8 times speedup doing that vs doing tens of millions of inserts.

Solution 4 - Mysql

If:

  1. It's a new table, or the amount to be inserted is greater then the already inserted data
  2. There are indexes on the table
  3. You do not need other access to the table during the insert

Then ALTER TABLE tbl_name DISABLE KEYS can greatly improve the speed of your inserts. When you're done, run ALTER TABLE tbl_name ENABLE KEYS to start building the indexes, which can take a while, but not nearly as long as doing it for every insert.

Solution 5 - Mysql

You may try using DDBulkLoad object.

// Get a DDBulkLoad object
DDBulkLoad bulkLoad = DDBulkLoadFactory.getInstance(connection);
bulkLoad.setTableName(“mytable”);
bulkLoad.load(“data.csv”);

Solution 6 - Mysql

try {
        // Disable auto-commit
        connection.setAutoCommit(false);
        int maxInsertBatch = 10000;     
        // Create a prepared statement
        String sql = "INSERT INTO mytable (xxx), VALUES(?)";
        PreparedStatement pstmt = connection.prepareStatement(sql);

        Object[] vals=set.toArray();
        int count = 1;
        for (int i=0; i<vals.length; i++) {
            pstmt.setString(1, vals[i].toString());
            pstmt.addBatch();
            if(count%maxInsertBatch == 0){
                 pstmt.executeBatch();
            }
            count++;
        }

        // Execute the batch
        pstmt.executeBatch();
        System.out.append("inserted "+count);

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
Questionuser157195View Question on Stackoverflow
Solution 1 - MysqlBertil ChapuisView Answer on Stackoverflow
Solution 2 - MysqlEranView Answer on Stackoverflow
Solution 3 - MysqlnosView Answer on Stackoverflow
Solution 4 - MysqlWrikkenView Answer on Stackoverflow
Solution 5 - MysqlLalithView Answer on Stackoverflow
Solution 6 - MysqlHieu HoangChiView Answer on Stackoverflow