ResultSet: Retrieving column values by index versus retrieving by label

JavaOptimizationJdbcResultsetMaintenance

Java Problem Overview


When using JDBC, I often come across constructs like

ResultSet rs = ps.executeQuery();
while (rs.next()) {
    int id = rs.getInt(1);
    // Some other actions
}

I asked myself (and authors of code too) why not to use labels for retrieving column values:

int id = rs.getInt("CUSTOMER_ID");

The best explanation I've heard is something concerning performance. But actually, does it make processing extremely fast? I don't believe so, though I have never performed measurements. Even if retrieving by label would be a bit slower, nevertheless, it provide better readability and flexibility, in my opinion.
So could someone give me good explanation of avoiding to retrieve column values by column index instead of column label? What are pros and cons of both approaches (maybe, concerning certain DBMS)?

Java Solutions


Solution 1 - Java

Warning: I'm going to get bombastic here, because this drives me crazy.

99%* of the time, it's a ridiculous micro-optimization that people have some vague idea makes things 'better'. This completely ignores the fact that, unless you're in an extremely tight and busy loop over millions of SQL results all the time, which is hopefully rare, you'll never notice it. For everyone who's not doing that, the developer time cost of maintaing, updating, and fixing bugs in the column indexing are far greater than the incremental cost of hardware for your infinitesimally-worse-performing application.

Don't code optimizations like this in. Code for the person maintaining it. Then observe, measure, analyse, and optimize. Observe again, measure again, analyse again, and optimize again.

Optimization is pretty much the last step in development, not the first.

* Figure is made up.

Solution 2 - Java

You should use string labels by default.

Pros:

  • Independence of column order
  • Better readability/maintainability

Cons:

  • You have no control over the column names (access via stored procedures)

Which would you prefer?

ints?

int i = 1;  
customerId = resultSet.getInt(i++);  
customerName = resultSet.getString(i++);  
customerAddress = resultSet.getString(i++);

or Strings?

customerId = resultSet.getInt("customer_id");  
customerName = resultSet.getString("customer_name");  
customerAddress = resultSet.getString("customer_address");

And what if there is a new column inserted at position 1? Which code would you prefer? Or if the order of the columns is changed, which code version would you need to change at all?

That's why you should use string labels by default.

Solution 3 - Java

The answer has been accepted, none-the-less, here is some additional information and personal experience that I have not seen put forward yet.

Use column names (constants and not literals is preferred) in general and if possible. This is both clearer, is easier to maintain, and future changes are less likely to break the code.

There is, however, a use for column indexes. In some cases these are faster, but not sufficiently that this should override the above reasons for names*. These are very valuable when developing tools and general methods dealing with ResultSets. Finally, an index may be required because the column does not have a name (such as an unnamed aggregate) or there are duplicate names so there is no easy way to reference both.

*Note that I have written some JDBC drivers and looked inside some open sources one and internally these use column indexes to reference the result columns. In all cases I have worked with, the internal driver first maps a column name to an index. Thus, you can easily see that the column name, in all those cases, would always take longer. This may not be true for all drivers though.

Solution 4 - Java

From the java documentation:

> The ResultSet interface provides getter methods (getBoolean, getLong, and so on) for retrieving column values from the current row. Values can be retrieved using either the index number of the column or the name of the column. In general, using the column index will be more efficient. Columns are numbered from 1. For maximum portability, result set columns within each row should be read in left-to-right order, and each column should be read only once.

Of course each method (named or indexed) has its place. I agree that named columns should be the default. However, in cases where a huge number of loops are required, and where the SELECT statement is defined and maintained in the same section of code (or class), indexes should be ok - it is advisable to list the columns being selected, not just "SELECT * FROM...", since any table change will break the code.

Solution 5 - Java

I did some performance profiling on this exact subject on an Oracle database. In our code we have a ResultSet with numerous colums and a huge number of rows. Of the 20 seconds (!) the request takes to execute method oracle.jdbc.driver.ScrollableResultSet.findColumn(String name) takes about 4 seconds.

Obviously there's something wrong with the overall design, but using indexes instead of the column names would probably take this 4 seconds away.

Solution 6 - Java

Sure, using column names increases readability and makes maintenance easy. But using column names has a flipside. As you know, SQL allows multiple column names with same name, there's no guarantee that the column name you typed in the getter method of resultSet actually points to the column name you intend to access. In theory, using index numbers instead of column names is preferred, but it reduces the readability.

Solution 7 - Java

You can have the best of both! The speed of using indexes with the maintainability and security of using column names.

First - unless you are looping thru a result set just use column names.

  1. Define a set of integer variables, one for each column you will access. The names of the variables can include the name of the column: e.g. iLast_Name.

  2. Before the result set loop iterate thru the column metadata and set the value of each integer variable to the column index of the corresponding column name. If the index of the 'Last_Name' column is 3 then set the value of 'iLast_Name' to 3.

  3. In the result set loop use the integer variable names in the GET/SET methods. The variable name is a visual clue to the developer/maintainer as to the actual column name being accessed but the value is the column index and will give the best performance.

NOTE: the initial mapping (i.e. column name to index mapping) is only done once before the loop rather than for every record and column in the loop.

Solution 8 - Java

I don't think using the labels impacts performance by much. But there is another reason not to use Strings. Or ints, for that matter.

Consider using constants. Using an int constant makes the code more readably, but also less likely to have errors.

Besides being more readable, the constant also prevents you from making typo's in the label names - the compiler will throw an error if you do. And any IDE worth anything will pick it up. This is not the case if you use Strings or ints.

Solution 9 - Java

The JDBC driver takes care for the column to index look-up. So if you extract values by column name each time the driver makes a look-up (usually in hash map) to check the corresponding index for the column name.

Solution 10 - Java

I agree with previous answers that performance is not something that can force us to select either of the approaches. It would be good to consider the following things instead:

  • Code readability: for every developer reading your code labels have much more sense than indexes.
  • Maintenance: think of the SQL query and the way it is maintained. What is more likely to happen in your case after fixing/improving/refactoring SQL query: changing the order of the columns extracted or changing result column names. It seems for me that changing the order of the columns extracted (as the results of adding/deleting new columns in result set) has greater probability to happen.
  • Encapsulation: in spite of the way you choose try to isolate the code where you run SQL query and parse result set in the same component and make only this component aware about the column names and their mapping to the indexes (if you decided to use them).

Solution 11 - Java

Using the index is an attempt at optimization.

The time saved by this is wasted by the extra effort it takes the developer to look up the necessary data to check if their code will work properly after the changes.

I think it's our built-in instinct to use numbers instead of text.

Solution 12 - Java

Besides the look up in Map for labels it also leads to an extra String creation. Though it will happens on stack but still it caries a cost with it.

It all depends on the individual choice and till date I have used only indexes :-)

Solution 13 - Java

As it is pointed out by other posters, I would stick to column names unless you have a really powerful reason not to do so. The impact in performance is negligible compared to, for example, query optimization. In this case, maintenance is much more important than an small optmization.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionRorickView Question on Stackoverflow
Solution 1 - JavaCowanView Answer on Stackoverflow
Solution 2 - JavaMartin KlinkeView Answer on Stackoverflow
Solution 3 - JavaKevin BrockView Answer on Stackoverflow
Solution 4 - JavaJasonView Answer on Stackoverflow
Solution 5 - JavaMarko UllgrenView Answer on Stackoverflow
Solution 6 - Javauser228462View Answer on Stackoverflow
Solution 7 - JavaRick PostView Answer on Stackoverflow
Solution 8 - JavaSietseView Answer on Stackoverflow
Solution 9 - JavazlosterView Answer on Stackoverflow
Solution 10 - JavaCha2lengerView Answer on Stackoverflow
Solution 11 - JavadatabyssView Answer on Stackoverflow
Solution 12 - JavaVinod SinghView Answer on Stackoverflow
Solution 13 - JavaRober2D2View Answer on Stackoverflow