Why does std::pair expose member variables?

C++StlEncapsulation

C++ Problem Overview


From http://www.cplusplus.com/reference/utility/pair/, we know that std::pair has two member variables, first and second.

Why did the STL designers decide to expose two member variables, first and second, instead of offering a getFirst() and a getSecond()?

C++ Solutions


Solution 1 - C++

For the original C++03 std::pair, functions to access the members would serve no useful purpose.

As of C++11 and later (we're now at C++17, with C++20 coming up fast) std::pair is a special case of std::tuple, where std::tuple can have any number of items. As such it makes sense to have a parameterized getter, since it would be impractical to invent and standardize an arbitrary number of item names. Thus you can use std::get also for a std::pair.

So, the reasons for the design are historical, that the current std::pair is the end result of an evolution towards more generality.


In other news:

regarding

> As far as I know, it will be better if encapsulating two member variables above and give a getFirst(); and getSecond()

no, that's rubbish.

That's like saying a hammer is always better, whether you're driving in nails, fastening with screws, or trimming a piece of wood. Especially in the last case a hammer is just not a useful tool. Hammers can be very useful, but that doesn't mean that they're “better” in general: that's just nonsense.

Solution 2 - C++

Getters and setters are usually useful if one thinks that getting or setting the value requires extra logic (changing some internal state). This can then be easily added into the method. In this case std::pair is only used to provide 2 data values. Nothing more, nothing less. And thus, adding the verbosity of a getter and setter would be pointless.

Solution 3 - C++

The reason is that no real invariant needs to be imposed on the data structure, as std::pair models a general-purpose container for two elements. In other words, an object of type std::pair<T, U> is assumed to be valid for any possible first and second element of type T and U, respectively. Similarly, subsequent mutations in the value of its elements cannot really affect the validity of the std::pair per se.

Alex Stepanov (the author of the STL) explicitly presents this general design principle during his course Efficient Programming with Components, when commenting on the singleton container (i.e., a container of one element).

Thus, albeit the principle in itself can be a source of debate, this is the reason behind the shape of std::pair.

Solution 4 - C++

Getters and setters are useful if one believes that abstraction is warranted to insulate users from design choices and changes in those choices, now or in the future.

The typical example for "now" is that the setter/getter might have logic to validate and/or calculate the value - e.g., use a setter for a phone number, instead of directly exposing the field, so that you can check the format; use a getter for a collection so that the getter can provide a read-only view of the member's value (a collection) to the caller.

The canonical (though bad) example for "changes in the future" is Point - should you expose x and y or getX() and getY()? The usual answer is to use getters/setters because at some time in the future you might want to change the internal representation from Cartesian to polar and you don't want your users to be impacted (or to have them depend on that design decision).

In the case of std::pair - it is the intent that this class now and forever represent two and exactly two values (of arbitrary type) directly, and provide their values on demand. That's it. And that's why the design uses direct member access, rather than go through a getter/setter.

Solution 5 - C++

It could be argued that std::pair would be better off having accessor functions to access its members! Notably for degenerated cases of std::pair there could be an advantage. For example, when at least one of the types is an empty, non-final class, the objects could be smaller (the empty base could be made a base which wouldn't need to get its own address).

At the time std::pair was invented these special cases were not considered (and I'm not sure if the empty base optimization was allowed in the draft working paper at that time). From a semantic point there isn't much reason to have accessor functions, though: clearly, the accessors would need to return a mutable reference for non-const objects. As a result the accessor does not provide any form of encapsulation.

On the other hand, it makes it [slightly] harder on the optimizer to see what's going on when accessor functions are used e.g. because additional sequence points are introduced. I could imagine that Meng Lee and Alexander Stepanov even measured whether there is a difference (nor did I). Even if they didn't, providing access to the members directly is certainly not slower than going through an accessor function while the reverse is not necessarily true.

I wasn't part of the decision and the C++ standard doesn't have a rationale but I guess it was a deliberate decision to make the members public data members.

Solution 6 - C++

The primary purpose of getters and setters is to gain control over access. That is to say, if you expose "first" as a variable, any class can read and write (if not const) it without telling the class it is a part of. In a number of cases, that can pose serious problems.

For example, say you have a class that represents the number of passengers on a boat. You store the number of passengers as an integer. If you expose that number as a bare variable, it would be possible for external functions to write to it. That could leave you in a case where there are actually 10 passengers, but someone changed the variable (perhaps accidentally) to be 50. This is a case for a getter on the number of passengers (but not a setter, which would present the same problem).

An example for getters and setters would be a class which represents a mathematical vector in which you want to cache certain information about the vector. Say you want to store the length. In this case, changing vec.x would probably change the length/magnitude. So, not only do you need to make x wrapped in a getter, you must provide a setter for x, which knows to update the vector's cached length. (Of course, most actual math libraries do not cache these values, and thus expose the variables.)

So the question you ought to ask yourself in the context of using them is: is this class ever conceivably going to need to control or be alerted to changes to this variable?

The answer in something like std::pair is a flat "no". There is no case for controlling access to members in a class whose sole purpose is to contain those members. There certainly is no need for pair to know if those variables have been touched, considering those are its only two members, and thus it has no state to update should either change. pair is ignorant of what it actually contains and its meaning, so tracking what it contains is not worth the effort.

Depending on the compiler and how it is configured, getters and setters can introduce overhead. That's probably not important in most cases, but if you were to put them on something fundamental like std::pair, it would be a non-trivial concern. As such, their addition would need justified - which as I just explained, it cannot be.

Solution 7 - C++

I was appalled by the number of comments that show no basic understanding of object-oriented design (does that prove c++ is not an OO-language?). Yes the design of std::pair has some historical traits, but that does not make a bad design good; nor should it be used as an excuse to deny the fact. Before I rant on it, let me answer some of the questions in the comments:

> Don't you think int should also have a setter and getter

Yes, from a design point of view we should use accessors because by doing so we lose nothing but gain additional flexibility. Some newer algorithms may want to pack additional bits into the key/values, and you cannot encode/decode them without accessors.

> Why wrap something in a getter if there is no logic in the getter?

How do you know there would be no logic in the getter/setter? A good design should not limit the possibility of implementation based on guess. It should offer as much flexibility as possible. Remember the design of std:pair also decides the design of iterator, and by requiring users to directly access member variables, the iterator has to return structures that actually store key/values together. That turns out to be a big limitation. There are algorithms that need to keep them separate. There are algorithms that don't store key/values explicitly at all. Now they have to copy the data during iteration.

> Contrary to popular belief, having objects that do nothing but store member variables with getters and setters is not "the way things should be done"

Another wild guess.

OK, I would stop here.

To answer the original question: std::pair chose to expose member variables because whoever designed it did not recognize and/or prioritize the importance of a flexible contract. They obviously had a very narrow idea/vision about how key-value pairs in a map/hashtable should be implemented, and to make it worse, they let such a narrow view on implementation spill over the top to compromise the design. For example, what if I want to implement a replacement of std:unordered_map that stores key and values in separate arrays based on an open addressing scheme with linear probing? This can greatly boost cache performance for pairs with small keys and large values, as you don't need long-jump across the spaces occupied by values to probe the keys. Had std::pair chosen accessors, it would be trivial to write an STL-style iterator for this. But now it is simply impossible to achieve this without eliciting additional data copying.

I noticed that they also mandate the use of open hashing (i.e., closed chaining) for the implementation of std::unordered_map. This is not only strange from a design point of view (why you want to restrict how things are implemented???), but also pretty dumb in terms of implementation - chained hashtables using linked list is perhaps the slowest of all categories. Go google the web, we can easily find that std:unordered_map is often the doormat of a hashtable benchmark. It even tends to be slower than Java's HashMap (I don't know how they managed to lag behind in this case, as HashMap is also a chained hashtable). An old excuse is that chained hashtable tend to perform better when the load_factor approaches 1, which is totally invalid because 1) there are plenty of techniques in open addressing family to deal with this problem - ever heard of hopscotching or robin-hood hashing, and the latter has actually been there for 30 freakish years; 2) a chained hashtable adds the overhead of a pointer (a good 8 bytes on 64 bit machines) for each entry, so when we say the load_factor of an unordered_map approaches 1, it is not 100% memory usage! We should take that into consideration and compare the performance of unordered_map with alternatives with same memory usage. And it turns out that alternatives like Google Dense HashMap is 3-4 times faster than std::unordered_map.

Why these are relevant? Because interestingly, mandating open hashing does make the design of std::pair look less bad, now that we do not need the flexibility of an alternative storage structure. Moreover, the presence of std::pair makes it almost impossible to adopt newer/better algorithms to write a drop-in replacement of std::unordered_map. Sometimes you wonder whether they did that intentionally so that the poor design of std::pair and the pedestrian implementation of std::unordered_map can survive longer together. Of course I am kidding, so whoever wrote those, don't get offended. In fact people using Java or Python (OK, I admit Python's a stretch) would want to thank you for making them feel good about being "as fast as C++".

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionguoView Question on Stackoverflow
Solution 1 - C++Cheers and hth. - AlfView Answer on Stackoverflow
Solution 2 - C++Hatted RoosterView Answer on Stackoverflow
Solution 3 - C++Ilio CatalloView Answer on Stackoverflow
Solution 4 - C++davidbakView Answer on Stackoverflow
Solution 5 - C++Dietmar KühlView Answer on Stackoverflow
Solution 6 - C++user3995702View Answer on Stackoverflow
Solution 7 - C++jzl106View Answer on Stackoverflow