Who architected / designed C++'s IOStreams, and would it still be considered well-designed by today's standards?

C++Iostream

C++ Problem Overview


First off, it may seem that I'm asking for subjective opinions, but that's not what I'm after. I'd love to hear some well-grounded arguments on this topic.


In the hope of getting some insight into how a modern streams / serialization framework ought to be designed, I recently got myself a copy of the book Standard C++ IOStreams and Locales by Angelika Langer and Klaus Kreft. I figured that if IOStreams wasn't well-designed, it wouldn't have made it into the C++ standard library in the first place.

After having read various parts of this book, I am starting to have doubts if IOStreams can compare to e.g. the STL from an overall architectural point-of-view. Read e.g. this interview with Alexander Stepanov (the STL's "inventor") to learn about some design decisions that went into the STL.

What surprises me in particular:

  • It seems to be unknown who was responsible for IOStreams' overall design (I'd love to read some background information about this — does anyone know good resources?);

  • Once you delve beneath the immediate surface of IOStreams, e.g. if you want to extend IOStreams with your own classes, you get to an interface with fairly cryptic and confusing member function names, e.g. getloc/imbue, uflow/underflow, snextc/sbumpc/sgetc/sgetn, pbase/pptr/epptr (and there's probably even worse examples). This makes it so much harder to understand the overall design and how the single parts co-operate. Even the book I mentioned above doesn't help that much (IMHO).


Thus my question:

If you had to judge by today's software engineering standards (if there actually is any general agreement on these), would C++'s IOStreams still be considered well-designed? (I wouldn't want to improve my software design skills from something that's generally considered outdated.)

C++ Solutions


Solution 1 - C++

Regarding who designed them, the original library was (not surprisingly) created by Bjarne Stroustrup, and then reimplemented by Dave Presotto. This was then redesigned and reimplemented yet again by Jerry Schwarz for Cfront 2.0, using the idea of manipulators from Andrew Koenig. The standard version of the library is based on this implementation.

Source "The Design & Evolution of C++", section 8.3.1.

Solution 2 - C++

Several ill-conceived ideas found their way into the standard: auto_ptr, vector<bool>, valarray and export, just to name a few. So I wouldn't take the presence of IOStreams necessarily as a sign of quality design.

IOStreams have a checkered history. They are actually a reworking of an earlier streams library, but were authored at a time when many of today's C++ idioms didn't exist, so the designers didn't have the benefit of hindsight. One issue that only became apparent over time was that it is almost impossible to implement IOStreams as efficiently as C's stdio, due to the copious use of virtual functions and forwarding to internal buffer objects at even the finest granularity, and also thanks to some inscrutable strangeness in the way locales are defined and implemented. My memory of this is quite fuzzy, I'll admit; I remember it being the subject of intense debate some years ago, on comp.lang.c++.moderated.

Solution 3 - C++

> If you had to judge by today's > software engineering standards (if > there actually is any general > agreement on these), would C++'s > IOStreams still be considered > well-designed? (I wouldn't want to > improve my software design skills from > something that's generally considered > outdated.)

I would say NO, for several reasons:

Poor error handling

Error conditions should be reported with exceptions, not with operator void*.

The "zombie object" anti-pattern is what causes bugs like these.

Poor separation between formatting and I/O

This makes stream objects unnecessary complex, as they have to contain extra state information for formatting, whether you need it or not.

It also increases the odds of writing bugs like:

using namespace std; // I'm lazy.
cout << hex << setw(8) << setfill('0') << x << endl;
// Oops!  Forgot to set the stream back to decimal mode.

If instead, you wrote something like:

cout << pad(to_hex(x), 8, '0') << endl;

There would be no formatting-related state bits, and no problem.

Note that in "modern" languages like Java, C#, and Python, all objects have a toString/ToString/__str__ function that is called by the I/O routines. AFAIK, only C++ does it the other way around by using stringstream as the standard way of converting to a string.

Poor support for i18n

Iostream-based output splits string literals into pieces.

cout << "My name is " << name << " and I am " << occupation << " from " << hometown << endl;

Format strings put whole sentences into string literals.

printf("My name is %s and I am %s from %s.\n", name, occupation, hometown);

The latter approach is easier to adapt to internationalization libraries like GNU gettext, because the use of whole sentences provides more context for the translators. If your string formatting routine supports re-ordering (like the POSIX $ printf parameters), then it also better handles differences in word order between languages.

Solution 4 - C++

I'm posting this as a separate answer because it is pure opinion.

Performing input & output (particularly input) is a very, very hard problem, so not surprisingly the iostreams library is full of bodges and things that with perfect hindsight could have been done better. But it seems to me that all I/O libraries, in whatever language are like this. I've never used a programming language where the I/O system was a thing of beauty that made me stand in awe of its designer. The iostreams library does have advantages, particularly over the C I/O library (extensibility, type-safety etc.), but I don't think anyone is holding it up as an example of great OO or generic design.

Solution 5 - C++

My opinion of C++ iostreams has improved substantially over time, particularly after I started to actually extend them by implementing my own stream classes. I began to appreciate the extensibility and overall design, despite the ridiculously poor member function names like xsputn or whatever. Regardless, I think I/O streams are a massive improvement over C stdio.h, which has no type safety and is riddled with major security flaws.

I think the main problem with IO streams is that they conflate two related but somewhat orthogonal concepts: textual formatting and serialization. On the one hand, IO streams are designed to produce a human-readable, formatted textual representation of an object, and on the other hand, to serialize an object into a portable format. Sometimes these two goals are one and the same, but other times this results in some seriously annoying incongruities. For example:

std::stringstream ss;
std::string output_string = "Hello world";
ss << output_string;

...

std::string input_string;
ss >> input_string;
std::cout << input_string;

Here, what we get as input is not what we originally outputted to the stream. This is because the << operator outputs the entire string, whereas the >> operator will only read from the stream until it encounters a whitespace character, since there's no length information stored in the stream. So even though we output a string object containing "hello world", we're only going to input a string object containing "hello". So while the stream has served its purpose as a formatting facility, it has failed to properly serialize and then unserialize the object.

You might say that IO streams weren't designed to be serialization facilities, but if that's the case, what are input streams really for? Besides, in practice I/O streams are often used to serialize objects, because there are no other standard serialization facilities. Consider boost::date_time or boost::numeric::ublas::matrix, where if you output a matrix object with the << operator, you'll get the same exact matrix when you input it using the >> operator. But in order to accomplish this, the Boost designers had to store column count and row count information as textual data in the output, which compromises the actual human-readable display. Again, an awkward combination of textual formatting facilities and serialization.

Note how most other languages separate these two facilities. In Java, for example, formatting is accomplished through the toString() method, while serialization is accomplished through the Serializable interface.

In my opinion, the best solution would have been to introduce byte based streams, alongside the standard character based streams. These streams would operate on binary data, with no concern for human-readable formatting/display. They could be used solely as serialization/deserialization facilities, to translate C++ objects into portable byte sequences.

Solution 6 - C++

i always found C++ IOStreams ill-designed: their implementation makes it very difficult to properly define a new type a stream. they also mix io features and formatting features (think about manipulators).

personally, the best stream design and implementation i have ever found lies in the Ada programming language. it is a model in decoupling, a joy to create new type of streams, and output functions always work regardless of the stream used. this is thank to a least common denominator: you output bytes to a stream and that's it. stream functions take care of putting the bytes into the stream, it is not their job to e.g. format an integer into hexadecimal (of course, there is a set of type attributes, equivalent to a class member, defined for handling formatting)

i wish C++ was as simple regarding to streams...

Solution 7 - C++

I think IOStreams design is brilliant in terms of extendability and usefulness.

  1. Stream buffers: take a look on boost.iostream extensions: create gzip, tee, copy streams in few lines, create special filters and so on. It would not be possible without it.

  2. Localization integration and formatting integration. See what can be done:

     std::cout << as::spellout << 100 << std::endl;
    

    Can print: "one hundred" or even:

     std::cout << translate("Good morning")  << std::endl;
    

    Can print "Bonjour" or "בוקר טוב" according to the locale imbued to std::cout!

    Such things can be done just because iostreams are very flexible.

Could it be done better?

Of course it could! In fact there are many things that could be improved...

Today it is quite painful to derive correctly from stream_buffer, it is quite non-trivial to add additional formatting information to stream, but possible.

But looking back many years ago I still the library design was good enough to be about to bring many goodies.

Because you can't always see the big picture, but if you leave points for extensions it gives you much better abilities even in points you didn't think about.

Solution 8 - C++

(This answer is just based on my opinion)

I think that IOStreams are much more complex than their function equivalents. When I write in C++, I still use the cstdio headers for "old-style" I/O, which I find much more predictable. On a side note, (though it isn't really important; the absolute time difference is negligible) IOStreams have been proven on numerous occasions to be slower than C I/O.

Solution 9 - C++

I always run into surprises when using the IOStream.

The library seems text oriented and not binary oriented. That may be the first surprise: using the binary flag in file streams is not sufficient to get binary behavior. User Charles Salvia above has observed it correctly: IOStreams mixes formatting aspects (where you want pretty output, e.g. limited digits for floats) with serialization aspects (where you do not want information loss). Probably it would be good to separate these aspects. Boost.Serialization does this half. You have a serialize function which routes to the inserters and extractors if you want. There already you have the tension between both aspects.

Many functions have also confusing semantics (e.g. get, getline, ignore and read. Some extract the delimiter, some don't; also some set eof). Further on some mention the weird function names when implementing a stream (e.g. xsputn, uflow, underflow). Things get even worse when one uses the wchar_t variants. The wifstream does a translation to multibyte while wstringstream does not. Binary I/O does not work out of the box with wchar_t: you have the overwrite the codecvt.

The c buffered I/O (i.e. FILE) is not as powerful as its C++ counterpart, but is more transparent and has much less counter intuitive behavior.

Still every time when I stumble upon the IOStream, I get attracted to it like a moth to fire. Probably it would be a good thing if some really clever guy would have a good look at the overall architecture.

Solution 10 - C++

I cannot help answering the first part of the question (Who did that?). But it was answered in other posts.

As to the second part of the question (Well designed?), my answer is a resounding "No!". Here a little example which makes me shake my head in disbelief since years:

#include <stdint.h>
#include <iostream>
#include <vector>

// A small attempt in generic programming ;)
template <class _T>
void ShowVector( const char *title, const std::vector<_T> &v)
{
    std::vector<_T>::const_iterator iter;
    std::cout << title << " (" << v.size() << " elements): ";
    for( iter = v.begin(); iter != v.end(); ++iter )
    {
        std::cout << (*iter) << " ";
    }
    std::cout << std::endl;
}
int main( int argc, const char * argv[] )
{
    std::vector<uint8_t> byteVector;
    std::vector<uint16_t> wordVector;
    byteVector.push_back( 42 );
    wordVector.push_back( 42 );
    ShowVector( "Garbled bytes as characters output o.O", byteVector );
    ShowVector( "With words, the numbers show as numbers.", wordVector );
    return 0;
}

The above code produces nonsense due to iostream design. For some reasons beyond my grasp, they treat uint8_t bytes as characters, while larger integral types are treated like numbers. Q.e.d. Bad design.

There is also no way I can think of to fix this. The type could as well be a float or a double instead... so a cast to 'int' to make silly iostream understand that numbers not chars are the topic will not help.

After receiving a down-vote to my reply, maybe a few more words of explanation... IOStream design is flawed as it does not give the programmer a means to state HOW an item is treated. The IOStream implementation makes arbitrary decisions (such as treating uint8_t as a char, not a byte number). This IS a flaw of the IOStream design, as they try to achieve the unachievable.

C++ does not allow to classify a type - the language does not have the facility. There is no such thing as is_number_type() or is_character_type() IOStream could use to make a reasonable automatic choice. Ignoring that and trying to get away with guessing IS a design flaw of a library.

Admitted, printf() would equally fail to work in a generic "ShowVector()" implementation. But that is no excuse for iostream behavior. But it is very likely that in printf() case, ShowVector() would be defined like this:

template <class _T>
void ShowVector( const char *formatString, const char *title, const std::vector<_T> &v );

Solution 11 - C++

C++ iostreams have a lot of flaws, as noted in the other responses, but I'd like to note something in its defense.

C++ is virtually unique among languages in serious use that makes variable input and output straightforward for beginners. In other languages, user input tends to involve type coercion or string formatters, while C++ makes the compiler do all the work. The same is largely true for output, although C++ isn't as unique in this regard. Still, you can do formatted I/O pretty well in C++ without having to understand classes and object-oriented concepts, which is at pedagogically useful, and without having to understand format syntax. Again, if you're teaching beginners, that's a big plus.

This simplicity for beginners comes at a price, which can make it a headache for dealing with I/O in more complex situations, but hopefully by that point the programmer has learned enough to be able to deal with them, or at least turned old enough to drink.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
Questionstakx - no longer contributingView Question on Stackoverflow
Solution 1 - C++anonView Answer on Stackoverflow
Solution 2 - C++Marcelo CantosView Answer on Stackoverflow
Solution 3 - C++dan04View Answer on Stackoverflow
Solution 4 - C++anonView Answer on Stackoverflow
Solution 5 - C++Charles SalviaView Answer on Stackoverflow
Solution 6 - C++Adrien PlissonView Answer on Stackoverflow
Solution 7 - C++ArtyomView Answer on Stackoverflow
Solution 8 - C++Delan AzabaniView Answer on Stackoverflow
Solution 9 - C++gast128View Answer on Stackoverflow
Solution 10 - C++BitTicklerView Answer on Stackoverflow
Solution 11 - C++user2310967View Answer on Stackoverflow