Is it good practice to use std::vector as a simple buffer?

C++StdStdvector

C++ Problem Overview


I have an application that is performing some processing on some images.

Given that I know the width/height/format etc. (I do), and thinking just about defining a buffer to store the pixel data:

Then, rather than using new and delete [] on an unsigned char* and keeping a separate note of the buffer size, I'm thinking of simplifying things by using a std::vector.

So I would declare my class something like this:

#include <vector>

class MyClass
{
    // ... etc. ...

public:
    virtual void OnImageReceived(unsigned char *pPixels, 
        unsigned int uPixelCount);

private:
    std::vector<unsigned char> m_pImageBuffer;    // buffer for 8-bit pixels

    // ... etc. ...
};

Then, when I received a new image (of some variable size - but don't worry about those details here), I can just resize the vector (if necessary) and copy the pixels:

void MyClass::OnImageReceived(unsigned char *pPixels, unsigned int uPixelCount)
{
    // called when a new image is available
    if (m_pImageBuffer.size() != uPixelCount)
    {
        // resize image buffer
        m_pImageBuffer.reserve(uPixelCount);
        m_pImageBuffer.resize(uPixelCount, 0);
    }

    // copy frame to local buffer
    memcpy_s(&m_pImageBuffer[0], m_pImageBuffer.size(), pPixels, uPixelCount);

    // ... process image etc. ...
}

This seems fine to me, and I like that fact that I don't have to worry about the memory management, but it raises some questions:

  1. Is this a valid application of std::vector or is there a more suitable container?
  2. Am I doing the right thing performance-wise by calling reserve and resize?
  3. Will it always be the case that the underlying memory is consecutive so I can use memcpy_s as shown?

Any additional comment, criticism or advice would be very welcome.

C++ Solutions


Solution 1 - C++

  1. Sure, this'll work fine. The one thing you need to worry about is ensuring that the buffer is correctly aligned, if your class relies on a particular alignment; in this case you may want to use a vector of the datatype itself (like float).
  2. No, reserve is not necessary here; resize will automatically grow the capacity as necessary, in exactly the same way.
  3. Before C++03, technically not (but in practice yes). Since C++03, yes.

Incidentally, though, memcpy_s isn't the idiomatic approach here. Use std::copy instead. Keep in mind that a pointer is an iterator.

Starting in C++17, std::byte is the idiomatic unit of opaquely typed storage such as you are using here. char will still work, of course, but allows unsafe usages (as char!) which byte does not.

Solution 2 - C++

Besides what other answers mention, I would recommend you to use std::vector::assign rather than std::vector::resize and memcpy:

void MyClass::OnImageReceived(unsigned char *pPixels, unsigned int uPixelCount)
{
    m_pImageBuffer.assign(pPixels, pPixels + uPixelCount);
}

That will resize if necessary, and you would be avoiding the unnecessary 0 initialization of the buffer caused by std::vector::resize.

Solution 3 - C++

Using a vector in this case is fine. In C++ the storage is guaranteed to be contigious.

I would not both resize and reserve, nor would I memcpy to copy the data in. Instead, all you need to do is reserve to make sure you don't have to reallocate many times, then clear out the vector using clear. If you resize, it will go through and set the values of every element to their defaults -- this is unnecesarry here because you're just going to overwrite it anyway.

When you're ready to copy the data in, don't use memcpy. Use copy in conjunction with back_inserter into an empty vector:

std::copy (pPixels, pPixels + uPixelCount, std::back_inserter(m_pImageBuffer));

I would consider this idiom to be much closer to canonical than the memcpy method you are employing. There might be faster or more efficient methods, but unless you can prove that this is a bottleneck in your code (which it likely won't be; you'll have much bigger fish to fry elsewhere) I would stick with idiomatic methods and leave the premature micro-optimizations to someone else.

Solution 4 - C++

I would avoid std::vector as a container for storing an unstructured buffer, as std::vector is profoundly slow when used as a buffer

Consider this (C++14) example (for C++11, you can used shared instead of unique ptrs, but you'll notice slight performance hit in the array example that you don't get from the vectors when running at -O3 or -O2):

#include <array>
#include <chrono>
#include <ctime>
#include <iostream>
#include <memory>
#include <vector>

namespace {
std::unique_ptr<std::array<unsigned char, 4000000>> allocateWithPtr() {
  return std::make_unique<std::array<unsigned char, 4000000>>();
}

std::vector<unsigned char> allocateWithVector() {
  return std::vector<unsigned char>(4000000);
}
} // namespace

int main() {
  auto start = std::chrono::system_clock::now();

  for (long i = 0; i < 1000; i++) {
    auto myBuff = allocateWithPtr();
  }
  auto ptr_end = std::chrono::system_clock::now();

  for (long i = 0; i < 1000; i++) {
    auto myBuff = allocateWithVector();
  }
  auto vector_end = std::chrono::system_clock::now();

  std::cout << "std::unique_ptr = " << (ptr_end - start).count() / 1000.0
            << " ms." << std::endl;
  std::cout << "std::vector = " << (vector_end - ptr_end).count() / 1000.0
            << " ms." << std::endl;
}

Output:

bash % clang++ -O3 -std=gnu++14 test.cpp && ./a.out
std::unique_ptr = 0 ms.
std::vector = 0 ms

bash % clang++ -O2 -std=gnu++14 test.cpp && ./a.out
std::unique_ptr = 0 ms.
std::vector = 0 ms.

bash % clang++ -O1 -std=gnu++14 test.cpp && ./a.out
std::unique_ptr = 89.945 ms.
std::vector = 14135.3 ms.

bash % clang++ -O0 -std=gnu++14 test.cpp && ./a.out
std::unique_ptr = 80.945 ms.
std::vector = 67521.1 ms.

Even with no writes or reallocations, std::vector is over 800 times slower than just using a new with a unique_ptr at -O0 and 150 times slower at -O1. What's going on here?

As @MartinSchlott points out, it is not designed for this task. A vector is for holding a set object instances, not an unstructured (from an array standpoint) buffer. Objects have destructors and constructors. When the vector is destroyed, it calls the destructor for each element in it, even vector will call a destructor for each char in your vector.

You can see how much time it takes just to "destroy" the unsigned chars in this vector with this example:

#include <chrono>
#include <ctime>
#include <iostream>
#include <memory>
#include <vector>

std::vector<unsigned char> allocateWithVector() {
    return std::vector<unsigned char>(4000000); }
}

int main() {
    auto start = std::chrono::system_clock::now();

    for (long i = 0; i < 100; i++) {
        auto leakThis = new std::vector<unsigned char>(allocateWithVector());
    }
    auto leak_end = std::chrono::system_clock::now();

    for (long i = 0; i < 100; i++) {
        auto myBuff = allocateWithVector();
    }
    auto vector_end = std::chrono::system_clock::now();

    std::cout << "leaking vectors: = "
              << (leak_end - start).count() / 1000.0 << " ms." << std::endl;
    std::cout << "destroying vectors = "
              << (vector_end - leak_end).count() / 1000.0 << " ms." << std::endl;
}

Output:

leaking vectors: = 2058.2 ms.
destroying vectors = 3473.72 ms.

real	0m5.579s
user	0m5.427s
sys	0m0.135s

Even when removing the destruction of the vector, it's still taking 2 seconds to just construct 100 of these things.

If you don't need dynamic resizing, or construction & destruction of the elements making up your buffer, don't use std::vector.

Solution 5 - C++

std::vector was MADE to be used in such cases. So, yes.

  1. Yes, it is.

  2. reserve is unnecessary in your case.

  3. Yes, it will.

Solution 6 - C++

In addition - to ensure a minimum of allocated memory:

void MyClass::OnImageReceived(unsigned char *pPixels, unsigned int uPixelCount)
{
    m_pImageBuffer.swap(std::vector<unsigned char>(
         pPixels, pPixels + uPixelCount));
    // ... process image etc. ...
}

vector::assign does not change the amount of memory allocated, if the capacity is bigger than the amount needed:

> Effects: > erase(begin(), end()); > insert(begin(), first, last);

Solution 7 - C++

Please, consider this:

void MyClass::OnImageReceived(unsigned char *pPixels, unsigned int uPixelCount)
{
    // called when a new image is available
    if (m_pImageBuffer.size() != uPixelCount) // maybe just <  ??
    {
        std::vector<unsigned char> temp;
        temp.reserve(uPixelCount);        // no initialize
        m_pImageBuffer.swap(temp) ;       // no copy old data
    }

    m_pImageBuffer.assign(pPixels, pPixels + uPixelCount);  // no reallocate

    // ... process image etc. ...
}

My point is that if you have a big picture and need a litter bigger pic, your old pic will get copy during the reserve and/or resize into the new allocated memmory, the excess of memmory initialized, and then rewrited with the new pic. You colud directly assing, but then you will no be able to use the info you have about the new size to avoid posible reallocations (maybe the implementation of assign is allready optimize for this simple case ????).

Solution 8 - C++

It depends. If you access the data only through iterators and the [] operator, than its okay to use a vector.

If you have to give a pointer to functions which expect a buffer of e.g. bytes. It is not in my opinion. In this case You should use something like

unique_ptr<unsigned char[]> buf(new unsigned char[size])

is it as save as a vector, but instead of a vector you have maximum control of the buffer. A vector may reallocate a buffer or during a method/function call you may unintentionally make a copy of your whole vector. A easily made mistake.

The rule (for me) is. If you have a vector, use it like a vector. If you need a memory buffer, use a memory buffer.

As in a comment pointed out, the vector has a data method. This is C++. The freedom of using a vector as a raw buffer does not mend that you should use it as a raw buffer. In my humble opinion, the intention of a vector was to have a type save buffer with type save access system. For compatibility you can use the internal buffer for calls. The intention was not to use the vector as a smart pointer buffer container. For that, I use the pointer templates, signaling other user of my code that I use this buffer in a raw way. If I use vectors, I use them in the way they are intended to, not the possible ways they offer.

AS I got some blame here for my opinion (not recommendation) I want to add some words to the actual problem the op described.

If he expect always the same picture size, he should, in my opinion, use a unique_ptr, because that's what he is doing with it in my opinion. Using

 m_pImageBuffer.resize(uPixelCount, 0);

zeros the buffer first before he copy the pPixel to it, a unnecessary time penalty.

If the pictures he is expecting of different size, he should, in my opinion, not use a vector during following reason. Especially in his code:

// called when a new image is available
if (m_pImageBuffer.size() != uPixelCount)
{
    // resize image buffer
    m_pImageBuffer.reserve(uPixelCount);
    m_pImageBuffer.resize(uPixelCount, 0);
}

he will resize the vector, which is in fact a malloc and copy as long as the images are getting bigger. A realloc in my experience always leads to malloc and copy.

That is the reason I, especially in this situation, recommand the use of a unique_ptr instead of a vector.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionRoger RowlandView Question on Stackoverflow
Solution 1 - C++SneftelView Answer on Stackoverflow
Solution 2 - C++mfontaniniView Answer on Stackoverflow
Solution 3 - C++John DiblingView Answer on Stackoverflow
Solution 4 - C++Steve BrobergView Answer on Stackoverflow
Solution 5 - C++Ivan IshchenkoView Answer on Stackoverflow
Solution 6 - C++user2249683View Answer on Stackoverflow
Solution 7 - C++qPCR4virView Answer on Stackoverflow
Solution 8 - C++Martin SchlottView Answer on Stackoverflow