How is vector implemented in C++

C++Data Structures

C++ Problem Overview


I am thinking of how I can implement std::vector from the ground up.

How does it resize the vector?

realloc only seems to work for plain old stucts, or am I wrong?

C++ Solutions


Solution 1 - C++

it is a simple templated class which wraps a native array. It does not use malloc/realloc. Instead, it uses the passed allocator (which by default is std::allocator).

Resizing is done by allocating a new array and copy constructing each element in the new array from the old one (this way it is safe for non-POD objects). To avoid frequent allocations, often they follow a non-linear growth pattern.

UPDATE: in C++11, the elements will be moved instead of copy constructed if it is possible for the stored type.

In addition to this, it will need to store the current "size" and "capacity". Size is how many elements are actually in the vector. Capacity is how many could be in the vector.

So as a starting point a vector will need to look somewhat like this:

template <class T, class A = std::allocator<T> >
class vector {
public:
    // public member functions
private:
    T*                    data_;
    typename A::size_type capacity_;
    typename A::size_type size_;
    A                     allocator_;
};

The other common implementation is to store pointers to the different parts of the array. This cheapens the cost of end() (which no longer needs an addition) ever so slightly at the expense of a marginally more expensive size() call (which now needs a subtraction). In which case it could look like this:

template <class T, class A = std::allocator<T> >
class vector {
public:
    // public member functions
private:
    T* data_;         // points to first element
    T* end_capacity_; // points to one past internal storage
    T* end_;          // points to one past last element
    A  allocator_;
};

I believe gcc's libstdc++ uses the latter approach, but both approaches are equally valid and conforming.

NOTE: This is ignoring a common optimization where the empty base class optimization is used for the allocator. I think that is a quality of implementation detail, and not a matter of correctness.

Solution 2 - C++

From Wikipedia, as good an answer as any.

> A typical vector implementation consists, internally, of a pointer to > a dynamically allocated array,[2] and possibly data members holding > the capacity and size of the vector. The size of the vector refers to > the actual number of elements, while the capacity refers to the size > of the internal array. When new elements are inserted, if the new size > of the vector becomes larger than its capacity, reallocation > occurs.[2][4] This typically causes the vector to allocate a new > region of storage, move the previously held elements to the new region > of storage, and free the old region. Because the addresses of the > elements change during this process, any references or iterators to > elements in the vector become invalidated.[5] Using an invalidated > reference causes undefined behaviour

Solution 3 - C++

Resizing the vector requires allocating a new chunk of space, and copying the existing data to the new space (thus, the requirement that items placed into a vector can be copied).

Note that it does not use new [] either -- it uses the allocator that's passed, but that's required to allocate raw memory, not an array of objects like new [] does. You then need to use placement new to construct objects in place. [Edit: well, you could technically use new char[size], and use that as raw memory, but I can't quite imagine anybody writing an allocator like that.]

When the current allocation is exhausted and a new block of memory needs to be allocated, the size must be increased by a constant factor compared to the old size to meet the requirement for amortized constant complexity for push_back. Though many web sites (and such) call this doubling the size, a factor around 1.5 to 1.6 usually works better. In particular, this generally improves chances of re-using freed blocks for future allocations.

Solution 4 - C++

It allocates a new array and copies everything over. So, expanding it is quite inefficient if you have to do it often. Use reserve() if you have to use push_back().

Solution 5 - C++

Solution 6 - C++

You'd need to define what you mean by "plain old structs."

realloc by itself only creates a block of uninitialized memory. It does no object allocation. For C structs, this suffices, but for C++ it does not.

That's not to say you couldn't use realloc. But if you were to use it (note you wouldn't be reimplementing std::vector exactly in this case!), you'd need to:

  1. Make sure you're consistently using malloc/realloc/free throughout your class.
  2. Use "placement new" to initialize objects in your memory chunk.
  3. Explicitly call destructors to clean up objects before freeing your memory chunk.

This is actually pretty close to what vector does in my implementation (GCC/glib), except it uses the C++ low-level routines ::operator new and ::operator delete to do the raw memory management instead of malloc and free, rewrites the realloc routine using these primitives, and delegates all of this behavior to an allocator object that can be replaced with a custom implementation.

Since vector is a template, you actually should have its source to look at if you want a reference – if you can get past the preponderance of underscores, it shouldn't be too hard to read. If you're on a Unix box using GCC, try looking for /usr/include/c++/version/vector or thereabouts.

Solution 7 - C++

You can implement them with resizing array implementation. When the array becomes full, create an array with twice as much the size and copy all the content to the new array. Do not forget to delete the old array.

As for deleting the elements from vector, do resizing when your array becomes a quarter full. This strategy makes prevents any performance glitches when one might try repeated insertion and deletion at half the array size.

It can be mathematically proved that the amortized time (Average time) for insertions is still linear for n insertions which is asymptotically the same as you will get with a normal static array.

Solution 8 - C++

	///Implement Vector class
	class MyVector {
		int *int_arr;
		int capacity;
		int current;
	public:
		MyVector() {
			int_arr = new int[1];
			capacity = 1;
			current = 0;
		}
		void Push(int nData);
		void PushData(int nData, int index);
		void PopData();
		int  GetData(int index);
		int  GetSize();
		void Print();
	};

	void MyVector::Push(int data)
	{
		if (current == capacity){
			int *temp = new int[2 * capacity];
			for (int i = 0; i < capacity; i++)
			{
				temp[i] = int_arr[i];
			}

			delete[] int_arr;
			capacity *= 2;

			int_arr = temp;
		}
		int_arr[current] = data;
		current++;
	}
	void MyVector::PushData(int data, int index)
	{
		if (index == capacity){
			Push(index);
		}
		else
			int_arr[index] = data;
	}
	void MyVector::PopData(){
		current--;
	}

	int MyVector::GetData(int index)
	{
		if (index < current){
			return int_arr[index];
		}
	}

	int MyVector::GetSize()
	{
		return current;
	}

	void MyVector::Print()
	{
		for (int i = 0; i < current; i++) {
			cout << int_arr[i] << " ";
		}
		cout << endl;
	}
	
	int main()
	{
		MyVector vect;
		vect.Push(10);
		vect.Push(20);
		vect.Push(30);
		vect.Push(40);

		vect.Print();

		std::cout << "\nTop item is "
			<< vect.GetData(3) << std::endl;

		vect.PopData();
		vect.Print();

		cout << "\nTop item is "
			<< vect.GetData(1) << endl;
		return 0;
	}

Solution 9 - C++

realloc only works on heap memory. In C++ you usually want to use the free store.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionJohn SmithView Question on Stackoverflow
Solution 1 - C++Evan TeranView Answer on Stackoverflow
Solution 2 - C++SerapthView Answer on Stackoverflow
Solution 3 - C++Jerry CoffinView Answer on Stackoverflow
Solution 4 - C++MackeView Answer on Stackoverflow
Solution 5 - C++log0View Answer on Stackoverflow
Solution 6 - C++Owen S.View Answer on Stackoverflow
Solution 7 - C++Manish Chandra JoshiView Answer on Stackoverflow
Solution 8 - C++Dattatraya MengudaleView Answer on Stackoverflow
Solution 9 - C++Edward StrangeView Answer on Stackoverflow