Why is it OK to return a 'vector' from a function?

C++VectorStlScopeStandard Library

C++ Problem Overview


Please consider this code. I have seen this type of code several times. words is a local vector. How is it possible to return it from a function?

Can we guarantee it will not die?

 std::vector<std::string> read_file(const std::string& path)
 {
    std::ifstream file("E:\\names.txt");

    if (!file.is_open())
    {
        std::cerr << "Unable to open file" << "\n";
        std::exit(-1);
    }

    std::vector<string> words;//this vector will be returned
    std::string token;

    while (std::getline(file, token, ','))
    {
        words.push_back(token);
    }

    return words;
}

C++ Solutions


Solution 1 - C++

Pre C++11:

The function will not return the local variable, but rather a copy of it. Your compiler might however perform an optimization where no actual copy action is made.

See this question & answer for further details.

C++11:

The function will move the value. See this answer for further details.

Solution 2 - C++

> Can we guarantee it will not die?

As long there is no reference returned, it's perfectly fine to do so. words will be moved to the variable receiving the result.

The local variable will go out of scope. after it was moved (or copied).

Solution 3 - C++

I think you are referring to the problem in C (and C++) that returning an array from a function isn't allowed (or at least won't work as expected) - this is because the array return will (if you write it in the simple form) return a pointer to the actual array on the stack, which is then promptly removed when the function returns.

But in this case, it works, because the std::vector is a class, and classes, like structs, can (and will) be copied to the callers context. [Actually, most compilers will optimise out this particular type of copy using something called "Return Value Optimisation", specifically introduced to avoid copying large objects when they are returned from a function, but that's an optimisation, and from a programmers perspective, it will behave as if the assignment constructor was called for the object]

As long as you don't return a pointer or a reference to something that is within the function returning, you are fine.

Solution 4 - C++

To well understand the behaviour, you can run this code:

#include <iostream>

class MyClass
{
  public:
    MyClass() { std::cout << "run constructor MyClass::MyClass()" << std::endl; }
    ~MyClass() { std::cout << "run destructor MyClass::~MyClass()" << std::endl; }
    MyClass(const MyClass& x) { std::cout << "run copy constructor MyClass::MyClass(const MyClass&)" << std::endl; }
    MyClass& operator = (const MyClass& x) { std::cout << "run assignation MyClass::operator=(const MyClass&)" << std::endl; }
};

MyClass my_function()
{
  std::cout << "run my_function()" << std::endl;
  MyClass a;
  std::cout << "my_function is going to return a..." << std::endl;
  return a;
}

int main(int argc, char** argv)
{
  MyClass b = my_function();

  MyClass c;
  c = my_function();

  return 0;
}

The output is the following:

run my_function()
run constructor MyClass::MyClass()
my_function is going to return a...
run constructor MyClass::MyClass()
run my_function()
run constructor MyClass::MyClass()
my_function is going to return a...
run assignation MyClass::operator=(const MyClass&)
run destructor MyClass::~MyClass()
run destructor MyClass::~MyClass()
run destructor MyClass::~MyClass()

Note that this example was provided in C++03 context, it could be improved for C++ >= 11

Solution 5 - C++

I do not agree and do not recommend to return a vector:

vector <double> vectorial(vector <double> a, vector <double> b)
{
	vector <double> c{ a[1] * b[2] - b[1] * a[2], -a[0] * b[2] + b[0] * a[2], a[0] * b[1] - b[0] * a[1] };
	return c;
}

This is much faster:

void vectorial(vector <double> a, vector <double> b, vector <double> &c)
{
	c[0] = a[1] * b[2] - b[1] * a[2]; c[1] = -a[0] * b[2] + b[0] * a[2]; c[2] = a[0] * b[1] - b[0] * a[1];
}

I tested on Visual Studio 2017 with the following results in release mode:

8.01 MOPs by reference
5.09 MOPs returning vector

In debug mode, things are much worse:

0.053 MOPS by reference
0.034 MOPs by return vector

Solution 6 - C++

This is actually a failure of design. You shouldn't be using a return value for anything not a primitive for anything that is not relatively trivial.

The ideal solution should be implemented through a return parameter with a decision on reference/pointer and the proper use of a "const'y'ness" as a descriptor.

On top of this, you should realise that the label on an array in C and C++ is effectively a pointer and its subscription are effectively an offset or an addition symbol.

So the label or ptr array_ptr === array label thus returning foo[offset] is really saying return element at memory pointer location foo + offset of type return type.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionPranit KothariView Question on Stackoverflow
Solution 1 - C++Tim MeyerView Answer on Stackoverflow
Solution 2 - C++πάντα ῥεῖView Answer on Stackoverflow
Solution 3 - C++Mats PeterssonView Answer on Stackoverflow
Solution 4 - C++CaduchonView Answer on Stackoverflow
Solution 5 - C++mathengineerView Answer on Stackoverflow
Solution 6 - C++NewbstarrView Answer on Stackoverflow