Refactoring with C++ 11
C++C++11RefactoringC++ Problem Overview
Given the new toolset provided by c++ lots of programmers, aiming at code simplification, expressiveness, efficiency, skim through their old code and make tweaks (some pointless, some successful) to achieve their goals. Whilst trying not to loose too much time on such labors and just make non intrusive and self contained changes, what are the best practices?
Let me cross out the obvious :
-
Use auto to run iterator based loops :
for (std::vector<foo>::const_iterator it(lala.begin()), ite(lala.end()); it != ite; ++it); // becomes for (auto it(lala.cbegin()), ite(lala.cend()); it != ite; ++it);
-
Use tie for multiple assignments that just produce C-style rows of code ( https://stackoverflow.com/questions/9269825/how-to-assign-multiple-values-into-a-struct-at-once/21513616#21513616 )
a = 1; b = 2; c = 3; d = 4; e = 5; // becomes std::tie(a, b, c, d, e) = std::make_tuple(1, 2, 3, 4, 5);
-
To make a class non inheritable just declare it as "final" and delete the code that achieved such a behavior http://www.parashift.com/c++-faq/final-classes.html
-
Use the delete keyword to explicitly hide constructors/destructors instead of declaring them private (eg code to create heap based objects, non copyable objects etc)
-
Turn trivial functors created just to facillitate the execution of a single STL algorithm into lambda functions (apart from reducing code cluttering you'll have guaranteed inlined calls)
-
Simplify RAII wrapping of an object by just using a smart pointer
-
Get rid of bind1st, bind2nd and just use bind
-
Replace hand written code for type traits (Is_ptr_but_dont_call_for_const_ptrs<> and such :) ) with standard code provided by < type_traits >
-
Stop including boost headers for functionallity now implented in STL (BOOST_STATIC_ASSERT vs static_assert)
-
Provide move semantics to classes (although this wouldn't qualify as a dirty/quick/easy change)
-
Use nullptr where possible instead of the NULL macro and get rid of the code that filled containers of pointers with 0's casted to object type
std::vector<foo*> f(23); for (std::size_t i(0); i < 23; ++i) { f[i] = static_cast<foo*>(0); } // becomes std::vector<foo*> f(23, nullptr);
-
Clear the vector data accessing syntax
std::vector<int> vec; &vec[0]; // access data as a C-style array vec.data(); // new way of saying the above
-
Replace throw() with noexcept (apart from avoiding the deprecated exception specifiation you get some speed benefits http://channel9.msdn.com/Events/GoingNative/2013/An-Effective-Cpp11-14-Sampler @ 00.29.42)
void some_func() noexcept; // more optimization options void some_func() throw(); // fewer optimization options void some_func() ; // fewer optimization options
-
Replace code where you'd push a tempory in a container and hoped that the optimizer would ellide the copy away, with an "emplace" function where available, in order to perfectly forward the argument and construct directly an object into a container without temporary at all.
vecOfPoints.push_back(Point(x,y,z)); // so '03 vecOfPoints.emplace_back(x, y, z); // no copy or move operations performed
UPDATE
The answer by Shafik Yaghmour was rightfully awarded the bounty for having the greatest acceptance by the audience.
The answer by R Sahu was my accepted one, because the combination of features it proposes captures the spirit of refactoring : making code clearer and cleaner and simpler and elegant.
C++ Solutions
Solution 1 - C++
1. Replacing rand
One of the big gains in C++11 has to be replacing the use of rand()
with all the options available in the random header. Replacing rand()
in many cases should be straight forward.
Stephan T. Lavavej probably made this point the strongest with his presentation rand() Considered Harmful. The examples show a uniform integer distribution from [0,10]
using rand()
:
#include <cstdlib>
#include <iostream>
#include <ctime>
int main()
{
srand(time(0)) ;
for (int n = 0; n < 10; ++n)
{
std::cout << (rand() / (RAND_MAX / (10 + 1) + 1)) << ", " ;
}
std::cout << std::endl ;
}
and using std::uniform_int_distrubution:
#include <iostream>
#include <random>
int main()
{
std::random_device rd;
std::mt19937 e2(rd());
std::uniform_int_distribution<> dist(0, 10);
for (int n = 0; n < 10; ++n) {
std::cout << dist(e2) << ", " ;
}
std::cout << std::endl ;
}
Along with this should be moving from std::random_shuffle to std::shuffle which comes out of the effort to Deprecate rand and Friends. This was recently covered in the SO question Why are std::shuffle methods being deprecated in C++14?.
Note that the distributions are not guaranteed to be consistent across platforms.
2. Using std::to_string instead of std::ostringstream or sprintf
C++11 provides std::to_string which can be used to convert numerics to std::string it would produce the content as the equivalent std::sprintf. Most likely this would be used in place of either std::ostringstream or snprintf
. This is more of a convenience, there is probably not much of performance difference and we can see from the Fast integer to string conversion in C++ article there are probably much faster alternatives available if performance is the main concern:
#include <iostream>
#include <sstream>
#include <string>
int main()
{
std::ostringstream mystream;
mystream << 100 ;
std::string s = mystream.str();
std::cout << s << std::endl ;
char buff[12] = {0};
sprintf(buff, "%d", 100);
std::string s2( buff ) ;
std::cout << s2 << std::endl ;
std::cout << std::to_string( 100 ) << std::endl ;
}
3. Using constexpr in place of template meta-programming
If you are dealing with literals there may be cases where using constexpr functions over template meta-programming may produce code that is more clear and possibly compiles faster. The article Want speed? Use constexpr meta-programming! provides an example of prime number determination using template meta-programming:
struct false_type
{
typedef false_type type;
enum { value = 0 };
};
struct true_type
{
typedef true_type type;
enum { value = 1 };
};
template<bool condition, class T, class U>
struct if_
{
typedef U type;
};
template <class T, class U>
struct if_<true, T, U>
{
typedef T type;
};
template<size_t N, size_t c>
struct is_prime_impl
{
typedef typename if_<(c*c > N),
true_type,
typename if_<(N % c == 0),
false_type,
is_prime_impl<N, c+1> >::type >::type type;
enum { value = type::value };
};
template<size_t N>
struct is_prime
{
enum { value = is_prime_impl<N, 2>::type::value };
};
template <>
struct is_prime<0>
{
enum { value = 0 };
};
template <>
struct is_prime<1>
{
enum { value = 0 };
};
and using constexpr functions:
constexpr bool is_prime_recursive(size_t number, size_t c)
{
return (c*c > number) ? true :
(number % c == 0) ? false :
is_prime_recursive(number, c+1);
}
constexpr bool is_prime_func(size_t number)
{
return (number <= 1) ? false : is_prime_recursive(number, 2);
}
The constexpr version is much shorter, easier to understand and apparently performs much better than the template meta-programming implementation.
4. Using class member initialization to provide default values
As was recently covered in Has the new C++11 member initialization feature at declaration made initialization lists obsolete? class member initialization can be used to provide default values and can simplify cases where a class has multiple constructors.
Bjarne Stroustrup provides a good example in the C++11 FAQ, he says:
>This saves a bit of typing, but the real benefits come in classes with multiple constructors. Often, all constructors use a common initializer for a member:
and provides an example of members which have a common initializer:
class A {
public:
A(): a(7), b(5), hash_algorithm("MD5"), s("Constructor run") {}
A(int a_val) : a(a_val), b(5), hash_algorithm("MD5"), s("Constructor run") {}
A(D d) : a(7), b(g(d)), hash_algorithm("MD5"), s("Constructor run") {}
int a, b;
private:
HashingFunction hash_algorithm; // Cryptographic hash to be applied to all A instances
std::string s; // String indicating state in object lifecycle
};
and says:
>The fact that hash_algorithm and s each has a single default is lost in the mess of code and could easily become a problem during maintenance. Instead, we can factor out the initialization of the data members:
class A {
public:
A(): a(7), b(5) {}
A(int a_val) : a(a_val), b(5) {}
A(D d) : a(7), b(g(d)) {}
int a, b;
private:
HashingFunction hash_algorithm{"MD5"}; // Cryptographic hash to be applied to all A instances
std::string s{"Constructor run"}; // String indicating state in object lifecycle
};
Note, that in C++11 a class using in class member initializers is no longer an aggregate although this restriction is removed in C++14.
5. Use fixed width integer types from cstdint instead of hand rolled typedefs
Since the C++11 standard uses C99 as a normative reference we get fixed width integer types, as well. For example:
int8_t
int16_t
int32_t
int64_t
intptr_t
Although several of them an optional, for the exact width integer types the following from C99 section 7.18.1.1
applies:
> These types are optional. However, if an implementation provides integer types with widths of 8, > 16, 32, or 64 bits, no padding bits, and (for the signed types) that > have a two’s complement representation, it shall define the > corresponding typedef names.
Solution 2 - C++
I would add delegating constructors and in-class member initializers to the list.
Simplification By Using Delegating Constructors and In-Class Initialization
With C++03:
class A
{
public:
// The default constructor as well as the copy constructor need to
// initialize some of the members almost the same and call init() to
// finish construction.
A(double data) : id_(0), name_(), data_(data) {init();}
A(A const& copy) : id_(0), name_(), data_(copy.data_) {init();}
void init()
{
id_ = getNextID();
name_ = getDefaultName();
}
int id_;
string name_;
double data_;
};
With C++11:
class A
{
public:
// With delegating constructor, the copy constructor can
// reuse this constructor and avoid repetitive code.
// In-line initialization takes care of initializing the members.
A(double data) : data_(data) {}
A(A const& copy) : A(copy.data_) {}
int id_ = getNextID();
string name_ = getDefaultName();
double data_;
};
Solution 3 - C++
For-each syntax:
std::vector<int> container;
for (auto const & i : container)
std::cout << i << std::endl;
Solution 4 - C++
-
Changing
std::map
tostd::unordered_map
andstd::set
tostd::unordered_set
where ever order of container's elements is irrelevant, enhances significantly the performance. -
Using
std::map::at
instead of using square bracket syntax insertion, when you want to avoid involuntary insertions. -
Use alias templates when you want to
typedef
templates. -
Use of initialization lists instead of for loops to initialize STL containers.
-
Replace fixed size C arrays with std::array.
Solution 5 - C++
Use the uniform initialization syntax for variable initialization
widget w(x); // old
widget w{x}; // new
to avoid problems like c++'s most vexing parse (the rest of the reasons why the new way is superior are explained in the linked article by Herb Sutter)
Solution 6 - C++
This blog post proposes the Rule of Zero if all ownerships into a class follow the RAII principle, allowing to get rid of the Rule of Three/Four/Five in C++11.
However, Scott Meyers shows here that not writing explicitly the destructor, copy/move constructors and assignment operators can induce subtle problems if you slightly change your code (say, for debugging). He then recommends to explicitly declare default (C++11 feature) these functions:
~MyClass() = default;
MyClass( const MyClass& ) = default;
MyClass( MyClass&& ) = default;
MyClass& operator=( const MyClass& ) = default;
MyClass& operator=( MyClass&& ) = default;
Solution 7 - C++
Feature: std::move
"Express clear difference between the copying and moving the resources"
std::string tmp("move");
std::vector<std::string> v;
v.push_back(std::move(tmp));
//At this point tmp still be the valid object but in unspecified state as
// its resources has been moved and now stored in vector container.
Solution 8 - C++
- Prefer scoped enums to unscoped enums
-
In C++98 the enums, there is no scoped for enums like the following code snippet. The names of such enumerators belong to the scope containing enum, namely nothing else in that scope may have the same name.
enum Color{ blue, green, yellow }; bool blue = false; // error: 'blue' redefinition
However, in C++11, the
scoped enums
can fix this issue.scoped enum
are declared varenum class
.enum class Color{ blue, green, yellow }; bool blue = false; // fine, no other `blue` in scope Color cc = blue; // error! no enumerator `blue` in this scope Color cc = Color::blue; // fine auto c = Color::blue; // fine
-
The enumerators of
scope enums
are more strongly typed. But, the enumerators ofunscoped enums
implicitly convert to other typesenum Color{ blue, green, yellow }; std::vector<std::size_t> getVector(std::size_t x); Color c = blue; if (c < 10.1) { // compare Color with double !! auto vec = getVector(c); // could be fine !! }
However,
scoped enums
will be failed in this case.enum class Color{ blue, green, yellow }; std::vector<std::size_t> getVector(std::size_t x); Color c = Color::blue; if (c < 10.1) { // error ! auto vec = getVector(c); // error !! }
Fix it through
static_cast
if (static_cast<double>(c) < 10.1) { auto vec = getVector(static_cast<std::size_t>(c)); }
-
unscoped enums
may be forward-declared.enum Color; // error!! enum class Color; // fine
-
Both
scoped
andunscoped
enums support specification of the underlying type. The default underlying type forscoped enums
isint
.Unscoped enums
have no default underlying type.
- Using Concurrency API
-
Prefer task-based to thread-based
If you want to run a function
doAsyncWork
asynchronously, you have two basic choices. One is thread-basedint doAsyncWork(); std::thread t(doAsyncWork);
The other is task-based.
auto fut = std::async(doAsyncWork);
Obviously, we can get the return value of
doAsyncWork
through task-based more easily than thread-based. With thetask-based
approach, it’s easy, because the future returned fromstd::async
offers the get function. Theget
function is even more important ifdoAsyncWork
emits an exception, becauseget
provides access to that, too. -
Thread-based
calls for manual management of thread exhaustion, oversubscription, load balancing, and adaptation to new platforms. ButTask-based
viastd::async
with the default launch policy suffers from none of these drawbacks.
Here are several links:
C/C++ Programming Abstractions for Parallelism and Concurrency
Solution 9 - C++
Optimize simple mathematical functions with constexpr, especially if they are called inside inner loops. This would allow the compiler to calculate them at compilation saving you time
Example
constexpr int fibonacci(int i) {
return i==0 ? 0 : (i==1 ? 1 : fibonacci(i-1) + fibonacci(i-2));
}
Another example is to use std::enable_if
to limit the allowed template parameters types in a particular template function/class. This would make your code safer (in case you haven't use SFINAE to constrains the possible template arguments in your old code) when you implicit assume some property about the template types and it is just one extra line of code
example:
template
<
typename T,
std::enable_if< std::is_abstract<T>::value == false, bool>::type = false // extra line
>
void f(T t)
{
// do something that depends on the fact that std::is_abstract<T>::value == false
}
Update 1: If you have a small array where the size is known at compile time and you you want to avoid the overhead of the heap allocation in std::vector (meaning: you want the array on the stack), you only choice in C++03 was to use c-style arrays. Change that to std::array
. It is a simple change that provides you a lot of the functionally present in std::vector + stack allocation (much faster than heap allocation as I said before).
Solution 10 - C++
Use smart pointers. Notice that there are still good reason to have naked pointers in some cases, the best way to check if a pointer should be smart is to look for the uses of delete
on it.
There should be no reason to use new
either. Replace every new
with make_shared
or make_unique
.
Unfortunately make_unique
didn't make it in the C++11 standard, the best solution IMO is to implement it yourself(see previous link), and put some macros to check for the __cplusplus
version (make_unique
is available in C++14).
Using make_unique
and make_shared
is really important in order to make your code exception safe.
Solution 11 - C++
Use of override keyword
Mark virtual functions in derived classes as override (if they indeed override of course). This can protect against introducing bugs in the future, e.g. by changing the signature of a virtual function in a base class and forgetting to change the signature in all derived classes accordingly.