Calculate rolling / moving average in C++

C++BoostMoving Average

C++ Problem Overview


I know this is achievable with boost as per:

https://stackoverflow.com/questions/5195990/using-boostaccumulators-how-can-i-reset-a-rolling-window-size-does-it-keep-e

But I really would like to avoid using boost. I have googled and not found any suitable or readable examples.

Basically I want to track the moving average of an ongoing stream of a stream of floating point numbers using the most recent 1000 numbers as a data sample.

What is the easiest way to achieve this?


I experimented with using a circular array, exponential moving average and a more simple moving average and found that the results from the circular array suited my needs best.

C++ Solutions


Solution 1 - C++

If your needs are simple, you might just try using an exponential moving average.

http://en.wikipedia.org/wiki/Moving_average#Exponential_moving_average

Put simply, you make an accumulator variable, and as your code looks at each sample, the code updates the accumulator with the new value. You pick a constant "alpha" that is between 0 and 1, and compute this:

accumulator = (alpha * new_value) + (1.0 - alpha) * accumulator

You just need to find a value of "alpha" where the effect of a given sample only lasts for about 1000 samples.

Hmm, I'm not actually sure this is suitable for you, now that I've put it here. The problem is that 1000 is a pretty long window for an exponential moving average; I'm not sure there is an alpha that would spread the average over the last 1000 numbers, without underflow in the floating point calculation. But if you wanted a smaller average, like 30 numbers or so, this is a very easy and fast way to do it.

Solution 2 - C++

You simply need a circular array (circular buffer) of 1000 elements, where you add the element to the previous element and store it.

It becomes an increasing sum, where you can always get the sum between any two pairs of elements, and divide by the number of elements between them, to yield the average.

Solution 3 - C++

> Basically I want to track the moving average of an ongoing stream of a stream of floating point numbers using the most recent 1000 numbers as a data sample.

Note that the below updates the total_ as elements as added/replaced, avoiding costly O(N) traversal to calculate the sum - needed for the average - on demand.

template <typename T, typename Total, size_t N>
class Moving_Average
{
  public:
    Moving_Average& operator()(T sample)
    {
        total_ += sample;
        if (num_samples_ < N)
            samples_[num_samples_++] = sample;
        else
        {
            T& oldest = samples_[num_samples_++ % N];
            total_ -= oldest;
            oldest = sample;
        }
        return *this;
    }

    operator double() const { return total_ / std::min(num_samples_, N); }

  private:
    T samples_[N];
    size_t num_samples_{0};
    Total total_{0};
};

Examples:

// average of last 3 (from 4) samples...
std::cout << Moving_Average<double, double, 3>{}(4)(7)(2)(6) << '\n';
    // "5\n"

// average of last 3 squares...
Moving_Average<double, double, 3> ma;
for (int i = 0; i < 10; ++i)
    std::cout << (i * i) << ':' << ma(i * i) << ' ';
std::cout << '\n';
    // 0:0 1:0.5 4:1.66667 9:4.66667 16:9.66667 25:16.6667 36:25.6667 49:36.6667 64:49.6667 81:64.6667

Total is made a different parameter from T to support e.g. using a long long when totalling 1000 longs, an int for chars, or a double to total floats.

Issues

This is a bit flawed in that num_samples_ could conceptually wrap back to 0, but it's hard to imagine anyone having 2^64 samples: if concerned, use an extra bool data member to record when the container is first filled while cycling num_samples_ around the array (best then renamed something innocuous like "pos").

Another issue is inherent with floating point precision, and can be illustrated with a simple scenario for T=double, N=2: we start with total_ = 0, then inject samples {1E17, 1, 2}...

  • 1E17, we execute total_ += 1E17, so total_ == 1E17, then inject

  • 1, we execute total += 1, but total_ == 1E17 still, as the "1" is too insignificant to change the 64-bit double representation of a number as large as 1E17, then we inject

  • 2, we execute total += 2 - 1E17, in which 2 - 1E17 is evaluated first and yields -1E17 as the 2 is lost to imprecision/insignificance, so to our total of 1E17 we add -1E17 and total_ becomes 0, despite current samples of 1 and 2 for which we'd want total_ to be 3. Our moving average will calculate 0 instead of 1.5. As we add another sample, we'll subtract the "oldest" 1 from total_ despite it never having been properly incorporated therein; our total_ and moving averages are likely to remain wrong.

You could add code that stores the highest recent total_ and if the current total_ is too small a fraction of that (a template parameter could provide a multiplicative threshold), you recalculate the total_ from all the samples in the samples_ array (and set highest_recent_total_ to the new total_), but I'll leave that to the reader who cares sufficiently.

Solution 4 - C++

You can approximate a rolling average by applying a weighted average on your input stream.

template <unsigned N>
double approxRollingAverage (double avg, double input) {
    avg -= avg/N;
    avg += input/N;
    return avg;
}

This way, you don't need to maintain 1000 buckets. However, it is an approximation, so it's value will not match exactly with a true rolling average.

Edit: Just noticed @steveha's post. This is equivalent to the exponential moving average, with the alpha being 1/N (I was taking N to be 1000 in this case to simulate 1000 buckets).

Solution 5 - C++

Simple class to calculate rolling average and also rolling standard deviation:

#define _stdev(cnt, sum, ssq) sqrt((((double)(cnt))*ssq-pow((double)(sum),2)) / ((double)(cnt)*((double)(cnt)-1)))

class moving_average {
private:
    boost::circular_buffer<int> *q;
    double sum;
    double ssq;
public:
    moving_average(int n)  {
        sum=0;
        ssq=0;
        q = new boost::circular_buffer<int>(n);
    }
    ~moving_average() {
        delete q;
    }
    void push(double v) {
        if (q->size() == q->capacity()) {
            double t=q->front();
            sum-=t;
            ssq-=t*t;
            q->pop_front();
        }
        q->push_back(v);
        sum+=v;
        ssq+=v*v;
    }
    double size() {
        return q->size();
    }
    double mean() {
        return sum/size();
    }
    double stdev() {
        return _stdev(size(), sum, ssq);
    }

};

Solution 6 - C++

One way can be to circularly store the values in the buffer array. and calculate average this way.

int j = (int) (counter % size);
buffer[j] = mostrecentvalue;
avg = (avg * size - buffer[j - 1 == -1 ? size - 1 : j - 1] + buffer[j]) / size;

counter++;

// buffer[j - 1 == -1 ? size - 1 : j - 1] is the oldest value stored

The whole thing runs in a loop where most recent value is dynamic.

Solution 7 - C++

I use this quite often in hard realtime systems that have fairly insane update rates (50kilosamples/sec) As a result I typically precompute the scalars.

To compute a moving average of N samples: scalar1 = 1/N; scalar2 = 1 - scalar1; // or (1 - 1/N) then:

Average = currentSample*scalar1 + Average*scalar2;

Example: Sliding average of 10 elements

double scalar1 = 1.0/10.0;  // 0.1
double scalar2 = 1.0 - scalar1; // 0.9
bool first_sample = true;
double average=0.0;
while(someCondition)
{
   double newSample = getSample();
   if(first_sample)
   {
    // everybody forgets the initial condition *sigh*
      average = newSample;
      first_sample = false;
   }
   else
   {
      average = (sample*scalar1) + (average*scalar2);
   }
 }
      

Note: this is just a practical implementation of the answer given by steveha above. Sometimes it's easier to understand a concrete example.

Solution 8 - C++

You could implement a ring buffer. Make an array of 1000 elements, and some fields to store the start and end indexes and total size. Then just store the last 1000 elements in the ring buffer, and recalculate the average as needed.

Solution 9 - C++

Incrementing on @Nilesh's answer (credit goes to him), you can:

  • keep track of the sum, no need to divide and then multiply every time, generating error
  • avoid if conditions using % operator

This is UNTESTED sample code to show the idea, it could also be wrapped into a class:

const unsigned int size=10; // ten elements buffer

unsigned int counterPosition=0;
unsigned int counterNum=0;

int buffer[size];
long sum=0;

void reset() {
    for(int i=0;i<size;i++) {
        buffer[i]=0;
    }
}

float addValue(int value) {
    unsigned  int oldPos = ((counterPosition + 1) % size);

    buffer[counterPosition] = value;
    sum = (sum - buffer[oldPos] + value); 

    counterPosition=(counterPosition+1) % size;
    if(counterNum<size) counterNum++;

    return ((float)sum)/(float)counterNum;
}

float removeValue() {
    unsigned  int oldPos =((counterPosition + 1) % size);

    buffer[counterPosition] = 0;
    sum = (sum - buffer[oldPos]); 

    if(counterNum>1) { // leave one last item at the end, forever
        counterPosition=(counterPosition+1) % size;
        counterNum--; // here the two counters are different
    }
    return ((float)sum)/(float)counterNum;
}

It should be noted that, if the buffer is reset to all zeroes, this method works fine while receiving the first values in as - buffer[oldPos] is zero and counter grows. First output is the first number received. Second output is the average of only the first two, and so on, fading in the values while they arrive until size items are reached.

It is also worth considering that this method, like any other for rolling average, is asymmetrical, if you stop at the end of the input array, because the same fading does not happen at the end (it can happen after the end of data, with the right calculations).

That is correct. The rolling average of 100 elements with a buffer of 10 gives different results: 10 fading in, 90 perfectly rolling 10 elements, and finally 10 fading out, giving a total of 110 results for 100 numbers fed in! It's your choice to decide which ones to show (and if it's better going the straight way, old to recent, or backwards, recent to old).

To fade out correctly after the end, you can go on adding zeroes one by one and reducing the count of items by one every time until you have reached size elements (still keeping track of correct position of old values).

Usage is like this:

int avg=0;
reset();

avg=addValue(2); // Rpeat for 100 times
avg=addValue(3); // Use avg value

...

avg=addValue(-4);
avg=addValue(12); // last numer, 100th input 

// If you want to fade out repeat 10 times after the end of data:

avg=removeValue(); // Rpeat for last 10 times after data has finished
avg=removeValue(); // Use avg value
...
avg=removeValue();
avg=removeValue();

Solution 10 - C++

I used a deque... seems to work for me. This example has a vector, but you could skip that aspect and simply add them to deque.

#include <deque>

template <typename T>
double mov_avg(vector<T> vec, int len){
  deque<T> dq = {};
  for(auto i = 0;i < vec.size();i++){
    if(i < len){
      dq.push_back(vec[i]);
    }
    else {
      dq.pop_front();
      dq.push_back(vec[i]);
    }
  }
  double cs = 0;
  for(auto i : dq){
    cs += i;
  }
  return cs / len;
}



//Skip the vector portion, track the input number (or size of deque), and the value.


  double len = 10;
  double val; //Accept as input
  double instance; //Increment each time input accepted.

  deque<double> dq;
  if(instance < len){
      dq.push_back(val);
  }
  else {
      dq.pop_front();
      dq.push_back(val);
    }
  }
  double cs = 0;
  for(auto i : dq){
    cs += i;
  }
  double rolling_avg = cs / len;

//To simplify further -- add values to this, then simply average the deque.

 int MAX_DQ = 3;
 void add_to_dq(deque<double> &dq, double value){
    if(dq.size() < MAX_DQ){
      dq.push_back(value);
    }else {
      dq.pop_front();
      dq.push_back(value);
    }
  }
 

Another sort of hack I use occasionally is using mod to overwrite values in a vector.

  vector<int> test_mod = {0,0,0,0,0};
  int write = 0;
  int LEN = 5;
  
  int instance = 0; //Filler for N -- of Nth Number added.
  int value = 0; //Filler for new number.

  write = instance % LEN;
  test_mod[write] = value;
  //Will write to 0, 1, 2, 3, 4, 0, 1, 2, 3, ...
  //Then average it for MA.

  //To test it...
  int write_idx = 0;
  int len = 5;
  int new_value;
  for(auto i=0;i<100;i++){
      cin >> new_value;
      write_idx = i % len;
      test_mod[write_idx] = new_value;

This last (hack) has no buckets, buffers, loops, nothing. Simply a vector that's overwritten. And it's 100% accurate (for avg / values in vector). Proper order is rarely maintained, as it starts rewriting backwards (at 0), so 5th index would be at 0 in example {5,1,2,3,4}, etc.

Solution 11 - C++

a simple moving average for 10 items, using a list:

#include <list>

std::list<float> listDeltaMA;

float getDeltaMovingAverage(float delta)
{
	listDeltaMA.push_back(delta);
	if (listDeltaMA.size() > 10) listDeltaMA.pop_front();
	float sum = 0;
	for (std::list<float>::iterator p = listDeltaMA.begin(); p != listDeltaMA.end(); ++p)
		sum += (float)*p;
	return sum / listDeltaMA.size();
}

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestiongojiView Question on Stackoverflow
Solution 1 - C++stevehaView Answer on Stackoverflow
Solution 2 - C++Karthik Kumar ViswanathanView Answer on Stackoverflow
Solution 3 - C++Tony DelroyView Answer on Stackoverflow
Solution 4 - C++jxhView Answer on Stackoverflow
Solution 5 - C++Erik AronestyView Answer on Stackoverflow
Solution 6 - C++NKJView Answer on Stackoverflow
Solution 7 - C++baumannView Answer on Stackoverflow
Solution 8 - C++TimView Answer on Stackoverflow
Solution 9 - C++FrancescoMMView Answer on Stackoverflow
Solution 10 - C++Zach OakesView Answer on Stackoverflow
Solution 11 - C++Pedro SoaresView Answer on Stackoverflow