Endless sine generation in C

C Performance Time Precision Trigonometry

C Problem Overview

I am working on a project which incorporates computing a sine wave as input for a control loop.

The sine wave has a frequency of 280 Hz, and the control loop runs every 30 µs and everything is written in C for an Arm Cortex-M7.

At the moment we are simply doing:

double time;
void control_loop() {
    time += 30e-6;
    double sine = sin(2 * M_PI * 280 * time);
    ...
}

Two problems/questions arise:

When running for a long time, time becomes bigger. Suddenly there is a point where the computation time for the sine function increases drastically (see image). Why is this? How are these functions usually implemented? Is there a way to circumvent this (without noticeable precision loss) as speed is a huge factor for us? We are using sin from math.h (Arm GCC).
How can I deal with time in general? When running for a long time, the variable time will inevitably reach the limits of double precision. Even using a counter time = counter++ * 30e-6; only improves this, but it does not solve it. As I am certainly not the first person who wants to generate a sine wave for a long time, there must be some ideas/papers/... on how to implement this fast and precise.

C Solutions

Solution 1 - C

Instead of calculating sine as a function of time, maintain a sine/cosine pair and advance it through complex number multiplication. This doesn't require any trigonometric functions or lookup tables; only four multiplies and an occasional re-normalization:

static const double a = 2 * M_PI * 280 * 30e-6;
static const double dx = cos(a);
static const double dy = sin(a);
double x = 1, y = 0; // complex x + iy
int counter = 0;

void control_loop() {
    double xx = dx*x - dy*y;
    double yy = dx*y + dy*x;
    x = xx, y = yy;

    // renormalize once in a while, based on
    // https://www.gamedev.net/forums/topic.asp?topic_id=278849
    if((counter++ & 0xff) == 0) {
        double d = 1 - (x*x + y*y - 1)/2;
        x *= d, y *= d;
    }

    double sine = y; // this is your sine
}

The frequency can be adjusted, if needed, by recomputing dx, dy.

Additionally, all the operations here can be done, rather easily, in fixed point.

Rationality

As @user3386109 points out below (+1), the 280 * 30e-6 = 21 / 2500 is a rational number, thus the sine should loop around after 2500 samples exactly. We can combine this method with theirs by resetting our generator (x=1,y=0) every 2500 iterations (or 5000, or 10000, etc...). This would eliminate the need for renormalization, as well as get rid of any long-term phase inaccuracies.

(Technically any floating point number is a diadic rational. However 280 * 30e-6 doesn't have an exact representation in binary. Yet, by resetting the generator as suggested, we'll get an exactly periodic sine as intended.)

Explanation

Some requested an explanation down in the comments of why this works. The simplest explanation is to use the angle sum trigonometric identities:

xx = cos((n+1)*a) = cos(n*a)*cos(a) - sin(n*a)*sin(a) = x*dx - y*dy
yy = sin((n+1)*a) = sin(n*a)*cos(a) + cos(n*a)*sin(a) = y*dx + x*dy

and the correctness follows by induction.

This is essentially the De Moivre's formula if we view those sine/cosine pairs as complex numbers, in accordance to Euler's formula.

A more insightful way might be to look at it geometrically. Complex multiplication by exp(ia) is equivalent to rotation by a radians. Therefore, by repeatedly multiplying by dx + idy = exp(ia), we incrementally rotate our starting point 1 + 0i along the unit circle. The y coordinate, according to Euler's formula again, is the sine of the current phase.

Normalization

While the phase continues to advance with each iteration, the magnitude (aka norm) of x + iy drifts away from 1 due to round-off errors. However we're interested in generating a sine of amplitude 1, thus we need to normalize x + iy to compensate for numeric drift. The straight forward way is, of course, to divide it by its own norm:

double d = 1/sqrt(x*x + y*y);
x *= d, y *= d;

This requires a calculation of a reciprocal square root. Even though we normalize only once every X iterations, it'd still be cool to avoid it. Fortunately |x + iy| is already close to 1, thus we only need a slight correction to keep it at bay. Expanding the expression for d around 1 (first order Taylor approximation), we get the formula that's in the code:

d = 1 - (x*x + y*y - 1)/2

TODO: to fully understand the validity of this approximation one needs to prove that it compensates for round-off errors faster than they accumulate -- and thus get a bound on how often it needs to be applied.

Solution 2 - C

The function can be rewritten as

double n;
void control_loop() {
    n += 1;
    double sine = sin(2 * M_PI * 280 * 30e-6 * n);
    ...
}

That does exactly the same thing as the code in the question, with exactly the same problems. But it can now be simplified:

280 * 30e-6 = 280 * 30 / 1000000 = 21 / 2500 = 8.4e-3

Which means that when n reaches 2500, you've output exactly 21 cycles of the sine wave. Which means that you can set n back to 0. The resulting code is:

int n;
void control_loop() {
    n += 1;
    if (n == 2500)
    	n = 0;
    double sine = sin(2 * M_PI * 8.4e-3 * n);
    ...
}

As long as your code can run for 21 cycles without problems, it'll run forever without problems.

Solution 3 - C

I'm rather shocked at the existing answers. The first problem you detect is easily solved, and the next problem magically disappears when you solve the first problem.

You need a basic understanding of math to see how it works. Recall, sin(x+2pi) is just sin(x), mathematically. The large increase in time you see happens when your sin(float) implementation switches to another algorithm, and you really want to avoid that.

Remember that float has only 6 significant digits. 100000.0f*M_PI+x uses those 6 digits for 100000.0f*M_PI, so there's nothing left for x.

So, the easiest solution is to keep track of x yourself. At t=0 you initialize x to 0.0f. Every 30 us, you increment x+= M_PI * 280 * 30e-06;. The time does not appear in this formula! Finally, if x>2*M_PI, you decrement x-=2*M_PI; (Since sin(x)==sin(x-2*pi)

You now have an x that stays nicely in the range 0 to 6.2834, where sin is fast and the 6 digits of precision are all useful.

Solution 4 - C

How to generate a lovely sine.

DAC is 12bits so you have only 4096 levels. It makes no sense to send more than 4096 samples per period. In real life you will need much less samples to generate a good quality waveform.

Create C file with the lookup table (using your PC). Redirect the output to the file (https://helpdeskgeek.com/how-to/redirect-output-from-command-line-to-text-file/).

#define STEP   ((2*M_PI) / 4096.0)

int main(void)
{
    double alpha = 0;

    printf("#include <stdint.h>\nconst uint16_t sine[4096] = {\n");
    for(int x = 0; x < 4096 / 16; x++)
    {
        for(int y = 0; y < 16; y++)
        {
            printf("%d, ", (int)(4095 * (sin(alpha) + 1.0) / 2.0));
            alpha += STEP;
        }
        printf("\n");
    }
    printf("};\n");
}

https://godbolt.org/z/e899d98oW

Configure the timer to trigger the overflow 4096*280=1146880 times per second. Set the timer to generate the DAC trigger event. For 180MHz timer clock it will not be precise and the frequency will be 279.906449045Hz. If you need better precision change the number of samples to match your timer frequency or/and change the timer clock frequency (H7 timers can run up to 480MHz)
Configure DAC to use DMA and transfer the value from the lookup table created in the step 1 to the DAC on the trigger event.
Enjoy beautiful sine wave using your oscilloscope. Note that your microcontroller core will not be loaded at all. You will have it for other tasks. If you want to change the period simple reconfigure the timer. You can do it as many times per second as you wish. To reconfigure the timer use timer DMA burst mode - which will reload PSC & ARR registers on the upddate event automatically not disturbing the generated waveform.

I know it is advanced STM32 programming and it will require register level programming. I use it to generate complex waveforms in our devices.

It is the correct way of doing it. No control loops, no calculations, no core load.

Solution 5 - C

I'd like to address the embedded programming issues in your code directly - @0___________'s answer is the correct way to do this on a microcontroller and I won't retread the same ground.

Variables representing time should never be floating point. If your increment is not a power of two, errors will always accumulate. Even if it is, eventually your increment will be smaller than the smallest increment and the timer will stop. Always use integers for time. You can pick an integer size big enough to ignore roll over - an unsigned 32 bit integer representing milliseconds will take 50 days to roll over, while an unsigned 64 bit integer will take over 500 million years.
Generating any periodic signal where you do not care about the signal's phase does not require a time variable. Instead, you can keep an internal counter which resets to 0 at the end of a period. (When you use DMA with a look-up table, that's exactly what you're doing - the counter is the DMA controller's next-read pointer.)
Whenever you use a transcendental function such as sine in a microcontroller, your first thought should be "can I use a look-up table for this?" You don't have access to the luxury of a modern operating system optimally shuffling your load around on a 4 GHz+ multi-core processor. You're often dealing with a single thread that will stall waiting for your 200 MHz microcontroller to bring the FPU out of standby and perform the approximation algorithm. There is a significant cost to transcendental functions. There's a cost to LUTs too, but if you're hitting the function constantly, there's a good chance you'll like the tradeoffs of the LUT a lot better.

Solution 6 - C

As noted in some of the comments, the time value is continually growing with time. This poses two problems:

The sin function likely has to perform a modulus internally to get the internal value into a supported range.
The resolution of time will become worse and worse as the value increases, due to adding on higher digits.

Making the following changes should improve the performance:

double time;
void control_loop() {
    time += 30.0e-6;
    if((1.0/280.0) < time)
    {
        time -= 1.0/280.0;
    }
    
    double sine = sin(2 * M_PI * 280 * time);
    ...
}

Note that once this change is made, you will no longer have a time variable.

Solution 7 - C

Use a look-up table. Your comment in the discussion with Eugene Sh.:

> A small deviation from the sine frequency (like 280.1Hz) would be ok.

In that case, with a control interval of 30 µs, if you have a table of 119 samples that you repeat over and over, you will get a sine wave of 280.112 Hz. Since you have a 12-bit DAC, you only need 119 * 2 = 238 bytes to store this if you would output it directly to the DAC. If you use it as input for further calculations like you mention in the comments, you can store it as float or double as desired. On an MCU with embedded static RAM, it only takes a few cycles at most to load from memory.

Solution 8 - C

If you have a few kilobytes of memory available, you can eliminate this problem completely with a lookup table.

With a sampling period of 30 µs, 2500 samples will have a total duration of 75 ms. This is exactly equal to the duration of 21 cycles at 280 Hz.

I haven't tested or compiled the following code, but it should at least demonstrate the approach:

double sin2500() {
  static double *table = NULL;
  static int n = 2499;
  if (!table) {
    table = malloc(2500 * sizeof(double));
    for (int i=0; i<2500; i++) table[i] = sin(2 * M_PI * 280 * i * 30e-06);
  }
  n = (n+1) % 2500;
  return table[n];
}

Solution 9 - C

How about a variant of others' modulo-based concept:

int t = 0;
int divisor = 1000000;
void control_loop() {
    t += 30 * 280;
    if (t > divisor) t -= divisor;
    double sine = sin(2 * M_PI * t / (double)divisor));
    ...
}

It calculates the modulo in integer then causes no roundoff errors.

Solution 10 - C

There is an alternative approach to calculating a series of values of sine (and cosine) for angles that increase by some very small amount. It essentially devolves down to calculating the X and Y coordinates of a circle, and then dividing the Y value by some constant to produce the sine, and dividing the X value by the same constant to produce the cosine.

If you are content to generate a "very round ellipse", you can use a following hack, which is attributed to Marvin Minsky in the 1960s. It's much faster than calculating sines and cosines, although it introduces a very small error into the series. Here is an extract from the Hakmem Document, Item 149. The Minsky circle algorithm is outlined.

ITEM 149 (Minsky): CIRCLE ALGORITHM Here is an elegant way to draw almost circles on a point-plotting display:

NEW X = OLD X - epsilon * OLD Y
NEW Y = OLD Y + epsilon * NEW(!) X

This makes a very round ellipse centered at the origin with its size determined by the initial point. epsilon determines the angular velocity of the circulating point, and slightly affects the eccentricity. If epsilon is a power of 2, then we don't even need multiplication, let alone square roots, sines, and cosines! The "circle" will be perfectly stable because the points soon become periodic.

The circle algorithm was invented by mistake when I tried to save one register in a display hack! Ben Gurley had an amazing display hack using only about six or seven instructions, and it was a great wonder. But it was basically line-oriented. It occurred to me that it would be exciting to have curves, and I was trying to get a curve display hack with minimal instructions.

Here is a link to the hakmem: http://inwap.com/pdp10/hbaker/hakmem/hacks.html

Solution 11 - C

I think it would be possible to use a modulo because sin() is periodic.

Then you don’t have to worry about the problems.

double time = 0;
long unsigned int timesteps = 0;
double sine;
void controll_loop()
{
  timesteps++;
  time += 30e-6;
  if( time > 1 )
  {
    time -= 1;
  }
  sine = sin( 2 * M_PI * 280 * time );
  ...
}

Solution 12 - C

Fascinating thread. Minsky's algorithm mentioned in Walter Mitty's answer reminded me of a method for drawing circles that was published in Electronics & Wireless World and that I kept. (Credit: https://www.electronicsworld.co.uk/magazines/). I'm attaching it here for interest.

However, for my own similar projects (for audio synthesis) I use a lookup table, with enough points that linear interpolation is accurate enough (do the math(s)!)

Content Type	Original Author	Original Content on Stackoverflow
Question	energetic	View Question on Stackoverflow
Solution 1 - C	Yakov Galka	View Answer on Stackoverflow
Solution 2 - C	user3386109	View Answer on Stackoverflow
Solution 3 - C	MSalters	View Answer on Stackoverflow
Solution 4 - C	0___________	View Answer on Stackoverflow
Solution 5 - C	John	View Answer on Stackoverflow
Solution 6 - C	Jonathon S.	View Answer on Stackoverflow
Solution 7 - C	G. Sliepen	View Answer on Stackoverflow
Solution 8 - C	r3mainer	View Answer on Stackoverflow
Solution 9 - C	tshiono	View Answer on Stackoverflow
Solution 10 - C	Walter Mitty	View Answer on Stackoverflow
Solution 11 - C	Robin	View Answer on Stackoverflow
Solution 12 - C	Tim V	View Answer on Stackoverflow