Do I need to explicitly handle negative numbers or zero when summing squared digits?

C

C Problem Overview


I recently had a test in my class. One of the problems was the following:

> Given a number n, write a function in C/C++ that returns the sum of the digits of the number squared. (The following is important). The range of n is [ -(10^7), 10^7 ]. Example: If n = 123, your function should return 14 (1^2 + 2^2 + 3^2 = 14).

This is the function that I wrote:

int sum_of_digits_squared(int n) 
{
    int s = 0, c;

    while (n) {
        c = n % 10;
        s += (c * c);
        n /= 10;
    }

    return s;
}

Looked right to me. So now the test came back and I found that the teacher didn't give me all the points for a reason that I do not understand. According to him, for my function to be complete, I should've have added the following detail:

int sum_of_digits_squared(int n) 
 {
    int s = 0, c;

    if (n == 0) {      //
        return 0;      //
    }                  //
                       // THIS APPARENTLY SHOULD'VE 
    if (n < 0) {       // BEEN IN THE FUNCTION FOR IT
        n = n * (-1);  // TO BE CORRECT
    }                  //
                      
    while (n) {
        c = n % 10;
        s += (c * c);
        n /= 10;
    }

    return s;
}

The argument for this is that the number n is in the range [-(10^7), 10^7], so it can be a negative number. But I don't see where my own version of the function fails. If I understand correctly, the meaning of while(n) is while(n != 0), not while (n > 0), so in my version of the function the number n wouldn't fail to enter the loop. It would work just the same.

Then, I tried both versions of the function on my computer at home and I got exactly the same answers for all the examples that I tried. So, sum_of_digits_squared(-123) is equal to sum_of_digits_squared(123) (which again, is equal to 14) (even without the detail that I apparently should've added). Indeed, if I try to print on the screen the digits of the number (from least to greatest in importance), in the 123 case I get 3 2 1 and in the -123 case I get -3 -2 -1 (which is actually kind of interesting). But in this problem it wouldn't matter since we square the digits.

So, who's wrong?

EDIT: My bad, I forgot to specify and didn't know it was important. The version of C used in our class and tests has to be C99 or newer. So I guess (by reading the comments) that my version would get the correct answer in any way.

C Solutions


Solution 1 - C

Summarizing a discussion that's been percolating in the comments:

  • There is no good reason to test in advance for n == 0. The while(n) test will handle that case perfectly.
  • It's likely your teacher is still used to earlier times, when the result of % with negative operands was differently defined. On some old systems (including, notably, early Unix on a PDP-11, where Dennis Ritchie originally developed C), the result of a % b was always in the range [0 .. b-1], meaning that -123 % 10 was 7. On such a system, the test in advance for n < 0 would be necessary.

But the second bullet applies only to earlier times. In the current versions of both the C and C++ standards, integer division is defined to truncate towards 0, so it turns out that n % 10 is guaranteed to give you the (possibly negative) last digit of n even when n is negative.

So the answer to the question "What is the meaning of while(n)?" is "Exactly the same as while(n != 0)", and the answer to "Will this code work properly for negative as well as positive n?" is "Yes, under any modern, Standards-conforming compiler." The answer to the question "Then why did the instructor mark it down?" is probably that they're not aware of a significant language redefinition that happened to C in 1999 and to C++ in 2010 or so.

Solution 2 - C

Your code is perfectly fine

You are absolutely correct and your teacher is wrong. There is absolutely no reason at all to add that extra complexity, since it does not affect the result at all. It even introduces a bug. (See below)

First, the separate check if n is zero is obviously completely unnecessary and this is very easy to realize. To be honest, I actually question your teachers competence if he has objections about this. But I guess everybody can have a brain fart from time to time. However, I DO think that while(n) should be changed to while(n != 0) because it adds a little bit extra clarity without even costing an extra line. It's a minor thing though.

The second one is a bit more understandable, but he is still wrong.

This is what the C11 standard 6.5.5.p6 says:

> If the quotient a/b is representable, the expression (a/b)*b + a%b shall equal a; otherwise, the behavior of both a/b and a%b is undefined.

The footnote says this:

> This is often called "truncation toward zero".

Truncation toward zero means that the absolute value for a/b is equal to the absolute value for (-a)/b for all a and b, which in turn means that your code is perfectly fine.

Modulo is easy math, but may be counterintuitive

However, your teacher does have a point that you should be careful, because the fact that you're squaring the result is actually crucial here. Calculating a%b according to above definition is easy math, but it might go against your intuition. For multiplication and division, the result is positive if the operands have equal sign. But when it comes to modulo, the result has the same sign as the first operand. The second operand does not affect the sign at all. For instance, 7%3==1 but (-7)%(-3)==(-1).

Here is a snippet demonstrating it:

$ cat > main.c 
#include <stdio.h>

void f(int a, int b) 
{
    printf("a: %2d b: %2d a/b: %2d a\%b: %2d (a%b)^2: %2d (a/b)*b+a%b==a: %5s\n",
           a, b ,a/b, a%b, (a%b)*(a%b), (a/b)*b+a%b == a ? "true" : "false");
}

int main(void)
{
    int a=7, b=3;
    f(a,b);
    f(-a,b);
    f(a,-b);
    f(-a,-b);
}

$ gcc main.c -Wall -Wextra -pedantic -std=c99

$ ./a.out
a:  7 b:  3 a/b:  2 a%b:  1 (a%b)^2:  1 (a/b)*b+a%b==a:  true
a: -7 b:  3 a/b: -2 a%b: -1 (a%b)^2:  1 (a/b)*b+a%b==a:  true
a:  7 b: -3 a/b: -2 a%b:  1 (a%b)^2:  1 (a/b)*b+a%b==a:  true
a: -7 b: -3 a/b:  2 a%b: -1 (a%b)^2:  1 (a/b)*b+a%b==a:  true

So, ironically, your teacher proved his point by being wrong.

Your teacher's code is flawed

Yes, it actually is. If the input is INT_MIN AND the architecture is two's complement AND the bit pattern where the sign bit is 1 and all value bits are 0 is NOT a trap value (using two's complement without trap values is very common) then your teacher's code will yield undefined behavior on the line n = n * (-1). Your code is - if ever so slightly - better than his. And considering introducing a small bug by making the code unnecessary complex and gaining absolutely zero value, I'd say that your code is MUCH better.

In other words, in compilations where INT_MIN = -32768 (even though the resulting function cannot receive an input that is < -32768 or > 32767), the valid input of -32768 causes undefined behavior, because the result of -(-32768i16) cannot be expressed as a 16-bit integer. (Actually, -32768 probably would not cause an incorrect result, because -(-32768i16) usually evaluates to -32768i16, and your program handles negative numbers correctly.) (SHRT_MIN could be -32768 or -32767, depending on the compiler.)

But your teacher explicitly stated that n can be in the range [-10^7; 10^7]. A 16-bit integer is too small; you would have to use [at least] a 32-bit integer. Using int might seem to make his code safe, except that int is not necessarily a 32-bit integer. If you compile for a 16-bit architecture, both of your code snippets are flawed. But your code is still much better because this scenario reintroduces the bug with INT_MIN mentioned above with his version. To avoid this, you can write long instead of int, which is a 32-bit integer on either architecture. A long is guaranteed to be able to hold any value in the range [-2147483647; 2147483647]. C11 Standard 5.2.4.2.1 LONG_MIN is often -2147483648 but the maximum (yes, maximum, it's a negative number) allowed value for LONG_MIN is -2147483647.

What changes would I make to your code?

Your code is fine as it is, so these are not really complaints. It's more like that if I really, really need to say anything about your code, there are some small things that could make it just a tiny bit clearer.

  • The names of the variables could be a little bit better, but it is a short function that is easy to understand, so it's not a big deal.
  • You could change the condition from n to n!=0. Semantically, it's 100% equivalent, but it makes it a little bit clearer.
  • Move declaration of c (which I renamed to digit) to inside the while loop since it's only used there.
  • Change argument type to long to ensure it can handle the whole input set.
int sum_of_digits_squared(long n) 
{
    long sum = 0;

    while (n != 0) {
        int digit = n % 10;
        sum += (digit * digit);
        n /= 10;
    }

    return sum;
}

Actually, this can be a little bit misleading because - as mentioned above - the variable digit can get a negative value, but a digit is in itself never either positive or negative. There are a few ways around this, but this is REALLY nitpicking, and I would not care for such small details. Especially the separate function for last digit is taking it too far. Ironically, this is one of the things that your teachers code actually solves.

  • Change sum += (digit * digit) to sum += ((n%10)*(n%10)) and skip the variable digit completely.
  • Change the sign of digit if negative. But I would strongly advice against making the code more complex just to make a variable name make sense. That's a VERY strong code smell.
  • Create a separate function that extracts the last digit. int last_digit(long n) { int digit=n%10; if (digit>=0) return digit; else return -digit; } This is useful if you want to use that function somewhere else.
  • Just name it c as you originally do. That variable name does not give any useful information, but on the other hand, it's not misleading either.

But to be honest, at this point you should move on to more important work. :)

Solution 3 - C

I don't completely like either your version or your teacher's. Your teacher's version does the extra tests that you correctly point out are unnecessary. C's mod operator is not a proper mathematical mod: a negative number mod 10 will produce a negative result (proper mathematical modulus is always non-negative). But since you're squaring it anyway, no difference.

But this is far from obvious, so I would add to your code not the checks of your teacher, but a big comment that explains why it works. E.g.:

/* NOTE: This works for negative values, because the modulus gets squared */

Solution 4 - C

NOTE: AS I was writing this answer, you did clarify that you are using C. The majority of my answer is about C++. However, since your title still has C++ and the question is still tagged C++, I have chosen to answer anyway in case this is still useful to other people, especially since most of the answers I've seen till now are mostly unsatisfactory.

In modern C++ (Note: I don't really know where C stands on this), your professor seems to be wrong on both counts.

First is this part right here:

if (n == 0) {
        return 0;
}

In C++, this is basically the same thing as:

if (!n) {
        return 0;
}

That means your while is equivalent to something like this:

while(n != 0) {
    // some implementation
}

That means since you are merely exiting in your if when the while wouldn't execute anyway, there really isn't a reason to put this if here, since what you are doing after the loop and in the if are equivalent anyway. Although I should say that is for some reason these were different, you'd need to have this if.

So really, this if statement isn't particularly useful unless I'm mistaken.

The second part is where things get hairy:

if (n < 0) {
    n = n * (-1);
}  

The heart of the issue is is what the output of the modulus of a negative number outputs.

In modern C++, this seems to be mostly well defined: >>The binary / operator yields the quotient, and the binary % operator yields the remainder from the division of the first expression by the second. If the second operand of / or % is zero the behavior is undefined. For integral operands the / operator yields the algebraic quotient with any fractional part discarded; if the quotient a/b is representable in the type of the result, (a/b)*b + a%b is equal to a.

And later:

>>If both operands are nonnegative then the remainder is nonnegative; if not, the sign of the remainder is implementation-defined.

As the poster of the quoted answer correctly points out, the important part of this equation right here:

>(a/b)*b + a%b

Taking an example of your case, you'd get something like this:

-13/ 10 = -1 (integer truncation)
-1 * 10 = -10
-13 - (-10) = -13 + 10 = -3 

The only catch is that last line:

>If both operands are nonnegative then the remainder is nonnegative; if not, the sign of the remainder is implementation-defined.

That means that in a case like this, only the sign seems to be implementation-defined. That shouldn't be a problem in your case because, because you are squaring this value anyway.

That said, keep in mind that this doesn't necessarily apply to earlier versions of C++, or C99. If that is what your professor is using, that could be why.


EDIT: Nope, I'm wrong. This seems to be the case for C99 or later as well:

>C99 requires that when a/b is representable: > >(a/b) * b + a%b shall equal a

And another place: >>When integers are divided and the division is inexact, if both operands are positive the result of the / operator is the largest integer less than the algebraic quotient and the result of the % operator is positive. If either operand is negative, whether the result of the / operator is the largest integer less than the algebraic quotient or the smallest integer greater than the algebraic quotient is implementation-defined, as is the sign of the result of the % operator. If the quotient a/b is representable, the expression (a/b)*b + a%b shall equal a. > >Does either ANSI C or ISO C specify what -5 % 10 should be?

So, yeah. Even in C99, this doesn't seem to affect you. The equation is the same.

Solution 5 - C

As others have pointed out, the special treatment for n==0 is nonsense, since for every serious C programmer it is obvious that "while(n)" does the job.

The behaviour for n<0 is not that obvious, that's why I would prefer to see those 2 lines of code:

if (n < 0) 
    n = -n;

or at least a comment:

// don't worry, works for n < 0 as well

Honestly, at what time did you start considering that n might be negative? When writing the code or when reading your teacher's remarks?

Solution 6 - C

This reminds me of an assignment that I failed

Way back in the 90's. The lecturer had been sprouting on about loops and, long story short, our assignment was to write a function that would return the number of digits for any given integer > 0.

So, for example, the number of digits in 321 would be 3.

Although the assignment simply said to write a function that returned the number of digits, the expectation was that we would use a loop that divides by 10 until... you get it, as covered by the lecture.

But using loops was not explicitly stated so I: took the log, stripped away the decimals, added 1 and was subsequently lambasted in front of the whole class.

Point is, the purpose of the assignment was to test our understanding of what we had learned during lectures. From the lecture I received I learned the computer teacher was a bit of a jerk (but perhaps a jerk with a plan?)


In your situation:

> write a function in C/C++ that returns the sum of the digits of the number squared

I would definitely have provided two answers:

  • the correct answer (squaring the number first), and
  • the incorrect answer in keeping with the example, just to keep him happy ;-)

Solution 7 - C

Generally in assignments not all the marks are given just because the code works. You also get marks for making a solution easy to read, efficient and elegant. These things are not always mutually exclusive.

One I can't strees enough is "use meaningful variable names".

In your example it does not make much difference, but if you're working on a project with a milion lines of code readability becomes very important.

Another thing I tend to see with C code is people trying to look clever. Rather than using while(n != 0) I'll show everyone how clever I am by writing while(n) because it means the same thing. Well it does in the compiler you have but as you suggested you teacher's older version has not implemented it the same way.

A common example is referencing an index in an array while incrementing it at the same time ; Numbers[i++] = iPrime;

Now, the next programmer who works on the code has to know if i gets incremented before or after the assignment, just so someone could show off.

A megabyte of disk space is cheaper that a roll of toilet paper, go for clarity rather than trying to save space, your fellow programmers will be happier.

Solution 8 - C

I wouldn't argue about whether the original or the modern definition of '%' is better but anyone who writes two return statement into such a short function shouldn't teach C programming at all. Extra return is a goto statement and we don't use goto in C. Furthermore the code without the zero check would have the same result, extra return made it harder to read.

Solution 9 - C

The problem statement is confusing, but the numerical example clarifies the meaning of the sum of the digits of the number squared. Here is an improved version:

> Write a function in the common subset of C and C++ that takes an integer n in the range [-107, 107] and returns the sum of the squares of the digits of its representation in base 10. Example: if n is 123, your function should return 14 (12 + 22 + 32 = 14).

The function that you wrote is fine except for 2 details:

  • The argument should have type long to accommodate for all values in the specified range as type long is guaranteed by the C Standard to have at least 31 value bits, hence a range sufficient to represent all values in [-107, 107]. (Note that type int is sufficient for the return type, whose maximum value is 568.)
  • The behavior of % for negative operands is non-intuitive and its specification varied between the C99 Standard and previous editions. You should document why your approach is valid even for negative inputs.

Here is a modified version:

int sum_of_digits_squared(long n) {
    int s = 0;

    while (n != 0) {
        /* Since integer division is defined to truncate toward 0 in C99 and C++98 and later,
           the remainder of this division is positive for positive `n`
           and negative for negative `n`, and its absolute value is the last digit
           of the representation of `n` in base 10.
           Squaring this value yields the expected result for both positive and negative `c`.
           dividing `n` by 10 effectively drops the last digit in both cases.
           The loop will not be entered for `n == 0`, producing the correct result `s = 0`.
         */
        int c = n % 10;
        s += c * c;
        n /= 10;
    }
    return s;
}

The teacher's answer has multiple flaws:

  • type int may have an insufficient range of values.
  • there is no need to special case the value 0.
  • negating negative values is unnecessary and may have undefined behavior for n = INT_MIN.

Given the extra constraints in the problem statement (C99 and range of values for n), only the first flaw is an issue. The extra code still produces the correct answers.

You should get a good mark in this test, but the explanation is required in a written test to show your understanding of the issues for negative n, otherwise the teacher may assume that you were unaware and just got lucky. In an oral exam, you would have gotten a question and your answer would have nailed it.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
Questionuser010517720View Question on Stackoverflow
Solution 1 - CSteve SummitView Answer on Stackoverflow
Solution 2 - CkluttView Answer on Stackoverflow
Solution 3 - CLee Daniel CrockerView Answer on Stackoverflow
Solution 4 - Cuser10957435View Answer on Stackoverflow
Solution 5 - CC.B.View Answer on Stackoverflow
Solution 6 - CSlowLearnerView Answer on Stackoverflow
Solution 7 - CPaul McCarthyView Answer on Stackoverflow
Solution 8 - CPeter KrassoiView Answer on Stackoverflow
Solution 9 - CchqrlieView Answer on Stackoverflow