Scalar vs. primitive data type - are they the same thing?

Programming LanguagesTypesTerminologyPrimitive TypesScalar

Programming Languages Problem Overview


In various articles I have read, there are sometimes references to primitive data types and sometimes there are references to scalars.

My understanding of each is that they are data types of something simple like an int, boolean, char, etc.

Is there something I am missing that means you should use particular terminology or are the terms simply interchangeable? The Wikipedia pages for each one doesn't show anything obvious.

If the terms are simply interchangeable, which is the preferred one?

Programming Languages Solutions


Solution 1 - Programming Languages

I don't think they're interchangeable. They are frequently similar, but differences do exist, and seems to mainly be in what they are contrasted with and what is relevant in context.

Scalars are typically contrasted with compounds, such as arrays, maps, sets, structs, etc. A scalar is a "single" value - integer, boolean, perhaps a string - while a compound is made up of multiple scalars (and possibly references to other compounds). "Scalar" is used in contexts where the relevant distinction is between single/simple/atomic values and compound values.

Primitive types, however, are contrasted with e.g. reference types, and are used when the relevant distinction is "Is this directly a value, or is it a reference to something that contains the real value?", as in Java's primitive types vs. references. I see this as a somewhat lower-level distinction than scalar/compound, but not quite.

It really depends on context (and frequently what language family is being discussed). To take one, possibly pathological, example: strings. In C, a string is a compound (an array of characters), while in Perl, a string is a scalar. In Java, a string is an object (or reference type). In Python, everything is (conceptually) an object/reference type, including strings (and numbers).

Solution 2 - Programming Languages

There's a lot of confusion and misuse of these terms. Often one is used to mean another. Here is what those terms actually mean.

"Native" refers to types that are built into to the language, as opposed to being provided by a library (even a standard library), regardless of how they're implemented. Perl strings are part of the Perl language, so they are native in Perl. C provides string semantics over pointers to chars using a library, so pointer to char is native, but strings are not.

"Atomic" refers to a type that can no longer be decomposed. It is the opposite of "composite". Composites can be decomposed into a combination of atomic values or other composites. Native integers and floating point numbers are atomic. Fractions, complex numbers, containers/collections, and strings are composite.

"Scalar" -- and this is the one that confuses most people -- refers to values that can express scale (hence the name), such as size, volume, counts, etc. Integers, floating point numbers, and fractions are scalars. Complex numbers, booleans, and strings are NOT scalars. Something that is atomic is not necessarily scalar and something that is scalar is not necessarily atomic. Scalars can be native or provided by libraries.

Some types have odd classifications. BigNumber types, usually implemented as an array of digits or integers, are scalars, but they're technically not atomic. They can appear to be atomic if the implementation is hidden and you can't access the internal components. But the components are only hidden, so the atomicity is an illusion. They're almost invariably provided in libraries, so they're not native, but they could be. In the Mathematica programming language, for example, big numbers are native and, since there's no way for a Mathematica program to decompose them into their building blocks, they're also atomic in that context, despite the fact that they're composites under the covers (where you're no longer in the world of the Mathematica language).

These definitions are independent of the language being used.

Solution 3 - Programming Languages

Put simply, it would appear that a 'scalar' type refers to a single item, as opposed to a composite or collection. So scalars include both primitive values as well as things like an enum value.

<http://ee.hawaii.edu/~tep/EE160/Book/chap5/section2.1.3.html>

Perhaps the 'scalar' term may be a throwback to C:

>where scalars are primitive objects which contain a single value and are not composed of other C++ objects

<http://www.open-std.org/jtc1/sc22/wg21/docs/papers/1995/N0774.pdf>

I'm curious about whether this refers to whether these items would have a value of 'scale'? - Such as counting numbers.

Solution 4 - Programming Languages

I like Scott Langeberg's answer because it is concise and backed by authoritative links. I would up-vote Scott's answer if I could.

I suppose that "primitive" data type could be considered primary data type so that secondary data types are derived from primary data types. The derivation is through combining, such as a C++ struct. A struct can be used to combine data types (such as and int and a char) to get a secondary data type. The struct-defined data type is always a secondary data type. Primary data types are not derived from anything, rather they are a given in the programming language.

I have a parallel to primitive being the nomenclature meaning primary. That parallel is "regular expression". I think the nomenclature "regular" can be understood as "regulating". Thus you have an expression that regulates the search.

Scalar etymology (http://www.etymonline.com/index.php?allowed_in_frame=0&search=scalar&searchmode=none) means ladder-like. I think the way this relates to programming is that a ladder has only one dimension: How many rungs from the end of the ladder. A scalar data type has only one dimension, thus represented by a single value.

I think in usage, primitive and scalar are interchangeable. Is there any example of a primitive that is not scalar, or of a scalar that is not primitive?

Although interchangeable, primitive refers to the data-type being a basic building block of other data types, and a primitive is not composed of other data types.

Scalar refers to its having a single value. Scalar contrasts with the mathematical vector. A vector is not represented by a single value because (using one kind of vector as an example) one value is needed to represent the vector's direction and another value needed to represent the vector's magnitude.

Reference links: http://whatis.techtarget.com/definition/primitive http://en.wikipedia.org/wiki/Primitive_data_type

Solution 5 - Programming Languages

In C, enumeration types, characters, and the various representations of integers form a more general type class called scalar types. Hence, the operations you can perform on values of any scalar type are the same as those for integers.

Solution 6 - Programming Languages

null type is the only thing that most realistically conforms to the definition of a "scalar type". Even the serialization of 'None' as 'N.' fitting into a 16bit word which is traditionally scalar -- or even a single bit which has multiple possible values -- isn't a "single data".

Solution 7 - Programming Languages

Every primitive is scalar, but not vice versa. DateTime is scalar, but not primitive.

Solution 8 - Programming Languages

Being scalar has nothing to do with the language, whereas being primitive is all dependent on the language. The two have nothing to do with each other.

A scalar data type is something that has a finite set of possible values, following some scale, i.e. each value can be compared to any other value as either equal, greater or less. Numeric values (floating point and integer) are the obvious examples, while discrete/enumerated values can also be considered scalar. In this regard, boolean is a scalar with 2 discrete possible values, and normally it makes sense that true > false. Strings, regardless of programming language, are technically not scalars.

Now what is primitive depends on the language. Every language classifies what its "basic types" are, and these are designated as its primitives. In JavaScript, string is primitive, despite it not being a scalar in the general sense. But in some languages a string is not primitive. To be a primitive type, the language must be able to treat it as immutable, and for this reason referential types such as objects, arrays, collections, cannot be primitive in most, if not all, languages.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionBen PearsonView Question on Stackoverflow
Solution 1 - Programming LanguagesMichael EkstrandView Answer on Stackoverflow
Solution 2 - Programming LanguagesClaudio PuvianiView Answer on Stackoverflow
Solution 3 - Programming LanguagesScott LangebergView Answer on Stackoverflow
Solution 4 - Programming LanguagesIndinferView Answer on Stackoverflow
Solution 5 - Programming LanguagesOnkell WangView Answer on Stackoverflow
Solution 6 - Programming LanguagesCommenterView Answer on Stackoverflow
Solution 7 - Programming LanguagesMarkZView Answer on Stackoverflow
Solution 8 - Programming LanguagesArnel EneroView Answer on Stackoverflow