Pointer arithmetics with two different buffers

C++PointersLanguage LawyerPointer Arithmetic

C++ Problem Overview


Consider the following code:

int* p1 = new int[100];
int* p2 = new int[100];
const ptrdiff_t ptrDiff = p1 - p2;

int* p1_42 = &(p1[42]);
int* p2_42 = p1_42 + ptrDiff;

Now, does the Standard guarantee that p2_42 points to p2[42]? If not, is it always true on Windows, Linux or webassembly heap?

C++ Solutions


Solution 1 - C++

To add the standard quote:

> expr.add#5

> When two pointer expressions P and Q are subtracted, the type of the result is an implementation-defined signed integral type; this type shall be the same type that is defined as std::ptrdiff_­t in the <cstddef> header ([support.types]).

> - (5.1) If P and Q both evaluate to null pointer values, the result is 0.

> - (5.2) Otherwise, if P and Q point to, respectively, elements x[i] and x[j] of the same array object x, the expression P - Q has the value i−j.

> - (5.3) Otherwise, the behavior is undefined. [ Note: If the value i−j is not in the range of representable values of type std::ptrdiff_­t, the behavior is undefined. — end note  ]

(5.1) does not apply as the pointers are not nullptrs. (5.2) does not apply because the pointers are not into the same array. So, we are left with (5.3) - UB.

Solution 2 - C++

const ptrdiff_t ptrDiff = p1 - p2;

This is undefined behavior. Subtraction between two pointers is well defined only if they point to elements in the same array. ([expr.add] ¶5.3).

> When two pointer expressions P and Q are subtracted, the type of the result is an implementation-defined signed integral type; this type shall be the same type that is defined as std::ptrdiff_­t in the <cstddef> header ([support.types]). > > - If P and Q both evaluate to null pointer values, the result is 0. > - Otherwise, if P and Q point to, respectively, elements x[i] and x[j] of the same array object x, the expression P - Q has the value i−j. > - Otherwise, the behavior is undefined

And even if there was some hypothetical way to obtain this value in a legal way, even that summation is illegal, as even a pointer+integer summation is restricted to stay inside the boundaries of the array ([expr.add] ¶4.2)

> When an expression J that has integral type is added to or subtracted from an expression P of pointer type, the result has the type of P. > > - If P evaluates to a null pointer value and J evaluates to 0, the result is a null pointer value. > - Otherwise, if P points to element x[i] of an array object x with n elements,81 the expressions P + J and J + P (where J has the value j) point to the (possibly-hypothetical) element x[i+j] if 0≤i+j≤n and the expression P - J points to the (possibly-hypothetical) element x[i−j] if 0≤i−j≤n. > - Otherwise, the behavior is undefined.

Solution 3 - C++

The third line is Undefined Behavior, so the Standard allows anything after that.

It's only legal to subtract two pointers pointing to (or after) the same array.

Windows or Linux aren't really relevant; compilers and especially their optimizers are what breaks your program. For instance, an optimizer might recognize that p1 and p2 both point to the begin of an int[100] so p1-p2 has to be 0.

Solution 4 - C++

The Standard allows for implementations on platforms where memory is divided into discrete regions which cannot be reached from each other using pointer arithmetic. As a simple example, some platforms use 24-bit addresses that consist of an 8-bit bank number and a 16-bit address within a bank. Adding one to an address that identifies the last byte of a bank will yield a pointer to the first byte of that same bank, rather than the first byte of the next bank. This approach allows address arithmetic and offsets to be computed using 16-bit math rather than 24-bit math, but requires that no object span a bank boundary. Such a design would impose some extra complexity on malloc, and would likely result in more memory fragmentation than would otherwise occur, but user code wouldn't generally need to care about the partitioning of memory into banks.

Many platforms do not have such architectural restrictions, and some compilers which are designed for low-level programming on such platforms will allow address arithmetic to be performed between arbitrary pointers. The Standard notes that a common way of treating Undefined Behavior is "behaving during translation or program execution in a documented manner characteristic of the environment", and support for generalized pointer arithmetic in environments that support it would fit nicely under that category. Unfortunately, the Standard fails to provide any means of distinguishing implementations that behave in such useful fashion and those which don't.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionSergeyView Question on Stackoverflow
Solution 1 - C++Max LanghofView Answer on Stackoverflow
Solution 2 - C++Matteo ItaliaView Answer on Stackoverflow
Solution 3 - C++MSaltersView Answer on Stackoverflow
Solution 4 - C++supercatView Answer on Stackoverflow