This isn't quite "undefined behaviour", just weird syntax and one of those moments when you ought to know operator precedence and evaluation order, which is pretty much the same in every language (in some languages with dialects or multiple compilers it may just be more apparent).
Undefined behaviour would be something along the lines of:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <err.h>
intmain(void){constsize_tsize=1024*1024;char*data=malloc(size);if(!data){err(1,"malloc");}// replace with assert if you don't have err.h from libbsdmemset(data,0,size);// write zeroesfree(data);memset(data,0xff,size);// write onesreturn0;}
Interesting. As far as I could read up because I didn't think it was either; in most cases the compiler will handle it as you expect, but it doesn't have to according to the spec which is why it is undefined?
There is no guarantee in the specification for c that the increment of i will be done when you use it as the third argument to printf(). So you could reasonably get 1, 1?
I think you're imagining that the operations occur in an unspecified order, as would be the case for
foo(a(), b());
There is a sequence point when a call is executed, so a(), and b() occur in some distinct, if unspecified, order.
The program will not have undefined behavior, but may have unspecified behavior (if it depends on the order of those calls), but we can continue to reason about the C Abstract Machine for both cases.
foo(i, i++);
There is no sequence point between i and i++, so they occur at the same time, leading to a violation of the C Abstract Machine, producing undefined behavior.
We cannot reason about the program from this point onward.
It's undefined behavior of the case that "Between two sequence points, an object is modified more than once, or is modified and the prior value is read other than to determine the value to be stored."
This happens because there is no sequence point between the i++ and the i.
Precedence doesn't come into this.
Here's a more interesting variation on your example.
Can you spot the undefined behavior here? :)
int main() {
char *data = malloc(1);
if (data) {
free(data++);
data++;
}
}
Ah, I see.
So C literally doesn't define any order on those instructions and it's up to the compiler?
Wouldn't have expected that, though I've seen the example a few times.
Excuse my hasty assumption then please.
First off, I'd really appreciate it if you specified the syntax in the code blocks so syntax highlighting kicks in ;-)
Something along the lines of that (without the backslashes, markdown dialect doesn't allow nested fences):
\`\`\`c
int main(void)
{
return 0;
}
\`\`\`
Actually no I can't see the undefined behaviour in that example.
In all cases you're manipulating the pointer only if I see correctly, and since free takes the pointer by value and not reference, you'd end up with a copy of data before increment in the call, and move along the pointer twice afterwards, but in either case the pointer is invalid.
What am I missing?
But you're not actually using that pointer in the code, so I fail to see how that's undefined behaviour.
An invalid pointer which isn't used still doesn't cause any runtime issues, or is there something about that too in the standards?
This isn't quite "undefined behaviour", just weird syntax and one of those moments when you ought to know operator precedence and evaluation order, which is pretty much the same in every language (in some languages with dialects or multiple compilers it may just be more apparent).
Undefined behaviour would be something along the lines of:
Interesting. As far as I could read up because I didn't think it was either; in most cases the compiler will handle it as you expect, but it doesn't have to according to the spec which is why it is undefined?
There is no guarantee in the specification for c that the increment of
i
will be done when you use it as the third argument toprintf()
. So you could reasonably get 1, 1?I may well have misunderstood though!
I think you're imagining that the operations occur in an unspecified order, as would be the case for
There is a sequence point when a call is executed, so a(), and b() occur in some distinct, if unspecified, order.
The program will not have undefined behavior, but may have unspecified behavior (if it depends on the order of those calls), but we can continue to reason about the C Abstract Machine for both cases.
There is no sequence point between i and i++, so they occur at the same time, leading to a violation of the C Abstract Machine, producing undefined behavior.
We cannot reason about the program from this point onward.
It's undefined behavior of the case that "Between two sequence points, an object is modified more than once, or is modified and the prior value is read other than to determine the value to be stored."
This happens because there is no sequence point between the i++ and the i.
Precedence doesn't come into this.
Here's a more interesting variation on your example.
Can you spot the undefined behavior here? :)
Ah, I see.
So C literally doesn't define any order on those instructions and it's up to the compiler?
Wouldn't have expected that, though I've seen the example a few times.
Excuse my hasty assumption then please.
First off, I'd really appreciate it if you specified the syntax in the code blocks so syntax highlighting kicks in ;-)
Something along the lines of that (without the backslashes, markdown dialect doesn't allow nested fences):
Actually no I can't see the undefined behaviour in that example.
In all cases you're manipulating the pointer only if I see correctly, and since free takes the pointer by value and not reference, you'd end up with a copy of data before increment in the call, and move along the pointer twice afterwards, but in either case the pointer is invalid.
What am I missing?
Pointers are only well defined for null pointer values or when pointing into or one past the end of an allocated array.
The first increment satisfies this, since it happens before the free occurs.
After the free, the pointer value is undefined and so the second increment has undefined behavior.
But you're not actually using that pointer in the code, so I fail to see how that's undefined behaviour.
An invalid pointer which isn't used still doesn't cause any runtime issues, or is there something about that too in the standards?
The last increment of the pointer is when it has an undefined value, producing undefined behavior.
For example it might behave like a trap representation.
Regardless, the program cannot be reasoned about after this point. :)