is a byte with all zeros equivalent to the integer value (signed or unsigned) zero, in terms of portability?
Such that, is memset(&i, 0, sizeof(int)); equivalent to i = 0;?
Name:
Anonymous2017-04-15 0:00
unsigned char uses ``a pure binary notation''1, which is defined like you'd expect: Every bit represents a unique power of two between 1 and 2CHAR_BIT-1, the value is the sum of the powers of two whose corresponding bit is set to 1. From this it follows that an unsigned char 0 consists of CHAR_BIT zero bits. All objects other than bitfields have an abstract object representation consisting of a contiguous sequence of one or more bytes2. C distinguishes between object representations and values; I don't want to dive into the irrelevant details of ORs too much, but the thing to keep in mind is that not every OR must correspond to a value and a single value may have multiple ORs.
memset(dest, val, len), by definition, sets each of the bytes in the pointed-to part of an object's OR to (unsigned char)val. Therefore, your question is equivalent to ``Is all-zeroes a valid OR for 0 for all integer types?''. And the answer is no. Integer types are allowed to have padding bits which allow for so-called trap representations — representations that do not correspond to a value. You could use these to e.g. signal signed overflow or store parity bits.3 Use a scheme where a value of 0 requires set checksum bits and the memset trick doesn't work.
However, C implementations with padded integers are extremely rare. I know one architecture where padded integers make sense (a 48-bit Burroughs mainframe that used 40 bits for integer operations and only used the remaining 8 bits for floating point arithmetic), and I doubt it ever had a compliant C compiler in the first place. Most of the time, it's more effective to just use all bits for a larger integer range.
With that in mind, we could restrict ourselves to machines without padding bits and ask: Is all-zeroes a valid OR for 0 for all integer types if the integer types don't use padding bits? It turns out that the answer is yes in that case! Unsigned integer types may only have value and padding bits and their value bits must use a pure binary representation similar to the one unsigned char uses.4 Therefore they may not have trap representations and 0 is represented as all-zeroes. The only difference between unsigned and signed types is that signed types have a bit which is repurposed as a sign bit. Signed types may have trap representations even in the absence of padding5, but the value of a signed integer is defined as the value induced by the value bits as if you were dealing with an unsigned integer, fed into one of three operations6 if the sign bit is one. Therefore, an all-zeroes signed integer without padding bits represents the value 0.
So if you restrict yourself to implementations without integer padding bits, int i; memset(&i, 0, sizeof(i)); is in fact equivalent to i = 0;.
1 Footnote 40, all references are according to ISO/IEC 9899:1999. I don't know if the numbering changed later and I don't care either because C11 is badly supported crap. 2 §3.6 defines a byte as ``addressable unit of data storage large enough to hold any member of the basic character set of the execution environment'', which need not correspond to a ``real byte'' in hardware since CHAR_BIT must be at least 8 and an implementation on a 6-bit machine could provide an abstract ``C byte'' (the char) that consists of two ``real bytes'' and set CHAR_BIT to 12. unsigned char would then work like an unsigned 12-bit number. 3 Footnote 44. 4 §6.2.6.2.1 5 Negative zero in a sign-magnitude system is explicitly mentioned in §6.2.6.2.2 as an example of this. 6 Corresponding to sign-magnitude, two's complement and one's complement, §6.2.6.2.2.