Time for another “ripped from the headlines” post about a frequently misunderstood Fortran feature! Today I’m going to talk about non-decimal constants, often referred to as BOZ constants – Binary (base 2), Octal (base 8) and Hexadecimal (base 16). (Yes, I know Hexadecimal doesn’t start with Z, but what can you do?)
Non-decimal constants were widely implemented in various FORTRAN compilers of the 1970s, but they were not part of the FORTRAN 77 standard and there were quite a few variations in syntax and semantics. Typically, octal and hexadecimal constants were offered but binary constants were less common. Some examples of syntax at the time:
That last one may surprise you – it was a syntax sometimes used for octal constants at least back into the early 1970s. Octal was much more common in the era of 12-bit and 16-bit processors. – hexadecimal didn’t really take off until 32-bit processors of the late 1970s, most notably the DEC VAX line. (Intel Fortran today accepts all but the last one.)
The first attempt to standardize non-decimal constants in Fortran was MIL-STD-1753 in 1978. It specified the
Z'nnnn' forms and allowed these only in
DATA statements as initialization values for integer data items. Fortran 90 (1991) formalized the syntax from MIL-STD-1753, permitted use of quotes (
") as an alternative to apostrophes (
'), and added
B'nnnn' for binary constants, but still allowed BOZ constants only in
DATA statements for integer variables. However, Fortran 90, like MIL-STD-1753, just waved its hands about what this meant, leaving it as an exercise for the reader. Of course, everyone knew (or thought they knew) what it meant. Well, almost everyone…
Fortran 95 (1997) added some words that attempted to clarify the semantics:
A data-stmt-constant that is a boz-literal-constant is treated as if the constant were an int-literal constant with a kind-param that specifies the representation method with the largest decimal exponent range supported by the processor.Fortran 95 5.2.10 (DATA statement)
This is a little better, but, to my mind anyway, has some ambiguity because BOZ constants are defined earlier as “unsigned”, even though Fortran has no unsigned integer type.
In a DATA statement (5.2.10), an unsigned binary, octal, or hexadecimal literal constant shall correspond to an integer scalar variable.Fortran 95 220.127.116.11 (Integer type)
Let’s assume that the default integer kind is a 32-bit 2s-complement integer. Consider:
integer :: i data i /Z'FFFFFFFF'/ print *, i
what does it print? If we take the text literally that
Z'FFFFFFFF' is unsigned, it represents 4,294,967,295 which is not representable in a signed 32-bit 2s-complement integer. Most people would expect the output to be -1. Hmmm….
Fortran 2003 (2004) expanded the use of BOZ constants, allowing them as actual arguments to the DBLE, REAL, INT and CMPLX intrinsic functions, yet again, maddeningly, never said how BOZ constants should be interpreted in those contexts. The assumption, of course, is that it is as if you did a TRANSFER of the bit pattern to the destination type, but the vagueness bothers me.
Fortran 2008 (2010) expanded the number of intrinsic functions where BOZ constants could be used, and reworded the definition of BOZ constants to something that was more reasonable:
A binary, octal, or hexadecimal constant (boz-literal-constant) is a sequence of digits that represents an ordered sequence of bits. Such a constant has no type.Fortran 2008 4.7p1 (Binary, octal, and hexadecimal literal constants)
Finally! BOZ constants were separated out from the section on integer type and now had a definition that made sense for the way they were used. Furthermore, the description of their use in
DATA statements, while still restricted to integers, was clearer:
If a data-stmt-constant is a boz-literal-constant , the corresponding variable shall be of type integer. The boz-literal-constant is treated as if it were converted by the intrinsic function INT (13.7.81) to type integer with the kind type parameter of the variable.Fortran 2008 5.4.7p11 (DATA statement)
Of course, we then need to go look at what the INT function says (here, A is the first argument to INT):
If A is a boz-literal-constant , the value of the result is the value whose bit sequence according to the model in 13.3 is the same as that of A as modified by padding or truncation according to 13.3.3. The interpretation of a bit sequence whose most significant bit is 1 is processor dependent.Fortran 2008 13.7.81p5 (INT)
Now we’re getting somewhere! But note the caveat regarding a sequence with an MSB of 1 – the standard says:
The interpretation of a negative integer as a sequence of bits is processor dependent.Fortran 2008 13.3.1p3 (Bit model)
Why is this there? Fortran tries hard to be architecture-neutral, and there have been processor architectures that represent integers in other ways, such as 1s-complement and signed-magnitude.
Fortran 2008 also nailed down what it meant for a BOZ constant to be an argument to the other intrinsic functions where they were supported. For example, here’s REAL:
If A is a boz-literal-constant , the value of the result is the value whose internal representation as a bit sequence is the same as that of A as modified by padding or truncation according to 13.3.3. The interpretation of the bit sequence is processor dependent.Fortran 2008 13.7.138p5 (REAL)
13.3.3 then added text explaining how BOZ constants work for arguments to INT and REAL when they are shorter or longer than the relevant type/kind, specifying that they are padded with zero bits on the left or truncated from the left, accordingly.
The additional intrinsic functions where BOZ constants could be used in Fortran 2008 are: BGE, BGT, BLE, BLT, DSHIFTL (I and J), DSHIFTR (I and J), IAND, IEOR, IOR, and MERGE_BITS (I and J). The standard explained, for each of these, exactly how a BOZ constant was to be interpreted.
Fortran 2018 and Fortran 202X
Now we get to Fortran 2018, which made… no changes at all regarding BOZ constants. But substantial changes are afoot for the next revision of the standard, which for now we’re calling Fortran 202X. First, a bit of a rewind…
There was a proposal for Fortran 2008 called BITS. The summary read:
A new data type, BITS, is added. Variables and constants of type BITS are ordered sequences of bits. They simplify and enhance the use of Fortran for several types of non-numeric problems, such as pattern matching, searching and sorting, and low level bit manipulation, as well as allowing for more clarity in the text of the standard. A BITS intrinsic data type also provides a way to standardize several common Fortran language extensions, and provides a rational method for dealing with BOZ constants. A method for declaring objects of type BITS is provided as well as rules on how such objects interact with existing Fortran objects. Associated intrinsic procedures are also provided. The BITS proposal introduces two incompatibilities with the Fortran 2003 standard. The new type name could conflict with the name of an existing user defined type, and the new operator, .xor., with a priority just above that of a user defined operator, could conflict with an existing user defined operator.
This was dropped from Fortran 2008, but was resurrected as a proposal for Fortran 202X – you can read about it in J3 paper 19-159. What we decided, though, was to enhance BOZ constants instead, as it was felt that this, in combination with the existing bit functions, was sufficient and less of a disruption to compilers. It helped that many compilers already supported at least some of the proposed enhancements.
J3 paper 19-212r1 laid out specifications for the usages to be extended. It also contained a program to test common extensions across six current compilers to see what the current situation was. Three of the tested compilers already supported all the proposed features, though some required a compile option. The additional usages proposed are:
- BOZ as an initialization in the definition of an integer named constant.
- BOZ as an initialization in the definition of a real named constant.
- BOZ as the expr of an intrinsic assignment to a variable of type INTEGER.
- BOZ as the expr of an intrinsic assignment to a variable of type REAL.
- BOZ constants that each have the same number of bits as ac-values [array constructor values] with a type-spec of INTEGER.
- BOZ constants that each have the same number of bits as ac-value with a type-spec of REAL. Each ac-value must have a bit sequence that is a valid representation for a value of the specified KIND of REAL.
- BOZ constant as an output-item in a WRITE statement corresponding to a B, O, or Z format edit descriptor.
The proposal was accepted, the “edits” completed in J3 paper 19-256r2, and it will standardize a lot of popular existing practice.
I do want to leave you with an important note – BOZ constants are not integers! They are, as Fortran 2008 finally nailed down, a sequence of bits whose interpretation depends on their context.
As usual, if you have questions about any of this or suggestions for a future Doctor Fortran post, add your comment below or use the contact form.
Thanks for this, Steve, this was a great article.
A lot of us are probably getting a lesson on this due to the recent change to gfortran!