Doctor Fortran in "I Can C Clearly Now, Part I"

Spend any time in the comp.lang.fortran newsgroup, or other places where programming languages are discussed, and you’ll soon see a new “Which is better, Fortran or C?” thread show up. These never fail to produce heated comments from people who should know better. My answer is that neither is “better” – each has its strengths and weaknesses.

For decades, smart programmers have used both in their applications, using C where it made sense and Fortran where that made sense. This was made easier by vendor-specific extensions to Fortran that dealt with things such as case-sensitive names and pass-by-value. Extensions such as %VAL and LOC have become so ingrained into the Fortran culture that many are astonished to find that they are non-standard.

Fortran 2003 added a whole class of features for “C interoperability” to the standard, finally enabling mixed-language programming in a reasonably portable manner. I am not aware of any other major programming language standard that has extended a hand in this manner. While many Fortran programmers have warmly embraced the new features, there’s still a lot of confusion about them, and I thought it was time to try to explain the new landscape.

This is a big topic, so I am going to split it up across several posts.

Interoper-what?

First, some definitions. The Fortran standard talks about interoperability with a “companion C processor”. (In Fortran-speak, “processor” means something that understands and runs code written in the language. For the most part, you can substitute “compiler”, but keep in mind that the compiler operates in an OS and CPU environment that may affect its behavior.) Each Fortran implementation is free to choose which C is its “companion”. For Intel Fortran, that is Microsoft Visual C++ on Windows, and gcc on Linux and OS X. What about Intel C++? That is also compatible with Visual C++ on Windows and gcc on Linux and OS X, so Intel Fortran will also interoperate with Intel C++.

Note that the standard says “companion C processor”, not “companion C++ processor”. In particular, the standard references the C99 standard, or ISO/IEC 9899:1999 to be specific. The companion processors may also build C++ code, but standard interoperability assumes C. You can use C++, but must stick to what is compatible with C when interoperating with Fortran.

What is meant by “interoperability” here? F2008 says it thusly: “Fortran provides a means of referencing procedures that are defined by means of the C programming language or procedures that can be described by C prototypes…, even if they are not actually defined by means of C. Conversely, there is a means of specifying that a procedure defined by a Fortran subprogram can be referenced by a function defined by means of C. In addition, there is a means of defining global variables that are associated with C variables whose names have external linkage.” To this, I will add that there are also means to declare Fortran variables, data structures and enumerations that correspond to similar declarations in C.

Fortran provides four major “tools” for enabling interoperability with C. These are:

Restrictions on which Fortran types are considered interoperable
The BIND(C) language-binding-spec
The ISO_C_BINDING intrinsic module
The VALUE attribute

I frequently see people refer to all of the interoperability tools as “ISO_C_BINDING”, but this is not correct; one can use the interoperability features without using the module.

Interoperable data types

The core concept of interoperability is that something should work the same way in Fortran as it does in C. While Fortran and C each support many of the same basic data types, not everything translates cleanly.

One difference is that Fortran has the concept of “kinds”, whereas C considers these somewhat distinct types. For example, consider the Fortran INTEGER type. C has numerous integer types, from short int to long long int, and some specialty types such as intptr_t. These may or may not have corresponding kinds in Fortran. For each of the C integer types which might be interoperable, ISO_C_BINDING declares a named constant (PARAMETER) giving the kind number for the implementation’s equivalent INTEGER kind.

For example, there’s the simple C int type. This corresponds to INTEGER(C_INT), where C_INT is defined in ISO_C_BINDING. In Intel Fortran, the value is always 4, as a C int corresponds with Fortran INTEGER(4), but some other Fortran may use different kind numbers. Using the named constant ensures portability.

More interesting is the C intptr_t type. This is an integer that is large enough to hold a pointer (address). In Intel Fortran, this would be INTEGER(4) when building a 32-bit application and INTEGER(8) for a 64-bit application. Intel Fortran provides different copies of ISO_C_BINDING for various platforms so you always get the right one.

Note that Fortran has no unsigned integer types, so there are no constants for C’s unsigned types. Such types are not interoperable.

You might wonder what happens if there is a “kind” of C type not supported by the Fortran implementation. The answer is that the named constant for that type is defined as -1, so you’ll get a compile-time error if you try to use it. We’ll see a use of this shortly.

Similarly, there are constants defined for REAL, COMPLEX, LOGICAL and CHARACTER. For REAL, the standard offers the possibility of a C long double type. This is implemented in different ways by various C compilers on various platforms supported by Intel Fortran. In gcc on 32-bit Linux, long double is an 80-bit floating type, as supported by the X87 instruction set. Intel Fortran doesn’t support this, so there, C_LONG_DOUBLE is -1. gcc on OS X, however, defines it as a 128-bit type that is the same as Intel Fortran’s REAL(16), so C_LONG_DOUBLE is 16 there. And on 64-bit Linux, or on Windows, long double is treated the same as double, so C_LONG_DOUBLE is 8. As long as you use the constants for kind values and the corresponding types in C, you’ll match.

LOGICAL and CHARACTER need special treatment when it comes to interoperability. The Fortran standard says that LOGICAL corresponds to C’s _Bool type, and defines a single kind value C_BOOL, which is 1 in Intel Fortran. But Intel Fortran, by default, tests LOGICALs for true/false differently than C does. Where C uses zero for false and not-zero for true, Intel Fortran defaults to treating even values as false and odd values as true. If you are going to use LOGICAL types to interoperate with C, be sure to specify the –fpscomp logicals (/fpscomp:logicals) option, which changes the interpretation to be C-like. This is included if you use –standard-semantics (/standard-semantics) – I recommend using this option any time you use Fortran 2003 (or later) features.

Now we come to CHARACTER. C does not have character strings, at least not in the way Fortran does. Really. It has arrays of single characters, so this is how you must represent things in Fortran. There is a kind value defined, C_CHAR, corresponding to the C char type. But only length 1 character variables are interoperable. I’ll talk more about that when I come to procedure arguments, but just know that it is not as dire a situation as you might think.

Derived types can also be interoperable, and that will be discussed next time when I talk about BIND(C).

There are other restrictions on interoperable variables. Scalar variables are interoperable only if their type parameters (kind and length) are interoperable (see above), they are not a Coarray, do not have the POINTER or ALLOCATABLE attribute (this may change in the future, I’ll talk about that in another post), and if character its length is not assumed nor defined by a non-constant expression. (Wait, I thought you said only length 1 was interoperable! Patience, grasshopper…)

Arrays are interoperable if the base type meets the scalar variable requirements above, if it is explicit shape or assumed-size, and is not zero-sized. Furthermore, assumed-size arrays are interoperable only with C arrays that have no size specified. There are some additional rules on rank, in particular, C arrays with rank greater than 1 are not interoperable because they are “arrays of arrays”.

To be continued…

The next post will be dedicated to BIND(C), in all its manifestations. “C” you then! [I never got around to the second part, but I’ll get there someday…]

(Originally posted at Intel Developer Zone, copied with permission)

Doctor Fortran