Doctor, it hurts when I do this!

It is often said that you can write bad code in any language, and I certainly can’t argue with that. I do find, though, that the worst-looking code comes from programmers who are more familiar with another programming language. One can often tell that a C programmer wrote Fortran code, or that a Fortran programmer wrote C code (my C code probably looks like the latter.)

One certainly can write clear and understandable code in Fortran, and many do. I see a lot of code from different people cross my desk each day, and I’ve learned to recognize certain “idioms” in Fortran code which are there to help make the code easier to understand. Meaningful variable names, use of mixed-case and free-format source, etc. Sometimes, though, a helpful coding practice can bite you. Here’s one I ran across the other day…

One of the big strengths of modern Fortran is its wealth of array-oriented features. Few languages offer the whole-array and array slice operations that Fortran does, and often you can do an array operation without a traditional DO loop. For example, you can add two arrays with:

A = B + C

You can even have functions that return arrays (explicit interface required!), which freaks out some who grew up on FORTRAN 77 or even FORTRAN IV. Some programmers, though, like to remind readers that an array is an array, so they’ll write code that looks like this:

A(:) = func(B(:), C(:))

with the (:) alerting the reader that the variable is an array (sort of like the Dave Barry joke that an apostrophe serves to warn you that the letter “s” is coming up in grocery store signs.) In Fortran syntax, (:) indicates an array section that starts at the first element and ends at the last element – the whole array, in other words.

Whenever I see this usage, I cringe, because I know that the compiler has to work extra hard to recognize that the programmer really meant “the whole array” and not a piece of it. In the past, unnecessary use of (:) would often prevent optimizations. Nowadays this is less often the case, thanks to hard work by the compiler developers, but sometimes it still happens. Still, most of the time, the (:) is “harmless” in that it does not change the overall meaning of the program, since an array section is “definable” (can be stored into), as long as you don’t use a vector subscript such as A([1,3,5,7]).

In the recent case, the customer had written something like:

A(:) = func(B)

where A was an ALLOCATABLE array and function “func” returned an array. In Fortran 90 and 95, the language required that A already be allocated and have the same shape (dimensions) as the function return value.

Fortran 2003, however, added a new twist. In F2003, if the allocatable array on the left of the assignment is not currently allocated with a shape matching the right-side, it is automatically deallocated (if needed) and then allocated with the correct shape. This can make code a lot cleaner as you don’t have to worry about knowing the shape in advance.

The downside, though, is that the checking required to support this is a lot of extra code, and applications where it is known that the array is already allocated to the correct shape don’t need this check which would just slow them down. This F2003 feature is enabled by default beginning in the Intel Fortran Compiler 17.0 release. In earlier releases this feature is not enabled by default – you have to ask for it with the option /assume:realloc_lhs, or for Linux and Mac users, -assume realloc_lhs. (“lhs” here means “left hand side”.)

The programmer who wrote the above code had in fact used this feature and depended on it, with array A starting out unallocated. He was therefore surprised to find that his program, when it got to the above assignment, complained that array bounds were violated!

It turned out that it was the use of the “helpful” (:) that caused the problem. This changed the meaning of the left-hand-side from an “allocatable variable” to an array section, and as such, it did not qualify for automatic reallocation. Consider, for example, if the code had read:

A(2:5) = func(B)

you could not reallocate some elements of an array section!

The solution in this case was to remove the superfluous (:), in which case the program ran as expected. On my advice, the customer also removed unnecessary (:) in other places as they could impede optimization.

So, Doctor Fortran’s advice if you put (:) on your arrays? Don’t do that!

Got any other examples of Fortran coding practices that can hurt you?

(Originally posted at Intel Developer Zone, copied with permission)

Comments

Thank you for posting this. From what I understand, use of `(:)` increases compilation time, whereas not using it increases run time (owing to the extra checks required to verify if the allocated array is of the same shape/size). Therefore, performance wise, shouldn’t the use of `(:)` be a better option at the expense of increased compilation time?

It depends on what you want. If, as in the post, the programmer intends for the F2003 auto-reallocation semantics to be available, then you don’t want the (:). If you know you don’t want it, then sure, use the (:) to avoid the check if you believe it will hurt performance (I don’t). When I wrote this post, auto-reallocation was not the default in the Intel compiler, but it is now.

I see. Thank you once again for clarifying my doubt. I guess I will keep the `(:)` in my existing codes then. In future codes, I might try to check how much of a difference it makes.

Write Your Comments Cancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Comments

Write Your Comments Cancel reply

Subscribe to Doctor Fortran