When coupled with arrays beginning at index 1, and the language does not document either fact well, it only ends in tears. And array-out-of-bound errors.
From an abstract language design perspective, I think 1-based indexing is rather more defensible than column-major index order, although both are clearly the Wrong Thing; however, it seems to be the case that I make more errors on account of 1-based indexing.
If you have a two-dimensional array, there are two ways to interpret position (i,j) within the array: as row i, column j, which is called row-major order, and as row j, column i, which is called column-major order.
Row-major order is the only correct choice, both because it's much more natural to think about it that way, and because iterating over all entries in the matrix has much more sensible memory behavior if you do it that way. The designers of Fortran could have known that column-major was wrong, but I'm prepared to forgive them for not realizing it. Nobody whose language is more recent has any excuse.
... I guess you perfectly well could lay out the matrix in memory so that column-major indexing had sane memory behavior, but as far as I know no language does that, even the ones that use that kind of indexing.
So, let me get this straight: row-major indexing makes the indices appear in correct numerical order in memory [e.g. (1,1), (1,2), (2,1), (2,2)] while in column-major indexing it's sideways [(1,1), (2,1), (1,2), (2,2)].
...I fear I am not enough of a computer programmer to understand the difference. Is it, as my father suggests, something about convenience of contiguous blocks of memory?
Yeah. You really really want successive iterations of the innermost loop to access consecutive bytes in memory, so the memory controller can anticipate the loads. And it's most natural to write loops from left index to right, so this just works with row-major order, but is all screwed up with column-major. Fortran compilers go to great length to flip the loops around to get sane memory access patterns.
Zack, in Fortran data is stored contiguously in columns. I really cannot find disadvantages to have column-major other than having to adapt data layouts of algorithms that were specified as row-major. Preferring row-major is just a sign of the overuse of C-like languages.
Compilers try to correct programmers that don't know languages semantics. A hilarious example is the 171.swim benchmark in the SPEC Cpu2000, where you have the comment and code of those who modified the benchmark, and who obviously didn't understood Fortran's memory layout.
C C *** modified for SPEC results verification C *** We want to ensure that all calculations were done C *** so we "use" all of the computed results. C *** C *** Since all of the comparisons of the individual diagnal terms C *** often differ in the smaller values, the check we have selected C *** is to add the absolute values of all terms and print these results C WRITE(7,350) NCYCLE,PTIME 350 FORMAT(/' CYCLE NUMBER',I5,' MODEL TIME IN HOURS', F6.2) WRITE(7,360) (UNEW(I,I),I=1,MNMIN,10) 360 FORMAT(/' DIAGONAL ELEMENTS OF U ', //(8E15.7)) C *** PCHECK = 0.0D0 UCHECK = 0.0D0 VCHECK = 0.0D0
370 CONTINUE C TEST FOR END OF RUN IF(NCYCLE .GE. ITMAX) THEN STOP ENDIF
As for your original question, "Column major array indexing: threat or menace?", I would say: it depends on your programmers knowledge of the languages: you can also have bad programmers for C-like languages. I'm interpreting your question as wanting a unique semantics for array indexing, and you're inclined to keep the C-like semantics. I find no reason to discard the other memory layout, and in the end, the question superfluous.
no subject
Date: 2007-11-10 07:48 am (UTC)no subject
Date: 2007-11-11 03:48 am (UTC)From an abstract language design perspective, I think 1-based indexing is rather more defensible than column-major index order, although both are clearly the Wrong Thing; however, it seems to be the case that I make more errors on account of 1-based indexing.
no subject
Date: 2007-11-11 02:02 am (UTC)no subject
Date: 2007-11-11 03:52 am (UTC)Row-major order is the only correct choice, both because it's much more natural to think about it that way, and because iterating over all entries in the matrix has much more sensible memory behavior if you do it that way. The designers of Fortran could have known that column-major was wrong, but I'm prepared to forgive them for not realizing it. Nobody whose language is more recent has any excuse.
no subject
Date: 2007-11-11 04:04 am (UTC)no subject
Date: 2007-11-11 02:07 pm (UTC)...I fear I am not enough of a computer programmer to understand the difference. Is it, as my father suggests, something about convenience of contiguous blocks of memory?
no subject
Date: 2007-11-12 06:08 am (UTC)no subject
Date: 2007-12-30 06:19 pm (UTC)cannot find disadvantages to have column-major other than having to
adapt data layouts of algorithms that were specified as row-major.
Preferring row-major is just a sign of the overuse of C-like languages.
Compilers try to correct programmers that don't know languages
semantics. A hilarious example is the 171.swim benchmark in the SPEC
Cpu2000, where you have the comment and code of those who modified the
benchmark, and who obviously didn't understood Fortran's memory layout.
C
C *** modified for SPEC results verification
C *** We want to ensure that all calculations were done
C *** so we "use" all of the computed results.
C ***
C *** Since all of the comparisons of the individual diagnal terms
C *** often differ in the smaller values, the check we have selected
C *** is to add the absolute values of all terms and print these results
C
WRITE(7,350) NCYCLE,PTIME
350 FORMAT(/' CYCLE NUMBER',I5,' MODEL TIME IN HOURS', F6.2)
WRITE(7,360) (UNEW(I,I),I=1,MNMIN,10)
360 FORMAT(/' DIAGONAL ELEMENTS OF U ', //(8E15.7))
C ***
PCHECK = 0.0D0
UCHECK = 0.0D0
VCHECK = 0.0D0
DO 3500 ICHECK = 1, MNMIN
DO 4500 JCHECK = 1, MNMIN
PCHECK = PCHECK + ABS(PNEW(ICHECK,JCHECK))
UCHECK = UCHECK + ABS(UNEW(ICHECK,JCHECK))
VCHECK = VCHECK + ABS(VNEW(ICHECK,JCHECK))
4500 CONTINUE
UNEW(ICHECK,ICHECK) = UNEW(ICHECK,ICHECK)
1 * ( MOD (ICHECK, 100) /100.)
3500 CONTINUE
C ***
C ***
WRITE(6,366) PCHECK, UCHECK, VCHECK
366 FORMAT(/,
* ' Pcheck = ',E12.4,/,
* ' Ucheck = ',E12.4,/,
* ' Vcheck = ',E12.4,/)
370 CONTINUE
C TEST FOR END OF RUN
IF(NCYCLE .GE. ITMAX) THEN
STOP
ENDIF
As for your original question, "Column major array indexing: threat or
menace?", I would say: it depends on your programmers knowledge of the
languages: you can also have bad programmers for C-like languages.
I'm interpreting your question as wanting a unique semantics for array
indexing, and you're inclined to keep the C-like semantics. I find no reason
to discard the other memory layout, and in the end, the question superfluous.
Sebastian