zwol: stylized sketch of a face in profile (Default)
[personal profile] zwol
Column major array indexing: threat or menace? Discuss.

Date: 2007-11-10 07:48 am (UTC)
From: [identity profile] aldren.livejournal.com
When coupled with arrays beginning at index 1, and the language does not document either fact well, it only ends in tears. And array-out-of-bound errors.

Date: 2007-11-11 03:48 am (UTC)
From: [identity profile] zwol.livejournal.com
Yes.

From an abstract language design perspective, I think 1-based indexing is rather more defensible than column-major index order, although both are clearly the Wrong Thing; however, it seems to be the case that I make more errors on account of 1-based indexing.

Date: 2007-11-11 02:02 am (UTC)
From: [identity profile] packbat.livejournal.com
Speaking as a humble Mech. E. - huh?

Date: 2007-11-11 03:52 am (UTC)
From: [identity profile] zwol.livejournal.com
If you have a two-dimensional array, there are two ways to interpret position (i,j) within the array: as row i, column j, which is called row-major order, and as row j, column i, which is called column-major order.

Row-major order is the only correct choice, both because it's much more natural to think about it that way, and because iterating over all entries in the matrix has much more sensible memory behavior if you do it that way. The designers of Fortran could have known that column-major was wrong, but I'm prepared to forgive them for not realizing it. Nobody whose language is more recent has any excuse.

Date: 2007-11-11 04:04 am (UTC)
From: [identity profile] zwol.livejournal.com
... I guess you perfectly well could lay out the matrix in memory so that column-major indexing had sane memory behavior, but as far as I know no language does that, even the ones that use that kind of indexing.

Date: 2007-11-11 02:07 pm (UTC)
From: [identity profile] packbat.livejournal.com
So, let me get this straight: row-major indexing makes the indices appear in correct numerical order in memory [e.g. (1,1), (1,2), (2,1), (2,2)] while in column-major indexing it's sideways [(1,1), (2,1), (1,2), (2,2)].

...I fear I am not enough of a computer programmer to understand the difference. Is it, as my father suggests, something about convenience of contiguous blocks of memory?

Date: 2007-11-12 06:08 am (UTC)
From: [identity profile] zwol.livejournal.com
Yeah. You really really want successive iterations of the innermost loop to access consecutive bytes in memory, so the memory controller can anticipate the loads. And it's most natural to write loops from left index to right, so this just works with row-major order, but is all screwed up with column-major. Fortran compilers go to great length to flip the loops around to get sane memory access patterns.

Date: 2007-12-30 06:19 pm (UTC)
From: [identity profile] sebpop.livejournal.com
Zack, in Fortran data is stored contiguously in columns. I really
cannot find disadvantages to have column-major other than having to
adapt data layouts of algorithms that were specified as row-major.
Preferring row-major is just a sign of the overuse of C-like languages.

Compilers try to correct programmers that don't know languages
semantics. A hilarious example is the 171.swim benchmark in the SPEC
Cpu2000, where you have the comment and code of those who modified the
benchmark, and who obviously didn't understood Fortran's memory layout.

C
C *** modified for SPEC results verification
C *** We want to ensure that all calculations were done
C *** so we "use" all of the computed results.
C ***
C *** Since all of the comparisons of the individual diagnal terms
C *** often differ in the smaller values, the check we have selected
C *** is to add the absolute values of all terms and print these results
C
WRITE(7,350) NCYCLE,PTIME
350 FORMAT(/' CYCLE NUMBER',I5,' MODEL TIME IN HOURS', F6.2)
WRITE(7,360) (UNEW(I,I),I=1,MNMIN,10)
360 FORMAT(/' DIAGONAL ELEMENTS OF U ', //(8E15.7))
C ***
PCHECK = 0.0D0
UCHECK = 0.0D0
VCHECK = 0.0D0

DO 3500 ICHECK = 1, MNMIN
DO 4500 JCHECK = 1, MNMIN
PCHECK = PCHECK + ABS(PNEW(ICHECK,JCHECK))
UCHECK = UCHECK + ABS(UNEW(ICHECK,JCHECK))
VCHECK = VCHECK + ABS(VNEW(ICHECK,JCHECK))
4500 CONTINUE
UNEW(ICHECK,ICHECK) = UNEW(ICHECK,ICHECK)
1 * ( MOD (ICHECK, 100) /100.)
3500 CONTINUE
C ***
C ***
WRITE(6,366) PCHECK, UCHECK, VCHECK
366 FORMAT(/,
* ' Pcheck = ',E12.4,/,
* ' Ucheck = ',E12.4,/,
* ' Vcheck = ',E12.4,/)

370 CONTINUE
C TEST FOR END OF RUN
IF(NCYCLE .GE. ITMAX) THEN
STOP
ENDIF


As for your original question, "Column major array indexing: threat or
menace?", I would say: it depends on your programmers knowledge of the
languages: you can also have bad programmers for C-like languages.
I'm interpreting your question as wanting a unique semantics for array
indexing, and you're inclined to keep the C-like semantics. I find no reason
to discard the other memory layout, and in the end, the question superfluous.

Sebastian

April 2017

S M T W T F S
      1
2345678
9101112131415
16171819 202122
23242526272829
30      

Style Credit

Expand Cut Tags

No cut tags
Page generated Jan. 9th, 2026 01:51 am
Powered by Dreamwidth Studios