[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

*To*: jeff@dark-techno.org*Subject*: Re: simple question about ATL_mmJIK.c*From*: R Clint Whaley <rwhaley@cs.utk.edu>*Date*: Thu, 26 Jul 2001 19:46:34 -0400 (EDT)*Cc*: atlas-comm@cs.utk.edu

Jeff, >.... >NBmm0(MB, NB, KB, ATL_rone, pA, KB, pB, KB, beta, pC, ldpc); >pA += NBNB; >pB += NBNB; >.... > >I assume that what's going on here is that pA and pB are pointers to some >starting offset within the matrices A and B. When NBmm0 is >called, the kernel will be executed, and the sub region of size NBxNB of >matrices A and B will be in the L1 cache, which the kernel then operates >on at a higher speed than if the data weren't in cache. After returning, >the pointers are advanced by NB*NB entries, and the process will repeat >with another chunk of a matrix. So, my simple questions are: Actually, they are not already in the cache, but A, at least, will be kept there for this operation. ATLAS/doc/atlas_over.ps describes this in more detail. > >1. NBNB I assume is the the number of *entries* in the sub matrix, so NB >squared. If I wanted to get the number of bytes this takes up, then this >would be NB*NB*sizeof(TYPE). Is this correct? Yes. >2. It seems like instead of processing the sub matrices like you normally >handle a 2D array, that ATLAS is mapping the 2D space into just a strictly >linear space. Is this true? Yes. This is the block-major format discussed on page 7-8 of ATLAS/doc/atlas_contrib.ps. This note along with the provided ATLAS/doc/atlas_over.ps exist to explain at least some of these ideas. Cheers, Clint

- Prev by Date:
**simple question about ATL_mmJIK.c** - Next by Date:
**Abysmal performance with new gccs** - Prev by thread:
**simple question about ATL_mmJIK.c** - Next by thread:
**ppc development** - Index(es):