If the
matrix
is
stored in CDS format, it is still possible to
perform a matrix-vector product
by either rows or columns, but this
does not take advantage of the CDS format. The idea
is to make a change in coordinates in the doubly nested loop. Replacing
we get
The algorithm will now have a doubly nested loop with the outer loop
enumerating the diagonals diag=-p,q with
and
the
(nonnegative) numbers of diagonals to the left and right of the main
diagonal. The bounds for the inner loop follow from the requirement
that
for i = 1, n
y(i) = 0
end;
for diag = -diag_left, diag_right
for loc = max(1,1-diag), min(n,n-diag)
y(loc) = y(loc) + val(loc,diag) * x(loc+diag)
end;
end;
The transpose matrix-vector product
is a minor variation of the
algorithm above. Using the update formula
for i = 1, n
y(i) = 0
end;
for diag = -diag_right, diag_left
for loc = max(1,1-diag), min(n,n-diag)
y(loc) = y(loc) + val(loc+diag, -diag) * x(loc+diag)
end;
end;
The memory access for the CDS-based matrix-vector product