Next: Conclusion
Up: Optimal mapping and scheduling
 Previous: Proof
 
Let proc(I) be the processor that executes tile I. We have three 
cases to consider, depending upon whether proc(I) also executes
both successors 
I' and I'', or exactly one of them, or none of them: 
- both successors:
 -  proc(I)=proc(I')=proc(I'') 
 
The same processor executes both successors. They are executed sequentially and the last one being executed cannot begin execution before time-step 
.
As 
 the result is proven.
 - one successor:
 -  proc(I) = proc(I') and 
 
 
(respectively proc(I) = proc(I'') and 
).
A communication is needed between I and I'' (respectively I and I'),
hence 
 (respectively
)
 - no successor:
 -  
 and 
 
This case is similar to the previous one. 
 
Back to the proof of the theorem, let
 the total execution time using P processors. Let Idle be the cumulated idle time of all processors during execution. Finally, let 
 be the sequential execution time. Clearly,

Hence, to show that 
, we need to show that

The structure of the dependence graph does impose that some processors are idle 
at the beginning of the computation, which will lead to 
a lower bound for Idle. For instance, during the execution of tile (0,0),
there are necessarily P-1 idle processors. To go on, we 
recursively define 
 as follows (see Figure 5):
-   
, and
 -  for 
, 
 is the one of the two successors of 
 which is executed last: if 
, let I'=(i+1,j) and I''=(i,j+1) be the successors of tile I:
-  If 
, then 
, and we define S(k) as the remaining tiles in column j: 
),
 -  If 
, then 
, and we define S(k) as the remaining tiles in row i: 
 
,
 
 
We know from Lemma 1 that for all 
, 

We prove by induction that for 
, at least P-k processors are kept idle between the beginning of the execution of 
 and that of 
. This will lead to be result that

This will prove the desired result, because the same amount of idleness, so to speak, will be spent at the end
of the computation (by symmetry of the dependence graph).
Now, for the induction:
-  Let k=1: 
 is either (0,1) or (1,0). See Figure 5 where 
 and 
. Between the
the beginning of  the execution of 
 and that
of 
, the only
successors of 
 that can be executed are in S(1). But all tasks in S(1) must be executed sequentially, hence between the beginning of the execution of 
 and that of 
, at least
(P-1) processors are kept idle.
 -  Assume that the hypothesis is true until step k.
Between the beginning of the execution of 
 and that of 
, at most one processor can be active in S(1), at most another one in S(2), 
, and at most one processor in S(k+1), so that at most
k+1 processors can be active, or equivalently, at least P-(k+1) processors remain idle.
 
  
Figure 5: A schedule when 
 and 
.
Pivot tiles are labeled, and sets S(k) 
are framed.
 
 
   
 Next: Conclusion
Up: Optimal mapping and scheduling
 Previous: Proof
Jack Dongarra 
Sat Feb  8 08:17:58 EST 1997