next up previous
Next: Conclusion Up: Optimal mapping and scheduling Previous: Proof

Proof

Let proc(I) be the processor that executes tile I. We have three cases to consider, depending upon whether proc(I) also executes both successors I' and I'', or exactly one of them, or none of them:

both successors:
proc(I)=proc(I')=proc(I'')
The same processor executes both successors. They are executed sequentially and the last one being executed cannot begin execution before time-step tex2html_wrap_inline1216. As tex2html_wrap_inline1218 the result is proven.
one successor:
proc(I) = proc(I') and tex2html_wrap_inline1222
(respectively proc(I) = proc(I'') and tex2html_wrap_inline1226). A communication is needed between I and I'' (respectively I and I'), hence tex2html_wrap_inline1236 (respectively tex2html_wrap_inline1238)
no successor:
tex2html_wrap_inline1226 and tex2html_wrap_inline1222
This case is similar to the previous one.

Back to the proof of the theorem, let tex2html_wrap_inline1244 the total execution time using P processors. Let Idle be the cumulated idle time of all processors during execution. Finally, let tex2html_wrap_inline1250 be the sequential execution time. Clearly,
displaymath1252
Hence, to show that tex2html_wrap_inline1254, we need to show that
displaymath1256

The structure of the dependence graph does impose that some processors are idle at the beginning of the computation, which will lead to a lower bound for Idle. For instance, during the execution of tile (0,0), there are necessarily P-1 idle processors. To go on, we recursively define tex2html_wrap_inline1264 as follows (see Figure 5):

We know from Lemma 1 that for all tex2html_wrap_inline1268,
displaymath1304

We prove by induction that for tex2html_wrap_inline1306, at least P-k processors are kept idle between the beginning of the execution of tex2html_wrap_inline1272 and that of tex2html_wrap_inline1264. This will lead to be result that
displaymath1314
This will prove the desired result, because the same amount of idleness, so to speak, will be spent at the end of the computation (by symmetry of the dependence graph). Now, for the induction:


  figure422
Figure 5: A schedule when tex2html_wrap_inline928 and tex2html_wrap_inline1364. Pivot tiles are labeled, and sets S(k) are framed.


next up previous
Next: Conclusion Up: Optimal mapping and scheduling Previous: Proof

Jack Dongarra
Sat Feb 8 08:17:58 EST 1997