|
 |
|
|
IBM PE Debugging Tips
Here are some debugging tips that apply only to IBM MPI (PE):
- Avoid unwanted timeouts
You can cause undesired timeouts if you place breakpoints that stop other process too soon after calling MPI_Init() or MPL_Init(). If you create "stop all" breakpoints, the first process that gets to the breakpoint stops all the other parallel processes that have not yet arrived at the breakpoint. This may cause a timeout.
To turn the option off, select the Process Window's Action Point > Properties command while the line with the stop symbol is selected. After the Properties dialog box appears, you should deselect the Plant in share group check box.
- Control the poe process
Even though the poe process continues under TotalView control, you should not attempt to start, stop, or otherwise interact with it. Your parallel tasks require that poe continue to run. For this reason, if poe is stopped, TotalView automatically continues it when you continue any parallel task.
- Avoid slow processes due to node saturation
If you try to debug a PE program in which more than three parallel tasks run on a single node, the parallel tasks on each node may run noticeably slower than they would run if you were not debugging them.
This becomes more noticeable as the number of tasks increases, and, in some cases, the parallel tasks may make hardly any progress. This is because PE uses the SIGALRM signal to implement communications operations, and AIX requires that debuggers must intercept all signals. As the number of parallel tasks on a node increases, TotalView becomes saturated, and cannot keep up with the SIGALRMs being sent, thus slowing down the tasks.
|
|
|
|
|