Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [ptp-dev] SDM scalability issue

Thanks, Greg!

Leonardo Garcia
IDE Software Engineer - Linux on Cell
Linux Technology Center Brazil
Phone: +55-19-2132-2068 (T/L: 839-2068)

Greg Watson wrote:

I just found the bug causing this problem and have checked in a fix. See bug #255524. I'll do a build tonight so you can test.


On Nov 14, 2008, at 9:35 PM, Leonardo Augusto Guimarães Garcia wrote:


I am having a weird problem while debugging with PTP 2.1.0.

I have a cluster of 10 Power machines running Linux Fedora 9 and Open MPI 1.3. I can run MPI applications from command line and from within PTP on this cluster without problems, even when I use 10 processes distributed on the 10 nodes I have.

However, when I try to debug the application (I am using the "Open MPI Pi C Project" that comes with PLDT) I am able to debug without problems using any number of processes on my cluster from 1 to 7. However, when I switch to 8 processes or more, the debug session is started and connected, SDM is started, but the processes don't suspend execution in the beginning of the debug session. After a while, some processes may get suspended, but in all cases one of two things happens: none of the process get suspended or some of the processes (usually two of them) have their debug session broken and the debug session enters an inconsistent state from which I can only get out when I cancel the debug completely.

I've already tried to launch the 8 processes using different combinations of the machines in the cluster. I always get the same problem. On the other hand, if I try different combinations of machines with 7 processes, it always works fine.

Does anyone already faced something similar to that?

Best regards,

Leonardo Garcia

ptp-dev mailing list

ptp-dev mailing list

Leonardo Garcia
Linux on Cell and Prism Focal Point - LTC Brazil
IDE Software Engineer - Linux on Cell
Linux Technology Center Brazil
Phone: +55-19-2132-2068 (T/L: 839-2068)

Back to the top