Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [ptp-dev] Problem with invoking SDM debugger on pSeries Linux


I'm not sure exactly why this is happening. I started seeing this problem myself after I installed last nights nightly build (the 10/23 build) and thought it was because something in the SDM code was failing since I wasn't passing the flags you mentioned. I think I was still seeing this morning after I had fixed up my code.

I looked at system headers and it looks like 'Bad address' could be the result of strerror(errno) where errno is 14 (EFAULT). This page http://www.wlug.org.nz/EFAULT hints that it could be a bad parameter list passed to one of the exec() family of system calls, for instance not putting null at the end of the parameter or environment variable arrays.

I'll look at this a bit more tonight and let you know if I find anything.
Dave


Greg Watson <g.watson@xxxxxxxxxxxx>
Sent by: ptp-dev-bounces@xxxxxxxxxxx

10/24/2008 02:12 PM

Please respond to
Parallel Tools Platform general developers <ptp-dev@xxxxxxxxxxx>

To
Parallel Tools Platform general developers <ptp-dev@xxxxxxxxxxx>
cc
Subject
Re: [ptp-dev] Problem with invoking SDM debugger on pSeries Linux





Dave,

I've committed a fix for the command line options to the debugger.  
However there is something else strange going on because it's failing  
at line 1773 in ptp_ibmpe_peoxy with the error:

PE@k17sf2p03: 10/24 14:06:38 T(256) Error: sdm failed to execute,  
status Bad address

The only way I could get it to work is pass NULL rather than the  
environment pointer to the execve call. Do you have any idea what  
might be causing this?

Greg

On Oct 23, 2008, at 11:44 PM, Greg Watson wrote:

> Dave,
>
> I've found a few of problems. There was a bug in the handling of  
> thread stack frames that I've now fixed, but was only apparent on  
> linux_ppc64. Second, the version of gdb on the machine does not  
> handle threads correctly. If you attach gdb to a poe process and  
> select a thread, the -stack-info-depth command thinks the stack  
> frame is corrupted. I tried the same thing with gdb 6.8 (the current  
> version is 6.5) and the problem goes away. See the traces below.  
> Finally, you're not passing the correct parameters to the sdm master  
> and child processes. The --debugger and --debugger_path arguments  
> should only be passed to the child processes. Currently you're  
> passing both to the master which doesn't use them and not passing --
> debugger_path to the children.
>
> Greg
>
> **** GDB 6.5-25.el5rh ****
>
> -bash-3.1$ gdb -i mi x
> ~"GNU gdb Red Hat Linux (6.5-25.el5rh)\n"
> ~"Copyright (C) 2006 Free Software Foundation, Inc.\n"
> ~"GDB is free software, covered by the GNU General Public License,  
> and you are\n"
> ~"welcome to change it and/or distribute copies of it under certain  
> conditions.\n"
> ~"Type \"show copying\" to see the conditions.\n"
> ~"There is absolutely no warranty for GDB.  Type \"show warranty\"  
> for details.\n"
> ~"This GDB was configured as \"ppc64-redhat-linux-gnu\"..."
> ~"Using host libthread_db library \"/lib64/libthread_db.so.1\".\n"
> ~"\n"
> (gdb)
> attach 22461
> &"attach 22461\n"
> ~"Attaching to program: /home/greg/x, process 22461\n"
> ~"Reading symbols from /usr/lib/libmpi_ibm.so..."
> ~"done.\n"
> ~"Loaded symbols for /usr/lib/libmpi_ibm.so\n"
> ~"Reading symbols from /usr/lib/libpoe.so..."
> ~"done.\n"
> ~"Loaded symbols for /usr/lib/libpoe.so\n"
> ~"Reading symbols from /usr/lib/liblapi.so..."
> ~"done.\n"
> ~"Loaded symbols for /usr/lib/liblapi.so\n"
> ~"Reading symbols from /lib/libpthread.so.0..."
> ~"done.\n"
> ~"[Thread debugging using libthread_db enabled]\n"
> ~"[New Thread 268383472 (LWP 22461)]\n"
> ~"[New Thread 1222243504 (LWP 22473)]\n"
> ~"[New Thread 1218049200 (LWP 22472)]\n"
> ~"[New Thread 1205073072 (LWP 22465)]\n"
> ~"[New Thread 1200878768 (LWP 22464)]\n"
> ~"[New Thread 1083241648 (LWP 22463)]\n"
> ~"Loaded symbols for /lib/libpthread.so.0\n"
> ~"Reading symbols from /lib/libm.so.6..."
> ~"done.\n"
> ~"Loaded symbols for /lib/libm.so.6\n"
> ~"Reading symbols from /lib/libc.so.6..."
> ~"done.\n"
> ~"Loaded symbols for /lib/libc.so.6\n"
> ~"Reading symbols from /usr/lib/libstdc++.so.6..."
> ~"done.\n"
> ~"Loaded symbols for /usr/lib/libstdc++.so.6\n"
> ~"Reading symbols from /lib/libgcc_s.so.1..."
> ~"done.\n"
> ~"Loaded symbols for /lib/libgcc_s.so.1\n"
> ~"Reading symbols from /lib/libdl.so.2..."
> ~"done.\n"
> ~"Loaded symbols for /lib/libdl.so.2\n"
> ~"Reading symbols from /lib/ld.so.1..."
> ~"done.\n"
> ~"Loaded symbols for /lib/ld.so.1\n"
> ~"Reading symbols from /usr/lib/libpnsd.so..."
> ~"done.\n"
> ~"Loaded symbols for /usr/lib/libpnsd.so\n"
> ~"Reading symbols from /usr/lib/liblapiudp.so..."
> ~"done.\n"
> ~"Loaded symbols for /usr/lib/liblapiudp.so\n"
> ~"0x401fc2fc in nanosleep () from /lib/libc.so.6\n"
> ^done
> (gdb)
> info threads
> &"info threads\n"
> ~"  6 Thread 1083241648 (LWP 22463)  0x4004082c in do_sigwait ()\n"
> ~"   from /lib/libpthread.so.0\n"
> ~"  5 Thread 1200878768 (LWP 22464)  0x4003b014 in  
> pthread_cond_wait@@GLIBC_2.3.2\n"
> ~"    () from /lib/libpthread.so.0\n"
> ~"  4 Thread 1205073072 (LWP 22465)  0x4003b014 in  
> pthread_cond_wait@@GLIBC_2.3.2\n"
> ~"    () from /lib/libpthread.so.0\n"
> ~"  3 Thread 1218049200 (LWP 22472)  0x4003b014 in  
> pthread_cond_wait@@GLIBC_2.3.2\n"
> ~"    () from /lib/libpthread.so.0\n"
> ~"  2 Thread 1222243504 (LWP 22473)  0x4003b5c4 in  
> pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib/libpthread.so.0\n"
> ~"  1 Thread 268383472 (LWP 22461)  0x401fc2fc in nanosleep ()\n"
> ~"   from /lib/libc.so.6\n"
> ^done
> (gdb)
> -stack-info-depth
> ^done,depth="3"
> (gdb)
> -thread-select 2
> ^done,new-thread-
> id
> =
> "2
> ",frame
> =
> {level
> =
> "0
> ",addr
> =
> "0x4003b5c4
> ",func="pthread_cond_timedwait@@GLIBC_2.3.2",args=[],from="/lib/
> libpthread.so.0"}
> (gdb)
> -stack-info-depth
> &"Previous frame inner to this frame (corrupt stack?)\n"
> ^error,msg="Previous frame inner to this frame (corrupt stack?)"
> (gdb)
>
> *** GDB 6.8 ****
>
> -bash-3.1$ /home/greg/gdb-6.8/gdb/gdb -i mi x
> ~"GNU gdb 6.8\n"
> ~"Copyright (C) 2008 Free Software Foundation, Inc.\n"
> ~"License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html
> >\n"
> ~"This is free software: you are free to change and redistribute it.
> \n"
> ~"There is NO WARRANTY, to the extent permitted by law.  Type \"show  
> copying\"\n"
> ~"and \"show warranty\" for details.\n"
> ~"This GDB was configured as \"powerpc64-unknown-linux-gnu\"...\n"
> (gdb)
> attach 22461
> &"attach 22461\n"
> ~"Attaching to program: /home/greg/x, process 22461\n"
> ~"Reading symbols from /usr/lib/libmpi_ibm.so..."
> ~"done.\n"
> ~"Loaded symbols for /usr/lib/libmpi_ibm.so\n"
> ~"Reading symbols from /usr/lib/libpoe.so..."
> ~"done.\n"
> ~"Loaded symbols for /usr/lib/libpoe.so\n"
> ~"Reading symbols from /usr/lib/liblapi.so..."
> ~"done.\n"
> ~"Loaded symbols for /usr/lib/liblapi.so\n"
> ~"Reading symbols from /lib/libpthread.so.0..."
> ~"done.\n"
> ~"[Thread debugging using libthread_db enabled]\n"
> ~"[New Thread 0xfff34f0 (LWP 22461)]\n"
> ~"[New Thread 0x48d9f4b0 (LWP 22473)]\n"
> ~"[New Thread 0x4899f4b0 (LWP 22472)]\n"
> ~"[New Thread 0x47d3f4b0 (LWP 22465)]\n"
> ~"[New Thread 0x4793f4b0 (LWP 22464)]\n"
> ~"[New Thread 0x4090f4b0 (LWP 22463)]\n"
> ~"Loaded symbols for /lib/libpthread.so.0\n"
> ~"Reading symbols from /lib/libm.so.6..."
> ~"done.\n"
> ~"Loaded symbols for /lib/libm.so.6\n"
> ~"Reading symbols from /lib/libc.so.6..."
> ~"done.\n"
> ~"Loaded symbols for /lib/libc.so.6\n"
> ~"Reading symbols from /usr/lib/libstdc++.so.6..."
> ~"done.\n"
> ~"Loaded symbols for /usr/lib/libstdc++.so.6\n"
> ~"Reading symbols from /lib/libgcc_s.so.1..."
> ~"done.\n"
> ~"Loaded symbols for /lib/libgcc_s.so.1\n"
> ~"Reading symbols from /lib/libdl.so.2..."
> ~"done.\n"
> ~"Loaded symbols for /lib/libdl.so.2\n"
> ~"Reading symbols from /lib/ld.so.1..."
> ~"done.\n"
> ~"Loaded symbols for /lib/ld.so.1\n"
> ~"Reading symbols from /usr/lib/libpnsd.so..."
> ~"done.\n"
> ~"Loaded symbols for /usr/lib/libpnsd.so\n"
> ~"Reading symbols from /usr/lib/liblapiudp.so..."
> ~"done.\n"
> ~"Loaded symbols for /usr/lib/liblapiudp.so\n"
> ~"0x401fc2fc in nanosleep () from /lib/libc.so.6\n"
> ^done
> (gdb)
> info threads
> &"info threads\n"
> ~"  6 Thread 0x4090f4b0 (LWP 22463)  0x4004082c in do_sigwait ()\n"
> ~"   from /lib/libpthread.so.0\n"
> ~"  5 Thread 0x4793f4b0 (LWP 22464)  0x4003b014 in  
> pthread_cond_wait@@GLIBC_2.3.2\n"
> ~"    () from /lib/libpthread.so.0\n"
> ~"  4 Thread 0x47d3f4b0 (LWP 22465)  0x4003b014 in  
> pthread_cond_wait@@GLIBC_2.3.2\n"
> ~"    () from /lib/libpthread.so.0\n"
> ~"  3 Thread 0x4899f4b0 (LWP 22472)  0x4003b014 in  
> pthread_cond_wait@@GLIBC_2.3.2\n"
> ~"    () from /lib/libpthread.so.0\n"
> ~"  2 Thread 0x48d9f4b0 (LWP 22473)  0x4003b5c4 in  
> pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib/libpthread.so.0\n"
> ~"  1 Thread 0xfff34f0 (LWP 22461)  0x401fc2fc in nanosleep ()\n"
> ~"   from /lib/libc.so.6\n"
> ^done
> (gdb)
> -stack-info-depth
> ^done,depth="3"
> (gdb)
> -thread-select 2
> ^done,new-thread-
> id
> =
> "2
> ",frame
> =
> {level
> =
> "0
> ",addr
> =
> "0x4003b5c4
> ",func="pthread_cond_timedwait@@GLIBC_2.3.2",args=[],from="/lib/
> libpthread.so.0"}
> (gdb)
> -stack-info-depth
> ^done,depth="5"
> (gdb)
>
> On Oct 21, 2008, at 5:48 PM, Greg Watson wrote:
>
>> Dave,
>>
>> I think the ability to see global variables is a function of the  
>> gdb version, so it might not be supported on this platform.
>>
>> How do you have your project set up? Do you have a local copy of  
>> the project in your workspace, or are you using RDT?
>>
>> Greg
>>
>> On Oct 21, 2008, at 4:07 PM, Dave Wootton wrote:
>>
>>> I forgot to mention, when the processes suspend at main, I can see  
>>> local
>>> variables in main() and their values in the variables view and gdb  
>>> status
>>> for signals in the signals view. I don't see global variables. The  
>>> 'add
>>> global variables' icon is grayed out.
>>>
>>> Eclipse tries to open a source window, but fails with a message  
>>> 'Could not
>>> open the editor: Editor could not be initialized.' and the following
>>> exception.
>>> I don't know if this exception is due to running Eclipse remotely  
>>> from the
>>> application and remote files support isn't hooked up or because of  
>>> some
>>> other problem.
>>>
>>> java.lang.NullPointerException
>>>      at
>>> org
>>> .eclipse
>>> .cdt.internal.ui.editor.CEditor.updateScalabilityMode(CEditor.java:
>>> 1347)
>>>      at
>>> org.eclipse.cdt.internal.ui.editor.CEditor.doSetInput(CEditor.java:
>>> 1294)
>>>      at
>>> org.eclipse.ui.texteditor.AbstractTextEditor
>>> $19.run(AbstractTextEditor.java:3025)
>>>      at
>>> org
>>> .eclipse
>>> .jface.operation.ModalContext.runInCurrentThread(ModalContext.java:
>>> 446)
>>>      at
>>> org.eclipse.jface.operation.ModalContext.run(ModalContext.java:354)
>>>      at
>>> org.eclipse.jface.window.ApplicationWindow
>>> $1.run(ApplicationWindow.java:758)
>>>      at
>>> org.eclipse.swt.custom.BusyIndicator.showWhile(BusyIndicator.java:
>>> 70)
>>>      at
>>> org
>>> .eclipse.jface.window.ApplicationWindow.run(ApplicationWindow.java:
>>> 755)
>>>      at
>>> org.eclipse.ui.internal.WorkbenchWindow.run(WorkbenchWindow.java:
>>> 2483)
>>>      at
>>> org
>>> .eclipse
>>> .ui
>>> .texteditor
>>> .AbstractTextEditor.internalInit(AbstractTextEditor.java:3043)
>>>      at
>>> org
>>> .eclipse
>>> .ui.texteditor.AbstractTextEditor.init(AbstractTextEditor.java:3070)
>>>      at
>>> org
>>> .eclipse.ui.internal.EditorManager.createSite(EditorManager.java:
>>> 799)
>>>      at
>>> org
>>> .eclipse
>>> .ui.internal.EditorReference.createPartHelper(EditorReference.java:
>>> 643)
>>>      at
>>> org
>>> .eclipse
>>> .ui.internal.EditorReference.createPart(EditorReference.java:428)
>>>      at
>>> org
>>> .eclipse
>>> .ui
>>> .internal
>>> .WorkbenchPartReference.getPart(WorkbenchPartReference.java:594)
>>>      at org.eclipse.ui.internal.PartPane.setVisible(PartPane.java:
>>> 306)
>>>      at
>>> org
>>> .eclipse
>>> .ui
>>> .internal
>>> .presentations.PresentablePart.setVisible(PresentablePart.java:180)
>>>      at
>>> org
>>> .eclipse
>>> .ui
>>> .internal
>>> .presentations
>>> .util.PresentablePartFolder.select(PresentablePartFolder.java:270)
>>>      at
>>> org
>>> .eclipse
>>> .ui
>>> .internal
>>> .presentations
>>> .util.LeftToRightTabOrder.select(LeftToRightTabOrder.java:65)
>>>      at
>>> org
>>> .eclipse
>>> .ui
>>> .internal
>>> .presentations
>>> .util
>>> .TabbedStackPresentation.selectPart(TabbedStackPresentation.java:
>>> 473)
>>>      at
>>> org
>>> .eclipse
>>> .ui.internal.PartStack.refreshPresentationSelection(PartStack.java:
>>> 1256)
>>>      at
>>> org.eclipse.ui.internal.PartStack.setSelection(PartStack.java:1209)
>>>      at org.eclipse.ui.internal.PartStack.showPart(PartStack.java:
>>> 1608)
>>>      at org.eclipse.ui.internal.PartStack.add(PartStack.java:499)
>>>      at org.eclipse.ui.internal.EditorStack.add(EditorStack.java:
>>> 103)
>>>      at org.eclipse.ui.internal.PartStack.add(PartStack.java:485)
>>>      at org.eclipse.ui.internal.EditorStack.add(EditorStack.java:
>>> 112)
>>>      at
>>> org
>>> .eclipse
>>> .ui
>>> .internal.EditorSashContainer.addEditor(EditorSashContainer.java:63)
>>>      at
>>> org
>>> .eclipse
>>> .ui.internal.EditorAreaHelper.addToLayout(EditorAreaHelper.java:217)
>>>      at
>>> org
>>> .eclipse
>>> .ui.internal.EditorAreaHelper.addEditor(EditorAreaHelper.java:207)
>>>      at
>>> org
>>> .eclipse
>>> .ui.internal.EditorManager.createEditorTab(EditorManager.java:779)
>>>      at
>>> org
>>> .eclipse
>>> .ui
>>> .internal
>>> .EditorManager.openEditorFromDescriptor(EditorManager.java:678)
>>>      at
>>> org
>>> .eclipse.ui.internal.EditorManager.openEditor(EditorManager.java:
>>> 639)
>>>      at
>>> org
>>> .eclipse
>>> .ui
>>> .internal.WorkbenchPage.busyOpenEditorBatched(WorkbenchPage.java:
>>> 2817)
>>>      at
>>> org
>>> .eclipse
>>> .ui.internal.WorkbenchPage.busyOpenEditor(WorkbenchPage.java:2729)
>>>      at
>>> org.eclipse.ui.internal.WorkbenchPage.access$11(WorkbenchPage.java:
>>> 2721)
>>>      at
>>> org.eclipse.ui.internal.WorkbenchPage$10.run(WorkbenchPage.java:
>>> 2673)
>>>      at
>>> org.eclipse.swt.custom.BusyIndicator.showWhile(BusyIndicator.java:
>>> 70)
>>>      at
>>> org
>>> .eclipse.ui.internal.WorkbenchPage.openEditor(WorkbenchPage.java:
>>> 2668)
>>>      at
>>> org
>>> .eclipse.ui.internal.WorkbenchPage.openEditor(WorkbenchPage.java:
>>> 2652)
>>>      at
>>> org.eclipse.debug.internal.ui.sourcelookup.SourceLookupFacility
>>> $1.run(SourceLookupFacility.java:355)
>>>      at
>>> org.eclipse.swt.custom.BusyIndicator.showWhile(BusyIndicator.java:
>>> 70)
>>>      at
>>> org
>>> .eclipse
>>> .debug
>>> .internal
>>> .ui
>>> .sourcelookup
>>> .SourceLookupFacility.openEditor(SourceLookupFacility.java:365)
>>>      at
>>> org
>>> .eclipse
>>> .debug
>>> .internal
>>> .ui
>>> .sourcelookup
>>> .SourceLookupFacility.openEditor(SourceLookupFacility.java:274)
>>>      at
>>> org
>>> .eclipse
>>> .debug
>>> .internal
>>> .ui
>>> .sourcelookup
>>> .SourceLookupFacility.display(SourceLookupFacility.java:218)
>>>      at
>>> org.eclipse.debug.ui.DebugUITools.displaySource(DebugUITools.java:
>>> 776)
>>>      at
>>> org
>>> .eclipse
>>> .debug.internal.ui.elements.adapters.StackFrameSourceDisplayAdapter
>>> $
>>> SourceDisplayJob.runInUIThread(StackFrameSourceDisplayAdapter.java:
>>> 167)
>>>      at org.eclipse.ui.progress.UIJob$1.run(UIJob.java:94)
>>>      at org.eclipse.swt.widgets.RunnableLock.run(RunnableLock.java:
>>> 35)
>>>      at
>>> org
>>> .eclipse
>>> .swt.widgets.Synchronizer.runAsyncMessages(Synchronizer.java:133)
>>>      at
>>> org.eclipse.swt.widgets.Display.runAsyncMessages(Display.java:3800)
>>>      at
>>> org.eclipse.swt.widgets.Display.readAndDispatch(Display.java:3425)
>>>      at
>>> org.eclipse.ui.internal.Workbench.runEventLoop(Workbench.java:2382)
>>>      at org.eclipse.ui.internal.Workbench.runUI(Workbench.java:2346)
>>>      at org.eclipse.ui.internal.Workbench.access$4(Workbench.java:
>>> 2198)
>>>      at org.eclipse.ui.internal.Workbench$5.run(Workbench.java:493)
>>>      at
>>> org
>>> .eclipse
>>> .core.databinding.observable.Realm.runWithDefault(Realm.java:288)
>>>      at
>>> org
>>> .eclipse
>>> .ui.internal.Workbench.createAndRunWorkbench(Workbench.java:488)
>>>      at
>>> org.eclipse.ui.PlatformUI.createAndRunWorkbench(PlatformUI.java:149)
>>>      at
>>> org
>>> .eclipse
>>> .ui
>>> .internal.ide.application.IDEApplication.start(IDEApplication.java:
>>> 113)
>>>      at
>>> org
>>> .eclipse
>>> .equinox.internal.app.EclipseAppHandle.run(EclipseAppHandle.java:
>>> 193)
>>>      at
>>> org
>>> .eclipse
>>> .core
>>> .runtime
>>> .internal
>>> .adaptor.EclipseAppLauncher.runApplication(EclipseAppLauncher.java:
>>> 110)
>>>      at
>>> org
>>> .eclipse
>>> .core
>>> .runtime
>>> .internal.adaptor.EclipseAppLauncher.start(EclipseAppLauncher.java:
>>> 79)
>>>      at
>>> org
>>> .eclipse
>>> .core.runtime.adaptor.EclipseStarter.run(EclipseStarter.java:382)
>>>      at
>>> org
>>> .eclipse
>>> .core.runtime.adaptor.EclipseStarter.run(EclipseStarter.java:179)
>>>      at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>      at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
>>>      at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown  
>>> Source)
>>>      at java.lang.reflect.Method.invoke(Unknown Source)
>>>      at
>>> org.eclipse.equinox.launcher.Main.invokeFramework(Main.java:549)
>>>      at org.eclipse.equinox.launcher.Main.basicRun(Main.java:504)
>>>      at org.eclipse.equinox.launcher.Main.run(Main.java:1236)
>>>      at org.eclipse.equinox.launcher.Main.main(Main.java:1212)
>>> Dave
>>>
>>>
>>>
>>> Dave Wootton/Poughkeepsie/IBM@IBMUS
>>> Sent by: ptp-dev-bounces@xxxxxxxxxxx
>>> 10/21/2008 03:47 PM
>>> Please respond to
>>> Parallel Tools Platform general developers <ptp-dev@xxxxxxxxxxx>
>>>
>>>
>>> To
>>> Parallel Tools Platform general developers <ptp-dev@xxxxxxxxxxx>
>>> cc
>>> Parallel Tools Platform general developers <ptp-dev@xxxxxxxxxxx>,
>>> ptp-dev-bounces@xxxxxxxxxxx
>>> Subject
>>> Re: [ptp-dev] Problem with invoking SDM debugger on pSeries Linux
>>>
>>>
>>>
>>>
>>>
>>>
>>> I ran the debugger test again this afternoon and have a debug log  
>>> from the
>>>
>>> initial attempt, which was partially successful. I started my  
>>> proxy, and
>>> invoked a 2-task application (on 1 node) It got to the point where  
>>> the PTP
>>>
>>> debug perspective opened and the debug view showed a partially  
>>> expanded
>>> tree with a node for process 0 and threads 1 and 2 as childrenm of  
>>> process
>>>
>>> 0. Thread 1 is expanded and shows as suspended at main(). Thread 2  
>>> is
>>> collapsed and shows as suspended but no location. If I expand the  
>>> thread 2
>>>
>>> node then I get an ArrayIndexOutOfBounds exception as noted in the
>>> attached console log. Before I expand thread 2, I issued 'ps' on  
>>> my proxy
>>> node and see that I have 1 proxy, 3 SDM processes, 2 active gdb  
>>> processes,
>>>
>>> two defunct gdb processes and the two application processes, all  

>>> of which,
>>>
>>> with the exception of the defunct gdb processes looks right.
>>>
>>> This seems to be consistently repeatable, where I get the same  
>>> results
>>> each time I start from a fresh instance of Eclipse and my proxy.
>>>
>>> I think I have my proxy starting the SDMs properly at this time. I  
>>> need to
>>>
>>> spend some time tomorrow looking at exactly when I start the  
>>> master SDM
>>> since I think I want to start it only after my attach.cfg file is  
>>> created
>>> instead of the arbitrary delay that I have now. Once I sort that  
>>> out I
>>> expect I will have another patch for for you to commit to the PTP  
>>> 2.1
>>> branch, in addition to the PE and LoadLeveler patches I've already  
>>> sent in
>>>
>>> the last week.
>>> Dave
>>>
>>>
>>>
>>>
>>>
>>> Dave Wootton/Poughkeepsie/IBM@IBMUS
>>> Sent by: ptp-dev-bounces@xxxxxxxxxxx
>>> 10/17/2008 07:19 PM
>>> Please respond to
>>> Parallel Tools Platform general developers <ptp-dev@xxxxxxxxxxx>
>>>
>>>
>>> To
>>> Parallel Tools Platform general developers <ptp-dev@xxxxxxxxxxx>
>>> cc
>>>
>>> Subject
>>> Re: [ptp-dev] Problem with invoking SDM debugger on pSeries Linux
>>>
>>>
>>>
>>>
>>>
>>>
>>> Greg
>>> It occurred to me that my problem might be due to my test program  
>>> not
>>> being compiled with -g. I recompiled and it seemed like I got a bit
>>> farther along. The GUI got so far as to try to display a stack  
>>> trace for
>>> task 0 (of a 2 tasks on the same node application) suspended at  
>>> main() and
>>>
>>>
>>> showing the line # for main(), trying to open an editor window,  
>>> and then
>>> crashing with a subscript out of range exception. Unfortunately I  
>>> did not
>>> capture the stack trace since I thought I could recreate it and  
>>> get a
>>> better trace, then couldn't get the debugger to run any more. The  
>>> editor
>>> window failed to open the source file (maybe since I am running  
>>> remote on
>>> a Windows XP system)
>>>
>>> My network connection seems exceptionally sluggish for some  
>>> reason, which
>>> seems to have caused a second problem, where I was getting a  
>>> segmentation
>>> violation at line 324 of src/impl/sdm_routing_table_file.c. This  
>>> was a
>>> call to fclose(*routing_file). I'm not sure what should be  
>>> happening here.
>>>
>>>
>>> I commented out the fclose() and that got me past the sigsegv, but  
>>> with an
>>>
>>>
>>> intermittent message about too many open files. If I got bast the  
>>> 'too
>>> many open files' message then I got to the point where the  
>>> debugger tried
>>> to show a stack trace.
>>>
>>> I'll look at this more next week.
>>> Dave
>>>
>>>
>>>
>>> Dave Wootton/Poughkeepsie/IBM@IBMUS
>>> Sent by: ptp-dev-bounces@xxxxxxxxxxx
>>> 10/17/2008 01:25 PM
>>> Please respond to
>>> Parallel Tools Platform general developers <ptp-dev@xxxxxxxxxxx>
>>>
>>>
>>> To
>>> <ptp-dev@xxxxxxxxxxx>
>>> cc
>>>
>>> Subject
>>> [ptp-dev] Problem with invoking SDM debugger on pSeries Linux
>>>
>>>
>>>
>>>
>>>
>>>
>>> Greg
>>> I tried invoking the SDM debugger non my RedHat 5 system (the  
>>> k17sf2p03
>>> system you have access to, and which is up now), and had two  
>>> problems
>>>
>>> The first is that the code which waits for the routing file has a  
>>> timeout
>>> of 10 seconds, which is apparently too quick, since I get a  
>>> message that
>>> SDM timed out waiting for the routing file. I changed both calls  
>>> that
>>> waited for the routing file to wait for 1000 seconds which fixed  
>>> that
>>> problem.
>>>
>>> The second problem is that I get some sort of error message that I  
>>> think
>>> is goming from gdb. I'm attaching the logs for both the child SDMs  
>>> and the
>>>
>>>
>>>
>>> master SDMs.
>>>
>>> The good news is that I'm making it much farther in SDM than I was  
>>> a month
>>>
>>>
>>>
>>> ago when I last looked at this.
>>>
>>>
>>> Dave
>>>
>>> _______________________________________________
>>> ptp-dev mailing list
>>> ptp-dev@xxxxxxxxxxx
>>> https://dev.eclipse.org/mailman/listinfo/ptp-dev
>>>
>>> [attachment "sdm_child.txt" deleted by Dave Wootton/Poughkeepsie/
>>> IBM]
>>> [attachment "sdm_master.txt" deleted by Dave Wootton/Poughkeepsie/
>>> IBM]
>>> _______________________________________________
>>> ptp-dev mailing list
>>> ptp-dev@xxxxxxxxxxx
>>> https://dev.eclipse.org/mailman/listinfo/ptp-dev
>>>
>>> [attachment "debug_1021_log.txt" deleted by Dave Wootton/
>>> Poughkeepsie/IBM]
>>> _______________________________________________
>>> ptp-dev mailing list
>>> ptp-dev@xxxxxxxxxxx
>>> https://dev.eclipse.org/mailman/listinfo/ptp-dev
>>>
>>>
>>> _______________________________________________
>>> ptp-dev mailing list
>>> ptp-dev@xxxxxxxxxxx
>>> https://dev.eclipse.org/mailman/listinfo/ptp-dev
>>>
>>
>> _______________________________________________
>> ptp-dev mailing list
>> ptp-dev@xxxxxxxxxxx
>> https://dev.eclipse.org/mailman/listinfo/ptp-dev
>>
>
> _______________________________________________
> ptp-dev mailing list
> ptp-dev@xxxxxxxxxxx
> https://dev.eclipse.org/mailman/listinfo/ptp-dev
>

_______________________________________________
ptp-dev mailing list
ptp-dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/ptp-dev


Back to the top