Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
[udig-devel] [jira] Created: (UDIG-1727) "Full Extent" deadlocks in Tile.disposeSWTImage() due to concurrent locks on org.eclipse.swt.internal.gtk.OS

"Full Extent" deadlocks in Tile.disposeSWTImage() due to concurrent locks on org.eclipse.swt.internal.gtk.OS
------------------------------------------------------------------------------------------------------------

                 Key: UDIG-1727
                 URL: http://jira.codehaus.org/browse/UDIG-1727
             Project: uDIG
          Issue Type: Bug
          Components: application
    Affects Versions: UDIG 1.2.0
         Environment: eclipse=3.6 
java.version=1.6.0_20 
java.vendor=Sun Microsystems Inc. 
BootLoader constants: OS=linux, ARCH=x86_64, WS=gtk, NL=nb_NO 
Framework arguments: -product net.refractions.udig.product
            Reporter: Kenneth Gulbrandsøy
         Attachments: UDIG-1727-net.refractions.udig.project.02092010.patch

Do the following to reproduce deadlock:
 
   1. Run udig.product 
   2. Add any layer to a map 
   3. Ensure that the preference "Use Tiled Rendering System" is checked 
   4. Press "Full Extent" command rapidly until uDig freezes 

The root cause of this bug is the same as the UDIG-1726 issue; the tiled rendering system does not always honor the SWT multi-threading policy. Tile.getSWTImage() and Tile.disposeSWTImage() methods in the net.refractions.udig.project.render.Tile class access the SWT subsystem by invoking methods on the SWT image, that in turn access methods in the org.eclipse.swt.internal.gtk.OS class. If the (main) user-interface thread also concurrently access a method in the OS class that requires a lock, a deadlock will occur once a synchronized statement is invoked in the Tile class. Yikes! This is hard to explain concisely. Just reproduce the bug and see for your self by inspecting the stack trace in debug mode. You will see that the threads that are deadlocked both have method(s) that locks in the OS class in the call stack. One of the two threads should be the user-interface (main) thread. 

I believe that the root cause described above has been confirmed by applying the patch described next. This patch ensures that the SWT image instance in the Tile instance is only accessed asynchronously from the user-interface thread by wrapping access with the Display.asyncExec() method. However, this required that a Display instance is readily available, which it is not. Since the Tile instance does not have any information about the user-interface thread, the following "hack" is used to get it: 

   iterate over all threads until Display.findDisplay(Thread) returns a non-null reference to a Display instance 

This method has some shortcomings though. I believe that the tiled rendering system should always honor the SWT multi-threading policy from the top down, by ensuring that all calls that might result in invoking methods in the OS class that requires a lock, as the Title.getSWTImage() and Title.disposeSWTImage() methods do, is run on the user-interface thread using the Display.asyncExec() method. The hack I have used, is based on the following assumptions: 1. only one user-interface thread exists the thread-group of the invoking thread (reasonable I think) 2. if found, it is assumed that the thread is alive as long as the application is running, thus once found, the instance is returned directly thereafter 3. if not found, it is safe to invoke Title.getSWTImage() and Title.disposeSWTImage() methods from the calling thread I hope that somebody can confirm that this issue exist on other platforms than Linux, and that my hack actually solves the problem. 

The hack above is attached as a patch.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://jira.codehaus.org/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




Back to the top