Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
[jdt-ui-dev] Proposal: Refactoring "Remove unused members"

Hi,

  the idea is to remove members (classes, methods and class/instance 
variables) which are not referenced (anymore). The collected candidates for 
removal are to be displayed to let the user pick the methods s/he wants to 
remove.
  
  Hopefully I was able to express myself clearly. If not, please ask. I am 
more than happy to try a different approach of explaining or thinking about 
it.

  I believe at least the following Issues have to be taken care of...

[ Simple Example ]

Let's start with one of the simplest cases.

      public class A {
        private void a() {}        // referenced from inside this class
        public  void b() { a();} // referenced from the outside?
        private void c() {};       // never referenced
      }
      
This would result in the following collection of items to analyze: 

Name           Containing Type             Access
of Item          and Supertypes              Modifier
                      declaring the item
------------------------------------------------------------
a()                    A                                     private
b()                    A                                     public
c()                    A                                     private

The results of this case are pretty obvious:

A.b() is not referenced from inside A. According to the public access modifier 
of A.b() the whole workspace has to be searched for items referring to this 
method. If none is found it is a candidate for removal. 
A.a() is referenced from A.b(). Assuming A.b() is referenced from the outside 
of this class, it will _not_ be a removed.  Hence A.a() will _not_ be a 
candidate for removal. 
A.c() is not referenced from inside A, there is no named inner class and as 
A.c() is declared private the search ends here. Hence it _is_ a candidate for 
removal. 


[ Taking Supertypes into Account ]

An example with three super types:

      interface I {
        void i();
      }
      public class SuperSuperA {
        public void ssaa() {}
        public void ssab() {}
      }

      abstract class SuperA extends SuperSuperA
                          implements I {
        void i() {}
      }

      public class A extends SuperA {
        public void i() {}
        public void ssab() {}
      }
      
Name     Containing Type  Access    Remark
of Item    and Supertypes   Modifier
                declaring 
                the item             
-------------------------------------------------------------------------------------
i()            A, SuperA, I          public      The  implementation of 
                                                                A.i() could be 
                                                                used  when 
                                                                referring to 
                                                                SuperA or I.
ssab()    A, SuperA,             public      The implementation of 
                                                                A.ssab() could 
                                                                be SuperSuperA                                     
                                                                used when 
                                                                referring to 
                                                                SuperA or                                                                                 
                                                                SuperSuperA.

Note that SuperSuper.ssaa() didn't make it into the collection of items as it 
is not overridden in A. 

Resulting in: 
As A.i() is public the whole workspace has to be searched for references to 
A.i(), SuperA.i() and I.i(). 
This is also true for A.ssab(). A.ssab(), SuperA.ssab() and SuperSuperA.ssab() 
have to be searched in whole workspace. 


[ Recursive Dependencies  ]

An example with recursive dependencies.

      public class A  {
        private void a() { b(); }
        private void b() {}
      }

A.b() can not be a candidate for removal as it is referenced by A.a(). A.a() 
can be removed though, but after doing so, A.a() is also a candidate for 
removal. As the user may not select to delete A.a(), A.b() cannot be removed 
in every case. (any case? my English su... I mean sometimes it can, sometimes 
it can't).

Therefore it is necessary to record the candidates differently, including the 
dependencies.

Name          Containing Type         Access            Depends on
of Item         and Supertypes          Modifier           the removal of
                     declaring the item
--------------------------------------------------------------------------------------------
a()                     A                               private         -
b()                     A                               private         A.a()

The collection of removal candidates is to be presented to the user in a way 
reflecting the dependencies.
One possible implementation would be to show the candidates in a hierarchical 
view. The independent candidates are shown on the first level and the 
respective depending items are below the candidate they depend on. If a 
containing candidate is deselected all depending candidates and candidates 
depending on the latter are also deselected and cannot be selected without 
selecting the candidate they depend on. 

A typical example for recursive dependencies might be accessors and the 
respective instance variable or (inner) classes which are only referenced by 
a Widget using them as XXListeners and can be removed when the button reaches 
its end of life.


[ When to stop searching for References ]

Please consider the following code. 
      
      public class A  {
        public static void aa() { }
        public static void ab() { A.aa(); }
                // main() and all methods
                // of Object are blacklisted
                // and will not be used as
                // candidates.
        public static void main(String[] args) {A.ab(); } 
      }

      public class B {
        public static void ba() { A.aa();}
      }

      public class C {
        public static void ca() { A.aa();}
      }

Assumption: As A.aa() is referenced by A.ab(), B.ba() and C.ca() it is not a 
candidate for removal.
To get this result the whole workspace must be searched as A.aa() is declared 
public, but this would mean a serious performance penalty.
The first possible optimization would be to stop looking for more references 
after the first is found. In the am scenario that would mean to stop to after 
A.ab() is found. As a consequence A.aa() would be in a unknown state when 
A.ab() (another candidate) will not be confirmed by the user, because when 
A.ab() is not removed it is unknown if A.aa() is referenced by more callers 
than A.ab(), as the search has been stopped prematurely.
Therefore the search can only be stopped, after the first caller is found 
which is not also a candidate.
Hence the callers, which are also candidates, have not to be recorded in case 
a caller outside the candidate list is found. 


[ Performance ]

I believe that displaying the search results and executing the removal are 
relatively cheap compared to the search, so I will just focus on the search 
for now.

- general search scope - 
For non-static private members only the class itself and the named, 
non-static, inner classes have to be searched. 
The search for members declared with the package modifier will start like a 
search for private members, but will also the search the declaring package. 
Regarding protected members the private and package search has to be performed 
and furthermore all subclasses, which are not in the same package as the 
member itself.
Public members are the worst. They can be referenced by everything. So as 
worst case the whole workspace has to be searched.
This is just a rough outline to build on for the below mentioned issues.

It would be important to show the user a tip helping him to understand that it 
is important to define an appropriate workingset, which includes only his/her 
own code, not all the libraries etc.

Also the default settings should already narrow down the search, e.g. "public" 
as a search criteria should be deselected.

- batching the search - 
The two factors having the most impact on the performance will likely be disk 
io and parsing, therefore an ideal search would touch each ressource only 
once and also parse it only once. 

- stopping early / redefining search scope - 
That would mean that search scope needs to reflect the maximum scope universe 
of all members to search for. If there are 10 members to search for declared 
private and one public this could mean to search the whole workspace. Still 
for this particular public member it is the only choice there is.
Therefore it is necessary to suspend the search after the first reference to 
the public member is found (where the calling member is not a candidate 
itself) _and_ recalculate the maximum search scope for the remaining 
candidates (in the a.m. case it should be none, as private members should be 
searched first, but there are other cases, like having more than one public 
member to search for).
To make a long story short, the search scope needs to be redefined during the 
search and it also must be possible for the caller to stop the search for the 
case that relevant references are found for all candidates before having 
searched the remaining scope completely.


[ Performance :: Implementing the search using the Java Search Engine ]

I had a quick peek at the JavaSearchEngine and although I am not through yet, 
I have found the OrPattern to build the maximum search scope, but haven't 
found anything yet to suspend/stop a running search and to reduce the scope 
of it.



Back to the top