1. Overlapping SHT patch (pushed April 13th)
The initial purpose of this patch was to fix
Bug 460261 - Traces with many threads causes
tracecompass to slow down and views to no longer work which
had 10k threads and made the SHT degenerate into an empty
comb.
The SHT would degenerate into an empty comb because nodes have
limited capacity (2k - 3k intervals), neighbour nodes cannot
overlap, and we store an interval for every attribute from the
tree's start time to the first relevant state change for this
attribute. Therefore the SHT would only be able to fit so many
intervals by becoming much deeper (depth being proportional to
the number of attributes BTW).
The suggested alternative removes the consecutive node
constraint, thus allowing nodes to overlap and bringing tree
height down to "reasonable" levels. To do this, I have added
end times to node headers, new Branches start at the start
time of the first interval inserted and queries now search on
sub-trees instead of branches.
Results:
Node usage is up : 96% past 10 nodes.
Build Times are also faster because less work is done on empty
nodes.
Full State Query performance is quasi-unchanged
Single State Query performance is twofold on many attribute
traces, especially by using extra header information on quark
bounds.
An additional optimization that is not part of this patch is
getting the best node start times to respect interval and node
length distribution. New node start time would we tree's
current end time minus the average of maximum interval lengths
on tree level.
Another optimization I experimented with is adding quark
bounds to node headers which helps narrow down nodes which
need to be searched for a single query a lot on SS with many
quarks.
2. Applying R-Tree properties to SHT for attribute scalability
R-Trees have amazing properties on multi-dimensional data
sets, the SHT is multi dimensional too : time x quark.
Suggestion is to reorganize intervals in a sub-tree buffer
before writing them to a standard SHT.
My benchmarks show much faster performance on stabbing queries
(x10 )and range queries (x1000) on traces with many
attributes.
3. ITmfStateSystem API changes
After reviewing use cases for the State System, I suggest
adding the following queries to the ITmfStateSystem API:
Range Queries - retrieving intervals for an attribute for a
complete time range (from t1 to t2).
They are currently implemented manually or via the
StateSystemUtils, which performs a stabbing query for t1, uses
end time te of returned interval to repeat stabbing queries on
te + 1 while te < t2.
This could be more efficient if all the intervals were
retrieved in a single pass of the state history tree, instead
of doing as many stabbing queries as there are intervals in
the time range.
Multi-attribute queries: many views/analyses work on all the
intervals for CPUs or threads for example. However using a
full query and filtering after is inefficient because we need
to search too many nodes and return an oversized object, using
single queries for every attribute isn't more efficient
because one single query is hardly twice as fast as a full
query, and forces the SS to search the SHT as many times as
there are attributes queried.
This could be more efficient if the ITmfStateSystem API
allowed access to the SHT method doQuery and searched for the
relevant intervals.
Obviously, it would be interesting to combine all
possibilities in the API:
- querySingleState(quark,
time), querySingleState(quark, rangeStart, rangeEnd)
- queryMultiState(quarkList,
time), queryMultiState(quarkList, rangeStart, rangeEnd)
- queryFullState(time)
- queryFullStateRange not
included to dissuade devs from querying too much data.
4. StateSystem usage throughout Trace Compass
I have also looked at how Trace Compass queries data on the
state system.
There are a number of inefficiencies:
- many single queries can be
replaced by full queries.
- some redundant queries
- if multi-quark and range
queries (see above) are approved, they should replace a
number of other queries
- some inefficiencies are
harder to spot, what looks like a standalone single
query is sometimes deeper in the call graph of a method
that is applied many times to threads for example
(KernelThreadInformationProvider).
I can work on this after API dependencies have been resolved
(not worth fixing twice, for each StateSystem API :) ).
5. Unused quarks
In KernelAnalysis for example, Thread nodes (Threads/*) do not
store any information, nonetheless, an interval containing a
nullStateValue , ranging from the beginning of the tree to the
end of the tree is still created. These intervals are very
long and decrease SHT performance a lot.
I suggest using the core attribute for "Status" in the
kernelanalysis for example.
The patches have been submitted :
But fail review by Hudson because a number of unit test have
hardcoded quarks.
However the custom state values suggested by Alex are a
more elegant/abstract way to solve this issue.