Hi all,
In the context of https://wiki.eclipse.org/E4/UI/Smart_Import , the
main operation that crawls the filesystem to identify patterns and
then configure Eclipse projects.
I believe that, when on a directory, crawling and configuring direct
children can be treated in parallel. I was thinking of using
JobGroups to enable parallelism for this task, and more
specifically, I was thinking that I could simply make this loop
parallel:
http://git.eclipse.org/c/e4/org.eclipse.e4.ui.git/tree/bundles/org.eclipse.e4.ui.importer/src/org/eclipse/ui/internal/wizards/datatransfer/OpenFolderCommand.java#n158
. It's the one that iterates on children.
However, since it's in a recursive operation, I don't believe I can
assign 1 job for each iteration, or it will assign all thread to
"parent" tasks, without leaving some for children. Example:
Imagine this filesystem structure:
* P
** P1
*** P1a
*** P1b
** P2
*** P2a
*** P2b
** P3
*** P3a
....
And that I use 2 threads to throttle my jobs. Then those 2 threads
will be assigned to the job responsible of crawling P1 and P2, and
when trying to work on P3, P1a, P1b, P2a or P2b, the new job for
this branch will wait for a new thread forever.
I'm not a parallelism expert, so that may be a dumb question, but
how can I take benefit from throttling in a recursive algorithm?
A naive implementation would be to only parallelize the execution
for the 1st-level branches, so that P1 // P2 // P3, and keep the
subdirectories on the same thread, but this is obviously sub-optimal
in case P1 and P2 finish quite fast, sub-jobs of P3 should be
allowed to use the additional thread.
Cheers,
|