Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [omr-dev] Question regarding TreeTops

Hi Mark,

On 11 June 2018 at 15:02, Mark Stoodley <mstoodle@xxxxxxxxxx> wrote:
> A treetop can have at most one side effect, but a treetop does not have to
> contain a side effect. In fact, the initial IL generation from JitBuilder
> put every node underneath a treetop and relied upon simplifying
> optimizations to eliminate treetops that did not contribute to the program
> order of side effects.

I see - does JitBuilder still do this or is there a better way to
decide when to anchor nodes in TreeTops?

>
>> An 'return' instruction is a TreeTop I believe? If the IL returns an
>> integer constant, how is it related to a symbol?
>
> In your example of a return of a constant, e.g.:
>         ireturn
>            iconst 42
>
> There is no symbol needed in this case because there is no memory side
> effect. But there is a flow of control, which is a side effect, so the
> ireturn needs to be its own treetop.

Okay thanks.

>
> If you load or store to an auto or a global, there needs to be a symbol
> associated with that location. If you call somewhere, there needs to be a
> symbol for the target which will be associated (via aliasing) to the set of
> other symbols that could be read or modified while executing the target).
> There are a few other scenarios (e.g. exceptions), but I won't be
> exhaustive.

So the way I understand this - a symbol refers to a unique location in
memory? Are temporary primitive values mapped to symbols too?

>
> Each IL Node that "has" a symbol, however, actually points to a "symbol
> reference" which then points to the symbol.
>
> Why symbols and symbol references? It's part of the Java heritage of the
> compiler, but it may become useful in other scenarios too. In Java, two
> classes A and B can each contain a constant pool entry that will ultimately
> resolve to (say) some other class C. Java's resolution rules allow the
> constant pool entry in A's class to be resolved while B's constant pool
> entry has not yet been resolved. The class is only loaded once, but each
> class has its own notion of whether it has resolved what each constant pool
> entry actually points at.
>
> How does that impact the compiler? Imagine you're compiling A.foo() and it
> inlines B.bar(). A reference to class C in the code for A.foo() will
> directly refer to A's resolved constant pool entry, while a reference to C
> in B.bar() will directly refer to an unresolved constant pool entry (from
> class B). Of course, the JIT can figure out they are the same and that they
> will refer to the same class by inspection (if they will, in fact, resolve
> to the same class). It gets even stranger, which I won't bore you with, but
> the compiler getting things right means capturing the different constant
> pool entry origins (as symbol references) even though they refer to the same
> memory (symbol).
>
> Outside of Java, you don't tend to see multiple identical symbol references
> pointing at the same symbol so much, although obviously it can happen
> indirectly via aliasing.
>

Thank you. I also saw the detailed write up on symbols and symbol
references in the docs. I confess I do not fully understand it all
yet. For my immediate needs I need to be able to generate IL that will
be amenable to optimization. I saw that the array access is via a
ShadowSymbolRef, i.e.:

  TR::SymbolReference *symRef =
      injector->symRefTab()->findOrCreateArrayShadowSymbolRef(type, base);
  TR::Node *load = TR::Node::createWithSymRef(
      TR::ILOpCode::indirectLoadOpCode(type), 1,
get_array_element_address(injector,type,base,index), 0, symRef);


Presumably this tells the compiler that the element access is
referring to part of the object at 'base'.


> Hope that helps!


Indeed it does! Thank you.

Regards
Dibyendu


Back to the top