Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
[rdf4j-dev] Side notes -> Re: planning ahead: 3.7 vs 4.0
  • From: jerven Bolleman <jerven.bolleman@sib.swiss>
  • Date: Thu, 11 Feb 2021 23:12:19 +0100
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=sib.swiss; dmarc=pass action=none header.from=sib.swiss; dkim=pass header.d=sib.swiss; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=sZ+dC9JYhz7P8Rl2IVBajztf5EzQkwCU9opCycq0tuU=; b=RDroU186COb0+xCd0N6spHFyYhv4kt6YYw1x8yR3cS1kLhcOUAXKuPhKPlrktlg0ji434Ih3+hPyh/uEeA28opkHrBc5TOASHZLeyZYVRi7SG7qKGIWu6jnLS4A4yjTvyP26iTCSSyEyn7Eeyiqty8Ibgg+hO6jiPzZJQa5vijHenxNs527ZiLbK3ph4fj6IY3Ok2tG5NpriiHCt9847LiFrqaMVO4Sq+Z/DLYysVMGv32h8J73Elhl5IGVTnQpzUtKX51uPEkcuq1mQk8Zc7s9BRvNT37y3W1X+RLXik9ZN9pLrKdYsB9oiDgOQfiglfiff8iWDpW2MjzoC0KREtA==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=NJYTaAAhhpKmriMPz5r32bf9KyTjI7RfFabKnsTIHFwT3k1AtXVDR0P3pI5uzZY6ElA+LXO9vBL9ibg9/EDU2RMItOTy9dPC5u4YFmfxtutb6r5d9WYGN2LDprdVeex9S/jTnzqkIUeruOUc8mtylp7eG7xFM+cAMvAixDgDM7UlMGEj/E3C+iDNGAFvFlUp1bfNvN2o7Dr35Q8Tnk5ADqan0Tj/u6aaPMmjL9L00fhDc3QQTlOUsmwysMkI1i63CrZcoKAyKJ5GUicbvwYPDvAj+4uY7OywMUJhZoeFbXYba/a3Zl4wq3FGpCkFy+Lv8SucTfeAR/5WTgDG3YfQ+Q==
  • Delivered-to: rdf4j-dev@xxxxxxxxxxx
  • List-archive: <https://www.eclipse.org/mailman/private/rdf4j-dev/>
  • List-help: <mailto:rdf4j-dev-request@eclipse.org?subject=help>
  • List-subscribe: <https://www.eclipse.org/mailman/listinfo/rdf4j-dev>, <mailto:rdf4j-dev-request@eclipse.org?subject=subscribe>
  • List-unsubscribe: <https://www.eclipse.org/mailman/options/rdf4j-dev>, <mailto:rdf4j-dev-request@eclipse.org?subject=unsubscribe>
  • User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.7.0

Hi Jeen,

I will make the issues:

A triple store that can use faster evaluation (much more than just better query planing) is my https://github.com/JervenBolleman/sapfhir.

I have other read optimized similar things in various states of not working. But in each case actually are impacted a lot by overhead
of getting things from disk to immediately throw them out again.

This kind of thing is highly bio-informatics specialist experimentation.
experiments on top of protoypes really :)

Regards.
Jerven

On 11/02/2021 23:05, Jeen Broekstra wrote:
On Fri, 12 Feb 2021, at 06:42, jerven Bolleman wrote:

Hi Jeen, All,

Just a thought that I was having while writing this e-mail.
How about a virtual developer get together, sometime, including
downstream projects/companies like graphdb and ontop.

Good idea. I'll see about organizing something along those lines. Me being in Oz makes finding a decent time slot a little tricky, but we can figure something out I'm sure.


Now back to 3.x vs 4.0

I am fine either way. For those who are getting paid, I suspect
commercial planning is a major concern. Yet, I feel that in practice
upgrading a JVM version is not a major stumbling block. Either
everything get's updated more or less, or nothing get's updated at all.
For what it is worth in our org we are all on java 11 for more than a
year. Also a commercial party, can backport and maintain the 3 series
even if most development happens on a cleaned up 4 branch.

We had a conversation about this about a year ago, when several people voiced some concern. We're a year on though, perhaps the situation's changed. See https://github.com/eclipse/rdf4j/issues/2046 <https://github.com/eclipse/rdf4j/issues/2046> .


For major changes:

There are two things I would like to see for 4.0.
For the getStatements method on a SailConnection, I would like to expand
the acceptable types. This is to enable what is called "predicate" or
"filter" pushdown in the literature.

Currently we accept either a null or a Value. What I would like to have
is a Value or a "VariableDescription".

Imagine the following query:

SELECT ?p
WHERE {
    ?z ?y ?p . # iteration 1
    ?q ?p ?x .
}

From the query we could analyze that ?p must be an IRI.
Yet when we call the getStatements method we have no way to pass in that
knowledge. Which means that each Literal in the store is pulled out from
storage and into memory to immediately afterwards being thrown out.

It's an interesting idea. This is the kind of thing  where I'd really appreciate Ontotext's (and other Sail implementors, like e.g. the Halyard people) input as well.

Could you raise a ticket for this idea, so we don't lose track of it, and keep in-depth conversations on the topic trackable?

[snip]

I don't think that we would have many of these cases for the first step
but the API change would be significant. Although could be done in a
completely backwards compatible manner.

You'd have to introduce a new superclass or interface that covers both Values and "VariableDescription" though, so it would not be binary compatible. But yeah, we can make it relatively easy.

The second change is that I would like to see the getStatements return
an. CloseableIteration<List<? extends Statement>, SailException> instead
of the current one statement at a time iterator. Again doable with
default methods. Being able to return blocks of values would really
allow for some significant speed ups. Again easily done with a default
method so it doesn't need a major version.

Can I ask, with this, and also the other ideas (paralellizing query execution): do you have a specific triplestore implementation in mind that would make use of this? I mean for example our own NativeStore is not really set up to leverage these kinds of optimizations I think.

Cheers,

Jeen

_______________________________________________
rdf4j-dev mailing list
rdf4j-dev@xxxxxxxxxxx
To unsubscribe from this list, visit https://www.eclipse.org/mailman/listinfo/rdf4j-dev


--
SIB logo
	*Jerven Tjalling Bolleman*
Principal Software Developer
*SIB | Swiss Institute of Bioinformatics*
1, rue Michel Servet - CH 1211 Geneva 4 - Switzerland
t +41 22 379 58 85
Jerven.Bolleman@sib.swiss - www.sib.swiss



Back to the top