Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [rdf4j-dev] Contribution of FedX (SPARQL Federation): Next steps

Hi Jeen, all,

just a quick status update:

The initial steps are now done on a branch on my local fork: https://github.com/aschwarte10/rdf4j/commits/develop-contribute-fedx

Performed steps:

- copy src and test from original FedX repository
- code is refactored into maven standard structure
- add a basic (standalone) maven build definition
- replace license headers to RDF4J template (Thanks Jeen for the hint regarding the Eclipse tool)
- packages are adjusted to org.eclipse.rdf4j.federated.*

The build is currently basically a 1:1 translation of the Gradle script, took me a while (as I am new to maven), but seems to be working now. Next steps are obviously to integrate this into the hierarchical RDF4J build and particularly re-use the dependency definitions. I will try to do this throughout the next days, but I probably require some help here. Will come back with concrete questions or ask for help when I get stuck.

Open questions:
- naming of main repository class: can we stick to the name FedX or do we have to use "Federated"
- do we have to provide some kind of readme with historical information
- I used the FedX project to evaluate a migration from Junit4 to Junit5 (to get a general feeling also for other projects). FedX currently uses Junit 5. Do you think it's possible to keep this in parallel to junit4 (as of RDF4J) or do I have to migrate back
- do we want to deprecate / remove the CLI. I would vote for yes.
- should I re-format the entire code? I personally always use Eclipse built-in formatter already and the code style should be pretty close.

A bit after the initial steps
- evalute potential drop-in replacement for current federation
- documentation

Anything I forgot?

If you have the chance, please have a look at the branch to get a high level impression and the direction this change goes. I am happy about any feedback.

Thanks and best regards,
 Andreas

Am Sa., 7. Sept. 2019 um 13:01 Uhr schrieb Andreas Schwarte <aschwarte10@xxxxxxxxx>:
Hi Jeen,

your summary and proposal sounds good. I will start migrating the code to my fork of the RDF4J repository during the next days and will keep you posted when I have something share.

The initial steps are pretty much straight forward. Regarding the maven build I might require some help later, but I'll try to provide the basic setup first.

Whether we can keep the name, I am also not clear about. Ideally we would be able to keep the name (as it has some history and publications), however, I am also fine with your proposal. I would do the renaming of classes as a separate step, and in the meantime find out whether it is a strict requirement for the contribution to change the name.

One question we should answer before we do the final integration: do we want to support the CLI for FedX, i.e. a simple command line for running queries against an ad-hoc federation. I personally think that this is no longer required with the integration into RDF4J. What do you think?

To what extend we can use FedX as a backwards compatible replacement for the federation sail we need to check, I will also have a look at this later when the first steps are done.

I will keep you posted when there is some reasonable progress.

Thanks again for all the help,
 Andreas


Am Fr., 6. Sept. 2019 um 01:00 Uhr schrieb Jeen Broekstra <jeen.broekstra@xxxxxxxxx>:
Hi Andreas,

Thanks for getting this discussion going! Sorry for being a bit slow, my mind has been on other things. 

On Thu, Sep 5, 2019 at 9:37 PM Andreas Schwarte <aschwarte10@xxxxxxxxx> wrote:
The next questions to be clarified are of technical nature:

* Would FedX become its own module / artifact within the RDF4J project. Suggestion: yes

This seems the most natural choice, at least for the first migration. However I would suggest that it might make sense to refactor/break it up later on. As an example: it seems logical to me to take the FedXRepository implementation and make it its own module alongside SailRepository and other Repository API implementations. But as said let's bring the code into the project as one big module first, and then consider further.

* Where (which repository) would we check-in the code. We might need a fresh repository

Unless you have strong objections I would be in favour of making it part of the main rdf4j code repository. There are significant advantages to having the entire code base in a single repo. Initially, the best place is probably in the tools/ directory, as a module alongside the console, the workbench, and the rdf4j server. 
 
* Regarding code structure: we might probably need to change all package names, as well as license headers. Are there any guidelines for this?

There are some simple guidelines, like the root of the package name should be org.eclipse.rdf4j. Where we put it below that is up for grabs. I don't know if we want to stick with the name FedX (or even if we can). Perhaps for now we simply go with org.eclipse.rdf4j.federated as a package name? That also means some class names get changed, e.g. FedXRepository would become FederatedRepository.

The license header should be same one we use for everything else (see https://github.com/eclipse/rdf4j/blob/master/.github/CONTRIBUTING.md#code-formatting) . If you use Eclipse IDE, there is a plugin called Releng that you can use to do a mass refactor just for replacing license headers on a code base. See https://www.codejava.net/ides/eclipse/how-to-add-copyright-license-header-for-java-source-files-in-eclipse .

Finally: I've noticed FedX is set up as a gradle project. If we make it part of rdf4j we will have to migrate it to maven. Happy to help with the setup for that of course.

Do you have any comments or suggestions?
 
I think we also have to have a think about what to do with the existing FederationSail. As I understand it, it's pretty much made obsolete by the more advanced federation capabilities of FedX. Should we mark it deprecated? Or can we even just completely replace the existing FederationSail with the more advanced FedX implementation in a backward-compatible way?

Of course I'm happy to help with the work where necessary! I suggest the easiest practical way forward is that you start refactoring on a branch in your own repo: do the package renaming and license replacements there. Once it looks good, copy it over to the rdf4j repo (on a feature branch), in a directory under tools (tools/federated will do at first, we can always rename). Don't worry too much about preserving commit history. Push and put up a PR (against the develop branch), and then let's see what we've got :) 

Cheers,

Jeen
_______________________________________________
rdf4j-dev mailing list
rdf4j-dev@xxxxxxxxxxx
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://www.eclipse.org/mailman/listinfo/rdf4j-dev

Back to the top