Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [january-dev] IDataset <-> python

At the moment py4j does not support and is not extensible (without
direct hacking) to IDatasets. All the transfers of IDatasets between
Python and Java has been done with AnalysisRpc using
NumPyFileSaver/Loader for serialization on Java side and
numpy.save/load [1,2].

However, py4j 0.11 (in development [3]) is adding an extensible
ability with the eventual goal of being able to merge AnalysisRpc and
Py4J into one communication mechanism.

Jonah


[1] https://docs.scipy.org/doc/numpy/reference/generated/numpy.save.html
[2] https://docs.scipy.org/doc/numpy/reference/generated/numpy.load.html
[3] https://github.com/bartdag/py4j/issues/249


~~~
Jonah Graham
Kichwa Coders Ltd.
www.kichwacoders.com


On 31 January 2017 at 09:20,  <Matt.Gerring@xxxxxxxxxxxxx> wrote:
> ndarray!
>
>
>
> From: january-dev-bounces@xxxxxxxxxxx
> [mailto:january-dev-bounces@xxxxxxxxxxx] On Behalf Of
> Matt.Gerring@xxxxxxxxxxxxx
> Sent: 31 January 2017 09:07
>
>
> To: january-dev@xxxxxxxxxxx
> Subject: Re: [january-dev] IDataset <-> python
>
>
>
> Hi Scott,
>
>
>
> I make the suggestion of the ndarry file format. This class is an example of
> Java writing ndarry to file, we could adapt it:
>
> https://github.com/DawnScience/scisoft-core/blob/master/uk.ac.diamond.scisoft.analysis/src/uk/ac/diamond/scisoft/analysis/io/NumPyFileSaver.java
>
>
>
> Maybe we could stream it between Java and Python using the port approach
> py4j uses? I seem to remember Py4j works by encoding certain types, when it
> sees an IDataset it could use a class like this to serialize the IDataset to
> an ndarray binary stream. This would readable by the py4j python on the
> other side assuming that they installed numpy into their python.
>
>
>
> Cheerio,
>
>
>
> Matt
>
>
>
> From: january-dev-bounces@xxxxxxxxxxx
> [mailto:january-dev-bounces@xxxxxxxxxxx] On Behalf Of Scott Lewis
> Sent: 30 January 2017 22:59
> To: january-dev@xxxxxxxxxxx
> Subject: Re: [january-dev] IDataset <-> python
>
>
>
> Hi All,
>
> First, thanks for the replies.  I have some thoughts and comments, based
> upon our use case.
>
> I understand about the xml-rpc and/or py4j usage.   My suggestion would be
> to separate the logic for dataset serialization (i.e. xml-rpc, py4j,
> something else) from the logic for conversion between the java IDataset
> types and the python Dataset instances...in both java and python.   The
> reason this would be valuable to us is that we would like to be able to use
> remote services for communicating between processes...e.g. java<->python or
> java<->java...starting with py4j for java<->python most likely, and possibly
> using other serialization formats and transports...e.g. xmlrpc+socket,
> xmlrpc over messaging bus, object serialization, etc.   ECF's remote
> services allows this sort of 'pluggability of transports/serialization
> formats' without changing the exchange/communication service itself.
>
> But even with such a service is in place, there obviously has to be some
> conversion between IDataset (and the various types of IDataset) and the
> python Dataset class (or extension/subclass, etc).   It would be nice from
> our point of view if this conversion were also available from
> January...using whatever code exists and fits that bill.    I frankly don't
> have a clear idea of how much work the conversion is, but I imagine what you
> have been discussing on this thread will likely do much of what's needed
> (e.g. flattening) and of course all of you do know what's involved.
>
> This sort of separation...i.e. between java<->python exchange of types and
> the conversion between IDataset (java types) and Dataset python type/s would
> be helpful to us as it would allow substitution of various transports,
> without having the conversion be bound to a particular communications
> transport (xml-rpc or py4j or whatever).
>
> And FWIW, I would be happy to help particularly with the java<->python
> exchange of types, as I've already been working for some time with using
> OSGi remote services over py4j to communicate between java and python [1]
> and would certainly be happy to contribute this work to such an effort.
>
> Thanks,
>
> Scott
>
> [1] https://github.com/ECF/Py4j-RemoteServicesProvider
>
> On 1/30/2017 5:32 AM, Matt.Gerring@xxxxxxxxxxxxx wrote:
>
> Hi Erwin,
>
>
>
> Xtext has several repos:  https://github.com/eclipse/xtext
>
>
>
> We could do the same for january:
>
> https://github.com/eclipse/january                    (front page explaining
> collaboration)
>
> https://github.com/eclipse/january-dataset   (dataset)
>
> https://github.com/eclipse/january-forms       (forms)
>
> https://github.com/eclipse/january-ndarray    (ndarray<->dataset)  //
> flattening and python for xmlrpc later
>
>
>
>
>
> Matt
>
>
>
> From: Erwin De Ley [mailto:erwin.de.ley@xxxxxxxxxx]
> Sent: 30 January 2017 13:25
> To: Gerring, Matt (DLSLtd,RAL,LSCI); january-dev@xxxxxxxxxxx
> Cc: cxh@xxxxxxxxxxxx
> Subject: Re: [january-dev] IDataset <-> python
>
>
>
> That would be great.
>
> (Although you'll need to explain me a bit what you mean with 2 and 3 January
> repo's, when the time is there)
>
> Having form-related data structures and being able to pass them around is
> also of interest for Triquetrum, when we start implementing interactive
> workflows.
>
> regards
>
> erwin
>
>
>
> Op 1/30/2017 om 1:52 PM schreef Matt.Gerring@xxxxxxxxxxxxx:
>
> Erwin and Scott,
>
>
>
> Perhaps a possible way forward might be to add IDataset to the Triquetrum
> repo as this is the cleanest version? Then move that to a third January repo
> (we would like to have 2 one for dataset and one for forms)? I have sent the
> dataset/ndarray flatteners through (below as well J.
>
>
>
> I do not mind helping in this process as it just seems to be moving around
> code. What do you think?
>
>
>
> Matt
>
>
>
> https://github.com/DawnScience/scisoft-core/blob/master/uk.ac.diamond.scisoft.analysis.xmlrpc/src/uk/ac/diamond/scisoft/analysis/rpc/flattening/helpers/DatasetHelper.java
>
>
>
> https://github.com/DawnScience/scisoft-core/blob/master/uk.ac.diamond.scisoft.python/src/scisoftpy/python/pyflatten.py
>
>
>
>
>
> From: Erwin De Ley [mailto:erwin.de.ley@xxxxxxxxxx]
> Sent: 30 January 2017 12:38
> To: january-dev@xxxxxxxxxxx
> Cc: Gerring, Matt (DLSLtd,RAL,LSCI); cxh@xxxxxxxxxxxx
> Subject: Re: [january-dev] IDataset <-> python
>
>
>
> Hi Scott,
>
> As Matt mentioned, the Java-Python RPC-based integration developed for DAWN
> has been migrated/duplicated in Triquetrum.
> This was done in collaboration with Peter & Matt together with/by Jonah (who
> also developed the original package for DAWN).
>
> The goals of getting this in Triquetrum were/are :
>
> to extract a working version with minimal dependencies
> (e.g. without dependencies on DAWN specifics or Triquetrum specifics)
> to have eclipse-approved sources and binaries, i.e. be part of an official
> project
> to be easy to reuse and integrate (also outside of Triquetrum)
>
> On https://github.com/eclipse/triquetrum , the 2 bundles involved are :
>
> org.eclipse.triquetrum.scisoft.analysis.rpc : the technical layer to manage
> connections and flatten/unflatten ; a new incarnation of
> uk.ac.diamond.scisoft.analysis.xmlrpc
> org.eclipse.triquetrum.python.service : mainly the PythonService utility
> class (not yet a regular OSGi service approach)
>
> Test code for analysis.rpc is also available & some example code on how to
> use the PythonService class (this last part is in the examples folder).
> The main dependencies are on Apache XML-RPC that have been added in orbit
> during the integration in Triquetrum.
>
>
>
> From your request it's not yet clear to me if it would fit your needs in its
> current state.
>
> In particular, the version in Triquetrum does not contain anything (yet?)
> related to January IDataSet.
> It currently has flatteners for simple Java types, and maps them to "input"
> arguments of your Python scripts that should have a "run()" function as a
> starting point.
>
> Integration with January is planned for Triquetrum, so I would be in favor
> of adding IDataSet support in the Python service as well.
> But we should discuss planning and approach between the science projects for
> this.
> (also taking into account Matt's remark about defining what the best place
> is for the long-term maintenance of such an integration layer)
>
>
>
> If your project would have an interest in getting this to move forward (and
> even would maybe be able to contribute/fund some of this?) that would of
> course be great!
>
> regards
>
> erwin
>
>
>
> Op 1/30/2017 om 10:33 AM schreef Matt.Gerring@xxxxxxxxxxxxx:
>
> Hi Scott,
>
>
>
> It is exciting that you are using it for that! This is a great new usage,
> exactly the kind of thing we were hoping for in making the work open source,
> thanks for picking it up.
>
>
>
> I introduced py4j to dawnsci and in that project we send around
> ndarry<->IDataset from a py4j connection to do things like plotting; so you
> are not alone in doing this kind of thing. There is something called
> AnalysisRCP which flattens ndarray and unflattens in Java as IDataset. This
> was commissioned by Diamond Light Source as a work package with Kichwa
> Coders and Jonah from Kichwa is one of the project leads of January...
>
>
>
> To expand on what Peter said, the flattening service, which is the heart of
> integrating ndarry with IDataset is currently here:
>
> https://github.com/DawnScience/scisoft-core/tree/master/uk.ac.diamond.scisoft.analysis.xmlrpc
>
>
>
> We have spoken often and done little (AFAIK) about making a separately
> compiling/building/testing github project for this work. However my current
> understanding is that this code needs to be reworked to make it usable by
> any project. I did some work to put it into its current low dependency form
> but it would need more I think to be useful to you. There are still a few
> dependencies there:
>
> https://github.com/DawnScience/scisoft-core/blob/master/uk.ac.diamond.scisoft.analysis.xmlrpc/META-INF/MANIFEST.MF
>
>
>
> Eclipse Triquetrum also looked at some of this stuff and may have another
> version. I have copied in Erwin and Christopher just in case.
>
>
>
> In terms of releases, the plan is to follow eclipse release trains but as
> Peter indicates there may be need for more. This is because Diamond Light
> Source funds changes in it that sometimes require something at short notice
> (well it could be anyone doing the funding but DLS tends to so far...)
>
>
>
> Cheerio,
>
>
>
> Matt
>
>
>
> -----Original Message-----
>
> From: january-dev-bounces@xxxxxxxxxxx
> [mailto:january-dev-bounces@xxxxxxxxxxx] On Behalf Of
> Peter.Chang@xxxxxxxxxxxxx
>
> Sent: 30 January 2017 09:28
>
> To: january-dev@xxxxxxxxxxx
>
> Subject: Re: [january-dev] IDataset <-> python
>
>
>
>
>
> Hi Scott,
>
>
>
> We do have and use two mechanisms for communicating with Python: xml-rpc and
> Py4J. At the moment, they are part of Dawn Science
> (https://github.com/DawnScience/scisoft-core) and are based on saving the
> data to a filesystem that is accessible to both sides.
>
>
>
> The next version of January will be a major one though there has not been
> any discussion I am looking at a March release.
>
>
>
> Best regards,
>
>  Peter
>
>
>
>
>
>
>
>
>
>
>
> -----Original Message-----
>
> From: january-dev-bounces@xxxxxxxxxxx
> [mailto:january-dev-bounces@xxxxxxxxxxx] On Behalf Of Scott Lewis
>
> Sent: 27 January 2017 19:46
>
> To: january-dev@xxxxxxxxxxx
>
> Subject: [january-dev] IDataset <-> python
>
>
>
> Hi Folks,
>
>
>
> Is there any existing work on going between an IDataset instance (in
>
> java) and a Dataset instance in Python using Py4j?   We are using
>
> January to build an IoT data analytics framework that is built on
>
> java/OSGi/kura/our stuff.   It includes a way to dynamically
>
> create/add/remove osgi services that pass IDataset instances between
> components to perform various analytics.
>
>
>
> One thing that we might be interested in doing is having these osgi services
> implemented via Python...which would necessitate exchanging IDataset <->
> Python dataset instances.
>
>
>
> Another question:   When is the next expected release of January?
>
> Will it be a major, minor, maintenance?
>
>
>
> Thanksinadvance for any information.
>
>
>
> Scott
>
>
>
>
>
> _______________________________________________
>
> january-dev mailing list
>
> january-dev@xxxxxxxxxxx
>
> To change your delivery options, retrieve your password, or unsubscribe from
> this list, visit https://dev.eclipse.org/mailman/listinfo/january-dev
>
>
>
> --
>
> This e-mail and any attachments may contain confidential, copyright and or
> privileged material, and are for the use of the intended addressee only. If
> you are not the intended addressee or an authorised recipient of the
> addressee please notify us of receipt by returning the e-mail and do not
> use, copy, retain, distribute or disclose the information in or attached to
> the e-mail.
>
> Any opinions expressed within this e-mail are those of the individual and
> not necessarily of Diamond Light Source Ltd.
>
> Diamond Light Source Ltd. cannot guarantee that this e-mail or any
> attachments are free from viruses and we cannot accept liability for any
> damage which you may sustain as a result of software viruses which may be
> transmitted in or with the message.
>
> Diamond Light Source Limited (company no. 4375679). Registered in England
> and Wales with its registered office at Diamond House, Harwell Science and
> Innovation Campus, Didcot, Oxfordshire, OX11 0DE, United Kingdom
>
>
>
> _______________________________________________
>
> january-dev mailing list
>
> january-dev@xxxxxxxxxxx
>
> To change your delivery options, retrieve your password, or unsubscribe from
> this list, visit https://dev.eclipse.org/mailman/listinfo/january-dev
>
>
>
>
>
> --
>
> This e-mail and any attachments may contain confidential, copyright and or
> privileged material, and are for the use of the intended addressee only. If
> you are not the intended addressee or an authorised recipient of the
> addressee please notify us of receipt by returning the e-mail and do not
> use, copy, retain, distribute or disclose the information in or attached to
> the e-mail.
> Any opinions expressed within this e-mail are those of the individual and
> not necessarily of Diamond Light Source Ltd.
> Diamond Light Source Ltd. cannot guarantee that this e-mail or any
> attachments are free from viruses and we cannot accept liability for any
> damage which you may sustain as a result of software viruses which may be
> transmitted in or with the message.
> Diamond Light Source Limited (company no. 4375679). Registered in England
> and Wales with its registered office at Diamond House, Harwell Science and
> Innovation Campus, Didcot, Oxfordshire, OX11 0DE, United Kingdom
>
>
>
>
>
>
> --
>
> This e-mail and any attachments may contain confidential, copyright and or
> privileged material, and are for the use of the intended addressee only. If
> you are not the intended addressee or an authorised recipient of the
> addressee please notify us of receipt by returning the e-mail and do not
> use, copy, retain, distribute or disclose the information in or attached to
> the e-mail.
> Any opinions expressed within this e-mail are those of the individual and
> not necessarily of Diamond Light Source Ltd.
> Diamond Light Source Ltd. cannot guarantee that this e-mail or any
> attachments are free from viruses and we cannot accept liability for any
> damage which you may sustain as a result of software viruses which may be
> transmitted in or with the message.
> Diamond Light Source Limited (company no. 4375679). Registered in England
> and Wales with its registered office at Diamond House, Harwell Science and
> Innovation Campus, Didcot, Oxfordshire, OX11 0DE, United Kingdom
>
>
>
>
> _______________________________________________
>
> january-dev mailing list
>
> january-dev@xxxxxxxxxxx
>
> To change your delivery options, retrieve your password, or unsubscribe from
> this list, visit
>
> https://dev.eclipse.org/mailman/listinfo/january-dev
>
>
>
>
>
> --
>
> This e-mail and any attachments may contain confidential, copyright and or
> privileged material, and are for the use of the intended addressee only. If
> you are not the intended addressee or an authorised recipient of the
> addressee please notify us of receipt by returning the e-mail and do not
> use, copy, retain, distribute or disclose the information in or attached to
> the e-mail.
> Any opinions expressed within this e-mail are those of the individual and
> not necessarily of Diamond Light Source Ltd.
> Diamond Light Source Ltd. cannot guarantee that this e-mail or any
> attachments are free from viruses and we cannot accept liability for any
> damage which you may sustain as a result of software viruses which may be
> transmitted in or with the message.
> Diamond Light Source Limited (company no. 4375679). Registered in England
> and Wales with its registered office at Diamond House, Harwell Science and
> Innovation Campus, Didcot, Oxfordshire, OX11 0DE, United Kingdom
>
>
>
>
> --
>
> This e-mail and any attachments may contain confidential, copyright and or
> privileged material, and are for the use of the intended addressee only. If
> you are not the intended addressee or an authorised recipient of the
> addressee please notify us of receipt by returning the e-mail and do not
> use, copy, retain, distribute or disclose the information in or attached to
> the e-mail.
> Any opinions expressed within this e-mail are those of the individual and
> not necessarily of Diamond Light Source Ltd.
> Diamond Light Source Ltd. cannot guarantee that this e-mail or any
> attachments are free from viruses and we cannot accept liability for any
> damage which you may sustain as a result of software viruses which may be
> transmitted in or with the message.
> Diamond Light Source Limited (company no. 4375679). Registered in England
> and Wales with its registered office at Diamond House, Harwell Science and
> Innovation Campus, Didcot, Oxfordshire, OX11 0DE, United Kingdom
>
>
>
> _______________________________________________
> january-dev mailing list
> january-dev@xxxxxxxxxxx
> To change your delivery options, retrieve your password, or unsubscribe from
> this list, visit
> https://dev.eclipse.org/mailman/listinfo/january-dev
>


Back to the top