Re: [science-iwg] Data Structures

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]

Re: [science-iwg] Data Structures - part deux

From: <Matt.Gerring@xxxxxxxxxxxxx>
Date: Wed, 27 Jan 2016 15:51:37 +0000
Accept-language: en-GB, en-US
Delivered-to: science-iwg@xxxxxxxxxxx
List-archive: <https://dev.eclipse.org/mailman/private/science-iwg>
List-help: <mailto:science-iwg-request@eclipse.org?subject=help>
List-subscribe: <https://dev.eclipse.org/mailman/listinfo/science-iwg>, <mailto:science-iwg-request@eclipse.org?subject=subscribe>
List-unsubscribe: <https://dev.eclipse.org/mailman/options/science-iwg>, <mailto:science-iwg-request@eclipse.org?subject=unsubscribe>
Thread-index: AQHRWQd6TpEN6lIuCUupoYiEz77Df58Pc8GAgAABawCAAAmiYA==
Thread-topic: [science-iwg] Data Structures - part deux

She’s the boss of one project leader… ;-)

I think I am half understanding what this is about. Maybe could discuss further some time.

Matt

From: science-iwg-bounces@xxxxxxxxxxx [mailto:science-iwg-bounces@xxxxxxxxxxx] On Behalf Of Jay Jay Billings
Sent: 27 January 2016 15:02
To: Science Industry Working Group
Subject: Re: [science-iwg] Data Structures - part deux

Tracy,

Also, if you are the "self-appointed project manager" and will be contributing to the work, then you should be as a committer on the proposal.

Jay

On Wed, Jan 27, 2016 at 9:56 AM, Jay Jay Billings <jayjaybillings@xxxxxxxxx> wrote:

Tracy,

Thanks for getting this going. First, let me say that if we want January to be just about 'numpy for Java,' that is completely OK with me. We should just make that clear in the scope. In that case, we would be looking more at ICE and EAVP using January instead of the data structures from ICE and EAVP being moved into January.

I just shared a description of our data structures with Matt on the other thread. I have expanded it and share it below.

Jay

-----

Here's the code:

https://github.com/eclipse/ice/tree/master/org.eclipse.ice.datastructures

The goal of this package is to create general purpose data classes, structures and pattern realizations that can be mapped to a wide range of scientific problems while also maintaining metadata about that information. They are also all bound with JAXB so that they can be persisted to XML. Their design is verbose so that developers can almost immediately know how to pack their data into the classes.

They are, in a sense, the exact opposite of IDataSet because they are design to store "higher-level" quantities meant for direct consumption by users (as opposed to reduction into a plot, etc.) We store all raw, n-dimensional data, in files and link to those files through our ResourceComponent.

Our long term goals with this are to switch this to an EMF model, optimize the way metadata is stored, use IDataSet to back structures like MatrixComponent and ResourceComponent (ILazyDataSet in this case), and allow developers to create their own Component implementations simply through annotations.

Consider, for example, a battery. If the state of that battery would be represented on disk by five quantities - say a string, two integers and two floats - and each of those quantities has associated metadata such as descriptions, ids, names, etc., then we could map them as follows:

Battery --> 1 instance DataComponent

Quantities 1-5 --> 5 instances of Entry

Let's consider another example: a 3D geometry. In this case, the developer would use a GeometryComponent and the associated CSG tree (which is moving to EAVP) to create a 3D geometry constructed from shapes and boolean operations on those shapes. Alternatively, they could construct that geometry purely from a mesh using a MeshComponent and Edges, Vertices, etc.

Other classes, such as ListComponent, offer Generic solutions to storing whatever data structure a user can come up with so long as they provide JAXB bindings on that class so that it can be written to disk.

After that, any collection of Components, etc. are stored in a root class called Form that is processed by the workflow engine and the UI. All of this creates a single gigantic tree structure that can be walked in O(N) time by smartly implementing the IComponentVisitor interface.

On Wed, Jan 27, 2016 at 8:34 AM, Tracy Miranda <tracy@xxxxxxxxxxxxxxxx> wrote:

Hi all,

Following on from feedback for the January project proposal this is a thread for clarifying the scope and what the project should encompass.

As a sort-of self-appointed product manager I'm looking at it from the user perspective trying to answer these questions:

- What is it all about?

- What problems does it solve?

- Who really gives a damn?

For the initial proposal, touted as a 'numpy for Java' I have good answers for all those questions (mainly from the proposal itself, and work on python integration with Java).

When it comes to expanding the scope, I'm guilty of getting excited about integrating all the tools and not necessarily understanding what the structures are or are good for and how they all fit together.

I am certainly aware of specific use-cases beyond the current nd-array implementation, especially for the Triquetrum project, but it's pretty limited.

So maybe best to start with both the ICE and EAVP data structures first - do some good knowledge transfer from Jay on the types of structures we are talking about and the usecases for these...

Tracy

_______________________________________________
science-iwg mailing list
science-iwg@xxxxxxxxxxx
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.eclipse.org/mailman/listinfo/science-iwg

Jay Jay Billings

Oak Ridge National Laboratory

Twitter Handle: @jayjaybillings

Jay Jay Billings

Oak Ridge National Laboratory

Twitter Handle: @jayjaybillings

This e-mail and any attachments may contain confidential, copyright and or privileged material, and are for the use of the intended addressee only. If you are not the intended addressee or an authorised recipient of the addressee please notify us of receipt by returning the e-mail and do not use, copy, retain, distribute or disclose the information in or attached to the e-mail.
Any opinions expressed within this e-mail are those of the individual and not necessarily of Diamond Light Source Ltd.
Diamond Light Source Ltd. cannot guarantee that this e-mail or any attachments are free from viruses and we cannot accept liability for any damage which you may sustain as a result of software viruses which may be transmitted in or with the message.
Diamond Light Source Limited (company no. 4375679). Registered in England and Wales with its registered office at Diamond House, Harwell Science and Innovation Campus, Didcot, Oxfordshire, OX11 0DE, United Kingdom

References:
- [science-iwg] Data Structures - part deux
  - From: Tracy Miranda
- Re: [science-iwg] Data Structures - part deux
  - From: Jay Jay Billings
- Re: [science-iwg] Data Structures - part deux
  - From: Jay Jay Billings

Prev by Date: Re: [science-iwg] Data Structures - part deux
Next by Date: Re: [science-iwg] Data Structures - part deux
Previous by thread: Re: [science-iwg] Data Structures - part deux
Next by thread: Re: [science-iwg] Data Structures - part deux
Index(es):
- Date
- Thread

Breadcrumbs