Hey Mark. Comments
inline.
-----Original Message-----
From:
cosmos-dev-bounces@xxxxxxxxxxx [mailto:cosmos-dev-bounces@xxxxxxxxxxx] On Behalf Of Mark D Weitzel
Sent: Tuesday, April 03, 2007 1:18
PM
To: cosmos-dev@xxxxxxxxxxx
Subject: [cosmos-dev] Using JPA in
COSMOS
Joel/Don et al,
I've
been trolling around the web looking for more information on JPA. As we
are starting to understand this a bit more it will be important to make sure we
articulate where/how we intend to use this in COSMOS, esp. in regards to the
use cases it satisfies. With that in mind, I wanted to try and get
clarity on a few initial areas.
From
the point of view from the persistence layer, I'm assuming the contract between
the Data Filter & Transformation layer and the Data Sink is a collection of
POJOs.
Will
these POJOs require JPA annotations? No
Does this require the Data Sink
to be a JPA implementation? No
Many
of our scenarios account for the incorporation of legacy data stores that will
not cleanly map into these new POJOs.
Then
make new POJOs. I’m not sure which set “these”
new POJOs are drawn from…
This is actually the
default case since COSMOS will not ship its own data store, but will rely on
one provided by TPTP. How does JPA help us in this scenario?
My
recollection is that TPTP wants to move away from its existing data store,
hence they can choose to implement using JPA if they want.
In some scenarios, we may not have an RDB, e.g. the
repository for SML and the resource information may be a file system. What
advantages does JPA provide in this situation?
None. Again, JPA is an optional implementation, and not part
of the component contract..
When we met in Toronto
we spent some time talking about experience learned from TPTP, esp. regarding
the use of the EMF in the APIs. No one seemed to like this at all. As
I was looking at the JPA examples, they do not appear to be much
different--both include metadata in the Java classes about the underlying
model. The JPA annotations work in the same way. Here's an
example...
@ManyToOne()
@JoinColumn(name="CUSTOMER_ID")
public Customer
getCustomer() {
return
customer;
}
This
little bit of code has several implications, for example, it reveals the
structure of the RDB behind the scenes the same way that EMF did.
Only at the level of the direct binding to the underlying datastore. From the point of an in-process consumer,
only the POJO matters – not the annotations. Also, if you want to
completely strip off the annotations, then exposure as an interface rather than
an object would do that.
Also,
once the POJO has been serialized (for a remote consumer), then there is no
remnant of the RDB left at all.
Also, it tightly couples the database shcema--something
we did not want to do. In addition to the db schema, it also reveals very
specific data base structure (the exposure of CUSTOMER_ID as a foreign key). Are
we repeating the same API mistake of TPTP? What is the real value of JPA
given these trade offs?
I see JPA doing two things – providing
a quick way to integrate RDB-based datastores, and
providing an API to emulate.
The above issues are from the standpoint of writing
data to the database, here's some thoughts on
querying...
Does
the use of JPA require/force a set of APIs (EntityManager) on the Data Sinks,
e.g. do they need to implement or be JPA "Entity Managers"?
No
– only on those datasinks that choose to bind
to the underlying store using JPA.
Here's another example:
// Create new EntityManager
em =
emf.createEntityManager();
Query q = em.createQuery("select
c from Customer c where c.name = :name");
q.setParameter("name",
"Joe Smith");
We
need to determine where this falls in the COSMOS framework, but I'm hoping this
would be buried deep within the DC layer b/c we did not want to expose SQL. I'm
assuming that this would be fronted by some other service like API that would
be the external API cosmos would offer, e.g.
customerService.getCutomersByName("Joe Smith").
I
need to look at the query interfaces a bit more, but right now, I'm not sure
how we handle security or large data sets, what if we get 10,000 "Joe
Smiths"?
The JPA query API lets you specify a max
result set size and a starting index (to enable paging through a dataset). We
just need to add those parameters to our query interface.
Please let me know your thoughts. I'm interested
in what other issues we need to consider, including non-functional
requirements, e.g. does the use of JPA foster/inhibit adoption, et.
-mw
Mark Weitzel | STSM | IBM Software Group | Tivoli | Autonomic Computing | (919)
543 0625 | weitzelm@xxxxxxxxxx
=00