Home » Archived » XML Schema Definition (XSD) » Any tools for reducing a schema?
Any tools for reducing a schema? [message #37951] |
Fri, 05 March 2004 11:29  |
Eclipse User |
|
|
|
Originally posted by: adam.NOSPAMsoftfab.com
Dear All,
I am currently working on a big health sytem in the UK. We get these huge
schemas for a thing called "HL7".
The std has nearly everything as 0..* & other massive generalizations such
that it can be all things to all people. However as it is backended by
real objects & a real db I have to reduce the scope of the schema down to
some "specific schemas" e.g. only 5 lines of "address" & not 0 to infinity
of them.
i.e a document I create which is accordance with my mini-schemas will
validate against the bigger one but the bigger one can be used to create
documents that will not vaildate against my smaller one.
Does anyone know of any tools that can help to create more specific
schemas from massively generic ones?
I have done some Eclipse programming & if need be may create a schema
based on this xsd branch/project but I don't want to re-invent the wheel.
Any ideas for a tool that can be used to do this?
I'd like to be able to open up an existing schema & then have a new
"types" schema be created where instead of an "id" type which can be
anything, I can create a specific id type complete with type, length,
possibly a range of values, a pattern etc. so as to comply with what the
backend system will actually be expecting.
TIA
Adam
|
|
| | |
Re: Any tools for reducing a schema? [message #38847 is a reply to message #38472] |
Tue, 16 March 2004 07:14   |
Eclipse User |
|
|
|
Originally posted by: adam.NOSPAMsoftfab.com
Hayden Marchant wrote:
> Adam
> I work at Unicorn Solutions and we recently completed a similar project in
> which we subsetted a large industry standard similar in size to HL7. I did
> this using the Eclipse XSD API (with a lot of help from Ed) which made the
> job of subsetting so easy.
> There were quite a few trials and tribulations on the way which I'd be
> happy to share with you. Please feel free to contact me at
> hayden.marchant@unicorn.com
> Thanks,
> Hayden
Thanks. The problem is if anything slightly worse than I first thought in
that the Health people are using an automated tool which assumes that
there is a couple of basic schemas (a datatypes schema & a vocabulary one)
& that's it. As such almost every message/xsd contains duplicate complex
types when compared to other schemas. i.e. if the complextype isn't in the
datatypes schema then put it into the message schema w/o regard to if it's
used in 1,2,5 other schemas etc.
What this means is that if you include a couple of the message schemas in
say a WSDL, you get a huge raft of namespace collisions as you might have
the exact same "person" structure/complex type in 5 different message
schemas. i.e. what I need to be able to do is to distill out all the
"common" complextypes which aren't in the "datatypes" xsd. At the moment
I'm having to do this by hand which is (a) tedious beyond belief & (b)
error prone.
Arrrrggg....
i.e. I need to be able to say:
A) Run through this directory full of schemas, create a new schema (e.g.
called "Commontypes.xsd") & then
B) Dump the common complextypes into that schema &
C) Re-reference those types from the "local" (i.e. within the message)
version to the "new" (i.e. in Commontypes version) within each message.xsd
& then
D) Strip out the now de-referenced complextypes from those schemas.
& then.....I would need to :
Have a "Simpletypes" schema just consisting of simple types (usually
restricted by content (e.g. length, pattern or enumeration) such that I
could go though the verious complextypes setting their simpletypes to
these (e.g. 1 "id" might be a message id & the next might be a "person" id
& the next might be a "NHS_id" etc.
I can happily construct the simpletypes by hand (& would do so anyway).
At the moment I'm having to do all the above by hand & after a while I go
"XML-blind" & I'm sure errors will creep in. Not to mention what happens
if they do an update of the schemas etc.etc.
Adam
> Adam Flinton wrote:
> > Dear All,
> > I am currently working on a big health sytem in the UK. We get these huge
> > schemas for a thing called "HL7".
> > The std has nearly everything as 0..* & other massive generalizations such
> > that it can be all things to all people. However as it is backended by
> > real objects & a real db I have to reduce the scope of the schema down to
> > some "specific schemas" e.g. only 5 lines of "address" & not 0 to infinity
> > of them.
> > i.e a document I create which is accordance with my mini-schemas will
> > validate against the bigger one but the bigger one can be used to create
> > documents that will not vaildate against my smaller one.
> > Does anyone know of any tools that can help to create more specific
> > schemas from massively generic ones?
> > I have done some Eclipse programming & if need be may create a schema
> > based on this xsd branch/project but I don't want to re-invent the wheel.
> > Any ideas for a tool that can be used to do this?
> > I'd like to be able to open up an existing schema & then have a new
> > "types" schema be created where instead of an "id" type which can be
> > anything, I can create a specific id type complete with type, length,
> > possibly a range of values, a pattern etc. so as to comply with what the
> > backend system will actually be expecting.
> > TIA
> > Adam
|
|
| |
Re: Any tools for reducing a schema? [message #38906 is a reply to message #38847] |
Tue, 16 March 2004 07:20  |
Eclipse User |
|
|
|
Originally posted by: merks.ca.ibm.com
Adam,
Wow, this is really a very involved problem. There are so many tricky things that
could go wrong. For example different schemas can have different settings for
attributeFormDefault and elementFormDefault that affect the nested
element/attribute content. Things like blockDefault and finalDefault could be set
to prevent defining derived types. Certainly moving components from one schema to
another will change their namespace, so the resulting schema won't be able to
accept instances valid according to the original schema.
This definitely sounds way too complicated to do correctly by hand. You really
should look at how to automate the steps using XSD.
Adam Flinton wrote:
> Hayden Marchant wrote:
>
> > Adam
>
> > I work at Unicorn Solutions and we recently completed a similar project in
> > which we subsetted a large industry standard similar in size to HL7. I did
> > this using the Eclipse XSD API (with a lot of help from Ed) which made the
> > job of subsetting so easy.
>
> > There were quite a few trials and tribulations on the way which I'd be
> > happy to share with you. Please feel free to contact me at
> > hayden.marchant@unicorn.com
>
> > Thanks,
> > Hayden
>
> Thanks. The problem is if anything slightly worse than I first thought in
> that the Health people are using an automated tool which assumes that
> there is a couple of basic schemas (a datatypes schema & a vocabulary one)
> & that's it. As such almost every message/xsd contains duplicate complex
> types when compared to other schemas. i.e. if the complextype isn't in the
> datatypes schema then put it into the message schema w/o regard to if it's
> used in 1,2,5 other schemas etc.
>
> What this means is that if you include a couple of the message schemas in
> say a WSDL, you get a huge raft of namespace collisions as you might have
> the exact same "person" structure/complex type in 5 different message
> schemas. i.e. what I need to be able to do is to distill out all the
> "common" complextypes which aren't in the "datatypes" xsd. At the moment
> I'm having to do this by hand which is (a) tedious beyond belief & (b)
> error prone.
>
> Arrrrggg....
>
> i.e. I need to be able to say:
>
> A) Run through this directory full of schemas, create a new schema (e.g.
> called "Commontypes.xsd") & then
>
> B) Dump the common complextypes into that schema &
> C) Re-reference those types from the "local" (i.e. within the message)
> version to the "new" (i.e. in Commontypes version) within each message.xsd
> & then
> D) Strip out the now de-referenced complextypes from those schemas.
>
> & then.....I would need to :
>
> Have a "Simpletypes" schema just consisting of simple types (usually
> restricted by content (e.g. length, pattern or enumeration) such that I
> could go though the verious complextypes setting their simpletypes to
> these (e.g. 1 "id" might be a message id & the next might be a "person" id
> & the next might be a "NHS_id" etc.
>
> I can happily construct the simpletypes by hand (& would do so anyway).
>
> At the moment I'm having to do all the above by hand & after a while I go
> "XML-blind" & I'm sure errors will creep in. Not to mention what happens
> if they do an update of the schemas etc.etc.
>
> Adam
>
> > Adam Flinton wrote:
>
> > > Dear All,
>
> > > I am currently working on a big health sytem in the UK. We get these huge
> > > schemas for a thing called "HL7".
>
> > > The std has nearly everything as 0..* & other massive generalizations such
> > > that it can be all things to all people. However as it is backended by
> > > real objects & a real db I have to reduce the scope of the schema down to
> > > some "specific schemas" e.g. only 5 lines of "address" & not 0 to infinity
> > > of them.
>
> > > i.e a document I create which is accordance with my mini-schemas will
> > > validate against the bigger one but the bigger one can be used to create
> > > documents that will not vaildate against my smaller one.
>
> > > Does anyone know of any tools that can help to create more specific
> > > schemas from massively generic ones?
>
> > > I have done some Eclipse programming & if need be may create a schema
> > > based on this xsd branch/project but I don't want to re-invent the wheel.
>
> > > Any ideas for a tool that can be used to do this?
>
> > > I'd like to be able to open up an existing schema & then have a new
> > > "types" schema be created where instead of an "id" type which can be
> > > anything, I can create a specific id type complete with type, length,
> > > possibly a range of values, a pattern etc. so as to comply with what the
> > > backend system will actually be expecting.
>
> > > TIA
>
> > > Adam
|
|
|
Re: Any tools for reducing a schema? [message #583786 is a reply to message #37951] |
Fri, 05 March 2004 11:40  |
Eclipse User |
|
|
|
Adam,
There's no existing tool to do this, but you could definitely use XSD to
implement this type of thing.
Adam Flinton wrote:
> Dear All,
>
> I am currently working on a big health sytem in the UK. We get these huge
> schemas for a thing called "HL7".
>
> The std has nearly everything as 0..* & other massive generalizations such
> that it can be all things to all people. However as it is backended by
> real objects & a real db I have to reduce the scope of the schema down to
> some "specific schemas" e.g. only 5 lines of "address" & not 0 to infinity
> of them.
>
> i.e a document I create which is accordance with my mini-schemas will
> validate against the bigger one but the bigger one can be used to create
> documents that will not vaildate against my smaller one.
>
> Does anyone know of any tools that can help to create more specific
> schemas from massively generic ones?
>
> I have done some Eclipse programming & if need be may create a schema
> based on this xsd branch/project but I don't want to re-invent the wheel.
>
> Any ideas for a tool that can be used to do this?
>
> I'd like to be able to open up an existing schema & then have a new
> "types" schema be created where instead of an "id" type which can be
> anything, I can create a specific id type complete with type, length,
> possibly a range of values, a pattern etc. so as to comply with what the
> backend system will actually be expecting.
>
> TIA
>
> Adam
|
|
|
Re: Any tools for reducing a schema? [message #584030 is a reply to message #37951] |
Tue, 09 March 2004 09:28  |
Eclipse User |
|
|
|
Adam
I work at Unicorn Solutions and we recently completed a similar project in
which we subsetted a large industry standard similar in size to HL7. I did
this using the Eclipse XSD API (with a lot of help from Ed) which made the
job of subsetting so easy.
There were quite a few trials and tribulations on the way which I'd be
happy to share with you. Please feel free to contact me at
hayden.marchant@unicorn.com
Thanks,
Hayden
Adam Flinton wrote:
> Dear All,
> I am currently working on a big health sytem in the UK. We get these huge
> schemas for a thing called "HL7".
> The std has nearly everything as 0..* & other massive generalizations such
> that it can be all things to all people. However as it is backended by
> real objects & a real db I have to reduce the scope of the schema down to
> some "specific schemas" e.g. only 5 lines of "address" & not 0 to infinity
> of them.
> i.e a document I create which is accordance with my mini-schemas will
> validate against the bigger one but the bigger one can be used to create
> documents that will not vaildate against my smaller one.
> Does anyone know of any tools that can help to create more specific
> schemas from massively generic ones?
> I have done some Eclipse programming & if need be may create a schema
> based on this xsd branch/project but I don't want to re-invent the wheel.
> Any ideas for a tool that can be used to do this?
> I'd like to be able to open up an existing schema & then have a new
> "types" schema be created where instead of an "id" type which can be
> anything, I can create a specific id type complete with type, length,
> possibly a range of values, a pattern etc. so as to comply with what the
> backend system will actually be expecting.
> TIA
> Adam
|
|
|
Re: Any tools for reducing a schema? [message #584185 is a reply to message #38472] |
Tue, 16 March 2004 07:14  |
Eclipse User |
|
|
|
Hayden Marchant wrote:
> Adam
> I work at Unicorn Solutions and we recently completed a similar project in
> which we subsetted a large industry standard similar in size to HL7. I did
> this using the Eclipse XSD API (with a lot of help from Ed) which made the
> job of subsetting so easy.
> There were quite a few trials and tribulations on the way which I'd be
> happy to share with you. Please feel free to contact me at
> hayden.marchant@unicorn.com
> Thanks,
> Hayden
Thanks. The problem is if anything slightly worse than I first thought in
that the Health people are using an automated tool which assumes that
there is a couple of basic schemas (a datatypes schema & a vocabulary one)
& that's it. As such almost every message/xsd contains duplicate complex
types when compared to other schemas. i.e. if the complextype isn't in the
datatypes schema then put it into the message schema w/o regard to if it's
used in 1,2,5 other schemas etc.
What this means is that if you include a couple of the message schemas in
say a WSDL, you get a huge raft of namespace collisions as you might have
the exact same "person" structure/complex type in 5 different message
schemas. i.e. what I need to be able to do is to distill out all the
"common" complextypes which aren't in the "datatypes" xsd. At the moment
I'm having to do this by hand which is (a) tedious beyond belief & (b)
error prone.
Arrrrggg....
i.e. I need to be able to say:
A) Run through this directory full of schemas, create a new schema (e.g.
called "Commontypes.xsd") & then
B) Dump the common complextypes into that schema &
C) Re-reference those types from the "local" (i.e. within the message)
version to the "new" (i.e. in Commontypes version) within each message.xsd
& then
D) Strip out the now de-referenced complextypes from those schemas.
& then.....I would need to :
Have a "Simpletypes" schema just consisting of simple types (usually
restricted by content (e.g. length, pattern or enumeration) such that I
could go though the verious complextypes setting their simpletypes to
these (e.g. 1 "id" might be a message id & the next might be a "person" id
& the next might be a "NHS_id" etc.
I can happily construct the simpletypes by hand (& would do so anyway).
At the moment I'm having to do all the above by hand & after a while I go
"XML-blind" & I'm sure errors will creep in. Not to mention what happens
if they do an update of the schemas etc.etc.
Adam
> Adam Flinton wrote:
> > Dear All,
> > I am currently working on a big health sytem in the UK. We get these huge
> > schemas for a thing called "HL7".
> > The std has nearly everything as 0..* & other massive generalizations such
> > that it can be all things to all people. However as it is backended by
> > real objects & a real db I have to reduce the scope of the schema down to
> > some "specific schemas" e.g. only 5 lines of "address" & not 0 to infinity
> > of them.
> > i.e a document I create which is accordance with my mini-schemas will
> > validate against the bigger one but the bigger one can be used to create
> > documents that will not vaildate against my smaller one.
> > Does anyone know of any tools that can help to create more specific
> > schemas from massively generic ones?
> > I have done some Eclipse programming & if need be may create a schema
> > based on this xsd branch/project but I don't want to re-invent the wheel.
> > Any ideas for a tool that can be used to do this?
> > I'd like to be able to open up an existing schema & then have a new
> > "types" schema be created where instead of an "id" type which can be
> > anything, I can create a specific id type complete with type, length,
> > possibly a range of values, a pattern etc. so as to comply with what the
> > backend system will actually be expecting.
> > TIA
> > Adam
|
|
|
Re: Any tools for reducing a schema? [message #584199 is a reply to message #37983] |
Tue, 16 March 2004 07:15  |
Eclipse User |
|
|
|
Ed Merks wrote:
> Adam,
> There's no existing tool to do this, but you could definitely use XSD to
> implement this type of thing.
Thanks. Could you look at my reply post to Hayden & possibly comment?
Adam
|
|
|
Re: Any tools for reducing a schema? [message #584213 is a reply to message #38847] |
Tue, 16 March 2004 07:20  |
Eclipse User |
|
|
|
Adam,
Wow, this is really a very involved problem. There are so many tricky things that
could go wrong. For example different schemas can have different settings for
attributeFormDefault and elementFormDefault that affect the nested
element/attribute content. Things like blockDefault and finalDefault could be set
to prevent defining derived types. Certainly moving components from one schema to
another will change their namespace, so the resulting schema won't be able to
accept instances valid according to the original schema.
This definitely sounds way too complicated to do correctly by hand. You really
should look at how to automate the steps using XSD.
Adam Flinton wrote:
> Hayden Marchant wrote:
>
> > Adam
>
> > I work at Unicorn Solutions and we recently completed a similar project in
> > which we subsetted a large industry standard similar in size to HL7. I did
> > this using the Eclipse XSD API (with a lot of help from Ed) which made the
> > job of subsetting so easy.
>
> > There were quite a few trials and tribulations on the way which I'd be
> > happy to share with you. Please feel free to contact me at
> > hayden.marchant@unicorn.com
>
> > Thanks,
> > Hayden
>
> Thanks. The problem is if anything slightly worse than I first thought in
> that the Health people are using an automated tool which assumes that
> there is a couple of basic schemas (a datatypes schema & a vocabulary one)
> & that's it. As such almost every message/xsd contains duplicate complex
> types when compared to other schemas. i.e. if the complextype isn't in the
> datatypes schema then put it into the message schema w/o regard to if it's
> used in 1,2,5 other schemas etc.
>
> What this means is that if you include a couple of the message schemas in
> say a WSDL, you get a huge raft of namespace collisions as you might have
> the exact same "person" structure/complex type in 5 different message
> schemas. i.e. what I need to be able to do is to distill out all the
> "common" complextypes which aren't in the "datatypes" xsd. At the moment
> I'm having to do this by hand which is (a) tedious beyond belief & (b)
> error prone.
>
> Arrrrggg....
>
> i.e. I need to be able to say:
>
> A) Run through this directory full of schemas, create a new schema (e.g.
> called "Commontypes.xsd") & then
>
> B) Dump the common complextypes into that schema &
> C) Re-reference those types from the "local" (i.e. within the message)
> version to the "new" (i.e. in Commontypes version) within each message.xsd
> & then
> D) Strip out the now de-referenced complextypes from those schemas.
>
> & then.....I would need to :
>
> Have a "Simpletypes" schema just consisting of simple types (usually
> restricted by content (e.g. length, pattern or enumeration) such that I
> could go though the verious complextypes setting their simpletypes to
> these (e.g. 1 "id" might be a message id & the next might be a "person" id
> & the next might be a "NHS_id" etc.
>
> I can happily construct the simpletypes by hand (& would do so anyway).
>
> At the moment I'm having to do all the above by hand & after a while I go
> "XML-blind" & I'm sure errors will creep in. Not to mention what happens
> if they do an update of the schemas etc.etc.
>
> Adam
>
> > Adam Flinton wrote:
>
> > > Dear All,
>
> > > I am currently working on a big health sytem in the UK. We get these huge
> > > schemas for a thing called "HL7".
>
> > > The std has nearly everything as 0..* & other massive generalizations such
> > > that it can be all things to all people. However as it is backended by
> > > real objects & a real db I have to reduce the scope of the schema down to
> > > some "specific schemas" e.g. only 5 lines of "address" & not 0 to infinity
> > > of them.
>
> > > i.e a document I create which is accordance with my mini-schemas will
> > > validate against the bigger one but the bigger one can be used to create
> > > documents that will not vaildate against my smaller one.
>
> > > Does anyone know of any tools that can help to create more specific
> > > schemas from massively generic ones?
>
> > > I have done some Eclipse programming & if need be may create a schema
> > > based on this xsd branch/project but I don't want to re-invent the wheel.
>
> > > Any ideas for a tool that can be used to do this?
>
> > > I'd like to be able to open up an existing schema & then have a new
> > > "types" schema be created where instead of an "id" type which can be
> > > anything, I can create a specific id type complete with type, length,
> > > possibly a range of values, a pattern etc. so as to comply with what the
> > > backend system will actually be expecting.
>
> > > TIA
>
> > > Adam
|
|
|
Goto Forum:
Current Time: Tue Apr 29 06:09:10 EDT 2025
Powered by FUDForum. Page generated in 0.04596 seconds
|