Topics

Namespace For DITA?

Eliot Kimber <ekimber@...>
 

I am writing a little DITA importer for my toy content management system XIRUSS-T (xiruss-t.sourceforge.net). XIRUSS only supports schemas (and even if I supported DTDs I wouldn't use DTD external identifiers as document type identifiers as a matter of principal).

My importer framework uses namespaces to associate XML documents with the appropriate processing, so in order to implement automatic recognition of DITA documents a document has to be in the DITA name space. In addition, if a document references a schema the importer will automatically import the schema document if it's not already in the repository.

However, the schemas provided in the IBM DITA 1.3 distribution and the initial OASIS submission don't declare a target namespace.

Is there a normative namespace for IBM DITA? I know that we have not defined one yet for OASIS DITA.

If there's not one then I'll just make one up for my testing purposes.

Thanks,

Eliot
--
W. Eliot Kimber
Professional Services
Innodata Isogen
9030 Research Blvd, #410
Austin, TX 78758
(512) 372-8122

eliot@...
www.innodata-isogen.com

Don Day <dond@...>
 

Eliot Kimber <ekimber@...> wrote on 06/22/2004 03:30:13 PM:

Is there a normative namespace for IBM DITA? I know that we have not
defined one yet for OASIS DITA.

If there's not one then I'll just make one up for my testing purposes.
There has been no formal work to define one for IBM DITA, Eliot, in part
because specialization tends to put off the need for it except for cases
such as yours. So, go for it... just document somewhere that whatever you
end up with is just a working version, not a formally declared one.

Would you mind pointing the OASIS TC to info on how to formally define
and/or register a namespace for a given DTD? Obviously the TC needs to
wrestle with how a namespace would behave in a specialized document, how a
topic with namespaced content would generalize, etc.. It seems to me to be
a "one DTD, one namespace" framework for now, which doesn't look all that
useful for DITA content, although DITA topics could certainly include
standard namespaced fragments such as SVG or MathML.

Regards,
--
Don Day <dond@...>
Chair, OASIS DITA Technical Committee
IBM Lead DITA Architect
11501 Burnet Rd., MS 9037D018, Austin TX 78758
Ph. 512-838-8550 (T/L 678-8550)

"Where is the wisdom we have lost in knowledge?
Where is the knowledge we have lost in information?"
--T.S. Eliot

W. Eliot Kimber <ekimber@...>
 

--- In dita-users@..., Don Day <dond@u...> wrote:





Eliot Kimber <ekimber@i...> wrote on 06/22/2004 03:30:13 PM:

Is there a normative namespace for IBM DITA? I know that we have
not
defined one yet for OASIS DITA.

If there's not one then I'll just make one up for my testing
purposes.

There has been no formal work to define one for IBM DITA, Eliot, in
part
because specialization tends to put off the need for it except for
cases
such as yours. So, go for it... just document somewhere that
whatever you
end up with is just a working version, not a formally declared one.

Would you mind pointing the OASIS TC to info on how to formally
define
and/or register a namespace for a given DTD?
A namespace is just a URI, so it would be up to OASIS to define its
rules for constructing URIs, presumably with the oasis-open.org
domain. The namespace would be "registered" by publishing a
specification that says something like "the namespace 'blah' is the
namespace for the vocabulary 'blarg' as defined in this
specification", as well as publishing a normative XSD schema that
specifies a targetNamespace="blah" attribute.

That is, as far as I know, there is no registrar of name spaces nor
does there need to be one since the name space of namespace names is
already a managed name space (namely the namespace of URIs). As for
SGML public IDs, it's up to the owner of the domain to manage how
URIs within that domain are associated with tag vocabularies and
document types.

Note that the W3C namespace spec itself says nothing about how
namespaces might be formally associated with document types--any such
association must be from the direction of the document type itself.

Obviously the TC needs to
wrestle with how a namespace would behave in a specialized
document, how a
topic with namespaced content would generalize, etc.. It seems to
me to be
a "one DTD, one namespace" framework for now, which doesn't look
all that
useful for DITA content, although DITA topics could certainly
include
standard namespaced fragments such as SVG or MathML.
It's useful in the sense that we have to have a way to formally and
unambiguously identify a given document as being a DITA document, and
namespaces clearly mapped to schemas (and by extensions the abstract
document types they reflect) is the only standard-defined mechanism
we have in the XML world (it could also be done with HyTime
architectures but I'm presuming that HyTime is essentially a dead
standard in this context).

The question of whether DITA needs to define a single namespace or
several is definitely an open question that will have to be taken up
by the DITA TC, at least in the DITA 1.1 time frame, if not before
(it is probably sufficient to define a single namespace for 1.0 as
that reflects what the current DTDs do and doesn't require any change
to the current schemas beyond adding the targetNamespace= attribute
to each schema component involved).

This is probably not a question to discuss in this forum, which is
for users.

Cheers,

Eliot
Innodata Isogen

W. Eliot Kimber <ekimber@...>
 

--- In dita-users@..., Eliot Kimber <ekimber@i...> wrote:

However, the schemas provided in the IBM DITA 1.3 distribution and
the
initial OASIS submission don't declare a target namespace.

Is there a normative namespace for IBM DITA? I know that we have
not
defined one yet for OASIS DITA.
I subsequently realized that I had also not properly appreciated
the "no namespace" feature of XSD schemas, which the current DITA
schemas use.

However, for my purposes (building an automatic DITA importer within
a content management system) that is not particularly useful since
without a namespace I have no way to know that a given XSD schema
instance happens to in fact be a DITA schema and therefore no way to
know that a given document is in fact a DITA document, with the
importing human providing some out-of-band indication that the
document is in fact a DITA document (and even then I wouldn't be able
to check their assertion).

Therefore, for my testing purposes I just made up a namespace as Don
suggested.

But as a matter of principle, I suggest that nobody should ever
create a schema for use with more than one document instance that
does not also declare the namespace it governs.

Cheers,

Eliot
Innodata Isogen

Tsao, Scott <scott.tsao@...>
 

--- In dita-users@..., "W. Eliot Kimber" <ekimber@i...> wrote:

A namespace is just a URI, so it would be up to OASIS to define its
rules for constructing URIs, presumably with the oasis-open.org
domain. The namespace would be "registered" by publishing a
specification that says something like "the namespace 'blah' is the
namespace for the vocabulary 'blarg' as defined in this
specification", as well as publishing a normative XSD schema that
specifies a targetNamespace="blah" attribute.
Perhaps a good example is the target namespaces for UBL, see:

http://xml.coverpages.org/ni2004-04-30-b.html#codeListSchemas

e.g.,

targetNamespace="urn:oasis:names:tc:ubl:codelist:AcknowledgementResponseCode:1:0"

Note that they chose to use URN (instead of URL) for the namespace names.

Regards,

Scott

Eric Sirois
 


From: "W. Eliot Kimber" <ekimber@...>
Date: 2004/06/23 Wed PM 04:10:20 EDT> To: dita-users@...
Subject: [dita-users] Re: Namespace For DITA?

Hi Eliot,

No rework of the schema definitions (*.mod) is needed to support namespaces. We would just define the namespace and targetNamespacein the head schemas (ditabase.xsd, topic.xsd, etc) to support namspacesBecause the XML Schemas depend heavily on subsitution groups we can't add namespaces in the *.mod files because substitution groups only work when element are in the null namespace or same namespcace.

By simply defining a namespace in the head schema the module would assume that namespace (chameloen effect).

The two main reason for *not* including namspaces in the OASIS submission are:
1)XSLT tools - The public package of DITA at developerworks includes a number of XSL stylesheets to process DITA documents into HTML and PDF files. If the XML Schemas had included a namespace all of those tools useless. A complete duplicate of namespace aware set of tools would have been created and maintained for the XML Schemas.


2)Backwards compatibilityRather than modifying all instance documents that would currently be using DTDs for validation, JAXP (Java API for XML Processing) provides functionality to programatically ignore the DOCTYPE declaration and validate against a schema. If a schema had a namespace, then you could always turn off namespace awareness on the parser and get the same results, but you would also loose any information in the xml namespace (xml:lang, xml:space, or xml:base).

Better support for local name template matching is supposed to be part of XSLT 2.0.

General XML Editor woes
Stylus Studio (SS) products use Xerces as their main XML parser. In SS v5.1 you can specify other parsers to validate XML Schemas/DTDs.

The errors reported would be like this:
[Error] task.mod:157:37: rcase-Recurse.2: There is not a complete functional mapping between the particles.
[Error] task.mod:157:37: derivation-ok-restriction.5.3.2: Error for type 'step.class'. The particle of the type is not a valid restriction of the particle of the base.

[Error] task.mod:96:41: rcase-MapAndSum.1: There is not a complete functional mapping between the particles.
[Error] task.mod:96:41: derivation-ok-restriction.5.3.2: Error for type 'taskbody.class'. The particle of the type is not a valid restriction of the particle of the base.

This is documented in the developerWorks package. In General, the parsers will have problems with the schemas. It's not a problem with the schemas per se, but rather with the algorithms defined the XML Schema(XS) 1.0 spec. We are working with our XML Schema WG rep (IBM) on getting this 'fix' into the XS 1.1 spec.

There is way to fix the schemas to avoid these errors, but that would mean not being able to take advantage of XS inheritance architecture. This will affect any applcation using Xerces or MSXML v4.0 as their parser. Some applications that uses their own proprietary parser will not experience these errors. For example, XMLSpy and Syntext Serna. This will affect the validation of the schema and not the valdation of the instance documents. With Xerces, you can validate the instance document without having to validate the schemas. Unfortunately, this is not true for the MSXML parser.

Here is a snippet of an explanation I got from Henry Thompson (from XML Schema WG) regarding the content model errors.

<snip>
I think I now see your problem.

As written, the W3C XML Schema REC tries to define _syntactic_
properties of content models which guarantee a _validation_ semantics
of type subsumption, that is, members of a restricted type must all be
members of its base type. To do this, it basically insists (in
section 3.9.6 [1]) that it be possible to set the components of the
restricted model in a one-to-one correspondence with those in the base
model, with gaps allowed a) where there's a choice and b) where
there's optionality. What it _doesn't_ do is collapse real choices.

Your problem is that if you expand out all the group references,
you've got, in DTD notation

((a|b)|e|(c|d))* as the base,
and
(a|b)* as the restriction

(ignoring the substitution group stuff).

Although it's true that every member of the restriction will be a
member of the base, the rules in REC don't have enough sophistication
to figure that out -- the rules will try to match the restriction's
*ed choice with two branches against the base's one with three
branches, but it's not smart enough to use the _same_ branch from the
base both times, which is what it needs to do to win.

We hope to improve the REC in this area at the next revision, I'm
sorry it's bitten you here.

ht</snip>

I hope this helps.

Kind regards,
EricIBM Canada Ltd.

1

W. Eliot Kimber <ekimber@...>
 

--- In dita-users@..., <easirois@r...> wrote:

From: "W. Eliot Kimber" <ekimber@i...>
Date: 2004/06/23 Wed PM 04:10:20 EDT> To: dita-
users@...
Subject: [dita-users] Re: Namespace For DITA?

Hi Eliot,

No rework of the schema definitions (*.mod) is needed to support
namespaces. We would just define the namespace and targetNamespacein
the head schemas (ditabase.xsd, topic.xsd, etc) to support
namspacesBecause the XML Schemas depend heavily on subsitution groups
we can't add namespaces in the *.mod files because substitution
groups only work when element are in the null namespace or same
namespcace.

By simply defining a namespace in the head schema the module would
assume that namespace (chameloen effect).

I didn't realize this would work--I was probably thrown off by the
bogus error reports I got from Stylus Studio. This would certainly
simplify things.

The two main reason for *not* including namspaces in the OASIS
submission are:
1)XSLT tools - The public package of DITA at developerworks
includes a number of XSL stylesheets to process DITA documents into
HTML and PDF files. If the XML Schemas had included a namespace all
of those tools useless. A complete duplicate of namespace aware set
of tools would have been created and maintained for the XML Schemas.

This discussion should really be moved to the DITA TC list, but I
imagine that we will need to issue a new set of tools for OASIS DITA
1.0 in any case, so adding namespace awareness could be added to that
code as part of its normal development.

2)Backwards compatibilityRather than modifying all instance
documents that would currently be using DTDs for validation, JAXP
(Java API for XML Processing) provides functionality to
programatically ignore the DOCTYPE declaration and validate against a
schema. If a schema had a namespace, then you could always turn off
namespace awareness on the parser and get the same results, but you
would also loose any information in the xml namespace (xml:lang,
xml:space, or xml:base).

Again, I don't personally think that backward compatibility with the
IBM DITA distribution is a requirement--obviously we'd want to be as
similar as possible, but at least for this issue I would argue that
the value of having a namespace for DITA would far outweigh any
compatibility concerns. For any existing documents that currently use
DTDs that you wanted to move to schemas you'd have to at least add a
targetNamespace= attribute anyway, so also adding a namespace
declaration wouldn't really add any cost. If DITA uses exactly one
namespace then you can just define the DITA namespace as the root
namespace and none of the other tags in instance documents have to be
changed (assuming this was the only change from the IBM distribution).

Cheers,

Eliot
Innodata Isogen