XML and Databases

Ronald Bourret

Consulting, writing, and research in XML and databases

XML Guild

Member

Namespace Myths Exploded

Copyright 2000, by Ronald Bourret

Table of contents

The XML namespaces recommendation is tantalizingly vague about, or omits altogether, a number of apparently important points. In practice, this is not a problem -- the points are not actually important and the recommendation does what it was designed to do: provide a two-part naming system for element types and attributes. Thus, as long as you don't look too deeply, XML namespaces do their job and do it reasonably well.

Of course, many people have looked too deeply. Programmers are curious people and the close link between traditional namespaces and identifiers, as well as the perceived link between XML namespaces and schemas, has naturally invited closer inspection. So too has the fact that the XML namespaces recommendation introduces a new naming system, but does not discuss validation or how to declare XML namespaces in DTDs. Equally inviting is the fact that it discusses the structure of XML namespaces, but in non-normative appendix. The result has been confusion and controversy.

This article discusses a number of myths that have arisen around XML namespaces, examining possible sources, clarifying what the recommendation says about them, and pointing out ways to resolve the issues they raise. It is hoped that this will help clear up some of the confusion about XML namespaces, as well as reinforcing the point that most of that confusion revolves around things not required to use XML namespaces as they were designed -- that is, as a two-part naming system for element types and attributes.

Note: The discussions of these myths often refer to the "traditional namespaces used by an XML document." These are the traditional namespaces that hold element type and attribute names. There is one traditional namespace for element type names and, for each element type, one traditional namespace for attribute names.

Myth #1: XML namespaces exist

Preposterous as this myth might sound, it is at the root of much of the confusion about XML namespaces. To make it a little easier to understand, we can restate it more lucidly as, "Myth #1: The XML namespaces recommendation requires an XML namespace to exist as a physical or conceptual entity."

At a superficial level, this myth is true. In its definition of XML namespaces, the recommendation states the following: "An XML namespace is a collection of names, identified by a URI reference [RFC2396], which are used in XML documents as element type and attribute names. XML namespaces differ from the 'namespaces' conventionally used in computing disciplines in that the XML version has internal structure and is not, mathematically speaking, a set." Thus, an XML namespace exists at least at the conceptual level.

At a deeper level, myth really is a myth, the above statement notwithstanding. Except for this statement, there is nothing in the recommendation that requires an XML namespace to exist as a physical or conceptual entity. What the recommendation requires is the existence of a qualifier -- the XML namespace URI -- that, in combination with an element type or attribute name, creates a universal (and universally unique) name. An XML namespace is simply a convenient (and unnecessary) stop on the road from prefixed name to universal name. Thus, while many readers spend their time looking for clues about XML namespaces -- their structure, what names are in them, how to instantiate them, and so on -- the recommendation spends its time discussing XML namespace URIs -- how to declare them, how to associate them with element type and attribute names, and how to construct universal names with them. Small wonder that so many people are confused.

If you think in terms of universal names rather than in terms of XML namespaces, an XML namespace is nothing more than a collection of all the local parts of universal names that share the same XML namespace URI. Although such a collection is a short and logical step from universal names, it is a step in the wrong direction, as it points you down a road you never need to travel. Thus, it is better to ignore the definition (not to mention existence) of an XML namespace as a physical or conceptual entity and instead think about XML namespace URIs and universal names. In other words, to think of the XML namespaces recommendation not as "Namespaces in XML," but as "Universal Names in XML."

Myth #2: Universal names uniquely identify element types and attributes.

OK, this myth is a bit of a cheap shot, but it points out one of the reasons people want to know about the structure of an XML namespace, as well as reinforcing the potential ambiguities of invalid documents.

When a traditional namespace provides names to identify a set of objects, it usually provides names only to that set of objects and not to any other sets of objects. As a result, given a traditional namespace and a name from that namespace, it is usually possible to uniquely identify an object. This is not the case with XML namespaces, which provide names to both element types and attributes. That is, given the URI of an XML namespace and a name from that namespace -- for example, {http://www.foo.org}bar -- it is not possible to say whether the name identifies an element type or an attribute. This has led to much speculation about the structure of XML namespaces (see myth #6), since separating the names inside an XML namespace into separate sets of element type names and attribute names is one way to solve this problem. In practice, the problem is usually solved by the application, which knows whether it is looking at an element type or attribute name.

Technically, even the knowledge that a universal name applies to an element type or attribute is not quite enough for that name to uniquely identify a specific element type or attribute, even at the syntactic level. For example, consider the following document:


   <?xml version="1.0" ?>
   <A xmlns="http://www.foo.org/">
      <A>abcd</A>
   </A>

This document contains two elements with the universal name {http://www.foo.org/}A. Do these share a single element type or do they have two different element types? This ambiguity is nothing new, as it potentially exists for any document lacking a DTD. Fortunately, the ambiguity can be resolved through validation, just as it can when XML namespaces are not used. However, to guarantee that validation does resolve this ambiguity, the document must be constructed according to the following guidelines:

  • Declare all xmlns attributes in the DTD.

  • Use the same qualified names in the DTD and the body of the document.

  • Use one prefix per XML namespace.

  • Do not use the same prefix for more than one XML namespace.

  • Use at most one default XML namespace.

The first two guidelines are necessary to ensure that all names used in the body of the document are declared in the DTD. The latter three guidelines guarantee that a universal name always has the same qualified name, meaning that qualified names in the DTD are equivalent to universal names.

Myth #3: An XML namespace is an object; it has an interface

Because an XML namespace is "a collection of names," has "internal structure," and "is not, mathematically speaking, a set," there is a reasonable expectation that it has some sort of solidity -- that like a DOM tree, you can instantiate it, ask it questions, and pass it to other applications. In other words, that it is an object (or collection of objects) and has an interface.

This would be a useful thing. It would mean that a few vendors could implement XML namespace objects and the rest of us could use them. After instantiating the object and initializing it with our own names, we could ask useful name-related questions, such as, "Does the element type name foo already exist in this namespace? If not, please add it," and "What integer is associated with the element type name bar? I'd like to use it to switch to the code to process a bar element," and "Could you give me the definition of a foobar attribute?"

However, this is not the case. Like the XML 1.0 recommendation, the XML namespaces recommendation specifies only a syntax. Any "XML namespace objects" would have to be layered on top of this syntax, just as the DOM is layered on top of the XML 1.0 syntax. Thus, this myth really is a myth: an XML namespace is not an object and it does not have an interface.

Myth #4: Unprefixed attribute names are in XML namespaces

There are two probable origins of this myth. The first is a simple wish for completeness -- that is, that an XML namespace should include all related element type and attribute names. The second is the idea that an XML namespace is an object (it is not -- see myth #3), which raises the question, "What are the member elements of the object? Do they include unprefixed attribute names?"

In fact, the XML namespace recommendation never resolves this myth because it never formally defines membership in an XML namespace. Instead, it defines a mapping from certain names in an XML document to an XML namespace and then states how to construct universal names from the mapped names. What is clear is that no mapping is defined for unprefixed attribute names -- this is the meaning of the statement, "...default namespaces do not apply directly to attributes..." -- and therefore that there is no way to construct universal names from unprefixed attribute names. The importance of this is that applications: (a) must not attempt to map unprefixed attribute names to the default XML namespace, and (b) must identify attributes with unprefixed names through their associated element type, just as they do in the absence of XML namespaces.

From a purely philosophical standpoint, it can be argued that unprefixed attribute names are not "in" an XML namespace. This is because, unlike prefixed names and unprefixed element type names, unprefixed attribute names are never mapped to an XML namespace and therefore are not members. It is also possible to argue the converse. The first argument is the non-normative Appendix A, which explicitly states that unprefixed attribute names are "in" an XML namespace. The second argument is the use of the word "directly" in the above quote, which implies that the default XML namespace applies indirectly, such as through unprefixed element type names. Fortunately, the resolution of this myth is unimportant, as it does not affect how documents that use XML namespaces are written or how namespace-aware applications process unprefixed attribute names.

Note that elsewhere the Namespaces FAQ states that unprefixed attribute names are not in any XML namespace. This is done for simplicity, as the presence or absence of XML namespaces in a document does not affect how unprefixed attribute names are written or processed.

Myth #5: The XML namespaces recommendation introduces "global attributes"

The XML namespaces recommendation discusses global attributes in Appendix A. This appendix is non-normative and global attributes are never mentioned in a normative part of the recommendation. Thus, the recommendation never formally introduces or defines the concept of global attributes. However, the concept of global attributes is necessary to understanding one of the reasons why attribute names can be prefixed and thus mapped to universal names.

"Global attributes" are an idea that pre-dates the XML namespaces recommendation and has never been clearly defined. One variation is that they are simply a declarative shorthand -- that is, you declare a global attribute in a DTD (there is no syntax to actually do this) or schema and it applies to all element types in that DTD or schema. This would be useful, for example, if you wanted all element types to have a certain attribute.

The variation in Appendix A appears to be more flexible, allowing global attributes to "float" from one DTD or schema to another. For example, they might be defined in one schema and used on an element type in another schema. The XML namespaces recommendation specifically supports such attributes through qualified attribute names, which resolve to universal names. This allows these attributes to be recognized wherever they appear.

An example of this kind of global attribute is the version attribute in XSL Transformations (XSLT). XSLT supports a simplified syntax for its style sheets in which many of the XSLT elements are eliminated. The root element of such style sheets, which is not defined in XSLT, must use XSLT's version attribute to declare which version of XSLT it is using. In the following, xslt:version is a global attribute whose name belongs to the http://www.w3.org/1999/XSL/Transform namespace. This is different from the version attribute of the bar element type.

   <foo:bar xmlns:foo="http://www.foo.org/"
            xmlns:xslt="http://www.w3.org/1999/XSL/Transform"
            xslt:version="1.0"
            version="2.x">
      ...
   </foo:bar>

Although it is never explicitly stated in Appendix A, global attributes should always have the same type and semantics, regardless of the element type on which they occur. This represents a restriction to the XML 1.0 recommendation, which allows attributes on two different element types to have the same name but different types and semantics. However, it is the only way for global attributes to fulfill the stated motivation of XML namespaces as a way to support and reuse modular software. If an attribute identified by a universal name has a different type and semantics on each element type on which it occurs, it is impossible to write modular, reusable software to process it.

Myth #6: An XML namespace has an internal structure

There are a number of origins of this myth:

  • The definition of an XML namespace states that it is a "collection of names" that "has internal structure."

  • Appendix A, which is non-normative, describes a possible structure for XML namespaces.

  • The idea that an XML namespace is an object (it is not -- see myth #3) leads directly to questions about its member elements, which are organized within its structure.

  • Structure is one way to resolve the problem that a universal name does not identify a particular element type or attribute unless it is known beforehand whether the name applies to an element type or attribute (see myth #2).

  • Validity imposes a set of traditional namespaces on an XML document. It is therefore tempting to think that the structure of an XML namespace will mimic or approximate this structure.

As was shown in myth #1, there is no reason for XML namespaces (as opposed to XML namespace URIs) to even exist. Thus, their structure is unimportant -- it is never used in practice -- and it is unfortunate that it is ever discussed. Nevertheless, it is interesting to explore the possible structures of an XML namespace, if only to satisfy the curiosity needlessly raised by the recommendation.

In discussing possible structures, it is necessary to note that:

  • The structure proposed in Appendix A is non-normative and therefore only one possible structure, not the structure.

  • Inherent in the notion of structure is the idea that an XML namespace has certain name-related capabilities, such as the ability to determine whether a particular name applies to an element type or attribute. For example, keeping element type names separate from attribute names implies the ability to distinguish between these. Except for specifying that a particular element type or attribute name is in a particular XML namespace, the XML namespaces recommendation does not require or specify such capabilities.

  • The names stored in an XML namespace are local names. Attempts to circumvent the need for structure by storing other forms of names, such as universal names, are simply adding structure by other means.

And now, on with the show. Here are five possible internal structures for an XML namespace:

  1. A set. Although the normative part of the XML namespaces recommendation states that an XML namespace "is a collection of names" and "is not, mathematically speaking, a set," there is nothing other than the latter statement that prevents it from being a set. In particular, the recommendation describes how to map names in an XML document to XML namespaces -- in other words, to specify their membership in an XML namespace. Because the recommendation does not specify any other behaviors, a set is sufficient to represent an XML namespace. There are two obvious drawbacks to this structure. First, it does not provide enough information to support most name-related capabilities, including determining whether a name applies to an element type, an attribute, or both. Second, it cannot contain unprefixed attribute names (see myth #4), as these cannot be distinguished from the local part of universal names.

  2. Two traditional namespaces -- one for element type names and one for prefixed attribute names. In some senses, this is the minimal "intelligent" structure of an XML namespace, although the lack of unprefixed attribute names means that the namespace lacks a number of interesting name-related capabilities.

  3. The structure in Appendix A. This structure consists of one traditional namespace for element type names, one traditional namespace for global (prefixed) attribute names (see myth #5), and, for each element type, one traditional namespace for unprefixed attribute names. It provides the most complete name-related capabilities and (not surprisingly), except for the traditional namespace for global attributes, matches the structure of traditional namespaces imposed on an XML document by validity.

  4. Similar to (3), but without special treatment of global attributes. Thus, rather than having a single traditional namespace for global attribute names, each element type has two traditional namespaces for attribute names: one for prefixed names and one for unprefixed names. This structure assumes that the type and semantics of global attributes can vary depending on the element type they are associated with. While this is allowable under the XML 1.0 recommendation, it defeats the purpose of being able to assign universal names to attributes (see myth #5) and is probably best ignored.

  5. Same as (4), but without per-element type traditional namespaces for unprefixed attribute names. This structure takes an even more pessimistic view than that in (4), assuming that unprefixed attribute names are not in an XML namespace (see myth #4). Like (4), it is best ignored.

As can be seen, the structure of an XML namespace has many possible forms, depending on what is trying to be achieved. What is important to remember is that all of these forms are irrelevant to the actual use of universal names, which are the sole purpose of XML namespaces.

Myth #7: There is a "null" XML namespace that contains all element type and attribute names not declared to be in a specific XML namespace

There are two probable origins of this myth. The first is again a wish for completeness -- that all element type and attribute names are in some XML namespace. The second is that some namespace-aware applications use a null or empty (zero-length) URI to process names that are not specifically declared to be in an XML namespace. For example:

   if (uri == null)
   {
      // Process names not in any XML namespace
   }
   else if (uri.equals("http://www.foo.org/")
   {
      // Process names in http://www.foo.org namespace
   }
   else if (uri.equals("http://www.bar.org/")
   {
      // Process names in http://www.bar.org namespace
   }

In fact, a "null" XML namespace does not exist and the use of null or empty URIs in applications is an implementation detail. The reason is that, if such a namespace did exist, its names could be mapped to more than one element type or attribute, each of which has different types and semantics. Such names are not universal and violate the requirement that "...document constructs [element types or attributes] should have universal names, whose scope extends beyond their containing document."

Myth #8: XML namespaces are strongly related to schemas

Because XML namespaces contain element type and attribute names and schemas contain element type and attribute definitions, there is a strong temptation to think that there is a special relation between the two. In fact, the relationship between the set of names in an XML namespace and the set of names in a schema cannot be predicted any more than it can for any two other sets picked at random. That is, all of the following are possible:

  • The names in the XML namespace are the same as the names in the schema. This is generally the case for schemas defined today.

  • The names in the XML namespace are a strict subset of the names in the schema. This will be a common case in the future, when schemas are built in modular fashion from other schemas. In this case, the schema draws its names from more than one XML namespace.

  • The names in the XML namespace intersect, but are not a subset of, the names in the schema. This will also be a common case in the future, in which only part of one schema is used to build another schema. Again, the schema draws its names from more than one XML namespace.

  • The names in the XML namespace do not intersect the names in the schema. This is a trivial case and is mentioned only for completeness.

Myth #9: The XML namespaces recommendation redefines validity

Because the XML namespaces recommendation defines how to construct universal element type and attribute names, and because validity has rules concerning names (no two element types may have the same name, no two attributes of the same element type may have the same name, and so on), it is reasonable to assume that the XML namespaces recommendation redefines validity in terms of universal names. For example, no two element types may have the same universal name and no two element types whose names are not in an XML namespace may have the same name; no two attributes of the same element type may have the same universal name or the same unprefixed name; and so on.

In fact, the XML namespaces recommendation does not mention validity at all. The reason for this is most likely that there is no way to declare an XML namespace in a DTD; hence, there is no way to discuss universal names in DTDs, a necessary prerequisite to redefining validity in terms of universal names.

Furthermore, it is possible to create documents using XML namespaces that many people would think are valid (and which are certainly valid in spirit) but which are not valid as validity is defined in the XML 1.0 recommendation. For example, the following is not valid because the root element type (A) does not match the document element type (foo:A) in spite of the fact that both could reasonably be interpreted to be the same element type ({http://www.foo.org/}A).

   <!DOCTYPE foo:A [
      <!ELEMENT foo:A EMPTY>
      <!ATTLIST foo:A
                xmlns CDATA #FIXED "http://www.foo.org/"
                xmlns:foo CDATA #FIXED "http://www.foo.org/">
   ]>
   <A/>

Because of this, it is likely that at some point in the future validity constraints on element type and attribute names will be redefined in terms of universal names. (They are currently defined in terms of qualified names.) When and if this will happen is not known, although a reasonable time would be in conjunction with XML Schemas, as these circumvent the problems of declaring XML namespaces (and thus universal names) in a DTD. In the meantime, it seems safe to say that any schema-driven software that interprets validity today in terms of universal names instead of qualified names is unlikely to meet much resistance from the XML community.

Myth #10: XML namespaces are related to the traditional namespaces required by validity

This myth is both a myth and not a myth. At a conceptual level it has some validity. Although it is never stated in the XML namespaces recommendation, XML namespaces can be thought of as partitioning the traditional namespace for element type names into multiple traditional namespaces -- one for each XML namespace, plus an additional traditional namespace for element type names not in any XML namespace. Similarly, XML namespaces can be thought of as partitioning the per-element type traditional namespaces for attributes: one traditional namespace for each XML namespace, plus an additional traditional namespace for unprefixed attribute names.

(One consequence of partitioning the per-element type traditional namespaces for attributes is that a global attribute name can appear in more than one traditional namespace -- in particular, it appears once for each element type on which it occurs. Although this appears to reduce the globalness of global attribute names, it is needed to enforce a namespace-aware definition of validity, which requires that no two attribute names have the same universal name or the same unprefixed name. For more information, see myths #5 and #9.)

On a technical level, the myth is a myth: the XML namespaces used by an XML document are not related to the traditional namespaces used by that document. In particular, the names in the XML namespaces are different from those in the traditional namespaces and cannot be directly compared. For example, consider the name of the type of the following element:

   <foo:A xmlns:foo="http://www.foo.org/">

In the traditional namespace for element type names, the name of this element type is foo:A; this is because the traditional namespace does not recognize foo as a prefix identifying an XML namespace and process it accordingly. In the http://www.foo.org/ namespace, the name of this element type is A; its corresponding universal name is {http://www.foo.org/}A. Both names are different from foo:A.

Because the names are different, you should be very careful about how you use XML namespaces if you expect your document to be used by both namespace-aware and namespace-unaware applications. For example, the following elements have the same element type name according to a namespace-aware application and different names according to a namespace-unaware application; similar difficulties arise when using default XML namespaces.

   <foo:A xmlns:foo="http://www.foo.org/">123.45.67.8</foo:A>
   <bar:A xmlns:bar="http://www.foo.org/">123.45.67.8</bar:A>

Myth #11: XML namespaces are part of XML 1.0

The XML 1.0 recommendation specifically warns users against using colons in names, stating that these are likely to be used in the future for XML namespaces. In spite of this, XML namespaces are a technology layered on top of XML 1.0, not part of XML 1.0.

This has several consequences. First, XML 1.0 documents are not required to use XML namespaces and XML 1.0 parsers and applications are not required to recognize or process them. Second, some early XML technologies, such as SAX and the DOM, do not directly support XML namespaces because they were developed before XML namespaces were developed. Third, DTDs do not directly support XML namespaces and concepts such as validity are not defined with respect to XML namespaces.

Fortunately, XML 1.0 was designed with XML namespaces in mind. In particular, element type and attribute names can contain colons specifically so that colons can be used to separate an XML namespace prefix from a local name. Thus, a name such as foo:Bar is a legal name in XML 1.0 and can be processed by namespace-unaware software. Such software simply doesn't recognize foo as an XML namespace prefix; instead, it treats it, along with the colon, as part of the name.


Copyright (c) 2009, Ronald Bourret