Q: How do I handle binary data?

Applies to: 2.0

Binary data is generally represented in XML documents in one of three ways: Base64, hexBinary, and as an unparsed entity. XML-DBMS can handle the first two (Base64 and hexBinary) with a bit of work, but cannot handle the third (unparsed entities) without significant work.

Base64 and hexBinary data are character representations of binary data. Base64 seems to be the most commonly used in XML. For more information about these, see XML Schema Part 2: Data Types:

What you need to do

This procedure describes how to map a binary column to an XML document using Base64. If you want to use hexBinary, modify steps (1) and (3) accordingly.

Please note that I have not actually tested this procedure, so let me know if there are any problems.

  1. In your map document, specify that the formatter for the binary data types is Base64Formatter:

       <Options>
          ...
          <FormatClass
             Class="org.xmlmiddleware.conversions.formatters.external.Base64Formatter"
             DefaultForTypes="BINARY VARBINARY LONGVARBINARY" />
          ...
       </Options>
    
  2. In your map document, map the binary column using the BINARY, VARBINARY, or LONGVARBINARY data type. For BLOB columns, use LONGVARBINARY. For example:

       <Table Name="MyTable">
          ...
          <Column Name="MyBLOBColumn" DataType="LONGVARBINARY" />
          ...
       </Table>
    
  3. Implement the parse and format methods in org.xmlmiddleware.conversions.formatters.external.Base64Formatter. The easiest way to do this is to use an existing Base64 encoder/decoder. For example, you might use one of the products available on SourceForge, as returned by this Google search.

    Note that you will have to use the ByteArray class defined by XML-DBMS for representing binary data. This simply wraps an object around a byte[].

  4. Modify the code in org.xmlmiddleware.xmldbms.Row to use ByteArray, as explained in message 3418.

    Note that this code will cache the entire BLOB value in memory, which may cause problems for very large values. Unfortunately, there does not seem to be an easy way to fix this. You could modify this code and the code in Base64Formatter to use a file instead of memory, but the data value would still need to be cached in memory at some point anyway, as XML-DBMS uses DOM trees to represent XML documents.

Back to the XML-DBMS FAQs