XML and Java technologies: Data Binding Part 4: JiBX Usage

JiBX lead developer Dennis Sosnoski shows you how to work with his new framework for XML data binding in Java applications. With the binding definitions used by JiBX, you control virtually all aspects of marshalling and unmarshalling, including handling structural differences between your XML documents and the corresponding Java language objects. Want to refactor your code without changing the XML representation? With JiBX you can...

Part 3 of this series gave you an introduction to the architecture of the JiBX data binding framework. That included a quick overview of JiBX's Java-centric approach to data binding, as contrasted with the XML-centric approach used by most other data binding frameworks. Now in Part 4, you'll find out how to use the power of this Java-centric approach to data binding in your applications.

Most other data binding frameworks for the Java language force you to supply a DTD or W3C XML Schema grammar for your documents, then generate a collection of classes from that grammar. You need to work with these generated classes to use the frameworks, but in most cases you have little or no control over the classes -- they're basically JavaBean-type wrappers around simple data structures along with some added framework code. The whole point of these generated classes is to provide an interface for working with data from your XML documents.

The JavaBean wrapper approach is sometimes presented as object-oriented because of the use of get/set methods to access data. In reality, it's about as far from true object-oriented development as you can get, because of the lack of extensibility in the data objects. True object-oriented programming means that objects hide their internal state and implement their own behaviors for working with that state information. This is typically not possible with the generated code approaches.

With JiBX, binding to XML is treated as an aspect that applies to your classes, not as the primary purpose of those classes. Thus, you use object-oriented designs that are appropriate to your application. It also gives you the freedom to refactor your classes without needing to change the bound XML document structure. This aspect-oriented approach even lets you define multiple bindings to be used with the same classes, so that you can work with multiple input or output XML document structures without having to change your code.

Of XML bondage
The core of JiBX's aspect-oriented approach to binding is the use of binding definitions to control how your Java objects are converted to and from XML documents. To see how this works, think of XML documents as tree structures, where the nesting of elements define branches of the tree. Data binding converts these XML trees to and from trees of objects (or sometimes graphs of objects, with links up or across the tree structure). A JiBX binding definition is a third tree that represents a merger of the structure of the XML and object trees (which, in JiBX, can be different). This merged structure tells JiBX how to convert the XML tree to and from the object tree.

Figure 1 gives a simple example of how the binding definition is used in the JiBX framework. In this case the XML document and the bound classes use the same structure -- a customer has a name as a child, and the name has a pair of simple text values as children. The binding definition simply duplicates this structure, supplying the necessary information at each level to relate XML elements to the corresponding object properties.

Figure 1. Binding definition role

Binding components
Several types of elements are used within a binding definition. The purposes of these elements and the types of child elements they can contain are shown in Table 1.

Table 1. Elements used in a binding definition

Element Purpose

binding
The root element of the binding definition, with optional attributes for binding name and global settings.

Children: namespace , format , mapping (at least 1 mapping required)

namespace
Namespace declaration that defines a namespace URI and associated prefix (with the prefix used for marshalling).

Children: none

format
Format definition for converting simple values to and from text. This is needed only if you want to use nonstandard conversions.

Children: none

mapping
Defines how objects of a particular class are converted to and from XML. Each mapping is a reusable component that can be referenced wherever an object of that type needs to be handled within the binding definition. Mapping definitions that are children of the binding element are called global mappings.

Children: namespace , format , mapping , followed by any combination of value , structure , and collection elements

value
Gives the conversion handling for a simple value (a primitive, or an object type with a format supplied) to convert it to and from text. The XML representation can be an attribute, a simple element, or in some cases a plain text or CDATA node.

Children: none

structure
Structure component of binding, which can represent a Java object, an XML element, or both. Usually, this represents both a Java object and an XML element linked to that object. A structure mapping is defined when either the object or the XML element is missing from the definition.

Children: any combination of value , structure , and collection elements

collection
Similar to a structure element, but specifically for representing Java collection objects (added in JiBX Beta 2).

Children: any combination of value , structure , and collection elements

The binding element is always the root element of a binding definition. As children it can have namespace , format , and mapping elements, which must be in that order (with the first two types optional). Each mapping element can in turn have these same types of elements as children for nested definitions, followed by a mixture of value , structure , and collection elements that define the details of the relationship between XML and a Java class.

The value elements represent simple value components of the XML document, which can be attributes, simple child elements (with only text content), text, or CDATA. The structure elements are more involved. In the most common case (as in the Figure 1 example), a structure element relates a child element with complex content (the name element, in the example) to an object-value property of a Java object (the name field of the Customer object). Both sides of the relationship are optional, though. This allows the structure element to define an XML element with no corresponding object, or an object with no corresponding element. I'll show how this works in the following examples.

A simple binding
In Part 3, I gave some examples of the flexibility JiBX provides with structure mapping. I'll go through the actual binding definitions here. Figure 2 shows the first example, with a direct correspondence between the structure of the XML documents and the Java objects. This is just the full version of the same document and class structure used in Figure 1. Listing 1 gives a full binding definition for this correspondence.

Figure 2. Direct correspondence to XML
Direct correspondence to XML

Listing 1. Binding definition for direct correspondence


<binding>
  <mapping name="customer" class="Customer">
    <structure name="name" field="name">
      <value name="first-name" field="firstName"/>
      <value name="last-name" field="lastName"/>
    </structure>
    <value name="street1" field="street1"/>
    <value name="city" field="city"/>
    <value name="state" field="state"/>
    <value name="zip" field="zip"/>
    <value name="phone" field="phone"/>
  </mapping>
</binding>

Compacting the definition
Listing 1 gives the full form of the binding definition. This isn't the only way of specifying a binding, though. If requested, JiBX (starting with Beta 2) will map unspecified simple properties of Java objects automatically. The properties to be mapped may take the form of either fields or JavaBean-style get/set methods. Taking advantage of this default mapping allows Listing 2 to be used as a (much shorter) alternative to the Listing 1 definition.

Listing 2. Compact version of binding definition


<binding auto-link="fields">
  <mapping name="customer" class="Customer">
    <structure name="name" field="name"/>
  </mapping>
</binding>

This compact approach does have some limitations. The automatically generated property bindings will always follow any definitions given explicitly, and will occur in the order they're defined. In the case of the Figure 1 binding this is just what I want -- the name element is the first child of the customer element, and the field definitions within both the Name and Customer classes use the same order as the corresponding XML child elements. When the Java data matches the XML structure as closely as in this case, automatically generating bindings can make the binding definition very simple.

JiBX does provide some specialized options for customizing the automatic property binding generation. These let you give a prefix and suffix to be stripped from field or JavaBean property names when generating the corresponding XML element or attribute names, control the style of XML names, and set the access level for fields or methods included in the automatic generation. You can even list field or property names to be specifically included or excluded in the automatic generation. However, for the rest of the examples in this article I'll just stay with the full binding definition format, in order to clearly show exactly what values are being bound.

Flattening the tree
The simple binding example doesn't really do justice to the flexibility of JiBX. In Part 3 I also showed a pair of examples of structure mapping, which handles structural differences between the XML document and the bound Java classes. Figure 3 shows the first example of this type, with the same XML document structure bound to a single class rather than the pair of classes used previously.

Figure 3. Binding to single class

In the Figure 3 example, the Java class structure is a flattened version of the XML document. Rather than using a separate class for the values within the XML name element, this just includes the values directly within the class that corresponds to the parent customer element. Listing 3 gives a full binding definition for this structure mapping.

Listing 3. Structure mapping to single class


<binding>
  <mapping name="customer" class="Customer">
    <structure name="name">
      <value name="first-name" field="firstName"/>
      <value name="last-name" field="lastName"/>
    </structure>
    <value name="street1" field="street1"/>
    <value name="city" field="city"/>
    <value name="state" field="state"/>
    <value name="zip" field="zip"/>
    <value name="phone" field="phone"/>
  </mapping>
</binding>

If you compare Listing 1 and Listing 3 you'll see that the change to the binding definition for this flattened mapping is trivial. Only one line of the binding definition is different -- the field attribute has been removed from the original version. This tells JiBX that the structure element of the binding definition defines an element in the XML (as shown by the name attribute) that maps to some properties of the current object.

Warping the tree
Figure 4 gives a second example of structure mapping. This time the Java class structure uses a pair of classes, but the breakdown of data values doesn't match the structure of the XML document -- data values from the customer element of the XML document are split between the two classes, and the values from the name child element are included directly in the class corresponding to its parent element.

Figure 4. Binding to split classes

Listing 4 gives the full form of a binding definition for the Figure 4 binding. The only difference from the Listing 3 binding definition is that I've added a structure element corresponding to the new Address class. This structure element includes a field attribute but no name attribute, telling JiBX that the structure element is defining an object property with no corresponding element in the XML document.

Listing 4. Structure mapping to split classes


<binding>
  <mapping name="customer" class="Customer">
    <structure name="name">
      <value name="first-name" field="firstName"/>
      <value name="last-name" field="lastName"/>
    </structure>
    <structure field="address">
      <value name="street1" field="street1"/>
      <value name="city" field="city"/>
      <value name="state" field="state"/>
      <value name="zip" field="zip"/>
    </structure>
    <value name="phone" field="phone"/>
  </mapping>
</binding>

Tying up loose ends
JiBX binding definitions offer many options beyond what I've covered here. Some of these are hinted at in the list of binding definition elements, such as custom serialization and deserialization using the format element, and easy namespace handling with the namespace element. Other options are controlled by attributes of the binding definition elements. These include defaults for optional values, methods to be called before marshalling or after unmarshalling objects, and identifier values for referencing objects.

JiBX also includes a general extension hook in the form of custom marshal and unmarshal method definitions. These let you take over complete control of the marshalling and unmarshalling process, working directly with the low-level methods defined by the JiBX context classes. This type of low-level operation is not intended for general usage. It does provide some interesting possibilities for JiBX add-ons, though. One potential use is to allow portions of a document to be mapped to and from a document model (such as DOM, JDOM, or dom4j). This would provide easy handling of special cases, such as XHTML fragments embedded within XML documents.

Compiling the binding
Once you've got your binding definition, you need to actually compile it into the class files. JiBX supplies a binding compiler for this purpose. To use it, you just set up the Java class path so both the jar files included in the JiBX distribution and your own classes are accessible to the JVM, then run the org.jibx.binding.Compile program with one or more binding definition file paths as arguments.

The binding compiler adds JiBX binding code to your class files, preparing them for use with the JiBX runtime. It's smart about how it does this: If the same added code is needed by more than one binding, the binding compiler only generates the code once. Likewise, if you rerun the compiler with a modified binding definition it replaces the methods added for the old binding rather than just adding new ones. The compiler even removes methods and classes that were previously added for a binding you're no longer using. Finally, it only writes to class files that have actually been changed. This makes it safe to rerun the binding compiler after changing (and compiling) some of your Java source code files, without needing to recompile all your Java source files.

Running the binding
Once the binding compiler has modified your Java class files, you're ready to use the JiBX runtime for marshalling and unmarshalling documents. There's just one hitch, though: The actual binding code isn't added until after your Java source code is compiled to class files and run through the JiBX binding compiler, so you can't access this binding code directly in your source code. Instead, you need to work through a portion of the JiBX runtime that tracks the binding definitions you're using and connects you to the proper code at runtime.

This uses the org.jibx.runtime.BindingDirectory class that's included in the JiBX runtime jar, along with a class that JiBX generates in the same package as your code (or as the first class file it modifies, if your code is spread across multiple packages). You don't need to worry about the details of getting at this generated class, though; instead, you access it by passing one of the classes defined by a global mapping (one that's a child of the root binding element) in your binding to the BindingDirectory (if you've compiled more than one binding into the code, you'll also need to pass the name of the binding you want to use). The code is simple:

    IBindingFactory bfact = 
        BindingDirectory.getFactory(Customer.class);

Here, Customer is the name of a class with a global mapping in the binding. The org.jibx.runtime.IBindingFactory interface that gets returned provides methods to construct marshalling and unmarshalling contexts, which in turn allow you to do the actual marshal and unmarshal operations. Here's an unmarshal example:

    IUnmarshallingContext uctx = bfact.createUnmarshallingContext();
    Object obj = uctx.unmarshalDocument
        (new FileInputStream("filename.xml"), null);

This is just one of several variations of an unmarshal call -- in this case to unmarshal an XML document in the file filename.xml. You can pass a reader instead of a stream as the source of the document data, and you can also specify an encoding for the document -- see the JavaDocs on the JiBX site for details. The returned object is an instance of one of your classes defined with a global mapping in the binding -- you can either check the type with instanceof or cast directly to your object type, if you know what it is.

Marshalling is just as easy. Here's an example:

    IMarshallingContext mctx = bfact.createMarshallingContext();
    mctx.setIndent(4);
    mctx.marshalDocument(obj, "UTF-8", null,
        new FileOutputStream("filename.xml"));

As with the unmarshal example, this is just one of several variations that can be used for the marshal call. It first sets the indentation of the output XML to 4 spaces per nesting level, then marshals the object to an XML document written to the file filename.xml with UTF-8 character encoding (the most common choice for XML). You can pass a writer instead of a stream, as well as some other variations -- again, see the JavaDocs on the JiBX site for details. The object to be marshalled must be an instance of a class that's defined with a global mapping in the binding.

Future directions
JiBX provides a number of advantages over other available XML data binding frameworks for Java applications. These include very fast operation, a compact runtime distribution, and greater isolation between XML document formats and Java language object structures. As JiBX nears initial production release, it's looking like a great alternative for many applications.

JiBX does still have some areas of weakness. One is the current lack of support for code generation from an XML grammar. That may seem a surprising comment after my earlier remarks on the limitations of the XML-centric code generation approaches, however JiBX actually offers the means to avoid many of these limitations. If an XML grammar could be used for generating an initial set of classes and a corresponding binding definition, this would provide the benefit of getting working code quickly. At the same time, users would still have the long-term flexibility to independently refactor either the code or the grammar while modifying the binding definition to keep everything working in harmony.

Another very useful feature would be a tool to verify a binding definition against an XML grammar. A grammar provides most of the information necessary to say whether a binding definition will actually handle the intended documents properly. A tool to actually check the combination for compatibility would help prevent potential surprises in testing or deployment.

The current byte code enhancement approach that adds binding framework methods to your compiled classes is also an area where more flexibility would be useful. Byte code enhancement offers the advantage of keeping your source code clean, but at the costs of an added step in the build process and potential confusion in tracking problems in your code accessed during marshalling or unmarshalling. It'd be great to offer an alternative for cases where these costs outweigh the benefits.

I think there's a relatively simple solution to this issue. It should be fairly easy to decompile the code added by JiBX back to source code, and merge this into the original Java language source files. Once this is done the methods needed by JiBX will be compiled-in automatically, and as long as the user doesn't tinker with the code added by JiBX there should be no need to rerun the binding compiler until the binding definitions change. I'm currently investigating adding support for this type of operation to JiBX, though it probably won't be until after the initial production release.

As a final note on its limitations, JiBX currently offers relatively weak validation support compared to many of the other data binding frameworks. It's possible that this will change in the future. JiBX does include support for methods to be called before an object is marshalled or after an object is unmarshalled, and these methods could be used to handle most forms of validation. For many XML applications, full validation support is secondary to the main goal of fast and convenient access to data from XML documents, and for this goal JiBX already offers a great alternative.

Conclusion
In this article I've shown you the basics of working with the JiBX data binding framework. I personally feel JiBX has a lot to offer (which isn't too surprising, since I am the author of JiBX!). It's especially useful for applications that need to adapt existing object structures to XML, and for any applications where you want to decouple your code from the actual XML structure. But JiBX is definitely not a solution to all XML requirements for Java applications.

XML is used for many different purposes, and the ever-increasing number of tools for working with XML in Java applications reflects the fact that different purposes require different tools. The toughest part of working with XML is often just knowing which tool is best for a specific application. JiBX is designed to suit applications that need to interpret XML documents as data and work with that data in memory, where the focus is more on the use of that data by the application than on the XML documents themselves.

If JiBX sounds like a match for your needs I encourage you to download the current distribution and give it a try. Since it's an open source project with a BSD-style license, you're free to modify the code to suit your requirements -- and you don't even need to make your modifications public. I naturally hope that many people find it useful, and that they do contribute extensions and added tools to help the project grow in the future.

In my next article, I'll close out this series on data binding with a look at the recently released JAXB data binding standard and reference implementation. This will include trying out some of the customization options JAXB provides. JAXB is strongest in the very areas where JiBX is weakest, so this final article will be a nice wrap-up to a series that's covered a full range of tools and techniques.

Resources

Learn more about the new JiBX framework for mapped bindings.
Part 1 of this series on data binding provides background on why you'd want to use data binding for XML, along with an overview of the available Java frameworks for data binding. Part 2 gives performance comparisons between the data binding frameworks, including the new JiBX framework (developerWorks, January 2003). Part 3 introduces the JiBX architecture and discusses the reasons behind the choices made in JiBX (developerWorks, April 2003).
Check out the author's prior article on Data Binding with Castor, which covers the mapped data binding technique with Castor (developerWorks, April 2002).
If you need background on XML, try the developerWorks Introduction to XML tutorial (August 2002).
Review the author's previous developerWorks articles covering performance (September 2001) and usage (February 2002) comparisons for Java XML document models.

Find out more about the Java Architecture for XML Binding (JAXB), the evolving standard for Java Platform data binding.
Take a closer look at the Castor framework, which supports both mapped and generated bindings.

Read more about the interplay of Java Technology and XML.
Reference JSR 31 - the XML Data Binding Specification.
Find more information on the technologies covered in this article at the developerWorks XML and Java technology zones.
IBM WebSphere Studio provides a suite of tools that automate XML development, both in Java and in other languages. It is closely integrated with the WebSphere Application Server, but can also be used with other J2EE servers.
Find out how you can become an IBM Certified Developer in XML and related technologies.

About the author
Photo of Dennis Sosnoski

Dennis Sosnoski (dms@sosnoski.com) is the founder and lead consultant of Seattle-area Java consulting company Sosnoski Software Solutions, Inc., specialists in J2EE, XML, and Web services support. Dennis's professional software development experience spans over 30 years, with the last several years focused on server-side Java technologies. He's a frequent speaker on XML in Java and J2EE technologies at conferences nationwide, and chairs the Seattle Java-XML SIG.

developerWorks > XML | Java technology

About IBM | Privacy | Terms of use | Contact