J2EE pathfinder: Persistent data management, Part 1

The J2EE platform provides a rich set of options for managing enterprise data persistence, but how do you choose the one that's right for your architecture? In the next two installments of JEE pathfinder, Kyle Gabhart introduces J2EE's top data persistence technologies -- entity beans, JDBC, and JDO -- and compares them in several different environments. This month: JDBC versus entity beans.

Data persistence is one of the trickiest aspects of enterprise development. An enterprise data persistence solution must provide speedy client transactions, ensure data integrity over time, and be able to persist data through such everyday catastrophes as system crashes and network failures. For the next two installments of the J2EE pathfinder series, we'll focus on the J2EE technologies that can help you create sound data persistence solutions for your enterprise architecture. We'll launch the topic with a brief introduction to data persistence in enterprise applications, then move on to a more specific discussion of the various technology options. In this installment, we'll compare the single-stop solution of entity beans to the more complex -- but also more robust -- combination of session beans and Java Database Connectivity (JDBC) code. In the next installment, we'll talk about how Java Data Objects (JDO) stack up against entity beans.

What is data persistence?
Data is the most important asset of any computer application. The entire point of a computer application is to enable a person or another computer system to access its data. In an enterprise context, data must not only be accessible (that is, attached to a user interface and managed by a series of business rules), it must also be persistent. A persistent datastore is one that will survive even in the event of a server crash.

Persistent data exists outside of an application's active memory, typically in a database or flat file system. Although persistent data is read into transient memory for the purpose of use or modification, it is always written out to an external datastore for long-term storage. The United States National Institute of Standards and Technology (see Resources) defines three levels of persistent data:

Partially persistent data is a persistent data structure that allows updates to the latest version only.
Persistent data is a data structure that preserves its old versions; that is, both previous and current versions may be queried.
Fully persistent data is a persistent data structure that both maintains and allows updates to all versions of its data.

Most business applications provide at least partially persistent data. This type of persistence is vulnerable to mid-transaction or even mid-request system failures, which can result in incomplete and often corrupt data. In a persistent data implementation, on the other hand, a system interruption or failure is countered by a "rollback," where the state of the data is rolled back to the last known good configuration. Persistent data implementations are common in enterprise architectures and database management systems (DBMS). Fully persistent data implementations are very rare. Among the few examples of fully persistent data implementations are journaling file systems, VMS file systems (like VAX and Mac OS X), and concurrent versioning systems (CVS).

Persistence in J2EE
The information age has put tremendous emphasis on the use of distributed enterprise computing platforms. On such platforms, data must be protected at all cost and must persist indefinitely, even in the face of network failures, memory leaks, and server crashes. To maintain this type of persistence, application components must be capable of handling concurrency, connection management, data integrity, and synchronization. All three of J2EE's data management technologies handle these functions for the developer, although each one handles them somewhat differently.

Entity beans provide robust data persistence. The bean container handles most of the data integrity, resource management, and concurrency functions, letting developers focus on business logic and data processing rather than these low-level details. With Bean Managed Persistence (BMP) entity beans, the developer writes the persistence code but the container determines when to execute that code. With Container Managed Persistence (CMP) entity beans, the container generates the persistence code as well as managing the persistence logic.
JDBC, when combined with session beans, provides the ease of EJB development and platform-neutral deployment, without the resource usage and memory overhead that is common with EJB technology. Like BMP entity beans, this solution requires the developer to write the persistence code. Unlike BMP beans, it also requires the developer to write the persistence logic. Thus, the developer is responsible for determining when to persist data to and load from the datastore.
Java Data Objects is the newest persistence mechanism. JDO provides an object-oriented persistent datastore. Developers use POJOs (plain ordinary Java objects) to load and store persistent data.

We'll spend the rest of the article discussing the pros and cons of entity beans versus the combination of session beans and JDBC.

Entity bean advantages
When it comes to enterprise-level data persistence, entity beans offer the following advantages:

Standardization. The EJB specification defines a set of vendor-neutral interfaces that J2EE vendors can implement to support entity beans. This standardization allows for the development of best practices and reduces the ramp-up time when a new developer is hired. Because the basic component architecture and design patterns are common knowledge, it's fairly easy to find qualified talent to implement them.
Container-managed services. As we discussed in the previous two articles in this series, EJB container-managed services provide tremendous benefits for handling such enterprise functions as security, transaction handling, connection pooling, and resource management.
Transparent persistence. The idea of container-managed services is taken even further in the case of CMP entity beans. Here, the container also manages the persistence semantics automatically. With BMP entity beans, developers must write the persistence logic but the container determines when to call the methods defined by the developer. With both CMP and BMP entity beans, the container calls the shots as to when to persist a bean's state and how to ensure data integrity and concurrency with the underlying datastore.
Transaction support. Developers have coarse-grained control over CMP transactions (isolation levels, transaction requirements, and inclusion/exclusion of methods), and fine-grained control over BMP transactions, by handling the transaction semantics programmatically in the bean code. In both cases, the container manages the transactions and determines whether or not a given transaction should be committed.
Component-based design. Entity beans are designed to be self-contained components that are configured with a deployment descriptor and can be deployed into any J2EE application server without any code changes.

The rise and fall of entity beans
As architects, developers, and consultants began to work with entity beans, however, the charm and fascination quickly began to wear off. By the time the EJB 1.1 specification became widely implemented, it was evident that entity beans had to be used with great caution and respect. Although they still represented a powerful component-model architecture for data persistence, they were notorious for consuming far more than their fair share of server resources. The EJB 2.0 specification has alleviated some of these concerns. Although many people are still licking their wounds from the EJB 1.1 entity bean days, entity beans are now more trustworthy and a more viable solution than in days gone by. With the advent of local interfaces, enhanced CMP capabilities, and J2EE vendors with more experience implementing the EJB specification, entity beans have again become a viable data persistence mechanism for the industry at large.
In 1999, when the J2EE and EJB specifications were first introduced to the world, entity beans were touted as a brilliant enterprise component that would revolutionize enterprise application development, maintenance, and portability. The industry got excited about what entity beans represented as a sort of no-fuss, automatic persistence mechanism.

In summary, entity beans benefit from standardization and industry best practices, ease some of the complexities of enterprise development, and provide a compelling component-based design.

Entity bean disadvantages
While entity beans do have an impressive list of features to recommend them, we must also consider their disadvantages, which include the following:

Design complexity. Container-managed services and automatic, transparent persistence come at a heavy price. They introduce complexity to your application design at several levels. First, to avoid network overhead and force adherence to business rules, entity beans are almost always accessed through a session bean. Thus, each transaction involves at least two enterprise beans and often many more. As more components become involved, the architecture becomes more complicated to design, code, and maintain. Second, there is the cost of automation. The container is something of a magical black box. It invokes bean callback methods whenever it deems appropriate, creates and destroys bean instances, activates and passivates beans, and stores and loads their state to a persistent datastore whenever it chooses. The application code has no control over how or when these things happen. On the positive side, the container's function reduces the number of issues to be considered when writing business logic. On the negative side, the container's response to load conditions and data request patterns is unpredictable, so extensive scenario-based load testing must be added to the development process.
Long build cycles. Because of the complexity of enterprise beans and entity beans in particular, a single iteration (design/build/test/integrate/test/deploy) can take two to three times longer than comparable Java persistence solutions.
Response time. Depending upon the load placed on the server and the relative size of the entity bean that has been requested, queries on entity beans can have sub-par response times. Entity beans by their very nature are limited to the granularity of the bean instance. Either the entire bean must be loaded or the bean cannot be loaded at all. This granularity can further complicate an architecture, as the only options are to keep the beans as is with the poor response times, or to break the data up into smaller entities, further complicating the system's architecture.
Resource usage. All enterprise beans are resource hogs. Entity beans are some of the worst offenders. While this will vary depending upon what application design patterns are used and how efficiently the vendor has designed its entity bean implementation, entity beans still have a tendency to consume massive quantities of server resources.

In summary, entity beans suffer from complexity and long build cycles, making the design and development of systems that include them more difficult. In production, entity beans are notorious for hogging resources and responding slowly to concurrent requests for large entities.

Session beans and JDBC
Unlike entity beans, stateless session beans have not ridden a roller coaster of popularity (see sidebar). In fact, stateless session beans have remained steady and reliable in terms of both popularity and functionality since 1999, when the EJB specification was released. They produce excellent performance results and efficient resource pooling, and are an important worker component of the EJB family. The stability and predictability of stateless session beans make them an excellent candidate for managing persistent enterprise data.

Stateless session beans and JDBC are often combined to create a solid persistent data management solution. In the next couple of sections, we'll weigh the pros and cons of that solution, but we won't go into much detail about either technology on its own. If you need to learn more about stateless session beans or JDBC, see Resources.

How it works
Because session beans don't possess an inherent data access mechanism, they must use a resource manager connection factory. A resource manager is a component of a J2EE container that manages the entire life cycle of a particular type of resource, including connection pooling, transaction support, and any necessary network protocols that make the actual connection possible. A connection factory is an object that is used to create connections to a resource manager. The EJB specification defines resource managers for JDBC, JMS, JavaMail, and JCA resources.

In a persistence architecture based on session beans and JDBC, a session bean delegates all access commands to the JDBC layer. On receiving a call, the session bean uses JDBC to obtain an object that implements the javax.sql.DataSource interface. The object returned then serves as a resource manager factory for java.sql.Connection objects (defined by the JDBC API) that implement connections to a database management system. Once a Connection object has been obtained, the remainder of the persistence code and business logic (queries, updates, stored procedure calls, result set navigation, transaction commit/rollback, and so on) are pure JDBC.

Session bean/JDBC advantages
Session beans and JDBC make an excellent team for handling enterprise data persistence. The most commonly recognized advantages to this combination are as follows:

Design simplicity. From an architectural design standpoint, handling data management directly through session beans is much simpler than using entity beans.
Fine-grained control. Because session beans are generic worker components, they allow the developer complete control over the entire persistence process, including caching, persistence, concurrency, synchronization, and more. In contrast, CMP entity beans allow the developer no control over the persistence mechanism, and BMP entity beans only enable the developer to define what should happen, not when or under what circumstances.
Maturity. JDBC is approximately seven years old! Entity beans, by contrast, are just over three years old. The reliability and best practices of JDBC is a tremendous asset to the development of a J2EE persistence mechanism.
Speed. Because developers have complete control over the data access mechanism used within a session bean, data access and persistence logic can be optimized for certain tasks. This can result in incredibly quick response time due to direct, purposeful actions.

In summary, the combination of session beans and JDBC gives the developer fine-grained control over data management semantics, leverages a robust and mature data management technology, enables functional optimization, and packages it all into a relatively simple component architecture.

Session bean/JDBC disadvantages
Sounds good so far, but there are a few downsides to the combination of session beans and JDBC. They are as follows:

Implementation complexity. While the architectural design of such a system is fairly simple, the actual session bean implementation is often quite complicated. Managing database connections, ensuring data integrity, and properly handling transaction semantics are crucial tasks that can be overwhelming to implement if the application's data needs are fairly sophisticated. Developers are often required to implement some type of caching along with ensuring optimum performance. The construction of such a caching mechanism further complicates the development and maintenance of this system.
Not inherently transactional. Entity beans are inherently transactional components with configurable transaction semantics; session beans are not. When coding the transaction semantics directly into the application code, developers must take every precaution to ensure that the business rules, flow control, and transactional integrity of each function are preserved and fault-tolerant. These details are handled by the container in entity-bean development.
Persistence isn't automatic or guaranteed. In entity bean operation, the container handles the persistence of a bean's state, ensuring that such data has been preserved for later use. With session beans, the responsibility of persisting data to a secure, long-term datastore is on the developer's shoulders.

In summary, session beans combined with JDBC suffer from three key problems: the implementation of the bean is often complex; session beans aren't inherently transactional; and the persistence mechanism isn't automatic or guaranteed.

Making the call
Despite the drawbacks, J2EE architects have begun to claim plain vanilla stateless session beans with raw JDBC calls as the safest and most commonly recommended data persistence mechanism. This isn't so much because the combination is a superior solution to entity beans (both have their merits) as it is a matter of momentum. Whereas entity beans rose quickly to prominence and then fell just as quickly out of favor, the popular acceptance of session beans and JDBC has been accumulating slowly and steadily over time.

Despite the current trend, it is worthwhile to carefully weigh the advantages of entity beans versus session beans with JDBC. The following list identifies four key areas in which to compare the two data persistence solutions:

Read/write needs. Data that needs to be read often and never changed or occasionally changed is best handled by session beans with JDBC. The development should be simple, straightforward, and result in excellent response times.

If the data needs to be frequently updated and support many concurrent requests (and thus many concurrent changes), then entity beans are the clear choice. The complexity involved in building a mechanism to ensure data integrity, synchronization, and frequent persistence in the face of concurrent requests for data would simply be too overwhelming and not worth the time and effort involved in creating it.
Transactional support. CMP entity beans shield developers from being concerned with transaction contexts. All the transactional details are declared within the bean's deployment descriptor. If this level of control is acceptable, then CMP entity beans clearly provide the easiest solution. If more control is needed, BMP beans allow developers to define what actions should be taken without being concerned with writing business rules for when such actions should be triggered. For the maximum level of control, a session bean should be used. That session bean could manage a complicated transaction involving CMP and BMP entity beans, as well as a handful of JDBC calls that directly hit the database.
Time to market. CMP entity beans easily represent the single fastest time to market of any J2EE persistence mechanism. Data types and names are declared, deployment settings are defined, and the application server and vendor tools take care of the rest. It's hard to say whether BMP entity beans or session beans with JDBC would rank as the second fastest solution. On one hand, BMP would be faster because the container is providing so many life-cycle services on behalf of the bean. On the other hand, session beans would come in ahead, as they have a much less complicated and thus shorter build/test/deploy cycle. Ultimately, ranking these three as they relate to your particular project is only part of the picture. This ranking must then be weighed against the next category: resource usage.
Resource usage. Entity beans are notorious for consuming a large quantity of resources, especially when concurrent requests are made on especially large entities. In comparison, session beans and JDBC datasource connections are very lightweight and require only a small amount of server resources. For more information on this, read the description of the stateless session EJB instance-pooling model outlined in the first article of this series, "J2EE technologies for the stateless network" (see the J2EE Pathfinder series in Resources).

Conclusion
In this third installment of the J2EE pathfinder series, we have compared and contrasted entity beans with session beans and JDBC for data persistence. The scenarios discussed here don't cover every situation, but they are representative of some of the most common uses for entity EJB components and session EJB components.

Next month we'll continue our exploration of J2EE data persistence mechanisms as we compare entity beans with Java Data Objects. Until then, happy pathfinding!

Resources

Participate in the discussion forum on this article. (You can also click Discuss at the top or bottom of the article to access the forum.)
See the complete J2EE pathfinder series by Kyle Gabhart.
The J2EE home page is the place to start if you want to learn more about the Java 2 platform, Enterprise Edition and related technologies.
The National Institute of Standards and Technology offers a complete listing of the types of standard data persistence.
The tutorial "Getting started with Enterprise JavaBeans technology" (developerWorks, April 2003) is a comprehensive introduction to EJB technology.
Brett McLaughlin's EJB best practices series on developerWorks introduces some of the basic patterns and uses associated with Enterprise JavaBeans.
Rick Hightower details container-managed persistence in this four-part tutorial series (developerWorks, March 2002 - July 2002):
- Part 1 explains all about CMP/CMR.
- Part 2 describes the three varieties of component managed relationships: one-to-one, many-to-many, one-to-many.
- Part 3 introduces EJB QL.
- Part 4 focuses on advanced finder methods and more complex EJB-QL queries.
Srikanth Shenoy offers best practices for EJB exception handling (developerWorks, May 2002).
Kyle Brown's "Choosing the Right EJB Type: Some Design Criteria" (WebSphere Developer Domain, August 2000) is a careful comparison of EJB types.
The J2EEOlympus.com portal is an excellent repository of J2EE information (particularly the EJB pages).
developerWorks offers two tutorials by Robert Brunner that detail JDBC.
- In "Building Web-based applications with JDBC (December 2001) you'll learn the fundamentals of Web application programming using three separate techniques: a servlet approach, a JSP approach, and combined JSP, JavaBeans, and servlet approach (also known as Model 2).
- In "Advanced database operations with JDBC" (November 2001) you'll learn several advanced database operations, including stored procedures and advanced datatypes, that can be performed by a Java application using JDBC.
"What's new in JDBC 3.0?" by Josh Heidebrecht (developerWorks, July 2001) outlines the key features of JDBC 3.0, the current version of the JDBC specification.
See the developerWorks Java technology tutorials page for a complete listing of more free tutorials from developerWorks.
You'll find hundreds of articles about every aspect of Java programming in the developerWorks Java technology zone.

About the author
Photo of Kyle Gabhart

Kyle Gabhart is an independent consultant and subject matter expert with J2EE, XML, and Web services technologies. Kyle is a popular public speaker, recognized for his enthusiasm and dynamic analysis and presentation of emerging technologies. For information on his recent and upcoming presentations or industry publications, visit Gabhart.com. Kyle can be reached at kyle@gabhart.com.

developerWorks > Java technology

About IBM | Privacy | Terms of use | Contact