|
|
Contents: |
|
|
|
Related content: |
|
|
|
Subscriptions: |
|
|
| Take full advantage of the Java language's
polymorphism
Fernando
Ribeiro (mailto:fribeiro@bol.com.br?cc=&subject=Prevent
mistypings to the String class) Consultant August 2002
The conversion of objects to strings (or
stringification) can cause problems in Java programming unless
you remember that string representations are rarely used in solid
object-oriented applications. In this article, systems analyst and
programmer Fernando Ribeiro builds on Eric Allen's bug pattern concept
and explains how mis-stringification can be a bug pattern; he discusses
the diagnostics of this elusive pitfall and expounds on the benefits of
type safety.
Stringification is the conversion of an object to a string and,
for the purpose of this article, mis-stringification refers to the
mistyping to the String class. The examples in this paper
will show you that rarely is a product code a string, for example, but
many developers will type it to the String class and
jeopardize the infinite usefulness of polymorphism in object-oriented
programming.
Although it may seem simply a matter of style (since an insidious
attribute of the mis-stringification "bug pattern" is that it causes no
errors at any time, not even at test time), avoiding mistyping to the
String class allows you to take full advantage of the Java
language's inherent feature of polymorphism. In practical terms, avoiding
this pattern is the best way to combat it, and the best way to avoid it is
to define a specific type for most elements in your code. By doing so, you
will ensure the reliability of your system by making sure that each class
type is appropriate for its job. This solution may add some overhead to
your system's performance, but the trade-off is a much more reliable
system.
We'll be discussing this pattern in the context of an enterprise
system, and in this article we'll examine one way to detect this bug: the
mis-overloading of a method. (We don't discuss repairing the bug much in
this article because simply avoiding string representations is the best,
most common, method to solving the problem.)
Zooming in on String
mistypes I like to look at this problem of mistyping objects
to the String class much as I would a bug pattern. So,
let's refer to this misadventure as the mis-stringification bug pattern.
(For more on bug patterns, see Eric Allen's Diagnosing Java code
column available in Resources.)
A few definitions UML (Unified
Modeling Language): A language for specifying, visualizing,
constructing, and documenting software-system artifacts by
simplifying the software-design process by crafting a plan, or
"blueprint," for construction. For those of you who aren't
acquainted with all of the terms used in this article, these
definitions should help get you up to speed:
OCL (Object Constraint Language): The expression language
for the Unified Modeling Language (UML); it has the characteristics
of a pure expression language (cannot change anything in the model),
a modeling language (all implementation issues are out of scope and
cannot be expressed), and a formal language (all constructs have a
formally defined meaning).
Type-safe, type safety: A UML model element (such as a
field or operation) assigned a type whose structure and behavior
most closely match the specification of the element.
Stringification: Converting an object to a string.
Polymorphism: In object-oriented programming, a
programming language's ability to process objects differently
depending on their type.
Method overloading: The ability in object-oriented
applications to redefine methods for derived classes in which the
method name remains the same but the type of the parameters
change. |
Before we continue, permit me a quick word on the concept of type
safety. When a UML model element is type-safe, its structure and
behavior closely match its specification or, in other words, it has been
developed specifically for its purpose. An example may help: a "key"
parameter to an operation that searches an indexed list is not a string
but only an object that, like any other Java object, may be stringified by
calling the toString() : String method. The difference is
that strings may be substringed, concatenated, and so on; keys cannot.
They are keys, not strings.
In a type-safe application, the type of the color field,
the return type of the getColor() : String method, and the
color parameter of the setColor(color : String) :
void method is Color not String -- it
returns the color of the vehicle, not its string representation. Listing 1
offers an example.
In the code examples in this article, we will be using a fictitious
enterprise system that encompasses the delivery and tracking functions of
the products of the automotive industry. We will be defining classes for
this system, including a Vehicle class (when we discuss
individual vehicle details) and a more generic Product class
(as an example of a generic enterprise product catalog). Listing 1. The mis-stringified vehicle
/**
* The product
**/
public class Product {
/**
* Construct a product
**/
public Product() {
}
/**
* Construct a product
* @param code A code
**/
public Product(String code) {
this.setCode(code);
}
/**
* The code of a product
**/
private String code;
public boolean equals(Object b) {
if (!(b instanceof Product))
return false;
return this.getCode().equals(((Product)b).getCode());
}
protected void finalize() {
this.setCode(null);
}
/**
* Get the code of a product
* @return The code of a product
**/
public String getCode() {
return this.code;
}
public int hashCode() {
String code = this.getCode(); // defensively copies
if (code == null)
return 0;
return code.hashCode();
}
/**
* Set the code of a product
* @param code A code
**/
public void setCode(String code) {
this.code = code;
}
public String toString() {
return new String();
}
}
|
/**
* The vehicle
**/
public class Vehicle {
/**
* Construct a vehicle
**/
public Vehicle() {
}
/**
* The color of a vehicle
**/
private String color;
/**
* Get the color of a vehicle
* @return The color of a vehicle
**/
public String getColor() {
return this.color;
}
/**
* Set the color of a vehicle
* @param color A color
**/
public void setColor(String color) {
this.color = color;
}
}
|
This bug pattern is found in many enterprise systems, including product
catalogs. For an example, look at the following code (this example also
defines a Product class): Listing 2. The
mis-stringified product
/**
* The product
**/
public class Product {
/**
* Construct a product
**/
public Product() {
}
/**
* Construct a product
* @param code A code
**/
public Product(String code) {
this.setCode(code);
}
/**
* The code of a product
**/
private String code;
public boolean equals(Object b) {
if (!(b instanceof Product))
return false;
return this.getCode().equals(((Product)b).getCode());
}
protected void finalize() {
this.setCode(null);
}
/**
* Get the code of a product
* @return The code of a product
**/
public String getCode() {
return this.code;
}
public int hashCode() {
String code = this.getCode(); // defensively copies
if (code == null)
return 0;
return code.hashCode();
}
/**
* Set the code of a product
* @param code A code
**/
public void setCode(String code) {
this.code = code;
}
public String toString() {
return new String();
}
}
|
A few comments about the design of the Product class in
the above code:
- The first constructor is empty and takes no arguments.
- The second constructor takes a code.
- The codes compose -- are a part of -- the product.
- The string representations of the products are empty strings.
- The products are equaled by their codes.
- The hash codes of the products are the hash codes of their codes.
Let's take a look at some code examples for the last two items.
Products are equaled by their codes Here an OCL constraint
to illustrate this point:
context Product::equals(b : Object) : boolean
pre: b.oclIsKindOf(Product);
post: result = self.getCode().equals(b.oclAsType(Product).getCode());
|
The hash code of products and their codes are the same Here
is an OCL constraint to illustrate this point:
context Product::hashCode() : int post:
let code : String = self.getCode() in
if code.oclIsUndefined() then
result = 0;
else
result = code.hashCode();
end if
|
Here's why an occurrence of the mis-stringification bug pattern can
hamstring your ability to produce good code -- product code is not a
string because it may require structure and behavior beyond what is
available in the String class.
(The OCL constraints in this article are based on this OCL 2.0
submission -- for example, "oclIsNew" doesn't exist in OCL 1.4. For more
on OCL, see Resources.)
Also, product code may require specializations (such as sales or
engineering product code). And some products may be coded many times --
the engineering code may be used by the logistics systems; the logistics
code may be used by the sales systems; the engineering, logistics, and
sales codes may be used by the e-business systems. The usage requirements
of product codes are sort of red flags to developers, steering them toward
the practice of developing a new, specific type for each kind of product
code.
So why does this problem occur? And how do we fix or avoid it?
The problem and some
solutions The problem occurs because most programmers don't
employ type safety in object-oriented applications. (Remember, we think it
is worth the extra effort to define a new type specific to the
requirements rather than rely on existing types that may not match close
enough and may cause problems.) The benefits of type-safe applications are
sampled by the following vehicle problem in which vehicles (cars and
trucks) are delivered by different ships. Take a look at the following
code:
/**
* Deliver a vehicle
* @param vehicle A vehicle
**/
public void deliver(String vehicle) {
// is it a car or a truck?
}
|
The deliver(vehicle : String) : void method implements the
delivery of strings (sad but true) instead of vehicles because any string
is assignable to the vehicle parameter. This really isn't a
solution to the problem.
The Vehicle type, like the one used in the next code
block, is a much better match to what we want the invoker to pass to this
method.
/**
* Deliver a vehicle
* @param vehicle A vehicle
**/
public void deliver(Vehicle vehicle) {
// who delivers a vehicle?
}
|
The deliver(vehicle:Vehicle) : void method implements the
delivery of vehicles but, because cars and trucks -- all the vehicles in
this context -- are not delivered by the same ship, it also isn't a
complete solution to the problem.
Look at this next bit of code:
/**
* Deliver a car
* @param car A car
**/
public void deliverCar(String car) {
// delivered by the first ship
}
/**
* Deliver a truck
* @param truck A truck
**/
public void deliverTruck(String truck) {
// delivered by the second ship
}
|
This isn't a good solution either because the invokers of the
deliverCar(car : String) : void and deliverTruck(truck:
String) : void methods are conditioned to differentiate cars and
trucks.
Finally, take a look at the following code:
/**
* Deliver a car
* @param car A car
**/
public void deliver(Car car) {
// delivered by the first ship
}
/**
* Deliver a truck
* @param truck A truck
**/
public void deliver(Truck truck) {
// delivered by the second ship
}
|
The invokers of the deliver(car : Car) : void and
deliver(truck : Truck) : void methods aren't conditioned to
differentiate cars and trucks because method overloading allows a
developer to implement the same behavior for several argument lists. This
approach is appropriate in object-oriented applications.
So far, the code examples we've covered have used method overloading in
conjunction with a feature in the Java compiler -- method narrowing --
that searches for a best match to an operation requested by the invoker.
This search is based not only on the name of the method but also on the
type of its parameters and the size of the parameter list. (For more on
method narrowing, see Resources.)
The deliver(vehicle : Vehicle) : void method replaces both
the deliver(car : Car) : void and deliver(truck :
Truck) : void methods when cars and trucks are delivered by the
same ship. And, in accordance to the rules for binary compatibility of the
Java specification, the invokers of these two methods don`t even need to
be recompiled. This is the power of polymorphism to be unleashed by Java
applications.
Prevention methods The
"golden rule" to avoiding problems with stringification is this:
String representations of objects should be the
only strings in type-safe applications.
The following code and the UML class diagram should illustrate a clear
design of a type-safe object-oriented application in UML.
A look at a type-safe
product The following block is a well-designed, type-safe
product. Listing 3. The type-safe product
/**
* The product
**/
public class Product {
/**
* Construct a product
**/
public Product() {
}
/**
* Construct a product
* @param code A code
**/
public Product(ProductCode code) {
this.setCode(code);
}
/**
* The code of a product
**/
private ProductCode code;
public boolean equals(Object b) {
if (!(b instanceof Product))
return false;
return this.getCode().equals(((Product)b).getCode());
}
protected void finalize() {
this.setCode(null);
}
/**
* Get the code of a product
* @return The code of a product
**/
public ProductCode getCode() {
return this.code;
}
public int hashCode() {
ProductCode code = this.getCode(); // defensively copies
if (code == null)
return 0;
return code.hashCode();
}
/**
* Set the code of a product
* @param code A code
**/
public void setCode(ProductCode code) {
this.code = code;
}
public String toString() {
return new String();
}
}
|
Figure 1. The UML class diagram of the type-safe
product
A look at the type-safe product
code In this section, we'll examine the product code and
ProductCode class. Listing 4. The product
code
/**
* The product code
**/
public class ProductCode {
/**
* Construct a product code
**/
public ProductCode() {
}
public boolean equals(Object b) {
if (!(b instanceof ProductCode))
return false;
return this.toString().equals(b.toString());
}
public int hashCode() {
return this.toString().hashCode();
}
public String toString() {
return new String();
}
}
|
A quick note: At this point, some developers here would question the
wisdom of having toString return a new String
with every toString() call. I've certified this approach with
other developers, including the author of Effective Java
Programming, Joshua Bloch, and it seems to be the best solution at
this time. Calling intern() to access the pool would be
awkward because the variable to hold the return of this method tends to be
short-lived so performance is assumed not to be an issue.
A few comments about the design of the ProductCode
class:
- The constructor is empty and takes no parameters.
- The product codes are equaled by their string representations.
- The hash codes of the product codes are the hash codes of their
string representations.
- The string representations of the product codes are empty strings.
Let's look a bit closer at the last three items.
Product codes are equaled by their string representations
Here is an OCL constraint to illustrate this point:
context ProductCode::equals(b : Object)
pre: b.oclIsKindOf(ProductCode)
post: result = self.getCode().equals(b.getCode())
|
The hash codes of product codes and their string representations are
the same Here is an OCL constraint to illustrate this point:
context ProductCode::hashCode() : int post:
result = self.toString().hashCode();
|
Product code string representations are empty strings Here
is an OCL constraint to illustrate this point:
context ProductCode::toString() : String post:
result.oclIsNew();
|
Examining interface
implementation Some interfaces may be easily implemented by
subclasses of the ProductCode class:
Cloneable
Comparable
Serializable
Let's illuminate these subclass interface implementations with code
examples. We'll begin with Cloneable :
public Object clone() throws CloneNotSupportedException {
return super.clone();
}
|
And here's an example of an interface implementation of
Comparable :
public int compareTo(Object b) {
if (!(b instanceof ProductCode))
throw new ClassCastException();
return toString().compareTo(b.toString());
}
|
Following is a demonstration of the string representation of the
product codes being changed by a subclass of the ProductCode
class:
ProductCode pc = new ProductCode() {
public String toString() {
return "9BGRD08Z01G167984";
}
};
|
Notice that the syntax isn't particularly beautiful in the last
example.
At the end of your
string The String class is a final class. It
may not be extended for a very good reason: the class itself already
provides all the behavior used by Java applications. Inheriting the
ProductCode class from String (as some
developers would like to do) would be as awkward as using the string
representation of a product code instead of the product code itself to
compose a product.
Employing type safety to avoid the mis-stringification bug pattern will
take extra time (to create new, more specific types), will likely
not increase your system's performance, but will always
increase your system's reliability.
The benefits of polymorphism go hand-in-hand with the practice of using
type safety and mis-stringification is one more reason to care about this
and understand that it is not just a matter of style.
I'd like to thank the authors of the OCL spec, Jos Warmer and Anneke
Kleppe of Klasse Objecten, for their comments and the support they offered
in crafting this article.
Resources
- For more on bug patterns, see the Diagnosing
Java code columns by Eric Allen.
- A good source of information on the Unified Modeling Language and
its expression language, OCL, is Granville Miller's column, Java
modeling .
- Two more excellent resources on OCL and UML the UML 1.4
specification and the OCL
2.0 submission from Boldsoft, Rational Software Corporation, IONA,
and Adaptive Ltd. to learn more about OCL.
- A great roundup on OCL -- definition, history of development, links
to other resources -- can be found on the IBM OCL
page.
- A useful tool to understand OCL is the IBM
OCL Parser 0.3.
- A good resource on the useful Java feature method narrowing
can be found in this Sun
technical article.
- Find other Java related resources on the developerWorks
Java technology zone.
About the
author Fernando Ribeiro is a senior systems analyst and
programmer in Brazil. Fernando has been using C++, Java and UML for
six years in several industries, and recently was a member of a JCP
expert group engaged in internationalizing J2EE applications for a
major global IT service company. You can contact him at fribeiro@bol.com.br. |
|