|
|
|
Contents: |
|
|
|
Related content: |
|
|
|
Subscriptions: |
|
|
| Some limitations of generics in the JSR-14 prototype
compiler
Eric
E. Allen (mailto:eallen@cs.rice.edu?cc=&subject=Java
generics without the pain, Part 2) Ph.D. candidate, Java
programming languages team, Rice University 11 March 2003
This month Eric Allen continues the
discussion of generic types in JSR-14 and Tiger. He outlines several
limitations imposed in those Java extensions and explains how the
limitations are necessitated by the implementation strategy used by the
compilers of these extended languages. Share your thoughts on this
article with the author and other readers in the accompanying discussion forum. (You can
also click Discuss at the top or bottom of the article to access the
forum.)
J2SE 1.5 -- code-named "Tiger" -- is scheduled for release near the end
of 2003 and will include generic types (as previewed in the JSR-14
prototype compiler, available for download right now). In Part
1, we discussed the basics of generic types and why they will be an
important and much needed addition to the Java language. We also touched
upon how the incarnation of generic types scheduled for Tiger includes
several "kinks" that limit the contexts in which generic types can be
used.
To help new programmers in their efforts to use generics effectively,
I'll elaborate on exactly which usages of generic types are prohibited in
Tiger and JSR-14, and I'll explain why the limitations are a necessary
consequence of the implementation strategy used by JSR-14 (and
consequently, Tiger) to compatibly implement generic types on the JVM.
Limitations on generic
types Let's start by reviewing the limitations on the use of
generic types in Tiger and JSR-14:
- Enclosing type parameters should not be referred to inside static
members.
- Generic type parameters can't be instantiated with primitive types.
- "Naked" type parameters can't be used in casts or
instanceof operations.
- "Naked" type parameters can't be used in
new
operations.
- "Naked" type parameters can't be used in the
implements
or extends clauses of class definitions.
Why do these limitations exist? Because of the mechanism used by Tiger
and JSR-14 to implement generic types on the JVM. Because the JVM doesn't
provide any support for generic types, these compilers perform a "trick"
to make it seem like support for generic types exists -- they type check
all the code with the generic type information, but then "erase" all
generic types and produce class files that include nothing but ordinary
types.
For example, a generic type such as List<T> is
erased to simply List . "Naked" type parameters -- type
parameters that appear alone rather than inside of a type, such as type
parameter T in class List<T> -- are simply
erased to their upper bounds (in the case of T , that would be
Object ).
This technique is extremely powerful; we get almost all of the
increased precision of generic types, but we maintain compatibility with
the JVM. In fact, we can even use non-generic legacy classes such as
List with their generic counterparts
(List<T> ) interchangeably; both look the same at
runtime.
Unfortunately, as the above limitations show, there is a price for this
power. Erasing in this manner introduces holes in the type system that
limit how we can safely use generic types.
To help clarify each limitation, we'll review an example of where it
can occur. In this article, we'll discuss the first three limitations. The
issues with the last two are so intricate that they need a more in-depth
treatment, which we'll save for the next article.
Enclosing type parameters in static
members Referring to enclosing type parameters inside static
methods and static inner classes is prohibited outright by the compiler.
So, for instance, the following code is illegal in Tiger:
class C<T> {
static void m() {
T t;
}
static class D {
C<T> t;
}
}
|
When this code is compiled, it generates two errors:
- An error for the illegal reference to
T inside static
method m
- An error for the illegal reference to
T inside static
class D
When defining static fields, things get more complicated. In both
JSR-14 and Tiger, static fields in a generic class are shared across all
instantiations of the class. Now in the JSR-14 compilers 1.0 and 1.2, if
you refer to a type parameter in a static field declaration, the compiler
doesn't complain but it should. The fact that the field is shared can
easily lead to weird errors at runtime, such as a
ClassCastException in code that doesn't include a cast.
For example, the following program will compile without warning under
these versions of JSR-14:
class C<T> {
static T member;
C(T t) { member = t; }
T getMember() { return member; }
public static void main(String[] args) {
C<String> c = new C<String>("test");
System.out.println(c.getMember().toString());
new C<Integer>(new Integer(1));
System.out.println(c.getMember().toString());
}
}
|
Notice that every time an instance of class C is
allocated, the static field member is reset. What's more, the
type of the object it is set to is dependent on the type of the
instantiation of C ! In the main method provided,
the first instance, c , is of type
C<String> . But the second is of type
C<Integer> . Whenever the shared static field
member is accessed from c , it is assumed that
the type of member is String . However, after the
second instance of type C<Integer> is allocated,
member is of type Integer .
The result of running C 's main method might
surprise you -- it'll issue a ClassCastException ! How can
that be, since the source code doesn't include any casts? It turns out
that the compiler actually inserts casts into the code during compilation
to account for the fact that type erasure reduces the precision of the
types of certain expressions. These casts are supposed to succeed,
but in this case they don't.
This particular "feature" of JSR-14 1.0 and 1.2 should be considered a
bug. It breaks the soundness of the type system, in other words, the
fundamental contract that a type system should uphold with the programmer.
It would be much better to simply prevent the programmer from referring to
generic types in static fields, as is done in the case of static methods
and classes.
Note that the problem with allowing such potentially explosive code is
not that programmers could intentionally override the type system
in their own code. The problem is that programmers could accidentally
write such code (say by mistakenly including a static modifier in a field
declaration, due to copy-and-pasting).
The type checker is supposed to help a programmer recover from exactly
these sorts of mistakes, but in the case of static fields, the type system
could actually help to confuse the programmer. How are we supposed to
diagnose a bug like this one when the only error signaled is a
ClassCastException in code that makes no use of casts? The
situation is worse for a programmer who isn't aware of the implementation
scheme used for generic types in Tiger and just assumes that the type
system acts reasonably. In this case, it doesn't.
Luckily, the latest version of JSR-14 (1.3) outlaws the use of type
parameters in static fields. Therefore, we can reasonably expect that they
will be outlawed in static fields in Tiger as well.
Generic type parameters and
primitive types This restriction doesn't have the same
potential pitfalls as what we just discussed, but it can make your code
pretty wordy. For example, in the generic version of
java.util.Hashtable there are two type parameters: one for
the type of Keys and one for the type of Values .
So, if we want a Hashtable mapping Strings to
Strings , we can specify the new instance with the expression
new Hashtable<String, String>() . However, if we want a
Hashtable that maps Strings to
ints , we have no choice but to create an instance of
Hashtable<String, Integer> and wrap all
int values in Integers .
Again, this aspect of Tiger follows naturally from the implementation
scheme used. Since type parameters are erased to their bounds and the
bounds can't be primitive types, there is no way that an instantiation
with primitive types would make sense once the types are erased.
"Naked" parameters in casts or
instanceof operations Recall that by "naked" type
parameters, we mean type parameters that lexically occur alone, not as a
syntactic subcomponent of a larger type. For instance
C<T> is not a naked type parameter, but (in the body of
C ), T is.
If you use casts or instanceof operations on naked type
parameters inside your code, the compiler will issue what is called an
"unchecked" warning. For example, the following code will generate the
warning: Warning: unchecked cast to type T :
import java.util.Hashtable;
interface Registry {
public void register(Object o);
}
class C<T> implements Registry {
int counter = 0;
Hashtable<Integer, T> values;
public C() {
values = new Hashtable<Integer, T>();
}
public void register(Object o) {
values.put(new Integer(counter), (T)o);
counter++;
}
}
|
You should take such warnings seriously because they indicate that your
code could behave very strangely at runtime. In fact, they can make it
extraordinarily difficult to diagnose bugs. In the previous code, we'd
expect that if register("test") were called on an instance of
C<JFrame> , a ClassCastException would be
signaled. But it won't be; the computation will continue as if the cast
had succeeded, signaling an error further into the computation or worse,
completing with corrupt data but no outward signs of trouble. Similarly,
instanceof checks on naked type parameters will result in an
"unchecked" warning at compile time and the check will not occur as
expected at runtime.
A double-edged
sword So, what's going on here? Because Tiger relies on type
erasure, the naked type parameters in casts and instanceof
tests are "erased" to their upper bounds (in the earlier case, that'll be
type Object ). So, casts to type parameters will turn into
casts to the upper bound of the parameter.
Similarly, instanceof will check that the operand is an
instanceof the bound of the parameter. That's not what we
intended at all, and if it were, we would have simply cast to the bound
explicitly. So, in general, avoid using casts and instanceof
checks on type parameters.
Nevertheless, you will sometimes have to rely on casts to type
parameters in order to get your code to compile. When that happens, just
remember that, in that part of the code, there is no safety from type
checking -- you're on your own.
Although generic types can be a powerful weapon for producing robust
code, we've shown how their misuse can lead to code that is not only less
robust but also extraordinarily hard to diagnose and fix. Next time, we'll
cover the last two limitations of generic types in Tiger and discuss some
of the issues that necessarily come up in any attempt to include them in a
generic Java type system.
Resources
- Participate in the discussion forum on this
article. (You can also click Discuss at the top or bottom of the
article to access the forum.)
- Get a jump on generics in Java by downloading the JSR-14
prototype compiler; it includes the sources for a prototype compiler
written in the extended language, a jar file containing the class files
for running and bootstrapping the compiler, and a jar file containing
stubs for the collection classes.
- Eric Allen has a new book on the subject of bug patterns, Bug Patterns
in Java, which presents a methodology for diagnosing and
debugging computer programs by focusing on bug patterns, Extreme
Programming methods, and ways to craft powerful testable and extensible
software.
- Martin Fowler's Web site
contains much useful information about effective refactoring.
- Examine seven principles to build a base for code design with
testing in mind in this article, "Diagnosing
Java code: Designing 'testable' applications"
(developerWorks, September 2001).
- Explore the developerWorks repository of Eric Allen's columns --
from bug patterns to testability to design strategies -- in the Diagnosing
Java code columns roundup.
- Follow the discussion of adding generic types to Java by reading the
Java Community Process proposal, JSR-14.
- Keith Turner offers another look at this topic with his article "Catching
more errors at compile time with Generic Java"
(developerWorks, March 2001).
- "Automatic
Code Generation from Design Patterns" from IBM Research describes
the architecture and implementation of a tool that automates the
implementation of design patterns.
- Also, these two Diagnosing Java code articles can bolster
your knowledge of generic types and the Java type system: "Killer
combo -- Mixins, Jam, and unit testing" (December 2002) and "The
case for static types" (June 2002).
- Find hundreds more Java resources on the developerWorks
Java technology zone.
About the
author Eric Allen possesses a broad range of hands-on
knowledge of technology and the computer industry. With a B.S. in
computer science and mathematics from Cornell University and an M.S.
in computer science from Rice University, Eric is currently a Ph.D.
candidate in the Java programming languages team at Rice. Eric's
research concerns the development of semantic models and static
analysis tools for the Java language at the source and bytecode
levels. He is also concerned with the verification of security
protocols through semantic formalisms and type checking. Eric is
a project manager for and a founding member of the DrJava project,
an open-source Java IDE designed for beginners; he is also the lead
developer of the university's experimental compiler for the NextGen
programming language, an extension of the Java language with added
experimental features. Eric has moderated several Java forums for
the online magazine JavaWorld. In addition to these
activities, Eric teaches software engineering to Rice University's
computer science undergraduates. You can contact Eric at eallen@cs.rice.edu. |
|
|
|
|