The decision to require that classes implement the java.io.Serializable interface was not made lightly. The design called for a balance between the needs of developers and the needs of the system to be able to provide a predictable and safe mechanism. The most difficult design constraint to satisify was the safety and security of Java classes.
If classes were to be marked as being serializable the design team worried that a developer, either out of forgetfulness, laziness, or ignorance might not declare a class as being Serializable and then make that class useless for RMI or for purposes of persistence. We worried that the requirement would place on a developer the burden of knowing how a class was to be used by others in the future, an essentially unknowable condition. Indeed, our preliminary design, as reflected in the alpha API, concluded that the default case for a class ought to be that the objects in the class be serializable. We changed our design only after considerations of security and correctness convinced us that the default had to be that an object not be serialized.
Security restrictions
The first consideration that caused us to change the default behavior of objects had to do with security, and in particular in the privacy of fields declared to be private, package protected, or protected. The Java runtime restricts access to such fields for either read or write to a subset of the objects within the runtime.
No such restriction can be made on an object once it has been serialized; the stream of bytes that are the result of object serialization can be read and altered by any object that has access to that stream. This allows any object access to the state of a serialized object, which can violate the privacy guarantees users of the language expect. Further, the bytes in the stream can be altered in arbitrary ways, allows the reconstruction of an object that was never created within the protections of a Java environment. There are cases in which the recreation of such an object could compromise not only the privacy guarantees expected by users of the Java environment, but the integrity of the environment itself.
These violations cannot be guarded against, since the whole idea of serialization is to allow an object to be converted into a form that can be moved outside of the Java environment (and therefore outside of the privacy and integrity guarantees of that environment) and then be brought back into the environment. Requiring objects to be declared serializable does mean that the class designer must make an active decision to allow the possibility of such a breach in privacy or integrity. A developer who does not know about serialization should not be open to compromise because of this lack of knowledge. In addition, we would hope that the developer who declares a class to be Serializable does so after some thought about the possible consequences of that declaration.
Note that this sort of security problem is not one that can be dealt with by the mechanism of a security manager. Since serialization is intended to allow the transport of an object from one virtual machine to some other (either over space, as it is used in RMI, or over time, as when the stream is saved to a file), the mechanisms used for security need to be independent of the runtime environment of any particular virtual machine. We wanted to avoid as much as possible the problem of being able to serialize an object in one virtual machine and not being able to deserialize that object in some other virtual machine. Since the security manager is part of the runtime environment, using the security manager for serialization would have violated this requirement.
Forcing a conscious decision
While security concerns were the first reason for considering the design change, a reason that we feel is at least as convincing is that serialization should only be added to a class after some design consideration. It is far too easy to design a class that falls apart under serialization and reconstruction. By requiring a class designer to declare support for the Serialization interface, we hoped that the designer would also give some thought to the process of serializing that class.
Examples are easy to cite. Many classes deal with information that only makes sense in the context of the runtime in which the particular object exists; examples of such information include file handles, open socket connections, security information, etc. Such data can be dealt with easily by simply declaring the fields as transient, but such a declaration is only necessary if the object is going to be serialized. A novice (or forgetful, or hurried) programmer might neglect to mark fields as transient in much the same way he or she might neglect to mark the class as implementing the Serializable interface. Such a case should not lead to incorrect behavior; the way to avoid this is to not serialize objects not marked as implementing Serializable.
Another example of this sort is the "simple" object that is the root of a graph that spans a large number of objects. Serializing such an object could result in serializing lots of others, since serialization works over an entire graph. Doing something like this should be a conscious decision, not one that happens by default.
The need for this sort of thought was brought home to us in the group when we were going through the base Java class libraries marking the system classes as Serializable (where appropriate). We had originally thought that this would be a fairly simple process, and that most of the system classes could just be marked as implementing Serializable and then use the default implementation with no other changes. What we found was that this was far less often the case than we had suspected. In a large number of the classes, careful thought had to be given to whether or not a field should be marked as transient or whether it made sense to serialize the class at all.
Of course, there is no way to guarantee that a programmer or class designer is actually going to think about these issues when marking a class as Serializable. However, by requiring the class to declare itself as implementing the Serializable interface we do require that some thought be given by the programmer. Having serialization be the default state of an object would mean that lack of thought could cause bad effects in a program, something that the overall design of Java has attempted to avoid.
The ObjectOutputStream class keeps track of each object it serializes and sends only the handle if the object is written into the stream a subsequent time. This is the way it deals with graphs of objects. The corresponding ObjectInputStream keeps track of all of the objects it has created and their handles so when the handle is seen again it can return the same object. Both output and input streams keep this state until they are freed.
Alternatively, the ObjectOutputStream class implements a reset method that discards the memory of having sent an objecct, so sending an object again will make a copy.
The ObjectOutputStream maintains a table mapping objects written into
the stream to a handle. The first time an object is written to a stream
its contents are written into the stream, subsequent writes of the object
result in a handle to the object being written into the stream. This table
maintains references to objects that might otherwise be unreachable by
an application, thus, resulting in an unexpected situation of running out
of memory. A call to the ObjectOutputStream.reset() method resets the object/handle
table to its initial state, allowing all previously written objects to
be elgible for garbage collection. See handle.
Object Serialization does not contain any encryption/decryption in itself. It write to and reads from Java Streams, so it can be coupled with any available encryption technology. Object serialization can be used in many different ways from simple persistence, writing and read to/from files, or for RMI to communicate across hosts.
RMI's use of serialization leaves encryption and decryption to the lower network transport. We expect that when a secure channel is needed the network connections will be made using SSL or the like.
Currently there is no direct way to write objects to a random access file.
You can use the ByteArray I/O streams as an intermediate place to write and read bytes to/from the random access file and create Object I/O streams from the byte streams to write/read the objects. You just have to make sure that you have the entire object in the byte stream or reading/writing the object will fail.
For example, java.io.ByteArrayOutputStream can be used to receive the
bytes of ObjectOutputStream. From it you can get a byte[] of the result.
That in turn can be used with ByteArrayInputStream as input to ObjectInput
ObjectOutputStream and ObjectInputStream work to/from any stream object.
You could use a ByteArrayOutputStream and then get the array and insert
it into a ByteArrayInputStream. You could also use the piped stream classes
as well. Any java.io class that extends the OutputStream and InputStream
classes can be used.
Alternatively, the ObjectOutputStream class implements a reset
method that discards the memory of having sent an object, so sending an
object again will make a copy.
The diff will produce the same stream each time the same object is serialized. You will need to create a new ObjectOutputStream to serialize each object.
ObjectOutputStream produces an OutputStream, If your zip object extends the OutputStream class there is no problem compressing it.
This is not really viable for arbitrary objects because of the encoding of objects. For a particular object (such as String) you can compare the resulting bit streams. The encoding is stable, in that every time the same object is encoded it is encoded to the same set of bits.
Here's a brief example that shows how to serialize a tree of objects.
import java.io.*; class tree implements java.io.Serializable { public tree left; public tree right; public int id; public int level; private static int count = 0; public tree(int l) { id = count++; level = l; if (l > 0) { left = new tree(l-1); right = new tree(l-1); } } public void print(int levels) { for (int i = 0; i < level; i++) System.out.print(" "); System.out.println("node " + id); if (level <= levels && left != null) left.print(levels); if (level <= levels && right != null) right.print(levels); } public static void main (String argv[]) { try { /* Create a file to write the serialized tree to. */ FileOutputStream ostream = new FileOutputStream("tree.tmp"); /* Create the output stream */ ObjectOutputStream p = new ObjectOutputStream(ostream); /* Create a tree with three levels. */ tree base = new tree(3); p.writeObject(base); // Write the tree to the stream. p.flush(); ostream.close(); // close the file. /* Open the file and set to read objects from it. */ FileInputStream istream = new FileInputStream("tree.tmp"); ObjectInputStream q = new ObjectInputStream(istream); /* Read a tree object, and all the subtrees */ tree new_tree = (tree)q.readObject(); new_tree.print(3); // Print out the top 3 levels of the tree } catch (Exception ex) { ex.printStackTrace(); } } }
Only the fields of Serializable objects are written out and restored.
The object may be restored only if it has a no-arg constructor that will
initialize the fields of non-serializable supertypes. If the subclass has
access to the state of the superclass it can implement writeObject and
readObject to save and restore that state.
The bytecodes for a local object's methods are not passed directly in the ObjectOutputStream, but the object's class may need to be loaded by the receiver if the class is not already available locally. (The class files themselves are not serialized, just the names of the classes.) All classes must be able to be loaded during deserialization using the normal class loading mechanisms. For applets this means they are loaded by the AppletClassLoader.
There are no conherency guarantees for local objects passed to a remote
VM since such objects are passed by copying their contents (a true pass-by-value).
Here's an initial list of the classes that are marked Serializable. Note that classes that extend these classes are also serializable:
AWT has not yet been modified to work well with Serialization. When you serialize AWT widgets, also serialized are the Peer objects that map the AWT functions to the local window system. When you deserialize (reconsitute) the AWT widgets, the old Peers are recreated, but they are out of date. Peers are native to the local window system and contain pointers to data structures in the local address space, and therefore cannnot be moved.
As a work around you should first remove the top level widget from its container (so the widgets are no longer live). The peers are discarded at this point and you will save only the AWT widget state. When you later deserialize and read the widgets back in, add the top level widget to the frame to make the AWT widgets appear. You may need to add a show call.
For JDK 1.1 AWT widgets wil be serializable. However, they will not be interoperable with JDK 1.0.2 widgets.
In JDK1.1 Threads will NOT be serializable. In the present implementation, if you attempt to serialize and then deserialize a thread, there is NO explicit allocation of a new native thread or stack; all that happens is that the Java object is allocated with none of the native implementation. In short, it just won't work and will fail in unpredictable ways.
The difficulty with threads is that they have so much state which is intricately tied into the virtual machine that it is difficult or impossible to re-establish the context somewhere else. For example, saving the Java call stack is insufficient because if there were native methods that had called C procedures that in turn called Java, there would be an incredible mix of Java constructs and C pointers to deal with. Also, Serializing the stack would imply serializing any object reachable from any stack variable.
If a thread were resumed in the same VM, it would be sharing a lot of
state with the original thread, and would therefore fail in unpredictable
ways if both threads were running at once, just like two C threads trying
to share a stack. When deserialized in a separate VM, its hard to tell
what might happen.
AWT does not yet work well with serialization and you will therefore have trouble trying to pass fonts and images. This is because each contains memory pointers that are valid only in the originating VM, which will cause a segmentation violation when passed to a new VM.
These problems should be corrected by the time JDK 1.1 releases. As a work around for fonts, you will need to pass the information necessary to recreate a new font object that duplicates the characteristics of the font object in the originating VM. There is no current work around to allow images to be passed correctly.
Talk with RMI developers via the mailing
list rmi-users@javasoft.com
To subscribe, send subscribe rmi-users
to listserv@javasoft.com
For information on technical support, refer
to JavaSoft E-Mail Addresses
Send questions or comments about this site
to webmaster@jse.east.sun.com
|
Copyright
© 1995-97 Sun Microsystems, Inc.
All Rights Reserved.
Please send comments to: java-io@java.sun.com |
JavaSoft |