JavaSoft, a Sun Microsystems Business
 

Frequently Asked Questions 
Object Serialization


Object Serialization

  • General - Questions about the Serialization subsystem.
    1. Why must classes be marked serializable in order to be written to an ObjectOutputStream?
    2. A Serializable object is written with writeObject, modified and written a second time, the modification is missing when deserializing the stream.
    3. OutOfMemoryError thrown after writing a large number of objects into an ObjectOutputStream.
    4. How do I serialize a tree of objects?
    5. If class A does not implement Serializable but a subclass B implements Serializable, will the fields of class A be serialized when B is serialized?
    6. Does object serialization support encryption?
    7. The object serialization classes are stream oriented. How do I write objects to a random access file?
    8. How can I create an ObjectInputStream from an ObjectOutputStream without a file in between?
    9. Can I compute diff(serial(x),serial(y))?
    10. Can I compress the serial representation of my objects using my own zip/unzip methods?
    11. Can I execute methods on compressed versions of my objects, for example isempty(zip(serial(x)))?
  • Usage within the JDK.
    1. When a local object is serialized and passed as a parameter in an RMI call, are the byte codes for the local object's methods also passed? What about object coherency, if the remote VM application "keeps" the object handle?
    2. Which JDK 1.1 system classes will be marked serializable?
    3. Are there any plans to support the serialization of thread objects?
    4. I am having problems deserializing AWT components. How can I make this work?
    5. If I try to serialize a font or image object reconstitute it in a different VM, my application dies. Why?

    Object Serialization

    1. Why must classes be marked serializable in order to be written to an ObjectOutputStream?
    2. The decision to require that classes implement the java.io.Serializable interface was not made lightly. The design called for a balance between the needs of developers and the needs of the system to be able to provide a predictable and safe mechanism. The most difficult design constraint to satisify was the safety and security of Java classes.

      If classes were to be marked as being serializable the design team worried that a developer, either out of forgetfulness, laziness, or ignorance might not declare a class as being Serializable and then make that class useless for RMI or for purposes of persistence. We worried that the requirement would place on a developer the burden of knowing how a class was to be used by others in the future, an essentially unknowable condition. Indeed, our preliminary design, as reflected in the alpha API, concluded that the default case for a class ought to be that the objects in the class be serializable. We changed our design only after considerations of security and correctness convinced us that the default had to be that an object not be serialized.

      Security restrictions

      The first consideration that caused us to change the default behavior of objects had to do with security, and in particular in the privacy of fields declared to be private, package protected, or protected. The Java runtime restricts access to such fields for either read or write to a subset of the objects within the runtime.

      No such restriction can be made on an object once it has been serialized; the stream of bytes that are the result of object serialization can be read and altered by any object that has access to that stream. This allows any object access to the state of a serialized object, which can violate the privacy guarantees users of the language expect. Further, the bytes in the stream can be altered in arbitrary ways, allows the reconstruction of an object that was never created within the protections of a Java environment. There are cases in which the recreation of such an object could compromise not only the privacy guarantees expected by users of the Java environment, but the integrity of the environment itself.

      These violations cannot be guarded against, since the whole idea of serialization is to allow an object to be converted into a form that can be moved outside of the Java environment (and therefore outside of the privacy and integrity guarantees of that environment) and then be brought back into the environment. Requiring objects to be declared serializable does mean that the class designer must make an active decision to allow the possibility of such a breach in privacy or integrity. A developer who does not know about serialization should not be open to compromise because of this lack of knowledge. In addition, we would hope that the developer who declares a class to be Serializable does so after some thought about the possible consequences of that declaration.

      Note that this sort of security problem is not one that can be dealt with by the mechanism of a security manager. Since serialization is intended to allow the transport of an object from one virtual machine to some other (either over space, as it is used in RMI, or over time, as when the stream is saved to a file), the mechanisms used for security need to be independent of the runtime environment of any particular virtual machine. We wanted to avoid as much as possible the problem of being able to serialize an object in one virtual machine and not being able to deserialize that object in some other virtual machine. Since the security manager is part of the runtime environment, using the security manager for serialization would have violated this requirement.

      Forcing a conscious decision

      While security concerns were the first reason for considering the design change, a reason that we feel is at least as convincing is that serialization should only be added to a class after some design consideration. It is far too easy to design a class that falls apart under serialization and reconstruction. By requiring a class designer to declare support for the Serialization interface, we hoped that the designer would also give some thought to the process of serializing that class.

      Examples are easy to cite. Many classes deal with information that only makes sense in the context of the runtime in which the particular object exists; examples of such information include file handles, open socket connections, security information, etc. Such data can be dealt with easily by simply declaring the fields as transient, but such a declaration is only necessary if the object is going to be serialized. A novice (or forgetful, or hurried) programmer might neglect to mark fields as transient in much the same way he or she might neglect to mark the class as implementing the Serializable interface. Such a case should not lead to incorrect behavior; the way to avoid this is to not serialize objects not marked as implementing Serializable.

      Another example of this sort is the "simple" object that is the root of a graph that spans a large number of objects. Serializing such an object could result in serializing lots of others, since serialization works over an entire graph. Doing something like this should be a conscious decision, not one that happens by default.

      The need for this sort of thought was brought home to us in the group when we were going through the base Java class libraries marking the system classes as Serializable (where appropriate). We had originally thought that this would be a fairly simple process, and that most of the system classes could just be marked as implementing Serializable and then use the default implementation with no other changes. What we found was that this was far less often the case than we had suspected. In a large number of the classes, careful thought had to be given to whether or not a field should be marked as transient or whether it made sense to serialize the class at all.

      Of course, there is no way to guarantee that a programmer or class designer is actually going to think about these issues when marking a class as Serializable. However, by requiring the class to declare itself as implementing the Serializable interface we do require that some thought be given by the programmer. Having serialization be the default state of an object would mean that lack of thought could cause bad effects in a program, something that the overall design of Java has attempted to avoid.

    3. A Serializable object is written with writeObject, modified and written a second time, the modification is missing when deserializing the stream.
    4. The ObjectOutputStream class keeps track of each object it serializes and sends only the handle if the object is written into the stream a subsequent time. This is the way it deals with graphs of objects. The corresponding ObjectInputStream keeps track of all of the objects it has created and their handles so when the handle is seen again it can return the same object. Both output and input streams keep this state until they are freed.

      Alternatively, the ObjectOutputStream class implements a reset method that discards the memory of having sent an objecct, so sending an object again will make a copy.

    5. OutOfMemoryError thrown after writing a large number of objects into an ObjectOutputStream
    6. The ObjectOutputStream maintains a table mapping objects written into the stream to a handle. The first time an object is written to a stream its contents are written into the stream, subsequent writes of the object result in a handle to the object being written into the stream. This table maintains references to objects that might otherwise be unreachable by an application, thus, resulting in an unexpected situation of running out of memory. A call to the ObjectOutputStream.reset() method resets the object/handle table to its initial state, allowing all previously written objects to be elgible for garbage collection. See handle.
       

    7. Does object serialization support encryption?
    8. Object Serialization does not contain any encryption/decryption in itself. It write to and reads from Java Streams, so it can be coupled with any available encryption technology. Object serialization can be used in many different ways from simple persistence, writing and read to/from files, or for RMI to communicate across hosts.

      RMI's use of serialization leaves encryption and decryption to the lower network transport. We expect that when a secure channel is needed the network connections will be made using SSL or the like.

    9. The object serialization classes are stream oriented. How do I write objects to a random access file?
    10. Currently there is no direct way to write objects to a random access file.

      You can use the ByteArray I/O streams as an intermediate place to write and read bytes to/from the random access file and create Object I/O streams from the byte streams to write/read the objects. You just have to make sure that you have the entire object in the byte stream or reading/writing the object will fail.

      For example, java.io.ByteArrayOutputStream can be used to receive the bytes of ObjectOutputStream. From it you can get a byte[] of the result. That in turn can be used with ByteArrayInputStream as input to ObjectInput
       

    11. How can I create an ObjectInputStream from an ObjectOutputStream without a file in between?
    12. ObjectOutputStream and ObjectInputStream work to/from any stream object. You could use a ByteArrayOutputStream and then get the array and insert it into a ByteArrayInputStream. You could also use the piped stream classes as well. Any java.io class that extends the OutputStream and InputStream classes can be used.
      Alternatively, the ObjectOutputStream class implements a reset method that discards the memory of having sent an object, so sending an object again will make a copy.
       

    13. Can I compute diff(serial(x),serial(y))?
    14. The diff will produce the same stream each time the same object is serialized. You will need to create a new ObjectOutputStream to serialize each object.

    15. Can I compress the serial representation of my objects using my own zip/unzip methods?
    16. ObjectOutputStream produces an OutputStream, If your zip object extends the OutputStream class there is no problem compressing it.

    17. Can I execute methods on compressed versions of my objects, for example isempty(zip(serial(x)))?
    18. This is not really viable for arbitrary objects because of the encoding of objects. For a particular object (such as String) you can compare the resulting bit streams. The encoding is stable, in that every time the same object is encoded it is encoded to the same set of bits.

    19. How do I serialize a tree of objects?
    20. Here's a brief example that shows how to serialize a tree of objects.

      import java.io.*;
      
      class tree implements java.io.Serializable {
          public tree left;
          public tree right;
          public int id;
          public int level;
      
          private static int count = 0;
          public tree(int l) {
              id = count++;
              level = l;
              if (l > 0) {
                  left = new tree(l-1);
                  right = new tree(l-1);
              }
          }
          public void print(int levels) {
              for (int i = 0; i < level; i++)
                  System.out.print("  ");
              System.out.println("node " + id);
      
              if (level <= levels && left != null)
                  left.print(levels);
      
              if (level <= levels && right != null)
                  right.print(levels);
          }
      
      
          public static void main (String argv[]) {
      
              try {
                  /* Create a file to write the serialized tree to. */
                  FileOutputStream ostream = new FileOutputStream("tree.tmp");
                  /* Create the output stream */
                  ObjectOutputStream p = new ObjectOutputStream(ostream);
      
                  /* Create a tree with three levels. */
                  tree base = new tree(3);
      
                  p.writeObject(base); // Write the tree to the stream.
                  p.flush();
                  ostream.close();    // close the file.
                  
                  /* Open the file and set to read objects from it. */
                  FileInputStream istream = new FileInputStream("tree.tmp");
                  ObjectInputStream q = new ObjectInputStream(istream);
                  
                  /* Read a tree object, and all the subtrees */
                  tree new_tree = (tree)q.readObject();
      
                  new_tree.print(3);  // Print out the top 3 levels of the tree
              } catch (Exception ex) {
                  ex.printStackTrace();
              }
          }
      }
    21. If class A does not implement Serializable but a subclass B implements Serializable, will the fields of class A be serialized when B is serialized?
    22. Only the fields of Serializable objects are written out and restored. The object may be restored only if it has a no-arg constructor that will initialize the fields of non-serializable supertypes. If the subclass has access to the state of the superclass it can implement writeObject and readObject to save and restore that state.
       

    23. When a local object is serialized and passed as a parameter in an RMI call, are the byte codes for the local object's methods also passed? What about object coherency, if the remote VM application "keeps" the object handle?
    24. The bytecodes for a local object's methods are not passed directly in the ObjectOutputStream, but the object's class may need to be loaded by the receiver if the class is not already available locally. (The class files themselves are not serialized, just the names of the classes.) All classes must be able to be loaded during deserialization using the normal class loading mechanisms. For applets this means they are loaded by the AppletClassLoader.

      There are no conherency guarantees for local objects passed to a remote VM since such objects are passed by copying their contents (a true pass-by-value).
       

    25. Which JDK 1.1 system classes will be marked serializable.?
    26. Here's an initial list of the classes that are marked Serializable. Note that classes that extend these classes are also serializable:

      There are many classes for which Serialization makes no sense, such as those representing the state of something in the current VM (e.g. java.io.FileInputStream) or are exceedingly hard to do correctly (e.g. java.lang.Thread).
    27. I am having problems deserializing AWT components. How can I make this work?
    28. AWT has not yet been modified to work well with Serialization. When you serialize AWT widgets, also serialized are the Peer objects that map the AWT functions to the local window system. When you deserialize (reconsitute) the AWT widgets, the old Peers are recreated, but they are out of date. Peers are native to the local window system and contain pointers to data structures in the local address space, and therefore cannnot be moved.

      As a work around you should first remove the top level widget from its container (so the widgets are no longer live). The peers are discarded at this point and you will save only the AWT widget state. When you later deserialize and read the widgets back in, add the top level widget to the frame to make the AWT widgets appear. You may need to add a show call.

      For JDK 1.1 AWT widgets wil be serializable. However, they will not be interoperable with JDK 1.0.2 widgets.

    29. Are there any plans to support the serialization of thread objects?
    30. In JDK1.1 Threads will NOT be serializable. In the present implementation, if you attempt to serialize and then deserialize a thread, there is NO explicit allocation of a new native thread or stack; all that happens is that the Java object is allocated with none of the native implementation. In short, it just won't work and will fail in unpredictable ways.

      The difficulty with threads is that they have so much state which is intricately tied into the virtual machine that it is difficult or impossible to re-establish the context somewhere else. For example, saving the Java call stack is insufficient because if there were native methods that had called C procedures that in turn called Java, there would be an incredible mix of Java constructs and C pointers to deal with. Also, Serializing the stack would imply serializing any object reachable from any stack variable.

      If a thread were resumed in the same VM, it would be sharing a lot of state with the original thread, and would therefore fail in unpredictable ways if both threads were running at once, just like two C threads trying to share a stack. When deserialized in a separate VM, its hard to tell what might happen.
       

    31. If I try to serialize a font or image object and reconstitute it in a different VM, my application dies. Why?
    32. AWT does not yet work well with serialization and you will therefore have trouble trying to pass fonts and images. This is because each contains memory pointers that are valid only in the originating VM, which will cause a segmentation violation when passed to a new VM.

      These problems should be corrected by the time JDK 1.1 releases. As a work around for fonts, you will need to pass the information necessary to recreate a new font object that duplicates the characteristics of the font object in the originating VM. There is no current work around to allow images to be passed correctly.


    Talk with RMI developers via the mailing list rmi-users@javasoft.com
    To subscribe, send subscribe rmi-users to listserv@javasoft.com
    For information on technical support, refer to JavaSoft E-Mail Addresses
    Send questions or comments about this site to webmaster@jse.east.sun.com

    Copyright © 1995-97 Sun Microsystems, Inc. All Rights Reserved. 
    Please send comments to: java-io@java.sun.com 
    Sun 
    JavaSoft