Serialization is the process of converting a class instance into a serial sequence of bits. DeSerialization is taking those bits and reconstructing th e original object without losing anything of value (in practice the only things not saved are cached values or values calulated as they are needed). These bits could be just about anything, because of its ubiquity we chose to serialize to xml (or something so similar as to be indistinguishable from standard xml). This allows a wide variety of tools to be used when working with and verifying these serialized classes. Additionally, transmitting and storing xml is easy to do, and can be done with a variety of other factors in mind. The xml text can be sent down any stream, put in any file, compressed, queried. You should see Mezzanine::xml Manual for information about the xml system itself.
Topics:
The process of serializing doesn't just convert from class instance to text. Since our end goal is to convert live objects to XML it only make sense to closely integrate with the Mezzanine::xml portion of the engine. If you plan on writing serialization and deserialization code you should read the following parts of the Mezzanine::xml Manual at a minimum:
The central object that will carry information during this process is the Mezzanine::XML::Node. The Mezzanine::XML::Node is an excellent tool for converting and storing data in a single unified heirarchy.
C++ and most object oriented languages heavily imply that class inheritance should be structured as hierarchies. Additionally Hierarchies are implied when complex class have other complex class or datatypes as members. Both of these structures map cleanly onto the kind of hierarchies that a well formed xml documents provide.
There are some relationships in video game objects that cross normal hierarchical boundaries. For example, A constraint references two actors, and defines a relationship between them. When serialized the constraint simply stores the name of the actor and looks it up in the actor manager during deserialization. This implies that the actors exist already, or that there is some mechanism to create the constraint and fill in the actor later. Several mechanisms were discussed to accomplish this, some include: two passes of processing were constraintw would be done in the second pass, a work queue that would store objects that couldn't be deserialized yet, a prefetcher that would dig through the xml to find the required object.
Those methods all likely could have been made to work. However, they are not optimal for a variety of reasons. All of them have a set of preconditions and require more computing resources and could potentially delay loading or transmission times. Some of them heavily imply that all of the items to deserialize must be stored in the same xml source. Some demand access to xml that may not have been transmitted yet.
The simplest, most performant way to work around the issues that cross-hierarchical relationships presented was to ignore them. More specifically, throw an exception if an object reference during deserialization is not present. Then we ask that programmers who write could that must store, transmit and reconstruct class instances be aware of the following preconditions So can produce their own solutions:
The easyiest way to meet these conditions and not consume an inordinate amount of computing resources, is to pay attention to the order that items are serialized in. If a program serializes the worldnodes, then the actors, then everything else it will have relatively little trouble making it work.
There several ways to interact with the current serialization system. One might have to create a class that can be serialized or deserialized. There may be situations where another system is emitting xml and it must be intergrated into an existing game. It may be desired to create a 'factory' that produces objects from and xml source or create a sink to put objects into so they can be serialized. Here we will discuss some of the ways that the serialization system can be extended and what kind of assumptions it makes, so that anyone can write software that interacts with it cleanly.
Creating a class that be serialized is easy. There is just one function that it must implement. If a class implements this, it is said to be Serializable:
The member ProtoSerialize(XML::Node&) is expected to accept a Mezzanine::XML::Node and attach exactly one Node to it. This new Serialized node should contain all the data stored in the current state of the object being serialized. Storing data outside of this one node could cause undefined behavior.
The exact layout of the data in the Serialized Node is not pre-determined. The creator of that function need only take into account any difficulties DeSerializing when creating this. Because of this concern it is advisable name the Serialized node something unique and appropriate and to include a 'Version' attribute on it. If the class changes, the DeSerialization function will only need to check the 'Version' attribute to know if and how it can handle it.
Integrating with the DeSerialization code is pretty easy too. There are two functions you are expected to implement to create a DeSerializable:
The SerializableName() is expected to simply return the name of the xml elements this class will DeSerialize. For example A Mezzanine::Vector3 returns "Vector3", and a Mezzanine::ActorRigid return "ActorRigid". If a class is both DeSerializable and serializable it makes sense to call this function when assigning the name to the Serialized Node it creates.
ProtoDeSerialize(const XML::Node&), accepts a Mezzanine::XML::Node. The Node passed to it would correspond to the Serialized Node created by the ProtoSerialize(XML::Node&) function listed above. If xml is created by something then this is calling code is expecting this function to be the correct deserialization function. It is advisable but not required to verify the name of the xml node matches what is expected and that the 'Version' is something this code can handle. It is also advisable that every piece of data pulled out is verified the best it can be. If exceptions are thrown for every discrepency, then programmers using this will create xml and code that produce no discrepencies.
The following template make use of only the 3 functions described above to Serialize or DeSerialize class instances:
The functions make calls on the Mezzanine::xml system and expect a fairly basic set of conditions to be met before they are used. Serialize accepts an output stream and the class instance to be Serialized. It will create an XML::Document and populate it data from the class provided and then emit that into the stream. DeSerialize accepts an inputstream and the object to be populated. It expects the next xml element in the stream to be a serialized version of the passed object and will then overwrite as many of the values of the passed object as possible with the serialized values. For small items DeSerialize is fine, where possible it is better to have the XML::Document open the file or stream itself as to prevent the second pass through to find exactly one xml element.
In some cases, there are some pieces of information that cannot be supplied or entered by the class itself. This data must be provided by another class or upon creation of the class. This other class can implement the Serializer, DeSerializer, or both interfaces to make working with large amounts of serialization easier.
For example actors can only accept a mesh upon construction. So overwriting an existing actor is impossible to do completely. It expected to be partially implemented, to the extent possible, in the class members. But if you have the need to create Actors on the fly from data stored in files it makes sense to have a dedicated class or interface than can create these. Here is what goes into a Serializer:
Serializer::ProtoSerialize() when implement should take the required steps to attach a Serialized Node to the Passed XML::Node that represent the Serialization target. It is expected to get the extra information that the target cannot provide from somewhere else. Ideally the the Serializer can be, or be associated with, a manager or container of some kind. There is not default implementation of this.
Serializer::Serialize() Goes one step further than Serializer::ProtoSerialize() and also sends it down a stream. The default implements use Serializer::ProtoSerialize().
Serializer::ProtoSerializeAll() performs a similar role to Serializer::ProtoSerialize(), but again, it goes one step further. Rather than accept a single Target to serialize it is expected that the Serializer go to the source of the Targets and serialize all of them that are available. All of the target should be contained in one Node attached to the Node the function accepts. This is not implemented by default, the logic is too specific to the items to be serialized.
Serializer::SerializeAll() uses Serializer::ProtoSerializeAll() to send all of the available Targets in Serialized down a stream.
The logic behind a DeSerializer is similar to a Serializer. The same types of methods, even similar implementations if the function is implemented. Like the ProtoDeSerialize() individual DeSerializables implement, the functions on a DeSerializer will be passed the nodes that would correspond to the those created by their counterparts on the Serializer. Here is the contents of a Deserializer:
The function ContainerName() should be used when creating and verifying the xml element that is parent to the items DeSerialized by ProtoDeSerializeAll(). The Default implmentation of DeSerializeAll() will use ContainerName to verify it has extracted the correct Node.
There is no technical reason why a class cannot be both a serializer and a deserializer, or even multiple kinds of Serializers or DeSerializers. To keep things simple the Managers provided by the Mezzanine engine will store a pointer to the appropriate Serializer when one is required.
Sometimes yu will be forced to work with a system that produces xml that is not structured in a similar way to this system. Sometimes it may be too costly or not possible to modify the code to integrate it. For these the following function exists:
This function will make a call on the the stream insertion operator of the class passed in. If one doesn't exist it is easy to add one in your code without chaning the original source. If one does exist than you should probably copy/paste the whole function and re-implement it calling the functions that emit the XML string or stream. If you want to implement a stream insertion operator, the function prototype should be similar to the stream insertion operator in the Serialization Operators section.
The stream insertion (<<) and stream extraction (>>) operators can be used for serializing and deserializing most items in the Mezzanine engine.
Unfortunately due to conflict with the stream insertion operators provided with the iostreams library these couldn't be made into a template. That doesn't mean that they are difficult to implement. Here is a typical implemenation of stream insertion operators for XML serialization:
You will want to implement these functions with the appropriate type. The type Mezzanine::ActorRigid is used purely as example Though this is actual working code and was in the engine at one point, the current code is more sophiscticated
The function operator<< simply calls Serialize and returns the stream, so it has all the pre and cost conditions of the Serialize function listed in the Make a Serializable or a DeSerializable section.
The stream extraction operators are a little bit more interesting. The operator>>(istream,YourType), by virtue of calling Deserialize will wind up taking two passes over the XML. One looking for the ending tag that matches the first (it gets all the children of that tag too) and one performing the actual parsing. The operator>>(istream,YourType) will work only with completely parsed objects in memory. With the combination of these two all the heavy lifting of parsing is done up front, and the rest of the deserialization is just a bunch of pointer and string manipulation. Another possibility with your stream extraction operator, if you new that it had exactly one parent xml node ,you create without that first pass for improved performance.
To simplify and standardize errors thrown, the following functions exist:
Both of these functions throw a Mezzanine::Exception with the descriptive text of "Could not {FailedTo} during {ClassName} [De]Serialization." If SOrD (Serialize Or Deserialize) is true the "De" is not printed.