DOTS 0.5 and Serialization

Hello! I recently upgraded my project to the new DOTS 0.5 release which I was really excited for, thank you Unity Team! Unfortunately this upgrade seemed to have broken the method I used to serialize my world and save to disk.

public void SaveMethod()
    {
        var querySaveTag = new EntityQueryDesc
        {
            All = new ComponentType[]
            {
                typeof(SaveTag)
            },
           
            None = new ComponentType[]
            {
                typeof(RequestSceneLoaded)
            }
        };
        var queryPrefabTag = new EntityQueryDesc
        {
            All = new ComponentType[]
            {
                typeof(Prefab)
            },
           
            None = new ComponentType[]
            {
                typeof(RequestSceneLoaded)
            }
        };
        using var saveQuery = EntityManager.CreateEntityQuery(querySaveTag, queryPrefabTag);

        using var saveWorld = new World("saveWorld", WorldFlags.Staging);

        using var entityArray = saveQuery.ToEntityArray(Allocator.Temp);
       
        using var entityRemap = new NativeArray<EntityRemapUtility.EntityRemapInfo>(EntityManager.EntityCapacity, Allocator.Temp);
       
        saveWorld.EntityManager.CopyEntitiesFrom(EntityManager, entityArray);

        var filePathStr = $"{Application.dataPath}/saveData.test";

        var streamWriter = new StreamBinaryWriter(filePathStr);

        SerializeUtility.SerializeWorld(saveWorld.EntityManager, streamWriter, out objectOuputArr, entityRemap);
       
        streamWriter.Dispose();
       
       
    }

The issue with the code above is now the StreamBinaryWriter class is ā€œinaccessible due to its protection levelā€. I am not sure how to get this working again and would greatly appreciate some guidance. I am also having the same error with the StreamBinaryReader class which is used in my load method.

1 Like

StreamBinaryReader and StreamBinaryWriter were deprecated in 0.18 and then removed in 0.50. Since 0.18 was never released, it was essentially just removed in a single update without any warning. I also didn’t see any mention of it in the upgrade guide. The source for those classes is still available in Entities/Unity.Entities/Serialization/BinarySerialization.cs so you could just copy it into your project. Not sure if they still work correctly, but they appear to still be used in some tests.

From Changelog | Entities | 0.50.1-preview.2

1 Like

I had not thought of that. I will try to make my own Binary Reader/Writer using their code. Thank you for the tip!

I really badly want a serialization solution for DOTS.
It’s not the sort of thing I really want to implement myself, so easy to mess up and the bugs when you do mess up are nasty.

2 Likes

I tried copying over the stream reader/writer, I think it works…
Be careful though, you will get errors if you try to read or write a world without any chunks.

It really is tragic. I have a factorio-like game that’s broaching 100k lines of code, with insanely complicated state to save and load. SerializeUtility.SerializeWorld never worked for me despite spending hours trying to tame it. I ended up writing my own solution but I know it’s much slower than it could be and it’s very delicate code, subject to an incredible amount of entropy as my game continues to grow.

1 Like

You do not want to use SerializeWorld for saving.

If you ever change a single component in the future you will break all of your users saves. It basically prevents you ever updating your game once released. It’s not designed for saving.

7 Likes

The simplest solution is to use a serialization library which can automatically serialize classes with reflection and serialize/deserialize manually by turning entities into a big list of JSON/XML/whatever objects with most of the hard work taken care of by reflection.
This is slow but it’s easy and can even be optimized in many libraries if you want to get into that.

1 Like

So what is it for?

This was our solution and it’s pretty solid. I have written about this.

1 Like

I probably shouldn’t mention this as I’ve only been working on it for a little under a week but I’m kind of excited how it’s turned out. I’ve taken my 3rd shot at writing a serialization library for Entities. I’ve worked on 2 completely opposite solutions (including a shipped product that requires it not to break between versions) over the past couple of years and using what I’ve learnt, merged them together to create a very versatile solution avoiding a annoyances, tedious maintenance and migration problems I’ve seen.

I’ve got it to the point where you just add the attribute [Save] to any component you want saved and that’s all you really need to do. No code-gen or reflection required.

    [Save]
    [GenerateAuthoringComponent]
    public struct TestRemapping : IComponentData
    {
        public Entity Entity;
    }
    [Save]
    public struct TestBufferRemapping : IBufferElementData
    {
        public Entity Entity;
    }

Supports remapping entities and has a reasonably simple migration progress.
It’s extremely fast, especially to serialize. I intend to look at releasing this after i flesh out some features (subscene entity saving, release validation etc)

17 Likes

This looks great, excited to see what you share! I’m currently working on my own entity serialization so I’m curious about your approach.

Do you have a solution for specifying the set of entities to save? For example, let’s say I want to save all of the MyTestComponents, but only on entities that don’t have a DontSaveTag. Would that be possible?

If you wanted to save a built-in component, such as Parent or Translation, how would the user declare this? Is there some other way to register types?

I think you mentioned on discord that you are instantiating prefabs during deserialization. If so, will this mapping to prefabs be customizable by the user? For example, currently, all my entities have a FixedString32 in a component that can be used to fetch the corresponding prefab, but I might change it to use a uuid at some point. Is there a way to control what prefab key gets serialized and how it maps to a prefab entity?

I have another very specific problem that I’ve run into recently. I want to serialize the LinkedEntityGroup buffer on entities which contains Entity refs. In some cases, this buffer contains child entities that are purely visual. They don’t have any gameplay components on them, so they don’t need to be serialized. In fact, I don’t want to serialize them because they are part of the parent’s corresponding prefab, so they will get instantiated during deserialization anyways. Is there some way to specify which elements of the LinkedEntityGroup buffer get serialized? And is there a way to declare the buffer deserialization as additive instead of replacement?

My serialization library is also still a work-in-progress, but it currently looks something like this:

        // Describes the set of entities I want to save.
        var desc = new EntityQueryDesc
        {
            All = new[] { ComponentType.ReadOnly<EntityType>() },
            None = new[] { ComponentType.ReadOnly<HologramTag>(), ComponentType.ReadOnly<PreviewTag>() },
        };
        var world = World.DefaultGameObjectInjectionWorld;
        EntityManager entityManager = world.EntityManager;

        // Creates the serializer. The PrefabEntityMapper tells the serializer how to remap entity references into
        // prefab keys, strings in this case.
        var serializer = new EntitySerializer(entityManager.CreateEntityQuery(desc), entityManager, new PrefabEntityMapper(world));
       
        // Specify the component and buffer types to serialize on the entities.
        serializer.AddComponent<HexPosition>("Position");
        serializer.AddComponent<Facing>("Facing");
        serializer.AddComponent<Printer>("Printer");
        serializer.AddBuffer<LinkedEntityGroup>("Group");

        string path = "path/to/my.save";
      
        // Save
        using var sw = new StreamWriter(path);
        using JsonWriter writer = new JsonTextWriter(sw);
        _serializer.Serialize(writer);
      
        // Load
        using var sr = new StreamReader(path);
        JsonReader reader = new JsonTextReader(sr);
        _serializer.Deserialize(reader);
      
        // {
        //     "Entities": [
        //     {
        //         "Entity": {
        //             "EntityIndex": 1,
        //             "Key": "belt-straight"
        //         },
        //         "Components": {
        //             "Position": {
        //                 "Value": {
        //                     "x": -5,
        //                     "y": 6
        //                 }
        //             },
        //             "Facing": {
        //                 "Value": 0
        //             },
        //             "Group": [
        //             {
        //                 "Value": {
        //                     "EntityIndex": 1,
        //                     "Key": "belt-straight"
        //                 }
        //             }
        //             ]
        //         }
        //     },
        //     ...
        //     ]
        // }
1 Like

Our approach has been to use this for saving which I think has worked pretty well. The way we make it work is by copying over everything to new ā€œserializedā€ components that we just never change. And if we stop using one we need to at least keep it in the project to not break saves just as you say.

So while it might not be for everyone it works pretty well for us and often having different components for runtime and serialized make sense anyway like the health which is the absolute value during runtime, but we save a percentage so that when we increase the max health it isn’t deserialized with less health.

1 Like

Is this such a big problem it’s just loading blocks of memory/data to/from a file system?

I would call saving and loading a world reliably while avoiding breakage whenever a single bit changes in any component a big problem, yes.

My solution for our game was to copy all entities except ISystemStateComponents, managed components and prefabs into a temporary world, use MakeGenericMethod() with GetBuffer() and GetComponentData() to convert them into a serializable SavedEntity class containing a List, List (with type information and a list of IBufferElementData for the dynamic buffer), the entity index and version, and serialize.
On load we deserialize, recreate the entities with their components and buffers into a temporary world by using MakeGenericMethod() with AddComponent/Buffer, remap entity references, do some postprocessing and move everything into the game world.

This has been working so far and allows us to tinker with components without breaking saves (most of the time) while we’re developing.

2 Likes

That sounds exactly like what I has working last week. However, I didn’t really like how the type information had to be serialized and all the reflection stuff was getting messy. Decided to switch to a system where the IComponentData and IBuffereElementData types are explicitly registered with the serializer which made things a bit simpler. Still trying to figure out the best way to handle different kinds of migrations without breaking saves. Currently thinking of just adding a layer that can rewrite the raw json before it is deserialized into entities.

This is basically exactly what I’m doing as well. But reflection is slow and the state in my factorio-like is massive and subject to a lot of entropy. I find myself having to do not a small amount of ad hoc ā€œpost processingā€ of the data to really get it all working. I wish there was a better, faster way.

I suppose I’ll be eager to see what tertle is cooking up.

I use no reflection (except GetAttribute), I just use DynamicComponentTypeHandle. I can write up my current approach and why I’m doing it this way tonight when I’m off work if I find some time.

6 Likes

OK here it is. Big ol’ chunk of text. Skip to 3.1 if you don’t care about a quick bit of background. I’ve used 2 very different save systems at work and I’ve just started writing my own for my personal projects based off experience gained from the systems I’ve used.

1. Serialize the entire world
This our first iteration, which I did not write personally. It was the most simple approach just using SerializeWorld and the obvious benefit is extremely easy to use. It’s mostly done for you! Or is it… We stuck with this for nearly a year, had massive tooling built to manage the issues however

The first obvious downsides is any change to a StableTypeHash on a component breaks it. Change a namespace, the name of the component, any field on the component, add/remove any field. Your save file is dead. We replaced old components with exact same memory mapped stubs then migrated to the new versions .We had a huge range of tool in place available to developers to help migrate this including detecting and auto generating the replacements. Problems really start creeping in though when you do a huge refactor that just can’t be easily migrated.

The next big issue is inflexibility. You save the entire world, you load the entire world. But what if between this time designers have come in and changed how things should behave. Maybe the giant lobster no longer has a sing ability or you’ve decreased the max. It’s a huge pain and limitation on your designers to have to manually migrate the world to the new base prefabs. (NOTE: not all games want to apply changes to existing saves and this would not apply.)

You’re also saving a lot of data you don’t need. 90%+ of components/data you do not need to save. The worst thing is, because any component change you make requires a migration even if you you don’t care what data is on a component you still need to migrate it.

But the final nail in the coffin, at least for me is - most Unity updates break your saves and there is little you can do and there is no guarantee that some point in the future Unity won’t make a change to the entities package that will simply make this unavoidable. This is not an acceptable risk on a launched title.

2. Serialize each ā€˜archetype’ onto their own container
So yeah, a few months out from launch we just kept randomly breaking saves. Decided it was unacceptable and we need a new approach so I ā€˜volunteered’ and got to work writing something else. I kind of based it off what I had done and seen done in more traditional GO type games.

Built containers for each archetype, e.g.
struct BuildingSave { int Type; float3 Position; quaterion Rotation; }

To serialize we just have a large IJobEntityBatch that reads all data we want to save on this archetype and write each entity to it’s own container. Pretty quick.

To deserialize, we first create a default types of each archetype saved, like you would spawn if you were create it in a game fresh. Then we just apply the saved settings to it if it still has the component. This has the huge benefit of any changes design has made to the components on the prefab are updated. Added new components? They’ll be there. Updated a creatures max health updated, it’s there (you only need to save Current health.)

Migration isn’t too bad as we control the containers however any component change might require migrating multiple containers which is a bit annoying. Also we have to migrate the entire container instead of just a single component.

However we do save the bare minimum and only need to migrate rarely, probably only have one dev update one minor thing once a month. We shipped with this 6 months ago and have not had a major issue since. It’s not perfect but it’s been good enough that we haven’t considered changing since it was up and running…

This brings us to now what I’m looking at doing.
First question is why? It seems like we have a proven working solution and that’s true. However this is for my own project and I can’t exactly copy code I did at work and doing it a completely different way does help me avoid any issues (not that I think I’d actually have an issue with my employer.)

But the main reason is it’s still not completely without fault. While it is reasonably easy to maintain, it still takes a lot of code to setup initially and there are some ugly things about it (keeping old systems around for each migration.) I always wanted to codegen this bu doing it this way makes codegen quite a bit of work and I ran out of time.

3.1 Serialize each Component
Disclaimer: this is a 1 weekend test and has not been proven to be production worthy yet.
The approach I’m taking now is to serialize each component separately. The process is simple.

Give each entity you want saving a ā€˜type’ component (I call it Savable). This has a reference to it’s prefab (either an int, weak asset reference, whatever you want.)

Then you can can save each component my simply give a [Saved] attribute (also very easy to manually register types for those in 3rd party libraries.)

foreach (var type in TypeManager.AllTypes)
{
    if (type.Category == TypeManager.TypeCategory.ComponentData)
    {
        if (type.Type.GetCustomAttribute(typeof(SaveAttribute)) != null)
        {
             var saver = new ComponentSave(this, typeIndex);
    }

From the TypeIndex you can get ComponentType and get a DynamicTypeHandle

this.System.GetDynamicComponentTypeHandle(this.componentTypeRead)

using that you can serialize the component from a chunk

foreach (var chunk in this.Chunks)
{
    var components = chunk.GetDynamicComponentDataArrayReinterpret<byte>(this.ComponentType, this.ElementSize);
    this.Serializer.AddBufferNoResize(components);

Very simple serialization process. No magic required except grabbing an attribute in OnCreate.

Benefits:

  • Just need to attach [Saved] to any IComponent or IBufferElement and it will start working
  • Changes to prefab are reflected in saved data.
  • Fast serialization
  • Very easy migration. Just done in 1 giant block per component.

Downsides

  • I have to store an int for each component I save to match to the saved entity so file size is larger.
  • Not so fast deserialization (though it’s actually doing much better than I expected, simply creating the entities is still the highest cost but I haven’t stress tested really high component counts yet so I am expecting not as good performance.)

Deserialize steps are basically

  1. Check each serialized component for current matching type, if it doesn’t exist look for migration. If not found discard data [work in progress for this weekend]
  2. Create all entities.
  3. Apply components back 1 at a time. I thought this would be slow but surprisingly it’s not nearly as bad as I expected. Biggest cost is just Instantiating entities. Overall it’s not a huge deal though as my goal is fast serialization as this often has while you are playing but deserialization usually happens in a load screen.

1 more note, Entity references are really easy to handle. Thanks to Unity and Entity info saved in the TypeManager you can remap entities in components with something like

public static unsafe void RemapEntityFields([ReadOnly] byte* ptr, TypeManager.EntityOffsetInfo* offsets, int offsetCount, NativeHashMap<Entity, Entity> remap)
{
    for (var i = 0; i < offsetCount; i++)
    {
        var entity = (Entity*)(ptr + offsets[i].Offset);
        *entity = remap.TryGetValue(*entity, out var newEntity) ? newEntity : Entity.Null;
    }
}

Final thoughts on 3.1.
Probably won’t get around to it this weekend as it’s going to be focused on ensuring migration workflow feels good but I’m looking at implementing, what that I think will be reasonably easy, is apply save data to entities in subscenes.

Also currently I only support full component serialization. I’ve considered this a lot and partial component serialization isn’t that big a deal to implement but does make serialization a bit slower (instead of a memcpy on the entire array, each field needs to be individually MemCpyStride) and migration a bit more of a pain. I haven’t decided if I’m going to support it yet as I don’t think separating components with save data from those without is that bad a plan. That said, I will probably add this option just minimize it’s use but will be on the tail end of my feature implementation.

3.2 Serialize manually per chunk
Wait what, 3.2? Yep, after I wrote 3.1 I decided to see if I could do a version that was faster didn’t need to store references ints per component to decrease file size. This is basically the same as 3.1 but instead each possible component that can be saved is stored per chunk.

This is definitely even faster to serialize and a bit faster to load and makes a measurably smaller file size.
However, when I was planning out migration I realized it was going to be a lot more rigid and painful to manage and decided to go back o my original plan. It’s still fast and file size really isn’t that big of an issue, we are only talking 3MB compressed for 8 components on 100k entities (which is a lot more than I’ll probably ever need to actually save, though will need more components.).

I’m not ruling out switching back to 3.2 at one point if I can figure out migration but for now sticking with 3.1 and fleshing out the migration for that.

Final Thoughts
I’ve loosely heard of a few more alternatives to saving. Storing components in blobs etc. I have no experience on this. Others might have completely different solutions or found ways around the downsides of SerializeWorld in which case, great, please share!

19 Likes

Thank you for the great writeup @tertle , definitly appreciate the thought process.
For a previous project we also used something akin to method 2, storing certain components in a seperate data structure. We made sure to avoid entity references at all costs though so there were some extra components needed to for example identify the player or other serialized entities. Was very custom at the end so I also tried out to implement something more general like you described in method 3 but was trying out code generation; I gave up though as integrating generators, making sure people do not forget to trigger it and writing it all took more time than it was worth it tbh.
Using the DynamicComponentTypeHandle is pretty clever though and the fast serialize, slower deserialize tradeoff is also imho very well chosen. Speaking from experience 3Mb for a save game is also not too uncommon, esp on console where you most of the time end you up with some minimum size of a few Mb anyways.

Regarding full component serialization: imho I would not support serializing components partially at all, ECS is already well suited for breaking up big components into smaller multiple ones so that is what you should do, in the past we did partial serializing of MBs and it was very error-prone and confusing for engineers coming into the project. Slowing down the serialization for that is not a good tradeoff imho.

1 Like