I’ve been using FixedString64Bytes previously to store unique identifiers for the persitence (to map scene objects / prefabs for saving / loading).
The problem - with thousands of objects it blows up file size really fast.
(with 1 id to 1 object)
So I’ve tried FixedString32Bytes.
And it turns out I can still cut id in ~in half and still have it somewhat readable.
(e.g. by removing object prefixes, underscores, vowels) and halving random id part.
Reducing from 64 → 32 bytes yielded ~40% less space wasted, less memory consumed and faster serialization / deserialization.
Real solution would be to swap to something like uint instead. But then ability to debug will be lost.
Its really nice to have something human readable to determine what is missing or changed during remapping. Especially when using binary serialization / deserialization.
I’m pretty much sure its possible to trim it down even further, but there isn’t anything like 16 bytes string, or better - 24 (since 3 bytes are taken by internal fields).
There’s FixedBytes16 but I’d really like to avoid re-implementing the fixed string if possible.
Edit: Separate truncation check method would also be nice to have. Right now its only possible via try / catch.
Using FixedStrings as UID sounds like a waste to me and a potentially slow performance. What prevents you from designing it as a combination of some integers?
This: “Its really nice to have something human readable to determine what is missing or changed during remapping.”
This particular problem was addressed in Rukhanka by introducing a special debug operation mode enabled by the script compilation symbol. In this mode, I will keep all names in FixedString fields.
Versioning & debugging purposes. In case if ids ever change, you can figure out what fails and where.
Potentially override data by id manually.
Can’t do that with just ints unless you track each change done with each remapping.
Meaning you’d have to store that remapping table somewhere forever for all game designers / versions. Which is near impossible and leads to incompatible save files in terms of version / major changes done.
One uint would be 4 bytes so I’d rather pay x4 (e.g. for FixedString16Bytes) per id to have a more robust versioning.
E.g. with player messing up scene somewhat resulted in 5mb per save with FixedString64Bytes.
With 32Bytes its 2.2MB;
With 16 its probably gonna be 30% less.
So, overall, its not that bad, but it could be better.
SL Offtopic
Depending on the type of game I’d say its decent even with 32Bytes.
E.g. Rimworld with JSON serialization produces ~30-50MB save file on average. But for the first person action rpg with quick saving / quick loading / autosaving that kind of stalls during process isn’t that good. Fortunatelly actual file write can be offloaded to the separate thread. Unfortunatelly - it doesn’t get faster than IO speed.
I actually use strings as unique identifiers for bone names in Kinemation for the runtime binding feature (so not just debug, but actual gameplay). Now granted, I store these in blobs rather than fixed strings, but using strings directly not only avoids collisions from string hashing, but the strings are actually really short in a lot of bones, so they can save memory. And performance is fine, because I schedule a worker thread job to run alongside the sync point and it becomes a race which thread finishes first. Worker thread using the strings always wins.
This sounds interesting, I never thought about strings like this. Though I’m still struggling to figure out how your string UIDs are being used. Could you please give me an example of how they look like? Thanks.
Yeah, I was thinking about copy&pasting existing one since FixedString codegened anyway.
But then again, I’d rather have an officially supported version so that no updates required each time collections package updates.
So basically:
If scene object / prefab instanced object has implementation it gets packed into separate package;
Each scene object and prefab object [pooled / instantiated object] packs a unique id to their “prototypes”.
Ids are generated in editor & checked for uniqueness.
If id is valid - they do not change during remapping.
Basically its a field inside the package. This id is the FixedString.
If something goes wrong during unpacking I can just log out the stored id which contains object name / data.
This allows to figure out what kind of assets is being broken, since name usually contains prefix e.g. “pckp” – pickup, and name. So at least you’ve got some clues where to begin looking into. And since packages are split by two groups - from there its pretty trivial to find what’s broken since previous change.
In future I’m thinking about writing more robust versioning on top if necessary.
If game ever goes live.
Those ids that may get missing during changes might get replaced by manually generated packages with required “patch” data.
So theoretically I can still cull down some overhead by using something like MultiHashMap<UID, Package> for the prefab objects path since with multiple copies of the same instanced prefab id will be identical.
But at the same time I don’t want more complexity than it is, so I’ll just leave it as is for now.
Scene objects however always have unique ids so not much can be done about it.
Note that objects are default UnityEngine.Object. Like MonoBehaviours / SO’s.
I’m using hybrid setup, so authoring entities is part of those objects.
No entities involved until actual objects are generated in runtime.
Which means I can just grab their asset name.
So anyway, the generation:
Iterate over existing objects in the opened scene setup / search the project:
Grab existing id;
Check if it exists in the buffer, if it does - generate new one, if it fails → notify the user to rename it / re-run the process;
Grab name of the object;
Trim it to fit in FixedString of size. E.g. remove ‘_’ prefixes, run a couple of regex;
Generate unique symbols at the end (like a short UID);
Store it in the behaviour / update prefab lookup / update SO lookup for runtime use;
Uniqueness granted per object group. Scene objects / prefabs / SO’s are stored separately.
So if not compared against other groups / buckets hash collision is non-existent as long as uids does not match exactly. And well, if they are - user will know and just rename - re-run the process.
Or new id will be generated automatically (in most of the cases)
Dynamic data remapping phase happens later which allows to bind Entity ↔ Entity during saving / loading.
Technically, UID is also involved in it, though it does not matter that much as links / references between entities aren’t used as often. And as a result doesn’t contribute to the file size / memory / performance of S/L as much.
Though using FixedString16 will also reduce / improve it as uid type is universal across project.