More Compact Save Files

I’m currently working on saving for my Terraria/Starbound-like game, I’ve brought down the size of the save quite a bit but it’s not enough, the average size of the large world saves in Terraria is around 12 MB, my small world is 36 MB. I know of a few different tricks but what I’m focussing on now is reducing the size of the Tile object, it stores 15 bytes of info but saving one takes up 329 bytes on my drive. I know there’s got to be a more efficient way of storing it without compression. I’ve tried structs, they take even more space (333 bytes). Any suggestions will help, thanks!

15 bytes of info but saving one takes up 329 bytes on my drive

How? You are not serializing that as text, surely

This is how you can serialize structs manually:

using System.Collections.Generic;
using UnityEngine;
using BitConverter = System.BitConverter;
  
public class TestSerialization : MonoBehaviour
{
    [SerializeField] TILE input = default(TILE);
    [SerializeField] byte[] bytes;
    [SerializeField] TILE output;
  
    void OnValidate ()
    {
        bytes = input.ToBytes();
        int i = 0;
        output = TILE.FromBytes( bytes , ref i );
    }
  
    [System.Serializable]
    public struct TILE
    {
        public ushort x, y;
        public byte BlockID, WallID, OreID, NatID;
        public float LiquidAmt;
        public byte Bitmask, SettleCount, flag0;
  
        public byte[] ToBytes ()
        {
            List<byte> bytes = new List<byte>();
            bytes.AddRange( BitConverter.GetBytes(x) );
            bytes.AddRange( BitConverter.GetBytes(y) );
            bytes.Add( BlockID );
            bytes.Add( WallID );
            bytes.Add( OreID );
            bytes.Add( NatID );
            bytes.AddRange( BitConverter.GetBytes(LiquidAmt) );
            bytes.Add( Bitmask );
            bytes.Add( SettleCount );
            bytes.Add( flag0 );
            return bytes.ToArray();
        }
        public void ToBytes ( List<byte> buffer )
        {
            buffer.AddRange( BitConverter.GetBytes(x) );
            buffer.AddRange( BitConverter.GetBytes(y) );
            buffer.Add( BlockID );
            buffer.Add( WallID );
            buffer.Add( OreID );
            buffer.Add( NatID );
            buffer.AddRange( BitConverter.GetBytes(LiquidAmt) );
            buffer.Add( Bitmask );
            buffer.Add( SettleCount );
            buffer.Add( flag0 );
        }
  
        public static TILE FromBytes ( byte[] bytes , ref int index )
        {
            TILE result = default(TILE);
            {
                result.x = BitConverter.ToUInt16( bytes , index ); index += 2;
                result.y = BitConverter.ToUInt16( bytes , index ); index += 2;
                result.BlockID = bytes[ index++ ];
                result.WallID = bytes[ index++ ];
                result.OreID = bytes[ index++ ];
                result.NatID = bytes[ index++ ];
                result.LiquidAmt = BitConverter.ToSingle( bytes , index ); index += 4;
                result.Bitmask = bytes[ index++ ];
                result.SettleCount = bytes[ index++ ];
                result.flag0 = bytes[ index++ ];
            }
            return result;
        }
    }
}

Proof that this works, and produces 15 bytes exactly and every time: https://i.imgur.com/jBjhiW5l.jpg

Check out the com.Unity.Serialization package. It lets you save to binary quite easily (if you get past the lacking docs) and you can create highly efficient binary streams that way.

And then, once you have the byte array just run it through GZip to compress it. I use these two extension methods:

public static Byte[] Compress(this Byte[] uncompressedBuffer)
{
	using (var memoryStream = new MemoryStream())
	using (var zipStream = new GZipStream(memoryStream, CompressionMode.Compress))
	{
		zipStream.Write(uncompressedBuffer, 0, uncompressedBuffer.Length);
		zipStream.Close();
		return memoryStream.ToArray();
	}
}

public static Byte[] Decompress(this Byte[] compressedBuffer)
{
	using (var compressedStream = new MemoryStream(compressedBuffer))
	using (var unzipStream = new GZipStream(compressedStream, CompressionMode.Decompress))
	using (var uncompressedStream = new MemoryStream())
	{
		unzipStream.CopyTo(uncompressedStream);
		return uncompressedStream.ToArray();
	}
}

As I recall, Terraria uses a simple Run-length Encoding in its world save data. This significantly improves efficiency for the sky as well as for common materials, like dirt and stone.

I’m not positive exactly how Starbound handles its save data, but it should be denoting changes from the initial seeded, procedurally-generated state of a given world, rather than saving out each entire world you’ve been to as a whole.

In either case, it sounds like you would probably be better off planning your own data output, rather than relying on automated serialization to guess for you. (For example, neither of those games would deem it necessary to save the current, partially destroyed state of a block, where automatic serialization would easily try to include that information)