Best Image Format?

Good Morning,

So I have a bunch of animation images for a character in my game. I have imported images to my games before from outside of unity directory, and I was wondering

since Texture2D must be created first, before tex can LoadImage(filebytedata) is there anyway to obtain the dimensions of the image before creating the texture?

I intend to create my own format to conveniently read these images, by combining all of the frames into a single file where I will be able to read this file and obtain this type of data more easily. In addition to modifying said data, for example key colours.

if I open a png file for example in text format everything is encoded so that I cannot quickly obtain the dimension data in a single parse from a specific line.
Are there other common readable formats where such data can already be obtained?

or am I better off writing my own format?
The purpose of this being that since a single character can have as many as 60 or more frames, I’d rather interact with a single file to obtain a single characters entire frame content instead of giving a single character it’s own directory full of these files; or otherwise having a mixed file bag of jumbled frames and sorting through it. I feel compacting the data of multiple frames into a single file is the way to go.

Any thoughts?

Heres a quick example of the problem

 public void IMPORT_BULK_IMAGES()
    {
        IMAGEDIRECTORIES = Directory.GetFiles("C:/Users/SCHar/Desktop/COMPILER");
        List<Texture2D> COMPACT_LIST = new List<Texture2D>();
        List<string> FRAME_ID = new List<string>();
        for (int i = 0; i < IMAGEDIRECTORIES.Length; i++)
        {
            if (File.Exists(IMAGEDIRECTORIES[i]))
            {
                var fileData = File.ReadAllBytes(IMAGEDIRECTORIES[i]);
                // So I must read byte ^^ to obtain texture dimension?
                var tex = new Texture2D(UNKNOWN_DIMENSION_X, UNKNOWN_DIMENSION_Y);
               // And now i load image >>
                tex.LoadImage(fileData);
                var KA = IMAGEDIRECTORIES[i].ToCharArray();
                string FILENAME = "";
                for (int c = KA.Length - 5; c > 0; c--)
                {
                    if (KA[c] == '\\')
                    {
                        break;
                    }
                    else
                        FILENAME = KA[c] + FILENAME;
                }
                FRAME_ID.Add(FILENAME);
            }
        }
        CREATE_COMPACT_FILE(COMPACT_LIST,"INFANTRY_A", FRAME_ID);
    }

Onwards it will go →

public void CREATE_COMPACT_FILE(List<Texture2D> COMPACT_LIST, string COMPACT_FILE_NAME, List<string> FRAME_ID)
    {
        // Here interpret FRAME_ID ( Which may be Contains('W_') for walk for e.g.
    }

Ending with

 public void IMPORT_COMPACT_IMAGES() // HOW I WILL READ THE FINAL COMPACT FILE
    {
        IMAGEDIRECTORIES = Directory.GetFiles("C:/Users/SCHar/Desktop/OBJECTIVE/DATA/ART");
        for (int i = 0; i < IMAGEDIRECTORIES.Length; i++)
        {
            if (File.Exists(IMAGEDIRECTORIES[i]))
            {
                var MOVE = IMAGEDIRECTORIES[i];
                var MOVELINE = File.ReadAllLines(MOVE);
                for (int l = 0; l < MOVELINE.Length; l++)
                {
                     // I'll rebuild the images from the colour data, tags, and texture dimensions
                }
            }
        }

Other Format Examples, I just need Vector4 Color data, Dimension Data. and potential for Tag locations. None of this Binary encryption. I effectively going for a Non-Binary format.

That’s because it’s not text, it’s raw binary. However, it’s a published specification, so you can look up the spec and figure out how to read it.

According to the spec at FileFormat.info, the width and height of a PNG image are the first two DWORDs in the header, immediately following the PNG signature. So, the width is at bytes 9 through 12, and the height is at bytes 13 through 16.

But…

I assume that your art tool doesn’t already support some kind of packaged format, otherwise you wouldn’t be looking into this.

With that in mind, my real question is: how does doing this help you?

I’d just output folders of images from my art tool, and write an Editor extension in Unity to quickly make animation assets (or whatever you’re using) from a folder of images. Yes, you still have a folder of images, but aside from when you draw them you shouldn’t have to mess with the files.

I fully understand that having associated objects packaged in one file is neater, but one thing to note is that it’s also worse for most version control systems. Every time you change a frame you’d need to re-upload the whole package.

I’m not usually a 2D developer, so please say so if I’m overlooking something.


If you really want to do this just for fun, here’s what I’d do. (Edit: see next post.)

In an art tool I could write a plugin for, I’d write one which does the following:
1. Check / enforce that all images are same format.
2. Export all files as .PNG to some temp location.
3. Create a little JSON / XML / whatever file with metadata in it - name of animation, number of frames, timing, whatever else is relevant.
4. Package all of the files, including the metadata, using some compression format which is also available in Unity.
5. Write an importer for the Unity Editor which does the above backwards, but puts the extracted pixels into Texture2D assets.

This does strike me as a fair bit of work for a programmer to create (and likely maintain) to save other people from doing a relatively small amount of work. Again, though, I could be overlooking something, as I’m not a 2D dev.

2 Likes

On slightly further thought, I wouldn’t even bother with the compression, because PNG is already compressed.

  1. Check / enforce that all images are same format.
  2. Export all files as .PNG to some temp location.
  3. Create a little JSON / XML / whatever string with metadata in it - name of animation, number of frames, timing, whatever else is relevant, and a list of the sizes of each PNG file in bytes.
  4. Open a file stream, write the metadata to it, then the contents of each PNG file.
  5. Write an importer for the Unity Editor which reads the metadata, then uses it to read each PNG in sequence.

Note that before an image can be used on the GPU it has to be loaded into a texture format or fully decompressed. When you put an image file in the Editor it does this at build time, so your game doesn’t have to do it at runtime. If you’re loading images from a format such as PNG at runtime then you’re making more work (they need to be decompressed at least, and then possibly put into another format) and this will slow down load times.

Does that matter? That’ll depend entirely on your game.

Can you improve on it? Definitely, but the complexity of that is more than I can write in a couple of forum posts while waiting for a game to update. :wink:

1 Like

It’s one of those cases

you need a chocolate bar to make a chocolate bar

for simplicity as it can act as a bulk image modifier, I could set all reds to yellow have them prebaked in memory instead of running real time shader to convert these colours, as another alternative.

I know in the end I can use Unity png importer to make the file I want, but obtaining those specific byte values you mentioned I am about to pursue this route right now.

But a shader is both more simple and more flexible…

1 Like

You are probably correct. But I need to know for sure. I could also arrange such frames into a custom atlas of sorts. Convert back to png, and edit the entire sheet.

I found ^^ maybe the solution

I think I don’t quite understand what’s the problem :slight_smile: The image data will be replaced on load anyways, no matter what the dimension of the Texture2D was before.

What exact problem do you want to solve? I’ve once created my PNGTools which provides a way to access all chunks stored in a png file. Though I quickly created it in order to change the PPI settings of a png file as it was asked in this question. So currently I just parse the chunks into seperate parts, but the chunks itself are not further parsed. The “IHDR” which is the header chunk which is always the first chunk actually contains the width and height as the first 8 bytes. It’s quite trivial to read those out. Though the question remains: What exact problem should this solve? Why not just load the image and then read the width and height?

It is common to create spritesheets, yes. But it’s usually not to simplify loading, but to get better runtime performance since using a spritesheet means you only have a single texture in the video memory and you only choose a subsection of that single texture. That means several objects can actually use the exact same material and can be batched.

I just want to make this clear: When you use LoadImage on a Texture2D, the image data will be completely replaced. So it’s common to create a Texture2D with the dimension (1, 1). After the LoadImage call it’s replaced anyways.

1 Like

Oh Okay Now I feel stupid. Thank you for the every day genius that you provide Bunno. Honestly. How you know everything I can’t imagine.

but i did get as far as this, and i may continue the coding adventure just to see what is at the end of this nook and cranny8493578--1130189--Oo.png

for (int i = 0; i < 1000; i++)
        {
            if (!File.Exists("C:/Users/SCHar/Desktop/" + "INFANTRY_" + i))
            {
                var COMPACTFILE = File.Create("C:/Users/SCHar/Desktop/" + "INFANTRY_" + i);
                var SW = new StreamWriter(COMPACTFILE);
                for (int c = 0; c < COMPACT_LIST.Count; c++)
                {
                    SW.WriteLine(FRAME_ID[c]);
                    SW.Flush();
                    SW.WriteLine("d " + COMPACT_LIST[c].width + " " + COMPACT_LIST[c].height);
                    SW.Flush();
                    var TA = COMPACT_LIST[c].GetPixels();
                    for (int t = 0; t < TA.Length; t++)
                    {
                        string LINE = "c " + TA[t].r + " " + TA[t].g + " " + TA[t].b + " " + TA[t].a;
                        SW.WriteLine(LINE);
                        SW.Flush();
                    }
                }
                SW.Close();
                Debug.Log("WE DID");
                break;
            }
        }
A01 // My Frame ID
d 28 32 // My Width and Height :)
c 0 0 0 0 // My Colors etc etc etc
c 0 0 0 0
c 0 0 0 0
c 0 0 0 0
c 0 0 0 0

So if the line did not begin with "d " or "c " I know it is referencing a Frame ID
So I could now just read the entire thing for the entire data sheet and modify numbers in a single run.
As perhaps I would create 12 different versions of the same image, where specific colour is changed at a later time.

Again Thanks alot I mark as resolved.

To Note: The Text viewing of this file, opens much faster than a .TIFF
i don’t know if this is even going to be worth while, but what I thought was that we are standing on the shoulders of people who built image formats with the intention for the image to be compressed for the likes of storage methods such as floppy disc and they are of a time when space saving to that nature was more valuable. I am also certain that certain encodings and encryptions existed initially primarily to protect a software developers product from exploitation or replication in the early days of software development. Things that are as of today perhaps un-necessary.

In a twist of Fate. as it turns out, A single PNG file here is worth 15kb. and there are 55 of them. So that is 825 Kb.
My final file size in the new compiled format is 576kb.
That is 249kb of extra data present just to encode it.

That is a saving in Bytes of approximately 184,307

And with that;
I will speak no more of it.

Edit
My mistake

my file is in fact nearly 42x the size of the original encoded files. But like I said the preservation of this space next to my organisation here is not an issue for me:

This is a complete red herring. Interpreting binary data as text makes no sense. A text viewer has to interpret and organise that data in memory for display. Most text viewers work on a per-line basis which is completely violated by binary data. Also as text viewers try to interpret the data as unicode data (usually utf8) it would hit all sort of error cases and invalid unicode characters. Though it may accidentally recognise some rare unicode characters from other languages which means it has to use a unicode font and has to load all those character representations in order to view them. Opening binary data in a text viewer just makes no sense. Install a hex editor / viewer if you want to view binary files (personally I use this old hex editor which is a german product, but it has an english lang pack).

That doesn’t sound right. The “format” you just presented is one of the most inefficient formats possible. It’s far worse than BMP files and those are uncompressed files with minimal header data and just the color information in binary.

The single png you showed above had a size of 251 bytes! (not kilo bytes). Further more you currently use the defaul ToString conversion of the float values of each pixel. It may be encoded to two bytes in case the color information is 0. However any actual value would be something like 9 bytes per component. So more like 40 bytes per pixel where you would only need 4 bytes in an uncompressed format. Since you use the default ToString conversion, it means the number will be formatted in your local culture setting. If you run this code on different machines, you would get a different output. In germany, france, spain and many other countries (actually about half of the world) uses a comma as decimal point and points as number seperators whereas the english speaking world usually uses a dot as decimal point and a comma as number seperator. If you really want to use a text encoding, you should use the invariant culture.

That’s also wrong ^^. Yes, in the past disk space was more valuable than it is today. You could argue that is doesn’t matter that much today. However there is much more to it than disk space. Sending data over the internet is also a factor. Apart from that any text format requires parsing of the text which is always slower than a binary uncompressed format and the binary format is generally smaller.

No, this was never the intention and there is no encryption going on, jst compression. Any data needs encoding. Your format uses a text encoding (ASCII or UTF8) and on top of that the actual numbers are encoded in base 10 decimal numbers which need to be converted in order to be useable.

Most image formats only support 8 bit per color channel. There are only very few exceptions where you would have more than 8 bits per channel. That’s why for the majority of image formats it makes much more sense to represent a color as a Color32 (that is 4 byte). A BMP image uses 3 or 4 bytes per pixel + the header which is a few bytes (54 bytes).

What benefit does it give you when you save the image in a human readable format? I don’t quite understand the reasoning behind that?

1 Like

Forgive me, But does anybody have any insight into why those PNG format characters, and the TIFF format characters are cheaper in byte data than usage of the roman numeral?
I had reduced file size via compression so it is merely 14x as large as the original PNG. And it contains less physical text. Instead we just store unique colors, and then their index number in accordance to dimension.

So presumably if i write Byte instead of Write Line. then we should be even smaller and closer to the PNG.

                for (int m = 0; m < COMPACT_LIST.Count; m++)
                {
                    for (int F = 0; F < FRAME_ID[m].Length; F++)
                    {
                        BYTE.Add(Convert.ToByte(FRAME_ID[m][F]));
                    }
                    BYTE.Add(Convert.ToByte('d'));
                    BYTE.Add(Convert.ToByte(COMPACT_LIST[m].width));
                    BYTE.Add(Convert.ToByte(COMPACT_LIST[m].height));
                    for (int a = 0; a < COL[m].Count; a++)
                    {
                        BYTE.Add(Convert.ToByte('R'));
                        BYTE.Add(Convert.ToByte(COL[m][a].r));
                        BYTE.Add(Convert.ToByte(COL[m][a].g));
                        BYTE.Add(Convert.ToByte(COL[m][a].b));
                        BYTE.Add(Convert.ToByte(COL[m][a].a));
                        var C = COL_REFERENCE[m][a];
                        BYTE.Add(Convert.ToByte('i'));
                        for (int v = 0; v < COL_COORDINATES[m][C].Count; v++)
                        {
                            var B = COL_COORDINATES[m][C][v].ToString().ToCharArray();
                            for (int BA = 0; BA < B.Length; BA++)
                            {
                                BYTE.Add(Convert.ToByte(B[BA]));
                            }
                        }
                    }
                }
                for (int b = 0; b < BYTE.Count; b++)
                    COMPACTFILE.WriteByte(BYTE[i]);

I cant make much more gains. Im going to call it off. Interesting learning experience :slight_smile:
Thanks Bunny

I suppose I could iterate through my bytes look for duplicate bytes make a new byte list and reference index of duplicate bytes but this is turning into fractal bytes. The conclusion is unless i am prepared to fractal byte and god knows what else i am better off keeping my files loose in the directory :slight_smile:
Edit: Fractal Bytes failed ofc it would result the same byte number :expressionless:

You still have the wrong idea about data stored in files ^^. Data is just data. PNG files do not contain text. They contain just binary data. As you may know, text may be represented in a computer file either as ASCII characters where each character requires 1 byte (8 bits) or in UTF8 which may use a variable number of bytes per character, but is essentially backwards compatible to ascii for the first character plane.
Have a look at this ascii table to better understand what we’re talking about. A text file does not contain “text”. It contains binary data that is interpreted as text. So the word “Hello” would be this:

//  01001000 01100101 01101100 01101100 01101111
//  |  H   | |  e   | |  l   | |  l   | |  o   |

So human readable text is just one way how to interpret the data in that file. The file itself just contains those 40 binary bits (or 5 bytes) which when interpreted with the ASCII table comes out as the string “Hello”. A binary format does not use ascii characters but store data directly in those bytes.

A single byte value (so a group of 8 bits) can represent values between 0 and 255. For example the BMP format stores each color channel in a single byte. When you store a number as a decimal number represented as ASCII text, you can only store the values 0 to 9 in a single digit. So just storing values in the range of a byte as decimal text would require up to 3 characters. Furthermore you have to insert space characters as well. A space character is just another character (the number “32” or in hexadecimal 0x20). When storing numbers in decimal form you need a way to seperate the values. That’s why you inserted those spaces. With a variable length per number you need some sort of seperator, otherwise you would not know how to interpret the values. Imagine the values 240, 42, 7. If you write them without spaces as decimal values you get “240427” How would you know how to interpret that string as 3 numbers? It could be 2, 404, 27 or 24,042, 7 or 2404, 2, 7 or many other interpretations. So storing numbers in text requires a seperator or a fix number of digits. So if you know the largest values have 3 digits you could store them as 240042007. Knowing that each value always has 3 digits makes it easy to read those as 240, 042, 007. Though storing 3 values would require 9 bytes. If the range you want to store is between 0 and 255, you would only need 3 bytes. So it’s already 3 times as much data.

On windows machines a new line character (yes, a new line also requires a character) is actually represented by two characters. A “carriage return” (\r or 0x0D) followed by a “line feed” (\n or 0x0A). So each line you produce, even if those lines are empty would require two bytes.

You actually store floating point values as strings which can have around 13 characters as a decimal number + the space we just talked about to seperate the values. A float inside the computers memory is represented with just 4 bytes. Have a look at this website. It allows you to play around with float values. You see all the 32 bits(4 bytes) that make up the floating point number. So here you’re also over 3 times the required space.

Though the most important difference is that your data is uncompressed. Compression can reduce the required memory significantly, especially if there’s a lot repetition in the data. That’s what compression algorithms do. They search for repeating patterns and essentially replace those repeated patterns with a single version and some additional information how often that pattern occurs and how to put it back together. Note that this was just a very rough simplification. Different algorithms use quite different techniques to reduce the size of the data.

In the end it’s all just about information entropy, but I guess this goes too far now ^^. Note that other file formats like gif for example use a color palette and each pixel is just an index into that palette. For black and white (or simply two color images) that means we can store a single pixel in a single bit. So one byte of data actually contains 8 pixels. So a 8x8 image would only require 8 bytes of raw data. But of course you have to know how to read it. Of course gif is very limited as the maximum number of colors in the palette is 256 distinct colors. It may work well for small pixel art images, but not for any kind of photo like images. Usually we work with 24 bit color images. That’s 8 bit per color channel. So an RGB value would be made up by 3 bytes (8*3 bits) which gives us a total of 2 billion distinct colors.

Your format could be improved by using Color32 values where each value is a byte rather than a float. However storing it as text is still extremely inefficient. And I mean inefficient in both, memory and parsing / processing.

So I’m still wondering what actual problem you try to solve with your own format? Again, what benefit does it actually give you?

2 Likes

Struggling setting security for the file to allow for this >>

  public void COMPRESSFILE(string PATH)
    {
        FileStream originalFileStream = File.Open(PATH, FileMode.Open);
        FileStream compressedFileStream = File.Create(PATH + "_C");
        var Compressor = new DeflateStream(compressedFileStream, CompressionMode.Compress);
        originalFileStream.CopyTo(Compressor);
        compressedFileStream.Close();
        originalFileStream.Close();
        Compressor.Close();
    }

To be honest bunny I am only playing around I am not doing anything meaningfully productive here I am just interested in how these things work. What you explained up there just now is really well written.

I will probably bail on this venture pretty soon. I just keep coming back to it this whole day.
I might try the color32 in a little bit. This whole type of thing is pretty new to me. Like I might be able to do all this eventually, but then I still need to decompress and obtain bytes and rebuild :slight_smile: I honestly can’t give you a good reason why it’s so important that I am attempting this. Its not important, its just one of those things I guess i just gotta know.

I guess what you are hinting at is to convert to binary then write the bytes :open_mouth:

:open_mouth:

I’ll probably sleep on it tonight and decide if i needs to be that much of a genius. Do i?
I’ll sleep on it.

I’m jumping in the midst of this quality conversation, but just wanted to add (after Bunny explained how text files are actually “text”, because that’s really required) that PNG format has a long history of being VERY complicated, internally, unlike TIFF which is effectively VERY simple, by contrast.

Comparing the two directly is very unhealthy. That’s my first point, and I’ll explain why shortly.
My second point is that it’s important to understand the difference between raw, lossy and lossless.

Raw image is what the hardware is hungry for. You stream that bitch, you get its contents on the screen. Typically what we call as bitmaps (not the BMP format per se, that’s Windows Bitmap) are very near to what you actually want, plus minus some rearranging, reordering, flipping axes, and whatnot. You do almost nothing special to prepare such a stream, and voila, there’s a show on your screen or your printer. But there is a problem, raw data is HUGE! Even for today standards.

Lossless compression works like a zip, it packs HUGE data to much less data, via magic and interdimensional portals, and you can get your information back in a pristine condition, but you have to trade some CPU time (and memory) to push stuff through these portals.

Lossy compression is even better. It uses smoke and tricks to fool the brain into seeing one thing, by taking advantage of our limited perception, thus it can crunch raw data even further, but the price you pay is that some information has been permanently lost.

Ideally, why use raw other than for speed, when you can use lossless if you can cope with the coder/decoder performance (aka codec). Lossy is only really good for dumb memes, who cares about that right, let’s conserve some bandwidth instead.

Both PNG and TIFF are lossless formats, by their design and intention, and given their purpose, you’ll see why PNG is much more complicated than TIFF.

PNG was intended as a replacement for GIF, historically, and because it was intended for the internet, it employs some heavy duty compression that verges on astrophysics (not literally) established from doctoral papers on computer graphics. Sadly it turns out it doesn’t really do its job, not in terms of compression (that part is brilliant), but in terms of decoding performance. GIF was legally in a limbo for the time, being patented (and the patent changed hands), and it was also obsoleted because the hardware became more powerful, so the authors of PNG were pushed to make something that is universally acceptable on the internet, and for this they pursued a masterpiece in lossless compression.

In the end PNG (portable network graphics) format lost to JPEG (joint photographic experts group), which is a lossy compression, but it became so ubiquitous that it seriously destroyed every image it ever encoded. It took us some time to recover from the early internet, and to get better processing power until PNG became as regular as it is nowadays.

TIFF was historically designed to be very portable and not so much compressed. It’s an archival format intended for professional usage in desktop publishing and printing industry by Aldus that was later gobbled up by Adobe. The idea behind it was not the internet, but the actual raw storage, so it supported various color spaces from the get go, and is designed to streamline the color separations as independent bitmaps. But because space was a problem back then (and printing quality demands HUGE data), they’ve decided to go with a plug-in based model for compressing images in the codec directly. So the format was designed in such a way to encourage 3rd party meddling with all kinds of encoders and compression algorithms, of which only two had actually won. One is LZW, or Lempel-Ziv-Welch image compression algorithm (though there are numerous algorithms in this family, these guys were very productive together and on their own), the other is “plain” ZIP. It turns out LZW works slightly faster with CMYK separations (specifically at or beyond 300 dpi), but the differences in compression itself are negligible. Anyway this explains why TIFF (tagged image file format) has much better internal specs of the header and the way it’s laid out, and it even contains some human-readable data.

For the end, I wanted just to share this almost-documentary by Reducible (an excellent YT channel if you’re into computer engineering) on how exactly PNG works internally, where also another file format is shown near the end, called QOI (quite ok image) discovered only recently by some random guy, who managed to pull off only slightly worse compression than PNG’s (on a stochastic 32-bit image database in a direct one-to-one comparison) however the codec itself is trivially simple and turns out to possess an incredible performance rating. By contrast, none of the formats mentioned so far were made by some random guy, but by big consortiums and academia, so there is obviously a lot of room for improvement.

There you go, if you really want to learn more about this stuff, I’ve been doing what you’re doing in the early 90’s, so all of this might be useful to you.

3 Likes

That is some quality info my friend. Bravo! Il take a watch of that doc. I had watched a little documentary on jpeg maybe 6 months ago. Maybe it plays a part in what I’m trying to do here. I will likely chase that png byte data a little further. But I expect I will settle for the large readable file for reasons my brain is too melted to get into right now. But chasing that png will be like chasing the dragon. Eventually you would assume that logically there could only be one superior way to do it. But

does size matter?

Of course as bunny had said it matters for internet related matters. But in the context in which I would use my own custom file in unity? Does the size of that file matter? Personally it somewhat matters. That is to say; assuming I complete the project, and sold it on my steam account, if somebody pried it open to extract my graphics I guess I would prefer that they have a hard time and at the end, I guess; respect me for it.

But I am totally 100% not assuming I will get close to making the best file type our people have ever seen. It’s highly unlikely in-fact. Or at least; the odds of that would at the current moment be pretty slim.

but I will chase that png.
I will attempt the binary version. And measure size difference.
I will then hopefully be at some grade to apply and Unapply compression. And then I can really gauge how close I am getting. And how much a binary format byte encode + compression does.

so far it went down 42x the size 14x the size and then 10x the size and I’m assuming with a compression maybe its one 3rd smaller or at least I would be very happy if it was a significant amount coming from the compression. Leaving me in a situation where the remaining data reduction is a matter of algorithm.