Significance of binary fields in the Addressables catalog (for example m_KeyDataString)

We noticed there are 4 binary fields in the catalog, encoded in Base64: m_KeyDataString, m_BucketDataString, m_EntryDataString, m_ExtraDataString.

These take a up a lot of space in the catalog, but I assume they are necessary and probably “dependency”-related.

  1. Can someone say with certainty what they are used for? It’s not clear to us from the code.
  2. If we change something manually in a catalog (its readable parts, not its binary data) what special care must be taken with this binary data? For example, imagine that we change the bundle URLs inside the catalog to point to another host since we don’t want to create a new addressables build just for that (as the addressables are exactly the same). Would that change need changes to the binary data?
    2.5. Would we be able to regenerate the catalog hash somehow after the change?

Anyone?

I also would be really interested in learning more about the different parts of the catalog. But it looks like the only way is to scrutinize the BuildScriptPackedMode.cs script at the moment.

1 Like

Im also having an issue with catalog file. its too big (I have many file and pack it separated). Can I do some manual modified to reduce the file size? or split it to multi catalogs file? Thanks

I feel unconfortable with such a huge catalog, we host approximately 2K very little PNG files, packed separately on a remote CDN but the biggest overhead is to download a 4MB catalog, that is practically 75% the m_KeyDataString field, that we have very little information about, couldn’t it be only on the local catalog?

@unity_bill

2 Likes
  1. Can someone say with certainty what they are used for? It’s not clear to us from the code.

m_KeyDataString is all of your keys, Address, Label, GUIDs
m_EntryDataString is all your Addressable entries. This contains indexes into the InternalID’s for its load path and dependencies etc
m_BucketDataString The mappings between the keys and the entries
m_ExtraDataString is extra data for the entries. This will likely be exclusively information for how to load the AssetBundle. CRC, Cache Hash etc

  1. If we change something manually in a catalog (its readable parts, not its binary data) what special care must be taken with this binary data? For example, imagine that we change the bundle URLs inside the catalog to point to another host since we don’t want to create a new addressables build just for that (as the addressables are exactly the same). Would that change need changes to the binary data?

No, you can change this freely, as long as you do not change the order. The binary parts will be able to index into the internalID’s correctly.

2.5. Would we be able to regenerate the catalog hash somehow after the change?

You could calculate the hash of the new text in the catalog.json. However the value does not have to match, think of this hash as a version number. We check the “version number” on the server compared to what is cached. If it is differently, then we know the catalog has changed and requires a download. So long as you set it to something different than previous downloads, then you will not have any problems.

For the various file size issues with the catalog file. We will take this feedback on board and see what can be done in the future. For now the best way to reduce this is to only include keys that you intend to use. e.g. If you do not use contents of a Group with AssetReference, then disable “Include GUIDs in Catalog”, same with Labels and Address, if you do not use them to load, then you can strip them from the catalog with that setting.

3 Likes