How to control fileID using custom ScriptedImporter

Hello

The full situation:
We have big static(non-changeable) mesh assets in the project. They are serialized as YAML text which is terrible in a lot of ways:

  • git considers these files as text(while they are obviously binary files) which is slow
  • cannot use GitLFS because .asset extension is used for any scriptableobject
  • inevitable cost of YAML serialization for both speed and memory efficiency

I’ve looked online and found this brilliant solution:

All works great, but I want to convert existing mesh assets into new compressed assets (which are 10 times smaller) and don’t lose references of original meshes in the scene.
I’m successfully replacing asset guid in the .meta file but the thing is that I also need the fileID.

Here is how reference to original mesh looks in the scene YAML:
m_Mesh: {fileID: 4300000, guid: 57ea71a33f9a3d44eb6c4a2331ed3575, type: 3}
And that’s how it looks for compressed version:
m_Mesh: {fileID: -3755217185360081728, guid: 57ea71a33f9a3d44eb6c4a2331ed3575, type: 3}

As you can see guids are identical, the only difference is fileID. The value 4300000 is hardcoded, it’s a common value for all “standalone” mesh assets.

Original asset meta file:

fileFormatVersion: 2
guid: 57ea71a33f9a3d44eb6c4a2331ed3575
NativeFormatImporter:
  externalObjects: {}
  mainObjectFileID: 4300000
  userData:
  assetBundleName:
  assetBundleVariant:

ScriptedImporter asset meta file:

fileFormatVersion: 2
guid: 57ea71a33f9a3d44eb6c4a2331ed3575
ScriptedImporter:
  internalIDToNameTable: []
  externalObjects: {}
  serializedVersion: 2
  userData:
  assetBundleName:
  assetBundleVariant:
  script: {fileID: 11500000, guid: e61c173955c747f1836b5b352678d0f5, type: 3}

The fileID: -3755217185360081728 comes from the identifier provided in ScriptedImporter:

ctx.AddObjectToAsset("$IDENTIFIER$", object)

Changing the IDENTIFIER changes fileID.
That means we are “completely” responsible for the fileID value but… we cannot set it directly.

Is there a way to provide a specific fileID? Or how else can I overcome this problem?
Cheers!

1 Like

Hi there!

No. The fileId is a result of the hash of the identifier plus the type of the object, we don’t have any API that would allow forcing a certain value.

I didn’t try, but maybe you could temporarily change EditorSettings.serializationMode to ForceBinary before creating your .asset files? so they could be actual binary serialized files and not text yaml assets?
I think it should work, maybe not for already existing assets (because Unity may remember it was text and force use it anyway), but you could try to serialize them in another file and then replace it.

So if we would know the hash algorithm we could iterate and find an identifier that produces the required fileID?

Well, technically yes… but we’re using a xxhash 64bits algorithm, for which I’m not sure it’s worth trying to reverse find a matching hash through the 4 billion possibilities you have for each possible string input you’re going to try…
I won’t prevent you from trying if you really want to.
The string we’re hashing is
“Type:YOUR_OBJECT_TYPE_NAME->IDENTIFIER0”
In your case I think YOUR_OBJECT_TYPE_NAME would be Mesh
Give that string to an xxh64 algorithm (http://cyan4973.github.io/xxHash/) and you should get the same results as what the AssetDatabase is registering.

Other options (you should try setting the files to binary though, would be easier in my opinion):

  • You could rename your .asset files to .mesh.asset and add this composite extension to the git lfs configuration (**.mesh.asset instead of **.asset) so that only these specific .asset files get tracked.

  • Do the change to use an importer, get your new fileID, and then make a script to edit all existing yaml files in your project to reference the new ID. That would execute way faster than trying to solve the hash problem. Run it once, everyone gets fixed, and you can remove that script and forget you ever had the issue.

I’m not sure I understood correctly how to replicate this string. Are there two symbols ‘-’ and ‘>’? At the end there is ‘0’ (digit zero)?

I’ve tried it in .NET 6 console application.
xxhash implementation from GitHub - ssg/HashDepot: .NET library for xxHash, FNV, MurmurHash3 and SipHash algorithms

using System.Text;
using HashDepot;

// Here I've tried different possible combinations
var strings = new []
{
    "Type:Mesh->mesh",
    "Type:Mesh->mesh0",
    "Type:UnityEngine.Mesh->mesh",
    "Type:UnityEngine.Mesh->mesh0",
};

// My custom importer produces the same fileID for different assets
// using this code ctx.AddObjectToAsset("mesh", obj);
// so my first task is to get the same value from hash function
// m_Mesh: {fileID: -8670151273213183110, guid: b434c64c770f87549820d17cce48bbf6, type: 3}
// m_Mesh: {fileID: -8670151273213183110, guid: 5fd4a69ce2fe7a64a8801ab8b7139529, type: 3}

foreach (var s in strings)
{
    var buffer = Encoding.UTF8.GetBytes(s);
    var result =  XXHash.Hash64(buffer);
    Console.WriteLine($"{s} = {result}");
}

And that’s what I’ve got:

Type:Mesh->mesh = 16347258017382355574
Type:Mesh->mesh0 = 9776592800496368506
Type:UnityEngine.Mesh->mesh = 8067832460341274838
Type:UnityEngine.Mesh->mesh0 = 13028042163031573331

Ok, never mind, hash function returns ulong and I didn’t cast to long.

To be precise here is how it goes:
$"Type:{obj.GetType().Name}->{id}0"

1 Like

This seems to be the case for built-in Unity types (Mesh, Material…) but im not getting the propper ID for ScriptableObject derived classes.
Is there a different format for those?

The format should be relatively the same for ScriptedImporters, we’re using the same string Type:YOUR_OBJECT_TYPE_NAME->IDENTIFIER0
The identifier is the string given to AddObjectToAsset, and the type will always be the Unity native type, which means all Object inheriting from MonoBehaviour or ScriptableObject will be resolved in MonoBehaviour.

ctx.AddObjectToAsset("identifier", ScriptableObject.CreateInstance<MyCustomScriptableObject>());

Will resolve into:

Type:MonoBehaviour->identifier0

If the type is a GameObject, then it’ll be saved as a Prefab, in which case it’ll be:
Type:GameObject->IDENTIFIER/path/to/gameobject0
For each Object in the gameobject, depending on the type, we will build the following string:
Type:YOUR_OBJECT_TYPE_NAME->IDENTIFIER/path/to/gameobject/SOME_ADDITION0
The SOME_ADDITION resolves as follow:
If it’s a MonoBehaviour → MonoBehaviour-C#ScriptFullName
If it’s a Unity Component → UnityNativeTypeName

As an example:

var root = new GameObject("MyRoot");
var child = new GameObject("MyChild");
child.AddComponent<MyMonobehaviour>();
child.transform.SetParent(root.transform);
ctx.AddObjectToAsset("root", root);

Will produce hashes based on the following strings:

"Type:GameObject->root/MyRoot0"
"Type:Transform->root/MyRoot/Transform0"
"Type:GameObject->root/MyRoot/MyChild0"
"Type:Transform->root/MyRoot/MyChild/Transform0"
"Type:MonoBehaviour->root/MyRoot/MyChild/MonoBehaviour-MyMonobehaviour0"

I your MonoBehaviour is in a specific namespace, its name will become The.Name.Space.ScriptName

2 Likes

Hello,

yesterday I stumbled into a similar problem and posted this question: ( Issue with retaining fileIDs in Unity’s Scripted Importer ), I detailed a problem I was facing with the generation of fileIDs in Unity’s Scripted Importer, specifically with child GameObjects (prefabs) whose fileIDs changed upon renaming the source asset.

Today I found this thread and I appreciate the information and insights shared here. They helped me gain a better understanding of how the fileIDs are generated based on the identifier, type, and path information. This pattern does make sense in a context where identifiers used in AddObjectToAsset might not be guaranteed to be unique.

However, as I’ve worked through this, I’ve formed a perspective that, ideally, the design of the scripted importers should be in a way that allows the maximum flexibility for developers and users of the editor. This includes the ability to rename assets and C# scripts without breaking references. In my case, fileIDs were changing despite using a consistent, unique identifier, just because the name of the GameObject (which forms a part of the path) or the script was changed. It seems that the current design doesn’t fully take into account common use cases like renaming assets or scripts.

This flexibility is particularly important when we consider that the name of the asset being imported often needs to be included in the name of the prefabs it produces. This is because the name of the prefab is what is shown when added as a reference in the inspector. Without the ability to change asset names without consequences, developers might find it difficult to manage and navigate their project as it grows in complexity.

Two potential improvements could be:

  • Allow the use of guaranteed unique identifiers to AddObjectToAsset for full control: If an identifier is guaranteed to be unique, there should be no need to supplement it with additional information for generating the fileID. Providing this as an opt-in feature would give developers who wish to maintain full control over their fileIDs the ability to do so.

  • Base inferred information on script assetID, not C# type names: If Unity’s system infers additional information from the type, consider using the assetID of the script instead of the name of the C# type. This would allow the C# class to be renamed without affecting the fileID, as the assetID would remain the same. Should there not be a 1-1 relationship between classes and files. It should be possible to tag C# classes with attributes that specifies a unique guid that can be used for this purpose.

I appreciate any feedback or suggestions on how I can mitigate this issue moving forward until some potential fix has been implemented.

Thank you!

I came up with a workaround for my issue in the previous post:

  • In the ScriptedImporter, I generate the GameObjects (prefabs) using a deterministic placeholder part for the name, which does not include the name of the source asset. The identifier for these objects is kept the same.

  • Then, in an AssetPostprocessor, I use OnPostprocessAllAssets(string[ ] importedAssets, string[ ] deletedAssets, string[ ] movedAssets, string[ ] movedFromAssetPaths) method.

  • Within OnPostprocessAllAssets(), I check each imported asset to see if it’s one of my .pyxel files. If it is, I use AssetDatabase.LoadAllAssetsAtPath() to load all the assets at the imported .pyxel file path.

  • I then filter the loaded assets to only get the GameObjects. For each GameObject, I check its name for the presence of the placeholder part. If it’s found, I replace it with the name of the source asset.

  • I call AssetDatabase.SaveAssets().

This way, I can change the name of my .pyxel files without causing the fileIDs of the child GameObjects to change, thus preserving the references in my prefabs. The only caveat I’ve found so far to this workaround is that the updated names of the GameObjects (prefabs) are not displayed in the Unity Editor’s project window. They appear correctly in the Inspector, though.

By using the placeholder in the prefab names during the import process, the fileIDs remain consistent even when the source asset’s name changes. This ensures the integrity of the fileID despite any subsequent asset name modifications.

While I am not particularly happy with having to patch it like this it will have do for now and I hope this workaround can help others who might face a similar issue until Unity offers a more flexible approach towards maintaining fileIDs across asset name changes.

2 Likes