YAML fileID hash function for DLL scripts

Do anybody know what hash function (if any) is used to compute fileID number for DLL script references? What I am talking about is this:

m_Script: {fileID: -1167294237, guid: 2b16a1acf52a2a64e916f8a9e6d5df31, type: 3}

I’ve found out some interesting things posts this one, but I cannot find actual hash function to compute fileID.

Why I need this? I am Asset Store publisher and I wanted to create a evaluation version of my tools. My tools consists of scripts, prefabs, materials and scenes, so when I will compile scripts into DLLs the rest will break. If I had method to compute the fileID for compiled scripts I would create a ruby script to fix all references and share it with the community.

I know!

Given a type t, the fileID is equal to the first four bytes of the MD4 of the string `“s\0\0\0” + t.Namespace + t.Name’ as a little endian 32-byte integer.

Below is a chunk of code that will compute the fileID. The MD4 is the first, simple MD4 C# implementation, so feel free to drop in your own if you want. The important part is the FileIDUtil.Compute function.

using System;
using System.Linq;
using System.Collections.Generic;
using System.Security.Cryptography;

// Taken from http://www.superstarcoders.com/blogs/posts/md4-hash-algorithm-in-c-sharp.aspx
// Probably not the best implementation of MD4, but it works.
public class MD4 : HashAlgorithm
{
   private uint _a;
   private uint _b;
   private uint _c;
   private uint _d;
   private uint[] _x;
   private int _bytesProcessed;
 
   public MD4()
   {
      _x = new uint[16];
  
      Initialize();
   }
 
   public override void Initialize()
   {
      _a = 0x67452301;
      _b = 0xefcdab89;
      _c = 0x98badcfe;
      _d = 0x10325476;
  
      _bytesProcessed = 0;
   }
 
   protected override void HashCore(byte[] array, int offset, int length)
   {
      ProcessMessage(Bytes(array, offset, length));
   }
 
   protected override byte[] HashFinal()
   {
      try
      {
         ProcessMessage(Padding());
  
         return new [] {_a, _b, _c, _d}.SelectMany(word => Bytes(word)).ToArray();
      }
      finally
      {
         Initialize();
      }
   }
 
   private void ProcessMessage(IEnumerable<byte> bytes)
   {
      foreach (byte b in bytes)
      {
         int c = _bytesProcessed & 63;
         int i = c >> 2;
         int s = (c & 3) << 3;
   
         _x[i] = (_x[i] & ~((uint)255 << s)) | ((uint)b << s);
   
         if (c == 63)
         {
            Process16WordBlock();
         }
   
         _bytesProcessed++;
      }
   }
 
   private static IEnumerable<byte> Bytes(byte[] bytes, int offset, int length)
   {
      for (int i = offset; i < length; i++)
      {
         yield return bytes[i];
      }
   }
 
   private IEnumerable<byte> Bytes(uint word)
   {
      yield return (byte)(word & 255);
      yield return (byte)((word >> 8) & 255);
      yield return (byte)((word >> 16) & 255);
      yield return (byte)((word >> 24) & 255);
   }
 
   private IEnumerable<byte> Repeat(byte value, int count)
   {
      for (int i = 0; i < count; i++)
      {
         yield return value;
      }
   }
 
   private IEnumerable<byte> Padding()
   {
      return Repeat(128, 1)
         .Concat(Repeat(0, ((_bytesProcessed + 8) & 0x7fffffc0) + 55 - _bytesProcessed))
         .Concat(Bytes((uint)_bytesProcessed << 3))
         .Concat(Repeat(0, 4));
   }
 
   private void Process16WordBlock()
   {
      uint aa = _a;
      uint bb = _b;
      uint cc = _c;
      uint dd = _d;
  
      foreach (int k in new [] { 0, 4, 8, 12 })
      {
         aa = Round1Operation(aa, bb, cc, dd, _x[k], 3);
         dd = Round1Operation(dd, aa, bb, cc, _x[k + 1], 7);
         cc = Round1Operation(cc, dd, aa, bb, _x[k + 2], 11);
         bb = Round1Operation(bb, cc, dd, aa, _x[k + 3], 19);
      }
  
      foreach (int k in new [] { 0, 1, 2, 3 })
      {
         aa = Round2Operation(aa, bb, cc, dd, _x[k], 3);
         dd = Round2Operation(dd, aa, bb, cc, _x[k + 4], 5);
         cc = Round2Operation(cc, dd, aa, bb, _x[k + 8], 9);
         bb = Round2Operation(bb, cc, dd, aa, _x[k + 12], 13);
      }
  
      foreach (int k in new [] { 0, 2, 1, 3 })
      {
         aa = Round3Operation(aa, bb, cc, dd, _x[k], 3);
         dd = Round3Operation(dd, aa, bb, cc, _x[k + 8], 9);
         cc = Round3Operation(cc, dd, aa, bb, _x[k + 4], 11);
         bb = Round3Operation(bb, cc, dd, aa, _x[k + 12], 15);
      }
  
      unchecked
      {
         _a += aa;
         _b += bb;
         _c += cc;
         _d += dd;
      }
   }
 
   private static uint ROL(uint value, int numberOfBits)
   {
      return (value << numberOfBits) | (value >> (32 - numberOfBits));
   }
 
   private static uint Round1Operation(uint a, uint b, uint c, uint d, uint xk, int s)
   {
      unchecked
      {
         return ROL(a + ((b & c) | (~b & d)) + xk, s);
      }
   }
 
   private static uint Round2Operation(uint a, uint b, uint c, uint d, uint xk, int s)
   {
      unchecked
      {
         return ROL(a + ((b & c) | (b & d) | (c & d)) + xk + 0x5a827999, s);
      }
   }
 
   private static uint Round3Operation(uint a, uint b, uint c, uint d, uint xk, int s)
   {
      unchecked
      {
         return ROL(a + (b ^ c ^ d) + xk + 0x6ed9eba1, s);
      }
   }
}

public static class FileIDUtil
{
    public static int Compute(Type t)
    {
        string toBeHashed = "s\0\0\0" + t.Namespace + t.Name;

        using (HashAlgorithm hash = new MD4())
        {
            byte[] hashed = hash.ComputeHash(System.Text.Encoding.UTF8.GetBytes(toBeHashed));

            int result = 0;

            for(int i = 3; i >= 0; --i)
            {
                result <<= 8;
                result |= hashed[i];
            }

            return result;
        }
    }
}
28 Likes

Also, since you mentioned writing a Ruby script, here’s the same thing in Ruby:

require("openssl")
def getFileID(namespace, name)
    s = "s\0\0\0"+namespace+name
    return OpenSSL::smile:igest.digest("MD4", s)[0..3].unpack('l<').first
end

How… do… you… know… this…?!
That’s just great!!! Thank you!

Short Story: Used dtrace to figure out which calls were being made to libssl.dylib and which arguments were being passed.

Longer Story: First, I figured out what that value is in binary and then searched the binaries in the Library folder for that value to figure out where it was contained. It shows up in assetDatabase3 and in the individual file for the DLL in the Library/metadata folder.
Then I used Instruments’ File I/O instrument to figure out which functions were being called when those files were written. Figured that out and then used the Sampler to get a more complete view of what was being called around that function. Finally found a function called AssetImporter::GenerateFileIDHashBased.

Used lldb to step through that function assembly instruction by assembly instruction and eventually found that it was making a call to MD4_Update in libssl. So, it was based on MD4, so, I quickly just tried shoving “+” into MD4, but I didn’t see anything that looked like it was related to the fileID.

So then I made a custom dtrace instrument which just probed all calls to MD4_Update and displayed their arguments. I ran Unity with the dtrace probes and found that it was calling MD4_Update with the int 0x73 (“s\0\0\0” as a string) and then MD4_Update with the string “+”. So, I ran that through MD4 and noticed that the first four bytes were the fileID (in little endian byte order).

14 Likes

Thank you for sharing this amazing thing!

OK this is not yet finished, but I will post it anyway.
Here’s my toolbox: GitHub - genail/genail-toolbox: My Unity3D toolset for Unity Asset Store publishers

It consists of several scripts:
gt-fileid - computes fileid using the script above (and it works!)
gt-genguid - for guid generation
gt-regenguids - replaces all meta files guids in chosen directory and fixes references (good for duplicating)
gt-replaces - replace strings in sources. You can specify a configuration file
gt-update-references - Replaces recursively references to resource with another (guid and fileid)
gt-unitymake - utilizes all the rest and creates a simple build script

It’s not finished. It has some hard coded paths (Windows Unity install location), and documentation is not valid in some places, but it can be used!

Here’s an example ReplaceFile.rb that has been used to build Mad Level Manager trial version: STRINGS = { "_ASSET_NAME_" => "Mad Level Manager", "_NAMESPACE_" => "M - Pastebin.com

Thank you for posting the md4 script. I’m trying to recreate the “m_localIdentifiertInFile” which I think is the id at the bottom not directly the fileid:
Transform:
m_ObjectHideFlags: 0
m_PrefabParentObject: {fileID: 400018, guid: 9f2fa551b27e9824094b74d1028e77fa, type: 2}
m_PrefabInternal: {fileID: 265297395}
m_GameObject: {fileID: 18793384}
m_LocalRotation: {x: 0, y: 0, z: 0, w: 1}
m_LocalPosition: {x: .149999976, y: -.099999994, z: 0}
m_LocalScale: {x: 1, y: 1, z: 1}
m_Children: [ ]
m_Father: {fileID: 93346996}
m_RootOrder: 0
— !u!114 &18793387

But I can’t tell yet what I need for it**.** ’ “s\0\0\0” + t.Namespace + t.Name; ’ doesn’t seem to be all I need. Any hints?

1 Like

Note that the initial “s\0\0\0” is actually 115 (as a 32-bit integer), which is the class ID for MonoScript. That can’t be coincidence. :slight_smile:

5 Likes

Does this algorithm still hold?

I find that the fileID of a “Test Component” from UnityTestTools are different when in loose scripts and in an external DLL on Unity5.5.0f3.

Did Unity change how they calculate the Id or I got some steps wrong here?

====== update ======

After some verification, this algorithm still works for external DLL on Unity5.5.0f3;

I have written a plugin to help switch the script between source code and dll without missing reference.

Hope it can help.

1 Like

The same solution using monodevelop C# compiled program (Build & run) intead of Ruby. T
There is another aproach too, using an C# editor script is included too. I haven’t tested neither of them but maybe it could be helpfull to someone.
[
Forcing a GUID to regenerate/refresh - Questions & Answers - Unity Discussions](http:// Forcing a GUID to regenerate/refresh - Questions & Answers - Unity Discussions)

I needed to change a project from Dll to source code and it just can’t work without searching & replacing all GUID and FileID. I ended up writing a little script to do that inside Unity from the menu.

You write in the script itself the original and final GUID and FileID (lines 36, 37, 39 & 40) and run it from menu->Tools->Regenerate asset GUIDs.
In theory you can write more than one GUID and File ID, but I haven’t tested.

I use it to change 2 GUIDs in very, very large and complicated project without any problem.

To know how to look at the GUID you want to replace (original and final) look at here. (Step 3):
http://forum.unity3d.com/threads/148078-Reducing-script-compile-time-or-a-better-workflow-to-reduce-excessive-recompiling?p=1026639&viewfull=1#post1026639

Backup your project first, It perfectly can break your project for good.!!!

/*

BACK UP YOUR PROJECT

*/


using System;
using System.Collections.Generic;
using System.IO;
using UnityEditor;
using UnityEngine;

public class UnityGuidRegeneratorMenu : MonoBehaviour {
    [MenuItem("Tools/Regenerate asset GUIDs")]
    public static void RegenerateGuids() {
        if (EditorUtility.DisplayDialog("GUIDs regeneration",
            "You are going to start the process of GUID regeneration. This may have unexpected results. \n\n MAKE A PROJECT BACKUP BEFORE PROCEEDING!",
            "Regenerate GUIDs", "Cancel")) {
            try {
                AssetDatabase.StartAssetEditing();
                string path = Path.GetFullPath(".") + Path.DirectorySeparatorChar + "Assets";
                RegenerateGuids (path);
            }
            finally {
                AssetDatabase.StopAssetEditing();
                EditorUtility.ClearProgressBar();
                AssetDatabase.Refresh();
            }
        }
    }

    private static readonly string[] fileListPath = { "*.meta", "*.mat", "*.anim", "*.prefab", "*.unity", "*.asset" };

    static string[] oldGUIDsList = new string[1] { "74dfce233ddb29b4294c3e23c1d3650d" };
    static string[] newGUIDsList = new string[1] { "89f0137620f6af44b9ba852b4190e64e" };

    static string[] oldFileIDsList = new string[1] { "11500000" };
    static string[] newFileIDsList = new string[1] { "-667331979" };
 
    static  string _assetsPath;
    static Dictionary<string, string> GUIDDict = new Dictionary<string, string>();
    static Dictionary<string, string> FileIDDict = new Dictionary<string, string>();

    public static void RegenerateGuids(string path) {
        //Debug.Log ("Init.");
        for(int i = 0; i < oldGUIDsList.Length; i++)
            GUIDDict.Add(oldGUIDsList[i], newGUIDsList[i]);

        for(int i = 0; i < oldFileIDsList.Length; i++)
            FileIDDict.Add(oldFileIDsList[i], newFileIDsList[i]);

         //Get the list of files to modify
        _assetsPath = path;
        Debug.Log ("Read File List: "+ _assetsPath);
        //string[] fileList = File.ReadAllLines(_assetsPath + fileListPath);
        // Get list of working files
        List<string> fileList = new List<string>();
        foreach (string extension in fileListPath) {
            fileList.AddRange( Directory.GetFiles(_assetsPath, extension, SearchOption.AllDirectories) );
        }

        //Debug.Log ("GUI Start for each");
        foreach (string f in fileList) {
            //Debug.Log ("file: " + f);
            string[] fileLines = File.ReadAllLines( f );
        
             for(int i = 0; i < fileLines.Length; i++) {
                bool GUIReplaced = false;
                //find all instances of the string "guid: " and grab the next 32 characters as the old GUID
                if(fileLines[i].Contains("guid: ")) {
                     int index = fileLines[i].IndexOf("guid: ") + 6;
                     string oldGUID = fileLines[i].Substring(index, 32); // GUID has 32 characters.
                     //use that as a key to the dictionary and find the value
                     //replace those 32 characters with the new GUID value
                     if(GUIDDict.ContainsKey(oldGUID)) {
                        fileLines[i] = fileLines[i].Replace(oldGUID, GUIDDict[oldGUID]);
                        GUIReplaced = true;
                        Debug.Log("replaced GUID \"" + oldGUID + "\" with \"" + GUIDDict[oldGUID] + "\" in file " + f);
                     }
                    //else Debug.Log("GUIDDict did not contain the key " + oldGUID);
                }

                if (GUIReplaced && fileLines [i].Contains ("fileID: ")) {
                    int index = fileLines[i].IndexOf("fileID: ") + 8;
                    int index2 = fileLines[i].IndexOf(",", index);
                    string oldFileID = fileLines[i].Substring(index, index2-index); // GUID has 32 characters.
                    //Debug.Log("FileID: "+oldFileID);
                    //use that as a key to the dictionary and find the value
                    //replace those 32 characters with the new GUID value
                    if(FileIDDict.ContainsKey(oldFileID)) {
                        fileLines[i] = fileLines[i].Replace(oldFileID, FileIDDict[oldFileID]);
                        Debug.Log("replaced FileID \"" + oldFileID + "\" with \"" + FileIDDict[oldFileID] + "\" in file " + f);
                    }
                    //else Debug.Log("FileIDDict did not contain the key " + oldFileID);
                }
             }
             //Write the lines back to the file
            File.WriteAllLines(f, fileLines);
         }
     }

}

It took my a lot of time to understand what and how to do it and make the script to make it work .
I hope it could be useful to somebody.

Do the .meta, .mat and .anim files need to be replaced? I’ve written my own solution replacing .prefab, .unity and .asset files only. As far as I’ve experienced, the files that need to have their references replaced are those that can store a reference to a MonoBehaviour or ScriptableObject only.

Depends on what references do actually change when using some dll based utility. In my case, the material references do change when using dll or src, so I had included those too. It was a utility that actually DO create & use their own materials and textures.
In my case not doing that result in not having my materials updated (there was no materals at all). I suposse is not mandatory in all cases…

Saludos desde Donosti.
Si tienes cualquier problema, coméntamelo; me costó Dios y ayuda entender y sacar adelante esta historia de cambiar una versión con src por otra con dll (o viceversa). email: jocyf@jocyf.com

Under what conditions does the editor assign a file ID (in 5.6)? I’ve noticed that after saving a scene, some components of scene objects get them and others do not. Transforms often get them, but not always.

The change always happens when you instead using a script you change it with a dll with that script in it (or viceversa).
On the Transform component case, I don’t know because that script vs Dll “change” cannot be done.

Sorry, I guess I was a bit off topic. I was just asking about the normal Unity behavior for assigning a File ID. Sometimes a component on a scene object still has this set to 0 after saving the scene. The rules for when one gets assigned aren’t obvious to me.

¡Gracias! Te contesto en inglés, para que se pueda leer en el foro.

Many thanks! I’ve also came up with a solution myself that allows me to exchange DLL with source code (and opposite). First, I parse the .cs files looking for MonoBehaviour-derived classes, grabbing their GUIDs and computing the corresponding FileID. The result is written to a CSV file, which contains the name, GUID and FileID per each component. Using that file I can then search the project files for GUID references and replace them with the corresponding FileID. The reverse procedure is straightforward.

1 Like

Thanks for the solution. Here is an F# function using MD4 from NuGet Gallery | HashLib 2.0.1 that I got working thanks to this thread. \000 is null character in F# instead of just \0 in C#.

open System

let md4 = HashLib.HashFactory.Crypto.CreateMD4()

let computeFileID (tp: Type): int =
    // https://discussions.unity.com/t/541112
    let hash =
        sprintf "s\000\000\000%s%s" tp.Namespace tp.Name
        |> Text.Encoding.UTF8.GetBytes
        |> md4.ComputeBytes
    BitConverter.ToInt32(hash.GetBytes(), 0)
2 Likes