Processing a huge text file (lots of strings)

I have some existing C# code that imports a word list text file (via ScriptedImporter) and does a huge amount of processing on each word (line by line) and converts the whole list into a tightly packed Gaddag.

I was jobifying the runtime side of this project, and decided I’d like to make this import faster (can take up to 30s for larger lists, currently).

I’m trying to figure out the best way to convert the initial string data from the file into data for the jobs (I’d have parallel jobs processing each word).

I was thinking a NativeArray from the string[ ] File.ReadLines - but this is somewhat cumbersome to populate (can’t just memcpy, have to iterate) to do this.

Another way might be to read the file contents into several Words 4096 characters at a time? Each job would have to process whole words only so I’m not sure how I’d ensure I’m not cutting a word off in the middle.

Any other way to get a massive string (File.ReadAllText) or string[ ] (File.ReadLines) into a job friendly format?

Blob asset rerefence with a bolb string ?
I assume you don’t change the file content just read it and populate your Gaddag?

1 Like

Exactly this! Thank you, I was only looking in the Unity.Collections namespace and didn’t realise this existed. I’ll give that a go