What are the best tools for Many to Many relationships in DOTS ?

I tried doing some digging around and going through the documentation without much success but I might be missing something.

What’s the best way to implement Many to Many relationships through Entities?

I’m aware that Entities will be used to define the relationships, but I’m asking about a good tool to implement them.

I’ll try giving a generic example: neural nets (following something like the NEAT algorithm)

Let’s say I have Node Entities holding a previousValue and a presentValue. I also have Connection Entities each holding an InputNode Reference (to the Entity), and an OuputNode Reference. In my case, we’ll say that there can be any number of connections exiting and entering a node (no pre-definite number of connections per node). Nonetheless, we will have a high number of Nodes and Connections.
Might be good to add the specification that NO connections would ever have the same OuputNode and the same InputNode.
Each frames, I need to read the previousValue of the InputNode, and add it to the present value of the OuputNode (which has been reset to zero at the start of the frame by another system).

I figure it might be a good idea to separate the Read System and the Write System, maybe by adding an InputValue to the connection which first gets copied from the inputNode’s value using ComponentDataFromEntity in the Read System.

However the Write System seems a lot more complex given the ‘‘undefined number of entering connections’’ premise. A few Ideas I looked into and their potential problems I thought of:

  • Using ComponentDataFromEntity in write mode:

This would allow to keep the read and write system together (or not), but we would have to ensure thread safety ourself, which can be tricky. Basically we would have to make sure that all Connections that have the same OutputNode are executed in the same Job, which I’m not sure how to do. Furthermore this could be difficult because we don’t know how many connections would reffer to that node. Could be one, could be ten, maybe could be a hundred?

  • Using the EntityCommandBuffer :

If my undestanding is correct, this would allow to keep the read and write system together again, but considering we have a high number of Nodes and Connections and that this would need to run every frame, the ECB might be a specifically bad tool for the job.

  • Using A SharedComponentData for the OutputNode Reference:

To my undestanding, this would group Connections with the same OuputNode together in the same chunks, maybe allowing us to use IJobChunk to ensure thread safety and that no connections writing to the same node would run in parallel, although I’m not sure if there could potentially be two chunks writing to the same Node (let’s say the number of entering connections is greater than the chunk’s maximum length. I also don’t know what would be the impact if the opposite happen and that most Nodes have very few entering connections, most chunks would contains one or two entities and we would have a lot of chunks, potentially nullifying the effects of ECS’s memory optimisation. My guess is that the latter is most likely and a low number of connections would potentially have the same Outputnode (<=10 maybe).

I know I’m probably missing some tools or knowledge, I’m just wondering how you would implement such a system and what kind of pitfalls I’m maybe overlooking/forgetting? Also, are there some additional limitations we could impose on the premise (max number of connections with the same OutputNode, or something like that) to make a specific tool/solution better?

Thanks a lot in advance, I’m pretty new to the world of ECS and slowly learning.

Ideally you want each node to have a list of all its inputs so that it can safely random-access read them in parallel. This is one of those situations where I think a NativeMultiHashMap makes sense. For each connection node, you add an OutputNode key and an InputNode value to the map. Then each node looks up itself in the map and iterates through all inputs using ComponentDataFromEntity.

Thanks a lot for the answer, I didn’t know about this particular memory structure!

Wouldn’t that scale really badly with lots of Nodes/Connections? Let’s say in the 100’000s? (I’m asking, I don’t know)

I’m new to the field of DOTS, but it feels like we would want to avoid random-access whenever possible?

Also, in the case where I would want to add weights to connections to represent the strength of a particular link, could a NativeMultiHashMap still be used?

Yes. You would just make the “value” side of the NativeMultiHashMap a struct containing whatever extra per-connection data you need.

Very interesting!

After looking through the documentation (and the boid sample) I’m struggling to understand the application of what you described, could you explain it a little bit more or at least refer me to the good resources?

The boids sample won’t help you much. But what part of my description (the first reply) are you not understanding?

After many hours of video watching and head-banging I finally managed to implement a working prototype, the connections manage to propagate node values accordingly and I’m impressed by the performances (obviously ahah)

In the end the Boid sample was very helfull in understanding the specific structure and use case of the NativeMultiHashMap. If anyone is interested at some point I’m open to sharing my solution

In a database, a many-to-many relationship needs an extra table with two foreign keys. Since Entities/Components are basically an in-memory database, I’d be inclined to do the same; just having a component with 2 entity references, and use one entity per relationship. Not sure how well it would perform though, I assume the solution @DreamingImLatios suggested is better in both readability and performance, but that’s how I probably would have done it myself.

That’s how I ended up implementing it.
Connections are their own entities and store the connection strength, They also have a signal component that pre-load the value from their input nodes and multiply it to the connection strength in a system.

The “Stimulation” System is then responsible for loading every signal’s value in a NativeMultiHashMap (along with other structures), which is used by a custom variant of the “JobNativeMultiHashMapMergedSharedKeyIndices” job (from the boid sample project from Unity) that collapses the MultiHashMap into a single native array, which is then written to the list of all output nodes.

It might not be the best solution, but it’s the most straightforward I managed to wrap my head around
I’ll probably get back to it at some point

The nice thing about DOTS is, like Mike Geig showed in the horde shooter example Unity uploaded a few years ago, that you don’t even need to find the “perfect” solution to get great performance. Every single bullet checked the distance to every single enemy to determine collision, and he still got amazing performance out of it despite the implementation being terrible (his words, not mine). But yeah if you want to squeeze every last frame out of it and find the very best solution, I assume that would depend on the context and scale, and you’d probably need to do some testing/digging through docs. I’m glad the entity-per-connection solution works out though