Inconsistent behavior instantiating entities in a C# Job

[Unity 2020.2.1f1 | Entities Version 0.17.0-preview.42]

Problem: My entities instantiate as expected only when I enter play mode for the first time after opening my project, or after I make and save changes to the code. Each subsequent time I enter play mode without doing one of those two things, some entities generate in unexpected ways, but a majority are correct.

For reference, the scene operates like this:
Start → generate entities and store in native array for later cloning & positioning → (on ui button press) generate terrain data using mathf.perlinnoise for entity positions (done on main thread) → c# job clones entities and sets positions using terrain data → wait a frame (updates broadphase) → c# job ray casts against entities and stores results in native array → c# job (dependent on prev job) then runs comparisons on ray casts and spawns additional entities. The comparisons sometimes trigger false positives which creates the out of place entities on subsequent play mode sessions (but never the first as mentioned above).

My initial thoughts were that it was a scheduling issue (false positives/weird behavior & jobs is a red flag) or floating point imprecision in the comparisons, but it always works right once then never again after.

The last job (the one with the actual issue) also runs mathf.perlinnoise called from within the job for some comparisons between terrain chunk borders (think minecraft style chunks). I’ve done a little research and using mathf in a job doesn’t seem like its bad (just slower maybe) but i noticed that unity.mathematics also has a bunch of different noise implementations as well which may be worth switching to down the line. I dont think mathf.perlinnoise is the actual issue tho because it is marked as static and thread safe, but then again this issue only cropped up when i started using it so…

Possible Solution: IF calling mathf.perlinnoise from the job is in fact the issue i could cache the borders of neighboring terrain chunks in each terrain chunk so i could eliminate the call to perlin noise in the job at the cost of some memory. (i would like to avoid this since it will take a teeny bit of refactoring but am more than willing if its necessary)

Questions:

  1. Does the unity editor have a known issue caching information between play sessions in some weird way that would cause an issue with c#jobs/native collections like this? (i feel like i remember reading a forum post about something like this a few years ago but i couldnt come up with anything)

  2. Is using mathf.perlinnoise in a c# job is actually bad practice? If yes, would it be safe to use the unity.mathematics implementation or should i try something like my proposed solution instead?

I might try and implement my caching solution when i have time, and will share the results if i do. Regardless if y’all need more info/have any input it would be super appreciated ^-^

For (1), there’s no caching in DOTS natively that I would expect to cause problems because it supports fast play-mode options. However, if you set those options in your project and are using static fields, that may be your issue.

1 Like

Okay just a mini update, I enabled play mode options (both reload domain & reload scene) since they seem worthwhile to test with. They say it makes it closer to the player (which i assume means a build). Even with those settings enabled the error still did occur tho. (Didnt know they existed so Im still glad you mentioned them! ^-^)

I also disabled the code that calls mathf.perlinnoise in the c#job and sure enough the false positives seem to have disappeared. I’ll still need to sample the noise for the cases i disabled, but im going to try the solution i mentioned above and will update again if it solves the problem entirely or not.

Okay another update, I’ve made some progress and I think I’ve managed to isolate the issue but still am not sure what to do!

First, to answer my previous question (2): Yes, and from what I have seen, yes. I haven’t tested it with the noise functions personally, but I’ve seen unity.mathematics used in c# job examples without issue. As for mathf.perlinnoise, the C# Job System tips and troubleshooting page states that accessing static data will cause issues and will be unsupported in the future. So that’s that!

As for my progress: I have cleaned up my code a fair bit, and implemented the caching solution for the perimeter of my terrain chunks and it works well, but the problem still happens just in a new form. >.< I’ve been doing lots of research and profiling and I think I’ve reached some kind of breakthrough.

To start, I’ve isolated each of the 4 phases of the process to have 1 frame of buffer between each respective jobhandle.complete() call. So to reiterate the jobs are called in an Update() method like this:

press ui button to pre generate chunks of terrain data on the main thread (update()) and move to phase 0

[now in Update()]
phase 0: create c# job to generate the source entities and set their physics colliders via a command buffer. These entities are later cloned to make up the actual terrain chunks. (this is done on scene start or if the terrain is reset)

wait 1 frame

phase 1: create c# job to generate the base terrain out of the previously made source entities via a command buffer.

wait 1 frame

phase 2: create a c# job that uses raycasts to sample terrain data for processing in the next job.

wait 1 frame

phase 3: create a c# job that analyzes the data from the last job, and generates more entities from the source entities made earlier.

done!

Here is a sample of what a single terrain chunk looks like when it runs properly…
(for reference, the gray square underneath is just a blank tile at 0,0,0 and is unrelated. The mess of things next to it are the actual source entities that are used to later clone and make the terrain)

…and here is when the problem happens:

The red circle is highlighting the cloned entities that end up in the wrong place (0,0,0 in this case). Sometimes it comes out like the second image, but most of the time it works like the first image.

Here’s the kicker tho, if I wait TWO frames instead of one between phase 2 and 3 it works perfectly 100% of the time. If I only wait for one frame it mostly works and sometimes doesn’t.

So then the original issue seems to have been multiple things, but now that I’ve refactored my code at this point I’ve narrowed it down to what seems to be a scheduling issue with the physics system, and what I can only assume is getting unlucky with when the job completes relative to the physics system’s broad phase update. I can live with the two frame delay for the terrain due to the nature of my game, but thats 2 frames EVERY time I want to raycast which would be problematic in other situations, and it all just screams like I’m still doing something wrong.

Which brings me to my questions:

  1. I had put the [Update In Group] [Update Before] and [Update After] attributes on my c# jobs. They don’t appear to actually impact WHEN c# jobs are scheduled when looking at the profiler and have since removed them. I also had a realization based on some research that these attribute tags might only apply to ECS Systems and that C# Jobs/ECS Systems are independent of each other even if they are compatible, is that true?

  2. Is there a way to force a job to complete within a certain system/group (similar to update in group/before/after attributes) or is that impossible?

  3. Would this work better as a ECS System (due to being able to set what group/system it updates in) and if so how can I make the system only run on demand (like on button press) and not every frame?

Once again any input is super appreciated unfortunately this problem just doesn’t seem to want to get solved yet! Also if you need code samples for a better idea of whats happening let me know and I’ll put something together! Regardless, thank you for your time ^-^

Sorry I haven’t read everything in this thread.

But I would suspect problem related to entities orders. Basically entities orders and their indexes are not guaranteed to be the same, in every consecutive game start. I don’t know if that is related problem, but in any case, entities orders alone, should not be relied upon.

I appreciate your input, and you’re definitely right about that! :slight_smile: The documentation clearly states its not necessarily reliable to access an entity by its index/version, and that each iteration of an IJobParallelFor should be independent from the others (which mine are to my knowledge).

To clarify: the source entities are generated and stored in a persistent native array in pre-determined never changing order (ie 0 = flat, 1 = ne corner, 2 = se corner, etc…) and later individually cloned and positioned in phases 1 & 3 based on the data from phases 0 & 2 respectively. At no point are they (or any other generated entities) accessed by their respective indices/versions OR the index of the IJobParallelFor.

Which brings me back to what feels like my band-aid solution of waiting two frames between phase 2 and 3 which does make it work 100% of the time, but with a noticeable delay. This 2 frame delay isn’t necessarily a problem for the terrain portion of my project, but in an instance that requires more precision it could negatively impact the player’s experience.

I’m most familiar with OOP coding, but have been learning the ECS/DOTS approach for ~1 year and I feel like I’ve come a long way, but it also just feels like I’m still doing this wrong :confused: