Writing to NativeArray in IJobParallelFor is extremely slow

I am working on a project to implement a simple SPH fluid simulation, in which I write a job to compute density and pressure. it looks like below.

    [BurstCompile]
    private struct ComputeDensityAndPressureJob : IJobParallelFor
    {
        [ReadOnly] public NativeArray<Translation> positions;
        [ReadOnly] public NativeArray<int> positionToGrid;
        [ReadOnly] public NativeArray<FluidParticleComponent> particles;
        [ReadOnly] public int3 gridLength;
        [ReadOnly] public float kernelRadiusRate;
        [ReadOnly] public NativeMultiHashMap<int, int> hashMap;
        public NativeArray<float> densities;
        public NativeArray<float> pressures;

        public void Execute(int index)
        {
            float density = 0f;
            var gridIndex = positionToGrid[index];
            var position = positions[index].Value;
            var setting = particles[index];
            var kernelRadius = kernelRadiusRate * setting.radius;
            var kernelRadius2 = math.pow(kernelRadius, 2f);
            var poly6Constant = setting.density * setting.volume * 315f / (64f * math.PI * math.pow(kernelRadius, 9f));

            for (int x = -1; x <= 1; x++)
            {
                for (int y = -1; y <= -1; y++)
                {
                    for (int z = -1; z <= 1; z++)
                    {
                        var neighborGridIndex = gridIndex + XyzToGridIndex(x, y, z, gridLength);
                        var found = hashMap.TryGetFirstValue(neighborGridIndex, out var j, out var iterator);
                        while (found)
                        {
                            float distanceSq = math.lengthsq(position - positions[j].Value);
                            if (distanceSq < kernelRadius2)
                            {
                                // poly6, \rho_i = m \Sigma{W_{poly6}
                                // W_poly6 = \frac{315}{64 \pi h^9} (h^2 - r^2)^3
                                density += poly6Constant * math.pow(kernelRadius2 - distanceSq, 3f);
                            }
                            found = hashMap.TryGetNextValue(out j, ref iterator);
                        }
                    }
                }
            }

            densities[index] = density;
            // p_i = k(\rho_i - \rho_0),k as 2000
            pressures[index] = 2000f * (density - setting.density);
        }
    }

When I play the demo in editor, it runs at only 4 frames per second. In the profiler window, I can see that the job runs for 200ms.
But when i remove the last 2 lines which write back to NativeArray, it’ runs at 250 frames per second now. The running time of CPU main thread is only 4ms.
It’s so strange. I have turned off the safety check. Why is it so slow to write data back to NativeArray?

The full code is linked below. You can run my project on Unity version 2020.3.25.
https://github.com/Acceyuriko/FluidSimulation/blob/main/Assets/Scripts/FluidSPHSystem.cs

Have you checked the assembly to see that the job actually does anything when you remove writing to the native array? Because it sounds like the compiler is simply removing all the work since you’re not storing the result of it anywhere.

It’s nothing to do with the writing to array

            using (new ProfilerMarker("WriteToArray").Auto())
            {
                densities[index] = density;
                pressures[index] = 2000f * (density - setting.density);
            }

It’s just that when you disable this part of the code, burst is smart enough to know well you’re pretty much doing nothing just strip it all out.

Thank you too much. I think I find out the answer.