Hello everyone, I am currently on the final stretch of completing my thesis for a Master’s degree. My research focuses on multi-thread performance in Unity Jobs with Burst, and I have some pertinent questions about it. I would love to share a few questions with you all and open this thread for discussions as well.
-
What is/are the best way(s) to evaluate performance for a multi-threaded script in Unity Jobs? I am currently using the Unity Profiler, which I find difficult to understand. I am unsure if I should evaluate the cost per ms of the job by picking every piece of the job in the thread and taking their average. In some scripts, I have these blue pieces in the threads (referring to the job I am running) split into many places, which does not happen in the samples from Unity. I have seen some articles that measure performance by measuring the FPS of the entire execution. (Picture 1: a screenshot of the Unity Profiler of the Unity sample FindNearest Job, Picture 2: a screenshot of my own job implementation of a flocking algorithm). If FPS is the best way to measure it all, how can I do this properly? What’s the best way to properly collect FPS data? Get the average of the highes and lowest pitches? use the in Game tab gizmo? Use the profiler?
-
Regarding memory allocation cost, do jobs have anything to do with memory allocation? I am seeing a small variation between the parallel and sequential algorithms; let’s say the parallel script is 1.15 GB in total, and the sequential one is 1.16 GB. (I am not using ECS.)
-
This is a highly relevant question: why can’t I use a number of batches lower than the number of jobs allocated? In the following code from the sample, it is possible to limit the number of batches to 64, even if the number of Seekers is larger than the number of batches. I cannot reproduce this in any other script I create using Unity Jobs. In the second script, which is mine, I am trying to make the number of batches lower than the number of Boid units (from FlockManager) to better utilize thread spreading allocation in parallel. However, I encounter three errors, as described below. So far, it only works without errors if the number of batches is equal to or greater than the number of job units. What am I doing wrong? (Pitcure 3: Flocking jobs cant use the number of batches lower than the jobs units)
-
I understand that the development time required to implement Jobs and Burst in an existing script can be more efficient compared to implementing ECS, which is the primary reason I am not using ECS at the moment. I am curious to know if this was also the main reason behind the decision to not use ECS in this Unity sample (FindNearest). Unlike other Unity samples for DOTS, this particular sample does not utilize ECS. Could you kindly provide more information on why this choice was made?
Below are the code mentioned and the errors I get, also there are pitcures attached.
Thank you very much, I appreciate it in advance!
(Didn’t know if I should keep this a Question or a Discussion tagged, since im planning on making new answers here if you allow me)
Best regards!
Patrick Machado
Script FindNearest Sample (described on number 3):
using Unity.Collections;
using Unity.Jobs;
using Unity.Mathematics;
using UnityEngine;
namespace Step3
{
public class FindNearest : MonoBehaviour
{
public NativeArray<float3> TargetPositions;
public NativeArray<float3> SeekerPositions;
public NativeArray<float3> NearestTargetPositions;
public void Awake()
{
Spawner spawner = Object.FindObjectOfType<Spawner>();
TargetPositions = new NativeArray<float3>(spawner.NumTargets, Allocator.Persistent);
SeekerPositions = new NativeArray<float3>(spawner.NumSeekers, Allocator.Persistent);
NearestTargetPositions = new NativeArray<float3>(spawner.NumSeekers, Allocator.Persistent);
}
public void OnDestroy()
{
TargetPositions.Dispose();
SeekerPositions.Dispose();
NearestTargetPositions.Dispose();
}
public void Update()
{
for (int i = 0; i < TargetPositions.Length; i++)
{
TargetPositions[i] = Spawner.TargetTransforms[i].localPosition;
}
for (int i = 0; i < SeekerPositions.Length; i++)
{
SeekerPositions[i] = Spawner.SeekerTransforms[i].localPosition;
}
FindNearestJob findJob = new FindNearestJob
{
TargetPositions = TargetPositions,
SeekerPositions = SeekerPositions,
NearestTargetPositions = NearestTargetPositions,
};
// Execute will be called once for every element of the SeekerPositions array,
// with every index from 0 up to (but not including) the length of the array.
// The Execute calls will be split into batches of 64.
JobHandle findHandle = findJob.Schedule(SeekerPositions.Length, 64);
findHandle.Complete();
for (int i = 0; i < SeekerPositions.Length; i++)
{
Debug.DrawLine(SeekerPositions[i], NearestTargetPositions[i]);
}
}
}
}
Script FlockManager (Described on number 3):
using UnityEngine;
using Unity.Jobs;
using Unity.Mathematics;
using Unity.Collections;
using System.Collections;
using Unity.Burst;
using Unity.Jobs.LowLevel.Unsafe;
using UnityEngine.Profiling;
namespace NewBoid_JobParallelized
{
public struct Boid
{
public float3 position;
public float3 velocity;
public float3 acceleration;
}
public class FlockManager : MonoBehaviour
{
public GameObject boidPrefab;
public int numBoids = 50;
public float maxSpeed = 5f;
public float maxForce = 1f;
public float separationDistance = 2f;
public float alignmentDistance = 10f;
public float cohesionDistance = 10f;
public float separationWeight = 1f;
public float alignmentWeight = 1f;
public float cohesionWeight = 1f;
public float boundsRadius = 50f;
public Transform target;
private NativeArray<Boid> boids;
private JobHandle flockJobHandle;
public GameObject[] boidPrefabs;
public float spawnRadius;
public int BatchSize = 64;
public int ThreadLimitedTo = -1;
void Start()
{
boids = new NativeArray<Boid>(numBoids, Allocator.Persistent);
boidPrefabs = new GameObject[numBoids];
for (int i = 0; i < numBoids; i++)
{
// Instantiate the boid prefab and store the reference in boidPrefabs
Vector3 position = new Vector3(UnityEngine.Random.Range(-spawnRadius, spawnRadius), UnityEngine.Random.Range(-spawnRadius, spawnRadius), UnityEngine.Random.Range(-spawnRadius, spawnRadius));
Quaternion rotation = Quaternion.Euler(UnityEngine.Random.Range(-180, 180), UnityEngine.Random.Range(-180, 180), UnityEngine.Random.Range(-180, 180));
GameObject _boidPrefab = Instantiate(boidPrefab, position, rotation);
boidPrefabs[i] = _boidPrefab;
}
InitializeBoids();
}
void InitializeBoids()
{
for (int i = 0; i < boids.Length; i++)
{
Boid boid = new Boid
{
position = boidPrefabs[i].transform.position,
velocity = boidPrefabs[i].GetComponent<Rigidbody>().velocity
};
boids[i] = boid;
}
}
void Update()
{
System.Diagnostics.Stopwatch stopwatch = System.Diagnostics.Stopwatch.StartNew();
// Create the flock job and schedule it
var flockJob = new FlockJob
{
boids = boids,
maxSpeed = maxSpeed,
maxForce = maxForce,
separationDistance = separationDistance,
alignmentDistance = alignmentDistance,
cohesionDistance = cohesionDistance,
separationWeight = separationWeight,
alignmentWeight = alignmentWeight,
cohesionWeight = cohesionWeight,
boundsRadius = boundsRadius,
targetPosition = target.position,
deltaTime = Time.deltaTime
};
// Update the Boid data with the current position and velocity of the prefab
for (int i = 0; i < boids.Length; i++)
{
GameObject boidPrefab = boidPrefabs[i];
boidPrefab.transform.position = boids[i].position;
boidPrefab.transform.rotation = Quaternion.LookRotation(boids[i].velocity);
Boid boid = boids[i];
boid.position = boidPrefab.transform.position;
boid.velocity = boidPrefab.GetComponent<Rigidbody>().velocity;
boids[i] = boid;
}
//int numBoidsPerJob = (int)Mathf.Ceil((float)numBoids / 64f);
if (ThreadLimitedTo != -1) JobsUtility.JobWorkerCount = ThreadLimitedTo;
Debug.Log("Threads: " + JobsUtility.JobWorkerCount);
flockJobHandle = flockJob.Schedule(boids.Length, BatchSize);
}
void LateUpdate()
{
// Wait for the flock job to complete and update the boid positions
flockJobHandle.Complete();
for (int i = 0; i < numBoids; i++)
{
Boid boid = boids[i];
boid.velocity += boid.acceleration * Time.deltaTime;
boid.velocity = math.clamp(boid.velocity, -maxSpeed, maxSpeed);
// add target following behavior
float3 targetOffset = new float3(target.position.x, target.position.y, target.position.z) - boid.position;
float targetDistance = math.length(targetOffset);
if (targetDistance > 0.1f) // if the boid is far from the target
{
float3 targetVelocity = math.normalize(targetOffset) * maxSpeed;
float3 targetAcceleration = (targetVelocity - boid.velocity) * 10f; // use a high multiplier to make the boids follow the target more quickly
boid.acceleration += targetAcceleration;
}
boid.position += boid.velocity * Time.deltaTime;
boid.acceleration = float3.zero;
boidPrefabs[i].transform.position = boid.position;
boidPrefabs[i].transform.rotation = Quaternion.LookRotation(boid.velocity);
boids[i] = boid;
}
}
void OnDestroy()
{
// Dispose of the boids array when the FlockManager is destroyed
boids.Dispose();
}
[BurstCompile]
struct FlockJob : IJobParallelFor
{
public NativeArray<Boid> boids;
public float maxSpeed;
public float maxForce;
public float separationDistance;
public float alignmentDistance;
public float cohesionDistance;
public float separationWeight;
public float alignmentWeight;
public float cohesionWeight;
public float boundsRadius;
public float3 targetPosition;
public float deltaTime;
public void Execute(int i)
{
Boid boid = boids[i];
float3 separation = float3.zero;
float3 alignment = float3.zero;
float3 cohesion = float3.zero;
int numNeighbors = 0;
for (int j = 0; j < boids.Length; j++)
{
if (i == j) continue;
Boid other = boids[j];
float3 offset = other.position - boid.position;
float distance = math.length(offset);
if (distance < separationDistance)
{
separation -= math.normalize(offset) / distance;
}
else if (distance < alignmentDistance)
{
alignment += other.velocity;
numNeighbors++;
}
else if (distance < cohesionDistance)
{
cohesion += other.position;
numNeighbors++;
}
}
if (numNeighbors > 0)
{
alignment /= numNeighbors;
cohesion /= numNeighbors;
cohesion = math.normalize(cohesion - boid.position);
}
float3 boundsOffset = float3.zero;
if (math.length(boid.position) > boundsRadius)
{
boundsOffset = -math.normalize(boid.position) * (math.length(boid.position) - boundsRadius);
}
separation = math.normalize(separation) * separationWeight;
alignment = math.normalize(alignment) * alignmentWeight;
cohesion = math.normalize(cohesion) * cohesionWeight;
boundsOffset = math.normalize(boundsOffset);
boid.acceleration = separation + alignment + cohesion + boundsOffset;
boid.acceleration = math.clamp(boid.acceleration, -maxForce, maxForce);
// add target following behavior
float3 targetOffset = targetPosition - boid.position;
float targetDistance = math.length(targetOffset);
if (targetDistance > 0.1f) // if the boid is far from the target
{
float3 targetVelocity = math.normalize(targetOffset) * maxSpeed;
float3 targetAcceleration = (targetVelocity - boid.velocity) * 10f; // use a high multiplier to make the boids follow the target more quickly
boid.acceleration += targetAcceleration;
}
boids[i] = boid;
}
}
}
}
Error 1/3 described on number 3:
IndexOutOfRangeException: Index 64 is out of restricted IJobParallelFor range [0...63] in ReadWriteBuffer.
ReadWriteBuffers are restricted to only read & write the element at the job index. You can use double buffering strategies to avoid race conditions due to reading & writing in parallel to the same elements from a job.
Unity.Collections.NativeArray`1[T].FailOutOfRangeError (System.Int32 index) (at <44569b2d1c974b5eb5e85c3450cfe46d>:0)
Unity.Collections.NativeArray`1[T].CheckElementReadAccess (System.Int32 index) (at <44569b2d1c974b5eb5e85c3450cfe46d>:0)
Unity.Collections.NativeArray`1[T].get_Item (System.Int32 index) (at <44569b2d1c974b5eb5e85c3450cfe46d>:0)
NewBoid_JobParallelized.FlockManager+FlockJob.Execute (System.Int32 i) (at Assets/AIs/AI_01_Boids/New_ParallelJobs_Boids/FlockManager.cs:182)
Unity.Jobs.IJobParallelForExtensions+ParallelForJobStruct`1[T].Execute (T& jobData, System.IntPtr additionalPtr, System.IntPtr bufferRangePatchData, Unity.Jobs.LowLevel.Unsafe.JobRanges& ranges, System.Int32 jobIndex) (at <44569b2d1c974b5eb5e85c3450cfe46d>:0)
Error 2/3 described on number 3:
IndexOutOfRangeException: Index 64 is out of restricted IJobParallelFor range [0...63] in ReadWriteBuffer.
ReadWriteBuffers are restricted to only read & write the element at the job index. You can use double buffering strategies to avoid race conditions due to reading & writing in parallel to the same elements from a job.
Unity.Collections.NativeArray`1[T].FailOutOfRangeError (System.Int32 index) (at <44569b2d1c974b5eb5e85c3450cfe46d>:0)
Unity.Collections.NativeArray`1[T].CheckElementReadAccess (System.Int32 index) (at <44569b2d1c974b5eb5e85c3450cfe46d>:0)
Unity.Collections.NativeArray`1[T].get_Item (System.Int32 index) (at <44569b2d1c974b5eb5e85c3450cfe46d>:0)
NewBoid_JobParallelized.FlockManager+FlockJob.Execute (System.Int32 i) (at Assets/AIs/AI_01_Boids/New_ParallelJobs_Boids/FlockManager.cs:182)
Unity.Jobs.IJobParallelForExtensions+ParallelForJobStruct`1[T].Execute (T& jobData, System.IntPtr additionalPtr, System.IntPtr bufferRangePatchData, Unity.Jobs.LowLevel.Unsafe.JobRanges& ranges, System.Int32 jobIndex) (at <44569b2d1c974b5eb5e85c3450cfe46d>:0)
Error 3/3 described on number 3:
IndexOutOfRangeException: Index 0 is out of restricted IJobParallelFor range [256...319] in ReadWriteBuffer.
ReadWriteBuffers are restricted to only read & write the element at the job index. You can use double buffering strategies to avoid race conditions due to reading & writing in parallel to the same elements from a job.
Unity.Collections.NativeArray`1[T].FailOutOfRangeError (System.Int32 index) (at <44569b2d1c974b5eb5e85c3450cfe46d>:0)
Unity.Collections.NativeArray`1[T].CheckElementReadAccess (System.Int32 index) (at <44569b2d1c974b5eb5e85c3450cfe46d>:0)
Unity.Collections.NativeArray`1[T].get_Item (System.Int32 index) (at <44569b2d1c974b5eb5e85c3450cfe46d>:0)
NewBoid_JobParallelized.FlockManager+FlockJob.Execute (System.Int32 i) (at Assets/AIs/AI_01_Boids/New_ParallelJobs_Boids/FlockManager.cs:182)
Unity.Jobs.IJobParallelForExtensions+ParallelForJobStruct`1[T].Execute (T& jobData, System.IntPtr additionalPtr, System.IntPtr bufferRangePatchData, Unity.Jobs.LowLevel.Unsafe.JobRanges& ranges, System.Int32 jobIndex) (at <44569b2d1c974b5eb5e85c3450cfe46d>:0)


