My Hybrid ECS Performance Test

I’ve recently begun digging into the current ECS system and learning how I might convert parts of current projects into hybrid ECS code. My goal here was to see if moving monobehaviour logic into hybrid ECS logic would actually give any kind of performance gain. I’ve seen this discussed a few places in these forums and people generally say there is not much performance to be gained by utilizing hybrid ECS over the traditional way but I’ve found in my test that it’s actually considerably worse.

The Test:
In my test I have 25000 objects and am essentially performing the same tasks in several different ways. Each object has 3 variables which are modified toggled/incremented each frame

  • int
  • boolean
  • float

Method A:
classic monobehaviour where all these operations are done in the update loop

Method B:
Hybrid ECS using 3 separate ComponentData comonents and 3 separate ComponentSystems that each perform one of the operations

Method C:
A manager that keeps a list of all the monobehaviour objects and calls a manual update function which performs the same logic as the normal Update function

Results:

  • A - 20.69 ms

  • B - 198.96 ms

  • C - 4.56 ms

Conclusion:
Based on this test it seems like a very bad idea to move code over into a hybrid ECS system as there appears to be a lot of overhead associated with it. This may be obvious to some people but when I starting looking into this I kind of expected even hybrid ECS to offer at least comparable performance to the standard monobehaviour approach.

2 Likes

Without source code and peer reviewing your methodology these results are useless.

3 Likes

I emulated about the same tests as yours, but profiled with frame measurement.

  • A : 250000 game objects with Update().
  • B : Manager game object looping 250000 times in one frame to call Update() equivalent.
  • C : One ECS system iterate on 250000 entities that was generated from ConvertToEntity with inject mode, and modify values directly to the MonoBehaviour reference like in Update() with Entities.ForEach. The modified values reflected on the GameObject’s inspector on the scene. No Burst, no leak detection.
  • D : Like C but with a little more work. Added a dedicated IComponentData data storage on conversion for parallel calculation. Apply the calculated value by a copy. Burst on, no safety checks, in editor. No leak detection.
  • E : Removed the copy back step from D.

Results
All tests run on MacBook Pro Early 2015 in editor. All units in milliseconds.

  • A : Median:88.87 Min:83.32 Max:657.38 Avg:113.88 Std:77.03 Zeroes:0 SampleCount: 100 Sum: 11388.47
  • B : Median:21.48 Min:18.86 Max:48.13 Avg:23.81 Std:6.50 Zeroes:0 SampleCount: 100 Sum: 2380.65
  • C : Median:27.93 Min:23.71 Max:29.69 Avg:27.38 Std:1.46 Zeroes:0 SampleCount: 100 Sum: 2738.49
  • D : Median:24.09 Min:23.31 Max:40.47 Avg:25.19 Std:2.54 Zeroes:0 SampleCount: 100 Sum: 2518.88
  • E : Median:7.95 Min:6.32 Max:11.36 Avg:8.04 Std:0.57 Zeroes:0 SampleCount: 100 Sum: 804.17

Evaluation

  • Update() is expensive.
  • There is a cost on using the query to assemble native array for ForEach from large amount of entities. Otherwise C must be around the same as B, and D should be a bit faster than B.
  • You better design for a one-way hybrid approach. Let GameObject use the calculated result stored in ECS via EntityManager, instead of expensive setting back value to MonoBehaviour that is not as cache friendly.

Source : IntBoolFloat/Assets/Tests/HybridPerformanceTest.cs at master · 5argon/IntBoolFloat · GitHub

5 Likes

The fastest simple way to iterate over hybrid entities is to:

class MySystem : JobComponentSystem
    OnUpdate
        Entities.WithoutBurst().Foreach()

JobComponentSystem uses high performance codegen based component iteration.

4 Likes

Ok posting some source here to give a bit more information on how the test was implemented.

Test Manager:
Instantiates 250,000 copies of a prefab which is an empty gameobject with a “TestCube” component attached to it.
Iterates over them and calls a manual update function.

using System.Collections.Generic;
using UnityEngine;

public class TestManager : MonoBehaviour
{
    public int spawnCount = 10000;
    public TestCube cubePrefab;
    private List<TestCube> _cubes = new List<TestCube>();

    void Start()
    {
        for(int i = 0; i < spawnCount; i++)
        {
            _cubes.Add(Instantiate(cubePrefab));
        }
    }

    void Update()
    {
        for(int i = 0; i < _cubes.Count; i++)
        {
            _cubes[i].ManualUpdate();
        }
    }
}

Test Cube:
3 simple variables which are modified in an update loop. On start each test cube spawns an entity with 3 components on it mimicking the 3 simple variables.

using Unity.Entities;
using UnityEngine;

public class TestCube : MonoBehaviour
{
    private bool _pointlessBoolean = false;
    private int _pointlessInt = 0;
    private float _pointlessFloat = 0;
   
    void Start()
    {
        EntityManager manager = World.Active.EntityManager;
        Entity entity = manager.CreateEntity();
        manager.AddComponent(entity, typeof(PointlessBoolean));
        manager.AddComponent(entity, typeof(PointlessInt));
        manager.AddComponent(entity, typeof(PointlessFloat));
    }

    void Update()
    {
        _pointlessBoolean = !_pointlessBoolean;
        _pointlessInt += 1;
        _pointlessFloat += .1f;
    }

    public void ManualUpdate()
    {
        _pointlessBoolean = !_pointlessBoolean;
        _pointlessInt += 1;
        _pointlessFloat += .1f;
    }
}

public struct PointlessBoolean : IComponentData
{
    public bool value;
}

public struct PointlessInt : IComponentData
{
    public int value;
}

public struct PointlessFloat : IComponentData
{
    public float value;
}

Systems:
3 simple component systems that together perform the same functionality as the other update loops. use Entities.Foreach to query the component data that they need.

using Unity.Entities;

public class ComponentSystem_PointlessFloat : ComponentSystem
{
    protected override void OnUpdate()
    {
        Entities.ForEach((ref PointlessFloat pointlessFloat) =>
        {
            pointlessFloat.value += .1f;
        });
    }
}

public class ComponentSystem_PointlessBoolean : ComponentSystem
{
    protected override void OnUpdate()
    {
        Entities.ForEach((ref PointlessBoolean boolean) =>
        {
            boolean.value = !boolean.value;
        });
    }
}

public class ComponentSystem_PointlessInt : ComponentSystem
{
    protected override void OnUpdate()
    {
        Entities.ForEach((ref PointlessInt pointlessInt) =>
        {
            pointlessInt.value += 1;
        });
    }
}

*Here is the system code that do this, which produce the C result : https://github.com/5argon/IntBoolFloat/blob/master/Assets/WorkObjectSystem.cs

Your code is equivalent to my E case where there is no copy back. It should be way faster than both Update() and manager looping over ManualUpdate(). Maybe try using the performance testing package to get a proper warm up and medians.

Well it is a ComponentSystem as opposed to a JobComponentSystem. Tried modifying the system to use a JobComponentSystem and Entities.ForEach but I get a compile error that Entities does not exist. I am on Unity 2019.2

I am using standard profiler but the measurements I provided are taken after it has run for a little bit and they seem pretty consistent (about 60ms per system). They also match what the Entity debugger says.

I wasn’t aware of the performance testing package until now. I’ll take a look at it seems like a pretty useful tool for this kind of thing

In the latest version there is no reason to use a normal ComponentSystem anymore since you can force it to run on main thread (but bursted, magically) with .Run() on both Entities.ForEach or Job.WithCode. (Then you can return default as a job handle) Run may have problem with previously running job but you can .Complete the incoming inputDeps at the first line, or use [AlwaysSynchronizeSystem] to automate that. You can also force it to run on thread with .Schedule and then .Complete it immediately as before, if you feel that the job is bigger than scheduling cost but want the result now. Therefore the current JCS made CS obsolete because it could also do non-Job even better than CS.

The latest Entities package only works with 2019.3. There in a JobComponentSystem ForEach gets code-genned to use jobs and Burst, which makes it significantly faster.

1 Like

This is super helpful information thanks a lot!

I upgraded to 2019.3 and re-did my systems using the new JCS method in place of a normal CS. Definitely improved results for the systems. New results are

Standard Update: 20.3ms
Manager Update: 4.26ms
Job Systems (3 systems): 4.98ms + 4.67ms + 4.57ms = 14.22ms

So seems like as long as you can get the same functionality of your monobehaviour in a small amount of systems you will probably be getting some speedup from moving to JobComponentSystems. However calling an update function manually from a manager still seems faster than doing the same logic in 1 or more systems.

Tried a few more things

commented out “WithoutBurst” on the systems and they dropped from 4.8ms (ish) a piece to .17ms (ish) a piece.

Also tried out a few different query combinations with one of the systems without burst
ComponentData Only: 4.5ms
Managed Component Only: 3.8ms
ComponentData and Managed: 7.4ms

Did you turn safety check off yet? That cause about 10x performance drop on each memory access. Also leak detection off may help things to be more like real device better, I guess?

Yeah I have. When burst can be used performance is looking great (less than .2ms per system). It’s just when Burst cannot be used (such as when referencing managed data) that the systems become a bit expensive

considering you have 3 jobcomponent systems to update 3 different values but only one monobehaviour manager to update all three, seems like you should consolidate the job systems or split the managed monobehaviour into 3 sub behaviours/managers

I’m still very new to data oriented design, but based on everything I’ve seen and read it generally involves many components with as little data as possible and systems that are dedicated to doing one particular thing. This test was based on the assumption that if you are converting an existing monobehaviour into ECS you will generally be developing several systems to do so.