Most efficient way to display large scatter plots?

I’d like to use Unity as a rendering platform to display scientific data as a large collection of scatter plot points. I tried the brute force method of just creating a cube primitive (no colliders) at each point, but the frame rate falls below 10 FPS by about 35K objects (on my machine). In practice, I’d like the plot points to have different shapes and materials, but a few dozen icons would suffice. (I want them to be 3D objects, and to be able to move.)

What would be the best way to tackle the problem? I see that GPU instancing is coming in June and while I know nothing about it, it sounds like just what I need. Regardless, I’m looking to learn more about how the lower level rendering engine works, so really, it’s just a curiosity to me.

Thanks in advance.

I dont think unity is a good tool at all for information visualization but I would recomend parallell coordinates or something similar to display them. If you wanna display the points anyway, use points and not primitives. Its slow because unity creates new game objects for each primitive I would wager.

I wouldn’t be surprised if there’s significant overhead in managing a large list of game objects. That’s why I’m wondering what would be the most efficient way to produce the same type of visual. It’s more effort, but I could create the entire plot with a procedural mesh. Maybe that’s a better way.

I don’t want straight point plots. Size and orientation are too important to interpreting 3D spatial relationships, and eventually, I’d like the points to be represented by simple mesh icons.

I see. Maybe a Vector field or something similar would be interesting for you they help displaying size and orientation and you could add more complexity. I wish I were better at these things…

Realistically how many data points do you expect to graph?

Will you control the hardware? From what I’ve read (might be wrong) I’m under the impression that instancing only works on relatively recent desktop GPUs. Ah, from Unity: “Windows, Mac and Linux with D3D11/D3D12/GL4.1”

If only a subset of the objects are visible at any given time, the CullingGroup API might be helpful:

Can you turn off shadow casting/receiving? Can you use an unlit shader?

Check this out:

https://github.com/keijiro/KvantSpray

Try using the particle system and SetParticles. Should be able to chew through 35k without too much trouble, depending on hardware of course

If you want different icons then use a single material with an atlas texture and the UVmodule to set the texture for each sprite.

@karl_jones ,

Thanks for the suggestion. Do you happen to have an example that illustrates your suggestion?
i did some experimentation with the particle system. It’s been a while, but I remember it being a good solution up to maybe 200K points.

FWIW,

I cobbled together an example using geometry shader examples from the web.

I create a point topology mesh at run time representing each data point. (Data is sorted into meshes that draw using the same material.) I then use a shader to render the point as a 3D object (cubes or whatever shape you can insert into the pipeline programmatically. I’d like to come up with a way to model and import 3D icons and draw them in a similar way. Maybe through GPU instancing?)

The best performance I’ve gotten has been to create 2D billboards that face the camera and are textured to look 3D. The spheres here are just textures. Since, it’s symmetrical, most people don’t detect that it’s turning and is really 2D. (Only graphics people who notice the lighting doesn’t change figure it out.)

This example draws 3.7M points at frame rates consistently above 60 FPS. Usually closer to 100FPS. Frame rate drop off has been more or less linear with the number of points.

It sounds like particles will not be able to help in this use case, there is a limit in the thousands for 3d mesh particles and 2d ones are going to struggle to compete with what you already have.
We do have an instancing solution in 5.5 and some improvements in 5.6 that would be better suited here. In 5.6 we are adding DrawMeshInstancedIndirect. @richardkettlewell is in the process of adding this and said he would be happy to offer some guidance on getting it working for you once 5.6 is available(no eta yet). So if your timeline allows it then its worth waiting for 5.6.

Thanks @karl_jones .

No problem. I’ll look for it in 5.6. I’m still in the learning phase with shaders and GPU instancing, so I’m not proficient enough to ask for help yet.

Eventually, I’m going to want to interact with nodes that I’m plotting which normally would mean raycasting. I’m assuming that something like a mesh collider isn’t going to work in a situation where a geometry shader is used. Any suggestions on where I might look for a solution?

You could try using simple colliders(box or sphere). The physics system is not an area i’m am very familiar with so I dont know how well it would perform under your circumstances. It may be better to detect the mouse location in the world and use some sort of spatial algorithm such as octree to determine the nearest particle and if its interacted with.

I’ve done some tests on how many instanced meshes we can expect to render, using our new “procedural” instancing. (Unity 5.6) This is where you will be able to provide your instancing data from custom sources eg a StructureBuffer, instead of supplying/updating matrix arrays etc via the CPU. It will allow you to generate instance data on the GPU, for example, and removes all per-instance CPU code paths.

So, if I use an instanced version of the Standard Shader, I can achieve around 800,000 cubes at 30fps.

But, the Standard Shader does a lot of stuff you may not need. So, if I use a simple custom shader, I can render over 2 million cubes at over 30fps on a GTX 980. If you want more complex meshes/shaders, this will severely impact how many items you can render, but conversely, there are also faster GPU’s available than a GTX 980 :slight_smile:

And here’s a preview of what the script/shader might look like:

using UnityEngine;
using System.Collections;

public class ExampleClass : MonoBehaviour {

    public int instanceCount = 500000;
    public Mesh instanceMesh;
    public Material instanceMaterial;

    private int cachedInstanceCount = -1;
    private ComputeBuffer positionBuffer;
    private ComputeBuffer argsBuffer;

    void Update() {

        // Update starting position buffer
        if (cachedInstanceCount != instanceCount)
            OnValidate();

        // Render
        instanceMaterial.SetBuffer("positionBuffer", positionBuffer);
        Graphics.DrawMeshInstancedIndirect(instanceMesh, 0, instanceMaterial, new Bounds(Vector3.zero, new Vector3(100.0f, 100.0f, 100.0f)), argsBuffer);
    }

    void OnGUI() {

        GUI.Label(new Rect(265, 25, 200, 30), "Instance Count: " + instanceCount.ToString());
        instanceCount = (int)GUI.HorizontalSlider(new Rect(25, 20, 200, 30), (float)instanceCount, 0.0f, 5000000.0f);
    }

    void OnValidate() {

        // positions
        positionBuffer = new ComputeBuffer(instanceCount, 16);
        Vector4[] positions = new Vector4[instanceCount];
        for (int i=0; i < instanceCount; i++)
        {
            float angle = Random.RandomRange(0.0f, Mathf.PI * 2.0f);
            float distance = Random.RandomRange(20.0f, 100.0f);
            float height = Random.RandomRange(-2.0f, 2.0f);
            float size = Random.RandomRange(0.05f, 0.25f);
            positions[i] = new Vector4(Mathf.Sin(angle) * distance, height, Mathf.Cos(angle) * distance, size);
        }
        positionBuffer.SetData(positions);

        // indirect args
        uint numIndices = (instanceMesh != null) ? (uint)instanceMesh.GetIndexCount(0) : 0;
        uint[] args = new uint[5] { numIndices, (uint)instanceCount, 0, 0, 0 };
        argsBuffer = new ComputeBuffer(1, args.Length * sizeof(uint), ComputeBufferType.IndirectArguments);
        argsBuffer.SetData(args);

        cachedInstanceCount = instanceCount;
    }
}
Shader "Instanced/InstancedShader" {
    Properties {
        _MainTex ("Albedo (RGB)", 2D) = "white" {}
    }
    SubShader {

        Pass {

            Tags {"LightMode"="ForwardBase"}
      
            CGPROGRAM
      
            #pragma vertex vert
            #pragma fragment frag
            #pragma multi_compile_fwdbase nolightmap nodirlightmap nodynlightmap novertexlight
            #pragma target 4.5

            #include "UnityCG.cginc"
            #include "UnityLightingCommon.cginc"
            #include "AutoLight.cginc"

            sampler2D _MainTex;

        #ifdef SHADER_API_D3D11
            StructuredBuffer<float4> positionBuffer;
        #endif

            struct v2f
            {
                float4 pos : SV_POSITION;
                float2 uv_MainTex : TEXCOORD0;
                float3 ambient : TEXCOORD1;
                float3 diffuse : TEXCOORD2;
                float3 color : TEXCOORD3;
                SHADOW_COORDS(4)
            };

            v2f vert (appdata_full v, uint instanceID : SV_InstanceID)
            {
                float4 data = positionBuffer[instanceID];

                float3 localPosition = v.vertex.xyz * data.w;
                float3 worldPosition = data.xyz + localPosition;
                float3 worldNormal = v.normal;
              
                float3 ndotl = saturate(dot(worldNormal, _WorldSpaceLightPos0.xyz));
                float3 ambient = ShadeSH9(float4(worldNormal, 1.0f));
                float3 diffuse = (ndotl * _LightColor0.rgb);
                float3 color = v.color;

                v2f o;
                o.pos = mul(UNITY_MATRIX_VP, float4(worldPosition, 1.0f));
                o.uv_MainTex = v.texcoord;
                o.ambient = ambient;
                o.diffuse = diffuse;
                o.color = color;
                TRANSFER_SHADOW(o)
                return o;
            }

            fixed4 frag (v2f i) : SV_Target
            {
                fixed shadow = SHADOW_ATTENUATION(i);
                fixed4 albedo = tex2D(_MainTex, i.uv_MainTex);
                float3 lighting = i.diffuse * shadow + i.ambient;
                fixed4 output = fixed4(albedo.rgb * i.color * lighting, albedo.w);
                UNITY_APPLY_FOG(i.fogCoord, output);
                return output;
            }

            ENDCG
        }
    }
}

3 Likes

@richardkettlewell that’s awesome! Love it.

I get about a million cubes at 30 fps on a Quadro K5000 using a cube geometry shader (similar performance). Looks like your work will make it much simpler to draw complex shapes.

Are there any plans to integrate your work with the physics system so that I can raycast against shapes drawn with DrawMeshInstancedIndirect?

It would probably be infeasible to integrate our CPU physics with something like this, if you want an interactive fps. All the data here is kept on the GPU for efficiency, so I would recommend a GPU based system for any kind of collision testing.

Sadly Unity doesn’t offer anything for this that I know of, but there are algorithms that tackle this kind of problem.
Eg, if your point cloud is static, a simple(ish) approach is to sort it according to the positions along 1 axis, and then your raycasts can binary search for the relevant points to test against. If your point cloud is dynamic, you would need to re-sort the data on each frame, instead of just once at the beginning.

1 Like

Thanks for the suggestion. The GPU is still a bit of a black box to me, so I need to do some research.

@mikewarren we just landed our new Procedural Instancing feature in 5.6.0a2.
I am able to render 2.7 million cubes at 30 fps on a GTX980 by using it.

When it ships, the script and shader I used to achieve that will be in the Script Manual. This link won’t work until we make the new docs public, but I think it will be this: Unity - Scripting API: Graphics.DrawMeshInstancedIndirect

2 Likes

I don’t quite understand GPU instancing yet, but I’m looking forward to it.
Great job, thanks!

This just got released today. Love it. Can you apply animation to this?

1 Like

It’s your responsibility to provide the instance data to the shader, or to procedurally generate it. (See the setup function in the docs example: Unity - Scripting API: Graphics.DrawMeshInstancedIndirect)

So, it’s totally possible to apply animation. E.g. via a ComputeShader that iterates over the instance data and manipulates it in some way. Eg you could specify an “attractor point” and write a compute shader that drags all the positions in the instance buffer towards that point. If that point is a Game Object, you can move it around and watch all the instances follow it around :slight_smile:

1 Like

Thanks. Will try that.

Not sure if you are still following this, but for either geometry shader version or once you get DrawMeshInstanced or DrawMeshInstancedDirect working i’d recommend an alternative to raycasting for purely ‘selection’ based detection ( i.e. detect mouse hits), since a physics or maths based system will generally struggle or require considerable effort to make it run efficiently.

Basically you’d render the scene again into a renderTexture and assign every instance a unique color value, or better yet just use the instance_ID value as a color. Then all you need to do is a screen pixel lookup of the color in the renderTexture to get its ‘ID’. Once the ID is known you can then do whatever you need to the node back in your main code.

Obviously rendering the instances twice is going to drop performance, though a custom shader for the renderTexture that is bare minimum will help here ( i.e surface shader would be overkill). Also you probably only want to render to the renderTexture on a mouseDown, no point rendering it every frame waiting for a mouse click.

Now i’ve used this process in the past but I think I was limited to either 255 or 65535 objects meaning I could use a simple 8 bit or 16 bit renderTexture. If you are rendering millions of nodes then that wont work. Pretty sure a R32Float would suffice though, but i’d double check the renderTexture formats to see if there is anything more suitable.

The final problem to solve then would be the read back of the color value for the pixel from a non-RGBA32 format. That isn’t something i’ve had to do and i’m unsure if Unity’s readPixel can deal with those. I guess worse case a computeShader could work, but that seems a bit overkill. Maybe someone else has experience of doing this can chime in.

2 Likes