Hey there. I have two snippets of code, both doing the same rendering. For the first, I’m using
Graphics.RenderMeshInstanced(), because it has support for passing in NativeArray, which is convenient for my use case (heavy job system usage). Unfortunately, I’ve noticed that doing it this way generates garbage every frame and takes over twice as long as the ‘old’ method??
// 9.3 KB garbage per frame, ~0.25ms
for (var splitIndex = 0; splitIndex < trsData.Length; ++splitIndex)
{
//..
var renderParams = new RenderParams(material);
renderParams.matProps = propertyBlock;
renderParams.lightProbeUsage = unityLightProbeUsage;
renderParams.shadowCastingMode = shadowsMode;
Graphics.RenderMeshInstanced(renderParams, mesh, 0, renderGroup, renderGroup.Length, 0);
}
Here’s an example of using Graphics.DrawMeshInstanced(), with a cached copy of the NativeArray copied into a Matrix4x4. It’s much faster and does not generate garbage.
Yes, it’s generating a lot of GC because they’ve made the method generic and they’re trying to Parse your struct layout on each call using Marshal.OffsetOf (reason for GC), who knows why. Everybody wanted a NativeArray overload of the DrawMeshInstanced since 2018 but yet we get this. No replies from graphics team in years. We’re using the old DrawMeshInstanced so still have to pay the cost of copying matrices to temporary array.
It doesn’t explain why they need to do it every call instead of caching it per type.
BTW, if you want to write to a managed array from a job, you can pin it. I’m doing this because CommandBuffer’s DrawMeshInstanced still only accepts a Matrix4x4[ ]. To do this, set up a list of pins for each frame in a manager class (eg your render pipeline if doing a custom SRP):
private readonly List<GCHandle> _arrayHandles = new();
public T* pin<T>(T[] array) where T : unmanaged
{
if(array == null || array.Length == 0)
throw new Exception("Array must be non-null and non-empty");
GCHandle gcHandle = GCHandle.Alloc(array, GCHandleType.Pinned);
_arrayHandles.Add(gcHandle);
return (T*) gcHandle.AddrOfPinnedObject();
}
When scheduling a job, pin it with that method and put the pointer into the job like this
The problem is you can’t convert Pointers or Spans or NativeArrays back to regular C# array without allocation. The DrawMeshInstanced only accepts that.
Even if they added a startIndex parameter that would solve it.
yes can confirm garbage collection for RenderMeshInstanced. however it seems to run faster than drawmeshinstanced on my tests. fps spikes down due to gc. is there any indication for when this will be fixed?
UPDATE
drawmeshinstanced runs at the same speed the issue was converting my nativearray to matrix4x4 array. drawmeshinstanced has zero gc and runs at same speed
Further to that, if every render is going to use the same values, then you can take that out of the loop too. Anything which stays the same for every loop, doesn’t need to be in the loop, and can be initialised just before.
are you sure this will make it 0 GC? I don’t want to test it again and find out it’s still allocating. Also the material changes, so unless the RenderMeshInstanced will make a copy of the params, I cannot do what you are suggesting.
You would just need to add the material change in the loop if it changes. The problem of allocation is when you allocate new memory. I would hope that it grabs the RenderParams by reference in the function
Sorry I want just to be clear here. You are not saying this because I do new RenderParams right? Because RenderParams is a struct, not a class and therefore new RenderParams doesn’t allocate by itself.
*When you create a struct object using the new operator, it gets created and the appropriate constructor is called. Unlike classes, structs can be instantiated without using the new operator. If you do not use new, the fields will remain unassigned and the object cannot be used until all of the fields are initialized.
When you are using new RenderParams, you don’t have to fill in all the other values, they are set as a default. But it is still ‘creating’ a new struct, meaning the old one is marked for deletion. If you are using a struct persistently, then create that outside of where you use it, Class level: RenderParams myRP (allocated, uninitialised, not usable), Start: myRP = new RenderParams (allocate new, set as defaults) , Loop: renderParams.matProps = propertyBlock; (change a property)
When you change a part of the struct, you are not allocating new memory because it already exists from the class level, stored in memory. I hope I’m right anyway!
Here’s a test I made ages ago, 1million blades of grass, 170+fps, zero allocation. This is part of the construct, although I’m using RenderMeshIndirect here which is a little complex to setup, but showing the setup I use (cut out all the extra code)
public class DrawGrass2 : MonoBehaviour
{
RenderParams renderParams;
public void Start()
{
// Init with a material
renderParams = new RenderParams(material);
// Set receive shadows
renderParams.receiveShadows = true;
// Set shadow casting mode
renderParams.shadowCastingMode = UnityEngine.Rendering.ShadowCastingMode.Off;
// MaterialPropertyBlock so you can access the custom StructuredBuffer in the shader
renderParams.matProps = new MaterialPropertyBlock();
// Send PositionsBuffer to the GPU ShaderBuffer
renderParams.matProps.SetBuffer("_PerInstanceData", GB_positionsBuffer);
}
public void Update()
{
if (meshRenderer.isVisible)
{
// Instances to draw
GB_commandData[0].instanceCount = (uint)drawInstances;
GB_commandBuffer.SetData(GB_commandData);
Graphics.RenderMeshIndirect(in renderParams, mesh, GB_commandBuffer);
}
}
}
And you might be right about it not allocating if you use new, it may just reset the struct since it calls the default constructor. But the fact is in your code, you are creating a new struct by saying RenderParams/var renderParams = new RenderParams(); You are saying a new one exists, in every loop, so it would be marked for deletion at the end of every loop.
It makes more sense if you think of it as:
allocate new RenderParams renderParams = default RenderParams();