Most performant way to upload Meshes and Textures on Polyspatial

Hey everyone,

I’m currently working on an app for the AVP running in Polyspatial/Shared space, which displays a sequence of meshes and textures, which are streamed/read from disk in real time. This means that the shown mesh and textures are updated up to 30 times per second.

I already know that polyspatial is particularly inefficient in these cases, as all the meshes and textures need to be mirrored to the RealityKit renderer, but even loading a 256x256 texture and around 1000 polygons 30 times a second lets the framerate drop to around 30 FPS. On my M2 Mac, this same setup runs at round 350 FPS.

Are there any best practices/workarounds, which could help speedup this process, like low-level APIs, or special pathways? A fast texture upload pathway for render textures seem to exist, but this does not seem to apply to classical textures?

I guess that the way the meshes and textures are uploaded don’t really matter, but just in case:

I read an .astc texture from disk, then load the raw texture data into an already allocated texture slot and then applying it

frame.texture.LoadRawTextureData<byte>(frame.textureBufferRaw);
frame.texture.Apply(false);
frame.frameMeshRenderer.sharedMaterial.SetTexture("_mainTex", frame.texture);

For the meshes, I’m reading the vertices and indices from disk into a NativeBuffer and then setting them with:

meshFilter.sharedMesh.SetVertexBufferData<byte>(vertexBufferRaw, 0, 0, frame.vertexBufferRaw.Length);
meshFilter.sharedMesh.SetIndexBufferData<byte>(indiceBufferRaw, 0, 0, frame.indiceBufferRaw.Length);
meshFilter.sharedMesh.SetSubMesh(0, new SubMeshDescriptor(0, indiceCounts), MeshUpdateFlags.DontRecalculateBounds);
meshFilter.sharedMesh.RecalculateNormals();

I’d be very grateful for any feedback :slight_smile:

Currently, the only way I can think of to speed this up would be to store the mesh (and texture) data in RenderTextures (using floating point textures for the mesh data), and use a shader graph with a vertex stage to read the vertex data from the texture when drawing. That’s basically how we support bake-to-texture particles.

However, visionOS 2 has added new low-level APIs (LowLevelMesh and LowLevelTexture) that we should be able to use in the future to speed up the transfer of mesh and texture data, and doing so is definitely on our road map. Right now, we transfer RenderTextures via a GPU blit (using the older, visionOS 1.0 DrawableQueue API), and with LowLevelMesh/LowLevelTexture, we should be able to do the same for any mesh/texture that’s loaded into the GPU on the Unity side.

Assuming this optimization would be particularly useful in your case, I’d suggest submitting it to our road map so that we can track interest and prioritize it.

1 Like

Thank you very much for the detailed answer, I highly appreciate it! :slight_smile:

The render texture path is a very interesting idea, I’ll try to implement this for my texture upload first, as this could be implemented pretty easily and see how much it improves my performance! I’ll report back in this thread once I have conducted some tests

I’d of course be super interested in the implementation of the low level APIs, as they would avoid these kind of workarounds. I imagine not only in my usecase, but in many others as well (such as the Particle system). I submitted it to the road map!

1 Like

Sorry one more question, I’m currently trying to implement the RenderTexture based approach, but I’m stumbling a bit on how I can upload my data to a render texture.

I couldn’t find a function from the RenderTexture docs that allows to upload directly from RAM/CPU to GPU. Only Graphics.Blit() seems to be available. However this requires that the Texture is already uploaded to the GPU, so the expensive operation has already happened.

Which methods are you using to upload the data into the RT, for the bake-to-texture particles? Or are these internal, unexposed functions?

I’m actually wondering about this too!

We use ParticleSystemRenderer.BakeTexture, which updates a Texture2D, and then we blit that to a RenderTexture using Graphics.Blit. ParticleSystemRenderer uses something like GetRawTextureData/Apply. So, unfortunately, some redundant blitting. I think one of the changes we’d like to make even before we attempt to use LowLevelMesh is to transfer non-RenderTextures (Texture2D, Texture2DArray, Texture3D, Cubemap) via blitting as well, to avoid that extra step (and generally to transfer them faster when in-process, versus Play to Device where we have to transfer them via the CPU).

1 Like

Thanks a lot for the answer! Got it, so the way to do it is to fill a Texture2D as usual and the Blit it.

Just to fully understand it, when exactly is the moment the expensive Texture Copy operation to RealityKit would happen when I’m just using a classical Texture2D? When it gets assigned to a material? Or asked the other way around, what things do I need to avoid to benefit from the RT Blitting improvement?

Blitting by default for non RT textures would be much appreciated! This could already give a lot of performance benefits I imagine.

Yes; we track the usage of each asset via the component trackers, so the first frame that you have a MeshRenderer/SkinnedMeshRenderer that uses a given Material, we start tracking that Material. The first frame that a Material is tracked, we start tracking its textures. The first frame that a texture is tracked, we send its initial contents over the CPU API (for non-RenderTextures), which ends up calling a function like TextureResource.init on the RealityKit side. Any time you change the texture (and thus it becomes “dirty” for that frame), we resend that data over the CPU and recreate the texture.

WIth a RenderTexture, when the texture is marked dirty (possibly explicitly via MarkDirty), instead of the contents, we just send a native handle to the RealityKit side. Then, instead of TextureResource.init, we use a blit operation to copy the texture to one provided by RealityKit. Note that simply having RealityKit use the native Metal texture that Unity creates isn’t quite an option; RealityKit can only use textures that it creates (although we’ve looked into the possibility of having Unity use those directly, and may do so again in the future).

Anyway, the short answer is that any use of RenderTextures guarantees use of the faster path (when in-process, that is; on Play to Device, we can’t transfer native handles and must read the contents on the CPU, which makes things much slower).

2 Likes

Great, thank you a lot, this makes things much more clearer to me, this is much appreciated! I will report back what performance gains I was able to achieve in case anyone is interested.

1 Like

I was able to finally test out the RenderTexture approach. Unfortunately, blitting into a render texture has the downside that all of the data needs to be uncompressed, which is not feasible for our use case. Compared to ASTC compressed textures, uncompressed texture are nearly 8x larger, which, besides needing much more storage space, also means a 8x increase in read time.

However, thanks to your explanation how you implemented the particle system, I was able to implement pointcloud rendering into polyspatial, which is great! I just seem to be missing two steps, which I hope you could help me with :slight_smile:

First, the RenderTexture always seems to get bilinear filtering applied when I run it in the simulator. I set the filtering to point mode for all render textures, and In the Unity Editor, everything works fine. In the simulator however, the forced bilinear filtering unfortunately messes up the vertex positions in the float texture. Is there any setting which can prevent that?

From left to right: Editor point filtering, Editor Bilinear filtering, Simulator point filtering, Simulator bilinear filtering

Second question is regarding the billboarding of the particle/points. As far as I understand, polyspatial doesn’t allow for head position tracking in bounded mode. My solution is to use tetrahedrons as particles, which works and looks fine, but uses twice the amount of polygons that are necessary. Is there any hint you could give on how you approached this problem?

1 Like

You should be able to use a Sampler State node in the shader graph to force point sampling. Because there’s no way to have a RealityKit MaterialX shader graph with sampler states such as wrap modes as parameters, we can’t apply the sampler state associated with the texture.

That is true for the head position on the CPU. Shader graphs, however, can access the camera parameters in order to perform billboarding even in bounded mode. There are a few different ways to do this, but one of the most straightforward is to get the object space View Direction. I was just now working on a shader graph that uses the cross product of that and the direction of a line in order to have the line always face the camera.

1 Like

Thank you so much! Ah yes, I should have read the documentation more throughly, but I didn’t even think that these properties would be available as shader properties/settings.

Thank you a lot for the fast and very helpful advice!

1 Like