Hi,
I just added some skinned meshes to my scene and was wondering about the skinning (especially in Unity3), and what exactly is being done on the CPU/GPU…
As far as I know there are 3 parts to rendering a skinned animation (assuming this is just 1 anim, and the renderer is interpolating between keyframes on that 1 anim):
Blending of 2 keyframes animation matrices to get a 3rd morphed set of matrices, using quaternions to blend between the anims.
Creating the matrix palette (by traversing the hierarchy of blended matrices to find final matrices).
Creating the object/world space verts by doing weighted multiplies of the matrices in the matrix palette and each vert. If lighting, also skin the normals. If normal mapped, also skin the tangents.
When referring to ‘skinning is done on the CPU’… am I right in thinking this is just part 1) and 2) above? Surely not 3)? I’ve got a 5k vert character in my scene and I’m seeing in the profiler in Unity3 that it’s using 25% of my CPU doing skinning
It’s not doing the actual skinning on the CPU is it?
BUUUMP. Where’s the in-depth authoritative info on how both CPU and GPU skinning work technically behind the curtains? Knowing how things are done in detail allows for better-informed design decisions early on. I know about the rendering pipeline in general, rasterization, shading etc. Now what happens exactly for skinned-mesh-renderers?
1 and 2 are always on the CPU and are usually called animation. #3 is skinning and it can be GPU or CPU. There are rare cases where CPU skinning is preferable, but generally GPU skinning is much faster.
Thanks MCN! Would still be hugely interested how it’s done on the GPU if GPU skinning is enabled. Doesn’t seem to be happening in Vertex Shaders or does it? GPU means basically some kinds of shaders need to run for this purpose, what shaders?
Yeah, GPU skinning means it’s done in a vertex shader. Unity has a lot of magic in its shader pipeline which is probably why you don’t see that code, but I’m not exactly sure. If you’re curious as to how its often done, Google matrix pallete skinning shader.
GPU Skinning is D3D11-only, AFAIK. The skinning is performed by the vertex shader, and the results written to a new VB via StreamOut (which is why D3D11 is required). The new VB is then rendered similarly to any other mesh.
Yep, that’s how it works on DX11. On OGLES3.0 it uses transform feedback. Here’s the relevant change on the 4.2 release:
GPU Skinning (requires Unity Pro)
Completely automatic, no custom shaders needed.
Works on DirectX 11 (via stream-out), OpenGL ES 3.0 (via transform feedback) and Xbox 360 (via memexport). Other platforms will continue to use CPU skinning.
Okay very interesting stuff guys, thanks! So contrary to the suggestion by @MakeCodeNow , while there exists a technique called “matrix pallete skinning” that uses vertex shaders, this isn’t used in Unity; instead they pick one DX11-specific technique for DX11, another (transform feedback) for GL-ES3, and although transform-feedback should be available in most current OpenGL (not ES) driver implementations, everything else falls back to CPU. Good to know…
FWIW the D3D11 and GLES3 techniques are effectively the same, they just have different names for the pipeline stages (“StreamOut” vs “Transform Feedback”).
I’m not entirely sure its that straightforward though. Been looking into this myself recently as I had tried to use GPU skinning on Android by forcing GLES 3.0, only to re-read the 4.2 release notes and discover its Pro only.
The problem i’ve got is that its all a little too black-box. Sure Unity says that its support for dx11 and GLES 3.0, but I also remember reading about gpu skinning being available on iOS then disabled due to performance issues. I have no idea if that statement is correct or misinformation and that’s the problem, you never quite know if its working on your target platform.
Further more it would be really nice to be able to enable/disable gpu skinning on demand for a project or even a specific skinnedMeshRenderer since mileage will vary in terms of performance based on the target hardware. As far as I can tell there are no options to do this. For example if you project is cpu bound and you have a decent gpu then gpu skinning is likely to be a win, conversely if you have a poor gpu, but idle cpu cores then cpu skinning might be better.
Really its all a bit disappointing that we had to wait until the latest API’s to get a feature that can easily work with older versions and older hardware. I guess there must be complexities within Unity that made it impractical to implement older style gpu skinning.
Well they always go for “runs for many use-cases on a broad range of hardware” but what baffles me is that there isn’t a complete SkinningMeshRenderer-replacement in the Asset Store yet perhaps one based on augmenting existing vertex-shaders with matrix-pallette skinning to fully ignore those exotic “DX11 or GLES3 only” features…
Resending full meshes to the GPU every frame is just stupidly wasteful. This whole practice should have been banned with the introduction of the programmable vertex stage over a decade ago now there are a few pitfalls and limitations when going with the vertex stage but they’re kind of manageable IMHO. Oh well, if U5 won’t improve on this and there’s still nothing on Asset Store in another half a year or so, I’ll have to consider giving it my best shot myself …
But don’t get too excited, without access to vertex streams to pass bone indices and weights, plus the fact that I’ve always found skeleton animation math hard I might not get anywhere with it. I do have something working, by which I mean the mesh ‘moves’,but its just a grossly distorted mess at the moment. I’m hoping its because i’ve got the bone transformations wrong in the shader, but that sort of stuff is really hard to debug.
There are various technical limitations such as the max number of bones, and working out hot to provide the bone indices and weights, but it looks feasible.
I trust you have already collected all the various matrix-pallette-vertex-shader tutorials regardless of GLSL or HLSL from Google… when I took a quick first glance a few results looked pretty promising but I didn’t bookmark them.
Not sure about “vertex streams”, from my cursory high-level research it seemed some just set lots of matrix and/or quaternion uniforms so I guess, going back to the very first post in this thread or rather the first reply by @MakeCodeNow it might be feasible to still perform steps 1 & 2 on parallel CPU cores and prepare everything for vertex-stage uniforms so that the VS only needs to apply a couple matrix transformations for step 3. That’s still bound to be much faster than transforming 10000s of vertices on the CPU and reuploading them each frame… which is the current out-of-box state of affairs outside of DX11 and GLES3.
BUT probably that’s what you’re struggling with already and the above is adding no real insights I’m definitely fairly noobish on this particular topic… when you need a second pair of shader-coding eyes or some external beta-tester lemme know!
Interesting info. I had no idea that Unity’s GPU skinning was GLES 3.0/DX11 only. That’s a remarkably high min spec for a concept that’s been a good idea since the original Xbox. As folks elsewhere said, CPU skinning is worth it if you’re GPU is crappy, or if you render the same character lots of times per frame, but in my experience you pretty much always need to vectorize and parallelize the CPU skinning for it to be reasonably efficient. Animation as well is naturally parallizable at many points in the pipeline. We were doing that early in the PS360 generation and Unity definitely should (if they are not already) be doing it now.
If anyone does go for matrix pallete skinning, note that you will have bone limits and you will have to split your meshes (or fall back to CPU skinning) if they go over that limit. This is just the way it’s always been, and one of the reasons stream approaches are awesome, but so far too much hardware still lives in the land of DX9/ES2.
You need a vertex stream or two to provide the bone indices and bone weights to the vertex shader, much like you pass in tangents for normal mapping, you need this data per vertex.
My current hacky solution because we don’t have the ability to add arbitrary streams is to use the vertex color, but that means i’m limited to 2 bones per vertex and the weighting is quantised to 1/255f intervals. There are a couple of alternative options that should provide improved accuracy and upto 4 bones, but I want to get the basics working first.
As MakeCodeNow mentions though there are further restrictions to deal with, such as the number of matrices that can be passed into a Unity shader. For Shader model 3 that appears to be around 56 bones currently, though may be less with more complex shaders. This can be improved and feasibly could be doubled, but again I want to get the basics working first and to check there aren’t any silly or strange road blocks that make the concept pointless to implement.
Talking of which after much banging my head against a wall and randomly trying out different transformations/matrices combinations, ( becuase I had no idea why the mesh got messed up) I’ve actually got skinning working on the gpu for a simple test model/animation i knocked up in blender. Might be a bit early to be sure, need to test with a proper animated character, plus its like 5 am, but i’m pretty happy with the progress.
Oh one other positive aspect, even if skinning doesn’t work, there is the possibility of using this technique for non-hardware instancing of objects in a single draw call, without having to batch or combine models. Indeed i’m pretty sure others have already done that.
I wouldn’t mind betting that ‘skinning’ is the number one expensive calculation that occurs through the character pipeline. Usually. When Motion Builder brought GPU skinning into MB2011, you could raise the mesh density by about a factor of 10 (with a half decent GPU). Realtime motion editing with dense meshes. If you want to find if it’s available - do that to your meshes and test the performance
Yeah, I was disappointed when I found that out. IIRC the main reason is that this way it’s entirely separated from the rest of the shading pipeline; if we were to use traditional matrix palette skinning then we’d have to be patching our shaders (and/or surface shaders) to do all that work, which would require changes to the shader compiler, etc. Not that any of that sounded impossible to me, but I guess it’s just one of those ‘nobody’s done it yet’ things.
This is something I really don’t get about Unity. What you’ve stated is pretty much to go to expression whenever anybody queries why Unity’s graphics are lagging behind hardware, usability, feature completeness etc. Its been like this for many years now, so why don’t they just hire or re-allocated one or two specialists to focus on this stuff?
Its pretty important and clearly hurts Unity in that many features simply cannot be implemented by developers because the API, hooks, functionality doesn’t exist.