In order to create a certain effect, I need to calculate the pose of a set of skinned meshes, since Unity doesn’t expose that data in version 3. I was able to get everything working, but I’m suffering from some major performance issues. Since I don’t have Unity Pro, I did some ad-hoc timing of functions using Time.realtimeSinceStart and narrowed down my bottleneck to this section of code:
private function CalculatedPosedMesh(skinnedMesh : SkinnedMeshRenderer) : Mesh
{
// Build a new mesh
var baseMesh = skinnedMesh.sharedMesh;
var mesh = new Mesh();
var baseMeshVertices = baseMesh.vertices;
var baseMeshVerticesLength = baseMeshVertices.Length;
var baseMeshBoneWeights = baseMesh.boneWeights;
var baseMeshBindPoses = baseMesh.bindposes;
var newVert : Vector3[] = new Vector3[baseMeshVertices.Length];
var i : int = 0;
for(i = 0; i < baseMeshVerticesLength; i++)
{
//Only the first bone is being factored in right now in order to cut down on calculations.
// Apply bone weights
newVert <em>= GetBoneInfluence(skinnedMesh, baseMesh, baseMeshBoneWeights_.boneIndex0, 1.0, baseMeshVertices*);*_</em>
* // Transform the vertex to be in local space for convenience.*
_ newVert -= transform.position;
* }*_
* mesh.vertices = newVert;*
* mesh.uv = baseMesh.uv;*
* mesh.triangles = baseMesh.triangles;*
* mesh.RecalculateBounds();*
* mesh.RecalculateNormals();*
* return mesh;*
}
private function GetBoneInfluence(skinnedMesh : SkinnedMeshRenderer, baseMesh : Mesh, boneIndex : int, boneWeight : float, vertex : Vector3) : Vector3
{
* // Transform the mesh vertice first so that it’s local in bone space, and then transform the*
* // local coordinates to world coordinates using the current bone transform.*
* var localVertexPosition : Vector3 = baseMesh.bindposes[boneIndex].MultiplyPoint3x4( vertex );*
_ return skinnedMesh.bones[boneIndex].transform.localToWorldMatrix.MultiplyPoint3x4( localVertexPosition ) * boneWeight;
}
This code works just fine, and the effect looks excellent, but this piece of code is causing 0.2-1.0 second hitches each time I call it on a 3k vertex mesh, which is not acceptable. I was able to optimize it down to what it is now by caching a lot of the return values from the mesh class (a lot of functionality is apparently being done behind the scenes by the property accessors), but I’ve run out of ideas in that regard. If I can bring the calculation cost down consistently to half a second or less, I think I can mask the rest of it with multithreading or coroutines.
Can anyone spot redundant or unnecessary operations here that I might have missed, or another way to approach the problem? Any help would be appreciated!_