I am doing the following which results in 14000 calls so I am trying to optimize the matrix multiplications. As they are structs I get quite a lot of ms of delay because of:

String.memcpy

Buffer.Memcpy

Buffer.memcpy4

```
SkinnedMeshRenderer renderer = target.GetComponent<SkinnedMeshRenderer>();
if( renderer != null )
{
NeedsScaleAdjust = false;
Mesh mesh = new Mesh();
renderer.BakeMesh( mesh );
mesh.boneWeights = renderer.sharedMesh.boneWeights;
outMesh = mesh;
Matrix4x4 scale = Matrix4x4.Scale( _target.transform.localScale ).inverse;
outBindposes = new Matrix4x4[ renderer.bones.Length ];
for( int i = 0; i < renderer.bones.Length; i++ )
{
outBindposes[ i ] = renderer.bones[ i ].worldToLocalMatrix * target.transform.localToWorldMatrix * scale;
}
return;
}
```

I am not sure if I use float4 from Unity Mathematics is going to help at all since float4 is also a struct but any ideas as to how to improve this would be great!

Cache the matrices first, and if you are doing this every frame have a pool for matrices. Also you can move the target.transform.localtoworldmatrix * scale outside the loop. I have done things like this and I did my own matrix mult methods and used multithreading, but I guess with Unity now Burst etc would do a good job.

@SpookyCat I’ve actually found some of your questions from years ago in the forums and were reading through them

I was thinking whether I could add the variables in hashsets and if I can somehow check whether the 3 of them had been multiplied before I can just return the cached result. Did you mean that by caching the matrices @SpookyCat ?

PS: My solution is already in multithreading so it’ll start getting very complex very quickly if I add Matrices in threads as well although not a bad idea

This makes it a bit faster to anyone that might need it, I’m also unity Unity Mathematics, although I must have a bug somewhere as the result isn’t the same but it could be in another part of the code

```
public static float4x4 Multiply( ref float[ , ] matrix1, ref float[ , ] matrix2 )
{
// caching matrix lengths for better performance
int matrix1Rows = matrix1.GetLength( 0 );
int matrix1Cols = matrix1.GetLength( 1 );
int matrix2Rows = matrix2.GetLength( 0 );
int matrix2Cols = matrix2.GetLength( 1 );
// checking if input is defined
if( matrix1Cols != matrix2Rows )
{
return default;
}
// creating the final input matrix
float[ , ] product = new float[ matrix1Rows, matrix2Cols ];
// looping through matrix 1 rows
for( int matrix1Row = 0; matrix1Row < matrix1Rows; matrix1Row++ )
{
// for each matrix 1 row, loop through matrix 2 columns
for( int matrix2Col = 0; matrix2Col < matrix2Cols; matrix2Col++ )
{
// loop through matrix 1 columns to calculate the dot input
for( int matrix1Col = 0; matrix1Col < matrix1Cols; matrix1Col++ )
{
product[ matrix1Row, matrix2Col ] +=
matrix1[ matrix1Row, matrix1Col ] *
matrix2[ matrix1Col, matrix2Col ];
}
}
}
return Convert( product );
}
```