Fast matrices

I have an algorithm that finds the determinate. It gets the correct answer but use to large of a matrix and it runs really slow. It uses a recursive method which is slowing it down. I tested it in java first and found that anything larger then 10 is running in about a second then after that is just increases even worse. I need to be able to run about 200 determines a frame and really large matrices.

public static int det(int [ ][ ] matrix){

int result = 0;

if(matrix.length ==1)
return matrix[0][0];

if(matrix.length == 2){
return matrix[0][0] * matrix[1][1] - matrix[0][1]* matrix[1][0];
}

for(int i = 0 ; i< matrix[0].length; i++){
int temp[ ][ ] = new int[matrix.length-1][matrix[0].length-1];
for ( int j = 1; j< matrix.length;j++){
for( int k = 0 ; k < matrix[0].length;k++){
if(k< i)
temp[j-1][k] = matrix[j][k];
else if(k>i)
temp[j-1][k-1] = matrix[j][k];
}
}

result += matrix[0]*Math.pow(-1, i)*det(temp);

  • }*

  • return result;*

}
}

Anything you can write as a recursive method, you can write as a loop, but I don’t think this will help you very much. Recursive methods are not necessarily slow because they are recursive.
There’s just a lot of operations that need to be done to compute the determinant of large matrices.
Usually you can take some shortcuts though, when there are a lot of 0s for example.
Also, in case you happen to know the matrices in advance, you can precompute them (or parts of them)

Unfortunately, I can’t precompute the matrices. That is kind of the point of the game that the weather will change without being able to predict it, so it is being read in from a set of files that are temporally spaced and then the weather needs to be calculated based on position. Unless the users path was know precomputing is not good. I just need a good set of matrix operations because I need the determinant to calculate the inverse, so I can divide two matrices. The multiplication is straight forward and the inverse is to after the determinate is finished. I am going to be Kriging the data to fill in the spaces between the data points.

Recursion is slower not in the number of steps, but at the hardware level. When you call a function it must be copied from the original program file and stored in the running instance of the program, which is stored in RAM. When that instance is needed it is pulled from the RAM and sent to the cache of the CPU, which takes many clock cycles usually 5-10 for the transfer of each instruction. If it is done iteratively there is only one instance of the code and it stays in the cache until its done computing because the cache has an algorithm that determines the “best” (determined by the company that designs the CPU and what they want it to do) code to stay in cache and it recognizes the loop. When you call a new function even if its the same function the old instance is is left in cache with a moderate chance of being removed based on the algorithm and replaced with code that is more likely to be excited before the old instance is needed again and the new instance with the new inputs are inserted which takes a lot of time to just set it up. The “Runtime analysis” is the same, but the clock cycles required are much more.

You don’t need the determinant to calculate the inverse - use gaussian elimination instead. It’s more efficient, both in number of instructions needed and memory footprint.

For operations of this complexity it’s worth really thinking about why you need the calculation to be done. Why are you trying to “divide” two matrices? It’s possible that there’s a better alogirthm right up at that level that would avoid the need for the inverse.

Also, if the matrix has a particular structure then different algorithms can work better - for example, if it’s symmetric, or if it’s mostly zeros.

Another approach may be to cache the partial inverses of areas of the matrix, so that if your changeable weather just modifies a few values at a time you don’t have to recompute the whole inverse again from scratch.

Wikipedia has information on several techniques, indexed here:

Also bear in mind that for something as generic as this there’s almost certainly a library available already which does what you want, and unless you have very strong reasons for implementing it yourself, it’s probably better to find an implementation by somebody who already knew what to do.

I agree with the above, you should really look for algorithmic improvements or change your methods first. If you really need to optimise the code you already have though, then here are some things you could try which might help:

  1. Instead of assigning memory for your temp matrix in each recursion you could pre-allocate an array of floats large enough to hold all the needed matrices at each level of recursion from the beginning and index into this array. Even better would be if you knew before hand what your biggest matrix would be, then you could make the float array static and just re-use it each time you need to invert a matrix. This will ensure your memory if aligned better and avoid costly allocation operations.

  2. Split your ‘k’ loop into two seperate loops in oder to get rid of the conditional statements currently contained within that loop to prevent code branching.

ie:

for(int k = 0; k < i; ++k)
{
temp[j-1][k] = matrix[j][k];
}

for(int k = i + 1; k < matrix[0].length; ++k)
{
temp[j-1][k-1] = matrix[j][k];
}

  1. At the end you’re using ‘Math.pow’ however this is rather wasteful since all you really need to know is whether the int ‘i’ is even or odd. This can be determined more efficiently using the modulus, ie: (i%2), if that returns 0 then i is even and if it returns 1 then i is odd.

Thanks, I decided to just do the matrix for the visible area, so its only a 3x3 which is very fast at calculating the inverse.