How to tell when you are micro-optimizing?

Hello,

I have been educating myself on optimization stuff alot, and there is alot to learn! Because I really do not have a background in this subject, I find myself questioning whether the time I am putting in to optimize things is worth it. I know the default answer is to use a profiler, which I am, but that takes time as I still have to do the optimizations to test them.

I have mostly been referring to this book about CPU layout and architecture:

And this book about real-time collision detection:

They talk about limiting function sizes (ie breaking long functions into smaller functions) for better cache locality. Using smaller data types, storing integers implicitly using bits. Eliminating branches using bit masks, etc.Things like this.

Are these types of optimizations you generally do later, when you have time? I know it depends on one’s project, but I would like to hear from your experiences, whether it be a hobby project, or professional. I know that generally top-level algorithmic changes are where you start, and then work your way down. I am trying to get a grasp on the scale of difference these things can potentially make to performance.

I just want to hear if you have ever done these types of low level optimizations and found a significant difference.

You definitely going to far with micro optimization, if you implementing all that into project.
Is no point going too deep down the rabbit hole, if you not hitting bottle necks.
Always profile, stress test and see if is worth it.
All comes with practice of course.

Of course there are benefits of knowing, what gives best performance.
So knowledge is not wasteful.
But chances are, that for time you spent gaining on micro optimization, you could spend time instead, optimizing and organizing your code and gaining probably more performance this way.

Did you actually run tests, how much you gained on such micro-optimization?

But like saying

you eliminate also readability.

People talking about branching and all that, so I did also my own test, using Unity DOTS, jobs and burst.
https://discussions.unity.com/t/814311/3

Is it worth it for stuff, which executes at most few times per frame?
You need to recognize, when is worth to apply them.

If you want really focus on optimization, I suggest go redirect your attention to DOTS.
It will force you to go that route, with some practical applications.

2 Likes

It’s good (AND FUN!) to know these things - but at some point your head just explodes :smile: I know it, and I’m not an absolute expert on performance (above average, though, I’d say).
The most important part to realize is that, as you’ve said, a top level change of the algorithm for instance is the most noticable optimization. But it’s equally important to know what the compiler is able to do for you - it’s a huge topic and you knowing about it also results in much cleaner code in general.

I want to give you some advice, though: Good programming practices help out a lot. By that I mean that you should wrap as much of your logic and call into it as much as you can. Imagine you need bit array structs or something similar; Instead of writing code like 1 == ((myInt >> index) & 1) ? [...] : [...] everywhere, define a “BitArray” datatype with member methods instead. That way you can optimize the method itself in case you come across some better algorithm, essentially upgrading your entire program by changing just a tiny bit. This might be an obvious and/or trivial example, but IMO it applies to so much, even if it is a TINY bit of logic, like “myBranchFreeMaxFunction” for example…
I wouldn’t worry about function call overhead since… compilers actually have full control over inlining, and I wouldn’t worry about your layers of abstraction impacting performance whatsoever, especially if you compile with IL2CPP (c++ compilers collapse your abstractions very well).

1 Like

This is actually the reason I develop LSSS. GitHub - Dreaming381/lsss-wip: Latios Space Shooter Sample - an open Unity DOTS Project using the Latios Framework
I haven’t been building my framework to build an unoriginal space shooter. I have much bigger plans. Instead, I have been developing LSSS so that I have a realistic but relatively small-scope project to put the tech I have been building in perspective.

My first pass through any code is to make it clean and use optimization tools that already exist where applicable (math.select usually). Sometimes I try to write code in a branchless way if I suspect that I may want that optimization in the future and it drastically changes the logically reasoning of the algorithm in question.

Then I profile. I don’t just profile the algorithm, I profile the entire game. I make sure to overload it with simulation until it is no longer running at target frame rates. Then I look at the profiling data and look at where the problems are. Got a bunch of threads sitting idle for three milliseconds? Either find work for them to do or optimize what they are blocked on. Then I look at slow parallel jobs and pick the one that I think could save me the most overall frame time for the least time spent on it. And I drill down that optimization until I believe my time is better spent optimizing something else.

But until you have a working project that can provide that perspective, you shouldn’t worry about optimizing much. You are just wasting time on things that likely won’t matter.

3 Likes

There are comments here from people who are more experienced than me saying not to worry about it until it matters, and I agree with them to a degree. However, I do think that it can be beneficial to develop habits that make your first pass reasonably fast. For example thinking about vectorisation when writing loop bodies - like Entities.ForEach - is good. Don’t expend undue effort on it until you find that you need to, but do try to develop habits that mean that you write reasonable code by default. This is where reading about performance techniques can be useful at the start.

I don’t usually profile new bits of Burst code when I write them, but I do usually take a quick glance at the assembly output just to confirm that the compiler is doing roughly what I expected - for example, to make sure that things vectorised.

3 Likes

Yeah I kind of fell down this hole a bit, I looked at bit packing and Separation logic optimisations. I eventually kind of realised they might not have applied to my case and might not have been worth the time effort to implement, and they might not have resulted in much of speed up. I wasn’t going to use a millions cells anyway. It was probably just a delaying tactic so I don’t have to focus on other more complicated things and actually finish something. It’s better to at least get something fully working and then I could go back and optimise. I know anything I do in DOTS is way faster than anything I could do in Monobehaviour at least.

How about this… You are Micro-Optimizing when:

  • You are not learning something new, and…

  • You are adding/editing things not required to SHIP IT!

I’m a huge fan of MVP (Minimum Viable Product) and though I admittedly fall down the optimization hole, aiming for a MVP provides clarity of thought and vision. Hold off on optimizations until after.

2 Likes

Learn things because knowing them is the only way to find out if at a certain situation you will need it or not.
Regarding how much to optimize:

  • Build good habits so you are not building things slower than needed, if it is not hard to build them fast.
  • Performance is a culture (a set of habits of a group), it helps if you always consider it.
  • Have targets and see if you are close to them. AAA games go far enough that they have time and memory budgets for each section/team like AI, physics … but you at least might say I want to run 60 simulation frames and should not be worse than that on a certain system. Now in that case some systems run more than others, some codes are being ran on many entities, in loops or in multiple loops/hot paths. Do micro optimizations which are harder and make the code harder to read only on them. Regarding if you for example you should try to avoid branching in all inner loops, it would be great if you do but until it becomes a habit you can do it only in code sections which are slow after algorithm and data layout optimizations.

Programming in general is a set of huristic and lose rules so there will never be a perfect cut of where to stop but if you are learning, learn it well and you’ll know what systems are taking the most time and naturally they are more important to optimize to achieve your goal. If you don’t have any specific goal then it is hard to say when to stop.

1 Like