Hi Martin,
thanks for your answer! The following text describes how we used to worked in a previous company…
When starting a new project, we usually defined performance budgets via Trial&Error on a particular platform. Finding out how many Render-State changes, Vertices, and so on the hardware can handle before dropping under the frame-rate we have to achieve.
Then we had different automated performance tests, some of these were continuous integration performance tests.
For example, each level had several pre-defined camera positions. The performance test placed the camera at each of these points for a certain time-frame and moved the camera around, for example a slow 360° degree turn around the Y-axis within 30 seconds for a first-person shooter like game.
The test measured the CPU performance during the entire time. At end end of the test, we have a graph similar to the one in this post . The test in the linked post is from my hobby-project though, but the idea is the same, just a simpler implementation in my hobby-project.
If we could also measure other statistics beside “Framerate”, it would allow us to automatically detect what’s most likely the issue for the frame-rate drop in a certain camera view.
For example, if one of these camera positions drops below 30 fps and we detect there were 10x the amount of vertices than our budget allows, it immediately gives a clue where to look in order to fix it. This allows non-programmers to understand where to look for performance issues. Also simple things like “are we CPU or GPU bound” is really useful to have.
If the performance test generates a text like:
Level “Harbour” camera position “Region 4” (Rotation=0,54,0) dropped to 20 fps, pushing 10000000 vertices more than in our budget.
… can be understood by non-programmers as well, for example by level builders. Especially if he/she just worked in this area and the test runs after he/she pushed the changes, then after the test “failed” generates an email with this text and sends it to him/her or a selected group of people.
Because we couldn’t generate this detailed information why a performance test failed, a programmer needed to investigate always.
EDIT: We also had a “performance overlay” in the game that we could activate at any time. It displayed CPU ms, Memory consumption, etc. Adding more detail to this would also allow situation like this:
Inhouse QA tests game. Comes into a region that drops to 10 fps. Checks Performance Overlay. See’s 100000 Draw-Calls. Moves over to level design “Didn’t you add more items to this region yesterday? Yes I did, looks great right? It sure does, but maybe you placed a little too many objects. I’ll create you a ticket”. It’s also something that will move work away from a programmer.
I hope this clears things up, otherwise please let me.
Edit2: something like “gc allocs per frame” would be nice too.