Performance tips for low-poly style

Hi,
We’re making a dungeoncrawler with generated maps made of tiles.
We are going for a flat shaded low-poly style. On first try I made a mesh with kinda high polygon count(3000), and meshes like this one will be used in dozens to hundreds in the scene. A little test on my integrated GPU immediately jumped to 20fps.


I tried using deffered lighting and a simple diffuse material, but it didn’t help much.
Any tips on getting playable framerates, or will I have to make less detailed meshes?

You are underlining useless information in the screenshot. Your real damage dealer here is the setpass count. It’s way too high and you should look at getting that and batches a bit lower.

You’ve provided far too little information about how you’re setting things up. There is no magic bullet if there’s no target. So I am guessing, but I don’t think the GPU is working especially hard. The CPU is working very hard. This means you have major inefficiencies with materials and nothing is being batched and you didn’t really use deferred under ideal conditions.

1 Like

Thank you for the fast reply!
I saw the setpass calls and I agree it’s horrendous. The models are just dragged and dropped into the scene, all have the same material - Legacy diffuse. I think the problem is either that the material isn’t shared between the models, or the lights drive setpass calls so high. Because of that I tried setting rendering to deferred, but it didn’t seem to help much.
Also all gameobjects are marked as static, but they will be spawned at runtime, so I don’t know whether it applies. All models use the same palette texture too.

It doesn’t really apply but you can manually batch things at runtime. In any case setpass would be low and batches high if it wasn’t switching materials. So I think you need to look closely - are you modifying materials at all at runtime? if so you are probably instructing Unity to create a clone of the material. When game is running in editor, pause it and look at the scene properly. Are the walls using (clone) materials? Can they be moved (ignoring static batching) etc. Lots of info for you to look at.

You want to first sort out material inefficiencies. This is what is contributing to large setpass calls. Batches isn’t a bad thing if setpass is low. Because a gpu can easily plough through similar batches, but it has to stop and set things up with a setpass.

In any case, the CPU is the one being stressed here, not the GPU.

1 Like

Once you sort it out so your setpass is much lower, you need to look at using a shader to give you the faceted look, not splitting everything when everything is high poly. This is suicide for performance. A shader, or even great crisp normal maps will give you back the sharp edged polygonal look you’re looking for without a performance-draining hack.

(The normal maps approach can even let you lower your poly count and you would probably want to split no verts at all with this approach, dramatically lowering gpu cost).

1 Like

3269807--252564--Bez názvu.png

I noticed leftover cages from blender, they may have caused the setpass calls even though they had no triangles, only edges. So setpass calls are down, but 30ms cpu is still pretty high. The materials aren’t clones and there are no scripts at all, even post processing is disabled. Meshes are batched fine too. Atleast on forward i get almost 60fps.

With the second post, you mean I should make the meshes share vertices and then use some geometry shader to fix it? Isn’t the shader heavy for older PCs?

The batching limitations seem a bit rough: https://docs.unity3d.com/Manual/DrawCallBatching.html (his polygon counts are too high for dynamic batching according to that?) if a small number pallette of objects will be repeated many times would instancing be better https://docs.unity3d.com/Manual/GPUInstancing.html

I like the idea of forward rendering with this art style you can prob do per vertex lighting and get away with it looking good even. Somewhat odd you can get almost 60fps with 30ms of CPU per frame, or does that change with forward?

No, no geometry shader is needed. Looks like the stats window is all within reason for a mobile title, let alone laptop. So that’s all fine.

At this point the bottleneck is something else so you would probably want to use profiler. If deferred is slower and forward is 60, you probably should stick with forward :slight_smile: most of the CPU time at this point will probably be something else, or even vsync (which you don’t need to be fixing).

It could be the device is truly pants. What are you running it on?

I’d say stick it on forward and be done with it. Test it once or twice with vsync off, to check how much it’s really flying.

Only thing to watch out for in forward is not to use lights that are too big in radius, that would cause to many things to be overdrawn at once (in forward each object is drawn multiplied by the number of however many lights touching it).

TLDR: if profiler isnt immediately telling you what highest ms is being spent on at this point, it’s probably just your laptop

I will stick with forward, it runs fine now.
Thank you for your time, have a great day.

Normal maps with really high ress does stress the GPU pretty much since it will generate polygones up in tangent space even so your model is low poly. Also shader has a very important role for your FPS performance, since this are the instructions for the GPU. What type of GPU you have also matter (you said you use integrated GPU, on laptops does aren’t the best. Yeah forward is by standard with Unity and performance best, but not the solution to your current issue. Lower texture size on the normal maps / displace map. and see what FPS you get. Also make sure your GPU it sett to maximum power in the windows power management so it doesn’t trikel down.

They aren’t using normal or displacement maps, it’s all polygons.

It’s several 3000-ish poly mesh tiles reused, they have spelled that out in the thread. The screenshot has post processing effects - looks like a chromatic aberration effect?

postprosessing does not generate tangentspace.

But you are the only person in the thread talking about tangents.

I am sure Pagi can explain what was done in that screenshot him self, when he say flat low poly and post a screenshot
with stones extruded that can only mean normal mapping or displacement. Else its not flat low poly… its only logic…

If he did use mesh , he should consider normal mapping instead, since the GPU has
optimization for this in pixelshader,.

I hope his performance will be better with some texture handling. Looking good so far.

I want to point out, that on low GPU hardware its better to simulate things with texture than geometry. Programs like 3D coat can render gemotry in to normal maps and trippel the performance with low poly objects with normal map based from a high polygone object.

This is pretty much standard now days …

Yes these are basic models sculpted and decimated in blender. Colors by UVs from color palette, no normal maps. I know I could bake details into normal map, but it just isn’t the same, I want real 3D details if it is possible.

2 Likes

Hi Pagi, i understand, then the performance is pretty much self explanatory and very little you can do in Unity (other than turn off shadow and use forward render) Texturet/materials will not change things to the better. The GPU renders every geometry that is in the frustum range, you can also try to cull objects by distance if your whole scene is based on high polygone objects. Try also on another computer, maybe its just the GPU you currently using.

That said, i think you be surprised how well/real normal map can simulate geometry (try displace map) , since you acutely get increase poly by texture in tangent space. Its just allot faster :slight_smile:

You can also bake geometry objects, to increase performance,

Here you see an object that is low poly to the left, and same model to the the right with normal map based of the high polygone object.(they pretty much look exactly the same).
http://wiki.polycount.com/wiki/Texture_Baking

higher polycount = lower framerate. take any GPU… Theres more than 3000 in that scene, if you look closely he have copied objects around. so i guess 3000 on each object. multiplied with each.

It was 1.6m trio so yeah…

Culling is also pretty much important.