Vertex Animation Shader Performance (tree vegetation)

Hi, my problem is not the Shader, but the Performance is horrible. Like if i have a scene with Custom Trees and they’re Vertex animated for Wind.

is it usefull to put the animation parameter in a cginc file? or how can i get a good performance boost? like animate the parameter in c# or something…

thanks dan


Shader "TreeWind/Wind_Main" {


    Properties {

        _MainTex ("Main Texture", 2D) = "white" {}
        _BumpMap ("Normalmap", 2D) = "bump" {}
        _Color ("Color", Color) = (1,1,1,1)
        _AlphaCutoff("Alpha Cutoff",range(0,1)) = 0.03 

        _wind_dir ("Wind Direction", Vector) = (0.5,0.05,0.5,0)

        _tree_sway("Tree Sway Offset", range(0,1)) = 0.2
        _tree_sway_disp ("Tree Sway Displacement", range(0,1)) = 0.3
        _tree_sway_speed ("Tree Sway Speed", range(0,10)) = 1
        _wind_size ("Wind Wave Size", range(50,5)) = 15



    SubShader {
        Tags {
		LOD 400
        cull off

		#include "TerrainEngine.cginc"
        #pragma target 5.0

        #pragma surface surf Lambert vertex:vert addshadow fullforwardshadows 

            //Declared Variables

            float4 _wind_dir;
            float _wind_size;
            float _tree_sway_speed;
            float _tree_sway_disp;
            float _tree_sway;
            fixed _AlphaCutoff;

            sampler2D _BumpMap;
            sampler2D _MainTex;

            fixed4 _Color;


                struct Input {

                    float2 uv_MainTex;
                    float2 uv_BumpMap;
                    float3 viewDir;


                // Vertex Manipulation Function

                void vert (inout appdata_full i) {

                     //Gets the vertex's World Position 

                    float3 worldPos = mul (unity_ObjectToWorld, i.vertex).xyz;

                    //Tree Movement and Wiggle

                    i.vertex.x += (cos(_Time.z * _tree_sway_speed + (worldPos.x/_wind_size) + (sin(_Time.z * _tree_sway * _tree_sway_speed + (worldPos.x/_wind_size)) * _tree_sway) ) + 1)/2 * _tree_sway_disp * _wind_dir.x * (i.vertex.y / 10) + 
                    cos(_Time.w * i.vertex.x * _tree_sway + (worldPos.x/_wind_size)) * _tree_sway * _wind_dir.x * i.color.b;

                    i.vertex.z += (cos(_Time.z * _tree_sway_speed + (worldPos.z/_wind_size) + (sin(_Time.z * _tree_sway * _tree_sway_speed + (worldPos.z/_wind_size)) * _tree_sway) ) + 1)/2 * _tree_sway_disp * _wind_dir.z * (i.vertex.y / 10) + 
                    cos(_Time.w * i.vertex.z * _tree_sway + (worldPos.x/_wind_size)) * _tree_sway * _wind_dir.z * i.color.b;

                    i.vertex.y += cos(_Time.z * _tree_sway_speed + (worldPos.z/_wind_size)) * _tree_sway_disp * _wind_dir.y * (i.vertex.y / 10);

                    //Branches Movement

                    i.vertex.y += sin(_Time.w * _tree_sway_speed + _wind_dir.x + (worldPos.z/_wind_size)) * _tree_sway  * i.color.r;


                // Surface Shader

                void surf (Input IN, inout SurfaceOutput o) {
                    fixed4 c = tex2D(_MainTex, IN.uv_MainTex) * _Color;
                    o.Albedo = c.rgb;
                    o.Alpha = c.a;
                    o.Normal = UnpackNormal(tex2D(_BumpMap, IN.uv_MainTex));
                    clip(tex2D(_MainTex, IN.uv_MainTex).a * c.a - _AlphaCutoff);

    Fallback "Legacy Shaders/Bumped Diffuse"

Does disabling the vertex animation improve framerate? If not, how many trees do you have and how many vertices are you rendering? Each vertex you are trying to render will need to run the vertex shader code. More vertices = less performance.

A cginc does nothing for performance. It is used so you don’t have to re-write code that you want to share between different shaders. All shaders get compiled down to assembly code before being run, so the GPU has no concept of a cginc file.

If you post the vertex shader code I can tell you if there’s any optimizations you can make.

It’s heavily math bound, the rest is pretty “normal”. So the math on each vertex is killing your performance.


  1. Pre-calc things rather than re-calc. Make a work vector for worldPos / _wind_size and use vector based math to do the divide. Use that result instead of calcing each factor (vector math is faster). You seem to be using that expression (worldPox.x/_wind_size) several times…calc it once.

  2. Consider using a lookup table for sin and cos if you think it will help.

  3. similar things for other parts…I.vertex.y/10 appears several times. Do the math once, then use the resulting variable. Maybe time.z * tree_sway_speed could be passed in as a uniform once. For example. Calc things in C# if you can set it once per frame, calc it once in the shader if you must (vert specific stuff). Never calc things multiple times. Some compilers may optimize repeats out, but some may not. So do it yourself. Also, where possible, multiply first, then add. MAD methodology.

  4. Consider a completely different approach that just calcs a result for all trees in a table, and then “just looks up” the individual tree’s values…like there’s 10 variations, and each tree is one of 10 status values. Kind of like a flipbook where you’re just using an index. Then you won’t have to do the math N times (if you have 1000 trees, with 7000 verts, that’s a lot of math).

  5. Make sure you cull trees you don’t need (you indicated billboards, good job)

Just some thoughts. :slight_smile:

The best ones i have seen have a cginc, also they work with a wind zone