TerrainData.GetSteepness Implementation

Does anyone know how GetSteepness is internally implemented? (Docs: Unity - Scripting API: TerrainData.GetSteepness)

I’m looking to call GetSteepness for a large number of heightmap points and the invocation cost of terrainData.GetSteepness is high due to crossing the C#/C++ code boundary.

Since I create terrain procedurally I already have height data. I’m hoping to calculate steepness from this height data.

From what I am learning, Unity generates an interpolated normal map using the following shader:

GitHub: unity-builtin-shaders/DefaultResourcesExtra/TerrainShaders/Utils/GenNormalmap.shader at master · hyperfact/unity-builtin-shaders (github.com)

Original Location: https://download.unity3d.com/download_unity/17028576122a/builtin_shaders-6000.0.5f1.zip

Using an array of normal map values, GetSteepness is then determined by …

float steepness = Dot(GetInterpolatedNormal(x, y), Vector3f(0.0F, 1.0F, 0.0F));
steepness = Rad2Deg(acos(steepness));

The shader input parameters are as follows:

uniform float4 _MainTex_ST;
uniform float4 _TerrainNormalmapGenSize;  // (1.0f / (float)m_Width, 1.0f / (float)m_Height, 1.0f / hmScale.x, 1.0f / hmScale.z);
uniform float4 _TerrainTilesScaleOffsets[9]; // ((65535.0f / kMaxHeight) * hmScale.y, terrainPosition.y, 0.0f, 0.0f)

It looks like I am almost there. Does anyone know what should be contained in the _TerrainTilesScaleOffsets array?

This isn’t correct, but it is closer.

        Vector3 terrainSize = terrainData.size;
        Vector4 normalMapGenSize = new Vector4(
            1.0f / terrainData.heightmapResolution, 
            1.0f / terrainData.heightmapResolution,
            1.0f / terrainSize.x, 
            1.0f / terrainSize.z
        );
        normalMapMaterial.SetVector("_TerrainNormalmapGenSize", normalMapGenSize);

        Vector4[] terrainTilesScaleOffsets = new Vector4[9];
        for (int i = 0; i < terrainTilesScaleOffsets.Length; i++)
        {
            terrainTilesScaleOffsets[i] = new Vector4(terrainSize.y / 65535.0f, terrain.GetPosition().y, 0, 0);
        }
        normalMapMaterial.SetVectorArray("_TerrainTilesScaleOffsets", terrainTilesScaleOffsets);

The offsets appear to be for neighboring tiles… Any feedback on whether this is close, and what the order of the offsets are is greatly appreciated.

Does your application need very precise knowledge of the normals at arbitrary points? Or is it enough just to get the surface normal corresponding to the height map points you already know from your procedural generation?

If it’s the latter, since you know the adjacent height values, you can use the formula here and then take the acos of the Z component from the result. That would give you a “steepness map” corresponding to your height map.

With that in hand, you’d need to figure out whether a simple lookup into that is enough or if you also need to interpolate through those steepness values. The good answer there will depend a bit on the use case because you could get fast lookups by oversampling the data ahead of time, or you could save memory by doing the interpolation at runtime – which route to pursue is totally application dependent. But the interpolation there is probably fine as a plain old bilinear.

In any case, If you don’t have multiple terrain tiles using the normal-from-heightmap check is pretty simple – the only bit of extra work is deciding how to handle the edges, where it probably makes the most sense just to extend the values from one rank in. If you do have multiple terrain tiles, you’ll need to get the correct heights from the adjacent tiles. You can skip all the math about scales and so on since the end result will just be an array mapping “height in this terrain texel” to “normal in this terrain texel”. At runtime you’ll scale to find correct texel lookup, but that’s a simple scale-and-offset based on the position and size of the terrain.

Hi @SteveTheodore,

These are great questions.

  • I’m using multiple terrain tiles.
  • The interpolated normal positions are arbitrary and not always aligned with heightmap coordinates.

I found C# logic for calculating a Sobel filter from a normalized position using height values. It returns nearly identical results to Unity’s existing GetSteepness() method – yay!
…but it is 4x slower than existing Unity logic.

I’m on board with creating a steepness map! I’m attempting to use the Unity GenNormalMap shader (above). It’s running but the values I am getting back are incorrect.

I may be using incorrect constants, or an incorrect order for terrain tile scale offsets. Creating 25 pre-generated normal maps takes 230ms – for 25 square miles of terrain. This is definitely ideal, but I need to actually use the shader correctly.

Do you or your team use this Unity shader? I’m supplying normalMapGenSize as

  Vector4 normalMapGenSize = new Vector4(
      1.0f / terrainData.heightmapResolution,
      1.0f / terrainData.heightmapResolution,
      1.0f / terrainSize.x,
      1.0f / terrainSize.z

and _TerrainTileScaleOffsets as

  Vector4[] terrainTilesScaleOffsets = new Vector4[9];
  terrainTilesScaleOffsets[0] = CalculateTileScaleOffset(terrain.bottomNeighbor?.leftNeighbor);  // BL
  terrainTilesScaleOffsets[1] = CalculateTileScaleOffset(terrain.bottomNeighbor);                // B
  terrainTilesScaleOffsets[2] = CalculateTileScaleOffset(terrain.bottomNeighbor?.rightNeighbor); // BR
  terrainTilesScaleOffsets[3] = CalculateTileScaleOffset(terrain.leftNeighbor);                  // L
  terrainTilesScaleOffsets[4] = CalculateTileScaleOffset(terrain);                               // C
  terrainTilesScaleOffsets[5] = CalculateTileScaleOffset(terrain.rightNeighbor);                 // R
  terrainTilesScaleOffsets[6] = CalculateTileScaleOffset(terrain.topNeighbor?.leftNeighbor);     // TL
  terrainTilesScaleOffsets[7] = CalculateTileScaleOffset(terrain.topNeighbor);                   // T
  terrainTilesScaleOffsets[8] = CalculateTileScaleOffset(terrain.topNeighbor?.rightNeighbor);    // TR
  NormalMapGenerationMaterial.SetVectorArray("_TerrainTilesScaleOffsets", terrainTilesScaleOffsets);

CalculateTileOffset is

  float kMaxHeight = 32766;
  return new Vector4(65535.0f / kMaxHeight * terrainSize.y, terrainPosition.y, 0, 0);

I then convert to an array of Vector3 normals, via the following

private Vector3[,] TextureToNormals(Texture2D texture)
{
    int width = texture.width;
    int height = texture.height;

    Vector3[,] normalsArray = new Vector3[width, height];
    Color[] pixels = texture.GetPixels();

    for (int y = 0; y < height; y++)
        for (int x = 0; x < width; x++)
        {
            Color color = pixels[y * width + x];
            normalsArray[x, y] = new Vector3((color.r - 0.5f) * 2.0f, (color.g - 0.5f) * 2.0f, (color.b - 0.5f));
        }

    return normalsArray;
}

Do you see anything that stands out as incorrect? I haven’t been successful finding an example of how the shader is used.

And after typing all of this up, I realize that I will still have to interpolate if the x/y does not align with the precomputed normals map.

I’m really surprised at how challenging this! I’m not as capable as I would like to be with Unity’s terrain system.

@SteveTheodore

The reason I am working so hard to speed this up is because I have other aspects of procedural generation working quickly!

The tall pole in the tent is steepness checks for vegetation placement.



Performance metrics:

Create Stamps: 1ms, 
Create Terrain: 437ms, 
Create Base Layers 0ms, 
Stamp Features 0ms, 
Update Heightmaps 1270ms, 
Create Metadata 63ms, 
Painting 350ms, 
Grass 2141ms,  -- slow!
Trees 5049ms,  -- slow!
Spawn Objects 0ms, 
Clean Up 0ms, 
Total 9514ms, 

Area 25km, 
380ms per km

Just so I get the problem – is this something that’s editor time or runtime?

It is runtime from start to finish.

unity-builtin-shaders/DefaultResourcesExtra/TerrainShaders/Utils/GenNormalmap.shader at master · hyperfact/unity-builtin-shaders (github.com)

I integrated the GenNormalMap.shader which is part of Unity’s built-in shaders. Assuming I get correct values from the shader, it would decrease computation time from ~7 seconds to ~1.5 seconds.

Any detail on how this shader is used is greatly appreciated.

I’m having difficulty understanding what the expected order is for input parameter _TerrainTilesScaleOffset. The referenced kMaxHeight constant appears to be a Unity constant, but the expected value is not shown.

...
Grass 837ms,  -- faster!
Trees 621ms,  -- faster!

Area 16km, 
222ms per km

kMaxHeight is 32766 (for a variety of reasons this is a 15 bit number)

the comment on the _TerrainTilesScaleOffsets is:
uniform float4 _TerrainTilesScaleOffsets[9]; // ((65535.0f / kMaxHeight) * hmScale.y, terrainPosition.y, 0.0f, 0.0f)

It’s for calculating the scaled heightmap value using the funky heightmap number and the Y scale of the terrain and the Y position of the terrain object.

That shader is used to generate normal maps that are used when instanced rendering is turned on. If you are generating that normal map – I should have called this out before! – then all you need to grab is the ACOS of the up value in that map and you’ve got your mask since it’s already a normalized vector.

Depending on the nature of the algorithm you need for placement, I’d consider moving as much of the logic as possible into a shader which takes that normal map (and maybe the alpha (aka “splat”) maps) to decide what can be placed where. Doing that work on the GPU will give you much better parallelization than doing it on the CPU, since you’re running over big 2-d arrays of data, not C#'s strong point.

To minimize trips across the GPU-CPU boundary, if you did try a GPU-based approach to generating placements you could try 2 things:

  1. A simple approach would generate a 4-channel texture, where the RGB is just the normal map (so you can use that for your ground alignment) and the A is a bitmask – that would let you store on-off values for 8 different types of placements.
  2. A more complex approach would be to generate a new 4-channel texture which encoded valid XY positions and 2-channel normals only for locations where a placement is valid. You’d reconstruct the Z values when doing the actual placement. If you expect less than 50% of the locations in the map to be valid placements this would let you do less of the expensive “copy me to a texture, then make a CPU array out of the texture” path… Unfortunately this is also hard to write without compute because of the indirections involved, I would not try it unless forced to do so.

In either case you’d render that into a RenderTexture, then you’d need to do a copy back to a Texture2D on the CPU side, then convert that to a C# array if that’s what you need. Async and using memory copies are your friends there for making it smooth, but it’s a data-intensive operation in any case, so not likely to be lighting-fast.

I should have called out also: doing this:

Vector3[,] normalsArray = new Vector3[width, height];
Color[] pixels = texture.GetPixels();

Will be slow because of both the GetPixels() and creation of a new 2D array. I’d recommend checking out this article for how to speed up that readback. In particular you may want to use Texture2D.GetRawTextureData which won’t allocate if you don’t need to keep your steepness array around after doing your placements. RawTextureData is a bit of a pain because it uses NativeArrays but it can be worth it for the scale you’re working at… plus that makes it Burst-friendly, which might be helpful in a parallel-heavy space like this where you could process lots of potential placements simultaneously.

Hey Steve!

Thank you for the amazing assistance. My journey has been to make every mistake along the way – profiling and benchmarking each technique as I climb a ladder towards better performance!

I am now on a path to using a shader to create a mask which computes height, slope, and random placement via noise.

You have answered all of my questions and helped me succeed with the prior techniques.

A few thoughts –

  • You were right on memory vs speed considerations. Using a shader to create maps and store them was not a viable solution due to high memory consumption (1 GB for 25 sq km).

  • And you were spot on that there is no way to squeeze performance out of anything that requires a C# n^2 loop with detail maps 512x512 and larger.

  • I had to fail with each step along the way to graduate to the understanding that I need to offload this work to a shader.

My goal is to generate 25 square km of terrain (shown below) in 3.5 seconds. I’m at 5.4 seconds right now.

Thank you for the amazing assistance!

For any geometric calculation problem like this, one thing you should look for is the harmonics – ie, at what scale does your problem really operate?

5x5 km is a lot of data at typical game res – say, 1m terrain quads. However it might be – depending on your algorithm of course – that the actual frequency of the data is pretty consistent on a scale of 2 or even 4 meters. So maybe you can run two passes: the first on the 2d mip of your height-and-normal map, which is 25% of the memory cost, and then running a slower but more exact version only in areas where the low res check indicates the possibility of success.

If this is expected to run at runtime maybe you should look at ways to reduce the memory bandwidth of the generation algorithm itself. For example we know the output is limited to the 15-bit resolution of a Unity height map. Generating the original heights as halfs rather than floats is already half the work, but if you could do an 8-bit integer algorithm instead and generate your placement data off of that you’re doing 1/4 of the original work. Then your 1gb map is a 256mb map instead. Once you have the key work done could do something like a smooth upscale with a little extra noise to disguise the quantization.

Ultimately all perf is down to how much memory you are moving around. Reducing N^2 dimensions and/or reducing bitdepths is the key to perf improvement. Clever packing of textures to get more info across the CPU-GPU boundary is really the same thing (that’s why I suggested you could set flags on 8 different vegetation types into one channel of a 4-channel texture).

Good luck with it!