How can I use feature recognition on terrain?

Dear readers, I am currently working on my master thesis and for it I need to use real world height data. I have been able to use the height data and put it in a terrain, however I need to take this terrain and get interesting objects like buildings and trees etc out of the terrain, into a 3d object with a mesh and collider. I can of course see for myself that something is a building or not, but I need to do this automatically by c# code. I have added a screenshot of what the terrain looks like. All help is greatly appreciated! If you have any questions please ask!

Your screenshot isn’t showing up — did you forget to include it?

So what you’re attempting to do is actually very very difficult. If you break it down into a two-step problem, you first have to detect and localize objects (like trees), and then you have to isolate them (separate the particular tree mesh from everything else). If you have detection, mesh isolation is going to be a little easier.

For detection, state-of-the-art is with neural networks. You can take one of two approaches: localization or segmentation. I’d probably use a top-down view of the rendered terrain, if you have color data as well, or a heightmap, which should also work fine. Localization involves the neural network simply drawing (2D) boxes around the objects, while segmentation involves the neural network identifying every pixel in the image that is part of that object. Localization is much easier to train, but segmentation is going to give you better results.

From there, some relatively simple mesh filtering should get you the object from the ground/terrain. Overall, though, it’s pretty difficult and I’d say not entirely worth it. That’s also not going to run in pure C# for sure, you’d need something like Tensorflow as well.

If you’re interested in using neural networks (and honestly if there’s no other way around this component of the project I’d choose a different thesis unless you already have machine learning experience, it would take a lot of effort to learn), you’d probably want to use the SSD-Resnet model or similar for localization, and either the CRF-RNN or FCN-8 for segmentation.