Hello,

I’m a student looking to learn to code and ran into some performance problems for one of the first times. The problem line is a line that tests if a test location is within about two units of any point within a point cloud. This point cloud is for almost all intents and purposes, random and does not fit any grid. These points are stored as Vector3s in a very large Hashset. The search is just a foreach loop that iterates the entire set looking for a point within the appropriate distance. This search occurs quite frequently so any inefficiency really causes issues. It should be noted that the point cloud will also be adding new points with about the same frequency.

It feels like there is a large amount of wasted resources here looking at points that aren’t even close to the test location. What kind of data structure might be more effective for this kind of application?

Thank you for your time.

If your already generated points in the pointcloud are static (do not move) the best solution would be an octree. To minimize the cost of further subdividing the tree when new points are added you may store several points within a leave node until the count reaches a certain limit (for example 10). Once that is reached you just subdivide that node further down.

Another solution may be a BSP tree. However depending on your data the sub dividing and choosing proper division planes could be tricky.

Octrees are usually easier to handle as it’s still a regular grid, just with variing density. To determine the leave nodes you need to check for a certain location you would do an AABB test between the node’s bounding box and your test sphere’s bounding box (in your case a sphere with radius 2). Once you have collected all leave nodes you can just test against the points in those nodes the usual way.

Just as a performance tip: avoid using to many of the built-in operators of the vector structs. Manual inlining can give you quite a speed boost. Though it depends on the situation. Over here the inlined code runs 8 times faster than the straight forward implementation. Note that the unoptimised code doesn’t use any heavy methods. They are all simple linear functions or conditional statements, yet the inlined code runs way faster.