Job that iterates in parallel all *unique keys* from a NativeMultiHashMap?

What is the cleanest way of doing this? I am currently trying to use GetUniqueKeyArray and feeding that to an IJobParallelFor with the length of the unique keys (untested). Is there a better/cleaner way to accomplish this?

Thanks.

One way I’ve had success with:

At the same that you write a key/value pair to the NativeMultiHashMap, also TryAdd the key to a hashset (represented by a NativeHashMap<TKey, bool>). You can safely do this from multiple threads.

then in your job that needs to iterate, Iterate over just the hashset keys, and pull all the values associated with each key from the NativeMultiHashMap.

optionally, for the hashset, you could make your own empty value struct instead of using a bool. That data isn’t used either way, since you only care about the keys.

you could potentially also use a NativeQueue or NativeStream for this purpose, depending on how you need to iterate. In those cases, you would still call TryAdd() on a hashset first, but in this case you’d only be using it to check if you’d already added that particular Key.

2 Likes

Wrote a custom job for this at one point,

/// <summary>
    /// Job that visits each unique key in a <see cref="NativeMultiHashMap{TKey,TValue}"/>.
    /// </summary>
    /// <typeparam name="TKey">The key type of the hash map.</typeparam>
    /// <typeparam name="TValue">The value type of the hash map.</typeparam>
    [JobProducerType(typeof(JobNativeMultiHashMapVisitUniqueKey.NativeMultiHashMapVisitUniqueKeyJobStruct<,,>))]
    [SuppressMessage("ReSharper", "TypeParameterCanBeVariant", Justification = "Unity job system doesn't like it.")]
    [SuppressMessage("ReSharper", "UnusedTypeParameter", Justification = "Required by burst.")]
    public interface IJobNativeMultiHashMapVisitUniqueKeys<TKey, TValue>
        where TKey : struct, IEquatable<TKey>
        where TValue : struct
    {
        /// <summary>
        /// Execute the job on a unique key from the hash map.
        /// </summary>
        /// <param name="key">The key from the hash map.</param>
        void Execute(TKey key);
    }
4 Likes

This is amazing! May I ask if this is performant for you? If this is the case I am very excited to use it, if that is ok with you.
Thank you.

Probably most performant approach you could use. There is a simple unit test there you can run to make sure it’s still working.

1 Like

Perfect. Thanks a lot, this is incredible.

Are there any dependencies for this custom job, other than this file? Do I have permission to use your code in your gitlab? This code is really great, thanks a lot.

Hmm thought I had a license file added to this repo, seems I do not. Should be MIT so yes you can use it fine.

As for dependencies yes it requires my imposters (to get internal access to the hashmap). Note this does mean if the signature/layout of hashmap changes in the future it will break. very easy fix though just remap the imposter to match.

https://gitlab.com/tertle/com.bovinelabs.common/tree/master/Common/Collections
probably better off just importing that entire folder.

3 Likes

Thanks a lot for this. I was looking for a performant way to iterate only over the unique keys in a NativeMultiHashmap and this yielded 10x performance gains over GetUniqueKeyArray and iterating that array.

If any of the Unity staff reads this. Are there plans to implement a supported way to iterate unique keys in a job?

2 Likes

Your unique key Job is extremely impressive. Can I use it for use cases that have Entity as key and Entity as value? The interface wants me to use int/int for key/value. Is it easy for me to change this? Thanks a lot for your help.

Why is it requiring an int?

    public interface IJobNativeMultiHashMapVisitUniqueKeys<TKey, TValue>
        where TKey : struct, IEquatable<TKey>
        where TValue : struct
    {
        /// <summary>
        /// Execute the job on a unique key from the hash map.
        /// </summary>
        /// <param name="key">The key from the hash map.</param>
        void Execute(TKey key);
    }

I may have stumbled on a bug. Is it possible that not all unique keys are visited in some cases?

When I have a lot of different keys but a low (but large enough) capacity for the nativemultihashmap it seems like sometimes keys are skipped. When I make the capacity 10x larger than it has to be this doesn’t seem to happen.

Pretty hard to debug but that’s what I’ve found out so far. Any thoughts?

Do you have a repo? I haven’t used this in a few months bu totally possible it could have an issue.

Can’t share the original project code unfortunately. But, I’ve managed to recreate the problem in a simplified manner. The code below illustrates the example. If you change the capacity to * 10 for example, the target hash map has all of the nibbles. When it’s kept the same as count, only 13 nibbles make it to the target. Just copy this script and put it on a gameobject and run:

using System;
using System.Collections.Generic;
using Unity.Collections;
using Unity.Mathematics;
using Unity.Jobs;
using Unity.Burst;
using UnityEngine;
using static Unity.Mathematics.math;
using BovineLabs.Common.Extensions;
using BovineLabs.Common.Jobs;
using BovineLabs.Common.Collections;

namespace UniqueKeyExample
{
    public class Test : MonoBehaviour
    {

        public struct Nibble
        {
            public float2 Position;
        }

        private NativeMultiHashMap<int, Nibble> source = new NativeMultiHashMap<int, Nibble>();
        private NativeMultiHashMap<int, Nibble> target = new NativeMultiHashMap<int, Nibble>();

        // Start is called before the first frame update
        void Start()
        {

            int nibbleCount = 100;
            int nibbleCap = nibbleCount; // * 10 makes the bug disappear

            source = new NativeMultiHashMap<int, Nibble>(nibbleCap, Allocator.Persistent);
            target = new NativeMultiHashMap<int, Nibble>(nibbleCap, Allocator.Persistent);

            for (int i = 0; i < nibbleCount; i++)
            {
                var nibble = CreateNibble();
                source.Add(PosToHash(nibble.Position), nibble);
            }

            Debug.Log("Source nibble count: " + source.Length);

            var job = new TestJob()
            {
                source = source,
                target = this.target.AsParallelWriter()
            };
            job.Schedule(source, 1).Complete();

            Debug.Log("Target nibble count: " + target.Length);

        }

        float r;
        private Nibble CreateNibble()
        {
            r += 5;
            float2 pos = float2(r, r);
            return new Nibble()
            {
                Position = pos,
            };
        }

        [BurstCompile(CompileSynchronously = false)]
        public struct TestJob : IJobNativeMultiHashMapVisitUniqueKeys<int, Nibble>
        {
            [ReadOnly] public NativeMultiHashMap<int, Nibble> source;
            [WriteOnly] public NativeMultiHashMap<int, Nibble>.ParallelWriter target;

            public void Execute(int key)
            {
                // transfer all nibbles to target
                source.TryGetFirstValue(key, out var nibble, out var iterator);

                target.Add(PosToHash(nibble.Position), nibble);
                while (source.TryGetNextValue(out nibble, ref iterator))
                    target.Add(PosToHash(nibble.Position), nibble);
            }
        }

        #region Buckets
        public const float CellSize = 8; // We want to keep the cellsize relatively low so workload can be more evenly distributed amongst threads

        public static int PosToHash(float2 pos) => PosToHash(pos, CellSize);
        public static int PosToHash(float2 pos, float cellSize) => CellToHash(GetCell(pos, cellSize));
        public static int CellToHash(int2 cell)
        {
            return cell.GetHashCode();
        }
        public static int2 GetCell(float2 position, float cellSize)
        {
            return (int2)floor(position / cellSize);
        }
        #endregion

        private void OnDestroy()
        {
            source.Dispose();
            target.Dispose();
        }
    }
}

Let me know if there’s anything I can help with. This unique key job was a lifesaver for me and I hope we can get it to work properly. Thanks!

Took me a while to figure something out. And a loooooot of alt tabbing between collections 0.8 and my IDE.

// <copyright file="IJobNativeMultiHashMapVisitUniqueKeys.cs" company="BovineLabs">
//     Copyright (c) BovineLabs. All rights reserved.
//     Modified by Kmsxkuse.
// </copyright>

using System;
using System.Diagnostics.CodeAnalysis;
using Unity.Collections;
using Unity.Collections.LowLevel.Unsafe;
using Unity.Jobs;
using Unity.Jobs.LowLevel.Unsafe;

namespace CustomCollections
{
    /// <summary>
    ///     Job that visits each unique key in a <see cref="NativeMultiHashMap{TKey,TValue}" />.
    /// </summary>
    /// <typeparam name="TKey">The key type of the hash map.</typeparam>
    /// <typeparam name="TValue">The value type of the hash map.</typeparam>
    [JobProducerType(typeof(JobNativeMultiHashMapVisitUniqueKey.NativeMultiHashMapVisitUniqueKeyJobStruct<,,>))]
    [SuppressMessage("ReSharper", "TypeParameterCanBeVariant", Justification = "Unity job system doesn't like it.")]
    [SuppressMessage("ReSharper", "UnusedTypeParameter", Justification = "Required by burst.")]
    public interface IJobParallelNativeMultiHashMapVisitUniqueKeys<TKey, TValue>
        where TKey : struct, IEquatable<TKey>
        where TValue : struct
    {
        /// <summary>
        ///     Execute the job on a unique key from the hash map.
        /// </summary>
        /// <param name="key">The key from the hash map.</param>
        void Execute(TKey key);
    }

    /// <summary>
    ///     Extension methods for <see cref="IJobParallelNativeMultiHashMapVisitUniqueKeys{TKey,TValue}" />.
    /// </summary>
    public static class JobNativeMultiHashMapVisitUniqueKey
    {
        /// <summary>
        ///     Schedule a <see cref="IJobParallelNativeMultiHashMapVisitUniqueKeys{TKey,TValue}" /> job.
        /// </summary>
        /// <param name="jobData">The job.</param>
        /// <param name="hashMap">The hash map.</param>
        /// <param name="minIndicesPerJobCount">Min indices per job count.</param>
        /// <param name="dependsOn">The job handle dependency.</param>
        /// <typeparam name="TJob">The type of the job.</typeparam>
        /// <typeparam name="TKey">The type of the key in the hash map.</typeparam>
        /// <typeparam name="TValue">The type of the value in the hash map.</typeparam>
        /// <returns>The handle to job.</returns>
        public static unsafe JobHandle Schedule<TJob, TKey, TValue>(
            this TJob jobData,
            NativeMultiHashMap<TKey, TValue> hashMap,
            int minIndicesPerJobCount,
            JobHandle dependsOn = default)
            where TJob : struct, IJobParallelNativeMultiHashMapVisitUniqueKeys<TKey, TValue>
            where TKey : unmanaged, IEquatable<TKey>
            where TValue : struct
        {
            var jobProducer = new NativeMultiHashMapVisitUniqueKeyJobStruct<TJob, TKey, TValue>
            {
                HashMap = hashMap,
                JobData = jobData
            };

            var scheduleParams = new JobsUtility.JobScheduleParameters(
                UnsafeUtility.AddressOf(ref jobProducer),
                NativeMultiHashMapVisitUniqueKeyJobStruct<TJob, TKey, TValue>.Initialize(),
                dependsOn,
                ScheduleMode.Parallel);

            return JobsUtility.ScheduleParallelFor(
                ref scheduleParams,
                hashMap.GetUnsafeBucketData().bucketCapacityMask + 1,
                minIndicesPerJobCount);
        }

        /// <summary>
        ///     The job execution struct;
        /// </summary>
        /// <typeparam name="TJob">The type of the job.</typeparam>
        /// <typeparam name="TKey">The type of the key in the hash map.</typeparam>
        /// <typeparam name="TValue">The type of the value in the hash map.</typeparam>
        internal struct NativeMultiHashMapVisitUniqueKeyJobStruct<TJob, TKey, TValue>
            where TJob : struct, IJobParallelNativeMultiHashMapVisitUniqueKeys<TKey, TValue>
            where TKey : unmanaged, IEquatable<TKey>
            where TValue : struct
        {
            [ReadOnly] public NativeMultiHashMap<TKey, TValue> HashMap;

            internal TJob JobData;

            // ReSharper disable once StaticMemberInGenericType
            private static IntPtr _jobReflectionData;

            internal static IntPtr Initialize()
            {
                if (_jobReflectionData == IntPtr.Zero)
                    _jobReflectionData = JobsUtility.CreateJobReflectionData(
                        typeof(NativeMultiHashMapVisitUniqueKeyJobStruct<TJob, TKey, TValue>),
                        typeof(TJob),
                        (ExecuteJobFunction) Execute);

                return _jobReflectionData;
            }

            private delegate void ExecuteJobFunction(
                ref NativeMultiHashMapVisitUniqueKeyJobStruct<TJob, TKey, TValue> jobProducer,
                IntPtr additionalPtr,
                IntPtr bufferRangePatchData,
                ref JobRanges ranges,
                int jobIndex);

            // ReSharper disable once MemberCanBePrivate.Global - Required by Burst
            internal static unsafe void Execute(
                ref NativeMultiHashMapVisitUniqueKeyJobStruct<TJob, TKey, TValue> jobProducer,
                IntPtr additionalPtr,
                IntPtr bufferRangePatchData,
                ref JobRanges ranges,
                int jobIndex)
            {
                var hashMapData = jobProducer.HashMap.GetUnsafeBucketData();
                using var seenKeys = new NativeHashSet<TKey>(hashMapData.bucketCapacityMask + 1, Allocator.Temp);
       
                while (true)
                {
                    if (!JobsUtility.GetWorkStealingRange(ref ranges, jobIndex, out var begin, out var end))
                        return;
           
                    var buckets = (int*) hashMapData.buckets;
                    var next = (int*) hashMapData.next;
                    var keys = hashMapData.keys;

                    for (var i = begin; i < end; i++)
                    {
                        var entryIndex = buckets[i];

                        while (entryIndex != -1)
                        {
                            var key = UnsafeUtility.ReadArrayElement<TKey>(keys, entryIndex);
                   
                            if (seenKeys.Add(key))
                                jobProducer.JobData.Execute(key);
                   
                            entryIndex = next[entryIndex];
                        }
                    }
                }
            }
        }
    }
}

Key problem with original code: A single bucket can contain multiple keys. Keys with small number of values can be merged into a single bucket.

Thats why when there was a single value in resulting Execute(TKey key) as the original code was just accessing the first key value pair in the buffer when there could be more.

So, to solve this, I iterated over the entire bucket like how the old MultiHashMapVisitKeyValue did it but that will call keys multiple times as it moves through the bucket.

So to solve this again, I introduced a hashset seenKeys, with allocation temp for “automatic” disposal (no clue if it is), to check if the key was already accessed.

I stuck a using statement in front of the NativeHashSet but I have no clue if it actually works when transferred to the bursted code. Maybe? No errors and the transfer is clean.

The keys of multihashmaps using this job must also be a basic blittable type. (Replaced TKey : struct with TKey : unmanaged). I couldn’t get anything else to satisfy The type 'TKey' must be valid unmanaged type (simple numeric, 'bool', 'char', 'void', enumeration type or struct type with all fields of unmanaged types at any level of nesting) in order to use it as a type argument for 'T' parameter in the creation of the NativeHashSet inside the Execution code.

Testing with the following script (trimmed from bartofzo’s comment) registers full copying between two NMHMs.

Further testing is needed to see if seenKeys is actually deallocated or just sitting around and Unity isn’t notifying me and how bad of a performance hit adding this actually is.

There is no further files or imposter structs needed. I just used GetUnsafeBucketData() to get all the pointer values.

using CustomCollections;
using Unity.Burst;
using Unity.Collections;
using UnityEngine;

public class Test : MonoBehaviour
{
    // Start is called before the first frame update
    void Start()
    {
        int nibbleCount = 100;
        int nibbleCap = nibbleCount * 2;
        using var source = new NativeMultiHashMap<int, int>(nibbleCap, Allocator.TempJob);
        using var target = new NativeMultiHashMap<int, int>(nibbleCap, Allocator.TempJob);
        for (var i = 0; i < nibbleCount; i++)
        {
            source.Add(i, i);
            source.Add(i, i + 1);
        }
        Debug.Log("Source nibble count: " + source.Count());
        new TestJobParallel
        {
            source = source,
            target = target.AsParallelWriter()
        }.Schedule(source, 1).Complete();
        Debug.Log("Target nibble count: " + target.Count());
    }
    [BurstCompile]
    public struct TestJobParallel : IJobParallelNativeMultiHashMapVisitUniqueKeys<int, int>
    {
        [ReadOnly] public NativeMultiHashMap<int, int> source;
        [WriteOnly] public NativeMultiHashMap<int, int>.ParallelWriter target;
        public void Execute(int key)
        {
            // transfer all nibbles to target
            source.TryGetFirstValue(key, out var nibble, out var iterator);
   
            target.Add(key, nibble);
            while (source.TryGetNextValue(out nibble, ref iterator))
                target.Add(key, nibble);
        }
    }
}

Edit: 1 year, ouch. Apologies everyone for the necromancy.

1 Like