From Barracuda to Sentis - ComputeBuffer as an Input

Hello! I am trying to port Keijiro’s ObjectDetection project made with Baracuda and running the built-in render pipeline over to a project running Sentis and URP. Keijiro made clever use of Compute shaders to perform pre and post process calculations to improve performance. On Barracuda, everything is working well since you can create a Tensor that can take a ComputeBuffer as an input (see the ObjectDetector class) as follow:

using (var tensor = new Tensor(1, imageSize, imageSize, 3, _preBuffer))

Now, in Sentis there is now way to input a ComputeBuffer when creating a Tensor, let alone creating a new Tensor directly like in the previous exemple. So the closest method I found is to use TextureConverter.ToTensor() which take a CommandBuffer as an optional input, but I still can’t figure out how to make that work. Some guidance would be really appreciated. Thanks!

I suppose the reason for this is that on some devices compute shaders and buffers may not be supported. If you’re sure your backend is GPUCompute and it works, then the code above could be translated in something like:

Tensor _inputTensor = TensorFloat.Zeros(new TensorShape(1, imageSize, imageSize, 3));
ComputeBuffer _inputBuffer = ComputeTensorData.Pin(_inputTensor).buffer;
1 Like

Correct, that’s the way to go, thanks @roumenf

1 Like

Hey @roumenf ! Thanks for your reply, it helped me make some progress. Unfortunately, I still can’t get it to work. I can get the bounding boxes to display, but only if I lower the confidence threshold to almost 0 and then they appear all over the place which make me think that no actual object detection is happening. I am starting to think that, since the original project was built on Barracuda and that according to the doc, Barracuda enforce NHWC (channel last), while Sentis, by default, seems to target NCHW(channel first) then it just break. I am still unsure whether that order is somehow hardcoded within the shaders of the original project which could be the issue… Any Idea?

Based on the original Barracuda code found here, this is my Sentis version of the ObjectDetector class:

using System.Collections.Generic;
using Unity.Sentis;
using UnityEngine;

namespace TinyYoloV2

    public sealed class ObjectDetector : System.IDisposable
        #region Internal objects

        ResourceSet _resources;
        ComputeBuffer _preBuffer;
        ComputeBuffer _post1Buffer;
        ComputeBuffer _post2Buffer;
        ComputeBuffer _countBuffer;
        ComputeBuffer tensorBuffer;
        Tensor tensor;
        IWorker _worker;


        #region Public constructor

        public ObjectDetector(ResourceSet resources)
            _resources = resources;
            var imageSize = Config.ImageSize;
            tensor = TensorFloat.Zeros(new TensorShape(1, 3, imageSize, imageSize));

            tensorBuffer = ComputeTensorData.Pin(tensor).buffer;

            _preBuffer = new ComputeBuffer(Config.InputSize, sizeof(float));

            _post1Buffer = new ComputeBuffer
              (Config.MaxDetection, BoundingBox.Size, ComputeBufferType.Append);

            _post2Buffer = new ComputeBuffer
              (Config.MaxDetection, BoundingBox.Size, ComputeBufferType.Append);

            _countBuffer = new ComputeBuffer
              (1, sizeof(uint), ComputeBufferType.Raw);

            _worker = WorkerFactory.CreateWorker(BackendType.GPUCompute, ModelLoader.Load(_resources.model));


        #region IDisposable implementation

        public void Dispose()

            tensorBuffer = null;

            _preBuffer = null;

            _post1Buffer = null;

            _post2Buffer = null;

            _countBuffer = null;

            _worker = null;


        #region Public accessors

        public ComputeBuffer BoundingBoxBuffer
          => _post2Buffer;

        public void SetIndirectDrawCount(ComputeBuffer drawArgs)
          => ComputeBuffer.CopyCount(_post2Buffer, drawArgs, sizeof(uint));

        public IEnumerable<BoundingBox> DetectedObjects
          => _post2ReadCache ?? UpdatePost2ReadCache();


        #region Main image processing function

        public void ProcessImage
          (Texture sourceTexture, float scoreThreshold, float overlapThreshold)
            // Reset the compute buffer counters.

            int[] _preBuffData = new int[_preBuffer.count];

            // Preprocessing
            var pre = _resources.preprocess;
            pre.SetTexture(0, "_Texture", sourceTexture);
            pre.SetBuffer(0, "_Tensor", tensorBuffer);
            pre.SetInt("_ImageSize", Config.ImageSize);
            pre.Dispatch(0, Config.ImageSize / 8, Config.ImageSize / 8, 1);


            // Output tensor (13x13x125) -> Temporary render texture (125x169)
            var reshape = new TensorShape
                (1, Config.TotalCells, Config.OutputPerCell, 1);

            var reshapedRT = RenderTexture.GetTemporary
              (125, 169, 0, RenderTextureFormat.RFloat);

            var outputTensor = _worker.PeekOutput().ShallowReshape(reshape);
            TextureConverter.RenderToTexture(outputTensor as TensorFloat, reshapedRT);

            // 1st postprocess (bounding box aggregation)
            var post1 = _resources.postprocess1;
            post1.SetFloat("_Threshold", scoreThreshold);
            post1.SetTexture(0, "_Input", reshapedRT);
            post1.SetBuffer(0, "_Output", _post1Buffer);
            post1.Dispatch(0, 1, 1, 1);

            // Bounding box count
            ComputeBuffer.CopyCount(_post1Buffer, _countBuffer, 0);

            // 2nd postprocess (overlap removal)
            var post2 = _resources.postprocess2;
            post2.SetFloat("_Threshold", overlapThreshold);
            post2.SetBuffer(0, "_Input", _post1Buffer);
            post2.SetBuffer(0, "_Count", _countBuffer);
            post2.SetBuffer(0, "_Output", _post2Buffer);
            post2.Dispatch(0, 1, 1, 1);

            // Bounding box count after removal
            ComputeBuffer.CopyCount(_post2Buffer, _countBuffer, 0);

            // Read cache invalidation
            _post2ReadCache = null;


        #region GPU to CPU readback function

        BoundingBox[] _post2ReadCache;
        int[] _countReadCache = new int[1];

        BoundingBox[] UpdatePost2ReadCache()
            _countBuffer.GetData(_countReadCache, 0, 0, 1);
            var buffer = new BoundingBox[_countReadCache[0]];
            _post2Buffer.GetData(buffer, 0, 0, buffer.Length);
            return buffer;


} // namespace TinyYoloV2

Ah it’s Keijiro samples, he should have updated the samples to work with Sentis.
Let me get back to you

1 Like

Hey All - We are working with Keijiro to update his samples, but we don’t have a timeline yet :zipper_mouth_face:. Will let you know when we have a better line of sight.

Known internally as Unity issue 112.

1 Like

Hey @Bill_Cullen and @alexandreribard_unity! Thanks for the update. Is there a way for me to follow the progression of that issue? In the meantime, I will look into it so to have a temporary solution.

Hey there - We are working with him to update his sample, but don’t have a timeline yet. Stay tuned.

1 Like