Crashing while using GPU Compute on Android Devices

Using BSRGanX2 from Model zoo, GPU Compute mode is working good on PC wuth RTX 3060, but the same operation is making the app to crash with the below logs in between the verbose operation

Logs

2023-08-26 13:56:29.614 25121 485 Info Unity Conv - name: 1884, inputs: [1883, HRconv.weight, HRconv.bias], fusedActivation: None, group: 1, strides: [1, 1], pads: [1, 1, 1, 1], dilations: [1, 1], autoPad: NotSet, kernelShape: [3, 3], fusedActivation: None

2023-08-26 13:56:29.614 25121 485 Warn Unity Exceeded safe compute dispatch group count limit per dimension [1, 116620, 1] for Conv2D_T16x16_R4x4

2023-08-26 13:56:29.618 25121 485 Error Unity Thread group count is above the maximum allowed limit. Maximum allowed thread group count is 65535.

2023-08-26 13:56:29.618 25121 485 Error Unity Unity.Sentis.ComputeHelper:Dispatch(ComputeFunc, Int32, Int32, Int32)

2023-08-26 13:56:29.618 25121 485 Error Unity Unity.Sentis.GPUComputeBackend:ConvMobile(TensorFloat, TensorFloat, TensorFloat, TensorFloat, Span`1, Span`1, Span`1, FusableActivation)

2023-08-26 13:56:29.618 25121 485 Error Unity Unity.Sentis.GPUComputeBackend:Conv(TensorFloat, TensorFloat, TensorFloat, Int32, Span`1, Span`1, Span`1, FusableActivation)

2023-08-26 13:56:29.618 25121 485 Error Unity Unity.Sentis.Layers.Conv:Execute(Tensor[], ExecutionContext)

2023-08-26 13:56:29.618 25121 485 Error Unity Unity.Sentis.<StartManualSchedule>d__33:MoveNext()

2023-08-26 13:56:29.618 25121 485 Error Unity Unity.Sentis.GenericWorker:Execute()

2023-08-26 13:56:29.618 25121 485 Error Unity <PerformOperation>d__10:MoveNext()

2023-08-26 13:56:29.618 25121 485 Error Unity UnityEngine.SetupCoroutine:InvokeMoveNext(IEnumerator, IntPtr)

2023-08-26 13:56:29.618 25121 485 Error Unity [ line -1636289176]

Code


 IEnumerator PerformOperation()
    {
        Debug.Log(tex.height);
        Debug.Log(tex.width);

     
        yield return null;
        originalImage.sprite = Sprite.Create(tex, new Rect(0, 0, tex.width, tex.height), new Vector2(0.5f, 0.5f),
            100f, 0, SpriteMeshType.FullRect);

    
        yield return null;
        _runtimeModel = ModelLoader.Load(modelAsset);
        Debug.Log(_runtimeModel);
        yield return null;
        _worker = WorkerFactory.CreateWorker(BackendType.GPUCompute, _runtimeModel, true);
        _tensor = TextureConverter.ToTensor(tex, width: tex.width / 2, height: tex.height / 2, 3);
        _worker.Execute(_tensor);
        Debug.LogError("_worker Executed");
        yield return null;
        _peekOutput = _worker.FinishExecutionAndDownloadOutput() as TensorFloat;
        _peekOutput.MakeReadable();
    

        RenderTexture rt;


        Debug.LogError("Got data");


   
       rt = TextureConverter.ToTexture(_peekOutput, width: tex.width, height: tex.height);
        rawImage.texture = rt;

        _peekOutput?.Dispose();
        _tensor?.Dispose();
        _worker?.Dispose();
        Debug.LogError("Executed...");
    }

Platform Info
Device : OnePlus Nord CE 3
OS : A13

1 Like

Oh that’s interesting.
Inputs are probably too large and we bust the dispatch thread limit when scheduling our kernels.
Could you share the inputs size you are using and the link to the model, we’ll investigate

Sure… Here it is… WeTransfer - Send Large Files & Share Photos Online - Up to 2GB Free

Hi,

The link has expired, could you share another link?

Thanks.

Hi,

The 1.2 version should be released this or next week. Try upgrading, and let me know if you can reproduce the issue. Don’t forget to reimport the model and delete the library folder if you run into any weird errors.

Hi! I’m on v1.2.0 and unfortunately having the same issue. The network evaluation appears to complete, but the app crashes straight after. The final outputs are

...
01-12 04:53:00.567  5608  5652 I Unity   : Conv - name: Y, inputs: [/conv_act/Mul_output_0, conv_out.weight, conv_out.bias], fusedActivation: None, group: 1, strides: [1, 1], pads: [1, 1, 1, 1], dilations: [1, 1], autoPad: NotSet, kernelShape: [3, 3], fusedActivation: None
01-12 04:53:00.568  5608  5652 I Unity   : Unity.Sentis.DefaultVars

(ignore the timestamps, it might be time for bed…)

“Y” is the name of the only output, so I assume this means the evaluation is totally finished by this point. I’ve reduced batch size to 1 (this is a dynamic axis) but still have this problem :frowning: works fine on laptop (GPUCompute) and on CPU on mobile!

Edit: more relevant logcat

01-12 05:15:41.194  8218  8401 W Adreno-GSL: <gsl_ldd_control:553>: ioctl fd 80 code 0xc040094a (IOCTL_KGSL_GPU_COMMAND) failed: errno 35 Resource deadlock would occur
01-12 05:15:41.194  8218  8401 W Adreno-GSL: <log_gpu_snapshot:462>: panel.gpuSnapshotPath is not set.not generating user snapshot
01-12 05:15:41.195  8218  8401 W Adreno-GSL: <gsl_ldd_control:553>: ioctl fd 80 code 0x400c0907 (IOCTL_KGSL_DEVICE_WAITTIMESTAMP_CTXTID) failed: errno 35 Resource deadlock would occur
01-12 05:15:41.195  8218  8401 W Adreno-GSL: <log_gpu_snapshot:462>: panel.gpuSnapshotPath is not set.not generating user snapshot
01-12 05:15:41.196  8218  8401 W Adreno-GSL: <gsl_ldd_control:553>: ioctl fd 80 code 0xc040094a (IOCTL_KGSL_GPU_COMMAND) failed: errno 35 Resource deadlock would occur
01-12 05:15:41.196  8218  8401 W Adreno-GSL: <log_gpu_snapshot:462>: panel.gpuSnapshotPath is not set.not generating user snapshot
01-12 05:15:41.196  8218  8401 W Adreno-GSL: <gsl_ldd_control:553>: ioctl fd 80 code 0x400c0907 (IOCTL_KGSL_DEVICE_WAITTIMESTAMP_CTXTID) failed: errno 35 Resource deadlock would occur
01-12 05:15:41.196  8218  8401 W Adreno-GSL: <log_gpu_snapshot:462>: panel.gpuSnapshotPath is not set.not generating user snapshot
01-12 05:15:41.204  1271  1351 E BufferQueueProducer: [SurfaceView - com.DefaultCompany.PixDiffusion/com.unity3d.player.UnityPlayerActivity#0](id:4f7000038e4,api:1,p:8218,c:1271) dequeueBuffer: attempting to exceed the max dequeued buffer count (2)
01-12 05:15:41.205  8218  8401 W vulkan  : dequeueBuffer timed out: Function not implemented (-38)
01-12 05:15:41.246  8218  8401 E CRASH   : *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
01-12 05:15:41.246  8218  8401 E CRASH   : Version '2022.3.17f1 (4fc78088f837)', Build type 'Release', Scripting Backend 'mono', CPU 'armeabi-v7a'
01-12 05:15:41.246  8218  8401 E CRASH   : Build fingerprint: 'motorola/guamp_retailen/guamp:11/RPXS31.Q2-58-17-7-3/ad9c24:user/release-keys'
01-12 05:15:41.246  8218  8401 E CRASH   : Revision: 'pvt'
01-12 05:15:41.246  8218  8401 E CRASH   : ABI: 'arm'
01-12 05:15:41.251  8218  8401 E CRASH   : Timestamp: 2024-01-12 05:15:41.246966678+0000
01-12 05:15:41.251  8218  8401 E CRASH   : pid: 8218, tid: 8401, name: UnityGfxDeviceW  >>> com.DefaultCompany.PixDiffusion <<<
01-12 05:15:41.251  8218  8401 E CRASH   : uid: 10674
01-12 05:15:41.251  8218  8401 E CRASH   : signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr --------
01-12 05:15:41.251  8218  8401 E CRASH   : Cause: null pointer dereference
01-12 05:15:41.251  8218  8401 E CRASH   :     r0  00000000  r1  c1bcaa60  r2  00000002  r3  00000400
01-12 05:15:41.251  8218  8401 E CRASH   :     r4  c4755358  r5  00000001  r6  0000023e  r7  00000000
01-12 05:15:41.251  8218  8401 E CRASH   :     r8  00000000  r9  00000001  r10 00000008  r11 bd180104
01-12 05:15:41.251  8218  8401 E CRASH   :     ip  00000001  sp  bc24be70  lr  00000100  pc  c7218ca8
01-12 05:15:41.252  8218  8401 E CRASH   : backtrace:
01-12 05:15:41.252  8218  8401 E CRASH   :       #00 pc 00a9fca8  /data/app/~~-ijLScJ941dgXfC5k-WHGg==/com.DefaultCompany.PixDiffusion-6v8IDz-WmvfxXPmk4gRRBQ==/lib/arm/libunity.so (BuildId: 72096e5a0ef6b55e448480e2451becd1c044b449)
01-12 05:15:41.252  8218  8401 E CRASH   :       #01 pc 00a9ae50  /data/app/~~-ijLScJ941dgXfC5k-WHGg==/com.DefaultCompany.PixDiffusion-6v8IDz-WmvfxXPmk4gRRBQ==/lib/arm/libunity.so (BuildId: 72096e5a0ef6b55e448480e2451becd1c044b449)
01-12 05:15:41.252  8218  8401 E CRASH   :       #02 pc 00a58580  /data/app/~~-ijLScJ941dgXfC5k-WHGg==/com.DefaultCompany.PixDiffusion-6v8IDz-WmvfxXPmk4gRRBQ==/lib/arm/libunity.so (BuildId: 72096e5a0ef6b55e448480e2451becd1c044b449)
01-12 05:15:41.252  8218  8401 E CRASH   :       #03 pc 00a8d634  /data/app/~~-ijLScJ941dgXfC5k-WHGg==/com.DefaultCompany.PixDiffusion-6v8IDz-WmvfxXPmk4gRRBQ==/lib/arm/libunity.so (BuildId: 72096e5a0ef6b55e448480e2451becd1c044b449)
01-12 05:15:41.252  8218  8401 E CRASH   :       #04 pc 00a87bbc  /data/app/~~-ijLScJ941dgXfC5k-WHGg==/com.DefaultCompany.PixDiffusion-6v8IDz-WmvfxXPmk4gRRBQ==/lib/arm/libunity.so (BuildId: 72096e5a0ef6b55e448480e2451becd1c044b449)
01-12 05:15:41.252  8218  8401 E CRASH   :       #05 pc 00bc3f55  /data/app/~~-ijLScJ941dgXfC5k-WHGg==/com.DefaultCompany.PixDiffusion-6v8IDz-WmvfxXPmk4gRRBQ==/lib/arm/libunity.so (BuildId: 72096e5a0ef6b55e448480e2451becd1c044b449)
01-12 05:15:41.252  8218  8401 E CRASH   :       #06 pc 00bc32e1  /data/app/~~-ijLScJ941dgXfC5k-WHGg==/com.DefaultCompany.PixDiffusion-6v8IDz-WmvfxXPmk4gRRBQ==/lib/arm/libunity.so (BuildId: 72096e5a0ef6b55e448480e2451becd1c044b449)
01-12 05:15:41.252  8218  8401 E CRASH   :       #07 pc 00bc305b  /data/app/~~-ijLScJ941dgXfC5k-WHGg==/com.DefaultCompany.PixDiffusion-6v8IDz-WmvfxXPmk4gRRBQ==/lib/arm/libunity.so (BuildId: 72096e5a0ef6b55e448480e2451becd1c044b449)
01-12 05:15:41.252  8218  8401 E CRASH   :       #08 pc 004c8d57  /data/app/~~-ijLScJ941dgXfC5k-WHGg==/com.DefaultCompany.PixDiffusion-6v8IDz-WmvfxXPmk4gRRBQ==/lib/arm/libunity.so (BuildId: 72096e5a0ef6b55e448480e2451becd1c044b449)
01-12 05:15:41.252  8218  8401 E CRASH   :       #09 pc 0008170b  /apex/com.android.runtime/lib/bionic/libc.so (__pthread_start(void*)+40) (BuildId: 9d4f6aa585db1e76cb15e0aa4299910e)
01-12 05:15:41.252  8218  8401 E CRASH   :       #10 pc 0003a50d  /apex/com.android.runtime/lib/bionic/libc.so (__start_thread+30) (BuildId: 9d4f6aa585db1e76cb15e0aa4299910e)

Edit 2: problem unchanged when using the model with static shape

I saw it mentioned somewhere that blocking reads might be the problem so I moved to async readback but unfortunately exact same error still! I’m now reading the output like this:

    ...
    worker.Execute(inputTensors);

    TensorFloat output = worker.PeekOutput("Y") as TensorFloat;
    waiting = true;
    output.AsyncReadbackRequest(delegate (bool success)
    {
        if (success)
        {
            output.MakeReadable();
            float[] modelOutput = output.ToReadOnlyArray();

            inputTensor.Dispose();

            // ... do stuff with modelOutput

            waiting = false;
        }
    });

Tried upgrading to 1.3.0-pre2 using Unity 2023 and still the same error, even with a newly imported very small model, so I assume it’s not a resource problem.

@alexandreribard_unity sorry to @ but I’m really stumped here! And possibly am trying to make a deadline in a few days… Happy to provide a model and/or sample code if that helps. I don’t currently have another device to test on but will do later today so will edit then.

EDIT: I started a new thread as I tested on another device and was surprised to see it working fine, so perhaps it is a different issue (some kind of compatibility issue) App crashes when evaluating model with GPU compute on some Android devices