Where is data buffered when getting and setting data to a compute buffer/shader

In this example I set an int value from CPU to GPU, then dispatch a shader to fill a small buffer with this value squared. I then request the buffer back to CPU asynchronously.
I can however do this multiple times before the results are read back and I still get the correct result. Meaning that some values must be buffered? Does the data reside on the CPU side or the GPU side in the meanwhile? How does this chain work?

This is a typical result from the example
Run Begin
Run End
FrameCount: 824 → 827, data: 1, 1, 1
FrameCount: 824 → 827, data: 4, 4, 4

We can see that Run() runs on frame 824, then on frame 827 both results are received with different content.

C# Script

using System;
using System.Runtime.InteropServices;
using Sirenix.OdinInspector;//Used to create an easy [Button]
using Unity.Collections;
using UnityEngine;
using UnityEngine.Rendering;

public class GPUAsyncTest : MonoBehaviour
{
    private ComputeBuffer _resultsBuffer;
    [SerializeField] private ComputeShader computeShader;
    private int _kernelID;
    private void Start()
    {
        _resultsBuffer = new ComputeBuffer(32, Marshal.SizeOf<int>());
        _kernelID = computeShader.FindKernel("Run");
        computeShader.SetBuffer(_kernelID, "Results", _resultsBuffer);
    }

    [Button]
    private void Run()
    {
        Debug.Log("Run Begin");
        int currentFrameCount = Time.frameCount;
        computeShader.SetInt("Value", 1);
        computeShader.Dispatch(_kernelID, 1, 1, 1);
        AsyncGPUReadback.Request(_resultsBuffer, (request) => Callback(request,currentFrameCount));
        computeShader.SetInt("Value", 2);
        computeShader.Dispatch(_kernelID, 1, 1, 1);
        AsyncGPUReadback.Request(_resultsBuffer, (request) => Callback(request,currentFrameCount));
        Debug.Log("Run End");
    }

    private void Callback(AsyncGPUReadbackRequest request, int frameCountWhenCalled)
    {
        if (request.hasError)
        {
            throw new Exception("AsyncGPUReadback");
        }
       
        NativeArray<int> data = request.GetData<int>();
       
        Debug.Log($"FrameCount: {frameCountWhenCalled} -> {Time.frameCount}, data: {data[0]}, {data[15]}, {data[31]}");
    }
}

Shader code:

#pragma kernel Run

int Value;
RWStructuredBuffer<int> Results;

[numthreads(32,1,1)]
void Run (uint3 id : SV_DispatchThreadID)
{
    Results[id.x] = Value*Value;
}

Hi. Typically data will be copied to a staging buffer upon request that the CPU can then read. Implementation details are device dependant but it’s supported to perform several requests on the same buffer at the same frame.