QMIX algorithm

I am currently looking for a multi-agent reinforcement learning algorithm suitable for a USV combat simulation. The simulator is complete, and I have finished experimenting with the default POCA algorithm provided by ml-agents. I believe that QMIX is more suitable for the current simulator than POCA, so I would like to conduct experiments with QMIX.

QMIX is not provided in ml-agents, so I tried to create QMIX code by copying the POCA code inside ml-agents. However, I am having trouble creating QMIX code because the code is too complex and requires the use of Advantage or updating the policy loss.

Is there an implementation of the QMIX algorithm that is implemented like the cmd-based training using the default algorithm provided by Unity ml-agents? If not, can I get help by requesting assistance through the Unity Success Plan?

