Unity Ml_Agent : Object detection in real-time

Unity Ml_Agent : Object detection in real-time

Hello, I would like to ask you a question regarding the ML-Agent project that is underway these days. I am currently studying self-driving drones at graduate school. More specifically, we are conducting research to describe the ability of Unity to identify targets in real time as drones launch into target areas after takeoff.
Currently, I put Raycast on drone agent that identifies objects recognized by sensors through tags, but that is not what I wanted.

Real AI artificial intelligence is not information identified by contact with the Raycast sensor, but I think I should actually detect the target I want among the various objects seen on the Unity Agent’s camera, so I’m trying to solve it, but I haven’t found the right answer yet.

I’d like to know about the technology that identifies objects from the drone’s first-person perspective in real time as they play the game, and the announcement is displayed around the bounding-box around the target.
To realize it, I have tried various methods for packages such as barracks, OpenCV, OpenVINO, and perception that are currently available in Unity, but I have not solved them.
Currently, the simulation implemented is using vector-observation via raycast (Picture 1), and my research goal is to target-detection using vision-sensor of agent camera.

What I am curious about is

  1. whether the above explanation is actually possible?
  2. If possible, I would like to know if there are any examples of such cases or git hubs.We sincerely hope to find Insight through your advice. Thank you.

Picture 1 is an example of a scene that I want to implement. This is a picture of me randomly putting bounding-box and announcement through Photoshop in the simulation scene I created.
Picture 2 shows an example of a bounding-box found on Unity’s official blog. Unity - AI & Machine Learning, Explained

Hey there,

first up some kind of disclaimer: Imo this is a hard topic to work on and the chances for success are low. In addition chances are high that even if you get something to work that produces a correct result in your obj-detection it will be painfully slow.

That being said let’s jump in the topic. if you check out Unity ML-Agents you will see that it only utilizes Reinforcement Learning. While this is a nice thing for a lot of applications for your particular setup this is probably not a good approach. This is basically the most crucial point here that you will have to consider if you want to use ML-Agents. In case you are still a beginner in this topic: Reinforcement Learning is not really good at object detection/classification. In addition obj-detection/classification is really slow. (Depending on the images resolution a single obj-detection can take seconds)

On top of this you’d need ages to train and evaluate the model as designing rewards and observations for you Agent can be really difficult, especially if such a complex scenario.

Afaik Unity does not offer any further ML-Based packages which kind of limits your options as most ML-Frameworks are (as you are probably aware) python-based. What you could try to do is to design some python/Unity inferface for example on a pubsub messaging system like mqtt and have 2 programms running. One for the simulation, one for obj-detection. This way you could also tap into the possibility to convert tensorflow models to tflite and use a TPU (e.g. from CoralAI) to actually get decent evaluation-times on your obj-detection. This way you can have ML-Agents to take care of learning how to fly and a python programm to take visual observations of your drone and spit out classifications.
Note that while this in concept can work, this will still require a ton of work to make these 2 programm parts work together.

What you can use ML-Agents for is for example to teach your drone to fly, to explore some kind of environment based on input you give it. This Video for example shows some planes that learned to fly along a parcour.
What will imo not work is to use ML-Agents to train an obj-detection.

Let me know if you have any questions regarding this.