Train models from images?

I’ve been very successful so far with ml-agents and the 3d game world, but now my agent needs image detection. I have an ml-agent drone that is able to fly around and navigate its world, but I want it to recognize whether a target is familiar or unfamiliar. If it is familiar it will return to a default monitoring mode. However, if the target is unfamiliar, it will track its movement and follow it.

How do I start teaching it this new skill?

My initial thought was to have it recognize simple colors as familiar: red, blue, green, and then cyan as unfamiliar, but I wondered if I am going about it all wrong…since my real goal is face or image detection, shouldn’t I just go in that direction? But again…I’m not how to do that with the current plugin.

This lead me to considering how to do image detection at all in unity which lead me to the Perception engine–I love the idea of generating synthetic data. I’m just not sure what to do with the labeled data in Unity.

Once I have a few thousand or so labeled images, how can start training models?

Is this even available inside Unity yet?

I read this blog: https://blogs.unity3d.com/2020/09/17/training-a-performant-object-detection-ml-model-on-synthetic-data-using-unity-perception-tools/

…hoping to gain some insight but it seems it’s all 3rd party tools to actually conduct the training.

Is there something I’m missing, or is this just something not yet integrated in ml-agents?

Any guidance would be appreciated, thanks!

1 Like

Unity Perception / Simulation are offered by a different group here at Unity. I’d suggest cross-posting in that forum to get better visibility: https://forum.unity.com/forums/unity-simulation.407/

Since you are using ML-Agents you may also get an additional perspective from them:

Re: object detection - AFAIK we don’t have any tutorials yet, but you could try some of the ONNX pre-trained models at:
https://github.com/onnx/models#object-detection--image-segmentation-

One model we currently support is Tiny Yolo2:
https://docs.unity3d.com/Packages/com.unity.barracuda@1.0/manual/SupportedArchitectures.html

This is also a project I found (not from Unity though):
https://github.com/Syn-McJ/TFClassify-Unity-Barracuda

2 Likes

thanks @amirebrahimi_unity !

Object detection with Yolo v3 tutorial content would be seriously useful. It’s hard to figure out what to do with the outputs of the ONNX currently without an example.

1 Like

Got Yolo v3 tiny working with Barracuda + AR Foundation here if anyone is interested, I can provide some links for tutorials on training Darknet and converting darknet weights to an ONNX file to use with it too if you are interested. It was easy to learn for an ML noob like me.

1 Like

I’m for sure interested @ROBYER1_1 thanks!

I wanted to keep this short but it’s detailed enough you should have my workflow down and this workflow will suit Windows/Linux users

Setting up Darknet

Use the popular AlexeyAB fork of Darknet, it’s fairly easy to set up on Windows and Linux
If you have an Nvidia RTX 30 series card right now you may have some issues with installing OpenCV with CUDA 11 support although I assume that may be fixed now - DM me if you have issues I have a workaround

https://github.com/artynet/darknet-alexeyAB#how-to-compile-on-windows-using-vcpkg

Using Darknet to train image detection

There are 100’s of tutorials on this online and I was a total noob doing it with no previous experience, it’s just a case of labelling things you want Darknet to detect using Yolo in images and running training. It was simpler than I thought

As I am using Windows, I use Windows to run Darknet for training, and use WSL2 for running Linux to run Darkmark for labeling images and creating the config .cfg files for Darknet which saves me HOURS!

I used this WSL2 hack to see Linux GUI in windows so I could have windows and linux running simultaneously
https://gist.github.com/tdcosta100/385636cbae39fc8cd0937139e87b1c74

Darkmark is super useful for labeling images and generating the .cfg config file for Darknet. You can use an early trained object detection model to actually help you label images faster when your model can recognise things in images that you are labeling (so you can scale up and train 1000’s of images faster and improve accuracy)

https://www.ccoderun.ca/darkmark/Summary.html#DarkMarkDarknet

The creator of Darkmark made a Discord group who are very helpful too, feel free to join I’m very active there too for questions
https://github.com/AlexeyAB/darknet/issues/6455

I liked this tutorial personally, it showed me the whole process

https://www.youtube.com/watch?v=zJDUhGL26iU

Converting trained Darknet weights to ONNX

Converting Yolo v4 / v4 tiny and Yolo v3 / v3 Tiny to ONNX works from here https://github.com/jkjung-avt/tensorrt_demos#demo-5-yolov4

  1. Make sure you have your trained .cfg and .weights files from darknet

  2. Follow the naming conventions here from the link below, keep them identical for the .cfg and .weights files, take note to try and use this naming convention for future models you want to convert. The naming conventions are very strict but they make sense and will help you identify what your files are
    e.g. “yolov3-tiny-multipersons-384x288.cfg” and “yolov3-tiny-multipersons-384x288.weights” where ‘multipersons’ is just whatever you want to label the file as, for my scenario I wrote ‘test’ instead of ‘multipersons’
    More info on the naming needed:
    https://jkjung-avt.github.io/trt-yolov3-custom/

  3. Clone this repo and install dependencies for it, in our case Demo #4 and Demo #5: requires TensorRT 6.x+, you may need some things like make, I tested this all on Linux and sussed it out despite being a total ML/Linux noob
    https://github.com/jkjung-avt/tensorrt_demos

  4. Follow setup instructions here up until step 4, you have the converted .onnx file when you have ran the python3 yolo_to_onnx.py -m yolov4-416
    ** if you are using a custom model with custom class number, use --category_num 2 at the end of the function, e.g. for 2 classes, use:
    python3 yolo_to_onnx.py -m yolov3-tiny-darkmarktest-416 --category_num 2

Feel free to continue the steps if you want to test the onnx graph in TF, or open it in Netron graph viewer web version to compare the structure to your original .cfg
https://github.com/jkjung-avt/tensorrt_demos#demo-5-yolov4
(it’s all set to run with Python 3 and I tested on Ubuntu 20.04 LTS)

More reading from the repo creator here, he talks about how the scripts all work:
https://jkjung-avt.github.io/tensorrt-yolov3/

Finally… using it
Assuming you are using the repo here https://github.com/derenlei/Unity_Detection2AR
Bring in your .onnx file and make sure to import your names list file as a .txt file for unity to read

  • Upload the model and label to Assets/Models. Use inspector to update your model settings in Scene: Detect → Game Object: Detector Yolo2-tiny / Detector Yolo3-tiny. Update anchor info in the DetectorYolo script here or here.

  • On the PhoneARCamera script in the scene, choose Yolo v2 or v3 from the dropdown, it should be on a gameobject like ‘Camera’ or ‘phonecamera’

I hope that is helpful to you, any questions please let me know

I trust in your case you will want NPCs or something in the game to send the camera feed to the Yolo model for object detection? You can see what we are doing with the PhoneARCamera script which currently sends the AR CPU image of what’s in the phone camera directly to Unity Barracuda using the Yolo v3 tiny model.

6 Likes

Excellent material @ROBYER1_1 much appreciated sir

1 Like

Hello, Sir

I also encountered a situation very similar to your question and checked your post while googling.

I left the following questions on the Unity Forum, but I did not get a clear answer.

( Unity Ml_Agent : Object detection in real-time )

I would like to ask if you have solved the problem.

I recently found another very similar success story within Unity, and I’m sharing it with you.

(GitHub - JoSihun/CameraAutoFlightMLAgents: Drone Autonomous Flight based on Camera using Unity ML Agent)

I’m sorry to ask you this question all of a sudden, and I’d like to ask you for an answer at your available time. Thank you.

Thank you @ROBYER1_1 for detailed explaination!

1 Like