I would like my Agent to select from a list of possible targets, evaluating them on the threat they represent. Since the number of possible targets can vary, it would be good to add them to a BufferSensor, however the buffer sensor doesn’t depend on the order, meaning using a discrete action to select from the list doesn’t work.
Is there a way to solve this, to be able to select a particular element while using attention?
I’ll ask our research team about this. The policy will implicitly learn which object is the biggest threat and either avoid it or try to neutralize it depending on how your game is set up. There is no mechanism for “viewing” what it perceives as the object which has the biggest signal at the moment.
the sorter environment does take order into account, but I don’t believe it helps with the problem you are trying to solve.
Focusing on the most pressing target would be the idea, but the targeting action would be ‘pick one of these N discrete targets’, not a continuous spectrum.
Hello,
I was planning to ask the exact same question today, so i’m participating to the topic.
I’d also like to be able to pick a target from a dynamic set of units. However, I’m not able to find any easy general approach for this.
The approach for unit selection in open AI five is interesting, they use embeddings to “encode” units features in a fixed size vector of Dim X. And then the neural network give a vector of same Div, then you use dot product with the output vector and input vector to find the unit that match the most with the nn output.
It works in a similar way to recommandation engines.
However in ml agent, we cannot define our own embeddings and train them, also, I have no idea how open AI five trained their embeddings. More here https://neuro.cs.ut.ee/the-use-of-embeddings-in-openai-five/
I’d like to know if there was a technical name for this kind of “Unit selection” thing, because I’m having no big success with searches. (Especially, when I use the word “selection” on google in a machine learning context, i’m being flooded with feature selection)
If you have any other ideas for differents approach, I’m also super interested,
Thank you !
Hi,
We have a feature request to expose network activations that may be relevant here. Maybe we can bump up the priority on that in order to enable this type of functionality.
That would be awesome ! I will definitly try it out as soon as this feature comes out
This feature request is tracked internally as MLA-812. Thanks!
Awesome !
In the meantime, is there any alternative way we could try in order to achieve it ? Thanks
Could you elaborate a bit how the activation helps for selecting an element? I’m not too familiar with how the attention mechanism of the buffer sensor exactly works.
Exposing the model activations will allow you to see which parts of the network were “activated” for a specific set of inputs. For the buffer sensor and the attention mechanism, this means you could, in theory, inspect the network activations in order to determine which Observation was considered to be the “most important” by the previous forward pass of the network.
+1 to this feature, it would be great to understand if there’s been any progress / prioritization here since!
+1 from me too. I am also looking for this kind of feature.
+1