Say you have a model with 3 inputs: a,b,c and two outputs x,y
Perhaps your model is such that x=a+b and y=b+c
That means if you want the output x you shouldn’t need to put in the input c. And if just want the output y you shouldn’t need to put in the input a.
I have noticed that you get an error if you don’t put in all the inputs (even the one’s you don’t need). Which makes sense since it doesn’t know which output you will need so it has to calculate everything.
A suggestion would be to do:
m_Engine.Execute(m_Inputs, m_Outputs);
Then it would only need to calculate those parts of the graph that lead to the wanted outputs. In most cases this is not a big deal but in other models it might speed things up even by 2x if one or more of the outputs is not needed.
===
On another issue, I noticed that if you miss out an input b, say but you have all ready set that input in a previous inference cycle. Then that input seems to be saved and used again. (I don’t know if that is safe or is that part of the memory possible to be overwritten?)
1 Like
Hey! good spot 
We actually do that but at import time.
Doing this at runtime is a bit too costly, because we need to parse the graph and then skip nodes at runtime.
We can expose our model optimizer methods in the public API if you want, that way you can remove all unused nodes given your model input.
2 Likes
For the unspecified inputs next inference cycle, I’ll investigate.
1 Like
The way I was thinking would be to start at the given output nodes then just recursively get their ancestor nodes until it find what inputs it would need. And then when doing the inference just skip the inputs that are not needed. I don’t know how it works in the backend so I’m not sure how costly that would be.
I don’t think I’d want to remove any nodes since they might be needed for a later time if I needed to get the other output. It’s more about just skipping some. (I guess I could load the model twice and prune different nodes off each but that wouldn’t be very efficient.)
This is actually related to the other issue, since if you miss out an input because you think you don’t need it, it might use the cached input that you gave last time. (That’s how I noticed it, in the first place). The cached input could make the inference fail if it is the wrong shape even though it has no affect on the output.
Ok so if you want to keep the model unchanged then I’d recommend using 0-dim tensors.
For example if one input is A, 2, 3
with A being dynamic, you can pass in a input of shape 0, 2, 3
and this will essentially cancel out the subgraph for that input.
A few caveat to be aware:
- your dim needs to be dynamic to not mess up with shape inference.
- there is some rules regarding 0-dim tenors, like Concat will skip the input, if you reduce on a 0-dim then we’ll return a 0f constant… But those rules are the same in torch
1 Like