Model didn't import: ljspeech-jets-onnx

This model didn’t import:

ConvTranspose only supported in 2D case
ConvTranspose.InputError: outputPadding must have two values per spatial dimension or be null
Assertion failure. Value was False

I tested the same model with Onnx Runtime and it works.

Yes apologies, ConvTranspose only supports the 2D case in the release.
We’ll fix things up in a upcoming version

1 Like

No worries.
That would be great for a future version. I just tried it with Onnx Runtime and it is a really nice text to speech model:

I am trying to replicate your Orb demo! This model doesn’t have lip sync data though.

I could imagine using something like this for a game character’s voice.

Until we fix the ConvT 1D case you can try

Although you’ll need to code the FFT as a Conv1D call

1 Like

That might be good for speech recognition, but I’m looking for speech generation. I’m sure I’ll find something.

One work around you could take for the ConvTranspose 1D case is override it with a Custom Layer (cf sample)
In which you reshape the input to a 4D shape, call ConvTranspose and reshape back to the original shape

I had a look at this but I didn’t really understand what I was doing.

Also the model needs the “If” operator apparently.

Ah yes ok then it would take a bit more time for support then

Amazingly I got this to work now! It was quite complicated:

  • Created some python code which allowed me to delete the “If” nodes from the ONNX file and route them always down the always true branches.
  • Overrode the ConvTranspose as suggested with some quick and dirty code. The main part of it is:
    public override Tensor Execute(Tensor[] inputs, ExecutionContext ctx)
        var X = inputs[0] as TensorFloat;
        var K = inputs[1] as TensorFloat;
        var B = inputs[2] as TensorFloat;

        X = ctx.ops.Reshape(X, new TensorShape(X.shape[0], X.shape[1], X.shape[2], 1)) as TensorFloat;
        K = ctx.ops.Reshape(K, new TensorShape(K.shape[0], K.shape[1], K.shape[2], 1)) as TensorFloat;

        if (strides.Length != 2) strides = new int[] { strides[0], 1 };
        if (pads.Length != 4) pads = new int[] { 0, 0, 0, 0 };
        outputPadding = new int[] { 0, 0, 0, 0 };
        X = ctx.ops.Conv2DTrans(X, K, B, strides, pads, outputPadding, FusableActivation.None);
        X = ctx.ops.Reshape(X, new TensorShape(X.shape[0], X.shape[1], X.shape[2])) as TensorFloat;

        return X;

which no doubt breaks it for the usual 2D case. But this is just a test for the 1D case. Just setting the padding parameters to zero seems to work for now although pads = new int[] { pads[0], 0, pads[1], 0 } is probably the correct thing to do.

yeah you are correct

FYI we have this in our backlog to fix, known as issue 27.

Hey there - A fix for the ConvT 1D case support was included in 1.1.0… sorry I forgot to post here until now! Check it out and let us know if any improvement. I will mark this thread resolved for now though.

I’m just getting started and trying to use the same model but I’m getting a KeyNotFoundException when selecting it.
This does not happen with the sample models.

I just tried to import the model locally, unfortunately it seems to contain an “If” operator.
The current version of Sentis does not support this operator. You can see a list of all of the operators we support and don’t support here:
I believe the other errors you are seeing are just following on from the model not being able to import. We will aim to make our debug errors clearer (issue 114 internally).

Thanks for the feedback and link.
Would you also suggest this Inspect ONNX Model Operators to see which operators are used in a ONNX model?

Yes, we also recommend for examining .onnx models outside of Sentis.

1 Like

Side note you actually get the correct error when its imported at the beginning

Take a look at the end of the model, you can see what the if are doing.

It’s super trivial behaviour. I’d just remove them from the model directly either by modifying the onnx or doing that in sentis.
You can follow the CustomLayer sample and either make the if a No-op or do the shape logic it’s doing.
You can then choose if you remove those last 10 nodes are not that useful for inference

1 Like

Thanks for this information and the push in this direction.

I saw that they are at the end and for a second I was wondering if I could remove them. But as I never worked with this I thought it can be that “simple”.

With 10 nodes you mean everything after the Conv, right?

Yes, I mean remove everything after the Tanh layer
More details

foreach(var layer in model.layers)
if( == "...")
model.layers = newLayers
model.outputs = new [] {...}
1 Like