3d to Image comparison, getting errors

Background: I want to create a 3D object out of a set of 15 base shapes, for which I’m then adjusting rotation, color and such via actions, with the goal of matching a face photo. This creates the (perhaps problematically high) amount of 15 * 14 = 210 continuous actions. The only observation of the agent is an 84x84 camera pointed at the randomized photo on a quad. The reward is how much the snapshot of another camera of the 3D creation now matches the photo, using the sum of color distance of each pixel. I’m ending each training immediately via Done() after handing out the rewards (not sure if that’s even appropriate). ResetOnDone then repeats the same process.

The Error: Even when trying various different Config Yaml settings (e.g. I tried
time_horizon: 1, but also much higher, more normal values), I keep getting errors in the Anaconda/ Tensorflow prompt after some steps. What might I be doing wrong? Error below. Thanks!

INFO:mlagents.trainers: Main: ClayxelFaceMatcher: Step: 1000. Time Elapsed: 22.805 s Mean Reward: -65.020. Std of Reward: 0.000. Training.
Process Process-1:
Traceback (most recent call last):
  File "e:\_misc\programs\anaconda\envs\ml-agents\lib\multiprocessing\process.py", line 258, in _bootstrap
    self.run()
  File "e:\_misc\programs\anaconda\envs\ml-agents\lib\multiprocessing\process.py", line 93, in run
    self._target(*self._args, **self._kwargs)
  File "e:\_misc\programs\anaconda\envs\ml-agents\lib\site-packages\mlagents\trainers\subprocess_env_manager.py", line 132, in worker
    env.step()
  File "e:\_misc\programs\anaconda\envs\ml-agents\lib\site-packages\mlagents_envs\timers.py", line 262, in wrapped
    return func(*args, **kwargs)
  File "e:\_misc\programs\anaconda\envs\ml-agents\lib\site-packages\mlagents_envs\environment.py", line 326, in step
    self._update_state(rl_output)
  File "e:\_misc\programs\anaconda\envs\ml-agents\lib\site-packages\mlagents_envs\environment.py", line 283, in _update_state
    agent_info_list, self._env_specs[brain_name]
  File "e:\_misc\programs\anaconda\envs\ml-agents\lib\site-packages\mlagents_envs\timers.py", line 262, in wrapped
    return func(*args, **kwargs)
  File "e:\_misc\programs\anaconda\envs\ml-agents\lib\site-packages\mlagents_envs\rpc_utils.py", line 127, in batched_step_result_from_proto
    _process_visual_observation(obs_index, obs_shape, agent_info_list)
  File "e:\_misc\programs\anaconda\envs\ml-agents\lib\site-packages\mlagents_envs\timers.py", line 262, in wrapped
    return func(*args, **kwargs)
  File "e:\_misc\programs\anaconda\envs\ml-agents\lib\site-packages\mlagents_envs\rpc_utils.py", line 73, in _process_visual_observation
    for agent_obs in agent_info_list
  File "e:\_misc\programs\anaconda\envs\ml-agents\lib\site-packages\mlagents_envs\rpc_utils.py", line 73, in <listcomp>
    for agent_obs in agent_info_list
  File "e:\_misc\programs\anaconda\envs\ml-agents\lib\site-packages\mlagents_envs\timers.py", line 262, in wrapped
    return func(*args, **kwargs)
  File "e:\_misc\programs\anaconda\envs\ml-agents\lib\site-packages\mlagents_envs\rpc_utils.py", line 51, in process_pixels
    image.load()
  File "e:\_misc\programs\anaconda\envs\ml-agents\lib\site-packages\PIL\ImageFile.py", line 250, in load
    self.load_end()
  File "e:\_misc\programs\anaconda\envs\ml-agents\lib\site-packages\PIL\PngImagePlugin.py", line 677, in load_end
    self.png.call(cid, pos, length)
  File "e:\_misc\programs\anaconda\envs\ml-agents\lib\site-packages\PIL\PngImagePlugin.py", line 140, in call
    return getattr(self, "chunk_" + cid.decode('ascii'))(pos, length)
  File "e:\_misc\programs\anaconda\envs\ml-agents\lib\site-packages\PIL\PngImagePlugin.py", line 356, in chunk_IDAT
    raise EOFError
EOFError

That’s pretty strange, it looks like the visual observation is getting truncated or corrupted. Two things that might help debug this?

  1. I assume you’re using a CameraSensorComponent - can you set the compression type to None? That will send the data as floats instead of PNG, so it’ll bypass PIL (the python image library raising the exception).
  2. Do you mind saving a Demonstration file with your observations and attaching it here? We can hopefully load it and see if something is wrong with the visual data. You’ll probably need to set your “agent” to Heuristic behavior type (since it won’t be able to do inference, and training will crash).

Also, what are the pip versions of PIL and/or Pillow do you have installed?

1 Like

Thanks! Yes, I’m using a CameraSensorComponent. I’m still on ML-Agents 0.13, and it looks like the Compression option was only added in 0.14, so guess I should try to upgrade first (I had 0.14 before, but had to downgrade as it didn’t play along with whatever Python libraries I had, I think). I’m using the Anaconda setup route, by the way, as I haven’t yet gotten to work the newly suggested Python integration route (the Anaconda route worked fine for me).

For reference, my versions:
ml-agents: 0.13.0,
ml-agents-envs: 0.13.0,
Communicator API: API-13
TensorFlow: 1.7.1
Pip 18.1
Python 3.7 (I have a bunch of different ones installed, too)
Barracuda: 0.4.0 (0.6.0 throws errors with my other setup)

Is my agent subclass what you mean with Demonstration file? If upgrading to 0.14 fails to solve this, I can also share my full setup, but below for starters is my agent subclass.

using UnityEngine;
using MLAgents;

public class ClayxelsAgent : Agent
{
    [SerializeField] Transform photoQuad = null;
    [SerializeField] RenderTexture clayxelsRenderTexture = null;

    Clayxel clayxelWrapper = null;
    ClayObject[] clayObjects = null;
    const int actionsPerClayObject = 14;
    int maxActions = -1;

    string[] imageNames = null;
    const string resourcesPath = "E:\\Projects\\Shapemaker\\Assets\\Resources";
    Material photoMaterial = null;
    Texture2D photoTexture = null;

    public override void InitializeAgent()
    {
        clayxelWrapper = GetComponent<Clayxel>();
        clayObjects = GetComponentsInChildren<ClayObject>();
        maxActions = clayObjects.Length * actionsPerClayObject;

        GetImageNames();
  
        Renderer renderer = photoQuad.GetComponent<Renderer>();
        photoMaterial = renderer.material;
  
        SetRandomImageOnPhotoQuad();
    }

    public override void CollectObservations()
    {
        // None needed, only CameraSensorComponent is automatically observed.
    }

    public override void AgentAction(float[] actions)
    {
        DoActions(actions);
        HandleReward();
        Done();
    }

    void DoActions(float[] actions)
    {
        for (int i = 0; i < clayObjects.Length; i++)
        {
            ClayObject clayObject = clayObjects[i];
            Transform clayTransform = clayObject.transform;

            int n = i * actionsPerClayObject;

            clayTransform.localPosition = new Vector3(
                actions[n++] * 2.5f,
                actions[n++] * 2.5f,
                actions[n++] * 1.5f
            );

            const float maxAngle = 180f;
            clayTransform.localEulerAngles = new Vector3(
                actions[n++] * maxAngle,
                actions[n++] * maxAngle,
                actions[n++] * maxAngle
            );

            const float minScale = 0.1f;
            const float maxScale = 3f;
            clayTransform.localScale = new Vector3(
                minScale + actions[n++] * (maxScale - minScale),
                minScale + actions[n++] * (maxScale - minScale),
                minScale + actions[n++] * (maxScale - minScale)
            );

            clayObject.color = new Color(
                (actions[n++] + 1f) * 0.5f,
                (actions[n++] + 1f) * 0.5f,
                (actions[n++] + 1f) * 0.5f,
                1f
            );

            clayObject.blend = actions[n++] * 1.5f;

            float roundness = (actions[n++] + 1f) * 0.5f * 0.5f;
            const float mirrorXOption = 2.0f;
            clayObject.attrs = new Vector4(roundness, 0f, 0f, mirrorXOption);
        }

        clayxelWrapper.needsUpdate = true;
        clayxelWrapper.Update();
        // Clayxel.reloadAll();
    }

    void HandleReward()
    {
        float colorDistance = 0f;

        Texture2D renderTexture2D = new Texture2D(
            clayxelsRenderTexture.width, clayxelsRenderTexture.height,
            TextureFormat.RGBA32, false
        );
        RenderTexture.active = clayxelsRenderTexture;
        renderTexture2D.ReadPixels(
            new Rect(0, 0, clayxelsRenderTexture.width, clayxelsRenderTexture.height), 0, 0
        );
        renderTexture2D.Apply();

        int width = clayxelsRenderTexture.width;
        int height = clayxelsRenderTexture.height;
        Color[] colorsSource = photoTexture.GetPixels(0, 0, width, height);
        Color[] colorsClayxels = renderTexture2D.GetPixels(0, 0, width, height);

        for (int i = 0; i < colorsSource.Length; i++)
        {
            colorDistance +=
                Mathf.Abs(colorsSource[i].r - colorsClayxels[i].r) +
                Mathf.Abs(colorsSource[i].g - colorsClayxels[i].g) +
                Mathf.Abs(colorsSource[i].b - colorsClayxels[i].b);
        }

        float reward = 100f - colorDistance * 0.01f;
        // print(reward);
        AddReward(reward);
    }

    public override void AgentReset()
    {
        SetRandomImageOnPhotoQuad();
    }

    public override float[] Heuristic()
    {
        float[] actions = new float[maxActions];
  
        float randomizeMax = Input.GetKey(KeyCode.Space) ? 1f : 0.1f;
        for (int i = 0; i < actions.Length; i++)
        {
            actions[i] = UnityEngine.Random.Range(-randomizeMax, randomizeMax);
        }

        return actions;
    }

    void SetRandomImageOnPhotoQuad()
    {
        int randomIndex = UnityEngine.Random.Range(0, imageNames.Length);
        string path = imageNames[randomIndex];
        photoTexture = Resources.Load(path) as Texture2D;
        photoMaterial.mainTexture = photoTexture;
    }

    void GetImageNames()
    {
        imageNames = System.IO.Directory.GetFiles(resourcesPath + "\\Faces", "*.jpg");
        for (int i = 0; i < imageNames.Length; i++)
        {
            imageNames[i] = imageNames[i].Replace(resourcesPath + "\\", "");
            imageNames[i] = imageNames[i].Replace(".chip.jpg", ".chip");
        }
    }

}

In Anaconda it still shows “ml-agents: 0.13.0” (causing a mismatch between Unity’s v14), even though I’ve just spent hours to do all the Python upgrades, Conda upgrades, Pip upgrades, Tensorflow and Tensorboard etc. uninstall and upgrades, removed my old ML-Agents, grabbed the 14 one, restarted Win 10 multiple times, and so on.

I now also did a complete uninstall of all Python version, then re-installed the suggested Python 3.7, did various Path adding and restarts – trying to get the new suggested non-Anaconda install route to work – but I’m now still getting errors during “pip3 install mlagents” about “Could not find a version that satisfies the requirement tensorflow<2.1,>=1.7 (from mlagents) (from versions: none)”. (There were also Pip 19 to 20 version upgrade warnings, which I got rid off after several tries and restarts.) I’m now back trying to install Anaconda again, but running into the same issue. Using the alternative “pip3 install --upgrade https://storage.googleapis.com/tensorflow/mac/cpu/tensorflow-1.7.1-py3-none-any.whl” did something, but now I’m getting
“ImportError: No module named ‘_pywrap_tensorflow_internal’” when trying “mlagents-learn config/trainer_config.yaml --run-id=Main --train” in Anaconda. And now, I’m getting “Could not find conda environment: ml-agents” when trying “activate ml-agents”.

The whole Python dependency chain of the terrific Unity ML Agents is a big stumbling block to me. Is there something on the roadmap that would create a nearly-one-click Windows install for those uninitiated like me who just care about the Unity C# side of it? Because I wonder if it might be my best option to just wait for that.

Sorry for the python troubles. The “Could not find a version that satisfies the requirement tensorflow<2.1,>=1.7 (from mlagents) (from versions: none)” error sounds like you might have gotten python3.8 instead of 3.7 - tensorflow doesn’t currently support 3.8. Can you run “python --version” to check?

If you can get back to the original problem - what I was looking for by “Demonstration file” was the output from adding a Demonstration Recorder to the Agent: https://github.com/Unity-Technologies/ml-agents/blob/master/docs/Training-Imitation-Learning.md#recording-demonstrations That will capture the observations of the Agent so we can try to load it back up.

And when you do get up and running again, can you tell me the output of “pip3 show pillow” and “pip3 show pil”?

Hi! Thanks for the reply. Anaconda command line upon “python --version” tells me Python 3.7.4. (For what it’s worth I had, earlier that day, installed, then deinstalled, Python 3.8.)

If there was a single Unity-made Exe that installs all ML Agents dependencies – I’d give it system-wide rights and tick any box saying “this will get rid of any other Python installed”, just for it to solve all problems! :slight_smile: But that might be an impossibility to implement, I guess. And I guess there’s also no way to magically auto-convert Tensorflow to C# to ease setup.

I’m no python expert, but I always assumed that creating a fresh environment and running a setup file based install should take care of all dependencies. I’ve installed ml-agents quite some time ago, back when using conda was still the recommended practice. Since then I’ve only updated my existing environment, currently working with 0.12.
On the other hand though, I wonder if virtualenv and conda environments are completely self-contained? If that’s the case, maybe we could reserve a space in the forum or on Github for people to zip and share their working ml-agents python environments? Not sure if that would make sense or be practical, just an idea.