Azure OCR on local image

Hello, I’m hoping someone can help me cuz I’m going a little bit crazy after many, many failed attempts… (I’m a noob so am out of my depth here…)

I want to use Azure to do an OCR read on a local image in my (Android) app folder. Azure doc says to read local images you put the binary image data in the HTTP request body and set content format octet-stream.

I’m using NativeGallery to take a screenshot on the phone and then I copy it to the local app folder using System.IO.File.Copy. I then convert the local image into a texture, then convert to bytes. I use WWWForm to create a web form, which Unity says to use for uploading images. I pass the form into my Post request but nothing happens… I’ve put what I think are the relevant bits below.

Thank you :slight_smile:

private IEnumerator TakeScreenshotAndSave() //take a screenshot and save it
{
    yield return new WaitForEndOfFrame();

    Texture2D ss = new Texture2D(Screen.width, Screen.height, TextureFormat.RGB24, false);
    ss.ReadPixels(new Rect(0, 0, Screen.width, Screen.height), 0, 0);
    ss.Apply(); // takes screenshot and stores in texture

    // save image to phone gallery. copy image to local app folder
    NativeGallery.Permission permission = NativeGallery.SaveImageToGallery(ss, "TESTfolder", "image.png", (success, path) => System.IO.File.Copy(path, Application.persistentDataPath + "/SavedImage" + ".png", true));
    Debug.Log("Permission result: " + permission);

    ScreenshotLocation = (Application.persistentDataPath + "/SavedImage.png"); // path to screenshot

    ScanImageWithOCR(); // call the OCR function

    Destroy(ss); // To avoid memory leaks
}
////////////////////////////////////

void ScanImageWithOCR() // convert the image to bytes and upload to Azure
{
    // setup the request header
    RequestHeader clientSecurityHeader = new RequestHeader
    {
        Key = clientId,
        Value = clientSecret
    };

    // setup the request header
    RequestHeader contentTypeHeader = new RequestHeader
    {
        Key = "Content-Type",
        Value = "application/octet-stream"
    };

    var rawData = System.IO.File.ReadAllBytes(ScreenshotLocation); //read the screenshot at this path and store in var rawData
    var screenshotTexture = new Texture2D(Screen.width, Screen.height, TextureFormat.RGB24, false); // Create a texture the size of the screen, RGB24 format
    screenshotTexture.LoadImage(rawData); // put var rawData into new texture

    byte[] bytes = screenshotTexture.EncodeToPNG(); // convert texture to byte array
    Destroy(screenshotTexture); // delete texture

    // Create a Web Form containing bytes array
    WWWForm form = new WWWForm();
    form.AddField("frameCount", Time.frameCount.ToString());
    form.AddBinaryData("", bytes, null, null);

    // send a post request with bytes contained in WWWForm
    StartCoroutine(RestWebClient.Instance.HttpPost(baseUrl, form, (r) => OnRequestComplete(r), new List<RequestHeader>
        {
            clientSecurityHeader,
            contentTypeHeader
        }));
////////////////////////////////////////////////

public IEnumerator HttpPost(string url, WWWForm form, System.Action<Response> callback, IEnumerable<RequestHeader> headers = null)
{
    using (UnityWebRequest webRequest = UnityWebRequest.Post(url, form))
    {
        if (headers != null)
        {
            foreach (RequestHeader header in headers)
            {
                webRequest.SetRequestHeader(header.Key, header.Value);
            }
        }

        webRequest.uploadHandler.contentType = defaultContentType;
        // webRequest.uploadHandler = new UploadHandlerRaw(System.Text.Encoding.UTF8.GetBytes(bytes));

        yield return webRequest.SendWebRequest();

Networking, UnityWebRequest, WWW, Postman, curl, WebAPI, etc:

And setting up a proxy can be very helpful too, in order to compare traffic:

1 Like

Thanks @Kurt-Dekker I’ll have a look at those, though my issue is more that I can’t seem to format the information for the Post request. The Post request was working perfectly with a URL image but is not working with my attempt to upload binary image data so I think the issue is in formatting.

Have you considered trying a different OCR engine, like Smart Engines? They offer fast and accurate OCR on a variety of platforms, including mobile. They also provide a number of resources and support to help you get up and running quickly and easily.
From what I can see, it seems like you might be having some issues with the data format and the HTTP request. The team at Smart Engines is well-versed in these types of issues and could help you troubleshoot and resolve them in no time.
Why not check them out and see if their solution could be a better fit for you?