How to use API to bulk download raw data export?

What I’m trying to do:

  • Use Unity API to raw data export all my ‘custom’ events from the last ~30 days into one file
  • I’m following the instructions found here
  • I managed to do the first step and actually create the raw data export and have it show up in the Unity dashboard
  • But now, I can only individually download files through the dashboard for what seems like multiple instances of one event / one day at a time - which is very inefficient
  • I want to download 1 file that has all my custom events from the last ~30 days
  • How do I do this?

So… here’s where I’m at.

  • But I don’t want to manually download each of these individually as different files.
  • I want ONE file with ALL my data so that I can query it using Couchbase (or something else)
  • I imagine the end goal here isn’t supposed to be me downloading each of these files one at a time. I need a way to combine this all so that I can query/analyze the entire batch.

I followed the next logical step in the Unity Manual - i.e. used the line from the “Get Raw Data Export” section - thinking this may do what I wanted, but nothing happened and it did not work for me.

  • Open Notepad, type what’s in #2

  • curl --user {UNITY_PROJECT_ID}:{API_KEY} https://analytics.cloud.unity3d.com/api/v2/projects/{UNITY_PROJECT_ID}/rawdataexports/{ID}

  • Substitute UNITY_PROJECT_ID and API_KEY as I had done in a previous step which worked. Then substituted ID with the relevant blacked out ID (top one) from the image I attached above.

  • Typed this into command line, hit enter.

  • Nothing happens

Is this what I’m supposed to do to get the data in bulk combined into one?

  • If yes, why is it not working?
  • If no, what am I supposed to be doing?

I don’t see any result from curl, you might try the -v verbose option. So your first curl command did work? It’s not in the screenshot, please provide that too. You don’t use the report ids from the dashboard, please follow the documentation steps exactly.

@JeffDUnity3D My first command from curl, which did work in the command line, is below:

curl --user {UNITY_PROJECT_ID}:{API_KEY} --request POST --header “Content-Type: application/json” --data “{"startDate":"2020-06-16","endDate":"2020-07-08","format":"json","dataset":"custom"}” Unity Cloud

I am trying to follow the documentation steps exactly. However, for someone that is not a programmer, these steps are not very intuitive or clear. For example, in the above code that worked for the initial step, I had to do further forum research and talk to programmers I know to figure out what to try to get it to work.

  • If that ID that’s supposed to be used is not the ID from the dashboard, what ID do I use?
  • The output from the first command above gives me an “id” - and that “id” matches the “id” in the dashboard. I’m not sure where else I’m supposed to get an “id” from?
  • I’m not sure what you mean by the “-v verbose” option. For someone that does not use the command line for their regular workflow, could you be very specific for me and tell me exactly what I should be trying here?

I’m just getting no response when I put the following (with real info in curly brackets) in the command line.

curl --user {UNITY_PROJECT_ID}:{API_KEY} https://analytics.cloud.unity3d.com/api/v2/projects/{UNITY_PROJECT_ID}/rawdataexports/{ID}

What should I be typing instead of this? And what is that “ID” supposed to be then?

The instructions on the documentation page walk you step by step through the multiple calls required, each depends on the output of the one before it. You’ll want to look in the documentation for curl regarding the -v option. It would be curl -v … The ID you mention is the raw data export id returned from the previous curl call, see screenshots from the documentation

This may help too, curl syntax is different on Windows vs Mac. These are actual requests, substituting ****

Windows syntax:

curl -v --user 4ad8afe2-78a5-48ac-88d4-*****:4dbbb00cb1675966d94872b**** --request POST --header “Content-Type: application/json” --data “{ "startDate": "2020-03-10" , "endDate": "2020-03-12", "format": "json", "dataset": "appStart" }” Unity Cloud

Mac syntax:

curl -v --user 4ad8afe2-78a5-48ac-88d4-b****:4dbbb00cb1675966d94872b**** --request POST --header “Content-Type: application/json” --data ‘{ “startDate”: “2020-03-10” , “endDate”: “2020-03-12”, “format”: “json”, “dataset”: “appStart” }’ Unity Cloud

@JeffDUnity3D I have followed the documentation as instructed. I am stuck at this step where I’m trying to use the code as it is listed in the documentation, but it does not work.

I’ve tried the whole process over again on a smaller data set below, here’s how it goes… note, the “id” that’s generated from the first step actually does match the “id” in the Unity dashboard.

If I am supposed to try something that is not listed in the Unity Manual, which it seems like I have to divert from here - could you please very explicitly explain what it is I am supposed to do? I am not a programmer, so as many details as you could provide would be appreciated.

In your latest post, you provided syntax. You provided for the first step, which I have completed. Is there some special Windows syntax I should be using for the second step, where I am currently stuck? I’m on Windows.

You might not need those $ characters preceding the id’s, that might be a documentation issue. And please use the -v verbose syntax as suggested. curl -v --user …

Hi @JeffDUnity3D

I managed to figure out how to write all 3 pieces into the command line, as per the manual but with various adjustments that did not work in Windows or with your help. The steps that I got to work were:

  • Create Raw Data Export
  • Get Raw Data Export
  • List All Raw Data Exports

However, I cannot figure out a way to get what I actually want, which is:
I want to download one file with all custom events from the last ~31 days

I can only figure out how to download single partial pieces of the full custom raw data export data at a time, e.g. one day at a time (like only July 4, 2020 instead of the last ~31 days in one file). This is exactly what I could already do from the Unity Dashboard. Except, through these commands, I do it via an API/URL link rather than from the dashboard itself.

An example:

  • One raw RDE pulls all “custom” events from June 16 - July 8.
  • The entirety of this is 8,212,960,381 bytes (8.21GB)
  • Through the dashboard or the command line API, I can only download much smaller pieces of this at one time for partial pieces of the data (e.g. one piece worth 51353316 bytes (0.05GB)

How do I get one file with all the custom data for the last ~31 days? I’m not seeing a way to do this from the API URL result.

You can’t download all days at once. You need to write a Python program with a loop, and download each day and then import each day into your database.

@JeffDUnity3D I see, that’s unfortunate…

Final question for now then while I get support to do that. My Couchbase table has a custom_params column with a variety of information from the JSON. Do you know the syntax to pull this apart into separate columns?

Example:

  • Query result returns custom_params column, which has various sub-columns
  • How would I transform “# shifts this ship” into a proper column - such as country or city?

I’ve looked through Couchbase forums and Syntax, but didn’t easily find anything. Thought you may be able to provide a quick answer for me?

There is an example here in the bottom image https://support.unity3d.com/hc/en-us/articles/115004052703-Advanced-Queries-with-Unity-Analytics-and-Raw-Data-Export

@JeffDUnity3D With sub-columns that are one word or use underscores, this works. However, in the example I have, “# shifts this ship”, there are spaces - and the method from your doc doesn’t work. What syntax change do I need to make here to get it to work correctly?

Likely a bad naming convention for your parameters. I would suggest alpha-only parameter names which is a fairly standard database object naming convention. Specifically that # sign is going to likely cause problems later. Best to change the naming now before you send more events than spend time trying to parse these names. Perhaps you can rename column names? It’s been while since I’ve used Couchbase. Even still, you probably want the proper naming for your parameters.

I agree the naming convention is poor. Right now, I’m tasked with trying to make sense of what is here though. I’m not clear on how I could rename this now to actually make sense of it and query this.

You will need to work with the developer who wrote the event code in Unity to rename your parameters (this is the correct solution), or change the name of the Couchbase columns. You need to remove the spaces and the # sign and make the syntax more like the example with alpha characters only and no spaces.

I figured out how to do it with just query code.

custom_params.# shifts this ship AS num_shifts

This will pull the poorly named JSON object into its own column.

I might suggest however, since this is early in your project lifecycle, to consider properly naming the parameters in the first place.

Is there an example of this? The python loop, I’m having trouble downloading each day.

Have you downloaded a single day?