Corrupted stack traces with il2cpp on iOS

Hey

We have an automated crash reporting system as well as report various errors that occur in the game (not crashes).

On iOS, many of these are reported with a stack trace that just doesn’t make any sense.
We know that there used to be issues regarding that in the past, but i believe these were already resolved.

We are using Unity 5.3.4p1, which is relatively up to date.

Anyone else experiences such issues ?

BTW, we get the errors + stack traces using Unity’s event ‘logMessageReceived’:

Application.logMessageReceived     += ReportError;
private void ReportError(string logString, string stackTrace, LogType type)
{
    // use the stackTrace string when reporting the error
}

@liortal

We did find and correct one issue related to stack traces in the 5.3.5p1 release (which should be available on Wednesday). This fix was specific to NullReferenceException and DivideByZeroException exceptions thrown by the runtime code though.

You hinted that you might be seeing incorrect stack traces in exceptions thrown by managed code in your project. Is that correct? If so, that is a problem we are not aware of, but we would like to correct.

Hi Josh,

The problem is that these issues are pretty random, we’re only getting them from real world players, and can’t always reproduce them in house.

Also, I do have to get permissions from my employers in order to share our project with you guys (i have tried to get that in the past but did not get it unfortunately).

I can share a few scenarios of stack traces that can’t be right:

NRE with no stack trace at all:
NullReferenceException: A null value was found where an object instance was required

This is the code that is reported to throw the exception:

private bool canSelectItem = true;

/// <summary>
/// A flag that determines whether a leaderboard item can currently be selected (clicked on).
/// </summary>
public bool CanSelectItem
{
    get { return canSelectItem; }
    set { canSelectItem = value; }
}

I need to look for a few other examples, we had others, It looks like it’s mostly NullReferenceExceptions.

@liortal

This looks a lot like what we were seeing the bug which we corrected. The call stack would have a managed method on it, but it is the wrong managed method. In fact, it is the next managed method in the binary after the method that actually threw the exception. Clearly, CanSelectItem did not throw an NRE!

I would recommend trying 5.3.5p1 when it is released (it may be a delayed a day or two, from what I’m hearing). I guess it will be difficult to know for sure if the problem is corrected, as this is not consistently reproducible. But we should be able to find out soon if there are still incorrect stack traces reported.

Thanks Josh! i’ll upgrade and update this thread in case the issue was fixed (or not… )

No signs of 5.3.5 or p1? has there been a delay?

@MrEsquire

Yes, there has been a delay. I’ve heard that some part of our internal build publishing pipeline ran into some issues, although I don’t know the details. I would expect 5.3.5f1 in a day or two, and 5.3.5p1 maybe early next week now.

1 Like

Thanks you was correct, official has been released. Any internal news on the first patch?

@MrEsquire

Sorry, I don’t have more news yet. I believe that some build publishing issues still linger, but we’re working on it.

We just started using 5.3.5p1, still getting corrupt stack traces, here’s an example:

@liortal

The last known bug we had with incorrect stack traces on iOS was fixed in 5.3.5p1, so this is definitely something we have not seen. Is it possible for you to submit a project to reproduce this?

My company cannot share the project, not without protecting themselves in some legal way (it’s not up to me, sorry…).
I will see what the latest decision is.

In the meantime, please note that it seems all stack traces that we report now being with the same method calls (which are clearly wrong), here are 2 examples:

Fyber :: Request failed due to: An error happened while trying to retrieve ads
MALog:ReportError(String, String, LogType)
FyberPlugin.FyberCallbacksManager:ProcessRequestCallback(NativeMessage)
System.Collections.Generic.ICollection1:get_Count()\n System.Collections.Generic.ICollection1:get_Count()
System.Collections.Generic.ICollection`1:get_Count()

UnityException: SetAnimationImpactTimes : No events in shield animation
ShieldMainAnimation.SetAnimationImpactTimes (Int32 shieldCount)
ShieldMainAnimation.PlayAnimation (Int32 amountOfShields)
MainManager+c__Iterator92.MoveNext ()
System.Collections.Generic.ICollection1[T].get_Count () System.Collections.Generic.ICollection1[T].get_Count ()
System.Collections.Generic.ICollection1[T].get_Count () System.Collections.Generic.ICollection1:get_Count()

Not sure there’s much you can do without a project though…

@liortal

Thanks, I understand.

Bugs like this are difficult to track down without the project, because they are usually impacted by the layout of the binary executable file.

With that said, you can check at least one thing on your side. This error looks like one we had in the past when the MapFIleParser utility was not working correctly. That utility runs after the linker, and reads the linker map file to produce a binary data file that libil2cpp reads at runtime to understand the function locations in the executable.

You can check the Xcode build output after the linker runs for information about the MapFileParser. It should run and not report an error. You can also look in the output project for files named SymbolMap* - these are the data files used by libil2cpp at runtime. If they are not present, we can get incorrect stack traces which look like this.

Of course, the incorrect stack traces could be caused by something different as well, but this might be worth a look.

I will check that; Other info that (may?) be useful: we did not yet upgrade our project from 5.3.4p1.

The way i do it, is i flip the switch in Cloud build to set a different version of the engine. I tried both 5.3.5p1 and 5.3.5p4 today, both exhibit this issue. I will try to see if there’s any Xcode log in Unity cloud build and will report it if i find anything.

@liortal

I don’t think the fact that the project is still on 5.3.4p1 should have an impact. The stack traces should not depend on the code in the project, only on the runtime for the virtual machine. Hopefully cloud build will allow you to see the full Xcode log.

I found this in the log:

Is this related? (i will send you the full log in a PM if that is possible)

@liortal

Yes, that looks like a good source for this problem. Please PM me the full log if you can.

Just did thanks !!

Thanks to the bug report @liortal submitted, we were able to track down the cause of this problem. It looks like stack traces have probably not worked correctly with iOS builds from cloud build for some time, as the cloud build configuration for Xcode was subtly different from the default configuration, and the generation of the SymbolMap files IL2CPP needs to build proper managed stack traces was (almost) silently failing.

The good news is that we have a fix, and the fix should land on cloud build within the next Unity patch release or so.

I’m seeing a slightly different issue, but am wondering if it’s somehow related. We’ve developing a game for iOS and Android. We use an in-game logger for our development builds called Lunar Mobile Console, and it shows us the Unity stack trace for all Debug.Log statements. We also publish our dev builds through HockeyApp, so we get to see the exceptions there as well. In LMC, in iOS, I’m seeing stack traces on normal Debug.Log statements that look like this:

However, in the Unity editor, the same log message looks like this:

There are two things of interest here: First, is it possible to get line numbers like we get in-editor? It would make tracking down exceptions so much easier. Second, why has the iOS build injected spurious calls to various methods (e.g., ThirdParty.MD5.MD5Managed:HashFinal and IAnalyticsFieldNameAdapter:Adapt)? The analytics field name adapter is literally a 4-line interface with 1 method that simply adapts typed fields into their string counterparts readable by our analytics system. Furthermore, it isn’t even in the call chain for this particular log statement, so I’m not sure why it’s even showing up.

We’re running Unity 5.3.5p2 against Xcode 7.3.1 using the Debug configuration with both Development Build and Script Debugging selected.

Update: I just noticed this crash come in from HockeyApp. It’s sort of ridiculous:

The Amazon method doesn’t cause that error to be logged, DC.Common.Tuning.GetDefById does. Why is the Amazon method there? And why are there now EIGHT mentions of the (not-at-all-in-the-actual-call-stack) IAnalyticsFieldNameAdapter#Adapt method? Hoping this is a known/fixed issue in 5.4…