Access Violation on standalone build

I originally posted this in the general support forum but it seems to be only a Windows error. See the original post here: Access Violation on standalone build - Unity Engine - Unity Discussions. The game crashes randomly roughly after 10 minutes. The prior build built on 5.6 was 100% stable. The new build is using the latest 2017.1 release and features the new runtime navmesh generation. I have since then also filed a bug report. When checking the crash dump it always fails on the same line in asm. I’ve wasted a few days on this can someone please help?

When I examine the crash dump with WinDgb I get the following:

This dump file has an exception of interest stored in it.
The stored exception information can be accessed via .ecxr.
(4f78.10130): Access violation - code c0000005 (first/second chance not available)
00000000070819a8 41ffd3 call r11 {mono!mono_domain_get (00007ffd6b5f32b4)}
0:000> .ecxr
rax=000000011315ec68 rbx=0000000003f211e0 rcx=000000011315ebe0
rdx=fffffffeedca4308 rsi=000000011315ebe0 rdi=0000000000e02d08
rip=00000000070819a8 rsp=ffffffffffffffe0 rbp=0000000000e03110
r8=0000000000000000 r9=0000000000000000 r10=0000000000000000
r11=00007ffd6b5f32b4 r12=000000011315ebe0 r13=0000000000d02000
r14=0000000000000000 r15=0000000002e91560
iopl=0 nv up ei ng nz na pe cy
cs=0033 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00010283
00000000070819a8 41ffd3 call r11 {mono!mono_domain_get (00007ffd6b5f32b4)}

I’m by no means an expert at this, why would getting the mono domain crash the app?

Found a post with s similar issue: RSP/RBP-related crashes in windows standalone x64 build - Unity Engine - Unity Discussions. Seems the app is suffering from a stack overflow.

Looking at the dissassemble I see the following:

0000000006e01993 488bfc mov rdi,rsp 0000000006e01996 488b6128 mov rsp,qword ptr [rcx+28h]
0000000006e0199a 4883ec20 sub rsp,20h 0000000006e0199e 49bbb432e6d8f87f0000 mov r11,offset mono!mono_domain_get (00007ff8d8e632b4) 0000000006e019a8 41ffd3 call r11 {mono!mono_domain_get (00007ff8`d8e632b4)}

Following the post I printed out the memory address at rdi:

Last set context:
rax=00000001122e76c8 rbx=0000000003bb1390 rcx=00000001122e7640
rdx=fffffffeee19b8a8 rsi=00000001122e7640 rdi=0000000000482d08
rip=0000000006e019a8 rsp=ffffffffffffffe0 rbp=0000000000483110
r8=0000000000000000 r9=0000000000000000 r10=0000000000000000
r11=00007ff8d8e632b4 r12=00000001122e7640 r13=0000000000726000
r14=0000000000000000 r15=0000000002692080

Then examined the memory at the stack’s location:

0000000000482d08 00007ff8d8f4deb7 mono!win32_handle_stack_overflow+0xb7 [c:\buildslave\mono\build\mono\mini\mini-windows.c @ 214] 0000000000482d10 0000000006f06d48
0000000000482d18 0000000003bb1390 0000000000482d20 0000000000483110
0000000000482d28 0000000003bb1390 0000000000482d30 0000000006f0fe60
0000000000482d38 0000000003bb1390 0000000000482d40 0000000000482d50
0000000000482d48 0000000000000000 0000000000482d50 0000000000000000
0000000000482d58 0000000000000000 0000000000482d60 00000001532d4880
0000000000482d68 00000001532d4880 0000000000482d70 0000000129e37ab0
0000000000482d78 0000000129e36330 0000000000482d80 0000000000484000
0000000000482d88 0000000000000001 0000000000482d90 0000000129e37ab0
0000000000482d98 00007ff8d8f822a5 mono!array_is_full+0x9 [c:\buildslave\mono\build\unity\unity_liveness.c @ 43] 0000000000482da0 0000000129e363c0
0000000000482da8 0000000000000001 0000000000482db0 000000000057f488
0000000000482db8 0000000000484178 0000000000482dc0 000000019f0b5000
0000000000482dc8 000000012e49f3e0 0000000000482dd0 0000000000000001
00000000`00482dd8 0000000000000020

I see multiple references to 0000000003bb1390 but I’m not sure if that is harmful or what it exactly means. I tried to call mono!mono_pmip on these address to see what they are but had no luck doing so in windgb. I am kinda stumped what to do next. Any advice or experts available on this subject?

We started disabling systems in our game to roughly pinpoint where the problem may be. To give some context we are building an open world styled game. Players can go anywhere in the world. As players move around the world terrains stream in as scenes. A separate system then spawns all of the entities. Entities include NPCs, trees, player built structures, anything that can be removed or interacted with. The spawning system will generate these entities the first time a chunk loads and will load the entities back from disk whenever revising a chunk.

First we disabled all NPCs, the crash still occurred. Then we disabled all of the new run time navmesh generation code, still crashed. We then disabled just trees and the build seemed stable. We haven’t been able to crash it for two hours.

Trees are made out of blocks in our game. There is a block for the trunk and a block for the branches. Trunks are placed on top of each other multiple times and then the branches are placed at the top. The strange thing about trees potentially causing this crash is that the NPCs in the world are also made up of blocks. They use the same code and have additional systems such as animation, ai, etc but they do not trigger this crash. The only differences that trees possess is pathfinding generation (which was removed during this witch hunt, so not that) and that there are a lot more trees in a chunk compared to NPCs.

We haven’t been able to find solid replication steps too which is very strange. Trees have two paths for creation. The first time they are generated procedural. If a tree unloads with a chunk and a player returns then they are created by reading the saved data and recursively building it back block by block. When triggering either of these states it will sometimes crash immediately and sometimes run up to 30 minutes. We also cannot replicate this in editor so that rules out things like infinite loops.

I did fix one issue in this area of code that avoids stack overflow issues. I am not sure this is your exact issue, but perhaps you can try the fix when it’s available. DM me with your info if you want (maybe even a bug is best) and I can get you a dll to try.

Note two things:

  1. The stack overflow I fixed is encountered when you have large arrays, usually referencing other arrays
  2. The liveness code mentioned in the call stack is traversing all objects. That means if there is a memory corruption (due to any reason) it will likely crash here.

The two things you mentioned seems could be related to how the tree code works. Trees are stored in a list. Each tree has data that is used to construct it and a list of the actual block game objects. The data contains information for positioning it, a list of children (which is also the same data class), and a ref to its associated game object that was created when actually built. The actual block game objects also have a reference back to the model data. When areas like the forest is built there are hundreds of trees with many blocks which can get to numbers as high as 10000+. When a block is removed due to a tree being destroyed or a chunk unloading we null out the data.

I DM’d you about the dll fix.

In case anyone else was experiencing a crash like this the fix jonchan implemented has completely solved our crash issue.

Joncham I have this old game that need your dll! ) Please share it with me…

What version of Unity are you on? This fix has been backported to many releases now.

2017.3.1f1

I would at least try using 2017.4.x as backported fixes have gone into the LTS release.