Rendering 3k-4k crowd in Unity like FIFA audience in Unity3D.

I am doing a R&D on efficiently rendering animated 3D audience in Unity3D
like a reference shown from FIFA game (like any other AAA sports games do.).
alt text
These are few things I want to list here.

  1. Target flat-form is Windows PC with minimum configuration of DirectX 11 powered GPU and 4th Gen CPU.
  2. Tried mecanim animation (Optimise hierarchy turned on) with GPU skinning, But It still drops performance. and Unity scene size is coming around file size of 200MB to 300MB.
  3. Tried Legacy animation with MeshBaker (http://www.digitalopus.ca/site/mesh-baker/), but still 500MB+ scene with 5-10 FPS on high-end system.
  4. Need to render almost 3000-4000 Audience all around the stadium.
    Approach : 8-16 unique audience prefab with their own animation set.

Few studies :

  1. http://http.developer.nvidia.com/GPUGems3/gpugems3_ch02.html
    Not sure, whether we can achieve this in Unity3D.
  2. http.developer.nvidia.com/GPUGems2/gpugems2_chapter03.html
    Even by this approach, a animating character can be instanced randomly in stadium and I personally feel, this is what happens in crowd rendering in games like FIFA.
    because we can easily see few randomly placed prefab instances are doing the same animations on same time like,
    Unity’s SkinnedMeshRender.BakeMesh.

Bottle necks :

  1. During game replay camera will be placed in corner and most of audience will be rendered in this spot (60-70%). And this is something which we cannot remove too.
  2. Need 3D audience and not 2D.

Any help, suggestions, thoughts, Please post here. I would like to get a best approach to achieve this from Unity3D.

Thank you.

I don’t think even the current gen Madden/FIFA games do entirely 3D audiences. They use LODs to make sure the close ones are 3d, then use gradually lowering 2D renders as it gets further away from the camera.

In the image you posted, I’d be willing to bet that more than half is a clever 2D render.

However, here’s a few ideas to consider:
Have a single (or a few) skeletons to animate, and apply them to the rest of the models. Don’t have them all animate themselves independently.
Bake the lighting in and use unlit shaders. Notice how, in the image you posted, the shadows don’t ever seem to fall on part of a person. The whole person is either shadowed, or not.
To reduce the scene size, you’ll need to reduce the number of game objects. Try using as few bones as possible, and maybe batch multiple people into a single model.

The last thing I can recommend is to design the game around the necessary specs, and not the other way around. One of the most overlooked optimizations, especially in games like this, is simply to control the camera angles so that there’s not too much on screen at once. Do wide shots with 2D audiences, and do close ups with 3D audiences. You’d be surprised how much creativity a constraint like that can bring out in people.

Skinned instancing would be ideal, as you note, but I’m also not sure how possible it is with Unity. On the other hand, Unity does have great built in instancing, assuming you’re not generating new materials or meshes for each person.