Just need some clarification regarding Component variables

Having just finished coding all the scripts for a game prototype that I was working on, I then proceeded to finding ways to further optimize my code. It is during this endeavor that I read the following suggestion for optimizing one’s code from this page.

Whenever you access a component through GetComponent or an accessor variable, Unity has to find the right component from the game object. This time can easily be saved by caching a reference to the component in a private variable.

Simply turn this:

function Update()
{
  transform.Translate(0, 0, 5);
}

Into this:

private var myTransform : Transform;

function Awake()
{
  myTransform = transform;
}

function Update()
{
  myTransform.Translate(0, 0, 5);
}

The latter code will run a lot faster since Unity doesn’t have to find the transform component in the game object each frame. The same applies for scripted components, where you use GetComponent instead of the transform or other shorthand property.

I understand the point of caching component references. I do it to optimize my code whenever possible. My question is whether the above example is actually optimizing anything at all.

Let me break up the logic involved behind my confusion.

The above script obviously inherits from MonoBehaviour, which itself inherits from Component. transform is a variable inherited by that script from Component which points to the Transform component of the game object which it is attached to. As such, the Transform component is already cached in transform. Obviously, re-caching it won’t do any good efficiency-wise.

So now I pose my question: is there a flaw in my knowledge of how the inherited variables from Component work, or was that just a bad example?

No, .transform is a property, not a variable. Properties only exist in C# and they look like this:

public Transform transform
{
    get 
    {
        return ....;
    }
    set
    {
        .... = value;
    }
}

When ever you read the property the getter is called which have to return a value. Whenever you assign something to the property the setter is called the the assigned value comes in value.

All those “accessor variables” just execute GetComponent behind the scenes.

So the transform would look like:

public Transform transform
{
    get 
    {
        return GetComponent<Transform>();
    }
}

It has no setter so you can’t assign something to it… what wouldn’t make sense.

Well, I created a small benchmark scene to know how much caching transforms could impact performance. If someone want to reproduce this in his own machine, just create a new scene - no need to install any packages, let it with the Main Camera only. Save the scripts Control.js and BlockScript.js below in the Assets folder, assign Control.js to the Main Camera and press the Play button.

The whole scene contains only a rectangular array of blocks waving like hawaiian dancers, and a GUIBox at the top left corner showing the statistics: the averaged frame time for cached and non-cached transforms, the extra time taken by the non-cached frames, and the extra time taken by each transform access. In my notebook (1.86 GHz Intel Core Duo) each transform access takes about 0.13 micro seconds more than the cached access. I reproduced the benchmark in another machine (2.9 GHz Athlon X2), and the difference lowered to 0.08 micro seconds.

My conclusion about this matter is: if you have some very numerous object (>100) in your game, caching its transform property could bring a real increase in performance; on the other hand, in scripts used in a few objects this will not make a significant difference. In a machine like mine, for instance, 100 accesses will save about 13uS if the transform is cached. Since a frame takes tipically something from 5000uS to 100000uS, this alone will not improve performance too much.
EDITED: I run the benchmark also in WebPlayer and Standalone versions. In WebPlayer, the difference reduced to 50% (about 0.07uS per access), and in the Standalone version there was almost no difference!

These results where obtained from the BlockScript.js script. It’s a incredibly useless piece of code, which does almost nothing but access the object’s transform 1000 times per frame. These accesses are made directly or via the cache variable cachedTrf, according to the control variable cacheTransform . In the Control.js script, cacheTransform is toggled at each second, and the measured times at each mode are compared to evaluate the extra time taken by non-cached transform accesses. The number of blocks generated at Start() is defined by the array lenght and width, which appear in the Inspector variable blocks .

That’s the BlockScript.js script:

private var cachedTrf: Transform;
private var pos0: Vector3;
private var phase: float;

function Start(){

	cachedTrf = transform;
	pos0 = transform.position;
	phase = Mathf.Sqrt(Mathf.Abs(pos0.x*pos0.z))/5;
}

function Update(){

	var min: float = -5000;
	var max: float = 5000;
	var i: int;
	
	// Access transform 1000 times per frame
	// via cache variable or via getter
	
	if (Control.cacheTransform){
		for (i=0; i<125; i++){
			if (cachedTrf.position.x>min && cachedTrf.position.x<max &&
				cachedTrf.position.y>min && cachedTrf.position.y<max &&
				cachedTrf.position.z>min && cachedTrf.position.z<max &&
				cachedTrf.position.x<max+1 && cachedTrf.position.x>9999){
					print("never happens");
			}
		}
	} else {
		for (i=0; i<125; i++){
			if (transform.position.x>min && transform.position.x<max &&
				transform.position.y>min && transform.position.y<max &&
				transform.position.z>min && transform.position.z<max &&
				transform.position.x<max+1 && transform.position.x>9999){
					print("never happens");
			}
		}
	}
	transform.position = pos0+Vector3(0, Mathf.Sin(Time.time+phase), 0);
}

This is the Control.js script. It generates the blocks array according to the dimensions set in the blocks variable, and shows the statistics:

var blocks:Vector2 = Vector2(10,10);
static var cacheTransform:boolean = true;

private var t0: float;
private var frames: int = 0;
private var tFrame: float;
private var fps: float;
private var avgCachd: float = 0;
private var sumCachd: float = 0;
private var qCachd: int = 0;
private var avgGetter: float = 0;
private var sumGetter: float = 0;
private var qGetter: int = 0;
private var extraTime: float = 0;
private var getterTime: float = 0;
    
function OnGUI (){

	GUI.Box(Rect(10,5,220,130),"");
	var s = "Cached:

";
s += " T frame: “+avgCachd.ToString(“F2”)+” mS
";
s += "Transform:
";
s += " T frame: “+avgGetter.ToString(“F2”)+” mS
";
s += "Difference:
";
s += " Extra time: “+extraTime.ToString(“F2”)+” mS
“;
s += " Extra time per GET: “+(1000*getterTime).ToString(“F2”)+” uS”;
GUI.Label(Rect(20,10,200,120), s);
}

function Start () {
    for (var y = 0; y < blocks.y; y++) {
        for (var x = 0; x < blocks.x; x++) {
            var blk = GameObject.CreatePrimitive(PrimitiveType.Cube);
            blk.AddComponent(BlockScript);
            blk.transform.position = Vector3 (3*(x-y), 0,3*(x+y));
        }
    }
	t0 = Time.time;
}

function Update () {

	frames++; // count frame
	var t = Time.time-t0;
	if (t>=1){ // one second elapsed:
		t0 = Time.time; // update t0
		fps = frames/t;
		tFrame = 1000/fps; // tFrame in mS
		frames = 0;
		if (cacheTransform){
			sumCachd += tFrame;
			qCachd++;
			avgCachd = sumCachd/qCachd;
		} else {
			sumGetter += tFrame;
			qGetter++;
			avgGetter = sumGetter/qGetter;
		}
		if (qGetter*qCachd>0){
			extraTime = sumGetter/qGetter-sumCachd/qCachd;
			getterTime = extraTime/(1000*(blocks.x*blocks.y));
		}
		cacheTransform = !cacheTransform;
	}
}

Accessor variable vs cached variable performance test on iPhone

I also tested accessor variable vs cached variable performance using the iPhone internal profiler. Looks like the difference in cpu time is about 10%. Though this test has 2000 transform access calls per frame (which is quite a lot).

Testing software and device:

Unity 3.4.2f2 (Unity Pro + iOS Pro)

iOS SDK 5.0

iPhone 3GS (4.3.5)

The test scene was one rotating cube which had this script attached:

    using UnityEngine;
using System.Collections;

public class TransformAccess : MonoBehaviour {
	
	private Transform cachedTransform;
	
	private bool useCached = false;
	
	void Start () {
		cachedTransform = this.transform;
	}
	
	void Update () {
		for(int i = 0; i < 1000; i++) {
			if(useCached) {
				cachedTransform.RotateAround(cachedTransform.up, 0.002f * Time.deltaTime);
			}
			else {
				transform.RotateAround(transform.up, 0.002f * Time.deltaTime);
			}
		}
		
		if(Input.GetButtonUp("Fire1")) {
			useCached = !useCached;
			Debug.Log("use cached: " + useCached);
		}
	}
}

The script rotates the cube around the up axis thousand times per frame. The rotation accesses the transform twice per iteration (transform.RotateAround() and transform.up).

The internal profiler data using the accessor:

----------------------------------------
iPhone Unity internal profiler stats:
cpu-player>    min: 30.9   max: 39.1   avg: 34.0
cpu-ogles-drv> min:  0.1   max:  0.2   avg:  0.1
frametime>     min: 35.3   max: 41.9   avg: 37.6
draw-call #>   min:   1    max:   1    avg:   1     | batched:     0
tris #>        min:    12  max:    12  avg:    12   | batched:     0
verts #>       min:    24  max:    24  avg:    24   | batched:     0
player-detail> physx:  0.4 animation:  0.0 culling  0.0 skinning:  0.0 batching:  0.0 render: -0.1 fixed-update-count: 1 .. 2
mono-scripts>  update: 32.6   fixedUpdate:  0.0 coroutines:  0.0 
mono-memory>   used heap: 143360 allocated heap: 196608  max number of collections: 0 collection total duration:  0.0
----------------------------------------

The internal profiler data using the cached variable:

----------------------------------------
iPhone Unity internal profiler stats:
cpu-player>    min: 28.4   max: 33.9   avg: 30.6
cpu-ogles-drv> min:  0.1   max:  0.3   avg:  0.1
frametime>     min: 31.9   max: 43.1   avg: 33.9
draw-call #>   min:   1    max:   1    avg:   1     | batched:     0
tris #>        min:    12  max:    12  avg:    12   | batched:     0
verts #>       min:    24  max:    24  avg:    24   | batched:     0
player-detail> physx:  0.4 animation:  0.0 culling  0.0 skinning:  0.0 batching:  0.0 render:  0.1 fixed-update-count: 1 .. 2
mono-scripts>  update: 29.2   fixedUpdate:  0.0 coroutines:  0.0 
mono-memory>   used heap: 143360 allocated heap: 196608  max number of collections: 0 collection total duration:  0.0
----------------------------------------