In University when I was studying Java, I was told never to create a variable inside a loop because it is inefficient. The variable would be created and destroyed over and over again. They said that we should initialize the variable outside of the loop. I see lots of Unity developers creating variables inside of loops. Is this an issue with CSharp?
Which code is more efficient?
Code A
public void update()
{
float value = 1f;
}
Code B
private float value;
public void update()
{
value = 1f;
}
It just completely subverts the way both C# and Java compilers work.
Both C# and Java utilize what is called a ‘stack’ for their function operation (honestly, most language compilers do… they’re very effective ways to describe a function call structure).
Specifically in C# and Java, the way the stack behaves is this. When a function is called a chunk of memory is allocated on the stack, called a stack frame, for the memory state of the function to operate in. The size of that chunk of memory is predicated on how many variables are needed in that function, and how large those variables are. When the function returns, the stack frame is “popped” (removed) from the stack.
An allocated stack frame can’t easily be resized… so instead the compiler figures out all the variables needed for the function and allocates them at the beginning of the function.
The “y” variable in this case does not get recreated over and over, it’s initialized as part of the stack from allocation.
And java bytecode behaves in a very similar way (though looks slightly different as a language). It too has a variable array as part of its stack frame and is created at allocation of the frame.
…
Now in relation to Update. Well this is not a loop in the same sense. And really is a function call over and over. This means that a stack frame is allocated every single time Update is called. This means that the variables used inside that function need to be allocated on the stack frame every call of the function.
This is important because if you called the funciton twice at the same time… you technically have the same functions, but with 2 distinct states, running at the same time. Those 2 distinct states are represented as their respective frames on the stack.
By declaring the variable outside of the Update method its no longer a member of the functions stack frame and instead is a member of the object that the function is a member of (in this case the MonoBehaviour class). Since this is in a class, that variable will be allocated where the object is, which is over on the heap. Any reference to it doesn’t point to an address on the stack, but an address in the heap (just a different portion of RAM).
This also means that the state of that variable is not distinct to any specific call to the function Update. It’s state is independent and therefore persists between all calls to Update.
With this said… declaring a variable outside of Update TECHNICALLY speeds up the initialization of the stack frame since it has fewer variables to allocate.
But it comes at the cost of the state of that variable behaving differently.
And honestly… the speed difference… it’s negligible.
At the end of the day… you declare the scope of a variable not based on the speed you require (since the speed differences are non-significant really)… but rather on what you need. What you expect the variable to behave like.
So this post gets a like every once in a while which says to me it’s popping up in people’s searches.
I would like to add an addendum to it.
Rereading OP’s question and my response. I think I may know what the person who told OP to never create a variable in a loop may have actually been referring to.
I’m willing to bet that they meant to not create an instance of an object that could be reused in a loop. Something like this:
for(int i = 0; i < 10; i++)
{
var obj = new SomeObject();
obj.DoStuff(i);
}
vs
var obj = new SomeObject();
for(int i = 0; i < 10; i++)
{
obj.DoStuff(i);
}
The implication being in the first 10 SomeObject’s are created, where as in the 2nd only 1.
Mind you… there are situations where you may want 10 SomeObject’s. And other’s where you want 1. The statement of “never create X in a loop” is reductive at best. You should understand the implications and ramifications of what you’re doing in or out of a loop and decide accordingly.
I kinda doubt that ^^. I think the OP made it clear that he meant the declaration / creation of the variable itself, not the assignment, though we don’t know for sure.
A common pitfall in Java is that Java has the wrapper / box classes for the primitive types directly available. So the type “int” has the box-type “Integer” and Java supports auto boxing / unboxing for those types. So you have to be careful with those. In C# this can’t happen “that” easily or sneaky. Yes, C# does box automatically when you do
object o = 5;
though that should be quite clear.
Exactly, apart from the whole boxing topic, Java as well as C# supports closures. Now it gets really tough for people who have never really learned about them.
If you create a delegate inside a loop, the compiler will automatically create closure objects for variables outside the scope of the delegate that are used by the delegate. Such variables are “captured”. Variable capturing will actually completely alter how the code is compiled and how it works. For proper closures you often need to declare a variable inside the loop, otherwise all the delegates would use the same closure object. However that only happens when you use closures. An important thing to remember is that closures do not capture values but actual variables, which sounds strange at first since you can not really “reference” variables. The compiler knows that and simply packs the variable in question into a compiler generated closure class. For more information, see this blog post.
So what you really should be careful with are closures ^^. Closures can do crazy things like this:
Crazy Closure madness ahead
public class MyDelegateVar<T>
{
private System.Func<T> m_GetVar;
private System.Action<T> m_SetVar;
public MyDelegateVar(System.Func<T> aGetVar, System.Action<T> aSetVar )
{
m_GetVar = aGetVar;
m_SetVar = aSetVar;
}
public T Value
{
get => m_GetVar();
set => m_SetVar(value);
}
}
public class Test
{
public List<MyDelegateVar<int>> variables;
public void Setup()
{
variables = new List<MyDelegateVar<int>>();
for(int i = 0; i < 20; i++)
{
int val = i;
variables.Add(new MyDelegateVar<int>(()=>val, (v)=>val = v));
}
}
}
Note that this example does not make much sense. However note that each of those “MyDelegateVar” instances does not have any int variable, just two delegates. Inside the loop we create a “local” int variable and two delegates (one for setting the variable and one for getting the variable). Those delegates will capture this local variable into a hidden and shared closure object. As a result, each of our MyDelegateVar instances will have their own version of that variable and the variable will live beyond the method execution.
Note that if you would move the declaration of the variable “val” outside the loop, the result would be very different.
public void Setup()
{
variables = new List<MyDelegateVar<int>>();
int val;
for(int i = 0; i < 20; i++)
{
val = i;
variables.Add(new MyDelegateVar<int>(()=>val, (v)=>val = v));
}
}
In this case all the MyDelegateVar instances will capture the same variable. So after Setup is completed, all the instances will use the same closure object which would have the value of 19. Changing one of our MyDelegateVar would affect all of them since there really is only one closure object behind all those instances.
I think OP might appreciate a simple answer: it depends. With primitive types the compiler will optimize it and in other cases I wouldn’t worry about it until it becomes a measurable problem.
Always prioritize clarity over cleverness.
To answer the actual question, these three snippets are going to compile to identical machine code in any reasonable language:
// Example 1:
int acc = 0;
int x;
for(int i = 0; i < 100; ++i) {
x = GetSomeNumber(i);
acc += x;
}
// Example 2:
int acc = 0;
for(int i = 0; i < 100; ++i) {
acc += GetSomeNumber(i);
}
// Example 3:
int acc = 0;
for(int i = 0; i < 100; ++i) {
int x = GetSomeNumber(i);
acc += x;
}
But that stops being true when ‘x’ is no longer a primitive type (int, float…). The assignment operator is allowed to have arbitrary side effects in C# and C++ (and I think Java), so examples 1 and 3 can actually have different outcomes than example 2. Also, in many languages examples 1 and 3 can have different outcomes, because a non-primitive ‘x’ might have a constructor that gets called repeatedly in example 3. (Certainly in C++, no idea about C#.)