Properly distribute random chance for object instantiation

Hey, I’m struggling a little with some random generation logic. I know I can create a horrid else if statement that spans the page, but I assume there is a better way.

Obvious (horrid) method:

if(Random code 0 - 10){
make my object;
}else if ( random code 11 - 15 ){
make a different object;
} etc

I’d rather have a function that evenly distributes the prefabs connected to the script and allows me to weight them manually. Does anyone have a theory? I don’t need you to do all the work, just a recommendation. I’m willing to do the research needed to get it going.

http://forum.unity3d.com/threads/221890-Weighted-random-selection

The last post in this thread (not the links in it) has everything you need.

Thanks for the link. I actually came up with another idea:

Iterate through each object, rolling your random number and adding the weight to it. Whichever rolls the highest total is the object that will be spawned.

Both techniques are not suitable for very large lists though. I may try both and see which one works the best.

Actually, in retrospect, that link isn’t what I’m looking for. It still favors items closer to the top of the list, just moving the bias to object 2. I think the only way to give each object the anticipated chance is to roll for each item separately and use the highest score. This gives each object it’s weighted chance without bias towards those at the top f the list.

There is a bias towards the items a the top of the list, because they have to be ordered by weight, descending. If you have one object with 0.5 and two objects with 0.25 , everything below 0.5 will result in the first object, proper weighted.

Right, but the problem with that is that it exponentially reduces the chance that an object at the bottom of the list will be selected.

For example:

int roll = Random.Range(0, 100);

//sloppy for statement
for(i = 0; i < stuff.length; i++){
     if(stuff.length[i].weight + roll > 100){
          make something;
     }

}

Okay, so lets say the first item is weighted at 50. That means Item 2 only has a 50% chance of being created. So lets say we have 5 objects with these weights:

50
25
10
5
1

The first item has a 50% chance. The second item has a 25% - 50% chance (total of 12.5%) of being instantiated and it goes down from there. Even if you reverse the weights, you still have the distinct problem of the upper weights reducing the chance of the lower ones.

Instead, I’m doing this:

void SelectEnemySpawn ()
		{
				int current = 0;
				int next = 0;
				for (int i = 0; i < enemyPool.Length; i++) {
						next = enemyPool [i].GetComponent<EnemyData> ().spawnWeight + Random.Range (0, 1000);
						Debug.Log (enemyPool[i] + ": " + next);
						if (next >= current) {
								current = next;
								enemyToSpawn = enemyPool [i];
						}
				}
		}

It’s not extremely efficient, but it never iterates through more than 10 objects and it’s running fine so far. Each item has its exactly specified chance of being selected. Well, almost. Objects lower down the list have a slightly higher chance since they auto-win ties. But that’s probably a .00001% increase. The point is, the result is predictable every time, no matter where the object is on the list.

1 Like

Why do you think it has 12.5? it has a chance of 25/91 if it is weighted 25/91:
You have a list with your weights, 50, 25, 10, 5, 1. You add everything and get 91. Get a random number between 0 and 91. Maybe 55.
Take the first object, its weight is 50. Thats too low. Add the second one, so you have 75. Thats the first one over 55, so this is your object. It has a chance of 25/91 like stated.

Because, the objects at the bottom have to wait their turn. The objects at the top will always get a chance to be created.

So using your example, I will roll 50+ half the time (that’s statistical averages) which means 50% of the time, it won’t be over the first item. Now this is expected behavior.

However, what happens if the next item is also 50? Now it skips both items. Or what if the next item is 51? The item before it catches the 0 - 50. The item after it catches the 52 - 91.

Does that make sense? Any time you stop checking items without iterating the whole list, the items at the bottom are at a disadvantage.

Anyway, the script I wrote works excellent and is just as easy to expand.

You don’t have to iterate over the whole list, you only throw one random.range between zero and the sum of all propabilities. You don’t do a new random number after every object. I have no problem with your code, but you wanted something unbiased :wink:

I think the problem with our misunderstanding is, that you try and roll a second random number.
Just pseudocode, haven’t got any MonoDevelop here at the moment:

	public string GetWeightedStrings () {
		
		Dictionary<int, string> items=new Dictionary<int, string>();
		items.Add (50, "Testobject 1");
		items.Add (25, "Testobject 2");
		items.Add (10, "Testobject 3");
		items.Add (5, "Testobject 4");
		items.Add (1, "Testobject 5");

		int sum = 0;
		foreach (DictionaryEntry item in items) {
			sum+=item.Key;
				}
		int rand = Random.Range (0, sum + 1);

		int cur = 0;
		foreach (DictionaryEntry item in items) {
			cur+=item.Key;
			if(cur>rand) return item.Value;
		}

	}
1 Like

No, I wasn’t generating more than one random number for that. Only for my solution. The problem I have with the above is purely statistics based. In the above, the first object will always appear 50% of the time. But this takes away from the second item. The second item doesn’t even get a chance 50% of the time. The loop doesn’t even reach it. Then the 3rd item down is selected even less, because both the first and second item has to fail before it gets a chance.

essentially, it’s no different than rolling like this:

if random 0 - 100 | make 1
else if random 0 - 100 | make 2
else if random 0 - 100 | make 3

In both scenarios, the top objects MUST fail in order for the lower objects to get a chance. It doesn’t matter if you roll once or each time - once something succeeds then the rest are out of luck. The further down the list it goes, the lower the probability that an item will be selected.

Believe me, I had a very lengthy conversation trying to explain this to my friend. It’s all about statistical probability. It’s a lot like rolling 2x 6 sided dice. The further from 2 you want to roll, the exponentially harder it is to roll that number. For example, going from 9 to 10 is exponentially harder than going from 7 to 8.

In my system, no object has to fail the check. The items each have their set chance of appearing based on it’s roll and weight alone. It’s place on the list has no bearing.

If you don’t believe me, or want to see for yourself, do this:

set up a script to do exactly that, and weight each item at 25, then count the number of times each object is created (just store them in an int variable or something). I’ll bet you object 1 appears more than 25% of the time, and object 4 appears less than 25% of the time.

Note that the method you described is close to what I want. The bias is much lower than other methods, but I think mine is the least biased I’ve seen so far. Either way, I’m happy with how it’s working :slight_smile:

No, the short one is:
if random 0-50 make 1
if random 51-75 make 2
if random 76 to 85 make 3…

Edit: I tested it, with 10000000 iterations and the weights set to 27,26, 25, 24 (dictionaries can’t store similar keys…)

the results are:
2728803
2523716
2424700
2322781
27+26+25+24=102, so the last one should have:
2352941
That looks significant enough, or does it?

That’s the completely unbiased method. That differs from your version in that it rolls once and applies the matching conditional statement. In yours, you start iterating and stop at the first number.

Here is another example. Consider the following weights:

50
50
25
5
1
1
1

Okay, so you roll a 75.

Skip the first one.
Apply the second one.
25 would match, but it never gets the chance to.

Okay, now you roll a 55.

Skip the first one
apply the second one
25 would match, but it never gets a chance to

okay, now you roll a 70

skip the first one
apply the second one
25 would match, but it never gets a chance to.

Now do you see the pattern? The first 50 has a 50% chance. The second 50 has a 25% chance. The 25 has a 12.5% chance, and so on. Trust me, those are the results :slight_smile:

Look at the disparity. 27 is close, only off by 28,803 (above the weight), but the second in line is down by 77,000, as is each following, . And these numbers are close together, only 4 points of separation. See what 50, 51, 25, 10, 1 would give you.

Thats the one I wrote in code. I don’t stop at the first number, I stop at the number in which part the random number lies.

50, 25, 10, 5, 1 gives you (in brackets how much it should be):
5542924 ( 5494505, off by 0.8%)
2717595 (2747252, off by 1%)
1087287 (1098901, off by 1.6%)
543323 (549450, off by 1.1%)
108871 (109890, off by 0.9%)

The 25 in your post (with 2x 50) will get thrown if the random number is between 100 and 125 of the whole range of 0-133.

I was referring to the code you originally linked me to. I already know the else if chain will work perfectly. I’ve used it before. I just don’t like using it because if you have 100+ items it becomes a major hassle.

I didn’t link to any code, I linked to a thread with a how to and thats what I wrote in this thread and that is exactly the code I tested (I just replaced dictionaryentry with keyvaluepair).

The code in the thread you linked. There are 3 code samples we’re talking about

The one that adds the weight as it iterates, the one that uses a wad of else if statements, and one that generates a random number per objects.

Here is the results of the one I"m using:

50007 generations:

50 39.2%
51 41.3%
25 12.6%
10 .04%
1 .02%

This allows me to weight two items equally and get similar performance out of them. I could also reduce everything to 0 weight and get an equal distribution. Or increase everything to 50 to get an equal distribution.

I like the flexibility more. And the fact that I can just attach 2 prefabs with the same weight to the script and it will just work.

Actually the script you wrote won’t provide statistically accurate results. The problem is that you aren’t actually getting results that are based on the weights. The are affected by them, but not dependent on them. You are basically rolling a new die for each one and comparing them, instead of rolling a single die and selecting. You could probably adjust the random number you are generating for each based on the total to get somewhere in the ballpark. (it gets closer the close the die sides are to the pool total, but still not accurate)

What I do is generate basically a lookup table. It takes a couple of extra steps to set up, but it is faster to run, as I only have to roll one die. And it is statistically accurate. For example (I just used a simple object to illustrate):

using UnityEngine;
using System.Collections;

public class RandomMonster : MonoBehaviour {
	
	Monster[] monsters;
	int[] monster_selection_pool;
	int pool_total = 0;

	void Start () {
		monsters = new Monster[3];
		monsters[0] = new Monster("Dragon",60);
		monsters[1] = new Monster("Troll",30);
		monsters[2] = new Monster("Ogre",5);
		BuildPool();
	}
	
	void BuildPool()
	{
		// fill the pool.  Used extra loops to leverage builtin arrays
	    foreach(Monster monster in monsters) pool_total+=monster.weight;
		monster_selection_pool = new int[pool_total];
		int i = 0;
		for (int m = 0; m < monsters.Length; m += 1) {
			for (int p = 0; p < monsters[m].weight; p += 1) {
				monster_selection_pool[i] = m;
				i++;
			} 
		}
	}
	
	public Monster GetRandomMonster()
	{
		// simple example, returns a random monster based on weight
		return monsters[monster_selection_pool[Random.Range(0,pool_total)]];
	}
	
	// example object
    public class Monster
    {
        public string name;
        public int weight;
		
		public Monster(string n,int w)
		{
			name = n;
			weight = w;
		}
    }
}

Next post has several sets of results for comparison.(first are the expected results, then with pooling shown above, and last is your method).

Result samples:

RESULTS: (10000 samples, 1000 sided die rolls)
 EXPECTED---------------------   ARRAY POOL-------------------   MULTI-DIE--------------------  
  Dragon (60) - 6315 (63%)        Dragon (60) - 6400 (64%)        Dragon (60) - 3816 (38%)      
   Troll (30) - 3157 (32%)         Troll (30) - 3093 (31%)         Troll (30) - 3296 (33%)      
     Ogre (5) - 526 (5%)             Ogre (5) - 507 (5%)             Ogre (5) - 2888 (29%)      


RESULTS: (1000 samples, 1000 sided die rolls)
 EXPECTED---------------------   ARRAY POOL-------------------   MULTI-DIE--------------------  
  Dragon (60) - 631 (63%)         Dragon (60) - 631 (63%)         Dragon (60) - 379 (38%)       
   Troll (30) - 315 (32%)          Troll (30) - 324 (32%)          Troll (30) - 329 (33%)       
     Ogre (5) - 52 (5%)              Ogre (5) - 45 (4%)              Ogre (5) - 292 (29%)       


RESULTS: (1000 samples, 100 sided die rolls)
 EXPECTED---------------------   ARRAY POOL-------------------   MULTI-DIE--------------------  
  Dragon (60) - 631 (63%)         Dragon (60) - 641 (64%)         Dragon (60) - 711 (71%)       
   Troll (30) - 315 (32%)          Troll (30) - 308 (31%)          Troll (30) - 219 (22%)       
     Ogre (5) - 52 (5%)              Ogre (5) - 51 (5%)              Ogre (5) - 70 (7%)         


RESULTS: (100000 samples, 5000 sided die rolls)
 EXPECTED---------------------   ARRAY POOL-------------------   MULTI-DIE--------------------  
 Dragon (400) - 56338 (56%)      Dragon (400) - 56299 (56%)      Dragon (400) - 31166 (31%)     
   Troll (90) - 12676 (13%)        Troll (90) - 12626 (13%)        Troll (90) - 22644 (23%)     
   Ogre (150) - 21126 (21%)        Ogre (150) - 21229 (21%)        Ogre (150) - 24095 (24%)     
   Bunny (70) - 9859 (10%)         Bunny (70) - 9846 (10%)         Bunny (70) - 22095 (22%)     


RESULTS: (1000 samples, 2000 sided die rolls)
 EXPECTED---------------------   ARRAY POOL-------------------   MULTI-DIE--------------------  
 Dragon (400) - 563 (56%)        Dragon (400) - 562 (56%)        Dragon (400) - 390 (39%)       
   Troll (90) - 126 (13%)          Troll (90) - 115 (12%)          Troll (90) - 198 (20%)       
   Ogre (150) - 211 (21%)          Ogre (150) - 221 (22%)          Ogre (150) - 238 (24%)       
   Bunny (70) - 98 (10%)           Bunny (70) - 102 (10%)          Bunny (70) - 174 (17%)       


RESULTS: (100 samples, 2000 sided die rolls)
 EXPECTED---------------------   ARRAY POOL-------------------   MULTI-DIE--------------------  
 Dragon (400) - 56 (56%)         Dragon (400) - 53 (53%)         Dragon (400) - 33 (33%)        
   Troll (90) - 12 (13%)           Troll (90) - 13 (13%)           Troll (90) - 19 (19%)        
   Ogre (150) - 21 (21%)           Ogre (150) - 22 (22%)           Ogre (150) - 27 (27%)        
   Bunny (70) - 9 (10%)            Bunny (70) - 12 (12%)           Bunny (70) - 21 (21%)

That looks like what I am doing. You are calculating the index of the monster array based on the sum of its weights, it’s a bit shorter, but works only with integers, not with floats. But I think it is a bit faster, I’ll try it :wink:

Edit: based on your tests, 10000 samples:
6327 (you expected 6315)
3141 (3157)
532 (526)