using System.Collections;
using System.Collections.Generic;
using System.Linq;
using UnityEngine;
public class ExampleCode
{
public static List<int> exampleList = new List<int>(){ 48, 28, 30, 30, 30, 40, 38, 49, 30, 30};
public static void testMethod()
{
List<int> testMethodList = exampleList.Distinct().ToList();
for (int i = 0; i < testMethodList.Count; i++)
{
Debug.Log(testMethodList[i]);
}
}
}
I’m looking for a way to remove an amount of duplicates that i’d like to remove, instead of removing all duplicate elements. I’m not looking to remove a particular duplicate, i’m looking to remove a fixed amount of any duplicates in the list.
Thank you for your response, Kurt! Unfortunately it’s not as easy. On getting a list of distinct elements, i don’t get any dupes whatsoever (obviously because i get a list of distinct elements) however i was thinking if there was a way to “tell” .Distinct() to neglect a few entries.
Now that i wrote it out, i think of another option, it being comparing exampleList and testMethodList, detecting the difference between the two, creating a new List out of that difference, removing the amount of elements i’d like to remove and then adding it back to the testMethodList, however it sounds inefficient.
Rather than relying on some pre-baked thing like Distinct(), it seems like you’ll have to go through the list yourself. Honestly this problem feels like something you’d see on LeetCode or similar.
Thank you for your response, PraetorBlue! Yeah, the more i look into it, the more i realize that the two ways that i have of going about it are either the inefficient one or skipping Distinct() and creating a custom solution.
Sort list
Start at the end of the list
iterate through the list
if value [ i ] equals value[ i - 1] delete value [ i ] count = count + 1 end processing when count = max
Thank you for the response, JeffDUnity3D! Wow, that seems like a far better solution than the one i’m currently working on. It’s so much easier to implement, again, thank you for your input!
While i think i’ll go with the solution graciously provided by JeffDUnity3D, i have encountered a problem trying to solve my issue:
using System.Collections;
using System.Collections.Generic;
using System.Linq;
using UnityEngine;
public class ExampleCode
{
public static List<int> exampleList = new List<int>(){ 48, 28, 30, 30, 30, 40, 38, 49, 30, 30};
public static void testMethod()
{
List<int> testMethodList = exampleList.Distinct().ToList();
for (int i = 0; i < testMethodList.Count; i++)
{
Debug.Log(testMethodList[i]);
}
List<int> listDifference = new List<int>();
listDifference = exampleList.Except(testMethodList).ToList();
// List<int> listDifferenceList = listDifference.ToList();
// for (int i = 0; i < listDifference.ToList().Count; i++)
// {
// Debug.Log(listDifference.ToList()[i] + "is an item in listDifference.");
// }
Debug.Log(listDifference.Count + " is a count of List Difference");
}
}
It returns a 0 count of List Difference, even though i expected to get 4. Odd.
Does not matter at all :). My goal is just to remove a particular amount of duplicates in the most easy to control fashion, instead of getting rid of them alltogether and your solution seems to be exactly what i was looking for.
List.Except doesn’t really know or care about duplicates. If the element appears at least once in the other list, it’s considered in that list. It doesn’t matter if there are two of them in one list and one in the other.
Thank you for the quick response, PraetorBlue! That’s disappointing to hear as i wanted to store every different value in a third list and remove whatever random amount from that list, then add back to whatever list i want to add the values to, but the provided easy solution by JeffDUnity3D seems to be a much better option and i presume less resource hungry.
Posting a “template” of what JeffDUnity3D suggested, with some changes (in case someone is going to look for a solution for this issue :)).
using System.Collections;
using System.Collections.Generic;
using UnityEngine;
public class ExampleCode
{
public static List<int> exampleList = new List<int>(){ 48, 28, 30, 30, 30, 40, 38, 49, 30, 30};
public static void testMethod()
{
exampleList.Sort();
Debug.Log(exampleList[0]);
int count = 0;
Debug.Log(exampleList.Count + " is the count of Example List before removing 1 duplicate");
for (int i = 0; i < exampleList.Count; i++)
{
if(exampleList[i] == exampleList[i+1])
{
exampleList.RemoveAt(i);
count = count+1;
if(count < 2)
{
return;
}
}
}
Debug.Log(exampleList.Count + " is the count of Example List after removing 1 duplicate.");
}
}
This snippet is easy to modify to your needs, however, as it is, it has an obvious issue:
Each time you call the method, it is not going to stop until the entire list is free of duplicates.
So you can either call it a certain amount of times (which is going to add up to the amount of calls in your project) or adjust it to your projects needs in a way that you seem fit ;).
Again, big thank you to JeffDUnity3D for his “thoughts out loud”! The issue is resolved!
I had recommended to start at the end of the list, you are starting at the beginning. I did so so there are no holes in the array when you do RemoveAt. I haven’t tested, but does RemoveAt change exampleList.Count? If so, that’s why I started at the end.
No, that’s not what happens. First of all as Jeff said you’re actually skipping elements when you go forward through the list. However your code has many other issues. First of all your condition
is backwards. Your count start at 0 and increases. You will return out of your method when count is less than 2 which is the case after the very first case. Also you said it does not “stop” before all duplicates are gone. How do you judge that? By your debug.log statement at the end of your method? Well when you return; out of your method that log statement is never reached. You may want to use break; instead. So all your method actually does is removing exactly one duplicate element and terminate no matter what.
There are several reasons why you want to iterate through a list backwards. First it avoids skipping elements when you remove them infront of you. Second removing elements from the end is cheaper than removing elements from the front since the list has to close the gap. So the overall performance would be better (though with that element count it’s neglectible).
I decided to start from the beginning (in my particular case) because there are cases in which i might be more interested in removing the duplicates as they increase by value instead of going for the largest values first.
Poor wording on my side. Meant to say that with each cycle it removes one element at a time on a call.
Thank you for the tips! I was glad to have at least some sort of a solution to my problem, hence why the poor wording. I’ll be sure to iterate through the list backwards and apply those practices to big lists. However, oddly enough my Debug.Log is reached just fine, tested it and it does what i want it to.
Which Debug.Log? Going through the list backwards isn’t to remove the largest duplicates, instead it’s to adhere to proper programming practices, as described.
To take a small tangent, and go back to OP’s desire of a “Distinct” but for a specific count.
You could accomplish it another way, but at the expense of creating a little garbage. Noting that any use of linq generally comes with garbage overheads just because it tends to rely on iterator functions.
Note that the Distinct method actually just uses a “set” (a trimmed down “lightweight” set implementation not to much unlike HashSet):
public static IEnumerable<TSource> Distinct<TSource>(this IEnumerable<TSource> source) {
if (source == null) throw Error.ArgumentNull("source");
return DistinctIterator<TSource>(source, null);
}
public static IEnumerable<TSource> Distinct<TSource>(this IEnumerable<TSource> source, IEqualityComparer<TSource> comparer) {
if (source == null) throw Error.ArgumentNull("source");
return DistinctIterator<TSource>(source, comparer);
}
static IEnumerable<TSource> DistinctIterator<TSource>(IEnumerable<TSource> source, IEqualityComparer<TSource> comparer) {
Set<TSource> set = new Set<TSource>(comparer);
foreach (TSource element in source)
if (set.Add(element)) yield return element;
}
So technically speaking if you wanted to create a version of ‘Distinct’ of your own that took a count you could do it in much the same way:
public static class CustomEnumerableExtensions
{
///Returns a distinct list for as long as the number of duplicates is < 'count'. Once reaching 'count' duplicates, all subsequent duplicates are returned.
public IEnumerable<T> CustomDistinct(this IEnumerable<T> source, int count)
{
var hash = new HashSet<T>();
int cnt = 0;
foreach(var element in source)
{
if (hash.Add(element))
yield return element;
else if(cnt < count)
cnt++;
else
yield return element;
}
}
}
Debug.Log(exampleList.Count + " is the count of Example List after removing 1 duplicate.");
I understand this now. But at the time i viewed it as solely to remove the largest duplicates. As far as i understand, in order to both go for the smallest values first and to adhere to the best practices, i’d simply need to sort the list from biggest values to the smallest and then go through the list backwards, correct?