I’m making a very simple crafting system. You click 2 items, and you get a result.
I did some light research, but everything I find is from really long ago.
First, I found this stackoverflow thread/page discussing HashSet vs List speeds. According to one post, the more items are in a List, the slower it gets, but a HashSet is roughly the same speed regardless. However at the very lowest point, a List is actually faster, and since I will only ever need 2 elements, List would be the more performant one. But the post is from 2012 and was last edited in 2014.
Next there’s this stackoverflow thread/page about Arrays vs Lists. According to it, arrays are faster. However, the post is from 2009.
I would just like some up-to-date info on whether these findings are still accurate.
They are not inherently faster or slower than each other. It depends on what you’re doing with each. If you are searching for entries in a List then you can potentially take linear time up to the size of the list and where the element is you’re searching for but lists take minimal memory typically. A HashSet looks up your item by creating a unique (ideally) hash which equates to a specific “bucket” in the set so it can identify where it is quickly. Non-unique hashes means multiple items can reside on the same bucket which can lead to a little searching or in the worst case, as bad as a list if not worse. HashSets occupy more memory by the sheer fact that this key/value bucket set-up occupies more space.
Abstract questions on which is faster always comes down to your usage and the scale you’re using it at. Set-up a quick prototype at the scale of items you’re using and use the profiler to tell you.
Like almost everything in engineering, it’s a balancing act of various factors.
First of all Hashsets or Dictionaries are not simply “faster”. It depends on what operations you want to do. For just “storing” elements a Heshset / Dictionary is always slower. The only benefit they have is testing for a specific element in the collection. Note that so many people get the concept of the “big-O” notation wrong. The big-O does not tell you the absolute “speed” of an operation but how the speed scales with the number of elements. So O(n) scales linearly. That means if 5 elements take x amount of time, when you have 10 elements (twice as many) it takes roughly twice the amount of time. All those values are relative.
On the other hand O(1) has a constant time. That does not even mean that it really takes the same amount of time, it just states that the runtime of the algorithm has no direct correlation to the number of elements. Again, this does not tell you anything about the absolute runtime. This can only be determined by running some test cases on a specific hardware.
A List is just a wrapper class for a normal C# array. So yes, a List always adds a bit of overhead, however the choice of which collection you want / should use should not only depends on “which is the fastest”. It depends on the actual usecase. Arrays are great if you know the number of elements beforehand and if the number of elements do not change. Lists are great if you don’t know the number of elements beforehand as it takes care of creating a new internal array when necessary. Other classes like a Hashset, Dictionary, Stack, Queue, LinkedList can’t really be compared with an array / List as they serve completely different usecases. A hammer is a good tool to drive a nail into the wall. However you wouldn’t use a hammer for a screw. That’s why we have screwdrivers. On the other hand when you want to drill a hole you wouldn’t use a hammer or a screwdriver because they are horrible inefficient in drilling holes.
All those different classes have been made to solve specific problems more efficient than others. Note that efficiency comes in many flavors. There are several optimisation targets you can aim for and most are counterparts of the others. We usually talk about memory usage and speed. Often times you can trade one for the other. Also keep in mind that you can do various operations on those collections. Some operations are faster / more optimised in a certain class while others are getting worse. Some features are completely lost on others. For example arrays and Lists provide random access as you can simply use an index to access any element in constant time. So reading element 0 is just as fast as reading element 23015. However a LinkedList does not have indices and there is no way to access element 23015 directly.
So your question is just way too abstract and detached from reality. As a programmer you should face read problems and use the right tools at hand to solve the problem. If there’s one all-dominating-fully-optimised tool out there, why would anyone use any of the others? So there simply is no “better” choice in general.
If you like to get an answer to somehow simplify the task of choosing the right collection, that’s just not possible. You just have to learn the differences between those classes, what are the advantages of each and what are the disadvantages. Finally you have to decide what aspect is the more important one in your case.
Along with that was already mentioned, it’s probably worth mentioning one minor detail that might even make the decision for you:
Lists can be indexed - HashSets cannot.
If you have a case where you want to get an element at an index in a collection, that pretty much rules-out a HashSet entirely.