Advice regarding C# library and Tensors

This question probably doesn’t fit here, but I could also not think of a better place to ask it:

I am building a generic C# library with a lot of common logic for both 2D and 3D, sometimes even 2.5D (xz for heightmaps). I find myself repeating a lot of my code and it got me thinking about generic vectors.

Why not use Tensors? They’re any-numeric type any-length “vectors” (or scalars or matrices) so it seems to me like I could make 1 class implementing tensors instead of 2 classes implementing Vector2 and Vector3 seperately. But: has anyone ever played with this idea? And is it advisable?

1 Like

Isn’t unity and C# vector2 and 3 and 4 are already support tensor paradigm?

There is indexer for vector class so you can access vector with 0,1,2,3 index as tensor-like. Unity - Scripting API: Vector2.this[int]

Maybe what’s lack is the dimension and cardinality but we can do switch return from type. What the tensor operation you want to implement ?

Unity‘s Vector Indexers just check which index you supplied and return the appropriate vector property. X, Y, and Z are just properties afaik.

I have tried multiple times to code exactly what you described inside of Unity‘s version of .NET to
no avail (I specifically wanted swizzling and/or single component access to work like it does normally and math to work only for number types which is not achievable without abstract statics).

While the idea is definitely possible, the real question becomes: How‘s the performance?

I’d assume using a fixed size array would technically allow for cache hits because of the packed nature of the data (which struct fields already do anyways). At the same time, maybe the fact that code needs to be abstract, means that there is some kind of runtime overhead that is incurred when using abstracts.

Keep in mind, vectors are used everywhere for Unity (and every game engine ever), so performance here is so critical, to the point that I would assume someone already did this and found out that it doesn’t really work that well in practice…

Even so, I’m going to ask the .NET compiler experts here: Are abstract statics and abstracts in general more like generics (they generate code at compile time) or like Objects/classes (they do runtime validation checks/boxing)?

3 Likes

If you do this again I’ll get staff to step in. The roadmap thread is not a personal resource to find developers to answer your questions.

3 Likes

Noted. Is there any way I can make off-topic threads off of a thread myself?

I’ve perhaps also been too reluctant to interact with the forums after its departure to Discussions. For that, I humbly apologize.

Right, one thing I have been experience is that the new discussion site UX was not encouraging people to start their own thread. Pinned thread make up too much of the space people can see and it too overwhelming. Update and activity was not sorted properly. And there is obscure of how to navigate area we actually want to see. Everything was bunched up in the first page

It look good and might be great for announcement and discussion but it not very as welcoming as forum

1 Like

I don’t know what this means. Your question has nothing to do with the original thread. Just choose the most appropriate category and tags and make a new post.
If you’re referring to something in another thread, copy a quote from a post or link to it.

I’m not in control over any of this, but note you can click the pin icon and unpin things locally.

3 Likes

While I am thank you for your informing. I am not voiced from my own perspective alone. I am trying to explain the UX that everyone would be experienced

It’s not that pinned thread should be unpin, it’s that it was the pinned thread of every topic from the whole site was too much to be in one page (which is the front page of the site) and it push everyone’s thread down below too far so no one can see thread we made and we cannot discus anything with anyone aside from the pinned thread at all

While pin thread is great when I can see what was important in each tag/topic, everything important make every other thing become unimportant. I want to see what pinned for Animation or Web but when I click at Latest topic I want to actually see what is the actual latest topic. And when I navigate to home I wish to see summary update of the topic, not the pin of everything unity was pinned

3 Likes

Just wanted to chime in to say that I agree with all of this. Obviously it’s not good to hijack threads, but with how discussions is set up, I feel that it funnels people towards doing that.

1 Like

this thread summarizes perfectly how much activity a completely new (and important) thread gets:

Meh, I disagree. The post has currently 182 views, so realistically speaking, it is being shown to others.

I think in this specific case, the topic discussed in the thread is way too niche for many users which is why they have refrained from commenting.

At the same time, I do agree with Thaina, that having 20 pinned threads in the „Latest“ category, defeats its own purpose…

2 Likes

C# doesn’t have something like C++ template which would let you define a variable amount of field at compile time.

So to support a variable amount of dimensions, you need to store the size of the tensor. For small tensors like 2 or 3d, this is 50 or 33% added storage cost. Larger storage also mean lower iteration performance.

Non constant size also mean you can’t inline the loop over the dimensions used in most operations, which add further overhead.

You can take a look into code gen, but it is most likely more trouble than it worth, given many vector operations are already implemented for you.

2 Likes

Edit Sorry in advance for the compressed stream of thoughts below. It’s 1:30am on Christmas Eve, I got linked here from another location, and I was/am somewhat frustrated about some of the perceived misinformation and lack of familiarity that tends to coincide with these topics. It’s not the fault of anyone here, it’s just one of those cases that I regularly need to address. It is much as I often have to explain the basics of how IEEE 754 floating-point works and it being deterministic, how you don’t need to use double to solve precision issues, how chunking and floating origin systems work, how fixed-point doesn’t solve underlying problems and introduces additional problems instead, etc. – I end up reiterating a lot of the same information many times in many places throughout the year and repeating it year after year.

=========================

Types like Vector2/3/4 and Matrix3x3/4x4 don’t get templated because they are highly specific, specialized, and often “primitive” types. They get their own static types that are used in these highly specific scenarios and which are typically hand-tuned to use very explicit paths that are designed for and around the needs of graphics, general image processing, and similar scenarios. These types then get used much like other primitives (such as int, float, bool, etc) to build other specialized types representing the structured data.

This is why you find that libraries like DirectX Math or GLM (OpenGL Mathematics) don’t really use templates in the way being described/considered. GLM itself does use some level of templating and if you naively look at the code you might presume that it is using this to optimize. However, if you dig deeper you will find that this general support is actually quite lacking and that the main library is actually all built around explicit specialization of the core types/sizes; such that they’ve functionally defined multiple unique types with no shared logic. The underlying vec<length_t, typename, qualifier> (and equivalent for mat) is actually really just this almost unusable interface (trait-like definition), isn’t used for storage, or other considerations. You can functionally define “the same thing” in .NET (RyuJIT) via an actual generic interface, a TSelf generic, and relying on inlining/devirtualization or generic specialization for value types.

– Notably this general specialization support, ability to use interfaces and modern language features, etc also means that real world usages do not incur additional fields, do not incur lower iteration performance, etc. You can build something (in .NET) that has the same or even better codegen than C/C++ (as you’re not targeting a lowest common machine by default).

Tensor libraries on the other hand are more general purpose. They are typically working with non-static data where the size, shape, sparseness of the data, and other considerations are never decided by the developer (either the tensor author or the tensor consumer). Instead they are often dictated by the source data set which is often external to the library and specified by the data scientist or in the case of general ML applications by the arbitrary input provided by the user.

Because they are designed to be general purpose and not understand the size or other information, it is incredibly atypical to use templates for representing size information (as it is never statically known). The allocations are generally large enough that any additional field cost used to track the size, shape, sparseness is negligible. The data is typically sliced out of larger memory, often breaking it down to the largest contiguous “row” of information, accounting for where the shape matters vs is irrelevant. Where if there is a core/common size where extra optimizations are meaningful to your app, you can do the necessary dynamic check and then optimize accordingly (but its often not meaningful).

=========================

You ultimately have different types for different scenarios and goals. .NET itself has in-box Vector, Matrix, Plane, Quaternion, and other types for graphics scenarios in the System.Numerics namespace. .NET is then currently building its own set of “tensor primitives” and a general tensor interchange type in the System.Numerics.Tensors namespace, some of which are stable and others which are still experimental.

You’re going to be hard pressed to build something that is better accelerated or handled across all the relevant scenarios, especially when considering the range of x86, x64, Arm64, WASM, and potential other future targets.

Where there are places that need additional APIs, defining them as extensions or utility methods on top of these existing APIs will give you the best performance and compatibility. Opening issues so that we can add them officially in box, where relevant, is also goodness (dotnet/runtime on GitHub). Noting that we can’t add “everything”, nor is everything applicable enough to warrant a central in-box method.

Providing additional interchange concepts can also be done where relevant. On the eventual radar is some kind of Vector2<T> and related set of interfaces so that you can have vectors of half/double. Potentially some kind of similar Point2<T> so that you can have similar concepts (but not all the same APIs) for types like int and byte as well might be possible – Noting that Vector2<int> despite being a common term to see in some libraries in many ways doesn’t make “sense” because core operations like Normalize or Length/Magnitude which are foundational to Euclidean Vectors do not work with integer types. The integer usage is generally as a basic Euclidean Point or other basic tuple, which is a related but not quite identical concept and where the set of functions exposed differs. – This really gets into the differences between scalar/vector/matrix/tensor the mathematical concepts vaguely representing programming arrays or collections and vector/matrix the geometry (typically Euclidean) concepts that represent a concrete piece of information around a coordinate system.

I don’t think that some IVector2<TSelf, T> or similar interface is necessary. It is sufficient to simply define a conversion operator that does what will typically be a bitcast (such as via Unsafe.BitCast) in the cases you have one type of Vector2 and need to reinterpret as another Vector2 (such as interchanging between the .NET and Unity vector types).

5 Likes

While I do think that OP didn’t mean to have generic types for Vectors, rather generic lengths of vectors, I still totally agree with what you said.

Vector ints, don’t make sense etc… Yet I still have to ask: Is it not possible to generically limit such functions such as Normalize to only be applicable to IFloatingPoint types?

You’re right, but I feel like, there has to be a way where one can unsafely recreate an array where we have only 2 bits for length storage (4 possible states, i.e size up to Vector4) which then becomes negligible.

I don’t think you’re right here… The compiler knows how long the array is, so it should be able to unfold the loop operation itself… Especially because we are creating a certain generically assigned array length. Or so I presume.

You do bring up a good point though, which is why I am currently thinking about, if it is even possible to hint to the compiler how big an array will stay…

I’m not arguing against specialization. OP is asking why it is done - why he have to write boilerplate code for vec2 and 3. I explained the limitations of C# and potential performance cost at a basic level. By codegen, I mean generating c#, in place of a templating system, not il/byte code.

As an Unity user, I’m not up-to-date on the latest C# improvements. Take a simple element wise Add method. How do you handle multiple dimensions without writing multiple overloads? You can’t loop through fields without reflection, so you need to store the members in an array. Can the compiler skip the per instance storage of the array length if it is a per type constant?

You’ll either waste space on padding or instructions on manual packing/unpacking. It can make sense in some situations, like on the gpu when you’re severely bandwidth starved, but not for general purpose tasks.

Yes, using old versions of Unity make me forget we have static interface method now. Whether the compiler can unroll loops if the the method return a constant, I have not tested.

1 Like

That was directed at tannergooding not you… Sorry for the confusion. Should have specifically quoted tannergooding

Sure… if the compiler kept the array around… If we can use it as a compiler hint though, where the compiler optimizes away the array, as it is constant size, we might be able to get away with using it…

I have not tested this thesis either, but it seems a way to hint to the compiler for a makeshift constant size stack allocated array is possible… See:

Or:

Specifically:

Remarks:
This attribute can be used to annotate a struct type with a single field. The runtime replicates that field in the actual type layout as many times as is specified.

This could mean, that a length is not even stored with this method, making it possible to bypass storing a length byte all together…

So did I with that paragraph. I hit the quote button under his comment but it did not show up. This forums is still confusing to me.

The padding statement was in response to your 2 bit idea.

That inline array is very interesting. Looking toward to using it in 2047 :slight_smile:

1 Like

Have you considered System.Runtime.Intrinsics.Vector128<T>? Vector128<T> Struct (System.Runtime.Intrinsics) | Microsoft Learn

Standard floating-point vectors (System.Numerics.Vector2, Vector3, Vector4) are implemented in terms of Vector128<float>.

Vector128<T> is not that generic, since it only accepts a limited list of primitive types, but still.

Why don’t Vector2 and Vector128<T> implement at least some of the natural “generic math” interfaces like IAdditionOperators<TSelf,TOther,TResult>?

1 Like