It does seem like you are getting into the deeper topics of software development, however are you forced to?
Do not fall into the trap of “premature optimization”. Does your game contain many trees? Since a couple hundred monobehaviors are still fine and you can rescue some performance via static batching. Unity has a topic on that:
You could read the parent topic of that as well.
First of all it seems you have one misconception:
Unfortunately when it comes to rendering, you cannot have a “DoRenderALeaf(leaf_info)” method and then try to call this somehow in parallel for all leaves and expect a performance bonus vs separate monobehaviors.
The reason is, not only taking data back from GPU is slow, sending to GPU is as well. Every “DoRender” method would be a so called “draw call” and you should not have more draw calls than maybe some thousands on PC.
Instead you want to send all information on what to render, to the GPU at once. Usually that is not that easy. Unity’s call batching does help. The link above will tell you more.
Alternatively there is also this method but you’ll quickly see that requires more from the developer: Unity - Scripting API: Graphics.DrawMeshInstanced
Another way which I personally use to populate an underwater world with corals, is to use the particle system and have all leaves be particles (which you set by code) with infinite lifetime. The particle system also batches them into a singular draw call (and it happens with a C++ implementation, thus really fast). Of course this is only really meaningful if your leaves are already created via an algorithm and you don’t need to place them by hand in the editor.
Here a quick overview on the parallelism techniques as far as I know them:
Compute shaders:
A way to do massive parallel computations with the GPU. Working with this is tricky because there’s usually no debugger or print method available on the code that’s executed on the GPU. When you harness that power you can do amazing things like these:
Every frame, a formula is being executed for every singular pixel of the fluid texture!
This is only really useful for extremely specific usecases in game development though and as you already know, retrieving the data from the GPU is an additional hurdle, thus it’s most commonly used for visual effects where all information remains on the GPU. In the case of the fluid, the new computed fluid particle positions are directly used in the next frame so no data needs to be sent to the CPU.
Vertex manipulation in shaders:
The enemy of batching drawcalls is changing data. E.g. if you want to have the leaves move in the wind, all batching is thrown out if you need to change the coordinates, angles etc. every frame. Normal, regular shaders come to the rescue. You can access some time value in a shader and manipulate the vertices as well as colors etc. to achieve the desired effect highly efficient. Randomly variating the orientation of the leaves can also be an effect for that.
This is less involved than compute shaders and you’ll find many tutorials.
The topics above are interesting for rendering-related things but there can be other times you have high performance requirements. Then your options are:
Threads:
Those are almost a honorable mention because not very common in Unity.
In principle Unity is just a C# application however. So you can spawn threads and even processes. The infamous “race conditions” are the typical result though and Unity has put a hard stop to those: You are not allowed to access pretty much any Unity API from another thread but the “main thread” (which is the one calling FixedUpdate, Update etc.). Not even set the transform.position of a game object.
Furthermore spawning threads is slow. Creating and closing a thread every frame will most likely diminish any gains from parallelization.
Thread pools are a thing against that but they get tricky.
Personally I only use threads to monitor a custom written DLL.
Job System:
Instead of threads that are managed by the OS, Unity has their own form of micro threads, the “jobs”. Using them is somewhat involved, but it can be learned and the benefits are awesome. You need a more data based way of thinking (as opposed to object oriented) but then Unity can guarantee you that there are no race conditions without you having to deal with “semaphores”, “mutex” and similar constructs. Despite working on CPU, you can easily start hundreds of jobs and they will be executed with incredibly little overhead. Unity uses them internally for several systems as well.
API wise jobs work a tiny bit like compute shaders: You provide the data, launch the job (via schedule()) and later (ideally after 1-4 frames) when you need the result, you enforce the completion with “Complete()” (for the case it hasn’t completed yet). Since it’s CPU, there’s no extra time needed for data retrieval.
Finally jobs work in tandem with the Burst compiler. As a programmer you do not need to do much besides adding it to the project and marking your job structs accordingly. It results in a massive speedboost compared to C#. I’ve witnessed 40x.
ECS/DOTS:
The Entity Component System which is part of Unity’s Data-Oriented Technology Stack goes another step further.
Parallelization of code-execution is sometimes not enough because even if you tell your CPU to compute X things in parallel and it has the cores to do so, the memory is often the final bottleneck.
That’s why to achieve the absolute maximum you need to think of how your data is stored in memory so it is available as quickly as possible. That usually works by having the data that’s needed simultaneously be available “nearby” so there is less arbitrary random memory access.
ECS is a framework for exactly that.
It builds upon Jobs and Burst but tries to have little ties to Monobehavior instances because those are stored in those undesirable arbitrary memory locations. Instead you have ordered “Entities” with their data stored in lists.
It is a different way of programming. However you do not need to dig into it rightaway unless you intend to develop a RTS game 
It’s also still in development.
Coroutines:
Another honorable mention because despite their name which may confuse you if you come from other programming languages, they are not “real” parallelism. They are only a way to tell Unity in a concise way to execute something during the next X number of frames (or until a condition is met etc.). They are being executed on the main thread, thus no direct performance gain. They can be a small performance gain if the alternative is to have many “if” in an Update() method instead which would mean it’s being checked every frame the whole game while the Coroutine might only have been running occasionally once per minute.
They are easy to learn and lead to more readable code, so worth learning early.
Hope this gives you a headstart 
P.s. Before you employ any of the parallelism solutions: Use the Profiler and FrameDebugger to verify that you actually have a performance problem in the code you try to rework.