You can already do this if you donāt have a discrete GPU. If you do have a discrete GPU it makes more sense to use that parallelism than the stuff on an APU because it will be dramatically more efficient to do so and will take both heat and power load off of the APU.
The fact that this isnāt a thing is telling of one of these issues, or probably both:
itās not possible or technically very challenging
itās not worth the effort due to increased programming complexity vs little benefit
Keep in mind those are low-end GPUs. Even a $200 (used to be less than half the price before that war in europe) dedicated GPU beats those IGPs (integrated graphics processor). Not just beat, they stomp them to death.
As I already said, all of these things would be better served by literally any discrete GPU and it would make sense not to apply extra heat and electrical load to an APU for these tasks. Any discrete GPU is going to be able to handle these things no problem to the point where trying to divide these tasks would not even be an optimization. Thereās also the matter of passing all this to the actual rendering GPU at the time.
This would require a ridiculously complex solution to solve a problem that doesnāt even exist and would probably lead to bottlenecks when it comes to synchronizing that data. This would likely slow things down and, if it didnāt, it would provide no meaningful benefit over even the most low-power discrete GPUs.
Itās not just that theyāre low-end either. Integrated graphics has to share memory bandwidth with the CPU. Iāve seen benchmarks for AMDās four core APUs showing memory usage and they were bandwidth starved. Imagine what will happen to memory usage when the CPU is no longer just four cores.
Additionally the AMD 7000 series only has a single execution unit in its IGPUs. Aside from the hardware encoding and decoding abilities itās intended to be a framebuffer and little else. Itās definitely not intended for anything that needs GPU accelerated processing to be done like games.
Like the others said, the main issue is getting the results of the IGP job over to the discrete GPU which means uploading a texture or mesh data from main memory to the GPU. That will most likely be slower than letting the discrete GPU perform that job with data that it likely already has in memory.
It gets really complex when the output needs to be synchronized, ie both GPUs having their results ready in the same frame.
Finally, you can simply plugin a second discrete GPU and nearly double the processing power. There are apps like Davinci Resolve which can dedicate an entire GPU to specific background processing tasks while the main GPU does the usual stuff. These apps wouldnāt even bother trying to use any available IGP, which is telling.
The whole multi GPU thing died several years ago. You can barely use two of the same GPUs together for rendering anymore, let alone two from different vendors, at the API level.
How do you even get the results from one GPU to the other?
The closest we got was when PhysX allowed using one of the GPUs for physics. But this was limited mostly to non-game play physics like cloth/hair simulations, and the data still had to go back to the CPU, at least, which has massive latency.
A readback, then a texture upload, at the very least. Both of which are things you want to minimise.
The IGPU uses system RAM, which is slower than VRAM, and would need its own copies of stuff, so it both uses space and competes for bandwidth.
And some of the suggested uses would require getting data from the main GPUās buffers. To render a typical particle effect on a different GPU requires depth data. So the main GPU has to finish opaque rendering (at least), the depth buffer has to be read back then uploaded to the IGPU, which (slowly) does its thing while using up some RAM and bandwidth and outputs it to a texture. That texture is then read back, uploaded to the main GPU where itāsā¦ probably rendered as a quad with a shader anyway? And the main GPU may not be able to move past the transparent render queue until thatās happened.
In short, itās a lot of work to be able to do something more slowly.
For games only. SLI has always been a niche, and was always ripe with issues (I had SLI once about 15 years ago). Since it was targeted at hardcore gamers the issue that concerned prospective buyers the most was introducing additional input lag (if they knew about it in advance). Iād say that and the fact that within less than two years you could get the same horsepower on a single GPU for half the price made SLI a terrible choice, plus all the compatibility issues including not being able to use some of the fullscreen-postprocessing effects (back in the day in WoW for instance you couldnāt enable either Bloom or FXAA on SLI, or both, or just not both at the same time, something like that).
However, for other heavy-computing purposes such as mining bitcoins, video processing, scientific calculations, and so on the whole multi-GPU thing took off. Some things (fortunately) seem to be on the way out though, like game streaming (goodbye Stadia, was nice not knowing you).
Nvidia now makes dedicated GPUs without video out ports that are essentially āRTX 3099ā variants of the same gaming level chips they produce, like the A100 line (the H100 line is for data centers): https://www.nvidia.com/en-us/data-center/a100/
Yeah, multi GPU never really made that much sense for games in the first place because its price entry point always made it a really small niche, but Iāve personally eyed it for improving offline rendering speeds and some ML applications.
Multi-GPU works if thereās little interdependency on the data on each GPU or if the data would have to be routed to/from main memory anyway (like video processing), but yeah, real-time rendering is a different beast altogether.
You could use other GPUs for some compute tasks, but it would only be worth it for stuff that would be sent back to the CPU. The latency and bandwidth restrictions on that limits the use for games. Something like running a particle system on another GPU would be a nightmare, and the bottleneck of getting the simulation from one GPU to another would eat away most of the gains of offloading it in the first place.
We even saw this problem a lot with things like SLI, which were designed to make this easier. SLI was always a very inconsistent at best experience, often suffering from pretty bad glitches or some games just outright not working if it was enabled at all.