CPUs from Intel and AMD now have GPUs built in, can Unity tap into this unused processing power?

AMDs 7000 and a lot of Intel CPUs now have on board GPUs that in a modern gaming rig will just be sitting their twiddling their thumbs.

Can these modest chunks of parallel processing power be tapped into by Unity?

Or could we as game developers detect these dormant GPUs and use their power to improve our games?

PS How would you use on chip GPU processing power if it was easy to tap into?

You can already do this if you donā€™t have a discrete GPU. If you do have a discrete GPU it makes more sense to use that parallelism than the stuff on an APU because it will be dramatically more efficient to do so and will take both heat and power load off of the APU.

The fact that this isnā€™t a thing is telling of one of these issues, or probably both:

  • itā€™s not possible or technically very challenging
  • itā€™s not worth the effort due to increased programming complexity vs little benefit

Keep in mind those are low-end GPUs. Even a $200 (used to be less than half the price before that war in europe) dedicated GPU beats those IGPs (integrated graphics processor). Not just beat, they stomp them to death.

The (what some say) fastest integrated GPU from AMD scores around 2000 points:
https://www.videocardbenchmark.net/gpu.php?gpu=Ryzen+7+5700G+with+Radeon+Graphics&id=4405

The lowest-end Nvidia GTX 1050 Ti you can still buy scores around 6000 points:
https://www.videocardbenchmark.net/gpu.php?gpu=GeForce+GTX+1050+Ti&id=3595

1 Like

Itā€™s probably tricky to get the balance right of having none-essential features or moving some light but broad/large processing to these IGPUs.

Maybe the key is to think of what they could be used for:

  • In game low resolution camera/TV systems.
  • The mini map.
  • Some Particle or Cloud Systems.
  • Animal or Plant systems e.g. Fish(Boids), Birds, Insects, Rats.

Could they be used for mesh distortion or destruction, offloading work from the CPU to IGPUs?

As I already said, all of these things would be better served by literally any discrete GPU and it would make sense not to apply extra heat and electrical load to an APU for these tasks. Any discrete GPU is going to be able to handle these things no problem to the point where trying to divide these tasks would not even be an optimization. Thereā€™s also the matter of passing all this to the actual rendering GPU at the time.

This would require a ridiculously complex solution to solve a problem that doesnā€™t even exist and would probably lead to bottlenecks when it comes to synchronizing that data. This would likely slow things down and, if it didnā€™t, it would provide no meaningful benefit over even the most low-power discrete GPUs.

1 Like

Itā€™s not just that theyā€™re low-end either. Integrated graphics has to share memory bandwidth with the CPU. Iā€™ve seen benchmarks for AMDā€™s four core APUs showing memory usage and they were bandwidth starved. Imagine what will happen to memory usage when the CPU is no longer just four cores.

Additionally the AMD 7000 series only has a single execution unit in its IGPUs. Aside from the hardware encoding and decoding abilities itā€™s intended to be a framebuffer and little else. Itā€™s definitely not intended for anything that needs GPU accelerated processing to be done like games.

Like the others said, the main issue is getting the results of the IGP job over to the discrete GPU which means uploading a texture or mesh data from main memory to the GPU. That will most likely be slower than letting the discrete GPU perform that job with data that it likely already has in memory.

It gets really complex when the output needs to be synchronized, ie both GPUs having their results ready in the same frame.

Finally, you can simply plugin a second discrete GPU and nearly double the processing power. There are apps like Davinci Resolve which can dedicate an entire GPU to specific background processing tasks while the main GPU does the usual stuff. These apps wouldnā€˜t even bother trying to use any available IGP, which is telling.

1 Like

The whole multi GPU thing died several years ago. You can barely use two of the same GPUs together for rendering anymore, let alone two from different vendors, at the API level.

How do you even get the results from one GPU to the other?

The closest we got was when PhysX allowed using one of the GPUs for physics. But this was limited mostly to non-game play physics like cloth/hair simulations, and the data still had to go back to the CPU, at least, which has massive latency.

3 Likes

A readback, then a texture upload, at the very least. Both of which are things you want to minimise.

The IGPU uses system RAM, which is slower than VRAM, and would need its own copies of stuff, so it both uses space and competes for bandwidth.

And some of the suggested uses would require getting data from the main GPUā€™s buffers. To render a typical particle effect on a different GPU requires depth data. So the main GPU has to finish opaque rendering (at least), the depth buffer has to be read back then uploaded to the IGPU, which (slowly) does its thing while using up some RAM and bandwidth and outputs it to a texture. That texture is then read back, uploaded to the main GPU where itā€™sā€¦ probably rendered as a quad with a shader anyway? And the main GPU may not be able to move past the transparent render queue until thatā€™s happened.

In short, itā€™s a lot of work to be able to do something more slowly.

3 Likes

For games only. SLI has always been a niche, and was always ripe with issues (I had SLI once about 15 years ago). Since it was targeted at hardcore gamers the issue that concerned prospective buyers the most was introducing additional input lag (if they knew about it in advance). Iā€™d say that and the fact that within less than two years you could get the same horsepower on a single GPU for half the price made SLI a terrible choice, plus all the compatibility issues including not being able to use some of the fullscreen-postprocessing effects (back in the day in WoW for instance you couldnā€™t enable either Bloom or FXAA on SLI, or both, or just not both at the same time, something like that).

However, for other heavy-computing purposes such as mining bitcoins, video processing, scientific calculations, and so on the whole multi-GPU thing took off. Some things (fortunately) seem to be on the way out though, like game streaming (goodbye Stadia, was nice not knowing you).

Nvidia now makes dedicated GPUs without video out ports that are essentially ā€œRTX 3099ā€ variants of the same gaming level chips they produce, like the A100 line (the H100 line is for data centers): https://www.nvidia.com/en-us/data-center/a100/

3 Likes

Yeah, multi GPU never really made that much sense for games in the first place because its price entry point always made it a really small niche, but Iā€™ve personally eyed it for improving offline rendering speeds and some ML applications.

Multi-GPU works if thereā€™s little interdependency on the data on each GPU or if the data would have to be routed to/from main memory anyway (like video processing), but yeah, real-time rendering is a different beast altogether.

You could use other GPUs for some compute tasks, but it would only be worth it for stuff that would be sent back to the CPU. The latency and bandwidth restrictions on that limits the use for games. Something like running a particle system on another GPU would be a nightmare, and the bottleneck of getting the simulation from one GPU to another would eat away most of the gains of offloading it in the first place.

We even saw this problem a lot with things like SLI, which were designed to make this easier. SLI was always a very inconsistent at best experience, often suffering from pretty bad glitches or some games just outright not working if it was enabled at all.

ā€œLook I have my new V8 sitting out in the garage. But Iā€™m wondering if I can connect it up to my RC model car to get more power.ā€

5 Likes