Join us on our own very special Reddit: /r/Linuxers

OpenGL Multi-threading, what it is and what it means

By - | Views: 28,623
Disclaimer: this information is all easily available around, I’m trying to condense it a bit, and focus on a particular topic for discussion. I may gloss over, or simplify information, but the centrals ideas should apply.

There’s been a fair bit of talk recently about attempting to multi-thread using OpenGL so I thought I’d write a bit more about what “multi-threading” an OpenGL game is, what’s normally done, and how it compares to multi-threading in Vulkan.

Some History

For those not aware, OpenGL is, in computing terms, old. It was designed before multiple CPU cores were even available to the general consumer, and long before just about every part of a graphics pipeline was programmable.
The central concept of OpenGL is a state machine. This has done it very well for a long time, but a single OpenGL context (state machine) is based on sequential inputs from the application - the application calls the API, and the OpenGL implementation reacts. State is changed, rendering commands issued, resources loaded, and so on.


State of OpenGL

Being state based, and because any of the API calls has the potential to change state, this makes multi-threaded access to an OpenGL context very difficult - indeed. Let's say one thread is handling "x" and only meant for it, but another thread accesses it and changes "x" to "x+1", the original thread then tries to do something else and it doesn't know it has changed. It’s not permitted in most cases and can result in undefined behaviour like that. There is some exception to this: contexts are allowed to share certain data such as texture information and vertex buffers, but more on that later.
Furthermore about a state based design, is that OpenGL implementations must ensure that the state is always valid. It must ensure that data is correctly bound, in range, and that nothing will break the system. Up to this point, everything is CPU-side still. If everything is okay, the implementation may then generate commands that can be sent to the GPU itself.

Drivers can do some fancy things behind the scenes of course, but the end result, as presented to the application, is the same. Accept an API command, modify and verify state, send hardware commands to the GPU.

Recent versions of OpenGL have been a big help in cutting out a lot of the overhead. Checking state validity can be decreased, greatly reducing the time from API call to GPU command, but it’s still very much a case of having to do that in a single thread.


Threading

So how can developers “multi-thread” OpenGL? It is possible to have multiple contexts in multiple threads, and use them to load texture data, update vertex buffers, possibly compile new shaders, in different threads. The tricky part about this is that sharing this information between OpenGL contexts is dependent on the drivers behaving themselves, in addition to the application not trying (by accident or intent) to do anything strange, so it’s often unstable. It can be quite the adventure getting a game running with this approach, and the runtime improvements are often simply not even worth the effort - it can often run worse if drivers need to synchronise data between contexts often enough. For the curious, things like editors with multiple rendering windows do this, but that’s a different scenario - each window isn’t trying to interfere with every other window while rendering, so multi-threading doesn’t normally come into play.

This leads to the second approach to multi-threading OpenGL: developers don’t! If OpenGL works best by submitting commands sequentially on the thread where a context is active, then that’s simply the best thing to do. Nothing stops a game developer making their own queue of OpenGL API calls they want to perform though, and creating that can be done by multi-threading. To give an example, if a game has a big list of objects, there’s going to be a bit of processing to do when deciding whether to draw each object or not; for each object, the game decides if the object might be visible first, and only try to render if it will actually be seen. The check for each object takes time, but processing each object is independent. So the list can be split into multiple sub-lists, and each sub-list given to a separate thread to run a visibility check on. Each thread will have it’s own rendering list to which objects that should be rendered are added. When done, each rendering list can be iterated over in turn and objects submitted to OpenGL in a single thread. This is a very simple example, but there’s normally quite a fair amount of similar logic in deciding what to render. So it’s not multi-threading OpenGL, but rather multi-threading in deciding how to use OpenGL.


Vulkan

Before I mentioned that OpenGL verifies state information and then generates commands to the GPU.

Firstly, once a developer has finished making everything work, then all that verification is still done, but isn’t actually required. It’s useful during development, but later it’s (hopefully!) a waste of time. So even on a dedicated thread submitting commands to OpenGL, there’s quite the overhead for the final task of actually sending commands to the GPU itself. It would be nice if there was some way to pre-build a list of commands to send to the GPU that were known to be valid.
Secondly, as with the example above of a game splitting object visibility checks into multiple sub-lists, it would also be nice if multiple GPU command lists could be created on separate threads, and then submitted to the GPU in turn. They are separate after all, and don’t require GPU access to actually prepare.
This is essentially what Vulkan allows. There are some requirements: all the state must be known up-front, and prepared for well before it’s time to actually render something. The flip side is that there is much, much less driver overhead, and the API itself can be used multi-threaded. Actual submission of commands to the GPU is still done sequentially, in a single thread, however there’s very little overhead; all error checking has been done, and it’s just sending commands directly to the GPU (feeding the beast).
There are other areas of Vulkan that lend themselves nicely to application level multi-threading, but I won’t cover them here. Suffice to say that Vulkan does not contain a central state machine, and instead tries to keep everything as isolated and contained as possible, meaning things like building a shader don’t block loading a texture, making multi-threaded designs easier to achieve.


Not Always Applicable

On a final note: when porting games, the way a game handles its data is not always compatible with some of the multi-threading ideas mentioned above. It can’t be expected in every game. In addition, it might simply be easier in time and effort (not to mention with testing and stability) to run things in a single thread anyway. Not as efficient, but possibly less error prone and faster to bring a port. Article taken from GamingOnLinux.com.
32 Likes, Who?
We do often include affiliate links to earn us some pennies. We are currently affiliated with GOG, Humble Store and Paradox Interactive. See more here.
The comments on this article are closed.
18 comments
Page: «2/2
  Go to:

Shmerl 12 Feb, 2017
QuoteActual submission of commands to the GPU is still done sequentially, in a single thread, however there’s very little overhead; all error checking has been done

Is that true? From what I've read, modern GPUs support multiple queues for input (some for graphics, some for compute). I'm not sure what GPU is supposed to do with multiple queues for graphics for example, since in the end, rendered image is a single frame, but if they exist, it means it should be possible to feed them from multiple threads (one thread per GPU input queue). And Vulkan should support that.

Also, it's possible to have multiple GPUs working in parallel (Vulkan aims to support that), to increase computational power. You for sure don't want to have one thread feeding such hardware setup - it's going to be underutilized.


Last edited by Shmerl on 12 February 2017 at 4:58 am UTC
etonbears 12 Feb, 2017
Quoting: Shmerl
QuoteActual submission of commands to the GPU is still done sequentially, in a single thread, however there’s very little overhead; all error checking has been done

Is that true? From what I've read, modern GPUs support multiple queues for input (some for graphics, some for compute). I'm not sure what GPU is supposed to do with multiple queues for graphics for example, since in the end, rendered image is a single frame, but if they exist, it means it should be possible to feed them from multiple threads (one thread per GPU input queue). And Vulkan should support that.

Also, it's possible to have multiple GPUs working in parallel (Vulkan aims to support that), to increase computational power. You for sure don't want to have one thread feeding such hardware setup - it's going to be underutilized.

Yes, Vulkan uses a single thread for GPU submission. AFAIK, in hardware terms, the most common case where a single thread may cause throttling would be for an extremely powerful GPU with a relatively weak CPU. In such a case, you would be using a dedicated PCIe card for the GPU, and the need to use PCIe would enforce single-thread synchronization in the driver regardless of what you do higher up the software stack.

AMD's APUs and Intel integrated graphics are different in that they are monolithic silicon, and therefore might be expected to benefit from multi-threading; but as the GPU elements are relatively weak, it is probably not the case.

Either way, an application design is probably more robust if it explicitly synchronizes submission order through a single thread. Responsibility for explicit synchronization is the trade-off developers accept for the benefits of using Vulkan.

Multiple independently operating GPUs ( say, one for compute and another for graphics ) could clearly benefit from a submission thread per GPU, but if they are co-operating on the same tasks, you still need to synchronize submissions, so you would probably still want a single thread to do that work.
mirv 12 Feb, 2017
View PC info
  • Supporter Plus
This is something I kind of glossed over with Vulkan.

I'll limit this to assuming graphics and presentation support is in the same queue family, and on the same queue from that. I'll also state, as above, the hardware itself might require single-threaded submission anyway - but let's pretend that's not the case.

If you use graphics, transfer, and compute with different queues (probably a good thing), it might be ok to submit on separate threads, but there still needs to be synchronisation between them. Can't render an object until data transfer for it is complete, for example. Given that command submission is relatively cheap (all the heavy lifting of generating commands has been done, so it really is just piping it through to the GPU directly) then the effort of trying to synch things might end up making it slower, not to mention making the code more difficult to maintain. Also if submitting commands is slower than it takes the commands to execute, there are bigger problems!

Multiple queues are useful of course, because the GPU can carry out those workloads in parallel - but using the CPU to tell it how to do that is better off single threaded (mostly likely).
Shmerl 12 Feb, 2017
I saw this topic mentioned in the Mantle document: https://www.amd.com/Documents/Mantle-Programming-Guide-and-API-Reference.pdf

Search for "GPU queue" there. Synchronization is also covered there with queue semaphores. So I assume Vulkan should have an analog of the same idea.

I can't find it now, but I saw someone asking similar question in one of the Khronos Q&A, and they said Vulkan should support multiple parallel GPU queues.


Last edited by Shmerl on 12 February 2017 at 5:31 pm UTC
Shmerl 12 Feb, 2017
Vulkan doc has various chapters as well about devices and queues, and synchronization: https://www.khronos.org/registry/vulkan/specs/1.0/pdf/vkspec.pdf

But they avoid details about practical usage and threading in regards to that. I suppose some higher level articles should dive into that.


Last edited by Shmerl on 12 February 2017 at 6:21 pm UTC
Shmerl 12 Feb, 2017
Here is one interesting article on this topic: http://gpuopen.com/concurrent-execution-asynchronous-queues/

From there at least it seems that GCN hardware has only one graphics queue, but in theory nothing prevents there to be multiple, which is clearly a possibility with multi-GPU setup. I.e. scenarios of SLI/Crossfire like usage, when multiple GPUs are used for rendering the single target would be such case. Supposedly it's coming in Vulkan-next.


Last edited by Shmerl on 12 February 2017 at 6:39 pm UTC
Comandante Ñoñardo 13 Feb, 2017
Very informative...
But, in a few words.
Is OpenGL obsolete because it doesn't make efficient use of modern CPU's??

Now, for the sake of porting windows games to Linux and for the sake of the performance of those ports, instead of using OpenGL, is not more convenient to teach Linux how to speak D3D11 (like gallium9 does with D3D9)?

Vulkan can be the future, but is not the standard in the actual blockbuster games.
mirv 13 Feb, 2017
View PC info
  • Supporter Plus
Quoting: Comandante ÑoñardoVery informative...
But, in a few words.
Is OpenGL obsolete because it doesn't make efficient use of modern CPU's??

Now, for the sake of porting windows games to Linux and for the sake of the performance of those ports, instead of using OpenGL, is not more convenient to teach Linux how to speak D3D11 (like gallium9 does with D3D9)?

Vulkan can be the future, but is not the standard in the actual blockbuster games.

I wouldn't call OpenGL obselete. It will be a long while before that might be considered the case.

There are several reasons it would be a bad idea to try get some kind of DirectX-like API natively on GNU/Linux. That in itself is almost worth another article. Gallium9 is nice & all, but better instead just to focus on Vulkan.
While you're here, please consider supporting GamingOnLinux on:

Patreon, Liberapay or PayPal Donation.

We have no adverts, no paywalls, no timed exclusive articles. Just good, fresh content. Without your continued support, we simply could not continue!

You can find even more ways to support us on this dedicated page any time. If you already are, thank you!
The comments on this article are closed.