How to use Compute Shaders and DrawIndirect


Click below to see a short video clip of the result.


This tutorial is based on the geometry shader/ compute buffer tutorial which can be followed here.  The purpose of this example is to demonstrate how to use compute shaders and draw indirect in a straightforward manner.  There are many ways you could extend this example to make it more performant.  For example, you could factor out the geometry shader by having the compute shader export quads or a pair of triangles per pixel at initialization time.

Why Should I Care?

Compute Shaders are great for many reasons.  For one, they allow you to run tons of generic calculations on the GPU without having to jump through hoops writing graphics code.  Compute Shaders can accept, as input, any kind of buffer with any kind of data in it.  You could even pass sound data to the GPU to process and then read it back to the CPU to save as an audio file.  You can also think up ways to compute data at initialization time and then reference it in a classic rasterization shader while your CPU is doing other work. Keep in mind, anytime you send or receive data between the CPU and GPU, there is a performance cost.  Imagine you generate a procedural mesh in your compute shader and then desire to draw it.  In order to submit a draw call on the CPU, you need to know how many vertices you have.  However, the compute shader generated the mesh data procedurally on the GPU so you have no idea how many triangles it generated.  What do you do?  Well, you could read the buffer data back from the GPU to the CPU but that is pretty unfortunate because the only reason you are reading the data back is to know what many vertices the GPU generated just so that you can send that vertex count back to the GPU!  That is like a guy in New York City sending a letter to another guy in Seattle asking what restaurants are in Brooklyn.  Pretty inefficient!  Fortunately, we have a solution for this called DrawProceduralIndirect!  After you finish generating/filling your compute buffer with data, you can use another compute buffer to store the draw call arguments you would have normally passed into DrawProcedural.  Now you don’t need to read any data back from the GPU to the CPU!  All the compute buffer data and draw call args are all in GPU local memory so they are very fast for the GPU to fetch and use.  DrawProceduralIndirect is a way for the CPU to say “I have no idea how many triangles are in whatever it is the GPU made in the compute shader but the GPU knows so just draw it!”

Frame Debugger Capture:

The following capture shows that there is a single instanced draw call used to draw all the points.

The following capture shows the contents of the compute buffer that was uploaded to the GPU and used by the rasterization shader.





Download the Sample Project

Share It!

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.