<Prev | Content | Next>

06. GPU Frame Capture: pt.3 Profiling

I've already explained how to capture GPU frames from Xcode in the previous two episodes. Though, I focus only on the Performance section in this one. Also I won't cover the meaning of each counter and metrics because there are lots of them, and I don't have enough time and characters here, but at least you'll have an understanding of where and what you can find.

Overview

  1. Performance section. Here we have most of the data about performance interesting for us. There's also a performance section under every GPU call, but there are only counters.
  2. Instruments tabs. I'll explain each one in the next sections.
  3. Encoders list. Here you can see their cost, and select an encoder, pipeline and call you need for deeper look.
  4. Instrument area. Here you can find details of each instrument. For Overview, it's some summary, basic timeline and shaders statistics.
  5. Details for a selected object. Here you can see some basic statistics for the selected kernel. As you can see, it's already helpful enough.

Timeline

This instrument shows what's happening at what time. So you can see here when each of your encoders runs, how much time it takes, and see which resources it uses.

  • The top half shows calls (when and what is called and how much time it takes). That helps you to see order of calls, what's called in parallel and if there are any unused gaps between calls.
  • The bottom half shows usage of resources, limiters, etc. That shows when you use GPU capabilities and how much of them. That helps you to understand if you have proper setup of your dispatches and optimise them and your shaders for better resource use.

You can set up filters and hide counters you aren't interested in.

Also you can select a group of counters and not be distracted by others.

NOTE: if you select an encoder or call in encoders list, it gets highlighted.

Shaders

This section has a list of shaders you use with some basic statistics. It's pretty boring. Also you can find the same in Overview section.

Heat Map

This instrument is relatively new. It has three sections:

  1. Top one contains heatmaps themselves. Depending on shader type there could be different metrics, but I'll explain on fragment shader example only because it's better for understanding. Originally there's only Shader Execution Cost and Color attachment maps, but you can add others (with [+] button). For example, I've added Overdraw heatmap, where you can see overlapped areas of different draw calls.
  2. Middle section shows operations ratio in used time. It's visually great for overview and allows you to navigate to the function in code (next section).
  3. Bottom section shows your code with statistics per line. You can choose what kind of statistics under the code (now there's Number of Instructions). I'll provide more details in the next chapter.

Cost Graph

This instrument looks like the Heat map instrument but without heatmaps, but it's more convenient for working with code.

If you hover mouse cursor over the pie diagram, it shows more detailed statistics. That's very helpful for optimisation work (see the last chapter of the episode).

Counters

This instrument shows all counters for every draw or dispatch call. It's similar to counters in Timeline instrument, but summarised and more numbers. Use it to understand what's overloaded and what's underloaded to find ways for optimisation and to compare between calls.

Particular GPU call counters

If you select Performance section under a GPU call, you'll get counters for this particular call, but in more convenient shape. So I would use this instrument instead of Counters in common Performance section. But here you're not able to compare them between different calls.

There's also Pipeline Statistics, but it's too basic and not very informative after everything we had above.

On-flight profile

If you select a shader in Bound Resources, you can see the same statistics pies, but here you can change your code.

When you did all your changes you think could improve performance of your shaders, tap the round arrow button. Xcode rerun your captured frames and profile them again. If you did it right, percentage of the line should go down (and others increase).

You can also see difference in Pipeline Statistics and difference percentage in calls hierarchy. But it can be very inaccurate, so for time profiling I recommend to use Instruments instead of GPU frame capture.

Conclusion

  • GPU frame capture can provide you lots of helpful performance insights.
  • These tools are being developed, so they may change.
  • You can see your optimisation results on flight in the same capturing session.
  • There are lots of instruments, so the best way to learn them is to experiment and try with your own code because every case may have different nuances.

<Prev | Content | Next>