2

I know that getrusage() can provide per-thread CPU utilization, but only the time spent on the CPU. Is there any way to get the number of executed CPU instructions? Or the number of cycles spent on the cpu? Basically, I need to find a reproducible measure of how much the thread spends on the CPU. Any suggestions to do this in C?

UPDATE (to respond to comments):

  • Ideally I'd need this in a platform independent way, but Linux would be the most useful.
  • Reproducibility is the most important for me, even if that means the actual runtime may be slightly different.
  • I know vTune (and have used it), but I'd like to have this info programmatically while my code is running. So vTune is out, as well as the suggestions made in the post linked by Craig Estey.
  • I did look at the Intel Intrinsics Guide, but did not find anything useful...
12
  • 3
    This has to be platform dependent. Specify an O/S please. Commented Jan 29, 2019 at 19:23
  • 1
    NB. Number of cycles is probably more useful, but not exactly reproducible wrt. pipeline stalls, cache misses etc. Number of instructions should reproducible, but doesn't tell you much about speed, since instruction latency isn't uniform (except perhaps on older RISC or embedded chips) Commented Jan 29, 2019 at 19:29
  • 2
    No suggestion for doing this in C but if you are using an Intel processor take a look at en.wikipedia.org/wiki/VTune Commented Jan 29, 2019 at 19:43
  • 1
    See: stackoverflow.com/questions/54355631/… It is a virtual duplicate of your question and gives a number of different methods Commented Jan 29, 2019 at 20:07
  • 1
    @tadman: I find that in CPU-intensive code that doesn't sleep or wait for I/O, looking at cycle counts instead of time is a useful way to factor out CPU frequency variation when tuning code to be more efficient on a clock-for-clock basis. That might not hold up if waiting for data from other cores is a factor, though. That depends on uncore clock, not this core's clock. (Although on desktop CPUs, all cores and the uncore are locked to the same frequency.) Commented May 21, 2020 at 16:40

1 Answer 1

2

Take a look at google's filament engine. They are doing exactly that. Look at their profiler. https://github.com/google/filament/blob/master/libs/utils/src/Profiler.cpp Also you can get more info from this link: https://www.youtube.com/watch?v=Lcq_fzet9Iw

Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.