Langbahn Team – Weltmeisterschaft

Gprof

Gprof is a performance analysis tool for Unix applications. It used a hybrid of instrumentation and sampling[1] and was created as an extended version of the older "prof" tool. Unlike prof, gprof is capable of limited call graph collecting and printing.[1][2]

History

GPROF was originally written by a group led by Susan L. Graham at the University of California, Berkeley for Berkeley Unix (4.2BSD[3]). Another implementation was written as part of the GNU project for GNU Binutils in 1988 by Jay Fenlason.[4][5]

Implementation

Instrumentation code is automatically inserted into the program code during compilation (for example, by using the '-pg' option of the gcc compiler), to gather caller-function data. A call to the monitor function 'mcount' is inserted before each function call.[6]

Sampling data is saved in 'gmon.out' or in 'progname.gmon' file just before the program exits, and can be analyzed with the 'gprof' command-line tool. Several gmon files can be combined with 'gprof -s' to accumulate data from several runs of a program.

GPROF output consists of two parts: the flat profile and the call graph. The flat profile gives the total execution time spent in each function and its percentage of the total running time. Function call counts are also reported. Output is sorted by percentage, with hot spots at the top of the list.

The second part of the output is the textual call graph, which shows for each function who called it (parent) and who it called (child subroutines). There is an external tool called gprof2dot capable of converting the call graph from gprof into graphical form.[7]

Limitations and accuracy

At run-time, timing values are obtained by statistical sampling. Sampling is done by probing the target program's program counter at regular intervals using operating system interrupts (programmed via profil(2) or setitimer(2) syscalls). The resulting data is not exact, rather a statistical approximation. The amount of error is usually more than one sampling period. If a value is n times the sampling period, the expected error in the value is the square root of n sampling periods.[8][9] A typical sampling period is 0.01 second (10 milliseconds) or 0.001 second (1 ms) or in other words 100 or 1000 samples per second of CPU running time.

In some versions, such as BSD, profiling of shared libraries can be limited because of restrictions of the profil function, which may be implemented as library function or as system call. There were analogous utility in glibc called 'sprof' to profile dynamic libraries.[10]

Gprof cannot measure time spent in kernel mode (syscalls, waiting for CPU or I/O waiting), and only user-space code is profiled.[9]

The mcount function may not be thread-safe in some implementations, so multi-threaded application profiles can be incorrect (typically it only profiles the main thread of application).[11]

Instrumentation overhead can be high (estimated as 30%[12]-260%[13]) for higher-order or object-oriented programs. Mutual recursion and non-trivial cycles are not resolvable by the gprof approach (context-insensitive call graph), because it only records arc traversal, not full call chains.[13][14][15]

Gprof with call-graph collecting can be used only with compatible compilers, like GCC, clang/LLVM and some other.

Reception

In 2004 a GPROF paper appeared on the list of the 50 most influential PLDI papers of all time as one of four papers of 1982 year.[16]

According to Thiel,[6] "GPROF ... revolutionized the performance analysis field and quickly became the tool of choice for developers around the world ... the tool still maintains a large following ... the tool is still actively maintained and remains relevant in the modern world."

See also

References

  1. ^ a b Susan L. Graham, Peter B. Kessler, and Marshall K. Mckusick. gprof: a Call Graph Execution Profiler // Proceedings of the SIGPLAN '82 Symposium on Compiler Construction, SIGPLAN Notices, Vol. 17, No 6, pp. 120-126; doi: 10.1145/800230.806987
  2. ^ gprof --- Call Graph // Ping Huang, Reinventing Computing, MIT AI Lab
  3. ^ HISTORY The gprof profiler appeared in 4.2BSD
  4. ^ GNU gprof manual: "GNU gprof was written by Jay Fenlason."
  5. ^ GNU's Bulletin, vol. 1 no. 5 (1988): "Gprof replacement Foundation staffer Jay Fenlason has recently completed a profiler to go with GNU C, compatible with `GPROF' from Berkeley Unix. "
  6. ^ a b Justin Thiel, An Overview of Software Performance Analysis Tools and Techniques: From GProf to DTrace (2006) "2.1.1 Overview of GProf"
  7. ^ Gprof call graph visualization // Cookbook for scientific computing. Python cookbook. École polytechnique fédérale de Lausanne (EPFL)
  8. ^ Statistical Inaccuracy of gprof Output Archived 2012-05-29 at the Wayback Machine
  9. ^ a b gprof Profiling Tools on BG/P Systems Archived 2013-12-21 at the Wayback Machine, "Issues in Interpreting Profile Data", Argonne Leadership Computing Facility
  10. ^ "The qprof project". HP Labs, Research (archived). Archived from the original on 4 August 2014. Retrieved 28 September 2023.{{cite web}}: CS1 maint: bot: original URL status unknown (link)
  11. ^ HOWTO: using gprof with multithreaded applications // Sam Hocevar, 2004-12-13
  12. ^ GNU gprof Profiler Archived 2015-12-08 at the Wayback Machine, Yu Kai Hong, Department of Mathematics at National Taiwan University; July 19, 2008
  13. ^ a b Low-Overhead Call Path Profiling of Unmodified, Optimized Code, ACM 1-59593-167/8/06/2005 .
  14. ^ J. M. Spivey Fast, accurate call graph profiling Archived 2012-02-07 at the Wayback Machine, September 3, 2003 // Software—Practice & Experience archive, Volume 34 Issue 3, March 2004, Pages 249 - 264 Spivey, J. M. (2004). "Fast, accurate call graph profiling". Software: Practice and Experience. 34 (3): 249–264. CiteSeerX 10.1.1.62.1032. doi:10.1002/spe.562. S2CID 17866706.
  15. ^ Yossi Kreinin, How profilers lie: the cases of gprof and KCachegrind // February 2nd, 2013
  16. ^ 20 Years of PLDI (1979–1999): A Selection, Kathryn S. McKinley, Editor

Further reading

  • Susan L. Graham, Peter B. Kessler, and Marshall K. Mckusick. gprof: a Call Graph Execution Profiler // Proceedings of the SIGPLAN '82 Symposium on Compiler Construction, SIGPLAN Notices, Vol. 17, No 6, pp. 120–126; doi: 10.1145/800230.806987
  • Graham, S. L., Kessler, P. B. and McKusick, M. K. (1983), An execution profiler for modular programs. Softw: Pract. Exper., 13: 671–685. doi: 10.1002/spe.4380130803