What are good tools for determining where a program is spending the most wallclock time?

CURATOR: Jack Smith

ANSWER: [Katia] Benchmarking/profiling tools are specific for each application:
For a program written in c and compiled with gcc compiler, one can use gprof :http://sourceware.org/binutils/docs/gprof/.
There are a number of other popular profilers.

For R script the basic profiling can be done using Rprof() function: http://stat.ethz.ch/R-manual/R-devel/library/utils/html/Rprof.html which comes with base R. However there are some other very helpful tools like proftools library https://cran.r-project.org/web/packages/proftools/index.html and profvis library https://rstudio.github.io/profvis/

Python just like R comes with profiling tools: https://docs.python.org/2/library/profile.html and just like R there are some additional packages that might be helpful to graphically determine the bottleneck: http://pycallgraph.slowchop.com/en/master/

MATLAB has its own built-in profiler.

[deleted - obsolete]

COMMENTARY: Could you please explain what you are trying to do? And why you want to find out about the wallclock time?

If one is using a Linux system, then strace might be a good option to get some information about how much time program is spending interacting with the kernel. The following description is provided on the developer’s website:

strace is a diagnostic, debugging and instructional userspace utility for Linux. It is used to monitor and tamper with interactions between processes and the Linux kernel, which include system calls, signal deliveries, and changes of process state.

If one adds the -c flag, the command will provide a summary of how many system calls of each type was called and how much time they took. Below is an example taken from strace.io :

$ strace -c ls > /dev/null
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 89.76    0.008016           4      1912           getdents
  8.71    0.000778           0     11778           lstat
  0.81    0.000072           0      8894           write
  0.60    0.000054           0       943           open
  0.11    0.000010           0       942           close
  0.00    0.000000           0         1           read
  0.00    0.000000           0       944           fstat
  0.00    0.000000           0         8           mmap
  0.00    0.000000           0         4           mprotect
  0.00    0.000000           0         1           munmap
  0.00    0.000000           0         7           brk
  0.00    0.000000           0         3         3 access
  0.00    0.000000           0         1           execve
  0.00    0.000000           0         1           sysinfo
  0.00    0.000000           0         1           arch_prctl
------ ----------- ----------- --------- --------- ----------------
100.00    0.008930                 25440         3 total

The table is sorted by decrease time spent defined by columns % time and seconds. If one sees write and open system calls near the top of the summary, this may indicate reading and writing files (I/O) as a potential bottleneck for the program.