Commit graph

1344 commits

Author SHA1 Message Date
Masami Hiramatsu
b2a3c12b74 perf probe: Support tracing an entry of array
Add array-entry tracing support to perf probe. This enables to trace an entry
of array which is indexed by constant value, e.g. array[0].

For example:

  $ perf probe -a 'bio_split bi->bi_io_vec[0]'

Cc: Paul Mackerras <paulus@samba.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20100519195742.2885.5344.stgit@localhost6.localdomain6>
Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-07-05 18:50:27 -03:00
Masami Hiramatsu
73317b9540 perf probe: Support "string" type
Support string type casting to event argument. If perf-probe finds an argument
casted as string, it ensures the target variable is "(unsigned/signed) char
*(or []). perf-probe also adds dereference if the target is a pointer.

So, both of 'char buf[10];' and 'char *buf;' can be accessed by 'buf:string'

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20100519195734.2885.1666.stgit@localhost6.localdomain6>
Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-07-05 18:47:28 -03:00
Thavidu Ranatunga
869599ceda perf: Version String fix, for fallback if not from git
This gets rid of the default version fallback for Perf and
changes it so that it returns the version of the kernel from
it's Makefile (if sources were not from git, ie. if it was
downloaded from a tarball)

Signed-off-by: Thavidu Ranatunga <tharan@au1.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <1278316815-6099-2-git-send-email-tharan@au1.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-07-05 10:42:58 +02:00
Thavidu Ranatunga
2190de2f59 perf: Version String fix, using kernel version
Changes the Perf --version string such that it shows the kernel
version as suggested by Ingo as follows:

That way the perf that comes with v2.6.34 will be:

  perf version v2.6.34

while interim versions will have the version of the interim
kernel - for example:

 perf version v2.6.35-rc4-70-g39ef13a

This functionality was already in the perf version generator
file except that it was looking for a .git in the perf directory
instead of the kernel directory.

Signed-off-by: Thavidu Ranatunga <tharan@au1.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <1278316815-6099-1-git-send-email-tharan@au1.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-07-05 10:42:58 +02:00
Ingo Molnar
08f8ba0799 Merge commit 'v2.6.35-rc4' into perf/core
Merge reason: Pick up the latest perf fixes

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-07-05 08:30:58 +02:00
Conny Seidel
167a58f10d perf tools: Fix fallback to cplus_demangle() when bfd_demangle() is not available
make version 3.80 doesn't support "else ifdef" on the same line, also it
doesn't support unindented nested constructs.

Build fails with:
Makefile:608: Extraneous text after `else' directive
Makefile:611: *** only one `else' per conditional.  Stop.

This patch fixes the build for make 3.80.

Cc: Borislav Petkov <borislav.petkov@amd.com>
LKML-Reference: <1277990366-1462-1-git-send-email-conny.seidel@amd.com>
Signed-off-by: Conny Seidel <conny.seidel@.amd.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-07-02 10:01:58 -03:00
Gui Jianfeng
c214909b36 perf tools: Fix find tids routine by excluding "." and ".."
Introduce a filter function to skip "." and ".." directories when calculating
tid number, otherwise tid 0 will be included in the all_tid result array.

Cc: Ingo Molnar <mingo@elte.hu>
LKML-Reference: <4C185F68.1020505@cn.fujitsu.com>
Signed-off-by: Gui Jianfeng <guijianfeng@cn.fujitsu.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-07-01 14:02:38 -03:00
Srikar Dronamraju
0879b100f3 perf: Fix hist_entry__tui_annotate() build failure
When compiling perf on latest tip/master I see the following
error:

  cc1: warnings being treated as errors
  util/newt.c: In function 'hist_entry__tui_annotate':
  util/newt.c:764: warning: 'ret' is used uninitialized in
  this function make: *** [util/newt.o] Error 1

I think the problem was introduced by commit
13f499f076

Below is a patch that fixes the problem.

Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <20100629173226.GC23231@linux.vnet.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-06-29 22:59:05 +02:00
Thomas Gleixner
f384c954c9 Merge branch 'linus' into perf/core
Reason: Further changes conflict with upstream fixes

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2010-06-28 22:33:24 +02:00
Ingo Molnar
9a15a07fe2 Merge branch 'perf/core' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux-2.6 into perf/core 2010-06-25 16:16:44 +02:00
Gui Jianfeng
830f4c8031 perf kvm: Get rid of unused guest_kallsyms
guest_kallsyms is redundant here, remove it.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Yanmin Zhang <yanmin_zhang@linux.intel.com>
LKML-Reference: <4C241140.9090008@cn.fujitsu.com>
Signed-off-by: Gui Jianfeng <guijianfeng@cn.fujitsu.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-06-25 07:28:21 -03:00
Frederic Weisbecker
ffabd99e05 perf: Report lost events in perf trace debug mode
Account and report lost events in perf trace debugging mode,
useful to check the reliability of the traces.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
2010-06-24 23:36:23 +02:00
Frederic Weisbecker
6fcf7ddbb7 perf: Don't print traces when debugging ordering
Errors due to ordering bugs are easily lost in the middle
of traces.

When we are in this mode, don't print the traces so that
we don't miss the debugging messages.
But display a comforting message if we didn't encounter any
ordering problem.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
2010-06-24 23:36:05 +02:00
Frederic Weisbecker
aa59a48596 perf: Don't use 4 bytes as a default instruction breakpoint length
4 bytes is fine as a default access for data breakpoints. But
instruction breakpoints should take the native pointer length,
otherwise we get a -EINVAL in x86-64.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Prasad <prasad@linux.vnet.ibm.com>
Cc: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Jason Wessel <jason.wessel@windriver.com>
2010-06-24 23:35:49 +02:00
Arnaldo Carvalho de Melo
9f61d85fd0 perf ui: Move objdump_line specific stuff out of ui_browser
By adding a ui_browser->refresh_entries() pure virtual member.

Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-06-21 19:38:33 -03:00
Arnaldo Carvalho de Melo
13f499f076 perf ui: Separate showing the entries from running the browser
Another patch eroding the changes I had to move to a tree widget that
doesn't requires adding all entries in an existing list/tree structure
to a generic tree widget, but instead allows traversing just the entries
that should appear on the screen on a given moment.

Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-06-21 18:04:02 -03:00
Arnaldo Carvalho de Melo
46b0a07a45 perf ui: Introduce ui_browser->seek to support multiple list structures
So that we can use the ui_browser on things like an rb_tree, etc.

Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-06-21 13:36:20 -03:00
Arnaldo Carvalho de Melo
8c694d2559 perf ui: Introduce routine ui_browser__is_current_entry
Will be used in more places in the new tree widget.

Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-06-21 13:35:56 -03:00
Tom Zanussi
bfde744863 perf scripts perl: Makefile fix
Fix a typo introduced by recent Makefile changes, in f9af3a4.  Without it, Perl
scripting support won't get compiled in.

Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Stephane Eranian <eranian@google.com>
LKML-Reference: <1276836006.7762.15.camel@tropicana>
Signed-off-by: Tom Zanussi <tzanussi@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-06-18 08:39:01 -03:00
Ian Munsie
5ffc88819c perf record: prevent kill(0, SIGTERM);
At exit, perf record will kill the process it was profiling by sending a
SIGTERM to child_pid (if it had been initialised), but in certain situations
child_pid may be 0 and perf would mistakenly kill more processes than intended.

child_pid is set to the return of fork() to either 0 or the pid of the child.
Ordinarily this would not present an issue as the child calls execvp to spawn
the process to be profiled and would therefore never run it's sig_atexit and
never attempt to kill pid 0.

However, if a nonexistant binary had been passed in to perf record the call to
execvp would fail and child_pid would be left set to 0. The child would then
exit and it's atexit handler, finding that child_pid was initialised to 0,
would call kill(0, SIGTERM), resulting in every process within it's process
group being killed.

In the case that perf was being run directly from the shell this typically
would not be an issue as the shell isolates the process.  However, if perf was
being called from another program it could kill unexpected processes, which may
even include X.

This patch changes the logic of the test for whether child_pid was initialised
to only consider positive pids as valid, thereby never attempting to kill pid
0.

Cc: David S. Miller <davem@davemloft.net>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <1276072680-17378-1-git-send-email-imunsie@au1.ibm.com>
Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-06-17 14:24:43 -03:00
Andy Isaacson
0f2c3de2ba perf session: fix error message on failure to open perf.data
If we cannot open our data file, print strerror(errno) for a more
comprehensible error message; and only suggest 'perf record' on ENOENT.

In particular, this fixes the nonsensical advice when:

    % sudo perf record sleep 1
    [ perf record: Woken up 1 times to write data ]
    [ perf record: Captured and wrote 0.009 MB perf.data (~381 samples) ]
    % perf trace
    failed to open file: perf.data  (try 'perf record' first)
    %

Cc: Ingo Molnar <mingo@elte.hu>
LPU-Reference: <20100612033615.GA24731@hexapodia.org>
Signed-off-by: Andy Isaacson <adi@hexapodia.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-06-17 13:55:54 -03:00
Andy Isaacson
84c104ad42 perf debug: fix hex dump partial final line
The loop counter math in trace_event was much more complicated than
necessary, resulting in incorrectly decoding the human-readable
portion of the partial last line of hexdump in "perf trace -D" output:

.  0020:  00 00 00 00 00 00 00 00 2f 73 62 69 6e 2f 69 6e  ......../sbin/i
.  0030:  69 74 00 00 00 00 00 00                          /sbin/i

With this fixed (and simpler!) code, we get the correct output:

.  0020:  00 00 00 00 00 00 00 00 2f 73 62 69 6e 2f 69 6e  ......../sbin/in
.  0030:  69 74 00 00 00 00 00 00                          it......

Cc: Ingo Molnar <mingo@elte.hu>
LPU-Reference: <20100612024404.GA24469@hexapodia.org>
Signed-off-by: Andy Isaacson <adi@hexapodia.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-06-17 13:20:50 -03:00
Chase Douglas
9ed7e1b85c perf probe: Add kernel source path option
The probe plugin requires access to the source code for some operations.  The
source code must be in the exact same location as specified by the DWARF tags,
but sometimes the location is an absolute path that cannot be replicated by a
normal user. This change adds the -s|--source option to allow the user to
specify the root of the kernel source tree.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
LKML-Reference: <1276543590-10486-1-git-send-email-chase.douglas@canonical.com>
Signed-off-by: Chase Douglas <chase.douglas@canonical.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-06-17 10:27:57 -03:00
Kirill Smelkov
cfc21cc641 perf tools: .gitignore += config.make config.make.autogen
These are local-configuration files and should be ignored.

LKML-Reference: <1276516847-25817-1-git-send-email-kirr@landau.phys.spbu.ru>
Signed-off-by: Kirill Smelkov <kirr@landau.phys.spbu.ru>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-06-17 10:24:31 -03:00
Stephane Eranian
a1ac1d3c08 perf record: Add option to avoid updating buildid cache
There are situations where there is enough information in the perf.data
to process the samples. Updating the buildid cache may add unecessary
overhead in terms of disk space and time (copying large elf images).

A persistent option to do this already exists via the perfconfig file,
simply do:

[buildid]
dir = /dev/null

This patch provides a way to suppress builid cache updates on a per-run
basis.  It addds a new option, -N, to perf record. Buildids are still
generated in the perf.data file.

Cc: David S. Miller <davem@davemloft.net>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <4c19ef89.93ecd80a.40dc.fffff8e9@mx.google.com>
Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-06-17 10:20:44 -03:00
Eric B Munson
70c3856b2f perf symbols: Function descriptor symbol lookup
Currently symbol resolution does not work for 64-bit programs on architectures
that use function descriptors such as ppc64.

The problem is that a symbol doesn't point to a text address, it points to a
data area that contains (amongst other things) a pointer to the text address.

We look for a section called ".opd" which is the function descriptor area. To
create the full symbol table, when we see a symbol in the function descriptor
section we load the first pointer and use that as the text address.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <1276523793-15422-1-git-send-email-ebmunson@us.ibm.com>
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Eric B Munson <ebmunson@us.ibm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-06-17 10:06:27 -03:00
Stephane Eranian
cf103a14dd perf record: Avoid synthesizing mmap() for all processes in per-thread mode
A bug was introduced by commit c45c6ea2e5.

Perf record was scanning /proc/PID to create synthetic PERF_RECOR_MMAP
entries even though it was running in per-thread mode. There was a bogus
check to select what mmaps to synthesize. We only need all processes in
system-wide mode.

Cc: David S. Miller <davem@davemloft.net>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
LKML-Reference: <4c192107.4f1ee30a.4316.fffff98e@mx.google.com>
Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-06-17 08:57:31 -03:00
Arnaldo Carvalho de Melo
720a3aeb73 perf session: Remove threads from tree on PERF_RECORD_EXIT
Move them to a session->dead_threads list just like we do with maps that
are replaced, because we may have hist_entries pointing to them.

This fixes a bug when inserting maps for a new thread that reused the
TID, mixing maps for two different threads, causing an endless loop.

The code for insering maps should be made more robust but for .35 this
is the minimalistic patch.

Reported-by: Ingo Molnar <mingo@elte.hu>
Cc: David S. Miller <davem@davemloft.net>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-06-17 08:37:44 -03:00
Arnaldo Carvalho de Melo
1d90f2e707 perf record: Don't call newt functions when not initialized
When processing events we want to give visual feedback to the user when
using the newt browser, so there are ui_progress calls in
__perf_session__process_events, but those should check if newt is being
used.

Reported-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Tested-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>,
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <20100609123530.GB9471@ghostprotocols.net>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-06-10 08:29:19 -03:00
Arnaldo Carvalho de Melo
f9af3a4c1f perf tools: Reorganize the Makefile feature tests
Moving the tests to a separate file, feature-tests.mak and using a try-cc
function similar to the try-run in Kbuild.

This also makes the output more quiet as we can stop using the INTERMEDIATE
target to remove the .perf.dev.null file needed for some gcc versions where
/dev/null can't be used as the output file name.

As the tests get shorter by uninlining the source code used to test for
features, we can more properly use identation.

The feature tests itself can be made more clear and reused, like when trying to
see what is needed to have bfd_demangle.

We also get a bit closer to reusing scripts/Kbuild.include, reducing the
distance from the kernel build system.

Tests performed:

[root@emilia perf]# make -j9 O=/tmp/perf
PERF_VERSION = 0.0.2.PERF
    GEN /tmp/perf/common-cmds.h
    * new build flags or prefix
    GEN perf-archive
    CC /tmp/perf/builtin-annotate.o
    CC /tmp/perf/bench/sched-messaging.o
    CC /tmp/perf/builtin-diff.o
<SNIP>
    CC /tmp/perf/scripts/python/Perf-Trace-Util/Context.o
    CC /tmp/perf/perf.o
    CC /tmp/perf/builtin-help.o
    AR /tmp/perf/libperf.a
    LINK /tmp/perf/perf
[root@emilia perf]#

If we uninstall, for instance newt-devel we get:

[root@emilia perf]# rpm -e newt-devel
[root@emilia perf]# make -j9 O=/tmp/perf
Makefile:564: newt not found, disables TUI support. Please install newt-devel or libnewt-dev
    * new build flags or prefix
    GEN perf-archive
    CC /tmp/perf/perf.o
    CC /tmp/perf/builtin-annotate.o
<SNIP>
    AR /tmp/perf/libperf.a
    LINK /tmp/perf/perf
[root@emilia perf]#

And then binutils-devel:

[root@emilia perf]# make -j9 O=/tmp/perf
Makefile:564: newt not found, disables TUI support. Please install newt-devel or libnewt-dev
Makefile:632: No bfd.h/libbfd found, install binutils-dev[el]/zlib-static to gain symbol demangling
    * new build flags or prefix
    GEN perf-archive
    CC /tmp/perf/perf.o
<SNIP>
    AR /tmp/perf/libperf.a
    LINK /tmp/perf/perf
[root@emilia perf]#

And then strictly required devel packages:

[root@emilia perf]# rpm -e elfutils-libelf-devel elfutils-devel
[root@emilia perf]# make -j9 O=/tmp/perf
Makefile:509: No libdw.h found or old libdw.h found or elfutils is older than 0.138, disables dwarf support. Please install new elfutils-devel/libdw-dev
Makefile:542: *** No libelf.h/libelf found, please install libelf-dev/elfutils-libelf-devel.  Stop.
[root@emilia perf]#

After installing everything back on:

[root@emilia perf]# yum install elfutils-devel binutils-devel newt-devel
<SNIP>
Installed:
  binutils-devel.x86_64 0:2.20.51.0.2-5.11.el6
  elfutils-devel.x86_64 0:0.147-1.el6
  elfutils-libelf-devel.x86_64 0:0.147-1.el6
  newt-devel.x86_64 0:0.52.11-1.el6

Complete!
[root@emilia perf]# make -j9
PERF_VERSION = 0.0.2.PERF
    GEN common-cmds.h
    * new build flags or prefix
    GEN perf-archive
    CC builtin-annotate.o
<SNIP>
    AR libperf.a
    LINK perf
[root@emilia perf]# make -j9
[root@emilia perf]#

Thanks to Sam for pointing me to try-run.

Cc: David S. Miller <davem@davemloft.net>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-06-09 16:57:39 -03:00
Eric B Munson
3af9e85928 perf: Add non-exec mmap() tracking
Add the capacility to track data mmap()s. This can be used together
with PERF_SAMPLE_ADDR for data profiling.

Signed-off-by: Anton Blanchard <anton@samba.org>
[Updated code for stable perf ABI]
Signed-off-by: Eric B Munson <ebmunson@us.ibm.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
LKML-Reference: <1274193049-25997-1-git-send-email-ebmunson@us.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-06-09 11:12:34 +02:00
Arun Sharma
f60f359383 perf report: Implement --sort cpu
In a shared multi-core environment, users want to analyze why their
program was slow. In particular, if the code ran slower only on certain
CPUs due to interference from other programs or kernel threads, the user
should be able to notice that.

Sample usage:

perf record -f -a -- sleep 3
perf report --sort cpu,comm

Workload:

program is running on 16 CPUs
Experiencing interference from an antagonist only on 4 CPUs.

  Samples: 106218177676 cycles

  Overhead  CPU          Command
  ........  ...  ...............

     6.25%  2            program
     6.24%  6            program
     6.24%  11           program
     6.24%  5            program
     6.24%  9            program
     6.24%  10           program
     6.23%  15           program
     6.23%  7            program
     6.23%  3            program
     6.23%  14           program
     6.22%  1            program
     6.20%  13           program
     3.17%  12           program
     3.15%  8            program
     3.14%  0            program
     3.13%  4            program
     3.11%  4         antagonist
     3.11%  0         antagonist
     3.10%  8         antagonist
     3.07%  12        antagonist

Cc: David S. Miller <davem@davemloft.net>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <20100505181612.GA5091@sharma-home.net>
Signed-off-by: Arun Sharma <aruns@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-06-05 09:35:53 -03:00
Arnaldo Carvalho de Melo
41a37e2017 perf tools: Make event__preprocess_sample parse the sample
Simplifying the tools that were using both in sequence and allowing
upcoming simplifications, such as Arun's patch to sort by cpus.

Cc: David S. Miller <davem@davemloft.net>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-06-05 09:35:19 -03:00
Stephane Eranian
45d8e8025a perf annotate: Ask objdump to demangle symbols
Perf report is demangling symbols but not annotate.

The former uses internal demangling via libbdf or libiberty. The latter
executes objdump which by default does not demangle symbols.

This patch adds the -C option to the objdump cmdline to enable symbol
demangling.

Cc: David S. Miller <davem@davemloft.net>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <4c07b323.2126e30a.6245.0e1e@mx.google.com>
Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-06-05 09:34:59 -03:00
Stephane Eranian
45de34bbe3 perf buildid: add perfconfig option to specify buildid cache dir
This patch adds the ability to specify an alternate directory to store the
buildid cache (buildids, copy of binaries). By default, it is hardcoded to
$HOME/.debug. This directory contains immutable data. The layout of the
directory is such that no conflicts in filenames are possible. A modification
in a file, yields a different buildid and thus a different location in the
subdir hierarchy.

You may want to put the buildid cache elsewhere because of disk space
limitation or simply to share the cache between users. It is also useful for
remote collect vs. local analysis of profiles.

This patch adds a new config option to the perfconfig file.  Under the tag
'buildid', there is a dir option. For instance, if you have:

$ cat /etc/perfconfig
[buildid]
dir = /var/cache/perf-buildid

All buildids and binaries are be saved in the directory specified. The perf
record, buildid-list, buildid-cache, report, annotate, and archive commands
will it to pull information out.

The option can be set in the system-wide perfconfig file or in the
$HOME/.perfconfig file.

Cc: David S. Miller <davem@davemloft.net>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <4c055fb7.df0ce30a.5f0d.ffffae52@mx.google.com>
Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-06-05 09:34:04 -03:00
Arnaldo Carvalho de Melo
8e5564e6c7 perf tools: Make target to generate self contained source tarball
Useful for when people want to try some version of the perf tools and don't
wants to download the kernel tarball.

Here is a session using this new target:

  [root@emilia linux-2.6-tip]# make help | grep -i perf
    perf-tar-src-pkg    - Build perf-2.6.35-rc1.tar source tarball
    perf-targz-src-pkg  - Build perf-2.6.35-rc1.tar.gz source tarball
    perf-tarbz2-src-pkg - Build perf-2.6.35-rc1.tar.bz2 source tarball
  [root@emilia linux-2.6-tip]# make perf-tarbz2-src-pkg
    TAR
  [root@emilia linux-2.6-tip]# ls -la perf-2.6.35-rc1.tar.bz2
  -rw-r--r-- 1 root root 295731 May 31 11:18 perf-2.6.35-rc1.tar.bz2
  [root@emilia linux-2.6-tip]# tar xf perf-2.6.35-rc1.tar.bz2
  [root@emilia linux-2.6-tip]# cd perf-2.6.35-rc1
  [root@emilia perf-2.6.35-rc1]# ls
  arch  HEAD  include  lib  tools
  [root@emilia perf-2.6.35-rc1]# cd tools/perf
  [root@emilia perf]# make -j9 2>&1 | tail
      CC arch/x86/util/dwarf-regs.o
      CC util/probe-finder.o
      CC util/newt.o
      CC util/scripting-engines/trace-event-perl.o
      CC scripts/perl/Perf-Trace-Util/Context.o
      CC perf.o
      CC builtin-help.o
      AR libperf.a
      LINK perf
  rm .perf.dev.null
  [root@emilia perf]# ./perf record -a sleep 1
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.262 MB perf.data (~11457 samples) ]
  [root@emilia perf]# ./perf report | head -12
  # Events: 6K cycles
  #
  # Overhead          Command       Shared Object  Symbol
  # ........  ...............  ..................  ......
  #
       4.73%             perf  [kernel.kallsyms]   [k] format_decode
       4.49%             perf  libc-2.12.so        [.] _IO_file_underflow_internal
       4.38%             init  [kernel.kallsyms]   [k] mwait_idle
       3.29%             perf  [kernel.kallsyms]   [k] vsnprintf
       2.38%             init  [kernel.kallsyms]   [k] sched_clock_local
       2.35%             init  [kernel.kallsyms]   [k] apic_timer_interrupt
       1.86%     sirq-timer/5  [kernel.kallsyms]   [k] find_busiest_group
  [root@emilia perf]#

Acked-by: Michal Marek <mmarek@suse.cz>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Cc: David S. Miller <davem@davemloft.net>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <20100528185357.GA28009@ghostprotocols.net>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-06-05 09:33:35 -03:00
Stephane Eranian
c45c6ea2e5 perf tools: Add the ability to specify list of cpus to monitor
This patch adds a -C option to stat, record, top to designate a list of CPUs to
monitor. CPUs can be specified as a comma-separated list or ranges, no space
allowed.

Examples:
$ perf record -a -C0-1,4-7 sleep 1
$ perf top -C0-4
$ perf stat -a -C1,2,3,4 sleep 1

With perf record in per-thread mode with inherit mode on, samples are collected
only when the thread runs on the designated CPUs.

The -C option does not turn on system-wide mode automatically.

Cc: David S. Miller <davem@davemloft.net>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <4bff9496.d345d80a.41fe.7b00@mx.google.com>
Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-06-05 09:33:01 -03:00
Stephane Eranian
761844b9c6 perf report: Make -D print sampled CPU
It is useful to know on which CPU a sample was captured on.
The information is captured with perf record -R but it was
not printed out by perf report -D. This patch adds this.

When -R is not used, cpu is set to -1to indicate that
the CPU is unknown (it is not captured).

Cc: David S. Miller <davem@davemloft.net>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <4bff964c.e88cd80a.3106.7d31@mx.google.com>
Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-06-05 09:32:24 -03:00
Arnaldo Carvalho de Melo
e7dadc0089 perf symbols: Set the DSO long name when using symbol_conf.vmlinux_name
We need to set the long name to the name specified via, for instance,
'perf annotate --vmlinux /path/to/vmlinux', if not it will remain as
'[kernel.kallsyms]' and that will make annotate fail when passing this
as the vmlinux name in the call to objdump.

The way this is setup grew unwieldly and dso__load_vmlinux is the
function that should allocate space for the long name, with callers not
assuming that filenames should be allocated somehow by then (strdup,
dso__build_id_filename, etc).

For now this is the minimalistic patch, a proper fix for .36 will be
made.

Reported-by: Stephane Eranian <eranian@google.com>
Tested-by: Stephane Eranian <eranian@google.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <20100604003900.GD10469@ghostprotocols.net>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-06-04 07:07:52 -03:00
Ingo Molnar
da3fd1a001 Merge branch 'perf/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux-2.6 into perf/urgent 2010-06-02 09:13:12 +02:00
Arnaldo Carvalho de Melo
b5c874f14c perf buildid-list: Fix --with-hits event processing
When we use plain 'perf buildid-list' we use only what is in the buildid
table in the perf.data header. And those have absolute pathnames because
at 'perf record' time we used __perf_session__process_events and that
doesn't sets up the path shortening code in map__new() that happens if
symbol_conf.full_paths is false, the default.

On the other hand, when we use 'perf buildid-list --with-hits' we
process all the events using perf_session__process_events, adding
entries to the global DSO list _after_ removing the current directory
from the DSO name, for presentation purposes.

Because of that we end up having two entries in the DSO list when
recording events for binaries using relative pathnames.

Fix it minimally by setting symbol_conf.full_paths to true when marking
the DSOs with hits in 'perf buildid-list --with-hits', as used by 'perf
archive'

Right fix longer term is to shorten the path only at presentation time.
Will be done for 2.6.36.

Reported-by: Stephane Eranian <eranian@google.com>
Tested-by: Stephane Eranian <eranian@google.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <20100601183837.GC4093@ghostprotocols.net>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-06-01 16:16:11 -03:00
Pierre Tardy
c02514850d perf scripts python: Give field dict to unhandled callback
trace_unhandled() callback does not allow to access event fields, this patch
resolves the problem.

It can also been used as a more pythonic and flexible way for script writters
to demux event types

This will for example greatly simplify pytimechart event demux.

Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Tom Zanussi <tzanussi@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>,
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <1275340329-2397-1-git-send-email-tardyp@gmail.com>
Signed-off-by: Pierre Tardy <tardyp@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-06-01 06:12:35 -03:00
Konstantin Stepanyuk
75d9ef1707 perf hist: fix objdump output parsing
hist_entry__annotate() runs objdump with -S option so the output may contain
lines of any format. If a line starts with a colon strtoull() returns 0 and
calculated offset will be negative. This causes perf annotate segfaults.

Make sure that strtoull() has parsed at least one digit.

Cc: David S. Miller <davem@davemloft.net>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Konstantin Stepanyuk <konstantin.stepanyuk@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-06-01 05:44:36 -03:00
Borislav Petkov
2fb750e825 perf-record: Check correct pid when forking
When forking the child to be traced, we should check the correct
return value from fork() and not a local variable which is otherwise
unused.

Signed-off-by: Borislav Petkov <bp@alien8.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
LKML-Reference: <20100531211818.GA30175@liondog.tnic>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2010-06-01 00:57:14 +02:00
Frederic Weisbecker
dd833d713c perf: Do the comm inheritance per thread in event__process_task
event__process_task() doesn't propagate the comm copy on clone,
but only on process fork. So we loose all the tid:comm resolution
for tasks that aren't a main process thread.

Progragate the per thread granularity to event__process_task for
pid resolution.

This fixes various unresolved pids in perf sched, especially when
we trace multithread processes. The problem is quickly reproducible
with the messaging benchmark using the multithread mode "-t" :

	perf sched record perf bench sched messaging -t

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
2010-06-01 00:43:07 +02:00
Frederic Weisbecker
af64865ba6 perf: Use event__process_task from perf sched
perf sched uses event__process_comm(), which means it can resolve
comms from:

- tasks that have exec'ed (kernel comm events)
- tasks that were running when perf record started the actual
  recording (synthetized comm events)

But perf sched can't resolve the pids of tasks that were created
after the recording started.

To solve this, we need to inherit the comms on fork events using
event__process_task().

This fixes various unresolved pids in perf sched, easily visible
with:
	perf sched record perf bench sched messaging

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
2010-06-01 00:10:32 +02:00
Frederic Weisbecker
13eb04fdbe perf: Process comm events by tid
When we synthetize the existing running tasks though procfs,
we walk through every threads of a process, queuing one comm
events per tid.

But then on report time, event__process_comm() only creates and
sets the comm on a per process granularity. This is the right
thing for comm events that came from the kernel, as they are
only created on exec. Sub-threads then inherit their comm
from fork events. But that doesn't work with our synthetized
comm events taken from procfs informations as the per thread
granularity is done on comm events directly there.

Hence we need event__process_comm() to work with the tid rather
than the pid. It won't change anything for comm events coming
from the kernel but this will fix the synthetized ones.

Before:

	$ ./perf report -D | grep COMM | grep firefox

	0x2c7b8 [0x18]: PERF_RECORD_COMM: firefox:5297
	0x2c7d0 [0x18]: PERF_RECORD_COMM: firefox:5297
	0x2c7e8 [0x18]: PERF_RECORD_COMM: firefox:5297
	0x2c800 [0x18]: PERF_RECORD_COMM: firefox:5297
	0x2c818 [0x18]: PERF_RECORD_COMM: firefox:5297
	0x2c830 [0x18]: PERF_RECORD_COMM: firefox:5297

After:
	$ ./perf report -D | grep COMM | grep firefox

	0x2c7b8 [0x18]: PERF_RECORD_COMM: firefox:5297
	0x2c7d0 [0x18]: PERF_RECORD_COMM: firefox:5299
	0x2c7e8 [0x18]: PERF_RECORD_COMM: firefox:5300
	0x2c800 [0x18]: PERF_RECORD_COMM: firefox:5308
	0x2c818 [0x18]: PERF_RECORD_COMM: firefox:5309
	0x2c830 [0x18]: PERF_RECORD_COMM: firefox:5312

This fixes various unresolved pid on perf sched.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
2010-05-31 23:59:50 +02:00
Arnaldo Carvalho de Melo
c4fe52a8ee perf tui: Fix last use_browser problem related to .perfconfig
When we moved to using ~/.perfconfig to set the value of use_browser,
it changed from a boolean to an int so that the convention used for
use_pager was followed.

That convention is:

-1: unspecified, that is what use_{browser,pager} is initialized
 0: Don't use the browser (should be TUI), because was explicitely
    set to 0/off/false on ~/.perfconfig [tui] cmd =, or because
    we're redirecting the stdout to a file or piping it to some
    other command (!isatty()).
 1: Use the TUI

Some code was not properly audited and continued testing it as a
boolean, this seems to be the last one.

Reported-by: Frédéric Weisbecker <fweisbec@gmail.com>
Tested-by: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-27 09:53:40 -03:00
Arnaldo Carvalho de Melo
5ad90e4ea4 perf symbols: Add the build id cache to the vmlinux path
So that if the kernel DSO has a build id because record inserted it in
the perf.data build id table in the header, or a BUILD_ID event was
inserted in the stream, we first look at the build id cache
($HOME/.debug/).

If we find it there, try to use it, allowing offline annotation in
addition to 'perf report'.

Reported-by: Stephane Eranian <eranian@google.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-26 13:26:02 -03:00
Arnaldo Carvalho de Melo
62e3436b5f perf tui: Reset use_browser if stdout is not a tty
The newt initialization routines weren't being called because the output
was a file (perf annotate > /tmp/bla) but use_browser was still 1,
because ~/.perfconfig had it as 'on', so, later on newt routines
segfaulted.

Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-26 13:22:26 -03:00
Arnaldo Carvalho de Melo
d67f088e08 perf report: Support multiple events on the TUI
The hists__tty_browse_tree function was created with the loop to print
all events, and its equivalent, hists__tui_browse_tree, was created in a
similar fashion, where it is possible to switch among the multiple
events, if present, using TAB to go the next event, and shift+TAB
(UNTAB) to go to the previous.

The report TUI now shows as the window title the name of the event and a
leak was fixed wrt pstacks.

Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-23 22:36:51 -03:00
Arnaldo Carvalho de Melo
44bf460649 perf annotate: Fix up usage of the build id cache
It was assuming that the cache was always available and also wasn't
checking if the file found in the build id cache was just a kallsyms
file, that is not supported by objdump for disassembly.

Reported-by: Ingo Molnar <mingo@elte.hu>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-23 22:35:07 -03:00
Arnaldo Carvalho de Melo
46e3e055ce perf annotate: Add TUI interface
When annotating multiple entries, for instance, when running simply as:

$ perf annotate

the right and left keys, as well as TAB can be used to cycle thru the
multiple symbols being annotated.

If one doesn't like TUI annotate, disable it by editing ~/.perfconfig
and adding:

[tui]

	annotate = off

Just like it is possible for report.

Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-22 11:25:40 -03:00
Arnaldo Carvalho de Melo
6e78c9fd1b perf tui: Remove annotate from popup menu after failure
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-22 11:20:24 -03:00
Arnaldo Carvalho de Melo
0e8dc25974 perf report: Don't start the TUI if -D is used
One day we'll have support for the "dump raw trace in ASCII" in the TUI
frontend, but till then, use the tty code.

Reported-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-21 14:22:13 -03:00
Frederic Weisbecker
598357eba6 perf: Fix getline undeclared
We need to have stdio.h included with _GNU_SOURCEfopr getline,
which is broken with the inclusion of build-id.h.

Keep util.h included first in hist.c

Fixes:
	util/hist.c: Dans la fonction «hist_entry__parse_objdump_line» :
	util/hist.c:938: attention : déclaration implicite de la fonction « «getline» »
	util/hist.c:938: attention : nested extern declaration of «getline»
	make: *** [util/hist.o] Erreur 1

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1274438919-5104-1-git-send-regression-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-05-21 13:55:32 +02:00
Peter Zijlstra
0e2e63dd60 perf-record: Share per-cpu buffers
It seems a waste of space to create a buffer per
event, share it per-cpu.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
LKML-Reference: <20100521090710.634824884@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-05-21 11:37:58 +02:00
Peter Zijlstra
57adc51dce perf-record: Remove -M
Since it is not allowed to create cross-cpu (or
cross-task) buffers, this option is no longer valid.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
LKML-Reference: <20100521090710.582740993@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-05-21 11:37:57 +02:00
Ingo Molnar
1c34bde13a Merge branch 'perf' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux-2.6 into perf/core 2010-05-21 09:50:09 +02:00
Arnaldo Carvalho de Melo
5d06e6915b perf tui: Allow disabling the TUI on a per command basis in ~/.perfconfig
Using the same scheme as for git's/perf's pager setup, i.e. if one
doesn't want to, on a newt enabled perf binary, to disable the TUI for
'perf report', its just a matter of doing:

  [root@doppio linux-2.6-tip]# printf "[tui]\n\nreport = off\n" >
  /root/.perfconfig
  [root@doppio linux-2.6-tip]# cat /root/.perfconfig
  [tui]

  report = off
  [root@doppio linux-2.6-tip]#

System wide settings are also possible, by editing /etc/perfconfig, etc,
i.e. the git machinery for config files applies to perf as well, so when
in doubt where to put your settings, consult the git documentation, if
it fails, please let us know.

Suggested-by: Ingo Molnar <mingo@elte.hu>
Discussed-with: Stephane Eranian <eranian@google.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-20 22:01:10 -03:00
Russ Anderson
ef365cefbc perf record: remove unneeded gettimeofday() call
Perf record repeatedly calls gettimeofday() which adds noise to the performance
measurements.  Since gettimeofday() is only used for the error printf, delete
it.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
LKML-Reference: <20100518225240.GC25589@sgi.com>
Signed-off-by: Russ Anderson <rja@sgi.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-20 21:53:58 -03:00
Arnaldo Carvalho de Melo
b36f19d572 perf annotate: Use build-ids to find the right DSO
We were still using the pathname found on the MMAP event, that could not
be the one we used when recording, so use the build-id cache for that,
only falling back to use the pathname in the MMAP event if no build-ids
are available.

With this we now also are able to do secure, seamless offline annotation.

Example:

[root@doppio linux-2.6-tip]# perf report -g none -v 2> /dev/null | head -10
     8.12%     Xorg  /usr/lib64/libpixman-1.so.0.14.0       0x0000000000026d02 B [.] pixman_rasterize_edges
     4.68%  firefox  /usr/lib64/xulrunner-1.9.1/libxul.so   0x00000000005dbdba B [.] 0x000000005dbdba
     3.70%  swapper  /lib/modules/2.6.34-rc6/build/vmlinux  0xffffffff81022cea ! [k] read_hpet
     2.96%     init  /lib/modules/2.6.34-rc6/build/vmlinux  0xffffffff81022cea ! [k] read_hpet
     2.73%  swapper  /lib/modules/2.6.34-rc6/build/vmlinux  0xffffffff8100a738 ! [k] mwait_idle_with_hints
[root@doppio linux-2.6-tip]# perf annotate -v pixman_rasterize_edges 2>&1 | grep Executing
Executing: objdump --start-address=0x000000371ce26670 --stop-address=0x000000371ce2709f -dS /root/.debug/.build-id/bd/6ac5199137aaeb279f864717d8d061477466c1|grep -v /root/.debug/.build-id/bd/6ac5199137aaeb279f864717d8d061477466c1|expand
[root@doppio linux-2.6-tip]# perf buildid-list | grep libpixman-1.so.0.14.0
bd6ac5199137aaeb279f864717d8d061477466c1 /usr/lib64/libpixman-1.so.0.14.0
[root@doppio linux-2.6-tip]#

Reported-by: Stephane Eranian <eranian@google.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-20 12:15:33 -03:00
Arnaldo Carvalho de Melo
17930b405e perf TUI: Make 'space' be an alias to 'PgDn'
Just like if one is using the stdio based pager, or more/less, for that
matter.

Suggested-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-20 11:35:22 -03:00
Ingo Molnar
dfacc4d6c9 Merge branch 'perf/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing into perf/core 2010-05-20 14:38:55 +02:00
Frederic Weisbecker
85cb68b27c perf: Fix unaligned accesses while fetching trace values
Accessing trace values of an 8 size may end up in a segfault
on archs that can't deal with misaligned access, which is the
case for sparc 64. This is because PERF_SAMPLE_RAW are aligned
to 4 and not to 8.

Fix this on the macros that get the values of 8 size.

This fixes segfaults on perf tools in sparc 64.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
Cc: David Miller <davem@davemloft.net>
2010-05-20 11:21:57 +02:00
Tom Zanussi
cbb5cf7ff6 perf: Use read() instead of lseek() in trace_event_read.c:skip()
This is a small fix for a problem affecting live-mode, introduced
recently:

root@tropicana:~# perf trace rwtop
perf trace started with Perl
script /root/libexec/perf-core/scripts/perl/rwtop.pl

  Fatal: did not read header event

commit d00a47cce5 added a skip()
function to skip over e.g. header_page, but this doesn't work for
live mode.  This patch re-implements skip() to use read() instead of
lseek() to fix that.

Signed-off-by: Tom Zanussi <tzanussi@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1273032130.6383.28.camel@tropicana>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2010-05-20 08:37:17 +02:00
Arnaldo Carvalho de Melo
f869097e88 perf session: Make read_build_id routines look at the host_machine too
The changes made to support host and guest machines in a session, that
started when the 'perf kvm' tool was introduced ended up introducing a
bug where the host_machine was not having its DSOs traversed for
build-id processing.

Fix it by moving some methods to the right classes and considering the
host_machine when processing build-ids.

Reported-by: Tom Zanussi <tzanussi@gmail.com>
Reported-by: Stephane Eranian <eranian@google.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-19 13:45:08 -03:00
Arnaldo Carvalho de Melo
f6e1467d83 perf symbols: Don't try to read the build-id twice
In __dsos__read_build_ids if the dso already had its build-id read,
don't try again.

Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-19 13:44:41 -03:00
Arnaldo Carvalho de Melo
151f85a471 perf tools: remove xstrndup, xmalloc, xzalloc
All the functions that call this can handle the equivalent, non
panic'ing wrapped routines.

Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-18 23:05:28 -03:00
Arnaldo Carvalho de Melo
8a7ddad8e7 perf probe: Don't call die()
Functions that were calling xzalloc also returned -1 when, for other
reasons, it could fail, and the calleds are coping with failures, so
stop using die() and xzalloc().

Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-18 23:05:18 -03:00
Arnaldo Carvalho de Melo
b448c4b613 perf probe: Fix some error exit paths
That could leave filedescriptors open and leak memory. Also stop using
xmalloc, use malloc and handle results just like other error cases in
the same routine that used it.

Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-18 23:04:28 -03:00
Arnaldo Carvalho de Melo
a41794cdd7 perf tools: Remove some unused functions
Without the bloated cplus_demangle from binutils, i.e building with:

$ make NO_DEMANGLE=1 O=~acme/git/build/perf -j3 -C tools/perf/ install

Before:

   text	   data	    bss	    dec	    hex	filename
 471851	  29280	4025056	4526187	 45106b	/home/acme/bin/perf

After:

[acme@doppio linux-2.6-tip]$ size ~/bin/perf
   text	   data	    bss	    dec	    hex	filename
 446886	  29232	4008576	4484694	 446e56	/home/acme/bin/perf

So its a 5.3% size reduction in code, but the interesting part is in the git
diff --stat output:

 19 files changed, 20 insertions(+), 1909 deletions(-)

If we ever need some of the things we got from git but weren't using, we just
have to go to the git repo and get fresh, uptodate source code bits.

Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-18 23:03:35 -03:00
Stephane Eranian
5af52b51f7 perf stat: add perf stat -B to pretty print large numbers
It is hard to read very large numbers so provide an option to perf stat
to separate thousands using a separator. The patch leverages the locale
support of stdio. You need to set your LC_NUMERIC appropriately, for
instance LC_NUMERIC=en_US.UTF8. You need to pass -B to activate this
feature. This way existing scripts parsing the output do not need to be
changed. Here is an example.

$ perf stat noploop 2
noploop for 2 seconds

 Performance counter stats for 'noploop 2':

        1998.347031  task-clock-msecs         #      0.998 CPUs
                 61  context-switches         #      0.000 M/sec
                  0  CPU-migrations           #      0.000 M/sec
                118  page-faults              #      0.000 M/sec
      4,138,410,900  cycles                   #   2070.917 M/sec  (scaled from 70.01%)
      2,062,650,268  instructions             #      0.498 IPC    (scaled from 70.01%)
      2,057,653,466  branches                 #   1029.678 M/sec  (scaled from 70.01%)
             40,267  branch-misses            #      0.002 %      (scaled from 30.04%)
      2,055,961,348  cache-references         #   1028.831 M/sec  (scaled from 30.03%)
             53,725  cache-misses             #      0.027 M/sec  (scaled from 30.02%)

        2.001393933  seconds time elapsed

$ perf stat -B  noploop 2
noploop for 2 seconds

 Performance counter stats for 'noploop 2':

        1998.297883  task-clock-msecs         #      0.998 CPUs
                 59  context-switches         #      0.000 M/sec
                  0  CPU-migrations           #      0.000 M/sec
                119  page-faults              #      0.000 M/sec
      4,131,380,160  cycles                   #   2067.450 M/sec  (scaled from 70.01%)
      2,059,096,507  instructions             #      0.498 IPC    (scaled from 70.01%)
      2,054,681,303  branches                 #   1028.216 M/sec  (scaled from 70.01%)
             25,650  branch-misses            #      0.001 %      (scaled from 30.05%)
      2,056,283,014  cache-references         #   1029.017 M/sec  (scaled from 30.03%)
             47,097  cache-misses             #      0.024 M/sec  (scaled from 30.02%)

        2.001391016  seconds time elapsed

Cc: David S. Miller <davem@davemloft.net>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <4bf28fe8.914ed80a.01ca.fffff5f5@mx.google.com>
Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-18 23:03:22 -03:00
Linus Torvalds
4d7b4ac22f Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (311 commits)
  perf tools: Add mode to build without newt support
  perf symbols: symbol inconsistency message should be done only at verbose=1
  perf tui: Add explicit -lslang option
  perf options: Type check all the remaining OPT_ variants
  perf options: Type check OPT_BOOLEAN and fix the offenders
  perf options: Check v type in OPT_U?INTEGER
  perf options: Introduce OPT_UINTEGER
  perf tui: Add workaround for slang < 2.1.4
  perf record: Fix bug mismatch with -c option definition
  perf options: Introduce OPT_U64
  perf tui: Add help window to show key associations
  perf tui: Make <- exit menus too
  perf newt: Add single key shortcuts for zoom into DSO and threads
  perf newt: Exit browser unconditionally when CTRL+C, q or Q is pressed
  perf newt: Fix the 'A'/'a' shortcut for annotate
  perf newt: Make <- exit the ui_browser
  x86, perf: P4 PMU - fix counters management logic
  perf newt: Make <- zoom out filters
  perf report: Report number of events, not samples
  perf hist: Clarify events_stats fields usage
  ...

Fix up trivial conflicts in kernel/fork.c and tools/perf/builtin-record.c
2010-05-18 08:19:03 -07:00
Arnaldo Carvalho de Melo
32ec6acfdc perf tui: Fix build problem with slang <= 2.0.6
slang versions <= 2.0.6 have a "#if HAVE_LONG_LONG" that breaks the
build if it isn't defined. Use the equivalent one that glibc has on
features.h.

Reported-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-18 00:25:36 -03:00
Masami Hiramatsu
7752f1b096 perf probe: Don't compile CFI related code if elfutils is old
Check elfutils version, and if it is old don't compile CFI analysis code. This
allows to compile perf with old elfutils.

Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Tested-by: Stephane Eranian <eranian@google.com>
Reported-by: Robert Richter <robert.richter@amd.com>
Cc: Ingo Molnar <mingo@elte.hu>
LKML-Reference: <20100510171207.26029.97604.stgit@localhost6.localdomain6>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-17 22:13:51 -03:00
Arnaldo Carvalho de Melo
94f3ca9578 perf tools: Add mode to build without newt support
make NO_NEWT=1

Will avoid building the newt (tui) support.

Suggested-by: Ingo Molnar <mingo@elte.hu>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-17 18:18:11 -03:00
Arnaldo Carvalho de Melo
2f51903bc3 perf symbols: symbol inconsistency message should be done only at verbose=1
That happened for an old perf.data file that had no fake MMAP events for
the kernel modules, but even then it should warn once for each module,
not one time for every symbol in every module not found.

Reported-by: Ingo Molnar <mingo@elte.hu>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-17 17:57:59 -03:00
Arnaldo Carvalho de Melo
63aa9e7e3a perf tui: Add explicit -lslang option
At least on rawhide using -lnewt is not enough if we use SLang routines
directly, so add an explicit -lslang since we use SLang routines.

Reported-by: Ingo Molnar <mingo@elte.hu>
Tested-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-17 16:42:37 -03:00
Arnaldo Carvalho de Melo
edb7c60e27 perf options: Type check all the remaining OPT_ variants
OPT_SET_INT was renamed to OPT_SET_UINT since the only use in these
tools is to set something that has an enum type, that is builtin
compatible with unsigned int.

Several string constifications were done to make OPT_STRING require a
const char * type.

Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-17 16:22:41 -03:00
Arnaldo Carvalho de Melo
8035458fbb perf options: Type check OPT_BOOLEAN and fix the offenders
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-17 16:22:37 -03:00
Arnaldo Carvalho de Melo
1967936d68 perf options: Check v type in OPT_U?INTEGER
To avoid problems like the one fixed by Stephane Eranian in 3de29ca, now
we'll got this instead:

	bench/sched-messaging.c:259: error: negative width in bit-field ‘<anonymous>’
	bench/sched-messaging.c:261: error: negative width in bit-field ‘<anonymous>’

Which is rather cryptic, but is how BUILD_BUG_ON_ZERO works, so kernel
hackers should be already used to this.

With it in place found some problems, fixed by changing the affected
variables to sensible types or changed some OPT_INTEGER to OPT_UINTEGER.

Next csets will go thru converting each of the remaining OPT_ so that
review can be made easier by grouping changes per type per patch.

Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-17 15:43:38 -03:00
Arnaldo Carvalho de Melo
c100edbee8 perf options: Introduce OPT_UINTEGER
For unsigned int options to be parsed, next patches will make use of it.

Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-17 15:30:00 -03:00
Arnaldo Carvalho de Melo
dc4ff19341 perf tui: Add workaround for slang < 2.1.4
Older versions of the slang library didn't used the 'const' specifier,
causing problems with modern compilers of this kind:

util/newt.c:252: error: passing argument 1 of ‘SLsmg_printf’ discards
qualifiers from pointer target type

Fix it by using some wrappers that when needed const the affected
parameters back to plain (char *).

Reported-by: Lin Ming <ming.m.lin@intel.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Lin Ming <ming.m.lin@intel.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <20100517145421.GD29052@ghostprotocols.net>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-17 12:28:34 -03:00
Stephane Eranian
3de29cab1f perf record: Fix bug mismatch with -c option definition
The -c option defines the user requested sampling period. It was implemented
using an unsigned int variable but the type of the option was OPT_LONG. Thus,
the option parser was overwriting memory belonging to other variables, namely
the mmap_pages leading to a zero page sampling buffer. The bug was exposed only
when compiling at -O0, probably because the compiler was padding variables at
higher optimization levels.

This patch fixes this problem by declaring user_interval as u64. This also
avoids wrap-around issues for large period on 32-bit systems.

Commiter note:

Made it use OPT_U64(user_interval) after implementing OPT_U64 in the
previous patch.

Cc: David S. Miller <davem@davemloft.net>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <4bf11ae9.e88cd80a.06b0.ffffa8e3@mx.google.com>
Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-17 12:23:18 -03:00
Arnaldo Carvalho de Melo
6ba85cea87 perf options: Introduce OPT_U64
We have things like user_interval (-c/--count) in 'perf record' that
needs this.

Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-17 12:16:48 -03:00
Arnaldo Carvalho de Melo
a9a4ab747e perf tui: Add help window to show key associations
Suggested-by: Ingo Molnar <mingo@elte.hu>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-16 21:04:27 -03:00
Arnaldo Carvalho de Melo
a308f3a868 perf tui: Make <- exit menus too
In fact it is now added to the hot key list when newt_form__new is used,
allowing us to remove the explicit assignment in all its users.

The visible change is that <- will exit the menu that pops up when -> is
pressed (and Enter when callchains are not being used).

Suggested-by: Ingo Molnar <mingo@elte.hu>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-16 20:29:38 -03:00
Arnaldo Carvalho de Melo
9d192e118a perf newt: Add single key shortcuts for zoom into DSO and threads
'D'/'d' for zooming into the DSO in the current highlighted hist entry,
'T'/'t' for zooming into the current thread.

Suggested-by: Ingo Molnar <mingo@elte.hu>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-15 21:15:01 -03:00
Arnaldo Carvalho de Melo
29351db6a0 perf newt: Exit browser unconditionally when CTRL+C, q or Q is pressed
ESC still asks for confirmation.

Suggested-by: Ingo Molnar <mingo@elte.hu>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-15 21:06:58 -03:00
Arnaldo Carvalho de Melo
c1ec5fefd9 perf newt: Fix the 'A'/'a' shortcut for annotate
Reported-by: Ingo Molnar <mingo@elte.hu>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-15 20:48:25 -03:00
Arnaldo Carvalho de Melo
605539034f perf newt: Make <- exit the ui_browser
Right now that means that pressing the left arrow willl make the symbol
annotation window to exit back to the main symbol histogram browser.

This is another improvement on the UI fastpath, i.e. just the arrows and
enter are enough for most browsing.

Suggested-by: Ingo Molnar <mingo@elte.hu>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-15 20:48:24 -03:00
Arnaldo Carvalho de Melo
3e1bbdc3a7 perf newt: Make <- zoom out filters
After we use the filters to zoom into DSOs or threads, we can use <-
(left arrow) to zoom out from the last filter applied.

It is still possible to zoom out of order by using the popup menu.

With this we now have the zoom out operation on the browsing fast path,
by allowing fast navigation using just the four arrors and the enter key
to expand collapse callchains.

Suggested-by: Ingo Molnar <mingo@elte.hu>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-14 20:05:21 -03:00
Arnaldo Carvalho de Melo
c82ee828aa perf report: Report number of events, not samples
Number of samples is meaningless after we switched to auto-freq, so
report the number of events, i.e. not the sum of the different periods,
but the number PERF_RECORD_SAMPLE emitted by the kernel.

While doing this I noticed that naming "count" to the sum of all the
event periods can be confusing, so rename it to .period, just like in
struct sample.data, so that we become more consistent.

This helps with the next step, that was to record in struct hist_entry
the number of sample events for each instance, we need that because we
use it to generate the number of events when applying filters to the
tree of hist entries like it is being done in the TUI report browser.

Suggested-by: Ingo Molnar <mingo@elte.hu>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-14 14:19:35 -03:00
Arnaldo Carvalho de Melo
cee75ac7ec perf hist: Clarify events_stats fields usage
The events_stats.total field is too generic, rename it to .total_period,
and also add a comment explaining that it is the sum of all the .period
fields in samples, that is needed because we use auto-freq to avoid
sampling artifacts.

Ditto for events_stats.lost, that is the sum of all lost_event.lost
fields, i.e. the number of events the kernel dropped.

Looking at the users, builtin-sched.c can make use of these fields and
stop doing it again.

Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-14 13:16:55 -03:00
Arnaldo Carvalho de Melo
c8446b9bda perf hist: Make event__totals per hists
This is one more thing that started global but are more useful per hist
or per session.

Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-14 10:36:42 -03:00
Kirill Smelkov
5d2be7cb19 perf trace scripts: Fix typos in perf-trace-python.txt
option option   -> option
special special -> special

Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <1273747165-17242-1-git-send-email-kirr@mns.spb.ru>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-13 17:10:40 -03:00
Stephane Eranian
2e6cdf996b perf tools: change event inheritance logic in stat and record
By default, event inheritance across fork and pthread_create was on but the -i
option of stat and record, which enabled inheritance, led to believe it was off
by default.

This patch fixes this logic by inverting the meaning of the -i option.  By
default inheritance is on whether you attach to a process (-p), a thread (-t)
or start a process. If you pass -i, then you turn off inheritance. Turning off
inheritance if you don't need it, helps limit perf resource usage as well.

The patch also fixes perf stat -t xxxx and perf record -t xxxx which did not
start the counters.

Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <4bea9d2f.d60ce30a.0b5b.08e1@mx.google.com>
Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-13 16:39:12 -03:00
Frederic Weisbecker
8a0ecfb8b4 perf hist: Fix missing getline declaration
hist.c needs to include util.h so that it gets stdio.h
inclusion with __GNU_SOURCE defined.

Fixes:
	util/hist.c: In function ‘hist_entry__parse_objdump_line’:
	util/hist.c:931: erreur: implicit declaration of function ‘getline’
	util/hist.c:931: erreur: nested extern declaration of ‘getline’

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1273772836-11533-1-git-send-regression-fweisbec@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-13 16:32:58 -03:00
Frederic Weisbecker
8769e1c717 perf hist: Fix hists__browse no-newt case
Fix mistake in a parameter type of the no-newt hists__browse()
version.

Fixes:
	builtin-report.c: In function ‘__cmd_report’:
	builtin-report.c:314: erreur: incompatible type for argument 1 of ‘hists__browse’

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1273771378-8577-1-git-send-regression-fweisbec@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-13 16:32:28 -03:00
Arnaldo Carvalho de Melo
46db2c3205 perf record: Add a fallback to the reference relocation symbol
Usually "_text" is enough, but I received reports that its not always
available, so fallback to "_stext" for the symbol we use to check if we
need to apply any relocation to all the symbols in the kernel symtab,
for when, for instance, kexec is being used.

Reported-by: Darren Hart <dvhltc@us.ibm.com>
Reported-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-05-13 07:55:29 +02:00
Arnaldo Carvalho de Melo
ef7b93a119 perf report: Librarize the annotation code and use it in the newt browser
Now we don't anymore use popen to run 'perf annotate' for the selected
symbol, instead we collect per address samplings when processing samples
in 'perf report' if we're using the newt browser, then we use this data
directly to do annotation.

Done this way we can actually traverse the objdump_line objects
directly, matching the addresses to the collected samples and colouring
them appropriately using lower level slang routines.

The new ui_browser class will be reused for the main, callchain aware,
histogram browser, when it will be made generic and don't assume that
the objects are always instances of the objdump_line class maintained
using list_heads.

Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-11 23:23:20 -03:00
Arnaldo Carvalho de Melo
3798ed7bc7 perf ui: Add ui_helpline methods
Initially this was just to be able to have a printf like method to
prepare the formatted string and then pass to newtPushHelpLine, but as
we already have for ui_progress, etc, its a step in identifying a
restricted, highlevel set of widgets we can then have implementations
for multiple widget sets (GTK, etc).

Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-11 18:01:23 -03:00
Kyle McMartin
d11c7addfe perf symbols: allow forcing use of cplus_demangle
For Fedora, I want to force perf to link against libiberty.a for
cplus_demangle, rather than libbfd.a for bfd_demangle due to licensing insanity
on binutils. (libiberty is LGPL2, libbfd is GPL3.)

If we just rely on autodetection, we'll end up with libbfd linked against us,
since they're both in binutils-static in the buildroot.

Cc: Ingo Molnar <mingo@elte.hu>
LKML-Reference: <20100510204335.GA7565@bombadil.infradead.org>
Signed-off-by: Kyle McMartin <kyle@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-11 12:43:11 -03:00
Masami Hiramatsu
6b3c4ef504 perf probe: Check older elfutils and set NO_DWARF
Check whether elfutils is older than 0.138 (from which version checking
routine has been introduced). And if so, set NO_DWARF because it is hard
to check the API dependency without version checking.

Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Reported-by: Robert Richter <robert.richter@amd.com>
Cc: Robert Richter <robert.richter@amd.com>
Cc: Ingo Molnar <mingo@elte.hu>
LKML-Reference: <20100511045953.9913.19485.stgit@localhost6.localdomain6>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-11 12:43:11 -03:00
Arnaldo Carvalho de Melo
b09e0190ac perf hist: Adopt filter by dso and by thread methods from the newt browser
Those are really not specific to the newt code, can be used by other UI
frontends.

Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-11 12:43:10 -03:00
Frederic Weisbecker
de068ec048 perf: Fix static strings treated like dynamic ones
The raw_field_ptr() helper, used to retrieve the address of a field
inside a trace event, treats every strings as if they were dynamic
ie: having a secondary level of indirection to retrieve their
contents.

FIELD_IS_STRING doesn't mean FIELD_IS_DYNAMIC, we only need to
compute the secondary dereference for the latter case.

This fixes perf sched segfaults, bad cmdline report and may be
some other bugs.

Reported-by: Jason Baron <jbaron@redhat.com>
Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Tom Zanussi <tzanussi@gmail.com>
2010-05-11 09:14:24 +02:00
Tom Zanussi
e61a639a79 perf/trace/scripting: syscall-counts script cleanup
A small fix for the syscall counts script:

 - silence the match output in the shell script

Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
LKML-Reference: <1273466820-9330-10-git-send-email-tzanussi@gmail.com>
Signed-off-by: Tom Zanussi <tzanussi@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-10 19:51:02 -03:00
Tom Zanussi
79e653f1bf perf/trace/scripting: syscall-counts-by-pid script cleanup
A small fix for the syscall counts by pid script:

- silence the match output in the shell script

Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
LKML-Reference: <1273466820-9330-9-git-send-email-tzanussi@gmail.com>
Signed-off-by: Tom Zanussi <tzanussi@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-10 19:51:01 -03:00
Tom Zanussi
a4ab0c1297 perf/trace/scripting: failed-syscalls-by-pid script cleanup
A small fixe for the failed syscalls by pid script:

 - silence the match output in the shell script

Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
LKML-Reference: <1273466820-9330-8-git-send-email-tzanussi@gmail.com>
Signed-off-by: Tom Zanussi <tzanussi@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-10 19:51:00 -03:00
Tom Zanussi
3824a4e8da perf/trace/scripting: don't show script start/stop messages by default
Only print the script start/stop messages in verbose mode - users
normally don't care and it just clutters up the output.

Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
LKML-Reference: <1273466820-9330-7-git-send-email-tzanussi@gmail.com>
Signed-off-by: Tom Zanussi <tzanussi@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-10 19:50:59 -03:00
Tom Zanussi
a3412d9b35 perf/trace/scripting: workqueue-stats script cleanup
Some minor fixes for the workqueue-stats script:

 - Fix nuisance 'use of uninitialized value' warnings

Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
LKML-Reference: <1273466820-9330-6-git-send-email-tzanussi@gmail.com>
Signed-off-by: Tom Zanussi <tzanussi@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-10 19:50:58 -03:00
Tom Zanussi
e366728d57 perf/trace/scripting: wakeup-latency script cleanup
Some minor fixes for the wakeup-latency script:

 - Fix nuisance 'use of uninitialized value' warnings

 - Avoid divide-by-zero error

Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
LKML-Reference: <1273466820-9330-5-git-send-email-tzanussi@gmail.com>
Signed-off-by: Tom Zanussi <tzanussi@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-10 19:50:57 -03:00
Tom Zanussi
e88a4bfbcd perf/trace/scripting: rwtop script cleanup
A couple of fixes for the rwtop script:

- printing the totals and clearing the hashes in the signal handler
  eventually leads to various random and serious problems when running
  the rwtop script continuously.  Moving the print_totals() calls to
  the event handlers solves that problem, and the event handlers are
  invoked frequently enough that it doesn't affect the timeliness of
  the output.

- Fix nuisance 'use of uninitialized value' warnings

Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Message-Id: <1273466820-9330-4-git-send-email-tzanussi@gmail.com>
Signed-off-by: Tom Zanussi <tzanussi@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-10 19:50:56 -03:00
Tom Zanussi
6922c3d772 perf/trace/scripting: rw-by-pid script cleanup
Some minor fixes for the rw-by-pid script:

- Fix nuisance 'use of uninitialized value' warnings

- Change the failed read/write sections to sort by error counts

Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
LKML-Reference: <1273466820-9330-3-git-send-email-tzanussi@gmail.com>
Signed-off-by: Tom Zanussi <tzanussi@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-10 19:50:55 -03:00
Tom Zanussi
c3f5fd287a perf/trace/scripting: failed-syscalls script cleanup
A couple small fixes for the failed syscalls script:

- The script description says it can be restricted to a specific comm,
  make it so.

- silence the match output in the shell script

Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
LKML-Reference: <1273466820-9330-2-git-send-email-tzanussi@gmail.com>
Signed-off-by: Tom Zanussi <tzanussi@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-10 19:50:54 -03:00
Arnaldo Carvalho de Melo
fefb0b94bb perf hist: Calculate max_sym name len and nr_entries
Better done when we are adding entries, be it initially of when we're
re-sorting the histograms.

Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-10 19:49:08 -03:00
Arnaldo Carvalho de Melo
1c02c4d2e9 perf hist: Introduce hists class and move lots of methods to it
In cbbc79a we introduced support for multiple events by introducing a
new "event_stat_id" struct and then made several perf_session methods
receive a point to it instead of a pointer to perf_session, and kept the
event_stats and hists rb_tree in perf_session.

While working on the new newt based browser, I realised that it would be
better to introduce a new class, "hists" (short for "histograms"),
renaming the "event_stat_id" struct and the perf_session methods that
were really "hists" methods, as they manipulate only struct hists
members, not touching anything in the other perf_session members.

Other optimizations, such as calculating the maximum lenght of a symbol
name present in an hists instance will be possible as we add them,
avoiding a re-traversal just for finding that information.

The rationale for the name "hists" to replace "event_stat_id" is that we
may have multiple sets of hists for the same event_stat id, as, for
instance, the 'perf diff' tool has, so event stat id is not what
characterizes what this struct and the functions that manipulate it do.

Cc: Eric B Munson <ebmunson@us.ibm.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-10 13:13:49 -03:00
Arnaldo Carvalho de Melo
d118f8ba6a perf session: create_kernel_maps should use ->host_machine
Using machines__create_kernel_maps(..., HOST_KERNEL_ID) it would create
another machine instance for the host machine, and since 1f626bc we have
it out of the machines rb_tree.

Fix it by using machine__create_kernel_maps(&self->host_machine)
directly.

Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-10 12:51:05 -03:00
Arnaldo Carvalho de Melo
cdd5b75b0c perf callchains: Use zalloc to allocate objects
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-10 10:57:39 -03:00
Arnaldo Carvalho de Melo
7f8264539c perf newt: Use newtAddComponent()
Instead of newtAddComponents(just-one-entry, NULL), that is not needed
if, like in this browser, we're adding just one component at a time.

Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-10 10:51:25 -03:00
Ingo Molnar
1f0ac7183f Merge branch 'perf/test' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing into perf/core 2010-05-10 08:20:19 +02:00
Arnaldo Carvalho de Melo
232a5c948d perf report: Allow limiting the number of entries to print in callchains
Works by adding a third parameter to the '-g' argument, after the graph
type and minimum percentage, for example:

[root@doppio linux-2.6-tip]# perf report -g fractal,0.5,2

Will show only the first two symbols where at least 0.5% of the samples
took place.

All the other symbols that don't fall outside these constraints will be
put together in the last entry, prefixed with "[...]" and the total
percentage for them.

Suggested-by: Arjan van de Ven <arjan@linux.intel.com>
Acked-by: Arjan van de Ven <arjan@linux.intel.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-09 21:15:35 -03:00
Arnaldo Carvalho de Melo
1f626bc368 perf session: Embed the host machine data on perf_session
We have just one host on a given session, and that is the most common
setup right now, so embed a ->host_machine struct machine instance
directly in the perf_session class, check if we're looking for it before
going to the rb_tree.

This also fixes a problem found when we try to process old perf.data
files where we didn't have MMAP events for the kernel and modules and
thus don't create the kernel maps, do it in event__preprocess_sample if
it wasn't already.

Reported-by: Ingo Molnar <mingo@elte.hu>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: Zhang, Yanmin <yanmin_zhang@linux.intel.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-09 21:14:52 -03:00
Arnaldo Carvalho de Melo
4cc4945844 perf symbols: Check if a struct machine instance was found
Which can happen when processing old files that had no fake kernel MMAP,
events.

That shouldn't result in perf_session__create_kernel_maps not being
called, this will be fixed in a followup patch, for now do these checks
to avoid segfaulting.

Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-09 21:14:07 -03:00
Arnaldo Carvalho de Melo
3ceb0d4438 perf symbols: Consider unresolved DSOs in the dso__col_widt calculation
By using BITS_PER_LONG / 4, that is the number of chars that will be
used in such cases as the DSO "name".

Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-09 18:32:32 -03:00
Hitoshi Mitake
76ba7e846f perf lock: Drop "-a" option from cmd_record() default arguments set
This patch drops "-a" from the default arguments passed to
perf record by perf lock.

If a user wants to do a system wide record of lock events,
        perf lock record -a <program> <argument> ...
is enough for this purpose.

This can reduce the size of the perf.data file.

% sudo ./perf lock record whoami
root
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.439 MB perf.data (~19170 samples) ]
% sudo ./perf lock record -a whoami   # with -a option
root
[ perf record: Woken up 0 times to write data ]
[ perf record: Captured and wrote 48.962 MB perf.data (~2139197 samples) ]

Signed-off-by: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Jason Baron <jbaron@redhat.com>
Cc: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
LKML-Reference: Message-Id: <1273306229-5216-1-git-send-email-mitake@dcl.info.waseda.ac.jp>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2010-05-09 21:52:27 +02:00
Arnaldo Carvalho de Melo
28e2a106d1 perf hist: Simplify the insertion of new hist_entry instances
And with that fix at least one bug:

The first hit for an entry, the one that calls malloc to create a new
instance in __perf_session__add_hist_entry, wasn't adding the count to
the per cpumode (PERF_RECORD_MISC_USER, etc) total variable.

Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-09 13:10:39 -03:00
Arnaldo Carvalho de Melo
39d1e1b1e2 perf report: Fix leak of resolved callchains array on error path
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-09 13:07:39 -03:00
Arnaldo Carvalho de Melo
139633c6a4 perf callchain: Move validate_callchain to callchain lib
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-09 13:07:05 -03:00
Tom Zanussi
794e43b56c perf/live-mode: Handle payload-less events
Some events, such as the PERF_RECORD_FINISHED_ROUND event consist of
only an event header and no data.  In this case, a 0-length payload
will be read, and the 0 return value will be wrongly interpreted as an
'unexpected end of event stream'.

This patch allows for proper handling of data-less events by skipping
0-length reads.

Signed-off-by: Tom Zanussi <tzanussi@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
LKML-Reference: <1273038527.6383.51.camel@tropicana>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2010-05-09 13:49:52 +02:00
Frederic Weisbecker
90c0e5fc7b perf lock: Always check min AND max wait time
When a lock is acquired after beeing contended, we update the
wait time statistics for the given lock.
But if the min wait time is updated, we don't check the max wait
time. This is wrong because the first time we update the wait time,
we want to update both min and max wait time.

Before:
	Name   acquired  contended total wait (ns)   max wait (ns)   min wait (ns)
	key          8          1           21656           0           21656

After:
	Name   acquired  contended total wait (ns)   max wait (ns)   min wait (ns)
	key          8          1           21656           21656           21656

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
2010-05-09 13:45:30 +02:00
Frederic Weisbecker
5efe08cf68 perf: Fix perf lock bad rate
Fix the cast made to get the bad rate. It is made in the result
instead of the operands. We need the operands to be cast in double,
otherwise the result will always be zero.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
2010-05-09 13:45:29 +02:00
Frederic Weisbecker
84c7a21791 perf: Humanize lock flags in perf lock
Use an enum instead of plain constants for lock flags.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
2010-05-09 13:45:27 +02:00
Frederic Weisbecker
10350ec362 perf: Cleanup perf lock broken states
Use enum to get a human view of bad_hist indexes and
put bad histogram output in its own function.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
2010-05-09 13:45:26 +02:00
Hitoshi Mitake
26242d859c perf lock: Add "info" subcommand for dumping misc information
This adds the "info" subcommand to perf lock which can be used
to dump metadata like threads or addresses of lock instances.
"map" was removed because info should do the work for it.

This will be useful not only for debugging but also for ordinary
analyzing.

v2: adding example of usage
% sudo ./perf lock info -t
 | Thread ID: comm
 | 	 0: swapper
 |         1: init
 |        18: migration/5
 |        29: events/2
 |        32: events/5
 |        33: events/6
...

% sudo ./perf lock info -m
| Address of instance: name of class
|  0xffff8800b95adae0: &(&sighand->siglock)->rlock
|  0xffff8800bbb41ae0: &(&sighand->siglock)->rlock
|  0xffff8800bf165ae0: &(&sighand->siglock)->rlock
|  0xffff8800b9576a98: &p->cred_guard_mutex
|  0xffff8800bb890a08: &(&p->alloc_lock)->rlock
|  0xffff8800b9522a08: &(&p->alloc_lock)->rlock
|  0xffff8800bb8aaa08: &(&p->alloc_lock)->rlock
|  0xffff8800bba72a08: &(&p->alloc_lock)->rlock
|  0xffff8800bf18ea08: &(&p->alloc_lock)->rlock
|  0xffff8800b8a0d8a0: &(&ip->i_lock)->mr_lock
|  0xffff88009bf818a0: &(&ip->i_lock)->mr_lock
|  0xffff88004c66b8a0: &(&ip->i_lock)->mr_lock
|  0xffff8800bb6478a0: &(shost->host_lock)->rlock

v3: fixed some problems Frederic pointed out
 * better rbtree tracking in dump_threads()
 * removed printf() and used pr_info() and pr_debug()

Signed-off-by: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Jason Baron <jbaron@redhat.com>
Cc: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
LKML-Reference: <1272863520-16179-1-git-send-email-mitake@dcl.info.waseda.ac.jp>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2010-05-09 13:45:24 +02:00
Frederic Weisbecker
d6b17bebd7 perf: Provide a new deterministic events reordering algorithm
The current events reordering algorithm is based on a heuristic that
gets broken once we deal with a very fast flow of events.

Indeed the time period based flushing is not suitable anymore
in the following case, assuming we have a flush period of two
seconds.

    CPU 0           |        CPU 1
                    |
  cnt1 timestamps   |      cnt1 timestamps
                    |
    0               |         0
    1               |         1
    2               |         2
    3               |         3
    [...]           |        [...]
    4 seconds later

If we spend too much time to read the buffers (case of a lot of
events to record in each buffers or when we have a lot of CPU buffers
to read), in the next pass the CPU 0 buffer could contain a slice
of several seconds of events. We'll read them all and notice we've
reached the period to flush. In the above example we flush the first
half of the CPU 0 buffer, then we read the CPU 1 buffer where we
have events that were on the flush slice and then the reordering
fails.

It's simple to reproduce with:

	perf lock record perf bench sched messaging

To solve this, we use a new solution that doesn't rely on an
heuristical time slice period anymore but on a deterministic basis
based on how perf record does its job.

perf record saves the buffers through passes. A pass is a tour
on every buffers from every CPUs. This is made in order: for
each CPU we read the buffers of every counters. So the more
buffers we visit, the later will be the timstamps of their events.

When perf record finishes a pass it records a
PERF_RECORD_FINISHED_ROUND pseudo event.
We record the max timestamp t found in the pass n. Assuming these
timestamps are monotonic across cpus, we know that if a buffer
still has events with timestamps below t, they will be all available
and then read in the pass n + 1.
Hence when we start to read the pass n + 2, we can safely flush every
events with timestamps below t.

      ============ PASS n =================
         CPU 0         |   CPU 1
                       |
      cnt1 timestamps  |   cnt2 timestamps
            1          |         2
            2          |         3
            -          |         4  <--- max recorded

      ============ PASS n + 1 ==============
         CPU 0         |   CPU 1
                       |
      cnt1 timestamps  |   cnt2 timestamps
            3          |         5
            4          |         6
            5          |         7 <---- max recorded

        Flush every events below timestamp 4

      ============ PASS n + 2 ==============
         CPU 0         |   CPU 1
                       |
      cnt1 timestamps  |   cnt2 timestamps
            6          |         8
            7          |         9
            -          |         10

        Flush every events below timestamp 7
        etc...

It also works on perf.data versions that don't have
PERF_RECORD_FINISHED_ROUND pseudo events. The difference is that
the events will be only flushed in the end of the perf.data
processing. It will then consume more memory and scale less with
large perf.data files.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
2010-05-09 13:43:42 +02:00
Frederic Weisbecker
9840280757 perf: Introduce a new "round of buffers read" pseudo event
In order to provide a more rubust and deterministic reordering
algorithm, we need to know when we reach a point where we just
did a pass through over every counter buffers to read every thing
they had.

This patch introduces a new PERF_RECORD_FINISHED_ROUND pseudo event
that only consist in an event header and doesn't need to contain
anything.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
2010-05-09 13:43:42 +02:00
Pekka Enberg
e157eb8341 perf report: Document '--call-graph' better for usage
This patch improves 'perf report -h' output for the
'--call-graph' command line option by enumerating the
different output types.

Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1273332783-4268-1-git-send-email-penberg@cs.helsinki.fi>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-05-08 18:11:44 +02:00
Ingo Molnar
ed82702155 Merge branch 'perf' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux-2.6 into perf/core 2010-05-08 10:02:57 +02:00
Arnaldo Carvalho de Melo
1cf4a0632c perf list: Improve the raw hw event descriptor documentation
It was x86 specific and imcomplete at that, improve the situation by
making it clear where the example provided applies and by adding the
URLs for the Intel and AMD manuals where this is discussed in depth.

Acked-by: Robert Richter <robert.richter@amd.com>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: Robert Richter <robert.richter@amd.com>
Reported-by: Robert Richter <robert.richter@amd.com
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-07 14:07:05 -03:00
Peter Zijlstra
ab608344bc perf, x86: Improve the PEBS ABI
Rename perf_event_attr::precise to perf_event_attr::precise_ip and
widen it to 2 bits. This new field describes the required precision of
the PERF_SAMPLE_IP field:

  0 - SAMPLE_IP can have arbitrary skid
  1 - SAMPLE_IP must have constant skid
  2 - SAMPLE_IP requested to have 0 skid
  3 - SAMPLE_IP must have 0 skid

And modify the Intel PEBS code accordingly. The PEBS implementation
now supports up to precise_ip == 2, where we perform the IP fixup.

Also s/PERF_RECORD_MISC_EXACT/&_IP/ to clarify its meaning, this bit
should be set for each PERF_SAMPLE_IP field known to match the actual
instruction triggering the event.

This new scheme allows for a PEBS mode that uses the buffer for more
than a single event.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Stephane Eranian <eranian@google.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-05-07 11:31:02 +02:00
Arnaldo Carvalho de Melo
4778e0e8c6 perf tools: Fixup minor doc formatting issues
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-05 11:23:27 -03:00
Arnaldo Carvalho de Melo
9e32a3cb06 perf list: Add explanation about raw hardware event descriptors
Using explanation given by Ingo Molnar in the oprofile mailing list.

Suggested-by: Nick Black <dank@qemfd.net>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Nick Black <dank@qemfd.net>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-05 11:20:05 -03:00
Tom Zanussi
db620b1c2f perf/record: simplify TRACE_INFO tracepoint check
Fix a couple of inefficiencies and redundancies related to
have_tracepoints() and its use when checking whether to write
TRACE_INFO.

First, there's no need to use get_tracepoints_path() in
have_tracepoints() - we really just want the part that checks whether
any attributes correspondo to tracepoints.

Second, we really don't care about raw_samples per se - tracepoints
are always raw_samples.  In any case, the have_tracepoints() check
should be sufficient to decide whether or not to write TRACE_INFO.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>,
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <1273030770.6383.6.camel@tropicana>
Signed-off-by: Tom Zanussi <tzanussi@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-05 11:12:53 -03:00
Arnaldo Carvalho de Melo
9890948d85 perf report: Make dso__calc_col_width agree with hist_entry__dso_snprintf
The first was always using the ->long_name, while the later used
->short_name if verbose was not set, resulting in the dso column to be
much wider than needed most of the time.

Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-05 09:49:48 -03:00
Ingo Molnar
c4f3b5a2d7 Merge branch 'perf' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux-2.6 into perf/core 2010-05-04 18:31:47 +02:00
Anton Blanchard
02bf60aad7 perf: Fix performance issue with perf report
On a large machine we spend a lot of time in perf_header__find_attr when
running perf report.

If we are parsing a file without PERF_SAMPLE_ID then for each sample we call
perf_header__find_attr and loop through all counter IDs, never finding a match.
As the machine gets larger there are more per cpu counters and we spend an
awful lot of time in there.

The patch below initialises each sample id to -1ULL and checks for this in
perf_header__find_attr. We may need to do something more intelligent eventually
(eg a hash lookup from counter id to attr) but this at least fixes the most
common usage of perf report.

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Eric B Munson <ebmunson@us.ibm.com>
Acked-by: Eric B Munson <ebmunson@us.ibm.com>
LKML-Reference: <20100504111915.GB14636@kryten>
Signed-off-by: Anton Blanchard <anton@samba.org>
--
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-04 10:54:09 -03:00
Arnaldo Carvalho de Melo
11d232ec28 perf inject: Add missing bits
New commands need to have Documentation and be added to command-list.txt
so that they can appear when 'perf' is called withouth any subcommand:

[root@doppio linux-2.6-tip]# perf

 usage: perf [--version] [--help] COMMAND [ARGS]

 The most commonly used perf commands are:
   annotate        Read perf.data (created by perf record) and display annotated code
   archive         Create archive with object files with build-ids found in perf.data file
   bench           General framework for benchmark suites
   buildid-cache   Manage build-id cache.
   buildid-list    List the buildids in a perf.data file
   diff            Read two perf.data files and display the differential profile
   inject          Filter to augment the events stream with additional information
   kmem            Tool to trace/measure kernel memory(slab) properties
   kvm             Tool to trace/measure kvm guest os
   list            List all symbolic event types
   lock            Analyze lock events
   probe           Define new dynamic tracepoints
   record          Run a command and record its profile into perf.data
   report          Read perf.data (created by perf record) and display the profile
   sched           Tool to trace/measure scheduler properties (latencies)
   stat            Run a command and gather performance counter statistics
   test            Runs sanity tests.
   timechart       Tool to visualize total system behavior during a workload
   top             System profiling tool.
   trace           Read perf.data (created by perf record) and display trace output

 See 'perf help COMMAND' for more information on a specific command.

[root@doppio linux-2.6-tip]#

The new 'perf inject' command hadn't so it wasn't appearing on that list.

Also fix the long option, that should have no spaces in it, rename the faulty one
to be '--build-ids', instead of '--inject build-ids'.

Reported-by: Ingo Molnar <mingo@elte.hu>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-04 10:48:22 -03:00
Tom Zanussi
63e0c7715a perf: record TRACE_INFO only if using tracepoints and SAMPLE_RAW
The current perf code implicitly assumes SAMPLE_RAW means tracepoints
are being used, but doesn't check for that.  It happily records the
TRACE_INFO even if SAMPLE_RAW is used without tracepoints, but when the
perf data is read it won't go any further when it finds TRACE_INFO but
no tracepoints, and displays misleading errors.

This adds a check for both in perf-record, and won't record TRACE_INFO
unless both are true.  This at least allows perf report -D to dump raw
events, and avoids triggering a misleading error condition in perf
trace.  It doesn't actually enable the non-tracepoint raw events to be
displayed in perf trace, since perf trace currently only deals with
tracepoint events.

Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1272865861.7932.16.camel@tropicana>
Signed-off-by: Tom Zanussi <tzanussi@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-03 10:31:48 -03:00
Ingo Molnar
0806ebd974 Merge branch 'perf/core' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing into perf/core 2010-05-03 08:29:35 +02:00
Arnaldo Carvalho de Melo
090f7204df perf inject: Refactor read_buildid function
Into two functions, one that actually reads the build_id for the dso if
it wasn't already read, and another taht will inject the event if the
build_id is available.

Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-02 19:46:36 -03:00
Arnaldo Carvalho de Melo
2c9faa0600 perf record: Don't exit in live mode when no tracepoints are enabled
With this I was able to actually test Tom Zanussi's two previous patches
in my usual perf testing ways, i.e. without any tracepoints activated.

Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-02 13:37:24 -03:00
Tom Zanussi
454c407ec1 perf: add perf-inject builtin
Currently, perf 'live mode' writes build-ids at the end of the
session, which isn't actually useful for processing live mode events.

What would be better would be to have the build-ids sent before any of
the samples that reference them, which can be done by processing the
event stream and retrieving the build-ids on the first hit.  Doing
that in perf-record itself, however, is off-limits.

This patch introduces perf-inject, which does the same job while
leaving perf-record untouched.  Normal mode perf still records the
build-ids at the end of the session as it should, but for live mode,
perf-inject can be injected in between the record and report steps
e.g.:

perf record -o - ./hackbench 10 | perf inject -v -b | perf report -v -i -

perf-inject reads a perf-record event stream and repipes it to stdout.
At any point the processing code can inject other events into the
event stream - in this case build-ids (-b option) are read and
injected as needed into the event stream.

Build-ids are just the first user of perf-inject - potentially
anything that needs userspace processing to augment the trace stream
with additional information could make use of this facility.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
LKML-Reference: <1272696080-16435-3-git-send-email-tzanussi@gmail.com>
Signed-off-by: Tom Zanussi <tzanussi@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-02 13:36:56 -03:00
Tom Zanussi
789688faef perf/live: don't synthesize build ids at the end of a live mode trace
It doesn't really make sense to record the build ids at the end of a
live mode session - live mode samples need that information during the
trace rather than at the end.

Leave event__synthesize_build_id() in place, however; we'll still be
using that to synthesize build ids in a more timely fashion in a
future patch.

Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Steven Rostedt <rostedt@goodmis.org>
LKML-Reference: <1272696080-16435-2-git-send-email-tzanussi@gmail.com>
Signed-off-by: Tom Zanussi <tzanussi@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-02 12:04:05 -03:00
Arnaldo Carvalho de Melo
fb72014d98 perf tools: Don't use code surrounded by __KERNEL__
We need to refactor code to be explicitely shared by the kernel and at
least the tools/ userspace programs, so, till we do that, copy the bare
minimum bitmap/bitops code needed by tools/perf.

Reported-by: "H. Peter Anvin" <hpa@zytor.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-05-02 12:00:44 -03:00
Frederic Weisbecker
d00a47cce5 perf: Fix warning while reading ring buffer headers
commit e9e94e3bd8
"perf trace: Ignore "overwrite" field if present in
/events/header_page" makes perf trace launching spurious warnings
about unexpected tokens read:

	Warning: Error: expected type 6 but read 4

This change tries to handle the overcommit field in the header_page
file whenever this field is present or not.

The problem is that if this field is not present, we try to find it
and give up in the middle of the line when we realize we are actually
dealing with another field, which is the "data" one. And this failure
abandons the file pointer in the middle of the "data" description
line:

	field: u64 timestamp;	offset:0;	size:8;	signed:0;
	field: local_t commit;	offset:8;	size:8;	signed:1;
	field: char data;	offset:16;	size:4080;	signed:1;
                      ^^^
                      Here

What happens next is that we want to read this line to parse the data
field, but we fail because the pointer is not in the beginning of the
line.

We could probably fix that by rewinding the pointer. But in fact we
don't care much about these headers that only concern the ftrace
ring-buffer. We don't use them from perf.

Just skip this part of perf.data, but don't remove it from recording
to stay compatible with olders perf.data

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
2010-05-01 04:31:48 +02:00
Frederic Weisbecker
e5a5f1f015 perf: Remove leftover useless options to record trace events from scripts
-f, -c 1, -R are now useless for trace events recording, moreover
-M is useless and event hurts.

Remove them from the documentation examples and from record scripts.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Tom Zanussi <tzanussi@gmail.com>
2010-04-30 19:55:00 +02:00
Arnaldo Carvalho de Melo
1c6a800cde perf test: Initial regression testing command
First an example with the first internal test:

[acme@doppio linux-2.6-tip]$ perf test
 1: vmlinux symtab matches kallsyms: Ok

So it run just one test, that is "vmlinux symtab matches kallsyms", and it was
successful.

If we run it in verbose mode, we'll see details about errors and extra warnings
for non-fatal problems:

[acme@doppio linux-2.6-tip]$ perf test -v
 1: vmlinux symtab matches kallsyms:
--- start ---
Looking at the vmlinux_path (5 entries long)
No build_id in vmlinux, ignoring it
No build_id in /boot/vmlinux, ignoring it
No build_id in /boot/vmlinux-2.6.34-rc4-tip+, ignoring it
Using /lib/modules/2.6.34-rc4-tip+/build/vmlinux for symbols
Maps only in vmlinux:
 ffffffff81cb81b1-ffffffff81e1149b 0 [kernel].init.text
 ffffffff81e1149c-ffffffff9fffffff 0 [kernel].exit.text
 ffffffffff600000-ffffffffff6000ff 0 [kernel].vsyscall_0
 ffffffffff600100-ffffffffff6003ff 0 [kernel].vsyscall_fn
 ffffffffff600400-ffffffffff6007ff 0 [kernel].vsyscall_1
 ffffffffff600800-ffffffffffffffff 0 [kernel].vsyscall_2
Maps in vmlinux with a different name in kallsyms:
 ffffffffff600000-ffffffffff6000ff 0 [kernel].vsyscall_0 in kallsyms as [kernel].0
 ffffffffff600100-ffffffffff6003ff 0 [kernel].vsyscall_fn in kallsyms as:
*ffffffffff600100-ffffffffff60012f 0 [kernel].2
 ffffffffff600400-ffffffffff6007ff 0 [kernel].vsyscall_1 in kallsyms as [kernel].6
 ffffffffff600800-ffffffffffffffff 0 [kernel].vsyscall_2 in kallsyms as [kernel].8
Maps only in kallsyms:
 ffffffffff600130-ffffffffff6003ff 0 [kernel].4
---- end ----
vmlinux symtab matches kallsyms: Ok
[acme@doppio linux-2.6-tip]$

In the above case we only know the name of the non contiguous kernel ranges in
the address space when reading the symbol information from the ELF symtab in
vmlinux.

The /proc/kallsyms file lack this, we only notice they are separate because
there are modules after the kernel and after that more kernel functions, so we
need to have a module rbtree backed by the module .ko path to get symtabs in
the vmlinux case.

The tool uses it to match by address to emit appropriate warning, but don't
considers this fatal.

The .init.text and .exit.text ines, of course, aren't in kallsyms, so I left
these cases just as extra info in verbose mode.

The end of the sections also aren't in kallsyms, so we the symbols layer does
another pass and sets the end addresses as the next map start minus one, which
sometimes pads, causing harmless mismatches.

But at least the symbols match, tested it by copying /proc/kallsyms to
/tmp/kallsyms and doing changes to see if they were detected.

This first test also should serve as a first stab at documenting the
symbol library by providing a self contained example that exercises it
together with comments about what is being done.

More tests to check if actions done on a monitored app, like doing mmaps, etc,
makes the kernel generate the expected events should be added next.

Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-04-29 18:59:23 -03:00
Arnaldo Carvalho de Melo
5c0541d53e perf symbols: Add machine helper routines
Created when writing the first 'perf test' regression testing routine.

Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-04-29 15:25:23 -03:00
Arnaldo Carvalho de Melo
18acde52b8 perf tools: Create $(OUTPUT)arch/$(ARCH)/util/ directory
So that "make -C tools/perf O=/tmp/some/path" works again.

Problem introduced in:

cd932c5 "perf: Move arch specific code into separate arch director"

Cc: Ian Munsie <imunsie@au.ibm.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-04-27 22:29:45 -03:00
Arnaldo Carvalho de Melo
cbf6968098 perf machines: Make the machines class adopt the dsos__fprintf methods
Now those methods don't operate on a global list of dsos, but on lists
of machines, so make this clear by renaming the functions.

Cc: Avi Kivity <avi@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Zhang, Yanmin <yanmin_zhang@linux.intel.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-04-27 21:22:44 -03:00
Arnaldo Carvalho de Melo
d28c62232e perf machine: Adopt some map_groups functions
Those functions operated on members now grouped in 'struct machine', so
move those methods to this new class.

The changes made to 'perf probe' shows that using this abstraction
inserting probes on guests almost got supported for free.

Cc: Avi Kivity <avi@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Zhang, Yanmin <yanmin_zhang@linux.intel.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-04-27 21:21:18 -03:00
Arnaldo Carvalho de Melo
48ea8f5470 perf machine: Pass buffer size to machine__mmap_name
Don't blindly assume that the size of the buffer is enough, use
snprintf.

Cc: Avi Kivity <avi@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Zhang, Yanmin <yanmin_zhang@linux.intel.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-04-27 21:19:05 -03:00
Arnaldo Carvalho de Melo
23346f21b2 perf tools: Rename "kernel_info" to "machine"
struct kernel_info and kerninfo__ are too vague, what they really
describe are machines, virtual ones or hosts.

There are more changes to introduce helpers to shorten function calls
and to make more clear what is really being done, but I left that for
subsequent patches.

Cc: Avi Kivity <avi@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Zhang, Yanmin <yanmin_zhang@linux.intel.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-04-27 21:17:50 -03:00
Ingo Molnar
462b04e28a Merge branch 'perf' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux-2.6 into perf/core 2010-04-27 11:16:54 +02:00
Stefan Hajnoczi
f93830fbb0 perf tools: Fix libdw-dev package name in error message
The headers required for DWARF support are provided by the libdw-dev
package in Debian-based distros.  This patch corrects the elfutils-dev
package name to libdw-dev in the Makefile error message when libdw.h is
not found.

Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <1272292023-9869-1-git-send-email-stefanha@linux.vnet.ibm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-04-26 15:39:54 -03:00
Masami Hiramatsu
ef4a356574 perf probe: Add --max-probes option
Add --max-probes option to change the maximum limit of
findable probe points per event, since inlined function can be
expanded into thousands of probe points. Default value is 128.

Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Suggested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
LKML-Reference: <20100421195640.24664.62984.stgit@localhost6.localdomain6>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-04-26 15:35:20 -03:00
Masami Hiramatsu
5d1ee0413c perf probe: Fix to exit callback soon after finding too many probe points
Fix to exit callback soon after finding too many probe points.
Don't try to continue searching because it already failed.

Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
LKML-Reference: <20100421195632.24664.42598.stgit@localhost6.localdomain6>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-04-26 15:33:08 -03:00
Masami Hiramatsu
15eca306ec perf probe: Fix to use symtab only if no debuginfo
Fix perf probe to use symtab only if there is no debuginfo, because debuginfo
has more information than symtab.

If we can't find a function in debuginfo, we never find it in symtab.

Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
LKML-Reference: <20100421195624.24664.46214.stgit@localhost6.localdomain6>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-04-26 15:32:37 -03:00
Masami Hiramatsu
0ab061cd52 perf tools: Initialize dso->node member in dso__new
If dso->node member is not initialized, it causes a segmentation fault when
adding to other lists.

It should be initilized in dso__new().

Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
LKML-Reference: : <20100421195616.24664.89980.stgit@localhost6.localdomain6>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-04-26 15:31:32 -03:00
William Cohen
cfadf9d4ac perf: Some perf-kvm documentation edits
asciidoc does not allow the "===" to be longer than the line
above it.
Also fix a couple types and formatting errors.

Signed-off-by: William Cohen <wcohen@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ingo Molnar <mingo@elte.hu>
LKML-Reference: <4BD204C5.9000504@redhat.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2010-04-24 03:50:49 +02:00
Frederic Weisbecker
e1889d75af perf: Add a perf trace option to check samples ordering reliability
To ensure sample events time reordering is reliable, add a -d option
to perf trace to check that automatically.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
2010-04-24 03:50:48 +02:00
Frederic Weisbecker
9df9bbba9f perf: Use generic sample reordering in perf timechart
Use the new generic sample events reordering from perf timechart,
this drops the ad hoc sample reordering it was using before.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
2010-04-24 03:50:46 +02:00
Frederic Weisbecker
e0a808c65c perf: Use generic sample reordering in perf trace
Use the new generic sample events reordering from perf trace.
Before that, the displayed traces were ordered as they were
in the input as recorded by perf record (not time ordered).

This makes eventually perf trace displaying the events as beeing
time ordered.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
2010-04-24 03:50:45 +02:00
Frederic Weisbecker
587570d4cc perf: Use generic sample reordering in perf kmem
Use the new generic sample events reordering from perf kmem,
this drops the need of multiplexing the buffers on record time,
improving the scalability of perf kmem.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Li Zefan <lizf@cn.fujitsu.com>
2010-04-24 03:50:44 +02:00
Frederic Weisbecker
a64eae703b perf: Use generic sample reordering in perf sched
Use the new generic sample events reordering from perf sched,
this drops the need of multiplexing the buffers on record time,
improving the scalability of perf sched.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
2010-04-24 03:50:42 +02:00
Frederic Weisbecker
c61e52ee70 perf: Generalize perf lock's sample event reordering to the session layer
The sample events recorded by perf record are not time ordered
because we have one buffer per cpu for each event (even demultiplexed
per task/per cpu for task bound events). But when we read trace events
we want them to be ordered by time because many state machines are
involved.

There are currently two ways perf tools deal with that:

- use -M to multiplex every buffers (perf sched, perf kmem)
  But this creates a lot of contention in SMP machines on
  record time.

- use a post-processing time reordering (perf timechart, perf lock)
  The reordering used by timechart is simple but doesn't scale well
  with huge flow of events, in terms of performance and memory use
  (unusable with perf lock for example).
  Perf lock has its own samples reordering that flushes its memory
  use in a regular basis and that uses a sorting based on the
  previous event queued (a new event to be queued is close to the
  previous one most of the time).

This patch proposes to export perf lock's samples reordering facility
to the session layer that reads the events. So if a tool wants to
get ordered sample events, it needs to set its
struct perf_event_ops::ordered_samples to true and that's it.

This prepares tracing based perf tools to get rid of the need to
use buffers multiplexing (-M) or to implement their own
reordering.

Also lower the flush period to 2 as it's sufficient already.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
2010-04-24 03:49:58 +02:00
Stephane Eranian
5710fcad7c perf: Fix initialization bug in parse_single_tracepoint_event()
The parse_single_tracepoint_event() was setting some attributes
before it validated the event was indeed a tracepoint event. This
caused problems with other initialization routines like in the
builtin-top.c module whereby sample_period is not set if not 0.

Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ingo Molnar <mingo@elte.hu>
LKML-Reference: <4bcf232b.698fd80a.6fbe.ffffb737@mx.google.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2010-04-24 03:24:09 +02:00
Hitoshi Mitake
e4cef1f650 perf lock: Fix state machine to recognize lock sequence
Previous state machine of perf lock was really broken.
This patch improves it a little.

This patch prepares the list of state machine that represents
lock sequences for each threads.

These state machines can be one of these sequences:

      1) acquire -> acquired -> release
      2) acquire -> contended -> acquired -> release
      3) acquire (w/ try) -> release
      4) acquire (w/ read) -> release

The case of 4) is a little special.
Double acquire of read lock is allowed, so the state machine
counts read lock number, and permits double acquire and release.

But, things are not so simple. Something in my model is still wrong.
I counted the number of lock instances with bad sequence,
and ratio is like this (case of tracing whoami): bad:233, total:2279

version 2:
 * threads are now identified with tid, not pid
 * prepared SEQ_STATE_READ_ACQUIRED for read lock.
 * bunch of struct lock_seq_stat is now linked list
 * debug information enhanced (this have to be removed someday)
   e.g.
     | === output for debug===
     |
     | bad:233, total:2279
     | bad rate:0.000000
     | histogram of events caused bad sequence
     |     acquire: 165
     |    acquired: 0
     |   contended: 0
     |     release: 68

Signed-off-by: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Jason Baron <jbaron@redhat.com>
Cc: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Cc: Ingo Molnar <mingo@elte.hu>
LKML-Reference: <1271852634-9351-1-git-send-email-mitake@dcl.info.waseda.ac.jp>
[rename SEQ_STATE_UNINITED to SEQ_STATE_UNINITIALIZED]
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2010-04-24 03:23:14 +02:00
Ian Munsie
fead7960f0 perf probe: Add PowerPC DWARF register number mappings
This adds mappings from the register numbers from DWARF to the
register names used in the PowerPC Regs and Stack Access API.  This
allows perf probe to be used to record variable contents on PowerPC.

This requires the functionality represented by the config symbol
HAVE_REGS_AND_STACK_ACCESS_API in order to function, although it will
compile without it.  That functionality is added for PowerPC in commit
359e4284 ("powerpc: Add kprobe-based event tracer").

Signed-off-by: Ian Munsie <imunsie@au.ibm.com>
Acked-by: Masami Hiramatsu <mhiramat@redhat.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2010-04-22 13:48:31 +10:00
Ian Munsie
cd932c5939 perf: Move arch specific code into separate arch directory
The perf userspace tool included some architecture specific code to map
registers from the DWARF register number into the names used by the regs
and stack access API.

This moves the architecture specific code out into a separate
arch/x86 directory along with the infrastructure required to use it.

Signed-off-by: Ian Munsie <imunsie@au.ibm.com>
Acked-by: Masami Hiramatsu <mhiramat@redhat.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2010-04-22 13:48:31 +10:00
Frederic Weisbecker
6eca8cc35b perf: Fix perf probe build error
When we run into dry run mode, we want to make
write_kprobe_trace_event to succeed on writing the event. Let's
initialize it to 0.

Fixes the following build error:
	util/probe-event.c:1266: attention : «ret» may be used uninitialized in this function
	util/probe-event.c:1266: note: «ret» was declared here

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <1271808065-25290-1-git-send-regression-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-04-21 09:39:52 +02:00
Zhang, Yanmin
a1645ce12a perf: 'perf kvm' tool for monitoring guest performance from host
Here is the patch of userspace perf tool.

Signed-off-by: Zhang Yanmin <yanmin_zhang@linux.intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-04-19 12:37:24 +03:00
Ingo Molnar
b5a80b7e91 Merge branch 'perf' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux-2.6 into perf/core 2010-04-15 09:16:51 +02:00
Ingo Molnar
84b13fd596 Merge branch 'perf/live' into perf/core
Conflicts:
	tools/perf/builtin-record.c

Merge reason: add the live tracing feature, resolve conflict.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-04-15 09:13:26 +02:00
Frederic Weisbecker
f921281930 perf: Make the trace events sample period default to 1
Trace events are mostly used for tracing and then require not to
be lost when possible. As opposite to hardware events that really
require to trigger after a given sample period, trace events mostly
need to trigger everytime.

It is a frustrating experience to trace with perf and realize we
lost a lot of events because we forgot the "-c 1" option.

Then default sample_period to 1 for trace events but let the user
override it.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
2010-04-15 04:12:52 +02:00
Frederic Weisbecker
bdef3b02ce perf: Always record tracepoints raw samples from perf record
Trace events are mostly used for tracing rather than simple
counting. Don't bother anymore with adding -R when using them,
just record raw samples of trace events every time.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
2010-04-15 04:12:52 +02:00
Frederic Weisbecker
7865e817e9 perf: Make -f the default for perf record
Force the overwriting mode by default if append mode is not explicit.
Adding -f every time one uses perf on a daily basis quickly becomes a
burden.

Keep the -f among the options though to avoid breaking some random
users scripts.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
2010-04-15 04:12:51 +02:00
Thomas Gleixner
a1e2f60e3e perf: Fix dynamic field detection
Checking if a tracing field is an array with a dynamic length
requires to check the field type and seek the "__data_loc"
string that prepends the actual type, as can be found in a trace
event format file:

	field:__data_loc char[] name;	offset:16;	size:4;	signed:1;

But we actually use strcmp() to check if the field type fully
matches "__data_loc", which may fail as we trip over the rest of
the type.

To fix this, use strncmp to only check if it starts with
"__data_loc".

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
LKML-Reference: <1271282283-23721-1-git-send-regression-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-04-15 01:34:46 +02:00
Masami Hiramatsu
f6c903f585 perf probe: Show function entry line as probe-able
Function entry line should be shown as probe-able line,
because each function has declared line attribute.

LKML-Reference: <20100414224007.14630.96915.stgit@localhost6.localdomain6>
Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-04-14 17:45:39 -03:00
Masami Hiramatsu
de1439d8a5 perf probe: Support DW_OP_plus_uconst in DW_AT_data_member_location
DW_OP_plus_uconst can be used for DW_AT_data_member_location.
This patch adds DW_OP_plus_uconst support when getting
structure member offset.

Commiter note:

Fixed up the size_t format specifier in one case:

cc1: warnings being treated as errors
util/probe-finder.c: In function ‘die_get_data_member_location’:
util/probe-finder.c:270: error: format ‘%d’ expects type ‘int’, but argument 4 has type ‘size_t’
make: *** [/home/acme/git/build/perf/util/probe-finder.o] Error 1

LKML-Reference: <20100414223958.14630.5230.stgit@localhost6.localdomain6>
Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-04-14 17:44:00 -03:00
Masami Hiramatsu
dda4ab34fe perf probe: Fix line range to show end line
Line range should reject the range if the number of lines is 0
(e.g. "sched.c:1024+0"), and it should show the lines include
the end of line number (e.g. "sched.c:1024-2048" should show
2048th line).

LKML-Reference: <20100414223950.14630.42263.stgit@localhost6.localdomain6>
Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-04-14 17:41:30 -03:00
Masami Hiramatsu
d3b63d7ae0 perf probe: Fix a bug that --line range can be overflow
Since line_finder.lno_s/e are signed int but line_range.start/end
are unsigned int, it is possible to be overflow when converting
line_range->start/end to line_finder->lno_s/e.
This changes line_range.start/end and line_list.line to signed int
and adds overflow checks when setting line_finder.lno_s/e.

LKML-Reference: <20100414223942.14630.72730.stgit@localhost6.localdomain6>
Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-04-14 17:41:21 -03:00
Masami Hiramatsu
dd259c5db2 perf probe: Fix mis-estimation for shortening filename
Fix mis-estimation size for making a short filename.
Since the buffer size is 32 bytes and there are '@' prefix and
'\0' termination, maximum shorten filename length should be
30. This means, before searching '/', it should be 31 bytes.

LKML-Reference: <20100414223935.14630.11954.stgit@localhost6.localdomain6>
Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-04-14 17:41:14 -03:00
Masami Hiramatsu
7ca5989dd0 perf probe: Fix to use correct debugfs path finder
Instead of using debugfs_path, use debugfs_find_mountpoint()
to find actual debugfs path.

LKML-Reference: <20100414223928.14630.38326.stgit@localhost6.localdomain6>
Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Reported-by: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-04-14 17:41:06 -03:00
Masami Hiramatsu
02b95dadc8 perf probe: Remove xstrdup()/xstrndup() from util/probe-{event, finder}.c
Remove all xstr*dup() calls from util/probe-{event,finder}.c since
it may cause 'sudden death' in utility functions and it makes
reusing it from other code difficult.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20100412171756.3790.89607.stgit@localhost6.localdomain6>
Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-04-14 17:28:52 -03:00
Masami Hiramatsu
e334016f1d perf probe: Remove xzalloc() from util/probe-{event, finder}.c
Remove all xzalloc() calls from util/probe-{event,finder}.c since
it may cause 'sudden death' in utility functions and it makes
reusing it from other code difficult.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20100412171749.3790.33303.stgit@localhost6.localdomain6>
Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-04-14 17:28:45 -03:00
Masami Hiramatsu
146a143948 perf probe: Remove die() from probe-event code
Remove die() and DIE_IF() code from util/probe-event.c since
these 'sudden death' in utility functions make reusing it from
other code (especially tui/gui) difficult.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20100412171742.3790.33650.stgit@localhost6.localdomain6>
Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-04-14 17:28:36 -03:00
Masami Hiramatsu
b55a87ade3 perf probe: Remove die() from probe-finder code
Remove die() and DIE_IF() code from util/probe-finder.c since
these 'sudden death' in utility functions make reusing it from
other code (especially tui/gui) difficult.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20100412171735.3790.88853.stgit@localhost6.localdomain6>
Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-04-14 17:28:30 -03:00
Masami Hiramatsu
a34a985499 perf probe: Support DW_OP_call_frame_cfa in debuginfo
When building kernel without CONFIG_FRAME_POINTER, gcc uses
CFA (canonical frame address) for frame base. With this patch,
perf probe just gets CFI (call frame information) from debuginfo
and search corresponding CFA from the CFI. IOW, this allows
perf probe works correctly on the kernel without CONFIG_FRAME_POINTER.

<Before>
 ./perf probe -fn sched_slice:12 lw.weight
  Fatal: DW_OP 156 is not supported.
              (^^^ DW_OP_call_frame_cfa)

<After>
./perf probe -fn sched_slice:12 lw.weight
Add new event:
  probe:sched_slice    (on sched_slice:12 with weight=lw.weight)

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20100412171728.3790.98217.stgit@localhost6.localdomain6>
Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-04-14 17:28:20 -03:00
Masami Hiramatsu
11a1ca3554 perf probe: Support basic type casting
Add basic type casting for arguments to perf probe. This allows
users to specify the actual type of arguments. Of course, if
user sets invalid types, kprobe-tracer rejects that.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20100412171722.3790.50372.stgit@localhost6.localdomain6>
Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-04-14 17:28:09 -03:00
Masami Hiramatsu
4984912eb2 perf probe: Query basic types from debuginfo
Query the basic type information (byte-size and signed-flag) from
debuginfo and pass that to kprobe-tracer. This is especially useful
for tracing the members of data structure, because each member has
different byte-size on the memory.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20100412171715.3790.23730.stgit@localhost6.localdomain6>
Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-04-14 17:27:56 -03:00
Masami Hiramatsu
df0faf4be0 perf probe: Use the last field name as the argument name
Set the last field name to the argument name when the argument
is refering a data-structure member.

e.g.
 ./perf probe --add 'vfs_read file->f_mode'
 Add new event:
   probe:vfs_read       (on vfs_read with f_mode=file->f_mode)

 This probe records file->f_mode, but the argument name becomes "f_mode".

This enables perf-trace command to parse trace event format correctly.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20100412171700.3790.72961.stgit@localhost6.localdomain6>
Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-04-14 17:26:14 -03:00
Masami Hiramatsu
48481938b0 perf probe: Support argument name
Set given names to event arguments. The syntax is same as kprobe-tracer,
you can add 'NAME=' right before each argument.

e.g.
  ./perf probe vfs_read foo=file

 then, 'foo' is set to the argument name as below.

  ./perf probe -l
  probe:vfs_read       (on vfs_read@linux-2.6-tip/fs/read_write.c with foo)

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20100412171653.3790.74624.stgit@localhost6.localdomain6>
Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-04-14 17:26:04 -03:00
Frederic Weisbecker
fcd1498405 perf tools: Fix accidentally preprocessed snprintf callback
struct sort_entry has a callback named snprintf that turns an
entry into a string result.
But there are glibc versions that implement snprintf through a
macro. The following expression is then going to get the snprintf
call preprocessed:

        ent->snprintf(...)

to finally end up in a build error:

        util/hist.c: Dans la fonction «hist_entry__snprintf» :
        util/hist.c:539: erreur: «struct sort_entry» has no member named «__builtin___snprintf_chk»

To fix this, prepend struct sort_entry callbacks with an "se_"
prefix.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-04-14 16:59:21 -03:00
Tom Zanussi
a0cccc2e8e perf trace: Invoke live mode automatically if record/report not specified
Currently, live mode is invoked by explicitly invoking the
record and report sides and connecting them with a pipe e.g.

 $ perf trace record rwtop -o - | perf trace report rwtop 5 -i -

In terms of usability, it's not that bad, but it does require
the user to type and remember more than necessary.

This patch allows the user to accomplish the same thing without
specifying the separate record/report steps or the pipe.  So the
same command as above can be accomplished more simply as:

 $ perf trace rwtop 5

Notice that the '-i -' and '-o -' aren't required in this case -
they're added internally, and that any extra arguments are
passed along to the report script (but not to the record
script).

The overall effect is that any of the scripts listed in 'perf
trace -l' can now be used directly in live mode, with the
expected arguments, by simply specifying the script and args to
'perf trace'.

Signed-off-by: Tom Zanussi <tzanussi@gmail.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: fweisbec@gmail.com
Cc: rostedt@goodmis.org
Cc: k-keiichi@bx.jp.nec.com
Cc: acme@ghostprotocols.net
LKML-Reference: <1270184365-8281-12-git-send-email-tzanussi@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-04-14 11:56:09 +02:00
Tom Zanussi
00b21a0193 perf trace/scripting: Enable scripting shell scripts for live mode
It should be possible to run any perf trace script in 'live
mode'. This requires being able to pass in e.g. '-i -' or other
args, which the current shell scripts aren't equipped to handle.
 In a few cases, there are required or optional args that also
need special handling. This patch makes changes the current set
of shell scripts as necessary.

Signed-off-by: Tom Zanussi <tzanussi@gmail.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: fweisbec@gmail.com
Cc: rostedt@goodmis.org
Cc: k-keiichi@bx.jp.nec.com
Cc: acme@ghostprotocols.net
LKML-Reference: <1270184365-8281-11-git-send-email-tzanussi@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-04-14 11:56:08 +02:00
Tom Zanussi
47902f3611 perf trace/scripting: Add rwtop and sctop scripts
A couple of scripts, one in Python and the other in Perl, that
demonstrate 'live mode' tracing.  For each, the output of the
perf event stream is fed continuously to the script, which
continuously aggregates the data and reports the current results
every 3 seconds, or at the optionally specified interval.  After
the current results are displayed, the aggregations are cleared
and the cycle begins anew.

To run the scripts, simply pipe the output of the 'perf trace
record' step as input to the corresponding 'perf trace report'
step, using '-' as the filename to -o and -i:

 $ perf trace record sctop -o - | perf trace report sctop -i -

Also adds clear_term() utility functions to the Util.pm and
Util.py utility modules, for use by any script to clear the
screen.

Signed-off-by: Tom Zanussi <tzanussi@gmail.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: fweisbec@gmail.com
Cc: rostedt@goodmis.org
Cc: k-keiichi@bx.jp.nec.com
Cc: acme@ghostprotocols.net
LKML-Reference: <1270184365-8281-10-git-send-email-tzanussi@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-04-14 11:56:08 +02:00
Tom Zanussi
c7929e4727 perf: Convert perf header build_ids into build_id events
Bypasses the build_id perf header code and replaces it with a
synthesized event and processing function that accomplishes the
same thing, used when reading/writing perf data to/from a pipe.

Signed-off-by: Tom Zanussi <tzanussi@gmail.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: fweisbec@gmail.com
Cc: rostedt@goodmis.org
Cc: k-keiichi@bx.jp.nec.com
Cc: acme@ghostprotocols.net
LKML-Reference: <1270184365-8281-9-git-send-email-tzanussi@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-04-14 11:56:08 +02:00
Tom Zanussi
9215545e99 perf: Convert perf tracing data into a tracing_data event
Bypasses the tracing_data perf header code and replaces it with
a synthesized event and processing function that accomplishes
the same thing, used when reading/writing perf data to/from a
pipe.

The tracing data is pretty large, and this patch doesn't attempt
to break it down into component events.  The tracing_data event
itself doesn't actually contain the tracing data, rather it
arranges for the event processing code to skip over it after
it's read, using the skip return value added to the event
processing loop in a previous patch.

Signed-off-by: Tom Zanussi <tzanussi@gmail.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: fweisbec@gmail.com
Cc: rostedt@goodmis.org
Cc: k-keiichi@bx.jp.nec.com
Cc: acme@ghostprotocols.net
LKML-Reference: <1270184365-8281-8-git-send-email-tzanussi@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-04-14 11:56:07 +02:00
Tom Zanussi
cd19a035f3 perf: Convert perf event types into event type events
Bypasses the event type perf header code and replaces it with a
synthesized event and processing function that accomplishes the
same thing, used when reading/writing perf data to/from a pipe.

Signed-off-by: Tom Zanussi <tzanussi@gmail.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: fweisbec@gmail.com
Cc: rostedt@goodmis.org
Cc: k-keiichi@bx.jp.nec.com
Cc: acme@ghostprotocols.net
LKML-Reference: <1270184365-8281-7-git-send-email-tzanussi@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-04-14 11:56:07 +02:00
Tom Zanussi
2c46dbb517 perf: Convert perf header attrs into attr events
Bypasses the attr perf header code and replaces it with a
synthesized event and processing function that accomplishes the
same thing, used when reading/writing perf data to/from a pipe.

Making the attrs into events allows them to be streamed over a
pipe along with the rest of the header data (in later patches).
It also paves the way to allowing events to be added and removed
from perf sessions dynamically.

Signed-off-by: Tom Zanussi <tzanussi@gmail.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: fweisbec@gmail.com
Cc: rostedt@goodmis.org
Cc: k-keiichi@bx.jp.nec.com
Cc: acme@ghostprotocols.net
LKML-Reference: <1270184365-8281-6-git-send-email-tzanussi@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-04-14 11:56:07 +02:00
Tom Zanussi
c239da3b4b perf trace: Introduce special handling for pipe input
Adds special treatment for stdin - if the user specifies '-i -'
to perf trace, the intent is that the event stream be read from
stdin rather than from a disk file.

The actual handling of the '-' filename is done by the session;
this just adds a signal handler to stop reporting, and turns off
interference by the pager.

Signed-off-by: Tom Zanussi <tzanussi@gmail.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: fweisbec@gmail.com
Cc: rostedt@goodmis.org
Cc: k-keiichi@bx.jp.nec.com
Cc: acme@ghostprotocols.net
LKML-Reference: <1270184365-8281-5-git-send-email-tzanussi@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-04-14 11:56:06 +02:00
Tom Zanussi
46656ac7fb perf report: Introduce special handling for pipe input
Adds special treatment for stdin - if the user specifies '-i -'
to perf report, the intent is that the event stream be written
to stdin rather than from a disk file.

The actual handling of the '-' filename is done by the session;
this just adds a signal handler to stop reporting, and turns off
interference by the pager.

Signed-off-by: Tom Zanussi <tzanussi@gmail.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: fweisbec@gmail.com
Cc: rostedt@goodmis.org
Cc: k-keiichi@bx.jp.nec.com
Cc: acme@ghostprotocols.net
LKML-Reference: <1270184365-8281-4-git-send-email-tzanussi@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-04-14 11:56:06 +02:00
Tom Zanussi
529870e374 perf record: Introduce special handling for pipe output
Adds special treatment for stdout - if the user specifies '-o -'
to perf record, the intent is that the event stream be written
to stdout rather than to a disk file.

Also, redirect stdout of forked child to stderr - in pipe mode,
stdout of the forked child interferes with the stdout perf
stream, so redirect it to stderr where it can still be seen but
won't be mixed in with the perf output.

Signed-off-by: Tom Zanussi <tzanussi@gmail.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: fweisbec@gmail.com
Cc: rostedt@goodmis.org
Cc: k-keiichi@bx.jp.nec.com
Cc: acme@ghostprotocols.net
LKML-Reference: <1270184365-8281-3-git-send-email-tzanussi@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-04-14 11:56:06 +02:00
Tom Zanussi
8dc58101f2 perf: Add pipe-specific header read/write and event processing code
This patch makes several changes to allow the perf event stream
to be sent and received over a pipe:

- adds pipe-specific versions of the header read/write code

- adds pipe-specific version of the event processing code

- adds a range of event types to be used for header or other
  pseudo events, above the range used by the kernel

- checks the return value of event handlers, which they can use
  to skip over large events during event processing rather than actually
  reading them into event objects.

- unifies the multiple do_read() functions and updates its
  users.

Note that none of these changes affect the existing perf data
file format or processing - this code only comes into play if
perf output is sent to stdout (or is read from stdin).

Signed-off-by: Tom Zanussi <tzanussi@gmail.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: fweisbec@gmail.com
Cc: rostedt@goodmis.org
Cc: k-keiichi@bx.jp.nec.com
Cc: acme@ghostprotocols.net
LKML-Reference: <1270184365-8281-2-git-send-email-tzanussi@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-04-14 11:56:05 +02:00
Ian Munsie
c055564217 perf: Fix endianness argument compatibility with OPT_BOOLEAN() and introduce OPT_INCR()
Parsing an option from the command line with OPT_BOOLEAN on a
bool data type would not work on a big-endian machine due to the
manner in which the boolean was being cast into an int and
incremented. For example, running 'perf probe --list' on a
PowerPC machine would fail to properly set the list_events bool
and would therefore print out the usage information and
terminate.

This patch makes OPT_BOOLEAN work as expected with a bool
datatype. For cases where the original OPT_BOOLEAN was
intentionally being used to increment an int each time it was
passed in on the command line, this patch introduces OPT_INCR
with the old behaviour of OPT_BOOLEAN (the verbose variable is
currently the only such example of this).

I have reviewed every use of OPT_BOOLEAN to verify that a true
C99 bool was passed. Where integers were used, I verified that
they were only being used for boolean logic and changed them to
bools to ensure that they would not be mistakenly used as ints.
The major exception was the verbose variable which now uses
OPT_INCR instead of OPT_BOOLEAN.

Signed-off-by: Ian Munsie <imunsie@au.ibm.com>
Acked-by: David S. Miller <davem@davemloft.net>
Cc: <stable@kernel.org> # NOTE: wont apply to .3[34].x cleanly, please backport
Cc: Git development list <git@vger.kernel.org>
Cc: Ian Munsie <imunsie@au1.ibm.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Eric B Munson <ebmunson@us.ibm.com>
Cc: Valdis.Kletnieks@vt.edu
Cc: WANG Cong <amwang@redhat.com>
Cc: Thiago Farina <tfransosi@gmail.com>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Cc: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: Anton Blanchard <anton@samba.org>
Cc: John Kacur <jkacur@redhat.com>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
LKML-Reference: <1271147857-11604-1-git-send-email-imunsie@au.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-04-14 11:26:44 +02:00
Arnaldo Carvalho de Melo
53e5b5c215 perf tools: Fix perl support installation when O= is used
We need to create the $O/scripts/perl/Perf-Trace-Util/ directory too.

Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-04-09 13:33:54 -03:00
Arnaldo Carvalho de Melo
e9e94e3bd8 perf trace: Ignore "overwrite" field if present in /events/header_page
That is not used in perf where we have the LOST events.

Without this patch we get:

[root@doppio ~]# perf lock report | head -3
  Warning: Error: expected 'data' but read 'overwrite'

So, to make the same perf command work with kernels with and without
this field, introduce variants for the parsing routines to not warn the
user in such case.

Discussed-with: Steven Rostedt <rostedt@goodmis.org>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-04-08 11:34:26 -03:00
Randy Dunlap
854c5548df perf: cleanup some Documentation
Correct typos in perf bench & perf sched help text.

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>,
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <20100331113100.cc898487.randy.dunlap@oracle.com>
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
2010-04-08 11:34:26 -03:00
Randy Dunlap
f0e9c4fcef perf bench: fix spello
Fix spello in user message.

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>,
Cc: Paul Mackerra <paulus@samba.org>s
LKML-Reference: <20100331113056.2c7df509.randy.dunlap@oracle.com>
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
2010-04-08 11:34:26 -03:00
Arnaldo Carvalho de Melo
eed05fe70f perf tools: Reorganize some structs to save space
Using 'pahole --packable' I found some structs that could be reorganized
to eliminate alignment holes, in some cases getting them to be cacheline
multiples.

[acme@doppio linux-2.6-tip]$ codiff perf.old ~/bin/perf
builtin-annotate.c:
  struct perf_session    |   -8
  struct perf_header     |   -8
 2 structs changed

builtin-diff.c:
  struct sample_data         |   -8
 1 struct changed
  diff__process_sample_event |   -8
 1 function changed, 8 bytes removed, diff: -8

builtin-sched.c:
  struct sched_atom      |   -8
 1 struct changed

builtin-timechart.c:
  struct per_pid         |   -8
 1 struct changed
  cmd_timechart          |  -16
 1 function changed, 16 bytes removed, diff: -16

builtin-probe.c:
  struct perf_probe_point |   -8
  struct perf_probe_event |   -8
 2 structs changed
  opt_add_probe_event     |   -3
 1 function changed, 3 bytes removed, diff: -3

util/probe-finder.c:
  struct probe_finder      |   -8
 1 struct changed
  find_kprobe_trace_events |  -16
 1 function changed, 16 bytes removed, diff: -16

/home/acme/bin/perf:
 4 functions changed, 43 bytes removed, diff: -43
[acme@doppio linux-2.6-tip]$

Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-04-08 11:34:26 -03:00
Arnaldo Carvalho de Melo
c0ed55d2e4 perf TUI: Move "Yes" button to before "No"
Esc + Enter should be enough warning to avoid accidentaly exiting from
the browser.

Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-04-08 11:34:25 -03:00
Arnaldo Carvalho de Melo
6e7ab4c649 perf TUI: Show filters on the title and add help line about how to zoom out
Suggested-by: Ingo Molnar <molnar@elte.hu>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-04-08 11:34:25 -03:00
Arnaldo Carvalho de Melo
8c40041f75 perf kmem: Fix breakage introduced by 5a0e3ad slab.h script
Commit 5a0e3ad ("include cleanup: Update gfp.h and slab.h
includes to prepare for breaking implicit slab.h inclusion
from percpu.h") added a '#include <linux/slab.h>' to
tools/perf/builtin-kmem.h because: that tool has lines like
this:

        if (!strcmp(event->name, "kmalloc") ||
            !strcmp(event->name, "kmem_cache_alloc")) {
                process_alloc_event(data, event, cpu, timestamp, thread, 0);
                return;
        }

So, using the script regex:

>>> import re
>>> s = re.compile(r'^(|.*[^a-zA-Z0-9_])_*(slab_is_available|kmem_cache_|k[mzc]alloc|krealloc|kz?free|ksize|__getname|putname)')
>>> l = '   !strcmp(event->name, "kmem_cache_alloc")) {'
>>> s.search(l)
<_sre.SRE_Match object at 0xb77b1ad0>
>>>

Remove that file that is not available in the tools/perf include
path and thus builtin-kmem.c couldn't be compiled.

Reported-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
LKML-Reference: <1270561053-14308-1-git-send-email-acme@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-04-06 17:48:06 +02:00
Tejun Heo
336f5899d2 Merge branch 'master' into export-slabh 2010-04-05 11:37:28 +09:00
Hitoshi Mitake
8141d0050d perf: Swap inclusion order of util.h and string.h in util/string.c
Currently util/string.c includes headers in this order: string.h, util.h
But this causes a build error because __USE_GNU definition
is needed for strndup() definition:

	% make -j
	touch .perf.dev.null
	    CC util/string.o
	cc1: warnings being treated as errors
	util/string.c: In function ‘argv_split’:
	util/string.c:171: error: implicit declaration of function ‘strndup’
	util/string.c:171: error: incompatible implicit declaration of built-in function ‘strndup’

So this patch swaps the headers inclusion order.
util.h defines _GNU_SOURCE, and /usr/include/features.h defines
__USE_GNU as 1 if _GNU_SOURCE is defined.

Signed-off-by: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <1270368798-27232-1-git-send-email-mitake@dcl.info.waseda.ac.jp>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2010-04-04 16:40:42 +02:00
Arnaldo Carvalho de Melo
a5e29aca02 perf TUI: Add a "Zoom into COMM(PID) thread" and reverse operations
Now one can press the right arrow key and in addition to being able to
filter by DSO, filter out by thread too, or a combination of both
filters.

With this one can start collecting events for the whole system, then
focus on a subset of the collected data quickly.

Cc: Avi Kivity <avi@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-04-03 22:45:00 -03:00
Arnaldo Carvalho de Melo
83753190c1 perf newt: Add a "Zoom into foo.so DSO" and reverse operations
Clicking on -> will bring as one of the popup menu options a "Zoom into
CURRENT DSO", i.e. CURRENT will be replaced by the name of the DSO in
the current line.

Choosing this option will filter out all samples that didn't took place
in a symbol in this DSO.

After that the option reverts to "Zoom out of CURRENT DSO", to allow
going back to the more compreensive view, not filtered by DSO.

Future similar operations will include zooming into a particular thread,
COMM, CPU, "last minute", "last N usecs", etc.

Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-04-03 22:36:56 -03:00
Ingo Molnar
22a4e4c435 Merge branch 'perf/urgent' into perf/core
Conflicts:
	tools/perf/Makefile

Merge reason: resolve the conflict.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-04-03 18:17:55 +02:00
Ingo Molnar
70a7c1271e Merge branch 'perf' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux-2.6 into perf/core 2010-04-03 18:16:42 +02:00
Arnaldo Carvalho de Melo
533c46c31c perf newt: Pass the input_name to perf_session__browse_hists
So that it can use it in the 'perf annotate' command line, otherwise
it'll use the default and not the specified -i filename passed to 'perf
report'.

Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-04-03 11:54:35 -03:00
Arnaldo Carvalho de Melo
e65713ea1e perf newt: Move the hist browser population bits to separare function
Next patches will use that when applying filtes to then repopulate the
browser with the narrowed vision.

Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-04-03 11:25:56 -03:00
Arnaldo Carvalho de Melo
fb6b893180 perf newt: Remove useless column width calculation
Not used in the TUI interface.

Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-04-03 11:04:55 -03:00
Anton Blanchard
4af8b35db6 perf symbols: Fill in pgoff in mmap synthesized events
When we synthesize mmap events we need to fill in the pgoff field.

I wasn't able to test this completely since I couldn't find an
executable region with a non 0 offset. We will see it when we start
doing data profiling.

Signed-off-by: Anton Blanchard <anton@samba.org>
Cc: David Miller <davem@davemloft.net>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <20100403115331.GK5594@kryten>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-04-03 10:20:31 -03:00
Arnaldo Carvalho de Melo
e206d556c5 perf tools: Move the prototypes in util/string.h to util.h
So that we avoid conflict with libc's string.h header.

Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Suggested-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-04-03 10:19:26 -03:00
Borislav Petkov
b0f86f5a16 perf, probe-finder: Build fix on Debian
Building chokes with:

 In file included from /usr/include/gelf.h:53,
                 from /usr/include/elfutils/libdw.h:53,
                 from util/probe-finder.h:61,
                 from util/probe-finder.c:39:
 /usr/include/libelf.h:98: error: expected specifier-qualifier-list before 'off64_t'
 [...]

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
Acked-by: Masami Hiramatsu <mhiramat@redhat.com>
LKML-Reference: <20100329164755.GA16034@aftab>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-04-02 22:46:26 +02:00
Tom Zanussi
b1dcc03cb8 perf/scripts: Tuple was set from long in both branches in python_process_event()
This is a fix to the signed/unsigned field handling in the
Python scripting engine, based on a patch from Roel Kluin.

Basically, Python wants to use a PyInt (which is internally a
long) if it can i.e. if the value will fit into that type.  If
not, it stores it into a PyLong, which isn't actually a long,
but an arbitrary-precision integer variable.

The code below is similar to to what Python does internally, and
it seems to work as expected on the x86 and x86_64 sytems I
tested it on.

Signed-off-by: Tom Zanussi <tzanussi@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Roel Kluin <roel.kluin@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: rostedt@goodmis.org
LKML-Reference: <1270184305.6422.10.camel@tropicana>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-04-02 21:32:16 +02:00
Arnaldo Carvalho de Melo
2aefa4f733 perf tools: sort_dimension__add shouldn't die
Propagate error instead.

LKML-Reference: <new-submission>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-04-02 16:28:35 -03:00
Arnaldo Carvalho de Melo
ad5b217b15 perf session: Remove one more exit() call from library code
Return NULL instead and make the caller propagate the error.

LKML-Reference: <new-submission>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-04-02 16:28:31 -03:00
Arnaldo Carvalho de Melo
b9fb930477 perf hist: Only allocate callchain_node if processing callchains
The struct callchain_node size is 120 bytes, that are never used when
there are no callchains or '-g none' is specified, so conditionally
allocate it, reducing sizeof(struct hist_entry) from 210 bytes to only
96, greatly speeding the non-callchain processing.

LKML-Reference: <new-submission>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-04-02 16:28:28 -03:00
Arnaldo Carvalho de Melo
71cf8b8ff7 perf kmem: Fixup the symbol address before using it
We get absolute addresses in the events, but relative ones from the
symbol subsystem, so calculate the absolute address by asking for the
map where the symbol was found, that has the place where the DSO was
actually loaded.

For the core kernel this poses no problems if the kernel is not
relocated by things like kexec, or if we use /proc/kallsyms, but for
modules we were getting really large, negative offsets.

LKML-Reference: <new-submission>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-04-02 16:28:25 -03:00
Arnaldo Carvalho de Melo
e727ca73f8 perf kmem: Resolve kernel symbols again
Due to the assumption in perf_session__new that the kernel maps would be
created using the fake PERF_RECORD_MMAP event in a perf.data file 'perf
kmem --stat caller', that doesn't have such event, ends up not being
able to resolve the kernel addresses.

Fix it by calling perf_session__create_kernel_maps() in __cmd_kmem().

LKML-Reference: <new-submission>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-04-02 16:28:20 -03:00
Arnaldo Carvalho de Melo
a4e3b956a8 perf hist: Replace ->print() routines by ->snprintf() equivalents
Then hist_entry__fprintf will just us the newly introduced
hist_entry__snprintf, add the newline and fprintf it to the supplied
FILE descriptor.

This allows us to remove the use_browser checking in the color_printf
routines, that now got color_snprintf variants too.

The newt TUI browser (and other GUIs that may come in the future) don't
have to worry about stdio specific stuff in the strings they get from
the se->snprintf routines and instead use whatever means to do the
equivalent.

Also the newt TUI browser don't have to use the fmemopen() hack, instead
it can use the se->snprintf routines directly. For now tho use the
hist_entry__snprintf routine to reduce the patch size.

Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-04-02 16:28:15 -03:00
Arnaldo Carvalho de Melo
70162138c9 perf record: Add a fallback to the reference relocation symbol
Usually "_text" is enough, but I received reports that its not always
available, so fallback to "_stext" for the symbol we use to check if we
need to apply any relocation to all the symbols in the kernel symtab,
for when, for instance, kexec is being used.

Reported-by: Darren Hart <dvhltc@us.ibm.com>
Cc: Darren Hart <dvhltc@us.ibm.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-04-02 16:28:06 -03:00
Arnaldo Carvalho de Melo
c29ede615f perf tools: Allow specifying O= to build files in a separate directory
Avoiding polluting the source tree with build files.

Reported-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-04-02 16:28:03 -03:00
Arnaldo Carvalho de Melo
8b2c551f96 perf tools: Use -o $(BITBUCKET) in one more case
As described in 1703f2c some gcc versions has issues using /dev/null, so
use the mechanism used elsewhere.

Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-04-02 16:27:59 -03:00
Arnaldo Carvalho de Melo
5f4d3f8816 perf report: Add progress bars
For when we are processing the events and inserting the entries in the
browser.

Experimentation here: naming "ui_something" we may be treading into
creating a TUI/GUI set of routines that can then be implemented in terms
of multiple backends.

Also the time it takes for adding things to the "browser" takes, visually
(I guess I should do some profiling here ;-) ), more time than for
processing the events...

That means we probably need to create a custom hist_entry browser, so
that we reuse the structures we have in place instead of duplicating
them in newt.

But progress was made and at least we can see something while long files
are being loaded, that must be one of UI 101 bullet points :-)

Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-04-02 16:27:55 -03:00
Arnaldo Carvalho de Melo
7e5e1b1404 perf symbols: map_groups__find_symbol must return the map too
Tools need to know from which map in the map_group a symbol was resolved
to, so that, for isntance, we can annotate kernel modules symbols by
getting its precise name, etc.

Also add the _by_name variants for completeness.

Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-04-02 16:27:43 -03:00