linux/tools/perf
Frederic Weisbecker b67577dfb4 perf lock: Drop the buffers multiplexing dependency
We need to deal with time ordered events to build a correct
state machine of lock events. This is why we multiplex the lock
events buffers. But the ordering is done from the kernel, on
the tracing fast path, leading to high contention between cpus.

Without multiplexing, the events appears in a weak order.
If we have four events, each split per cpu, perf record will
read the events buffers in the following order:

[ CPU0 ev0, CPU0 ev1, CPU0 ev3, CPU0 ev4, CPU1 ev0, CPU1 ev0....]

To handle a post processing reordering, we could just read and sort
the whole in memory, but it just doesn't scale with high amounts
of events: lock events can fill huge amounts in few times.

Basically we need to sort in memory and find a "grace period"
point when we know that a given slice of previously sorted events
can be committed for post-processing, so that we can unload the
memory usage step by step and keep a scalable sorting list.

There is no strong rules about how to define such "grace period".
What does this patch is:

We define a FLUSH_PERIOD value that defines a grace period in
seconds.
We want to have a slice of events covering 2 * FLUSH_PERIOD in our
sorted list.
If FLUSH_PERIOD is big enough, it ensures every events that occured
in the first half of the timeslice have all been buffered and there
are none remaining and there won't be further to put inside this
first timeslice. Then once we reach the 2 * FLUSH_PERIOD
timeslice, we flush the first half to be gentle with the memory
(the second half can still get new events in the middle, so wait
another period to flush it)

FLUSH_PERIOD is defined to 5 seconds. Say the first event started on
time t0. We can safely assume that at the time we are processing
events of t0 + 10 seconds, ther won't be anymore events to read
from perf.data that occured between t0 and t0 + 5 seconds. Hence
we can safely flush the first half.

To point out funky bugs, we have a guardian that checks a new event
timestamp is not below the last event's timestamp flushed and that
displays a warning in this case.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
2010-02-27 17:06:19 +01:00
..
bench perf sched: Fix build failure on sparc 2009-12-14 08:59:12 +01:00
Documentation perf lock: Fix and add misc documentally things 2010-02-27 17:05:22 +01:00
scripts perf/scripts: Add syscall tracing scripts 2010-02-25 04:07:48 +01:00
util Merge commit 'v2.6.33' into perf/core 2010-02-27 16:18:46 +01:00
.gitignore perf: Ignore perf-archive temp file 2010-01-29 10:37:33 +01:00
builtin-annotate.c perf annotate: Handle samples not at objdump output addr boundaries 2010-02-26 15:42:49 +01:00
builtin-bench.c perf bench: Add "all" pseudo subsystem and "all" pseudo suite 2009-12-14 08:51:19 +01:00
builtin-buildid-cache.c perf buildid-cache: Add new command to manage build-id cache 2010-01-21 08:31:29 +01:00
builtin-buildid-list.c perf build-id: Move the routine to find DSOs with hits to the lib 2010-02-04 09:33:26 +01:00
builtin-diff.c perf tools: Don't cast RIP to pointers 2010-01-16 10:58:45 +01:00
builtin-help.c perf: Make cmd_to_page() function more compact 2010-01-13 10:53:51 +01:00
builtin-kmem.c perf symbols: Remove perf_session usage in symbols layer 2010-02-04 09:33:24 +01:00
builtin-list.c perf list: Fix large list output by using the pager 2009-08-13 09:05:48 +02:00
builtin-lock.c perf lock: Drop the buffers multiplexing dependency 2010-02-27 17:06:19 +01:00
builtin-probe.c perf probe: Don't use a perf_session instance just to resolve symbols 2010-02-04 09:33:26 +01:00
builtin-record.c perf record: Fix existing process callgraph symbol 2010-02-08 16:55:52 +01:00
builtin-report.c Merge branch 'perf/urgent' into perf/core 2010-01-29 10:36:22 +01:00
builtin-sched.c perf tools: Don't cast RIP to pointers 2010-01-16 10:58:45 +01:00
builtin-stat.c perf tools: Fix --pid option for stat 2010-01-13 10:09:08 +01:00
builtin-timechart.c Merge branch 'perf/urgent' into perf/core 2010-01-29 10:36:22 +01:00
builtin-top.c Merge commit 'v2.6.33' into perf/core 2010-02-27 16:18:46 +01:00
builtin-trace.c perf/scripts: Add Python scripting engine 2010-02-25 04:07:29 +01:00
builtin.h perf lock: Introduce new tool "perf lock", for analyzing lock statistics 2010-01-31 09:08:26 +01:00
command-list.txt perf lock: Fix and add misc documentally things 2010-02-27 17:05:22 +01:00
CREDITS perf_counter tools: Add CREDITS file for Git contributors 2009-06-24 19:54:29 +02:00
design.txt perf: Fix few typos + cosmetics 2010-01-13 17:39:44 +01:00
Makefile perf/scripts: Add syscall tracing scripts 2010-02-25 04:07:48 +01:00
perf-archive.sh perf archive: Add helper script to package files needed to do analysis 2010-01-16 10:58:49 +01:00
perf.c perf lock: Introduce new tool "perf lock", for analyzing lock statistics 2010-01-31 09:08:26 +01:00
perf.h perf tools: Allow building for ARM 2009-12-11 13:50:21 +01:00