Impact: build fix
On !CONFIG_CONTEXT_SWITCH_TRACER trace_find_cmdline() is not defined:
kernel/trace/trace_output.c: In function 'trace_ctxwake_print':
kernel/trace/trace_output.c:499: error: implicit declaration of function 'trace_find_cmdline'
kernel/trace/trace_output.c:499: warning: assignment makes pointer from integer without a cast
Move it to the generic section in trace.h.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: extend the tracing API
The goal of this patch is to normalize and make more easy the
implementation of statistical (histogram) tracing.
It implements a trace_stat file into the /debugfs/tracing directory where
one can print a one-shot output of statistics/histogram entries.
A tracer has to provide two basic iterator callbacks:
stat_start() => the first entry
stat_next(prev, idx) => the next one.
Note that it is adapted for arrays or hash tables or lists.... since it
provides a pointer to the previous entry and the current index of the
iterator.
These two callbacks are called to get a snapshot of the statistics at each
opening of the trace_stat file because. The values are so updated between
two "cat trace_stat". And the tracer is free to lock its datas during the
iteration to keep consistent values.
Since it is almost always interesting to sort statisticals values to
address the problems by priority, this infrastructure provides a "sorting"
of the stat entries too if desired. A tracer has just to provide a
stat_cmp callback to compare two entries and the stat tracing
infrastructure will build a sorted list of the given entries.
A last callback, called stat_headers, can be implemented by a tracer to
output headers on its trace.
If one of these callbacks is changed on runtime, it just have to signal it
to the stat tracing API by calling the init_tracer_stat() helper.
Changes in V2:
- Fix a memory leak if the user opens multiple times the trace_stat file
without closing it. Now we always free our list before rebuilding it.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: rework trace.c to use new event register API
Almost every ftrace event has to implement its output display in
trace.c through a different function. Some events did not handle
all the formats (trace, latency-trace, raw, hex, binary), and
this method does not scale well.
This patch converts the format functions to use the event API to
find the event and and print its format. Currently, we have
a print function for trace, latency_trace, raw, hex and binary.
A trace_nop_print is available if the event wants to avoid output
on a particular format.
Perhaps other tracers could use this in the future (like mmiotrace and
function_graph).
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: simplify/generalize/refactor trace.c
The trace.c file is becoming more difficult to maintain due to the
growing number of events. There is several formats that an event may
be printed. This patch sets up the infrastructure of an event hash to
allow for events to register how they should be printed.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: cleanup, remove obsolete code
Now that the ring buffer used by ftrace allows for variable length
entries, we do not need the 'cont' feature of the buffer. This code
makes other parts of ftrace more complex and by removing this it
simplifies the ftrace code.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
* 'tracing-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (241 commits)
sched, trace: update trace_sched_wakeup()
tracing/ftrace: don't trace on early stage of a secondary cpu boot, v3
Revert "x86: disable X86_PTRACE_BTS"
ring-buffer: prevent false positive warning
ring-buffer: fix dangling commit race
ftrace: enable format arguments checking
x86, bts: memory accounting
x86, bts: add fork and exit handling
ftrace: introduce tracing_reset_online_cpus() helper
tracing: fix warnings in kernel/trace/trace_sched_switch.c
tracing: fix warning in kernel/trace/trace.c
tracing/ring-buffer: remove unused ring_buffer size
trace: fix task state printout
ftrace: add not to regex on filtering functions
trace: better use of stack_trace_enabled for boot up code
trace: add a way to enable or disable the stack tracer
x86: entry_64 - introduce FTRACE_ frame macro v2
tracing/ftrace: add the printk-msg-only option
tracing/ftrace: use preempt_enable_no_resched_notrace in ring_buffer_time_stamp()
x86, bts: correctly report invalid bts records
...
Fixed up trivial conflict in scripts/recordmcount.pl due to SH bits
being already partly merged by the SH merge.
Impact: tracer output improvement
Ending newlines are appended automatically on comments by the function
graph tracer because the newline needs to be placed after the "*/"
comment characters.
So if the user puts an ending newline, we want to strip it.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: extend the wakeup tracepoint with the info whether the wakeup was real
Add the information needed to distinguish 'real' wakeups from 'false'
wakeups.
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: eliminate false WARN_ON message
If an interrupt goes off after the setting of the local variable
tail_page and before incrementing the write index of that page,
the interrupt could push the commit forward to the next page.
Later a check is made to see if interrupts pushed the buffer around
the entire ring buffer by comparing the next page to the last commited
page. This can produce a false positive if the interrupt had pushed
the commit page forward as stated above.
Thanks to Jiaying Zhang for finding this race.
Reported-by: Jiaying Zhang <jiayingz@google.com>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Cc: <stable@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: fix stuck trace-buffers
If an interrupt comes in during the rb_set_commit_to_write and
pushes the tail page forward just at the right time, the commit
updates will miss the adding of the interrupt data. This will
cause the commit pointer to cease from moving forward.
Thanks to Jiaying Zhang for finding this race.
Reported-by: Jiaying Zhang <jiayingz@google.com>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Cc: <stable@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: cleanup
This patch factors out common code from multiple tracers into a
tracing_reset_online_cpus() function and converts the tracers to use it.
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
these warnings:
kernel/trace/trace_sched_switch.c: In function ‘tracing_sched_register’:
kernel/trace/trace_sched_switch.c:96: warning: passing argument 1 of ‘register_trace_sched_wakeup_new’ from incompatible pointer type
kernel/trace/trace_sched_switch.c:112: warning: passing argument 1 of ‘unregister_trace_sched_wakeup_new’ from incompatible pointer type
kernel/trace/trace_sched_switch.c: In function ‘tracing_sched_unregister’:
kernel/trace/trace_sched_switch.c:121: warning: passing argument 1 of ‘unregister_trace_sched_wakeup_new’ from incompatible pointer type
Trigger because sched_wakeup_new tracepoints need the same trace
signature as sched_wakeup - which was changed recently.
Fix it.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
this warning:
kernel/trace/trace.c: In function ‘print_lat_fmt’:
kernel/trace/trace.c:1826: warning: unused variable ‘state’
Triggers because 'state' has become unused - remove it.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: remove dead code
struct ring_buffer.size is not set after ring_buffer is initialized
or resized. it is always 0.
we can use "buffer->pages * PAGE_SIZE" to get ring_buffer's size
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: fix occasionally incorrect trace output
The tracing code has interesting varieties of printing out task state.
Unfortunalely only one of the instances is correct as it copies the
code from sched.c:sched_show_task(). The others are plain wrong as
they treatthe bitfield as an integer offset into the character
array. Also the size check of the character array is wrong as it
includes the trailing \0.
Use a common state decoder inline which does the Right Thing.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: enhancement
Ingo Molnar has asked about a way to remove items from the filter
lists. Currently, you can only add or replace items. The way
items are added to the list is through opening one of the list
files (set_ftrace_filter or set_ftrace_notrace) via append.
If the file is opened for truncate, the list is cleared.
echo spin_lock > /debug/tracing/set_ftrace_filter
The above will replace the list with only spin_lock
echo spin_lock >> /debug/tracing/set_ftrace_filter
The above will add spin_lock to the list.
Now this patch adds:
echo '!spin_lock' >> /debug/tracing/set_ftrace_filter
This will remove spin_lock from the list.
The limited glob features of these lists also can be notted.
echo '!spin_*' >> /debug/tracing/set_ftrace_filter
This will remove all functions that start with 'spin_'
Note:
echo '!spin_*' > /debug/tracing/set_ftrace_filter
will simply clear out the list (notice the '>' instead of '>>')
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: clean up
Andrew Morton suggested to use the stack_tracer_enabled variable
to decide whether or not to start stack tracing on bootup.
This lets us remove the start_stack_trace variable.
Reported-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: enhancement to stack tracer
The stack tracer currently is either on when configured in or
off when it is not. It can not be disabled when it is configured on.
(besides disabling the function tracer that it uses)
This patch adds a way to enable or disable the stack tracer at
run time. It defaults off on bootup, but a kernel parameter 'stacktrace'
has been added to enable it on bootup.
A new sysctl has been added "kernel.stack_tracer_enabled" to let
the user enable or disable the stack tracer at run time.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: display ftrace_printk messages "as is"
By default, ftrace_printk() messages find their output with some other
informations like pid, caller, ...
Sometimes a developer just want to have the ftrace_printk left "as is", without
other information.
This is done by providing a default-off option called printk-msg-only.
To enable it, just do `echo printk-msg-only > /debugfs/tracing/trace_options`
Before the patch:
<...>-2739 [000] 145.692153: __might_sleep: I'm an ftrace_printk msg in __might_sleep
<...>-2739 [000] 145.692155: __might_sleep: I'm another ftrace_printk msg in __might_sleep
After the patch and the printk-msg-only option enabled:
I'm an ftrace_printk msg in __might_sleep
I'm another ftrace_printk msg in __might_sleep
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: prevent a trace recursion
After some tests with function graph tracer under x86-32, I saw some recursions
caused by ring_buffer_time_stamp() that calls preempt_enable_no_notrace() which
calls preempt_schedule() which is traced itself.
This patch re-enables preemption without rescheduling.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: change calling convention of existing cpumask APIs
Most cpumask functions started with cpus_: these have been replaced by
cpumask_ ones which take struct cpumask pointers as expected.
These four functions don't have good replacement names; fortunately
they're rarely used, so we just change them over.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Mike Travis <travis@sgi.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: paulus@samba.org
Cc: mingo@redhat.com
Cc: tony.luck@intel.com
Cc: ralf@linux-mips.org
Cc: Greg Kroah-Hartman <gregkh@suse.de>
Cc: cl@linux-foundation.org
Cc: srostedt@redhat.com
Impact: restructure code, cleanup
Remove BTS bits from the hw-branch-tracer (renamed from bts-tracer) and
use the ds interface.
Signed-off-by: Markus Metzger <markut.t.metzger@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
I added EXPORT_SYMBOL_GPLs for all functions part of the API
(ring_buffer.h). This is required since oprofile is using the ring
buffer and the compilation as modules would fail otherwise.
Signed-off-by: Robert Richter <robert.richter@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
this warning:
kernel/trace/trace.c: In function ‘trace_vprintk’:
kernel/trace/trace.c:3626: warning: ‘flags’ may be used uninitialized in this function
shows some confusion about irq_flags / flags use here. We already have
irq_flags so remove the extra flags variable.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: Provide a way to pause the function graph tracer
As suggested by Steven Rostedt, the previous patch that prevented from
spinlock function tracing shouldn't use the raw_spinlock to fix it.
It's much better to follow lockdep with normal spinlock, so this patch
adds a new flag for each task to make the function graph tracer able
to be paused. We also can send an ftrace_printk whithout worrying of
the irrelevant traced spinlock during insertion.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: cleanup
Apply some suggestions of Steven Rostedt:
_turn tracing_selftest_running into a simple int (no need of an atomic_t)
_set it __read_mostly
_fix a comment style
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: provide trace headers to explain a bit the output
This patch implements the print_headers callback for the function graph
tracer. These headers are output according to the current trace options.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: clean up
Using (struct pid *)-1 as the pointer for ftrace_swapper_pid is
a little confusing for others. This patch uses the address of the
actual init pid structure instead. This change is only for
clarity. It does not affect the code itself. Hopefully soon the
swapper tasks will all have their own pid structure and then
we can clean up the code a bit more.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: cleanup
As suggested by Steven Rostedt, this patch provide a new macro
task_curr_ret_stack() to move the cpp conditionnal CONFIG into
the linux/ftrace.h headers.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: fix default empty traces on function-graph-tracer
The actual ftrace_trace_task() checks if ftrace_pid_trace is allocated
and return 1 if it is true.
If it is NULL, it will check the bit of pid tracing flag for the current
task (which are not set by default).
So by default, a task is not traced.
Actually all tasks should be traced by default and filter_by_pid when
ftrace_pid_trace is allocated.
The appropriate condition should be to return 1 if filter_by_pid is
set.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acke-dby: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: fix tracer selfstests false results
After setting a ftrace_printk somewhere in th kernel, I saw the
Function tracer selftest failing.
When a selftest occurs, the ring buffer is lurked to see if
some entries were inserted. But concurrent insertion such as
ftrace_printk could occured at the same time and could give
false positive or negative results.
This patch prevent prevent from TRACE_PRINT entries insertion
during selftests.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Handle the TRACE_PRINT entries from the function grapg tracer
and output them as a C comment just below the function that called
it, as if it was a comment inside this function.
Example with an ftrace_printk inside might_sleep() function:
void __might_sleep(char *file, int line)
{
static unsigned long prev_jiffy; /* ratelimiting */
ftrace_printk("Hi I'm a comment in might_sleep() :-)");
A chunk of a resulting trace:
0) | _reiserfs_free_block() {
0) | reiserfs_read_bitmap_block() {
0) | __bread() {
0) | __getblk() {
0) | __find_get_block() {
0) 0.698 us | mark_page_accessed();
0) 2.267 us | }
0) | __might_sleep() {
0) | /* Hi I'm a comment in might_sleep() :-) */
0) 1.321 us | }
0) 5.872 us | }
0) 7.313 us | }
0) 8.718 us | }
And this patch brings two minor fixes:
- The newline after a switch-out task has disappeared
- The "|" sign just before the cpu number on task-switch has been deleted.
0) 0.616 us | pick_next_task_rt();
0) 1.457 us | _spin_trylock();
0) 0.653 us | _spin_unlock();
0) 0.728 us | _spin_trylock();
0) 0.631 us | _spin_unlock();
0) 0.729 us | native_load_sp0();
0) 0.593 us | native_load_tls();
------------------------------------------
0) cat-2834 => migrati-3
------------------------------------------
0) | finish_task_switch() {
0) 0.841 us | _spin_unlock_irq();
0) 0.616 us | post_schedule_rt();
0) 3.882 us | }
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: fix a bug in function filter setting
when writing function to set_graph_function, we should check whether it
has existed in set_graph_function to avoid duplicating.
Signed-off-by: Liming Wang <liming.wang@windriver.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: new feature
This patch lets the swapper tasks of all CPUS be filtered by the
set_ftrace_pid file.
If '0' is echoed into this file, then all the idle tasks (aka swapper)
is flagged to be traced. This affects all CPU idle tasks.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: clean up, extend PID filtering to PID namespaces
Eric Biederman suggested using the struct pid for filtering on
pids in the kernel. This patch is based off of a demonstration
of an implementation that Eric sent me in an email.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: New feature
This patch makes the changes to set_ftrace_pid apply to the function
graph tracer.
# echo $$ > /debugfs/tracing/set_ftrace_pid
# echo function_graph > /debugfs/tracing/current_tracer
Will cause only the current task to be traced. Note, the trace flags are
also inherited by child processes, so the children of the shell
will also be traced.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: clean up
Use the new task struct trace flags to determine if a process should be
traced or not.
Note: this moves the searching of the pid to the slow path of setting
the pid field. This needs to be converted to the pid name space.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patch adds the file:
/debugfs/tracing/set_graph_function
which can be used along with the function graph tracer.
When this file is empty, the function graph tracer will act as
usual. When the file has a function in it, the function graph
tracer will only trace that function.
For example:
# echo blk_unplug > /debugfs/tracing/set_graph_function
# cat /debugfs/tracing/trace
[...]
------------------------------------------
| 2) make-19003 => kjournald-2219
------------------------------------------
2) | blk_unplug() {
2) | dm_unplug_all() {
2) | dm_get_table() {
2) 1.381 us | _read_lock();
2) 0.911 us | dm_table_get();
2) 1. 76 us | _read_unlock();
2) + 12.912 us | }
2) | dm_table_unplug_all() {
2) | blk_unplug() {
2) 0.778 us | generic_unplug_device();
2) 2.409 us | }
2) 5.992 us | }
2) 0.813 us | dm_table_put();
2) + 29. 90 us | }
2) + 34.532 us | }
You can add up to 32 functions into this file. Currently we limit it
to 32, but this may change with later improvements.
To add another function, use the append '>>':
# echo sys_read >> /debugfs/tracing/set_graph_function
# cat /debugfs/tracing/set_graph_function
blk_unplug
sys_read
Using the '>' will clear out the function and write anew:
# echo sys_write > /debug/tracing/set_graph_function
# cat /debug/tracing/set_graph_function
sys_write
Note, if you have function graph running while doing this, the small
time between clearing it and updating it will cause the graph to
record all functions. This should not be an issue because after
it sets the filter, only those functions will be recorded from then on.
If you need to only record a particular function then set this
file first before starting the function graph tracer. In the future
this side effect may be corrected.
The set_graph_function file is similar to the set_ftrace_filter but
it does not take wild cards nor does it allow for more than one
function to be set with a single write. There is no technical reason why
this is the case, I just do not have the time yet to implement that.
Note, dynamic ftrace must be enabled for this to appear because it
uses the dynamic ftrace records to match the name to the mcount
call sites.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Conflicts:
fs/nfsd/nfs4recover.c
Manually fixed above to use new creds API functions, e.g.
nfs4_save_creds().
Signed-off-by: James Morris <jmorris@namei.org>
Impact: fix to output of stack trace
If a function is not found in the stack of the stack tracer, the
number printed is quite strange. This fixes the algorithm to handle
missing functions better.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
CONFIG_FUNCTION_GRAPH_TRACER depends on FUNCTION_TRACER already,
(turning it non-default) so it so making it default-n is pointless.
So enable it by default - it's a nice extension of the function tracer.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: better trace output of duration for long calls
The old duration output didn't exceeded 9999.999 us to fit the column
and the nanosecs were always 3 numbers. As Ingo suggested, it's better
to have the whole microseconds elapsed time and shift the nanosecs precision
if needed to fit the maximum 7 numbers. And usec need more number, the case
should be rare and important enough to break a bit the column alignment to
show it.
So, depending of the duration value, we now have these patterns:
u.nnn us
uu.nnn us
uuu.nnn us
uuuu.nnn us
uuuuu.nn us
uuuuuu.n us
uuuuuuuu..... us
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: extend function-graph output: let one know which thread called a function
This patch implements a helper function to print the couple cmdline/pid.
Its output is provided during task switching and on each row if the new
"funcgraph-proc" defualt-off option is set through trace_options file.
The output is center aligned and never exceeds 14 characters. The cmdline
is truncated over 7 chars.
But note that if the pid exceeds 6 characters, the column will overflow (but
the situation is abnormal).
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: feature, let entry function decide to trace or not
This patch lets the graph tracer entry function decide if the tracing
should be done at the end as well. This requires all function graph
entry functions return 1 if it should trace, or 0 if the return should
not be traced.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: clean up
Andrew Morton pointed out that the kernel convention of a variable
named page should be of type page struct. The ring buffer uses
a variable named "page" for a pointer to something else.
This patch converts those to be called "bpage" (as in "buffer page").
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: new ftrace_graph_stop function
While developing more features of function graph, I hit a bug that
caused the WARN_ON to trigger in the prepare_ftrace_return function.
Well, it was hard for me to find out that was happening because the
bug would not print, it would just cause a hard lockup or reboot.
The reason is that it is not safe to call printk from this function.
Looking further, I also found that it calls unregister_ftrace_graph,
which grabs a mutex and calls kstop machine. This would definitely
lock the box up if it were to trigger.
This patch adds a fast and safe ftrace_graph_stop() which will
stop the function tracer. Then it is safe to call the WARN ON.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: new API to ring buffer
This patch adds a new interface into the ring buffer that allows a
page to be read from the ring buffer on a given CPU. For every page
read, one must also be given to allow for a "swap" of the pages.
rpage = ring_buffer_alloc_read_page(buffer);
if (!rpage)
goto err;
ret = ring_buffer_read_page(buffer, &rpage, cpu, full);
if (!ret)
goto empty;
process_page(rpage);
ring_buffer_free_read_page(rpage);
The caller of these functions must handle any waits that are
needed to wait for new data. The ring_buffer_read_page will simply
return 0 if there is no data, or if "full" is set and the writer
is still on the current page.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: get ready for splice changes
This patch moves the commit and timestamp into the beginning of each
data page of the buffer. This change will allow the page to be moved
to another location (disk, network, etc) and still have information
in the page to be able to read it.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: fix for lockdep and ftrace
The raw_local_irq_save/restore confuses lockdep. This patch
converts them to the local_irq_save/restore variants.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: extend and enable the function graph tracer to 64-bit x86
This patch implements the support for function graph tracer under x86-64.
Both static and dynamic tracing are supported.
This causes some small CPP conditional asm on arch/x86/kernel/ftrace.c I
wanted to use probe_kernel_read/write to make the return address
saving/patching code more generic but it causes tracing recursion.
That would be perhaps useful to implement a notrace version of these
function for other archs ports.
Note that arch/x86/process_64.c is not traced, as in X86-32. I first
thought __switch_to() was responsible of crashes during tracing because I
believed current task were changed inside but that's actually not the
case (actually yes, but not the "current" pointer).
So I will have to investigate to find the functions that harm here, to
enable tracing of the other functions inside (but there is no issue at
this time, while process_64.c stays out of -pg flags).
A little possible race condition is fixed inside this patch too. When the
tracer allocate a return stack dynamically, the current depth is not
initialized before but after. An interrupt could occur at this time and,
after seeing that the return stack is allocated, the tracer could try to
trace it with a random uninitialized depth. It's a prevention, even if I
hadn't problems with it.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Tim Bird <tim.bird@am.sony.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: fix "no output from tracer" bug caused by ftrace_update_pid_func()
When disabling single thread function trace using
"echo -1 > set_ftrace_pid", the normal function trace
has to restore to original function, otherwise the normal
function trace will not work well.
Without this commit, something like below:
$ ps |grep 850
850 root 2556 S -/bin/sh
$ echo 850 > /debug/tracing/set_ftrace_pid
$ echo function > /debug/tracing/current_tracer
$ echo 1 > /debug/tracing/tracing_enabled
$ sleep 1
$ echo 0 > /debug/tracing/tracing_enabled
$ cat /debug/tracing/trace_pipe |wc -l
59704
$ echo -1 > /debug/tracing/set_ftrace_pid
$ echo 1 > /debug/tracing/tracing_enabled
$ sleep 1
$ echo 0 > /debug/tracing/tracing_enabled
$ more /debug/tracing/trace_pipe
<====== nothing output now!
it should output trace record.
Signed-off-by: Liming Wang <liming.wang@windriver.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: fix build error on branch tracer
This should fix a build error reported on alpha in linux-next:
CC kernel/trace/trace_branch.o
kernel/trace/trace_branch.c: In function 'probe_likely_condition':
kernel/trace/trace_branch.c:44: error: implicit declaration of function 'raw_local_irq_save'
kernel/trace/trace_branch.c:76: error: implicit declaration of function 'raw_local_irq_restore'
Unfortunately, I can't test it since I don't have any Alpha build environment.
Reported-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: make ftrace position computing more sane
First remove useless ->pos field. Then we needn't check seq_printf
in .show like other place.
Signed-off-by: Liming Wang <liming.wang@windriver.com>
Reviewed-by: Bruce Ashfield <bruce.ashfield@windriver.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
There are architectures that still have no stacktrace support.
Signed-off-by: Török Edwin <edwintorok@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: increase the visual qualities of the call-graph-tracer output
This patch applies various trace output formatting changes:
- CPU is now a decimal number, followed by a parenthesis.
- Overhead is now on the second column (gives a good visibility)
- Cost is now on the third column, can't exceed 9999.99 us. It is
followed by a virtual line based on a "|" character.
- Functions calls are now the last column on the right. This way, we
haven't dynamic column (which flow is harder to follow) on its right.
- CPU and Overhead have their own option flag. They are default-on but you
can disable them easily:
echo nofuncgraph-cpu > trace_options
echo nofuncgraph-overhead > trace_options
TODO:
_ Refactoring of the thread switch output.
_ Give a default-off option to output the thread and its pid on each row.
_ Provide headers
_ ....
Here is an example of the new trace style:
0) | mutex_unlock() {
0) 0.639 us | __mutex_unlock_slowpath();
0) 1.607 us | }
0) | remove_wait_queue() {
0) 0.616 us | _spin_lock_irqsave();
0) 0.616 us | _spin_unlock_irqrestore();
0) 2.779 us | }
0) 0.495 us | n_tty_set_room();
0) ! 9999.999 us | }
0) | tty_ldisc_deref() {
0) 0.615 us | _spin_lock_irqsave();
0) 0.616 us | _spin_unlock_irqrestore();
0) 2.793 us | }
0) | current_fs_time() {
0) 0.488 us | current_kernel_time();
0) 0.495 us | timespec_trunc();
0) 2.486 us | }
0) ! 9999.999 us | }
0) ! 9999.999 us | }
0) ! 9999.999 us | }
0) | sys_read() {
0) 0.796 us | fget_light();
0) | vfs_read() {
0) | rw_verify_area() {
0) | security_file_permission() {
0) 0.488 us | cap_file_permission();
0) 1.720 us | }
0) 3. 4 us | }
0) | tty_read() {
0) 0.488 us | tty_paranoia_check();
0) | tty_ldisc_ref_wait() {
0) | tty_ldisc_try() {
0) 0.615 us | _spin_lock_irqsave();
0) 0.615 us | _spin_unlock_irqrestore();
0) 5.436 us | }
0) 6.427 us | }
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: enhance the output of the graph-tracer
This patch applies some ideas of Ingo Molnar and Steven Rostedt.
* Output leaf functions in one line with parenthesis, semicolon and duration
output.
* Add a second column (after cpu) for an overhead sign.
if duration > 100 us, "!"
if duration > 10 us, "+"
else " "
* Print output in us with remaining nanosec: u.n
* Print duration on the right end, following the indentation of the functions.
Use also visual clues: "-" on entry call (no duration to output) and "+" on
return (duration output).
The name of the tracer has been fixed as well: function-branch becomes
function_branch.
Here is an example of the new output:
CPU[000] dequeue_entity() { -
CPU[000] update_curr() { -
CPU[000] update_min_vruntime(); + 0.512 us
CPU[000] } + 1.504 us
CPU[000] clear_buddies(); + 0.481 us
CPU[000] update_min_vruntime(); + 0.504 us
CPU[000] } + 4.557 us
CPU[000] hrtick_update() { -
CPU[000] hrtick_start_fair(); + 0.489 us
CPU[000] } + 1.443 us
CPU[000] + } + 14.655 us
CPU[000] + } + 15.678 us
CPU[000] + } + 16.686 us
CPU[000] msecs_to_jiffies(); + 0.481 us
CPU[000] put_prev_task_fair(); + 0.504 us
CPU[000] pick_next_task_fair(); + 0.482 us
CPU[000] pick_next_task_rt(); + 0.504 us
CPU[000] pick_next_task_fair(); + 0.481 us
CPU[000] pick_next_task_idle(); + 0.489 us
CPU[000] _spin_trylock(); + 0.655 us
CPU[000] _spin_unlock(); + 0.609 us
CPU[000] ------------8<---------- thread bash-2794 ------------8<----------
CPU[000] finish_task_switch() { -
CPU[000] _spin_unlock_irq(); + 0.722 us
CPU[000] } + 2.369 us
CPU[000] ! } + 501972.605 us
CPU[000] ! } + 501973.763 us
CPU[000] copy_from_read_buf() { -
CPU[000] _spin_lock_irqsave(); + 0.670 us
CPU[000] _spin_unlock_irqrestore(); + 0.699 us
CPU[000] copy_to_user() { -
CPU[000] might_fault() { -
CPU[000] __might_sleep(); + 0.503 us
CPU[000] } + 1.632 us
CPU[000] __copy_to_user_ll(); + 0.542 us
CPU[000] } + 3.858 us
CPU[000] tty_audit_add_data() { -
CPU[000] _spin_lock_irq(); + 0.609 us
CPU[000] _spin_unlock_irq(); + 0.624 us
CPU[000] } + 3.196 us
CPU[000] _spin_lock_irqsave(); + 0.624 us
CPU[000] _spin_unlock_irqrestore(); + 0.625 us
CPU[000] + } + 13.611 us
CPU[000] copy_from_read_buf() { -
CPU[000] _spin_lock_irqsave(); + 0.624 us
CPU[000] _spin_unlock_irqrestore(); + 0.616 us
CPU[000] } + 2.820 us
CPU[000]
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: prevent unnecessary stack recursion
if the resched flag was set before we entered, then don't reschedule.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: new "power-tracer" ftrace plugin
This patch adds a C/P-state ftrace plugin that will generate
detailed statistics about the C/P-states that are being used,
so that we can look at detailed decisions that the C/P-state
code is making, rather than the too high level "average"
that we have today.
An example way of using this is:
mount -t debugfs none /sys/kernel/debug
echo cstate > /sys/kernel/debug/tracing/current_tracer
echo 1 > /sys/kernel/debug/tracing/tracing_enabled
sleep 1
echo 0 > /sys/kernel/debug/tracing/tracing_enabled
cat /sys/kernel/debug/tracing/trace | perl scripts/trace/cstate.pl > out.svg
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: enhancement for function graph tracer
When run on a SMP box, the function graph tracer is confusing because
it shows the different CPUS as changes in the trace.
This patch adds the annotation of 'CPU[###]' where ### is a three digit
number. The output will look similar to this:
CPU[001] dput() {
CPU[000] } 726
CPU[001] } 487
CPU[000] do_softirq() {
CPU[001] } 2221
CPU[000] __do_softirq() {
CPU[000] __local_bh_disable() {
CPU[001] unroll_tree_refs() {
CPU[000] } 569
CPU[001] } 501
CPU[000] rcu_process_callbacks() {
CPU[001] kfree() {
What makes this nice is that now you can grep the file and produce
readable format for a particular CPU.
# cat /debug/tracing/trace > /tmp/trace
# grep '^CPU\[000\]' /tmp/trace > /tmp/trace0
# grep '^CPU\[001\]' /tmp/trace > /tmp/trace1
Will give you:
# head /tmp/trace0
CPU[000] ------------8<---------- thread sshd-3899 ------------8<----------
CPU[000] inotify_dentry_parent_queue_event() {
CPU[000] } 2531
CPU[000] inotify_inode_queue_event() {
CPU[000] } 505
CPU[000] } 69626
CPU[000] } 73089
CPU[000] audit_syscall_exit() {
CPU[000] path_put() {
CPU[000] dput() {
# head /tmp/trace1
CPU[001] ------------8<---------- thread pcscd-3446 ------------8<----------
CPU[001] } 4186
CPU[001] dput() {
CPU[001] } 543
CPU[001] vfs_permission() {
CPU[001] inode_permission() {
CPU[001] shmem_permission() {
CPU[001] generic_permission() {
CPU[001] } 501
CPU[001] } 2205
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: enhancement to function graph tracer
Export the trace_find_cmdline so the function graph tracer can
use it to print the comms of the threads.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: feature
This patch enables function tracing and function return to run together.
I've tested this by enabling the stack tracer and return tracer, where
both the function entry and function return are used together with
dynamic ftrace.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: more efficient code for ftrace graph tracer
This patch uses the dynamic patching, when available, to patch
the function graph code into the kernel.
This patch will ease the way for letting both function tracing
and function graph tracing run together.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: feature to function trace a single thread
This patch adds the ability to function trace a single thread.
The file:
/debugfs/tracing/set_ftrace_pid
contains the pid to trace. Valid pids are any positive integer.
Writing any negative number to this file will disable the pid
tracing and the function tracer will go back to tracing all of
threads.
This feature works with both static and dynamic function tracing.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: feature
This patch sets a C-like output for the function graph tracing.
For this aim, we now call two handler for each function: one on the entry
and one other on return. This way we can draw a well-ordered call stack.
The pid of the previous trace is loosely stored to be compared against
the one of the current trace to see if there were a context switch.
Without this little feature, the call tree would seem broken at
some locations.
We could use the sched_tracer to capture these sched_events but this
way of processing is much more simpler.
2 spaces have been chosen for indentation to fit the screen while deep
calls. The time of execution in nanosecs is printed just after closed
braces, it seems more easy this way to find the corresponding function.
If the time was printed as a first column, it would be not so easy to
find the corresponding function if it is called on a deep depth.
I plan to output the return value but on 32 bits CPU, the return value
can be 32 or 64, and its difficult to guess on which case we are.
I don't know what would be the better solution on X86-32: only print
eax (low-part) or even edx (high-part).
Actually it's thee same problem when a function return a 8 bits value, the
high part of eax could contain junk values...
Here is an example of trace:
sys_read() {
fget_light() {
} 526
vfs_read() {
rw_verify_area() {
security_file_permission() {
cap_file_permission() {
} 519
} 1564
} 2640
do_sync_read() {
pipe_read() {
__might_sleep() {
} 511
pipe_wait() {
prepare_to_wait() {
} 760
deactivate_task() {
dequeue_task() {
dequeue_task_fair() {
dequeue_entity() {
update_curr() {
update_min_vruntime() {
} 504
} 1587
clear_buddies() {
} 512
add_cfs_task_weight() {
} 519
update_min_vruntime() {
} 511
} 5602
dequeue_entity() {
update_curr() {
update_min_vruntime() {
} 496
} 1631
clear_buddies() {
} 496
update_min_vruntime() {
} 527
} 4580
hrtick_update() {
hrtick_start_fair() {
} 488
} 1489
} 13700
} 14949
} 16016
msecs_to_jiffies() {
} 496
put_prev_task_fair() {
} 504
pick_next_task_fair() {
} 489
pick_next_task_rt() {
} 496
pick_next_task_fair() {
} 489
pick_next_task_idle() {
} 489
------------8<---------- thread 4 ------------8<----------
finish_task_switch() {
} 1203
do_softirq() {
__do_softirq() {
__local_bh_disable() {
} 669
rcu_process_callbacks() {
__rcu_process_callbacks() {
cpu_quiet() {
rcu_start_batch() {
} 503
} 1647
} 3128
__rcu_process_callbacks() {
} 542
} 5362
_local_bh_enable() {
} 587
} 8880
} 9986
kthread_should_stop() {
} 669
deactivate_task() {
dequeue_task() {
dequeue_task_fair() {
dequeue_entity() {
update_curr() {
calc_delta_mine() {
} 511
update_min_vruntime() {
} 511
} 2813
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: cleanup
This patch changes the name of the "return function tracer" into
function-graph-tracer which is a more suitable name for a tracing
which makes one able to retrieve the ordered call stack during
the code flow.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: add new ftrace plugin
A prototype for a BTS ftrace plug-in.
The tracer collects branch trace in a cyclic buffer for each cpu.
The tracer is not configurable and the trace for each snapshot is
appended when doing cat /debug/tracing/trace.
This is a proof of concept that will be extended with future patches
to become a (hopefully) useful tool.
Signed-off-by: Markus Metzger <markus.t.metzger@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Add a callback to allow an ftrace plug-in to write its own header.
Move the call to trace->open() up a few lines.
The changes are required by the BTS ftrace plug-in.
Signed-off-by: Markus Metzger <markus.t.metzger@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: cleanup, move all hrtimer processing into hardirq context
This is an attempt at removing some of the hrtimer complexity by
reducing the number of callback modes to 1.
This means that all hrtimer callback functions will be ran from HARD-irq
context.
I went through all the 30 odd hrtimer callback functions in the kernel
and saw only one that I'm not quite sure of, which is the one in
net/can/bcm.c - hence I'm CC-ing the folks responsible for that code.
Furthermore, the hrtimer core now calls callbacks directly with IRQs
disabled in case you try to enqueue an expired timer. If this timer is a
periodic timer (which should use hrtimer_forward() to advance its time)
then it might be possible to end up in an inf. recursive loop due to the
fact that hrtimer_forward() doesn't round up to the next timer
granularity, and therefore keeps on calling the callback - obviously
this needs a fix.
Aside from that, this seems to compile and actually boot on my dual core
test box - although I'm sure there are some bugs in, me not hitting any
makes me certain :-)
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: fix mmiotrace overrun tracing
When ftrace framework moved to use the ring buffer facility, the buffer
overrun detection was broken after 2.6.27 by commit
| commit 3928a8a2d9
| Author: Steven Rostedt <rostedt@goodmis.org>
| Date: Mon Sep 29 23:02:41 2008 -0400
|
| ftrace: make work with new ring buffer
|
| This patch ports ftrace over to the new ring buffer.
The detection is now fixed by using the ring buffer API.
When mmiotrace detects a buffer overrun, it will report the number of
lost events. People reading an mmiotrace log must know if something was
missed, otherwise the data may not make sense.
Signed-off-by: Pekka Paalanen <pq@iki.fi>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: fix race
vma->vm_file reference is only stable while holding the mmap_sem,
so move usage of it to within the critical section.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: cleanup
User stack tracing is just implemented for x86, but it is not x86 specific.
Introduce a generic config flag, that is currently enabled only for x86.
When other arches implement it, they will have to
SELECT USER_STACKTRACE_SUPPORT.
Signed-off-by: Török Edwin <edwintorok@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: fix refcounting/object-access bug
Hold mmap_sem while looking up/accessing vma.
Hold the RCU lock while using the task we looked up.
Signed-off-by: Török Edwin <edwintorok@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: fix compiler warning
The ftrace_pointers used in the branch profiler are constant values.
They should never change. But the compiler complains when they are
passed into the debugfs_create_file as a data pointer, because the
function discards the qualifier.
This patch typecasts the parameter to debugfs_create_file back to
a void pointer. To remind the callbacks that they are pointing to
a constant value, I also modified the callback local pointers to
be const struct ftrace_pointer * as well.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: add new API to disable all of ftrace on anomalies
It case of a serious anomaly being detected (like something caught by
lockdep) it is a good idea to disable all tracing immediately, without
grabing any locks.
This patch adds ftrace_off_permanent that disables the tracers, function
tracing and ring buffers without a way to enable them again. This should
only be used when something serious has been detected.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: feature to permanently disable ring buffer
This patch adds a API to the ring buffer code that will permanently
disable the ring buffer from ever recording. This should only be
called when some serious anomaly is detected, and the system
may be in an unstable state. When that happens, shutting down the
recording to the ring buffers may be appropriate.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: feature to profile if statements
This patch adds a branch profiler for all if () statements.
The results will be found in:
/debugfs/tracing/profile_branch
For example:
miss hit % Function File Line
------- --------- - -------- ---- ----
0 1 100 x86_64_start_reservations head64.c 127
0 1 100 copy_bootdata head64.c 69
1 0 0 x86_64_start_kernel head64.c 111
32 0 0 set_intr_gate desc.h 319
1 0 0 reserve_ebda_region head.c 51
1 0 0 reserve_ebda_region head.c 47
0 1 100 reserve_ebda_region head.c 42
0 0 X maxcpus main.c 165
Miss means the branch was not taken. Hit means the branch was taken.
The percent is the percentage the branch was taken.
This adds a significant amount of overhead and should only be used
by those analyzing their system.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: cleanup on output of branch profiler
When a branch has not been taken, it does not make sense to show
a percentage incorrect or hit. This patch changes the behaviour
to print out a 'X' when the branch has not been executed yet.
For example:
correct incorrect % Function File Line
------- --------- - -------- ---- ----
2096 0 0 do_arch_prctl process_64.c 832
0 0 X do_arch_prctl process_64.c 804
2604 0 0 IS_ERR err.h 34
130228 5765 4 __switch_to process_64.c 673
0 0 X enable_TSC process_64.c 448
0 0 X disable_TSC process_64.c 431
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: clean up to make one profiler of like and unlikely tracer
The likely and unlikely profiler prints out the file and line numbers
of the annotated branches that it is profiling. It shows the number
of times it was correct or incorrect in its guess. Having two
different files or sections for that matter to tell us if it was a
likely or unlikely is pretty pointless. We really only care if
it was correct or not.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: modify+improve the userstacktrace tracing visualization feature
Store thread group leader id, and use it to lookup the address in the
process's map. We could have looked up the address on thread's map,
but the thread might not exist by the time we are called. The process
might not exist either, but if you are reading trace_pipe, that is
unlikely.
Example usage:
mount -t debugfs nodev /sys/kernel/debug
cd /sys/kernel/debug/tracing
echo userstacktrace >iter_ctrl
echo sym-userobj >iter_ctrl
echo sched_switch >current_tracer
echo 1 >tracing_enabled
cat trace_pipe >/tmp/trace&
.... run application ...
echo 0 >tracing_enabled
cat /tmp/trace
You'll see stack entries like:
/lib/libpthread-2.7.so[+0xd370]
You can convert them to function/line using:
addr2line -fie /lib/libpthread-2.7.so 0xd370
Or:
addr2line -fie /usr/lib/debug/libpthread-2.7.so 0xd370
For non-PIC/PIE executables this won't work:
a.out[+0x73b]
You need to run the following: addr2line -fie a.out 0x40073b
(where 0x400000 is the default load address of a.out)
Signed-off-by: Török Edwin <edwintorok@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: add new (default-off) tracing visualization feature
Usage example:
mount -t debugfs nodev /sys/kernel/debug
cd /sys/kernel/debug/tracing
echo userstacktrace >iter_ctrl
echo sched_switch >current_tracer
echo 1 >tracing_enabled
.... run application ...
echo 0 >tracing_enabled
Then read one of 'trace','latency_trace','trace_pipe'.
To get the best output you can compile your userspace programs with
frame pointers (at least glibc + the app you are tracing).
Signed-off-by: Török Edwin <edwintorok@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>