Commit Graph

240 Commits (f92f6e6ee35d2779aa62e70f78ad8e1cd417eb52)

Author SHA1 Message Date
James Morris 88e9d34c72 seq_file: constify seq_operations
Make all seq_operations structs const, to help mitigate against
revectoring user-triggerable function pointers.

This is derived from the grsecurity patch, although generated from scratch
because it's simpler than extracting the changes from there.

Signed-off-by: James Morris <jmorris@namei.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-09-23 07:39:29 -07:00
Li Zefan 79fe249c83 tracing: Fix failure path in ftrace_regex_open()
Don't forget to free trace_parser if seq_open() returned failure.

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
LKML-Reference: <4AB86694.4040803@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-22 10:28:57 +02:00
Li Zefan 1eb90f138b tracing: Fix failure path in ftrace_graph_write()
Don't call trace_parser_put() on uninitialized trace_parser.

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
LKML-Reference: <4AB86639.3000003@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-22 10:28:56 +02:00
Li Zefan 4ba7978e98 tracing: Check the return value of trace_get_user()
Return immediately if trace_get_user() returned failure.

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
LKML-Reference: <4AB86614.7020803@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-22 10:28:55 +02:00
Li Zefan a4ec5e0c26 function-graph: use ftrace_graph_funcs directly
No need to store ftrace_graph_funcs in file->private.

Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
LKML-Reference: <4AB32364.7020602@cn.fujitsu.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2009-09-19 11:26:54 -04:00
Steven Rostedt b375a11a23 tracing: switch function prints from %pf to %ps
For direct function pointers (like what mcount provides) PowerPC64
requires the use of %ps, otherwise nothing is printed.

This patch converts all prints of functions retrieved through mcount
to use the %ps format from the %pf.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2009-09-17 15:53:40 -04:00
Atsushi Tsuji b36461da2a tracing: Fix minor bugs for __unregister_ftrace_function_probe
Fix the condition of strcmp for "*".
Also fix NULL pointer dereference when glob is NULL.

Signed-off-by: Atsushi Tsuji <a-tsuji@bk.jp.nec.com>
LKML-Reference: <4AAF6726.5090905@bk.jp.nec.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2009-09-16 09:08:54 -04:00
jolsa@redhat.com 689fd8b65d tracing: trace parser support for function and graph
Convert the writing to 'set_graph_function', 'set_ftrace_filter'
and 'set_ftrace_notrace' to use the generic trace_parser
'trace_get_user' function.

Removed FTRACE_ITER_CONT flag, since it's not needed after this change.

Minor fix in set_graph_function display - g_show function.

Signed-off-by: Jiri Olsa <jolsa@redhat.com>
LKML-Reference: <1252682969-3366-4-git-send-email-jolsa@redhat.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2009-09-11 15:20:18 -04:00
Ingo Molnar ed011b22ce Merge commit 'v2.6.31-rc9' into tracing/core
Merge reason: move from -rc5 to -rc9.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-06 06:11:42 +02:00
Jiri Olsa eda1e32855 tracing: handle broken names in ftrace filter
If one filter item (for set_ftrace_filter and set_ftrace_notrace) is being
setup by more than 1 consecutive writes (FTRACE_ITER_CONT flag), it won't
be handled corretly.

I used following program to test/verify:

[snip]
#include <stdio.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <string.h>

int main(int argc, char **argv)
{
        int fd, i;
        char *file = argv[1];

        if (-1 == (fd = open(file, O_WRONLY))) {
                perror("open failed");
                return -1;
        }

        for(i = 0; i < (argc - 2); i++) {
                int len = strlen(argv[2+i]);
                int cnt, off = 0;

                while(len) {
                        cnt = write(fd, argv[2+i] + off, len);
                        len -= cnt;
                        off += cnt;
                }
        }

        close(fd);
        return 0;
}
[snip]

before change:
sh-4.0# echo > ./set_ftrace_filter
sh-4.0# /test ./set_ftrace_filter "sys" "_open "
sh-4.0# cat ./set_ftrace_filter
#### all functions enabled ####
sh-4.0#

after change:
sh-4.0# echo > ./set_ftrace_notrace
sh-4.0# test ./set_ftrace_notrace "sys" "_open "
sh-4.0# cat ./set_ftrace_notrace
sys_open
sh-4.0#

Signed-off-by: Jiri Olsa <jolsa@redhat.com>
LKML-Reference: <20090811152904.GA26065@jolsa.lab.eng.brq.redhat.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2009-08-18 20:39:48 -04:00
Li Zefan 3be04b471b ftrace: Simplify seqfile code
Use seq_release_private().

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Li Zefan <lizf@cn.fujitsu.com>
LKML-Reference: <4A891AAB.8090701@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-17 11:25:10 +02:00
Ingo Molnar 89034bc2c7 Merge branch 'linus' into tracing/core
Conflicts:
	kernel/trace/trace_events_filter.c

We use the tracing/core version.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-11 14:19:09 +02:00
Steven Rostedt 8650ae32ef tracing: only truncate ftrace files when O_TRUNC is set
The current code will truncate the ftrace files contents if O_APPEND
is not set and the file is opened in write mode. This is incorrect.
It should only truncate the file if O_TRUNC is set. Otherwise
if one of these files is opened by a C program with fopen "r+",
it will incorrectly truncate the file.

Reported-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2009-07-23 10:07:18 -04:00
Li Zefan 87827111a5 function-graph: Fix seqfile memory leak
Every time we cat set_graph_function, we leak memory allocated
by seq_open().

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
LKML-Reference: <4A67D907.2010500@cn.fujitsu.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2009-07-23 09:53:23 -04:00
jolsa@redhat.com 566b0aaf79 tracing: Remove unused fields/variables
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Cc: rostedt@goodmis.org
LKML-Reference: <1247773468-11594-2-git-send-email-jolsa@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-07-18 12:21:16 +02:00
Ingo Molnar 45bceffc30 Merge branch 'linus' into tracing/core
Merge reason: tracing/core was on an older, pre-rc1 base.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-07-18 12:20:01 +02:00
Xiao Guangrong 6f2f3cf00e tracing/function: Cleanup for function tracer
We can directly use %pf input format instead of kallsyms_lookup()
and %s input format

Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Reviewed-by: Li Zefan <lizf@cn.fujitsu.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2009-07-17 01:45:51 -04:00
Xiao Guangrong 64fbcd1628 tracing/function: Simplify __ftrace_replace_code()
Rewrite the __ftrace_replace_code() function, simplify it, but don't
change the code's logic.

First, we get the state we want to set, if the record has the same
state, then do nothing, otherwise enable/disable it.

Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Reviewed-by: Li Zefan <lizf@cn.fujitsu.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2009-07-17 00:37:53 -04:00
Steven Rostedt 6ab5d668b1 tracing/function-profiler: do not free per cpu variable stat
The per cpu variable stat is freeded if we fail to allocate a name
on start up. This was due to stat at first being allocated in the
initial design. But since then, it has become a static per cpu variable
but the free on error was not removed.

Also added __init annotation to the function that this is in.

[ Impact: prevent possible memory corruption on low mem at boot up ]

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-07-13 11:01:10 +02:00
Li Zefan a32c7765e2 tracing: Fix stack tracer sysctl handling
This made my machine completely frozen:

  # echo 1 > /proc/sys/kernel/stack_tracer_enabled
  # echo 2 > /proc/sys/kernel/stack_tracer_enabled

The cause is register_ftrace_function() was called twice.

Also fix ftrace_enabled sysctl, though seems nothing bad happened
as I tested it.

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <4A448D17.9010305@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-26 20:48:39 +02:00
Li Zefan 0296e4254f ftrace: Fix the output of profile
The first entry of the ftrace profile was always skipped when
reading trace_stat/functionX.

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <4A443D59.4080307@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-26 09:25:42 +02:00
Li Zefan d82d62444f ftrace: Fix t_hash_start()
When the output of set_ftrace_filter is larger than PAGE_SIZE,
t_hash_start() will be called the 2nd time, and then we start
from the head of a hlist, which is wrong and causes some entries
to be outputed twice.

The worse is, if the hlist is large enough, reading set_ftrace_filter
won't stop but in a dead loop.

Reviewed-by: Liming Wang <liming.wang@windriver.com>
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <4A41876E.2060407@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-24 11:02:53 +02:00
Li Zefan 694ce0a544 ftrace: Don't manipulate @pos in t_start()
It's rather confusing that in t_start(), in some cases @pos is
incremented, and in some cases it's decremented and then incremented.

This patch rewrites t_start() in a much more general way.

Thus we fix a bug that if ftrace_filtered == 1, functions have tracer
hooks won't be printed, because the branch is always unreachable:

static void *t_start(...)
{
	...
	if (!p)
		return t_hash_start(m, pos);
	return p;
}

Before:
  # echo 'sys_open' > /mnt/tracing/set_ftrace_filter
  # echo 'sys_write:traceon:4' >> /mnt/tracing/set_ftrace_filter
  sys_open

After:
  # echo 'sys_open' > /mnt/tracing/set_ftrace_filter
  # echo 'sys_write:traceon:4' >> /mnt/tracing/set_ftrace_filter
  sys_open
  sys_write:traceon:count=4

Reviewed-by: Liming Wang <liming.wang@windriver.com>
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <4A41874B.4090507@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-24 11:02:53 +02:00
Li Zefan 85951842a1 ftrace: Don't increment @pos in g_start()
It's wrong to increment @pos in g_start(). It causes some entries
lost when reading set_graph_function, if the output of the file
is larger than PAGE_SIZE.

Reviewed-by: Liming Wang <liming.wang@windriver.com>
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <4A418738.7090401@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-24 11:02:52 +02:00
Frederic Weisbecker 9ea1a153a4 tracing/urgent: warn in case of ftrace_start_up inbalance
Prevent from further ftrace_start_up inbalances so that we avoid
future nop patching omissions with dynamic ftrace.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
2009-06-20 06:52:21 +02:00
Linus Torvalds 991ec02cdc Merge branch 'tracing-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'tracing-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  function-graph: always initialize task ret_stack
  function-graph: move initialization of new tasks up in fork
  function-graph: add memory barriers for accessing task's ret_stack
  function-graph: enable the stack after initialization of other variables
  function-graph: only allocate init tasks if it was not already done

Manually fix trivial conflict in kernel/trace/ftrace.c
2009-06-10 19:58:10 -07:00
Steven Rostedt 84047e360a function-graph: always initialize task ret_stack
On creating a new task while running the function graph tracer, if
we fail to allocate the ret_stack, and then fail the fork, the
code will free the parent ret_stack. This is because the child
duplicated the parent and currently points to the parent's ret_stack.

This patch always initializes the task's ret_stack to NULL.

[ Impact: prevent crash of parent on low memory during fork ]

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2009-06-02 16:51:55 -04:00
Steven Rostedt 26c01624a2 function-graph: add memory barriers for accessing task's ret_stack
The code that handles the tasks ret_stack allocation for every task
assumes that only an interrupt can cause issues (even though interrupts
are disabled).

In reality, the code is allocating the ret_stack for tasks that may be
running on other CPUs and there are not efficient memory barriers to
handle this case.

[ Impact: prevent crash due to using of uninitialized ret_stack variables ]

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2009-06-02 14:42:17 -04:00
Steven Rostedt 82310a3272 function-graph: enable the stack after initialization of other variables
The function graph tracer checks if the task_struct has ret_stack defined
to know if it is OK or not to use it. The initialization is done for
all tasks by one process, but the idle tasks use the same initialization
used by new tasks.

If an interrupt happens on an idle task that just had the ret_stack
created, but before the rest of the initialization took place, then
we can corrupt the return address of the functions.

This patch moves the setting of the task_struct's ret_stack to after
the other variables have been initialized.

[ Impact: prevent kernel panic on idle task when starting function graph ]

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2009-06-02 14:41:50 -04:00
Steven Rostedt 179c498ae2 function-graph: only allocate init tasks if it was not already done
When the function graph tracer is enabled, it calls the initialization
needed for the init tasks that would be called on all created tasks.

The problem is that this is called every time the function graph tracer
is enabled, and the ret_stack is allocated for the idle tasks each time.
Thus, the old ret_stack is lost and a memory leak is created.

This is also dangerous because if an interrupt happened on another CPU
with the init task and the ret_stack is replaced, we then lose all the
return pointers for the interrupt, and a crash would take place.

[ Impact: fix memory leak and possible crash due to race ]

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2009-06-02 12:03:19 -04:00
Steven Rostedt 0f6ce3de4e ftrace: do not profile functions when disabled
A race was found that if one were to enable and disable the function
profiler repeatedly, then the system can panic. This was because a profiled
function may be preempted just before disabling interrupts. While
the profiler is disabled and then reenabled, the preempted function
could start again, and access the hash as it is being initialized.

This just adds a check in the irq disabled part to check if the profiler
is enabled, and if it is not then it will just exit.

When the system is disabled, the profile_enabled variable is cleared
before calling the unregistering of the function profiler. This
unregistering calls stop machine which also acts as a synchronize schedule.

[ Impact: fix panic in enabling/disabling function profiler ]

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2009-06-01 23:26:23 -04:00
Steven Rostedt 2af15d6a44 ftrace: add kernel command line function filtering
When using ftrace=function on the command line to trace functions
on boot up, one can not filter out functions that are commonly called.

This patch adds two new ftrace command line commands.

  ftrace_notrace=function-list
  ftrace_filter=function-list

Where function-list is a comma separated list of functions to filter.
The ftrace_notrace will make the functions listed not be included
in the function tracing, and ftrace_filter will only trace the functions
listed.

These two act the same as the debugfs/tracing/set_ftrace_notrace and
debugfs/tracing/set_ftrace_filter respectively.

The simple glob expressions that are allowed by the filter files can also
be used by the command line interface.

	ftrace_notrace=rcu*,*lock,*spin*

Will not trace any function that starts with rcu, ends with lock, or has
the word spin in it.

Note, if the self tests are enabled, they may interfere with the filtering
set by the command lines.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2009-06-01 23:23:10 -04:00
Ming Lei 24ed0c4bfc tracing: fix check for return value of register_module_notifier
return zero should be correct, so fix it.

[ Impact: eliminate incorrect syslog message ]

Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Li Zefan <lizf@cn.fujitsu.com>
Cc: rostedt@goodmis.org
LKML-Reference: <1242545498-7285-1-git-send-email-tom.leiming@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-05-18 10:24:13 +02:00
Steven Rostedt 93eb677d74 ftrace: use module notifier for function tracer
The hooks in the module code for the function tracer must be called
before any of that module code runs. The function tracer hooks
modify the module (replacing calls to mcount to nops). If the code
is executed while the change occurs, then the CPU can take a GPF.

To handle the above with a bit of paranoia, I originally implemented
the hooks as calls directly from the module code.

After examining the notifier calls, it looks as though the start up
notify is called before any of the module's code is executed. This makes
the use of the notify safe with ftrace.

Only the startup notify is required to be "safe". The shutdown simply
removes the entries from the ftrace function list, and does not modify
any code.

This change has another benefit. It removes a issue with a reverse dependency
in the mutexes of ftrace_lock and module_mutex.

[ Impact: fix lock dependency bug, cleanup ]

Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2009-04-17 16:59:15 +02:00
Steven Rostedt ad8d75fff8 tracing/events: move trace point headers into include/trace/events
Impact: clean up

Create a sub directory in include/trace called events to keep the
trace point headers in their own separate directory. Only headers that
declare trace points should be defined in this directory.

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Neil Horman <nhorman@tuxdriver.com>
Cc: Zhao Lei <zhaolei@cn.fujitsu.com>
Cc: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2009-04-14 22:05:43 -04:00
Steven Rostedt 597af81537 function-graph: use int instead of atomic for ftrace_graph_active
Impact: cleanup

The variable ftrace_graph_active is only modified under the
ftrace_lock mutex, thus an atomic is not necessary for modification.

Reported-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2009-04-07 14:43:08 +02:00
Frederic Weisbecker 5452af664f tracing/ftrace: factorize the tracing files creation
Impact: cleanup

Most of the tracing files creation follow the same pattern:

ret = debugfs_create_file(...)
if (!ret)
	pr_warning("Couldn't create ... entry\n")

Unify it!

Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <1238109938-11840-1-git-send-email-fweisbec@gmail.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2009-04-07 14:43:07 +02:00
Ingo Molnar 93776a8ec7 Merge branch 'linus' into tracing/core
Merge reason: update to upstream tracing facilities

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-07 13:47:45 +02:00
Steven Rostedt 2aad1b76e6 function-graph: allow unregistering twice
Impact: fix to permanent disabling of function graph tracer

There should be nothing to prevent a tracer from unregistering a
function graph callback more than once. This can simplify error paths.

But currently, the counter does not account for mulitple unregistering
of the function graph callback. If it happens, the function graph
tracer will be permanently disabled.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-01 14:47:42 +02:00
Steven Rostedt 34886c8bc5 tracing: add average time in function to function profiler
Show the average time in the function (Time / Hit)

  Function                               Hit    Time            Avg
  --------                               ---    ----            ---
  mwait_idle                              51    140326.6 us     2751.503 us
  smp_apic_timer_interrupt                47    3517.735 us     74.845 us
  schedule                                10    2738.754 us     273.875 us
  __schedule                              10    2732.857 us     273.285 us
  hrtimer_interrupt                       47    1896.104 us     40.342 us
  irq_exit                                56    1711.833 us     30.568 us
  __run_hrtimer                           47    1315.589 us     27.991 us
  tick_sched_timer                        47    1138.690 us     24.227 us
  do_softirq                              56    1116.829 us     19.943 us
  __do_softirq                            56    1066.932 us     19.052 us
  do_IRQ                                   9    926.153 us      102.905 us

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-03-25 21:03:36 -04:00
Steven Rostedt 318e0a73c9 tracing: remove on the fly allocator from function profiler
Impact: safer code

The on the fly allocator for the function profiler was to save
memory. But at the expense of stability. Although it survived several
tests, allocating from the function tracer is just too risky, just
to save space.

This patch removes the allocator and simply allocates enough entries
at start up.

Each function gets a profiling structure of 40 bytes. With an average
of 20K functions, and this is for each CPU, we have 800K per online
CPU. This is not too bad, at least for non-embedded.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-03-25 21:03:17 -04:00
Zhaolei 2a4efa4245 ftrace: Using FTRACE_WARN_ON() to check "freed record" in ftrace_release()
"Because when we call ftrace_free_rec we change the rec->ip to point to the
  next record in the chain. Something is very wrong if rec->ip >= s &&
  rec->ip < e and the record is already free."

 "Note, use FTRACE_WARN_ON() macro. This way it shuts down ftrace if it is
  hit and helps to avoid further damage later."
                   -- Steven Rostedt <rostedt@goodmis.org>

Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-03-25 17:45:36 -04:00
Steven Rostedt fb9fb015e9 tracing: clean up tracing profiler
Ingo Molnar suggested clean ups for the profiling code. This patch
makes those updates.

Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-03-25 13:26:41 -04:00
Steven Rostedt a2a16d6a31 function-graph: add option to calculate graph time or not
graph time is the time that a function is executing another function.
Thus if function A calls B, if graph-time is set, then the time for
A includes B. This is the default behavior. But if graph-time is off,
then the time spent executing B is subtracted from A.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-03-24 23:41:11 -04:00
Steven Rostedt cafb168a1c tracing: make the function profiler per cpu
Impact: speed enhancement

By making the function profiler record in per cpu data we not only
get better readings, avoid races, we also do not have to take any
locks.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-03-24 23:41:10 -04:00
Steven Rostedt 0706f1c48c tracing: adding function timings to function profiler
If the function graph trace is enabled, the function profiler will
use it to take the timing of the functions.

 cat /debug/tracing/trace_stat/functions

  Function                               Hit    Time
  --------                               ---    ----
  mwait_idle                             127    183028.4 us
  schedule                                26    151997.7 us
  __schedule                              31    151975.1 us
  sys_wait4                                2    74080.53 us
  do_wait                                  2    74077.80 us
  sys_newlstat                           138    39929.16 us
  do_path_lookup                         179    39845.79 us
  vfs_lstat_fd                           138    39761.97 us
  user_path_at                           153    39469.58 us
  path_walk                              179    39435.76 us
  __link_path_walk                       189    39143.73 us
[...]

Note the times are skewed due to the function graph tracer not taking
into account schedules.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-03-24 23:41:09 -04:00
Steven Rostedt 493762fc53 tracing: move function profiler data out of function struct
Impact: reduce size of memory in function profiler

The function profiler originally introduces its counters into the
function records itself. There is 20 thousand different functions on
a normal system, and that is adding 20 thousand counters for profiling
event when not needed.

A normal run of the profiler yields only a couple of thousand functions
executed, depending on what is being profiled. This means we have around
18 thousand useless counters.

This patch rectifies this by moving the data out of the function
records used by dynamic ftrace. Data is preallocated to hold the functions
when the profiling begins. Checks are made during profiling to see if
more recorcds should be allocated, and they are allocated if it is safe
to do so.

This also removes the dependency from using dynamic ftrace, and also
removes the overhead by having it enabled.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-03-24 23:41:06 -04:00
Steven Rostedt bac429f037 tracing: add function profiler
Impact: new profiling feature

This patch adds a function profiler. In debugfs/tracing/ two new
files are created.

  function_profile_enabled  - to enable or disable profiling

  trace_stat/functions   - the profiled functions.

For example:

  echo 1 > /debugfs/tracing/function_profile_enabled
  ./hackbench 50
  echo 0 > /debugfs/tracing/function_profile_enabled

yields:

  cat /debugfs/tracing/trace_stat/functions

  Function                               Hit
  --------                               ---
  _spin_lock                        10106442
  _spin_unlock                      10097492
  kfree                              6013704
  _spin_unlock_irqrestore            4423941
  _spin_lock_irqsave                 4406825
  __phys_addr                        4181686
  __slab_free                        4038222
  dput                               4030130
  path_put                           4023387
  unroll_tree_refs                   4019532
[...]

The most hit functions are listed first. Functions that are not
hit are not listed.

This feature depends on and uses dynamic function tracing. When the
function profiling is disabled, no overhead occurs. But it still
takes up around 300KB to hold the data, thus it is not recomended
to keep it enabled for systems low on memory.

When a '1' is echoed into the function_profile_enabled file, the
counters for is function is reset back to zero. Thus you can see what
functions are hit most by different programs.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-03-24 23:40:00 -04:00
Lai Jiangshan ee000b7f9f tracing: use union for multi-usages field
Impact: cleanup

struct dyn_ftrace::ip has different usages in his lifecycle,
we use union for it. And also for struct dyn_ftrace::flags.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Steven Rostedt <srostedt@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <49C871BE.3080405@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-24 16:43:12 +01:00
Lai Jiangshan cc59c9e8d0 ftrace: show virtual PID
Impact: fix PID output under namespaces

When current namespace is not the global namespace,
pid read from set_ftrace_pid is no correct.

 # ~/newpid_namespace_run bash
 # echo $$
 1
 # echo 1 > set_ftrace_pid
 # cat set_ftrace_pid
 3756

Since we write virtual PID to set_ftrace_pid, we need get
virtual PID when we read it.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Steven Rostedt <srostedt@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <49C84D65.9050606@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-24 16:42:49 +01:00
Steven Rostedt be6f164a02 function-graph: add option for include sleep times
Impact: give user a choice to show times spent while sleeping

The user may want to see the time a function spent sleeping.
This patch adds the trace option "sleep-time" to allow that.
The "sleep-time" option is default on.

 echo sleep-time > /debug/tracing/trace_options

produces:

 ------------------------------------------
 2)  avahi-d-3428  =>    <idle>-0
 ------------------------------------------

 2)               |      finish_task_switch() {
 2)   0.621 us    |        _spin_unlock_irq();
 2)   2.202 us    |      }
 2) ! 1002.197 us |    }
 2) ! 1003.521 us |  }

where as,

 echo nosleep-time > /debug/tracing/trace_options

produces:

 0)    <idle>-0    =>  yum-upd-3416
 ------------------------------------------

 0)               |              finish_task_switch() {
 0)   0.643 us    |                _spin_unlock_irq();
 0)   2.342 us    |              }
 0) + 41.302 us   |            }
 0) + 42.453 us   |          }

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-03-24 11:06:24 -04:00
Steven Rostedt 8aef2d2856 function-graph: ignore times across schedule
Impact: more accurate timings

The current method of function graph tracing does not take into
account the time spent when a task is not running. This shows functions
that call schedule have increased costs:

 3) + 18.664 us   |      }
 ------------------------------------------
 3)    <idle>-0    =>  kblockd-123
 ------------------------------------------

 3)               |      finish_task_switch() {
 3)   1.441 us    |        _spin_unlock_irq();
 3)   3.966 us    |      }
 3) ! 2959.433 us |    }
 3) ! 2961.465 us |  }

This patch uses the tracepoint in the scheduling context switch to
account for time that has elapsed while a task is scheduled out.
Now we see:

 ------------------------------------------
 3)    <idle>-0    =>  edac-po-1067
 ------------------------------------------

 3)               |      finish_task_switch() {
 3)   0.685 us    |        _spin_unlock_irq();
 3)   2.331 us    |      }
 3) + 41.439 us   |    }
 3) + 42.663 us   |  }

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-03-24 09:33:30 -04:00
Steven Rostedt 05ce5818ad function-graph: prevent more than one tracer registering
Impact: prevent crash due to multiple function graph tracers

The function graph tracer can currently only handle a single tracer
being registered. If another tracer registers with the function
graph tracer it can crash the system.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-03-24 09:32:52 -04:00
Frederic Weisbecker 03303549b1 tracing/ftrace: fix the check on nopped sites
Impact: fix a dynamic tracing failure

Recently, the function and function graph tracers failed to use dynamic
tracing after the following commit:

fa9d13cf13
(ftrace: don't try to __ftrace_replace_code on !FTRACE_FL_CONVERTED rec)

The patch is right except a mistake on the check for the FTRACE_FL_CONVERTED
flag. The code patching is aborted in case of successfully nopped sites.
What we want is the opposite: ignore the callsites that haven't been nopped.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-03-16 22:15:36 -04:00
Lai Jiangshan e94142a67f ftrace: remove struct list_head from struct dyn_ftrace
Impact: save memory

The struct dyn_ftrace table is very large, this patch will save
about 50%.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Steven Rostedt <srostedt@redhat.com>
LKML-Reference: <49BA2C9F.8020009@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-13 11:36:20 +01:00
Lai Jiangshan 850a80cfaa ftrace: use seq_read
Impact: cleanup

VFS layer has tested the file mode, we do not need test it.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Steven Rostedt <srostedt@redhat.com>
LKML-Reference: <49BA2BAB.6010608@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-13 11:35:31 +01:00
Zhaolei fa9d13cf13 ftrace: don't try to __ftrace_replace_code on !FTRACE_FL_CONVERTED rec
Do __ftrace_replace_code for !FTRACE_FL_CONVERTED rec will always
fail, we should ignore this rec.

Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Cc: "Steven Rostedt ;" <rostedt@goodmis.org>
LKML-Reference: <49BA2472.4060206@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-13 10:25:06 +01:00
Zhaolei b00f0b6dc1 ftrace: avoid double-free of dyn_ftrace
If dyn_ftrace is freed before ftrace_release(), ftrace_release()
will free it again and make ftrace_free_records wrong.

Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Cc: "Steven Rostedt ;" <rostedt@goodmis.org>
LKML-Reference: <49BA23D9.1050900@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-13 10:25:06 +01:00
KOSAKI Motohiro 10dd3ebe21 tracing: fix deadlock when setting set_ftrace_pid
Impact: fix deadlock while using set_ftrace_pid

Reproducer:

	# cd /sys/kernel/debug/tracing
	# echo $$ > set_ftrace_pid

	then, console becomes hung.

Details:

when writing set_ftracepid, kernel callstack is following

	ftrace_pid_write()
		mutex_lock(&ftrace_lock);
		ftrace_update_pid_func()
			mutex_lock(&ftrace_lock);
			mutex_unlock(&ftrace_lock);
		mutex_unlock(&ftrace_lock);

then, system always deadlocks when ftrace_pid_write() is called.

In past days, ftrace_pid_write() used ftrace_start_lock, but
commit e6ea44e9b4 consolidated
ftrace_start_lock to ftrace_lock.

Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Reviewed-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Steven Rostedt <srostedt@redhat.com>
LKML-Reference: <20090306151155.0778.A69D9226@jp.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-03-06 12:07:38 +01:00
Steven Rostedt 5e2336a0d4 tracing: make all file_operations const
Impact: cleanup

All file_operations structures should be constant. No one is going to
change them.

Reported-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-03-05 21:46:40 -05:00
Ingo Molnar c478f87869 Merge branch 'tip/x86/ftrace' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into tracing/ftrace
Conflicts:
	include/linux/ftrace.h
	kernel/trace/ftrace.c
2009-02-22 18:12:01 +01:00
Steven Rostedt 4377245aa9 ftrace: break out modify loop immediately on detection of error
Impact: added precaution on failure detection

Break out of the modifying loop as soon as a failure is detected.
This is just an added precaution found by code review and was not
found by any bug chasing.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-02-20 14:30:20 -05:00
Steven Rostedt 000ab69117 ftrace: allow archs to preform pre and post process for code modification
This patch creates the weak functions: ftrace_arch_code_modify_prepare
and ftrace_arch_code_modify_post_process that are called before and
after the stop machine is called to modify the kernel text.

If the arch needs to do pre or post processing, it only needs to define
these functions.

[ Update: Ingo Molnar suggested using the name ftrace_arch_code_modify_*
          over using ftrace_arch_modify_* ]

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-02-20 13:16:18 -05:00
Frederic Weisbecker f9349a8f97 tracing/function-graph-tracer: make set_graph_function file support ftrace regex
Impact: trace only functions matching a pattern

The set_graph_function file let one to trace only one or several
chosen functions and follow all their code flow.

Currently, only a constant function name is allowed so this patch
allows the ftrace_regex functions:

- matches all functions that end with "name":
  echo *name > set_graph_function

- matches all functions that begin with "name":
  echo name* > set_graph_function

- matches all functions that contains "name":
  echo *name* > set_graph_function

Example:

echo mutex* > set_graph_function

 0)               |  mutex_lock_nested() {
 0)   0.563 us    |    __might_sleep();
 0)   2.072 us    |  }
 0)               |  mutex_unlock() {
 0)   1.036 us    |    __mutex_unlock_slowpath();
 0)   2.433 us    |  }
 0)               |  mutex_unlock() {
 0)   0.691 us    |    __mutex_unlock_slowpath();
 0)   1.787 us    |  }
 0)               |  mutex_lock_interruptible_nested() {
 0)   0.548 us    |    __might_sleep();
 0)   1.945 us    |  }

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-20 11:36:24 +01:00
Ingo Molnar ac07bcaa82 Merge branch 'tip/tracing/ftrace' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into tracing/ftrace 2009-02-18 01:09:07 +01:00
Ingo Molnar 37bd824a35 Merge branches 'tracing/ftrace' and 'tracing/urgent' into tracing/core 2009-02-18 01:08:13 +01:00
Frederic Weisbecker 5b058bcde9 tracing/function-graph-tracer: trace the idle tasks
When the function graph tracer is activated, it iterates over the task_list
to allocate a stack to store the return addresses.

But the per cpu idle tasks are not iterated by using
do_each_thread / while_each_thread.

So we have to iterate on them manually.

This fixes somes weirdness in the traces and many losses of traces.
Examples on two cpus:

 0)   Xorg-4287    |   2.906 us    |              }
 0)   Xorg-4287    |   3.965 us    |            }
 0)   Xorg-4287    |   5.302 us    |          }
 ------------------------------------------
 0)   Xorg-4287    =>    <idle>-0
 ------------------------------------------

 0)    <idle>-0    |   2.861 us    |                        }
 0)    <idle>-0    |   0.526 us    |                        set_normalized_timespec();
 0)    <idle>-0    |   7.201 us    |                      }
 0)    <idle>-0    |   8.214 us    |                    }
 0)    <idle>-0    |               |                    clockevents_program_event() {
 0)    <idle>-0    |               |                      lapic_next_event() {
 0)    <idle>-0    |   0.510 us    |                        native_apic_mem_write();
 0)    <idle>-0    |   1.546 us    |                      }
 0)    <idle>-0    |   2.583 us    |                    }
 0)    <idle>-0    | + 12.435 us   |                  }
 0)    <idle>-0    | + 13.470 us   |                }
 0)    <idle>-0    |   0.608 us    |                _spin_unlock_irqrestore();
 0)    <idle>-0    | + 23.270 us   |              }
 0)    <idle>-0    | + 24.336 us   |            }
 0)    <idle>-0    | + 25.417 us   |          }
 0)    <idle>-0    |   0.593 us    |          _spin_unlock();
 0)    <idle>-0    | + 41.869 us   |        }
 0)    <idle>-0    | + 42.906 us   |      }
 0)    <idle>-0    | + 95.035 us   |    }
 0)    <idle>-0    |   0.540 us    |    menu_reflect();
 0)    <idle>-0    | ! 100.404 us  |  }
 0)    <idle>-0    |   0.564 us    |  mce_idle_callback();
 0)    <idle>-0    |               |  enter_idle() {
 0)    <idle>-0    |   0.526 us    |    mce_idle_callback();
 0)    <idle>-0    |   1.757 us    |  }
 0)    <idle>-0    |               |  cpuidle_idle_call() {
 0)    <idle>-0    |               |    menu_select() {
 0)    <idle>-0    |   0.525 us    |      pm_qos_requirement();
 0)    <idle>-0    |   0.518 us    |      tick_nohz_get_sleep_length();
 0)    <idle>-0    |   2.621 us    |    }
[...]
 1)    <idle>-0    |   0.518 us    |              touch_softlockup_watchdog();
 1)    <idle>-0    | + 14.355 us   |            }
 1)    <idle>-0    | + 22.840 us   |          }
 1)    <idle>-0    | + 25.949 us   |        }
 1)    <idle>-0    |               |        handle_irq() {
 1)    <idle>-0    |   0.511 us    |          irq_to_desc();
 1)    <idle>-0    |               |          handle_edge_irq() {
 1)    <idle>-0    |   0.638 us    |            _spin_lock();
 1)    <idle>-0    |               |            ack_apic_edge() {
 1)    <idle>-0    |   0.510 us    |              irq_to_desc();
 1)    <idle>-0    |               |              move_native_irq() {
 1)    <idle>-0    |   0.510 us    |                irq_to_desc();
 1)    <idle>-0    |   1.532 us    |              }
 1)    <idle>-0    |   0.511 us    |              native_apic_mem_write();
 ------------------------------------------
 1)    <idle>-0    =>    cat-5073
 ------------------------------------------

 1)    cat-5073    |   3.731 us    |                    }
 1)    cat-5073    |               |                    run_local_timers() {
 1)    cat-5073    |   0.533 us    |                      hrtimer_run_queues();
 1)    cat-5073    |               |                      raise_softirq() {
 1)    cat-5073    |               |                        __raise_softirq_irqoff() {
 1)    cat-5073    |               |                          /* nr: 1 */
 1)    cat-5073    |   2.718 us    |                        }
 1)    cat-5073    |   3.814 us    |                      }

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-17 19:20:17 +01:00
Steven Rostedt b6887d7916 ftrace: rename _hook to _probe
Impact: clean up

Ingo Molnar did not like the _hook naming convention used by the
select function tracer. Luis Claudio R. Goncalves suggested using
the "_probe" extension. This patch implements the change of
calling the functions and variables "_hook" and replacing them
with "_probe".

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-02-17 12:32:04 -05:00
Steven Rostedt 6a24a244cd ftrace: clean up coding style
Ingo Molnar pointed out some coding style issues with the recent ftrace
updates. This patch cleans them up.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-02-17 11:20:26 -05:00
Ingo Molnar 73d3fd96e7 ftrace: fix !CONFIG_DYNAMIC_FTRACE ftrace_swapper_pid definition
Impact: build fix

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-17 11:50:42 +01:00
Steven Rostedt 809dcf29ce ftrace: add pretty print to selected fuction traces
This patch adds a call back for the tracers that have hooks to
selected functions. This allows the tracer to show better output
in the set_ftrace_filter file.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-02-16 23:06:01 -05:00
Steven Rostedt 8fc0c701c5 ftrace: show selected functions in set_ftrace_filter
This patch adds output to show what functions have tracer hooks
attached to them.

  # echo 'sys_open:traceon:4' > /debug/tracing/set_ftrace_filter
  # cat set_ftrace_filter

 #### all functions enabled ####
 sys_open:ftrace_traceon:0000000000000004

  # echo 'do_fork:traceoff:' > set_ftrace_filter
  # cat set_ftrace_filter

 #### all functions enabled ####
 sys_open:ftrace_traceon:0000000000000002
 do_fork:ftrace_traceoff:ffffffffffffffff

Note the 4 changed to a 2. This is because The code was executed twice
since the traceoff was added. If a cat is done again:

 #### all functions enabled ####
 sys_open:ftrace_traceon
 do_fork:ftrace_traceoff:ffffffffffffffff

The number disappears. That is because it will not print a NULL.

Callbacks to allow the tracer to pretty print will be implemented soon.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-02-16 22:50:06 -05:00
Steven Rostedt 59df055f19 ftrace: trace different functions with a different tracer
Impact: new feature

Currently, the function tracer only gives you an ability to hook
a tracer to all functions being traced. The dynamic function trace
allows you to pick and choose which of those functions will be
traced, but all functions being traced will call all tracers that
registered with the function tracer.

This patch adds a new feature that allows a tracer to hook to specific
functions, even when all functions are being traced. It allows for
different functions to call different tracer hooks.

The way this is accomplished is by a special function that will hook
to the function tracer and will set up a hash table knowing which
tracer hook to call with which function. This is the most general
and easiest method to accomplish this. Later, an arch may choose
to supply their own method in changing the mcount call of a function
to call a different tracer. But that will be an exercise for the
future.

To register a function:

 struct ftrace_hook_ops {
	void			(*func)(unsigned long ip,
					unsigned long parent_ip,
					void **data);
	int			(*callback)(unsigned long ip, void **data);
	void			(*free)(void **data);
 };

 int register_ftrace_function_hook(char *glob, struct ftrace_hook_ops *ops,
				  void *data);

glob is a simple glob to search for the functions to hook.
ops is a pointer to the operations (listed below)
data is the default data to be passed to the hook functions when traced

ops:
 func is the hook function to call when the functions are traced
 callback is a callback function that is called when setting up the hash.
   That is, if the tracer needs to do something special for each
   function, that is being traced, and wants to give each function
   its own data. The address of the entry data is passed to this
   callback, so that the callback may wish to update the entry to
   whatever it would like.
 free is a callback for when the entry is freed. In case the tracer
   allocated any data, it is give the chance to free it.

To unregister we have three functions:

  void
  unregister_ftrace_function_hook(char *glob, struct ftrace_hook_ops *ops,
				void *data)

This will unregister all hooks that match glob, point to ops, and
have its data matching data. (note, if glob is NULL, blank or '*',
all functions will be tested).

  void
  unregister_ftrace_function_hook_func(char *glob,
				 struct ftrace_hook_ops *ops)

This will unregister all functions matching glob that has an entry
pointing to ops.

  void unregister_ftrace_function_hook_all(char *glob)

This simply unregisters all funcs.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-02-16 22:44:09 -05:00
Steven Rostedt e6ea44e9b4 ftrace: consolidate mutexes
Impact: clean up

Now that ftrace_lock is a mutex, there is no reason to have three
different mutexes protecting similar data. All the mutex paths
are not in hot paths, so having a mutex to cover more data is
not a problem.

This patch removes the ftrace_sysctl_lock and ftrace_start_lock
and uses the ftrace_lock to protect the locations that were protected
by these locks. By doing so, this change also removes some of
the lock nesting that was taking place.

There are still more mutexes in ftrace.c that can probably be
consolidated, but they can be dealt with later. We need to be careful
about the way the locks are nested, and by consolidating, we can cause
a recursive deadlock.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-02-16 18:15:31 -05:00
Steven Rostedt 52baf11922 ftrace: convert ftrace_lock from a spinlock to mutex
Impact: clean up

The older versions of ftrace required doing the ftrace list
search under atomic context. Now all the calls are in non-atomic
context. There is no reason to keep the ftrace_lock as a spinlock.

This patch converts it to a mutex.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-02-16 17:33:14 -05:00
Steven Rostedt f6180773d9 ftrace: add command interface for function selection
Allow for other tracers to add their own commands for function
selection. This interface gives a trace the ability to name a
command for function selection. Right now it is pretty limited
in what it offers, but this is a building step for more features.

The :mod: command is converted to this interface and also serves
as a template for other implementations.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-02-16 17:06:02 -05:00
Steven Rostedt e68746a271 ftrace: enable filtering only when a function is filtered on
Impact: fix to prevent empty set_ftrace_filter and no ftrace output

The function filter is used to only trace a given set of functions.
The filter is enabled when a function name is echoed into the
set_ftrace_filter file. But if the name has a typo and the function
is not found, the filter is enabled, but no function is listed.

This makes a confusing situation where set_ftrace_filter is empty
but no functions ever get enabled for tracing.

For example:

 # cat /debug/tracing/set_ftrace_filter

  #### all functions enabled ####

 # echo bad_name > set_ftrace_filter
 # cat /debug/tracing/set_ftrace_filter

 # echo function > current_tracer
 # cat trace

  # tracer: nop
  #
  #           TASK-PID    CPU#    TIMESTAMP  FUNCTION
  #              | |       |          |         |

This patch changes that to only enable filtering if a function
is set to be filtered on. Now, the filter is not enabled if
a bad name is echoed into set_ftrace_filter.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-02-16 17:03:49 -05:00
Steven Rostedt 64e7c44061 ftrace: add module command function filter selection
This patch adds a "command" syntax to the function filtering files:

  /debugfs/tracing/set_ftrace_filter
  /debugfs/tracing/set_ftrace_notrace

Of the format:  <function>:<command>:<parameter>

The command is optional, and dependent on the command, so are
the parameters.

 echo do_fork > set_ftrace_filter

Will only trace 'do_fork'.

 echo 'sched_*' > set_ftrace_filter

Will only trace functions starting with the letters 'sched_'.

 echo '*:mod:ext3' > set_ftrace_filter

Will trace only the ext3 module functions.

 echo '*write*:mod:ext3' > set_ftrace_notrace

Will prevent the ext3 functions with the letters 'write' in
the name from being traced.

 echo '!*_allocate:mod:ext3' > set_ftrace_filter

Will remove the functions in ext3 that end with the letters
'_allocate' from the ftrace filter.

Although this patch implements the 'command' format, only the
'mod' command is supported. More commands to follow.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-02-16 16:55:50 -05:00
Steven Rostedt 9f4801e30a ftrace: break up ftrace_match_records into smaller components
Impact: clean up

ftrace_match_records does a lot of things that other features
can use. This patch breaks up ftrace_match_records and pulls
out ftrace_setup_glob and ftrace_match_record.

ftrace_setup_glob prepares a simple glob expression for use with
ftrace_match_record. ftrace_match_record compares a single record
with a glob type.

Breaking this up will allow for more features to run on individual
records.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-02-16 16:49:57 -05:00
Steven Rostedt 7f24b31b01 ftrace: rename ftrace_match to ftrace_match_records
Impact: clean up

ftrace_match is too generic of a name. What it really does is
search all records and matches the records with the given string,
and either sets or unsets the functions to be traced depending
on if the parameter 'enable' is set or not.

This allows us to make another function called ftrace_match that
can be used to test a single record.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-02-16 16:33:15 -05:00
Steven Rostedt 265c831cb0 ftrace: add do_for_each_ftrace_rec and while_for_each_ftrace_rec
Impact: clean up

To iterate over all the functions that dynamic trace knows about
it requires two for loops. One to iterate over the pages and the
other to iterate over the records within the page.

There are several duplications of these loops in ftrace.c. This
patch creates the macros do_for_each_ftrace_rec and
while_for_each_ftrace_rec to handle this logic, and removes the
duplicate code.

While making this change, I also discovered and fixed a small
bug that one of the iterations should exit the loop after it found the
record it was searching for. This used a break when it should have
used a goto, since there were two loops it needed to break out
from.  No real harm was done by this bug since it would only continue
to search the other records, and the code was in a slow path anyway.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-02-16 16:25:12 -05:00
Steven Rostedt 0c75a3ed63 ftrace: state that all functions are enabled in set_ftrace_filter
Impact: clean up, make set_ftrace_filter less confusing

The set_ftrace_filter shows only the functions that will be traced.
But when it is empty, it will trace all functions. This can be a bit
confusing.

This patch makes set_ftrace_filter show:

  #### all functions enabled ####

When all functions will be traced, and we do not filter only a select
few.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-02-16 16:21:35 -05:00
Wenji Huang 57794a9d48 trace: trivial fixes in comment typos.
Impact: clean up

Fixed several typos in the comments.

Signed-off-by: Wenji Huang <wenji.huang@oracle.com>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
2009-02-07 20:03:36 -05:00
Ingo Molnar ce70a0b472 Merge branches 'tracing/blktrace', 'tracing/ftrace', 'tracing/urgent' and 'linus' into tracing/core 2009-02-04 20:45:41 +01:00
Oleg Nesterov 229c4ef8ae ftrace: do_each_pid_task() needs rcu lock
"ftrace: use struct pid" commit 978f3a45d9
converted ftrace_pid_trace to "struct pid*".

But we can't use do_each_pid_task() without rcu_read_lock() even if
we know the pid itself can't go away (it was pinned in ftrace_pid_write).
The exiting task can detach itself from this pid at any moment.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-03 22:50:58 +01:00
Frederic Weisbecker 00f57f545a tracing/function-graph-tracer: fix a regression while suspend to disk
Impact: fix a crash while kernel image restore

When the function graph tracer is running and while suspend to disk, some racy
and dangerous things happen against this tracer.

The current task will save its registers including the stack pointer which
contains the return address hooked by the tracer. But the current task will
continue to enter other functions after that to save the memory, and then
it will store other return addresses, and finally loose the old depth which
matches the return address saved in the old stack (during the registers saving).

So on image restore, the code will return to wrong addresses.
And there are other things: on restore, the task will have it's "current"
pointer overwritten during registers restoring....switching from one task to
another... That would be insane to try to trace function graphs at these
stages.

This patch makes the function graph tracer listening on power events, making
it's tracing disabled for the current task (the one that performs the
hibernation work) while suspend/resume to disk, making the tracing safe
during hibernation.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-21 15:21:30 +01:00
Frederic Weisbecker 4a2b8dda3f tracing/function-graph-tracer: fix a regression while suspend to disk
Impact: fix a crash while kernel image restore

When the function graph tracer is running and while suspend to disk, some racy
and dangerous things happen against this tracer.

The current task will save its registers including the stack pointer which
contains the return address hooked by the tracer. But the current task will
continue to enter other functions after that to save the memory, and then
it will store other return addresses, and finally loose the old depth which
matches the return address saved in the old stack (during the registers saving).

So on image restore, the code will return to wrong addresses.
And there are other things: on restore, the task will have it's "current"
pointer overwritten during registers restoring....switching from one task to
another... That would be insane to try to trace function graphs at these
stages.

This patch makes the function graph tracer listening on power events, making
it's tracing disabled for the current task (the one that performs the
hibernation work) while suspend/resume to disk, making the tracing safe
during hibernation.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-15 09:47:40 +01:00
Uwe Kleine-Koenig 32632920a7 ftrace, trivial: fix typo "resgister" -> "register"
Signed-off-by: Uwe Kleine-Koenig <ukleinek@strlen.de>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-14 12:11:49 +01:00
Shaohua Li f00012074b ftrace, ia64: Add macro for ftrace_caller
Define FTRACE_ADDR. In IA64, a function pointer isn't a 'unsigned long' but a
'struct {unsigned long ip, unsigned long gp}'.

Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-14 12:11:18 +01:00
Shaohua Li 25aac9dc7c ftrace, ia64: explictly ignore a file in recordmcount.pl
In IA64, a function pointer isn't a 'unsigned long' but a
'struct {unsigned long ip, unsigned long gp}'. MCOUNT_ADDR is determined
at link time not compile time, so explictly ignore kernel/trace/ftrace.o
in recordmcount.pl.

Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-14 12:11:13 +01:00
Steven Rostedt 431aa3fbf5 ftrace: convert unsigned index to signed
Impact: fix to unsigned compared to less than zero

Roel Kluin pointed out that there is a compare of an unsigned number
to less than zero. A previous clean up had the unsigned index set
to -1 for certain cases, but never converted it to signed.

Frederic Weisbecker noticed that another index is used to compare
the above index to and it also needs to be converted to signed.

[
  Converted ftrace_page->index to int from unsigned long as
  Andrew Morton pointed out that there's no need for it to
  stay a long.
]

Reported-by: Roel Kluin <roel.kluin@gmail.com>
Reported-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-07 10:43:52 +01:00
Steven Rostedt ea3a6d6d60 ftrace: add not to regex on filtering functions
Impact: enhancement

Ingo Molnar has asked about a way to remove items from the filter
lists. Currently, you can only add or replace items. The way
items are added to the list is through opening one of the list
files (set_ftrace_filter or set_ftrace_notrace) via append.
If the file is opened for truncate, the list is cleared.

  echo spin_lock > /debug/tracing/set_ftrace_filter

The above will replace the list with only spin_lock

  echo spin_lock >> /debug/tracing/set_ftrace_filter

The above will add spin_lock to the list.

Now this patch adds:

  echo '!spin_lock' >> /debug/tracing/set_ftrace_filter

This will remove spin_lock from the list.

The limited glob features of these lists also can be notted.

  echo '!spin_*' >> /debug/tracing/set_ftrace_filter

This will remove all functions that start with 'spin_'

Note:

  echo '!spin_*' > /debug/tracing/set_ftrace_filter

will simply clear out the list (notice the '>' instead of '>>')

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-18 12:57:09 +01:00
Frederic Weisbecker 380c4b1411 tracing/function-graph-tracer: append the tracing_graph_flag
Impact: Provide a way to pause the function graph tracer

As suggested by Steven Rostedt, the previous patch that prevented from
spinlock function tracing shouldn't use the raw_spinlock to fix it.
It's much better to follow lockdep with normal spinlock, so this patch
adds a new flag for each task to make the function graph tracer able
to be paused. We also can send an ftrace_printk whithout worrying of
the irrelevant traced spinlock during insertion.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-08 15:11:45 +01:00
Steven Rostedt 21bbecdaae ftrace: use init_struct_pid as swapper pid
Impact: clean up

Using (struct pid *)-1 as the pointer for ftrace_swapper_pid is
a little confusing for others. This patch uses the address of the
actual init pid structure instead. This change is only for
clarity. It does not affect the code itself. Hopefully soon the
swapper tasks will all have their own pid structure and then
we can clean up the code a bit more.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-05 14:51:29 +01:00
Liming Wang faec2ec505 ftrace: avoid duplicated function when writing set_graph_function
Impact: fix a bug in function filter setting

when writing function to set_graph_function, we should check whether it
has existed in set_graph_function to avoid duplicating.

Signed-off-by: Liming Wang <liming.wang@windriver.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-04 09:42:35 +01:00
Steven Rostedt e32d895691 ftrace: add ability to only trace swapper tasks
Impact: new feature

This patch lets the swapper tasks of all CPUS be filtered by the
set_ftrace_pid file.

If '0' is echoed into this file, then all the idle tasks (aka swapper)
is flagged to be traced.  This affects all CPU idle tasks.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-04 09:09:38 +01:00
Steven Rostedt 978f3a45d9 ftrace: use struct pid
Impact: clean up, extend PID filtering to PID namespaces

Eric Biederman suggested using the struct pid for filtering on
pids in the kernel. This patch is based off of a demonstration
of an implementation that Eric sent me in an email.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-04 09:09:37 +01:00
Steven Rostedt 804a685162 ftrace: trace single pid for function graph tracer
Impact: New feature

This patch makes the changes to set_ftrace_pid apply to the function
graph tracer.

  # echo $$ > /debugfs/tracing/set_ftrace_pid
  # echo function_graph > /debugfs/tracing/current_tracer

Will cause only the current task to be traced. Note, the trace flags are
also inherited by child processes, so the children of the shell
will also be traced.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-04 09:09:36 +01:00
Steven Rostedt 0ef8cde56a ftrace: use task struct trace flag to filter on pid
Impact: clean up

Use the new task struct trace flags to determine if a process should be
traced or not.

Note: this moves the searching of the pid to the slow path of setting
the pid field. This needs to be converted to the pid name space.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-04 09:09:35 +01:00
Steven Rostedt ea4e2bc4d9 ftrace: graph of a single function
This patch adds the file:

   /debugfs/tracing/set_graph_function

which can be used along with the function graph tracer.

When this file is empty, the function graph tracer will act as
usual. When the file has a function in it, the function graph
tracer will only trace that function.

For example:

 # echo blk_unplug > /debugfs/tracing/set_graph_function
 # cat /debugfs/tracing/trace
 [...]
 ------------------------------------------
 | 2)  make-19003  =>  kjournald-2219
 ------------------------------------------

 2)               |  blk_unplug() {
 2)               |    dm_unplug_all() {
 2)               |      dm_get_table() {
 2)      1.381 us |        _read_lock();
 2)      0.911 us |        dm_table_get();
 2)      1. 76 us |        _read_unlock();
 2) +   12.912 us |      }
 2)               |      dm_table_unplug_all() {
 2)               |        blk_unplug() {
 2)      0.778 us |          generic_unplug_device();
 2)      2.409 us |        }
 2)      5.992 us |      }
 2)      0.813 us |      dm_table_put();
 2) +   29. 90 us |    }
 2) +   34.532 us |  }

You can add up to 32 functions into this file. Currently we limit it
to 32, but this may change with later improvements.

To add another function, use the append '>>':

  # echo sys_read >> /debugfs/tracing/set_graph_function
  # cat /debugfs/tracing/set_graph_function
  blk_unplug
  sys_read

Using the '>' will clear out the function and write anew:

  # echo sys_write > /debug/tracing/set_graph_function
  # cat /debug/tracing/set_graph_function
  sys_write

Note, if you have function graph running while doing this, the small
time between clearing it and updating it will cause the graph to
record all functions. This should not be an issue because after
it sets the filter, only those functions will be recorded from then on.
If you need to only record a particular function then set this
file first before starting the function graph tracer. In the future
this side effect may be corrected.

The set_graph_function file is similar to the set_ftrace_filter but
it does not take wild cards nor does it allow for more than one
function to be set with a single write. There is no technical reason why
this is the case, I just do not have the time yet to implement that.

Note, dynamic ftrace must be enabled for this to appear because it
uses the dynamic ftrace records to match the name to the mcount
call sites.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-04 09:09:34 +01:00