Commit Graph

4997 Commits (decbec3838d10ecd7aabdb4c0e05aac0e5f5dc0c)

Author SHA1 Message Date
Ingo Molnar 970987beb9 Merge branches 'tracing/ftrace', 'tracing/function-graph-tracer' and 'tracing/urgent' into tracing/core 2008-12-05 14:45:22 +01:00
Steven Rostedt 62679efe0a ftrace: add checks on ret stack in function graph
Import: robustness checks

Add more checks in the function graph code to detect errors and
perhaps print out better information if a bug happens.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-03 08:56:27 +01:00
Steven Rostedt e49dc19c6a ftrace: function graph return for function entry
Impact: feature, let entry function decide to trace or not

This patch lets the graph tracer entry function decide if the tracing
should be done at the end as well. This requires all function graph
entry functions return 1 if it should trace, or 0 if the return should
not be traced.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-03 08:56:26 +01:00
Steven Rostedt 7ee991fbc6 ftrace: print real return in dumpstack for function graph
Impact: better dumpstack output

I noticed in my crash dumps and even in the stack tracer that a
lot of functions listed in the stack trace are simply
return_to_handler which is ftrace graphs way to insert its own
call into the return of a function.

But we lose out where the actually function was called from.

This patch adds in hooks to the dumpstack mechanism that detects
this and finds the real function to print. Both are printed to
let the user know that a hook is still in place.

This does give a funny side effect in the stack tracer output:

        Depth   Size      Location    (80 entries)
        -----   ----      --------
  0)     4144      48   save_stack_trace+0x2f/0x4d
  1)     4096     128   ftrace_call+0x5/0x2b
  2)     3968      16   mempool_alloc_slab+0x16/0x18
  3)     3952     384   return_to_handler+0x0/0x73
  4)     3568    -240   stack_trace_call+0x11d/0x209
  5)     3808     144   return_to_handler+0x0/0x73
  6)     3664    -128   mempool_alloc+0x4d/0xfe
  7)     3792     128   return_to_handler+0x0/0x73
  8)     3664     -32   scsi_sg_alloc+0x48/0x4a [scsi_mod]

As you can see, the real functions are now negative. This is due
to them not being found inside the stack.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-03 08:56:25 +01:00
Steven Rostedt 14a866c567 ftrace: add ftrace_graph_stop()
Impact: new ftrace_graph_stop function

While developing more features of function graph, I hit a bug that
caused the WARN_ON to trigger in the prepare_ftrace_return function.
Well, it was hard for me to find out that was happening because the
bug would not print, it would just cause a hard lockup or reboot.
The reason is that it is not safe to call printk from this function.

Looking further, I also found that it calls unregister_ftrace_graph,
which grabs a mutex and calls kstop machine. This would definitely
lock the box up if it were to trigger.

This patch adds a fast and safe ftrace_graph_stop() which will
stop the function tracer. Then it is safe to call the WARN ON.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-03 08:56:23 +01:00
Steven Rostedt bb4304c71c ftrace: have function graph use mcount caller address
Impact: consistency change for function graph

This patch makes function graph record the mcount caller address
the same way the function tracer does.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-03 08:56:22 +01:00
Steven Rostedt 347fdd9dd4 ftrace: clean up function graph asm
Impact: clean up

There exists macros for x86 asm to handle x86_64 and i386.
This patch updates function graph asm to use them.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-03 08:56:21 +01:00
Ingo Molnar dfdc5437bd Merge commit 'v2.6.28-rc7'; branch 'x86/dumpstack' into tracing/ftrace
Merge x86/dumpstack into tracing/ftrace because upcoming ftrace changes
depend on cleanups already in x86/dumpstack.

Also merge to latest upstream -rc.
2008-12-03 08:55:34 +01:00
Linus Torvalds b7d6266062 Merge branch 'kvm-updates/2.6.28' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm
* 'kvm-updates/2.6.28' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm:
  KVM: MMU: avoid creation of unreachable pages in the shadow
  KVM: ppc: stop leaking host memory on VM exit
  KVM: MMU: fix sync of ptes addressed at owner pagetable
  KVM: ia64: Fix: Use correct calling convention for PAL_VPS_RESUME_HANDLER
  KVM: ia64: Fix incorrect kbuild CFLAGS override
  KVM: VMX: Fix interrupt loss during race with NMI
  KVM: s390: Fix problem state handling in guest sigp handler
2008-12-02 15:56:17 -08:00
Frederic Weisbecker 48d68b20d0 tracing/function-graph-tracer: support for x86-64
Impact: extend and enable the function graph tracer to 64-bit x86

This patch implements the support for function graph tracer under x86-64.
Both static and dynamic tracing are supported.

This causes some small CPP conditional asm on arch/x86/kernel/ftrace.c I
wanted to use probe_kernel_read/write to make the return address
saving/patching code more generic but it causes tracing recursion.

That would be perhaps useful to implement a notrace version of these
function for other archs ports.

Note that arch/x86/process_64.c is not traced, as in X86-32. I first
thought __switch_to() was responsible of crashes during tracing because I
believed current task were changed inside but that's actually not the
case (actually yes, but not the "current" pointer).

So I will have to investigate to find the functions that harm here, to
enable tracing of the other functions inside (but there is no issue at
this time, while process_64.c stays out of -pg flags).

A little possible race condition is fixed inside this patch too. When the
tracer allocate a return stack dynamically, the current depth is not
initialized before but after. An interrupt could occur at this time and,
after seeing that the return stack is allocated, the tracer could try to
trace it with a random uninitialized depth. It's a prevention, even if I
hadn't problems with it.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Tim Bird <tim.bird@am.sony.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-02 09:47:48 +01:00
Linus Torvalds 72244c0e68 Merge branch 'irq-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'irq-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  irq.h: fix missing/extra kernel-doc
  genirq: __irq_set_trigger: change pr_warning to pr_debug
  irq: fix typo
  x86: apic honour irq affinity which was set in early boot
  genirq: fix the affinity setting in setup_irq
  genirq: keep affinities set from userspace across free/request_irq()
2008-11-30 13:06:20 -08:00
Linus Torvalds 66a45cc4cc Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: always define DECLARE_PCI_UNMAP* macros
  x86: fixup config space size of CPU functions for AMD family 11h
  x86, bts: fix wrmsr and spinlock over kmalloc
  x86, pebs: fix PEBS record size configuration
  x86, bts: turn macro into static inline function
  x86, bts: exclude ds.c from build when disabled
  arch/x86/kernel/pci-calgary_64.c: change simple_strtol to simple_strtoul
  x86: use limited register constraint for setnz
  xen: pin correct PGD on suspend
  x86: revert irq number limitation
  x86: fixing __cpuinit/__init tangle, xsave_cntxt_init()
  x86: fix __cpuinit/__init tangle in init_thread_xstate()
  oprofile: fix an overflow in ppro code
2008-11-30 13:01:04 -08:00
Linus Torvalds 8c7b905a2d Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq
* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq:
  [CPUFREQ] powernow-k8: ignore out-of-range PstateStatus value
  [CPUFREQ] Documentation: Add Blackfin to list of supported processors
2008-11-30 11:43:41 -08:00
Christoph Hellwig 96b8936a9e remove __ARCH_WANT_COMPAT_SYS_PTRACE
All architectures now use the generic compat_sys_ptrace, as should every
new architecture that needs 32bit compat (if we'll ever get another).

Remove the now superflous __ARCH_WANT_COMPAT_SYS_PTRACE define, and also
kill a comment about __ARCH_SYS_PTRACE that was added after
__ARCH_SYS_PTRACE was already gone.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-11-30 11:00:15 -08:00
Al Viro df6b07949b xen_play_dead() is __cpuinit
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-11-30 10:03:38 -08:00
Al Viro 37af46efa5 xen_setup_vcpu_info_placement() is not init on x86
... so get xen-ops.h in agreement with xen/smp.c

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-11-30 10:03:37 -08:00
Al Viro 23a14b9e9d kvm_setup_secondary_clock() is cpuinit
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-11-30 10:03:37 -08:00
Al Viro 2236d252e0 enable_IR_x2apic() needs to be __init
calls __init, called only from __init

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-11-30 10:03:37 -08:00
Joerg Roedel b627c8b17c x86: always define DECLARE_PCI_UNMAP* macros
Impact: fix boot crash on AMD IOMMU if CONFIG_GART_IOMMU is off

Currently these macros evaluate to a no-op except the kernel is compiled
with GART or Calgary support. But we also need these macros when we have
SWIOTLB, VT-d or AMD IOMMU in the kernel. Since we always compile at
least with SWIOTLB we can define these macros always.

This patch is also for stable backport for the same reason the SWIOTLB
default selection patch is.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Cc: <stable@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-27 12:44:08 +01:00
Ingo Molnar c7cc773076 Merge branches 'tracing/blktrace', 'tracing/ftrace', 'tracing/function-graph-tracer' and 'tracing/power-tracer' into tracing/core 2008-11-27 10:56:13 +01:00
Marcelo Tosatti 6c475352e8 KVM: MMU: avoid creation of unreachable pages in the shadow
It is possible for a shadow page to have a parent link
pointing to a freed page. When zapping a high level table,
kvm_mmu_page_unlink_children fails to remove the parent_pte link.
For that to happen, the child must be unreachable via the shadow
tree, which can happen in shadow_walk_entry if the guest pte was
modified in between walk() and fetch(). Remove the parent pte
reference in such case.

Possible cause for oops in bug #2217430.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2008-11-26 12:34:27 +02:00
Arjan van de Ven f3f47a6768 tracing: add "power-tracer": C/P state tracer to help power optimization
Impact: new "power-tracer" ftrace plugin

This patch adds a C/P-state ftrace plugin that will generate
detailed statistics about the C/P-states that are being used,
so that we can look at detailed decisions that the C/P-state
code is making, rather than the too high level "average"
that we have today.

An example way of using this is:

 mount -t debugfs none /sys/kernel/debug
 echo cstate > /sys/kernel/debug/tracing/current_tracer
 echo 1 > /sys/kernel/debug/tracing/tracing_enabled
 sleep 1
 echo 0 > /sys/kernel/debug/tracing/tracing_enabled
 cat /sys/kernel/debug/tracing/trace | perl scripts/trace/cstate.pl > out.svg

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-26 08:29:32 +01:00
Steven Rostedt 5a45cfe1c6 ftrace: use code patching for ftrace graph tracer
Impact: more efficient code for ftrace graph tracer

This patch uses the dynamic patching, when available, to patch
the function graph code into the kernel.

This patch will ease the way for letting both function tracing
and function graph tracing run together.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-26 06:52:54 +01:00
Andreas Herrmann ffd565a8b8 x86: fixup config space size of CPU functions for AMD family 11h
Impact: extend allowed configuration space access on 11h CPUs from 256 to 4K

Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-26 03:53:27 +01:00
Ingo Molnar c2324b694f tracing: function graph tracer, fix
fix return-tracer => graph-tracer namespace rename fallout.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-26 03:10:01 +01:00
Frederic Weisbecker 287b6e68ca tracing/function-return-tracer: set a more human readable output
Impact: feature

This patch sets a C-like output for the function graph tracing.
For this aim, we now call two handler for each function: one on the entry
and one other on return. This way we can draw a well-ordered call stack.

The pid of the previous trace is loosely stored to be compared against
the one of the current trace to see if there were a context switch.

Without this little feature, the call tree would seem broken at
some locations.
We could use the sched_tracer to capture these sched_events but this
way of processing is much more simpler.

2 spaces have been chosen for indentation to fit the screen while deep
calls. The time of execution in nanosecs is printed just after closed
braces, it seems more easy this way to find the corresponding function.
If the time was printed as a first column, it would be not so easy to
find the corresponding function if it is called on a deep depth.

I plan to output the return value but on 32 bits CPU, the return value
can be 32 or 64, and its difficult to guess on which case we are.
I don't know what would be the better solution on X86-32: only print
eax (low-part) or even edx (high-part).

Actually it's thee same problem when a function return a 8 bits value, the
high part of eax could contain junk values...

Here is an example of trace:

sys_read() {
  fget_light() {
  } 526
  vfs_read() {
    rw_verify_area() {
      security_file_permission() {
        cap_file_permission() {
        } 519
      } 1564
    } 2640
    do_sync_read() {
      pipe_read() {
        __might_sleep() {
        } 511
        pipe_wait() {
          prepare_to_wait() {
          } 760
          deactivate_task() {
            dequeue_task() {
              dequeue_task_fair() {
                dequeue_entity() {
                  update_curr() {
                    update_min_vruntime() {
                    } 504
                  } 1587
                  clear_buddies() {
                  } 512
                  add_cfs_task_weight() {
                  } 519
                  update_min_vruntime() {
                  } 511
                } 5602
                dequeue_entity() {
                  update_curr() {
                    update_min_vruntime() {
                    } 496
                  } 1631
                  clear_buddies() {
                  } 496
                  update_min_vruntime() {
                  } 527
                } 4580
                hrtick_update() {
                  hrtick_start_fair() {
                  } 488
                } 1489
              } 13700
            } 14949
          } 16016
          msecs_to_jiffies() {
          } 496
          put_prev_task_fair() {
          } 504
          pick_next_task_fair() {
          } 489
          pick_next_task_rt() {
          } 496
          pick_next_task_fair() {
          } 489
          pick_next_task_idle() {
          } 489

------------8<---------- thread 4 ------------8<----------

finish_task_switch() {
} 1203
do_softirq() {
  __do_softirq() {
    __local_bh_disable() {
    } 669
    rcu_process_callbacks() {
      __rcu_process_callbacks() {
        cpu_quiet() {
          rcu_start_batch() {
          } 503
        } 1647
      } 3128
      __rcu_process_callbacks() {
      } 542
    } 5362
    _local_bh_enable() {
    } 587
  } 8880
} 9986
kthread_should_stop() {
} 669
deactivate_task() {
  dequeue_task() {
    dequeue_task_fair() {
      dequeue_entity() {
        update_curr() {
          calc_delta_mine() {
          } 511
          update_min_vruntime() {
          } 511
        } 2813

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-26 01:59:45 +01:00
Frederic Weisbecker fb52607afc tracing/function-return-tracer: change the name into function-graph-tracer
Impact: cleanup

This patch changes the name of the "return function tracer" into
function-graph-tracer which is a more suitable name for a tracing
which makes one able to retrieve the ordered call stack during
the code flow.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-26 01:59:45 +01:00
Andreas Herrmann a266d9f125 [CPUFREQ] powernow-k8: ignore out-of-range PstateStatus value
A workaround for AMD CPU family 11h erratum 311 might cause that the
P-state Status Register shows a "current P-state" which is larger than
the "current P-state limit" in P-state Current Limit Register. For the
wrong P-state value there is no ACPI _PSS object defined and
powernow-k8/cpufreq can't determine the proper CPU frequency for that
state.

As a consequence this can cause a panic during boot (potentially with
all recent kernel versions -- at least I have reproduced it with
various 2.6.27 kernels and with the current .28 series), as an
example:

powernow-k8: Found 1 AMD Turion(tm)X2 Ultra DualCore Mobile ZM-82 processors (2 \
)
powernow-k8:    0 : pstate 0 (2200 MHz)
powernow-k8:    1 : pstate 1 (1100 MHz)
powernow-k8:    2 : pstate 2 (600 MHz)
BUG: unable to handle kernel paging request at ffff88086e7528b8
IP: [<ffffffff80486361>] cpufreq_stats_update+0x4a/0x5f
PGD 202063 PUD 0
Oops: 0002 [#1] SMP
last sysfs file:
CPU 1
Modules linked in:
Pid: 1, comm: swapper Not tainted 2.6.28-rc3-dirty #16
RIP: 0010:[<ffffffff80486361>]  [<ffffffff80486361>] cpufreq_stats_update+0x4a/0\
f
Synaptics claims to have extended capabilities, but I'm not able to read them.<6\
6
RAX: 0000000000000000 RBX: 0000000000000001 RCX: ffff88006e7528c0
RDX: 00000000ffffffff RSI: ffff88006e54af00 RDI: ffffffff808f056c
RBP: 00000000fffee697 R08: 0000000000000003 R09: ffff88006e73f080
R10: 0000000000000001 R11: 00000000002191c0 R12: ffff88006fb83c10
R13: 00000000ffffffff R14: 0000000000000001 R15: 0000000000000000
FS:  0000000000000000(0000) GS:ffff88006fb50740(0000) knlGS:0000000000000000
Unable to initialize Synaptics hardware.
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: ffff88086e7528b8 CR3: 0000000000201000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process swapper (pid: 1, threadinfo ffff88006fb82000, task ffff88006fb816d0)
Stack:
 ffff88006e74da50 0000000000000000 ffff88006e54af00 ffffffff804863c7
 ffff88006e74da50 0000000000000000 00000000ffffffff 0000000000000000
 ffff88006fb83c10 ffffffff8024b46c ffffffff808f0560 ffff88006fb83c10
Call Trace:
 [<ffffffff804863c7>] ? cpufreq_stat_notifier_trans+0x51/0x83
 [<ffffffff8024b46c>] ? notifier_call_chain+0x29/0x4c
 [<ffffffff8024b561>] ? __srcu_notifier_call_chain+0x46/0x61
 [<ffffffff8048496d>] ? cpufreq_notify_transition+0x93/0xa9
 [<ffffffff8021ab8d>] ? powernowk8_target+0x1e8/0x5f3
 [<ffffffff80486687>] ? cpufreq_governor_performance+0x1b/0x20
 [<ffffffff80484886>] ? __cpufreq_governor+0x71/0xa8
 [<ffffffff80484b21>] ? __cpufreq_set_policy+0x101/0x13e
 [<ffffffff80485bcd>] ? cpufreq_add_dev+0x3f0/0x4cd
 [<ffffffff8048577a>] ? handle_update+0x0/0x8
 [<ffffffff803c2062>] ? sysdev_driver_register+0xb6/0x10d
 [<ffffffff8056592c>] ? powernowk8_init+0x0/0x7e
 [<ffffffff8048604c>] ? cpufreq_register_driver+0x8f/0x140
 [<ffffffff80209056>] ? _stext+0x56/0x14f
 [<ffffffff802c2234>] ? proc_register+0x122/0x17d
 [<ffffffff802c23a0>] ? create_proc_entry+0x73/0x8a
 [<ffffffff8025c259>] ? register_irq_proc+0x92/0xaa
 [<ffffffff8025c2c8>] ? init_irq_proc+0x57/0x69
 [<ffffffff807fc85f>] ? kernel_init+0x116/0x169
 [<ffffffff8020cc79>] ? child_rip+0xa/0x11
 [<ffffffff807fc749>] ? kernel_init+0x0/0x169
 [<ffffffff8020cc6f>] ? child_rip+0x0/0x11
Code: 05 c5 83 36 00 48 c7 c2 48 5d 86 80 48 8b 04 d8 48 8b 40 08 48 8b 34 02 48\

RIP  [<ffffffff80486361>] cpufreq_stats_update+0x4a/0x5f
 RSP <ffff88006fb83b20>
CR2: ffff88086e7528b8
---[ end trace 0678bac75e67a2f7 ]---
Kernel panic - not syncing: Attempted to kill init!

In short, aftereffect of the wrong P-state is that
cpufreq_stats_update() uses "-1" as index for some array in

cpufreq_stats_update (unsigned int cpu)
{
...
     if (stat->time_in_state)
                stat->time_in_state[stat->last_index] =
                        cputime64_add(stat->time_in_state[stat->last_index],
                                      cputime_sub(cur_time, stat->last_time));
...
}

Fortunately, the wrong P-state value is returned only if the core is
in P-state 0. This fix solves the problem by detecting the
out-of-range P-state, ignoring it, and using "0" instead.

Cc: Mark Langsdorf <mark.langsdorf@amd.com>
Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com>
Signed-off-by: Dave Jones <davej@redhat.com>
2008-11-25 13:38:29 -05:00
Markus Metzger 1e9b51c283 x86, bts, ftrace: a BTS ftrace plug-in prototype
Impact: add new ftrace plugin

A prototype for a BTS ftrace plug-in.

The tracer collects branch trace in a cyclic buffer for each cpu.

The tracer is not configurable and the trace for each snapshot is
appended when doing cat /debug/tracing/trace.

This is a proof of concept that will be extended with future patches
to become a (hopefully) useful tool.

Signed-off-by: Markus Metzger <markus.t.metzger@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-25 17:31:13 +01:00
Markus Metzger 6abb11aecd x86, bts, ptrace: move BTS buffer allocation from ds.c into ptrace.c
Impact: restructure DS memory allocation to be done by the usage site of DS

Require pre-allocated buffers in ds.h.

Move the BTS buffer allocation for ptrace into ptrace.c.
The pointer to the allocated buffer is stored in the traced task's
task_struct together with the handle returned by ds_request_bts().

Removes memory accounting code.

Signed-off-by: Markus Metzger <markus.t.metzger@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-25 17:31:12 +01:00
Markus Metzger ca0002a179 x86, bts: base in-kernel ds interface on handles
Impact: generalize the DS code to shared buffers

Change the in-kernel ds.h interface to identify the tracer via a
handle returned on ds_request_~().

Tracers used to be identified via their task_struct.

The changes are required to allow DS to be shared between different
tasks, which is needed for perfmon2 and for ftrace.

For ptrace, the handle is stored in the traced task's task_struct.
This should probably go into a (arch-specific) ptrace context some
time.

Signed-off-by: Markus Metzger <markus.t.metzger@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-25 17:31:11 +01:00
Ingo Molnar 7d55718b0c Merge branches 'tracing/core', 'x86/urgent' and 'x86/ptrace' into tracing/hw-branch-tracing
This pulls together all the topic branches that are needed
for the DS/BTS/PEBS tracing work.
2008-11-25 17:30:30 +01:00
Markus Metzger de90add30e x86, bts: fix wrmsr and spinlock over kmalloc
Impact: fix sleeping-with-spinlock-held bugs/crashes

- Turn a wrmsr to write the DS_AREA MSR into a wrmsrl.
- Use irqsave variants of spinlocks.
- Do not allocate memory while holding spinlocks.

Reported-by: Stephane Eranian <eranian@googlemail.com>
Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Markus Metzger <markus.t.metzger@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-25 17:29:02 +01:00
Markus Metzger c4858ffc8f x86, pebs: fix PEBS record size configuration
Impact: fix DS hw enablement on 64-bit x86

Fix the PEBS record size in the DS configuration.

Reported-by: Stephane Eranian <eranian@googlemail.com>
Signed-off-by: Markus Metzger <markus.t.metzger@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-25 17:28:53 +01:00
Markus Metzger e5e8ca633b x86, bts: turn macro into static inline function
Impact: cleanup

Replace a macro with a static inline function.

Signed-off-by: Markus Metzger <markus.t.metzger@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-25 17:28:51 +01:00
Markus Metzger 292c669cd7 x86, bts: exclude ds.c from build when disabled
Impact: cleanup

Move the CONFIG guard from the .c file into the makefile.

Reported-by: Andi Kleen <andi-suse@firstfloor.org>
Signed-off-by: Markus Metzger <markus.t.metzger@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-25 17:28:50 +01:00
Julia Lawall eff79aee91 arch/x86/kernel/pci-calgary_64.c: change simple_strtol to simple_strtoul
Impact: fix theoretical option string parsing overflow

Since bridge is unsigned, it would seem better to use simple_strtoul that
simple_strtol.

A simplified version of the semantic patch that makes this change is as
follows: (http://www.emn.fr/x-info/coccinelle/)

// <smpl>
@r2@
long e;
position p;
@@

e = simple_strtol@p(...)

@@
position p != r2.p;
type T;
T e;
@@

e =
- simple_strtol@p
+ simple_strtoul
  (...)
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Cc: muli@il.ibm.com
Cc: jdmason@kudzu.us
Cc: discuss@x86-64.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-25 15:56:03 +01:00
Steven Rostedt 5cf02b7baf x86: use limited register constraint for setnz
Impact: build fix with certain compilers

GCC can decide to use %dil when "r" is used, which is not valid for
setnz.

This bug was brought out by Stephen Rothwell's merging of the
branch tracer into linux-next.

[ Thanks to Uros Bizjak for recommending 'q' over 'Q' ]

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-25 15:38:03 +01:00
Ingo Molnar 6f893fb2e8 Merge branches 'tracing/branch-tracer', 'tracing/fastboot', 'tracing/ftrace', 'tracing/function-return-tracer', 'tracing/power-tracer', 'tracing/powerpc', 'tracing/ring-buffer', 'tracing/stack-tracer' and 'tracing/urgent' into tracing/core 2008-11-24 17:46:24 +01:00
Marcelo Tosatti 0c0f40bdbe KVM: MMU: fix sync of ptes addressed at owner pagetable
During page sync, if a pagetable contains a self referencing pte (that
points to the pagetable), the corresponding spte may be marked as
writable even though all mappings are supposed to be write protected.

Fix by clearing page unsync before syncing individual sptes.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2008-11-23 15:24:19 +02:00
Avi Kivity bd2b3ca768 KVM: VMX: Fix interrupt loss during race with NMI
If an interrupt cannot be injected for some reason (say, page fault
when fetching the IDT descriptor), the interrupt is marked for
reinjection.  However, if an NMI is queued at this time, the NMI
will be injected instead and the NMI will be lost.

Fix by deferring the NMI injection until the interrupt has been
injected successfully.

Analyzed by Jan Kiszka.

Signed-off-by: Avi Kivity <avi@redhat.com>
2008-11-23 14:52:29 +02:00
Ian Campbell 86bbc2c235 xen: pin correct PGD on suspend
Impact: fix Xen guest boot failure

commit eefb47f6a1 ("xen: use
spin_lock_nest_lock when pinning a pagetable") changed xen_pgd_walk to
walk over mm->pgd rather than taking pgd as an argument.

This breaks xen_mm_(un)pin_all() because it makes init_mm.pgd readonly
instead of the pgd we are interested in and therefore the pin subsequently
fails.

(XEN) mm.c:2280:d15 Bad type (saw 00000000e8000001 != exp 0000000060000000) for mfn bc464 (pfn 21ca7)
(XEN) mm.c:2665:d15 Error while pinning mfn bc464

[   14.586913] 1 multicall(s) failed: cpu 0
[   14.586926] Pid: 14, comm: kstop/0 Not tainted 2.6.28-rc5-x86_32p-xenU-00172-gee2f6cc #200
[   14.586940] Call Trace:
[   14.586955]  [<c030c17a>] ? printk+0x18/0x1e
[   14.586972]  [<c0103df3>] xen_mc_flush+0x163/0x1d0
[   14.586986]  [<c0104bc1>] __xen_pgd_pin+0xa1/0x110
[   14.587000]  [<c015a330>] ? stop_cpu+0x0/0xf0
[   14.587015]  [<c0104d7b>] xen_mm_pin_all+0x4b/0x70
[   14.587029]  [<c022bcb9>] xen_suspend+0x39/0xe0
[   14.587042]  [<c015a330>] ? stop_cpu+0x0/0xf0
[   14.587054]  [<c015a3cd>] stop_cpu+0x9d/0xf0
[   14.587067]  [<c01417cd>] run_workqueue+0x8d/0x150
[   14.587080]  [<c030e4b3>] ? _spin_unlock_irqrestore+0x23/0x40
[   14.587094]  [<c014558a>] ? prepare_to_wait+0x3a/0x70
[   14.587107]  [<c0141918>] worker_thread+0x88/0xf0
[   14.587120]  [<c01453c0>] ? autoremove_wake_function+0x0/0x50
[   14.587133]  [<c0141890>] ? worker_thread+0x0/0xf0
[   14.587146]  [<c014509c>] kthread+0x3c/0x70
[   14.587157]  [<c0145060>] ? kthread+0x0/0x70
[   14.587170]  [<c0109d1b>] kernel_thread_helper+0x7/0x10
[   14.587181]   call  1/3: op=14 arg=[c0415000] result=0
[   14.587192]   call  2/3: op=14 arg=[e1ca2000] result=0
[   14.587204]   call  3/3: op=26 arg=[c1808860] result=-22

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-23 13:32:24 +01:00
Ingo Molnar 3d994e1076 Merge branch 'oprofile-for-tip' of git://git.kernel.org/pub/scm/linux/kernel/git/rric/oprofile into x86/urgent 2008-11-23 12:16:57 +01:00
Thomas Gleixner a1967d6441 x86: revert irq number limitation
Impact: fix MSIx not enough irq numbers available regression

The manual revert of the sparse_irq patches missed to bring the number
of possible irqs back to the .27 status. This resulted in a regression
when two multichannel network cards were placed in a system with only
one IO_APIC - causing the networking driver to not have the right
IRQ and the device not coming up.

Remove the dynamic allocation logic leftovers and simply return
NR_IRQS in probe_nr_irqs() for now.

   Fixes: http://lkml.org/lkml/2008/11/19/354

Reported-by: Jesper Dangaard Brouer <hawk@diku.dk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Jesper Dangaard Brouer <hawk@diku.dk>
Acked-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-23 11:59:52 +01:00
Török Edwin 8d26487fd4 tracing/stack-tracer: introduce CONFIG_USER_STACKTRACE_SUPPORT
Impact: cleanup

User stack tracing is just implemented for x86, but it is not x86 specific.

Introduce a generic config flag, that is currently enabled only for x86.
When other arches implement it, they will have to
SELECT USER_STACKTRACE_SUPPORT.

Signed-off-by: Török Edwin <edwintorok@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-23 11:53:50 +01:00
Török Edwin 8d7c6a9616 tracing/stack-tracer: fix style issues
Impact: cleanup

Signed-off-by: Török Edwin <edwintorok@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-23 11:53:48 +01:00
Ingo Molnar 9f14416442 Merge commit 'v2.6.28-rc6' into irq/urgent 2008-11-23 10:52:33 +01:00
Török Edwin 02b67518e2 tracing: add support for userspace stacktraces in tracing/iter_ctrl
Impact: add new (default-off) tracing visualization feature

Usage example:

 mount -t debugfs nodev /sys/kernel/debug
 cd /sys/kernel/debug/tracing
 echo userstacktrace >iter_ctrl
 echo sched_switch >current_tracer
 echo 1 >tracing_enabled
 .... run application ...
 echo 0 >tracing_enabled

Then read one of 'trace','latency_trace','trace_pipe'.

To get the best output you can compile your userspace programs with
frame pointers (at least glibc + the app you are tracing).

Signed-off-by: Török Edwin <edwintorok@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-23 09:25:15 +01:00
Frederic Weisbecker f201ae2356 tracing/function-return-tracer: store return stack into task_struct and allocate it dynamically
Impact: use deeper function tracing depth safely

Some tests showed that function return tracing needed a more deeper depth
of function calls. But it could be unsafe to store these return addresses
to the stack.

So these arrays will now be allocated dynamically into task_struct of current
only when the tracer is activated.

Typical scheme when tracer is activated:
- allocate a return stack for each task in global list.
- fork: allocate the return stack for the newly created task
- exit: free return stack of current
- idle init: same as fork

I chose a default depth of 50. I don't have overruns anymore.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-23 09:17:26 +01:00
Ingo Molnar a0a70c735e Merge branches 'tracing/profiling', 'tracing/options' and 'tracing/urgent' into tracing/core 2008-11-23 09:10:32 +01:00