Commit graph

41 commits

Author SHA1 Message Date
Don Zickus
2bce5daca2 perf, x86, nmi: Move LVT un-masking into irq handlers
It was noticed that P4 machines were generating double NMIs for
each perf event.  These extra NMIs lead to 'Dazed and confused'
messages on the screen.

I tracked this down to a P4 quirk that said the overflow bit had
to be cleared before re-enabling the apic LVT mask.  My first
attempt was to move the un-masking inside the perf nmi handler
from before the chipset NMI handler to after.

This broke Nehalem boxes that seem to like the unmasking before
the counters themselves are re-enabled.

In order to keep this change simple for 2.6.39, I decided to
just simply move the apic LVT un-masking to the beginning of all
the chipset NMI handlers, with the exception of Pentium4's to
fix the double NMI issue.

Later on we can move the un-masking to later in the handlers to
save a number of 'extra' NMIs on those particular chipsets.

I tested this change on a P4 machine, an AMD machine, a Nehalem
box, and a core2quad box.  'perf top' worked correctly along
with various other small 'perf record' runs.  Anything high
stress breaks all the machines but that is a different problem.

Thanks to various people for testing different versions of this
patch.

Reported-and-tested-by: Shaun Ruffell <sruffell@digium.com>
Signed-off-by: Don Zickus <dzickus@redhat.com>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Link: http://lkml.kernel.org/r/1303900353-10242-1-git-send-email-dzickus@redhat.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
CC: Cyrill Gorcunov <gorcunov@gmail.com>
2011-04-27 17:59:11 +02:00
Cyrill Gorcunov
1ea5a6afd9 perf, x86: P4 PMU - Don't forget to clear cpuc->active_mask on overflow
It's not enough to simply disable event on overflow the
cpuc->active_mask should be cleared as well otherwise counter
may stall in "active" even in real being already disabled (which
potentially may lead to the situation that user may not use this
counter further).

Don pointed out that:

 " I also noticed this patch fixed some unknown NMIs
   on a P4 when I stressed the box".

Tested-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Don Zickus <dzickus@redhat.com>
Signed-off-by: Don Zickus <dzickus@redhat.com>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Link: http://lkml.kernel.org/r/1303398203-2918-3-git-send-email-dzickus@redhat.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-04-22 10:21:34 +02:00
Linus Torvalds
2a20b02c05 Merge branch 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  perf, x86: Complain louder about BIOSen corrupting CPU/PMU state and continue
  perf, x86: P4 PMU - Read proper MSR register to catch unflagged overflows
  perf symbols: Look at .dynsym again if .symtab not found
  perf build-id: Add quirk to deal with perf.data file format breakage
  perf session: Pass evsel in event_ops->sample()
  perf: Better fit max unprivileged mlock pages for tools needs
  perf_events: Fix stale ->cgrp pointer in update_cgrp_time_from_cpuctx()
  perf top: Fix uninitialized 'counter' variable
  tracing: Fix set_ftrace_filter probe function display
  perf, x86: Fix Intel fixed counters base initialization
2011-03-25 17:53:09 -07:00
Don Zickus
242214f9c1 perf, x86: P4 PMU - Read proper MSR register to catch unflagged overflows
The read of a proper MSR register was missed and instead of
counter the configration register was tested (it has
ARCH_P4_UNFLAGGED_BIT always cleared) leading to unknown NMI
hitting the system. As result the user may obtain "Dazed and
confused, but trying to continue" message. Fix it by reading a
proper MSR register.

When an NMI happens on a P4, the perf nmi handler checks the
configuration register to see if the overflow bit is set or not
before taking appropriate action.  Unfortunately, various P4
machines had a broken overflow bit, so a backup mechanism was
implemented.  This mechanism checked to see if the counter
rolled over or not.

A previous commit that implemented this backup mechanism was
broken. Instead of reading the counter register, it used the
configuration register to determine if the counter rolled over
or not. Reading that bit would give incorrect results.

This would lead to 'Dazed and confused' messages for the end
user when using the perf tool (or if the nmi watchdog is
running).

The fix is to read the counter register before determining if
the counter rolled over or not.

Signed-off-by: Don Zickus <dzickus@redhat.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Lin Ming <ming.m.lin@intel.com>
LKML-Reference: <4D8BAB49.3080701@openvz.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-03-24 21:40:01 +01:00
Lucas De Marchi
0d2eb44f63 x86: Fix common misspellings
They were generated by 'codespell' and then manually reviewed.

Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>
Cc: trivial@kernel.org
LKML-Reference: <1300389856-1099-3-git-send-email-lucas.demarchi@profusion.mobi>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-03-18 10:39:30 +01:00
Robert Richter
73d6e52206 perf, x86: Store perfctr msr addresses in config_base/event_base
Instead of storing the base addresses we can store the counter's msr
addresses directly in config_base/event_base of struct hw_perf_event.
This avoids recalculating the address with each msr access. The
addresses are configured one time. We also need this change to later
modify the address calculation.

Signed-off-by: Robert Richter <robert.richter@amd.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1296664860-10886-5-git-send-email-robert.richter@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-02-16 13:30:52 +01:00
Cyrill Gorcunov
7d44ec193d perf, x86: P4 PMU: Fix spurious NMI messages
Several people have reported spurious unknown NMI
messages on some P4 CPUs.

This patch fixes it by checking for an overflow (negative
counter values) directly, instead of relying on the
P4_CCCR_OVF bit.

Reported-by: George Spelvin <linux@horizon.com>
Reported-by: Meelis Roos <mroos@linux.ee>
Reported-by: Don Zickus <dzickus@redhat.com>
Reported-by: Dave Airlie <airlied@gmail.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Lin Ming <ming.m.lin@intel.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <AANLkTinfuTfCck_FfaOHrDqQZZehtRzkBum4SpFoO=KJ@mail.gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-02-16 12:26:12 +01:00
Stephane Eranian
d038b12c6d perf: Fix Pentium4 raw event validation
This patch fixes some issues with raw event validation on
Pentium 4 (Netburst) based processors.

As I was testing libpfm4 Netburst support, I ran into two
problems in the p4_validate_raw_event() function:

   - the shared field must be checked ONLY when HT is on
   - the binding to ESCR register was missing

The second item was causing raw events to not be encoded
correctly compared to generic PMU events.

With this patch, I can now pass Netburst events to libpfm4
examples and get meaningful results:

  $ task -e global_power_events🏃u  noploop 1
  noploop for 1 seconds
  3,206,304,898 global_power_events:running

Signed-off-by: Stephane Eranian <eranian@google.com>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: peterz@infradead.org
Cc: paulus@samba.org
Cc: davem@davemloft.net
Cc: fweisbec@gmail.com
Cc: perfmon2-devel@lists.sf.net
Cc: eranian@gmail.com
Cc: robert.richter@amd.com
Cc: acme@redhat.com
Cc: gorcunov@gmail.com
Cc: ming.m.lin@intel.com
LKML-Reference: <4d3efb2f.1252d80a.1a80.ffffc83f@mx.google.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-01-27 19:21:53 +01:00
Cyrill Gorcunov
047a3772fe perf, x86: P4 PMU - Fix unflagged overflows handling
Don found that P4 PMU reads CCCR register instead of counter
itself (in attempt to catch unflagged event) this makes P4
NMI handler to consume all NMIs it observes. So the other
NMI users such as kgdb simply have no chance to get NMI
on their hands.

Side note: at moment there is no way to run nmi-watchdog
together with perf tool. This is because both 'perf top' and
nmi-watchdog use same event. So while nmi-watchdog reserves
one event/counter for own needs there is no room for perf tool
left (there is a way to disable nmi-watchdog on boot of course).

Ming has tested this patch with the following results

 | 1. watchdog disabled
 |
 | kgdb tests on boot OK
 | perf works OK
 |
 | 2. watchdog enabled, without patch perf-x86-p4-nmi-4
 |
 | kgdb tests on boot hang
 |
 | 3. watchdog enabled, without patch perf-x86-p4-nmi-4 and do not run kgdb
 | tests on boot
 |
 | "perf top" partialy works
 |   cpu-cycles            no
 |   instructions          yes
 |   cache-references      no
 |   cache-misses          no
 |   branch-instructions   no
 |   branch-misses         yes
 |   bus-cycles            no
 |
 | 4. watchdog enabled, with patch perf-x86-p4-nmi-4 applied
 |
 | kgdb tests on boot OK
 | perf does not work, NMI "Dazed and confused" messages show up
 |

Which means we still have problems with p4 box due to 'unknown'
nmi happens but at least it should fix kgdb test cases.

Reported-by: Jason Wessel <jason.wessel@windriver.com>
Reported-by: Don Zickus <dzickus@redhat.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Don Zickus <dzickus@redhat.com>
Acked-by: Lin Ming <ming.m.lin@intel.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <4D275E7E.3040903@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-01-09 10:40:52 +01:00
Ingo Molnar
00e8976200 Merge branch 'perf/urgent' into perf/core
Conflicts:
	tools/perf/util/ui/browsers/hists.c

Merge reason: fix the conflict and merge in changes for dependent patch.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-10-05 09:47:14 +02:00
Cyrill Gorcunov
03e22198d2 perf, x86: Handle in flight NMIs on P4 platform
Stephane reported we've forgot to guard the P4 platform
against spurious in-flight performance IRQs. Fix it.

This fixes potential spurious 'dazed and confused' NMI
messages.

Reported-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Don Zickus <dzickus@redhat.com>
Cc: fweisbec@gmail.com
Cc: peterz@infradead.org
Cc: Robert Richter <robert.richter@amd.com>
Cc: Lin Ming <ming.m.lin@intel.com>
LKML-Reference: <1285815698-4298-1-git-send-email-dzickus@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-09-30 09:17:59 +02:00
Ingo Molnar
2aa61274ef Merge branch 'perf/urgent' into perf/core
Merge reason: Pick up pending fixes before applying dependent new changes.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-09-09 20:40:08 +02:00
Peter Zijlstra
de725dec9d perf, x86: Fix handle_irq return values
Now that we rely on the number of handled overflows, ensure all
handle_irq implementations actually return the right number.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Don Zickus <dzickus@redhat.com>
Cc: peterz@infradead.org
Cc: robert.richter@amd.com
Cc: gorcunov@gmail.com
Cc: fweisbec@gmail.com
Cc: ying.huang@intel.com
Cc: ming.m.lin@intel.com
Cc: eranian@google.com
LKML-Reference: <1283454469-1909-4-git-send-email-dzickus@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-09-03 08:05:18 +02:00
Cyrill Gorcunov
c9cf4a019c perf, x86, Pentium4: Add RAW events verification
Implements verification of

- Bits of ESCR EventMask field (meaningful bits in field are hardware
  predefined and others bits should be set to zero)

- INSTR_COMPLETED event (it is available on predefined cpu model only)

- Thread shared events (they should be guarded by "perf_event_paranoid"
  sysctl due to security reason). The side effect of this action is
  that PERF_COUNT_HW_BUS_CYCLES become a "paranoid" general event.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Tested-by: Lin Ming <ming.m.lin@intel.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <20100825182334.GB14874@lenovo>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-09-01 08:26:56 +02:00
Lin Ming
8d33091992 perf, x86, Pentium4: Clear the P4_CCCR_FORCE_OVF flag
If on Pentium4 CPUs the FORCE_OVF flag is set then an NMI happens
on every event, which can generate a flood of NMIs. Clear it.

Reported-by: Vince Weaver <vweaver1@eecs.utk.edu>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: <stable@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-08-25 15:15:33 +02:00
Cyrill Gorcunov
1c250d709f perf, x86: P4 PMU -- update nmi irq statistics and unmask lvt entry properly
In case if last active performance counter is not overflowed at
moment of NMI being triggered by another counter, the irq
statistics may miss an update stage. As a more serious
consequence -- apic quirk may not be triggered so apic lvt entry
stay masked.

Tested-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20100805150917.GA6311@lenovo>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-08-08 22:53:50 +02:00
Cyrill Gorcunov
39ef13a4ac perf, x86: P4 PMU -- redesign cache events
To support cache events we have reserved the low 6 bits in
hw_perf_event::config (which is a part of CCCR register
configuration actually).

These bits represent Replay Event mertic enumerated in
enum P4_PEBS_METRIC. The caller should not care about
which exact bits should be set and how -- the caller
just chooses one P4_PEBS_METRIC entity and puts it into
the config. The kernel will track it and set appropriate
additional MSR registers (metrics) when needed.

The reason for this redesign was the PEBS enable bit, which
should not be set until DS (and PEBS sampling) support will
be implemented properly.

TODO
====

 - PEBS sampling (note it's tricky and works with _one_ counter only
   so for HT machines it will be not that easy to handle both threads)

 - tracking of PEBS registers state, a user might need to turn
   PEBS off completely (ie no PEBS enable, no UOP_tag) but some
   other event may need it, such events clashes and should not
   run simultaneously, at moment we just don't support such events

 - eventually export user space bits in separate header which will
   allow user apps to configure raw events more conveniently.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <1278295769.9540.15.camel@minggr.sh.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-07-05 08:34:36 +02:00
Cyrill Gorcunov
68aa00ac0a perf, x86: Make a second write to performance counter if needed
On Netburst PMU we need a second write to a performance counter
due to cpu erratum.

A simple flag test instead of alternative instructions was choosen
because wrmsrl is already a macro and if virtualization is turned
on will need an additional wrapper call which is more expencise.

nb: we should propably switch to jump-labels as only this facility
reach the mainline.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Robert Richter <robert.richter@amd.com>
Cc: Lin Ming <ming.m.lin@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20100602212304.GC5264@lenovo>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-06-09 11:12:35 +02:00
Cyrill Gorcunov
9d36dfcf21 perf, x86: P4_pmu_schedule_events -- use smp_processor_id instead of raw_
This snippet somehow escaped the commit:

 | commit 137351e0fe
 | Author: Cyrill Gorcunov <gorcunov@openvz.org>
 | Date:   Sat May 8 15:25:52 2010 +0400
 |
 |    x86, perf: P4 PMU -- protect sensible procedures from preemption

so bring it eventually back. It helps to catch
preemption issue (if there will be, rule of thumb --
don't use raw_ if you can).

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Lin Ming <ming.m.lin@intel.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20100518212439.167259349@openvz.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-05-19 09:41:05 +02:00
Cyrill Gorcunov
623aab896e perf, x86: P4 PMU -- do a real check for ESCR address being in hash
To prevent from clashes in future code modifications
do a real check for ESCR address being in hash. At
moment the callers are known to pass sane values but
better to be on a safe side.

And comment fix.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
CC: Lin Ming <ming.m.lin@intel.com>
CC: Peter Zijlstra <a.p.zijlstra@chello.nl>
CC: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20100518212439.004503600@openvz.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-05-19 09:41:05 +02:00
Cyrill Gorcunov
ef4f30f54e perf, x86: P4 PMU -- fix typo in unflagged NMI handling
Tested-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
LKML-Reference: <1274174954.22793.17.camel@minggr.sh.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-05-18 12:05:20 +02:00
Cyrill Gorcunov
0db1a7bc00 perf, x86: P4 PMU -- handle unflagged events
It might happen that an event can overflow without
the proper overflow flag set. Check the sign bit in
the raw counter value to solve this problem.

Tested-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: fweisbec@gmail.com
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <1274083984.6540.15.camel@minggr.sh.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-05-18 08:25:34 +02:00
Cyrill Gorcunov
1ff3d7d792 x86, perf: P4 PMU - fix counters management logic
Jaswinder reported this #GP:

 |
 | Message from syslogd@ht at May 14 09:39:32 ...
 | kernel:[  314.908612] EIP: [<c100ccca>]
 | x86_perf_event_set_period+0x19d/0x1b2 SS:ESP 0068:edac3d70
 |

Ming has narrowed it down to a comparision issue
between arguments with different sizes and
signs. As result event index reached a wrong
value which in turn led to a GP fault.

At the same time it was found that p4_next_cntr
has broken logic and should return the counter
index only if it was not yet borrowed for
another event.

Reported-by: Jaswinder Singh Rajput <jaswinderlinux@gmail.com>
Reported-by: Lin Ming <ming.m.lin@intel.com>
Bisected-by: Lin Ming <ming.m.lin@intel.com>
Tested-by: Jaswinder Singh Rajput <jaswinderlinux@gmail.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
CC: Peter Zijlstra <a.p.zijlstra@chello.nl>
CC: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20100514190815.GG13509@lenovo>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-05-15 08:38:55 +02:00
Cyrill Gorcunov
720019908f x86, perf: P4 PMU -- use hash for p4_get_escr_idx()
Linear search over all p4 MSRs should be fine if only
we would not use it in events scheduling routine which
is pretty time critical. Lets use hashes. It should speed
scheduling up significantly.

v2: Steven proposed to use more gentle approach than issue
    BUG on error, so we use WARN_ONCE now

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Lin Ming <ming.m.lin@intel.com>
LKML-Reference: <20100512174242.GA5190@lenovo>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-05-13 08:51:13 +02:00
Cyrill Gorcunov
c7993165ef x86, perf: P4 PMU -- check for proper event index in RAW events
RAW events are special and we should be ready for user passing
in insane event index values.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Lin Ming <ming.m.lin@intel.com>
LKML-Reference: <20100508112717.315897547@openvz.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-05-08 14:17:53 +02:00
Cyrill Gorcunov
3f51b7119d x86, perf: P4 PMU -- Get rid of redundant check for array index
The caller already has done such a check.
And it was wrong anyway, it had to be '>=' rather than '>'

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Lin Ming <ming.m.lin@intel.com>
LKML-Reference: <20100508112717.130386882@openvz.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-05-08 14:17:53 +02:00
Cyrill Gorcunov
137351e0fe x86, perf: P4 PMU -- protect sensible procedures from preemption
Steven reported:

|
| I'm getting:
|
| Pid: 3477, comm: perf Not tainted 2.6.34-rc6 #2727
| Call Trace:
|  [<ffffffff811c7565>] debug_smp_processor_id+0xd5/0xf0
|  [<ffffffff81019874>] p4_hw_config+0x2b/0x15c
|  [<ffffffff8107acbc>] ? trace_hardirqs_on_caller+0x12b/0x14f
|  [<ffffffff81019143>] hw_perf_event_init+0x468/0x7be
|  [<ffffffff810782fd>] ? debug_mutex_init+0x31/0x3c
|  [<ffffffff810c68b2>] T.850+0x273/0x42e
|  [<ffffffff810c6cab>] sys_perf_event_open+0x23e/0x3f1
|  [<ffffffff81009e6a>] ? sysret_check+0x2e/0x69
|  [<ffffffff81009e32>] system_call_fastpath+0x16/0x1b
|
| When running perf record in latest tip/perf/core
|

Due to the fact that p4 counters are shared between HT threads
we synthetically divide the whole set of counters into two
non-intersected subsets. And while we're "borrowing" counters
from these subsets we should not be preempted (well, strictly
speaking in p4_hw_config we just pre-set reference to the
subset which allow to save some cycles in schedule routine
if it happens on the same cpu). So use get_cpu/put_cpu pair.

Also p4_pmu_schedule_events should use smp_processor_id rather
than raw_ version. This allow us to catch up preemption issue
(if there will ever be).

Reported-by: Steven Rostedt <rostedt@goodmis.org>
Tested-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Lin Ming <ming.m.lin@intel.com>
LKML-Reference: <20100508112716.963478928@openvz.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-05-08 14:17:53 +02:00
Cyrill Gorcunov
de902d967f x86, perf: P4 PMU -- configure predefined events
If an event is not RAW we should not exit p4_hw_config
early but call x86_setup_perfctr as well.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Lin Ming <ming.m.lin@intel.com>
Cc: Robert Richter <robert.richter@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-05-08 14:17:52 +02:00
Robert Richter
9d0fcba67e perf, x86: Call x86_setup_perfctr() from .hw_config()
The perfctr setup calls are in the corresponding .hw_config()
functions now. This makes it possible to introduce config functions
for other pmu events that are not perfctr specific.

Also, all of a sudden the code looks much nicer.

Signed-off-by: Robert Richter <robert.richter@amd.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1271190201-25705-4-git-send-email-robert.richter@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-05-07 11:31:00 +02:00
Peter Zijlstra
caaa8be3b6 perf, x86: Fix __initconst vs const
All variables that have __initconst should also be const.

Suggested-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-04-02 19:52:05 +02:00
Peter Zijlstra
b4cdc5c264 perf, x86: Fix up the ANY flag stuff
Stephane noticed that the ANY flag was in generic arch code, and Cyrill
reported that it broke the P4 code.

Solve this by merging x86_pmu::raw_event into x86_pmu::hw_config and
provide intel_pmu and amd_pmu specific versions of this callback.

The intel_pmu one deals with the ANY flag, the amd_pmu adds the few extra
event bits AMD64 has.

Reported-by: Stephane Eranian <eranian@google.com>
Reported-by: Cyrill Gorcunov <gorcunov@gmail.com>
Acked-by: Robert Richter <robert.richter@amd.com>
Acked-by: Cyrill Gorcunov <gorcunov@gmail.com>
Acked-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1269968113.5258.442.camel@laptop>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-04-02 19:52:04 +02:00
Robert Richter
948b1bb89a perf, x86: Undo some some *_counter* -> *_event* renames
The big rename:

 cdd6c48 perf: Do the big rename: Performance Counters -> Performance Events

accidentally renamed some members of stucts that were named after
registers in the spec. To avoid confusion this patch reverts some
changes. The related specs are MSR descriptions in AMD's BKDGs and the
ARCHITECTURAL PERFORMANCE MONITORING section in the Intel 64 and IA-32
Architectures Software Developer's Manuals.

This patch does:

 $ sed -i -e 's:num_events:num_counters:g' \
   arch/x86/include/asm/perf_event.h \
   arch/x86/kernel/cpu/perf_event_amd.c \
   arch/x86/kernel/cpu/perf_event.c \
   arch/x86/kernel/cpu/perf_event_intel.c \
   arch/x86/kernel/cpu/perf_event_p6.c \
   arch/x86/kernel/cpu/perf_event_p4.c \
   arch/x86/oprofile/op_model_ppro.c

 $ sed -i -e 's:event_bits:cntval_bits:g' -e 's:event_mask:cntval_mask:g' \
   arch/x86/kernel/cpu/perf_event_amd.c \
   arch/x86/kernel/cpu/perf_event.c \
   arch/x86/kernel/cpu/perf_event_intel.c \
   arch/x86/kernel/cpu/perf_event_p6.c \
   arch/x86/kernel/cpu/perf_event_p4.c

Signed-off-by: Robert Richter <robert.richter@amd.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1269880612-25800-2-git-send-email-robert.richter@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-04-02 19:52:02 +02:00
Peter Zijlstra
11164cd4f6 perf, x86: Add Nehelem PMU programming errata workaround
Implement the workaround for Intel Errata AAK100 and AAP53.

Also, remove the Core-i7 name for Nehalem events since there are
also Westmere based i7 chips.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
LKML-Reference: <1269608924.12097.147.camel@laptop>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-03-26 15:47:24 +01:00
Cyrill Gorcunov
d814f30105 x86, perf: Add raw events support for the P4 PMU
The adding of raw event support lead to complete code
refactoring. I hope is became more readable then it was.

The list of changes:

1)  The 64bit config field is enough to hold all information we need
    to track event details. To achieve it we used *own* enum for
    events selection in ESCR register and map this key into proper
    value at moment of event enabling.

    For the same reason we use 12LSB bits in CCCR register -- to track
    which exactly cache trace event was requested. And we cear this bits
    at real 'write' moment.

2)  There is no per-cpu area reserved for P4 PMU anymore. We
    don't need it. All is held by config.

3)  Now we may use any available counter, ie we try to grab any
    possible counter.

v2:
  - Lin Ming reported the lack of ESCR selector in CCCR for cache events

v3:
  - Don't loose cache event codes at config unpacking procedure, we may
    need it one day so no obscure hack behind our back, better to clear
    reserved bits explicitly when needed (thanks Ming for pointing out)

  - Lin Ming fixed misplaced opcodes in cache events

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Tested-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Robert Richter <robert.richter@amd.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <1269403766.3409.6.camel@minggr.sh.intel.com>
[ v4: did a few whitespace fixlets ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-03-26 08:45:49 +01:00
Cyrill Gorcunov
9c8c6bad31 x86, perf: Fix few cosmetic dabs for P4 pmu (comments and constantify)
- A few ESCR have escaped fixing at previous attempt.
- p4_escr_map is read only, make it const.

Nothing serious.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Lin Ming <ming.m.lin@intel.com>
LKML-Reference: <20100318211256.GH5062@lenovo>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-03-18 22:17:46 +01:00
Lin Ming
cb7d6b5053 perf, x86: Add cache events for the Pentium-4 PMU
Move the HT bit setting code from p4_pmu_event_map to
p4_hw_config. So the cache events can get HT bit set correctly.

Tested on my P4 desktop, below 6 cache events work:

 L1-dcache-load-misses
 LLC-load-misses
 dTLB-load-misses
 dTLB-store-misses
 iTLB-loads
 iTLB-load-misses

Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Reviewed-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <1268908392.13901.128.camel@minggr.sh.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-03-18 17:04:02 +01:00
Lin Ming
f34edbc1cd perf, x86: Add a key to simplify template lookup in Pentium-4 PMU
Currently, we use opcode(Event and Event-Selector) + emask to
look up template in p4_templates.

But cache events (L1-dcache-load-misses, LLC-load-misses, etc)
use the same event(P4_REPLAY_EVENT) to do the counting, ie, they
have the same opcode and emask. So we can not use current lookup
mechanism to find the template for cache events.

This patch introduces a "key", which is the index into
p4_templates. The low 12 bits of CCCR are reserved, so we can
hide the "key" in the low 12 bits of hwc->config.

We extract the key from hwc->config and then quickly find the
template.

Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Reviewed-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <1268908387.13901.127.camel@minggr.sh.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-03-18 17:03:51 +01:00
Cyrill Gorcunov
7335f75e9c x86, perf: Use apic_write unconditionally
Since apic_write() maps to a plain noop in the !CONFIG_X86_LOCAL_APIC
case we're safe to remove this conditional compilation and clean up
the code a bit.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: fweisbec@gmail.com
Cc: acme@redhat.com
Cc: eranian@google.com
Cc: peterz@infradead.org
LKML-Reference: <20100317104356.232371479@openvz.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-03-18 17:03:22 +01:00
Cyrill Gorcunov
e449526282 perf, x86: Enable not tagged retired instruction counting on P4s
This should turn on instruction counting on P4s, which was missing in
the first version of the new PMU driver.

It's inaccurate for now, we still need dependant event to tag mops
before we can count them precisely. The result is that the number of
instruction may be lifted up.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <1268629102.3355.11.camel@minggr.sh.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-03-15 08:14:34 +01:00
Cyrill Gorcunov
8576e19716 x86, perf: Unmask LVTPC only if we have APIC supported
Ingo reported:

 |
 | There's a build failure on -tip with the P4 driver, on UP 32-bit, if
 | PERF_EVENTS is enabled but UP_APIC is disabled:
 |
 | arch/x86/built-in.o: In function `p4_pmu_handle_irq':
 | perf_event.c:(.text+0xa756): undefined reference to `apic'
 | perf_event.c:(.text+0xa76e): undefined reference to `apic'
 |

So we have to unmask LVTPC only if we're configured to have one.

Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
CC: Lin Ming <ming.m.lin@intel.com>
CC: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <20100313081116.GA5179@lenovo>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-03-13 13:32:27 +01:00
Cyrill Gorcunov
a072738e04 perf, x86: Implement initial P4 PMU driver
The netburst PMU is way different from the "architectural
perfomance monitoring" specification that current CPUs use.
P4 uses a tuple of ESCR+CCCR+COUNTER MSR registers to handle
perfomance monitoring events.

A few implementational details:

1) We need a separate x86_pmu::hw_config helper in struct
   x86_pmu since register bit-fields are quite different from P6,
   Core and later cpu series.

2) For the same reason is a x86_pmu::schedule_events helper
   introduced.

3) hw_perf_event::config consists of packed ESCR+CCCR values.
   It's allowed since in reality both registers only use a half
   of their size. Of course before making a real write into a
   particular MSR we need to unpack the value and extend it to
   a proper size.

4) The tuple of packed ESCR+CCCR in hw_perf_event::config
   doesn't describe the memory address of ESCR MSR register
   so that we need to keep a mapping between these tuples
   used and available ESCR (various P4 events may use same
   ESCRs but not simultaneously), for this sake every active
   event has a per-cpu map of hw_perf_event::idx <--> ESCR
   addresses.

5) Since hw_perf_event::idx is an offset to counter/control register
   we need to lift X86_PMC_MAX_GENERIC up, otherwise kernel
   strips it down to 8 registers and event armed may never be turned
   off (ie the bit in active_mask is set but the loop never reaches
   this index to check), thanks to Peter Zijlstra

Restrictions:

 - No cascaded counters support (do we ever need them?)
 - No dependent events support (so PERF_COUNT_HW_INSTRUCTIONS
   doesn't work for now)
 - There are events with same counters which can't work simultaneously
   (need to use intersected ones due to broken counter 1)
 - No PERF_COUNT_HW_CACHE_ events yet

Todo:

 - Implement dependent events
 - Need proper hashing for event opcodes (no linear search, good for
   debugging stage but not in real loads)
 - Some events counted during a clock cycle -- need to set threshold
   for them and count every clock cycle just to get summary statistics
   (ie to behave the same way as other PMUs do)
 - Need to swicth to use event_constraints
 - To support RAW events we need to encode a global list of P4 events
   into p4_templates
 - Cache events need to be added

Event support status matrix:

 Event			status
 -----------------------------
 cycles			works
 cache-references	works
 cache-misses		works
 branch-misses		works
 bus-cycles		partially (does not work on 64bit cpu with HT enabled)
 instruction		doesnt work (needs dependent event [mop tagging])
 branches		doesnt work

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Robert Richter <robert.richter@amd.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20100311165439.GB5129@lenovo>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-03-11 18:51:08 +01:00