Commit Graph

23843 Commits (6179b5562d5d17c7c09b54cb11dd925ca308d7a9)

Author SHA1 Message Date
Finn Thain d74472f0b2 SONIC: small fix and cleanup
Fix a potential problem in the timeout handling: don't free the DMA buffers
before resetting the chip.

Also a trivial cleanup. Bring macsonic and jazzsonic into sync.

Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-04 17:59:08 -07:00
Finn Thain 8b6aaab8c8 m68k: macmace fixes
Fix a race condition in the transmit code, where the dma interrupt could update
the free tx buffer count concurrently and wedge the tx queue.

Fix the misuse of the rx frame status and rx frame length registers: no more
"fifo overrun" errors caused by the OFLOW bit being tested in the frame length
register (instead of the status register), and no more missed packets due to
incorrect length taken from status register (instead of the frame length
register).

Fix a panic (skb_over_panic BUG) caused by allocating and then copying an
incoming packet while the packet length register was changing.

Cut-and-paste the reset code from the powermac mace driver (mace.c), so the NIC
functions when MacOS does not initialise it (important for anyone wanting to
use the Emile boot loader).

Cut-and-paste the error counting and timeout recovery code from mace.c.

Fix over allocation of rx buffer memory (it's page order, not page count).

Converted to driver model.

Converted to DMA API.

Since I've run out of ways to make it fail, and since it performs well now,
promote the driver from EXPERIMENTAL status. Tested on both quadra 840av and
660av.

Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-04 17:59:07 -07:00
Finn Thain 0251c38ce5 CUDA ADB fixes
Fix the flakiness in the CUDA ADB driver on m68k macs (keypresses getting
wedged down or ADB just going AWOL altogether).

The only IRQ used by this driver is the VIA shift register IRQ. The PowerMac
conditional code disables the other VIA IRQ sources, so don't mess with the
other IRQ flags in the common code -- m68k macs need them.

When polling, don't disable local interrupts when we only need to disable the
CUDA interrupt.

Unless polling, don't clear the shift register IRQ flag. On m68k macs this
creates a race that often breaks CUDA ADB.

Tested on Quadra 840av and LC630 (both m68k); also Beige G3 (powerpc).

Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-04 17:59:07 -07:00
Finn Thain d95fd5fce8 m68k: Mac II ADB fixes
Fix a crash caused by requests placed in the queue with the completed flag
already set. This lead to some ADB_SYNC requests returning early and their
request structs being popped off the stack while still queued. Stack corruption
ensued or an invalid request callback pointer was invoked or both. Eliminate
macii_retransmit() and its buggy implementation of macii_write(). Have
macii_queue_poll() fully initialise the request queues.

Fix a bug in macii_queue_poll() where the last_req pointer was not being set.
This caused some requests to leave the queue before being completed (and would
also corrupt the stack under certain conditions).

Fix a race in macii_start that could set the state machine to "reading" while
current_req was null.

No longer send poll commands with the ADBREQ_REPLY flag -- doing that caused
the replies to be stored in the request buffer where they were forgotten
about.

Don't autopoll by continuously sending new Talk commands. Get the controller to
do that for us. This reduces the ADB interrupt rate on an idle bus to about 5
per second. Only autopoll the devices that were probed.

Explicitly clear the interrupt flag when polling.

Use disable_irq rather than local_irq_save when polling.

Remove excess local_irq_save/restore pairs.

Improve bus timeout and service request detection.

Remove unused code (last_reply, adb_dir etc) and unneeded code (prefix_len,
first_byte etc).

Change TIP and TACK to their correct names on this ADB controller (ST_EVEN and
ST_ODD).

Add some commentry.

Add a generous quantity of sanity checks (BUG_ONs).

Let m68k macs use the adb_sync boot param too.

Tested on Mac II, Mac IIci, Quadra 650, Quadra 700 etc.

Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-04 17:59:07 -07:00
Finn Thain 2964db0f59 m68k: Mac DP8390 update
Fix the support for C/NET nubus ethernet cards etc. Sync up the DP8390 driver
with the latest code in the mac68k repo.

Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-04 17:59:07 -07:00
Finn Thain f877958879 NuBus header update
Sync the nubus defines with the latest code in the mac68k repo. Some of these
are needed for DP8390 driver update in the next patch.

Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-04 17:59:07 -07:00
Geert Uytterhoeven 3f5d987e62 m68k: Amiga A2065 and Ariadne TX statistics
Add missing code to the Amiga A2065 and Ariadne drivers to update
net_device_stats.tx_bytes.

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-04 17:59:07 -07:00
Finn Thain bff832cda7 m68k: pmu_queue_request() declaration conflict
Fixes a "static qualifier follows non-static qualifier" error from gcc 4.

Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-04 17:59:06 -07:00
Geert Uytterhoeven b312b38c74 m68k: Atari SCSI workqueue updates
Workqueue updates for the Atari SCSI driver

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-04 17:59:06 -07:00
Geert Uytterhoeven f8744bc95d hilkbd: Kill compiler warning and fix comment dyslexia
hilkbd: Kill compiler warning and fix comment dyslexia

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-04 17:59:06 -07:00
Matthias Urlichs 39ad2cb352 m68k: Mac89x0 Ethernet netif updates
Macintosh CS89x0 Ethernet: Netif updates
Addition of netif_stop_queue() before transmission by Michael Schmitz
skb_copy_{from,to}_linear_data() conversion by Geert Uytterhoeven

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-04 17:59:06 -07:00
Michael Schmitz a100501212 m68k: Atari fb revival
Update the atari fb to 2.6 by Michael Schmitz,
Reformatting and rewrite of bit plane functions by Roman Zippel,
A few more fixes by Geert Uytterhoeven.

Signed-off-by: Michael Schmitz <schmitz@debian.org>
Signed-off-by: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-04 17:59:05 -07:00
Michael Schmitz c04cb856e2 m68k: Atari keyboard and mouse support.
Atari keyboard and mouse support.
(reformating and Kconfig fixes by Roman Zippel)

Signed-off-by: Michael Schmitz <schmitz@debian.org>
Signed-off-by: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-04 17:59:05 -07:00
Roman Zippel 3130d905ba m68k: Atari SCSI driver compile fixes
Atari SCSI driver compile fixes

Signed-off-by: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-04 17:59:05 -07:00
Roman Zippel c28bda2517 m68k: Reformat the Atari SCSI driver
Reformat the Atari SCSI driver

Signed-off-by: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-04 17:59:05 -07:00
Michael Schmitz fb810d121b m68k: Atari SCSI revival
SCSI should be working on a TT (but someone should really try!) but causes
trouble on a Falcon (as in: it ate a filesystem of mine) at least when
used concurrently with IDE. I have the notion it's because locking of the
ST-DMA interrupt by IDE is broken in 2.6 (the IDE driver always complains
about trying to release an already-released ST-DMA). Needs more work, but
that's on the IDE or m68k interrupt side rather than SCSI.

Signed-off-by: Michael Schmitz <schmitz@debian.org>
Signed-off-by: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-04 17:59:05 -07:00
Linus Torvalds 8d41f0e8d5 Merge branch 'i2c-for-linus' of git://jdelvare.pck.nerim.net/jdelvare-2.6
* 'i2c-for-linus' of git://jdelvare.pck.nerim.net/jdelvare-2.6: (44 commits)
  i2c-s3c2410: Fix bug in releasing driver
  i2c-s3c2410: Fix I2C SDA to SCL setup time
  i2c: New i2c-tiny-usb bus driver
  i2c: Documentation update
  i2c: SPIN_LOCK_UNLOCKED cleanup
  i2c: Obsolete i2c-ixp2000, i2c-ixp4xx and scx200_i2c
  i2c: New Simtec I2C bus driver
  i2c: Bitbanging I2C bus driver using the GPIO API
  Use menuconfig objects - I2C
  i2c: Restore i2c_smbus_read_block_data
  i2c-pxa: Clean transaction stop
  i2c-algo-bit: Improve debugging
  i2c-algo-bit: Implement a 50/50 SCL duty cycle
  i2c-omap: Switch to static adapter numbering
  i2c: Blackfin Two Wire Interface driver
  i2c-algo-sgi: Comment and whitespace cleanups
  i2c: Make i2c_del_driver a void function
  i2c: Move i2c-isa-only exported symbol declarations
  i2c: Document i2c_new_device()
  i2c: Add i2c_new_probed_device()
  ...

Fixed trivial conflict in Documentation/feature-removal-schedule.txt manually.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-04 17:46:27 -07:00
Linus Torvalds ded1504dfa Merge master.kernel.org:/pub/scm/linux/kernel/git/davej/cpufreq
* master.kernel.org:/pub/scm/linux/kernel/git/davej/cpufreq:
  [CPUFREQ] Report the number of processors in PowerNow-k8 correctly
  [CPUFREQ] do not declare undefined functions
  [CPUFREQ] cleanup kconfig options
  [CPUFREQ] Longhaul - Revert Longhaul ver. 2
  [CPUFREQ] Remove deprecated /proc/acpi/processor/performance write support
  [CPUFREQ] Fix limited cpufreq when booted on battery
  Fix preemption warnings in speedstep-centrino.c
  [CPUFREQ] Longhaul - Correct PCI code
  [CPUFREQ] p4-clockmod: switch to rdmsr_on_cpu/wrmsr_on_cpu
2007-05-04 17:38:48 -07:00
Linus Torvalds 98b96173c7 Merge master.kernel.org:/pub/scm/linux/kernel/git/davej/agpgart
* master.kernel.org:/pub/scm/linux/kernel/git/davej/agpgart:
  [AGPGART] sworks-agp: Switch to PCI ref counting APIs
  [AGPGART] Nvidia AGP: Use refcount aware PCI interfaces
  [AGPGART] Fix sparse warning in sgi-agp.c
  [AGPGART] Intel-agp adjustments
  [AGPGART] Move [un]map_page_into_agp into asm/agp.h
  [AGPGART] Add missing calls to global_flush_tlb() to ali-agp
  [AGPGART] prevent probe collision of sis-agp and amd64_agp
2007-05-04 17:38:16 -07:00
Michael Holzheu e296306277 [S390] tape: New read configuration data.
Instead of the deprecated read_conf_data(), implement a new function
tape_3590_read_dev_chars().

Signed-off-by: Michael Holzheu <holzheu@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2007-05-04 18:48:26 +02:00
Cornelia Huck 6c82a8af92 [S390] qeth: New read configuration data.
Instead of the deprecated read_conf_data(), implement a new function
qeth_read_conf_data().

Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2007-05-04 18:48:26 +02:00
Cornelia Huck 17283b56ec [S390] dasd: New read device characteristics and read configuration data.
Instead of the deprecated read_dev_chars() and read_conf_data_lpm(),
implement dasd_generic_read_dev_chars() and dasd_eckd_read_conf_lpm().
These should even recover better from error than the original cio
functions.

Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2007-05-04 18:48:26 +02:00
Ursula Braun 00c0c6466c [S390] qdio: make qdio statistics SMP-capable
Use atomic_t/atomic64_t to make qdio performance statistics smp safe.
Remove temporarily calculation of "total time of inbound actions".

Signed-off-by: Ursula Braun <braunu@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2007-05-04 18:48:25 +02:00
Michael Chan fde82055c1 [BNX2]: Fix TSO problem with small MSS.
Remove the check for skb->len greater than MTU when doing TSO.  When
the destination has a smaller MSS than the source, a TSO packet may
be smaller than the MTU at the source and we still need to process it
as a TSO packet.

Thanks to Brian Ristuccia <bristuccia@starentnetworks.com> for
reporting the problem.

Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-03 17:23:35 -07:00
Pavel Emelianov 7562f876cd [NET]: Rework dev_base via list_head (v3)
Cleanup of dev_base list use, with the aim to simplify making device
list per-namespace. In almost every occasion, use of dev_base variable
and dev->next pointer could be easily replaced by for_each_netdev
loop. A few most complicated places were converted to using
first_netdev()/next_netdev().

Signed-off-by: Pavel Emelianov <xemul@openvz.org>
Acked-by: Kirill Korotaev <dev@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-03 15:13:45 -07:00
Michael Chan 72fbaeb623 [BNX2]: Update version and reldate.
Update version to 1.5.10.

Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-03 13:25:32 -07:00
Michael Chan 883e515118 [BNX2]: Print bus information for PCIE devices.
Fix the code to print PCI or PCIE bus information for all devices.

Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-03 13:25:11 -07:00
Michael Chan 8e6a72c435 [BNX2]: Add 1-shot MSI handler for 5709.
The 5709 supports the one-shot MSI handler similar to some of the tg3
chips.  In this mode, the MSI disables itself automatically until it
is re-enabled at the end of NAPI poll.

Put the request_irq/free_irq logic in common procedures.

Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-03 13:24:48 -07:00
Michael Chan da3e4fbed2 [BNX2]: Restructure PHY event handling.
Restructure by adding bnx2_phy_event_is_set() to make code cleaner
and easier to understand.

Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-03 13:24:23 -07:00
Michael Chan 1b8227c48e [BNX2]: Add indirect spinlock.
The indirect register access method will be used by more than one
caller in BH context (NAPI poll and timer), so a spinlock is required.

Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-03 13:24:05 -07:00
Michael Chan 27a005b883 [BNX2]: Add support for 5709 Serdes.
Add PCI ID and code to support the 5709 Serdes PHY.

Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-03 13:23:41 -07:00
Michael Chan 605a9e20aa [BNX2]: Re-structure the 2.5G Serdes code.
Add some common procedures to handle enabling and disabling 2.5G.
Add some missing code to resolve flow control.

Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-03 13:23:13 -07:00
Michael Chan ca58c3af99 [BNX2]: Put MII register offsets in the bnx2 struct.
The 5709 Serdes device uses non-standard MII register offsets.  This
re-structuring will make it easier to support 5709 Serdes.

Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-03 13:22:52 -07:00
Michael Chan 4666f87a82 [BNX2]: Add ipv6 TSO and checksum for 5709.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-03 13:22:28 -07:00
Michael Chan 874bb672fd [BNX2]: Update 5709 firmware.
Add ipv6 TSO support in firmware. 

Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-03 13:21:48 -07:00
Michael Chan 41ccf61cf0 [BNX2]: Update 5708 firmware.
This fixes the problem of not counting all dropped multicast packets.

Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-03 13:21:13 -07:00
Michael Chan 30c517b291 [BNX2]: Save PCI state during suspend.
This is needed to save the MSI state which will be lost during
suspend.

Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-03 13:20:40 -07:00
Michael Chan 1b2f922f68 [BNX2]: Fix race conditions when calling register_netdev().
Hot-plug scripts can call bnx2_open() as soon as register_netdev() is
called in bnx2_init_one().  We need to call pci_set_drvdata() and
setup everything before calling register_netdev(). netif_carrier_off()
also needs to be moved to bnx2_open() to avoid race conditions with
the irq.
    
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-03 13:20:19 -07:00
Michael Chan 40453c839f [BNX2]: Add 40-bit DMA workaround for 5708.
The internal PCIE-to-PCIX bridge of the 5708 has the same 40-bit DMA
limitation as some of the tg3 chips.  Set dma_mask and persistent DMA
mask to 40-bit to workaround.

Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-03 13:19:18 -07:00
Michael Chan 5bae30c96a [BNX2]: Fix register and memory test on 5709.
Tweak registers and memory test range for 5709.

Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-03 13:18:46 -07:00
Michael Chan dad3e452da [BNX2]: Block MII access when ifdown.
The device may be in D3hot state and should not allow MII register
access.

Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-03 13:18:03 -07:00
Sascha Hauer ff4bfb2163 [ARM] 4328/1: Move i.MX UART regs to driver
This patch moves the i.MX UART register descriptions from
include/asm-arm/arch-imx/imx-regs.h to the serial driver itself.
This helps using the driver on other architectures like mx31

Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2007-05-03 20:24:21 +01:00
Russell King 73b6a2be8b [ARM] Add support for ICSIDE interface on RiscPC
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2007-05-03 14:16:56 +01:00
Russell King a17dba8df9 [ARM] Add platform support for PATA on RiscPC
Add pata_platform device for RiscPC, thereby converting the primary
IDE channel on the machine to PATA.

Acked-by: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2007-05-03 14:16:55 +01:00
Andrew Victor 03abeac0a2 [ARM] 4357/1: AT91: Support slower serial baud-rates
Allow slower serial baud-rates by switching the UART clock from MCK to
MCK/8.

Based on patches by Mike Wolfram and Russell King.

Signed-off-by: Andrew Victor <andrew@sanpeople.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2007-05-03 14:12:45 +01:00
Avi Kivity 2ff81f70b5 KVM: Remove unused 'instruction_length'
As we no longer emulate in userspace, this is meaningless.  We don't
compute it on SVM anyway.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03 10:52:32 +03:00
Avi Kivity 02c8320972 KVM: Don't require explicit indication of completion of mmio or pio
It is illegal not to return from a pio or mmio request without completing
it, as mmio or pio is an atomic operation.  Therefore, we can simplify
the userspace interface by avoiding the completion indication.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03 10:52:32 +03:00
Avi Kivity e7df56e4a0 KVM: Remove extraneous guest entry on mmio read
When emulating an mmio read, we actually emulate twice: once to determine
the physical address of the mmio, and, after we've exited to userspace to
get the mmio value, we emulate again to place the value in the result
register and update any flags.

But we don't really need to enter the guest again for that, only to take
an immediate vmexit.  So, if we detect that we're doing an mmio read,
emulate a single instruction before entering the guest again.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03 10:52:32 +03:00
Anthony Liguori 94dfbdb389 KVM: SVM: Only save/restore MSRs when needed
We only have to save/restore MSR_GS_BASE on every VMEXIT.  The rest can be
saved/restored when we leave the VCPU.  Since we don't emulate the DEBUGCTL
MSRs and the guest cannot write to them, we don't have to worry about
saving/restoring them at all.

This shaves a whopping 40% off raw vmexit costs on AMD.

Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03 10:52:32 +03:00
Adrian Bunk 2807696c37 KVM: fix an if() condition
It might have worked in this case since PT_PRESENT_MASK is 1, but let's
express this correctly.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03 10:52:31 +03:00
Anthony Liguori 2ab455ccce KVM: VMX: Add lazy FPU support for VT
Only save/restore the FPU host state when the guest is actually using the
FPU.

Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03 10:52:31 +03:00
Anthony Liguori 25c4c2762e KVM: VMX: Properly shadow the CR0 register in the vcpu struct
Set all of the host mask bits for CR0 so that we can maintain a proper
shadow of CR0.  This exposes CR0.TS, paving the way for lazy fpu handling.

Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03 10:52:31 +03:00
Avi Kivity e0e5127d06 KVM: Don't complain about cpu erratum AA15
It slows down Windows x64 horribly.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03 10:52:31 +03:00
Anthony Liguori 7807fa6ca5 KVM: Lazy FPU support for SVM
Avoid saving and restoring the guest fpu state on every exit.  This
shaves ~100 cycles off the guest/host switch.

Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03 10:52:31 +03:00
Avi Kivity 4c690a1e86 KVM: Allow passing 64-bit values to the emulated read/write API
This simplifies the API somewhat (by eliminating the special-case
cmpxchg8b on i386).

Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03 10:52:31 +03:00
Avi Kivity 1165f5fec1 KVM: Per-vcpu statistics
Make the exit statistics per-vcpu instead of global.  This gives a 3.5%
boost when running one virtual machine per core on my two socket dual core
(4 cores total) machine.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03 10:52:30 +03:00
Yaozu Dong 3fca036530 KVM: VMX: Avoid unnecessary vcpu_load()/vcpu_put() cycles
By checking if a reschedule is needed, we avoid dropping the vcpu.

[With changes by me, based on Anthony Liguori's observations]

Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03 10:52:30 +03:00
Yaozu Dong d6c69ee9a2 KVM: MMU: Avoid heavy ASSERT at non debug mode.
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03 10:52:30 +03:00
Avi Kivity 4d56c8a787 KVM: VMX: Only save/restore MSR_K6_STAR if necessary
Intel hosts only support syscall/sysret in long more (and only if efer.sce
is enabled), so only reload the related MSR_K6_STAR if the guest will
actually be able to use it.

This reduces vmexit cost by about 500 cycles (6400 -> 5870) on my setup.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03 10:52:30 +03:00
Avi Kivity 35cc7f9711 KVM: Fold drivers/kvm/kvm_vmx.h into drivers/kvm/vmx.c
No meat in that file.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03 10:52:30 +03:00
Avi Kivity e38aea3e93 KVM: VMX: Don't switch 64-bit msrs for 32-bit guests
Some msrs are only used by x86_64 instructions, and are therefore
not needed when the guest is legacy mode.  By not bothering to switch
them, we reduce vmexit latency by 2400 cycles (from about 8800) when
running a 32-bt guest on a 64-bit host.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03 10:52:30 +03:00
Avi Kivity 2345df8c55 KVM: VMX: Reduce unnecessary saving of host msrs
THe automatically switched msrs are never changed on the host (with
the exception of MSR_KERNEL_GS_BASE) and thus there is no need to save
them on every vm entry.

This reduces vmexit latency by ~400 cycles on i386 and by ~900 cycles (10%)
on x86_64.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03 10:52:29 +03:00
Avi Kivity c9047f5333 KVM: Handle guest page faults when emulating mmio
Usually, guest page faults are detected by the kvm page fault handler,
which detects if they are shadow faults, mmio faults, pagetable faults,
or normal guest page faults.

However, in ceratin circumstances, we can detect a page fault much later.
One of these events is the following combination:

- A two memory operand instruction (e.g. movsb) is executed.
- The first operand is in mmio space (which is the fault reported to kvm)
- The second operand is in an ummaped address (e.g. a guest page fault)

The Windows 2000 installer does such an access, an promptly hangs.  Fix
by adding the missing page fault injection on that path.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03 10:52:29 +03:00
Avi Kivity 364b625b56 KVM: SVM: Report hardware exit reason to userspace instead of dmesg
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03 10:52:29 +03:00
Avi Kivity 8c4385024d KVM: Retry sleeping allocation if atomic allocation fails
This avoids -ENOMEM under memory pressure.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03 10:52:29 +03:00
Avi Kivity b5a33a7572 KVM: Use slab caches to allocate mmu data structures
Better leak detection, statistics, memory use, speed -- goodness all
around.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03 10:52:29 +03:00
Avi Kivity 417726a3fb KVM: Handle partial pae pdptr
Some guests (Solaris) do not set up all four pdptrs, but leave some invalid.
kvm incorrectly treated these as valid page directories, pinning the
wrong pages and causing general confusion.

Fix by checking the valid bit of a pae pdpte.  This closes sourceforge bug
1698922.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03 10:52:29 +03:00
Avi Kivity d917a6b92d KVM: Initialize cr0 to indicate an fpu is present
Solaris panics if it sees a cpu with no fpu, and it seems to rely on this
bit.  Closes sourceforge bug 1698920.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03 10:52:29 +03:00
Eric Sesterhenn / Snakebyte 3964994bb5 KVM: Fix overflow bug in overflow detection code
The expression

   sp - 6 < sp

where sp is a u16 is undefined in C since 'sp - 6' is promoted to int,
and signed overflow is undefined in C.  gcc 4.2 actually warns about it.
Replace with a simpler test.

Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03 10:52:29 +03:00
Avi Kivity 5008fdf5b6 KVM: Use kernel-standard types
Noted by Joerg Roedel.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03 10:52:28 +03:00
Joerg Roedel 80b7706e4c KVM: SVM: enable LBRV virtualization if available
This patch enables the virtualization of the last branch record MSRs on
SVM if this feature is available in hardware. It also introduces a small
and simple check feature for specific SVM extensions.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03 10:52:28 +03:00
Avi Kivity b8836737d9 KVM: Add fpu get/set operations
These are really helpful when migrating an floating point app to another
machine.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03 10:52:28 +03:00
Avi Kivity e8207547d2 KVM: Add physical memory aliasing feature
With this, we can specify that accesses to one physical memory range will
be remapped to another.  This is useful for the vga window at 0xa0000 which
is used as a movable window into the (much larger) framebuffer.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03 10:52:28 +03:00
Avi Kivity 954bbbc236 KVM: Simply gfn_to_page()
Mapping a guest page to a host page is a common operation.  Currently,
one has first to find the memory slot where the page belongs (gfn_to_memslot),
then locate the page itself (gfn_to_page()).

This is clumsy, and also won't work well with memory aliases.  So simplify
gfn_to_page() not to require memory slot translation first, and instead do it
internally.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03 10:52:28 +03:00
Dor Laor e0fa826f96 KVM: Add mmu cache clear function
Functions that play around with the physical memory map
need a way to clear mappings to possibly nonexistent or
invalid memory.  Both the mmu cache and the processor tlb
are cleared.

Signed-off-by: Dor Laor <dor.laor@qumranet.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03 10:52:28 +03:00
Avi Kivity df513e2cdd KVM: x86 emulator: fix bit string operations operand size
On x86, bit operations operate on a string of bits that can reside in
multiple words.  For example, 'btsl %eax, (blah)' will touch the word
at blah+4 if %eax is between 32 and 63.

The x86 emulator compensates for that by advancing the operand address
by (bit offset / BITS_PER_LONG) and truncating the bit offset to the
range (0..BITS_PER_LONG-1).  This has a side effect of forcing the operand
size to 8 bytes on 64-bit hosts.

Now, a 32-bit guest goes and fork()s a process.  It write protects a stack
page at 0xbffff000 using the 'btr' instruction, at offset 0xffc in the page
table, with bit offset 1 (for the write permission bit).

The emulator now forces the operand size to 8 bytes as previously described,
and an innocent page table update turns into a cross-page-boundary write,
which is assumed by the mmu code not to be a page table, so it doesn't
actually clear the corresponding shadow page table entry.  The guest and
host permissions are out of sync and guest memory is corrupted soon
afterwards, leading to guest failure.

Fix by not using BITS_PER_LONG as the word size; instead use the actual
operand size, so we get a 32-bit write in that case.

Note we still have to teach the mmu to handle cross-page-boundary writes
to guest page table; but for now this allows Damn Small Linux 0.4 (2.4.20)
to boot.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03 10:52:28 +03:00
Avi Kivity afeb1f14c5 KVM: Remove debug message
No longer interesting.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03 10:52:27 +03:00
Avi Kivity 36868f7b0e KVM: Use list_move()
Use list_move() where possible.  Noticed by Dor Laor.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03 10:52:27 +03:00
Michal Piotrowski 55bf402834 KVM: Remove unused function
Remove unused function

CC      drivers/kvm/svm.o
drivers/kvm/svm.c:207: warning: ‘inject_db’ defined but not used

Signed-off-by: Michal Piotrowski <michal.k.k.piotrowski@gmail.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03 10:52:27 +03:00
Avi Kivity 0cc5064d33 KVM: SVM: Ensure timestamp counter monotonicity
When a vcpu is migrated from one cpu to another, its timestamp counter
may lose its monotonic property if the host has unsynced timestamp counters.
This can confuse the guest, sometimes to the point of refusing to boot.

As the rdtsc instruction is rather fast on AMD processors (7-10 cycles),
we can simply record the last host tsc when we drop the cpu, and adjust
the vcpu tsc offset when we detect that we've migrated to a different cpu.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03 10:52:27 +03:00
Avi Kivity d28c6cfbbc KVM: MMU: Fix hugepage pdes mapping same physical address with different access
The kvm mmu keeps a shadow page for hugepage pdes; if several such pdes map
the same physical address, they share the same shadow page.  This is a fairly
common case (kernel mappings on i386 nonpae Linux, for example).

However, if the two pdes map the same memory but with different permissions, kvm
will happily use the cached shadow page.  If the access through the more
permissive pde will occur after the access to the strict pde, an endless pagefault
loop will be generated and the guest will make no progress.

Fix by making the access permissions part of the cache lookup key.

The fix allows Xen pae to boot on kvm and run guest domains.

Thanks to Jeremy Fitzhardinge for reporting the bug and testing the fix.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03 10:52:27 +03:00
Joerg Roedel 916ce2360f KVM: SVM: forbid guest to execute monitor/mwait
This patch forbids the guest to execute monitor/mwait instructions on
SVM. This is necessary because the guest can execute these instructions
if they are available even if the kvm cpuid doesn't report its
existence.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03 10:52:26 +03:00
Sergey Kiselev 0e5bf0d0e4 KVM: Handle writes to MCG_STATUS msr
Some older (~2.6.7) kernels write MCG_STATUS register during kernel
boot (mce_clear_all() function, called from mce_init()). It's not
currently handled by kvm and will cause it to inject a GPF.
Following patch adds a "nop" handler for this.

Signed-off-by: Sergey Kiselev <sergey.kiselev@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03 10:52:26 +03:00
Avi Kivity fcd3410870 KVM: Remove unused and write-only variables
Trivial cleanup.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03 10:52:26 +03:00
Avi Kivity 6da63cf95f KVM: Don't allow the guest to turn off the cpu cache
The cpu cache is a host resource; the guest should not be able to turn
it off (even for itself).

Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03 10:52:26 +03:00
Avi Kivity 038881c8be KVM: Hack real-mode segments on vmx from KVM_SET_SREGS
As usual, we need to mangle segment registers when emulating real mode
as vm86 has specific constraints.  We special case the reset segment base,
and set the "access rights" (or descriptor flags) to vm86 comaptible values.

This fixes reboot on vmx.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03 10:52:26 +03:00
Avi Kivity 024aa1c02f KVM: Modify guest segments after potentially switching modes
The SET_SREGS ioctl modifies both cr0.pe (real mode/protected mode) and
guest segment registers.  Since segment handling is modified by the mode on
Intel procesors, update the segment registers after the mode switch has taken
place.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03 10:52:26 +03:00
Avi Kivity f6528b03f1 KVM: Remove set_cr0_no_modeswitch() arch op
set_cr0_no_modeswitch() was a hack to avoid corrupting segment registers.
As we now cache the protected mode values on entry to real mode, this
isn't an issue anymore, and it interferes with reboot (which usually _is_
a modeswitch).

Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03 10:52:25 +03:00
Avi Kivity 8cb5b03332 KVM: Workaround vmx inability to virtualize the reset state
The reset state has cs.selector == 0xf000 and cs.base == 0xffff0000,
which aren't compatible with vm86 mode, which is used for real mode
virtualization.

When we create a vcpu, we set cs.base to 0xf0000, but if we get there by
way of a reset, the values are inconsistent and vmx refuses to enter
guest mode.

Workaround by detecting the state and munging it appropriately.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03 10:52:25 +03:00
Avi Kivity aac012245a KVM: MMU: Remove global pte tracking
The initial, noncaching, version of the kvm mmu flushed the all nonglobal
shadow page table translations (much like a native tlb flush).  The new
implementation flushes translations only when they change, rendering global
pte tracking superfluous.

This removes the unused tracking mechanism and storage space.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03 10:52:25 +03:00
Avi Kivity ca5aac1f96 KVM: MMU: Remove unnecessary check for pdptr access
We already special case the pdptr access, so no need to check it again.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03 10:52:25 +03:00
Avi Kivity 039576c03c KVM: Avoid guest virtual addresses in string pio userspace interface
The current string pio interface communicates using guest virtual addresses,
relying on userspace to translate addresses and to check permissions.  This
interface cannot fully support guest smp, as the check needs to take into
account two pages at one in case an unaligned string transfer straddles a
page boundary.

Change the interface not to communicate guest addresses at all; instead use
a buffer page (mmaped by userspace) and do transfers there.  The kernel
manages the virtual to physical translation and can perform the checks
atomically by taking the appropriate locks.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03 10:52:25 +03:00
Avi Kivity f0fe510864 KVM: Future-proof argument-less ioctls
Some ioctls ignore their arguments.  By requiring them to be zero now,
we allow a nonzero value to have some special meaning in the future.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03 10:52:25 +03:00
Avi Kivity 07c45a366d KVM: Allow kernel to select size of mmap() buffer
This allows us to store offsets in the kernel/user kvm_run area, and be
sure that userspace has them mapped.  As offsets can be outside the
kvm_run struct, userspace has no way of knowing how much to mmap.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03 10:52:24 +03:00
Avi Kivity 1961d276c8 KVM: Add guest mode signal mask
Allow a special signal mask to be used while executing in guest mode.  This
allows signals to be used to interrupt a vcpu without requiring signal
delivery to a userspace handler, which is quite expensive.  Userspace still
receives -EINTR and can get the signal via sigwait().

Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03 10:52:24 +03:00
Avi Kivity 6722c51c51 KVM: Initialize the apic_base msr on svm too
Older userspace didn't care, but newer userspace (with the cpuid changes)
does.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03 10:52:24 +03:00
Avi Kivity 1b19f3e61d KVM: Add a special exit reason when exiting due to an interrupt
This is redundant, as we also return -EINTR from the ioctl, but it
allows us to examine the exit_reason field on resume without seeing
old data.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03 10:52:24 +03:00
Avi Kivity 8eb7d334bd KVM: Fold kvm_run::exit_type into kvm_run::exit_reason
Currently, userspace is told about the nature of the last exit from the
guest using two fields, exit_type and exit_reason, where exit_type has
just two enumerations (and no need for more).  So fold exit_type into
exit_reason, reducing the complexity of determining what really happened.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03 10:52:24 +03:00
Avi Kivity b4e63f560b KVM: Allow userspace to process hypercalls which have no kernel handler
This is useful for paravirtualized graphics devices, for example.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03 10:52:24 +03:00
Avi Kivity 5d308f4550 KVM: Add method to check for backwards-compatible API extensions
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-05-03 10:52:24 +03:00