If the stack pointer is 0xc057a000, then the first stack page is at
0xc0579000 (the stack pointer is decremented before use). Not
calculating this correctly caused guests with CONFIG_DEBUG_PAGEALLOC=y
to be killed with a "bad stack page" message: the initial kernel stack
was just proceeding the .smp_locks section which
CONFIG_DEBUG_PAGEALLOC marks read-only when freeing.
Thanks to Frederik Deweerdt for the bug report!
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
If a Guest makes hypercall which sets a GDT entry to not present, we
currently set any segment registers using that GDT entry to 0.
Unfortunately, this is not sufficient: there are other ways of
altering GDT entries which will cause a fault.
The correct solution to do what Linux does: let them set any GDT value
they want and handle the #GP when popping causes a fault. This has
the added benefit of making our Switcher slightly more robust in the
case of any other bugs which cause it to fault.
We kill the Guest if it causes a fault in the Switcher: it's the
Guest's responsibility to make sure it's not using segments when it
changes them.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
A non-periodic clock_event_device and the "jiffies" clock don't mix well:
tick_handle_periodic() can go into an infinite loop.
Currently lguest guests use the jiffies clock when the TSC is
unusable. Instead, make the Host write the current time into the lguest
page on every interrupt. This doesn't cost much but is more precise
and at least as accurate as the jiffies clock. It also gets rid of
the GET_WALLCLOCK hypercall.
Also, delay setting sched_clock until our clock is set up, otherwise
the early printk timestamps can go backwards (not harmful, just ugly).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Documentation: The FIXMEs
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Documentation: The Host
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The netfilter code had very good documentation: the Netfilter Hacking HOWTO.
Noone ever read it.
So this time I'm trying something different, using a bit of Knuthiness.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The sense of the IF bit is backwards in the host interrupt handling.
This means we always save "IF=1" on the stack when injecting an
interrupt. It turns out this is almost always correct (unless the
guest is taking a page fault in an interrupt due to an unpopulated
vmalloc mapping), so went unnoticed.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This is the code for the "lg.ko" module, which allows lguest guests to
be launched.
[akpm@linux-foundation.org: update for futex-new-private-futexes]
[akpm@linux-foundation.org: build fix]
[jmorris@namei.org: lguest: use hrtimers]
[akpm@linux-foundation.org: x86_64 build fix]
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Andi Kleen <ak@suse.de>
Cc: Eric Dumazet <dada1@cosmosbay.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>