linux - Linux Kernel (branches are rebased on master from time to time)

Age	Commit message (Collapse)	Author	Files	Lines
2009-07-30	lguest and virtio: cleanup struct definitions to Linux style.	Rusty Russell	1	-6/+3
	I've been doing this for years, and akpm picked me up on it about 12 months ago. lguest partly serves as example code, so let's do it Right. Also, remove two unused fields in struct vblk_info in the example launcher. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Cc: Ingo Molnar <mingo@redhat.com>
2009-07-30	lguest: fix comment style	Rusty Russell	1	-8/+15
	I don't really notice it (except to begrudge the extra vertical space), but Ingo does. And he pointed out that one excuse of lguest is as a teaching tool, it should set a good example. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Cc: Ingo Molnar <mingo@redhat.com>
2009-07-17	lguest: remove unnecessary forward struct declaration	Davide Libenzi	1	-2/+0
	While fixing lg.h to drop the fwd declaration, I noticed there's another one ;) Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2009-06-30	eventfd: revised interface and cleanups	Davide Libenzi	1	-1/+1
	Change the eventfd interface to de-couple the eventfd memory context, from the file pointer instance. Without such change, there is no clean way to racely free handle the POLLHUP event sent when the last instance of the file* goes away. Also, now the internal eventfd APIs are using the eventfd context instead of the file*. This patch is required by KVM's IRQfd code, which is still under development. Signed-off-by: Davide Libenzi <davidel@xmailserver.org> Cc: Gregory Haskins <ghaskins@novell.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Benjamin LaHaise <bcrl@kvack.org> Cc: Avi Kivity <avi@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-06-12	lguest: remove obsolete LHREQ_BREAK call	Rusty Russell	1	-3/+1
	We no longer need an efficient mechanism to force the Guest back into host userspace, as each device is serviced without bothering the main Guest process (aka. the Launcher). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2009-06-12	lguest: use eventfds for device notification	Rusty Russell	1	-0/+13
	Currently, when a Guest wants to perform I/O it calls LHCALL_NOTIFY with an address: the main Launcher process returns with this address, and figures out what device to run. A far nicer model is to let processes bind an eventfd to an address: if we find one, we simply signal the eventfd. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Cc: Davide Libenzi <davidel@xmailserver.org>
2009-06-12	lguest: allow any process to send interrupts	Rusty Russell	1	-0/+1
	We currently only allow the Launcher process to send interrupts, but it as we already send interrupts from the hrtimer, it's a simple matter of extracting that code into a common set_interrupt routine. As we switch to a thread per virtqueue, this avoids a bottleneck through the main Launcher process. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2009-06-12	lguest: PAE support	Matias Zabaljauregui	1	-0/+5
	This version requires that host and guest have the same PAE status. NX cap is not offered to the guest, yet. Signed-off-by: Matias Zabaljauregui <zabaljauregui@gmail.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2009-06-12	lguest: replace hypercall name LHCALL_SET_PMD with LHCALL_SET_PGD	Matias Zabaljauregui	1	-1/+1
	replace LHCALL_SET_PMD with LHCALL_SET_PGD hypercall name (That's really what it is, and the confusion gets worse with PAE support) Signed-off-by: Matias Zabaljauregui <zabaljauregui@gmail.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Reported-by: Jeremy Fitzhardinge <jeremy@goop.org>
2009-06-12	lguest: Segment selectors are 16-bit long. Fix lg_cpu.ss1 definition.	Matias Zabaljauregui	1	-1/+1
	If GDT_ENTRIES were every > 256, this could become a problem. Signed-off-by: Matias Zabaljauregui <zabaljauregui at gmail.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2009-06-12	lguest: improve interrupt handling, speed up stream networking	Rusty Russell	1	-2/+2
	lguest never checked for pending interrupts when enabling interrupts, and things still worked. However, it makes a significant difference to TCP performance, so it's time we fixed it by introducing a pending_irq flag and checking it on irq_restore and irq_enable. These two routines are now too big to patch into the 8/10 bytes patch space, so we drop that code. Note: The high latency on interrupt delivery had a very curious effect: once everything else was optimized, networking without GSO was faster than networking with GSO, since more interrupts were sent and hence a greater chance of one getting through to the Guest! Note2: (Almost) Closing the same loophole for iret doesn't have any measurable effect, so I'm leaving that patch for the moment. Before: 1GB tcpblast Guest->Host: 30.7 seconds 1GB tcpblast Guest->Host (no GSO): 76.0 seconds After: 1GB tcpblast Guest->Host: 6.8 seconds 1GB tcpblast Guest->Host (no GSO): 27.8 seconds Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2009-06-12	lguest: fix race in halt code	Rusty Russell	1	-1/+2
	When the Guest does the LHCALL_HALT hypercall, we go to sleep, expecting that a timer or the Waker will wake_up_process() us. But we do it in a stupid way, leaving a classic missing wakeup race. So split maybe_do_interrupt() into interrupt_pending() and try_deliver_interrupt(), and check maybe_do_interrupt() and the "break_out" flag before calling schedule. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2009-04-19	lguest: fix guest crash on non-linear addresses in gdt pvops	Rusty Russell	1	-1/+2
	Fixes guest crash 'lguest: bad read address 0x4800000 len 256' The new per-cpu allocator ends up handing a non-linear address to write_gdt_entry. We do __pa() on it, and hand it to the host, which kills us. I've long wanted to make the hypercall "LOAD_GDT_ENTRY" to match the IDT code, but had no pressing reason until now. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Cc: lguest@ozlabs.org
2009-03-30	lguest: use bool instead of int	Matias Zabaljauregui	1	-4/+4
	Impact: clean up Rusty told me, some time ago, that he had become a fan of "bool". So, here are some replacements. Signed-off-by: Matias Zabaljauregui <zabaljauregui at gmail.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-12-30	lguest: move the initial guest page table creation code to the host	Matias Zabaljauregui	1	-1/+1
	This patch moves the initial guest page table creation code to the host, so the launcher keeps working with PAE enabled configs. Signed-off-by: Matias Zabaljauregui <zabaljauregui@gmail.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-05-27	x86/paravirt: add pte_flags to just get pte flags	Jeremy Fitzhardinge	1	-1/+0
	Add pte_flags() to extract the flags from a pte. This is a special case of pte_val() which is only guaranteed to return the pte's flags correctly; the page number may be corrupted or missing. The intent is to allow paravirt implementations to return pte flags without having to do any translation of the page number (most notably, Xen). Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-04-18	drivers: Remove unnecessary inclusions of asm/semaphore.h	Matthew Wilcox	1	-1/+0
	None of these files use any of the functionality promised by asm/semaphore.h. It's possible that they rely on it dragging in some unrelated header file, but I can't build all these files, so we'll have fix any build failures as they come up. Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
2008-01-30	lguest: Use explicit includes rateher than indirect	Glauber de Oliveira Costa	1	-0/+1
	explicitly use ktime.h include explicitly use hrtimer.h include explicitly use sched.h include This patch adds headers explicitly to lguest sources file, to avoid depending on them being included somewhere else. Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-01-30	lguest: get rid of lg variable assignments	Glauber de Oliveira Costa	1	-14/+14
	We can save some lines of code by getting rid of *lg = cpu... lines of code spread everywhere by now. Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-01-30	lguest: move changed bitmap to lg_cpu	Glauber de Oliveira Costa	1	-3/+3
	events represented in the 'changed' bitmap are per-cpu, not per-guest. move it to the lg_cpu structure Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-01-30	lguest: move last_pages to lg_cpu	Glauber de Oliveira Costa	1	-1/+2
	in our new model, pages are assigned to a virtual cpu, not to a guest. We move it to the lg_cpu structure. Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-01-30	lguest: per-vcpu lguest pgdir management	Glauber de Oliveira Costa	1	-6/+6
	this patch makes the pgdir management per-vcpu. The pgdirs pool is still guest-wide (although it'll probably need to grow when we are really executing more vcpus), but the pgdidx index is gone, since it makes no sense anymore. Instead, we use a per-vcpu index. Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-01-30	lguest: make pending notifications per-vcpu	Glauber de Oliveira Costa	1	-1/+2
	this patch makes the pending_notify field, used to control pending notifications, per-vcpu, instead of per-guest Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-01-30	lguest: makes special fields be per-vcpu	Glauber de Oliveira Costa	1	-8/+9
	lguest struct have room for some fields, namely, cr2, ts, esp1 and ss1, that are not really guest-wide, but rather, vcpu-wide. This patch puts it in the vcpu struct Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-01-30	lguest: per-vcpu lguest task management	Glauber de Oliveira Costa	1	-7/+7
	lguest uses tasks to control its running behaviour (like sending breaks, controlling halted state, etc). In a per-vcpu environment, each vcpu will have its own underlying task. So this patch makes the infrastructure for that possible Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-01-30	lguest: replace lguest_arch with lg_cpu_arch.	Glauber de Oliveira Costa	1	-9/+10
	The fields found in lguest_arch are not really per-guest, but per-cpu (gdt, idt, etc). So this patch turns lguest_arch into lg_cpu_arch. It makes sense to have a per-guest per-arch struct, but this can be addressed later, when the need arrives. Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-01-30	lguest: make registers per-vcpu	Glauber de Oliveira Costa	1	-4/+5
	This is the most obvious per-vcpu field: registers. So this patch moves it from struct lguest to struct vcpu, and patch the places in which they are used, accordingly Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-01-30	lguest: map_switcher_in_guest() per-vcpu	Glauber de Oliveira Costa	1	-1/+1
	The switcher needs to be mapped per-vcpu, because different vcpus will potentially have different page tables (they don't have to, because threads will share the same). So our first step is the make the function receive a vcpu struct Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-01-30	lguest: per-vcpu interrupt processing.	Glauber de Oliveira Costa	1	-5/+5
	This patch adapts interrupt processing for using the vcpu struct. Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-01-30	lguest: per-vcpu lguest timers	Glauber de Oliveira Costa	1	-5/+5
	Here, I introduce per-vcpu timers. With this, we can have local expiries, needed for accounting time in smp guests Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-01-30	lguest: make hypercalls use the vcpu struct	Glauber de Oliveira Costa	1	-8/+8
	this patch changes do_hcall() and do_async_hcall() interfaces (and obviously their callers) to get a vcpu struct. Again, a vcpu services the hypercall, not the whole guest Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-01-30	lguest: per-cpu run guest	Glauber de Oliveira Costa	1	-2/+2
	This patch makes the run_guest() routine use the lg_cpu struct. This is required since in a smp guest environment, there's no more the notion of "running the guest", but rather, it is "running the vcpu" Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-01-30	lguest: introduce vcpu struct	Glauber de Oliveira Costa	1	-0/+10
	this patch introduces a vcpu struct for lguest. In upcoming patches, more and more fields will be moved from the lguest struct to the vcpu Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2007-10-25	lguest: documentation update	Rusty Russell	1	-2/+2
	Went through the documentation doing typo and content fixes. This patch contains only comment and whitespace changes. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2007-10-25	lguest: remove unused "wake" element from struct lguest	Rusty Russell	1	-3/+0
	Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2007-10-23	generalize lgread_u32/lgwrite_u32.	Rusty Russell	1	-4/+19
	Jes complains that page table code still uses lgread_u32 even though it now uses general kernel pte types. The best thing to do is to generalize lgread_u32 and lgwrite_u32. This means we lose the efficiency of getuser(). We could potentially regain it if we used __copy_from_user instead of copy_from_user, but I'm not certain that our range check is equivalent to access_ok() on all platforms. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Acked-by: Jes Sorensen <jes@sgi.com>
2007-10-23	Remove old lguest I/O infrrasructure.	Rusty Russell	1	-26/+1
	This patch gets rid of the old lguest host I/O infrastructure and replaces it with a single hypercall "LHCALL_NOTIFY" which takes an address. The main change is the removal of io.c: that mainly did inter-guest I/O, which virtio doesn't yet support. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2007-10-23	Boot with virtual == physical to get closer to native Linux.	Rusty Russell	1	-5/+3
	1) This allows us to get alot closer to booting bzImages. 2) It means we don't have to know page_offset. 3) The Guest needs to modify the boot pagetables to create the PAGE_OFFSET mapping before jumping to C code. 4) guest_pa() walks the page tables rather than using page_offset. 5) We don't use page_offset to figure out whether to emulate: it was always kinda quesationable, and won't work for instructions done before remapping (bzImage unpacking in particular). 6) We still want the kernel address for tlb flushing: have the initial hypercall give us that, too. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2007-10-23	Allow guest to specify syscall vector to use.	Rusty Russell	1	-0/+3
	(Based on Ron Minnich's LGUEST_PLAN9_SYSCALL patch). This patch allows Guests to specify what system call vector they want, and we try to reserve it. We only allow one non-Linux system call vector, to try to avoid DoS on the Host. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2007-10-23	Rename "cr3" to "gpgdir" to avoid x86-specific naming.	Rusty Russell	1	-3/+3
	Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2007-10-23	Pagetables to use normal kernel types	Matias Zabaljauregui	1	-37/+8
	This is my first step in the migration of page_tables.c to the kernel types and functions/macros (2.6.23-rc3). Seems to be working OK. Signed-off-by: Matias Zabaljauregui <matias.zabaljauregui@cern.ch> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2007-10-23	Move register setup into i386_core.c	Jes Sorensen	1	-0/+1
	Move setup_regs() to lguest_arch_setup_regs() in i386_core.c given that this is very architecture specific. Signed-off-by: Jes Sorensen <jes@sgi.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2007-10-23	Make hypercalls arch-independent.	Jes Sorensen	1	-1/+3
	Clean up the hypercall code to make the code in hypercalls.c architecture independent. First process the common hypercalls and then call lguest_arch_do_hcall() if the call hasn't been handled. Rename struct hcall_ring to hcall_args. This patch requires the previous patch which reorganize the layout of struct lguest_regs on i386 so they match the layout of struct hcall_args. Signed-off-by: Jes Sorensen <jes@sgi.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2007-10-23	Introduce "hcall" pointer to indicate pending hypercall.	Rusty Russell	1	-0/+3
	Currently we look at the "trapnum" to see if the Guest wants a hypercall. But once the hypercall is done we have to reset trapnum to a bogus value, otherwise if we exit to userspace and return, we'd run the same hypercall twice (that was a nasty bug to find!). This has two main effects: 1) When Jes's patch changes the hypercall args to be a generic "struct hcall_args" we simply change the type of "lg->hcall". It's set by arch code, so if it has to copy args or something it can do so, and point "hcall" into lg->arch somewhere. 2) Async hypercalls only get run when an actual hypercall is pending. This simplfies the code a little and is a more logical semantic. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2007-10-23	Move i386 part of core.c to x86/core.c.	Jes Sorensen	1	-52/+11
	Separate i386 architecture specific from core.c and move it to x86/core.c and add x86/lguest.h header file to match. Signed-off-by: Jes Sorensen <jes@sgi.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2007-10-23	Make shadow IDT a complete IDT with 256 entries.	Rusty Russell	1	-2/+1
	This simplifies the code a little, in preparation for allowing alternate system call vectors in guests (Plan 9 uses 0x40). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2007-10-23	Remove fixed limit on number of guests, and lguests array.	Rusty Russell	1	-4/+1
	Back when we had all the Guest state in the switcher, we had a fixed array of them. This is no longer necessary. If we switch the network code to using random_ether_addr (46 bits is enough to avoid clashes), we can get rid of the concept of "guest id" altogether. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2007-10-23	Introduce guest mem offset, static link example launcher	Rusty Russell	1	-0/+3
	In order to avoid problematic special linking of the Launcher, we give the Host an offset: this means we can use any memory region in the Launcher as Guest memory rather than insisting on mmap() at 0. The result is quite pleasing: a number of casts are replaced with simple additions. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2007-10-23	Remove binfmts.h include from lg.h	Rusty Russell	1	-1/+0
	It wasn't needed since a very early prototype of lguest. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2007-07-28	Provide timespec to guests rather than jiffies clock.	Rusty Russell	1	-0/+1
	A non-periodic clock_event_device and the "jiffies" clock don't mix well: tick_handle_periodic() can go into an infinite loop. Currently lguest guests use the jiffies clock when the TSC is unusable. Instead, make the Host write the current time into the lguest page on every interrupt. This doesn't cost much but is more precise and at least as accurate as the jiffies clock. It also gets rid of the GET_WALLCLOCK hypercall. Also, delay setting sched_clock until our clock is set up, otherwise the early printk timestamps can go backwards (not harmful, just ugly). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>