summaryrefslogtreecommitdiffstats
path: root/drivers/xen/grant-table.c
AgeCommit message (Collapse)AuthorFilesLines
2013-12-04XEN: Grant table address, xen_hvm_resume_frames, is a phys_addr not a pfnEric Trudeau1-1/+2
From: Eric Trudeau <etrudeau@broadcom.com> xen_hvm_resume_frames stores the physical address of the grant table. englighten.c was incorrectly setting it as if it was a page frame number. This caused the table to be mapped into the guest at an unexpected physical address. Additionally, a warning is improved to include the grant table address which failed in xen_remap. Signed-off-by: Eric Trudeau <etrudeau@broadcom.com> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
2013-11-26xen/gnttab: leave lazy MMU mode in the case of a m2p override failureMatt Wilson1-2/+4
Commit f62805f1 introduced a bug where lazy MMU mode isn't exited if a m2p_add/remove_override call fails. Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Anthony Liguori <aliguori@amazon.com> Cc: xen-devel@lists.xenproject.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Matt Wilson <msw@amazon.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> CC: stable@vger.kernel.org
2013-10-25grant-table: call set_phys_to_machine after mapping grant refsStefano Stabellini1-2/+17
When mapping/unmapping grant refs, call set_phys_to_machine to update the P2M with the new mappings for autotranslate guests. This is (almost) a nop on x86. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Changes in v9: - add in-code comments.
2013-08-09xen-gnt: prevent adding duplicate gnt callbacksRoger Pau Monne1-2/+11
With the current implementation, the callback in the tail of the list can be added twice, because the check done in gnttab_request_free_callback is bogus, callback->next can be NULL if it is the last callback in the list. If we add the same callback twice we end up with an infinite loop, were callback == callback->next. Replace this check with a proper one that iterates over the list to see if the callback has already been added. Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Matt Wilson <msw@amazon.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com> CC: stable@vger.kernel.org
2013-06-28xen: Convert printks to pr_<level>Joe Perches1-9/+8
Convert printks to pr_<level> (excludes printk(KERN_DEBUG...) to be more consistent throughout the xen subsystem. Add pr_fmt with KBUILD_MODNAME or "xen:" KBUILD_MODNAME Coalesce formats and add missing word spaces Add missing newlines Align arguments and reflow to 80 columns Remove DRV_NAME from formats as pr_fmt adds the same content This does change some of the prefixes of these messages but it also does make them more consistent. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2013-02-19xen: introduce xen_remap, use it instead of ioremapStefano Stabellini1-1/+1
ioremap can't be used to map ring pages on ARM because it uses device memory caching attributes (MT_DEVICE*). Introduce a Xen specific abstraction to map ring pages, called xen_remap, that is defined as ioremap on x86 (no behavioral changes). On ARM it explicitly calls __arm_ioremap with the right caching attributes: MT_MEMORY. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2013-01-18Merge tag 'stable/for-linus-3.8-rc3-tag' of ↵Linus Torvalds1-19/+29
git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen Pull Xen fixes from Konrad Rzeszutek Wilk: - CVE-2013-0190/XSA-40 (or stack corruption for 32-bit PV kernels) - Fix racy vma access spotted by Al Viro - Fix mmap batch ioctl potentially resulting in large O(n) page allcations. - Fix vcpu online/offline BUG:scheduling while atomic.. - Fix unbound buffer scanning for more than 32 vCPUs. - Fix grant table being incorrectly initialized - Fix incorrect check in pciback - Allow privcmd in backend domains. Fix up whitespace conflict due to ugly merge resolution in Xen tree in arch/arm/xen/enlighten.c * tag 'stable/for-linus-3.8-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: xen: Fix stack corruption in xen_failsafe_callback for 32bit PVOPS guests. Revert "xen/smp: Fix CPU online/offline bug triggering a BUG: scheduling while atomic." xen/gntdev: remove erronous use of copy_to_user xen/gntdev: correctly unmap unlinked maps in mmu notifier xen/gntdev: fix unsafe vma access xen/privcmd: Fix mmap batch ioctl. Xen: properly bound buffer access when parsing cpu/*/availability xen/grant-table: correctly initialize grant table version 1 x86/xen : Fix the wrong check in pciback xen/privcmd: Relax access control in privcmd_ioctl_mmap
2013-01-15xen/grant-table: correctly initialize grant table version 1Matt Wilson1-19/+29
Commit 85ff6acb075a484780b3d763fdf41596d8fc0970 (xen/granttable: Grant tables V2 implementation) changed the GREFS_PER_GRANT_FRAME macro from a constant to a conditional expression. The expression depends on grant_table_version being appropriately set. Unfortunately, at init time grant_table_version will be 0. The GREFS_PER_GRANT_FRAME conditional expression checks for "grant_table_version == 1", and therefore returns the number of grant references per frame for v2. This causes gnttab_init() to allocate fewer pages for gnttab_list, as a frame can old half the number of v2 entries than v1 entries. After gnttab_resume() is called, grant_table_version is appropriately set. nr_init_grefs will then be miscalculated and gnttab_free_count will hold a value larger than the actual number of free gref entries. If a guest is heavily utilizing improperly initialized v1 grant tables, memory corruption can occur. One common manifestation is corruption of the vmalloc list, resulting in a poisoned pointer derefrence when accessing /proc/meminfo or /proc/vmallocinfo: [ 40.770064] BUG: unable to handle kernel paging request at 0000200200001407 [ 40.770083] IP: [<ffffffff811a6fb0>] get_vmalloc_info+0x70/0x110 [ 40.770102] PGD 0 [ 40.770107] Oops: 0000 [#1] SMP [ 40.770114] CPU 10 This patch introduces a static variable, grefs_per_grant_frame, to cache the calculated value. gnttab_init() now calls gnttab_request_version() early so that grant_table_version and grefs_per_grant_frame can be appropriately set. A few BUG_ON()s have been added to prevent this type of bug from reoccurring in the future. Signed-off-by: Matt Wilson <msw@amazon.com> Reviewed-and-Tested-by: Steven Noonan <snoonan@amazon.com> Acked-by: Ian Campbell <Ian.Campbell@citrix.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Annie Li <annie.li@oracle.com> Cc: xen-devel@lists.xen.org Cc: linux-kernel@vger.kernel.org Cc: stable@vger.kernel.org # v3.3 and newer Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2013-01-03Drivers: xen: remove __dev* attributes.Greg Kroah-Hartman1-1/+1
CONFIG_HOTPLUG is going away as an option. As a result, the __dev* markings need to be removed. This change removes the use of __devinit, and __devinitdata from these drivers. Based on patches originally written by Bill Pemberton, but redone by me in order to handle some of the coding style issues better, by hand. Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Cc: Jan Beulich <jbeulich@suse.com> Cc: Stephen Hemminger <shemminger@vyatta.com> Cc: Jesse Barnes <jbarnes@virtuousgeek.org> Cc: Bill Pemberton <wfp5p@virginia.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-10-19Merge commit 'v3.7-rc1' into stable/for-linus-3.7Konrad Rzeszutek Wilk1-2/+4
* commit 'v3.7-rc1': (10892 commits) Linux 3.7-rc1 x86, boot: Explicitly include autoconf.h for hostprogs perf: Fix UAPI fallout ARM: config: make sure that platforms are ordered by option string ARM: config: sort select statements alphanumerically UAPI: (Scripted) Disintegrate include/linux/byteorder UAPI: (Scripted) Disintegrate include/linux UAPI: Unexport linux/blk_types.h UAPI: Unexport part of linux/ppp-comp.h perf: Handle new rbtree implementation procfs: don't need a PATH_MAX allocation to hold a string representation of an int vfs: embed struct filename inside of names_cache allocation if possible audit: make audit_inode take struct filename vfs: make path_openat take a struct filename pointer vfs: turn do_path_lookup into wrapper around struct filename variant audit: allow audit code to satisfy getname requests from its names_list vfs: define struct filename and have getname() return it btrfs: Fix compilation with user namespace support enabled userns: Fix posix_acl_file_xattr_userns gid conversion userns: Properly print bluetooth socket uids ...
2012-10-19xen: grant: use xen_pfn_t type for frame_list.Ian Campbell1-4/+4
This correctly sizes it as 64 bit on ARM but leaves it as unsigned long on x86 (therefore no intended change on x86). The long and ulong guest handles are now unused (and a bit dangerous) so remove them. Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2012-10-02Merge tag 'stable/for-linus-3.7-x86-tag' of ↵Linus Torvalds1-7/+60
git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen Pull Xen update from Konrad Rzeszutek Wilk: "Features: - When hotplugging PCI devices in a PV guest we can allocate Xen-SWIOTLB later. - Cleanup Xen SWIOTLB. - Support pages out grants from HVM domains in the backends. - Support wild cards in xen-pciback.hide=(BDF) arguments. - Update grant status updates with upstream hypervisor. - Boot PV guests with more than 128GB. - Cleanup Xen MMU code/add comments. - Obtain XENVERS using a preferred method. - Lay out generic changes to support Xen ARM. - Allow privcmd ioctl for HVM (used to do only PV). - Do v2 of mmap_batch for privcmd ioctls. - If hypervisor saves the LED keyboard light - we will now instruct the kernel about its state. Fixes: - More fixes to Xen PCI backend for various calls/FLR/etc. - With more than 4GB in a 64-bit PV guest disable native SWIOTLB. - Fix up smatch warnings. - Fix up various return values in privmcmd and mm." * tag 'stable/for-linus-3.7-x86-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: (48 commits) xen/pciback: Restore the PCI config space after an FLR. xen-pciback: properly clean up after calling pcistub_device_find() xen/vga: add the xen EFI video mode support xen/x86: retrieve keyboard shift status flags from hypervisor. xen/gndev: Xen backend support for paged out grant targets V4. xen-pciback: support wild cards in slot specifications xen/swiotlb: Fix compile warnings when using plain integer instead of NULL pointer. xen/swiotlb: Remove functions not needed anymore. xen/pcifront: Use Xen-SWIOTLB when initting if required. xen/swiotlb: For early initialization, return zero on success. xen/swiotlb: Use the swiotlb_late_init_with_tbl to init Xen-SWIOTLB late when PV PCI is used. xen/swiotlb: Move the error strings to its own function. xen/swiotlb: Move the nr_tbl determination in its own function. xen/arm: compile and run xenbus xen: resynchronise grant table status codes with upstream xen/privcmd: return -EFAULT on error xen/privcmd: Fix mmap batch ioctl error status copy back. xen/privcmd: add PRIVCMD_MMAPBATCH_V2 ioctl xen/mm: return more precise error from xen_remap_domain_range() xen/mmu: If the revector fails, don't attempt to revector anything else. ...
2012-09-21xen/gndev: Xen backend support for paged out grant targets V4.Andres Lagar-Cavilla1-0/+53
Since Xen-4.2, hvm domains may have portions of their memory paged out. When a foreign domain (such as dom0) attempts to map these frames, the map will initially fail. The hypervisor returns a suitable errno, and kicks an asynchronous page-in operation carried out by a helper. The foreign domain is expected to retry the mapping operation until it eventually succeeds. The foreign domain is not put to sleep because itself could be the one running the pager assist (typical scenario for dom0). This patch adds support for this mechanism for backend drivers using grant mapping and copying operations. Specifically, this covers the blkback and gntdev drivers (which map foreign grants), and the netback driver (which copies foreign grants). * Add a retry method for grants that fail with GNTST_eagain (i.e. because the target foreign frame is paged out). * Insert hooks with appropriate wrappers in the aforementioned drivers. The retry loop is only invoked if the grant operation status is GNTST_eagain. It guarantees to leave a new status code different from GNTST_eagain. Any other status code results in identical code execution as before. The retry loop performs 256 attempts with increasing time intervals through a 32 second period. It uses msleep to yield while waiting for the next retry. V2 after feedback from David Vrabel: * Explicit MAX_DELAY instead of wrap-around delay into zero * Abstract GNTST_eagain check into core grant table code for netback module. V3 after feedback from Ian Campbell: * Add placeholder in array of grant table error descriptions for unrelated error code we jump over. * Eliminate single map and retry macro in favor of a generic batch flavor. * Some renaming. * Bury most implementation in grant_table.c, cleaner interface. V4 rebased on top of sync of Xen grant table interface headers. Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org> Acked-by: Ian Campbell <ian.campbell@citrix.com> [v5: Fixed whitespace issues] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2012-09-12xen/m2p: do not reuse kmap_op->dev_bus_addrStefano Stabellini1-2/+4
If the caller passes a valid kmap_op to m2p_add_override, we use kmap_op->dev_bus_addr to store the original mfn, but dev_bus_addr is part of the interface with Xen and if we are batching the hypercalls it might not have been written by the hypervisor yet. That means that later on Xen will write to it and we'll think that the original mfn is actually what Xen has written to it. Rather than "stealing" struct members from kmap_op, keep using page->index to store the original mfn and add another parameter to m2p_remove_override to get the corresponding kmap_op instead. It is now responsibility of the caller to keep track of which kmap_op corresponds to a particular page in the m2p_override (gntdev, the only user of this interface that passes a valid kmap_op, is already doing that). CC: stable@kernel.org Reported-and-Tested-By: Sander Eikelenboom <linux@eikelenboom.it> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2012-08-21xen/apic/xenbus/swiotlb/pcifront/grant/tmem: Make functions or variables static.Konrad Rzeszutek Wilk1-7/+6
There is no need for those functions/variables to be visible. Make them static and also fix the compile warnings of this sort: drivers/xen/<some file>.c: warning: symbol '<blah>' was not declared. Should it be static? Some of them just require including the header file that declares the functions. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2012-08-21xen: missing includesStefano Stabellini1-0/+1
Changes in v2: - remove pvclock hack; - remove include linux/types.h from xen/interface/xen.h. v3: - Compile under IA64 Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2012-05-24Merge tag 'stable/for-linus-3.5-rc0-tag' of ↵Linus Torvalds1-10/+115
git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen Pull Xen updates from Konrad Rzeszutek Wilk: "Features: * Extend the APIC ops implementation and add IRQ_WORKER vector support so that 'perf' can work properly. * Fix self-ballooning code, and balloon logic when booting as initial domain. * Move array printing code to generic debugfs * Support XenBus domains. * Lazily free grants when a domain is dead/non-existent. * In M2P code use batching calls Bug-fixes: * Fix NULL dereference in allocation failure path (hvc_xen) * Fix unbinding of IRQ_WORKER vector during vCPU hot-unplug * Fix HVM guest resume - we would leak an PIRQ value instead of reusing the existing one." Fix up add-add onflicts in arch/x86/xen/enlighten.c due to addition of apic ipi interface next to the new apic_id functions. * tag 'stable/for-linus-3.5-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: xen: do not map the same GSI twice in PVHVM guests. hvc_xen: NULL dereference on allocation failure xen: Add selfballoning memory reservation tunable. xenbus: Add support for xenbus backend in stub domain xen/smp: unbind irqworkX when unplugging vCPUs. xen: enter/exit lazy_mmu_mode around m2p_override calls xen/acpi/sleep: Enable ACPI sleep via the __acpi_os_prepare_sleep xen: implement IRQ_WORK_VECTOR handler xen: implement apic ipi interface xen/setup: update VA mapping when releasing memory during setup xen/setup: Combine the two hypercall functions - since they are quite similar. xen/setup: Populate freed MFNs from non-RAM E820 entries and gaps to E820 RAM xen/setup: Only print "Freeing XXX-YYY pfn range: Z pages freed" if Z > 0 xen/gnttab: add deferred freeing logic debugfs: Add support to print u32 array in debugfs xen/p2m: An early bootup variant of set_phys_to_machine xen/p2m: Collapse early_alloc_p2m_middle redundant checks. xen/p2m: Allow alloc_p2m_middle to call reserve_brk depending on argument xen/p2m: Move code around to allow for better re-usage.
2012-05-07xen: enter/exit lazy_mmu_mode around m2p_override callsStefano Stabellini1-0/+19
This patch is a significant performance improvement for the m2p_override: about 6% using the gntdev device. Each m2p_add/remove_override call issues a MULTI_grant_table_op and a __flush_tlb_single if kmap_op != NULL. Batching all the calls together is a great performance benefit because it means issuing one hypercall total rather than two hypercall per page. If paravirt_lazy_mode is set PARAVIRT_LAZY_MMU, all these calls are going to be batched together, otherwise they are issued one at a time. Adding arch_enter_lazy_mmu_mode/arch_leave_lazy_mmu_mode around the m2p_add/remove_override calls forces paravirt_lazy_mode to PARAVIRT_LAZY_MMU, therefore makes sure that they are always batched. However it is not safe to call arch_enter_lazy_mmu_mode if we are in interrupt context or if we are already in PARAVIRT_LAZY_MMU mode, so check for both conditions before doing so. Changes in v4: - rebased on 3.4-rc4: all the m2p_override users call gnttab_unmap_refs and gnttab_map_refs; - check whether we are in interrupt context and the lazy_mode we are in before calling arch_enter/leave_lazy_mmu_mode. Changes in v3: - do not call arch_enter/leave_lazy_mmu_mode in xen_blkbk_unmap, that can be called in interrupt context. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> [v5: s/int lazy/bool lazy/] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2012-04-18Merge commit 'c104f1fa1ecf4ee0fc06e31b1f77630b2551be81' into ↵Konrad Rzeszutek Wilk1-2/+5
stable/for-linus-3.4 * commit 'c104f1fa1ecf4ee0fc06e31b1f77630b2551be81': (14566 commits) cpufreq: OMAP: fix build errors: depends on ARCH_OMAP2PLUS sparc64: Eliminate obsolete __handle_softirq() function sparc64: Fix bootup crash on sun4v. kconfig: delete last traces of __enabled_ from autoconf.h Revert "kconfig: fix __enabled_ macros definition for invisible and un-selected symbols" kconfig: fix IS_ENABLED to not require all options to be defined irq_domain: fix type mismatch in debugfs output format staging: android: fix mem leaks in __persistent_ram_init() staging: vt6656: Don't leak memory in drivers/staging/vt6656/ioctl.c::private_ioctl() staging: iio: hmc5843: Fix crash in probe function. panic: fix stack dump print on direct call to panic() drivers/rtc/rtc-pl031.c: enable clock on all ST variants Revert "mm: vmscan: fix misused nr_reclaimed in shrink_mem_cgroup_zone()" hugetlb: fix race condition in hugetlb_fault() drivers/rtc/rtc-twl.c: use static register while reading time drivers/rtc/rtc-s3c.c: add placeholder for driver private data drivers/rtc/rtc-s3c.c: fix compilation error MAINTAINERS: add PCDP console maintainer memcg: do not open code accesses to res_counter members drivers/rtc/rtc-efi.c: fix section mismatch warning ...
2012-04-17xen/gnttab: add deferred freeing logicJan Beulich1-10/+96
Rather than just leaking pages that can't be freed at the point where access permission for the backend domain gets revoked, put them on a list and run a timer to (infrequently) retry freeing them. (This can particularly happen when unloading a frontend driver when devices are still present, and the backend still has them in non-closed state or hasn't finished closing them yet.) Signed-off-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2012-04-17xen/grant-table: add error-handling code on failure of gnttab_resumeJulia Lawall1-4/+9
Jump to the label ini_nomem as done on the failure of the page allocations above. The code at ini_nomem is modified to accommodate different return values. Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2012-01-27xen/granttable: Disable grant v2 for HVM domains.Konrad Rzeszutek Wilk1-2/+5
As proper scaffolding for supporting error status is not yet implemented. BUG: unable to handle kernel NULL pointer dereference at 0000000000000400 IP: [<ffffffff81375ae9>] gnttab_end_foreign_access_ref_v2+0x29/0x40 PGD 32aa3067 PUD 32a87067 PMD 0 Oops: 0000 [#1] PREEMPT SMP CPU 0 Modules linked in: sg sr_mod cdrom ata_generic ata_piix libata scsi_mod xen_blkfront xen_netfront fb_sys_fops sysimgblt sysfillrect syscopyarea xen_kbdfront cmd Pid: 2307, comm: ip Not tainted 3.3.0-rc1 #1 Xen HVM domU RIP: 0010:[<ffffffff81375ae9>] [<ffffffff81375ae9>] gnttab_end_foreign_access_ref_v2+0x29/0x40 RSP: 0018:ffff88003be03d38 EFLAGS: 00010206 RAX: 0000000000000000 RBX: ffff880033210640 RCX: 0000000000000040 RDX: 0000000000002000 RSI: 0000000000000000 RDI: 0000000000000200 RBP: ffff88003be03d38 R08: 0000000000000101 R09: 0000000000000000 R10: dead000000100100 R11: 0000000000000000 R12: ffff88003be03e48 R13: 0000000000000001 R14: ffff880039461c00 R15: 0000000000000200 FS: 00007fb1f84ec700(0000) GS:ffff88003be00000(0000) knlGS:0000000000000000 ... Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-12-20xen/grant-table: Support mappings required by blkbackDaniel De Graaf1-19/+5
Add support for mappings without GNTMAP_contains_pte. This was not supported because the unmap operation assumed that this flag was being used; adding a parameter to the unmap operation to allow the PTE clearing to be disabled is sufficient to make unmap capable of supporting either mapping type. Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov> [v1: Fix cleanpatch warnings] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-12-16xen/granttable: Support transitive grantsAnnie Li1-0/+70
These allow a domain A which has been granted access on a page of domain B's memory to issue domain C with a copy-grant on the same page. This is useful e.g. for forwarding packets between domains. Signed-off-by: Annie Li <annie.li@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-12-16xen/granttable: Support sub-page grantsAnnie Li1-0/+72
- They can't be used to map the page (so can only be used in a GNTTABOP_copy hypercall). - It's possible to grant access with a finer granularity than whole pages. - Xen guarantees that they can be revoked quickly (a normal map grant can only be revoked with the cooperation of the domain which has been granted access). Signed-off-by: Annie Li <annie.li@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-12-16xen/granttable: Improve comments for function pointersAnnie Li1-24/+24
Signed-off-by: Annie Li <annie.li@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-11-22xen/granttable: Keep code format cleanAnnie Li1-4/+3
Acked-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: Annie Li <annie.li@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-11-22xen/granttable: Grant tables V2 implementationAnnie Li1-4/+167
Receiver-side copying of packets is based on this implementation, it gives better performance and better CPU accounting. It totally supports three types: full-page, sub-page and transitive grants. However this patch does not cover sub-page and transitive grants, it mainly focus on Full-page part and implements grant table V2 interfaces corresponding to what already exists in grant table V1, such as: grant table V2 initialization, mapping, releasing and exported interfaces. Each guest can only supports one type of grant table type, every entry in grant table should be the same version. It is necessary to set V1 or V2 version before initializing the grant table. Grant table exported interfaces of V2 are same with those of V1, Xen is responsible to judge what grant table version guests are using in every grant operation. V2 fulfills the same role of V1, and it is totally backwards compitable with V1. If dom0 support grant table V2, the guests runing on it can run with either V1 or V2. Acked-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: Annie Li <annie.li@oracle.com> [v1: Modified alloc_vm_area call (new parameters), indentation, and cleanpatch warnings] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-11-22xen/granttable: Refactor some codeAnnie Li1-5/+10
Acked-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: Annie Li <annie.li@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-11-22xen/granttable: Introducing grant table V2 stuctureAnnie Li1-40/+141
This patch introduces new structures of grant table V2, grant table V2 is an extension from V1. Grant table is shared between guest and Xen, and Xen is responsible to do corresponding work for grant operations, such as: figure out guest's grant table version, perform different actions based on different grant table version, etc. Although full-page structure of V2 is different from V1, it play the same role as V1. Acked-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: Annie Li <annie.li@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-11-06Merge branch 'stable/cleanups-3.2' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen * 'stable/cleanups-3.2' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: xen: use static initializers in xen-balloon.c Xen: fix braces and tabs coding style issue in xenbus_probe.c Xen: fix braces coding style issue in xenbus_probe.h Xen: fix whitespaces,tabs coding style issue in drivers/xen/pci.c Xen: fix braces coding style issue in gntdev.c and grant-table.c Xen: fix whitespaces,tabs coding style issue in drivers/xen/events.c Xen: fix whitespaces,tabs coding style issue in drivers/xen/balloon.c Fix up trivial whitespace-conflicts in drivers/xen/{balloon.c,pci.c,xenbus/xenbus_probe.c}
2011-09-29xen: modify kernel mappings corresponding to granted pagesStefano Stabellini1-3/+3
If we want to use granted pages for AIO, changing the mappings of a user vma and the corresponding p2m is not enough, we also need to update the kernel mappings accordingly. Currently this is only needed for pages that are created for user usages through /dev/xen/gntdev. As in, pages that have been in use by the kernel and use the P2M will not need this special mapping. However there are no guarantees that in the future the kernel won't start accessing pages through the 1:1 even for internal usage. In order to avoid the complexity of dealing with highmem, we allocated the pages lowmem. We issue a HYPERVISOR_grant_table_op right away in m2p_add_override and we remove the mappings using another HYPERVISOR_grant_table_op in m2p_remove_override. Considering that m2p_add_override and m2p_remove_override are called once per page we use multicalls and hypercall batching. Use the kmap_op pointer directly as argument to do the mapping as it is guaranteed to be present up until the unmapping is done. Before issuing any unmapping multicalls, we need to make sure that the mapping has already being done, because we need the kmap->handle to be set correctly. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> [v1: Removed GRANT_FRAME_BIT usage] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-07-26Xen: fix braces coding style issue in gntdev.c and grant-table.cRuslan Pisarev1-1/+1
This is a patch to the gntdev.c and grant-table.c files that fixed up braces errors found by the checkpatch.pl tools. Signed-off-by: Ruslan Pisarev <ruslan@rpisarev.org.ua> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-07-26xen/grant: Fix compile warning.Konrad Rzeszutek Wilk1-1/+1
drivers/xen/grant-table.c:85: warning: ‘rc’ may be used uninitialized in this function Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-04-18xen/p2m/m2p/gnttab: Support GNTMAP_host_map in the M2P override.Konrad Rzeszutek Wilk1-7/+24
We only supported the M2P (and P2M) override only for the GNTMAP_contains_pte type mappings. Meaning that we grants operations would "contain the machine address of the PTE to update" If the flag is unset, then the grant operation is "contains a host virtual address". The latter case means that the Hypervisor takes care of updating our page table (specifically the PTE entry) with the guest's MFN. As such we should not try to do anything with the PTE. Previous to this patch we would try to clear the PTE which resulted in Xen hypervisor being upset with us: (XEN) mm.c:1066:d0 Attempt to implicitly unmap a granted PTE c0100000ccc59067 (XEN) domain_crash called from mm.c:1067 (XEN) Domain 0 (vcpu#0) crashed on cpu#3: (XEN) ----[ Xen-4.0-110228 x86_64 debug=y Not tainted ]---- and crashing us. This patch allows us to inhibit the PTE clearing in the PV guest if the GNTMAP_contains_pte is not set. On the m2p_remove_override path we provide the same parameter. Sadly in the grant-table driver we do not have a mechanism to tell m2p_remove_override whether to clear the PTE or not. Since the grant-table driver is used by user-space, we can safely assume that it operates only on PTE's. Hence the implementation for it to work on !GNTMAP_contains_pte returns -EOPNOTSUPP. In the future we can implement the support for this. It will require some extra accounting structure to keep track of the page[i], and the flag. [v1: Added documentation details, made it return -EOPNOTSUPP instead of trying to do a half-way implementation] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-03-09xen/p2m/m2p/gnttab: do not add failed grant maps to m2p overrideIan Campbell1-0/+4
The caller will not undo a mapping which failed and therefore the override will not be removed. This is especially bad in the case of GNTMAP_contains_pte mapping type mappings where m2p_add_override will destroy the kernel mapping of the page. This was observed via a failure of map_grant_pages in gntdev_mmap (due to userspace using a bad grant reference), which left the page in question unmapped (because it was a GNTMAP_contains_pte mapping) which led to a crash later on. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Cc: Daniel De Graaf <dgdegra@tycho.nsa.gov> Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Cc: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-02-14xen-gntdev: Support mapping in HVM domainsDaniel De Graaf1-0/+6
HVM does not allow direct PTE modification, so instead we request that Xen change its internal p2m mappings on the allocated pages and map the memory into userspace normally. Note: The HVM path for map and unmap is slightly different: HVM keeps the pages mapped until the area is deleted, while the PV case (use_ptemod being true) must unmap them when userspace unmaps the range. In the normal use case, this makes no difference to users since unmap time is deletion time. [v2: Expanded commit descr.] Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-01-11xen p2m: clear the old pte when adding a page to m2p_overrideJeremy Fitzhardinge1-3/+13
When adding a page to m2p_override we change the p2m of the page so we need to also clear the old pte of the kernel linear mapping because it doesn't correspond anymore. When we remove the page from m2p_override we restore the original p2m of the page and we also restore the old pte of the kernel linear mapping. Before changing the p2m mappings in m2p_add_override and m2p_remove_override, check that the page passed as argument is valid and return an error if it is not. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-01-11xen: introduce gnttab_map_refs and gnttab_unmap_refsStefano Stabellini1-0/+36
gnttab_map_refs maps some grant refs and uses the new m2p override to set a proper m2p mapping for the granted pages. gnttab_unmap_refs unmaps the granted refs and removes th mappings from the m2p override. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2010-07-22xen: Xen PCI platform device driver.Stefano Stabellini1-10/+67
Add the xen pci platform device driver that is responsible for initializing the grant table and xenbus in PV on HVM mode. Few changes to xenbus and grant table are necessary to allow the delayed initialization in HVM mode. Grant table needs few additional modifications to work in HVM mode. The Xen PCI platform device raises an irq every time an event has been delivered to us. However these interrupts are only delivered to vcpu 0. The Xen PCI platform interrupt handler calls xen_hvm_evtchn_do_upcall that is a little wrapper around __xen_evtchn_do_upcall, the traditional Xen upcall handler, the very same used with traditional PV guests. When running on HVM the event channel upcall is never called while in progress because it is a normal Linux irq handler (and we cannot switch the irq chip wholesale to the Xen PV ones as we are running QEMU and might have passed in PCI devices), therefore we cannot be sure that evtchn_upcall_pending is 0 when returning. For this reason if evtchn_upcall_pending is set by Xen we need to loop again on the event channels set pending otherwise we might loose some event channel deliveries. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
2010-03-30include cleanup: Update gfp.h and slab.h includes to prepare for breaking ↵Tejun Heo1-0/+1
implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_*.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). * x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: Tejun Heo <tj@kernel.org> Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
2009-11-04xen: move Xen-testing predicates to common headerJeremy Fitzhardinge1-0/+1
Move xen_domain and related tests out of asm-x86 to xen/xen.h so they can be included whenever they are necessary. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2008-12-16xen: clean up asm/xen/hypervisor.hJeremy Fitzhardinge1-0/+1
Impact: cleanup hypervisor.h had accumulated a lot of crud, including lots of spurious #includes. Clean it all up, and go around fixing up everything else accordingly. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-20xen: clean up domain mode predicatesJeremy Fitzhardinge1-1/+1
There are four operating modes Xen code may find itself running in: - native - hvm domain - pv dom0 - pv domU Clean up predicates for testing for these states to make them more consistent. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Xen-devel <xen-devel@lists.xensource.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-05-27xen: implement save/restoreJeremy Fitzhardinge1-2/+2
This patch implements Xen save/restore and migration. Saving is triggered via xenbus, which is polled in drivers/xen/manage.c. When a suspend request comes in, the kernel prepares itself for saving by: 1 - Freeze all processes. This is primarily to prevent any partially-completed pagetable updates from confusing the suspend process. If CONFIG_PREEMPT isn't defined, then this isn't necessary. 2 - Suspend xenbus and other devices 3 - Stop_machine, to make sure all the other vcpus are quiescent. The Xen tools require the domain to run its save off vcpu0. 4 - Within the stop_machine state, it pins any unpinned pgds (under construction or destruction), performs canonicalizes various other pieces of state (mostly converting mfns to pfns), and finally 5 - Suspend the domain Restore reverses the steps used to save the domain, ending when all the frozen processes are thawed. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-04-24xen: make grant table arch portableIsaku Yamahata1-32/+3
split out x86 specific part from grant-table.c and allow ia64/xen specific initialization. ia64/xen grant table is based on pseudo physical address (guest physical address) unlike x86/xen. On ia64 init_mm doesn't map identity straight mapped area. ia64/xen specific grant table initialization is necessary. Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-04-24xen: replace callers of alloc_vm_area()/free_vm_area() with xen_ prefixed oneIsaku Yamahata1-1/+1
Don't use alloc_vm_area()/free_vm_area() directly, instead define xen_alloc_vm_area()/xen_free_vm_area() and use them. alloc_vm_area()/free_vm_area() are used to allocate/free area which are for grant table mapping. Xen/x86 grant table is based on virtual address so that alloc_vm_area()/free_vm_area() are suitable. On the other hand Xen/ia64 (and Xen/powerpc) grant table is based on pseudo physical address (guest physical address) so that allocation should be done differently. The original version of xenified Linux/IA64 have its own allocate_vm_area()/free_vm_area() definitions which don't allocate vm area contradictory to those names. Now vanilla Linux already has its definitions so that it's impossible to have IA64 definitions of allocate_vm_area()/free_vm_area(). Instead introduce xen_allocate_vm_area()/xen_free_vm_area() and use them. Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-04-24xen: add missing definitions for xen grant table which ia64/xen needsIsaku Yamahata1-1/+1
Add xen handles realted definitions for grant table which ia64/xen needs. Pointer argumsnts for ia64/xen hypercall are passed in pseudo physical address (guest physical address) so that it is required to convert guest kernel virtual address into pseudo physical address right before issuing hypercall. The xen guest handle represents such arguments. Define necessary handles and helper functions. Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-04-04xen: fix grant table bugMichael Abd-El-Malek1-6/+10
fix memory corruption and crash due to mis-sized grant table. A PV OS has two grant table data structures: the grant table itself and a free list. The free list is composed of an array of pages, which grow dynamically as the guest OS requires more grants. While the grant table contains 8-byte entries, the free list contains 4-byte entries. So we have half as many pages in the free list than in the grant table. There was a bug in the free list allocation code. The free list was indexed as if it was the same size as the grant table. But it's only half as large. So memory got corrupted, and I was seeing crashes in the slab allocator later on. Taken from: http://xenbits.xensource.com/linux-2.6.18-xen.hg?rev/4018c0da3360 Signed-off-by: Michael Abd-El-Malek <mabdelmalek@cmu.edu> Signed-off-by: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-07-18xen: Add grant table supportJeremy Fitzhardinge1-0/+582
Add Xen 'grant table' driver which allows granting of access to selected local memory pages by other virtual machines and, symmetrically, the mapping of remote memory pages which other virtual machines have granted access to. This driver is a prerequisite for many of the Xen virtual device drivers, which grant the 'device driver domain' restricted and temporary access to only those memory pages that are currently involved in I/O operations. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Ian Pratt <ian.pratt@xensource.com> Signed-off-by: Christian Limpach <Christian.Limpach@cl.cam.ac.uk> Signed-off-by: Chris Wright <chrisw@sous-sol.org>