summaryrefslogtreecommitdiffstats
path: root/Documentation
AgeCommit message (Collapse)AuthorFilesLines
2011-07-12x86, doc only: Correct real-mode kernel header offset for init_sizeDarren Hart1-1/+1
The real-mode kernel header init_size field is located at 0x260 per the field listing in th e"REAL-MODE KERNEL HEADER" section. It is listed as 0x25c in the "DETAILS OF HEADER FIELDS" section, which overlaps with pref_address. Correct the details listing to 0x260. Signed-off-by: Darren Hart <dvhart@linux.intel.com> Link: http://lkml.kernel.org/r/541cf88e2dfe5b8186d8b96b136d892e769a68c1.1310441260.git.dvhart@linux.intel.com CC: H. Peter Anvin <hpa@zytor.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2011-07-12KVM: KVM Steal time guest/host interfaceGlauber Costa1-0/+34
To implement steal time, we need the hypervisor to pass the guest information about how much time was spent running other processes outside the VM. This is per-vcpu, and using the kvmclock structure for that is an abuse we decided not to make. In this patchset, I am introducing a new msr, KVM_MSR_STEAL_TIME, that holds the memory area address containing information about steal time This patch contains the headers for it. I am keeping it separate to facilitate backports to people who wants to backport the kernel part but not the hypervisor, or the other way around. Signed-off-by: Glauber Costa <glommer@redhat.com> Acked-by: Rik van Riel <riel@redhat.com> Tested-by: Eric B Munson <emunson@mgebm.net> CC: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> CC: Peter Zijlstra <peterz@infradead.org> CC: Anthony Liguori <aliguori@us.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2011-07-12KVM: PPC: Allocate RMAs (Real Mode Areas) at boot for use by guestsPaul Mackerras1-0/+32
This adds infrastructure which will be needed to allow book3s_hv KVM to run on older POWER processors, including PPC970, which don't support the Virtual Real Mode Area (VRMA) facility, but only the Real Mode Offset (RMO) facility. These processors require a physically contiguous, aligned area of memory for each guest. When the guest does an access in real mode (MMU off), the address is compared against a limit value, and if it is lower, the address is ORed with an offset value (from the Real Mode Offset Register (RMOR)) and the result becomes the real address for the access. The size of the RMA has to be one of a set of supported values, which usually includes 64MB, 128MB, 256MB and some larger powers of 2. Since we are unlikely to be able to allocate 64MB or more of physically contiguous memory after the kernel has been running for a while, we allocate a pool of RMAs at boot time using the bootmem allocator. The size and number of the RMAs can be set using the kvm_rma_size=xx and kvm_rma_count=xx kernel command line options. KVM exports a new capability, KVM_CAP_PPC_RMA, to signal the availability of the pool of preallocated RMAs. The capability value is 1 if the processor can use an RMA but doesn't require one (because it supports the VRMA facility), or 2 if the processor requires an RMA for each guest. This adds a new ioctl, KVM_ALLOCATE_RMA, which allocates an RMA from the pool and returns a file descriptor which can be used to map the RMA. It also returns the size of the RMA in the argument structure. Having an RMA means we will get multiple KMV_SET_USER_MEMORY_REGION ioctl calls from userspace. To cope with this, we now preallocate the kvm->arch.ram_pginfo array when the VM is created with a size sufficient for up to 64GB of guest memory. Subsequently we will get rid of this array and use memory associated with each memslot instead. This moves most of the code that translates the user addresses into host pfns (page frame numbers) out of kvmppc_prepare_vrma up one level to kvmppc_core_prepare_memory_region. Also, instead of having to look up the VMA for each page in order to check the page size, we now check that the pages we get are compound pages of 16MB. However, if we are adding memory that is mapped to an RMA, we don't bother with calling get_user_pages_fast and instead just offset from the base pfn for the RMA. Typically the RMA gets added after vcpus are created, which makes it inconvenient to have the LPCR (logical partition control register) value in the vcpu->arch struct, since the LPCR controls whether the processor uses RMA or VRMA for the guest. This moves the LPCR value into the kvm->arch struct and arranges for the MER (mediated external request) bit, which is the only bit that varies between vcpus, to be set in assembly code when going into the guest if there is a pending external interrupt request. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
2011-07-12KVM: PPC: Allow book3s_hv guests to use SMT processor modesPaul Mackerras1-0/+13
This lifts the restriction that book3s_hv guests can only run one hardware thread per core, and allows them to use up to 4 threads per core on POWER7. The host still has to run single-threaded. This capability is advertised to qemu through a new KVM_CAP_PPC_SMT capability. The return value of the ioctl querying this capability is the number of vcpus per virtual CPU core (vcore), currently 4. To use this, the host kernel should be booted with all threads active, and then all the secondary threads should be offlined. This will put the secondary threads into nap mode. KVM will then wake them from nap mode and use them for running guest code (while they are still offline). To wake the secondary threads, we send them an IPI using a new xics_wake_cpu() function, implemented in arch/powerpc/sysdev/xics/icp-native.c. In other words, at this stage we assume that the platform has a XICS interrupt controller and we are using icp-native.c to drive it. Since the woken thread will need to acknowledge and clear the IPI, we also export the base physical address of the XICS registers using kvmppc_set_xics_phys() for use in the low-level KVM book3s code. When a vcpu is created, it is assigned to a virtual CPU core. The vcore number is obtained by dividing the vcpu number by the number of threads per core in the host. This number is exported to userspace via the KVM_CAP_PPC_SMT capability. If qemu wishes to run the guest in single-threaded mode, it should make all vcpu numbers be multiples of the number of threads per core. We distinguish three states of a vcpu: runnable (i.e., ready to execute the guest), blocked (that is, idle), and busy in host. We currently implement a policy that the vcore can run only when all its threads are runnable or blocked. This way, if a vcpu needs to execute elsewhere in the kernel or in qemu, it can do so without being starved of CPU by the other vcpus. When a vcore starts to run, it executes in the context of one of the vcpu threads. The other vcpu threads all go to sleep and stay asleep until something happens requiring the vcpu thread to return to qemu, or to wake up to run the vcore (this can happen when another vcpu thread goes from busy in host state to blocked). It can happen that a vcpu goes from blocked to runnable state (e.g. because of an interrupt), and the vcore it belongs to is already running. In that case it can start to run immediately as long as the none of the vcpus in the vcore have started to exit the guest. We send the next free thread in the vcore an IPI to get it to start to execute the guest. It synchronizes with the other threads via the vcore->entry_exit_count field to make sure that it doesn't go into the guest if the other vcpus are exiting by the time that it is ready to actually enter the guest. Note that there is no fixed relationship between the hardware thread number and the vcpu number. Hardware threads are assigned to vcpus as they become runnable, so we will always use the lower-numbered hardware threads in preference to higher-numbered threads if not all the vcpus in the vcore are runnable, regardless of which vcpus are runnable. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
2011-07-12KVM: PPC: Accelerate H_PUT_TCE by implementing it in real modeDavid Gibson1-0/+35
This improves I/O performance for guests using the PAPR paravirtualization interface by making the H_PUT_TCE hcall faster, by implementing it in real mode. H_PUT_TCE is used for updating virtual IOMMU tables, and is used both for virtual I/O and for real I/O in the PAPR interface. Since this moves the IOMMU tables into the kernel, we define a new KVM_CREATE_SPAPR_TCE ioctl to allow qemu to create the tables. The ioctl returns a file descriptor which can be used to mmap the newly created table. The qemu driver models use them in the same way as userspace managed tables, but they can be updated directly by the guest with a real-mode H_PUT_TCE implementation, reducing the number of host/guest context switches during guest IO. There are certain circumstances where it is useful for userland qemu to write to the TCE table even if the kernel H_PUT_TCE path is used most of the time. Specifically, allowing this will avoid awkwardness when we need to reset the table. More importantly, we will in the future need to write the table in order to restore its state after a checkpoint resume or migration. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
2011-07-12KVM: PPC: Add support for Book3S processors in hypervisor modePaul Mackerras1-0/+17
This adds support for KVM running on 64-bit Book 3S processors, specifically POWER7, in hypervisor mode. Using hypervisor mode means that the guest can use the processor's supervisor mode. That means that the guest can execute privileged instructions and access privileged registers itself without trapping to the host. This gives excellent performance, but does mean that KVM cannot emulate a processor architecture other than the one that the hardware implements. This code assumes that the guest is running paravirtualized using the PAPR (Power Architecture Platform Requirements) interface, which is the interface that IBM's PowerVM hypervisor uses. That means that existing Linux distributions that run on IBM pSeries machines will also run under KVM without modification. In order to communicate the PAPR hypercalls to qemu, this adds a new KVM_EXIT_PAPR_HCALL exit code to include/linux/kvm.h. Currently the choice between book3s_hv support and book3s_pr support (i.e. the existing code, which runs the guest in user mode) has to be made at kernel configuration time, so a given kernel binary can only do one or the other. This new book3s_hv code doesn't support MMIO emulation at present. Since we are running paravirtualized guests, this isn't a serious restriction. With the guest running in supervisor mode, most exceptions go straight to the guest. We will never get data or instruction storage or segment interrupts, alignment interrupts, decrementer interrupts, program interrupts, single-step interrupts, etc., coming to the hypervisor from the guest. Therefore this introduces a new KVMTEST_NONHV macro for the exception entry path so that we don't have to do the KVM test on entry to those exception handlers. We do however get hypervisor decrementer, hypervisor data storage, hypervisor instruction storage, and hypervisor emulation assist interrupts, so we have to handle those. In hypervisor mode, real-mode accesses can access all of RAM, not just a limited amount. Therefore we put all the guest state in the vcpu.arch and use the shadow_vcpu in the PACA only for temporary scratch space. We allocate the vcpu with kzalloc rather than vzalloc, and we don't use anything in the kvmppc_vcpu_book3s struct, so we don't allocate it. We don't have a shared page with the guest, but we still need a kvm_vcpu_arch_shared struct to store the values of various registers, so we include one in the vcpu_arch struct. The POWER7 processor has a restriction that all threads in a core have to be in the same partition. MMU-on kernel code counts as a partition (partition 0), so we have to do a partition switch on every entry to and exit from the guest. At present we require the host and guest to run in single-thread mode because of this hardware restriction. This code allocates a hashed page table for the guest and initializes it with HPTEs for the guest's Virtual Real Memory Area (VRMA). We require that the guest memory is allocated using 16MB huge pages, in order to simplify the low-level memory management. This also means that we can get away without tracking paging activity in the host for now, since huge pages can't be paged or swapped. This also adds a few new exports needed by the book3s_hv code. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
2011-07-12KVM: PPC: e500: enable magic pageScott Wood1-3/+5
This is a shared page used for paravirtualization. It is always present in the guest kernel's effective address space at the address indicated by the hypercall that enables it. The physical address specified by the hypercall is not used, as e500 does not have real mode. Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
2011-07-12KVM: MMU: Adjust shadow paging to work when SMEP=1 and CR0.WP=0Avi Kivity1-0/+18
When CR0.WP=0, we sometimes map user pages as kernel pages (to allow the kernel to write to them). Unfortunately this also allows the kernel to fetch from these pages, even if CR4.SMEP is set. Adjust for this by also setting NX on the spte in these circumstances. Signed-off-by: Avi Kivity <avi@redhat.com>
2011-07-12KVM: Fix KVM_ASSIGN_SET_MSIX_ENTRY documentationJan Kiszka1-2/+4
The documented behavior did not match the implemented one (which also never changed). Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2011-07-12KVM: Clarify KVM_ASSIGN_PCI_DEVICE documentationJan Kiszka1-6/+1
Neither host_irq nor the guest_msi struct are used anymore today. Tag the former, drop the latter to avoid confusion. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2011-07-12KVM: Fixup documentation section numberingJan Kiszka1-1/+1
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-07-12KVM: Document KVM_IOEVENTFDSasha Levin1-0/+30
Document KVM_IOEVENTFD that can be used to receive notifications of PIO/MMIO events without triggering an exit. Signed-off-by: Sasha Levin <levinsasha928@gmail.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-07-12KVM: nVMX: DocumentationNadav Har'El1-0/+251
This patch includes a brief introduction to the nested vmx feature in the Documentation/kvm directory. The document also includes a copy of the vmcs12 structure, as requested by Avi Kivity. [marcelo: move to Documentation/virtual/kvm] Signed-off-by: Nadav Har'El <nyh@il.ibm.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-07-12PM / Runtime: Add new helper function: pm_runtime_status_suspended()Kevin Hilman1-0/+3
This boolean function simply returns whether or not the runtime status of the device is 'suspended'. Unlike pm_runtime_suspended(), this function returns the runtime status whether or not runtime PM for the device has been disabled or not. Also add entry to Documentation/power/runtime.txt Signed-off-by: Kevin Hilman <khilman@ti.com> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2011-07-12KVM: Document KVM_GET_LAPIC, KVM_SET_LAPIC ioctlAvi Kivity1-0/+32
Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-07-11Documentation/Changes: remove some really obsolete textLinus Torvalds1-25/+18
That file harkens back to the days of the big 2.4 -> 2.6 version jump, and was based even then on older versions. Some of it is just obsolete, and Jesper Juhl points out that it talks about kernel versions 2.6 and should be updated to 3.0. Remove some obsolete text, and re-phrase some other to not be 2.6-specific. Reported-by: Jesper Juhl <jj@chaosbits.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-07-11Merge branch 'v4l_for_linus' of ↵Linus Torvalds1-0/+22
git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-2.6 * 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-2.6: [media] msp3400: fill in v4l2_tuner based on vt->type field [media] tuner-core.c: don't change type field in g_tuner or g_frequency [media] cx18/ivtv: fix g_tuner support [media] tuner-core: power up tuner when called with s_power(1) [media] v4l2-ioctl.c: check for valid tuner type in S_HW_FREQ_SEEK [media] tuner-core: simplify the standard fixup [media] tuner-core/v4l2-subdev: document that the type field has to be filled in [media] v4l2-subdev.h: remove unused s_mode tuner op [media] feature-removal-schedule: change in how radio device nodes are handled [media] bttv: fix s_tuner for radio [media] pvrusb2: fix g/s_tuner support [media] v4l2-ioctl.c: prefill tuner type for g_frequency and g/s_tuner [media] tuner-core: fix tuner_resume: use t->mode instead of t->type [media] tuner-core: fix s_std and s_tuner
2011-07-11Merge branch 'for_linus' of ↵Linus Torvalds1-0/+5
git://git.kernel.org/pub/scm/linux/kernel/git/mjg59/platform-drivers-x86 * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mjg59/platform-drivers-x86: hp-wmi: fix use after free dell-laptop - using buffer without mutex_lock Revert: "dell-laptop: Toggle the unsupported hardware killswitch" platform-drivers-x86: set backlight type to BACKLIGHT_PLATFORM thinkpad-acpi: handle HKEY 0x4010, 0x4011 events drivers/platform/x86: Fix memory leak thinkpad-acpi: handle some new HKEY 0x60xx events acer-wmi: fix bitwise bug when set device state acer-wmi: Only update rfkill status for associated hotkey events
2011-07-11Documentation/spinlocks.txt: Remove reference to sti()/cli()Muthu Kumar1-38/+7
Since we removed sti()/cli() and related, how about removing it from Documentation/spinlocks.txt? Signed-off-by: Muthukumar R <muthur@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-07-11mac80211: fix docbookJohannes Berg1-2/+3
I changed the TKIP key functions, but forgot to update the documentation includes, fix that. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-07-11HID: wiimote: Add sysfs support to wiimote driverDavid Herrmann1-0/+10
Add sysfs files for each led of the wiimote. Writing 1 to the file enables the led and 0 disables the led. We do not need memory barriers when checking wdata->ready since we use a spinlock directly after it. Signed-off-by: David Herrmann <dh.herrmann@googlemail.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2011-07-11Merge branch 'master' into for-nextJiri Kosina25-201/+397
Sync with Linus' tree to be able to apply pending patches that are based on newer code already present upstream.
2011-07-10Merge branch 'for-linus' of ↵Linus Torvalds1-0/+2
git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: PCI: conditional resource-reallocation through kernel parameter pci=realloc
2011-07-10tty/serial: Add devicetree support for nVidia Tegra serial portsGrant Likely1-1/+1
Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2011-07-08PCI: conditional resource-reallocation through kernel parameter pci=reallocRam Pai1-0/+2
Multiple attempts to dynamically reallocate pci resources have unfortunately lead to regressions. Though we continue to fix the regressions and fine tune the dynamic-reallocation behavior, we have not reached a acceptable state yet. This patch provides a interim solution. It disables dynamic reallocation by default, but adds the ability to enable it through pci=realloc kernel command line parameter. Tested-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: Ram Pai <linuxram@us.ibm.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-07-08USB: EHCI: Allow users to override 80% max periodic bandwidthKirill Smelkov2-0/+25
There are cases, when 80% max isochronous bandwidth is too limiting. For example I have two USB video capture cards which stream uncompressed video, and to stream full NTSC + PAL videos we'd need NTSC 640x480 YUV422 @30fps ~17.6 MB/s PAL 720x576 YUV422 @25fps ~19.7 MB/s isoc bandwidth. Now, due to limited alt settings in capture devices NTSC one ends up streaming with max_pkt_size=2688 and PAL with max_pkt_size=2892, both with interval=1. In terms of microframe time allocation this gives NTSC ~53us PAL ~57us and together ~110us > 100us == 80% of 125us uframe time. So those two devices can't work together simultaneously because the'd over allocate isochronous bandwidth. 80% seemed a bit arbitrary to me, and I've tried to raise it to 90% and both devices started to work together, so I though sometimes it would be a good idea for users to override hardcoded default of max 80% isoc bandwidth. After all, isn't it a user who should decide how to load the bus? If I can live with 10% or even 5% bulk bandwidth that should be ok. I'm a USB newcomer, but that 80% set in stone by USB 2.0 specification seems to be chosen pretty arbitrary to me, just to serve as a reasonable default. NOTE 1 ~~~~~~ for two streams with max_pkt_size=3072 (worst case) both time allocation would be 60us+60us=120us which is 96% periodic bandwidth leaving 4% for bulk and control. Alan Stern suggested that bulk then would be problematic (less than 300*8 bittimes left per microframe), but I think that is still enough for control traffic. NOTE 2 ~~~~~~ Sarah Sharp expressed concern that maxing out periodic bandwidth could lead to vendor-specific hardware bugs on host controllers, because > It's entirely possible that you'll run into > vendor-specific bugs if you try to pack the schedule with isochronous > transfers. I don't think any hardware designer would seriously test or > validate their hardware with a schedule that is basically a violation of > the USB bus spec (more than 80% for periodic transfers). So far I've only tested this patch on my HP Mini 5103 with N10 chipset kirr@mini:~$ lspci 00:00.0 Host bridge: Intel Corporation N10 Family DMI Bridge 00:02.0 VGA compatible controller: Intel Corporation N10 Family Integrated Graphics Controller 00:02.1 Display controller: Intel Corporation N10 Family Integrated Graphics Controller 00:1b.0 Audio device: Intel Corporation N10/ICH 7 Family High Definition Audio Controller (rev 02) 00:1c.0 PCI bridge: Intel Corporation N10/ICH 7 Family PCI Express Port 1 (rev 02) 00:1c.3 PCI bridge: Intel Corporation N10/ICH 7 Family PCI Express Port 4 (rev 02) 00:1d.0 USB Controller: Intel Corporation N10/ICH 7 Family USB UHCI Controller #1 (rev 02) 00:1d.1 USB Controller: Intel Corporation N10/ICH 7 Family USB UHCI Controller #2 (rev 02) 00:1d.2 USB Controller: Intel Corporation N10/ICH 7 Family USB UHCI Controller #3 (rev 02) 00:1d.3 USB Controller: Intel Corporation N10/ICH 7 Family USB UHCI Controller #4 (rev 02) 00:1d.7 USB Controller: Intel Corporation N10/ICH 7 Family USB2 EHCI Controller (rev 02) 00:1e.0 PCI bridge: Intel Corporation 82801 Mobile PCI Bridge (rev e2) 00:1f.0 ISA bridge: Intel Corporation NM10 Family LPC Controller (rev 02) 00:1f.2 SATA controller: Intel Corporation N10/ICH7 Family SATA AHCI Controller (rev 02) 01:00.0 Network controller: Broadcom Corporation BCM4313 802.11b/g/n Wireless LAN Controller (rev 01) 02:00.0 Ethernet controller: Marvell Technology Group Ltd. 88E8059 PCI-E Gigabit Ethernet Controller (rev 11) and the system works stable with 110us/uframe (~88%) isoc bandwith allocated for above-mentioned isochronous transfers. NOTE 3 ~~~~~~ This feature is off by default. I mean max periodic bandwidth is set to 100us/uframe by default exactly as it was before the patch. So only those of us who need the extreme settings are taking the risk - normal users who do not alter uframe_periodic_max sysfs attribute should not see any change at all. NOTE 4 ~~~~~~ I've tried to update documentation in Documentation/ABI/ thoroughly, but only "TBD" was put into Documentation/usb/ehci.txt -- the text there seems to be outdated and much needing refreshing, before it could be amended. Cc: Sarah Sharp <sarah.a.sharp@linux.intel.com> Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru> Acked-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-07-08gpio/mxc: add device tree probe supportShawn Guo1-0/+22
The patch adds device tree probe support for gpio-mxc driver. Signed-off-by: Shawn Guo <shawn.guo@linaro.org> Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2011-07-08net: Fix default in docs for tcp_orphan_retries.David S. Miller1-1/+1
Default should be listed at 8 instead of 7. Reported-by: Denys Fedoryshchenko <denys@visp.net.lb> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-07-08Merge branch 'master' of ↵John W. Linville1-0/+128
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 into for-davem
2011-07-08drivers/virt: introduce Freescale hypervisor management driverTimur Tabi1-0/+1
Add the drivers/virt directory, which houses drivers that support virtualization environments, and add the Freescale hypervisor management driver. The Freescale hypervisor management driver provides several services to drivers and applications related to the Freescale hypervisor: 1. An ioctl interface for querying and managing partitions 2. A file interface to reading incoming doorbells 3. An interrupt handler for shutting down the partition upon receiving the shutdown doorbell from a manager partition 4. A kernel interface for receiving callbacks when a managed partition shuts down. Signed-off-by: Timur Tabi <timur@freescale.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2011-07-07FS-Cache: Add a helper to bulk uncache pages on an inodeDavid Howells1-0/+16
Add an FS-Cache helper to bulk uncache pages on an inode. This will only work for the circumstance where the pages in the cache correspond 1:1 with the pages attached to an inode's page cache. This is required for CIFS and NFS: When disabling inode cookie, we were returning the cookie and setting cifsi->fscache to NULL but failed to invalidate any previously mapped pages. This resulted in "Bad page state" errors and manifested in other kind of errors when running fsstress. Fix it by uncaching mapped pages when we disable the inode cookie. This patch should fix the following oops and "Bad page state" errors seen during fsstress testing. ------------[ cut here ]------------ kernel BUG at fs/cachefiles/namei.c:201! invalid opcode: 0000 [#1] SMP Pid: 5, comm: kworker/u:0 Not tainted 2.6.38.7-30.fc15.x86_64 #1 Bochs Bochs RIP: 0010: cachefiles_walk_to_object+0x436/0x745 [cachefiles] RSP: 0018:ffff88002ce6dd00 EFLAGS: 00010282 RAX: ffff88002ef165f0 RBX: ffff88001811f500 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000100 RDI: 0000000000000282 RBP: ffff88002ce6dda0 R08: 0000000000000100 R09: ffffffff81b3a300 R10: 0000ffff00066c0a R11: 0000000000000003 R12: ffff88002ae54840 R13: ffff88002ae54840 R14: ffff880029c29c00 R15: ffff88001811f4b0 FS: 00007f394dd32720(0000) GS:ffff88002ef00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00007fffcb62ddf8 CR3: 000000001825f000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process kworker/u:0 (pid: 5, threadinfo ffff88002ce6c000, task ffff88002ce55cc0) Stack: 0000000000000246 ffff88002ce55cc0 ffff88002ce6dd58 ffff88001815dc00 ffff8800185246c0 ffff88001811f618 ffff880029c29d18 ffff88001811f380 ffff88002ce6dd50 ffffffff814757e4 ffff88002ce6dda0 ffffffff8106ac56 Call Trace: cachefiles_lookup_object+0x78/0xd4 [cachefiles] fscache_lookup_object+0x131/0x16d [fscache] fscache_object_work_func+0x1bc/0x669 [fscache] process_one_work+0x186/0x298 worker_thread+0xda/0x15d kthread+0x84/0x8c kernel_thread_helper+0x4/0x10 RIP cachefiles_walk_to_object+0x436/0x745 [cachefiles] ---[ end trace 1d481c9af1804caa ]--- I tested the uncaching by the following means: (1) Create a big file on my NFS server (104857600 bytes). (2) Read the file into the cache with md5sum on the NFS client. Look in /proc/fs/fscache/stats: Pages : mrk=25601 unc=0 (3) Open the file for read/write ("bash 5<>/warthog/bigfile"). Look in proc again: Pages : mrk=25601 unc=25601 Reported-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-and-Tested-by: Suresh Jayaraman <sjayaraman@suse.de> cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-07-07dt: bindings: move SEC node under new crypto/Kim Phillips1-1/+1
Since technically it's not powerpc arch-specific. Also rename it sec2 to differentiate it from its incompatible successor, the SEC 4. Signed-off-by: Kim Phillips <kim.phillips@freescale.com> Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2011-07-07[media] feature-removal-schedule: change in how radio device nodes are handledHans Verkuil1-0/+22
Radio devices have weird side-effects when used with combined TV/radio tuners and the V4L2 spec is ambiguous on how it should work. This results in inconsistent driver behavior which makes life hard for everyone. Be more strict in when and how the switch between radio and tv mode takes place and make sure all drivers behave the same. Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2011-07-07thinkpad-acpi: handle HKEY 0x4010, 0x4011 eventsHenrique de Moraes Holschuh1-0/+2
Handle events 0x4010 and 0x4011 so that we do not pester users about them. These events report when the thinkpad is docked/undocked to a native hotplug dock (i.e. one that does not need ACPI handling, nor is represented in the ACPI device tree). Such docks are based on USB 2.0/3.0, and also work as port replicators. We really want a proper dock class to report these, or at least new input EV_SW events. Since it is not clear which one to use yet, keep reporting them as vendor-specific ThinkPad events. WARNING: As defined by the thinkpad-acpi sysfs ABI rules of engagement, the vendor-specific events will be REMOVED as soon as generic events are made available (duplicate events are a big problem), with an appropriate update to the thinkpad-acpi sysfs/event ABI versioning. Userspace is already prepared to provide easy backwards compatibility for such changes when convenient to the distro (see acpi-fakekey). * Event 0x4010: docking to hotplug dock/port replicator * Event 0x4011: undocking from hotplug dock/port replicator Typical usecase would be to trigger display reconfiguration. Reports mention T410, T510, and series 3 docks/port replicators. Special thanks to Robert de Rooy for his extensive report and analysis of the situation. http://www.thinkwiki.org/wiki/ThinkPad_Port_Replicator_Series_3 http://www.thinkwiki.org/wiki/ThinkPad_Mini_Dock_Series_3 http://www.thinkwiki.org/wiki/ThinkPad_Mini_Dock_Plus_Series_3 http://www.thinkwiki.org/wiki/ThinkPad_Mini_Dock_Plus_Series_3_for_Mobile_Workstations http://lenovoblogs.com/insidethebox/?p=290 Signed-off-by: Henrique de Moraes Holschuh <hmh@hmh.eng.br> Cc: Matthew Garrett <mjg59@srcf.ucam.org> Reported-by: Claudius Hubig <claudiushubig@chubig.net> Reported-by: Doctor Bill <docbill@gmail.com> Reported-by: Korte Noack <gbk.noack@gmx.de> Reported-by: Robert de Rooy <robert.de.rooy@gmail.com> Reported-by: Sebastian Will <swill@csail.mit.edu> Signed-off-by: Matthew Garrett <mjg@redhat.com>
2011-07-07thinkpad-acpi: handle some new HKEY 0x60xx eventsHenrique de Moraes Holschuh1-0/+3
Handle some user interface events from the newer Lenovo models. We are likely to do something smart with these events in the future, for now, hide the ones we are already certain about from the user and userspace both. * Events 0x6000 and 0x6005 are key-related. 0x6005 is not properly identified yet. Ignore these events, and do not report them. * Event 0x6040 has not been properly identified yet, and we don't know if it is important (looks like it isn't, but still...). Keep reporting it. * Change the message the driver outputs on unknown 0x6xxx events, as all recent events are not related to thermal alarms. Degrade log level from ALERT to WARNING. Thanks to all users who reported these events or asked about them in a number of mailing lists. Your help is highly appreciated, even if I did took a lot of time to act on them. For that I apologise. I will list those that identified the reasons for the events as "reported-by", and I apologise in advance if I leave anyone out: it was not done on purpose, I made the mistake of not properly tagging all event report emails separately, and might have missed some. Signed-off-by: Henrique de Moraes Holschuh <hmh@hmh.eng.br> Reported-by: Markus Malkusch <markus@malkusch.de> Reported-by: Peter Giles <g1l3sp@gmail.com> Signed-off-by: Matthew Garrett <mjg@redhat.com>
2011-07-07Update my e-mail addressMichael Büsch1-1/+1
Signed-off-by: Michael Buesch <m@bues.ch> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2011-07-07net: doc: fix compile warning of no format arguments in ifenslave.cShan Wei1-9/+9
Fix following warning in ifenslave.c with gcc version 4.5.2 (Ubuntu/Linaro 4.5.2-8ubuntu4). Documentation/networking/ifenslave.c:263:4: warning: format not a string literal and no format arguments Documentation/networking/ifenslave.c:271:3: warning: format not a string literal and no format arguments Documentation/networking/ifenslave.c:277:3: warning: format not a string literal and no format arguments Documentation/networking/ifenslave.c:285:3: warning: format not a string literal and no format arguments Documentation/networking/ifenslave.c:291:3: warning: format not a string literal and no format arguments Documentation/networking/ifenslave.c:292:3: warning: format not a string literal and no format arguments Documentation/networking/ifenslave.c:312:4: warning: format not a string literal and no format arguments Documentation/networking/ifenslave.c:323:3: warning: format not a string literal and no format arguments Documentation/networking/ifenslave.c:342:4: warning: format not a string literal and no format arguments Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-07-06x86, olpc: Add XO-1 RTC driverDaniel Drake1-0/+5
Add a driver to configure the XO-1 RTC via CS5536 MSRs, to be used as a system wakeup source via olpc-xo1-pm. Device detection is based on finding the relevant device tree node. Signed-off-by: Daniel Drake <dsd@laptop.org> Link: http://lkml.kernel.org/r/1309019658-1712-11-git-send-email-dsd@laptop.org Acked-by: Andres Salomon <dilinger@queued.net> Acked-by: Grant Likely <grant.likely@secretlab.ca> Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: devicetree-discuss@lists.ozlabs.org Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2011-07-06Documentation: fix cgroup blkio throttle filenamesAndrea Righi1-6/+6
All the blkio.throttle.* file names are incorrectly reported without ".throttle" in the documentation. Fix it. Signed-off-by: Andrea Righi <andrea@betterlinux.com> Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Acked-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-07-06Documentation: update CodingStyle memory allocatorsJesper Juhl1-2/+2
The list of available general purpose memory allocators in Documentation/CodingStyle chapter 14 is incomplete. This patch adds the missing vzalloc() to the list. Signed-off-by: Jesper Juhl <jj@chaosbits.net> Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-07-06PM / Runtime: Replace "run-time" with "runtime" in documentationRafael J. Wysocki1-65/+65
The runtime PM documentation and kerneldoc comments sometimes spell "runtime" with a dash (i.e. "run-time"). Replace all of those instances with "runtime" to make the naming consistent. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2011-07-06PM / Runtime: Improve documentation of enable, disable and barrierRafael J. Wysocki1-4/+14
The runtime PM documentation in Documentation/power/runtime_pm.txt doesn't say that pm_runtime_enable() and pm_runtime_disable() work by operating on power.disable_depth, which is wrong, because the possibility of nesting disables doesn't follow from the description of these functions. Also, there is no description of pm_runtime_barrier() at all in the document, which is confusing. Improve the documentation by fixing those issues. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2011-07-06PM: Limit race conditions between runtime PM and system sleep (v2)Rafael J. Wysocki1-0/+21
One of the roles of the PM core is to prevent different PM callbacks executed for the same device object from racing with each other. Unfortunately, after commit e8665002477f0278f84f898145b1f141ba26ee26 (PM: Allow pm_runtime_suspend() to succeed during system suspend) runtime PM callbacks may be executed concurrently with system suspend/resume callbacks for the same device. The main reason for commit e8665002477f0278f84f898145b1f141ba26ee26 was that some subsystems and device drivers wanted to use runtime PM helpers, pm_runtime_suspend() and pm_runtime_put_sync() in particular, for carrying out the suspend of devices in their .suspend() callbacks. However, as it's been determined recently, there are multiple reasons not to do so, inlcuding: * The caller really doesn't control the runtime PM usage counters, because user space can access them through sysfs and effectively block runtime PM. That means using pm_runtime_suspend() or pm_runtime_get_sync() to suspend devices during system suspend may or may not work. * If a driver calls pm_runtime_suspend() from its .suspend() callback, it causes the subsystem's .runtime_suspend() callback to be executed, which leads to the call sequence: subsys->suspend(dev) driver->suspend(dev) pm_runtime_suspend(dev) subsys->runtime_suspend(dev) recursive from the subsystem's point of view. For some subsystems that may actually work (e.g. the platform bus type), but for some it will fail in a rather spectacular fashion (e.g. PCI). In each case it means a layering violation. * Both the subsystem and the driver can provide .suspend_noirq() callbacks for system suspend that can do whatever the .runtime_suspend() callbacks do just fine, so it really isn't necessary to call pm_runtime_suspend() during system suspend. * The runtime PM's handling of wakeup devices is usually different from the system suspend's one, so .runtime_suspend() may simply be inappropriate for system suspend. * System suspend is supposed to work even if CONFIG_PM_RUNTIME is unset. * The runtime PM workqueue is frozen before system suspend, so if whatever the driver is going to do during system suspend depends on it, that simply won't work. Still, there is a good reason to allow pm_runtime_resume() to succeed during system suspend and resume (for instance, some subsystems and device drivers may legitimately use it to ensure that their devices are in full-power states before suspending them). Moreover, there is no reason to prevent runtime PM callbacks from being executed in parallel with the system suspend/resume .prepare() and .complete() callbacks and the code removed by commit e8665002477f0278f84f898145b1f141ba26ee26 went too far in this respect. On the other hand, runtime PM callbacks, including .runtime_resume(), must not be executed during system suspend's "late" stage of suspending devices and during system resume's "early" device resume stage. Taking all of the above into consideration, make the PM core acquire a runtime PM reference to every device and resume it if there's a runtime PM resume request pending right before executing the subsystem-level .suspend() callback for it. Make the PM core drop references to all devices right after executing the subsystem-level .resume() callbacks for them. Additionally, make the PM core disable the runtime PM framework for all devices during system suspend, after executing the subsystem-level .suspend() callbacks for them, and enable the runtime PM framework for all devices during system resume, right before executing the subsystem-level .resume() callbacks for them. Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Kevin Hilman <khilman@ti.com>
2011-07-05Merge branch 'master' of ↵David S. Miller3-59/+22
master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
2011-07-05spi/tegra: Use engineering names in DT compatible propertyStephen Warren1-1/+1
Engineering names are more stable than marketing names. Hence, use them for Device Tree compatible properties instead. Signed-off-by: Stephen Warren <swarren@nvidia.com> Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2011-07-05gpio/tegra: Use engineering names in DT compatible propertyStephen Warren1-1/+1
Engineering names are more stable than marketing names. Hence, use them for Device Tree compatible properties instead. Signed-off-by: Stephen Warren <swarren@nvidia.com> Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2011-07-05NFC: add Documentation/networking/nfc.txtAloisio Almeida Jr1-0/+128
Signed-off-by: Aloisio Almeida Jr <aloisio.almeida@openbossa.org> Signed-off-by: Lauro Ramos Venancio <lauro.venancio@openbossa.org> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-07-05Merge Linux 3.0-rc6 into staging-nextGreg Kroah-Hartman22-194/+372
This handles the merge conflicts with the drivers/staging/brcm80211/Kconfig file due to changes on the two different branches. Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-07-04Update documented default values for various TCP/UDP tunablesMax Matveev1-4/+4
tcp_rmem and tcp_wmem use 1 page as default value for the minimum amount of memory to be used, same as udp_wmem_min and udp_rmem_min. Pages are different size on different architectures - use the right units when describing the defaults. Reviewed-by: Shan Wei <shanwei@cn.fujitsu.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: Max Matveev <makc@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-07-04Update description of net.sctp.sctp_rmem and net.sctp.sctp_wmem tunablesMax Matveev1-2/+9
sctp does not use second and third ("default" and "max") values of sctp_rmem tunable. The format is the same as tcp_rmem but the meaning is different so make the documentation explicit to avoid confusion. sctp_wmem is not used at all. Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: Max Matveev <makc@redhat.com> Reviewed-by: Shan Wei <shanwei@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>