summaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)AuthorFilesLines
2008-02-26x86: fix spontaneous reboot with allyesconfig bzImageIngo Molnar3-12/+23
recently the 64-bit allyesconfig bzImage kernel started spontaneously rebooting during early bootup. after a few fun hours spent with early init debugging, it turns out that we've got this rather annoying limit on the size of the kernel image: #define KERNEL_TEXT_SIZE (40*1024*1024) which limit my vmlinux just happened to pass: text data bss dec hex filename 29703744 4222751 8646224 42572719 2899baf vmlinux 40 MB is 42572719 bytes, so my vmlinux was just 1.5% above this limit :-/ So it happily crashed right in head_64.S, which - as we all know - is the most debuggable code in the whole architecture ;-) So increase the limit to allow an up to 128MB kernel image to be mapped. (should anyone be that crazy or lazy) We have a full 4K of pagetable (level2_kernel_pgt) allocated for these mappings already, so there's no RAM overhead and the limit was rather pointless and arbitrary. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-02-26x86: remove double-checking empty zero pages debugYinghai Lu1-8/+0
so far no one complained about that. Signed-off-by: Yinghai Lu <yinghai.lu@sun.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-02-26x86: notsc is ignored on common configurationsPavel Machek1-1/+2
notsc is ignored in 32-bit kernels if CONFIG_X86_TSC is on.. which is bad, fix it. Signed-off-by: Pavel Machek <pavel@suse.cz> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-02-26x86/mtrr: fix kernel-doc missing notationRandy Dunlap1-0/+1
Fix mtrr kernel-doc warning: Warning(linux-2.6.24-git12//arch/x86/kernel/cpu/mtrr/main.c:677): No description found for parameter 'end_pfn' Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-02-26x86: handle BIOSes which terminate e820 with CF=1 and no SMAPH. Peter Anvin1-3/+6
The proper way to terminate the e820 chain is with %ebx == 0 on the last legitimate memory block. However, several BIOSes don't do that and instead return error (CF = 1) when trying to read off the end of the list. For this error return, %eax doesn't necessarily return the SMAP signature -- correctly so, since %ah should contain an error code in this case. To deal with some particularly broken BIOSes, we clear the entire e820 chain if the SMAP signature is missing in the middle, indicating a plain insane e820 implementation. However, we need to make the test for CF = 1 before the SMAP check. This fixes at least one HP laptop (nc6400) for which none of the memory-probing methods (e820, e801, 88) functioned fully according to spec. Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-02-26x86: add comments for NOPsH. Peter Anvin1-17/+45
Add comments describing the various NOP sequences. Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-02-26x86: don't use P6_NOPs if compiling with CONFIG_X86_GENERICH. Peter Anvin1-1/+10
P6_NOPs are definitely not supported on some VIA CPUs, and possibly (unverified) on AMD K7s. It is also the only thing that prevents a 686 kernel from running on Transmeta TM3x00/5x00 (Crusoe) series. The performance benefit over generic NOPs is very small, so when building for generic consumption, avoid using them. Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-02-26x86: require family >= 6 if we are using P6 NOPsH. Peter Anvin2-3/+6
The P6 family of NOPs are only available on family >= 6 or above, so enforce that in the boot code. Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-02-26x86: do not promote TM3x00/TM5x00 to i686-classH. Peter Anvin1-7/+0
We have been promoting Transmeta TM3x00/TM5x00 chips to i686-class based on the notion that they contain all the user-space visible features of an i686-class chip. However, this is not actually true: they lack the EA-taking long NOPs (0F 1F /0). Since this is a userspace-visible incompatibility, downgrade these CPUs to the manufacturer-defined i586 level. Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-02-26x86: hpet fix docbook commentPavel Machek1-2/+2
Signed-off-by: Pavel Machek <Pavel@suse.cz> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-02-26x86: make DEBUG_PAGEALLOC and CPA more robustIngo Molnar1-33/+51
Use PF_MEMALLOC to prevent recursive calls in the DBEUG_PAGEALLOC case. This makes the code simpler and more robust against allocation failures. This fixes the following fallback to non-mmconfig: http://lkml.org/lkml/2008/2/20/551 http://bugzilla.kernel.org/show_bug.cgi?id=10083 Also, for DEBUG_PAGEALLOC=n reduce the pool size to one page. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-02-26x86/lguest: fix pgdir pmd index calculationAhmed S. Darwish1-1/+1
Hi all, Beginning from commits close to v2.6.25-rc2, running lguest always oopses the host kernel. Oops is at [1]. Bisection led to the following commit: commit 37cc8d7f963ba2deec29c9b68716944516a3244f x86/early_ioremap: don't assume we're using swapper_pg_dir At the early stages of boot, before the kernel pagetable has been fully initialized, a Xen kernel will still be running off the Xen-provided pagetables rather than swapper_pg_dir[]. Therefore, readback cr3 to determine the base of the pagetable rather than assuming swapper_pg_dir[]. static inline pmd_t * __init early_ioremap_pmd(unsigned long addr) { - pgd_t *pgd = &swapper_pg_dir[pgd_index(addr)]; + /* Don't assume we're using swapper_pg_dir at this point */ + pgd_t *base = __va(read_cr3()); + pgd_t *pgd = &base[pgd_index(addr)]; pud_t *pud = pud_offset(pgd, addr); pmd_t *pmd = pmd_offset(pud, addr); Trying to analyze the problem, it seems on the guest side of lguest, %cr3 has a different value from &swapper_pg-dir (which is AFAIK fine on a pravirt guest): Putting some debugging messages in early_ioremap_pmd: /* Appears 3 times */ [ 0.000000] *************************** [ 0.000000] __va(%cr3) = c0000000, &swapper_pg_dir = c02cc000 [ 0.000000] *************************** After 8 hours of debugging and staring on lguest code, I noticed something strange in paravirt_ops->set_pmd hypercall invocation: static void lguest_set_pmd(pmd_t *pmdp, pmd_t pmdval) { *pmdp = pmdval; lazy_hcall(LHCALL_SET_PMD, __pa(pmdp)&PAGE_MASK, (__pa(pmdp)&(PAGE_SIZE-1))/4, 0); } The first hcall parameter is global pgdir which looks fine. The second parameter is the pmd index in the pgdir which is suspectful. AFAIK, calculating the index of pmd does not need a divisoin over four. Removing the division made lguest work fine again . Patch is at [2]. I am not sure why the division over four existed in the first place. It seems bogus, maybe the Xen patch just made the problem appear ? [2]: The patch: [PATCH] lguest: fix pgdir pmd index cacluation Remove an error in index calculation which leads to removing a not existing shadow page table (leading to a Null dereference). Signed-off-by: Ahmed S. Darwish <darwish.07@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-02-26lguest: fix build breakageTony Breeds1-3/+1
[ mingo@elte.hu: merged to Rusty's patch ] Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-02-26lguest: include function prototypesHarvey Harrison2-9/+12
Added a declaration to asm-x86/lguest.h and moved the extern arrays there as well. As an alternative to including asm/lguest.h directly, an include could be put in linux/lguest.h Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Cc: "rusty@rustcorp.com.au" <rusty@rustcorp.com.au> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-02-24Linux 2.6.25-rc3v2.6.25-rc3Linus Torvalds1-1/+1
2008-02-24i2c-i801: Add support for the ICH10Gaston, Jason D3-3/+12
Add the Intel ICH10 SMBus Controller DeviceID's and updates Tolapai support. Signed-off-by: Jason Gaston <jason.d.gaston@intel.com> Signed-off-by: Jean Delvare <khali@linux-fr.org>
2008-02-24i2c: Make i2c_register_board_info() a NOP when CONFIG_I2C_BOARDINFO=nDavid Brownell1-1/+8
Don't require platform code to be #ifdeffed according to whether I2C is enabled or not ... if it's not enabled, let GCC compile out all I2C device declarations. (Issue noted on an NSLU2 build that didn't configure I2C.) Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: Jean Delvare <khali@linux-fr.org>
2008-02-24i2c-pca-isa: Add access check to legacy ioportsChristian Krafft1-0/+7
When probing i2c-pca-isa writes to legacy ioports, which crashes the kernel if there is no device at that port. This patch adds a check_legacy_ioport call, so probe fails gracefully and thus prevents the oops. Signed-off-by: Christian Krafft <krafft@de.ibm.com> Signed-off-by: Jean Delvare <khali@linux-fr.org>
2008-02-24Alchemy: compile fixManuel Lauss4-0/+4
Commit 8b798c4d16b762d15f4055597ff8d87f73b35552 broke alchemy build, fix it. Pointed out by Adrian Bunk. Signed-off-by: Manuel Lauss <mano@roarinelk.homelinux.net> Signed-off-by: Jean Delvare <khali@linux-fr.org>
2008-02-24i2c: Storage class should be before const qualifierTobias Klauser2-3/+3
The C99 specification states in section 6.11.5: The placement of a storage-class specifier other than at the beginning of the declaration specifiers in a declaration is an obsolescent feature. Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: Jean Delvare <khali@linux-fr.org>
2008-02-24i2c-pxa: Misc fixesWolfram Sang1-11/+14
While working on the PCA9564-platform driver, I sometimes had a glimpse at the pxa-driver. I found some suspicious places, and this patch contains my suggestions. Note: They are not tested, due to no hardware. [JD: Some more fixes.] Signed-off-by: Wolfram Sang <w.sang@pengutronix.de> Signed-off-by: Jean Delvare <khali@linux-fr.org> Tested-by: Mike Rapoport <mike@compulab.co.il> Tested-by: Eric Miao <ymiao3@marvell.com>
2008-02-24ARM: OMAP: Release i2c_adapter after use (Siemens SX1)Jean Delvare1-0/+2
Each call to i2c_get_adapter() must be followed by a call to i2c_put_adapter() to release the grabbed reference. Otherwise the reference count grows forever and the adapter can never be unregistered. Signed-off-by: Jean Delvare <khali@linux-fr.org> Acked-by: Vladimir Ananiev <vovan888@gmail.com> Acked-by: Tony Lindgren <tony@atomide.com>
2008-02-23Merge branch 'upstream-linus' of ↵Linus Torvalds8-20/+44
git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev: libata-core: fix kernel-doc warning sata_fsl: fix build with ATA_VERBOSE_DEBUG [libata] ahci: AMD SB700/SB800 SATA support 64bit DMA libata-pmp: clear hob for pmp register accesses libata: automatically use DMADIR if drive/bridge requires it power_state: get rid of write-only variable in SATA pata_atiixp: Use 255 sector limit
2008-02-24libata-core: fix kernel-doc warningRandy Dunlap1-1/+1
Fix libata-core kernel-doc warning: Warning(linux-2.6.25-rc2-git6//drivers/ata/libata-core.c:168): No description found for parameter 'ap' Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
2008-02-24sata_fsl: fix build with ATA_VERBOSE_DEBUGAnton Vorontsov1-3/+5
This patch fixes build and few warnings when ATA_VERBOSE_DEBUG is defined: CC drivers/ata/sata_fsl.o drivers/ata/sata_fsl.c: In function ‘sata_fsl_fill_sg’: drivers/ata/sata_fsl.c:338: warning: format ‘%x’ expects type ‘unsigned int’, but argument 3 has type ‘void *’ drivers/ata/sata_fsl.c:338: warning: format ‘%x’ expects type ‘unsigned int’, but argument 4 has type ‘struct prde *’ drivers/ata/sata_fsl.c: In function ‘sata_fsl_qc_issue’: drivers/ata/sata_fsl.c:459: error: ‘csr_base’ undeclared (first use in this function) drivers/ata/sata_fsl.c:459: error: (Each undeclared identifier is reported only once drivers/ata/sata_fsl.c:459: error: for each function it appears in.) drivers/ata/sata_fsl.c: In function ‘sata_fsl_freeze’: drivers/ata/sata_fsl.c:525: error: ‘csr_base’ undeclared (first use in this function) make[2]: *** [drivers/ata/sata_fsl.o] Error 1 Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
2008-02-24[libata] ahci: AMD SB700/SB800 SATA support 64bit DMAShane Huang1-6/+17
SB700 SATA controller can support 64 bit DMA, the previous commit badc2341579511a247f5993865aa68379e283c5c was added with careless reference to SB600, which should be modified by this patch. Signed-off-by: Shane Huang <shane.huang@amd.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
2008-02-24libata-pmp: clear hob for pmp register accessesMark Lord1-2/+2
>> Mark Lord wrote: >>> Tejun, I've added PMP to sata_mv, and am now trying to get it >>> to work with a Marvell PM attached. >>> >>> And the behaviour I see is very bizarre. >>> >>> After hard+soft resets, the PM signature is found, >>> and libata interrogates the PM registers. >>> >>> It successfully reads register 0, and then register 1. >>> But all subsequent registers read out (incorrectly) as zeros. ... This behavior has been confirmed by Marvell with a SATA analyzer. The Marvell port-multiplier apparently likes to see clean HOB information when accessing PMP registers. Since sata_mv uses PIO shadow register access, this doesn't happen automatically, as it might in a more purely FIS-based driver (eg. ahci). One way to fix this is to flag these commands with ATA_TFLAG_LBA48, forcing libata to write out the HOB fields with known (zero) values. Signed-off-by: Saeed Bishara <saeed@marvell.com> Acked-by: Mark Lord <mlord@pobox.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
2008-02-24libata: automatically use DMADIR if drive/bridge requires itTejun Heo4-3/+17
Back in 2.6.17-rc2, a libata module parameter was added for atapi_dmadir. That's nice, but most SATA devices which need it will tell us about it in their IDENTIFY PACKET response, as bit-15 of word-62 of the returned data (as per ATA7, ATA8 specifications). So for those which specify it, we should automatically use the DMADIR bit. Otherwise, disc writing will fail by default on many SATA-ATAPI drives. This patch adds ATA_DFLAG_DMADIR and make ata_dev_configure() set it if atapi_dmadir is set or identify data indicates DMADIR is necessary. atapi_xlat() is converted to check ATA_DFLAG_DMADIR before setting DMADIR. Original patch is from Mark Lord. Signed-off-by: Tejun Heo <htejun@gmail.com> Cc: Mark Lord <mlord@pobox.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
2008-02-24power_state: get rid of write-only variable in SATAPavel Machek1-3/+0
power_state is scheduled for removal, and libata uses it in write-only mode. Remove it. Signed-off-by: Pavel Machek <pavel@suse.cz> Signed-off-by: Jeff Garzik <jeff@garzik.org>
2008-02-24pata_atiixp: Use 255 sector limitAlan Cox1-2/+2
AHCI needs sorting too but this deals with the old interface Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
2008-02-23Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds51-180/+521
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (37 commits) [NETFILTER]: fix ebtable targets return [IP_TUNNEL]: Don't limit the number of tunnels with generic name explicitly. [NET]: Restore sanity wrt. print_mac(). [NEIGH]: Fix race between neighbor lookup and table's hash_rnd update. [RTNL]: Validate hardware and broadcast address attribute for RTM_NEWLINK tg3: ethtool phys_id default [BNX2]: Update version to 1.7.4. [BNX2]: Disable parallel detect on an HP blade. [BNX2]: More 5706S link down workaround. ssb: Fix support for PCI devices behind a SSB->PCI bridge zd1211rw: fix sparse warnings rtl818x: fix sparse warnings ssb: Fix pcicore cardbus mode ssb: Make the GPIO API reentrancy safe ssb: Fix the GPIO API ssb: Fix watchdog access for devices without a chipcommon ssb: Fix serial console on new bcm47xx devices ath5k: Fix build warnings on some 64-bit platforms. WDEV, ath5k, don't return int from bool function WDEV: ath5k, fix lock imbalance ...
2008-02-23Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6Linus Torvalds12-84/+34
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6: [SPARC64]: make IOMMU code respect the segment boundary limits [SPARC64]: Fix cpu trampoline et al. mismatch warnings. [SPARC64]: More sparse warning fixes in process.c [SPARC64]: Fix sparse warning wrt. fault_in_user_windows. [SPARC64]: Kill show_regs32(). [SPARC64]: Fix sparse warnings wrt. __show_regs(). [SPARC64]: Kill show_stackframe{,32}(). [SPARC64]: Fix sparse warnings wrt. machine_alt_power_off().
2008-02-23Fix u132-hcd.c compile errorMirco Tischler1-2/+2
This fixes the following compile error caused by commit 3a2d5b700132f35401f1d9e22fe3c2cab02c2549 ("PM: Introduce PM_EVENT_HIBERNATE callback state") CC [M] drivers/usb/host/u132-hcd.o drivers/usb/host/u132-hcd.c: In function ‘u132_suspend’: drivers/usb/host/u132-hcd.c:3224: error: expected expression before ‘int’ drivers/usb/host/u132-hcd.c:3225: error: ‘ports’ undeclared (first use in this function) ... Signed-off-by: Mirco Tischler <mt-ml@gmx.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-23[NETFILTER]: fix ebtable targets returnJoonwoo Park3-3/+3
The function ebt_do_table doesn't take NF_DROP as a verdict from the targets. Signed-off-by: Joonwoo Park <joonwpark81@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-23[IP_TUNNEL]: Don't limit the number of tunnels with generic name explicitly.Pavel Emelyanov5-43/+10
Use the added dev_alloc_name() call to create tunnel device name, rather than iterate in a hand-made loop with an artificial limit. Thanks Patrick for noticing this. [ The way this works is, when the device is actually registered, the generic code noticed the '%' in the name and invokes dev_alloc_name() to fully resolve the name. -DaveM ] Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-23[NET]: Restore sanity wrt. print_mac().David S. Miller2-5/+8
MAC_FMT had only one user and we tried to get rid of that, but this created more problems than it solved. As a result, this reverts three commits: 235365f3aaaa10b7056293877c0ead50425f25c7 ("net/8021q/vlan_dev.c: Use print_mac."), fea5fa875eb235dc186b1f5184eb36abc63e26cc ("[NET]: Remove MAC_FMT"), and 8f789c48448aed74fe1c07af76de8f04adacec7d ("[NET]: Elminate spurious print_mac() calls.") Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-23[NEIGH]: Fix race between neighbor lookup and table's hash_rnd update.Pavel Emelyanov1-2/+4
The neigh_hash_grow() may update the tbl->hash_rnd value, which is used in all tbl->hash callbacks to calculate the hashval. Two lookup routines may race with this, since they call the ->hash callback without the tbl->lock held. Since the hash_rnd is changed with this lock write-locked moving the calls to ->hash under this lock read-locked closes this gap. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-23[RTNL]: Validate hardware and broadcast address attribute for RTM_NEWLINKThomas Graf1-6/+19
RTM_NEWLINK allows for already existing links to be modified. For this purpose do_setlink() is called which expects address attributes with a payload length of at least dev->addr_len. This patch adds the necessary validation for the RTM_NEWLINK case. The address length for links to be created is not checked for now as the actual attribute length is used when copying the address to the netdevice structure. It might make sense to report an error if less than addr_len bytes are provided but enforcing this might break drivers trying to be smart with not transmitting all zero addresses. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-23tg3: ethtool phys_id defaultStephen Hemminger1-1/+1
When asked to blink LEDs the tg3 driver behaves when using: ethtool -p ethX The default value for data is zero, and other drivers interpret this as blink forever (or at least a really long time). The tg3 driver interprets this as blink once. All drivers should have the same behaviour. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Acked-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-23[BNX2]: Update version to 1.7.4.Michael Chan1-2/+2
Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-23fix vmsas.c file permissionsOliver Pinter1-0/+0
Signed-off-by: Oliver Pinter <oliver.pntr@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-23[BNX2]: Disable parallel detect on an HP blade.Michael Chan2-1/+14
Because of some board issues, we need to disable parallel detect on an HP blade. Without this patch, the link state can become stuck when it goes into parallel detect mode. Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-23[BNX2]: More 5706S link down workaround.Michael Chan1-15/+17
The previous patches to workaround the 5706S on an HP blade were not sufficient. The link state still does not change properly in some cases. This patch adds polling to make it completely reliable. Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-23Add memory barrier semantics to wake_up() & coLinus Torvalds1-0/+1
Oleg Nesterov and others have pointed out that on some architectures, the traditional sequence of set_current_state(TASK_INTERRUPTIBLE); if (CONDITION) return; schedule(); is racy wrt another CPU doing CONDITION = 1; wake_up_process(p); because while set_current_state() has a memory barrier separating setting of the TASK_INTERRUPTIBLE state from reading of the CONDITION variable, there is no such memory barrier on the wakeup side. Now, wake_up_process() does actually take a spinlock before it reads and sets the task state on the waking side, and on x86 (and many other architectures) that spinlock is in fact equivalent to a memory barrier, but that is not generally guaranteed. The write that sets CONDITION could move into the critical region protected by the runqueue spinlock. However, adding a smp_wmb() to before the spinlock should now order the writing of CONDITION wrt the lock itself, which in turn is ordered wrt the accesses within the spinlock (which includes the reading of the old state). This should thus close the race (which probably has never been seen in practice, but since smp_wmb() is a no-op on x86, it's not like this will make anything worse either on the most common architecture where the spinlock already gave the required protection). Acked-by: Oleg Nesterov <oleg@tv-sign.ru> Acked-by: Dmitry Adamushko <dmitry.adamushko@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-23mvsas: fix build warning, clean prototypesJeff Garzik1-12/+1
- Fix build 'make randconfig' build warning spotted by Toralf Foerster: drivers/scsi/mvsas.c: In function 'mvs_hexdump': drivers/scsi/mvsas.c:715: error: implicit declaration of function 'isalnum' - Remove unneeded prototypes (spotted by hch) Signed-off-by: Jeff Garzik <jgarzik@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-23documentation: atomic_add_unless() doesn't imply mb() on failureOleg Nesterov2-2/+3
(sorry for being offtpoic, but while experts are here...) A "typical" implementation of atomic_add_unless() can return 0 immediately after the first atomic_read() (before doing cmpxchg). In that case it doesn't provide any barrier semantics. See include/asm-ia64/atomic.h as an example. We should either change the implementation, or fix the docs. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Acked-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-23memcgroup: return negative error code in mem_cgroup_create()Li Zefan1-2/+2
Cgroup requires the subsystem to return negative error code on error in the create method. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Acked-by: Balbir Singh <balbir@in.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-23memcgroup: remove a useless VM_BUG_ON()Li Zefan1-1/+0
Remove this VM_BUG_ON(), as Balbir stated: We used to have a for loop with !list_empty() as a termination condition and VM_BUG_ON(!pc) is a spill over. With the new loop, VM_BUG_ON(!pc) does not make sense. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Acked-by: Balbir Singh <balbir@in.ibm.com> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-23memcgroup: fix and update documentationLi Zefan1-15/+9
- remove trailing " Bytes"s in the demonstration - remove section 4.4 (feature control_type has been removed) - fix reference section Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Acked-by: Balbir Singh <balbir@in.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-23cgroup: remove dead code in cgroup_get_rootdir()Li Zefan1-1/+0
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Acked-by: Paul Menage <menage@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>