summaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)AuthorFilesLines
2012-03-20GFS2: Change truncate page allocation to be GFP_NOFSBob Peterson2-3/+3
This patch changes the page allocation in gfs2_block_truncate_page and two others to GFP_NOFS to avoid deadlock in low-memory conditions. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2012-03-09GFS2: call gfs2_write_alloc_required for each chunkBenjamin Marzinski1-5/+8
gfs2_fallocate was calling gfs2_write_alloc_required() once at the start of the function. This caused problems since gfs2_write_alloc_required used a long unsigned int for the len, but gfs2_fallocate could allocate a much larger amount. This patch will move the call into the loop where the chunks are actually allocated and zeroed out. This will keep the allocation size under the limit, and also allow gfs2_fallocate to quickly skip over sections of the file that are already completely allocated. fallcate_chunk was also not correctly setting the file size. It was using the len veriable to find the last block written to, but by the time it was setting the size, the len variable had already been decremented to 0. Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2012-03-09GFS2: Clean up log flush header writingSteven Whitehouse1-65/+66
We already send both a pre and post flush to the block device when writing a journal header. There is no need to wait for the previous I/O specifically when we do this, unless we've turned "barriers" off. As a side effect, this also cleans up the code path for flushing the journal and makes it more readable. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2012-03-08GFS2: Remove a __GFP_NOFAIL allocationSteven Whitehouse4-2/+25
In order to ensure that we've got enough buffer heads for flushing the journal, the orignal code used __GFP_NOFAIL when performing this allocation. Here we dispense with that in favour of using a mempool. This should improve efficiency in low memory conditions since flushing the journal is a good way to get memory back, we don't want to be spinning, waiting on memory allocations. The buffers which are allocated via this mempool are fairly short lived, so that we'll recycle them pretty quickly. Although there are other memory allocations which occur during the journal flush process, this is the one which can potentially require the most memory, so the most important one to fix. The amount of memory reserved is a fixed amount, and we should not need to scale it when there are a greater number of filesystems in use. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2012-03-07GFS2: Flush pending glock work when evicting an inodeSteven Whitehouse1-0/+1
This ensures that we will not try to access the inode thats being flushed via the glock after it has been freed. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2012-03-05GFS2: make sure rgrps are up to date in func gfs2_blk2rgrpdBob Peterson1-10/+4
This patch adds a call to gfs2_rindex_update from function gfs2_blk2rgrpd and removes calls to it that are made redundant by it. The problem is that a gfs2_grow can add rgrps to the rindex, then put those rgrps into use, thus rendering the rindex we read in at mount time incomplete. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2012-03-05GFS2: Eliminate sd_rindex_mutexBob Peterson3-14/+10
Over time, we've slowly eliminated the use of sd_rindex_mutex. Up to this point, it was only used in two places: function gfs2_ri_total (which totals the file system size by reading and parsing the rindex file) and function gfs2_rindex_update which updates the rgrps in memory. Both of these functions have the rindex glock to protect them, so the rindex is unnecessary. Since gfs2_grow writes to the rindex via the meta_fs, the mutex is in the wrong order according to the normal rules. This patch eliminates the mutex entirely to avoid the problem. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2012-03-01GFS2: Unlock rindex mutex on glock errorBob Peterson1-1/+2
This patch fixes an error path in function gfs2_rindex_update that leaves the rindex mutex held. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2012-02-28GFS2: Make bd_cmp() staticSteven Whitehouse1-1/+1
Add missing static to bd_cmp() Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2012-02-28GFS2: Sort the ordered write listBob Peterson1-0/+16
This patch sorts the ordered write list for GFS2 writes. This increases the throughput for simultaneous writes. For example, if you have ten processes, all doing: dd if=/dev/zero of=/mnt/gfs2/fileX on different files, the throughput will be much better. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2012-02-28GFS2: FITRIM ioctl supportSteven Whitehouse8-36/+153
The FITRIM ioctl provides an alternative way to send discard requests to the underlying device. Using the discard mount option results in every freed block generating a discard request to the block device. This can be slow, since many block devices can only process discard requests of larger sizes, and also such operations can be time consuming. Rather than using the discard mount option, FITRIM allows a sweep of the filesystem on an occasional basis, and also to optionally avoid sending down discard requests for smaller regions. In GFS2 FITRIM will work at resource group granularity. There is a flag for each resource group which keeps track of which resource groups have been trimmed. This flag is reset whenever a deallocation occurs in the resource group, and set whenever a successful FITRIM of that resource group has taken place. This helps to reduce repeated discard requests for the same block ranges, again improving performance. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2012-02-28GFS2: Move two functions from log.c to lops.cSteven Whitehouse3-101/+97
gfs2_log_get_buf() and gfs2_log_fake_buf() are both used only in lops.c, so move them next to their callers and they can then become static. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2012-02-28GFS2: glock statistics gatheringSteven Whitehouse5-19/+431
The stats are divided into two sets: those relating to the super block and those relating to an individual glock. The super block stats are done on a per cpu basis in order to try and reduce the overhead of gathering them. They are also further divided by glock type. In the case of both the super block and glock statistics, the same information is gathered in each case. The super block statistics are used to provide default values for most of the glock statistics, so that newly created glocks should have, as far as possible, a sensible starting point. The statistics are divided into three pairs of mean and variance, plus two counters. The mean/variance pairs are smoothed exponential estimates and the algorithm used is one which will be very familiar to those used to calculation of round trip times in network code. The three pairs of mean/variance measure the following things: 1. DLM lock time (non-blocking requests) 2. DLM lock time (blocking requests) 3. Inter-request time (again to the DLM) A non-blocking request is one which will complete right away, whatever the state of the DLM lock in question. That currently means any requests when (a) the current state of the lock is exclusive (b) the requested state is either null or unlocked or (c) the "try lock" flag is set. A blocking request covers all the other lock requests. There are two counters. The first is there primarily to show how many lock requests have been made, and thus how much data has gone into the mean/variance calculations. The other counter is counting queueing of holders at the top layer of the glock code. Hopefully that number will be a lot larger than the number of dlm lock requests issued. So why gather these statistics? There are several reasons we'd like to get a better idea of these timings: 1. To be able to better set the glock "min hold time" 2. To spot performance issues more easily 3. To improve the algorithm for selecting resource groups for allocation (to base it on lock wait time, rather than blindly using a "try lock") Due to the smoothing action of the updates, a step change in some input quantity being sampled will only fully be taken into account after 8 samples (or 4 for the variance) and this needs to be carefully considered when interpreting the results. Knowing both the time it takes a lock request to complete and the average time between lock requests for a glock means we can compute the total percentage of the time for which the node is able to use a glock vs. time that the rest of the cluster has its share. That will be very useful when setting the lock min hold time. The other point to remember is that all times are in nanoseconds. Great care has been taken to ensure that we measure exactly the quantities that we want, as accurately as possible. There are always inaccuracies in any measuring system, but I hope this is as accurate as we can reasonably make it. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2012-02-28Merge git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-3.0-fixesLinus Torvalds4-12/+25
* git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-3.0-fixes: GFS2: Read resource groups on mount GFS2: Ensure rindex is uptodate for fallocate GFS2: Read in rindex if necessary during unlink GFS2: Fix race between lru_list and glock ref count
2012-02-28Merge tag 'iommu-fixes-v3.3-rc5' of ↵Linus Torvalds3-16/+49
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu IOMMU fixes for Linux 3.3-rc5 All the fixes are for the OMAP IOMMU driver. The first patch is the biggest one. It fixes the calls of the function omap_find_iovm_area() in the omap-iommu-debug module which expects a 'struct device' parameter since commit fabdbca instead of an omap_iommu handle. The omap-iommu-debug code still passed the handle to the function which caused a crash. The second patch fixes a NULL pointer dereference in the OMAP code and the third patch makes sure that the omap-iommu is initialized before the omap-isp driver, which relies on the iommu. The last patch is only a workaround until defered probing is implemented. * tag 'iommu-fixes-v3.3-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: ARM: OMAP: make iommu subsys_initcall to fix builtin omap3isp iommu/omap: fix NULL pointer dereference iommu/omap: fix erroneous omap-iommu-debug API calls
2012-02-28GFS2: Read resource groups on mountSteven Whitehouse4-20/+17
This makes mount take slightly longer, but at the same time, the first write to the filesystem will be faster too. It also means that if there is a problem in the resource index, then we can refuse to mount rather than having to try and report that when the first write occurs. In addition, to avoid recursive locking, we hvae to take account of instances when the rindex glock may already be held when we are trying to update the rbtree of resource groups. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2012-02-28GFS2: Ensure rindex is uptodate for fallocateBob Peterson1-0/+5
This patch fixes a problem whereby gfs2_grow was failing and causing GFS2 to assert. The problem was that when GFS2's fallocate operation tried to acquire an "allocation" it made sure the rindex was up to date, and if not, it called gfs2_rindex_update. However, if the file being fallocated was the rindex itself, it was already locked at that point. By calling gfs2_rindex_update at an earlier point in time, we bring rindex up to date and thereby avoid trying to lock it when the "allocation" is acquired. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2012-02-28GFS2: Read in rindex if necessary during unlinkBob Peterson1-2/+7
This patch fixes a problem whereby you were unable to delete files until other file system operations were done (such as statfs, touch, writes, etc.) that caused the rindex to be read in. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2012-02-28GFS2: Fix race between lru_list and glock ref countSteven Whitehouse1-4/+10
This patch fixes a narrow race window between the glock ref count hitting zero and glocks being removed from the lru_list. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2012-02-27Merge tag 'ktest-fix-make-min-failed-build-for-real' of ↵Linus Torvalds1-3/+5
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-ktest While demoing ktest at ELC in 2012, it was embarrassing that the make_min_config test failed to work because the snowball board I was testing it against had a config that would not build. But the make_min_config only tested the testing part and ignored build failures. The end result was a config file that would not boot. This time, for real. * tag 'ktest-fix-make-min-failed-build-for-real' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-ktest: ktest: Fix make_min_config test when build fails
2012-02-27ktest: Fix make_min_config test when build failsSteven Rostedt1-3/+5
The make_min_config does not take into account when the build fails, resulting in a invalid MIN_CONFIG .config file. When the build fails, it is ignored and the boot test is executed, using the previous built kernel. The configs that should be tested are not tested and they may be added or removed depending on the result of the last kernel that succeeded to be built. If the build fails, mark the current config as a failure and the configs that were disabled may still be needed. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2012-02-27Merge git://git.kernel.org/pub/scm/linux/kernel/git/aia21/ntfsLinus Torvalds5-17/+12
Here are some trivial NTFS changes (a spelling fix and two use before NULL check cases found by Coverity as well as an update in MAINTAINERS for the path to the ntfs git repo) together with a simple LDM fix for parsing fragmented VBLKs. * git://git.kernel.org/pub/scm/linux/kernel/git/aia21/ntfs: NTFS: Update git repo path in MAINTAINERS file. LDM: Fix reassembly of extended VBLKs. NTFS: Correct two spelling errors "dealocate" to "deallocate" in mft.c. NTFS: Do not dereference pointer before checking for NULL. NTFS: Remove unused variable.
2012-02-27Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds4-10/+46
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mce/AMD: Fix UP build error x86: Specify a size for the cmp in the NMI handler x86/nmi: Test saved %cs in NMI to determine nested NMI case x86/amd: Fix L1i and L2 cache sharing information for AMD family 15h processors x86/microcode: Remove noisy AMD microcode warning
2012-02-27Merge branch 'sched-urgent-for-linus' of ↵Linus Torvalds3-51/+2
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/events: Revert trace_sched_stat_sleeptime()
2012-02-27Merge branch 'irq-urgent-for-linus' of ↵Linus Torvalds4-13/+37
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: genirq: Handle pending irqs in irq_startup() genirq: Unmask oneshot irqs when thread was not woken
2012-02-27compat: fix compile breakage on s390Heiko Carstens13-14/+9
The new is_compat_task() define for the !COMPAT case in include/linux/compat.h conflicts with a similar define in arch/s390/include/asm/compat.h. This is the minimal patch which fixes the build issues. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-02-27ARM: OMAP: make iommu subsys_initcall to fix builtin omap3ispOhad Ben-Cohen2-2/+4
omap3isp depends on omap's iommu and will fail to probe if initialized before it (which always happen if they are builtin). Make omap's iommu subsys_initcall as an interim solution until the probe deferral mechanism is merged. Reported-by: James <angweiyang@gmail.com> Debugged-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Signed-off-by: Ohad Ben-Cohen <ohad@wizery.com> Cc: stable <stable@vger.kernel.org> Cc: Tony Lindgren <tony@atomide.com> Cc: Hiroshi Doyu <hdoyu@nvidia.com> Cc: Joerg Roedel <Joerg.Roedel@amd.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2012-02-27NTFS: Update git repo path in MAINTAINERS file.Anton Altaparmakov1-1/+1
Signed-off-by: Anton Altaparmakov <anton@tuxera.com>
2012-02-27Merge branch 'master' of /Volumes/CaseSensitiveDisk/linuxAnton Altaparmakov154-660/+1223
2012-02-26Merge tag 'stable/for-linus-fixes-3.3-rc5' of ↵Linus Torvalds2-8/+6
git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen Two fixes to fix a memory corruption bug when WC pages never get converted back to WB but end up being recycled in the general memory pool as WC. There is a better way of fixing this, but there is not enough time to do the full benchmarking to pick one of the right options - so picking the one that favors stability for right now. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> * tag 'stable/for-linus-fixes-3.3-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: xen/pat: Disable PAT support for now. xen/setup: Remove redundant filtering of PTE masks.
2012-02-26Merge tag 'for-linus' of git://github.com/rustyrussell/linuxLinus Torvalds1-4/+31
* tag 'for-linus' of git://github.com/rustyrussell/linux: mod/file2alias: make modpost compile on darwin again
2012-02-26autofs4 - update MAINTAINERS mailing list entryIan Kent1-1/+1
The autofs mailing list has moved to vger.kernel.org. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-02-27mod/file2alias: make modpost compile on darwin againAndreas Bießmann1-4/+31
commit e49ce14150c64b29a8dd211df785576fa19a9858 breaks cross compiling the linux kernel on darwin hosts. This fix introduce some minimal glue to adopt linker section handling for darwin hosts. Signed-off-by: Andreas Bießmann <andreas@biessmann.de> CC: Rusty Russell <rusty@rustcorp.com.au> CC: Greg Kroah-Hartman <gregkh@linuxfoundation.org> CC: Jochen Friedrich <jochen@scram.de> CC: Samuel Ortiz <sameo@linux.intel.com> CC: "K. Y. Srinivasan" <kys@microsoft.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Tested-by: Bernhard Walle <bernhard@bwalle.de>
2012-02-26Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds61-204/+380
1) ICMP sockets leave err uninitialized but we try to return it for the unsupported MSG_OOB case, reported by Dave Jones. 2) Add new Zaurus device ID entries, from Dave Jones. 3) Pointer calculation in hso driver memset is wrong, from Dan Carpenter. 4) ks8851_probe() checks unsigned value as negative, fix also from Dan Carpenter. 5) Fix crashes in atl1c driver due to TX queue handling, from Eric Dumazet. I anticipate some TX side locking fixes coming in the near future for this driver as well. 6) The inline directive fix in Bluetooth which was breaking the build only with very new versions of GCC, from Johan Hedberg. 7) Fix crashes in the ATP CLIP code due to ARP cleanups this merge window, reported by Meelis Roos and fixed by Eric Dumazet. 8) JME driver doesn't flush RX FIFO correctly, from Guo-Fu Tseng. 9) Some ip6_route_output() callers test the return value for NULL, but this never happens as the convention is to return a dst entry with dst->error set. Fixes from RonQing Li. 10) Logitech Harmony 900 should be handled by zaurus driver not cdc_ether, update white lists and black lists accordingly. From Scott Talbert. 11) Receiving from certain kinds of devices there won't be a MAC header, so there is no MAC header to fixup in the IPSEC code, and if we try to do it we'll crash. Fix from Eric Dumazet. 12) Port type array indexing off-by-one in mlx4 driver, fix from Yevgeny Petrilin. 13) Fix regression in link-down handling in davinci_emac which causes all RX descriptors to be freed up and therefore RX to wedge completely, from Christian Riesch. 14) It took two attempts, but ctnetlink soft lockups seem to be cured now, from Pablo Neira Ayuso. 15) Endianness bug fix in ENIC driver, from Santosh Nayak. 16) The long ago conversion of the PPP fragmentation code over to abstracted SKB list handling wasn't perfect, once we get an out of sequence SKB we don't flush the rest of them like we should. From Ben McKeegan. 17) Fix regression of ->ip_summed initialization in sfc driver. From Ben Hutchings. 18) Bluetooth timeout mistakenly using msecs instead of jiffies, from Andrzej Kaczmarek. 19) Using _sync variant of work cancellation results in deadlocks, use the non _sync variants instead. From Andre Guedes. 20) Bluetooth rfcomm code had reference counting problems leading to crashes, fix from Octavian Purdila. 21) The conversion of netem over to classful qdisc handling added two bugs to netem_dequeue(), fixes from Eric Dumazet. 22) Missing pci_iounmap() in ATM Solos driver. Fix from Julia Lawall. 23) b44_pci_exit() should not have __exit tag since it's invoked from non-__exit code. From Nikola Pajkovsky. 24) The conversion of the neighbour hash tables over to RCU added a race, fixed here by adding the necessary reread of tbl->nht, fix from Michel Machado. 25) When we added VF (virtual function) attributes for network device dumps, this potentially bloats up the size of the dump of one network device such that the dump size is too large for the buffer allocated by properly written netlink applications. In particular, if you add 255 VFs to a network device, parts of GLIBC stop working. To fix this, we add an attribute that is used to turn on these extended portions of the network device dump. Sophisticaed applications like 'ip' that want to see this stuff will be changed to set the attribute, whereas things like GLIBC that don't care about VFs simply will not, and therefore won't be busted by the mere presence of VFs on a network device. Thanks to the tireless work of Greg Rose on this fix. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (53 commits) sfc: Fix assignment of ip_summed for pre-allocated skbs ppp: fix 'ppp_mp_reconstruct bad seq' errors enic: Fix endianness bug. gre: fix spelling in comments netfilter: ctnetlink: fix soft lockup when netlink adds new entries (v2) Revert "netfilter: ctnetlink: fix soft lockup when netlink adds new entries" davinci_emac: Do not free all rx dma descriptors during init mlx4_core: Fixing array indexes when setting port types phy: IC+101G and PHY_HAS_INTERRUPT flag netdev/phy/icplus: Correct broken phy_init code ipsec: be careful of non existing mac headers Move Logitech Harmony 900 from cdc_ether to zaurus hso: memsetting wrong data in hso_get_count() netfilter: ip6_route_output() never returns NULL. ethernet/broadcom: ip6_route_output() never returns NULL. ipv6: ip6_route_output() never returns NULL. jme: Fix FIFO flush issue atm: clip: remove clip_tbl ipv4: ping: Fix recvmsg MSG_OOB error handling. rtnetlink: Fix problem with buffer allocation ...
2012-02-26Fix autofs compile without CONFIG_COMPATLinus Torvalds1-0/+4
The autofs compat handling fix caused a compile failure when CONFIG_COMPAT isn't defined. Instead of adding random #ifdef'fery in autofs, let's just make the compat helpers earlier to use: without CONFIG_COMPAT, is_compat_task() just hardcodes to zero. We could probably do something similar for a number of other cases where we have #ifdef's in code, but this is the low-hanging fruit. Reported-and-tested-by: Andreas Schwab <schwab@linux-m68k.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-02-25Linux 3.3-rc5v3.3-rc5Linus Torvalds1-1/+1
2012-02-25Merge tag 'hwmon-for-linus' of ↵Linus Torvalds4-17/+17
git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging Couple of minor driver fixes. * tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging: hwmon: (max34440) Fix resetting temperature history hwmon: (f75375s) Fix register write order when setting fans to full speed hwmon: (ads1015) Fix file leak in probe function hwmon: (max6639) Fix PPR register initialization to set both channels hwmon: (max6639) Fix FAN_FROM_REG calculation
2012-02-25Merge branch 'rc-fixes' of ↵Linus Torvalds3-21/+10
git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild three kbuild fixes for 3.3: - make deb-pkg symlink race fix. - make coccicheck fix. - Dropping the check for modutils. This is not a regression, but allows the module-init-tools replacement kmod work with the 3.3 kernel. * 'rc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild: coccicheck: change handling of C={1,2} when M= is set builddeb: Don't create files in /tmp with predictable names kbuild: do not check for ancient modutils tools
2012-02-25autofs: work around unhappy compat problem on x86-64Ian Kent4-3/+23
When the autofs protocol version 5 packet type was added in commit 5c0a32fc2cd0 ("autofs4: add new packet type for v5 communications"), it obvously tried quite hard to be word-size agnostic, and uses explicitly sized fields that are all correctly aligned. However, with the final "char name[NAME_MAX+1]" array at the end, the actual size of the structure ends up being not very well defined: because the struct isn't marked 'packed', doing a "sizeof()" on it will align the size of the struct up to the biggest alignment of the members it has. And despite all the members being the same, the alignment of them is different: a "__u64" has 4-byte alignment on x86-32, but native 8-byte alignment on x86-64. And while 'NAME_MAX+1' ends up being a nice round number (256), the name[] array starts out a 4-byte aligned. End result: the "packed" size of the structure is 300 bytes: 4-byte, but not 8-byte aligned. As a result, despite all the fields being in the same place on all architectures, sizeof() will round up that size to 304 bytes on architectures that have 8-byte alignment for u64. Note that this is *not* a problem for 32-bit compat mode on POWER, since there __u64 is 8-byte aligned even in 32-bit mode. But on x86, 32-bit and 64-bit alignment is different for 64-bit entities, and as a result the structure that has exactly the same layout has different sizes. So on x86-64, but no other architecture, we will just subtract 4 from the size of the structure when running in a compat task. That way we will write the properly sized packet that user mode expects. Not pretty. Sadly, this very subtle, and unnecessary, size difference has been encoded in user space that wants to read packets of *exactly* the right size, and will refuse to touch anything else. Reported-and-tested-by: Thomas Meyer <thomas@m3y3r.de> Signed-off-by: Ian Kent <raven@themaw.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-02-24Merge tag 'rdma-for-linus' of ↵Linus Torvalds2-7/+7
git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband One InfiniBand/RDMA regression fix for 3.3: - mlx4 SR-IOV changes added static exported functions, which doesn't build on powerpc at least. Fix from Doug Ledford for this. * tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: mlx4_core: Exported functions can't be static
2012-02-24Merge branch 'sfc-3.3' of git://git.kernel.org/pub/scm/linux/kernel/git/bwh/sfcDavid S. Miller1-2/+2
2012-02-25sfc: Fix assignment of ip_summed for pre-allocated skbsBen Hutchings1-2/+2
When pre-allocating skbs for received packets, we set ip_summed = CHECKSUM_UNNCESSARY. We used to change it back to CHECKSUM_NONE when the received packet had an incorrect checksum or unhandled protocol. Commit bc8acf2c8c3e43fcc192762a9f964b3e9a17748b ('drivers/net: avoid some skb->ip_summed initializations') mistakenly replaced the latter assignment with a DEBUG-only assertion that ip_summed == CHECKSUM_NONE. This assertion is always false, but it seems no-one has exercised this code path in a DEBUG build. Fix this by moving our assignment of CHECKSUM_UNNECESSARY into efx_rx_packet_gro(). Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
2012-02-24Merge tag 'scsi-fixes' of ↵Linus Torvalds17-100/+101
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6 SCSI fixes on 20120224: "This is a set of assorted bug fixes for power management, mpt2sas, ipr, the rdac device handler and quite a big chunk for qla2xxx (plus a use after free of scsi_host in scsi_scan.c). " * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6: [SCSI] scsi_dh_rdac: Fix for unbalanced reference count [SCSI] scsi_pm: Fix bug in the SCSI power management handler [SCSI] scsi_scan: Fix 'Poison overwritten' warning caused by using freed 'shost' [SCSI] qla2xxx: Update version number to 8.03.07.13-k. [SCSI] qla2xxx: Proper detection of firmware abort error code for ISP82xx. [SCSI] qla2xxx: Remove resetting memory during device initialization for ISP82xx. [SCSI] qla2xxx: Complete mailbox command timedout to avoid initialization failures during next reset cycle. [SCSI] qla2xxx: Remove check for null fcport from host reset handler. [SCSI] qla2xxx: Correct out of bounds read of ISP2200 mailbox registers. [SCSI] qla2xxx: Remove errant clearing of MBX_INTERRUPT flag during CT-IOCB processing. [SCSI] qla2xxx: Clear options-flags while issuing stop-firmware mbx command. [SCSI] qla2xxx: Add an "is reset active" helper. [SCSI] qla2xxx: Add check for null fcport references in qla2xxx_queuecommand. [SCSI] qla2xxx: Propagate up abort failures. [SCSI] isci: Fix NULL ptr dereference when no firmware is being loaded [SCSI] ipr: fix eeh recovery for 64-bit adapters [SCSI] mpt2sas: Fix mismatch in mpt2sas_base_hard_reset_handler() mutex lock-unlock
2012-02-24ppp: fix 'ppp_mp_reconstruct bad seq' errorsBen McKeegan1-0/+23
This patch fixes a (mostly cosmetic) bug introduced by the patch 'ppp: Use SKB queue abstraction interfaces in fragment processing' found here: http://www.spinics.net/lists/netdev/msg153312.html The above patch rewrote and moved the code responsible for cleaning up discarded fragments but the new code does not catch every case where this is necessary. This results in some discarded fragments remaining in the queue, and triggering a 'bad seq' error on the subsequent call to ppp_mp_reconstruct. Fragments are discarded whenever other fragments of the same frame have been lost. This can generate a lot of unwanted and misleading log messages. This patch also adds additional detail to the debug logging to make it clearer which fragments were lost and which other fragments were discarded as a result of losses. (Run pppd with 'kdebug 1' option to enable debug logging.) Signed-off-by: Ben McKeegan <ben@netservers.co.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-02-24enic: Fix endianness bug.Santosh Nayak2-2/+2
Sparse complaints the endian bug. Signed-off-by: Santosh Nayak <santoshprasadnayak@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-02-24coccicheck: change handling of C={1,2} when M= is setGreg Dietsche1-9/+4
This patch reverts a portion of d0bc1fb4 so that coccicheck will work properly when C=1 or C=2. Reported-and-tested-by: Brice Goglin <Brice.Goglin@inria.fr> Signed-off-by: Greg Dietsche <Gregory.Dietsche@cuw.edu> Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Signed-off-by: Michal Marek <mmarek@suse.cz>
2012-02-24Merge branch 'master' of git://1984.lsi.us.es/netDavid S. Miller3-12/+39
2012-02-24gre: fix spelling in commentsstephen hemminger1-5/+5
The original spelling and bad word choice makes these comments hard to read. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-02-24Merge branch 'v4l_for_linus' of ↵Linus Torvalds6-23/+74
git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media * 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: [media] hdpvr: update picture controls to support firmware versions > 0.15 [media] wl128x: fix build errors when GPIOLIB is not enabled [media] hdpvr: fix race conditon during start of streaming [media] omap3isp: Fix crash caused by subdevs now having a pointer to devnodes [media] imon: don't wedge hardware after early callbacks
2012-02-24epoll: ep_unregister_pollwait() can use the freed pwq->wheadOleg Nesterov2-4/+32
signalfd_cleanup() ensures that ->signalfd_wqh is not used, but this is not enough. eppoll_entry->whead still points to the memory we are going to free, ep_unregister_pollwait()->remove_wait_queue() is obviously unsafe. Change ep_poll_callback(POLLFREE) to set eppoll_entry->whead = NULL, change ep_unregister_pollwait() to check pwq->whead != NULL under rcu_read_lock() before remove_wait_queue(). We add the new helper, ep_remove_wait_queue(), for this. This works because sighand_cachep is SLAB_DESTROY_BY_RCU and because ->signalfd_wqh is initialized in sighand_ctor(), not in copy_sighand. ep_unregister_pollwait()->remove_wait_queue() can play with already freed and potentially reused ->sighand, but this is fine. This memory must have the valid ->signalfd_wqh until rcu_read_unlock(). Reported-by: Maxime Bizon <mbizon@freebox.fr> Cc: <stable@kernel.org> Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>