summaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)AuthorFilesLines
2012-04-11tcp: avoid order-1 allocations on wifi and tx pathEric Dumazet3-5/+18
Marc Merlin reported many order-1 allocations failures in TX path on its wireless setup, that dont make any sense with MTU=1500 network, and non SG capable hardware. After investigation, it turns out TCP uses sk_stream_alloc_skb() and used as a convention skb_tailroom(skb) to know how many bytes of data payload could be put in this skb (for non SG capable devices) Note : these skb used kmalloc-4096 (MTU=1500 + MAX_HEADER + sizeof(struct skb_shared_info) being above 2048) Later, mac80211 layer need to add some bytes at the tail of skb (IEEE80211_ENCRYPT_TAILROOM = 18 bytes) and since no more tailroom is available has to call pskb_expand_head() and request order-1 allocations. This patch changes sk_stream_alloc_skb() so that only sk->sk_prot->max_header bytes of headroom are reserved, and use a new skb field, avail_size to hold the data payload limit. This way, order-0 allocations done by TCP stack can leave more than 2 KB of tailroom and no more allocation is performed in mac80211 layer (or any layer needing some tailroom) avail_size is unioned with mark/dropcount, since mark will be set later in IP stack for output packets. Therefore, skb size is unchanged. Reported-by: Marc MERLIN <marc@merlins.org> Tested-by: Marc MERLIN <marc@merlins.org> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-11net: allow pskb_expand_head() to get maximum tailroomEric Dumazet1-1/+3
Marc Merlin reported many order-1 allocations failures in TX path on its wireless setup, that dont make any sense with MTU=1500 network, and non SG capable hardware. Turns out part of the problem comes from pskb_expand_head() not using ksize() to get exact head size given by kmalloc(). Doing the same thing than __alloc_skb() allows more tailroom in skb and can prevent future reallocations. As a bonus, struct skb_shared_info becomes cache line aligned. Reported-by: Marc MERLIN <marc@merlins.org> Tested-by: Marc MERLIN <marc@merlins.org> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-11bridge: Do not send queries on multicast group leavesHerbert Xu2-85/+0
As it stands the bridge IGMP snooping system will respond to group leave messages with queries for remaining membership. This is both unnecessary and undesirable. First of all any multicast routers present should be doing this rather than us. What's more the queries that we send may end up upsetting other multicast snooping swithces in the system that are buggy. In fact, we can simply remove the code that send these queries because the existing membership expiry mechanism doesn't rely on them anyway. So this patch simply removes all code associated with group queries in response to group leave messages. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-10MAINTAINERS: Mark NATSEMI driver as orphan'd.David S. Miller1-2/+1
After discussion with Tim Hockin. Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-10tcp: fix tcp_rcv_rtt_update() use of an unscaled RTT sampleNeal Cardwell1-2/+5
Fix a code path in tcp_rcv_rtt_update() that was comparing scaled and unscaled RTT samples. The intent in the code was to only use the 'm' measurement if it was a new minimum. However, since 'm' had not yet been shifted left 3 bits but 'new_sample' had, this comparison would nearly always succeed, leading us to erroneously set our receive-side RTT estimate to the 'm' sample when that sample could be nearly 8x too high to use. The overall effect is to often cause the receive-side RTT estimate to be significantly too large (up to 40% too large for brief periods in my tests). Signed-off-by: Neal Cardwell <ncardwell@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-10tcp: restore correct limitEric Dumazet1-2/+1
Commit c43b874d5d714f (tcp: properly initialize tcp memory limits) tried to fix a regression added in commits 4acb4190 & 3dc43e3, but still get it wrong. Result is machines with low amount of memory have too small tcp_rmem[2] value and slow tcp receives : Per socket limit being 1/1024 of memory instead of 1/128 in old kernels, so rcv window is capped to small values. Fix this to match comment and previous behavior. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Glauber Costa <glommer@parallels.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-10Merge branch 'master' of git://1984.lsi.us.es/netDavid S. Miller5-20/+24
2012-04-10netfilter: nf_conntrack: fix incorrect logic in nf_conntrack_init_netGao feng1-1/+1
in function nf_conntrack_init_net,when nf_conntrack_timeout_init falied, we should call nf_conntrack_ecache_fini to do rollback. but the current code calls nf_conntrack_timeout_fini. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-04-10netfilter: nf_ct_ipv4: packets with wrong ihl are invalidJozsef Kadlecsik1-0/+8
It was reported that the Linux kernel sometimes logs: klogd: [2629147.402413] kernel BUG at net / netfilter / nf_conntrack_proto_tcp.c: 447! klogd: [1072212.887368] kernel BUG at net / netfilter / nf_conntrack_proto_tcp.c: 392 ipv4_get_l4proto() in nf_conntrack_l3proto_ipv4.c and tcp_error() in nf_conntrack_proto_tcp.c should catch malformed packets, so the errors at the indicated lines - TCP options parsing - should not happen. However, tcp_error() relies on the "dataoff" offset to the TCP header, calculated by ipv4_get_l4proto(). But ipv4_get_l4proto() does not check bogus ihl values in IPv4 packets, which then can slip through tcp_error() and get caught at the TCP options parsing routines. The patch fixes ipv4_get_l4proto() by invalidating packets with bogus ihl value. The patch closes netfilter bugzilla id 771. Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-04-10netfilter: nf_ct_ipv4: handle invalid IPv4 and IPv6 packets consistentlyJozsef Kadlecsik1-2/+2
IPv6 conntrack marked invalid packets as INVALID and let the user drop those by an explicit rule, while IPv4 conntrack dropped such packets itself. IPv4 conntrack is changed so that it marks INVALID packets and let the user to drop them. Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-04-09netfilter: ip6_tables: ip6t_ext_hdr is now static inlinePablo Neira Ayuso2-15/+11
We may hit this in xt_LOG: net/built-in.o:xt_LOG.c:function dump_ipv6_packet: error: undefined reference to 'ip6t_ext_hdr' happens with these config options: CONFIG_NETFILTER_XT_TARGET_LOG=y CONFIG_IP6_NF_IPTABLES=m ip6t_ext_hdr is fairly small and it is called in the packet path. Make it static inline. Reported-by: Simon Kirby <sim@netnation.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-04-09netfilter: nf_ct_tcp: don't scale the size of the window up twiceChangli Gao1-2/+2
For a picked up connection, the window win is scaled twice: one is by the initialization code, and the other is by the sender updating code. I use the temporary variable swin instead of modifying the variable win. Signed-off-by: Changli Gao <xiaosuo@gmail.com> Acked-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-04-06Make the "word-at-a-time" helper functions more commonly usableLinus Torvalds2-32/+49
I have a new optimized x86 "strncpy_from_user()" that will use these same helper functions for all the same reasons the name lookup code uses them. This is preparation for that. This moves them into an architecture-specific header file. It's architecture-specific for two reasons: - some of the functions are likely to want architecture-specific implementations. Even if the current code happens to be "generic" in the sense that it should work on any little-endian machine, it's likely that the "multiply by a big constant and shift" implementation is less than optimal for an architecture that has a guaranteed fast bit count instruction, for example. - I expect that if architectures like sparc want to start playing around with this, we'll need to abstract out a few more details (in particular the actual unaligned accesses). So we're likely to have more architecture-specific stuff if non-x86 architectures start using this. (and if it turns out that non-x86 architectures don't start using this, then having it in an architecture-specific header is still the right thing to do, of course) Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-04-06Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds38-376/+602
Pull networking updates from David Miller: 1) Fix inaccuracies in network driver interface documentation, from Ben Hutchings. 2) Fix handling of negative offsets in BPF JITs, from Jan Seiffert. 3) Compile warning, locking, and refcounting fixes in netfilter's xt_CT, from Pablo Neira Ayuso. 4) phonet sendmsg needs to validate user length just like any other datagram protocol, fix from Sasha Levin. 5) Ipv6 multicast code uses wrong loop index, from RongQing Li. 6) Link handling and firmware fixes in bnx2x driver from Yaniv Rosner and Yuval Mintz. 7) mlx4 erroneously allocates 4 pages at a time, regardless of page size, fix from Thadeu Lima de Souza Cascardo. 8) SCTP socket option wasn't extended in a backwards compatible way, fix from Thomas Graf. 9) Add missing address change event emissions to bonding, from Shlomo Pongratz. 10) /proc/net/dev regressed because it uses a private offset to track where we are in the hash table, but this doesn't track the offset pullback that the seq_file code does resulting in some entries being missed in large dumps. Fix from Eric Dumazet. 11) do_tcp_sendpage() unloads the send queue way too fast, because it invokes tcp_push() when it shouldn't. Let the natural sequence generated by the splice paths, and the assosciated MSG_MORE settings, guide the tcp_push() calls. Otherwise what goes out of TCP is spaghetti and doesn't batch effectively into GSO/TSO clusters. From Eric Dumazet. 12) Once we put a SKB into either the netlink receiver's queue or a socket error queue, it can be consumed and freed up, therefore we cannot touch it after queueing it like that. Fixes from Eric Dumazet. 13) PPP has this annoying behavior in that for every transmit call it immediately stops the TX queue, then calls down into the next layer to transmit the PPP frame. But if that next layer can take it immediately, it just un-stops the TX queue right before returning from the transmit method. Besides being useless work, it makes several facilities unusable, in particular things like the equalizers. Well behaved devices should only stop the TX queue when they really are full, and in PPP's case when it gets backlogged to the downstream device. David Woodhouse therefore fixed PPP to not stop the TX queue until it's downstream can't take data any more. 14) IFF_UNICAST_FLT got accidently lost in some recent stmmac driver changes, re-add. From Marc Kleine-Budde. 15) Fix link flaps in ixgbe, from Eric W. Multanen. 16) Descriptor writeback fixes in e1000e from Matthew Vick. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (47 commits) net: fix a race in sock_queue_err_skb() netlink: fix races after skb queueing doc, net: Update ndo_start_xmit return type and values doc, net: Remove instruction to set net_device::trans_start doc, net: Update netdev operation names doc, net: Update documentation of synchronisation for TX multiqueue doc, net: Remove obsolete reference to dev->poll ethtool: Remove exception to the requirement of holding RTNL lock MAINTAINERS: update for Marvell Ethernet drivers bonding: properly unset current_arp_slave on slave link up phonet: Check input from user before allocating tcp: tcp_sendpages() should call tcp_push() once ipv6: fix array index in ip6_mc_add_src() mlx4: allocate just enough pages instead of always 4 pages stmmac: re-add IFF_UNICAST_FLT for dwmac1000 bnx2x: Clear MDC/MDIO warning message bnx2x: Fix BCM57711+BCM84823 link issue bnx2x: Clear BCM84833 LED after fan failure bnx2x: Fix BCM84833 PHY FW version presentation bnx2x: Fix link issue for BCM8727 boards. ...
2012-04-06net: fix a race in sock_queue_err_skb()Eric Dumazet1-1/+3
As soon as an skb is queued into socket error queue, another thread can consume it, so we are not allowed to reference skb anymore, or risk use after free. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-06netlink: fix races after skb queueingEric Dumazet1-11/+13
As soon as an skb is queued into socket receive_queue, another thread can consume it, so we are not allowed to reference skb anymore, or risk use after free. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-06doc, net: Update ndo_start_xmit return type and valuesBen Hutchings1-10/+12
Commit dc1f8bf68b311b1537cb65893430b6796118498a ('netdev: change transmit to limited range type') changed the required return type and 9a1654ba0b50402a6bd03c7b0fe9b0200a5ea7b1 ('net: Optimize hard_start_xmit() return checking') changed the valid numerical return values. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-06doc, net: Remove instruction to set net_device::trans_startBen Hutchings1-5/+2
Commit 08baf561083bc27a953aa087dd8a664bb2b88e8e ('net: txq_trans_update() helper') made it unnecessary for most drivers to set net_device::trans_start (or netdev_queue::trans_start). Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-06doc, net: Update netdev operation namesBen Hutchings2-14/+14
Commits d314774cf2cd5dfeb39a00d37deee65d4c627927 ('netdev: network device operations infrastructure') and 008298231abbeb91bc7be9e8b078607b816d1a4a ('netdev: add more functions to netdevice ops') moved and renamed net device operation pointers. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-06doc, net: Update documentation of synchronisation for TX multiqueueBen Hutchings1-3/+3
Commits e308a5d806c852f56590ffdd3834d0df0cbed8d7 ('netdev: Add netdev->addr_list_lock protection.') and e8a0464cc950972824e2e128028ae3db666ec1ed ('netdev: Allocate multiple queues for TX.') introduced more fine-grained locks. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-06doc, net: Remove obsolete reference to dev->pollBen Hutchings1-2/+1
Commit bea3348eef27e6044b6161fd04c3152215f96411 ('[NET]: Make NAPI polling independent of struct net_device objects.') removed the automatic disabling of NAPI polling by dev_close(), and drivers must now do this themselves. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-06ethtool: Remove exception to the requirement of holding RTNL lockBen Hutchings1-2/+1
Commit e52ac3398c3d772d372b9b62ab408fd5eec96840 ('net: Use device model to get driver name in skb_gso_segment()') removed the only in-tree caller of ethtool ops that doesn't hold the RTNL lock. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-05Merge tag 'fixes-for-linus' of ↵Linus Torvalds71-342/+603
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull "ARM: SoC fixes: from Olof Johansson: "A bunch of fixes for regressions (and a few other problems) in 3.4-rc1: - Fix for regression of mach/io.h cleanup on platforms with PCI or PCMCIA (adding back the include file on those for now) - AT91 fixes for usb and spi - smsc911x ethernet fixes for i.MX - smsc911x fixes for OMAP - gpio fixes for Tegra - A handful of build error and warning fixes for various platforms - cpufreq kconfig dependencies, build and lowlevel debug fixes for Samsung platforms In other words, more or less the regular collection of -rc1/2 type material. A few of them, in particular the smsc911x for OMAP series, aren't technically regressions for 3.4, but they're valid fixes and we're still relatively early in the rc cycle so it seems appropriate to include them." * tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (60 commits) ARM: fix __io macro for PCMCIA ARM: EXYNOS: Fix compiler warning in dma.c file ARM: EXYNOS: fix ISO C90 warning ARM: OMAP2+: hwmod: Fix wrong SYSC_TYPE1_XXX_MASK bit definitions ARM: OMAP2+: hwmod: Make omap_hwmod_softreset wait for reset status ARM: OMAP2+: hwmod: Restore sysc after a reset ARM: OMAP2+: omap_hwmod: Allow io_ring wakeup configuration for all modules ARM: OMAP3: clock data: fill in some missing clockdomains ARM: OMAP4: clock data: Force a DPLL clkdm/pwrdm ON before a relock ARM: OMAP4: clock data: fix mult and div mask for USB_DPLL ARM: OMAP2+: powerdomain: Wait for powerdomain transition in pwrdm_state_switch() gpio: tegra: Iterate over the correct number of banks gpio: tegra: fix register address calculations for Tegra30 EXYNOS: fix dependency for EXYNOS_CPUFREQ ARM: at91: dt: remove unit-address part for memory nodes ARM: at91: fix check of valid GPIO for SPI and USB USB: ehci-atmel: add needed of.h header file ARM: at91/NAND DT bindings: add comments ARM: at91/at91sam9x5.dtsi: fix NAND ale/cle in DT file USB: ohci-at91: trivial return code name change ...
2012-04-05Merge branch 'for-linus' of ↵Linus Torvalds3-3/+14
git://git.kernel.org/pub/scm/linux/kernel/git/lliubbo/blackfin Pull a few blackfin compile fixes from Bob Liu. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lliubbo/blackfin: blackfin: update defconfig for bf527-ezkit blackfin: gpio: fix compile error if !CONFIG_GPIOLIB blackfin: fix L1 data A overflow link issue
2012-04-06blackfin: update defconfig for bf527-ezkitBob Liu1-0/+1
To fix compile error: drivers/usb/musb/blackfin.h:51:3: error: #error "Please use PIO mode in MUSB driver on bf52x chip v0.0 and v0.1" make[4]: *** [drivers/usb/musb/blackfin.o] Error 1 Signed-off-by: Bob Liu <lliubbo@gmail.com>
2012-04-06blackfin: gpio: fix compile error if !CONFIG_GPIOLIBBob Liu1-2/+12
Add __gpio_get_value()/__gpio_set_value() to fix compile error if CONFIG_GPIOLIB = n. Signed-off-by: Bob Liu <lliubbo@gmail.com>
2012-04-06blackfin: fix L1 data A overflow link issueMike Frysinger1-1/+1
This patch fix below compile error: "bfin-uclinux-ld: L1 data A overflow!" It is due to the recent lib/gen_crc32table.c change: 46c5801eaf86e83cb3a4142ad35188db5011fff0 crc32: bolt on crc32c it added 8KiB more data to __cacheline_aligned which cause blackfin L1 data cache overflow. Signed-off-by: Mike Frysinger <vapier@gentoo.org> Signed-off-by: Bob Liu <lliubbo@gmail.com>
2012-04-05Merge branch 'for-linus' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/jikos/apm Pull an APM fix from Jiri Kosina: "One deadlock/race fix from Niel that got introduced when we were moving away from freezer_*_count() to wait_event_freezable()." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/apm: APM: fix deadlock in APM_IOC_SUSPEND ioctl
2012-04-05Merge tag 'omap-fixes-a2-for-3.4rc' of ↵Olof Johansson9-68/+105
git://git.kernel.org/pub/scm/linux/kernel/git/pjw/omap-pending into fixes From Paul Walmsley: OMAP clock, powerdomain, clockdomain, and hwmod fixes intended for the early v3.4-rc series. Also contains an HSMMC integration refinement of an earlier hardware bug workaround. * tag 'omap-fixes-a2-for-3.4rc' of git://git.kernel.org/pub/scm/linux/kernel/git/pjw/omap-pending: ARM: OMAP2+: hwmod: Fix wrong SYSC_TYPE1_XXX_MASK bit definitions ARM: OMAP2+: hwmod: Make omap_hwmod_softreset wait for reset status ARM: OMAP2+: hwmod: Restore sysc after a reset ARM: OMAP2+: omap_hwmod: Allow io_ring wakeup configuration for all modules ARM: OMAP3: clock data: fill in some missing clockdomains ARM: OMAP4: clock data: Force a DPLL clkdm/pwrdm ON before a relock ARM: OMAP4: clock data: fix mult and div mask for USB_DPLL ARM: OMAP2+: powerdomain: Wait for powerdomain transition in pwrdm_state_switch() ARM: OMAP AM3517/3505: clock data: change EMAC clocks aliases ARM: OMAP: clock: fix race in disable all clocks ARM: OMAP4: hwmod data: Add aliases for McBSP fclk clocks ARM: OMAP3xxx: clock data: fix DPLL4 CLKSEL masks ARM: OMAP3xxx: HSMMC: avoid erratum workaround when transceiver is attached ARM: OMAP44xx: clockdomain data: correct the emu_sys_clkdm CLKTRCTRL data
2012-04-05MAINTAINERS: update for Marvell Ethernet driversstephen hemminger1-12/+7
Marvell has agreed to do maintenance on the sky2 driver. * Add the developer to the maintainers file * Remove the old reference to the long gone (sk98lin) driver * Rearrange to fit current topic organization Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-05bonding: properly unset current_arp_slave on slave link upVeaceslav Falico1-1/+5
When a slave comes up, we're unsetting the current_arp_slave without removing active flags from it, which can lead to situations where we have more than one slave with active flags in active-backup mode. To avoid this situation we must remove the active flags from a slave before removing it as a current_arp_slave. Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: Jay Vosburgh <fubar@us.ibm.com> Signed-off-by: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Marcelo Ricardo Leitner <mleitner@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-05phonet: Check input from user before allocatingSasha Levin1-0/+3
A phonet packet is limited to USHRT_MAX bytes, this is never checked during tx which means that the user can specify any size he wishes, and the kernel will attempt to allocate that size. In the good case, it'll lead to the following warning, but it may also cause the kernel to kick in the OOM and kill a random task on the server. [ 8921.744094] WARNING: at mm/page_alloc.c:2255 __alloc_pages_slowpath+0x65/0x730() [ 8921.749770] Pid: 5081, comm: trinity Tainted: G W 3.4.0-rc1-next-20120402-sasha #46 [ 8921.756672] Call Trace: [ 8921.758185] [<ffffffff810b2ba7>] warn_slowpath_common+0x87/0xb0 [ 8921.762868] [<ffffffff810b2be5>] warn_slowpath_null+0x15/0x20 [ 8921.765399] [<ffffffff8117eae5>] __alloc_pages_slowpath+0x65/0x730 [ 8921.769226] [<ffffffff81179c8a>] ? zone_watermark_ok+0x1a/0x20 [ 8921.771686] [<ffffffff8117d045>] ? get_page_from_freelist+0x625/0x660 [ 8921.773919] [<ffffffff8117f3a8>] __alloc_pages_nodemask+0x1f8/0x240 [ 8921.776248] [<ffffffff811c03e0>] kmalloc_large_node+0x70/0xc0 [ 8921.778294] [<ffffffff811c4bd4>] __kmalloc_node_track_caller+0x34/0x1c0 [ 8921.780847] [<ffffffff821b0e3c>] ? sock_alloc_send_pskb+0xbc/0x260 [ 8921.783179] [<ffffffff821b3c65>] __alloc_skb+0x75/0x170 [ 8921.784971] [<ffffffff821b0e3c>] sock_alloc_send_pskb+0xbc/0x260 [ 8921.787111] [<ffffffff821b002e>] ? release_sock+0x7e/0x90 [ 8921.788973] [<ffffffff821b0ff0>] sock_alloc_send_skb+0x10/0x20 [ 8921.791052] [<ffffffff824cfc20>] pep_sendmsg+0x60/0x380 [ 8921.792931] [<ffffffff824cb4a6>] ? pn_socket_bind+0x156/0x180 [ 8921.794917] [<ffffffff824cb50f>] ? pn_socket_autobind+0x3f/0x90 [ 8921.797053] [<ffffffff824cb63f>] pn_socket_sendmsg+0x4f/0x70 [ 8921.798992] [<ffffffff821ab8e7>] sock_aio_write+0x187/0x1b0 [ 8921.801395] [<ffffffff810e325e>] ? sub_preempt_count+0xae/0xf0 [ 8921.803501] [<ffffffff8111842c>] ? __lock_acquire+0x42c/0x4b0 [ 8921.805505] [<ffffffff821ab760>] ? __sock_recv_ts_and_drops+0x140/0x140 [ 8921.807860] [<ffffffff811e07cc>] do_sync_readv_writev+0xbc/0x110 [ 8921.809986] [<ffffffff811958e7>] ? might_fault+0x97/0xa0 [ 8921.811998] [<ffffffff817bd99e>] ? security_file_permission+0x1e/0x90 [ 8921.814595] [<ffffffff811e17e2>] do_readv_writev+0xe2/0x1e0 [ 8921.816702] [<ffffffff810b8dac>] ? do_setitimer+0x1ac/0x200 [ 8921.818819] [<ffffffff810e2ec1>] ? get_parent_ip+0x11/0x50 [ 8921.820863] [<ffffffff810e325e>] ? sub_preempt_count+0xae/0xf0 [ 8921.823318] [<ffffffff811e1926>] vfs_writev+0x46/0x60 [ 8921.825219] [<ffffffff811e1a3f>] sys_writev+0x4f/0xb0 [ 8921.827127] [<ffffffff82658039>] system_call_fastpath+0x16/0x1b [ 8921.829384] ---[ end trace dffe390f30db9eb7 ]--- Signed-off-by: Sasha Levin <levinsasha928@gmail.com> Acked-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-05tcp: tcp_sendpages() should call tcp_push() onceEric Dumazet4-6/+9
commit 2f533844242 (tcp: allow splice() to build full TSO packets) added a regression for splice() calls using SPLICE_F_MORE. We need to call tcp_flush() at the end of the last page processed in tcp_sendpages(), or else transmits can be deferred and future sends stall. Add a new internal flag, MSG_SENDPAGE_NOTLAST, acting like MSG_MORE, but with different semantic. For all sendpage() providers, its a transparent change. Only sock_sendpage() and tcp_sendpages() can differentiate the two different flags provided by pipe_to_sendpage() Reported-by: Tom Herbert <therbert@google.com> Cc: Nandita Dukkipati <nanditad@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Tom Herbert <therbert@google.com> Cc: Yuchung Cheng <ycheng@google.com> Cc: H.K. Jerry Chu <hkchu@google.com> Cc: Maciej Żenczykowski <maze@google.com> Cc: Mahesh Bandewar <maheshb@google.com> Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: Eric Dumazet <eric.dumazet@gmail>com> Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-05Merge branch 'akpm' (Andrew's patch-bomb)Linus Torvalds78-631/+518
Merge batch of fixes from Andrew Morton: "The simple_open() cleanup was held back while I wanted for laggards to merge things. I still need to send a few checkpoint/restore patches. I've been wobbly about merging them because I'm wobbly about the overall prospects for success of the project. But after speaking with Pavel at the LSF conference, it sounds like they're further toward completion than I feared - apparently davem is at the "has stopped complaining" stage regarding the net changes. So I need to go back and re-review those patchs and their (lengthy) discussion." * emailed from Andrew Morton <akpm@linux-foundation.org>: (16 patches) memcg swap: use mem_cgroup_uncharge_swap fix backlight: add driver for DA9052/53 PMIC v1 C6X: use set_current_blocked() and block_sigmask() MAINTAINERS: add entry for sparse checker MAINTAINERS: fix REMOTEPROC F: typo alpha: use set_current_blocked() and block_sigmask() simple_open: automatically convert to simple_open() scripts/coccinelle/api/simple_open.cocci: semantic patch for simple_open() libfs: add simple_open() hugetlbfs: remove unregister_filesystem() when initializing module drivers/rtc/rtc-88pm860x.c: fix rtc irq enable callback fs/xattr.c:setxattr(): improve handling of allocation failures fs/xattr.c:listxattr(): fall back to vmalloc() if kmalloc() failed fs/xattr.c: suppress page allocation failure warnings from sys_listxattr() sysrq: use SEND_SIG_FORCED instead of force_sig() proc: fix mount -t proc -o AAA
2012-04-05memcg swap: use mem_cgroup_uncharge_swap fixMichal Hocko1-7/+7
Although mem_cgroup_uncharge_swap has an empty placeholder for !CONFIG_CGROUP_MEM_RES_CTLR_SWAP the definition is placed in the CONFIG_SWAP ifdef block so we are missing the same definition for !CONFIG_SWAP which implies !CONFIG_CGROUP_MEM_RES_CTLR_SWAP. This has not been an issue before, because mem_cgroup_uncharge_swap was not called from !CONFIG_SWAP context. But Hugh Dickins has a cleanup patch to call __mem_cgroup_commit_charge_swapin which is defined also for !CONFIG_SWAP. Let's move both the empty definition and declaration outside of the CONFIG_SWAP block to avoid the following compilation error: mm/memcontrol.c: In function '__mem_cgroup_commit_charge_swapin': mm/memcontrol.c:2837: error: implicit declaration of function 'mem_cgroup_uncharge_swap' if CONFIG_SWAP is disabled. Reported-by: David Rientjes <rientjes@google.com> Signed-off-by: Michal Hocko <mhocko@suse.cz> Cc: Hugh Dickins <hughd@google.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Cc: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-04-05backlight: add driver for DA9052/53 PMIC v1Ashish Jangam3-0/+194
DA9052/53 PMIC has capability to supply power for upto 3 banks of 6 white serial LEDS. It can also control intensity of independent banks and to drive these banks boost converter will provide up to 24V and forward current of max 50mA. This patch allows to control intensity of the individual WLEDs bank through DA9052/53 PMIC. This patch is functionally tested on Samsung SMDKV6410. Signed-off-by: David Dajun Chen <dchen@diasemi.com> Signed-off-by: Ashish Jangam <ashish.jangam@kpitcummins.com> Cc: Richard Purdie <rpurdie@rpsys.net> Cc: Samuel Ortiz <sameo@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-04-05C6X: use set_current_blocked() and block_sigmask()Matt Fleming1-13/+3
As described in e6fa16ab9c1e ("signal: sigprocmask() should do retarget_shared_pending()") the modification of current->blocked is incorrect as we need to check whether the signal we're about to block is pending in the shared queue. Also, use the new helper function introduced in commit 5e6292c0f28f ("signal: add block_sigmask() for adding sigmask to current->blocked") which centralises the code for updating current->blocked after successfully delivering a signal and reduces the amount of duplicate code across architectures. In the past some architectures got this code wrong, so using this helper function should stop that from happening again. Acked-by: Mark Salter <msalter@redhat.com> Cc: Aurelien Jacquiot <a-jacquiot@ti.com> Acked-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Matt Fleming <matt.fleming@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-04-05MAINTAINERS: add entry for sparse checkerChristopher Li1-0/+9
Signed-off-by: Christopher Li <sparse@chrisli.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-04-05MAINTAINERS: fix REMOTEPROC F: typoJoe Perches1-1/+1
remoteproc.txt should have been .h Signed-off-by: Joe Perches <joe@perches.com> Cc: Ohad Ben-Cohen <ohad@wizery.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-04-05alpha: use set_current_blocked() and block_sigmask()Matt Fleming1-21/+10
As described in e6fa16ab9c1e ("signal: sigprocmask() should do retarget_shared_pending()") the modification of current->blocked is incorrect as we need to check for shared signals we're about to block. Also, use the new helper function introduced in commit 5e6292c0f28f ("signal: add block_sigmask() for adding sigmask to current->blocked") which centralises the code for updating current->blocked after successfully delivering a signal and reduces the amount of duplicate code across architectures. In the past some architectures got this code wrong, so using this helper function should stop that from happening again. Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Matt Turner <mattst88@gmail.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Matt Fleming <matt.fleming@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-04-05simple_open: automatically convert to simple_open()Stephen Boyd63-572/+176
Many users of debugfs copy the implementation of default_open() when they want to support a custom read/write function op. This leads to a proliferation of the default_open() implementation across the entire tree. Now that the common implementation has been consolidated into libfs we can replace all the users of this function with simple_open(). This replacement was done with the following semantic patch: <smpl> @ open @ identifier open_f != simple_open; identifier i, f; @@ -int open_f(struct inode *i, struct file *f) -{ ( -if (i->i_private) -f->private_data = i->i_private; | -f->private_data = i->i_private; ) -return 0; -} @ has_open depends on open @ identifier fops; identifier open.open_f; @@ struct file_operations fops = { ... -.open = open_f, +.open = simple_open, ... }; </smpl> [akpm@linux-foundation.org: checkpatch fixes] Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Julia Lawall <Julia.Lawall@lip6.fr> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-04-05scripts/coccinelle/api/simple_open.cocci: semantic patch for simple_open()Julia Lawall1-0/+70
Find instances of an open-coded simple_open() and replace them with calls to simple_open(). Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Reported-by: Stephen Boyd <sboyd@codeaurora.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-04-05libfs: add simple_open()Stephen Boyd2-0/+9
debugfs and a few other drivers use an open-coded version of simple_open() to pass a pointer from the file to the read/write file ops. Add support for this simple case to libfs so that we can remove the many duplicate copies of this simple function. Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Julia Lawall <Julia.Lawall@lip6.fr> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-04-05hugetlbfs: remove unregister_filesystem() when initializing moduleHillf Danton1-1/+0
It was introduced by d1d5e05ffdc1 ("hugetlbfs: return error code when initializing module") but as Al pointed out, is a bad idea. Quoted comments from Al: "Note that unregister_filesystem() in module init is *always* wrong; it's not an issue here (it's done too early to care about and realistically the box is not going anywhere - it'll panic when attempt to exec /sbin/init fails, if not earlier), but it's a damn bad example. Consider a normal fs module. Somebody loads it and in parallel with that we get a mount attempt on that fs type. It comes between register and failure exits that causes unregister; at that point we are screwed since grabbing a reference to module as done by mount is enough to prevent exit, but not to prevent the failure of init. As the result, module will get freed when init fails, mounted fs of that type be damned." So remove it. Signed-off-by: Hillf Danton <dhillf@gmail.com> Cc: David Rientjes <rientjes@google.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-04-05drivers/rtc/rtc-88pm860x.c: fix rtc irq enable callbackJett.Zhou1-2/+2
According to 88pm860x spec, rtc alarm irq enable control is bit3 for RTC_ALARM_EN, so fix it. Signed-off-by: Jett.Zhou <jtzhou@marvell.com> Cc: Axel Lin <axel.lin@gmail.com> Cc: Haojian Zhuang <haojian.zhuang@marvell.com> Cc: Alessandro Zummo <a.zummo@towertech.it> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-04-05fs/xattr.c:setxattr(): improve handling of allocation failuresAndrew Morton1-4/+17
This allocation can be as large as 64k. - Add __GFP_NOWARN so the a falied kmalloc() is silent - Fall back to vmalloc() if the kmalloc() failed Cc: Dave Chinner <david@fromorbit.com> Cc: Dave Jones <davej@codemonkey.org.uk> Cc: David Rientjes <rientjes@google.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-04-05fs/xattr.c:listxattr(): fall back to vmalloc() if kmalloc() failedAndrew Morton1-4/+13
This allocation can be as large as 64k. As David points out, "falling back to vmalloc here is much better solution than failing to retreive the attribute - it will work no matter how fragmented memory gets. That means we don't get incomplete backups occurring after days or months of uptime and successful backups". Cc: Dave Chinner <david@fromorbit.com> Cc: Dave Jones <davej@codemonkey.org.uk> Cc: David Rientjes <rientjes@google.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-04-05fs/xattr.c: suppress page allocation failure warnings from sys_listxattr()Dave Jones1-1/+1
This size is user controllable, up to a maximum of XATTR_LIST_MAX (64k). So it's trivial for someone to trigger a stream of order:4 page allocation errors. Signed-off-by: Dave Jones <davej@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Dave Chinner <david@fromorbit.com> Acked-by: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-04-05sysrq: use SEND_SIG_FORCED instead of force_sig()Anton Vorontsov1-1/+1
Change send_sig_all() to use do_send_sig_info(SEND_SIG_FORCED) instead of force_sig(SIGKILL). With the recent changes we do not need force_ to kill the CLONE_NEWPID tasks. And this is more correct. force_sig() can race with the exiting thread, while do_send_sig_info(group => true) kill the whole process. Some more notes from Oleg Nesterov: > Just one note. This change makes no difference for sysrq_handle_kill(). > But it obviously changes the behaviour sysrq_handle_term(). I think > this is fine, if you want to really kill the task which blocks/ignores > SIGTERM you can use sysrq_handle_kill(). > > Even ignoring the reasons why force_sig() is simply wrong here, > force_sig(SIGTERM) looks strange. The task won't be killed if it has > a handler, but SIG_IGN can't help. However if it has the handler > but blocks SIGTERM temporary (this is very common) it will be killed. Also, > force_sig() can't kill the process if the main thread has already > exited. IOW, it is trivial to create the process which can't be > killed by sysrq. So, this patch fixes the issue. Suggested-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Anton Vorontsov <anton.vorontsov@linaro.org> Cc: Alan Cox <alan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-04-05proc: fix mount -t proc -o AAAVasiliy Kulikov1-4/+5
The proc_parse_options() call from proc_mount() runs only once at boot time. So on any later mount attempt, any mount options are ignored because ->s_root is already initialized. As a consequence, "mount -o <options>" will ignore the options. The only way to change mount options is "mount -o remount,<options>". To fix this, parse the mount options unconditionally. Signed-off-by: Vasiliy Kulikov <segoon@openwall.com> Reported-by: Arkadiusz Miskiewicz <a.miskiewicz@gmail.com> Tested-by: Arkadiusz Miskiewicz <a.miskiewicz@gmail.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Valdis Kletnieks <Valdis.Kletnieks@vt.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>