Age | Commit message (Collapse) | Author | Files | Lines |
|
git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull security subsystem updates from James Morris:
"New notable features:
- The seccomp work from Will Drewry
- PR_{GET,SET}_NO_NEW_PRIVS from Andy Lutomirski
- Longer security labels for Smack from Casey Schaufler
- Additional ptrace restriction modes for Yama by Kees Cook"
Fix up trivial context conflicts in arch/x86/Kconfig and include/linux/filter.h
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (65 commits)
apparmor: fix long path failure due to disconnected path
apparmor: fix profile lookup for unconfined
ima: fix filename hint to reflect script interpreter name
KEYS: Don't check for NULL key pointer in key_validate()
Smack: allow for significantly longer Smack labels v4
gfp flags for security_inode_alloc()?
Smack: recursive tramsmute
Yama: replace capable() with ns_capable()
TOMOYO: Accept manager programs which do not start with / .
KEYS: Add invalidation support
KEYS: Do LRU discard in full keyrings
KEYS: Permit in-place link replacement in keyring list
KEYS: Perform RCU synchronisation on keys prior to key destruction
KEYS: Announce key type (un)registration
KEYS: Reorganise keys Makefile
KEYS: Move the key config into security/keys/Kconfig
KEYS: Use the compat keyctl() syscall wrapper on Sparc64 for Sparc32 compat
Yama: remove an unused variable
samples/seccomp: fix dependencies on arch macros
Yama: add additional ptrace scopes
...
|
|
|
|
Move tcp_try_coalesce() protocol independent part to
skb_try_coalesce().
skb_try_coalesce() can be used in IPv4 defrag and IPv6 reassembly,
to build optimized skbs (less sk_buff, and possibly less 'headers')
skb_try_coalesce() is zero copy, unless the copy can fit in destination
header (its a rare case)
kfree_skb_partial() is also moved to net/core/skbuff.c and exported,
because IPv6 will need it in patch (ipv6: use skb coalescing in
reassembly).
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
No need to export napi_frags_skb()
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
commit c57b5468406 (pktgen: fix crash at module unload) did a very poor
job with list primitives.
1) list_splice() arguments were in the wrong order
2) list_splice(list, head) has undefined behavior if head is not
initialized.
3) We should use the list_splice_init() variant to clear pktgen_threads
list.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Fix two issues introduced in commit a1c7fff7e18f5
( net: netdev_alloc_skb() use build_skb() )
- Must be IRQ safe (non NAPI drivers can use it)
- Must not leak the frag if build_skb() fails to allocate sk_buff
This patch introduces netdev_alloc_frag() for drivers willing to
use build_skb() instead of __netdev_alloc_skb() variants.
Factorize code so that :
__dev_alloc_skb() is a wrapper around __netdev_alloc_skb(), and
dev_alloc_skb() a wrapper around netdev_alloc_skb()
Use __GFP_COLD flag.
Almost all network drivers now benefit from skb->head_frag
infrastructure.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When I first wrote drop monitor I wrote it to just build monolithically. There
is no reason it can't be built modularly as well, so lets give it that
flexibiity.
I've tested this by building it as both a module and monolithically, and it
seems to work quite well
Change notes:
v2)
* fixed for_each_present_cpu loops to be more correct as per Eric D.
* Converted exit path failures to BUG_ON as per Ben H.
v3)
* Converted del_timer to del_timer_sync to close race noted by Ben H.
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
CC: "David S. Miller" <davem@davemloft.net>
CC: Eric Dumazet <eric.dumazet@gmail.com>
CC: Ben Hutchings <bhutchings@solarflare.com>
Reviewed-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
netdev_alloc_skb() is used by networks driver in their RX path to
allocate an skb to receive an incoming frame.
With recent skb->head_frag infrastructure, it makes sense to change
netdev_alloc_skb() to use build_skb() and a frag allocator.
This permits a zero copy splice(socket->pipe), and better GRO or TCP
coalescing.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Use the current logging style.
This enables use of dynamic debugging as well.
Convert printk(KERN_<LEVEL> to pr_<level>.
Add pr_fmt. Remove embedded prefixes, use
%s, __func__ instead.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Convert printk(KERN_DEBUG to pr_debug which can
enable dynamic debugging.
Remove embedded prefixes from the conversions as
pr_fmt adds them.
Align arguments.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
- sock_flag() accepts a const pointer
- sock_flag() returns a boolean
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
We are going to delete the Token ring support. This removes any
special processing in the core networking for token ring, (aside
from net/tr.c itself), leaving the drivers and remaining tokenring
support present but inert.
The mass removal of the drivers and net/tr.c will be in a separate
commit, so that the history of these files that we still care
about won't have the giant deletion tied into their history.
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
|
|
Standardize the net core ratelimited logging functions.
Coalesce formats, align arguments.
Change a printk then vprintk sequence to use printf extension %pV.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
commit 7d3d43dab4e9 (net: In unregister_netdevice_notifier unregister
the netdevices.) makes pktgen crashing at module unload.
[ 296.820578] BUG: spinlock bad magic on CPU#6, rmmod/3267
[ 296.820719] lock: ffff880310c38000, .magic: ffff8803, .owner: <none>/-1, .owner_cpu: -1
[ 296.820943] Pid: 3267, comm: rmmod Not tainted 3.4.0-rc5+ #254
[ 296.821079] Call Trace:
[ 296.821211] [<ffffffff8168a715>] spin_dump+0x8a/0x8f
[ 296.821345] [<ffffffff8168a73b>] spin_bug+0x21/0x26
[ 296.821507] [<ffffffff812b4741>] do_raw_spin_lock+0x131/0x140
[ 296.821648] [<ffffffff8169188e>] _raw_spin_lock+0x1e/0x20
[ 296.821786] [<ffffffffa00cc0fd>] __pktgen_NN_threads+0x4d/0x140 [pktgen]
[ 296.821928] [<ffffffffa00ccf8d>] pktgen_device_event+0x10d/0x1e0 [pktgen]
[ 296.822073] [<ffffffff8154ed4f>] unregister_netdevice_notifier+0x7f/0x100
[ 296.822216] [<ffffffffa00d2a0b>] pg_cleanup+0x48/0x73 [pktgen]
[ 296.822357] [<ffffffff8109528e>] sys_delete_module+0x17e/0x2a0
[ 296.822502] [<ffffffff81699652>] system_call_fastpath+0x16/0x1b
Hold the pktgen_thread_lock while splicing pktgen_threads, and test
pktgen_exiting in pktgen_device_event() to make unload faster.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This reverts commit 8a83a00b0735190384a348156837918271034144.
It causes regressions for S390 devices, because it does an
unconditional DST drop on SKBs for vlans and the QETH device
needs the neighbour entry hung off the DST for certain things
on transmit.
Arnd can't remember exactly why he even needed this change.
Conflicts:
drivers/net/macvlan.c
net/8021q/vlan_dev.c
net/core/dev.c
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
ETHTOOL_GMODULEINFO returns a new struct ethtool_modinfo that will return the
type and size of plug-in module eeprom (such as SFP+) for parsing
by userland program.
ETHTOOL_GMODULEEEPROM returns the raw eeprom information
using the existing ethtool_eeprom structture to return the data
Signed-off-by: Stuart Hodgson <smhodgson@solarflare.com>
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
|
|
We want to support reading module (SFP+, XFP, ...) EEPROMs as well as
NIC EEPROMs. They will need a different command number and driver
operation, but the structure and arguments will be the same and so we
can share most of the code here.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
|
|
To build ip_vs as a module sysctl_rmem_max and sysctl_wmem_max
needs to be exported.
The dependency was added by "ipvs: wakeup master thread" patch.
Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Conflicts:
drivers/net/ethernet/intel/e1000e/param.c
drivers/net/wireless/iwlwifi/iwl-agn-rx.c
drivers/net/wireless/iwlwifi/iwl-trans-pcie-rx.c
drivers/net/wireless/iwlwifi/iwl-trans.h
Resolved the iwlwifi conflict with mainline using 3-way diff posted
by John Linville and Stephen Rothwell. In 'net' we added a bug
fix to make iwlwifi report a more accurate skb->truesize but this
conflicted with RX path changes that happened meanwhile in net-next.
In e1000e a conflict arose in the validation code for settings of
adapter->itr. 'net-next' had more sophisticated logic so that
logic was used.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
With the recent changes for how we compute the skb truesize it occurs to me
we are probably going to have a lot of calls to skb_end_pointer -
skb->head. Instead of running all over the place doing that it would make
more sense to just make it a separate inline skb_end_offset(skb) that way
we can return the correct value without having gcc having to do all the
optimization to cancel out skb->head - skb->head.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Since there is now only one spot that actually uses "fastpath" there isn't
much point in carrying it. Instead we can just use a check for skb_cloned
to verify if we can perform the fast-path free for the head or not.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The fast-path for pskb_expand_head contains a check where the size plus the
unaligned size of skb_shared_info is compared against the size of the data
buffer. This code path has two issues. First is the fact that after the
recent changes by Eric Dumazet to __alloc_skb and build_skb the shared info
is always placed in the optimal spot for a buffer size making this check
unnecessary. The second issue is the fact that the check doesn't take into
account the aligned size of shared info. As a result the code burns cycles
doing a memcpy with nothing actually being shifted.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Linux 3.4-rc5
Merge to pull in prerequisite change for Smack:
86812bb0de1a3758dc6c7aa01a763158a7c0638a
Requested by Casey.
|
|
This patch adds support for a skb_head_is_locked helper function. It is
meant to be used any time we are considering transferring the head from
skb->head to a paged frag. If the head is locked it means we cannot remove
the head from the skb so it must be copied or we must take the skb as a
whole.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
GRO is very optimistic in skb truesize estimates, only taking into
account the used part of fragments.
Be conservative, and use more precise computation, so that bloated GRO
skbs can be collapsed eventually.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Alexander Duyck <alexander.h.duyck@intel.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Acked-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This change is meant ot prevent stealing the skb->head to use as a page in
the event that the skb->head was cloned. This allows the other clones to
track each other via shinfo->dataref.
Without this we break down to two methods for tracking the reference count,
one being dataref, the other being the page count. As a result it becomes
difficult to track how many references there are to skb->head.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
I just noticed after some recent updates, that the init path for the drop
monitor protocol has a minor error. drop monitor maintains a per cpu structure,
that gets initalized from a single cpu. Normally this is fine, as the protocol
isn't in use yet, but I recently made a change that causes a failed skb
allocation to reschedule itself . Given the current code, the implication is
that this workqueue reschedule will take place on the wrong cpu. If drop
monitor is used early during the boot process, its possible that two cpus will
access a single per-cpu structure in parallel, possibly leading to data
corruption.
This patch fixes the situation, by storing the cpu number that a given instance
of this per-cpu data should be accessed from. In the case of a need for a
reschedule, the cpu stored in the struct is assigned the rescheule, rather than
the currently executing cpu
Tested successfully by myself.
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
CC: David Miller <davem@davemloft.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
TCP or UDP stacks have big enough latencies that prefetching next
pointer is worth it.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
__skb_splice_bits() can check if skb to be spliced has its skb->head
mapped to a page fragment, instead of a kmalloc() area.
If so we can avoid a copy of the skb head and get a reference on
underlying page.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Maciej Żenczykowski <maze@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Tom Herbert <therbert@google.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: Ben Hutchings <bhutchings@solarflare.com>
Cc: Matt Carlson <mcarlson@broadcom.com>
Cc: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
GRO can check if skb to be merged has its skb->head mapped to a page
fragment, instead of a kmalloc() area.
We 'upgrade' skb->head as a fragment in itself
This avoids the frag_list fallback, and permits to build true GRO skb
(one sk_buff and up to 16 fragments), using less memory.
This reduces number of cache misses when user makes its copy, since a
single sk_buff is fetched.
This is a followup of patch "net: allow skb->head to be a page fragment"
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Maciej Żenczykowski <maze@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Tom Herbert <therbert@google.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: Ben Hutchings <bhutchings@solarflare.com>
Cc: Matt Carlson <mcarlson@broadcom.com>
Cc: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
skb->head is currently allocated from kmalloc(). This is convenient but
has the drawback the data cannot be converted to a page fragment if
needed.
We have three spots were it hurts :
1) GRO aggregation
When a linear skb must be appended to another skb, GRO uses the
frag_list fallback, very inefficient since we keep all struct sk_buff
around. So drivers enabling GRO but delivering linear skbs to network
stack aren't enabling full GRO power.
2) splice(socket -> pipe).
We must copy the linear part to a page fragment.
This kind of defeats splice() purpose (zero copy claim)
3) TCP coalescing.
Recently introduced, this permits to group several contiguous segments
into a single skb. This shortens queue lengths and save kernel memory,
and greatly reduce probabilities of TCP collapses. This coalescing
doesnt work on linear skbs (or we would need to copy data, this would be
too slow)
Given all these issues, the following patch introduces the possibility
of having skb->head be a fragment in itself. We use a new skb flag,
skb->head_frag to carry this information.
build_skb() is changed to accept a frag_size argument. Drivers willing
to provide a page fragment instead of kmalloc() data will set a non zero
value, set to the fragment size.
Then, on situations we need to convert the skb head to a frag in itself,
we can check if skb->head_frag is set and avoid the copies or various
fallbacks we have.
This means drivers currently using frags could be updated to avoid the
current skb->head allocation and reduce their memory footprint (aka skb
truesize). (thats 512 or 1024 bytes saved per skb). This also makes
bpf/netfilter faster since the 'first frag' will be part of skb linear
part, no need to copy data.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Maciej Żenczykowski <maze@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Tom Herbert <therbert@google.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: Ben Hutchings <bhutchings@solarflare.com>
Cc: Matt Carlson <mcarlson@broadcom.com>
Cc: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Fixed a coding style issue relating to spaces
in net/core/sock.c
Signed-off-by: Jeffrin Jose <ahiliation@yahoo.co.in>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Eric Dumazet pointed out to me that the drop_monitor protocol has some holes in
its smp protections. Specifically, its possible to replace data->skb while its
being written. This patch corrects that by making data->skb an rcu protected
variable. That will prevent it from being overwritten while a tracepoint is
modifying it.
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: David Miller <davem@davemloft.net>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Eric Dumazet pointed out this warning in the drop_monitor protocol to me:
[ 38.352571] BUG: sleeping function called from invalid context at kernel/mutex.c:85
[ 38.352576] in_atomic(): 1, irqs_disabled(): 0, pid: 4415, name: dropwatch
[ 38.352580] Pid: 4415, comm: dropwatch Not tainted 3.4.0-rc2+ #71
[ 38.352582] Call Trace:
[ 38.352592] [<ffffffff8153aaf0>] ? trace_napi_poll_hit+0xd0/0xd0
[ 38.352599] [<ffffffff81063f2a>] __might_sleep+0xca/0xf0
[ 38.352606] [<ffffffff81655b16>] mutex_lock+0x26/0x50
[ 38.352610] [<ffffffff8153aaf0>] ? trace_napi_poll_hit+0xd0/0xd0
[ 38.352616] [<ffffffff810b72d9>] tracepoint_probe_register+0x29/0x90
[ 38.352621] [<ffffffff8153a585>] set_all_monitor_traces+0x105/0x170
[ 38.352625] [<ffffffff8153a8ca>] net_dm_cmd_trace+0x2a/0x40
[ 38.352630] [<ffffffff8154a81a>] genl_rcv_msg+0x21a/0x2b0
[ 38.352636] [<ffffffff810f8029>] ? zone_statistics+0x99/0xc0
[ 38.352640] [<ffffffff8154a600>] ? genl_rcv+0x30/0x30
[ 38.352645] [<ffffffff8154a059>] netlink_rcv_skb+0xa9/0xd0
[ 38.352649] [<ffffffff8154a5f0>] genl_rcv+0x20/0x30
[ 38.352653] [<ffffffff81549a7e>] netlink_unicast+0x1ae/0x1f0
[ 38.352658] [<ffffffff81549d76>] netlink_sendmsg+0x2b6/0x310
[ 38.352663] [<ffffffff8150824f>] sock_sendmsg+0x10f/0x130
[ 38.352668] [<ffffffff8150abe0>] ? move_addr_to_kernel+0x60/0xb0
[ 38.352673] [<ffffffff81515f04>] ? verify_iovec+0x64/0xe0
[ 38.352677] [<ffffffff81509c46>] __sys_sendmsg+0x386/0x390
[ 38.352682] [<ffffffff810ffaf9>] ? handle_mm_fault+0x139/0x210
[ 38.352687] [<ffffffff8165b5bc>] ? do_page_fault+0x1ec/0x4f0
[ 38.352693] [<ffffffff8106ba4d>] ? set_next_entity+0x9d/0xb0
[ 38.352699] [<ffffffff81310b49>] ? tty_ldisc_deref+0x9/0x10
[ 38.352703] [<ffffffff8106d363>] ? pick_next_task_fair+0x63/0x140
[ 38.352708] [<ffffffff8150b8d4>] sys_sendmsg+0x44/0x80
[ 38.352713] [<ffffffff8165f8e2>] system_call_fastpath+0x16/0x1b
It stems from holding a spinlock (trace_state_lock) while attempting to register
or unregister tracepoint hooks, making in_atomic() true in this context, leading
to the warning when the tracepoint calls might_sleep() while its taking a mutex.
Since we only use the trace_state_lock to prevent trace protocol state races, as
well as hardware stat list updates on an rcu write side, we can just convert the
spinlock to a mutex to avoid this problem.
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: David Miller <davem@davemloft.net>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Use min_t()/max_t() macros, reformat two comments, use !!test_bit() to
match !!sock_flag()
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
read only, so change it to const.
Signed-off-by: Shan Wei <davidshan@tencent.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Fix merge between commit 3adadc08cc1e ("net ax25: Reorder ax25_exit to
remove races") and commit 0ca7a4c87d27 ("net ax25: Simplify and
cleanup the ax25 sysctl handling")
The former moved around the sysctl register/unregister calls, the
later simply removed them.
With help from Stephen Rothwell.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Commit 35f3d14db (pipe: add support for shrinking and growing pipes)
added a slowdown for splice(socket -> pipe), as we might grow the spd
used in skb_splice_bits() for each skb we process in splice() syscall.
Its not needed since skb lengths are capped. The default on-stack arrays
are more than enough.
Use MAX_SKB_FRAGS instead of PIPE_DEF_BUFFERS to describe the reasonable
limit per skb.
Add coalescing support to help splicing of GRO skbs built from linear
skbs (linked into frag_list)
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
sk_add_backlog() & sk_rcvqueues_full() hard coded sk_rcvbuf as the
memory limit. We need to make this limit a parameter for TCP use.
No functional change expected in this patch, all callers still using the
old sk_rcvbuf limit.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Tom Herbert <therbert@google.com>
Cc: Maciej Żenczykowski <maze@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Cc: Rick Jones <rick.jones2@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
splice() from socket to pipe needs linear_to_page() helper to transfert
skb header to part of page.
We can reset the offset in the current sk->sk_sndmsg_page if we are the
last user of the page.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
It seems there is a logic error in trace_drop_common(), since we store
only 64 drops, even if they are from same location.
This fix is a one liner, but we probably need more work to avoid useless
atomic dec/inc
Now I can watch 1 Mpps drops through dropwatch...
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Name them in a "backward compatible" manner, i.e. reuse or not
are still 1 and 0 respectively. The reuse value of 2 means that
the socket with it will forcibly reuse everyone else's port.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
We don't use struct ctl_path anymore so delete the exported constants.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This results in code with less boiler plate that is a bit easier
to read.
Additionally stops us from using compatibility code in the sysctl
core, hastening the day when the compatibility code can be removed.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Using an ascii path to register_net_sysctl as opposed to the slightly
awkward ctl_path allows for much simpler code.
We no longer need to malloc dev_name to keep it alive the length of our
sysctl register instead we can use a small temporary buffer on the
stack.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
On the next line we register the net_core_table in net/core which
creates the directory and ensures it exists.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|