Age | Commit message (Collapse) | Author | Files | Lines |
|
Instead of using the SKB queue with the fake pkt_type for the
offloaded RX BA session management, also handle this with the
normal aggregation state machine worker. This also makes the
use of this more reliable since it gets rid of the allocation
of the fake skb.
Combined with the previous patch, this finally allows us to
get rid of the pkt_type hack entirely, so do that as well.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
This brings in commit 7a7c0a6438b8 ("mac80211: fix TX aggregation
start/stop callback race") to allow the follow-up cleanup.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
Just some simple overlapping changes in marvell PHY driver
and the DSA core code.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
XAUI allows XGMII to reach an extended distance by using a XGXS layer at
each end of the MAC to PHY link, operating over four Serdes lanes.
10GBASE-KR is a single lane Serdes backplane ethernet connection method
with autonegotiation on the link. Some PHYs use this to connect to the
ethernet interface at 10G speeds, switching to other connection types
when utilising slower speeds.
10GBASE-KR is also used for XFI and SFI to connect to XFP and SFP fiber
modules.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
genphy_restart_aneg() can only restart autonegotiation on clause 22
PHYs. Add a phy_restart_aneg() function which selects between the
clause 22 and clause 45 restart functionality depending on the PHY
type and whether the Clause 45 PHY supports the Clause 22 register set.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add generic helpers for 802.3 clause 45 PHYs for >= 10Gbps support.
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Pull networking fixes from David Miller:
1) Made TCP congestion control documentation match current reality,
from Anmol Sarma.
2) Various build warning and failure fixes from Arnd Bergmann.
3) Fix SKB list leak in ipv6_gso_segment().
4) Use after free in ravb driver, from Eugeniu Rosca.
5) Don't use udp_poll() in ping protocol driver, from Eric Dumazet.
6) Don't crash in PCI error recovery of cxgb4 driver, from Guilherme
Piccoli.
7) _SRC_NAT_DONE_BIT needs to be cleared using atomics, from Liping
Zhang.
8) Use after free in vxlan deletion, from Mark Bloch.
9) Fix ordering of NAPI poll enabled in ethoc driver, from Max
Filippov.
10) Fix stmmac hangs with TSO, from Niklas Cassel.
11) Fix crash in CALIPSO ipv6, from Richard Haines.
12) Clear nh_flags properly on mpls link up. From Roopa Prabhu.
13) Fix regression in sk_err socket error queue handling, noticed by
ping applications. From Soheil Hassas Yeganeh.
14) Update mlx4/mlx5 MAINTAINERS information.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (78 commits)
net: stmmac: fix a broken u32 less than zero check
net: stmmac: fix completely hung TX when using TSO
net: ethoc: enable NAPI before poll may be scheduled
net: bridge: fix a null pointer dereference in br_afspec
ravb: Fix use-after-free on `ifconfig eth0 down`
net/ipv6: Fix CALIPSO causing GPF with datagram support
net: stmmac: ensure jumbo_frm error return is correctly checked for -ve value
Revert "sit: reload iphdr in ipip6_rcv"
i40e/i40evf: proper update of the page_offset field
i40e: Fix state flags for bit set and clean operations of PF
iwlwifi: fix host command memory leaks
iwlwifi: fix min API version for 7265D, 3168, 8000 and 8265
iwlwifi: mvm: clear new beacon command template struct
iwlwifi: mvm: don't fail when removing a key from an inexisting sta
iwlwifi: pcie: only use d0i3 in suspend/resume if system_pm is set to d0i3
iwlwifi: mvm: fix firmware debug restart recording
iwlwifi: tt: move ucode_loaded check under mutex
iwlwifi: mvm: support ibss in dqa mode
iwlwifi: mvm: Fix command queue number on d0i3 flow
iwlwifi: mvm: rs: start using LQ command color
...
|
|
GCC explicitly does not warn for unused static inline functions for
-Wunused-function. The manual states:
Warn whenever a static function is declared but not defined or
a non-inline static function is unused.
Clang does warn for static inline functions that are unused.
It turns out that suppressing the warnings avoids potentially complex
#ifdef directives, which also reduces LOC.
Suppress the warning for clang.
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
A single BPF_OBJ_GET_INFO_BY_FD cmd is used to obtain the info
for both bpf_prog and bpf_map. The kernel can figure out the
fd is associated with a bpf_prog or bpf_map.
The suggested struct bpf_prog_info and struct bpf_map_info are
not meant to be a complete list and it is not the goal of this patch.
New fields can be added in the future patch.
The focus of this patch is to create the interface,
BPF_OBJ_GET_INFO_BY_FD cmd for exposing the bpf_prog's and
bpf_map's info.
The obj's info, which will be extended (and get bigger) over time, is
separated from the bpf_attr to avoid bloating the bpf_attr.
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Alexei Starovoitov <ast@fb.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add jited_len to struct bpf_prog. It will be
useful for the struct bpf_prog_info which will
be added in the later patch.
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Alexei Starovoitov <ast@fb.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add BPF_MAP_GET_FD_BY_ID command to allow user to get a fd
from a bpf_map's ID.
bpf_map_inc_not_zero() is added and is called with map_idr_lock
held.
__bpf_map_put() is also added which has the 'bool do_idr_lock'
param to decide if the map_idr_lock should be acquired when
freeing the map->id.
In the error path of bpf_map_inc_not_zero(), it may have to
call __bpf_map_put(map, false) which does not need
to take the map_idr_lock when freeing the map->id.
It is currently limited to CAP_SYS_ADMIN which we can
consider to lift it in followup patches.
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Alexei Starovoitov <ast@fb.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add BPF_PROG_GET_FD_BY_ID command to allow user to get a fd
from a bpf_prog's ID.
bpf_prog_inc_not_zero() is added and is called with prog_idr_lock
held.
__bpf_prog_put() is also added which has the 'bool do_idr_lock'
param to decide if the prog_idr_lock should be acquired when
freeing the prog->id.
In the error path of bpf_prog_inc_not_zero(), it may have to
call __bpf_prog_put(map, false) which does not need
to take the prog_idr_lock when freeing the prog->id.
It is currently limited to CAP_SYS_ADMIN which we can
consider to lift it in followup patches.
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Alexei Starovoitov <ast@fb.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch adds BPF_PROG_GET_NEXT_ID and BPF_MAP_GET_NEXT_ID
to allow userspace to iterate all bpf_prog IDs and bpf_map IDs.
The API is trying to be consistent with the existing
BPF_MAP_GET_NEXT_KEY.
It is currently limited to CAP_SYS_ADMIN which we can
consider to lift it in followup patches.
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Alexei Starovoitov <ast@fb.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch generates an unique ID for each created bpf_map.
The approach is similar to the earlier patch for bpf_prog ID.
It is worth to note that the bpf_map's ID and bpf_prog's ID
are in two independent ID spaces and both have the same valid range:
[1, INT_MAX).
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Alexei Starovoitov <ast@fb.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch generates an unique ID for each BPF_PROG_LOAD-ed prog.
It is worth to note that each BPF_PROG_LOAD-ed prog will have
a different ID even they have the same bpf instructions.
The ID is generated by the existing idr_alloc_cyclic().
The ID is ranged from [1, INT_MAX). It is allocated in cyclic manner,
so an ID will get reused every 2 billion BPF_PROG_LOAD.
The bpf_prog_alloc_id() is done after bpf_prog_select_runtime()
because the jit process may have allocated a new prog. Hence,
we need to ensure the value of pointer 'prog' will not be changed
any more before storing the prog to the prog_idr.
After bpf_prog_select_runtime(), the prog is read-only. Hence,
the id is stored in 'struct bpf_prog_aux'.
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Alexei Starovoitov <ast@fb.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
It's unused, so remove it.
Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Introduce a helper called is_tcf_gact_trap which could be used to
tell if the action is gact trap or not.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Yotam Gigi <yotamg@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
There is need to instruct the HW offloaded path to push certain matched
packets to cpu/kernel for further analysis. So this patch introduces a
new TRAP control action to TC.
For kernel datapath, this action does not make much sense. So with the
same logic as in HW, new TRAP behaves similar to STOLEN. The skb is just
dropped in the datapath (and virtually ejected to an upper level, which
does not exist in case of kernel).
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Yotam Gigi <yotamg@mellanox.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media
Pull media fixes from Mauro Carvalho Chehab:
"Some bug fixes:
- Don't fail build if atomisp has warnings
- Some CEC Kconfig changes to allow it to be used by DRM without
media dependencies
- A race fix at RC initialization code
- A driver fix at rainshadow-cec
IMHO, the one that affects most people in this series is a build fix:
if you try to build the Kernel with W=1 or using gcc7 and
all[yes|mod]config, build will fail due to -Werror at atomisp
makefiles"
* tag 'media/v4.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
[media] rc-core: race condition during ir_raw_event_register()
[media] cec: drop MEDIA_CEC_DEBUG
[media] cec: rename MEDIA_CEC_NOTIFIER to CEC_NOTIFIER
[media] cec: select CEC_CORE instead of depend on it
[media] rainshadow-cec: ensure exit_loop is intialized
[media] atomisp: don't treat warnings as errors
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs
David Howells says:
====================
rxrpc: Support service upgrade
Here's a set of patches that allow AF_RXRPC to support the AuriStor service
upgrade facility. This allows the server to change the service ID
requested to an upgraded service if the client requests it upon the
initiation of a connection.
This is used by the AuriStor AFS-compatible servers to implement IPv6
handling and improved facilities by providing improved volume location,
volume, protection, file and cache management services. Note that certain
parts of the AFS protocol carry hard-coded IPv4 addresses.
The reason AuriStor does it this way is that probing the improved service
ID first will not incur an ABORT or any other response on some servers if
the server is not listening on it - and so one have to employ a timeout.
This is implemented in the server by allowing an AF_RXRPC server to call
bind() twice on a socket to allow it to listen on two service IDs and then
call setsockopt() to instruct the server to upgrade one into the other if
the client requests it (by setting userStatus to 1 on the first DATA packet
on a connection). If the upgrade occurs, all further operations on that
connection are done with the new service ID. AF_RXRPC has to handle this
automatically as connections are not exposed to userspace.
Clients can request this facility by setting an RXRPC_UPGRADE_SERVICE
command in the sendmsg() control buffer and then observing the resultant
service ID in the msg_addr returned by recvmsg(). This should only be used
to probe the service. Clients should then use the returned service ID in
all subsequent communications with that server. Note that the kernel will
not retain this information should the connection expire from its cache.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup fixes from Tejun Heo:
"Two cgroup fixes. One to address RCU delay of cpuset removal affecting
userland visible behaviors. The other fixes a race condition between
controller disable and cgroup removal"
* 'for-4.12-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
cpuset: consider dying css as offline
cgroup: Prevent kill_css() from being called more than once
|
|
return void
Make return value void since functions never returns meaningfull value.
Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
commit d91824c08fbc ("genetlink: register family ops as array") removed the
ops_list member from both genl_family and genl_ops; while the
documentation of genl_family was updated accordingly by this patch,
ops_list remained in the documentation of the genl_ops object.
This patch fixes it by removing ops_list from genl_ops documentation.
Signed-off-by: Rami Rosen <rami.rosen@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Update tcp.txt to fix mandatory congestion control ops and default
CCA selection. Also, fix comment in tcp.h for undo_cwnd.
Signed-off-by: Anmol Sarma <me@anmolsarma.in>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Make it possible for a client to use AuriStor's service upgrade facility.
The client does this by adding an RXRPC_UPGRADE_SERVICE control message to
the first sendmsg() of a call. This takes no parameters.
When recvmsg() starts returning data from the call, the service ID field in
the returned msg_name will reflect the result of the upgrade attempt. If
the upgrade was ignored, srx_service will match what was set in the
sendmsg(); if the upgrade happened the srx_service will be altered to
indicate the service the server upgraded to.
Note that:
(1) The choice of upgrade service is up to the server
(2) Further client calls to the same server that would share a connection
are blocked if an upgrade probe is in progress.
(3) This should only be used to probe the service. Clients should then
use the returned service ID in all subsequent communications with that
server (and not set the upgrade). Note that the kernel will not
retain this information should the connection expire from its cache.
(4) If a server that supports upgrading is replaced by one that doesn't,
whilst a connection is live, and if the replacement is running, say,
OpenAFS 1.6.4 or older or an older IBM AFS, then the replacement
server will not respond to packets sent to the upgraded connection.
At this point, calls will time out and the server must be reprobed.
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
Implement AuriStor's service upgrade facility. There are three problems
that this is meant to deal with:
(1) Various of the standard AFS RPC calls have IPv4 addresses in their
requests and/or replies - but there's no room for including IPv6
addresses.
(2) Definition of IPv6-specific RPC operations in the standard operation
sets has not yet been achieved.
(3) One could envision the creation a new service on the same port that as
the original service. The new service could implement improved
operations - and the client could try this first, falling back to the
original service if it's not there.
Unfortunately, certain servers ignore packets addressed to a service
they don't implement and don't respond in any way - not even with an
ABORT. This means that the client must then wait for the call timeout
to occur.
What service upgrade does is to see if the connection is marked as being
'upgradeable' and if so, change the service ID in the server and thus the
request and reply formats. Note that the upgrade isn't mandatory - a
server that supports only the original call set will ignore the upgrade
request.
In the protocol, the procedure is then as follows:
(1) To request an upgrade, the first DATA packet in a new connection must
have the userStatus set to 1 (this is normally 0). The userStatus
value is normally ignored by the server.
(2) If the server doesn't support upgrading, the reply packets will
contain the same service ID as for the first request packet.
(3) If the server does support upgrading, all future reply packets on that
connection will contain the new service ID and the new service ID will
be applied to *all* further calls on that connection as well.
(4) The RPC op used to probe the upgrade must take the same request data
as the shadow call in the upgrade set (but may return a different
reply). GetCapability RPC ops were added to all standard sets for
just this purpose. Ops where the request formats differ cannot be
used for probing.
(5) The client must wait for completion of the probe before sending any
further RPC ops to the same destination. It should then use the
service ID that recvmsg() reported back in all future calls.
(6) The shadow service must have call definitions for all the operation
IDs defined by the original service.
To support service upgrading, a server should:
(1) Call bind() twice on its AF_RXRPC socket before calling listen().
Each bind() should supply a different service ID, but the transport
addresses must be the same. This allows the server to receive
requests with either service ID.
(2) Enable automatic upgrading by calling setsockopt(), specifying
RXRPC_UPGRADEABLE_SERVICE and passing in a two-member array of
unsigned shorts as the argument:
unsigned short optval[2];
This specifies a pair of service IDs. They must be different and must
match the service IDs bound to the socket. Member 0 is the service ID
to upgrade from and member 1 is the service ID to upgrade to.
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
Our previous patch (cited below) introduced a regression
for RAW Eth QPs.
Fix it by checking if the QP number provided by user-space
exists, hence allowing steering rules to be added for valid
QPs only.
Fixes: 89c557687a32 ("net/mlx4_en: Avoid adding steering rules with invalid ring")
Reported-by: Or Gerlitz <gerlitz.or@gmail.com>
Signed-off-by: Talat Batheesh <talatb@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Acked-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The final addition on the qed front -
- VFs would now require their PFs to provide multiple CIDs
- Based on the availability of connections from PF, determine whether
XDP is feasible and share it with qede via dev_info.
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This adds the infrastructure for supporting VFs that want to open
multiple transmission queues on the same queue-zone.
At this point, there are no VFs that actually request this functionality,
but later patches would remedy that.
a. VF and PF would communicate the capability during ACQUIRE;
Legacy VFs would continue on behaving as they do today
b. PF would communicate number of supported CIDs to the VF
and would enforce said limitation
c. Whenever VF passes a request for a given queue configuration
it would also pass an associated index within said queue-zone
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Part of an effort of a cleaner seperation between qed and the protocol
drivers, the L2 interface is to use the SB structure for initialization
purposes opaquely.
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This is a defense-in-depth measure in response to bugs like
4d6fa57b4dab ("macsec: avoid heap overflow in skb_to_sgvec"). There's
not only a potential overflow of sglist items, but also a stack overflow
potential, so we fix this by limiting the amount of recursion this function
is allowed to do. Not actually providing a bounded base case is a future
disaster that we can easily avoid here.
As a small matter of house keeping, we take this opportunity to move the
documentation comment over the actual function the documentation is for.
While this could be implemented by using an explicit stack of skbuffs,
when implementing this, the function complexity increased considerably,
and I don't think such complexity and bloat is actually worth it. So,
instead I built this and tested it on x86, x86_64, ARM, ARM64, and MIPS,
and measured the stack usage there. I also reverted the recent MIPS
changes that give it a separate IRQ stack, so that I could experience
some worst-case situations. I found that limiting it to 24 layers deep
yielded a good stack usage with room for safety, as well as being much
deeper than any driver actually ever creates.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: David Howells <dhowells@redhat.com>
Cc: Sabrina Dubroca <sd@queasysnail.net>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Alexander reported various KASAN messages triggered in recent kernels
The problem is that ping sockets should not use udp_poll() in the first
place, and recent changes in UDP stack finally exposed this old bug.
Fixes: c319b4d76b9e ("net: ipv4: add IPPROTO_ICMP socket kind")
Fixes: 6d0bfe226116 ("net: ipv6: Add IPv6 support to the ping socket.")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Sasha Levin <alexander.levin@verizon.com>
Cc: Solar Designer <solar@openwall.com>
Cc: Vasiliy Kulikov <segoon@openwall.com>
Cc: Lorenzo Colitti <lorenzo@google.com>
Acked-By: Lorenzo Colitti <lorenzo@google.com>
Tested-By: Lorenzo Colitti <lorenzo@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This commit updates documentation of the bpf_perf_event_output and
bpf_perf_event_read helpers to match their implementation.
Signed-off-by: Teng Qin <qinteng@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Allow BPF_PROG_TYPE_PERF_EVENT program types to attach to all
perf_event types, including HW_CACHE, RAW, and dynamic pmu events.
Only tracepoint/kprobe events are treated differently which require
BPF_PROG_TYPE_TRACEPOINT/BPF_PROG_TYPE_KPROBE program types accordingly.
Also add support for reading all event counters using
bpf_perf_event_read() helper.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The command
# arp -s 62.2.0.1 a:b:c:d:e:f dev eth2
adds an entry like the following (listed by "arp -an")
? (62.2.0.1) at 0a:0b:0c:0d:0e:0f [ether] PERM on eth2
but the symmetric deletion command
# arp -i eth2 -d 62.2.0.1
does not remove the PERM entry from the table, and instead leaves behind
? (62.2.0.1) at <incomplete> on eth2
The reason is that there is a refcnt of 1 for the arp_tbl itself
(neigh_alloc starts off the entry with a refcnt of 1), thus
the neigh_release() call from arp_invalidate() will (at best) just
decrement the ref to 1, but will never actually free it from the
table.
To fix this, we need to do something like neigh_forced_gc: if
the refcnt is 1 (i.e., on the table's ref), remove the entry from
the table and free it. This patch refactors and shares common code
between neigh_forced_gc and the newly added neigh_remove_one.
A similar issue exists for IPv6 Neighbor Cache entries, and is fixed
in a similar manner by this patch.
Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Reviewed-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
There was no reason for duplicating the code that initializes
ds->enabled_port_mask in both dsa_parse_ports_dn() and
dsa_parse_ports(), instead move this to dsa_ds_parse() which is early
enough before ops->setup() has run.
While at it, we can now make dsa_is_cpu_port() check ds->cpu_port_mask
which is a step towards being multi-CPU port capable.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Benefit from the support of ip header fields dissection and
allow users to set rules matching on ipv4 tos and ttl or
ipv6 traffic-class and hoplimit.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add support for dissection of ip tos and ttl and ipv6 traffic-class
and hoplimit. Both are dissected into the same struct.
Uses similar call to ip dissection function as with tcp, arp and others.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This config option is strictly speaking independent of the
media subsystem since it can be used by drm as well.
Besides, it looks odd when drivers select CEC_CORE and
MEDIA_CEC_NOTIFIER, that's inconsistent naming.
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma
Pull rdma fixes from Doug Ledford:
"For the most part this is just a minor -rc cycle for the rdma
subsystem. Even given that this is all of the -rc patches since the
merge window closed, it's still only about 25 patches:
- Multiple i40iw, nes, iw_cxgb4, hfi1, qib, mlx4, mlx5 fixes
- A few upper layer protocol fixes (IPoIB, iSER, SRP)
- A modest number of core fixes"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (26 commits)
RDMA/SA: Fix kernel panic in CMA request handler flow
RDMA/umem: Fix missing mmap_sem in get umem ODP call
RDMA/core: not to set page dirty bit if it's already set.
RDMA/uverbs: Declare local function static and add brackets to sizeof
RDMA/netlink: Reduce exposure of RDMA netlink functions
RDMA/srp: Fix NULL deref at srp_destroy_qp()
RDMA/IPoIB: Limit the ipoib_dev_uninit_default scope
RDMA/IPoIB: Replace netdev_priv with ipoib_priv for ipoib_get_link_ksettings
RDMA/qedr: add null check before pointer dereference
RDMA/mlx5: set UMR wqe fence according to HCA cap
net/mlx5: Define interface bits for fencing UMR wqe
RDMA/mlx4: Fix MAD tunneling when SRIOV is enabled
RDMA/qib,hfi1: Fix MR reference count leak on write with immediate
RDMA/hfi1: Defer setting VL15 credits to link-up interrupt
RDMA/hfi1: change PCI bar addr assignments to Linux API functions
RDMA/hfi1: fix array termination by appending NULL to attr array
RDMA/iw_cxgb4: fix the calculation of ipv6 header size
RDMA/iw_cxgb4: calculate t4_eq_status_entries properly
RDMA/iw_cxgb4: Avoid touch after free error in ARP failure handlers
RDMA/nes: ACK MPA Reply frame
...
|
|
Merge misc fixes from Andrew Morton:
"15 fixes"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
scripts/gdb: make lx-dmesg command work (reliably)
mm: consider memblock reservations for deferred memory initialization sizing
mm/hugetlb: report -EHWPOISON not -EFAULT when FOLL_HWPOISON is specified
mlock: fix mlock count can not decrease in race condition
mm/migrate: fix refcount handling when !hugepage_migration_supported()
dax: fix race between colliding PMD & PTE entries
mm: avoid spurious 'bad pmd' warning messages
mm/page_alloc.c: make sure OOM victim can try allocations with no watermarks once
pcmcia: remove left-over %Z format
slub/memcg: cure the brainless abuse of sysfs attributes
initramfs: fix disabling of initramfs (and its compression)
mm: clarify why we want kmalloc before falling backto vmallock
frv: declare jiffies to be located in the .data section
include/linux/gfp.h: fix ___GFP_NOLOCKDEP value
ksm: prevent crash after write_protect_page fails
|
|
We have seen an early OOM killer invocation on ppc64 systems with
crashkernel=4096M:
kthreadd invoked oom-killer: gfp_mask=0x16040c0(GFP_KERNEL|__GFP_COMP|__GFP_NOTRACK), nodemask=7, order=0, oom_score_adj=0
kthreadd cpuset=/ mems_allowed=7
CPU: 0 PID: 2 Comm: kthreadd Not tainted 4.4.68-1.gd7fe927-default #1
Call Trace:
dump_stack+0xb0/0xf0 (unreliable)
dump_header+0xb0/0x258
out_of_memory+0x5f0/0x640
__alloc_pages_nodemask+0xa8c/0xc80
kmem_getpages+0x84/0x1a0
fallback_alloc+0x2a4/0x320
kmem_cache_alloc_node+0xc0/0x2e0
copy_process.isra.25+0x260/0x1b30
_do_fork+0x94/0x470
kernel_thread+0x48/0x60
kthreadd+0x264/0x330
ret_from_kernel_thread+0x5c/0xa4
Mem-Info:
active_anon:0 inactive_anon:0 isolated_anon:0
active_file:0 inactive_file:0 isolated_file:0
unevictable:0 dirty:0 writeback:0 unstable:0
slab_reclaimable:5 slab_unreclaimable:73
mapped:0 shmem:0 pagetables:0 bounce:0
free:0 free_pcp:0 free_cma:0
Node 7 DMA free:0kB min:0kB low:0kB high:0kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:52428800kB managed:110016kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:320kB slab_unreclaimable:4672kB kernel_stack:1152kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
lowmem_reserve[]: 0 0 0 0
Node 7 DMA: 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB 0*8192kB 0*16384kB = 0kB
0 total pagecache pages
0 pages in swap cache
Swap cache stats: add 0, delete 0, find 0/0
Free swap = 0kB
Total swap = 0kB
819200 pages RAM
0 pages HighMem/MovableOnly
817481 pages reserved
0 pages cma reserved
0 pages hwpoisoned
the reason is that the managed memory is too low (only 110MB) while the
rest of the the 50GB is still waiting for the deferred intialization to
be done. update_defer_init estimates the initial memoty to initialize
to 2GB at least but it doesn't consider any memory allocated in that
range. In this particular case we've had
Reserving 4096MB of memory at 128MB for crashkernel (System RAM: 51200MB)
so the low 2GB is mostly depleted.
Fix this by considering memblock allocations in the initial static
initialization estimation. Move the max_initialise to
reset_deferred_meminit and implement a simple memblock_reserved_memory
helper which iterates all reserved blocks and sums the size of all that
start below the given address. The cumulative size is than added on top
of the initial estimation. This is still not ideal because
reset_deferred_meminit doesn't consider holes and so reservation might
be above the initial estimation whihch we ignore but let's make the
logic simpler until we really need to handle more complicated cases.
Fixes: 3a80a7fa7989 ("mm: meminit: initialise a subset of struct pages if CONFIG_DEFERRED_STRUCT_PAGE_INIT is set")
Link: http://lkml.kernel.org/r/20170531104010.GI27783@dhcp22.suse.cz
Signed-off-by: Michal Hocko <mhocko@suse.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Tested-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: <stable@vger.kernel.org> [4.2+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
KVM uses get_user_pages() to resolve its stage2 faults. KVM sets the
FOLL_HWPOISON flag causing faultin_page() to return -EHWPOISON when it
finds a VM_FAULT_HWPOISON. KVM handles these hwpoison pages as a
special case. (check_user_page_hwpoison())
When huge pages are involved, this doesn't work so well.
get_user_pages() calls follow_hugetlb_page(), which stops early if it
receives VM_FAULT_HWPOISON from hugetlb_fault(), eventually returning
-EFAULT to the caller. The step to map this to -EHWPOISON based on the
FOLL_ flags is missing. The hwpoison special case is skipped, and
-EFAULT is returned to user-space, causing Qemu or kvmtool to exit.
Instead, move this VM_FAULT_ to errno mapping code into a header file
and use it from faultin_page() and follow_hugetlb_page().
With this, KVM works as expected.
This isn't a problem for arm64 today as we haven't enabled
MEMORY_FAILURE, but I can't see any reason this doesn't happen on x86
too, so I think this should be a fix. This doesn't apply earlier than
stable's v4.11.1 due to all sorts of cleanup.
[james.morse@arm.com: add vm_fault_to_errno() call to faultin_page()]
suggested.
Link: http://lkml.kernel.org/r/20170525171035.16359-1-james.morse@arm.com
[akpm@linux-foundation.org: coding-style fixes]
Link: http://lkml.kernel.org/r/20170524160900.28786-1-james.morse@arm.com
Signed-off-by: James Morse <james.morse@arm.com>
Acked-by: Punit Agrawal <punit.agrawal@arm.com>
Acked-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>
Cc: <stable@vger.kernel.org> [4.11.1+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Commit 7c30f352c852 ("jiffies.h: declare jiffies and jiffies_64 with
____cacheline_aligned_in_smp") removed a section specification from the
jiffies declaration that caused conflicts on some platforms.
Unfortunately this change broke the build for frv:
kernel/built-in.o: In function `__do_softirq': (.text+0x6460): relocation truncated to fit: R_FRV_GPREL12 against symbol
`jiffies' defined in *ABS* section in .tmp_vmlinux1
kernel/built-in.o: In function `__do_softirq': (.text+0x6574): relocation truncated to fit: R_FRV_GPREL12 against symbol
`jiffies' defined in *ABS* section in .tmp_vmlinux1
kernel/built-in.o: In function `pwq_activate_delayed_work': workqueue.c:(.text+0x15b9c): relocation truncated to fit: R_FRV_GPREL12 against
symbol `jiffies' defined in *ABS* section in .tmp_vmlinux1
...
Add __jiffy_arch_data to the declaration of jiffies and use it on frv to
include the section specification. For all other platforms
__jiffy_arch_data (currently) has no effect.
Fixes: 7c30f352c852 ("jiffies.h: declare jiffies and jiffies_64 with ____cacheline_aligned_in_smp")
Link: http://lkml.kernel.org/r/20170516221333.177280-1-mka@chromium.org
Signed-off-by: Matthias Kaehlcke <mka@chromium.org>
Reported-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: David Howells <dhowells@redhat.com>
Cc: Sudip Mukherjee <sudipm.mukherjee@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Igor Stoppa has noticed that __GFP_NOLOCKDEP can use a lower bit. At
the time commit 7e7844226f10 ("lockdep: allow to disable reclaim lockup
detection") was written we still had __GFP_OTHER_NODE but I have removed
it in commit 41b6167e8f74 ("mm: get rid of __GFP_OTHER_NODE") and forgot
to lower the bit value.
The current value is outside of __GFP_BITS_SHIFT so it cannot be used
actually.
Fixes: 7e7844226f10 ("lockdep: allow to disable reclaim lockup detection")
Signed-off-by: Michal Hocko <mhocko@suse.com>
Reported-by: Igor Stoppa <igor.stoppa@nokia.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
git://people.freedesktop.org/~airlied/linux
Pull drm displayport quirk support:
"DP quirk for usb c dongles.
As mentioned I have a separate request for fixing a regression, but
also keeping the broken hw working, for certain USB-C DP adapters they
require a minimised n/m parameters, but an attempt to do this
generically has failed, we need to quirk these specific adapters.
However doing it generically regressed some eDP panels.
This pull adds the infrastructure and a quirk for the adapter"
* tag 'drm-dp-quirk-for-v4.12-rc4' of git://people.freedesktop.org/~airlied/linux:
drm/i915: Detect USB-C specific dongles before reducing M and N
drm/dp: start a DPCD based DP sink/branch device quirk database
drm/i915: use drm DP helper to read DPCD desc
drm/dp: add helper for reading DP sink/branch device desc from DPCD
|
|
Since last patch, sctp doesn't need to alloc memory for asoc->stream any
more. sctp_stream_new and sctp_stream_init both are used to alloc memory
for stream.in or stream.out, and their names are also confusing.
This patch is to merge them into sctp_stream_init, and only pass stream
and streamcnt parameters into it, instead of the whole asoc.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
As Marcelo's suggestion, stream is a fixed size member of asoc and would
not grow with more streams. To avoid an allocation for it, this patch is
to define it as an object instead of pointer and update the places using
it, also create sctp_stream_update() called in sctp_assoc_update() to
migrate the stream info from one stream to another.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue
Jeff Kirsher says:
====================
40GbE Intel Wired LAN Driver Updates 2017-06-01
This series contains updates to i40e, i40evf and the "new" AVF virtchnl.
This is the introduction of the Intel(R) Ethernet Adaptive Virtual
Function driver code and device ID, as presented at the NetDEV 1.2
conference in 2016.
http://netdevconf.org/1.2/session.html?anjali-singhai
The idea is to convert the interface between the i40evf driver
and the parent i40e PF driver to be generic, as the i40evf driver
should in the future be able to run on top of other Intel PF
drivers, and negotiate any features beyond a "base expected" set.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Enhance API between qedi and qed, allowing qedi to inform device's
firmware when the iSCSI mac is to be changed.
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|