summaryrefslogtreecommitdiffstats
path: root/drivers
AgeCommit message (Collapse)AuthorFilesLines
2018-03-07net/mlx5: Export ipsec capabilitiesAviad Yehezkel4-24/+16
We will need that for ipsec verbs. Signed-off-by: Aviad Yehezkel <aviadye@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-03-07net/mlx5: IPSec, Add command V2 supportAviad Yehezkel5-43/+63
This patch adds V2 command support. New fpga devices support extended features (udp encap, esn etc...), this features require new hardware sadb format therefore we have a new version of commands to manipulate it. Signed-off-by: Yossef Efraim <yossefe@mellanox.com> Signed-off-by: Aviad Yehezkel <aviadye@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-03-07net/mlx5e: IPSec, Add support for ESP trailer removal by hardwareYossi Kuperman5-1/+68
Current hardware decrypts and authenticates incoming ESP packets. Subsequently, the software extracts the nexthdr field, truncates the trailer and adjusts csum accordingly. With this patch and a capable device, the trailer is being removed by the hardware and the nexthdr field is conveyed via PET. This way we avoid both the need to access the trailer (cache miss) and to compute its relative checksum, which significantly improve the performance. Experiment shows that trailer removal improves the performance by 2Gbps, (netperf). Both forwarding and host-to-host configurations. Signed-off-by: Yossi Kuperman <yossiku@mellanox.com> Signed-off-by: Aviad Yehezkel <aviadye@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-03-07net/mlx5: IPSec, Generalize sandbox QP commandsYossi Kuperman1-51/+65
The current code assume only SA QP commands. Refactor in order to pave the way for new QP commands: 1. Generic cmd response format. 2. SA cmd checks are in dedicated functions. 3. Aligned debug prints. Signed-off-by: Yossi Kuperman <yossiku@mellanox.com> Signed-off-by: Aviad Yehezkel <aviadye@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-03-07net/mlx5: Use MLX5_IPSEC_DEV macro for ipsec capsSaeed Mahameed1-2/+1
Fix build break of mlx5_accel_ipsec_device_caps is not defined when MLX5_ACCEL is not selected, use MLX5_IPSEC_DEV instead which handles such case. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reported-by: Doug Ledford <dledford@redhat.com>
2018-03-06net/mlx5: Flow steering cmd interface should get the fte when deletingAviad Yehezkel3-5/+5
Previously, deleting a flow steering entry only got the index. Since the FPGA implementation of FTE's deletion might need to dig inside the FTE itself, we would like to get the FTE's context. Changing the interface to pass the FTE context. Signed-off-by: Aviad Yehezkel <aviadye@mellanox.com> Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-03-06{net,IB}/mlx5: Add flow steering helpersBoris Pismenny1-4/+3
Add helper functions that check if a protocol is part of a flow steering match criteria. Signed-off-by: Boris Pismenny <borisp@mellanox.com> Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Aviad Yehezkel <aviadye@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-03-06net/mlx5: Embed mlx5_flow_act into fs_fteMatan Barak4-25/+21
fte objects contain the match value and action. Currently, extending the actions require in adding them both to the API and fs_fte. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Aviad Yehezkel <aviadye@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-03-06net/mlx5: Add empty egress namespace to flow steering coreAviad Yehezkel4-0/+34
Currently, we don't support egress flow steering namespace in mlx5 flow steering core implementation. However, when we want to encrypt a packet, we model it as a flow steering rule in the egress path. To overcome this, we add an empty egress namespace to flow steering. This namespace is initialized only when ipsec support exists. In the future, this will grow to a full blown full steering implementation, resembling the ingress path. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Aviad Yehezkel <aviadye@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-03-06net/mlx5: Add shim layer between fs and cmdMatan Barak4-100/+248
The shim layer allows each namespace to define possibly different functionality for add/delete/update commands. The shim layer introduced here, will be used to support flow steering with the FPGA. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Aviad Yehezkel <aviadye@mellanox.com> Signed-off-by: Boris Pismenny <borisp@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-03-06{net,IB}/mlx5: Add has_tag to mlx5_flow_actMatan Barak3-2/+4
The has_tag member will indicate whether a tag action was specified in flow specification. A flow tag 0 = MLX5_FS_DEFAULT_FLOW_TAG is assumed a valid flow tag that is currently used by mlx5 RDMA driver, whereas in HW flow_tag = 0 means that the user doesn't care about flow_tag. HW always provide a flow_tag = 0 if all flow tags requested on a specific flow are 0. So we need a way (in the driver) to differentiate between a user really requesting flow_tag = 0 and a user who does not care, in order to be able to report conflicting flow tags on a specific flow. Signed-off-by: Matan Barak <matanb@mellanox.com> Reviewed-by: Aviad Yehezkel <aviadye@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-03-06IB/mlx5: Pass mlx5_flow_act struct instead of multiple argumentsBoris Pismenny1-12/+8
Group and pass all function arguments of parse_flow_attr call in one common struct mlx5_flow_act. This patch passes all the action arguments of parse_flow_attr in one common struct mlx5_flow_act. It allows us to scale the number of actions without adding new arguments to the function. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Boris Pismenny <borisp@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Acked-by: Jason Gunthorpe <jgg@mellanox.com>
2018-03-06net/mlx5: FPGA and IPSec initialization to be before flow steeringMatan Barak1-19/+20
Some flow steering namespace initialization (i.e. egress namespace) might depend on FPGA capabilities. Changing the initialization order such that the FPGA will be initialized before flow steering. Flow steering fs cmds initialization might depend on IPSec capabilities. Changing the initialization order such that the IPSec will be initialized before flow steering as well. Signed-off-by: Aviad Yehezkel <aviadye@mellanox.com> Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-03-06net/mlx5e: Removed not need synchronize_rcuAviad Yehezkel1-2/+2
This is already done by xfrm layer between state_dev_del callback to state_dev_free callback. Signed-off-by: Aviad Yehezkel <aviadye@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-03-06net/mlx5e: Fixed sleeping inside atomic contextAviad Yehezkel1-9/+4
We can't allocate with GFP_KERNEL inside spinlock. Actually ida_simple doesn't require spinlock so remove it. Fixes: 547eede070eb ("net/mlx5e: IPSec, Innova IPSec offload infrastructure") Signed-off-by: Aviad Yehezkel <aviadye@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-03-06net/mlx5e: Wait for FPGA command responses with a timeoutAviad Yehezkel1-3/+6
Generally, FPGA IPSec commands must always complete. We want to wait for one minute for them to complete gracefully also when killing a process. Signed-off-by: Aviad Yehezkel <aviadye@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-03-06net/mlx5: Fixed compilation issue when CONFIG_MLX5_ACCEL is disabledAviad Yehezkel1-2/+2
IPSec init and cleanup functions also depends on linux/mlx5/driver.h. Signed-off-by: Aviad Yehezkel <aviadye@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-03-06IB/mlx5: Removed not used parametersAviad Yehezkel2-5/+0
Signed-off-by: Aviad Yehezkel <aviadye@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Acked-by: Jason Gunthorpe <jgg@mellanox.com>
2018-02-23IB/mlx5: Disable self loopback check when in switchdev modeMark Bloch1-0/+8
When in switchdev mode, there is no need to do self loopback checks as we can't receive those packets, we insert steering rules to the eswitch that make sure packets can't be looped back. Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-02-23net/mlx5: E-Switch, Reload IB interface when switching devlink modesMark Bloch4-17/+26
Up until this point it wasn't possible to activate IB representors when switching to switchdev mode, remove this limitation. We trigger reload of the PF IB interface in order to make sure that already allocated resources are invalid and new resources will be opened correctly with all the limitations of switchdev mode applied (only raw packet capabilities, without RoCE). We also move the remove/add to a place where the E-Switch mode is set/unset to better control when to trigger this action, this will allow the IB side to start in the correct mode. For better code reuse, create a function which reloads an interface and export it. Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-02-23IB/mlx5: Add proper representors supportMark Bloch4-30/+192
This commit adds full support for IB representor: 1) Representors profile, We add two new profiles: nic_rep_profile - This profile will be used to create an IB device that represents the PF/UPLINK. rep_profile - This profile will be used to create an IB device that represents VFs. Each VF will be its own representor. 2) Proper load/unload callbacks, Those are called by the E-Switch when moving to/from switchdev mode. 3) Different flow DB handling for when we in switchdev mode. Signed-off-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-02-23IB/mlx5: E-Switch, Add rule to forward traffic to vportMark Bloch4-0/+45
In order to forward traffic from representor's SQ to the right virtual function, every time an SQ is created also add the corresponding flow rule to the FDB. Signed-off-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-02-23IB/mlx5: Don't expose MR cache in switchdev modeMark Bloch1-2/+3
When enabling many VFs and switching to switchdev mode, the total amount of mkeys we try to allocate when loading representors is very large and may cause timeouts on allocations, the same issues was observed on VFs and we employ the same fix that was done for them. We avoid allocating the full MR cache on load but still allow it to be manipulated once the IB device is loaded. Signed-off-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-02-23IB/mlx5: When in switchdev mode, expose only raw packet capabilitiesMark Bloch2-30/+124
Currently in switchdev mode we allow only for raw packet QPs. Expose the right capabilities and set the gid table length to 0, also make sure we don't try to enable RoCE, so split the function to enable RoCE so representors can enable only the notifier needed for net device events. Signed-off-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-02-23IB/mlx5: Listen to netdev register/unresiter events in switchdev modeMark Bloch2-3/+20
Currently we listen to netdev register/unregister event based on PCI device. When in switchdev mode PF and representors share the same PCI device, so in order to pair ib device and netdev in switchdev mode compare the netdev that triggered the event to that of the representor. Expose a function that lets you receive the netdev associated what a given representor. Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-02-23IB/mlx5: Add match on vport when in switchdev modeMark Bloch1-0/+12
When we point to a representor, it means we are in switchdev mode. The flow db is shared between PF and virtual function representors so each rule created needs to have a match on its specific source port. Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-02-23IB/mlx5: Allocate flow DB only on PF IB deviceMark Bloch2-14/+34
A flow DB is a shared resource between PF and representors, need to allocate it only when creating the PF IB device. Once we add IB representors, they will use the flow db which was created by the PF. Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-02-23IB/mlx5: Add basic regiser/unregister representors codeMark Bloch5-0/+169
Create the basic infrastructure of registering and unregistering IB representors. The load/unload callbacks are left empty and proper implementation will be introduced in following patches. Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-02-23net/mlx5: E-Switch, Optimize HW steering tables in switchdev modeMark Bloch2-7/+44
Under switchdev mode we insert an eswitch miss rule causing any unmatched traffic to be sent towards the PF vport. This miss rule can be optimized if we break it to two, one case is for multicast traffic and the other for unicast. Breaking the miss rule into two (unicast and multicast) allows the firmware to program the hardware in a more efficient way. Using ConncetX-5 Ex with IXIA and testpmd (which use IB representors): IXIA -> NIC -> PF -> IB representor -> NIC -> VF: - Without this optimization: 9.2 MPPS. - With this optimization: 18 MPPS. VF -> NIC -> IB representor-> PF -> NIC -> IXIA: - Without this optimization: 17 MPPS. - With this optimization: 23.4 MPPS. Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-02-23net/mlx5: E-Switch, Increase number of FTEs in FDB in switchdev modeMark Bloch1-2/+3
The max FTE number should be the max number of SQs that can be opened. Ethernet representors open one SQ each. Once we add IB representor this will increase (depends on the user). For now lets start with 31 per IB representor and if needed increase in the future. This increase only affects the number of FTEs in the slow path FDB, offloaded rules (done via TC on the fast path portion of the FDB) aren't affected. Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-02-23net/mlx5: E-Switch, Move representors definition to a global scopeMark Bloch4-49/+19
In preparation for IB representors, move representors structs to a global scope, also expose functions needed for registration, unregistration, eswitch mode and creating a flow rule to direct traffic from SQs to the right VF. Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-02-23net/mlx5: E-Switch, Add callback to get representor deviceMark Bloch3-0/+40
Add a callback interface to get a protocol device (per representor type). The Ethernet representors will expose their netdev via this interface. This functionality can be later used by IB representor in order to find the corresponding net device representor. Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-02-15IB/mlx5: Implement fragmented completion queue (CQ)Yonatan Cohen6-71/+87
The current implementation of create CQ requires contiguous memory, such requirement is problematic once the memory is fragmented or the system is low in memory, it causes for failures in dma_zalloc_coherent(). This patch implements new scheme of fragmented CQ to overcome this issue by introducing new type: 'struct mlx5_frag_buf_ctrl' to allocate fragmented buffers, rather than contiguous ones. Base the Completion Queues (CQs) on this new fragmented buffer. It fixes following crashes: kworker/29:0: page allocation failure: order:6, mode:0x80d0 CPU: 29 PID: 8374 Comm: kworker/29:0 Tainted: G OE 3.10.0 Workqueue: ib_cm cm_work_handler [ib_cm] Call Trace: [<>] dump_stack+0x19/0x1b [<>] warn_alloc_failed+0x110/0x180 [<>] __alloc_pages_slowpath+0x6b7/0x725 [<>] __alloc_pages_nodemask+0x405/0x420 [<>] dma_generic_alloc_coherent+0x8f/0x140 [<>] x86_swiotlb_alloc_coherent+0x21/0x50 [<>] mlx5_dma_zalloc_coherent_node+0xad/0x110 [mlx5_core] [<>] ? mlx5_db_alloc_node+0x69/0x1b0 [mlx5_core] [<>] mlx5_buf_alloc_node+0x3e/0xa0 [mlx5_core] [<>] mlx5_buf_alloc+0x14/0x20 [mlx5_core] [<>] create_cq_kernel+0x90/0x1f0 [mlx5_ib] [<>] mlx5_ib_create_cq+0x3b0/0x4e0 [mlx5_ib] Signed-off-by: Yonatan Cohen <yonatanc@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-02-15net/mlx5: Remove redundant EQ API exportsSaeed Mahameed2-3/+17
EQ structure and API is private to mlx5_core driver only, external drivers should not have access or the means to manipulate EQ objects. Remove redundant exports and move API functions out of the linux/mlx5 include directory into the driver's mlx5_core.h private include file. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Gal Pressman <galp@mellanox.com>
2018-02-15net/mlx5: Move CQ completion and event forwarding logic to eq.cSaeed Mahameed2-47/+47
Since CQ tree is now per EQ, CQ completion and event forwarding became specific implementation of EQ logic, this patch moves that logic to eq.c and makes those functions static. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Gal Pressman <galp@mellanox.com>
2018-02-15net/mlx5: CQ hold/put APISaeed Mahameed1-23/+19
Now as the CQ table is per EQ, add an API to hold/put CQ to be used from eq.c in downstream patch. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Gal Pressman <galp@mellanox.com>
2018-02-15net/mlx5: EQ add/del CQ APISaeed Mahameed3-45/+53
Add API to add/del CQ to/from EQs CQ table to be used in cq.c upon CQ creation/destruction, as CQ table is now private to eq.c. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Gal Pressman <galp@mellanox.com>
2018-02-15net/mlx5: Add missing likely/unlikely hints to cq eventsSaeed Mahameed1-3/+3
If a hardware event is targeting a CQ, that CQ should exist. Add unlikely to error handling flows. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Gal Pressman <galp@mellanox.com>
2018-02-15net/mlx5: CQ Database per EQSaeed Mahameed3-35/+52
Before this patch the driver had one CQ database protected via one spinlock, this spinlock is meant to synchronize between CQ adding/removing and CQ IRQ interrupt handling. On a system with large number of CPUs and on a work load that requires lots of interrupts, this global spinlock becomes a very nasty hotspot and introduces a contention between the active cores, which will significantly hurt performance and becomes a bottleneck that prevents seamless cpu scaling. To solve this we simply move the CQ database and its spinlock to be per EQ (IRQ), thus per core. Tested with: system: 2 sockets, 14 cores per socket, hyperthreading, 2x14x2=56 cores netperf command: ./super_netperf 200 -P 0 -t TCP_RR -H <server> -l 30 -- -r 300,300 -o -s 1M,1M -S 1M,1M WITHOUT THIS PATCH: Average: CPU %usr %nice %sys %iowait %irq %soft %steal %guest %gnice %idle Average: all 4.32 0.00 36.15 0.09 0.00 34.02 0.00 0.00 0.00 25.41 Samples: 2M of event 'cycles:pp', Event count (approx.): 1554616897271 Overhead Command Shared Object Symbol + 14.28% swapper [kernel.vmlinux] [k] intel_idle + 12.25% swapper [kernel.vmlinux] [k] queued_spin_lock_slowpath + 10.29% netserver [kernel.vmlinux] [k] queued_spin_lock_slowpath + 1.32% netserver [kernel.vmlinux] [k] mlx5e_xmit WITH THIS PATCH: Average: CPU %usr %nice %sys %iowait %irq %soft %steal %guest %gnice %idle Average: all 4.27 0.00 34.31 0.01 0.00 18.71 0.00 0.00 0.00 42.69 Samples: 2M of event 'cycles:pp', Event count (approx.): 1498132937483 Overhead Command Shared Object Symbol + 23.33% swapper [kernel.vmlinux] [k] intel_idle + 1.69% netserver [kernel.vmlinux] [k] mlx5e_xmit Tested-by: Song Liu <songliubraving@fb.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Gal Pressman <galp@mellanox.com>
2018-02-11vfs: do bulk POLL* -> EPOLL* replacementLinus Torvalds187-510/+510
This is the mindless scripted replacement of kernel use of POLL* variables as described by Al, done by this script: for V in IN OUT PRI ERR RDNORM RDBAND WRNORM WRBAND HUP RDHUP NVAL MSG; do L=`git grep -l -w POLL$V | grep -v '^t' | grep -v /um/ | grep -v '^sa' | grep -v '/poll.h$'|grep -v '^D'` for f in $L; do sed -i "-es/^\([^\"]*\)\(\<POLL$V\>\)/\\1E\\2/" $f; done done with de-mangling cleanups yet to come. NOTE! On almost all architectures, the EPOLL* constants have the same values as the POLL* constants do. But they keyword here is "almost". For various bad reasons they aren't the same, and epoll() doesn't actually work quite correctly in some cases due to this on Sparc et al. The next patch from Al will sort out the final differences, and we should be all done. Scripted-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-02-11Merge branch 'work.poll2' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull more poll annotation updates from Al Viro: "This is preparation to solving the problems you've mentioned in the original poll series. After this series, the kernel is ready for running for V in IN OUT PRI ERR RDNORM RDBAND WRNORM WRBAND HUP RDHUP NVAL MSG; do L=`git grep -l -w POLL$V | grep -v '^t' | grep -v /um/ | grep -v '^sa' | grep -v '/poll.h$'|grep -v '^D'` for f in $L; do sed -i "-es/^\([^\"]*\)\(\<POLL$V\>\)/\\1E\\2/" $f; done done as a for bulk search-and-replace. After that, the kernel is ready to apply the patch to unify {de,}mangle_poll(), and then get rid of kernel-side POLL... uses entirely, and we should be all done with that stuff. Basically, that's what you suggested wrt KPOLL..., except that we can use EPOLL... instead - they already are arch-independent (and equal to what is currently kernel-side POLL...). After the preparations (in this series) switch to returning EPOLL... from ->poll() instances is completely mechanical and kernel-side POLL... can go away. The last step (killing kernel-side POLL... and unifying {de,}mangle_poll() has to be done after the search-and-replace job, since we need userland-side POLL... for unified {de,}mangle_poll(), thus the cherry-pick at the last step. After that we will have: - POLL{IN,OUT,...} *not* in __poll_t, so any stray instances of ->poll() still using those will be caught by sparse. - eventpoll.c and select.c warning-free wrt __poll_t - no more kernel-side definitions of POLL... - userland ones are visible through the entire kernel (and used pretty much only for mangle/demangle) - same behavior as after the first series (i.e. sparc et.al. epoll(2) working correctly)" * 'work.poll2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: annotate ep_scan_ready_list() ep_send_events_proc(): return result via esed->res preparation to switching ->poll() to returning EPOLL... add EPOLLNVAL, annotate EPOLL... and event_poll->event use linux/poll.h instead of asm/poll.h xen: fix poll misannotation smc: missing poll annotations
2018-02-10Merge tag 'for-linus-20180210' of git://git.kernel.dk/linux-blockLinus Torvalds9-36/+111
Pull block fixes from Jens Axboe: "A few fixes to round off the merge window on the block side: - a set of bcache fixes by way of Michael Lyle, from the usual bcache suspects. - add a simple-to-hook-into function for bpf EIO error injection. - fix blk-wbt that mischarectized flushes as reads. Improve the logic so that flushes and writes are accounted as writes, and only reads as reads. From me. - fix requeue crash in BFQ, from Paolo" * tag 'for-linus-20180210' of git://git.kernel.dk/linux-block: block, bfq: add requeue-request hook bcache: fix for data collapse after re-attaching an attached device bcache: return attach error when no cache set exist bcache: set writeback_rate_update_seconds in range [1, 60] seconds bcache: fix for allocator and register thread race bcache: set error_limit correctly bcache: properly set task state in bch_writeback_thread() bcache: fix high CPU occupancy during journal bcache: add journal statistic block: Add should_fail_bio() for bpf error injection blk-wbt: account flush requests correctly
2018-02-10Merge tag 'platform-drivers-x86-v4.16-3' of git://github.com/dvhart/linux-pdx86Linus Torvalds2-11/+363
Pull x86 platform driver updates from Darren Hart: "Mellanox fixes and new system type support. Mostly data for new system types with a correction and an uninitialized variable fix" [ Pulling from github because git.infradead.org currently seems to be down for some reason, but Darren had a backup location - Linus ] * tag 'platform-drivers-x86-v4.16-3' of git://github.com/dvhart/linux-pdx86: platform/x86: mlx-platform: Add support for new 200G IB and Ethernet systems platform/x86: mlx-platform: Add support for new msn201x system type platform/x86: mlx-platform: Add support for new msn274x system type platform/x86: mlx-platform: Fix power cable setting for msn21xx family platform/x86: mlx-platform: Add define for the negative bus platform/x86: mlx-platform: Use defines for bus assignment platform/mellanox: mlxreg-hotplug: Fix uninitialized variable
2018-02-10Merge tag 'chrome-platform-for-linus-4.16' of ↵Linus Torvalds4-19/+57
git://git.kernel.org/pub/scm/linux/kernel/git/bleung/chrome-platform Pull chrome platform updates from Benson Leung: - move cros_ec_dev to drivers/mfd - other small maintenance fixes [ The cros_ec_dev movement came in earlier through the MFD tree - Linus ] * tag 'chrome-platform-for-linus-4.16' of git://git.kernel.org/pub/scm/linux/kernel/git/bleung/chrome-platform: platform/chrome: Use proper protocol transfer function platform/chrome: cros_ec_lpc: Add support for Google Glimmer platform/chrome: cros_ec_lpc: Register the driver if ACPI entry is missing. platform/chrome: cros_ec_lpc: remove redundant pointer request cros_ec: fix nul-termination for firmware build info platform/chrome: chromeos_laptop: make chromeos_laptop const
2018-02-10Merge tag 'kvm-4.16-1' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds8-2/+1017
Pull KVM updates from Radim Krčmář: "ARM: - icache invalidation optimizations, improving VM startup time - support for forwarded level-triggered interrupts, improving performance for timers and passthrough platform devices - a small fix for power-management notifiers, and some cosmetic changes PPC: - add MMIO emulation for vector loads and stores - allow HPT guests to run on a radix host on POWER9 v2.2 CPUs without requiring the complex thread synchronization of older CPU versions - improve the handling of escalation interrupts with the XIVE interrupt controller - support decrement register migration - various cleanups and bugfixes. s390: - Cornelia Huck passed maintainership to Janosch Frank - exitless interrupts for emulated devices - cleanup of cpuflag handling - kvm_stat counter improvements - VSIE improvements - mm cleanup x86: - hypervisor part of SEV - UMIP, RDPID, and MSR_SMI_COUNT emulation - paravirtualized TLB shootdown using the new KVM_VCPU_PREEMPTED bit - allow guests to see TOPOEXT, GFNI, VAES, VPCLMULQDQ, and more AVX512 features - show vcpu id in its anonymous inode name - many fixes and cleanups - per-VCPU MSR bitmaps (already merged through x86/pti branch) - stable KVM clock when nesting on Hyper-V (merged through x86/hyperv)" * tag 'kvm-4.16-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (197 commits) KVM: PPC: Book3S: Add MMIO emulation for VMX instructions KVM: PPC: Book3S HV: Branch inside feature section KVM: PPC: Book3S HV: Make HPT resizing work on POWER9 KVM: PPC: Book3S HV: Fix handling of secondary HPTEG in HPT resizing code KVM: PPC: Book3S PR: Fix broken select due to misspelling KVM: x86: don't forget vcpu_put() in kvm_arch_vcpu_ioctl_set_sregs() KVM: PPC: Book3S PR: Fix svcpu copying with preemption enabled KVM: PPC: Book3S HV: Drop locks before reading guest memory kvm: x86: remove efer_reload entry in kvm_vcpu_stat KVM: x86: AMD Processor Topology Information x86/kvm/vmx: do not use vm-exit instruction length for fast MMIO when running nested kvm: embed vcpu id to dentry of vcpu anon inode kvm: Map PFN-type memory regions as writable (if possible) x86/kvm: Make it compile on 32bit and with HYPYERVISOR_GUEST=n KVM: arm/arm64: Fixup userspace irqchip static key optimization KVM: arm/arm64: Fix userspace_irqchip_in_use counting KVM: arm/arm64: Fix incorrect timer_is_pending logic MAINTAINERS: update KVM/s390 maintainers MAINTAINERS: add Halil as additional vfio-ccw maintainer MAINTAINERS: add David as a reviewer for KVM/s390 ...
2018-02-09Merge tag 'kbuild-v4.16-2' of ↵Linus Torvalds4-4/+0
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull more Kbuild updates from Masahiro Yamada: "Makefile changes: - enable unused-variable warning that was wrongly disabled for clang Kconfig changes: - warn about blank 'help' and fix existing instances - fix 'choice' behavior to not write out invisible symbols - fix misc weirdness Coccinell changes: - fix false positive of free after managed memory alloc detection - improve performance of NULL dereference detection" * tag 'kbuild-v4.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (21 commits) kconfig: remove const qualifier from sym_expand_string_value() kconfig: add xrealloc() helper kconfig: send error messages to stderr kconfig: echo stdin to stdout if either is redirected kconfig: remove check_stdin() kconfig: remove 'config*' pattern from .gitignnore kconfig: show '?' prompt even if no help text is available kconfig: do not write choice values when their dependency becomes n coccinelle: deref_null: avoid useless computation coccinelle: devm_free: reduce false positives kbuild: clang: disable unused variable warnings only when constant kconfig: Warn if help text is blank nios2: kconfig: Remove blank help text arm: vt8500: kconfig: Remove blank help text MIPS: kconfig: Remove blank help text MIPS: BCM63XX: kconfig: Remove blank help text lib/Kconfig.debug: Remove blank help text Staging: rtl8192e: kconfig: Remove blank help text Staging: rtl8192u: kconfig: Remove blank help text mmc: kconfig: Remove blank help text ...
2018-02-09platform/x86: mlx-platform: Add support for new 200G IB and Ethernet systemsVadim Pasternak1-0/+142
It adds support for new Mellanox system types of basic classes qmb7, sn34, sn37, containing systems QMB700 (40x200GbE InfiniBand switch), SN3700 (32x200GbE and 16x400GbE Ethernet switch) and SN3410 (6x400GbE plus 48x50GbE Ethernet switch). These are the Top of the Rack systems, equipped with Mellanox COM-Express carrier board and switch board with Mellanox Quantum device, which supports InfiniBand switching with 40X200G ports and line rate of up to HDR speed or with Mellanox Spectrum-2 device, which supports Ethernet switching with 32X200G ports line rate of up to HDR speed. Signed-off-by: Vadim Pasternak <vadimp@mellanox.com> Signed-off-by: Darren Hart (VMware) <dvhart@infradead.org>
2018-02-09platform/x86: mlx-platform: Add support for new msn201x system typeVadim Pasternak1-0/+59
It adds support for new Mellanox system types of basic half unit size class msn201x, containing system MSN2010 (18x10GbE plus 4x4x25GbE) half and its derivatives. This is the Top of the Rack system, equipped with Mellanox Small Form Factor carrier board and switch board with Mellanox Spectrum device, which supports Ethernet switching with 32X100G ports line rate of up to EDR speed. Signed-off-by: Vadim Pasternak <vadimp@mellanox.com> Signed-off-by: Darren Hart (VMware) <dvhart@infradead.org>
2018-02-09platform/x86: mlx-platform: Add support for new msn274x system typeVadim Pasternak1-0/+124
It adds support for new Mellanox system types of basic class msn274x, containing system MSN2740 (32x100GbE Ethernet switch with cost reduction) and its derivatives. These are the Top of the Rack system, equipped with Mellanox Small Form Factor carrier board and switch board with Mellanox Spectrum device, which supports Ethernet switching with 32X100G ports line rate of up to EDR speed. Signed-off-by: Vadim Pasternak <vadimp@mellanox.com> Signed-off-by: Darren Hart (VMware) <dvhart@infradead.org>
2018-02-09Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds51-100/+330
Pull networking fixes from David Miller: 1) Make allocations less aggressive in x_tables, from Minchal Hocko. 2) Fix netfilter flowtable Kconfig deps, from Pablo Neira Ayuso. 3) Fix connection loss problems in rtlwifi, from Larry Finger. 4) Correct DRAM dump length for some chips in ath10k driver, from Yu Wang. 5) Fix ABORT handling in rxrpc, from David Howells. 6) Add SPDX tags to Sun networking drivers, from Shannon Nelson. 7) Some ipv6 onlink handling fixes, from David Ahern. 8) Netem packet scheduler interval calcualtion fix from Md. Islam. 9) Don't put crypto buffers on-stack in rxrpc, from David Howells. 10) Fix handling of error non-delivery status in netlink multicast delivery over multiple namespaces, from Nicolas Dichtel. 11) Missing xdp flush in tuntap driver, from Jason Wang. 12) Synchonize RDS protocol netns/module teardown with rds object management, from Sowini Varadhan. 13) Add nospec annotations to mpls, from Dan Williams. 14) Fix SKB truesize handling in TIPC, from Hoang Le. 15) Interrupt masking fixes in stammc from Niklas Cassel. 16) Don't allow ptr_ring objects to be sized outside of kmalloc's limits, from Jason Wang. 17) Don't allow SCTP chunks to be built which will have a length exceeding the chunk header's 16-bit length field, from Alexey Kodanev. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (82 commits) ibmvnic: Remove skb->protocol checks in ibmvnic_xmit bpf: fix rlimit in reuseport net selftest sctp: verify size of a new chunk in _sctp_make_chunk() s390/qeth: fix SETIP command handling s390/qeth: fix underestimated count of buffer elements ptr_ring: try vmalloc() when kmalloc() fails ptr_ring: fail early if queue occupies more than KMALLOC_MAX_SIZE net: stmmac: remove redundant enable of PMT irq net: stmmac: rename GMAC_INT_DEFAULT_MASK for dwmac4 net: stmmac: discard disabled flags in interrupt status register ibmvnic: Reset long term map ID counter tools/libbpf: handle issues with bpf ELF objects containing .eh_frames selftests/bpf: add selftest that use test_libbpf_open selftests/bpf: add test program for loading BPF ELF files tools/libbpf: improve the pr_debug statements to contain section numbers bpf: Sync kernel ABI header with tooling header for bpf_common.h net: phy: fix phy_start to consider PHY_IGNORE_INTERRUPT net: thunder: change q_len's type to handle max ring size tipc: fix skb truesize/datasize ratio control net/sched: cls_u32: fix cls_u32 on filter replace ...