| Age | Commit message (Collapse) | Author | Files | Lines | 
|---|
|  | This functionality is not used by the IIO subsystem. Due
to removal of legacy API it can also be removed.
Signed-off-by: Sebastian Reichel <sre@kernel.org> | 
|  | This struct is no longer used by anything in the kernel.
Signed-off-by: Sebastian Reichel <sre@kernel.org> | 
|  | All madc users have been converted to IIO API, so drop the
legacy API. The function is still used inside of the driver.
Signed-off-by: Sebastian Reichel <sre@kernel.org> | 
|  | Drop legacy twl4030_get_madc_conversion() method. It has been
used by drivers to get madc data before it conversion to IIO
API. There are no users in the mainline kernel anymore.
Signed-off-by: Sebastian Reichel <sre@kernel.org> | 
|  | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq fixes from Thomas Gleixner:
 - Prevent double activation of interrupt lines, which causes problems
   on certain interrupt controllers
 - Handle the fallout of the above because x86 (ab)uses the activation
   function to reconfigure interrupts under the hood.
* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/irq: Make irq activate operations symmetric
  irqdomain: Avoid activating interrupts more than once | 
|  | git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc driver fixes from Greg KH:
 "Here are two bugfixes that resolve some reported issues. One in the
  firmware loader, that should fix the much-reported problem of crashes
  with it. The other is a hyperv fix for a reported regression.
  Both have been in linux-next for a week or so with no reported issues"
* tag 'char-misc-4.10-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
  Drivers: hv: vmbus: finally fix hv_need_to_signal_on_read()
  firmware: fix NULL pointer dereference in __fw_load_abort() | 
|  | Reading a sysfs "memoryN/valid_zones" file leads to the following oops
when the first page of a range is not backed by struct page.
show_valid_zones() assumes that 'start_pfn' is always valid for
page_zone().
 BUG: unable to handle kernel paging request at ffffea017a000000
 IP: show_valid_zones+0x6f/0x160
This issue may happen on x86-64 systems with 64GiB or more memory since
their memory block size is bumped up to 2GiB.  [1] An example of such
systems is desribed below.  0x3240000000 is only aligned by 1GiB and
this memory block starts from 0x3200000000, which is not backed by
struct page.
 BIOS-e820: [mem 0x0000003240000000-0x000000603fffffff] usable
Since test_pages_in_a_zone() already checks holes, fix this issue by
extending this function to return 'valid_start' and 'valid_end' for a
given range.  show_valid_zones() then proceeds with the valid range.
[1] 'Commit bdee237c0343 ("x86: mm: Use 2GB memory block size on
    large-memory x86-64 systems")'
Link: http://lkml.kernel.org/r/20170127222149.30893-3-toshi.kani@hpe.com
Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Zhang Zhen <zhenzhang.zhang@huawei.com>
Cc: Reza Arbab <arbab@linux.vnet.ibm.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: <stable@vger.kernel.org>	[4.4+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> | 
|  | git://people.freedesktop.org/~airlied/linux
Pull drm fixes from Dave Airlie:
 "Another fixes pull for v4.10, it's a bit big due to the backport of
  the VMA fixes for i915 that should fix the oops on shutdown problems
  that you've worked around.
  There are also two drm core connector registration fixes, a bunch of
  nouveau regression fixes and two AMD fixes"
* tag 'drm-fixes-for-v4.10-rc7' of git://people.freedesktop.org/~airlied/linux:
  drm/radeon: Fix vram_size/visible values in DRM_RADEON_GEM_INFO ioctl
  drm/amdgpu/si: fix crash on headless asics
  drm/i915: Track pinned vma in intel_plane_state
  drm/atomic: Unconditionally call prepare_fb.
  drm/atomic: Fix double free in drm_atomic_state_default_clear
  drm/nouveau/kms/nv50: request vblank events for commits that send completion events
  drm/nouveau/nv1a,nv1f/disp: fix memory clock rate retrieval
  drm/nouveau/disp/gt215: Fix HDA ELD handling (thus, HDMI audio) on gt215
  drm/nouveau/nouveau/led: prevent compiling the led-code if nouveau=y and leds=m
  drm/nouveau/disp/mcp7x: disable dptmds workaround
  drm/nouveau: prevent userspace from deleting client object
  drm/nouveau/fence/g84-: protect against concurrent access to semaphore buffers
  drm: Don't race connector registration
  drm: prevent double-(un)registration for connectors | 
|  | Merge kcrctab entry fixes from Ard Biesheuvel:
 "This is a followup to [0] 'modversions: redefine kcrctab entries as
  relative CRC pointers', but since relative CRC pointers do not work in
  modules, and are actually only needed by powerpc with
  CONFIG_RELOCATABLE=y, I have made it a Kconfig selectable feature
  instead.
  First it introduces the MODULE_REL_CRCS Kconfig symbol, and adds the
  kbuild handling of it, i.e., modpost, genksyms and kallsyms.
  Then it switches all architectures to 32-bit CRC entries in kcrctab,
  where all architectures except powerpc with CONFIG_RELOCATABLE=y use
  absolute ELF symbol references as before"
[0] http://marc.info/?l=linux-arch&m=148493613415294&w=2
* emailed patches from Ard Biesheuvel:
  module: unify absolute krctab definitions for 32-bit and 64-bit
  modversions: treat symbol CRCs as 32 bit quantities
  kbuild: modversions: add infrastructure for emitting relative CRCs | 
|  | The function order_base_2() is defined (according to the comment block)
as returning zero on input zero, but subsequently passes the input into
roundup_pow_of_two(), which is explicitly undefined for input zero.
This has gone unnoticed until now, but optimization passes in GCC 7 may
produce constant folded function instances where a constant value of
zero is passed into order_base_2(), resulting in link errors against the
deliberately undefined '____ilog2_NaN'.
So update order_base_2() to adhere to its own documented interface.
[ See
     http://marc.info/?l=linux-kernel&m=147672952517795&w=2
  and follow-up discussion for more background. The gcc "optimization
  pass" is really just broken, but now the GCC trunk problem seems to
  have escaped out of just specially built daily images, so we need to
  work around it in mainline.    - Linus ]
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> | 
|  | The previous patch introduced a separate inline asm version of the
krcrctab declaration template for use with 64-bit architectures, which
cannot refer to ELF symbols using 32-bit quantities.
This declaration should be equivalent to the C one for 32-bit
architectures, but just in case - unify them in a separate patch, which
can simply be dropped if it turns out to break anything.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> | 
|  | The modversion symbol CRCs are emitted as ELF symbols, which allows us
to easily populate the kcrctab sections by relying on the linker to
associate each kcrctab slot with the correct value.
This has a couple of downsides:
 - Given that the CRCs are treated as memory addresses, we waste 4 bytes
   for each CRC on 64 bit architectures,
 - On architectures that support runtime relocation, a R_<arch>_RELATIVE
   relocation entry is emitted for each CRC value, which identifies it
   as a quantity that requires fixing up based on the actual runtime
   load offset of the kernel. This results in corrupted CRCs unless we
   explicitly undo the fixup (and this is currently being handled in the
   core module code)
 - Such runtime relocation entries take up 24 bytes of __init space
   each, resulting in a x8 overhead in [uncompressed] kernel size for
   CRCs.
Switching to explicit 32 bit values on 64 bit architectures fixes most
of these issues, given that 32 bit values are not treated as quantities
that require fixing up based on the actual runtime load offset.  Note
that on some ELF64 architectures [such as PPC64], these 32-bit values
are still emitted as [absolute] runtime relocatable quantities, even if
the value resolves to a build time constant.  Since relative relocations
are always resolved at build time, this patch enables MODULE_REL_CRCS on
powerpc when CONFIG_RELOCATABLE=y, which turns the absolute CRC
references into relative references into .rodata where the actual CRC
value is stored.
So redefine all CRC fields and variables as u32, and redefine the
__CRC_SYMBOL() macro for 64 bit builds to emit the CRC reference using
inline assembler (which is necessary since 64-bit C code cannot use
32-bit types to hold memory addresses, even if they are ultimately
resolved using values that do not exceed 0xffffffff).  To avoid
potential problems with legacy 32-bit architectures using legacy
toolchains, the equivalent C definition of the kcrctab entry is retained
for 32-bit architectures.
Note that this mostly reverts commit d4703aefdbc8 ("module: handle ppc64
relocating kcrctabs when CONFIG_RELOCATABLE=y")
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> | 
|  | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
 "Five kernel fixes:
   - an mmap tracing ABI fix for certain mappings
   - a use-after-free fix, found via KASAN
   - three CPU hotplug related x86 PMU driver fixes"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86/intel/uncore: Make package handling more robust
  perf/x86/intel/uncore: Clean up hotplug conversion fallout
  perf/x86/intel/rapl: Make package handling more robust
  perf/core: Fix PERF_RECORD_MMAP2 prot/flags for anonymous memory
  perf/core: Fix use-after-free bug | 
|  | Pull networking fixes from David Miller:
 1) Fix handling of interrupt status in stmmac driver. Just because we
    have masked the event from generating interrupts, doesn't mean the
    bit won't still be set in the interrupt status register. From Alexey
    Brodkin.
 2) Fix DMA API debugging splats in gianfar driver, from Arseny Solokha.
 3) Fix off-by-one error in __ip6_append_data(), from Vlad Yasevich.
 4) cls_flow does not match on icmpv6 codes properly, from Simon Horman.
 5) Initial MAC address can be set incorrectly in some scenerios, from
    Ivan Vecera.
 6) Packet header pointer arithmetic fix in ip6_tnl_parse_tlv_end_lim(),
    from Dan Carpenter.
 7) Fix divide by zero in __tcp_select_window(), from Eric Dumazet.
 8) Fix crash in iwlwifi when unregistering thermal zone, from Jens
    Axboe.
 9) Check for DMA mapping errors in starfire driver, from Alexey
    Khoroshilov.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (31 commits)
  tcp: fix 0 divide in __tcp_select_window()
  ipv6: pointer math error in ip6_tnl_parse_tlv_enc_lim()
  net: fix ndo_features_check/ndo_fix_features comment ordering
  net/sched: matchall: Fix configuration race
  be2net: fix initial MAC setting
  ipv6: fix flow labels when the traffic class is non-0
  net: thunderx: avoid dereferencing xcv when NULL
  net/sched: cls_flower: Correct matching on ICMPv6 code
  ipv6: Paritially checksum full MTU frames
  net/mlx4_core: Avoid command timeouts during VF driver device shutdown
  gianfar: synchronize DMA API usage by free_skb_rx_queue w/ gfar_new_page
  net: ethtool: add support for 2500BaseT and 5000BaseT link modes
  can: bcm: fix hrtimer/tasklet termination in bcm op removal
  net: adaptec: starfire: add checks for dma mapping errors
  net: phy: micrel: KSZ8795 do not set SUPPORTED_[Asym_]Pause
  can: Fix kernel panic at security_sock_rcv_skb
  net: macb: Fix 64 bit addressing support for GEM
  stmmac: Discard masked flags in interrupt status register
  net/mlx5e: Check ets capability before ets query FW command
  net/mlx5e: Fix update of hash function/key via ethtool
  ... | 
|  | git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull fscache fixes from Al Viro.
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  fscache: Fix dead object requeue
  fscache: Clear outstanding writes when disabling a cookie
  FS-Cache: Initialise stores_lock in netfs cookie | 
|  | Commit cdba756f5803a2 ("net: move ndo_features_check() close to
ndo_start_xmit()") inadvertently moved the doc comment for
.ndo_fix_features instead of .ndo_features_check. Fix the comment
ordering.
Fixes: cdba756f5803a2 ("net: move ndo_features_check() close to ndo_start_xmit()")
Signed-off-by: Dimitris Michailidis <dmichail@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net> | 
|  | The package management code in uncore relies on package mapping being
available before a CPU is started. This changed with:
  9d85eb9119f4 ("x86/smpboot: Make logical package management more robust")
because the ACPI/BIOS information turned out to be unreliable, but that
left uncore in broken state. This was not noticed because on a regular boot
all CPUs are online before uncore is initialized.
Move the allocation to the CPU online callback and simplify the hotplug
handling. At this point the package mapping is established and correct.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sebastian Siewior <bigeasy@linutronix.de>
Cc: Stephane Eranian <eranian@google.com>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: Yasuaki Ishimatsu <yasu.isimatu@gmail.com>
Fixes: 9d85eb9119f4 ("x86/smpboot: Make logical package management more robust")
Link: http://lkml.kernel.org/r/20170131230141.377156255@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org> | 
|  | The package management code in RAPL relies on package mapping being
available before a CPU is started. This changed with:
  9d85eb9119f4 ("x86/smpboot: Make logical package management more robust")
because the ACPI/BIOS information turned out to be unreliable, but that
left RAPL in broken state. This was not noticed because on a regular boot
all CPUs are online before RAPL is initialized.
A possible fix would be to reintroduce the mess which allocates a package
data structure in CPU prepare and when it turns out to already exist in
starting throw it away later in the CPU online callback. But that's a
horrible hack and not required at all because RAPL becomes functional for
perf only in the CPU online callback. That's correct because user space is
not yet informed about the CPU being onlined, so nothing caan rely on RAPL
being available on that particular CPU.
Move the allocation to the CPU online callback and simplify the hotplug
handling. At this point the package mapping is established and correct.
This also adds a missing check for available package data in the
event_init() function.
Reported-by: Yasuaki Ishimatsu <yasu.isimatu@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sebastian Siewior <bigeasy@linutronix.de>
Cc: Stephane Eranian <eranian@google.com>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Fixes: 9d85eb9119f4 ("x86/smpboot: Make logical package management more robust")
Link: http://lkml.kernel.org/r/20170131230141.212593966@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org> | 
|  | git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu
Pull percpu fix from Tejun Heo:
 "Douglas found and fixed a ref leak bug in percpu_ref_tryget[_live]().
  The bug is caused by storing the return value of atomic_long_inc_not_zero()
  into an int temp variable before returning it as a bool. The interim
  cast to int loses the upper bits and can lead to false negatives. As
  percpu_ref uses a high bit to mark a draining counter, this can happen
  relatively easily.
  Fixed by using bool for the temp variable"
* 'for-4.10-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu:
  percpu-refcount: fix reference leak during percpu-atomic transition | 
|  | Under some circumstances, an fscache object can become queued such that it
fscache_object_work_func() can be called once the object is in the
OBJECT_DEAD state.  This results in the kernel oopsing when it tries to
invoke the handler for the state (which is hard coded to 0x2).
The way this comes about is something like the following:
 (1) The object dispatcher is processing a work state for an object.  This
     is done in workqueue context.
 (2) An out-of-band event comes in that isn't masked, causing the object to
     be queued, say EV_KILL.
 (3) The object dispatcher finishes processing the current work state on
     that object and then sees there's another event to process, so,
     without returning to the workqueue core, it processes that event too.
     It then follows the chain of events that initiates until we reach
     OBJECT_DEAD without going through a wait state (such as
     WAIT_FOR_CLEARANCE).
     At this point, object->events may be 0, object->event_mask will be 0
     and oob_event_mask will be 0.
 (4) The object dispatcher returns to the workqueue processor, and in due
     course, this sees that the object's work item is still queued and
     invokes it again.
 (5) The current state is a work state (OBJECT_DEAD), so the dispatcher
     jumps to it - resulting in an OOPS.
When I'm seeing this, the work state in (1) appears to have been either
LOOK_UP_OBJECT or CREATE_OBJECT (object->oob_table is
fscache_osm_lookup_oob).
The window for (2) is very small:
 (A) object->event_mask is cleared whilst the event dispatch process is
     underway - though there's no memory barrier to force this to the top
     of the function.
     The window, therefore is from the time the object was selected by the
     workqueue processor and made requeueable to the time the mask was
     cleared.
 (B) fscache_raise_event() will only queue the object if it manages to set
     the event bit and the corresponding event_mask bit was set.
     The enqueuement is then deferred slightly whilst we get a ref on the
     object and get the per-CPU variable for workqueue congestion.  This
     slight deferral slightly increases the probability by allowing extra
     time for the workqueue to make the item requeueable.
Handle this by giving the dead state a processor function and checking the
for the dead state address rather than seeing if the processor function is
address 0x2.  The dead state processor function can then set a flag to
indicate that it's occurred and give a warning if it occurs more than once
per object.
If this race occurs, an oops similar to the following is seen (note the RIP
value):
BUG: unable to handle kernel NULL pointer dereference at 0000000000000002
IP: [<0000000000000002>] 0x1
PGD 0
Oops: 0010 [#1] SMP
Modules linked in: ...
CPU: 17 PID: 16077 Comm: kworker/u48:9 Not tainted 3.10.0-327.18.2.el7.x86_64 #1
Hardware name: HP ProLiant DL380 Gen9/ProLiant DL380 Gen9, BIOS P89 12/27/2015
Workqueue: fscache_object fscache_object_work_func [fscache]
task: ffff880302b63980 ti: ffff880717544000 task.ti: ffff880717544000
RIP: 0010:[<0000000000000002>]  [<0000000000000002>] 0x1
RSP: 0018:ffff880717547df8  EFLAGS: 00010202
RAX: ffffffffa0368640 RBX: ffff880edf7a4480 RCX: dead000000200200
RDX: 0000000000000002 RSI: 00000000ffffffff RDI: ffff880edf7a4480
RBP: ffff880717547e18 R08: 0000000000000000 R09: dfc40a25cb3a4510
R10: dfc40a25cb3a4510 R11: 0000000000000400 R12: 0000000000000000
R13: ffff880edf7a4510 R14: ffff8817f6153400 R15: 0000000000000600
FS:  0000000000000000(0000) GS:ffff88181f420000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000002 CR3: 000000000194a000 CR4: 00000000001407e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Stack:
 ffffffffa0363695 ffff880edf7a4510 ffff88093f16f900 ffff8817faa4ec00
 ffff880717547e60 ffffffff8109d5db 00000000faa4ec18 0000000000000000
 ffff8817faa4ec18 ffff88093f16f930 ffff880302b63980 ffff88093f16f900
Call Trace:
 [<ffffffffa0363695>] ? fscache_object_work_func+0xa5/0x200 [fscache]
 [<ffffffff8109d5db>] process_one_work+0x17b/0x470
 [<ffffffff8109e4ac>] worker_thread+0x21c/0x400
 [<ffffffff8109e290>] ? rescuer_thread+0x400/0x400
 [<ffffffff810a5acf>] kthread+0xcf/0xe0
 [<ffffffff810a5a00>] ? kthread_create_on_node+0x140/0x140
 [<ffffffff816460d8>] ret_from_fork+0x58/0x90
 [<ffffffff810a5a00>] ? kthread_create_on_node+0x140/0x140
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Jeremy McNicoll <jeremymc@redhat.com>
Tested-by: Frank Sorenson <sorenson@redhat.com>
Tested-by: Benjamin Coddington <bcodding@redhat.com>
Reviewed-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> | 
|  | ip6_make_flowlabel() determines the flow label for IPv6 packets. It's
supposed to be passed a flow label, which it returns as is if non-0 and
in some other cases, otherwise it calculates a new value.
The problem is callers often pass a flowi6.flowlabel, which may also
contain traffic class bits. If the traffic class is non-0
ip6_make_flowlabel() mistakes the non-0 it gets as a flow label and
returns the whole thing. Thus it can return a 'flow label' longer than
20b and the low 20b of that is typically 0 resulting in packets with 0
label. Moreover, different packets of a flow may be labeled differently.
For a TCP flow with ECN non-payload and payload packets get different
labels as exemplified by this pair of consecutive packets:
(pure ACK)
Internet Protocol Version 6, Src: 2002:af5:11a3::, Dst: 2002:af5:11a2::
    0110 .... = Version: 6
    .... 0000 0000 .... .... .... .... .... = Traffic Class: 0x00 (DSCP: CS0, ECN: Not-ECT)
        .... 0000 00.. .... .... .... .... .... = Differentiated Services Codepoint: Default (0)
        .... .... ..00 .... .... .... .... .... = Explicit Congestion Notification: Not ECN-Capable Transport (0)
    .... .... .... 0001 1100 1110 0100 1001 = Flow Label: 0x1ce49
    Payload Length: 32
    Next Header: TCP (6)
(payload)
Internet Protocol Version 6, Src: 2002:af5:11a3::, Dst: 2002:af5:11a2::
    0110 .... = Version: 6
    .... 0000 0010 .... .... .... .... .... = Traffic Class: 0x02 (DSCP: CS0, ECN: ECT(0))
        .... 0000 00.. .... .... .... .... .... = Differentiated Services Codepoint: Default (0)
        .... .... ..10 .... .... .... .... .... = Explicit Congestion Notification: ECN-Capable Transport codepoint '10' (2)
    .... .... .... 0000 0000 0000 0000 0000 = Flow Label: 0x00000
    Payload Length: 688
    Next Header: TCP (6)
This patch allows ip6_make_flowlabel() to be passed more than just a
flow label and has it extract the part it really wants. This was simpler
than modifying the callers. With this patch packets like the above become
Internet Protocol Version 6, Src: 2002:af5:11a3::, Dst: 2002:af5:11a2::
    0110 .... = Version: 6
    .... 0000 0000 .... .... .... .... .... = Traffic Class: 0x00 (DSCP: CS0, ECN: Not-ECT)
        .... 0000 00.. .... .... .... .... .... = Differentiated Services Codepoint: Default (0)
        .... .... ..00 .... .... .... .... .... = Explicit Congestion Notification: Not ECN-Capable Transport (0)
    .... .... .... 1010 1111 1010 0101 1110 = Flow Label: 0xafa5e
    Payload Length: 32
    Next Header: TCP (6)
Internet Protocol Version 6, Src: 2002:af5:11a3::, Dst: 2002:af5:11a2::
    0110 .... = Version: 6
    .... 0000 0010 .... .... .... .... .... = Traffic Class: 0x02 (DSCP: CS0, ECN: ECT(0))
        .... 0000 00.. .... .... .... .... .... = Differentiated Services Codepoint: Default (0)
        .... .... ..10 .... .... .... .... .... = Explicit Congestion Notification: ECN-Capable Transport codepoint '10' (2)
    .... .... .... 1010 1111 1010 0101 1110 = Flow Label: 0xafa5e
    Payload Length: 688
    Next Header: TCP (6)
Signed-off-by: Dimitris Michailidis <dmichail@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net> | 
|  | Commit a389fcfd2cb5 ("Drivers: hv: vmbus: Fix signaling logic in
hv_need_to_signal_on_read()")
added the proper mb(), but removed the test "prev_write_sz < pending_sz"
when making the signal decision.
As a result, the guest can signal the host unnecessarily,
and then the host can throttle the guest because the host
thinks the guest is buggy or malicious; finally the user
running stress test can perceive intermittent freeze of
the guest.
This patch brings back the test, and properly handles the
in-place consumption APIs used by NetVSC (see get_next_pkt_raw(),
put_pkt_raw() and commit_rd_index()).
Fixes: a389fcfd2cb5 ("Drivers: hv: vmbus: Fix signaling logic in
hv_need_to_signal_on_read()")
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Reported-by: Rolf Neugebauer <rolf.neugebauer@docker.com>
Tested-by: Rolf Neugebauer <rolf.neugebauer@docker.com>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> | 
|  | This patch introduce support for 2500BaseT and 5000BaseT link modes.
These modes are included in the new IEEE 802.3bz standard.
Signed-off-by: Pavel Belous <pavel.s.belous@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net> | 
|  | Since commit f3b0946d629c ("genirq/msi: Make sure PCI MSIs are
activated early"), we can end-up activating a PCI/MSI twice (once
at allocation time, and once at startup time).
This is normally of no consequences, except that there is some
HW out there that may misbehave if activate is used more than once
(the GICv3 ITS, for example, uses the activate callback
to issue the MAPVI command, and the architecture spec says that
"If there is an existing mapping for the EventID-DeviceID
combination, behavior is UNPREDICTABLE").
While this could be worked around in each individual driver, it may
make more sense to tackle the issue at the core level. In order to
avoid getting in that situation, let's have a per-interrupt flag
to remember if we have already activated that interrupt or not.
Fixes: f3b0946d629c ("genirq/msi: Make sure PCI MSIs are activated early")
Reported-and-tested-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1484668848-24361-1-git-send-email-marc.zyngier@arm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> | 
|  | I was under the misconception that the sysfs dev stuff can be fully
set up, and then registered all in one step with device_add. That's
true for properties and property groups, but not for parents and child
devices. Those must be fully registered before you can register a
child.
Add a bit of tracking to make sure that asynchronous mst connector
hotplugging gets this right. For consistency we rely upon the implicit
barriers of the connector->mutex, which is taken anyway, to ensure
that at least either the connector or device registration call will
work out.
Mildly tested since I can't reliably reproduce this on my mst box
here.
Reported-by: Dave Hansen <dave.hansen@intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1484237756-2720-1-git-send-email-daniel.vetter@ffwll.ch | 
|  | If we're unlucky then the registration from a hotplugged connector
might race with the final registration step on driver load. And since
MST topology discover is asynchronous that's even somewhat likely.
v2: Also update the kerneldoc for @registered!
v3: Review from Chris:
- Improve kerneldoc for late_register/early_unregister callbacks.
- Use mutex_destroy.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Sean Paul <seanpaul@chromium.org>
Reported-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161218133545.2106-1-daniel.vetter@ffwll.ch
(cherry picked from commit e73ab00e9a0f1731f34d0620a9c55f5c30c4ad4e) | 
|  | Zhang Yanmin reported crashes [1] and provided a patch adding a
synchronize_rcu() call in can_rx_unregister()
The main problem seems that the sockets themselves are not RCU
protected.
If CAN uses RCU for delivery, then sockets should be freed only after
one RCU grace period.
Recent kernels could use sock_set_flag(sk, SOCK_RCU_FREE), but let's
ease stable backports with the following fix instead.
[1]
BUG: unable to handle kernel NULL pointer dereference at (null)
IP: [<ffffffff81495e25>] selinux_socket_sock_rcv_skb+0x65/0x2a0
Call Trace:
 <IRQ>
 [<ffffffff81485d8c>] security_sock_rcv_skb+0x4c/0x60
 [<ffffffff81d55771>] sk_filter+0x41/0x210
 [<ffffffff81d12913>] sock_queue_rcv_skb+0x53/0x3a0
 [<ffffffff81f0a2b3>] raw_rcv+0x2a3/0x3c0
 [<ffffffff81f06eab>] can_rcv_filter+0x12b/0x370
 [<ffffffff81f07af9>] can_receive+0xd9/0x120
 [<ffffffff81f07beb>] can_rcv+0xab/0x100
 [<ffffffff81d362ac>] __netif_receive_skb_core+0xd8c/0x11f0
 [<ffffffff81d36734>] __netif_receive_skb+0x24/0xb0
 [<ffffffff81d37f67>] process_backlog+0x127/0x280
 [<ffffffff81d36f7b>] net_rx_action+0x33b/0x4f0
 [<ffffffff810c88d4>] __do_softirq+0x184/0x440
 [<ffffffff81f9e86c>] do_softirq_own_stack+0x1c/0x30
 <EOI>
 [<ffffffff810c76fb>] do_softirq.part.18+0x3b/0x40
 [<ffffffff810c8bed>] do_softirq+0x1d/0x20
 [<ffffffff81d30085>] netif_rx_ni+0xe5/0x110
 [<ffffffff8199cc87>] slcan_receive_buf+0x507/0x520
 [<ffffffff8167ef7c>] flush_to_ldisc+0x21c/0x230
 [<ffffffff810e3baf>] process_one_work+0x24f/0x670
 [<ffffffff810e44ed>] worker_thread+0x9d/0x6f0
 [<ffffffff810e4450>] ? rescuer_thread+0x480/0x480
 [<ffffffff810ebafc>] kthread+0x12c/0x150
 [<ffffffff81f9ccef>] ret_from_fork+0x3f/0x70
Reported-by: Zhang Yanmin <yanmin.zhang@intel.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: David S. Miller <davem@davemloft.net> | 
|  | Pull NFS client bugfixes from Trond Myklebust:
 "Stable patches:
   - NFSv4.1: Fix a deadlock in layoutget
   - NFSv4 must not bump sequence ids on NFS4ERR_MOVED errors
   - NFSv4 Fix a regression with OPEN EXCLUSIVE4 mode
   - Fix a memory leak when removing the SUNRPC module
  Bugfixes:
   - Fix a reference leak in _pnfs_return_layout"
* tag 'nfs-for-4.10-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
  pNFS: Fix a reference leak in _pnfs_return_layout
  nfs: Fix "Don't increment lock sequence ID after NFS4ERR_MOVED"
  SUNRPC: cleanup ida information when removing sunrpc module
  NFSv4.0: always send mode in SETATTR after EXCLUSIVE4
  nfs: Don't increment lock sequence ID after NFS4ERR_MOVED
  NFSv4.1: Fix a deadlock in layoutget | 
|  | git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc
Pull ARC fixes from Vineet Gupta:
 "Hopefully last set of changes for ARC for 4.10:
   - fix for unaligned access emulation corner case
   - fix for udelay loop inline asm regression
   - fix irq affinity finally for AXS103 board [Yuriy]
   - final fixes for setting IO-coherency sanely in SMP"
* tag 'arc-4.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
  ARC: [arcompact] handle unaligned access delay slot corner case
  ARCv2: smp-boot: wake_flag polling by non-Masters needs to be uncached
  ARC: smp-boot: Decouple Non masters waiting API from jump to entry point
  ARCv2: MCIP: update the BCR per current changes
  ARC: udelay: fix inline assembler by adding LP_COUNT to clobber list
  ARCv2: MCIP: Deprecate setting of affinity in Device Tree | 
|  | percpu_ref_tryget() and percpu_ref_tryget_live() should return
"true" IFF they acquire a reference. But the return value from
atomic_long_inc_not_zero() is a long and may have high bits set,
e.g. PERCPU_COUNT_BIAS, and the return value of the tryget routines
is bool so the reference may actually be acquired but the routines
return "false" which results in a reference leak since the caller
assumes it does not need to do a corresponding percpu_ref_put().
This was seen when performing CPU hotplug during I/O, as hangs in
blk_mq_freeze_queue_wait where percpu_ref_kill (blk_mq_freeze_queue_start)
raced with percpu_ref_tryget (blk_mq_timeout_work).
Sample stack trace:
__switch_to+0x2c0/0x450
__schedule+0x2f8/0x970
schedule+0x48/0xc0
blk_mq_freeze_queue_wait+0x94/0x120
blk_mq_queue_reinit_work+0xb8/0x180
blk_mq_queue_reinit_prepare+0x84/0xa0
cpuhp_invoke_callback+0x17c/0x600
cpuhp_up_callbacks+0x58/0x150
_cpu_up+0xf0/0x1c0
do_cpu_up+0x120/0x150
cpu_subsys_online+0x64/0xe0
device_online+0xb4/0x120
online_store+0xb4/0xc0
dev_attr_store+0x68/0xa0
sysfs_kf_write+0x80/0xb0
kernfs_fop_write+0x17c/0x250
__vfs_write+0x6c/0x1e0
vfs_write+0xd0/0x270
SyS_write+0x6c/0x110
system_call+0x38/0xe0
Examination of the queue showed a single reference (no PERCPU_COUNT_BIAS,
and __PERCPU_REF_DEAD, __PERCPU_REF_ATOMIC set) and no requests.
However, conditions at the time of the race are count of PERCPU_COUNT_BIAS + 0
and __PERCPU_REF_DEAD and __PERCPU_REF_ATOMIC set.
The fix is to make the tryget routines use an actual boolean internally instead
of the atomic long result truncated to a int.
Fixes: e625305b3907 percpu-refcount: make percpu_ref based on longs instead of ints
Link: https://bugzilla.kernel.org/show_bug.cgi?id=190751
Signed-off-by: Douglas Miller <dougmill@linux.vnet.ibm.com>
Reviewed-by: Jens Axboe <axboe@fb.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Fixes: e625305b3907 ("percpu-refcount: make percpu_ref based on longs instead of ints")
Cc: stable@vger.kernel.org # v3.18+ | 
|  | Pull networking fixes from David Miller:
 1) GTP fixes from Andreas Schultz (missing genl module alias, clear IP
    DF on transmit).
 2) Netfilter needs to reflect the fwmark when sending resets, from Pau
    Espin Pedrol.
 3) nftable dump OOPS fix from Liping Zhang.
 4) Fix erroneous setting of VIRTIO_NET_HDR_F_DATA_VALID on transmit,
    from Rolf Neugebauer.
 5) Fix build error of ipt_CLUSTERIP when procfs is disabled, from Arnd
    Bergmann.
 6) Fix regression in handling of NETIF_F_SG in harmonize_features(),
    from Eric Dumazet.
 7) Fix RTNL deadlock wrt. lwtunnel module loading, from David Ahern.
 8) tcp_fastopen_create_child() needs to setup tp->max_window, from
    Alexey Kodanev.
 9) Missing kmemdup() failure check in ipv6 segment routing code, from
    Eric Dumazet.
10) Don't execute unix_bind() under the bindlock, otherwise we deadlock
    with splice. From WANG Cong.
11) ip6_tnl_parse_tlv_enc_lim() potentially reallocates the skb buffer,
    therefore callers must reload cached header pointers into that skb.
    Fix from Eric Dumazet.
12) Fix various bugs in legacy IRQ fallback handling in alx driver, from
    Tobias Regnery.
13) Do not allow lwtunnel drivers to be unloaded while they are
    referenced by active instances, from Robert Shearman.
14) Fix truncated PHY LED trigger names, from Geert Uytterhoeven.
15) Fix a few regressions from virtio_net XDP support, from John
    Fastabend and Jakub Kicinski.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (102 commits)
  ISDN: eicon: silence misleading array-bounds warning
  net: phy: micrel: add support for KSZ8795
  gtp: fix cross netns recv on gtp socket
  gtp: clear DF bit on GTP packet tx
  gtp: add genl family modules alias
  tcp: don't annotate mark on control socket from tcp_v6_send_response()
  ravb: unmap descriptors when freeing rings
  virtio_net: reject XDP programs using header adjustment
  virtio_net: use dev_kfree_skb for small buffer XDP receive
  r8152: check rx after napi is enabled
  r8152: re-schedule napi for tx
  r8152: avoid start_xmit to schedule napi when napi is disabled
  r8152: avoid start_xmit to call napi_schedule during autosuspend
  net: dsa: Bring back device detaching in dsa_slave_suspend()
  net: phy: leds: Fix truncated LED trigger names
  net: phy: leds: Break dependency of phy.h on phy_led_triggers.h
  net: phy: leds: Clear phy_num_led_triggers on failure to avoid crash
  net-next: ethernet: mediatek: change the compatible string
  Documentation: devicetree: change the mediatek ethernet compatible string
  bnxt_en: Fix RTNL lock usage on bnxt_get_port_module_status().
  ... | 
|  | git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma
Pull rdma fixes from Doug Ledford:
 "Second round of -rc fixes for 4.10.
  This -rc cycle has been slow for the rdma subsystem. I had already
  sent you the first batch before the Holiday break. After that, we kept
  only getting a few here or there. Up until this week, when I got a
  drop of 13 to one driver (qedr). So, here's the -rc patches I have. I
  currently have none held in reserve, so unless something new comes in,
  this is it until the next merge window opens.
  Summary:
   - series of iw_cxgb4 fixes to make it work with the drain cq API
   - one or two patches each to: srp, iser, cxgb3, vmw_pvrdma, umem,
     rxe, and ipoib
   - one big series (13 patches) for the new qedr driver"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (27 commits)
  RDMA/cma: Fix unknown symbol when CONFIG_IPV6 is not enabled
  IB/rxe: Prevent from completer to operate on non valid QP
  IB/rxe: Fix rxe dev insertion to rxe_dev_list
  IB/umem: Release pid in error and ODP flow
  RDMA/qedr: Dispatch port active event from qedr_add
  RDMA/qedr: Fix and simplify memory leak in PD alloc
  RDMA/qedr: Fix RDMA CM loopback
  RDMA/qedr: Fix formatting
  RDMA/qedr: Mark three functions as static
  RDMA/qedr: Don't reset QP when queues aren't flushed
  RDMA/qedr: Don't spam dmesg if QP is in error state
  RDMA/qedr: Remove CQ spinlock from CM completion handlers
  RDMA/qedr: Return max inline data in QP query result
  RDMA/qedr: Return success when not changing QP state
  RDMA/qedr: Add uapi header qedr-abi.h
  RDMA/qedr: Fix MTU returned from QP query
  RDMA/core: Add the function ib_mtu_int_to_enum
  IB/vmw_pvrdma: Fix incorrect cleanup on pvrdma_pci_probe error path
  IB/vmw_pvrdma: Don't leak info from alloc_ucontext
  IB/cxgb3: fix misspelling in header guard
  ... | 
|  | git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media
Pull media fixes from Mauro Carvalho Chehab:
 - fix a regression on tvp5150 causing failures at input selection and
   image glitches
 - CEC was moved out of staging for v4.10. Fix some bugs on it while not
   too late
 - fix a regression on pctv452e caused by VM stack changes
 - fix suspend issued with smiapp
 - fix a regression on cobalt driver
 - fix some warnings and Kconfig issues with some random configs.
* tag 'media/v4.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
  [media] s5k4ecgx: select CRC32 helper
  [media] dvb: avoid warning in dvb_net
  [media] v4l: tvp5150: Don't override output pinmuxing at stream on/off time
  [media] v4l: tvp5150: Fix comment regarding output pin muxing
  [media] v4l: tvp5150: Reset device at probe time, not in get/set format handlers
  [media] pctv452e: move buffer to heap, no mutex
  [media] media/cobalt: use pci_irq_allocate_vectors
  [media] cec: fix race between configuring and unconfiguring
  [media] cec: move cec_report_phys_addr into cec_config_thread_func
  [media] cec: replace cec_report_features by cec_fill_msg_report_features
  [media] cec: update log_addr[] before finishing configuration
  [media] cec: CEC_MSG_GIVE_FEATURES should abort for CEC version < 2
  [media] cec: when canceling a message, don't overwrite old status info
  [media] cec: fix report_current_latency
  [media] smiapp: Make suspend and resume functions __maybe_unused
  [media] smiapp: Implement power-on and power-off sequences without runtime PM | 
|  | This is adds support for the PHYs in the KSZ8795 5port managed switch.
It will allow to detect the link between the switch and the soc
and uses the same read_status functions as the KSZ8873MLL switch.
Signed-off-by: Sean Nyekjaer <sean.nyekjaer@prevas.dk>
Signed-off-by: David S. Miller <davem@davemloft.net> | 
|  | Unlike ipv4, this control socket is shared by all cpus so we cannot use
it as scratchpad area to annotate the mark that we pass to ip6_xmit().
Add a new parameter to ip6_xmit() to indicate the mark. The SCTP socket
family caches the flowi6 structure in the sctp_transport structure, so
we cannot use to carry the mark unless we later on reset it back, which
I discarded since it looks ugly to me.
Fixes: bf99b4ded5f8 ("tcp: fix mark propagation with fwmark_reflect enabled")
Suggested-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net> | 
|  | git://people.freedesktop.org/~airlied/linux
Pull drm fixes from Dave Airlie:
 "This is the main request for rc6, since really the one earlier was the
  rc5 one :-)
  The main thing are the nouveau specific race fixes for the connector
  locking bug we fixed in -next and reverted here as it has quite large
  prereqs. These two fixes should solve the problem at that level and we
  can fix it properly in 4.11
  Otherwise i915 has a bunch of changes, one ABI change for GVT related
  stuff, some VC4 leak fixes, one core fence fix and some AMD changes,
  oh and one ast hang avoidance fix.
  Hoping it calms down around now"
* tag 'drm-fixes-for-v4.10-rc6-part-two' of git://people.freedesktop.org/~airlied/linux: (25 commits)
  drm/nouveau: Handle fbcon suspend/resume in seperate worker
  drm/nouveau: Don't enabling polling twice on runtime resume
  drm/ast: Fixed system hanged if disable P2A
  Revert "drm/radeon: always apply pci shutdown callbacks"
  drm/i915: reinstate call to trace_i915_vma_bind
  drm/i915: Move atomic state free from out of fence release
  drm/i915: Check for NULL atomic state in intel_crtc_disable_noatomic()
  drm/i915: Fix calculation of rotated x and y offsets for planar formats
  drm/i915: Don't init hpd polling for vlv and chv from runtime_suspend()
  drm/i915: Don't leak edid in intel_crt_detect_ddc()
  drm/i915: Release temporary load-detect state upon switching
  drm/i915: prevent crash with .disable_display parameter
  drm/i915: Avoid drm_atomic_state_put(NULL) in intel_display_resume
  MAINTAINERS: update new mail list for intel gvt driver
  drm/i915/gvt: Fix kmem_cache_create() name
  drm/i915/gvt/kvmgt: mdev ABI is available_instances, not available_instance
  drm/amdgpu: fix unload driver issue for virtual display
  drm/amdgpu: check ring being ready before using
  drm/vc4: Return -EINVAL on the overflow checks failing.
  drm/vc4: Fix an integer overflow in temporary allocation layout.
  ... | 
|  | git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management fixes from Rafael Wysocki:
 "These fix two regressions introduced recently, one by reverting the
  problematic commit and one by fixing up the behavior in an overlooked
  case.
  Specifics:
   - Revert the recent change that caused suspend-to-idle to be used as
     the default suspend method on systems where it is indicated to be
     efficient by the ACPI tables, as that turned out to be premature
     and introduced suspend regressions on some systems with missing
     power management support in device drivers (Rafael Wysocki).
   - Fix up the intel_pstate driver to take changes of the global limits
     via sysfs correctly when the performance policy is used which has
     been broken by a recent change in it (Srinivas Pandruvada)"
* tag 'pm-4.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  cpufreq: intel_pstate: Fix sysfs limits enforcement for performance policy
  Revert "PM / sleep / ACPI: Use the ACPI_FADT_LOW_POWER_S0 flag" | 
|  | git://anongit.freedesktop.org/git/drm-misc into drm-fixes
Single fence fix.
* tag 'drm-misc-fixes-2017-01-23' of git://anongit.freedesktop.org/git/drm-misc:
  drm/fence: fix memory overwrite when setting out_fence fd | 
|  | * pm-sleep:
  Revert "PM / sleep / ACPI: Use the ACPI_FADT_LOW_POWER_S0 flag"
* pm-cpufreq:
  cpufreq: intel_pstate: Fix sysfs limits enforcement for performance policy | 
|  | Pablo Neira Ayuso says:
====================
Netfilter fixes for net
The following patchset contains a large batch with Netfilter fixes for
your net tree, they are:
1) Two patches to solve conntrack garbage collector cpu hogging, one to
   remove GC_MAX_EVICTS and another to look at the ratio (scanned entries
   vs. evicted entries) to make a decision on whether to reduce or not
   the scanning interval. From Florian Westphal.
2) Two patches to fix incorrect set element counting if NLM_F_EXCL is
   is not set. Moreover, don't decrenent set->nelems from abort patch
   if -ENFILE which leaks a spare slot in the set. This includes a
   patch to deconstify the set walk callback to update set->ndeact.
3) Two fixes for the fwmark_reflect sysctl feature: Propagate mark to
   reply packets both from nf_reject and local stack, from Pau Espin Pedrol.
4) Fix incorrect handling of loopback traffic in rpfilter and nf_tables
   fib expression, from Liping Zhang.
5) Fix oops on stateful objects netlink dump, when no filter is specified.
   Also from Liping Zhang.
6) Fix a build error if proc is not available in ipt_CLUSTERIP, related
   to fix that was applied in the previous batch for net. From Arnd Bergmann.
7) Fix lack of string validation in table, chain, set and stateful
   object names in nf_tables, from Liping Zhang. Moreover, restrict
   maximum log prefix length to 127 bytes, otherwise explicitly bail
   out.
8) Two patches to fix spelling and typos in nf_tables uapi header file
   and Kconfig, patches from Alexander Alemayhu and William Breathitt Gray.
====================
Signed-off-by: David S. Miller <davem@davemloft.net> | 
|  | git://people.freedesktop.org/~airlied/linux
Pull drm revert from Dave Airlie:
 "Revert one patch missing some prereqs.
  One of the connector fixes was missing some prereqs, we have an
  alternate driver fix that should work that I'll send tomorrow.
  Today is a holiday here so quickly smashing this out"
Daniel Vetter explains:
 "I pushed a locking change to fix a nouveau rpm issue to -fixes that
  needed the connector_list rework. And that's only in -next, but I
  missed that. Dave has the revert in a pull, and he'll follow-up with
  the hack nouveau patch for 4.10, and then we'll reapply the proper fix
  again for -next and revert the hacks. A bit a mess, but should be
  sorted soon"
* tag 'drm-fixes-for-v4.10-rc6-revert-one' of git://people.freedesktop.org/~airlied/linux:
  Revert "drm/probe-helpers: Drop locking from poll_enable" | 
|  | This reverts commit 3846fd9b86001bea171943cc3bb9222cb6da6b42.
There were some precursor commits missing for this around connector
locking, we should probably merge Lyude's nouveau avoid the problem patch. | 
|  | Commit 4567d686f5c6d955 ("phy: increase size of MII_BUS_ID_SIZE and
bus_id") increased the size of MII bus IDs, but forgot to update the
private definition in <linux/phy_led_triggers.h>.
This may cause:
  1. Truncation of LED trigger names,
  2. Duplicate LED trigger names,
  3. Failures registering LED triggers,
  4. Crashes due to bad error handling in the LED trigger failure path.
To fix this, and prevent the definitions going out of sync again in the
future, let the PHY LED trigger code use the existing MII_BUS_ID_SIZE
definition.
Example:
  - Before I had triggers "ee700000.etherne:01:100Mbps" and
    "ee700000.etherne:01:10Mbps",
  - After the increase of MII_BUS_ID_SIZE, both became
    "ee700000.ethernet-ffffffff:01:" => FAIL,
  - Now, the triggers are "ee700000.ethernet-ffffffff:01:100Mbps" and
    "ee700000.ethernet-ffffffff:01:10Mbps", which are unique again.
Fixes: 4567d686f5c6d955 ("phy: increase size of MII_BUS_ID_SIZE and bus_id")
Fixes: 2e0bc452f4721520 ("net: phy: leds: add support for led triggers on phy link state change")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net> | 
|  | <linux/phy.h> includes <linux/phy_led_triggers.h>, which is not really
needed.  Drop the include from <linux/phy.h>, and add it to all users
that didn't include it explicitly.
Suggested-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net> | 
|  | Patch series "fix premature OOM regression in 4.7+ due to cpuset races".
This is v2 of my attempt to fix the recent report based on LTP cpuset
stress test [1].  The intention is to go to stable 4.9 LTSS with this,
as triggering repeated OOMs is not nice.  That's why the patches try to
be not too intrusive.
Unfortunately why investigating I found that modifying the testcase to
use per-VMA policies instead of per-task policies will bring the OOM's
back, but that seems to be much older and harder to fix problem.  I have
posted a RFC [2] but I believe that fixing the recent regressions has a
higher priority.
Longer-term we might try to think how to fix the cpuset mess in a better
and less error prone way.  I was for example very surprised to learn,
that cpuset updates change not only task->mems_allowed, but also
nodemask of mempolicies.  Until now I expected the parameter to
alloc_pages_nodemask() to be stable.  I wonder why do we then treat
cpusets specially in get_page_from_freelist() and distinguish HARDWALL
etc, when there's unconditional intersection between mempolicy and
cpuset.  I would expect the nodemask adjustment for saving overhead in
g_p_f(), but that clearly doesn't happen in the current form.  So we
have both crazy complexity and overhead, AFAICS.
[1] https://lkml.kernel.org/r/CAFpQJXUq-JuEP=QPidy4p_=FN0rkH5Z-kfB4qBvsf6jMS87Edg@mail.gmail.com
[2] https://lkml.kernel.org/r/7c459f26-13a6-a817-e508-b65b903a8378@suse.cz
This patch (of 4):
Since commit c33d6c06f60f ("mm, page_alloc: avoid looking up the first
zone in a zonelist twice") we have a wrong check for NULL preferred_zone,
which can theoretically happen due to concurrent cpuset modification.  We
check the zoneref pointer which is never NULL and we should check the zone
pointer.  Also document this in first_zones_zonelist() comment per Michal
Hocko.
Fixes: c33d6c06f60f ("mm, page_alloc: avoid looking up the first zone in a zonelist twice")
Link: http://lkml.kernel.org/r/20170120103843.24587-2-vbabka@suse.cz
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com>
Cc: Ganapatrao Kulkarni <gpkulkarni@gmail.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> | 
|  | On an overloaded system, it is possible that a change in the watchdog
threshold can be delayed long enough to trigger a false positive.
This can easily be achieved by having a cpu spinning indefinitely on a
task, while another cpu updates watchdog threshold.
What happens is while trying to park the watchdog threads, the hrtimers
on the other cpus trigger and reprogram themselves with the new slower
watchdog threshold.  Meanwhile, the nmi watchdog is still programmed
with the old faster threshold.
Because the one cpu is blocked, it prevents the thread parking on the
other cpus from completing, which is needed to shutdown the nmi watchdog
and reprogram it correctly.  As a result, a false positive from the nmi
watchdog is reported.
Fix this by setting a park_in_progress flag to block all lockups until
the parking is complete.
Fix provided by Ulrich Obergfell.
[akpm@linux-foundation.org: s/park_in_progress/watchdog_park_in_progress/]
Link: http://lkml.kernel.org/r/1481041033-192236-1-git-send-email-dzickus@redhat.com
Signed-off-by: Don Zickus <dzickus@redhat.com>
Reviewed-by: Aaron Tomlin <atomlin@redhat.com>
Cc: Ulrich Obergfell <uobergfe@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> | 
|  | online_{kernel|movable} is used to change the memory zone to
ZONE_{NORMAL|MOVABLE} and online the memory.
To check that memory zone can be changed, zone_can_shift() is used.
Currently the function returns minus integer value, plus integer
value and 0. When the function returns minus or plus integer value,
it means that the memory zone can be changed to ZONE_{NORNAL|MOVABLE}.
But when the function returns 0, there are two meanings.
One of the meanings is that the memory zone does not need to be changed.
For example, when memory is in ZONE_NORMAL and onlined by online_kernel
the memory zone does not need to be changed.
Another meaning is that the memory zone cannot be changed. When memory
is in ZONE_NORMAL and onlined by online_movable, the memory zone may
not be changed to ZONE_MOVALBE due to memory online limitation(see
Documentation/memory-hotplug.txt). In this case, memory must not be
onlined.
The patch changes the return type of zone_can_shift() so that memory
online operation fails when memory zone cannot be changed as follows:
Before applying patch:
   # grep -A 35 "Node 2" /proc/zoneinfo
   Node 2, zone   Normal
   <snip>
      node_scanned  0
           spanned  8388608
           present  7864320
           managed  7864320
   # echo online_movable > memory4097/state
   # grep -A 35 "Node 2" /proc/zoneinfo
   Node 2, zone   Normal
   <snip>
      node_scanned  0
           spanned  8388608
           present  8388608
           managed  8388608
   online_movable operation succeeded. But memory is onlined as
   ZONE_NORMAL, not ZONE_MOVABLE.
After applying patch:
   # grep -A 35 "Node 2" /proc/zoneinfo
   Node 2, zone   Normal
   <snip>
      node_scanned  0
           spanned  8388608
           present  7864320
           managed  7864320
   # echo online_movable > memory4097/state
   bash: echo: write error: Invalid argument
   # grep -A 35 "Node 2" /proc/zoneinfo
   Node 2, zone   Normal
   <snip>
      node_scanned  0
           spanned  8388608
           present  7864320
           managed  7864320
   online_movable operation failed because of failure of changing
   the memory zone from ZONE_NORMAL to ZONE_MOVABLE
Fixes: df429ac03936 ("memory-hotplug: more general validation of zone during online")
Link: http://lkml.kernel.org/r/2f9c3837-33d7-b6e5-59c0-6ca4372b2d84@gmail.com
Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Reviewed-by: Reza Arbab <arbab@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> | 
|  | Modules implementing lwtunnel ops should not be allowed to unload
while there is state alive using those ops, so specify the owning
module for all lwtunnel ops.
Signed-off-by: Robert Shearman <rshearma@brocade.com>
Signed-off-by: David S. Miller <davem@davemloft.net> | 
|  | The flush operation needs to modify set and element objects, so let's
deconstify this.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> | 
|  | First, log prefix will be truncated to NF_LOG_PREFIXLEN-1, i.e. 127,
at nf_log_packet(), so the extra part is useless.
Second, after adding a log rule with a very very long prefix, we will
fail to dump the nft rules after this _special_ one, but acctually,
they do exist. For example:
  # name_65000=$(printf "%0.sQ" {1..65000})
  # nft add rule filter output log prefix "$name_65000"
  # nft add rule filter output counter
  # nft add rule filter output counter
  # nft list chain filter output
  table ip filter {
      chain output {
          type filter hook output priority 0; policy accept;
      }
  }
So now, restrict the log prefix length to NF_LOG_PREFIXLEN-1.
Fixes: 96518518cc41 ("netfilter: add nftables")
Signed-off-by: Liping Zhang <zlpnobody@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> |