Age | Commit message (Collapse) | Author | Files | Lines |
|
Provide function equivalent to unmap_underlying_metadata() for a range
of blocks. We somewhat optimize the function to use pagevec lookups
instead of looking up buffer heads one by one and use page lock to pin
buffer heads instead of mapping's private_lock to improve scalability.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
Currently block plug holds up to 16 non-mergeable requests. This makes
sense if the request size is small, eg, reduce lock contention. But if
request size is big enough, we don't need to worry about lock
contention. Holding such request makes no sense and it lows the disk
utilization.
In practice, this improves 10% throughput for my raid5 sequential write
workload.
The size (128k) is arbitrary right now, but it makes sure lock
contention is small. This probably could be more intelligent, eg, check
average request size holded. Since this is mainly for sequential IO,
probably not worthy.
V2: check the last request instead of the first request, so as long as
there is one big size request we flush the plug.
Signed-off-by: Shaohua Li <shli@fb.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
Most blk_mq_requeue_request() and blk_mq_add_to_requeue_list() calls
are followed by kicking the requeue list. Hence add an argument to
these two functions that allows to kick the requeue list. This was
proposed by Christoph Hellwig.
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
blk_mq_quiesce_queue() waits until ongoing .queue_rq() invocations
have finished. This function does *not* wait until all outstanding
requests have finished (this means invocation of request.end_io()).
The algorithm used by blk_mq_quiesce_queue() is as follows:
* Hold either an RCU read lock or an SRCU read lock around
.queue_rq() calls. The former is used if .queue_rq() does not
block and the latter if .queue_rq() may block.
* blk_mq_quiesce_queue() first calls blk_mq_stop_hw_queues()
followed by synchronize_srcu() or synchronize_rcu(). The latter
call waits for .queue_rq() invocations that started before
blk_mq_quiesce_queue() was called.
* The blk_mq_hctx_stopped() calls that control whether or not
.queue_rq() will be called are called with the (S)RCU read lock
held. This is necessary to avoid race conditions against
blk_mq_quiesce_queue().
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Ming Lei <tom.leiming@gmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
Since blk_mq_requeue_work() no longer restarts stopped queues
canceling requeue work is no longer needed to prevent that a
stopped queue would be restarted. Hence remove this function.
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Mike Snitzer <snitzer@redhat.com>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
The function blk_queue_stopped() allows to test whether or not a
traditional request queue has been stopped. Introduce a helper
function that allows block drivers to query easily whether or not
one or more hardware contexts of a blk-mq queue have been stopped.
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
This is a helper that pins down a range from an iov_iter and adds it to
a bio without requiring a separate memory allocation for the page array.
It will be used for upcoming direct I/O implementations for block devices
and iomap based file systems.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
[hch: ported to the iov_iter interface, renamed and added comments.
All blame should be directed to me and all fame should go to Kent
after this!]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
If we're doing background type writes, then use the appropriate
background write flags for that.
Signed-off-by: Jens Axboe <axboe@fb.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
|
|
Add wbc_to_write_flags(), which returns the write modifier flags to use,
based on a struct writeback_control. No functional changes in this
patch, but it prepares us for factoring other wbc fields for write type.
Signed-off-by: Jens Axboe <axboe@fb.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Christoph Hellwig <hch@lst.de>
|
|
This adds a new request flag, REQ_BACKGROUND, that callers can use to
tell the block layer that this is background (non-urgent) IO.
Signed-off-by: Jens Axboe <axboe@fb.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
|
|
Now that we have a separate header for struct bio_vec there is absolutely
no excuse for including this header from non-block I/O code.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
It's only needed for the CONFIG_SWAP-only use of bio_end_io_t.
Because CONFIG_SWAP implies CONFIG_BLOCK this will allow to drop some
ifdefs in blk_types.h.
Instead we'll need to add a few explicit includes that were implicit
before, though.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
The file only needs the struct bvec_iter delcaration, which is available
from bvec.h.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
Nothing in fs.h should require blk_types.h to be included.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
This is where all the other bio operations live, so users must include
bio.h anyway.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
Move READ and WRITE to kernel.h and don't define them in terms of block
layer ops; they are our generic data direction indicators these days
and have no more resemblance with the block layer ops.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
Remove the WRITE_* and READ_SYNC wrappers, and just use the flags
directly. Where applicable this also drops usage of the
bio_set_op_attrs wrapper.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
Noidle should be the default for writes as seen by all the compounds
definitions in fs.h using it. In fact only direct I/O really should
be using NODILE, so turn the whole flag around to get the defaults
right, which will make our life much easier especially onces the
WRITE_* defines go away.
This assumes all the existing "raw" users of REQ_SYNC for writes
want noidle behavior, which seems to be spot on from a quick audit.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
Instead of requiring everyone to specify the REQ_SYNC flag aѕ well.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
Reads are synchronous per definition, don't add another flag for it.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
Currently the block layer op_is_write, bio_data_dir and rq_data_dir
helper treat every operation that is not a READ as a data out operation.
This worked surprisingly long, but the new REQ_OP_ZONE_REPORT operation
actually adds a second operation that reads data from the device.
Surprisingly nothing critical relied on this direction, but this might
be a good opportunity to properly fix this issue up.
We take a little inspiration and use the least significant bit of the
operation number to encode the data direction, which just requires us
to renumber the operations to fix this scheme.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Shaun Tancheff <shaun.tancheff@seagate.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
Now that we don't need the common flags to overflow outside the range
of a 32-bit type we can encode them the same way for both the bio and
request fields. This in addition allows us to place the operation
first (and make some room for more ops while we're at it) and to
stop having to shift around the operation values.
In addition this allows passing around only one value in the block layer
instead of two (and eventuall also in the file systems, but we can do
that later) and thus clean up a lot of code.
Last but not least this allows decreasing the size of the cmd_flags
field in struct request to 32-bits. Various functions passing this
value could also be updated, but I'd like to avoid the churn for now.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
A lot of the REQ_* flags are only used on struct requests, and only of
use to the block layer and a few drivers that dig into struct request
internals.
This patch adds a new req_flags_t rq_flags field to struct request for
them, and thus dramatically shrinks the number of common requests. It
also removes the unfortunate situation where we have to fit the fields
from the same enum into 32 bits for struct bio and 64 bits for
struct request.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Shaun Tancheff <shaun.tancheff@seagate.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
It's the last bio-only REQ_* flag, and we have space for it in the bio
bi_flags field.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Shaun Tancheff <shaun.tancheff@seagate.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
The information that am I/O is a read-ahead can be useful for drivers.
In fact the NVMe driver already checks it, even if it won't ever be set
at the moment.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Shaun Tancheff <shaun.tancheff@seagate.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
So move it into the common setion of the request flags.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Shaun Tancheff <shaun.tancheff@seagate.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
With the addition of the zoned operations the tests in this function
became incorrect. But I think it's much better to just open code the
allow operations in the only caller anyway.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Shaun Tancheff <shaun.tancheff@seagate.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
Implement ZBC support functions to setup zoned disks, both
host-managed and host-aware models. Only zoned disks that satisfy
the following conditions are supported:
1) All zones are the same size, with the exception of an eventual
last smaller runt zone.
2) For host-managed disks, reads are unrestricted (reads are not
failed due to zone or write pointer alignement constraints).
Zoned disks that do not satisfy these 2 conditions are setup with
a capacity of 0 to prevent their use.
The function sd_zbc_read_zones, called from sd_revalidate_disk,
checks that the device satisfies the above two constraints. This
function may also change the disk capacity previously set by
sd_read_capacity for devices reporting only the capacity of
conventional zones at the beginning of the LBA range (i.e. devices
reporting rc_basis set to 0).
The capacity message output was moved out of sd_read_capacity into
a new function sd_print_capacity to include this eventual capacity
change by sd_zbc_read_zones. This new function also includes a call
to sd_zbc_print_zones to display the number of zones and zone size
of the device.
Signed-off-by: Hannes Reinecke <hare@suse.de>
[Damien: * Removed zone cache support
* Removed mapping of discard to reset write pointer command
* Modified sd_zbc_read_zones to include checks that the
device satisfies the kernel constraints
* Implemeted REPORT ZONES setup and post-processing based
on code from Shaun Tancheff <shaun.tancheff@seagate.com>
* Removed confusing use of 512B sector units in functions
interface]
Signed-off-by: Damien Le Moal <damien.lemoal@hgst.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Shaun Tancheff <shaun.tancheff@seagate.com>
Tested-by: Shaun Tancheff <shaun.tancheff@seagate.com>
Acked-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
Adds the new BLKREPORTZONE and BLKRESETZONE ioctls for respectively
obtaining the zone configuration of a zoned block device and resetting
the write pointer of sequential zones of a zoned block device.
The BLKREPORTZONE ioctl maps directly to a single call of the function
blkdev_report_zones. The zone information result is passed as an array
of struct blk_zone identical to the structure used internally for
processing the REQ_OP_ZONE_REPORT operation. The BLKRESETZONE ioctl
maps to a single call of the blkdev_reset_zones function.
Signed-off-by: Shaun Tancheff <shaun.tancheff@seagate.com>
Signed-off-by: Damien Le Moal <damien.lemoal@hgst.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
Implement zoned block device zone information reporting and reset.
Zone information are reported as struct blk_zone. This implementation
does not differentiate between host-aware and host-managed device
models and is valid for both. Two functions are provided:
blkdev_report_zones for discovering the zone configuration of a
zoned block device, and blkdev_reset_zones for resetting the write
pointer of sequential zones. The helper function blk_queue_zone_size
and bdev_zone_size are also provided for, as the name suggest,
obtaining the zone size (in 512B sectors) of the zones of the device.
Signed-off-by: Hannes Reinecke <hare@suse.de>
[Damien: * Removed the zone cache
* Implement report zones operation based on earlier proposal
by Shaun Tancheff <shaun.tancheff@seagate.com>]
Signed-off-by: Damien Le Moal <damien.lemoal@hgst.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@seagate.com>
Tested-by: Shaun Tancheff <shaun.tancheff@seagate.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
Define REQ_OP_ZONE_REPORT and REQ_OP_ZONE_RESET for handling zones of
host-managed and host-aware zoned block devices. With with these two
new operations, the total number of operations defined reaches 8 and
still fits with the 3 bits definition of REQ_OP_BITS.
Signed-off-by: Shaun Tancheff <shaun.tancheff@seagate.com>
Signed-off-by: Damien Le Moal <damien.lemoal@hgst.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
Add the zoned queue limit to indicate the zoning model of a block device.
Defined values are 0 (BLK_ZONED_NONE) for regular block devices,
1 (BLK_ZONED_HA) for host-aware zone block devices and 2 (BLK_ZONED_HM)
for host-managed zone block devices. The standards defined drive managed
model is not defined here since these block devices do not provide any
command for accessing zone information. Drive managed model devices will
be reported as BLK_ZONED_NONE.
The helper functions blk_queue_zoned_model and bdev_zoned_model return
the zoned limit and the functions blk_queue_is_zoned and bdev_is_zoned
return a boolean for callers to test if a block device is zoned.
The zoned attribute is also exported as a string to applications via
sysfs. BLK_ZONED_NONE shows as "none", BLK_ZONED_HA as "host-aware" and
BLK_ZONED_HM as "host-managed".
Signed-off-by: Damien Le Moal <damien.lemoal@hgst.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@seagate.com>
Tested-by: Shaun Tancheff <shaun.tancheff@seagate.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
pkey_set() and pkey_get() were syscalls present in older versions
of the protection keys patches. They were fully excised from the
x86 code, but some cruft was left in the generic syscall code. The
C++ comments were intended to help to make it more glaring to me to
fix them before actually submitting them. That technique worked,
but later than I would have liked.
I test-compiled this for arm64.
Fixes: a60f7b69d92c0 ("generic syscalls: Wire up memory protection keys syscalls")
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86@kernel.org
Cc: linux-arch@vger.kernel.org
Cc: mgorman@techsingularity.net
Cc: linux-api@vger.kernel.org
Cc: linux-mm@kvack.org
Cc: luto@kernel.org
Cc: akpm@linux-foundation.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull gcc plugins update from Kees Cook:
"This adds a new gcc plugin named "latent_entropy". It is designed to
extract as much possible uncertainty from a running system at boot
time as possible, hoping to capitalize on any possible variation in
CPU operation (due to runtime data differences, hardware differences,
SMP ordering, thermal timing variation, cache behavior, etc).
At the very least, this plugin is a much more comprehensive example
for how to manipulate kernel code using the gcc plugin internals"
* tag 'gcc-plugins-v4.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
latent_entropy: Mark functions with __latent_entropy
gcc-plugins: Add latent_entropy plugin
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs
Pull btrfs fixes from Chris Mason:
"Some fixes from Omar and Dave Sterba for our new free space tree.
This isn't heavily used yet, but as we move toward making it the new
default we wanted to nail down an endian bug"
* 'for-linus-4.9' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
btrfs: tests: uninline member definitions in free_space_extent
btrfs: tests: constify free space extent specs
Btrfs: expand free space tree sanity tests to catch endianness bug
Btrfs: fix extent buffer bitmap tests on big-endian systems
Btrfs: catch invalid free space trees
Btrfs: fix mount -o clear_cache,space_cache=v2
Btrfs: fix free space tree bitmaps on big-endian systems
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs
Pull overlayfs updates from Miklos Szeredi:
"This update contains fixes to the "use mounter's permission to access
underlying layers" area, and miscellaneous other fixes and cleanups.
No new features this time"
* 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs:
ovl: use vfs_get_link()
vfs: add vfs_get_link() helper
ovl: use generic_readlink
ovl: explain error values when removing acl from workdir
ovl: Fix info leak in ovl_lookup_temp()
ovl: during copy up, switch to mounter's creds early
ovl: lookup: do getxattr with mounter's permission
ovl: copy_up_xattr(): use strnlen
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild
Pull kbuild updates from Michal Marek:
- EXPORT_SYMBOL for asm source by Al Viro.
This does bring a regression, because genksyms no longer generates
checksums for these symbols (CONFIG_MODVERSIONS). Nick Piggin is
working on a patch to fix this.
Plus, we are talking about functions like strcpy(), which rarely
change prototypes.
- Fixes for PPC fallout of the above by Stephen Rothwell and Nick
Piggin
- fixdep speedup by Alexey Dobriyan.
- preparatory work by Nick Piggin to allow architectures to build with
-ffunction-sections, -fdata-sections and --gc-sections
- CONFIG_THIN_ARCHIVES support by Stephen Rothwell
- fix for filenames with colons in the initramfs source by me.
* 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild: (22 commits)
initramfs: Escape colons in depfile
ppc: there is no clear_pages to export
powerpc/64: whitelist unresolved modversions CRCs
kbuild: -ffunction-sections fix for archs with conflicting sections
kbuild: add arch specific post-link Makefile
kbuild: allow archs to select link dead code/data elimination
kbuild: allow architectures to use thin archives instead of ld -r
kbuild: Regenerate genksyms lexer
kbuild: genksyms fix for typeof handling
fixdep: faster CONFIG_ search
ia64: move exports to definitions
sparc32: debride memcpy.S a bit
[sparc] unify 32bit and 64bit string.h
sparc: move exports to definitions
ppc: move exports to definitions
arm: move exports to definitions
s390: move exports to definitions
m68k: move exports to definitions
alpha: move exports to actual definitions
x86: move exports to actual definitions
...
|
|
Pull one more documentation update from Jonathan Corbet:
"A single commit converting the mac80211 DocBook template over to
Sphinx. Only 32 more to go..."
* tag 'docs-4.9-2' of git://git.lwn.net/linux:
docs-rst: sphinxify 802.11 documentation
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma
Pull rdma qedr RoCE driver from Doug Ledford:
"Early on in the merge window I mentioned I had a backlog of new
drivers waiting to be reviewed and that, in addition to the hns-roce
driver, I wanted to get possible a couple more reviewed. I ended up
only having the time to complete one of the additional drivers.
During Dave Miller's pull request this go around, there were a series
of 9 patches to the QLogic qed net driver that add basic support for a
paired RoCE driver. That support is currently not functional because
it is missing the matching RoCE driver in the RDMA subsystem. I
managed to finish that review. However, because it goes against part
of Dave's net pull, and a part that was accepted a day or two after
the merge window opened, to apply cleanly it has to be applied to
either the tip of Dave's net branch, or as I did in this case, I just
applied it to your master after you had taken Dave's pull request."
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma:
qedr: Add events support and register IB device
qedr: Add GSI support
qedr: Add LL2 RoCE interface
qedr: Add support for data path
qedr: Add support for memory registeration verbs
qedr: Add support for QP verbs
qedr: Add support for PD,PKEY and CQ verbs
qedr: Add support for user context verbs
qedr: Add support for RoCE HW init
qedr: Add RoCE driver framework
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull more ACPI updates from Rafael Wysocki:
"This includes a couple of fixes needed after recent changes, two ACPI
driver fixes (fan and "Processor Aggregator"), an update of the ACPI
device properties handling code and a new MAINTAINERS entry for ACPI
on ARM64.
Specifics:
- Fix an unused function warning that started to appear after recent
changes in the ACPI EC driver (Eric Biggers).
- Fix the KERN_CONT usage in acpi_os_vprintf() that has become
(particularly) annoying recently (Joe Perches).
- Fix the fan status checking in the ACPI fan driver to avoid
returning incorrect error codes sometimes (Srinivas Pandruvada).
- Fix the ACPI Processor Aggregator driver (PAD) to always let the
special processor_aggregator driver from Xen take over when running
as Xen dom0 (Juergen Gross).
- Update the handling of reference device properties in ACPI by
allowing empty rows ("holes") to appear in reference property lists
(Mika Westerberg).
- Add a new MAINTAINERS entry for ACPI on ARM64 (Lorenzo Pieralisi)"
* tag 'acpi-extra-4.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
acpi_os_vprintf: Use printk_get_level() to avoid unnecessary KERN_CONT
ACPI / PAD: don't register acpi_pad driver if running as Xen dom0
ACPI / property: Allow holes in reference properties
MAINTAINERS: Add ARM64-specific ACPI maintainers entry
ACPI / EC: Fix unused function warning when CONFIG_PM_SLEEP=n
ACPI / fan: Fix error reading cur_state
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull more power management updates from Rafael Wysocki:
"This includes a couple of fixes for cpufreq regressions introduced in
4.8, a rework of the intel_pstate algorithm used on Atom processors
(that took some time to test) plus a fix and a couple of cleanups in
that driver, a CPPC cpufreq driver fix, and a some devfreq fixes and
cleanups (core and exynos-nocp).
Specifics:
- Fix two cpufreq regressions causing undesirable changes in behavior
to appear (one in the core and one in the conservative governor)
introduced during the 4.8 cycle (Aaro Koskinen, Rafael Wysocki).
- Fix the way the intel_pstate driver accesses MSRs related to the
hardware-managed P-states (HWP) feature during the initialization
which currently is unsafe and may cause the processor to generate a
general protection fault (Srinivas Pandruvada).
- Rework the intel_pstate's P-state selection algorithm used on Atom
processors to avoid known problems with the current one and to make
the computation more straightforward, which also happens to improve
performance in multiple benchmarks a bit (Rafael Wysocki).
- Improve two comments in the intel_pstate driver (Rafael Wysocki).
- Fix the desired performance computation in the CPPC cpufreq driver
(Hoan Tran).
- Fix the devfreq core to avoid printing misleading error messages in
some cases (Tobias Jakobi).
- Fix the error code path in devfreq_add_device() to use proper
locking around list modifications (Axel Lin).
- Fix a build failure and remove a couple of redundant updates of
variables in the exynos-nocp devfreq driver (Axel Lin)"
* tag 'pm-extra-4.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
cpufreq: CPPC: Correct desired_perf calculation
cpufreq: conservative: Fix next frequency selection
cpufreq: skip invalid entries when searching the frequency
cpufreq: intel_pstate: Fix struct pstate_adjust_policy kerneldoc
cpufreq: intel_pstate: Proportional algorithm for Atom
PM / devfreq: Skip status update on uninitialized previous_freq
PM / devfreq: Add proper locking around list_del()
PM / devfreq: exynos-nocp: Remove redundant code
PM / devfreq: exynos-nocp: Select REGMAP_MMIO
cpufreq: intel_pstate: Clarify comment in get_target_pstate_use_performance()
cpufreq: intel_pstate: Fix unsafe HWP MSR access
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup updates from Tejun Heo:
- tracepoints for basic cgroup management operations added
- kernfs and cgroup path formatting functions updated to behave in the
style of strlcpy()
- non-critical bug fixes
* 'for-4.9' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
blkcg: Unlock blkcg_pol_mutex only once when cpd == NULL
cgroup: fix error handling regressions in proc_cgroup_show() and cgroup_release_agent()
cpuset: fix error handling regression in proc_cpuset_show()
cgroup: add tracepoints for basic operations
cgroup: make cgroup_path() and friends behave in the style of strlcpy()
kernfs: remove kernfs_path_len()
kernfs: make kernfs_path*() behave in the style of strlcpy()
kernfs: add dummy implementation of kernfs_path_from_node()
|
|
Add support for Queue Pair verbs which adds, deletes,
modifies and queries Queue Pairs.
Signed-off-by: Rajesh Borundia <rajesh.borundia@cavium.com>
Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
Add support for protection domain and completion queue verbs.
Signed-off-by: Rajesh Borundia <rajesh.borundia@cavium.com>
Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
Add support for ucontext, query port, add and del gid verbs.
Signed-off-by: Rajesh Borundia <rajesh.borundia@cavium.com>
Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
Adds a skeletal implementation of the qed* RoCE driver -
basically the ability to communicate with the qede driver and
receive notifications from it regarding various init/exit events.
Signed-off-by: Rajesh Borundia <rajesh.borundia@cavium.com>
Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu
Pull percpu updates from Tejun Heo:
- Nick improved generic implementations of percpu operations which
modify the variable and return so that they calculate the physical
address only once.
- percpu_ref percpu <-> atomic mode switching improvements. The
patchset was originally posted about a year ago but fell through the
crack.
- misc non-critical fixes.
* 'for-4.9' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu:
mm/percpu.c: fix potential memory leakage for pcpu_embed_first_chunk()
mm/percpu.c: correct max_distance calculation for pcpu_embed_first_chunk()
percpu: eliminate two sparse warnings
percpu: improve generic percpu modify-return implementation
percpu-refcount: init ->confirm_switch member properly
percpu_ref: allow operation mode switching operations to be called concurrently
percpu_ref: restructure operation mode switching
percpu_ref: unify staggered atomic switching wait behavior
percpu_ref: reorganize __percpu_ref_switch_to_atomic() and relocate percpu_ref_switch_to_atomic()
percpu_ref: remove unnecessary RCU grace period for staggered atomic switching confirmation
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata
Pull libata updates from Tejun Heo:
- Write same support added
- Minor ahci MSIX irq handling updates
- Non-critical SCSI command translation fixes
- Controller specific changes
* 'for-4.9' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata:
ahci: qoriq: Revert "ahci: qoriq: Disable NCQ on ls2080a SoC"
libata: remove <asm-generic/libata-portmap.h>
libata: remove unused definitions from <asm/libata-portmap.h>
pata_at91: Use PTR_ERR_OR_ZERO rather than if(IS_ERR(...)) + PTR_ERR
ata: Replace BUG() with BUG_ON().
ata: sata_mv: Replacing dma_pool_alloc and memset with a single call dma_pool_zalloc.
libata: Some drives failing on SCT Write Same
ahci: use pci_alloc_irq_vectors
libata: SCT Write Same handle ATA_DFLAG_PIO
libata: SCT Write Same / DSM Trim
libata: Add support for SCT Write Same
libata: Safely overwrite attached page in WRITE SAME xlat
ahci: also use a per-port lock for the multi-MSIX case
ARM: dts: STiH407-family: Add ports-implemented property in sata nodes
ahci: st: Add ports-implemented property in support
ahci: qoriq: enable snoopable sata read and write
ahci: qoriq: adjust sata parameter
libata-scsi: fix MODE SELECT translation for Control mode page
libata-scsi: use u8 array to store mode page copy
|
|
This easy-to-trigger warning shows up instantly when running
Trinity on a kernel with CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS disabled.
At most this should have been a printk, but the -EINVAL alone should be more
than adequate indicator that something isn't available.
Signed-off-by: Dave Jones <davej@codemonkey.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|