Age | Commit message (Collapse) | Author | Files | Lines |
|
git://git.kernel.org/pub/scm/linux/kernel/git/ryusuke/nilfs2
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ryusuke/nilfs2:
nilfs2: add reader's lock for cno in nilfs_ioctl_sync
nilfs2: delete unnecessary condition in load_segment_summary
nilfs2: move iterator to write log into segment buffer
nilfs2: get rid of s_dirt flag use
nilfs2: get rid of nilfs_segctor_req struct
nilfs2: delete unnecessary condition in nilfs_dat_translate
nilfs2: fix potential hang in nilfs_error on errors=remount-ro
nilfs2: use mnt_want_write in ioctls where write access is needed
nilfs2: issue discard request after cleaning segments
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2: (36 commits)
Ocfs2: Move ocfs2 ioctl definitions from ocfs2_fs.h to newly added ocfs2_ioctl.h
ocfs2: send SIGXFSZ if new filesize exceeds limit -v2
ocfs2/userdlm: Add tracing in userdlm
ocfs2: Use a separate masklog for AST and BASTs
dlm: allow dlm do recovery during shutdown
ocfs2: Only bug out in direct io write for reflinked extent.
ocfs2: fix warning in ocfs2_file_aio_write()
ocfs2_dlmfs: Enable the use of user cluster stacks.
ocfs2_dlmfs: Use the stackglue.
ocfs2_dlmfs: Don't honor truncate. The size of a dlmfs file is LVB_LEN
ocfs2: Pass the locking protocol into ocfs2_cluster_connect().
ocfs2: Remove the ast pointers from ocfs2_stack_plugins
ocfs2: Hang the locking proto on the cluster conn and use it in asts.
ocfs2: Attach the connection to the lksb
ocfs2: Pass lksbs back from stackglue ast/bast functions.
ocfs2_dlmfs: Move to its own directory
ocfs2_dlmfs: Use poll() to signify BASTs.
ocfs2_dlmfs: Add capabilities parameter.
ocfs2: Handle errors while setting external xattr values.
ocfs2: Set inline xattr entries with ocfs2_xa_set()
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
fuse: fix large stack use
fuse: cleanup in fuse_notify_inval_...()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu:
percpu: add __percpu sparse annotations to what's left
percpu: add __percpu sparse annotations to fs
percpu: add __percpu sparse annotations to core kernel subsystems
local_t: Remove leftover local.h
this_cpu: Remove pageset_notifier
this_cpu: Page allocator conversion
percpu, x86: Generic inc / dec percpu instructions
local_t: Move local.h include to ringbuffer.c and ring_buffer_benchmark.c
module: Use this_cpu_xx to dynamically allocate counters
local_t: Remove cpu_local_xx macros
percpu: refactor the code in pcpu_[de]populate_chunk()
percpu: remove compile warnings caused by __verify_pcpu_ptr()
percpu: make accessors check for percpu pointer in sparse
percpu: add __percpu for sparse.
percpu: make access macros universal
percpu: remove per_cpu__ prefix.
|
|
* git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-nmw:
GFS2: print glock numbers in hex
GFS2: ordered writes are backwards
GFS2: Remove old, unused linked list code from quota
GFS2: Remove loopy umount code
GFS2: Metadata address space clean up
|
|
* git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6:
[CIFS] pSesInfo->sesSem is used as mutex. Rename it to session_mutex and
[CIFS] Use unsigned ea length for clarity
cifs: set server_eof in cifs_fattr_to_inode
[CIFS] Minor cleanup to EA patch
cifs: merge CIFSSMBQueryEA with CIFSSMBQAllEAs
cifs: verify lengths of QueryAllEAs reply
cifs: increase maximum buffer size in CIFSSMBQAllEAs
cifs: rename name_len to list_len in CIFSSMBQAllEAs
cifs: clean up indentation in CIFSSMBQAllEAs
cifs: add parens around smb_var in BCC macros
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6: (38 commits)
SELinux: Make selinux_kernel_create_files_as() shouldn't just always return 0
TOMOYO: Protect find_task_by_vpid() with RCU.
Security: add static to security_ops and default_security_ops variable
selinux: libsepol: remove dead code in check_avtab_hierarchy_callback()
TOMOYO: Remove __func__ from tomoyo_is_correct_path/domain
security: fix a couple of sparse warnings
TOMOYO: Remove unneeded parameter.
TOMOYO: Use shorter names.
TOMOYO: Use enum for index numbers.
TOMOYO: Add garbage collector.
TOMOYO: Add refcounter on domain structure.
TOMOYO: Merge headers.
TOMOYO: Add refcounter on string data.
TOMOYO: Reduce lines by using common path for addition and deletion.
selinux: fix memory leak in sel_make_bools
TOMOYO: Extract bitfield
syslog: clean up needless comment
syslog: use defined constants instead of raw numbers
syslog: distinguish between /proc/kmsg and syscalls
selinux: allow MLS->non-MLS and vice versa upon policy reload
...
|
|
Currently we were adding ioctl cmds/structures for ocfs2 into ocfs2_fs.h
which was used for define ocfs2 on-disk layout. That sounds a little bit
confusing, and it may be quickly polluted espcially when growing the
ocfs2_info_request ioctls afterwards(it will grow i bet).
As a result, such OCFS2 IOCs do need to be placed somewhere other than
ocfs2_fs.h, a separated ocfs2_ioctl.h will be added to store such ioctl
structures and definitions which could also be used from userspace to
invoke ioctls call.
Signed-off-by: Tristan Ye <tristan.ye@oracle.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
|
|
* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
Revert "blkdev: fix merge_bvec_fn return value checks"
|
|
This reverts commit 9f7cdbc33f36d28e57eaba0093f68f0d14c38c5b.
It's causing oopses om dm setups, so revert it until we investigate.
Reported-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Tested-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
|
|
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1341 commits)
virtio_net: remove forgotten assignment
be2net: fix tx completion polling
sis190: fix cable detect via link status poll
net: fix protocol sk_buff field
bridge: Fix build error when IGMP_SNOOPING is not enabled
bnx2x: Tx barriers and locks
scm: Only support SCM_RIGHTS on unix domain sockets.
vhost-net: restart tx poll on sk_sndbuf full
vhost: fix get_user_pages_fast error handling
vhost: initialize log eventfd context pointer
vhost: logging thinko fix
wireless: convert to use netdev_for_each_mc_addr
ethtool: do not set some flags, if others failed
ipoib: returned back addrlen check for mc addresses
netlink: Adding inode field to /proc/net/netlink
axnet_cs: add new id
bridge: Make IGMP snooping depend upon BRIDGE.
bridge: Add multicast count/interval sysfs entries
bridge: Add hash elasticity/max sysfs entries
bridge: Add multicast_snooping sysfs toggle
...
Trivial conflicts in Documentation/feature-removal-schedule.txt
|
|
* 'for-2.6.34' of git://git.kernel.dk/linux-2.6-block: (38 commits)
block: don't access jiffies when initialising io_context
cfq: remove 8 bytes of padding from cfq_rb_root on 64 bit builds
block: fix for "Consolidate phys_segment and hw_segment limits"
cfq-iosched: quantum check tweak
blktrace: perform cleanup after setup error
blkdev: fix merge_bvec_fn return value checks
cfq-iosched: requests "in flight" vs "in driver" clarification
cciss: Fix problem with scatter gather elements in the scsi half of the driver
cciss: eliminate unnecessary pointer use in cciss scsi code
cciss: do not use void pointer for scsi hba data
cciss: factor out scatter gather chain block mapping code
cciss: fix scatter gather chain block dma direction kludge
cciss: simplify scatter gather code
cciss: factor out scatter gather chain block allocation and freeing
cciss: detect bad alignment of scsi commands at build time
cciss: clarify command list padding calculation
cfq-iosched: rethink seeky detection for SSDs
cfq-iosched: rework seeky detection
block: remove padding from io_context on 64bit builds
block: Consolidate phys_segment and hw_segment limits
...
|
|
This patch changes glock numbers from printing in decimal to hex.
Since DLM prints corresponding resource IDs in hex, it makes debugging
easier.
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
|
When we queue data buffers for ordered write, the buffers are added
to the head of the ordered write list. When the log needs to push
these buffers to disk, it also walks the list from the head. The
result is that the the ordered buffers are submitted to disk in
reverse order.
For large writes, this means that whenever the log flushes large
streams of reverse sequential order buffers are pushed down into the
block layers. The elevators don't handle this particularly well, so
IO rates tend to be significantly lower than if the IO was issued in
ascending block order.
Queue new ordered buffers to the tail of the ordered buffer list to
ensure that IO is dispatched in the order it was submitted. This
should significantly improve large sequential write speeds. On a
disk capable of 85MB/s, speeds increase from 50MB/s to 65MB/s for
noop and from 38MB/s to 50MB/s for cfq.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
|
As a consequence of the previous patch, we can now remove the
loop which used to be required due to the circular dependency
between the inodes and glocks. Instead we can just invalidate
the inodes, and then clear up any glocks which are left.
Also we no longer need the rwsem since there is no longer any
danger of the inode invalidation calling back into the glock
code (and from there back into the inode code).
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
|
Since the start of GFS2, an "extra" inode has been used to store
the metadata belonging to each inode. The only reason for using
this inode was to have an extra address space, the other fields
were unused. This means that the memory usage was rather inefficient.
The reason for keeping each inode's metadata in a separate address
space is that when glocks are requested on remote nodes, we need to
be able to efficiently locate the data and metadata which relating
to that glock (inode) in order to sync or sync and invalidate it
(depending on the remotely requested lock mode).
This patch adds a new type of glock, which has in addition to
its normal fields, has an address space. This applies to all
inode and rgrp glocks (but to no other glock types which remain
as before). As a result, we no longer need to have the second
inode.
This results in three major improvements:
1. A saving of approx 25% of memory used in caching inodes
2. A removal of the circular dependency between inodes and glocks
3. No confusion between "normal" and "metadata" inodes in super.c
Although the first of these is the more immediately apparent, the
second is just as important as it now enables a number of clean
ups at umount time. Those will be the subject of future patches.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
|
|
Conflicts:
drivers/firmware/iscsi_ibft.c
|
|
|
|
merge_bvec_fn() returns bvec->bv_len on success. So we have to check
against this value. But in case of fs_optimization merge we compare
with wrong value. This patch must be included in
b428cd6da7e6559aca69aa2e3a526037d3f20403
But accidentally i've forgot to add this in the initial patch.
To make things straight let's replace all such checks.
In fact this makes code easy to understand.
Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (44 commits)
rcu: Fix accelerated GPs for last non-dynticked CPU
rcu: Make non-RCU_PROVE_LOCKING rcu_read_lock_sched_held() understand boot
rcu: Fix accelerated grace periods for last non-dynticked CPU
rcu: Export rcu_scheduler_active
rcu: Make rcu_read_lock_sched_held() take boot time into account
rcu: Make lockdep_rcu_dereference() message less alarmist
sched, cgroups: Fix module export
rcu: Add RCU_CPU_STALL_VERBOSE to dump detailed per-task information
rcu: Fix rcutorture mod_timer argument to delay one jiffy
rcu: Fix deadlock in TREE_PREEMPT_RCU CPU stall detection
rcu: Convert to raw_spinlocks
rcu: Stop overflowing signed integers
rcu: Use canonical URL for Mathieu's dissertation
rcu: Accelerate grace period if last non-dynticked CPU
rcu: Fix citation of Mathieu's dissertation
rcu: Documentation update for CONFIG_PROVE_RCU
security: Apply lockdep-based checking to rcu_dereference() uses
idr: Apply lockdep-based diagnostics to rcu_dereference() uses
radix-tree: Disable RCU lockdep checking in radix tree
vfs: Abstract rcu_dereference_check for files-fdtable use
...
|
|
This patch makes ocfs2 send SIGXFSZ if new file size exceeds the rlimit.
Processes may get SIGXFSZ on one node (in the cluster) while others will
not on another if file size limits are different on the two nodes.
Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
|
|
Make use of the newly added BASTS masklog to trace ASTs and BASTs in userdlm.
Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
|
|
This patch adds a new masklog and uses it allow tracing ASTs and BASTs
in the dlmglue layer. This has been found to be very useful in debugging
cluster locking issues.
Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
|
|
There's currently an open Ubuntu bug[0], with the intent to compile NFS_FSCACHE
(and possibly AFS_FSCACHE, 9P_FSCACHE) into the standard Ubuntu kernel.
However, since *_FSCACHE still depends on EXPERIMENTAL, this won't happen.
As Arjan van de Ven pointed out[1], the EXPERIMENTAL flag doesn't mean that
much any more, I propose the following patch to fs/nfs/Kconfig. I'd do the
same for fs/9p/Kconfig and fs/afs/Kconfig, but as I did not test 9p or AFS, I
feel it would not be appropriate for me to remove the flag.
[0] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/440522/comments/5
[1] http://lkml.org/lkml/2010/1/23/145
Signed-off-by: Christian Kujau <lists@nerdbynature.de>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/teigland/dlm
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/dlm:
dlm: use bastmode in debugfs output
dlm: Send lockspace name with uevents
dlm: send reply before bast
dlm: fix ordering of bast and cast
|
|
* 'for-linus' of git://oss.sgi.com/xfs/xfs: (52 commits)
fs/xfs: Correct NULL test
xfs: optimize log flushing in xfs_fsync
xfs: only clear the suid bit once in xfs_write
xfs: kill xfs_bawrite
xfs: log changed inodes instead of writing them synchronously
xfs: remove invalid barrier optimization from xfs_fsync
xfs: kill the unused XFS_QMOPT_* flush flags V2
xfs: Use delay write promotion for dquot flushing
xfs: Sort delayed write buffers before dispatch
xfs: Don't issue buffer IO direct from AIL push V2
xfs: Use delayed write for inodes rather than async V2
xfs: Make inode reclaim states explicit
xfs: more reserved blocks fixups
xfs: turn off sign warnings
xfs: don't hold onto reserved blocks on remount,ro
xfs: quota limit statvfs available blocks
xfs: replace KM_LARGE with explicit vmalloc use
xfs: cleanup up xfs_log_force calling conventions
xfs: kill XLOG_VEC_SET_TYPE
xfs: remove duplicate buffer flags
...
|
|
* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/xfs-vipt:
xfs: fix xfs to work with Virtually Indexed architectures
sh: add mm API for DMA to vmalloc/vmap areas
arm: add mm API for DMA to vmalloc/vmap areas
parisc: add mm API for DMA to vmalloc/vmap areas
mm: add coherence API for DMA to vmalloc/vmap areas
|
|
If a node down event happens while dlm shutdown in progress, dlm recovery
should be done before dlm is shutdown. We can't migrate unrecovered locks,
obviously. But dlm_reco_thread only does recovery if the dlm_state is
in DLM_CTXT_JOINED.
dlm_reco_thread should do recovery if dlm_state is in DLM_CTXT_JOINED or
DLM_CTXT_IN_SHUTDOWN.
Signed-off-by: Srinivas Eeda <srinivas.eeda@oracle.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
|
|
In ocfs2_direct_IO_get_blocks, we only need to bug out
in case of we are going to write a recounted extent rec.
What a silly bug introduced by me!
Signed-off-by: Tao Ma <tao.ma@oracle.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
Cc: stable@kernel.org
|
|
This patch fixes a compiling warning in ocfs2_file_aio_write().
Signed-off-by: Coly Li <coly.li@suse.de>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
|
|
Unlike ocfs2, dlmfs has no permanent storage. It can't store off a
cluster stack it is supposed to be using. So it can't specify the stack
name in ocfs2_cluster_connect().
Instead, we create ocfs2_cluster_connect_agnostic(), which simply uses
the stack that is currently enabled. This is find for dlmfs, which will
rely on the stack initialization.
We add the "stackglue" capability to dlmfs's capability list. This lets
userspace know dlmfs can be used with all cluster stacks.
Signed-off-by: Joel Becker <joel.becker@oracle.com>
|
|
Rather than directly using o2dlm, dlmfs can now use the stackglue. This
allows it to use userspace cluster stacks and fs/dlm. This commit
forces o2cb for now. A latter commit will bump the protocol version and
allow non-o2cb stacks.
This is one big sed, really. LKM_xxMODE becomes DLM_LOCK_xx. LKM_flag
becomes DLM_LKF_flag.
We also learn to check that the LVB is valid before reading it. Any DLM
can lose the contents of the LVB during a complicated recovery. userdlm
should be checking this. Now it does. dlmfs will return 0 from read(2)
if the LVB was invalid.
Signed-off-by: Joel Becker <joel.becker@oracle.com>
|
|
We want folks using dlmfs to be able to use the LVB in places other than
just write(2)/read(2). By ignoring truncate requests, we allow 'echo
"contents" > /dlm/space/lockname' to work.
Signed-off-by: Joel Becker <joel.becker@oracle.com>
|
|
Inside the stackglue, the locking protocol structure is hanging off of
the ocfs2_cluster_connection. This takes it one further; the locking
protocol is passed into ocfs2_cluster_connect(). Now different cluster
connections can have different locking protocols with distinct asts.
Note that all locking protocols have to keep their maximum protocol
version in lock-step.
With the protocol structure set in ocfs2_cluster_connect(), there is no
need for the stackglue to have a static pointer to a specific protocol
structure. We can change initialization to only pass in the maximum
protocol version.
Signed-off-by: Joel Becker <joel.becker@oracle.com>
|
|
With the full ocfs2_locking_protocol hanging off of the
ocfs2_cluster_connection, ast wrappers can get the ast/bast pointers
there. They don't need to get them from their plugin structure.
The user plugin still needs the maximum locking protocol version,
though. This changes the plugin structure so that it only holds the max
version, not the entire ocfs2_locking_protocol pointer.
Signed-off-by: Joel Becker <joel.becker@oracle.com>
|
|
With the ocfs2_cluster_connection hanging off of the ocfs2_dlm_lksb, we
have access to it in the ast and bast wrapper functions. Attach the
ocfs2_locking_protocol to the conn.
Now, instead of refering to a static variable for ast/bast pointers, the
wrappers can look at the connection. This means different connections
can have different ast/bast pointers, and it reduces the need for the
static pointer.
Signed-off-by: Joel Becker <joel.becker@oracle.com>
|
|
We're going to want it in the ast functions, so we convert union
ocfs2_dlm_lksb to struct ocfs2_dlm_lksb and let it carry the connection.
Signed-off-by: Joel Becker <joel.becker@oracle.com>
|
|
The stackglue ast and bast functions tried to maintain the fiction that
their arguments were void pointers. In reality, stack_user.c had to
know that the argument was an ocfs2_lock_res in order to get the status
off of the lksb. That's ugly.
This changes stackglue to always pass the lksb as the argument to ast
and bast functions. The caller can always use container_of() to get the
ocfs2_lock_res or user_dlm_lock_res. The net effect to the caller is
zero. They still get back the lockres in their ast. stackglue gets
cleaner, and now can use the lksb itself.
Signed-off-by: Joel Becker <joel.becker@oracle.com>
|
|
We're going to remove the tie between ocfs2_dlmfs and o2dlm.
ocfs2_dlmfs doesn't belong in the fs/ocfs2/dlm directory anymore. Here
we move it to fs/ocfs2/dlmfs.
Signed-off-by: Joel Becker <joel.becker@oracle.com>
|
|
o2dlm's userspace filesystem is an easy way to use the DLM from
userspace. It is intentionally simple. For example, it does not allow
for asynchronous behavior or lock conversion. This is intentional to
keep the interface simple.
Because there is no asynchronous notification, there is no way for a
process holding a lock to know another node needs the lock. This is the
number one complaint of ocfs2_dlmfs users. Turns out, we can solve this
very easily. We add poll() support to ocfs2_dlmfs. When a BAST is
received, the lock's file descriptor will receive POLLIN.
This is trivial to implement. Userdlm already has an appropriate
waitqueue, and the lock knows when it is blocked.
We add the "bast" capability to tell userspace this is available.
Signed-off-by: Joel Becker <joel.becker@oracle.com>
Acked-by: Mark Fasheh <mfasheh@suse.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
|
|
Over time, dlmfs has added some features that were not part of the
initial ABI. Unfortunately, some of these features are not detectable
via standard usage. For example, Linux's default poll always returns
POLLIN, so there is no way for a caller of poll(2) to know when dlmfs
added poll support. Instead, we provide this list of new capabilities.
Capabilities is a read-only attribute. We do it as a module parameter
so we can discover it whether dlmfs is built in, loaded, or even not
loaded (via modinfo).
The ABI features are local to this machine's dlmfs mount. This is
distinct from the locking protocol, which is concerned with inter-node
interaction.
Signed-off-by: Joel Becker <joel.becker@oracle.com>
|
|
ocfs2 can store extended attribute values as large as a single file. It
does this using a standard ocfs2 btree for the large value. However,
the previous code did not handle all error cases cleanly.
There are multiple problems to have.
1) We have trouble allocating space for a new xattr. This leaves us
with an empty xattr.
2) We overwrote an existing local xattr with a value root, and now we
have an error allocating the storage. This leaves us an empty xattr.
where there used to be a value. The value is lost.
3) We have trouble truncating a reused value. This leaves us with the
original entry pointing to the truncated original value. The value
is lost.
4) We have trouble extending the storage on a reused value. This leaves
us with the original value safely in place, but with more storage
allocated when needed.
This doesn't consider storing local xattrs (values that don't require a
btree). Those only fail when the journal fails.
Case (1) is easy. We just remove the xattr we added. We leak the
storage because we can't safely remove it, but otherwise everything is
happy. We'll print a warning about the leak.
Case (4) is easy. We still have the original value in place. We can
just leave the extra storage attached to this xattr. We return the
error, but the old value is untouched. We print a warning about the
storage.
Case (2) and (3) are hard because we've lost the original values. In
the old code, we ended up with values that could be partially read.
That's not good. Instead, we just wipe the xattr entry and leak the
storage. It stinks that the original value is lost, but now there isn't
a partial value to be read. We'll print a big fat warning.
Signed-off-by: Joel Becker <joel.becker@oracle.com>
|
|
ocfs2_xattr_ibody_set() is the only remaining user of
ocfs2_xattr_set_entry(). ocfs2_xattr_set_entry() actually does two
things: it calls ocfs2_xa_set(), and it initializes the inline xattrs.
Initializing the inline space really belongs in its own call.
We lift the initialization to ocfs2_xattr_ibody_init(), called from
ocfs2_xattr_ibody_set() only when necessary. Now
ocfs2_xattr_ibody_set() can call ocfs2_xa_set() directly.
ocfs2_xattr_set_entry() goes away.
Another nice fact is that ocfs2_init_dinode_xa_loc() can trust
i_xattr_inline_size.
Signed-off-by: Joel Becker <joel.becker@oracle.com>
|
|
ocfs2_xattr_block_set() calls into ocfs2_xattr_set_entry() with just the
HAS_XATTR flag. Most of the machinery of ocfs2_xattr_set_entry() is
skipped. All that really happens other than the call to ocfs2_xa_set()
is making sure the HAS_XATTR flag is set on the inode.
But HAS_XATTR should be set when we also set di->i_xattr_loc. And
that's done in ocfs2_create_xattr_block(). So let's move it there, and
then ocfs2_xattr_block_set() can just call ocfs2_xa_set().
While we're there, ocfs2_create_xattr_block() can take the set_ctxt for
a smaller argument list. It also learns to set HAS_XATTR_FL, because it
knows for sure. ocfs2_create_empty_xatttr_block() in the reflink path
fakes a set_ctxt to call ocfs2_create_xattr_block().
Signed-off-by: Joel Becker <joel.becker@oracle.com>
|
|
ocfs2_xattr_set_in_bucket() doesn't need to do its own hacky space
checking. Let's let ocfs2_xa_prepare_entry() (via ocfs2_xa_set()) do
the more accurate work. Whenever it doesn't have space,
ocfs2_xattr_set_in_bucket() can try to get more space.
Signed-off-by: Joel Becker <joel.becker@oracle.com>
|
|
ocfs2_xa_set() wraps the ocfs2_xa_prepare_entry()/ocfs2_xa_store_value()
logic. Both callers can now use the same routine. ocfs2_xa_remove()
moves directly into ocfs2_xa_set().
Signed-off-by: Joel Becker <joel.becker@oracle.com>
|
|
ocfs2_xa_prepare_entry() gets all the logic to add, remove, or modify
external value trees. Now, when it exits, the entry is ready to receive
a value of any size.
ocfs2_xa_remove() is added to handle the complete removal of an entry.
It truncates the external value tree before calling
ocfs2_xa_remove_entry().
ocfs2_xa_store_inline_value() becomes ocfs2_xa_store_value(). It can
store any value.
ocfs2_xattr_set_entry() loses all the allocation logic and just uses
these functions. ocfs2_xattr_set_value_outside() disappears.
ocfs2_xattr_set_in_bucket() uses these functions and makes
ocfs2_xattr_set_entry_in_bucket() obsolete. That goes away, as does
ocfs2_xattr_bucket_set_value_outside() and
ocfs2_xattr_bucket_value_truncate().
Signed-off-by: Joel Becker <joel.becker@oracle.com>
|
|
We're going to want to make sure our buffers get accessed and dirtied
correctly. So have the xa_loc do the work. This includes storing the
inode on ocfs2_xa_loc.
Signed-off-by: Joel Becker <joel.becker@oracle.com>
|
|
We use the ocfs2_xattr_value_buf structure to manage external values.
It lets the value tree code do its work regardless of the containing
storage. ocfs2_xa_fill_value_buf() initializes a value buf from an
ocfs2_xa_loc entry.
Signed-off-by: Joel Becker <joel.becker@oracle.com>
|
|
Previously the xattr code would send in a fake value, containing a tree
root, to the function that installed name+value pairs. Instead, we pass
the real value to ocfs2_xa_set_inline_value(), and it notices that the
value cannot fit. Thus, it installs a tree root.
Signed-off-by: Joel Becker <joel.becker@oracle.com>
|