summaryrefslogtreecommitdiffstats
path: root/drivers/block
AgeCommit message (Collapse)AuthorFilesLines
2010-10-22Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6Linus Torvalds1-1/+1
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6: (141 commits) USB: mct_u232: fix broken close USB: gadget: amd5536udc.c: fix error path USB: imx21-hcd - fix off by one resource size calculation usb: gadget: fix Kconfig warning usb: r8a66597-udc: Add processing when USB was removed. mxc_udc: add workaround for ENGcm09152 for i.MX35 USB: ftdi_sio: add device ids for ScienceScope USB: musb: AM35x: Workaround for fifo read issue USB: musb: add musb support for AM35x USB: AM35x: Add musb support usb: Fix linker errors with CONFIG_PM=n USB: ohci-sh - use resource_size instead of defining its own resource_len macro USB: isp1362-hcd - use resource_size instead of defining its own resource_len macro USB: isp116x-hcd - use resource_size instead of defining its own resource_len macro USB: xhci: Fix compile error when CONFIG_PM=n USB: accept some invalid ep0-maxpacket values USB: xHCI: PCI power management implementation USB: xHCI: bus power management implementation USB: xHCI: port remote wakeup implementation USB: xHCI: port power management implementation ... Manually fix up (non-data) conflict: the SCSI merge gad renamed the 'hw_sector_size' member to 'physical_block_size', and the USB tree brought a new use of it.
2010-10-22Merge branch 'for-2.6.37/barrier' of git://git.kernel.dk/linux-2.6-blockLinus Torvalds9-84/+41
* 'for-2.6.37/barrier' of git://git.kernel.dk/linux-2.6-block: (46 commits) xen-blkfront: disable barrier/flush write support Added blk-lib.c and blk-barrier.c was renamed to blk-flush.c block: remove BLKDEV_IFL_WAIT aic7xxx_old: removed unused 'req' variable block: remove the BH_Eopnotsupp flag block: remove the BLKDEV_IFL_BARRIER flag block: remove the WRITE_BARRIER flag swap: do not send discards as barriers fat: do not send discards as barriers ext4: do not send discards as barriers jbd2: replace barriers with explicit flush / FUA usage jbd2: Modify ASYNC_COMMIT code to not rely on queue draining on barrier jbd: replace barriers with explicit flush / FUA usage nilfs2: replace barriers with explicit flush / FUA usage reiserfs: replace barriers with explicit flush / FUA usage gfs2: replace barriers with explicit flush / FUA usage btrfs: replace barriers with explicit flush / FUA usage xfs: replace barriers with explicit flush / FUA usage block: pass gfp_mask and flags to sb_issue_discard dm: convey that all flushes are processed as empty ...
2010-10-22Merge branch 'for-2.6.37/drivers' of git://git.kernel.dk/linux-2.6-blockLinus Torvalds15-1372/+2390
* 'for-2.6.37/drivers' of git://git.kernel.dk/linux-2.6-block: (95 commits) cciss: fix PCI IDs for new Smart Array controllers drbd: add race-breaker to drbd_go_diskless drbd: use dynamic_dev_dbg to optionally log uuid changes dynamic_debug.h: Fix dynamic_dev_dbg() macro if CONFIG_DYNAMIC_DEBUG not set drbd: cleanup: change "<= 0" to "== 0" drbd: relax the grace period of the md_sync timer again drbd: add some more explicit drbd_md_sync drbd: drop wrong debug asserts, fix recently introduced race drbd: cleanup useless leftover warn/error printk's drbd: add explicit drbd_md_sync to drbd_resync_finished drbd: Do not log an ASSERT for P_OV_REQUEST packets while C_CONNECTED drbd: fix for possible deadlock on IO error during resync drbd: fix unlikely access after free and list corruption drbd: fix for spurious fullsync (uuids rotated too fast) drbd: allow for explicit resync-finished notifications drbd: preparation commit, using full state in receive_state() drbd: drbd_send_ack_dp must not rely on header information drbd: Fix regression in recv_bm_rle_bits (compressed bitmap) drbd: Fixed a stupid copy and paste error drbd: Allow larger values for c-fill-target. ... Fix up trivial conflict in drivers/block/ataflop.c due to BKL removal
2010-10-22Merge branch 'for-2.6.37/core' of git://git.kernel.dk/linux-2.6-blockLinus Torvalds1-1/+0
* 'for-2.6.37/core' of git://git.kernel.dk/linux-2.6-block: (39 commits) cfq-iosched: Fix a gcc 4.5 warning and put some comments block: Turn bvec_k{un,}map_irq() into static inline functions block: fix accounting bug on cross partition merges block: Make the integrity mapped property a bio flag block: Fix double free in blk_integrity_unregister block: Ensure physical block size is unsigned int blkio-throttle: Fix possible multiplication overflow in iops calculations blkio-throttle: limit max iops value to UINT_MAX blkio-throttle: There is no need to convert jiffies to milli seconds blkio-throttle: Fix link failure failure on i386 blkio: Recalculate the throttled bio dispatch time upon throttle limit change blkio: Add root group to td->tg_list blkio: deletion of a cgroup was causes oops blkio: Do not export throttle files if CONFIG_BLK_DEV_THROTTLING=n block: set the bounce_pfn to the actual DMA limit rather than to max memory block: revert bad fix for memory hotplug causing bounces Fix compile error in blk-exec.c for !CONFIG_DETECT_HUNG_TASK block: set the bounce_pfn to the actual DMA limit rather than to max memory block: Prevent hang_check firing during long I/O cfq: improve fsync performance for small files ... Fix up trivial conflicts due to __rcu sparse annotation in include/linux/genhd.h
2010-10-22Merge branch 'llseek' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bklLinus Torvalds5-1/+6
* 'llseek' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl: vfs: make no_llseek the default vfs: don't use BKL in default_llseek llseek: automatically add .llseek fop libfs: use generic_file_llseek for simple_attr mac80211: disallow seeks in minstrel debug code lirc: make chardev nonseekable viotape: use noop_llseek raw: use explicit llseek file operations ibmasmfs: use generic_file_llseek spufs: use llseek in all file operations arm/omap: use generic_file_llseek in iommu_debug lkdtm: use generic_file_llseek in debugfs net/wireless: use generic_file_llseek in debugfs drm: use noop_llseek
2010-10-22Merge branch 'trivial' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bklLinus Torvalds26-155/+178
* 'trivial' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl: block: autoconvert trivial BKL users to private mutex drivers: autoconvert trivial BKL users to private mutex ipmi: autoconvert trivial BKL users to private mutex mac: autoconvert trivial BKL users to private mutex mtd: autoconvert trivial BKL users to private mutex scsi: autoconvert trivial BKL users to private mutex Fix up trivial conflicts (due to addition of private mutex right next to deletion of a version string) in drivers/char/pcmcia/cm40[04]0_cs.c
2010-10-22USB: storage: Use USB_ prefix instead of US_ prefixMichal Nazarewicz1-1/+1
This commit changes prefix for some of the USB mass storage class related macros (ie. USB_SC_ for subclass and USB_PR_ for class). Signed-off-by: Michal Nazarewicz <mina86@mina86.com> Cc: Alan Stern <stern@rowland.harvard.edu> Cc: Matthew Wilcox <willy@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-10-22xen-blkfront: disable barrier/flush write supportJens Axboe1-0/+7
The driver doesn't handle empty flushes. Disable barrier/flush write support until this is fixed up. Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-10-21Merge branch 'virtio' of ↵Linus Torvalds1-15/+2
git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus * 'virtio' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus: virtio_blk: remove BKL leftovers virtio: console: Disable lseek(2) for port file operations virtio: console: Send SIGIO in case of port unplug virtio: console: Send SIGIO on new data arrival on ports virtio: console: Send SIGIO to processes that request it for host events virtio: console: Reference counting portdev structs is not needed virtio: console: Add reference counting for port struct virtio: console: Use cdev_alloc() instead of cdev_init() virtio: console: Add a find_port_by_devt() function virtio: console: Add a list of portdevs that are active virtio: console: open: Use a common path for error handling virtio: console: remove_port() should return void virtio: console: Make write() return -ENODEV on hot-unplug virtio: console: Make read() return -ENODEV on hot-unplug virtio: console: Unblock poll on port hot-unplug virtio: console: Un-block reads on chardev close virtio: console: Check if portdev is valid in send_control_msg() virtio: console: Remove control vq data only if using multiport support virtio: console: Reset vdev before removing device
2010-10-21Merge branch 'for-linus' of ↵Linus Torvalds4-0/+1932
git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: (22 commits) ceph: do not carry i_lock for readdir from dcache fs/ceph/xattr.c: Use kmemdup rbd: passing wrong variable to bvec_kunmap_irq() rbd: null vs ERR_PTR ceph: fix num_pages_free accounting in pagelist ceph: add CEPH_MDS_OP_SETDIRLAYOUT and associated ioctl. ceph: don't crash when passed bad mount options ceph: fix debugfs warnings block: rbd: removing unnecessary test block: rbd: fixed may leaks ceph: switch from BKL to lock_flocks() ceph: preallocate flock state without locks held ceph: add pagelist_reserve, pagelist_truncate, pagelist_set_cursor ceph: use mapping->nrpages to determine if mapping is empty ceph: only invalidate on check_caps if we actually have pages ceph: do not hide .snap in root directory rbd: introduce rados block device (rbd), based on libceph ceph: factor out libceph from Ceph file system ceph-rbd: osdc support for osd call and rollback operations ceph: messenger and osdc changes for rbd ...
2010-10-21virtio_blk: remove BKL leftoversChristoph Hellwig1-15/+2
Remove the BKL usage added in "block: push down BKL into .locked_ioctl". Virtio-blk doesn't use the BKL for anything, and doesn't implement any ioctl command by itself, but only uses the generic scsi_cmd_ioctl which is fine without the BKL. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-10-20rbd: passing wrong variable to bvec_kunmap_irq()Dan Carpenter1-1/+1
We should be passing "buf" here insead of "bv". This is tricky because it's not the same as kmap() and kunmap(). GCC does warn about it if you compile on i386 with CONFIG_HIGHMEM. Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Sage Weil <sage@newdream.net>
2010-10-20rbd: null vs ERR_PTRDan Carpenter1-2/+2
ceph_alloc_page_vector() returns ERR_PTR(-ENOMEM) on errors. Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Sage Weil <sage@newdream.net>
2010-10-20block: rbd: removing unnecessary testYehuda Sadeh1-4/+0
rbd_get_segment() can't return a negative value, we don't need to check the return output. Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
2010-10-20block: rbd: fixed may leaksVasiliy Kulikov1-6/+8
rbd_client_create() doesn't free rbdc, this leads to many leaks. seg_len in rbd_do_op() is unsigned, so (seg_len < 0) makes no sense. Also if fixed check fails then seg_name is leaked. Signed-off-by: Vasiliy Kulikov <segooon@gmail.com> Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
2010-10-20rbd: introduce rados block device (rbd), based on libcephYehuda Sadeh4-0/+1934
The rados block device (rbd), based on osdblk, creates a block device that is backed by objects stored in the Ceph distributed object storage cluster. Each device consists of a single metadata object and data striped over many data objects. The rbd driver supports read-only snapshots. Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net> Signed-off-by: Sage Weil <sage@newdream.net>
2010-10-19cciss: fix PCI IDs for new Smart Array controllersMike Miller1-10/+12
cciss: fix PCI IDs for new controllers This patch fixes the botched up PCI IDs of new controllers. Please consider this patch for inclusion. Signed-off-by: Mike Miller <mike.miller@hp.com> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-10-19Merge branch 'v2.6.36-rc8' into for-2.6.37/barrierJens Axboe5-6/+20
Conflicts: block/blk-core.c drivers/block/loop.c mm/swapfile.c Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-10-15llseek: automatically add .llseek fopArnd Bergmann5-1/+6
All file_operations should get a .llseek operation so we can make nonseekable_open the default for future file operations without a .llseek pointer. The three cases that we can automatically detect are no_llseek, seq_lseek and default_llseek. For cases where we can we can automatically prove that the file offset is always ignored, we use noop_llseek, which maintains the current behavior of not returning an error from a seek. New drivers should normally not use noop_llseek but instead use no_llseek and call nonseekable_open at open time. Existing drivers can be converted to do the same when the maintainer knows for certain that no user code relies on calling seek on the device file. The generated code is often incorrectly indented and right now contains comments that clarify for each added line why a specific variant was chosen. In the version that gets submitted upstream, the comments will be gone and I will manually fix the indentation, because there does not seem to be a way to do that using coccinelle. Some amount of new code is currently sitting in linux-next that should get the same modifications, which I will do at the end of the merge window. Many thanks to Julia Lawall for helping me learn to write a semantic patch that does all this. ===== begin semantic patch ===== // This adds an llseek= method to all file operations, // as a preparation for making no_llseek the default. // // The rules are // - use no_llseek explicitly if we do nonseekable_open // - use seq_lseek for sequential files // - use default_llseek if we know we access f_pos // - use noop_llseek if we know we don't access f_pos, // but we still want to allow users to call lseek // @ open1 exists @ identifier nested_open; @@ nested_open(...) { <+... nonseekable_open(...) ...+> } @ open exists@ identifier open_f; identifier i, f; identifier open1.nested_open; @@ int open_f(struct inode *i, struct file *f) { <+... ( nonseekable_open(...) | nested_open(...) ) ...+> } @ read disable optional_qualifier exists @ identifier read_f; identifier f, p, s, off; type ssize_t, size_t, loff_t; expression E; identifier func; @@ ssize_t read_f(struct file *f, char *p, size_t s, loff_t *off) { <+... ( *off = E | *off += E | func(..., off, ...) | E = *off ) ...+> } @ read_no_fpos disable optional_qualifier exists @ identifier read_f; identifier f, p, s, off; type ssize_t, size_t, loff_t; @@ ssize_t read_f(struct file *f, char *p, size_t s, loff_t *off) { ... when != off } @ write @ identifier write_f; identifier f, p, s, off; type ssize_t, size_t, loff_t; expression E; identifier func; @@ ssize_t write_f(struct file *f, const char *p, size_t s, loff_t *off) { <+... ( *off = E | *off += E | func(..., off, ...) | E = *off ) ...+> } @ write_no_fpos @ identifier write_f; identifier f, p, s, off; type ssize_t, size_t, loff_t; @@ ssize_t write_f(struct file *f, const char *p, size_t s, loff_t *off) { ... when != off } @ fops0 @ identifier fops; @@ struct file_operations fops = { ... }; @ has_llseek depends on fops0 @ identifier fops0.fops; identifier llseek_f; @@ struct file_operations fops = { ... .llseek = llseek_f, ... }; @ has_read depends on fops0 @ identifier fops0.fops; identifier read_f; @@ struct file_operations fops = { ... .read = read_f, ... }; @ has_write depends on fops0 @ identifier fops0.fops; identifier write_f; @@ struct file_operations fops = { ... .write = write_f, ... }; @ has_open depends on fops0 @ identifier fops0.fops; identifier open_f; @@ struct file_operations fops = { ... .open = open_f, ... }; // use no_llseek if we call nonseekable_open //////////////////////////////////////////// @ nonseekable1 depends on !has_llseek && has_open @ identifier fops0.fops; identifier nso ~= "nonseekable_open"; @@ struct file_operations fops = { ... .open = nso, ... +.llseek = no_llseek, /* nonseekable */ }; @ nonseekable2 depends on !has_llseek @ identifier fops0.fops; identifier open.open_f; @@ struct file_operations fops = { ... .open = open_f, ... +.llseek = no_llseek, /* open uses nonseekable */ }; // use seq_lseek for sequential files ///////////////////////////////////// @ seq depends on !has_llseek @ identifier fops0.fops; identifier sr ~= "seq_read"; @@ struct file_operations fops = { ... .read = sr, ... +.llseek = seq_lseek, /* we have seq_read */ }; // use default_llseek if there is a readdir /////////////////////////////////////////// @ fops1 depends on !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier readdir_e; @@ // any other fop is used that changes pos struct file_operations fops = { ... .readdir = readdir_e, ... +.llseek = default_llseek, /* readdir is present */ }; // use default_llseek if at least one of read/write touches f_pos ///////////////////////////////////////////////////////////////// @ fops2 depends on !fops1 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier read.read_f; @@ // read fops use offset struct file_operations fops = { ... .read = read_f, ... +.llseek = default_llseek, /* read accesses f_pos */ }; @ fops3 depends on !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier write.write_f; @@ // write fops use offset struct file_operations fops = { ... .write = write_f, ... + .llseek = default_llseek, /* write accesses f_pos */ }; // Use noop_llseek if neither read nor write accesses f_pos /////////////////////////////////////////////////////////// @ fops4 depends on !fops1 && !fops2 && !fops3 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier read_no_fpos.read_f; identifier write_no_fpos.write_f; @@ // write fops use offset struct file_operations fops = { ... .write = write_f, .read = read_f, ... +.llseek = noop_llseek, /* read and write both use no f_pos */ }; @ depends on has_write && !has_read && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier write_no_fpos.write_f; @@ struct file_operations fops = { ... .write = write_f, ... +.llseek = noop_llseek, /* write uses no f_pos */ }; @ depends on has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; identifier read_no_fpos.read_f; @@ struct file_operations fops = { ... .read = read_f, ... +.llseek = noop_llseek, /* read uses no f_pos */ }; @ depends on !has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @ identifier fops0.fops; @@ struct file_operations fops = { ... +.llseek = noop_llseek, /* no read or write fn */ }; ===== End semantic patch ===== Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Julia Lawall <julia@diku.dk> Cc: Christoph Hellwig <hch@infradead.org>
2010-10-15drbd: add race-breaker to drbd_go_disklessLars Ellenberg1-0/+3
This adds a necessary race breaker to these commits: drbd: fix for possible deadlock on IO error during resync drbd: drop wrong debug asserts, fix recently introduced race What we do is get a refcount, check the state, then depending on the state and the requested minimum disk state, either hold it (success), or give it back immediately (failed "try lock"). Some code paths (flushing of drbd metadata) may still grab and hold a refcount even if we are D_FAILED (application IO won't). So even if we hit local_cnt == 0 once after being D_FAILED, we still need to wait for that again after we changed to D_DISKLESS. Once local_cnt reaches 0 while we are D_DISKLESS, we can be sure that no one will look at the protected members anymore, so only then is it safe to free them. We cannot easily convert to standard locking primitives here, as we want to be able to use it in atomic context (we always do a "try lock"), as well as hold references for a "long time" (from IO submission to completion callback). Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-15drbd: use dynamic_dev_dbg to optionally log uuid changesLars Ellenberg1-1/+31
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14drbd: cleanup: change "<= 0" to "== 0"Dan Carpenter1-1/+1
dt is unsigned so it's never less than zero. We are calculating the elapsed time, and that's never less than zero (unless there is a bug or we invent time travel). The comparison here is just to guard against divide by zero bugs. Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
2010-10-14drbd: relax the grace period of the md_sync timer againLars Ellenberg1-2/+2
Consolidate the ifdef's for the debug level, accidentally the used both DEBUG and DRBD_DEBUG_MD_SYNC. Default to off. For production, we can safely reduce the grace period for this timer again the the value we used to have. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14drbd: add some more explicit drbd_md_syncLars Ellenberg1-0/+5
It sometimes may take a while for the after state change work to be scheduled, which does drbd_md_sync. At convenient places, we should do explicit drbd_md_sync to have the new state information on disk as soon as possible. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14drbd: drop wrong debug asserts, fix recently introduced raceLars Ellenberg1-9/+19
commit 2372c38caadeaebc68a5ee190782c2a0df01edc3 drbd: fix for possible deadlock on IO error during resync introduced a new ASSERT, which turns out to be wrong. Drop it. Also serialize the state change to D_DISKLESS with the after state change work of the -> D_FAILED transition, don't open a new race. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14drbd: cleanup useless leftover warn/error printk'sLars Ellenberg2-6/+1
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14drbd: add explicit drbd_md_sync to drbd_resync_finishedLars Ellenberg1-0/+2
As we usually update the generation UUIDs here, we should explicitly sync them to disk. So far this has been done only implicitly by related code paths. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14drbd: Do not log an ASSERT for P_OV_REQUEST packets while C_CONNECTEDPhilipp Reisner1-8/+22
This might happen if on the VERIFY_S node the disk gets dropped. Although this is an cluster wide state transition, the VERIFY_T node, updates it connection state first. Then the ack packet for the cluster wide state transition travels back, and the VERIFY_S node stops to produce the P_OV_REQUEST packets. There is absolutely nothing wrong with that. Further, do not log "Can not satisfy peer's..." on the VERIFY_S node in this case, but pretend that they had equal checksum. [Bugz 327] Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14drbd: fix for possible deadlock on IO error during resyncLars Ellenberg2-22/+54
Scenario: Something (say, flush-147:0) is in drbd_al_begin_io, holding a local_cnt, waiting for the resync to make progress. Disk fails, worker in after_state_ch does drbd_rs_cancel_all, then waits for local_cnt to drop to zero. flush-147:0 is woken by drbd_rs_cancel_all, needs to write an AL transaction, and queues that on the worker. Deadlock. Fix: do not wait in the worker, have put_ldev() trigger the state change D_FAILED -> D_DISKLESS when necessary. put_ldev() cannot do the state change directly, as it may or may not already hold various spinlocks. We queue a short work instead. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14drbd: fix unlikely access after free and list corruptionLars Ellenberg2-0/+32
Various cleanup paths have been incomplete, for the very unlikely case that we cannot allocate enough bios from process context when submitting on behalf of the peer or resync process. Never observed. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14drbd: fix for spurious fullsync (uuids rotated too fast)Lars Ellenberg2-11/+36
If it was an "empty" resync, the SyncSource may have already "finished" the resync and rotated the UUIDs, before noticing the connection loss (and generating a new uuid, if Primary, rotating again), while the SyncTarget did not change its uuids at all, or only got to the previous sync-uuid. This would then again lead to a full sync on next handshake (see also Bug #251). Fix: Use explicit resync finished notification even for empty resyncs, do not finish an empty resync implicitly on the SyncSource. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14drbd: allow for explicit resync-finished notificationsLars Ellenberg1-0/+34
Preparation patch so more drbd_send_state() usage on the peer will not confuse drbd in receive_state(). Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14drbd: preparation commit, using full state in receive_state()Lars Ellenberg1-21/+18
no functional change, just using full state instead of just the .conn part of it for comparisons. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14drbd: drbd_send_ack_dp must not rely on header informationLars Ellenberg3-8/+9
drbd commit 17c854fea474a5eb3cfa12e4fb019e46debbc4ec drbd: receiving of big packets, for payloads between 64kByte and 4GByte introduced a new on-the-wire packet header format. We must no longer assume either format, but use the result of whatever drbd_recv_header has decoded. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14drbd: Fix regression in recv_bm_rle_bits (compressed bitmap)Lars Ellenberg1-12/+15
We used to be16_to_cpu the length field in our received packet header. drbd commit 17c854fea474a5eb3cfa12e4fb019e46debbc4ec drbd: receiving of big packets, for payloads between 64kByte and 4GByte changed this, but forgot to adjust a few places where we relied on h->length being in native byte order. This broke the receiving side of the RLE compressed bitmap exchange. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14drbd: Fixed a stupid copy and paste errorPhilipp Reisner1-1/+1
This caused rs_planed to be not in sync with the content of the fifo. That in turn could cause that the resync comes to a complete halt. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14drbd: Allow larger values for c-fill-target.Philipp Reisner1-1/+1
Connections through a compressing proxy might have more bits on the fly. 500MByte instead of 50MByte Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14drbd: fix possible access after freeLars Ellenberg1-1/+3
If we release the page pointed to by md_io_tmpp, we need to zero out the pointer, too, as that may be used later to decide whether we need to allocate a new page again. Impact: a previously freed page may be used and clobbered. Depending on what that particular page is being used for meanwhile, this may result in silent data corruption of completely unrelated things. Only of concern on devices with logical_block_size != 512 byte, if you re-attach after becoming diskless once. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14drbd: protocol compatibility for maximum packet sizesLars Ellenberg2-3/+17
Two missing corner cases to the "maximum packet size" handshake. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14drbd: Track the reasons to suspend IO in dedicated state bitsPhilipp Reisner6-27/+52
There are three ways to get IO suspended: * Loss of any access to data * Fence-peer-handler running * User requested to suspend IO Track those in different bits, so that one condition clearing its state bit does not interfere with the other two conditions. Only when the user resumes IO he overrules all three bits. The fact is hidden from the user, he sees only a single suspend bit. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14drbd: DIV_ROUND_UP not needed hereLars Ellenberg1-1/+1
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14drbd: Fixed compatibility with protocol versions smaller than 95Philipp Reisner1-2/+3
Forgot to consider the max size for the resync requests. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14drbd: fix for spurious full sync (becoming sync target looked like invalidate)Lars Ellenberg1-2/+3
If a synctarget lost connection while being WFSyncUUID, due to "state sanitizing", the attempted state change to SyncTarget looked like an "invalidate" to after_state_ch() later, thus caused a full sync on next handshake (Bug #318). drbd0: PingAck did not arrive in time. drbd0: peer( Primary -> Unknown ) conn( WFSyncUUID -> NetworkFailure ) pdsk( UpToDate -> DUnknown ) from : { cs:NetworkFailure ro:Secondary/Unknown ds:UpToDate/DUnknown r--- } to : { cs:SyncTarget ro:Secondary/Unknown ds:Inconsistent/DUnknown r--- } after sanizising, resulted in state: { cs:NetworkFailure ro:Secondary/Unknown ds:Inconsistent/DUnknown r--- } drbd0: disk( UpToDate -> Inconsistent ) Fix: don't mask state transition errors in "sanitizing", so the requested state change to SyncTarget fails, instead of being implicitly "remaped" to invalidate. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14drbd: cosmetic, don't report resync for online-verifyLars Ellenberg1-5/+7
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14drbd: fix spurious protocol errorLars Ellenberg1-1/+2
If we cannot satisfy a request (because our disk just broke), we still need to drain the payload. Or we'll get a protocol error when interpreting the payload as DRBD packet header. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14drbd: fix potential kernel BUG (NULL deref)Lars Ellenberg2-6/+19
BUG trace would look like: lc_find drbd_rs_complete_io got_OVResult drbd_asender Could be triggered by explicit, or IO-error policy based, detach during online-verify. We may only dereference mdev->resync, if we first get_ldev(), as the disk may break any time, causing mdev->resync to disappear once all ldev references have been returned. Already in flight online-verify requests or replies may still come in, which we then need to ignore. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14drbd: don't count sendpage()d pages only referenced by tcp as in useLars Ellenberg4-12/+27
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14drbd: Adding support for BIO/Request flags: REQ_FUA, REQ_FLUSH and REQ_DISCARDPhilipp Reisner3-22/+34
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14drbd: drbd_md_sync before calling user space helpersLars Ellenberg1-0/+4
Just in case we have some pending meta data changes to sync, do it before we call our userland helper, as that may take some time, or even cause a hard reboot. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-10-14drbd: fix race on meta-data update, addendumLars Ellenberg2-5/+31
addendum to baa33ae4eaa4477b60af7c434c0ddd1d182c1ae7 The race: drbd_md_sync() if (!test_and_clear_bit(MD_DIRTY, &mdev->flags)) return; ==> RACE with drbd_md_mark_dirty() rearming the timer. del_timer(&mdev->md_sync_timer); Fixed by moving the del_timer before the test_and_clear_bit. Additionally only rearm the timer in drbd_md_mark_dirty, if MD_DIRTY was not already set, reduce the grace period from five to one second, and add an ifdef'ed debuging aid to find code paths missing an explicit drbd_md_sync, if any, as those are the only relevant ones for this race. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>