linux

Commit Graph

Author	SHA1	Message	Date
Linus Torvalds	8d7a8fe2ce	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client Pull ceph fixes from Sage Weil: "There is a pair of fixes for double-frees in the recent bundle for 3.10, a couple of fixes for long-standing bugs (sleep while atomic and an endianness fix), and a locking fix that can be triggered when osds are going down" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: rbd: fix cleanup in rbd_add() rbd: don't destroy ceph_opts in rbd_add() ceph: ceph_pagelist_append might sleep while atomic ceph: add cpu_to_le32() calls when encoding a reconnect capability libceph: must hold mutex for reset_changed_osds()	2013-06-12 08:28:19 -07:00
Alex Elder	3abef3b358	rbd: fix cleanup in rbd_add() Bjorn Helgaas pointed out that a recent commit introduced a use-after-free condition in an error path for rbd_add(). He correctly stated: I think `b536f69a3a` "rbd: set up devices only for mapped images" introduced a use-after-free error in rbd_add(): ... If rbd_dev_device_setup() returns an error, we call rbd_dev_image_release(), which ultimately kfrees rbd_dev. Then we call rbd_dev_destroy(), which references fields in the already-freed rbd_dev struct before kfreeing it again. The simple fix is to return the error code after the call to rbd_dev_image_release(). Closer examination revealed that there's no need to clean up rbd_opts in that function, so fix that too. Update some other comments that have also become out of date. Reported-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-17 12:50:10 -05:00
Alex Elder	7262cfca43	rbd: don't destroy ceph_opts in rbd_add() Whether rbd_client_create() successfully creates a new client or not, it takes responsibility for getting the ceph_opts structure it's passed destroyed. If successful, the structure becomes associated with the created client; if not, rbd_client_create() will destroy it. Previously, rbd_get_client() would call ceph_destroy_options() if rbd_get_client() failed, and that meant it got called twice. That led freeing various pointers more than once, which is never a good idea. This resolves: http://tracker.ceph.com/issues/4559 Cc: stable@vger.kernel.org # 3.8+ Reported-by: Dan van der Ster <dan@vanderster.com> Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-17 12:50:03 -05:00
Linus Torvalds	109c3c0292	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client Pull Ceph fixes from Sage Weil: "Yes, this is a much larger pull than I would like after -rc1. There are a few things included: - a few fixes for leaks and incorrect assertions - a few patches fixing behavior when mapped images are resized - handling for cloned/layered images that are flattened out from underneath the client The last bit was non-trivial, and there is some code movement and associated cleanup mixed in. This was ready and was meant to go in last week but I missed the boat on Friday. My only excuse is that I was waiting for an all clear from the testing and there were many other shiny things to distract me. Strictly speaking, handling the flatten case isn't a regression and could wait, so if you like we can try to pull the series apart, but Alex and I would much prefer to have it all in as it is a case real users will hit with 3.10." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: (33 commits) rbd: re-submit flattened write request (part 2) rbd: re-submit write request for flattened clone rbd: re-submit read request for flattened clone rbd: detect when clone image is flattened rbd: reference count parent requests rbd: define parent image request routines rbd: define rbd_dev_unparent() rbd: don't release write request until necessary rbd: get parent info on refresh rbd: ignore zero-overlap parent rbd: support reading parent page data for writes rbd: fix parent request size assumption libceph: init sent and completed when starting rbd: kill rbd_img_request_get() rbd: only set up watch for mapped images rbd: set mapping read-only flag in rbd_add() rbd: support reading parent page data rbd: fix an incorrect assertion condition rbd: define rbd_dev_v2_header_info() rbd: get rid of trivial v1 header wrappers ...	2013-05-15 13:36:19 -07:00
Alex Elder	638f5abed3	rbd: re-submit flattened write request (part 2) Add code to rbd_img_obj_exists_callback() to detect when a clone's parent image has disappeared, and re-submit the original write request in that case. Kill off some redundant assertions. This completes the resolution for: http://tracker.ceph.com/issues/3763 Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-13 15:06:46 -05:00
Alex Elder	bbea1c1a31	rbd: re-submit write request for flattened clone Add code to rbd_img_parent_read_full_callback() to detect when a clone's parent image has disappeared, and re-submit the original write request in that case. (See the previous commit for more reasoning about why this is appropriate.) Rename some variables in rbd_img_obj_parent_read_full_callback() to match the convention used in the previous patch. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-13 15:06:45 -05:00
Alex Elder	02c74fbad9	rbd: re-submit read request for flattened clone If a clone image gets flattened while a parent read request is underway, the original rbd object request needs to be resubmitted. The reason is that by the time we get the response to the parent read request, the data read from the parent may be out of date. In other words, we could see this sequence of events: rbd client parent image/osd ---------- ---------------- original object ENOENT; issue parent read respond to parent read child image flattened original image header refresh <--- original object written independently here parent read response received Add code to rbd_img_parent_read_callback() to detect when a clone's parent image has disappeared (as evidenced by its parent overlap becoming 0), and re-submit the original read request in that case. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-13 15:06:45 -05:00
Alex Elder	392a9dad7e	rbd: detect when clone image is flattened A format 2 clone image can be the subject of a "flatten" operation, during which all of its data gets "copied up" from its parent image, leaving the image fully populated. Once this is complete, the clone's association with the parent is abolished. Since this can occur when a clone is mapped, we need to detect when it has occurred and handle it accordingly. We know an image has been flattened when we know it at one time had a parent, but we have learned (via a "get_parent" object class method call) it no longer has one. There might be in-flight requests at the point we learn an image has been flattened, so we can't simply clean up parent data structures right away. Instead, we'll drop the initial parent reference when the parent has disappeared (rather than when the image gets destroyed), which will allow the last in-flight reference to clean things up when it's complete. We leverage the fact that a zero parent overlap renders an image effectively unlayered. We set the overlap to 0 at the point we detect the clone image has flattened, which allows the unlayered behavior to take effect immediately, while keeping other parent structures in place until in-flight requests to complete. This and the next few patches resolve: http://tracker.ceph.com/issues/3763 Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-13 15:06:45 -05:00
Alex Elder	a2acd00e79	rbd: reference count parent requests Keep a reference count for uses of the parent information for an rbd device. An initial reference is set in rbd_img_request_create() if the target image has a parent (with non-zero overlap). Each image request for an image with a non-zero parent overlap gets another reference when it's created, and that reference is dropped when the request is destroyed. The initial reference is dropped when the image gets torn down. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-13 15:06:44 -05:00
Alex Elder	e93f315235	rbd: define parent image request routines Define rbd_parent_request_create() and rbd_parent_request_destroy() to handle the creation of parent image requests submitted for layered image objects. For simplicity, let rbd_img_request_put() handle dropping the reference to any image request (parent or not), and call whichever destructor is appropriate on the last put. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-13 15:06:44 -05:00
Alex Elder	fb65d2284c	rbd: define rbd_dev_unparent() Define rbd_dev_unparent() to encapsulate cleaning up parent data structures from a layered rbd image. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-13 15:06:43 -05:00
Alex Elder	8785b1d487	rbd: don't release write request until necessary Previously when a layered write was going to involve a copyup request, the original osd request was released before submitting the parent full-object read. The osd request for the copyup would then be allocated in rbd_img_obj_parent_read_full_callback(). Shortly we will be handling the event of mapped layered images getting flattened, and when that occurs we need to resubmit the original request. We therefore don't want to release the osd request until we really konw we're going to replace it--in the callback function. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-13 15:06:43 -05:00
Alex Elder	642a25375f	rbd: get parent info on refresh Get parent info for format 2 images on every refresh (rather than just during the initial probe). This will be needed to detect the disappearance of the parent image in the event a mapped image becomes unlayered (i.e., flattened). Avoid leaking the previous parent spec on the second and subsequent times this information is requested by dropping the previous one (if any) before updating it. (Also, extract the pool id into a local variable before assigning it into the parent spec.) Switch to using a non-zero parent overlap value rather than the existence of a parent (a non-null parent_spec pointer) to determine whether to mark a request layered. It will soon be possible for a layered image to become unlayered while a request is in flight. This means that the layered flag for an image request indicates that there was a non-zero parent overlap at the time the image request was created. The parent overlap can change thereafter, which may lead to special handling at request submission or completion time. This and the next several patches are related to: http://tracker.ceph.com/issues/3763 NOTE: If an error occurs while refreshing the parent info (i.e., requesting it after initial probe), the old parent info will persist. This is not really correct, and is a scenario that needs to be addressed. For now we'll assert that the failure mode is unlikely, but the issue has been documented in tracker issue 5040. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-13 15:06:33 -05:00
Alex Elder	70cf49cfc7	rbd: ignore zero-overlap parent An rbd clone image that has an overlap with its parent of 0 is effectively not a layered image at all. Detect this case and treat such an image as non-layered. Issue a warning to be sure the user knows what's going on. This resolves: http://tracker.ceph.com/issues/5028 Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-13 14:12:41 -05:00
Alex Elder	b91f09f17b	rbd: support reading parent page data for writes Currently, rbd_img_obj_parent_read_full() assumes the incoming object request contains bio data. But if a layered image is part of a multi-layer stack of images it will result in read requests of page data to parent images. This is handling the same kind of issue as was resolved by this commit: `5b2ab72d` rbd: support reading parent page data This resolves: http://tracker.ceph.com/issues/5027 Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-13 14:12:40 -05:00
Alex Elder	ebda6408f2	rbd: fix parent request size assumption The code that reads object data from the parent for a copyup on write request currently assumes that the size of that request is the size of a "full" object from the original target image. That is not necessarily the case. The parent overlap could reduce the request size below that. To fix that assumption we need to record the number of pages in the copyup_pages array, for both an image request and an object request. Rename a local variable in rbd_img_obj_parent_read_full_callback() to reflect we're recording the length of the parent read request, not the size of the target object. This resolves: http://tracker.ceph.com/issues/5038 Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-13 14:09:01 -05:00
Alex Elder	c48f3f86e2	rbd: kill rbd_img_request_get() Get rid of rbd_img_request_get(), because it isn't used, and maybe won't ever be needed. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-08 20:17:00 -05:00
Alex Elder	1f3ef78861	rbd: only set up watch for mapped images Any changes to parent images are immaterial to any mapped clone. So there is no need to have a watch event registered on header objects except for the header object of an image that is mapped. In fact, a watch request is a write operation, and we may only have read access to a parent image. We can't set up the watch request until we know the name of the header object though. So pass a flag to rbd_dev_image_probe() to indicate whether this probe is for a mapping or for a parent image. Change the second parameter to rbd_dev_header_watch_sync() be Boolean while we're at it. This resolves: http://tracker.ceph.com/issues/4941 Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-08 20:16:55 -05:00
Alex Elder	7ce4eef7b5	rbd: set mapping read-only flag in rbd_add() The rbd_dev->mapping field for a parent image is not meaningful. Since rbd_image_probe() is used both for images being mapped and their parents, it doesn't make sense to set that flag in that function. So move the setting of the mapping.read_only flag out of rbd_dev_image_probe() and into rbd_add() instead. This resolves: http://tracker.ceph.com/issues/4940 Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-08 20:16:50 -05:00
Alex Elder	5b2ab72d36	rbd: support reading parent page data Currently, rbd_img_parent_read() assumes the incoming object request contains bio data. But if a layered image is part of a multi-layer stack of images it will result in read requests of page data to parent images. Fortunately, it's not hard to add support for page data. This resolves: http://tracker.ceph.com/issues/4939 Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-08 20:16:25 -05:00
Alex Elder	91c6febb38	rbd: fix an incorrect assertion condition In rbd_img_obj_parent_read_full_callback() there is an assertion intended to verify the size of the image request for a full parent read was the size of the original request's target object. But assertion was looking at the parent image order rather than the original one, and these values can differ. Fix that. This resolves: http://tracker.ceph.com/issues/4938 Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-08 20:16:10 -05:00
Alex Elder	2df3fac758	rbd: define rbd_dev_v2_header_info() This rearranges rbd_dev_v2_refresh() so it works more like rbd_dev_v1_header_info(). While format 1 images need to read the whole header object to get any information, format 2 can collect almost all information selectively. So the one-time initialization will remain in a separate function--based on rbd_dev_v2_probe(). Rename rbd_dev_v2_refresh() to be rbd_dev_v2_header_info(), and have it call rbd_dev_v2_header_onetime() if it's being called for the first time for the given rbd device. Rename rbd_dev_v2_probe() to be rbd_dev_v2_header_onetime() and remove the image size and snapshot context calls it held in common with the refresh function. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-08 17:00:52 -05:00
Alex Elder	99a41ebcee	rbd: get rid of trivial v1 header wrappers Get rid of the trivial wrapper functions rbd_dev_v1_refresh() and rbd_dev_v1_probe(), substituting rbd_dev_v1_header_read() calls in their place. Rename rbd_dev_v1_header_read() to be rbd_dev_v1_header_info(), to be more generic (it will better reflect what happens with format 2 images). Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-08 17:00:46 -05:00
Alex Elder	30d60ba2f2	rbd: simplify rbd_dev_v1_probe() An rbd_dev structure's fields are all zero-filled for an initial probe, so there's no need to explicitly zero the parent_spec and parent_overlap fields in rbd_dev_v1_probe(). Removing these assignments makes rbd_dev_v1_probe() almost trivial. Move the dout() message that announces discovery of an image into rbd_dev_image_probe(), generalize to support images in either format and only show it if an image is fully discovered. This highlights that are some unnecessary cleanups in the error path for rbd_dev_v1_probe(), so they can be removed. Now rbd_dev_v1_probe() is a trivial wrapper function. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-08 17:00:41 -05:00
Alex Elder	662518b128	rbd: update in-core header directly Now that rbd_header_from_disk() only fills in one-time fields once, we can extend it slightly so it releases the other fields before replacing their values. This way there's no need to pass a temporary buffer and then copy all the results in. Just use the rbd device header structure in rbd_header_from_disk() so its values get updated directly. Note that this means we need to take the header semaphore at the point we update things. So pass the rbd_dev rather than the address of its header as its first argument to rbd_header_from_disk(), and have it return an error code. As a result, rbd_dev_v1_header_read() does all the work, rbd_read_header() becomes unnecessary, and rbd_dev_v1_refresh() becomes a very simple wrapper. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-08 17:00:37 -05:00
Alex Elder	bb23e37acb	rbd: refactor rbd_header_from_disk() This rearranges rbd_header_from_disk so that it: - allocates the snapshot context right away - keeps results in local variables, not changing the passed-in header until it's known we'll succeed - does initialization of set-once fields in a header only if they have not already been set The last point is moot at the moment, because rbd_read_header() (the only caller) always supplies a zero-filled header buffer. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-08 17:00:33 -05:00
Alex Elder	46578dcdca	rbd: zero format 1 header structure earlier The passed-in header structure is zeroed in rbd_header_from_disk(). Instead, have the caller do it. Note that there are two callers, rbd_dev_v1_refresh() and rbd_dev_v1_probe(). The latter already has a zeroed header structure so zeroing it isn't necessary there. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-08 17:00:28 -05:00
Alex Elder	f35a4dee14	rbd: set the mapping size and features later Defer setting the size and features fields of a mapped image until after the Linux disk structure is set up. Set the capacity of the disk after that. Rearrange the definition of rbd_image_header, separating the fields that are set only once from those that can be updated. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-08 17:00:00 -05:00
Linus Torvalds	4de13d7aa8	Merge branch 'for-3.10/core' of git://git.kernel.dk/linux-block Pull block core updates from Jens Axboe: - Major bit is Kents prep work for immutable bio vecs. - Stable candidate fix for a scheduling-while-atomic in the queue bypass operation. - Fix for the hang on exceeded rq->datalen 32-bit unsigned when merging discard bios. - Tejuns changes to convert the writeback thread pool to the generic workqueue mechanism. - Runtime PM framework, SCSI patches exists on top of these in James' tree. - A few random fixes. * 'for-3.10/core' of git://git.kernel.dk/linux-block: (40 commits) relay: move remove_buf_file inside relay_close_buf partitions/efi.c: replace useless kzalloc's by kmalloc's fs/block_dev.c: fix iov_shorten() criteria in blkdev_aio_read() block: fix max discard sectors limit blkcg: fix "scheduling while atomic" in blk_queue_bypass_start Documentation: cfq-iosched: update documentation help for cfq tunables writeback: expose the bdi_wq workqueue writeback: replace custom worker pool implementation with unbound workqueue writeback: remove unused bdi_pending_list aoe: Fix unitialized var usage bio-integrity: Add explicit field for owner of bip_buf block: Add an explicit bio flag for bios that own their bvec block: Add bio_alloc_pages() block: Convert some code to bio_for_each_segment_all() block: Add bio_for_each_segment_all() bounce: Refactor __blk_queue_bounce to not use bi_io_vec raid1: use bio_copy_data() pktcdvd: Use bio_reset() in disabled code to kill bi_idx usage pktcdvd: use bio_copy_data() block: Add bio_copy_data() ...	2013-05-08 10:13:35 -07:00
Alex Elder	51344a38ba	rbd: always set read-only flag in rbd_add() Hold off setting the read-only flag in rbd_add() for an image being mapped until we have successfully probed the image. At that point we know whether it's a snapshot mapping or not, so we can set the read-only flag in that one place rather than doing so (for snapshots) in rbd_dev_mapping_set(). To do this, pass a flag to the image probe routine indicating whether we want a read-only mapping. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-08 07:48:12 -05:00
Alex Elder	6d80b130d5	rbd: kill rbd_dev_clear_mapping() This function is a duplicate of rbd_dev_mapping_clear(), and was added by mistake. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-08 07:48:12 -05:00
Alex Elder	8f4b7d9821	rbd: don't look up snapshot id in rbd_dev_mapping_set() Currently rbd_dev_mapping_set() looks up the snapshot id for the snapshot whose name is found in the rbd device's spec structure. That function gets called by rbd_dev_device_setup(), which is called by rbd_add() after rbd_dev_image_probe(). If the image probe succeeds, the rbd device's spec will already have been updated to include names and ids for all fields. Therefore there's no need to look up the snapshot id in rbd_dev_mapping_set(). Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-08 07:48:11 -05:00
Alex Elder	c734b79655	rbd: don't print warning if not mapping a parent The presence of the LAYERING bit in an rbd image's feature mask does not guarantee the image actually has a parent image. Currently that bit is set only when a clone (i.e., image with a parent) is created, but it is (currently) not cleared if that clone gets flattened back into a "normal" image. A "parent_id" query will leave the parent_spec for the image being mapped a null pointer, but will not return an error. Currently, whenever an image with the LAYERED feature gets mapped, a warning about the use of layered images gets printed. But we don't want to do this for a flattened image, so print the warning only if we find there is a parent spec after the probe. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-08 07:48:11 -05:00
Alex Elder	29334ba49c	rbd: kill rbd_update_mapping_size() Since rbd_update_mapping_size() is now a trivial wrapper, just open code it in its two callers. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-08 07:45:39 -05:00
Alex Elder	00a653e216	rbd: update capacity in rbd_dev_refresh() When a mapped image changes size, we change the capacity recorded for the Linux disk associated with it, in rbd_update_mapping_size(). That function is called in two places--the format 1 and format 2 refresh routines. There is no need to set the capacity while holding the header semaphore. Instead, do it in the common rbd_dev_refresh(), using the logic that's already there to initiate disk revalidation. Add handling in the request function, just in case a request that exceeds the capacity of the device comes in (perhaps one that was started before a refresh shrunk the device). Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-08 07:45:30 -05:00
Alex Elder	e627db085e	rbd: revalidate only for mapping size changes This commit: `d98df63e` rbd: revalidate_disk upon rbd resize instituted a call to revalidate_disk() to notify interested parties that a mapped image has changed size. This works well, as long as the the rbd device doesn't map a snapshot. A snapshot will never change size. However, the base image the snapshot is associated with can, and it can do so while the snapshot is mapped. The problem is that the test for the size is looking at the size of the base image, not the size of the mapped snapshot. This patch corrects that. Update the warning message shown in the event of error, and move it into the callers. This resolves: http://tracker.ceph.com/issues/4911 Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-08 07:40:48 -05:00
Alex Elder	49ece55428	rbd: fix leak of format 2 snapshot context When rbd_dev_v2_refresh() is called, the rbd device already has a snapshot context associated with it. But that never gets freed, the pointer just gets overwritten. Fix this by dropping the rbd device's reference to the snapshot context before overwriting the pointer. Because ceph_put_snap_context() already handles for a null pointer we don't need to check for that (for the probe case, where no context has yet been assigned). This resolves: http://tracker.ceph.com/issues/4912 Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-08 07:38:30 -05:00
Linus Torvalds	292088ee03	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull more vfs updates from Al Viro: "A couple of fixes + getting rid of __blkdev_put() return value" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: proc: Use PDE attribute setting accessor functions make blkdev_put() return void block_device_operations->release() should return void mtd_blktrans_ops->release() should return void hfs: SMP race on directory close()	2013-05-07 15:14:53 -07:00
Al Viro	db2a144bed	block_device_operations->release() should return void The value passed is 0 in all but "it can never happen" cases (and those only in a couple of drivers) and it would've been lost on the way out anyway, even if something tried to pass something meaningful. Just don't bother. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2013-05-07 02:16:21 -04:00
Alex Elder	b5b09be30c	rbd: fix image request leak on parent read When a read for a layered image object finds the target object doesn't exist, a read image request for the parent image is created and submitted. When that completes, the callback routine was not releasing that parent image request. Fix that. The slab allocation stuff just added has greatly simplified the search for the source of this memory leak. This resolves: http://tracker.ceph.com/issues/4803 Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-02 12:15:28 -05:00
Alex Elder	78c2a44aae	rbd: allocate image object names with a slab allocator The names of objects used for image object requests are always fixed size. So create a slab cache to manage them. Define a new function rbd_segment_name_free() to match rbd_segment_name() (which is what supplies the dynamically-allocated name buffer). This is part of: http://tracker.ceph.com/issues/3926 Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-02 11:58:30 -05:00
Alex Elder	868311b1eb	rbd: allocate object requests with a slab allocator Create a slab cache to manage rbd_obj_request allocation. We aren't using a constructor, and we'll zero-fill object request structures when they're allocated. This is part of: http://tracker.ceph.com/issues/3926 Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-02 11:58:30 -05:00
Alex Elder	f907ad5596	rbd: allocate name separate from obj_request The next patch will define a slab allocator for a object requests. To use that we'll need to allocate the name of an object separate from the request structure itself. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-02 11:58:29 -05:00
Alex Elder	1c2a9dfe21	rbd: allocate image requests with a slab allocator Create a slab cache to manage rbd_img_request allocation. Nothing too fancy at this point--we'll still initialize everything at allocation time (no constructor) This is part of: http://tracker.ceph.com/issues/3926 Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-02 11:58:29 -05:00
Alex Elder	30d1cff817	rbd: use binary search for snapshot lookup Use bsearch(3) to make snapshot lookup by id more efficient. (There could be thousands of snapshots, and conceivably many more.) Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-02 11:58:17 -05:00
Alex Elder	15228ede7d	rbd: clear EXISTS flag if mapped snapshot disappears This functionality inadvertently disappeared in the last patch. Image snapshots can get removed at just about any time. In particular it can disappear even if it is in use by an rbd client as a mapped image. The rbd client deals with such a disappearance by responding to new requests with ENXIO. This is implemented by each rbd device maintaining an EXISTS flag, which is normally set but cleared if a snapshot disappears. This patch (re-)implements the clearing of that flag. Whenever mapped image header information is refreshed, if the mapping is for a snapshot, verify the mapped snapshot is still present in the updated snapshot context. If it is not, clear the flag. It is not necessary to check this in the initial probe, because the probe will not succeed if the snapshot doesn't exist. This resolves: http://tracker.ceph.com/issues/4880 Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-02 11:57:03 -05:00
Alex Elder	33dca39f5c	rbd: kill off the snapshot list We no longer use the snapshot list for anything. When we need to look up a snapshot name, id, size, or feature mask, we just do it directly rather than relying on this list being updated with every refresh. The main reason it existed was for the benefit of the device/sysfs entries that previously were associated with snapshots. So get rid of the snapshot list, and struct rbd_snap, and the hundreds of lines of code that supported them. This resolves: http://tracker.ceph.com/issues/4868 Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-01 21:20:22 -07:00
Alex Elder	2ad3d7167e	rbd: define rbd_snap_size() and rbd_snap_features() This patch defines a handful of new functions that will allow us to get rid of the rbd device structure's list of snapshots. Define rbd_snap_id_by_name() to look up a snapshot id given its name. This is efficient for format 1 images but not for format 2. Fortunately it only gets called at mapping time so it's not that critical. Use rbd_snap_id_by_name() to find out the id for a snapshot getting mapped, and pass that id to new functions rbd_snap_size() and rbd_snap_features() to look up information about a given snapshot's size and feature mask given its snapshot id. All this gets done in rbd_dev_mapping_set(). As a result, snap_by_name() is no longer needed, so get rid of it. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-01 21:20:20 -07:00
Alex Elder	54cac61fb6	rbd: use snap_id not index to look up snap info In order to align with what was needed for format 1 rbd images, rbd_dev_v2_snap_info() was set up to take as argument an index into the array of snapshot ids in a rbd device's snapshot context. This switches that around, so we pass the snapshot id instead. In doing this, rbd_snap_name() now returns a dynamically-allocated string rather than a fixed one, so there's no need to make a duplicate in its caller, rbd_dev_spec_update(). This means the following functions take a snapshot id where they previously used an index value: rbd_dev_snap_info() rbd_dev_v1_snap_info() rbd_dev_v2_snap_info() A new function, rbd_dev_snap_index(), determines the snap index for format 1 images and uses it to look up the name. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-01 21:20:19 -07:00
Alex Elder	9682fc6d3a	rbd: look up snapshot name in names buffer Rather than scanning the list of snapshot structures for it, scan the snapshot context buffer containing snapshot names in order to determine for a format 1 image the name associated with a given snapshot id. Pull out the part of rbd_dev_v1_snap_info() that does this scan into a new function, _rbd_dev_v1_snap_name(). Have that function return a dynamically-allocated copy of the name, and don't duplicate it in rbd_dev_v1_snap_info(). Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>	2013-05-01 21:20:18 -07:00

1 2 3 4 5 ...

444 Commits (0dbc3463c8606e8122f3c8aea9815d0ef6e66bf9)