Commit Graph

53 Commits (38d8879baeb61b6946052739e7c03fa79b3a57f0)

Author SHA1 Message Date
Dan Williams 38d8879bae isci: combine request flags
Combine three bools into one unsigned long 'flags'.  Doesn't increase the
request size due to packing. (to do: optimize the structure layout).

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2011-07-03 04:04:51 -07:00
Dan Williams 312e0c2455 isci: unify can_queue tracking on the tci_pool, uplevel tag assignment
The tci_pool tracks our outstanding command slots which are also the 'index'
portion of our tags.  Grabbing the tag early in ->lldd_execute_task let's us
drop the isci_host_can_queue() and ->was_tag_assigned_by_user infrastructure.
->was_tag_assigned_by_user required the task context to be duplicated in
request-local buffer.  With the tci established early we can build the
task_context directly into its final location and skip a memcpy.

With the task context buffer at a known address at request construction we
have the opportunity/obligation to also fix sgl handling.  This rework feels
like it belongs in another patch but the sgl handling and task_context are too
intertwined.
1/ fix the 'ab' pair embedded in the task context to point to the 'cd' pair in
   the task context (previously we were prematurely linking to the staging
   buffer).
2/ fix the broken iteration of pio sgls that assumes all sgls are relative to
   the request, and does a dangerous looking reverse lookup of physical
   address to virtual address.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2011-07-03 04:04:51 -07:00
Jeff Skirvin 9274f45ea5 isci: Terminate dev requests on FIS err bit rx in NCQ
When the remote device transitions to a not-ready state because of
an NCQ error condition, all outstanding requests to that device
are terminated and completed to libsas on the normal path.  The
device then waits for a READ LOG EXT command to issue on the task
management path.

Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2011-07-03 04:04:51 -07:00
Dan Williams 086a0dabc5 isci: fix isci_task_execute_tmf completion
1/ fix the timeout for wait_for_completion_timeout
2/ In the tmf timeout case we need to wait for our termination callback
3/ Once the request is successfully started it will be freed according to the
   normal lifetime for requests.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2011-07-03 04:04:51 -07:00
Dan Williams f208826751 isci: kill isci_remote_device_change_state()
Now that "stopping/stopped" are one in the same and signalled by a NULL device
pointer the rest of the device status infrastructure can be removed (->status
and ->state_lock).  The "not ready for i/o state" is replaced with a state
flag, and is evaluated under scic_lock so that we don't see transients from
taking the device reference to submitting the i/o.

This also fixes a potential leakage of can_queue slots in the rare case that
SAS_TASK_ABORTED is set at submission.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2011-07-03 04:04:51 -07:00
Dan Williams 209fae14fa isci: atomic device lookup and reference counting
We have unsafe references to remote devices that are notified to
disappear at lldd_dev_gone.  In order to clean this up we need a single
canonical source for device lookups and stable references once a lookup
succeeds.  Towards that end guarantee that domain_device.lldd_dev is
NULL as soon as we start the process of stopping a device.  Any code
path that wants to safely lookup a remote device must do so through
task->dev->lldd_dev (isci_lookup_device()).

For in-flight references outside of scic_lock we need reference counting
to ensure that the device is not recycled before we are done with it.
Simplify device back references to just scic_sds_request.target_device
which is now the only permissible internal reference that is maintained
relative to the reference count.

There were two occasions where we wanted new i/o's to be treated as
SAS_TASK_UNDELIVERED but where the domain_dev->lldd_dev link is still
intact.  Introduce a 'gone' flag to prevent i/o while waiting for libsas
to take action on the port down event.

One 'core' leftover is that we currently call
scic_remote_device_destruct() from isci_remote_device_deconstruct()
which is called when the 'core' says the device is stopped.  It would be
more natural for the final put to trigger
isci_remote_device_deconstruct() but this implementation is deferred as
it requires other changes.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2011-07-03 04:04:51 -07:00
Dan Williams 0d0cf14c9b isci: cleanup request allocation
Rather than return an error code and update a pointer that was passed by
reference just return the request object directly (or null if allocation
failed).

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2011-07-03 04:04:51 -07:00
Dan Williams 980d3aeb38 isci: fix isci_terminate_pending() list management
Walk through the list of pending requests being careful to consider that
multiple requests can be terminated when the lock is dropped (i.e.
invalidating the 'next' reference established by
list_for_each_entry_safe).

Also noticed that all callers to isci_terminate_pending_requests()
specifying terminating, so just drop the parameter.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2011-07-03 04:04:50 -07:00
Jeff Skirvin 77c852f312 isci: Handle timed-out request terminations correctly
In the situation where a termination of an I/O times-out,
make sure that the linkage from the request to the task
is severed completely.  Also make sure that the selection
of tasks to terminate occurs under scic_lock.

Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2011-07-03 04:04:50 -07:00
Jeff Skirvin 61aaff49e2 isci: filter broadcast change notifications during SMP phy resets
When resetting a sata device in the domain we have seen occasions where
libsas prematurely marks a device gone in the time it takes for the
device to re-establish the link.  This plays badly with software raid
arrays.  Other libsas drivers have non-uniform delays in their reset
handlers to try to cover this condition, but not sufficient to close the
hole.  Given that a sata device can take many seconds to recover we
filter bcns and poll for the device reattach state before notifying
libsas that the port needs the domain to be rediscovered.  Once this has
been proven out at the lldd level we can think about uplevelling this
feature to a common implementation in libsas.

Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com>
[ use kzalloc instead of kmem_cache ]
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
[ use eventq and time macros ]
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2011-07-03 04:04:50 -07:00
Jeff Skirvin ff717ab05f isci: Move the reset delay after the remote node resumption.
Delay after bringing up the RNC to allow for resumption latency.

Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2011-07-03 04:04:50 -07:00
Dave Jiang 8d2c65c09c isci: Removing unused variables compiler warnings
Newer gcc's are better at identifying "set, but not used" variables.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2011-07-03 04:04:50 -07:00
Edmund Nadolski 8db02da528 isci: remove isci_timer interface
Delete code which is no longer used.

Signed-off-by: Edmund Nadolski <edmund.nadolski@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2011-07-03 04:04:50 -07:00
Edmund Nadolski fd18388bc5 isci: Remove tmf timeout_timer
Replace the timeout_timer in the isci_tmf with a call to
wait_for_completion_timeout

Signed-off-by: Edmund Nadolski <edmund.nadolski@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2011-07-03 04:04:50 -07:00
Dan Williams f1f52e7593 isci: uplevel request infrastructure
* Consolidate tiny header files
* Move files out of core/ (drop core/scic_sds_ prefix)
* Merge core/scic_sds_request.[ch] into request.[ch]
* Cleanup request.c namespace (clean forward declarations and global
  namespace pollution)

Reported-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2011-07-03 04:04:47 -07:00
Dan Williams cc9203bf38 isci: move core/controller to host
Now that the data structures are unified unify the implementation in
host.[ch] and cleanup namespace pollution.

Reported-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2011-07-03 04:04:47 -07:00
Dan Williams ce2b3261b6 isci: unify constants
cross driver constants are spread out over multiple header files, consolidate
them into isci.h, and push some includes out to the source files that need
them.

TODO: remove SCI_MODE_SIZE infrastructure.
TODO: task.h is full of inlines that are too large

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2011-07-03 04:04:47 -07:00
Dan Williams 67ea838d17 isci: unify request data structures
Make scic_sds_request a proper member of isci_request.  Also let's us
get rid of the dma pool object size tracking since we now know that all
requests are sizeof(isci_request).  While cleaning up the construct
routine incidentally replaced SCI_FIELD_OFFSET with offsetof.

Reported-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2011-07-03 04:04:47 -07:00
Dan Williams b7645818cf isci: make command/response iu explicit request object members
Final elimination of the anonymous data at the end of the request
structure.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2011-07-03 04:04:47 -07:00
Dan Williams 0d84366fbe isci: make sgl explicit/aligned request object member
Towards unifying request objects we need all members to be defined in the
object and not carved out of anonymous buffer space.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2011-07-03 04:04:47 -07:00
Dan Williams 827a84d4e0 isci: move stp request info to scic_sds_request
In preparation for unifying allocation of all request information make stp
data available in all requests.  Incidentally collapse indentation.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2011-07-03 04:04:47 -07:00
Artur Wojcik cc3dbd0a91 isci: unify isci_host data structures
Make it explicit that isci_host and scic_sds_controller are one in the same
object.

Signed-off-by: Artur Wojcik <artur.wojcik@intel.com>
[removed ->ihost back pointer]
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2011-07-03 04:04:46 -07:00
Dan Williams d06b487b78 isci: implement I_T_nexus_reset
This is a requirement for 2.6.39's new libata eh.

Still some questions about lldd_dev_gone racing against dev->lldd_dev
lookups, but we are at least no more broken than mvsas in this regard.

We also short-circuit I_T_nexus_reset invocations from the device
discovery path (IDEV_EH similar to MVS_DEV_EH) to filter out the
resulting domain rediscoveries triggered by the reset.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2011-07-03 04:04:46 -07:00
Dave Jiang af5ae89350 isci: Convert of sci_ssp_response_iu to ssp_response_iu
Converting to Linux native format. However the isci driver does a lot of
the calculation based on the max size of this data structure and the
Linux data structure only has a pointer to the response data. Thus the
sizeof(struct ssp_response_iu) will be incorrect and we need to define
the max size.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2011-07-03 04:04:46 -07:00
Dave Jiang 0cfa890e5a isci: Fixup SSP command IU and task IU
Fixup of SSP command IU and SSP task IU to something that looks like Linux

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2011-07-03 04:04:46 -07:00
Dave Jiang f2f300806f isci: Convert SATA fis data structures to Linux native
Converting of sata_fis_reg_d2h to dev_to_host_fis
Converting of sata_fis_reg_h2d to host_to_dev_fis

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2011-07-03 04:04:45 -07:00
Maciej Patelczyk 890cae9b8a isci: Removed sci_base_object from scic_sds_request.
The 'struct sci_base_object' was removed from the struct
scic_sds_request and was replaced by a pointer to
struct isci_request.

Signed-off-by: Maciej Patelczyk <maciej.patelczyk@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2011-07-03 04:00:38 -07:00
Dave Jiang d2d61433a8 isci: Remove excessive log noise with expander hot-unplug
We are logging excessive output when hot unplug from expander. Moving
that to debug.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2011-07-03 04:00:38 -07:00
Dan Williams a1a113b0a1 isci: kill smp_discover_response_protocols in favor of domain_device.dev_type
This is step 1 of removing the contortions to:
1/ unparse expander phy data into a smp discover frame
2/ open-code-parse the smp discover fram into a domain_device.dev_type equivalent

libsas has already spent cycles determining the dev_type, so now that
scic_sds_remote_device is unified with isci_remote_device we can
directly reference dev_type.

This might also change multi-level expander detection as we previously only
looked at dev_type == EDGE_DEV and we did not consider the FANOUT_DEV case.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2011-07-03 04:00:38 -07:00
Dan Williams 88f3b62ac1 isci: move remote_device handling out of the core
Now that the core/lldd remote_device data structures are nominally unified
merge the corresponding sources into the top-level directory.  Also move the
remote_node_context infrastructure which has no analog at the lldd level.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2011-07-03 04:00:38 -07:00
Dan Williams 57f20f4ed6 isci: unify remote_device data structures
Make it explicit that isci_remote_device and scic_sds_remote_device are
one in the same object.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2011-07-03 04:00:38 -07:00
Bartosz Barcinski 6cb4d6b382 isci: audit usage of BUG_ON macro in isci driver
Removes unnecessary usage of BUG_ON macro, excluding core directory.
In some cases macro is unnecesary, check is done in caller function.
In other cases macro is replaced by if construction with
appropriate warning.

Signed-off-by: Maciej Patelczyk <maciej.patelczyk@intel.com>
[changed some survivable bug conditions to WARN_ONCE]
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2011-07-03 04:00:37 -07:00
Bartosz Barcinski 467e855a03 isci: sparse warnings cleanup
Clean warnings and errors reported by sparse tool.

request.c:430:50: warning: mixing different enum types
remote_device.c:534:39: warning: symbol 'flags' shadows an earlier one
task.c:495:44: warning: mixing different enum types
scic_sds_controller.c:2155:24: warning: mixing different enum types
scic_sds_controller.c:2272:36: warning: mixing different enum types
scic_sds_controller.c:2911:38: warning: incorrect type in initializer (different address spaces)
scic_sds_controller.c:2913:25: warning: incorrect type in argument 2 (different address spaces)
scic_sds_request.c:875:34: warning: cast removes address space of expression
scic_sds_request.c:876:123: warning: incorrect type in argument 2 (different address spaces)
scic_sds_port.c:585:51: warning: incorrect type in assignment (different address spaces)
scic_sds_port.c:712:9: warning: incorrect type in argument 2 (different address spaces)
scic_sds_port.c:1770:25: warning: incorrect type in argument 2 (different address spaces)

Signed-off-by: Bartosz Barcinski <Bartosz.Barcinski@intel.com>
Signed-off-by: Maciej Patelczyk <maciej.patelczyk@intel.com>
[fixed up some false positives and misconversions]
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2011-07-03 04:00:37 -07:00
Christoph Hellwig c629582d0d isci: remove scic_controller state handlers
Remove the state handler indirections for the scic_controller, and replace
them with procedural calls that check for the correct state first.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2011-07-03 04:00:37 -07:00
Dan Williams 4393aa4e6b isci: fix fragile/conditional isci_host lookups
A domain_device can always reference back to ->lldd_ha unlike local lldd
structures.  Fix up cases where the driver uses local objects to look up the
isci_host.  This also changes the calling conventions of some routines to
expect a valid isci_host parameter rather than re-lookup the pointer on entry.

Incidentally cleans up some macros that are longer to type than the open-coded
equivalent:
  isci_host_from_sas_ha
  isci_dev_from_domain_dev

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2011-07-03 04:00:36 -07:00
Jeff Skirvin ed8a72d108 isci: Qualify when the host lock is managed for STP/SATA callbacks.
In the case of internal discovery related STP/SATA I/O started
through sas_execute_task the host lock is not taken by libsas before
calling lldd_execute_task, so the lock should not be managed before
calling back to libsas through task->task_done or sas_task_abort.

Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2011-07-03 04:00:36 -07:00
Jeff Skirvin f219f010a3 isci: Properly handle requests in the "aborting" state.
When a TMF times-out, the request is set back to "aborting".
Requests in the "aborting" state must be terminated when
LUN and device resets occur.

Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2011-07-03 04:00:36 -07:00
Dan Williams 35173d579a isci: namespacecheck cleanups
* mark needlessly global routines static
* delete unused functions
* move kernel-doc blocks from header files to source
* reorder some functions to delete declarations
* more default handler cleanups phy

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2011-07-03 04:00:36 -07:00
Dan Williams 1077a57410 isci: fix incorrect assumptions about task->dev and task->dev->port being NULL
A domain_device has the same lifetime as its related scsi_target.  The
scsi_target is reference counted based on outstanding commands,
therefore it is safe to assume that if we have a valid sas_task that the
->dev pointer is also valid.

The asd_sas_port of a domain_device has the same lifetime as the driver
so it can also never be NULL as long as the sas_task is valid and the
driver is loaded.

This also cleans up isci_task_complete_for_upper_layer(), renames it to
isci_task_refuse() and notices that the isci_completion_selection
parameter was set to isci_perform_normal_io_completion by all callers.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2011-07-03 03:55:30 -07:00
Dan Williams 50e7f9b5a9 isci: Errors in the submit path for SATA devices manage the ap lock.
Since libsas takes the domain device sata_dev.ap->lock before submitting
a task, error completions in the submit path for SATA devices must
unlock/relock when completing the sas_task back to libsas.

Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2011-07-03 03:55:30 -07:00
Jeff Skirvin 70957a94d7 isci: Fixed BUG_ON in isci_abort_task_process_cb callback.
The request may be in the "aborted" or the "completed" state when
performing a task management operation on it.

Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2011-07-03 03:55:30 -07:00
Jeff Skirvin c3f42feb0c isci: Fix TMF build for SAS/SATA LUN reset cases.
In the case where a SAS or SATA LUN reset TMF is built a NULL pointer
dereference occurred because of the (unused) callback data pointer.

Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Jacek Danecki <Jacek.Danecki@intel.com>
2011-07-03 03:55:30 -07:00
Jeff Skirvin 4dc043c410 isci: Termination handling cleanup, added termination timeouts.
Added a request "dead" state for use when a termination wait times-out.

isci_terminate_pending_requests now detaches the device's pending list
and terminates each entry on the detached list.

Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2011-07-03 03:55:30 -07:00
Jeff Skirvin cbb65c665b isci: Code review change for completion pointer cleanup.
Since the request structure contains a pointer to the completion to be
used if the request is being aborted or terminated, there is no reason
to pass the completion as a pointer to isci_terminate_request_core().

Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com>
Signed-off-by: Jacek Danecki <Jacek.Danecki@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2011-07-03 03:55:30 -07:00
Jeff Skirvin f0846c6891 isci: Cleaning up task execute path.
Made sure the device ready check accounts for all states.
Moved the aborted task check into the loop of pulling task requests
off of the submitted list.

Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com>
Signed-off-by: Jacek Danecki <Jacek.Danecki@intel.com>
[remove host and device starting state checks]
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2011-07-03 03:55:30 -07:00
Jeff Skirvin 1fad9e934a isci: save the i/o tag outside the scic request structure.
The pointer to the core representation of a request is marked NULL at
completion, but we need to save the i/o tag for task management.

Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com>
Signed-off-by: Jacek Danecki <Jacek.Danecki@intel.com>
[revise changelog]
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2011-07-03 03:55:30 -07:00
Jeff Skirvin a5fde22536 isci: fix completion / abort path.
Corrected use of the request state_lock in the completion callback.

In the case where an abort (or reset) thread is trying to terminate an
I/O request, it sets the request state to "aborting" (or "terminating")
if the state is still "starting".  One of the bugs was to never set the
state to "completed".  Another was to not correctly recognize the
situation where the I/O had completed but the sas_task was still pending
callback to task_done - this was typically a problem in the LUN and
device reset cases.

It is now possible that we leave isci_task_abort_task() with
request->io_request_completion pointing to localy allocated
aborted_io_completion struct. It may result in a system crash.

Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com>
Signed-off-by: Maciej Trela <Maciej.Trela@intel.com>
Signed-off-by: Jacek Danecki <Jacek.Danecki@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2011-07-03 03:55:29 -07:00
Jeff Skirvin 18d3d72a42 isci: isci_request_cleanup_completed_loiterer checks task before task_done
In the condition where outstanding I/Os are being cleaned from the device
requests in process list, the cleanup function needs to check that the
request is actually a sas-task and not a task management function.

Signed-off-by: Jeff Skirvin <jeffrey.d.skirvin@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2011-07-03 03:55:29 -07:00
Dan Williams 8acaec1593 isci: kill "host quiesce" mechanism
The midlayer is already throttling i/o in the places where host_quiesce
was trying to prevent further i/o to the device.  It's also problematic
in that it holds a lock over GFP_KERNEL allocations.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2011-07-03 03:55:29 -07:00
Dan Williams 3a97eec6d7 isci: remove sci_device_handle
It belies the fact that isci_remote_device and scic_sds_remote_device
are one in same object with the same lifetime rules.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2011-07-03 03:55:29 -07:00