port of linux patch for hyperv storvsc -- works but how?

Scott Long

2019-11-13 16:04:16 UTC

Post by Andriy Gapon
https://reviews.freebsd.org/D22312
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=40630f462824ee24bc00d692865c86c3828094e0
The change is quite simple, but I have a hard time understanding how it works.
ccb->csio.scsi_status = (vm_srb->scsi_status & 0xFF);
- ccb->csio.resid = ccb->csio.dxfer_len - vm_srb->transfer_len;
+ if (srb_status == SRB_STATUS_SUCCESS ||
+ srb_status == SRB_STATUS_DATA_OVERRUN)
+ ccb->csio.resid = ccb->csio.dxfer_len - vm_srb->transfer_len;
+ else
+ ccb->csio.resid = ccb->csio.dxfer_len;
My motivation for porting that change is a report from a user that they are
getting a lot of zpool checksum errors while concurrently running zpool scrub
and iozone -l 1 -s 100G on a pool created on top of storvsc devices.
The user reports that the change fixes the problem.
I have not been able to analyze the earlier errors in details, so I am not sure
if they were getting any SRB_STATUS_DATA_OVERRUN errors. Also, I don't know
what kind of sense data was provided for errors.
I do know that they were getting a lot of these messages duirng the test when
storvsc scsi_status = 2, srb_status = 4
scsi_status = 2 is SCSI_STATUS_CHECK_COND
srb_status = 4 is SRB_STATUS_ERROR
So, nothing unusual.
I would like to understand how the change can possibly make a difference.
As far as I can tell, csio.resid is completely ignored if a ccb fails.

The csio.resid is passed to bp->bio_resid in dadone() in the event of an error. This
in turn is passed up to callers, and probably informs ZFS on how much data did not
actually make it to the device in the event of an error (I’m not going to dig into that
right now, though).

Post by Andriy Gapon
Also, as far as I can see, it's ignored for ccb retries as well.
That is, we never modify dxfer_len.

Disk I/O needs to be atomic and idempotent; if we tried to restart partially failed
transactions, we might race another process that is doing I/O to the same area, and
get merged results. Thus, we always retry full transactions.

Post by Andriy Gapon
And if the ccb succeeds then resid would be set the same with the old and new code.

Presumably this will be “0”, which is good.

Post by Andriy Gapon
The only possibility I could come up with is the ccb's sense data having
SSD_KEY_RECOVERED_ERROR. In that case, the ccb would be considered successful.
But I am not sure even about that case.

Yeah, there’s a lot more logic that is needed, and i think the problem is that the
hyperV model doesn’t provide a useful resid value, forcing us to guess at what it
should be.

I think that SRB_STATUS_DATA_UNDERRUN might also require special handling.

Scott