Discussion:
[Bug 261387] Should cam be calling callout drain for struct cam_sim and struct cam_ed?
(too old to reply)
b***@freebsd.org
2022-01-25 15:38:06 UTC
Permalink
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=261387

Mark Johnston <***@FreeBSD.org> changed:

What |Removed |Added
----------------------------------------------------------------------------
Status|New |Open
CC| |***@FreeBSD.org
Assignee|***@FreeBSD.org |***@FreeBSD.org

--- Comment #1 from Mark Johnston <***@FreeBSD.org> ---
Looks like the cam_sim callout has been removed:
https://cgit.freebsd.org/src/commit/?id=28027f28e607012b83ee9062eed3b8ed82e819c1

I think you're right that it should have been drained, but it's somewhat moot
since it apparently wasn't used.

The cam_ed callout is a bit more complicated: it's interlocked by the device's
SIM's devq lock, dev->sim->devq->send_mtx. So the callout handler,
xpt_release_devq_timeout(), runs with send_mtx held, as does
xpt_release_device() when it stops the callout. The interlock is mostly
sufficient to ensure that the callout does not continue running. It's not
completely sufficient, though, since xpt_release_devq_timeout() may drop the
send_mtx lock to schedule I/O (presumably from a different device). During
this window, xpt_release_device() may try to stop the callout, fail, and
continue destroying the device, while the callout is still running.

It's not very clear whether it's ok to free a callout while the handler is
running. At that point the callout subsystem no longer references the callout,
and xpt_release_devq_timeout() presumably won't access a "doomed" device after
dropping the send lock, since it's shouldn't appear in the devq anymore.

So I suspect that it's mostly a theoretical problem, though the callout
documentation really doesn't make this clear.
--
You are receiving this mail because:
You are the assignee for the bug.

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
b***@freebsd.org
2022-01-27 17:54:43 UTC
Permalink
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=261387

--- Comment #2 from ***@stratus.com ---
It's good that the callout was removed from the sim. Calling callout_drain
from that removal routine would have been scary...

I agree on the mostly theoretical aspect; however, on systems with huge caches
and many cores, theory can become practical on rare occasions.
--
You are receiving this mail because:
You are the assignee for the bug.

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
b***@freebsd.org
2022-01-28 15:38:09 UTC
Permalink
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=261387

--- Comment #3 from Mark Johnston <***@FreeBSD.org> ---
(In reply to Herbie.Robinson from comment #2)
"theoretical" was a bad word choice. I don't mean that this becomes a problem
with sufficiently large timing delays or anything like that. (At least, I
can't see how it does.) Rather, this code seemingly violates the implicit
contract that consumers have with the callout subsystem, that callout
structures shouldn't be freed while the callout handler is running. In fact, I
believe this is safe to do in this particular case. softclock_call_cc(), the
function that actually invokes the callout handler, does not access the callout
structure after calling the handler except in two cases which do not apply to
the cam_ed callout.

So while it's formally correct to drain the callout before destroying the
device in cam_destroy_device(), I can't see how it would fix any observed
problems. To be clear, I don't disagree with the patch.
--
You are receiving this mail because:
You are the assignee for the bug.

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Loading...