Discussion:
[Bug 212841] getting panic during mps reinitialization.
(too old to reply)
b***@freebsd.org
2024-10-13 03:56:51 UTC
Permalink
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=212841

Mark Linimon <***@FreeBSD.org> changed:

What |Removed |Added
----------------------------------------------------------------------------
Keywords|patch |crash
--
You are receiving this mail because:
You are on the CC list for the bug.

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
b***@freebsd.org
2024-10-13 21:38:43 UTC
Permalink
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=212841

--- Comment #12 from commit-***@FreeBSD.org ---
A commit in branch main references this bug:

URL:
https://cgit.FreeBSD.org/src/commit/?id=c0e0e530ced057502f51d7a6086857305e08fae0

commit c0e0e530ced057502f51d7a6086857305e08fae0
Author: prateek sethi <***@gmail.com>
AuthorDate: 2024-10-13 18:38:54 +0000
Commit: Warner Losh <***@FreeBSD.org>
CommitDate: 2024-10-13 21:38:01 +0000

mps/mpr: Add workaround for firmware not responding to IOC_FACTS or
IOC_INIT

Sometimes, especially with older firmware, mps(4) would have trouble
initializing the card in one of these two steps. Add in a retry after a
short delay. Sean Bruno and Stephen McConnell thought this was OK in the
bug discussions, but never committed it. Steve indicated the delay
might not be necessary, but the OP clearly needed to make it longer to
make things work. I've kept the delay, and added the suggested comment.

Ported the iocfacts part to mpr as well, since we see similar errors
about once every month or two over a few thousand controllers at
work. We've not seen it with IOC_INIT as far back as I can query the
error log database, so I didn't port that forward. We'll see if this
helps, but won't know for sure until next year (so I'm committing it now
since it won't hurt and might help). We usually see this failure in
connection with complicated recovery operations with a drive that's
failing, though, at least in the last year's worth of failures. It's
not clear this is the same as OP or not.

PR: 212841
Sponsored by: Netflix
Co-authored-by: imp

sys/dev/mpr/mpr.c | 23 ++++++++++++++++++-----
sys/dev/mps/mps.c | 38 ++++++++++++++++++++++++++++++--------
2 files changed, 48 insertions(+), 13 deletions(-)
--
You are receiving this mail because:
You are on the CC list for the bug.

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
b***@freebsd.org
2024-10-13 22:35:57 UTC
Permalink
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=212841

Warner Losh <***@FreeBSD.org> changed:

What |Removed |Added
----------------------------------------------------------------------------
Flags| |mfc-stable14+,
| |mfc-stable13+
CC| |***@FreeBSD.org

--- Comment #13 from Warner Losh <***@FreeBSD.org> ---
I neglected to flag this as MFC After, so tagged in bug.
--
You are receiving this mail because:
You are on the CC list for the bug.

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
b***@freebsd.org
2024-10-16 14:32:37 UTC
Permalink
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=212841

--- Comment #14 from commit-***@FreeBSD.org ---
A commit in branch stable/14 references this bug:

URL:
https://cgit.FreeBSD.org/src/commit/?id=b57229a80465b390a31fb30b3127c4cb93d8987f

commit b57229a80465b390a31fb30b3127c4cb93d8987f
Author: prateek sethi <***@gmail.com>
AuthorDate: 2024-10-13 18:38:54 +0000
Commit: Warner Losh <***@FreeBSD.org>
CommitDate: 2024-10-16 14:19:21 +0000

mps/mpr: Add workaround for firmware not responding to IOC_FACTS or
IOC_INIT

Sometimes, especially with older firmware, mps(4) would have trouble
initializing the card in one of these two steps. Add in a retry after a
short delay. Sean Bruno and Stephen McConnell thought this was OK in the
bug discussions, but never committed it. Steve indicated the delay
might not be necessary, but the OP clearly needed to make it longer to
make things work. I've kept the delay, and added the suggested comment.

Ported the iocfacts part to mpr as well, since we see similar errors
about once every month or two over a few thousand controllers at
work. We've not seen it with IOC_INIT as far back as I can query the
error log database, so I didn't port that forward. We'll see if this
helps, but won't know for sure until next year (so I'm committing it now
since it won't hurt and might help). We usually see this failure in
connection with complicated recovery operations with a drive that's
failing, though, at least in the last year's worth of failures. It's
not clear this is the same as OP or not.

PR: 212841
Sponsored by: Netflix
Co-authored-by: imp

(cherry picked from commit c0e0e530ced057502f51d7a6086857305e08fae0)

sys/dev/mpr/mpr.c | 23 ++++++++++++++++++-----
sys/dev/mps/mps.c | 38 ++++++++++++++++++++++++++++++--------
2 files changed, 48 insertions(+), 13 deletions(-)
--
You are receiving this mail because:
You are on the CC list for the bug.

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
b***@freebsd.org
2024-10-16 14:44:39 UTC
Permalink
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=212841

--- Comment #15 from commit-***@FreeBSD.org ---
A commit in branch stable/13 references this bug:

URL:
https://cgit.FreeBSD.org/src/commit/?id=e360f8c8fecc78d8f2aa2aee46940ec5eca86c87

commit e360f8c8fecc78d8f2aa2aee46940ec5eca86c87
Author: prateek sethi <***@gmail.com>
AuthorDate: 2024-10-13 18:38:54 +0000
Commit: Warner Losh <***@FreeBSD.org>
CommitDate: 2024-10-16 14:33:33 +0000

mps/mpr: Add workaround for firmware not responding to IOC_FACTS or
IOC_INIT

Sometimes, especially with older firmware, mps(4) would have trouble
initializing the card in one of these two steps. Add in a retry after a
short delay. Sean Bruno and Stephen McConnell thought this was OK in the
bug discussions, but never committed it. Steve indicated the delay
might not be necessary, but the OP clearly needed to make it longer to
make things work. I've kept the delay, and added the suggested comment.

Ported the iocfacts part to mpr as well, since we see similar errors
about once every month or two over a few thousand controllers at
work. We've not seen it with IOC_INIT as far back as I can query the
error log database, so I didn't port that forward. We'll see if this
helps, but won't know for sure until next year (so I'm committing it now
since it won't hurt and might help). We usually see this failure in
connection with complicated recovery operations with a drive that's
failing, though, at least in the last year's worth of failures. It's
not clear this is the same as OP or not.

PR: 212841
Sponsored by: Netflix
Co-authored-by: imp

(cherry picked from commit c0e0e530ced057502f51d7a6086857305e08fae0)

sys/dev/mpr/mpr.c | 23 ++++++++++++++++++-----
sys/dev/mps/mps.c | 38 ++++++++++++++++++++++++++++++--------
2 files changed, 48 insertions(+), 13 deletions(-)
--
You are receiving this mail because:
You are on the CC list for the bug.

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
b***@freebsd.org
2024-11-26 00:09:08 UTC
Permalink
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=212841

Mark Linimon <***@FreeBSD.org> changed:

What |Removed |Added
----------------------------------------------------------------------------
Resolution|--- |FIXED
Assignee|***@FreeBSD.org |***@FreeBSD.org
Status|New |Closed
--
You are receiving this mail because:
You are on the CC list for the bug.

--
Posted automagically by a mail2news gateway at muc.de e.V.
Please direct questions, flames, donations, etc. to news-***@muc.de
Loading...