The NetBSD Project

CVS log for src/sys/kern/kern_heartbeat.c

[BACK] Up to [cvs.NetBSD.org] / src / sys / kern

Request diff between arbitrary revisions


Default branch: MAIN


Revision 1.10 / (download) - annotate - [select for diffs], Wed Sep 6 12:29:14 2023 UTC (5 months, 2 weeks ago) by riastradh
Branch: MAIN
CVS Tags: thorpej-ifq-base, thorpej-ifq, thorpej-altq-separation-base, thorpej-altq-separation, HEAD
Changes since 1.9: +32 -42 lines
Diff to previous 1.9 (colored)

heartbeat(9): Make heartbeat_suspend/resume nestable.

And make them bind to the CPU as a side effect, instead of requiring
the caller to have already done so.

This lets us eliminate the assertions so we can use them in ddb even
when things are going haywire and we just want to get diagnostics.

XXX kernel revbump -- struct cpu_info change

Revision 1.9 / (download) - annotate - [select for diffs], Sat Sep 2 17:44:41 2023 UTC (5 months, 2 weeks ago) by riastradh
Branch: MAIN
Changes since 1.8: +19 -7 lines
Diff to previous 1.8 (colored)

heartbeat(9): Move panicstr check into the IPI itself.

We can't return early from defibrillate because the IPI may have yet
to run -- we can't return until the other CPU is definitely done
using the ipi_msg_t we created on the stack.

We should avoid calling panic again on the patient CPU in case it was
already in the middle of a panic, so that we don't re-enter panic
while, e.g., trying to print a stack trace.

Sprinkle some comments.

Revision 1.8 / (download) - annotate - [select for diffs], Sat Sep 2 17:44:32 2023 UTC (5 months, 2 weeks ago) by riastradh
Branch: MAIN
Changes since 1.7: +10 -6 lines
Diff to previous 1.7 (colored)

heartbeat(9): More detail about manual test success criteria.

Changes comments only, no functional change.

Revision 1.7 / (download) - annotate - [select for diffs], Sat Sep 2 17:44:23 2023 UTC (5 months, 2 weeks ago) by riastradh
Branch: MAIN
Changes since 1.6: +41 -4 lines
Diff to previous 1.6 (colored)

heartbeat(9): Ignore stale tc if primary CPU heartbeat is suspended.

The timecounter ticks only on the primary CPU, so of course it will
go stale if it's suspended.

(It is, perhaps, a mistake that it only ticks on the primary CPU,
even if the primary CPU is offlined or in a polled-input console
loop, but that's a separate issue.)

Revision 1.6 / (download) - annotate - [select for diffs], Sat Sep 2 17:43:37 2023 UTC (5 months, 2 weeks ago) by riastradh
Branch: MAIN
Changes since 1.5: +45 -19 lines
Diff to previous 1.5 (colored)

heartbeat(9): New flag SPCF_HEARTBEATSUSPENDED.

This way we can suspend heartbeats on a single CPU while the console
is in polling mode, not just when the CPU is offlined.  This should
be rare, so it's not _convenient_, but it should enable us to fix
polling-mode console input when the hardclock timer is still running
on other CPUs.

Revision 1.5 / (download) - annotate - [select for diffs], Sun Jul 16 10:18:19 2023 UTC (7 months, 1 week ago) by riastradh
Branch: MAIN
Changes since 1.4: +5 -5 lines
Diff to previous 1.4 (colored)

heartbeat(9): For now, use time_uptime without atomic_load_relaxed.

A later commit will change time_uptime to a macro so it is atomic,
using atomc_load_relaxed if possible or seqlock if not.

Revision 1.4 / (download) - annotate - [select for diffs], Sun Jul 16 10:18:07 2023 UTC (7 months, 1 week ago) by riastradh
Branch: MAIN
Changes since 1.3: +38 -7 lines
Diff to previous 1.3 (colored)

heartbeat(9): Avoid xcall(9) while cold.

Revision 1.3 / (download) - annotate - [select for diffs], Sat Jul 8 13:59:05 2023 UTC (7 months, 2 weeks ago) by riastradh
Branch: MAIN
Changes since 1.2: +2 -12 lines
Diff to previous 1.2 (colored)

curcpu_stable(9): New function for asserting curcpu() is stable.

Revision 1.2 / (download) - annotate - [select for diffs], Fri Jul 7 17:05:13 2023 UTC (7 months, 2 weeks ago) by riastradh
Branch: MAIN
Changes since 1.1: +17 -7 lines
Diff to previous 1.1 (colored)

heartbeat(9): Test whether curcpu is stable, not kpreempt_disabled.

kpreempt_disabled worked for my testing because I tested on aarch64,
which doesn't have kpreemption.

XXX Should move curcpu_stable() to somewhere that other things can
use it.

Revision 1.1 / (download) - annotate - [select for diffs], Fri Jul 7 12:34:50 2023 UTC (7 months, 2 weeks ago) by riastradh
Branch: MAIN

heartbeat(9): New mechanism to check progress of kernel.

This uses hard interrupts to check progress of low-priority soft
interrupts, and one CPU to check progress of another CPU.

If no progress has been made after a configurable number of seconds
(kern.heartbeat.max_period, default 15), then the system panics --
preferably on the CPU that is stuck so we get a stack trace in dmesg
of where it was stuck, but if the stuckness was detected by another
CPU and the stuck CPU doesn't acknowledge the request to panic within
one second, the detecting CPU panics instead.

This doesn't supplant hardware watchdog timers.  It is possible for
hard interrupts to be stuck on all CPUs for some reason too; in that
case heartbeat(9) has no opportunity to complete.

Downside: heartbeat(9) relies on hardclock to run at a reasonably
consistent rate, which might cause trouble for the glorious tickless
future.  However, it could be adapted to take a parameter for an
approximate number of units that have elapsed since the last call on
the current CPU, rather than treating that as a constant 1.

XXX kernel revbump -- changes struct cpu_info layout

This form allows you to request diff's between any two revisions of a file. You may select a symbolic revision name using the selection box or you may type in a numeric name using the type-in text box.




CVSweb <webmaster@jp.NetBSD.org>