Give an indication in container events for probe failure as to whether the failure was ignored due to FailureThreshold · Issue #115823 · kubernetes/kubernetes · GitHub | Latest TMZ Celebrity News & Gossip | Watch TMZ Live
Skip to content

Give an indication in container events for probe failure as to whether the failure was ignored due to FailureThreshold #115823

Open
@intUnderflow

Description

@intUnderflow

Probes of all kinds currently support FailureThreshold (and SuccessThreshold), these properties allow a user to specify that Kubernetes should not take action in response to a failed probe unless it fails a successive number of times.

This is useful for end-users as it allows them to mitigate the effects of any probes that "flake" by requiring successive failure.

When a probe fails in Kubernetes, we emit a container event indicating this here: https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/prober/prober.go#L110 and end-users can consume these events via the API for their own purposes. This event is emitted regardless of whether the FailureThreshold has been reached or not.

Currently when a user consumes a probe failure event they have no way of knowing whether the event resulted in action on the control plane (because the event can be ignored due to FailureThreshold, and information on this is not included in the event). This can lead to users assuming there is a problem and a container/pod was restarted when nothing occurred.

I think we should expose the keepGoing value from https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/prober/worker.go#L203 in the emitted event somehow, my preferred solution is to emit the probe failure event in the worker rather than where it currently sits in https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/prober/prober.go#L110 - there is also the option of passing some information down the stack into the prober from the worker (such as making the FailureThreshold/SuccessThreshold decision in the prober) but I'm worried about separation of concerns, happy to hear what other folks think :)

Also of note is that FailureThreshold/SuccessThreshold is the only filter I can see where a probe can be ignored after being run (and therefore emitting a container event)

I’m happy to write this PR once we’re confident in our approach :)

Metadata

Metadata

Labels

good first issueDenotes an issue ready for a new contributor, according to the "help wanted" guidelines.help wantedDenotes an issue that needs help from a contributor. Must meet "help wanted" guidelines.kind/cleanupCategorizes issue or PR as related to cleaning up code, process, or technical debt.kind/documentationCategorizes issue or PR as related to documentation.priority/backlogHigher priority than priority/awaiting-more-evidence.sig/nodeCategorizes an issue or PR as relevant to SIG Node.triage/acceptedIndicates an issue or PR is ready to be actively worked on.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions

    TMZ Celebrity News – Breaking Stories, Videos & Gossip

    Looking for the latest TMZ celebrity news? You've come to the right place. From shocking Hollywood scandals to exclusive videos, TMZ delivers it all in real time.

    Whether it’s a red carpet slip-up, a viral paparazzi moment, or a legal drama involving your favorite stars, TMZ news is always first to break the story. Stay in the loop with daily updates, insider tips, and jaw-dropping photos.

    🎥 Watch TMZ Live

    TMZ Live brings you daily celebrity news and interviews straight from the TMZ newsroom. Don’t miss a beat—watch now and see what’s trending in Hollywood.