TL;DR
- clint-go has features for event aggregation, which groups similar events into one, and spam filtering, which applies rate limits to events.
- Event aggregation uses an aggregation key generated by the EventAggregatorByReasonFunc as a key, and if the same event is published 10 or more times within 10 minutes, instead of posting a new event, update the existing to save etcd capacity.
- Spam filtering performs using the value generated by the getSpamKey function as a key. The spam filter uses a rate limiter based on the token bucket algorithm, with an initial 25 tokens and a refill rate of 1 token per 5 minutes. If an event is published beyond the limit, the event is discarded.
Background
When I wanted to create a CustomController that detects the completion of a child Job of a CronJob and performs an action, I noticed that CronJobController (v2) published an Event called SawCompletedJob when the child Job completed. I thought, "Oh, I can simply use the Event as a cue."
However, when I created a CronJob that ran once a minute in a local cluster (kind), I discovered that a SawCompletedJob event became unobservable from the middle of the run (about the 10th run). Specifically, the Events looked like the ones below.
$ kubectl alpha events --for cronjob/hello
...
8m29s (x2 over 8m29s) Normal SawCompletedJob CronJob/hello Saw completed job: hello-28023907, status: Complete
8m29s Normal SuccessfulDelete CronJob/hello Deleted job hello-28023904
7m35s Normal SuccessfulCreate CronJob/hello Created job hello-28023908
7m28s (x2 over 7m28s) Normal SawCompletedJob CronJob/hello Saw completed job: hello-28023908, status: Complete
7m28s Normal SuccessfulDelete CronJob/hello Deleted job hello-28023905
6m35s Normal SuccessfulCreate CronJob/hello Created job hello-28023909
6m28s Normal SawCompletedJob CronJob/hello Saw completed job: hello-28023909, status: Complete
2m35s (x3 over 4m35s) Normal SuccessfulCreate CronJob/hello (combined from similar events): Created job hello-28023913
SawCompletedJob events were published until 6m28s, but they became unobservable after that. SuccessfulCreate events were published in an aggregated form, but we cannot see all of them. By the way, all events from child Jobs and child Pods are observable.
$ kubectl alpha events
4m18s Normal Scheduled Pod/hello-28023914-frb94 Successfully assigned default/hello-28023914-frb94 to kind-control-plane
4m11s Normal Completed Job/hello-28023914 Job completed
3m18s Normal Started Pod/hello-28023915-5fsh5 Started container hello
3m18s Normal SuccessfulCreate Job/hello-28023915 Created pod: hello-28023915-5fsh5
3m18s Normal Created Pod/hello-28023915-5fsh5 Created container hello
3m18s Normal Pulled Pod/hello-28023915-5fsh5 Container image "busybox:1.28" already present on machine
3m18s Normal Scheduled Pod/hello-28023915-5fsh5 Successfully assigned default/hello-28023915-5fsh5 to kind-control-plane
3m11s Normal Completed Job/hello-28023915 Job completed
2m18s Normal Started Pod/hello-28023916-qbqqk Started container hello
2m18s Normal Pulled Pod/hello-28023916-qbqqk Container image "busybox:1.28" already present on machine
2m18s Normal Created Pod/hello-28023916-qbqqk Created container hello
2m18s Normal SuccessfulCreate Job/hello-28023916 Created pod: hello-28023916-qbqqk
2m18s Normal Scheduled Pod/hello-28023916-qbqqk Successfully assigned default/hello-28023916-qbqqk to kind-control-plane
2m11s Normal Completed Job/hello-28023916 Job completed
78s Normal SuccessfulCreate Job/hello-28023917 Created pod: hello-28023917-kpxvn
78s Normal Created Pod/hello-28023917-kpxvn Created container hello
78s Normal Pulled Pod/hello-28023917-kpxvn Container image "busybox:1.28" already present on machine
78s Normal Started Pod/hello-28023917-kpxvn Started container hello
78s Normal Scheduled Pod/hello-28023917-kpxvn Successfully assigned default/hello-28023917-kpxvn to kind-control-plane
71s Normal Completed Job/hello-28023917 Job completed
18s (x8 over 7m18s) Normal SuccessfulCreate CronJob/hello (combined from similar events): Created job hello-28023918
18s Normal Started Pod/hello-28023918-grvbz Started container hello
18s Normal Created Pod/hello-28023918-grvbz Created container hello
18s Normal Pulled Pod/hello-28023918-grvbz Container image "busybox:1.28" already present on machine
18s Normal SuccessfulCreate Job/hello-28023918 Created pod: hello-28023918-grvbz
18s Normal Scheduled Pod/hello-28023918-grvbz Successfully assigned default/hello-28023918-grvbz to kind-control-plane
11s Normal Completed Job/hello-28023918 Job completed
As I couldn't find any specific description of this behavior in the Kubernetes official documentation, I investigated it by reading the source code to figure out what was happening.
Kubernetes Event Publication Flow
client-go sends Kubernetes Events to kube-apiserver through and etcd stores them. client-go is responsible for event aggregation and spam filtering, but the flow of client-go sending Events to kube-apiserver is quite complex, so we will explain it first. Note that we will use the SawCompleteJob Event of CronJobController as an example, but please note that the details may vary on each controller.
- CronJobController publishes an Event via the Recorder's Eventf method. The method internally calls the Broadcaster's ActionOrDrop method and sends the Event to the incoming channel.
- The Event in the incoming channel is retrieved by the loop goroutine and forwarded to each Watcher's result channel.
- The Event Watcher goroutine calls the eventHandler for the Event received from the result channel. The eventHandler calls recordToSink in the eventBroadcasterImpl, where EventCorrelator performs event aggregation and spam filtering and then calls recordEvent to post an Event (or update the Event if it has been aggregated).
Note: Starting loop goroutine
The loop goroutine starts through the NewLongQueueBroadcaster function, which is called through the NewBroadcaster function. The NewBroadcaster function is called in the NewControllerV2 function.
Note: Starting Event Watcher
The CronJobController calls the StartRecordingToSink method of eventBroadcasterImpl, which starts the Event Watcher from the StartEventWatcher method. The StartEventWatcher method initializes and registers the Watcher through the Watch method of the Broadcaster. What I found interesting is that the registration process of the Watcher itself is sent to the incoming channel, and the loop goroutine executes it, making the events published before the start of the Watcher invisible to it (the comment calls it as a "terrible hack" though).
EventCorrelator
EventCorrelator implements the core logic for Kubernetes Event aggregation and spam filtering. EventCorrelator is initialized through the NewEventCorrelatorWithOptions function called in the StartRecordingToSink method of eventBroadcasterImpl. Note that e.options is empty, so the Controller uses the default values. eventBroadcasterImpl's recordToSink method calls EventCorrelate method, which aggregates Events and applies tge spam filter.
Aggregation
The EventCorrelator's EventAggregate method and the eventObserve method of eventLogger are used for Event aggregation. The source code has detailed comments, so it's recommended to refer to it directly for more information, but here's a brief overview of the process:
- Calculate aggregationKey and localKey using EventAggregatorByReasonFunc. aggregationKey consists of event.Source, event.InvolvedObject, event.Type, event.Reason, event.ReportingController, and event.ReportingInstance, while localKey is event.Message.
- Search the cache of EventAggregator using the the aggregationKey. The cache value is aggregateRecord. If the number of the localKeys within the maxIntervalInSeconds (default 600 seconds) is greater than or equal to the maxEvents (default 10), return aggregationKey as the key, otherwise return the eventKey as the key. Note that the eventKey should be unique to each Event.
- Call the eventObserve method of eventLogger with the key returned from the EventCorrelate method, and search the cache of eventLogger. The cache value is eventLog. If it hits the cache (i.e., the key is aggregationKey), compute the patch to update the Event.
Spam Filtering
For spam filtering, the filterFunc of EventCorrelator is called to apply it. The actual implementation of filterFunc is the Filter method of EventSourceObjectSpamFilter. Again, it's recommended to refer to the source code for details, but here's a brief overview of the process:
- Calculate the eventKey from the Event using the getSpamKey function.
- Search the cache of EventSourceObjectSpamFilter using the eventKey. The cache value is spamRecord, which contains the rate limiter. The default values for qps and burst of the rate limiter are 1/300 and 25, respectively. According to the comment, the rate limiter uses the token bucket algorithm, so there are initially 25 tokens, and then one token is refilled every 5 minutes.
- Call the TryAccept method of tokenBucketPassiveRateLimiter to check the rate limit. If it exceeds it, discard the Event.
Why did SawCompletedJob Event become unobservable?
Taking the above into consideration, let's think about why the SawCompleteJob Event became unobservable. In short, it is likely due to be caused by the spam filter.
- CronJobController issues three Events, SuccessfulCreate, SawCompletedJob, and SuccessfulDelete, per child Job every minute (Strictly speaking, it publishes SuccessfulDelete only when it reaches the HistoryLimit).
- The Controller uses a spam filter whose the key is solely based on the Source and InvolvedObject (See getSpamKey function). Therefore, these thee types of Events are identified as the same Event.
- The Controller consumed the first 25 tokens at a rate of three tokens per minute. One token is refilled every five minutes, but around nine minutes, the tokens started running out. After that, A toke was refilled every five minutes, but it was consumed by a (aggregated) SuccessfulCreate Event, so SawCompletedJob and SuccessfulDelete were never published thereafter.
Note: Event Types
I'll list SuccessfulCreate
γ SawCompletedJob
γSuccessfulDelete
events' involvedObject and source below.
SuccessfulCreate Event
apiVersion: v1
kind: Event
involvedObject:
apiVersion: batch/v1
kind: CronJob
name: hello
namespace: default
resourceVersion: "289520"
uid: 5f3cfeca-8a83-452a-beb9-7a5f9c1eff63
source:
component: cronjob-controller
...
reason: SuccessfulCreate
message: Created job hello-28025408
SawCompletedJob Event
apiVersion: v1
kind: Event
involvedObject:
apiVersion: batch/v1
kind: CronJob
name: hello
namespace: default
resourceVersion: "289020"
uid: 5f3cfeca-8a83-452a-beb9-7a5f9c1eff63
source:
component: cronjob-controller
...
reason: SawCompletedJob
message: 'Saw completed job: hello-28025408, status: Complete'
SuccessfulDelete Event
apiVersion: v1
kind: Event
involvedObject:
apiVersion: batch/v1
kind: CronJob
name: hello
namespace: default
resourceVersion: "289520"
uid: 5f3cfeca-8a83-452a-beb9-7a5f9c1eff63
source:
component: cronjob-controller
...
reason: SuccessfulDelete
message: Deleted job hello-28025408
Top comments (0)