Did the user really ask for Exactly Once? Fault Tolerance

#2020 #apachekafka #apachenifi #apacheflink

Exactly Once Requirements

It is very tricky and can cause performance degradation, if your user could just use at least once, then always go with that. Having data sinks like Kudu where you can do an upsert makes exactly once less needed.

https://docs.cloudera.com/csa/1.2.0/datastream-connectors/topics/csa-kafka.html

Apache Flink, Apache NiFi Stateless and Apache Kafka can participate in that.

For CDF Stream Processing and Analytics with Apache Flink 1.10 Streaming :

Both Kafka sources and sinks can be used with exactly once processing guarantees when checkpointing is enabled.

End-to-End Guaranteed Exactly-Once Record Delivery

The Data Source and Data Sink to need to support exactly-once state semantics and take part in checkpointing.

Data Sources

Apache Kafka - must have Exactly-Once selected, transactions enabled and correct driver.

Select : Semantic.EXACTLY_ONCE

Data Sinks

HDFS BucketingSink
Apache Kafka

For Kafka, please check the timeouts sync up to checkpoints. https://ci.apache.org/projects/flink/flink-docs-release-1.11/dev/connectors/kafka.html#kafka-producers-and-fault-tolerance

DEV Community

Did the user really ask for Exactly Once? Fault Tolerance

Exactly Once Requirements

Reference

Top comments (0)

Read next

ModuleNotFoundError in FastAPI Project When Running Pytest

Android Apps vs. iOS Apps: Which Are Better?

探索新星：用Rust编写的JavaScript和Wasm引擎Nova

Spring boot Autowire issue