The talk explains how Apache Flink checkpoints stateful jobs using the asynchronous barrier snapshotting algorithm to give exactly once semantics in streaming. Furthermore, Flink's approach to master high availability (HA) is described which solves the problem of the JobManager being the single point of failure. Job checkpointing in combination with HA is the basis for Flink's fault tolerance mechanism to recover from occurring failures.