什么是State(狀態(tài))?
- 某task/operator在某時刻的一個中間結(jié)果
- 快照(shapshot)
- 在flink中狀態(tài)可以理解為一種數(shù)據(jù)結(jié)構(gòu)
- 舉例
對輸入源為<key,value>的數(shù)據(jù),計算其中某key的最大值,如果使用HashMap,也可以進行計算,但是每次都需要重新遍歷,使用狀態(tài)的話,可以獲取最近的一次計算結(jié)果,減少了系統(tǒng)的計算次數(shù) - 程序一旦crash,恢復
- 程序擴容
State類型
Operator State(算子狀態(tài))
With Operator State (or non-keyed state), each operator state is bound to one parallel operator instance. The Kafka Connector is a good motivating example for the use of Operator State in Flink. Each parallel instance of the Kafka consumer maintains a map of topic partitions and offsets as its Operator State.
The Operator State interfaces support redistributing state among parallel operator instances when the parallelism is changed. There can be different schemes for doing this redistribution.

flink官方文檔用kafka的消費者舉例,認為kafka消費者的partitionId和offset類似flink的operator state
提供的數(shù)據(jù)結(jié)構(gòu):ListState<T>
每一個Operator都存在自己的狀態(tài)
key State
Keyed State is always relative to keys and can only be used in functions and operators on a KeyedStream.
You can think of Keyed State as Operator State that has been partitioned, or sharded, with exactly one state-partition per key. Each keyed-state is logically bound to a unique composite of <parallel-operator-instance, key>, and since each key “belongs” to exactly one parallel instance of a keyed operator, we can think of this simply as <operator, key>.
Keyed State is further organized into so-called Key Groups. Key Groups are the atomic unit by which Flink can redistribute Keyed State; there are exactly as many Key Groups as the defined maximum parallelism. During execution each parallel instance of a keyed operator works with the keys for one or more Key Groups.
基于KeyStream之上的狀態(tài)
可理解為dataStream.keyBy()之后的Operator State,Operator State是對每一個Operator的狀態(tài)進行記錄,而key State則是在dataSteam進行keyBy()后,記錄相同keyId的keyStream上的狀態(tài)
key State提供的數(shù)據(jù)類型:ValueState<T>、ListState<T>、ReducingState<T>、MapState<T>
狀態(tài)容錯
- Introduction
Apache Flink offers a fault tolerance mechanism to consistently recover the state of data streaming applications. The mechanism ensures that even in the presence of failures, the program’s state will eventually reflect every record from the data stream exactly once. Note that there is a switch to downgrade the guarantees to at least once (described below).
The fault tolerance mechanism continuously draws snapshots of the distributed streaming data flow. For streaming applications with small state, these snapshots are very light-weight and can be drawn frequently without impacting the performance much. The state of the streaming applications is stored at a configurable place (such as the master node, or HDFS).
In case of a program failure (due to machine-, network-, or software failure), Flink stops the distributed streaming dataflow. The system then restarts the operators and resets them to the latest successful checkpoint. The input streams are reset to the point of the state snapshot. Any records that are processed as part of the restarted parallel dataflow are guaranteed to not have been part of the checkpointed state before.
Note: For this mechanism to realize its full guarantees, the data stream source (such as message queue or broker) needs to be able to rewind the stream to a defined recent point. Apache Kafka has this ability and Flink’s connector to Kafka exploits this ability.
Note: Because Flink’s checkpoints are realized through distributed snapshots, we use the words snapshot and checkpointinterchangeably.
依靠checkPoint
checkPoint概念:進行全局快照,持久化保存所有的task/operator的State
- 特點:
異步:輕量級,不影響系統(tǒng)處理數(shù)據(jù)
Barrier機制
失敗情況下可回滾致最近一次成功的checkpoint
周期性 -
保證exactly-once
chcekPoint
Restore
shapshot(快照)
- Barriers(屏障)
Barriers是flink分布式快照中的重要元素
單并行度Barriers
多并行度Barriers
Barrier被注入數(shù)據(jù)流中,并隨著數(shù)據(jù)流和記錄一起流動,每一個Barrier攜帶者快照ID,并且十分輕量級,不會打斷數(shù)據(jù)的流動,不同時期的快照的barrier可以同時存在數(shù)據(jù)流中,所以各種快照可以同時發(fā)生。
相對于單并行度,多并行度的快照需要不同數(shù)據(jù)流中攜帶相同快照ID的Barrier經(jīng)過operator之后,才能進行checkpoint。

個人理解:感覺對于Flink的狀態(tài)遷移和容錯來說,主要依賴checkpoint機制,而其中最重要的元素就是Barrier,通過Barrier保證流入Operator的數(shù)據(jù)都進行了checkpoint



