參數(shù)問題
描述:

解決:仔細查看上下函數(shù)的輸入輸出的參數(shù)個數(shù),和函數(shù)的參數(shù)設(shè)置
測試全局變量和并行度,集群模式的關(guān)系
未完成
-
bolt內(nèi)不加static,3并行度,單機模式
結(jié)果:
測試bolt中全局變量i,在execute中執(zhí)行打印i,然后i++
image.png
分析:每個bolt實例維護自己的全局變量i。 -
bolt內(nèi)加static,3并行度,單機模式
結(jié)果:
測試bolt中全局變量i,在execute中執(zhí)行打印i,然后i++
image.png
分析:每個bolt實例共同維護享受static全局變量i。
原因:同一個機器在同一個jvm中,只有一個static。
-
bolt內(nèi)加static,3并行度,集群模式
結(jié)果:
測試bolt中全局變量i,在execute中執(zhí)行打印i,然后i++
image.png
分析:每個bolt實例維護自己的全局變量i。
原因:不同一個機器在不同一個jvm中,各自都有static。
-
testtopology加static的public變量,3并行度,本地模式
結(jié)果:
測試bolt中使用該testtopology.i,在execute中執(zhí)行打印i,然后i++
image.png
分析:每個bolt實例共同維護享受static全局變量i。
原因:同一個機器在同一個jvm中,只有一個static。
-
testtopology加static的public變量,3并行度,集群模式
結(jié)果:
測試bolt中使用該testtopology.i,在execute中執(zhí)行打印i,然后i++
image.png
分析:每個bolt實例維護自己的全局變量i。
原因:不同一個機器在不同一個jvm中,各自都有static。
全局狀態(tài)
鏈接:https://stackoverflow.com/questions/33755697/flink-sharing-state-in-coflatmapfunction
Is the CoFlatMapFunction intended to be executed in parallel?
If yes, you need some way to deterministically assign which record goes to which parallel instance. In some way the CoFlatMapFunction does a parallel (partitions) join between the model and the result of the session windows, so you need some form of key that selects which partition the elements go to. Does that make sense?
If not, try to set it to parallelism 1 explicitly.
Greetings, Stephan
A global state that all can access read-only is doable via broadcast().
A global state that is available to all for read and update is currently not available. Consistent operations on that would be quite costly, require some form of distributed communication/consensus.
Instead, I would encourage you to go with the following:
- If you can partition the state, use a keyBy().mapWithState() - That localizes state operations and makes it very fast.
- If your state is not organized by key, your state is probably very small, and you may be able to use a non-parallel operation.
- If some operation updates the state and another one accesses it, you can often implement that with iterations and a CoFlatMapFunction (one side is the original input, the other the feedback input).
All approaches in the end localize state access and modifications, which is a good pattern to follow, if possible.
Greetings, Stephan
鏈接:http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Issue-with-sharing-state-in-CoFlatMapFunction-td3529.html
flink討論多并行度,全局變量
窗口函數(shù)的bug:

解決:因為keyby輸出的key類型為tuple,需要自己重新定義。

1.4.1使用read kafka的時候bug
KafkaFetcher09/010/011 uses wrong user code classloader
鏈接:https://issues.apache.org/jira/browse/FLINK-8741
在1.4.2中修復
解決:使用新版本
1.4.1能自動添加hadoop到依賴,1.7.0的啟動腳本中取消了

解決:手動設(shè)置hadoop的依賴在/etc/profile中加入export HADOOP_CLASSPATH=hadoop classpath``。
Hadoop is not in the classpath/dependencies.
如果要使用filesyetem sate backen必須要用帶hadoop依賴的flink
you're right that if you want to access HDFS from the user code only it should be possible to use the Hadoop free Flink version and bundle the Hadoop dependencies with your user code. However, if you want to use Flink's file system state backend as you did, then you have to start the Flink cluster with the Hadoop dependency in its classpath. The reason is that the FsStateBackend is part of the Flink distribution and will be
loaded using the system class loader. One thing you could try out is to use the RocksDB state backend instead. Since the RocksDBStateBackend is loaded dynamically, I think it should use the Hadoop dependencies when trying to load the filesystem.
解決:安裝具有依賴的hadoop版本。成功




