激情在线视频,久久草在线,第一区二卡

序

本文主要研究一下flink如何兼容StormTopology

實(shí)例

    @Test
    public void testStormWordCount() throws Exception {
        //NOTE 1 build Topology the Storm way
        final TopologyBuilder builder = new TopologyBuilder();
        builder.setSpout("spout", new RandomWordSpout(), 1);
        builder.setBolt("count", new WordCountBolt(), 5)
                .fieldsGrouping("spout", new Fields("word"));
        builder.setBolt("print", new PrintBolt(), 1)
                .shuffleGrouping("count");

        //NOTE 2 convert StormTopology to FlinkTopology
        FlinkTopology flinkTopology = FlinkTopology.createTopology(builder);

        //NOTE 3 execute program locally using FlinkLocalCluster
        Config conf = new Config();
        // only required to stabilize integration test
        conf.put(FlinkLocalCluster.SUBMIT_BLOCKING, true);

        final FlinkLocalCluster cluster = FlinkLocalCluster.getLocalCluster();
        cluster.submitTopology("stormWordCount", conf, flinkTopology);
        cluster.shutdown();
    }

這里使用FlinkLocalCluster.getLocalCluster()來創(chuàng)建或獲取FlinkLocalCluster，之后調(diào)用FlinkLocalCluster.submitTopology來提交topology，結(jié)束時(shí)通過FlinkLocalCluster.shutdown來關(guān)閉cluster
這里構(gòu)建的RandomWordSpout繼承自storm的BaseRichSpout，WordCountBolt繼承自storm的BaseBasicBolt；PrintBolt繼承自storm的BaseRichBolt(由于flink是使用的Checkpoint機(jī)制，不會(huì)轉(zhuǎn)換storm的ack操作，因而這里用BaseBasicBolt還是BaseRichBolt都無特別要求)
FlinkLocalCluster.submitTopology這里使用的topology是StormTopoloy轉(zhuǎn)換后的FlinkTopology

LocalClusterFactory

flink-release-1.6.2/flink-contrib/flink-storm/src/main/java/org/apache/flink/storm/api/FlinkLocalCluster.java

    // ------------------------------------------------------------------------
    //  Access to default local cluster
    // ------------------------------------------------------------------------

    // A different {@link FlinkLocalCluster} to be used for execution of ITCases
    private static LocalClusterFactory currentFactory = new DefaultLocalClusterFactory();

    /**
     * Returns a {@link FlinkLocalCluster} that should be used for execution. If no cluster was set by
     * {@link #initialize(LocalClusterFactory)} in advance, a new {@link FlinkLocalCluster} is returned.
     *
     * @return a {@link FlinkLocalCluster} to be used for execution
     */
    public static FlinkLocalCluster getLocalCluster() {
        return currentFactory.createLocalCluster();
    }

    /**
     * Sets a different factory for FlinkLocalClusters to be used for execution.
     *
     * @param clusterFactory
     *      The LocalClusterFactory to create the local clusters for execution.
     */
    public static void initialize(LocalClusterFactory clusterFactory) {
        currentFactory = Objects.requireNonNull(clusterFactory);
    }

    // ------------------------------------------------------------------------
    //  Cluster factory
    // ------------------------------------------------------------------------

    /**
     * A factory that creates local clusters.
     */
    public interface LocalClusterFactory {

        /**
         * Creates a local Flink cluster.
         * @return A local Flink cluster.
         */
        FlinkLocalCluster createLocalCluster();
    }

    /**
     * A factory that instantiates a FlinkLocalCluster.
     */
    public static class DefaultLocalClusterFactory implements LocalClusterFactory {

        @Override
        public FlinkLocalCluster createLocalCluster() {
            return new FlinkLocalCluster();
        }
    }

flink在FlinkLocalCluster里頭提供了一個(gè)靜態(tài)方法getLocalCluster，用來獲取FlinkLocalCluster，它是通過LocalClusterFactory來創(chuàng)建一個(gè)FlinkLocalCluster
LocalClusterFactory這里使用的是DefaultLocalClusterFactory實(shí)現(xiàn)類，它的createLocalCluster方法，直接new了一個(gè)FlinkLocalCluster
目前的實(shí)現(xiàn)來看，每次調(diào)用FlinkLocalCluster.getLocalCluster，都會(huì)創(chuàng)建一個(gè)新的FlinkLocalCluster，這個(gè)在調(diào)用的時(shí)候是需要注意一下的

FlinkTopology

flink-release-1.6.2/flink-contrib/flink-storm/src/main/java/org/apache/flink/storm/api/FlinkTopology.java

    /**
     * Creates a Flink program that uses the specified spouts and bolts.
     * @param stormBuilder The Storm topology builder to use for creating the Flink topology.
     * @return A {@link FlinkTopology} which contains the translated Storm topology and may be executed.
     */
    public static FlinkTopology createTopology(TopologyBuilder stormBuilder) {
        return new FlinkTopology(stormBuilder);
    }

    private FlinkTopology(TopologyBuilder builder) {
        this.builder = builder;
        this.stormTopology = builder.createTopology();
        // extract the spouts and bolts
        this.spouts = getPrivateField("_spouts");
        this.bolts = getPrivateField("_bolts");

        this.env = StreamExecutionEnvironment.getExecutionEnvironment();

        // Kick off the translation immediately
        translateTopology();
    }

FlinkTopology提供了一個(gè)靜態(tài)工廠方法createTopology用來創(chuàng)建FlinkTopology
FlinkTopology先保存一下TopologyBuilder，然后通過getPrivateField反射調(diào)用getDeclaredField獲取_spouts、_bolts私有屬性然后保存起來，方便后面轉(zhuǎn)換topology使用
之后先獲取到ExecutionEnvironment，最后就是調(diào)用translateTopology進(jìn)行整個(gè)StormTopology的轉(zhuǎn)換

translateTopology

flink-release-1.6.2/flink-contrib/flink-storm/src/main/java/org/apache/flink/storm/api/FlinkTopology.java

    /**
     * Creates a Flink program that uses the specified spouts and bolts.
     */
    private void translateTopology() {

        unprocessdInputsPerBolt.clear();
        outputStreams.clear();
        declarers.clear();
        availableInputs.clear();

        // Storm defaults to parallelism 1
        env.setParallelism(1);

        /* Translation of topology */

        for (final Entry<String, IRichSpout> spout : spouts.entrySet()) {
            final String spoutId = spout.getKey();
            final IRichSpout userSpout = spout.getValue();

            final FlinkOutputFieldsDeclarer declarer = new FlinkOutputFieldsDeclarer();
            userSpout.declareOutputFields(declarer);
            final HashMap<String, Fields> sourceStreams = declarer.outputStreams;
            this.outputStreams.put(spoutId, sourceStreams);
            declarers.put(spoutId, declarer);

            final HashMap<String, DataStream<Tuple>> outputStreams = new HashMap<String, DataStream<Tuple>>();
            final DataStreamSource<?> source;

            if (sourceStreams.size() == 1) {
                final SpoutWrapper<Tuple> spoutWrapperSingleOutput = new SpoutWrapper<Tuple>(userSpout, spoutId, null, null);
                spoutWrapperSingleOutput.setStormTopology(stormTopology);

                final String outputStreamId = (String) sourceStreams.keySet().toArray()[0];

                DataStreamSource<Tuple> src = env.addSource(spoutWrapperSingleOutput, spoutId,
                        declarer.getOutputType(outputStreamId));

                outputStreams.put(outputStreamId, src);
                source = src;
            } else {
                final SpoutWrapper<SplitStreamType<Tuple>> spoutWrapperMultipleOutputs = new SpoutWrapper<SplitStreamType<Tuple>>(
                        userSpout, spoutId, null, null);
                spoutWrapperMultipleOutputs.setStormTopology(stormTopology);

                @SuppressWarnings({ "unchecked", "rawtypes" })
                DataStreamSource<SplitStreamType<Tuple>> multiSource = env.addSource(
                        spoutWrapperMultipleOutputs, spoutId,
                        (TypeInformation) TypeExtractor.getForClass(SplitStreamType.class));

                SplitStream<SplitStreamType<Tuple>> splitSource = multiSource
                        .split(new StormStreamSelector<Tuple>());
                for (String streamId : sourceStreams.keySet()) {
                    SingleOutputStreamOperator<Tuple> outStream = splitSource.select(streamId)
                            .map(new SplitStreamMapper<Tuple>());
                    outStream.getTransformation().setOutputType(declarer.getOutputType(streamId));
                    outputStreams.put(streamId, outStream);
                }
                source = multiSource;
            }
            availableInputs.put(spoutId, outputStreams);

            final ComponentCommon common = stormTopology.get_spouts().get(spoutId).get_common();
            if (common.is_set_parallelism_hint()) {
                int dop = common.get_parallelism_hint();
                source.setParallelism(dop);
            } else {
                common.set_parallelism_hint(1);
            }
        }

        /**
         * 1. Connect all spout streams with bolts streams
         * 2. Then proceed with the bolts stream already connected
         *
         * <p>Because we do not know the order in which an iterator steps over a set, we might process a consumer before
         * its producer
         * ->thus, we might need to repeat multiple times
         */
        boolean makeProgress = true;
        while (bolts.size() > 0) {
            if (!makeProgress) {
                StringBuilder strBld = new StringBuilder();
                strBld.append("Unable to build Topology. Could not connect the following bolts:");
                for (String boltId : bolts.keySet()) {
                    strBld.append("\n  ");
                    strBld.append(boltId);
                    strBld.append(": missing input streams [");
                    for (Entry<GlobalStreamId, Grouping> streams : unprocessdInputsPerBolt
                            .get(boltId)) {
                        strBld.append("'");
                        strBld.append(streams.getKey().get_streamId());
                        strBld.append("' from '");
                        strBld.append(streams.getKey().get_componentId());
                        strBld.append("'; ");
                    }
                    strBld.append("]");
                }

                throw new RuntimeException(strBld.toString());
            }
            makeProgress = false;

            final Iterator<Entry<String, IRichBolt>> boltsIterator = bolts.entrySet().iterator();
            while (boltsIterator.hasNext()) {

                final Entry<String, IRichBolt> bolt = boltsIterator.next();
                final String boltId = bolt.getKey();
                final IRichBolt userBolt = copyObject(bolt.getValue());

                final ComponentCommon common = stormTopology.get_bolts().get(boltId).get_common();

                Set<Entry<GlobalStreamId, Grouping>> unprocessedBoltInputs = unprocessdInputsPerBolt.get(boltId);
                if (unprocessedBoltInputs == null) {
                    unprocessedBoltInputs = new HashSet<>();
                    unprocessedBoltInputs.addAll(common.get_inputs().entrySet());
                    unprocessdInputsPerBolt.put(boltId, unprocessedBoltInputs);
                }

                // check if all inputs are available
                final int numberOfInputs = unprocessedBoltInputs.size();
                int inputsAvailable = 0;
                for (Entry<GlobalStreamId, Grouping> entry : unprocessedBoltInputs) {
                    final String producerId = entry.getKey().get_componentId();
                    final String streamId = entry.getKey().get_streamId();
                    final HashMap<String, DataStream<Tuple>> streams = availableInputs.get(producerId);
                    if (streams != null && streams.get(streamId) != null) {
                        inputsAvailable++;
                    }
                }

                if (inputsAvailable != numberOfInputs) {
                    // traverse other bolts first until inputs are available
                    continue;
                } else {
                    makeProgress = true;
                    boltsIterator.remove();
                }

                final Map<GlobalStreamId, DataStream<Tuple>> inputStreams = new HashMap<>(numberOfInputs);

                for (Entry<GlobalStreamId, Grouping> input : unprocessedBoltInputs) {
                    final GlobalStreamId streamId = input.getKey();
                    final Grouping grouping = input.getValue();

                    final String producerId = streamId.get_componentId();

                    final Map<String, DataStream<Tuple>> producer = availableInputs.get(producerId);

                    inputStreams.put(streamId, processInput(boltId, userBolt, streamId, grouping, producer));
                }

                final SingleOutputStreamOperator<?> outputStream = createOutput(boltId,
                        userBolt, inputStreams);

                if (common.is_set_parallelism_hint()) {
                    int dop = common.get_parallelism_hint();
                    outputStream.setParallelism(dop);
                } else {
                    common.set_parallelism_hint(1);
                }

            }
        }
    }

整個(gè)轉(zhuǎn)換是先轉(zhuǎn)換spout，再轉(zhuǎn)換bolt，他們根據(jù)的spouts及bolts信息是在構(gòu)造器里頭使用反射從storm的TopologyBuilder對象獲取到的
flink使用FlinkOutputFieldsDeclarer(它實(shí)現(xiàn)了storm的OutputFieldsDeclarer接口)來承載storm的IRichSpout及IRichBolt里頭配置的declareOutputFields信息，不過要注意的是flink不支持dirct emit；這里通過userSpout.declareOutputFields方法，將原始spout的declare信息設(shè)置到FlinkOutputFieldsDeclarer
flink使用SpoutWrapper來包裝spout，將其轉(zhuǎn)換為RichParallelSourceFunction類型，這里對spout的outputStreams的個(gè)數(shù)是否大于1進(jìn)行不同處理；之后就是將RichParallelSourceFunction作為StreamExecutionEnvironment.addSource方法的參數(shù)創(chuàng)建flink的DataStreamSource，并添加到availableInputs中，然后根據(jù)spout的parallelismHit來設(shè)置DataStreamSource的parallelism
對于bolt的轉(zhuǎn)換，這里維護(hù)了unprocessdInputsPerBolt，key為boltId，value為該bolt要連接的GlobalStreamId及Grouping方式，由于是使用map來進(jìn)行遍歷的，因此轉(zhuǎn)換的bolt可能亂序，如果連接的GlobalStreamId存在則進(jìn)行轉(zhuǎn)換，然后從bolts中移除，bolt連接的GlobalStreamId不在availableInputs中的時(shí)候，需要跳過處理下一個(gè)，不會(huì)從bolts中移除，因?yàn)橥鈱拥难h(huán)條件是bolts的size大于0，就是依靠這個(gè)機(jī)制來處理亂序
對于bolt的轉(zhuǎn)換有一個(gè)重要的方法就是processInput，它把bolt的grouping轉(zhuǎn)換為對spout的DataStream的對應(yīng)操作(比如shuffleGrouping轉(zhuǎn)換為對DataStream的rebalance操作，fieldsGrouping轉(zhuǎn)換為對DataStream的keyBy操作，globalGrouping轉(zhuǎn)換為global操作，allGrouping轉(zhuǎn)換為broadcast操作)，之后調(diào)用createOutput方法轉(zhuǎn)換bolt的執(zhí)行邏輯，它使用BoltWrapper或者M(jìn)ergedInputsBoltWrapper將bolt轉(zhuǎn)換為flink的OneInputStreamOperator，然后作為參數(shù)對stream進(jìn)行transform操作返回flink的SingleOutputStreamOperator，同時(shí)將轉(zhuǎn)換后的SingleOutputStreamOperator添加到availableInputs中，之后根據(jù)bolt的parallelismHint對這個(gè)SingleOutputStreamOperator設(shè)置parallelism

FlinkLocalCluster

flink-storm_2.11-1.6.2-sources.jar!/org/apache/flink/storm/api/FlinkLocalCluster.java

/**
 * {@link FlinkLocalCluster} mimics a Storm {@link LocalCluster}.
 */
public class FlinkLocalCluster {

    /** The log used by this mini cluster. */
    private static final Logger LOG = LoggerFactory.getLogger(FlinkLocalCluster.class);

    /** The Flink mini cluster on which to execute the programs. */
    private FlinkMiniCluster flink;

    /** Configuration key to submit topology in blocking mode if flag is set to {@code true}. */
    public static final String SUBMIT_BLOCKING = "SUBMIT_STORM_TOPOLOGY_BLOCKING";

    public FlinkLocalCluster() {
    }

    public FlinkLocalCluster(FlinkMiniCluster flink) {
        this.flink = Objects.requireNonNull(flink);
    }

    @SuppressWarnings("rawtypes")
    public void submitTopology(final String topologyName, final Map conf, final FlinkTopology topology)
            throws Exception {
        this.submitTopologyWithOpts(topologyName, conf, topology, null);
    }

    @SuppressWarnings("rawtypes")
    public void submitTopologyWithOpts(final String topologyName, final Map conf, final FlinkTopology topology, final SubmitOptions submitOpts) throws Exception {
        LOG.info("Running Storm topology on FlinkLocalCluster");

        boolean submitBlocking = false;
        if (conf != null) {
            Object blockingFlag = conf.get(SUBMIT_BLOCKING);
            if (blockingFlag instanceof Boolean) {
                submitBlocking = ((Boolean) blockingFlag).booleanValue();
            }
        }

        FlinkClient.addStormConfigToTopology(topology, conf);

        StreamGraph streamGraph = topology.getExecutionEnvironment().getStreamGraph();
        streamGraph.setJobName(topologyName);

        JobGraph jobGraph = streamGraph.getJobGraph();

        if (this.flink == null) {
            Configuration configuration = new Configuration();
            configuration.addAll(jobGraph.getJobConfiguration());

            configuration.setString(TaskManagerOptions.MANAGED_MEMORY_SIZE, "0");
            configuration.setInteger(TaskManagerOptions.NUM_TASK_SLOTS, jobGraph.getMaximumParallelism());

            this.flink = new LocalFlinkMiniCluster(configuration, true);
            this.flink.start();
        }

        if (submitBlocking) {
            this.flink.submitJobAndWait(jobGraph, false);
        } else {
            this.flink.submitJobDetached(jobGraph);
        }
    }

    public void killTopology(final String topologyName) {
        this.killTopologyWithOpts(topologyName, null);
    }

    public void killTopologyWithOpts(final String name, final KillOptions options) {
    }

    public void activate(final String topologyName) {
    }

    public void deactivate(final String topologyName) {
    }

    public void rebalance(final String name, final RebalanceOptions options) {
    }

    public void shutdown() {
        if (this.flink != null) {
            this.flink.stop();
            this.flink = null;
        }
    }

    //......
}

FlinkLocalCluster的submitTopology方法調(diào)用了submitTopologyWithOpts，而后者主要是設(shè)置一些參數(shù)，調(diào)用topology.getExecutionEnvironment().getStreamGraph()根據(jù)transformations生成StreamGraph，再獲取JobGraph，然后創(chuàng)建LocalFlinkMiniCluster并start，最后使用LocalFlinkMiniCluster的submitJobAndWait或submitJobDetached來提交整個(gè)JobGraph

小結(jié)

flink通過FlinkTopology對storm提供了一定的兼容性，這對于遷移storm到flink非常有幫助
要在flink上運(yùn)行storm的topology，主要有幾個(gè)步驟，分別是構(gòu)建storm原生的TopologyBuilder，之后通過FlinkTopology.createTopology(builder)來將StormTopology轉(zhuǎn)換為FlinkTopology，最后是通過FlinkLocalCluster(本地模式)或者FlinkSubmitter(遠(yuǎn)程提交)的submitTopology方法提交FlinkTopology
FlinkTopology是flink兼容storm的核心，它負(fù)責(zé)將StormTopology轉(zhuǎn)換為flink對應(yīng)的結(jié)構(gòu)，比如使用SpoutWrapper將spout轉(zhuǎn)換為RichParallelSourceFunction，然后添加到StreamExecutionEnvironment創(chuàng)建DataStream，把bolt的grouping轉(zhuǎn)換為對spout的DataStream的對應(yīng)操作(比如shuffleGrouping轉(zhuǎn)換為對DataStream的rebalance操作，fieldsGrouping轉(zhuǎn)換為對DataStream的keyBy操作，globalGrouping轉(zhuǎn)換為global操作，allGrouping轉(zhuǎn)換為broadcast操作)，然后使用BoltWrapper或者M(jìn)ergedInputsBoltWrapper將bolt轉(zhuǎn)換為flink的OneInputStreamOperator，然后作為參數(shù)對stream進(jìn)行transform操作
構(gòu)建完FlinkTopology之后，就使用FlinkLocalCluster提交到本地執(zhí)行，或者使用FlinkSubmitter提交到遠(yuǎn)程執(zhí)行
FlinkLocalCluster的submitTopology方法主要是通過FlinkTopology作用的StreamExecutionEnvironment生成StreamGraph，通過它獲取JobGraph，然后創(chuàng)建LocalFlinkMiniCluster并start，最后通過LocalFlinkMiniCluster提交JobGraph

doc

Storm Compatibility Beta

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av

聊聊flink如何兼容StormTopology

聊聊flink如何兼容StormTopology

序

實(shí)例

LocalClusterFactory

FlinkTopology

translateTopology

FlinkLocalCluster

小結(jié)

doc

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九 欧美,1769亚洲,黄色成人av

聊聊flink如何兼容StormTopology

序

實(shí)例

LocalClusterFactory

FlinkTopology

translateTopology

FlinkLocalCluster

小結(jié)

doc

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av