成人三级免费网站,久久免费黄色,懂色精品人妻一区二

本文內(nèi)容以以Socket數(shù)據(jù)來源為例，通過WordCount計(jì)算來跟蹤Job的生成
代碼如下：

    objectNetworkWordCount {
      defmain(args:Array[String]) {
        if (args.length< 2) {
          System.err.println("Usage:NetworkWordCount<hostname> <port>")
          System.exit(1)
        }
        val sparkConf= newSparkConf().setAppName("NetworkWordCount").setMaster("local[2]")
        val ssc = newStreamingContext(sparkConf,Seconds(1))
        val lines= ssc.socketTextStream(args(0),args(1).toInt,StorageLevel.MEMORY_AND_DISK_SER)
        val words= lines.flatMap(_.split(""))
        val wordCounts= words.map(x => (x,1)).reduceByKey(_+ _)
        wordCounts.print()
        ssc.start()
        ssc.awaitTermination()
      ｝
    }

從ssc.start()開始看，在start方法中調(diào)用了scheduler的start()方法，這里的scheduler就是
JobScheduler，我們看start的代碼

def start(): Unit = synchronized {
    if (eventLoop != null) return // scheduler has already been started
    
    logDebug("Starting JobScheduler")
    eventLoop = new EventLoop[JobSchedulerEvent]("JobScheduler") {
    override protected def onReceive(event: JobSchedulerEvent): Unit = processEvent(event)
    override protected def onError(e: Throwable): Unit = reportError("Error in job scheduler", e)
    }
    // 啟動(dòng)JobScheduler的事件循環(huán)器
    eventLoop.start()
    
    // attach rate controllers of input streams to receive batch completion updates
    for { inputDStream <- ssc.graph.getInputStreams
    rateController <- inputDStream.rateController
    } ssc.addStreamingListener(rateController)
    
    listenerBus.start(ssc.sparkContext)
    receiverTracker = new ReceiverTracker(ssc)
    inputInfoTracker = new InputInfoTracker(ssc)
    // 啟動(dòng)ReceiverTracker,數(shù)據(jù)的接收邏輯從這里開始
    receiverTracker.start()
    // 啟動(dòng)JobGenerator，job的生成從這里開始
    jobGenerator.start()
    logInfo("Started JobScheduler")
}

Spark Streaming由JobScheduler、ReceiverTracker、JobGenerator三大組件組成，其中ReceiverTracker、
JobGenerator包含在JobScheduler中。這里分別執(zhí)行三大組件的start方法。

我們先看Job的生成，jobGenerator.start()方法。在JobGenerator的start方法中都做了什么，繼續(xù)往下看。
首先啟動(dòng)了一個(gè)EventLoop并來回調(diào)processEvent方法，那么什么時(shí)候會(huì)觸發(fā)回調(diào)呢，來看一下EventLoop的內(nèi)部結(jié)構(gòu)

 private[spark] abstract class EventLoop\[E](name: String) extends Logging {

  //線程安全的阻塞隊(duì)列
  private val eventQueue: BlockingQueue[E] = new LinkedBlockingDeque\[E]()

  private val stopped = new AtomicBoolean(false)

  private val eventThread = new Thread(name) {
    //后臺(tái)線程
    setDaemon(true)

    override def run(): Unit = {
      try {
        while (!stopped.get) {
          val event = eventQueue.take()
          try {
            //回調(diào)子類的onReceive方法，就是事件的邏輯代碼
            onReceive(event)
          } catch {
            case NonFatal(e) => {
              try {
                onError(e)
              } catch {
                case NonFatal(e) => logError("Unexpected error in " + name, e)
              }
            }
          }
        }
      } catch {
        case ie: InterruptedException => // exit even if eventQueue is not empty
        case NonFatal(e) => logError("Unexpected error in " + name, e)
      }
    }
  }

  def start(): Unit = {
    if (stopped.get) {
      throw new IllegalStateException(name + " has already been stopped")
    }
    // Call onStart before starting the event thread to make sure it happens before onReceive
    onStart()
    // 啟動(dòng)事件循環(huán)器
    eventThread.start()
  }

  def stop(): Unit = {
    // stopped.compareAndSet(false, true) 判斷是否為false，同時(shí)賦值為true
    if (stopped.compareAndSet(false, true)) {
     eventThread.interrupt()
      var onStopCalled = false
      try {
        eventThread.join()
        // Call onStop after the event thread exits to make sure onReceive happens before onStop
        onStopCalled = true
        onStop()
      } catch {
        case ie: InterruptedException =>
          Thread.currentThread().interrupt()
          if (!onStopCalled) {
            // ie is thrown from `eventThread.join()`. Otherwise, we should not call `onStop` since
            // it's already called.
            onStop()
          }
      }
    } else {
      // Keep quiet to allow calling `stop` multiple times.
    }
  }

  def post(event: E): Unit = {
    eventQueue.put(event)
  }

  def isActive: Boolean = eventThread.isAlive

  protected def onStart(): Unit = {}

  protected def onStop(): Unit = {}

  protected def onReceive(event: E): Unit

  protected def onError(e: Throwable): Unit

 }

在EventLoop內(nèi)部其實(shí)是維護(hù)了一個(gè)隊(duì)列，開辟了一條后臺(tái)線程來回調(diào)實(shí)現(xiàn)類的onReceive方法。
那么是什么時(shí)候把事件放入EventLoop的隊(duì)列中呢，就要找EventLoop的post方法了。在JobGenerator實(shí)例化的時(shí)
候創(chuàng)建了一個(gè)RecurringTimer，代碼如下：

 private val timer = new RecurringTimer(clock, ssc.graph.batchDuration.milliseconds,
  // 回調(diào) eventLoop.post(GenerateJobs(new Time(longTime)))將GenerateJobs事件放入事件循環(huán)器
  longTime => eventLoop.post(GenerateJobs(new Time(longTime))), "JobGenerator")

RecurringTimer就是一個(gè)定時(shí)器，看一下他的構(gòu)造參數(shù)和內(nèi)部代碼，
* @param clock 時(shí)鐘
* @param period 間歇時(shí)間
* @param callback 回調(diào)方法
* @param name 定時(shí)器的名稱
很清楚的知道根據(jù)用戶傳入的時(shí)間間隔，周期性的回調(diào)callback方法。Callback就是前面看到的

longTime => eventLoop.post(GenerateJobs(new Time(longTime))), "JobGenerator")

將GenerateJobs事件提交到EventLoop的隊(duì)列中，此時(shí)RecurringTimer還沒有執(zhí)行。
回到JobGenerator中的start方法向下看，因?yàn)槭堑谝淮芜\(yùn)行，所以調(diào)用了startFirstTime方法。
在startFirstTime方法中，有一行關(guān)鍵代碼timer.start(startTime.milliseconds)，終于看到了定時(shí)器的啟動(dòng)

從定時(shí)器的start方法開始往回看，周期性的回調(diào)eventLoop.post方法將GenerateJobs事件發(fā)送到EvenLoop的隊(duì)列，然后回調(diào)rocessEvent方法，看generateJobs(time)。
generateJobs代碼如下

private def generateJobs(time: Time) {
  // Set the SparkEnv in this thread, so that job generation code can access the environment
  // Example: BlockRDDs are created in this thread, and it needs to access BlockManager
  // Update: This is probably redundant after threadlocal stuff in SparkEnv has been removed.
  SparkEnv.set(ssc.env)
  Try {
    jobScheduler.receiverTracker.allocateBlocksToBatch(time) // allocate received blocks to batch
    graph.generateJobs(time) // generate jobs using allocated block
  } match {
    case Success(jobs) =>
      // 獲取元數(shù)據(jù)信息
      val streamIdToInputInfos = jobScheduler.inputInfoTracker.getInfo(time)
      // 提交jobSet
      jobScheduler.submitJobSet(JobSet(time, jobs, streamIdToInputInfos))
    case Failure(e) =>
      jobScheduler.reportError("Error generating jobs for time " + time, e)
  }
  eventLoop.post(DoCheckpoint(time, clearCheckpointDataLater = false))
}
進(jìn)入graph.generateJobs(time) ，調(diào)用每一個(gè)outputStream的generateJob方法，generateJob代碼如下
private[streaming] def generateJob(time: Time): Option[Job] = {
  getOrCompute(time) match {
    case Some(rdd) => {
      // jobRunc中包裝了runJob的方法
      val jobFunc = () => {
        val emptyFunc = { (iterator: Iterator[T]) => {} }
        context.sparkContext.runJob(rdd, emptyFunc)
      }
      Some(new Job(time, jobFunc))
    }
    case None => None
  }
}

getOrCompute返回一個(gè)RDD，RDD的生成以后再說，定義了一個(gè)函數(shù)jobFunc，可以看到函數(shù)的作用是提交job，
把jobFunc封裝到Job對(duì)象然后返回。

返回的是多個(gè)job，jobs生成成功后提交JobSet，代碼如下
jobScheduler.submitJobSet(JobSet(time, jobs, streamIdToInputInfos))
然后分別提交每一個(gè)job，把job包裝到JobHandler(Runnable子類)交給線程池運(yùn)行，執(zhí)行JobHandler的run
方法，調(diào)用job.run()，在Job的run方法中就一行，執(zhí)行Try(func())，這個(gè)func()函數(shù)就是上面代碼中
的jobFunc，看到這里整個(gè)Job的生成與提交就連通了。
下面附上一張Job動(dòng)態(tài)生成流程圖

以上內(nèi)容如有錯(cuò)誤，歡迎指正

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av

6 Spark Streaming 中Job的動(dòng)態(tài)生成

6 Spark Streaming 中Job的動(dòng)態(tài)生成

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九 欧美,1769亚洲,黄色成人av

6 Spark Streaming 中Job的動(dòng)態(tài)生成

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av