[第八章]Worker原理深入剖析

上一節(jié)我們通過源碼詳細(xì)剖析了spark資源調(diào)度的算法,其中涉到Master分別過方法LaunchDriver,LaunchExecutor發(fā)送Driver,Eecutor到Worker上啟動。本節(jié)就以這兩方面進(jìn)行原理深入剖析

1:Master要求Worker啟動Driver與Executor.調(diào)用方法分別是LaunchDriver,LaunchExecutor

 case LaunchDriver(driverId, driverDesc) => {
      logInfo(s"Asked to launch driver $driverId")
      val driver = new DriverRunner(
        conf,
        driverId,
        workDir,
        sparkHome,
        driverDesc.copy(command = Worker.maybeUpdateSSLSettings(driverDesc.command, conf)),
        self,
        akkaUrl)
      drivers(driverId) = driver
      driver.start()

      coresUsed += driverDesc.cores
      memoryUsed += driverDesc.mem
    }

通過上面的代碼,我們可以看到創(chuàng)建了一個DriverRunner對象,并且driver.start().不難看出,這個方法本身就是一個線程,接著看下面的代碼

  /** Starts a thread to run and manage the driver. */
  def start() = {
//啟動一個線程,調(diào)用start
    new Thread("DriverRunner for " + driverId) {
      override def run() {
        try {
         //創(chuàng)建driver的工作目錄
          val driverDir = createWorkingDirectory()
          //下載用戶上傳的jar(我們編寫的application程序)
          val localJarFilename = downloadUserJar(driverDi

          def substituteVariables(argument: String): String = argument match {
            case "{{WORKER_URL}}" => workerUrl
            case "{{USER_JAR}}" => localJarFilename
            case other => other

不難看出,這還是一個java線程,所以spark源碼中,其實大量用了java的代碼,這個后面我們都會提到的。所以我們在開發(fā)中,不一定學(xué)了scala就一定全是用scala開發(fā)Applicaiton。
在上面的代碼中,首先通過createWorkingDirectory()創(chuàng)建了工作目錄,其中driverDir=new File(...)這也是JAVA中的FILE

private def createWorkingDirectory(): File = {
   val driverDir = new File(workDir, driverId)
   if (!driverDir.exists() && !driverDir.mkdirs()) {
     throw new IOException("Failed to create directory " + driverDir)
   }
   driverDir
 }

接下來看代碼:這就是創(chuàng)建一個ProcessBuilder,用這個對象啟動driver進(jìn)程

val builder = CommandUtils.buildProcessBuilder(driverDesc.command, driverDesc.mem,
            sparkHome.getAbsolutePath, substituteVariables)
          launchDriver(builder, driverDir, driverDesc.supervise)
        }

。。。
 val processStart = clock.getTimeMillis()
      val exitCode = process.get.waitFor()

接下來看代碼,當(dāng)driver啟動,或者被kill,會調(diào)用worker中的DriverStateChanged(),來通知Master改變driver的狀態(tài)

  finalState = Some(state)
  worker ! DriverStateChanged(driverId, state, finalException)

下面是worker中的DriverStateChanged()源碼:

case DriverStateChanged(driverId, state, exception) => {
   state match {
     case DriverState.ERROR =>
       logWarning(s"Driver $driverId failed with unrecoverable exception: ${exception.get}")
     case DriverState.FAILED =>
       logWarning(s"Driver $driverId exited with failure")
     case DriverState.FINISHED =>
       logInfo(s"Driver $driverId exited successfully")
     case DriverState.KILLED =>
       logInfo(s"Driver $driverId was killed by user")
     case _ =>
       logDebug(s"Driver $driverId changed state to $state")
   }
   //向Master通知,修改driver的狀態(tài)信息
   master ! DriverStateChanged(driverId, state, exception)
   val driver = drivers.remove(driverId).get
   finishedDrivers(driverId) = driver
   memoryUsed -= driver.driverDesc.mem

不難看出,我們現(xiàn)在分析到這里,是不是與前面幾節(jié)我們分析的都已經(jīng)連起來了。當(dāng)Master收到Worker的狀態(tài)改變時,更新在自己的內(nèi)存區(qū)的Driver信息.。以上就是Driver在Worker的運行原理.

二:Executor在Worker的啟動過程:

最后編輯于
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者
【社區(qū)內(nèi)容提示】社區(qū)部分內(nèi)容疑似由AI輔助生成,瀏覽時請結(jié)合常識與多方信息審慎甄別。
平臺聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點,簡書系信息發(fā)布平臺,僅提供信息存儲服務(wù)。

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容