事件監(jiān)聽模式一般需要定義3種組件:事件對(duì)象,事件源,事件監(jiān)聽器。在spark里面事件監(jiān)聽由ListenerBus組件負(fù)責(zé),ListenerBus是spark事件總線,spark中的事件監(jiān)聽由ListenerBus負(fù)責(zé),其中事件源是spark里面定義的各種事件,事件對(duì)象也即是這個(gè)事件對(duì)應(yīng)的Event,事件監(jiān)聽器就是負(fù)責(zé)處理這些事件的Listener
一:ListenerBus的初始化
ListenerBus的初始化是在SparkContext的初始化中完成的,SparkContext在初始化的時(shí)候需要將作業(yè)運(yùn)行的環(huán)境以及各種相關(guān)組件加載好,比如說sparkEnv,mapStatusManager,BlockManager,ShuffleManager,MetricSystem,ListenerBus等組件,sparkContext是spark作業(yè)的運(yùn)行的入口類,在sparkContext初始化的時(shí)候?qū)⑦@些組件都初始化好,所以ListenerBus作為事件總線,責(zé)任重大,當(dāng)然也會(huì)在這里初始化
private var _listenerBus: LiveListenerBus = _
listenerBus = new LiveListenerBus(_conf)
這里初始化的是LiveListenerBus對(duì)象,該對(duì)象是內(nèi)部維護(hù)了兩個(gè)隊(duì)列queues和queuedEvents
private val queues = new CopyOnWriteArrayList[AsyncEventQueue]()
// Visible for testing.
@volatile private[scheduler] var queuedEvents = new mutable.ListBuffer[SparkListenerEvent]()
ListenerBus事件監(jiān)聽就是通過這兩個(gè)隊(duì)列實(shí)現(xiàn)的,具體看看這兩個(gè)隊(duì)列是如何工作的
先來看看ListenerBus的post方法,也就將事件對(duì)象發(fā)送到事件隊(duì)列中
def post(event: SparkListenerEvent): Unit = {
if (stopped.get()) {
return
}
//此處是spark的測(cè)量系統(tǒng),該source是計(jì)數(shù)器,記錄多少個(gè)事件發(fā)送了
metrics.numEventsPosted.inc()
// If the event buffer is null, it means the bus has been started and we can avoid
// synchronization and post events directly to the queues. This should be the most
// common case during the life of the bus.
// queuedEvents如果為空的話,那么說明LiveListener已經(jīng)啟動(dòng)過了,那么而直接掉好用postToQueues方法,該方法內(nèi)部調(diào)用的AsyncEventQueue的post
// 方法,將event添加到AsyncEventQueue內(nèi)部維護(hù)的一個(gè)事件隊(duì)列中(private val eventQueue = new LinkedBlockingQueue[SparkListenerEvent](
// conf.get(LISTENER_BUS_EVENT_QUEUE_CAPACITY) ))
if (queuedEvents == null) {
postToQueues(event)
return
}
// Otherwise, need to synchronize to check whether the bus is started, to make sure the thread
// calling start() picks up the new event.
//如果LiveListener沒有啟動(dòng)過,在將事件天添加到queuedEvents隊(duì)列中,等待start()將事件發(fā)送的監(jiān)聽器
synchronized {
if (!started.get()) {
queuedEvents += event
return
}
}
// If the bus was already started when the check above was made, just post directly to the
// queues.
postToQueues(event)
}
上面的代碼解釋了事件對(duì)象添加到隊(duì)列中的原理,下面再來看看,事件是如何發(fā)送的監(jiān)聽器的,該過程由LiveListenerBus中的start()方法觸發(fā)
/**
* Start sending events to attached listeners.
*
* This first sends out all buffered events posted before this listener bus has started, then
* listens for any additional events asynchronously while the listener bus is still running.
* This should only be called once.
*
* @param sc Used to stop the SparkContext in case the listener thread dies.
*/
def start(sc: SparkContext, metricsSystem: MetricsSystem): Unit = synchronized {
//判斷l(xiāng)iveListenerBus是否已經(jīng)啟動(dòng)過,啟動(dòng)了則拋出異常
if (!started.compareAndSet(false, true)) {
throw new IllegalStateException("LiveListenerBus already started.")
}
this.sparkContext = sc
// 重點(diǎn)來了,此處先調(diào)用queues的start()方法,實(shí)則是啟動(dòng)了一個(gè)線程,
queues.asScala.foreach { q =>
q.start(sc)
queuedEvents.foreach(q.post)
}
queuedEvents = null
metricsSystem.registerSource(metrics)
}
該方法的作用一開始就說明了,將events發(fā)送到listeners。該方法內(nèi)部首先調(diào)用了q.start()方法,實(shí)則是啟動(dòng)了一個(gè)線程去輪詢的將event發(fā)送到監(jiān)聽器,queuedEvents.foreach(q.post) 和上面的postToQueues(event)的作用最后調(diào)用的都是post方法,將event加入到AsyncEventQueue內(nèi)部的一個(gè)eventQueue中。
下面重點(diǎn)看看AsyncEventQueue類 的start()方法是如何將event發(fā)送到listener中的
/**
* AsyncEventQueue 該類是SparkListenerBus的子類
**/
private[scheduler] def start(sc: SparkContext): Unit = {
if (started.compareAndSet(false, true)) {
this.sc = sc
dispatchThread.start()
} else {
throw new IllegalStateException(s"$name already started!")
}
}
private val dispatchThread = new Thread(s"spark-listener-group-$name") {
setDaemon(true)
override def run(): Unit = Utils.tryOrStopSparkContext(sc) {
dispatch()
}
}
private def dispatch(): Unit = LiveListenerBus.withinListenerThread.withValue(true) {
// 從eventQueue中取出待處理的event,并通過postToAll將事件發(fā)送到listener中
var next: SparkListenerEvent = eventQueue.take()
while (next != POISON_PILL) {
val ctx = processingTime.time()
try {
// 重點(diǎn)方法
super.postToAll(next)
} finally {
ctx.stop()
}
eventCount.decrementAndGet()
next = eventQueue.take()
}
eventCount.decrementAndGet()
}
再來看看super.postToAll
/**
* SparkListenerBus
*
**/
protected override def doPostEvent(
listener: SparkListenerInterface,
event: SparkListenerEvent): Unit = {
event match {
case stageSubmitted: SparkListenerStageSubmitted =>
listener.onStageSubmitted(stageSubmitted)
case stageCompleted: SparkListenerStageCompleted =>
listener.onStageCompleted(stageCompleted)
case jobStart: SparkListenerJobStart =>
listener.onJobStart(jobStart)
case jobEnd: SparkListenerJobEnd =>
listener.onJobEnd(jobEnd)
case taskStart: SparkListenerTaskStart =>
listener.onTaskStart(taskStart)
case taskGettingResult: SparkListenerTaskGettingResult =>
listener.onTaskGettingResult(taskGettingResult)
case taskEnd: SparkListenerTaskEnd =>
listener.onTaskEnd(taskEnd)
case environmentUpdate: SparkListenerEnvironmentUpdate =>
listener.onEnvironmentUpdate(environmentUpdate)
case blockManagerAdded: SparkListenerBlockManagerAdded =>
listener.onBlockManagerAdded(blockManagerAdded)
case blockManagerRemoved: SparkListenerBlockManagerRemoved =>
listener.onBlockManagerRemoved(blockManagerRemoved)
case unpersistRDD: SparkListenerUnpersistRDD =>
listener.onUnpersistRDD(unpersistRDD)
case applicationStart: SparkListenerApplicationStart =>
listener.onApplicationStart(applicationStart)
case applicationEnd: SparkListenerApplicationEnd =>
listener.onApplicationEnd(applicationEnd)
case metricsUpdate: SparkListenerExecutorMetricsUpdate =>
listener.onExecutorMetricsUpdate(metricsUpdate)
case executorAdded: SparkListenerExecutorAdded =>
listener.onExecutorAdded(executorAdded)
case executorRemoved: SparkListenerExecutorRemoved =>
listener.onExecutorRemoved(executorRemoved)
case executorBlacklistedForStage: SparkListenerExecutorBlacklistedForStage =>
listener.onExecutorBlacklistedForStage(executorBlacklistedForStage)
case nodeBlacklistedForStage: SparkListenerNodeBlacklistedForStage =>
listener.onNodeBlacklistedForStage(nodeBlacklistedForStage)
case executorBlacklisted: SparkListenerExecutorBlacklisted =>
listener.onExecutorBlacklisted(executorBlacklisted)
case executorUnblacklisted: SparkListenerExecutorUnblacklisted =>
listener.onExecutorUnblacklisted(executorUnblacklisted)
case nodeBlacklisted: SparkListenerNodeBlacklisted =>
listener.onNodeBlacklisted(nodeBlacklisted)
case nodeUnblacklisted: SparkListenerNodeUnblacklisted =>
listener.onNodeUnblacklisted(nodeUnblacklisted)
case blockUpdated: SparkListenerBlockUpdated =>
listener.onBlockUpdated(blockUpdated)
case speculativeTaskSubmitted: SparkListenerSpeculativeTaskSubmitted =>
listener.onSpeculativeTaskSubmitted(speculativeTaskSubmitted)
case _ => listener.onOtherEvent(event)
}
事件在SparkListenerBus中被處理掉,整個(gè)事件總線的處理流程完成。再來看看spark中事件總線涉及到類的關(guān)系圖

listenerBus事件總線主要涉及到上面的4個(gè)類,補(bǔ)充一點(diǎn)添加/移除監(jiān)聽器是用ListenerBus這個(gè)抽象類提供的addListener和removeListener來完成的,整個(gè)事件總線工作流程分析完成,再次說明,設(shè)計(jì)一個(gè)事件監(jiān)聽模型,最少要定義清楚3個(gè)組件:事件源,事件對(duì)象,事件監(jiān)聽器,最后就是怎么樣將事件對(duì)象發(fā)送到事件監(jiān)聽器中處理,可以參照spark里面的設(shè)計(jì),將事件對(duì)象緩存到一個(gè)隊(duì)列中,然后再由線程去輪詢這個(gè)隊(duì)列完成事件對(duì)象到事件監(jiān)聽的映射