Oozie-監(jiān)控體系-Instrumentation

Oozie 第一個(gè)版的監(jiān)控是自定義的,后面引進(jìn)了 做監(jiān)控當(dāng)下主流的框架 Codahale Metrics 本文從oozie自身的Instrumentation介紹,以及后面如果對(duì)接 Codahale Metrics;

Oozie自身的Instrumentation 框架定義了四種類型的監(jiān)控Timers、Counters、Variables、Sampler,所有的指標(biāo)都通過 group和name兩個(gè)key來進(jìn)行分類。

public class Instrumentation {
    private ScheduledExecutorService scheduler;
    private Lock counterLock;
    private Lock timerLock;
    private Lock variableLock;
    private Lock samplerLock;
    private Map<String, Map<String, Map<String, Object>>> all;
    private Map<String, Map<String, Element<Long>>> counters;
    private Map<String, Map<String, Element<Timer>>> timers;
    private Map<String, Map<String, Element<Variable>>> variables;
    private Map<String, Map<String, Element<Double>>> samplers;

cron 計(jì)時(shí)器的定義:

/**
 * Cron is a stopwatch that can be started/stopped several times. <p/> This class is not thread safe, it does not
 * need to be. <p/> It keeps track of the total time (first start to last stop) and the running time (total time
 * minus the stopped intervals). <p/> Once a Cron is complete it must be added to the corresponding group/name in a
 * Instrumentation instance.
 */
public static class Cron {
    private long start;
    private long end;
    private long lapStart;
    private long own;
    private long total;
    private boolean running;
    /**
     * Creates new Cron, stopped, in zero.
     */
    public Cron() {
        running = false;
    }
    /**
     * Start the cron. It cannot be already started.
     */
    public void start() {
        if (!running) {
            if (lapStart == 0) {
                lapStart = System.currentTimeMillis();
                if (start == 0) {
                    start = lapStart;
                    end = start;
                }
            }
            running = true;
        }
    }
    /**
     * Stops the cron. It cannot be already stopped.
     */
    public void stop() {
        if (running) {
            end = System.currentTimeMillis();
            if (start == 0) {
                start = end;
            }
            total = end - start;
            if (lapStart > 0) {
                own += end - lapStart;
                lapStart = 0;
            }
            running = false;
        }
    }

Counter定義:

  • action.executors - Counters related to actions.

  • [action_type]#action.[operation_performed] (start, end, check, kill)

  • [action_type]#ex.[exception_type] (transient, non-transient, error, failed)

  • e.g.

  • callablequeue - count of events in various execution queues.

  • delayed.queued: Number of commands queued with a delay.

  • executed: Number of executions from the queue.

  • failed: Number of queue attempts which failed.

  • queued: Number of queued commands.

  • commands: Execution Counts for various commands. This data is generated for all commands.

  • action.end

  • action.notification

  • action.start

  • callback

  • job.info

  • job.notification

  • purge

  • signal

  • start

  • submit
    -jobs: Job Statistics

  • start: Number of started jobs.

  • submit: Number of submitted jobs.

  • succeeded: Number of jobs which succeeded.

  • kill: Number of killed jobs.
    -authorization

  • failed: Number of failed authorization attempts.
    -webservices: Number of request to various web services along with the request type.

  • failed: total number of failed requests.

  • requests: total number of requests.

  • admin

  • admin-GET

  • callback

  • callback-GET

  • jobs

  • jobs-GET

  • jobs-POST

  • version

  • version-GET


private static class Counter extends AtomicLong implements Element<Long> {
    /**
     * Return the counter snapshot.
     *
     * @return the counter snapshot.
     */
    public Long getValue() {
        return get();
    }
    /**
     * Return the String representation of the counter value.
     *
     * @return the String representation of the counter value.
     */
    public String toString() {
        return Long.toString(get());
    }
}

Timer定義:

  • action.executors - Counters related to actions.
  • [action_type]#action.[operation_performed] (start, end, check, kill)
  • callablequeue
  • time.in.queue: Time a callable spent in the queue before being processed.
  • commands: Generated for all Commands.
  • action.end
  • action.notification
  • action.start
  • callback
  • job.info
  • job.notification
  • purge
  • signal
  • start
  • submit
  • Timers related to various database operations.
  • create-workflow
  • load-action
  • load-pending-actions
  • load-running-actions
  • load-workflow
  • load-workflows
  • purge-old-workflows
  • save-action
  • update-action
  • update-workflow
    -webservices
  • admin
  • admin-GET
  • callback
  • callback-GET
  • jobs
  • jobs-GET
  • jobs-POST
  • version
  • version-GET

public static class Timer implements Element<Timer> {
    Lock lock = new ReentrantLock();
    private long ownTime;
    private long totalTime;
    private long ticks;
    private long ownSquareTime;
    private long totalSquareTime;
    private long ownMinTime;
    private long ownMaxTime;
    private long totalMinTime;
    private long totalMaxTime;
    /**
     * Timer constructor. <p/> It is project private for test purposes.
     */
    Timer() {    }
    /**
     * Return the String representation of the timer value.
     *
     * @return the String representation of the timer value.
     */
    public String toString() {
        return XLog.format("ticks[{0}] totalAvg[{1}] ownAvg[{2}]", ticks, getTotalAvg(), getOwnAvg());
    }
    /**
     * Return the timer snapshot.
     *
     * @return the timer snapshot.
     */
    public Timer getValue() {
        try {
            lock.lock();
            Timer timer = new Timer();
            timer.ownTime = ownTime;
            timer.totalTime = totalTime;
            timer.ticks = ticks;
            timer.ownSquareTime = ownSquareTime;
            timer.totalSquareTime = totalSquareTime;
            timer.ownMinTime = ownMinTime;
            timer.ownMaxTime = ownMaxTime;
            timer.totalMinTime = totalMinTime;
            timer.totalMaxTime = totalMaxTime;
            return timer;
        }
        finally {
            lock.unlock();
        }
    }
    /**
     * Add a cron to a timer. <p/> It is project private for test purposes.
     *
     * @param cron Cron to add.
     */
    void addCron(Cron cron) {
        try {
            lock.lock();
            long own = cron.getOwn();
            long total = cron.getTotal();
            ownTime += own;
            totalTime += total;
            ticks++;
            ownSquareTime += own * own;
            totalSquareTime += total * total;
            if (ticks == 1) {
                ownMinTime = own;
                ownMaxTime = own;
                totalMinTime = total;
                totalMaxTime = total;
            }
            else {
                ownMinTime = Math.min(ownMinTime, own);
                ownMaxTime = Math.max(ownMaxTime, own);
                totalMinTime = Math.min(totalMinTime, total);
                totalMaxTime = Math.max(totalMaxTime, total);
            }
        }
        finally {
            lock.unlock();
        }
    }

Variable 定義:

  • oozie

  • version: Oozie build version.

  • configuration

  • config.dir: directory from where the configuration files are loaded. If null, all configuration files are loaded from the classpath

  • config.file: the Oozie custom configuration for the instance.
    -jvm

  • free.memory

  • max.memory

  • total.memory

-locks

  • locks: Locks are used by Oozie to synchronize access to workflow and action entries when the database being used does not support 'select for update' queries. (MySQL supports 'select for update').
    -logging
  • config.file: Log4j '.properties' configuration file.
  • from.classpath: whether the config file has been read from the claspath or from the config directory.
  • reload.interval: interval at which the config file will be realoded. 0 if the config file will never be reloaded, when loaded from the classpath is never reloaded.
public interface Variable<T> extends Element<T> {}

Sampler定義:

  • callablequeue
  • delayed.queue.size: The size of the delayed command queue.
  • queue.size: The size of the command queue.
  • threads.active: The number of threads processing callables.
  • jdbc:
  • connections.active: Active Connections over the past minute.
  • webservices: Requests to the Oozie HTTP endpoints over the last minute.
  • admin
  • callback
  • job
  • jobs
  • requests
  • version
private static class Sampler implements Element<Double>, Runnable {
    private Lock lock = new ReentrantLock();
    private int samplingInterval;
    private Variable<Long> variable;
    private long[] values;
    private int current;
    private long valuesSum;
    private double rate;
    public Sampler(int samplingPeriod, int samplingInterval, Variable<Long> variable) {
        this.samplingInterval = samplingInterval;
        this.variable = variable;
        values = new long[samplingPeriod / samplingInterval];
        valuesSum = 0;
        current = -1;
    }
    public int getSamplingInterval() {
        return samplingInterval;
    }
    public void run() {
        try {
            lock.lock();
            long newValue = variable.getValue();
            if (current == -1) {
                valuesSum = newValue;
                current = 0;
                values[current] = newValue;
            }
            else {
                current = (current + 1) % values.length;
                valuesSum = valuesSum - values[current] + newValue;
                values[current] = newValue;
            }
            rate = ((double) valuesSum) / values.length;
        }
        finally {
            lock.unlock();
        }
    }
    public Double getValue() {
        return rate;
    }
}
最后編輯于
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請(qǐng)聯(lián)系作者
【社區(qū)內(nèi)容提示】社區(qū)部分內(nèi)容疑似由AI輔助生成,瀏覽時(shí)請(qǐng)結(jié)合常識(shí)與多方信息審慎甄別。
平臺(tái)聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點(diǎn),簡書系信息發(fā)布平臺(tái),僅提供信息存儲(chǔ)服務(wù)。

相關(guān)閱讀更多精彩內(nèi)容

  • **2014真題Directions:Read the following text. Choose the be...
    又是夜半驚坐起閱讀 11,046評(píng)論 0 23
  • 我依然喜歡你,但是我不再愛你了。分手吧。 這是桃子發(fā)給前男友的短信。 桃子和男朋友相戀了三年,這也不是他們第一次分...
    晨一文閱讀 724評(píng)論 0 4
  • 一葷一素七塊錢,一葷兩素九塊錢。他問我要吃什么菜,我說‘一葷一素吧?!剡^頭來看了我一眼,說,‘愛吃什么,這么幾...
    千卷閱讀 309評(píng)論 0 2
  • 即使再難過,也要笑;即使再生氣,生氣一分鐘;因?yàn)闀?huì)笑的人有福氣,別把自己的福氣趕走了。
    小亮妹Megan閱讀 389評(píng)論 0 0

友情鏈接更多精彩內(nèi)容