国产日日夜夜亚洲,加勒比中文字幕东京热,情色在线视频

轉(zhuǎn)載： http://matt33.com/2017/06/25/kafka-producer-send-module/
[TOC]
Kafka，作為目前在大數(shù)據(jù)領(lǐng)域應(yīng)用最為廣泛的消息隊(duì)列，其內(nèi)部實(shí)現(xiàn)和設(shè)計(jì)有很多值得深入研究和分析的地方。

再 0.10.2 的 Kafka 中，其 Client 端是由 Java 實(shí)現(xiàn)，Server 端是由 Scala 來實(shí)現(xiàn)的，在使用 Kafka 時(shí)，Client 是用戶最先接觸到部分，因此，計(jì)劃寫的源碼分析也會(huì)從 Client 端開始，會(huì)先從 Producer 端開始，今天講的是 Producer 端的發(fā)送模型的實(shí)現(xiàn)。

Producer 使用

在分析 Producer 發(fā)送模型之前，先看一下用戶是如何使用 Producer 向 Kafka 寫數(shù)據(jù)的，下面是一個(gè)關(guān)于 Producer 最簡單的應(yīng)用示例。

import org.apache.kafka.clients.producer.KafkaProducer;
import org.apache.kafka.clients.producer.ProducerRecord;
import org.apache.kafka.clients.producer.Producer;

import java.util.Properties;

public class ProducerTest {
    private static String topicName;
    private static int msgNum;
    private static int key;

    public static void main(String[] args) {
        Properties props = new Properties();
        props.put("bootstrap.servers", "127.0.0.1:9092,127.0.0.2:9092");
        props.put("key.serializer", "org.apache.kafka.common.serialization.StringSerializer");
        props.put("value.serializer", "org.apache.kafka.common.serialization.StringSerializer");

        topicName = "test";
        msgNum = 10; // 發(fā)送的消息數(shù)

        Producer<String, String> producer = new KafkaProducer<>(props);
        for (int i = 0; i < msgNum; i++) {
            String msg = i + " This is matt's blog.";
            producer.send(new ProducerRecord<String, String>(topicName, msg));
        }
        producer.close();
    }
}

從上面的代碼可以看出 Kafka 為用戶提供了非常簡單的 API，在使用時(shí)，只需要如下兩步：

初始化 KafkaProducer 實(shí)例；
調(diào)用 send 接口發(fā)送數(shù)據(jù)。

本文主要是圍繞著 Producer 在內(nèi)部是如何實(shí)現(xiàn) send 接口而展開的。

Producer 數(shù)據(jù)發(fā)送流程

下面通過對(duì) send 源碼分析來一步步剖析 Producer 數(shù)據(jù)的發(fā)送流程。

Producer 的 send 實(shí)現(xiàn)

用戶是直接使用 producer.send() 發(fā)送的數(shù)據(jù)，先看一下 send() 接口的實(shí)現(xiàn)

// 異步向一個(gè) topic 發(fā)送數(shù)據(jù)
@Override
public Future<RecordMetadata> send(ProducerRecord<K, V> record) {
    return send(record, null);
}

// 向 topic 異步地發(fā)送數(shù)據(jù)，當(dāng)發(fā)送確認(rèn)后喚起回調(diào)函數(shù)
@Override
public Future<RecordMetadata> send(ProducerRecord<K, V> record, Callback callback) {
    // intercept the record, which can be potentially modified; this method does not throw exceptions
    ProducerRecord<K, V> interceptedRecord = this.interceptors == null ? record : this.interceptors.onSend(record);
    return doSend(interceptedRecord, callback);
}

數(shù)據(jù)發(fā)送的最終實(shí)現(xiàn)還是調(diào)用了 Producer 的 doSend() 接口。

Producer 的 doSend 實(shí)現(xiàn)

下面是 doSend() 的具體實(shí)現(xiàn)

private Future<RecordMetadata> doSend(ProducerRecord<K, V> record, Callback callback) {
       TopicPartition tp = null;
       try {
           // 1.確認(rèn)數(shù)據(jù)要發(fā)送到的 topic 的 metadata 是可用的
           ClusterAndWaitTime clusterAndWaitTime = waitOnMetadata(record.topic(), record.partition(), maxBlockTimeMs);
           long remainingWaitMs = Math.max(0, maxBlockTimeMs - clusterAndWaitTime.waitedOnMetadataMs);
           Cluster cluster = clusterAndWaitTime.cluster;
           // 2.序列化 record 的 key 和 value
           byte[] serializedKey;
           try {
               serializedKey = keySerializer.serialize(record.topic(), record.key());
           } catch (ClassCastException cce) {
               throw new SerializationException("Can't convert key of class " + record.key().getClass().getName() +
                       " to class " + producerConfig.getClass(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG).getName() +
                       " specified in key.serializer");
           }
           byte[] serializedValue;
           try {
               serializedValue = valueSerializer.serialize(record.topic(), record.value());
           } catch (ClassCastException cce) {
               throw new SerializationException("Can't convert value of class " + record.value().getClass().getName() +
                       " to class " + producerConfig.getClass(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG).getName() +
                       " specified in value.serializer");
           }

           // 3. 獲取該 record 的 partition 的值（可以指定,也可以根據(jù)算法計(jì)算）
           int partition = partition(record, serializedKey, serializedValue, cluster);
           int serializedSize = Records.LOG_OVERHEAD + Record.recordSize(serializedKey, serializedValue);
           ensureValidRecordSize(serializedSize); // record 的字節(jié)超出限制或大于內(nèi)存限制時(shí),就會(huì)拋出 RecordTooLargeException 異常
           tp = new TopicPartition(record.topic(), partition);
           long timestamp = record.timestamp() == null ? time.milliseconds() : record.timestamp(); // 時(shí)間戳
           log.trace("Sending record {} with callback {} to topic {} partition {}", record, callback, record.topic(), partition);
           Callback interceptCallback = this.interceptors == null ? callback : new InterceptorCallback<>(callback, this.interceptors, tp);
           // 4. 向 accumulator 中追加數(shù)據(jù)
           RecordAccumulator.RecordAppendResult result = accumulator.append(tp, timestamp, serializedKey, serializedValue, interceptCallback, remainingWaitMs);
           // 5. 如果 batch 已經(jīng)滿了,喚醒 sender 線程發(fā)送數(shù)據(jù)
           if (result.batchIsFull || result.newBatchCreated) {
               log.trace("Waking up the sender since topic {} partition {} is either full or getting a new batch", record.topic(), partition);
               this.sender.wakeup();
           }
           return result.future;
       } catch (ApiException e) {
           log.debug("Exception occurred during message send:", e);
           if (callback != null)
               callback.onCompletion(null, e);
           this.errors.record();
           if (this.interceptors != null)
               this.interceptors.onSendError(record, tp, e);
           return new FutureFailure(e);
       } catch (InterruptedException e) {
           this.errors.record();
           if (this.interceptors != null)
               this.interceptors.onSendError(record, tp, e);
           throw new InterruptException(e);
       } catch (BufferExhaustedException e) {
           this.errors.record();
           this.metrics.sensor("buffer-exhausted-records").record();
           if (this.interceptors != null)
               this.interceptors.onSendError(record, tp, e);
           throw e;
       } catch (KafkaException e) {
           this.errors.record();
           if (this.interceptors != null)
               this.interceptors.onSendError(record, tp, e);
           throw e;
       } catch (Exception e) {
           if (this.interceptors != null)
               this.interceptors.onSendError(record, tp, e);
           throw e;
       }
   }

在 dosend() 方法的實(shí)現(xiàn)上，一條 Record 數(shù)據(jù)的發(fā)送，可以分為以下五步：

確認(rèn)數(shù)據(jù)要發(fā)送到的 topic 的 metadata 是可用的（如果該 partition 的 leader 存在則是可用的，如果開啟權(quán)限時(shí)，client 有相應(yīng)的權(quán)限），如果沒有 topic 的 metadata 信息，就需要獲取相應(yīng)的 metadata；
序列化 record 的 key 和 value；
獲取該 record 要發(fā)送到的 partition（可以指定，也可以根據(jù)算法計(jì)算）；
向 accumulator 中追加 record 數(shù)據(jù)，數(shù)據(jù)會(huì)先進(jìn)行緩存；
如果追加完數(shù)據(jù)后，對(duì)應(yīng)的 RecordBatch 已經(jīng)達(dá)到了 batch.size 的大?。ɑ蛘遙atch 的剩余空間不足以添加下一條 Record），則喚醒 sender 線程發(fā)送數(shù)據(jù)。

數(shù)據(jù)的發(fā)送過程，可以簡單總結(jié)為以上五點(diǎn)，下面會(huì)這幾部分的具體實(shí)現(xiàn)進(jìn)行詳細(xì)分析。

發(fā)送過程詳解

獲取 topic 的 metadata 信息

Producer 通過 waitOnMetadata() 方法來獲取對(duì)應(yīng) topic 的 metadata 信息，這部分后面會(huì)單獨(dú)抽出一篇文章來介紹，這里就不再詳述，總結(jié)起來就是：在數(shù)據(jù)發(fā)送前，需要先該 topic 是可用的。

key 和 value 的序列化
Producer 端對(duì) record 的 key 和 value 值進(jìn)行序列化操作，在 Consumer 端再進(jìn)行相應(yīng)的反序列化，Kafka 內(nèi)部提供的序列化和反序列化算法如下圖所示：

image.png

當(dāng)然我們也是可以自定義序列化的具體實(shí)現(xiàn)，不過一般情況下，Kafka 內(nèi)部提供的這些方法已經(jīng)足夠使用。

獲取 partition 值
關(guān)于 partition 值的計(jì)算，分為三種情況：

指明 partition 的情況下，直接將指明的值直接作為 partiton 值；
沒有指明 partition 值但有 key 的情況下，將 key 的 hash 值與 topic 的 partition 數(shù)進(jìn)行取余得到 partition 值；
既沒有 partition 值又沒有 key 值的情況下，第一次調(diào)用時(shí)隨機(jī)生成一個(gè)整數(shù)（后面每次調(diào)用在這個(gè)整數(shù)上自增），將這個(gè)值與 topic 可用的 partition 總數(shù)取余得到 partition 值，也就是常說的 round-robin 算法。

具體實(shí)現(xiàn)如下：

// 當(dāng) record 中有 partition 值時(shí)，直接返回，沒有的情況下調(diào)用 partitioner 的類的 partition 方法去計(jì)算（KafkaProducer.class）
private int partition(ProducerRecord<K, V> record, byte[] serializedKey, byte[] serializedValue, Cluster cluster) {
    Integer partition = record.partition();
    return partition != null ?
            partition :
            partitioner.partition(
                    record.topic(), record.key(), serializedKey, record.value(), serializedValue, cluster);
}

Producer 默認(rèn)使用的 partitioner 是 org.apache.kafka.clients.producer.internals.DefaultPartitioner，用戶也可以自定義 partition 的策略，下面是這個(gè)類兩個(gè)方法的具體實(shí)現(xiàn)：

public int partition(String topic, Object key, byte[] keyBytes, Object value, byte[] valueBytes, Cluster cluster) {
        List<PartitionInfo> partitions = cluster.partitionsForTopic(topic);
        int numPartitions = partitions.size();
        if (keyBytes == null) {// 沒有指定 key 的情況下
            int nextValue = nextValue(topic); // 第一次的時(shí)候產(chǎn)生一個(gè)隨機(jī)整數(shù),后面每次調(diào)用在之前的基礎(chǔ)上自增;
            List<PartitionInfo> availablePartitions = cluster.availablePartitionsForTopic(topic);
            // leader 不為 null,即為可用的 partition
            if (availablePartitions.size() > 0) {
                int part = Utils.toPositive(nextValue) % availablePartitions.size();
                return availablePartitions.get(part).partition();
            } else {
                return Utils.toPositive(nextValue) % numPartitions;
            }
        } else {// 有 key 的情況下,使用 key 的 hash 值進(jìn)行計(jì)算
            return Utils.toPositive(Utils.murmur2(keyBytes)) % numPartitions; // 選擇 key 的 hash 值
        }
    }

    // 根據(jù) topic 獲取對(duì)應(yīng)的整數(shù)變量
    private int nextValue(String topic) {
        AtomicInteger counter = topicCounterMap.get(topic);
        if (null == counter) { // 第一次調(diào)用時(shí)，隨機(jī)產(chǎn)生
            counter = new AtomicInteger(new Random().nextInt());
            AtomicInteger currentCounter = topicCounterMap.putIfAbsent(topic, counter);
            if (currentCounter != null) {
                counter = currentCounter;
            }
        }
        return counter.getAndIncrement(); // 后面再調(diào)用時(shí)，根據(jù)之前的結(jié)果自增
    }

這就是 Producer 中默認(rèn)的 partitioner 實(shí)現(xiàn)。

向 accumulator 寫數(shù)據(jù)

Producer 會(huì)先將 record 寫入到 buffer 中，當(dāng)達(dá)到一個(gè) batch.size 的大小時(shí)，再喚起 sender 線程去發(fā)送 RecordBatch（第五步），這里先詳細(xì)分析一下 Producer 是如何向 buffer 中寫入數(shù)據(jù)的。

Producer 是通過 RecordAccumulator 實(shí)例追加數(shù)據(jù)，RecordAccumulator 模型如下圖所示，一個(gè)重要的變量就是 ConcurrentMap<TopicPartition, Deque<RecordBatch>> batches，每個(gè) TopicPartition 都會(huì)對(duì)應(yīng)一個(gè) Deque<RecordBatch>，當(dāng)添加數(shù)據(jù)時(shí)，會(huì)向其 topic-partition 對(duì)應(yīng)的這個(gè) queue 最新創(chuàng)建的一個(gè) RecordBatch 中添加 record，而發(fā)送數(shù)據(jù)時(shí)，則會(huì)先從 queue 中最老的那個(gè) RecordBatch 開始發(fā)送。

image.png

/ org.apache.kafka.clients.producer.internals.RecordAccumulator
     // 向 accumulator 添加一條 record，并返回添加后的結(jié)果（結(jié)果主要包含: future metadata、batch 是否滿的標(biāo)志以及新 batch 是否創(chuàng)建）其中， maxTimeToBlock 是 buffer.memory 的 block 的最大時(shí)間
    public RecordAppendResult append(TopicPartition tp,
                                     long timestamp,
                                     byte[] key,
                                     byte[] value,
                                     Callback callback,
                                     long maxTimeToBlock) throws InterruptedException {
        appendsInProgress.incrementAndGet();
        try {
            Deque<RecordBatch> dq = getOrCreateDeque(tp);// 每個(gè) topicPartition 對(duì)應(yīng)一個(gè) queue
            synchronized (dq) {// 在對(duì)一個(gè) queue 進(jìn)行操作時(shí),會(huì)保證線程安全
                if (closed)
                    throw new IllegalStateException("Cannot send after the producer is closed.");
                RecordAppendResult appendResult = tryAppend(timestamp, key, value, callback, dq); // 追加數(shù)據(jù)
                if (appendResult != null)// 這個(gè) topic-partition 已經(jīng)有記錄了
                    return appendResult;
            }

            // 為 topic-partition 創(chuàng)建一個(gè)新的 RecordBatch, 需要初始化相應(yīng)的 RecordBatch，要為其分配的大小是: max（batch.size, 加上頭文件的本條消息的大?。?            int size = Math.max(this.batchSize, Records.LOG_OVERHEAD + Record.recordSize(key, value));
            log.trace("Allocating a new {} byte message buffer for topic {} partition {}", size, tp.topic(), tp.partition());
            ByteBuffer buffer = free.allocate(size, maxTimeToBlock);// 給這個(gè) RecordBatch 初始化一個(gè) buffer
            synchronized (dq) {
                if (closed)
                    throw new IllegalStateException("Cannot send after the producer is closed.");

                RecordAppendResult appendResult = tryAppend(timestamp, key, value, callback, dq);
                if (appendResult != null) {// 如果突然發(fā)現(xiàn)這個(gè) queue 已經(jīng)存在，那么就釋放這個(gè)已經(jīng)分配的空間
                    free.deallocate(buffer);
                    return appendResult;
                }
                // 給 topic-partition 創(chuàng)建一個(gè) RecordBatch
                MemoryRecordsBuilder recordsBuilder = MemoryRecords.builder(buffer, compression, TimestampType.CREATE_TIME, this.batchSize);
                RecordBatch batch = new RecordBatch(tp, recordsBuilder, time.milliseconds());
                // 向新的 RecordBatch 中追加數(shù)據(jù)
                FutureRecordMetadata future = Utils.notNull(batch.tryAppend(timestamp, key, value, callback, time.milliseconds()));

                dq.addLast(batch);// 將 RecordBatch 添加到對(duì)應(yīng)的 queue 中
                incomplete.add(batch);// 向未 ack 的 batch 集合添加這個(gè) batch
                // 如果 dp.size()>1 就證明這個(gè) queue 有一個(gè) batch 是可以發(fā)送了
                return new RecordAppendResult(future, dq.size() > 1 || batch.isFull(), true);
            }
        } finally {
            appendsInProgress.decrementAndGet();
        }
    }

總結(jié)一下其 record 寫入的具體流程如下圖所示：

image.png

獲取該 topic-partition 對(duì)應(yīng)的 queue，沒有的話會(huì)創(chuàng)建一個(gè)空的 queue；
向 queue 中追加數(shù)據(jù)，先獲取 queue 中最新加入的那個(gè) RecordBatch，如果不存在或者存在但剩余空余不足以添加本條 record 則返回 null，成功寫入的話直接返回結(jié)果，寫入成功；
創(chuàng)建一個(gè)新的 RecordBatch，初始化內(nèi)存大小根據(jù) max(batch.size, Records.LOG_OVERHEAD + Record.recordSize(key, value)) 來確定（防止單條 record 過大的情況）；
向新建的 RecordBatch 寫入 record，并將 RecordBatch 添加到 queue 中，返回結(jié)果，寫入成功。

發(fā)送 RecordBatch

當(dāng) record 寫入成功后，如果發(fā)現(xiàn) RecordBatch 已滿足發(fā)送的條件（通常是 queue 中有多個(gè) batch，那么最先添加的那些 batch 肯定是可以發(fā)送了），那么就會(huì)喚醒 sender 線程，發(fā)送 RecordBatch。

sender 線程對(duì) RecordBatch 的處理是在 run() 方法中進(jìn)行的，該方法具體實(shí)現(xiàn)如下：

void run(long now) {
        Cluster cluster = metadata.fetch();
        // 獲取那些已經(jīng)可以發(fā)送的 RecordBatch 對(duì)應(yīng)的 nodes
        RecordAccumulator.ReadyCheckResult result = this.accumulator.ready(cluster, now);

        // 如果有 topic-partition 的 leader 是未知的,就強(qiáng)制 metadata 更新
        if (!result.unknownLeaderTopics.isEmpty()) {
            for (String topic : result.unknownLeaderTopics)
                this.metadata.add(topic);
            this.metadata.requestUpdate();
        }

        // 如果與node 沒有連接（如果可以連接,同時(shí)初始化該連接）,就證明該 node 暫時(shí)不能發(fā)送數(shù)據(jù),暫時(shí)移除該 node
        Iterator<Node> iter = result.readyNodes.iterator();
        long notReadyTimeout = Long.MAX_VALUE;
        while (iter.hasNext()) {
            Node node = iter.next();
            if (!this.client.ready(node, now)) {
                iter.remove();
                notReadyTimeout = Math.min(notReadyTimeout, this.client.connectionDelay(node, now));
            }
        }

        // 返回該 node 對(duì)應(yīng)的所有可以發(fā)送的 RecordBatch 組成的 batches（key 是 node.id）,并將 RecordBatch 從對(duì)應(yīng)的 queue 中移除
        Map<Integer, List<RecordBatch>> batches = this.accumulator.drain(cluster, result.readyNodes, this.maxRequestSize, now);
        if (guaranteeMessageOrder) {
            //記錄將要發(fā)送的 RecordBatch
            for (List<RecordBatch> batchList : batches.values()) {
                for (RecordBatch batch : batchList)
                    this.accumulator.mutePartition(batch.topicPartition);
            }
        }

        // 將由于元數(shù)據(jù)不可用而導(dǎo)致發(fā)送超時(shí)的 RecordBatch 移除
        List<RecordBatch> expiredBatches = this.accumulator.abortExpiredBatches(this.requestTimeout, now);
        for (RecordBatch expiredBatch : expiredBatches)
            this.sensors.recordErrors(expiredBatch.topicPartition.topic(), expiredBatch.recordCount);

        sensors.updateProduceRequestMetrics(batches);

        long pollTimeout = Math.min(result.nextReadyCheckDelayMs, notReadyTimeout);
        if (!result.readyNodes.isEmpty()) {
            log.trace("Nodes with data ready to send: {}", result.readyNodes);
            pollTimeout = 0;
        }
        // 發(fā)送 RecordBatch
        sendProduceRequests(batches, now);

        this.client.poll(pollTimeout, now); // 關(guān)于 socket 的一些實(shí)際的讀寫操作（其中包括 meta 信息的更新）
    }

這段代碼前面有很多是其他的邏輯處理，如：移除暫時(shí)不可用的 node、處理由于元數(shù)據(jù)不可用導(dǎo)致的超時(shí) RecordBatch，真正進(jìn)行發(fā)送發(fā)送 RecordBatch 的是 sendProduceRequests(batches, now) 這個(gè)方法，具體是：

/**
 * Transfer the record batches into a list of produce requests on a per-node basis
 */
private void sendProduceRequests(Map<Integer, List<RecordBatch>> collated, long now) {
    for (Map.Entry<Integer, List<RecordBatch>> entry : collated.entrySet())
        sendProduceRequest(now, entry.getKey(), acks, requestTimeout, entry.getValue());
}

/**
 * Create a produce request from the given record batches
 */
// 發(fā)送 produce 請(qǐng)求
private void sendProduceRequest(long now, int destination, short acks, int timeout, List<RecordBatch> batches) {
    Map<TopicPartition, MemoryRecords> produceRecordsByPartition = new HashMap<>(batches.size());
    final Map<TopicPartition, RecordBatch> recordsByPartition = new HashMap<>(batches.size());
    for (RecordBatch batch : batches) {
        TopicPartition tp = batch.topicPartition;
        produceRecordsByPartition.put(tp, batch.records());
        recordsByPartition.put(tp, batch);
    }

    ProduceRequest.Builder requestBuilder =
            new ProduceRequest.Builder(acks, timeout, produceRecordsByPartition);
    RequestCompletionHandler callback = new RequestCompletionHandler() {
        public void onComplete(ClientResponse response) {
            handleProduceResponse(response, recordsByPartition, time.milliseconds());
        }
    };

    String nodeId = Integer.toString(destination);
    ClientRequest clientRequest = client.newClientRequest(nodeId, requestBuilder, now, acks != 0, callback);
    client.send(clientRequest, now);
    log.trace("Sent produce request to {}: {}", nodeId, requestBuilder);
}

這段代碼就簡單很多，總來起來就是，將 batches 中 leader 為同一個(gè) node 的所有 RecordBatch 放在一個(gè)請(qǐng)求中進(jìn)行發(fā)送。這里并沒有真正進(jìn)行發(fā)送，真正網(wǎng)絡(luò)發(fā)送是在后面的poll方法里面。

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av

Kafka 源碼解析之 Producer 發(fā)送

Kafka 源碼解析之 Producer 發(fā)送

Producer 使用

Producer 數(shù)據(jù)發(fā)送流程

Producer 的 send 實(shí)現(xiàn)

Producer 的 doSend 實(shí)現(xiàn)

發(fā)送過程詳解

向 accumulator 寫數(shù)據(jù)

發(fā)送 RecordBatch

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九 欧美,1769亚洲,黄色成人av

Kafka 源碼解析之 Producer 發(fā)送

Producer 使用

Producer 數(shù)據(jù)發(fā)送流程

Producer 的 send 實(shí)現(xiàn)

Producer 的 doSend 實(shí)現(xiàn)

發(fā)送過程詳解

向 accumulator 寫數(shù)據(jù)

發(fā)送 RecordBatch

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av