基于keras-yolov3，原理及代碼細(xì)節(jié)的理解

本文GitHub 源碼：https://github.com/qqwweee/keras-yolo3

yolov3論文地址：https://pjreddie.com/media/files/papers/YOLOv3.pdf

yolov3官網(wǎng)：https://pjreddie.com/darknet/yolo/

最近對YOLOV3很感興趣，看了好多資料。做了一些相關(guān)的項(xiàng)目。所以寫下了一些心得體會，以便以后回顧查詢。

YOLO，即 You Only Look Once 的縮寫，是一個基于卷積神經(jīng)網(wǎng)絡(luò)（CNN）的物體檢測算法。

yolo設(shè)計(jì)理念

yolo算法整體來說是采用CNN對目標(biāo)進(jìn)行end-to-end的檢測。流程如圖一所示

image.png

圖一

具體來說（基于YOLOV3）

1：輸入一張任意大小圖片，保持長寬比不變的情況下，縮放至 w 或 h 達(dá)到416，再覆蓋在416416的新圖上，作為網(wǎng)絡(luò)的輸入。即網(wǎng)絡(luò)的輸入是一張416416，3通道的RGB圖。

2：運(yùn)行網(wǎng)絡(luò)。YOLO的CNN網(wǎng)絡(luò)把圖片分成 SS 個網(wǎng)格（yolov3多尺度預(yù)測，輸出3層，每層 S * S個網(wǎng)格，分別為 1313 ，26 26 ，5252），然后每個單元格負(fù)責(zé)去檢測那些中心點(diǎn)落在該格子內(nèi)的目標(biāo)，如圖二所示。每個單元格需要預(yù)測 3（4+1+B）個值。如果將輸入圖片劃分為 SS 網(wǎng)格，那么每層最終預(yù)測值為 SS3*(4+1+B) 大小的張量。 B：類別數(shù)（coco集為80類），即B=80. 3 為每層anchorbox數(shù)量，4 為邊界框大小和位置（x , y , w , h ）1 為置信度。

3: 通過NMS，非極大值抑制，篩選出框boxes,輸出框class_boxes和置信度class_box_scores，再生成類別信息classes，生成最終的檢測數(shù)據(jù)框，并返回

image.png

圖二

image.png

圖三

YOLOV3網(wǎng)絡(luò)結(jié)構(gòu)：

image.png

多尺度：

yolov3采用多尺度預(yù)測?！荆?313）（2626）（52*52）】

?小尺度：（13*13的feature map）

網(wǎng)絡(luò)接收一張（416416）的圖，經(jīng)過5個步長為2的卷積來進(jìn)行降采樣（416 / 2?5 = 13）.輸出（1313）。
?中尺度：（26*26的feature map）

從小尺度中的倒數(shù)第二層的卷積層上采樣(x2，up sampling)再與最后一個13x13大小的特征圖相加，輸出（2626）。
?大尺度：（5252的feature map）

操作同中尺度輸出（52*52）
好處：讓網(wǎng)絡(luò)同時學(xué)習(xí)到深層和淺層的特征，通過疊加淺層特征圖相鄰特征到不同通道（而非空間位置），類似于Resnet中的identity mapping。這個方法把26x26x512的特征圖疊加成13x13x2048的特征圖，與原生的深層特征圖相連接，使模型有了細(xì)粒度特征,增加對小目標(biāo)的識別能力。

anchor box:

yolov3 anchor box一共有9個，由k-means聚類得到。在COCO數(shù)據(jù)集上，9個聚類是：（1013）;（1630）;（3323）;（3061）;（6245）; （59119）; （11690）; （156198）; （373*326）。

不同尺寸特征圖對應(yīng)不同大小的先驗(yàn)框。

1313feature map對應(yīng)【（11690），（156198），（373326）】
2626feature map對應(yīng)【（3061），（6245），（59119）】
5252feature map對應(yīng)【（1013），（1630），（3323）】
原因：特征圖越大，感受野越小。對小目標(biāo)越敏感，所以選用小的anchor box。

      特征圖越小，感受野越大。對大目標(biāo)越敏感，所以選用大的anchor box。

邊框預(yù)測：

預(yù)測tx ty tw th

對tx和ty進(jìn)行sigmoid，并加上對應(yīng)的offset（下圖Cx, Cy）
對th和tw進(jìn)行exp，并乘以對應(yīng)的錨點(diǎn)值
對tx,ty,th,tw乘以對應(yīng)的步幅，即：416/13, 416 ? 26, 416 ? 52
最后，使用sigmoid對Objectness和Classes confidence進(jìn)行sigmoid得到0~1的概率，之所以用sigmoid取代之前版本的softmax，原因是softmax會擴(kuò)大最大類別概率值而抑制其他類別概率值

(tx,ty) :目標(biāo)中心點(diǎn)相對于該點(diǎn)所在網(wǎng)格左上角的偏移量，經(jīng)過sigmoid歸一化。即值屬于【0,1】。如圖約（0.3 , 0.4）

(cx,cy):該點(diǎn)所在網(wǎng)格的左上角距離最左上角相差的格子數(shù)。如圖（1,1）

(pw,ph):anchor box 的邊長

(tw,th):預(yù)測邊框的寬和高

PS：最終得到的邊框坐標(biāo)值是bx,by,bw,bh.而網(wǎng)絡(luò)學(xué)習(xí)目標(biāo)是tx,ty,tw,th

損失函數(shù)LOSS

YOLO V3把YOLOV2中的Softmax loss變成Logistic loss

                                             此圖僅供參考，與YOLOV3略有不同

代碼解讀：源碼檢測部分

Usage

Git Clone https://github.com/qqwweee/keras-yolo3
從YOLO website 下載yolov3 weights
把darknet版本的yolo model 轉(zhuǎn)換為 Keras model
運(yùn)行 YOLO dection

YOLO類的初始化參數(shù)：
class YOLO(object):
_defaults = {
"model_path": 'model_data/yolo.h5', #訓(xùn)練好的模型
"anchors_path": 'model_data/yolo_anchors.txt', # anchor box 9個，從小到大排列
"classes_path": 'model_data/coco_classes.txt', #類別數(shù)
"score" : 0.3, #score 閾值
"iou" : 0.45, #iou 閾值
"model_image_size" : (416, 416), #輸入圖像尺寸
"gpu_num" : 1, #gpu數(shù)量
}
run yolo_video.py

def detect_img(yolo):
while True:
img = input('Input image filename:') #輸入一張圖片
try:
image = Image.open(img)
except:
print('Open Error! Try again!')
continue
else:
r_image = yolo.detect_image(image) #進(jìn)入yolo.detect_image 進(jìn)行檢測
r_image.show()
yolo.close_session()

detect_image（）函數(shù)在yolo.py第102行

def detect_image(self, image):
    start = timer()

    if self.model_image_size != (None, None):  #判斷圖片是否存在
        assert self.model_image_size[0]%32 == 0, 'Multiples of 32 required'  
        assert self.model_image_size[1]%32 == 0, 'Multiples of 32 required'
        #assert斷言語句的語法格式 model_image_size[0][1]指圖像的w和h，且必須是32的整數(shù)倍

        boxed_image = letterbox_image(image, tuple(reversed(self.model_image_size)))                               #letterbox_image()定義在utils.py的第20行。輸入?yún)?shù)（圖像 ,(w=416,h=416)),輸出一張使用填充來調(diào)整圖像的縱橫比不變的新圖。  
    else:
        new_image_size = (image.width - (image.width % 32),
                          image.height - (image.height % 32))
        boxed_image = letterbox_image(image, new_image_size)
    image_data = np.array(boxed_image, dtype='float32')
    print(image_data.shape)  #（416，416,3）
    image_data /= 255.  #歸一化
    image_data = np.expand_dims(image_data, 0) 
    #批量添加一維 -> (1,416,416,3) 為了符合網(wǎng)絡(luò)的輸入格式 -> (bitch, w, h, c)

    out_boxes, out_scores, out_classes = self.sess.run(
        [self.boxes, self.scores, self.classes],  
        #目的為了求boxes,scores,classes，具體計(jì)算方式定義在generate（）函數(shù)內(nèi)。在yolo.py第61行
        feed_dict={    #喂參數(shù)
            self.yolo_model.input: image_data,  #圖像數(shù)據(jù)
            self.input_image_shape: [image.size[1], image.size[0]],   #圖像尺寸
            K.learning_phase(): 0   #學(xué)習(xí)模式 0：測試模型。 1：訓(xùn)練模式
        })

    print('Found {} boxes for {}'.format(len(out_boxes), 'img'))

  # 繪制邊框，自動設(shè)置邊框?qū)挾?，繪制邊框和類別文字，使用Pillow繪圖庫

font = ImageFont.truetype(font='font/FiraMono-Medium.otf',
　　　　size=np.floor(3e-2 * image.size[1] + 0.5).astype('int32')) #字體
　　　　 thickness = (image.size[0] + image.size[1]) // 300 #厚度

for i, c in reversed(list(enumerate(out_classes))):
　　　　predicted_class = self.class_names[c] #類別
　　　　 box = out_boxes[i] #框
　　　　 score = out_scores[i] #置信度

　　label = '{} {:.2f}'.format(predicted_class, score)  #標(biāo)簽
　　draw = ImageDraw.Draw(image)  #畫圖
　　label_size = draw.textsize(label, font)　　# 標(biāo)簽文字

　　top, left, bottom, right = box
　　top = max(0, np.floor(top + 0.5).astype('int32'))
　　left = max(0, np.floor(left + 0.5).astype('int32'))
　　bottom = min(image.size[1], np.floor(bottom + 0.5).astype('int32'))
　　right = min(image.size[0], np.floor(right + 0.5).astype('int32'))
　　print(label, (left, top), (right, bottom))  #邊框

　　if top - label_size[1] >= 0:  #標(biāo)簽文字
    　　text_origin = np.array([left, top - label_size[1]])
　　else:
    　　text_origin = np.array([left, top + 1])

　　# My kingdom for a good redistributable image drawing library.
　　for i in range(thickness):   #畫框
    　　draw.rectangle(
        　　[left + i, top + i, right - i, bottom - i],
        　　outline=self.colors[c])
　　draw.rectangle(     #文字背景
    　　[tuple(text_origin), tuple(text_origin + label_size)],
    　　fill=self.colors[c])
　　draw.text(text_origin, label, fill=(0, 0, 0), font=font)  #文案
　　del draw

end = timer()
　　print(end - start)
　　return image
generate()在yolo.py第61行
def generate(self):
model_path = os.path.expanduser(self.model_path) #獲取model路徑
assert model_path.endswith('.h5'), 'Keras model or weights must be a .h5 file.' #判斷model是否以h5結(jié)尾

# Load model, or construct model and load weights.
num_anchors = len(self.anchors)   #num_anchors = 9。yolov3有9個先驗(yàn)框
num_classes = len(self.class_names)  #num_cliasses = 80。 #coco集一共80類
is_tiny_version = num_anchors==6 # default setting is_tiny_version = False
try:
    self.yolo_model = load_model(model_path, compile=False)   #下載model
except:
    self.yolo_model = tiny_yolo_body(Input(shape=(None,None,3)), num_anchors//2, num_classes) \
        if is_tiny_version else yolo_body(Input(shape=(None,None,3)), num_anchors//3, num_classes)
    self.yolo_model.load_weights(self.model_path) # 確保model和anchor classes 對應(yīng)
else:
    assert self.yolo_model.layers[-1].output_shape[-1] == \     
   # model.layer[-1]:網(wǎng)絡(luò)最后一層輸出。 output_shape[-1]:輸出維度的最后一維。 -> (?,13,13,255)
        num_anchors/len(self.yolo_model.output) * (num_classes + 5), \ 
   #255 = 9/3*(80+5). 9/3:每層特征圖對應(yīng)3個anchor box  80:80個類別 5:4+1,框的4個值+1個置信度
        'Mismatch between model and given anchor and class sizes'

print('{} model, anchors, and classes loaded.'.format(model_path))
# 生成繪制邊框的顏色。
hsv_tuples = [(x / len(self.class_names), 1., 1.)    
#h(色調(diào)）：x/len(self.class_names)  s(飽和度）：1.0  v(明亮）：1.0 
              for x in range(len(self.class_names))]
self.colors = list(map(lambda x: colorsys.hsv_to_rgb(*x), hsv_tuples))   #hsv轉(zhuǎn)換為rgb
self.colors = list(
    map(lambda x: (int(x[0] * 255), int(x[1] * 255), int(x[2] * 255)),   
        self.colors))

hsv取值范圍在【0,1】，而RBG取值范圍在【0,255】，所以乘上255

np.random.seed(10101)  # np.random.seed():產(chǎn)生隨機(jī)種子。固定種子為一致的顏色
np.random.shuffle(self.colors)  # 調(diào)整顏色來裝飾相鄰的類。
np.random.seed(None)  #重置種子為默認(rèn)

Generate output tensor targets for filtered bounding boxes.

self.input_image_shape = K.placeholder(shape=(2, )) #K.placeholder:keras中的占位符
if self.gpu_num>=2:
self.yolo_model = multi_gpu_model(self.yolo_model, gpus=self.gpu_num)
boxes, scores, classes = yolo_eval(self.yolo_model.output, self.anchors,
len(self.class_names), self.input_image_shape,
score_threshold=self.score, iou_threshold=self.iou) #yolo_eval():yolo評估函數(shù)
return boxes, scores, classes

def yolo_eval(yolo_outputs, #模型輸出，格式如下【（?，13,13,255）（?，26,26,255）（?,52,52,255）】 ?:bitch size; 13-26-52:多尺度預(yù)測； 255：預(yù)測值（3*（80+5））
anchors, #[(10,13), (16,30), (33,23), (30,61), (62,45), (59,119), (116,90), (156,198),(373,326)]
num_classes,　　　　 # 類別個數(shù)，coco集80類
image_shape, #placeholder類型的TF參數(shù)，默認(rèn)(416, 416)；
max_boxes=20, #每張圖每類最多檢測到20個框同類別框的IoU閾值，大于閾值的重疊框被刪除，重疊物體較多，則調(diào)高閾值，重疊物體較少，則調(diào)低閾值
score_threshold=.6, #框置信度閾值，小于閾值的框被刪除，需要的框較多，則調(diào)低閾值，需要的框較少，則調(diào)高閾值；
iou_threshold=.5): #同類別框的IoU閾值，大于閾值的重疊框被刪除，重疊物體較多，則調(diào)高閾值，重疊物體較少，則調(diào)低閾值

"""Evaluate YOLO model on given input and return filtered boxes."""

num_layers = len(yolo_outputs)   #yolo的輸出層數(shù)；num_layers = 3  -> 13-26-52

anchor_mask = [[6,7,8], [3,4,5], [0,1,2]] if num_layers==3 else [[3,4,5], [1,2,3]]

default setting #每層分配3個anchor box.如13*13分配到【6,7,8】即【（116,90）（156,198）（373,326）】

input_shape = K.shape(yolo_outputs[0])[1:3] * 32

輸入shape(?,13,13,255);即第一維和第二維分別32 ->1332=416; input_shape:(416,416)

boxes = []
box_scores = []
for l in range(num_layers):
    _boxes, _box_scores = yolo_boxes_and_scores(yolo_outputs[l],
        anchors[anchor_mask[l]], num_classes, input_shape, image_shape)
    boxes.append(_boxes)
    box_scores.append(_box_scores)
boxes = K.concatenate(boxes, axis=0)    #K.concatenate:將數(shù)據(jù)展平 ->(?,4)
box_scores = K.concatenate(box_scores, axis=0)   # ->(?,)

mask = box_scores >= score_threshold  #MASK掩碼，過濾小于score閾值的值，只保留大于閾值的值
max_boxes_tensor = K.constant(max_boxes, dtype='int32')   #最大檢測框數(shù)20
boxes_ = []
scores_ = []
classes_ = []
for c in range(num_classes):
    # TODO: use keras backend instead of tf.
    class_boxes = tf.boolean_mask(boxes, mask[:, c])    #通過掩碼MASK和類別C篩選框boxes
    class_box_scores = tf.boolean_mask(box_scores[:, c], mask[:, c])    #通過掩碼MASK和類別C篩選scores
    nms_index = tf.image.non_max_suppression(        #運(yùn)行非極大抑制
        class_boxes, class_box_scores, max_boxes_tensor, iou_threshold=iou_threshold)
    class_boxes = K.gather(class_boxes, nms_index)     #K.gather:根據(jù)索引nms_index選擇class_boxes
    class_box_scores = K.gather(class_box_scores, nms_index)   #根據(jù)索引nms_index選擇class_box_score)
    classes = K.ones_like(class_box_scores, 'int32') * c    #計(jì)算類的框得分
    boxes_.append(class_boxes)
    scores_.append(class_box_scores)
    classes_.append(classes)

boxes_ = K.concatenate(boxes_, axis=0)

K.concatenate().將相同維度的數(shù)據(jù)連接在一起；把boxes_展平。 -> 變成格式:(?,4); ?:框的個數(shù)；4：（x,y,w,h）

scores_ = K.concatenate(scores_, axis=0)   #變成格式（?,）
classes_ = K.concatenate(classes_, axis=0) #變成格式（?,）

return boxes_, scores_, classes_

yolo_boxes_and_scores()在model.py的第176行
def yolo_boxes_and_scores(feats, anchors, num_classes, input_shape, image_shape):

feats:輸出的shape，->(?,13,13,255); anchors:每層對應(yīng)的3個anchor box

num_classes: 類別數(shù)（80）; input_shape:（416,416）; image_shape:圖像尺寸

'''Process Conv layer output'''

box_xy, box_wh, box_confidence, box_class_probs = yolo_head(feats,
anchors, num_classes, input_shape)

yolo_head():box_xy是box的中心坐標(biāo)，(0_{1)相對位置；box_wh是box的寬高，(0}1)相對值；

box_confidence是框中物體置信度；box_class_probs是類別置信度；

boxes = yolo_correct_boxes(box_xy, box_wh, input_shape, image_shape)

將box_xy和box_wh的(0~1)相對值，轉(zhuǎn)換為真實(shí)坐標(biāo)，輸出boxes是(y_min,x_min,y_max,x_max)的值

boxes = K.reshape(boxes, [-1, 4])

reshape,將不同網(wǎng)格的值轉(zhuǎn)換為框的列表。即（?,13,13,3,4）->(?,4) ？：框的數(shù)目

box_scores = box_confidence * box_class_probs                                    
#框的得分=框的置信度*類別置信度

box_scores = K.reshape(box_scores, [-1, num_classes])                          
#reshape,將框的得分展平，變?yōu)??,80); ?:框的數(shù)目
return boxes, box_scores

yolo_head()在model.py的第122行
def yolo_head(feats, anchors, num_classes, input_shape, calc_loss=False): #參數(shù)同上

"""Convert final layer features to bounding box parameters."""

num_anchors = len(anchors) #num_anchors = 3

Reshape to batch, height, width, num_anchors, box_params.

anchors_tensor = K.reshape(K.constant(anchors), [1, 1, 1, num_anchors, 2]) #reshape ->(1,1,1,3,2)

grid_shape = K.shape(feats)[1:3] # height, width  (?,13,13,255)  -> (13,13)

grid_y和grid_x用于生成網(wǎng)格grid，通過arange、reshape、tile的組合，創(chuàng)建y軸的0_{12的組合grid_y，再創(chuàng)建x軸的0}12的組合grid_x，將兩者拼接concatenate，就是grid；

grid_y = K.tile(K.reshape(K.arange(0, stop=grid_shape[0]), [-1, 1, 1, 1]),
[1, grid_shape[1], 1, 1])
grid_x = K.tile(K.reshape(K.arange(0, stop=grid_shape[1]), [1, -1, 1, 1]),
[grid_shape[0], 1, 1, 1])
grid = K.concatenate([grid_x, grid_y])
grid = K.cast(grid, K.dtype(feats)) #K.cast():把grid中值的類型變?yōu)楹蚮eats中值的類型一樣

feats = K.reshape(                                                                            
    feats, [-1, grid_shape[0], grid_shape[1], num_anchors, num_classes + 5])
#將feats的最后一維展開，將anchors與其他數(shù)據(jù)（類別數(shù)+4個框值+框置信度）分離

# Adjust preditions to each spatial grid point and anchor size.

xywh的計(jì)算公式，tx、ty、tw和th是feats值，而bx、by、bw和bh是輸出值，如下圖

box_xy = (K.sigmoid(feats[..., :2]) + grid) / K.cast(grid_shape[::-1], K.dtype(feats))           
box_wh = K.exp(feats[..., 2:4]) * anchors_tensor / K.cast(input_shape[::-1], K.dtype(feats))       
box_confidence = K.sigmoid(feats[..., 4:5])
box_class_probs = K.sigmoid(feats[..., 5:])   
#sigmoid:σ  
# ...操作符，在Python中，“...”(ellipsis)操作符，表示其他維度不變，只操作最前或最后1維；


if calc_loss == True:
    return grid, feats, box_xy, box_wh
return box_xy, box_wh, box_confidence, box_class_probs

yolo_correct_boxes()在model.py的第150行
def yolo_correct_boxes(box_xy, box_wh, input_shape, image_shape): #得到正確的x,y,w,h
'''Get corrected boxes'''
box_yx = box_xy[..., ::-1] #“::-1”是顛倒數(shù)組的值
box_hw = box_wh[..., ::-1]
input_shape = K.cast(input_shape, K.dtype(box_yx))
image_shape = K.cast(image_shape, K.dtype(box_yx))
new_shape = K.round(image_shape * K.min(input_shape/image_shape))
offset = (input_shape-new_shape)/2./input_shape
scale = input_shape/new_shape
box_yx = (box_yx - offset) * scale
box_hw *= scale

box_mins = box_yx - (box_hw / 2.)
box_maxes = box_yx + (box_hw / 2.)
boxes =  K.concatenate([
    box_mins[..., 0:1],           #y_min
    box_mins[..., 1:2],           #x_min
    box_maxes[..., 0:1],          #y_max
    box_maxes[..., 1:2]           #x_max

])

# Scale boxes back to original image shape.

boxes *= K.concatenate([image_shape, image_shape])
return boxes

OK, that's all! Enjoy it!

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av

2020-04-22

2020-04-22

hsv取值范圍在【0,1】，而RBG取值范圍在【0,255】，所以乘上255

Generate output tensor targets for filtered bounding boxes.

default setting #每層分配3個anchor box.如13*13分配到【6,7,8】即【（116,90）（156,198）（373,326）】

輸入shape(?,13,13,255);即第一維和第二維分別32 ->1332=416; input_shape:(416,416)

K.concatenate().將相同維度的數(shù)據(jù)連接在一起；把boxes_展平。 -> 變成格式:(?,4); ?:框的個數(shù)；4：（x,y,w,h）

feats:輸出的shape，->(?,13,13,255); anchors:每層對應(yīng)的3個anchor box

num_classes: 類別數(shù)（80）; input_shape:（416,416）; image_shape:圖像尺寸

yolo_head():box_xy是box的中心坐標(biāo)，(0_{1)相對位置；box_wh是box的寬高，(0}1)相對值；

box_confidence是框中物體置信度；box_class_probs是類別置信度；

將box_xy和box_wh的(0~1)相對值，轉(zhuǎn)換為真實(shí)坐標(biāo)，輸出boxes是(y_min,x_min,y_max,x_max)的值

reshape,將不同網(wǎng)格的值轉(zhuǎn)換為框的列表。即（?,13,13,3,4）->(?,4) ？：框的數(shù)目

Reshape to batch, height, width, num_anchors, box_params.

grid_y和grid_x用于生成網(wǎng)格grid，通過arange、reshape、tile的組合，創(chuàng)建y軸的0_{12的組合grid_y，再創(chuàng)建x軸的0}12的組合grid_x，將兩者拼接concatenate，就是grid；

xywh的計(jì)算公式，tx、ty、tw和th是feats值，而bx、by、bw和bh是輸出值，如下圖

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九 欧美,1769亚洲,黄色成人av

2020-04-22

hsv取值范圍在【0,1】，而RBG取值范圍在【0,255】，所以乘上255

Generate output tensor targets for filtered bounding boxes.

default setting #每層分配3個anchor box.如13*13分配到【6,7,8】即【（116,90）（156,198）（373,326）】

輸入shape(?,13,13,255);即第一維和第二維分別32 ->1332=416; input_shape:(416,416)

K.concatenate().將相同維度的數(shù)據(jù)連接在一起；把boxes_展平。 -> 變成格式:(?,4); ?:框的個數(shù)；4：（x,y,w,h）

feats:輸出的shape，->(?,13,13,255); anchors:每層對應(yīng)的3個anchor box

num_classes: 類別數(shù)（80）; input_shape:（416,416）; image_shape:圖像尺寸

yolo_head():box_xy是box的中心坐標(biāo)，(01)相對位置；box_wh是box的寬高，(01)相對值；

box_confidence是框中物體置信度；box_class_probs是類別置信度；

將box_xy和box_wh的(0~1)相對值，轉(zhuǎn)換為真實(shí)坐標(biāo)，輸出boxes是(y_min,x_min,y_max,x_max)的值

reshape,將不同網(wǎng)格的值轉(zhuǎn)換為框的列表。即（?,13,13,3,4）->(?,4) ？：框的數(shù)目

Reshape to batch, height, width, num_anchors, box_params.

grid_y和grid_x用于生成網(wǎng)格grid，通過arange、reshape、tile的組合， 創(chuàng)建y軸的012的組合grid_y，再創(chuàng)建x軸的012的組合grid_x，將兩者拼接concatenate，就是grid；

xywh的計(jì)算公式，tx、ty、tw和th是feats值，而bx、by、bw和bh是輸出值，如下圖

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av

hsv取值范圍在【0,1】，而RBG取值范圍在【0,255】，所以乘上255

K.concatenate().將相同維度的數(shù)據(jù)連接在一起；把boxes_展平。 -> 變成格式:(?,4); ?:框的個數(shù)；4：（x,y,w,h）

feats:輸出的shape，->(?,13,13,255); anchors:每層對應(yīng)的3個anchor box

yolo_head():box_xy是box的中心坐標(biāo)，(0_{1)相對位置；box_wh是box的寬高，(0}1)相對值；

box_confidence是框中物體置信度；box_class_probs是類別置信度；

將box_xy和box_wh的(0~1)相對值，轉(zhuǎn)換為真實(shí)坐標(biāo)，輸出boxes是(y_min,x_min,y_max,x_max)的值

reshape,將不同網(wǎng)格的值轉(zhuǎn)換為框的列表。即（?,13,13,3,4）->(?,4) ？：框的數(shù)目

grid_y和grid_x用于生成網(wǎng)格grid，通過arange、reshape、tile的組合，創(chuàng)建y軸的0_{12的組合grid_y，再創(chuàng)建x軸的0}12的組合grid_x，將兩者拼接concatenate，就是grid；

xywh的計(jì)算公式，tx、ty、tw和th是feats值，而bx、by、bw和bh是輸出值，如下圖