一寫在前面

未經(jīng)允許，不得轉(zhuǎn)載，謝謝~

這篇文章是對(duì)UCF101視頻數(shù)據(jù)集處理以及加載的一個(gè)記錄，也適用于其他的視頻數(shù)據(jù)集。

1 需求所在

PyTorch提供了像對(duì)CIFAR10這樣計(jì)算機(jī)視覺中經(jīng)常會(huì)用到的數(shù)據(jù)集的接口，直接調(diào)用即可方便的獲取到我們想要的train_x, train_y, test_x, test_y. 而我這次需要的UCF101還沒有得到這樣的待遇，所以首先要完成數(shù)據(jù)的讀入，才能進(jìn)行后面的網(wǎng)絡(luò)訓(xùn)練及測(cè)試工作。

簡(jiǎn)單來說，這篇文章實(shí)現(xiàn)了對(duì)UCF101的處理及加載，使其能夠每次根據(jù)batch_size的大小，返回需要的train_x, train_y, test_x, test_y用于視頻分類任務(wù).

2 不足之處

本文的處理方式簡(jiǎn)單粗暴，也適用于其他的數(shù)據(jù)集。

但是在您往下看之前，雖然文章標(biāo)題已經(jīng)注明未使用深度學(xué)習(xí)的框架，但為了不浪費(fèi)您寶貴的時(shí)間，還是要說明一下，在寫這個(gè)代碼時(shí)候只想到不能直接使用PyTorch封裝好的接口，忘記了它還提供了像DataLoader這樣用于數(shù)據(jù)加載的函數(shù)。

所以在數(shù)據(jù)處理的效率及內(nèi)存開銷方面應(yīng)該是有很大的改進(jìn)空間的～～～

二 UCF101數(shù)據(jù)集

簡(jiǎn)單介紹一下UCF101數(shù)據(jù)集。

內(nèi)含13320 個(gè)短視頻
視頻來源：YouTube
視頻類別：101 種
主要包括這5大類動(dòng)作：人和物體交互,只有肢體動(dòng)作，人與人交互，玩音樂器材，各類運(yùn)動(dòng)

三具體實(shí)現(xiàn)思路

1 數(shù)據(jù)集準(zhǔn)備

下載UCF101數(shù)據(jù)集UCF101.zip并解壓；
下載標(biāo)注文件及訓(xùn)練數(shù)據(jù)和測(cè)試數(shù)據(jù)的列表文件The Train/Test Splits for Action Recognition on UCF101 data set:
內(nèi)含：

以上兩個(gè)文件都在UCF數(shù)據(jù)集官網(wǎng)可以下載。

2 預(yù)處理

參考代碼：two-stream-action-recognition
預(yù)處理主要分為講視頻分解為幀，統(tǒng)計(jì)每個(gè)視頻的幀數(shù)這兩個(gè)步驟。
這兩部分的代碼在以上的參考文件中給出了，去下載video_jpg_ucf101_hmdb51.py以及n_frames_ucf101_hmdb51.py源碼即可。

這里說明一下怎么使用以及執(zhí)行結(jié)果：

將UCF101中的視頻保持結(jié)構(gòu)不變逐幀視頻分解為圖像。
python utils_fyq/video_jpg_ucf101_hmdb51.py /home/hl/Desktop/lovelyqian/CV_Learning/UCF101 /home/hl/Desktop/lovelyqian/CV_Learning/UCF101_jpg
將UCF101中的視頻保持結(jié)構(gòu)不變都逐幀視頻分解為圖像，每個(gè)視頻幀數(shù)目都不一樣，150幀左右，圖片大小都是320*240。
實(shí)現(xiàn)每個(gè)視頻的幀數(shù)（圖像數(shù)量）統(tǒng)計(jì)。
python utils_fyq/n_frames_ucf101_hmdb51.py /home/hl/Desktop/lovelyqian/CV_Learning/UCF101_jpg
執(zhí)行結(jié)果是每個(gè)視頻幀文件夾內(nèi)都有一個(gè)n_frames.txt文件，記錄該視頻幀的數(shù)目。

3 后續(xù)處理

定義了UCF101類，具體目標(biāo)：

train_x: [batch_size,16,3,160,160]
test_x : [batch_size,16,3,160,160]
每個(gè)視頻取隨機(jī)取16個(gè)連續(xù)幀
圖片為3通道，大小隨機(jī)取(160,160)
總共101類，所以label值為：0-100
train_y: [batch_size] 返回對(duì)應(yīng)的label值；
test_y_label: [batch_size] 根據(jù)視頻名稱返回對(duì)應(yīng)的label，用于與預(yù)測(cè)值進(jìn)行對(duì)比。
classNames[101]: index表示label, value表示具體的類別，例如classNames[0]='ApplyEyeMakeup`

以下依次具體介紹各個(gè)函數(shù)：

get_className()
根據(jù)下載號(hào)的標(biāo)注文件中的classInd.txt文件獲取到每個(gè)index對(duì)應(yīng)的value.
get_train()
根據(jù)下載好的標(biāo)注文件TrainList.txt獲取需要訓(xùn)練的視頻路徑train_x_path和對(duì)應(yīng)的類別標(biāo)注信息train_x。
get_label()
根據(jù)文件名提取該視頻所屬的視頻類別。
get_test()
根據(jù)下載好的標(biāo)注文件TestList.txt，得到要測(cè)試的路徑名test_x_path，并根據(jù)路徑名調(diào)用上面的函數(shù)得到正確的標(biāo)注信息test_y_label，用于計(jì)算預(yù)測(cè)精度。
get_single_image()
根據(jù)圖片的路徑名讀取圖片信息，本文的處理結(jié)果為（3,160,160）大小的Tensor.
get_single_video_x()
根據(jù)視頻圖像的路徑名隨機(jī)獲取16幀連續(xù)的幀。
set_mode()
設(shè)置當(dāng)前要取的數(shù)據(jù)是訓(xùn)練數(shù)據(jù)還是測(cè)試數(shù)據(jù)。
get_minibatches_index()
根據(jù)總共要訓(xùn)練（測(cè)試）的數(shù)量，以及batch_size，返回每次要訓(xùn)練（測(cè)試）的視頻標(biāo)號(hào)。
__getitem__()
利用了python中的特殊函數(shù)，可以使用索引訪問元素，并自動(dòng)迭代。所以利用這個(gè)特性，用batch_index作為索引，每次根據(jù)當(dāng)前mode為訓(xùn)練還是測(cè)試，返回需要的值。
__init__()
一些初始化工作，以及調(diào)用get_train()和get_test()先獲得各自的視頻路徑列表和label信息。

4 使用方法

    myUCF101=UCF101()

   # get classNames
    className=myUCF101.get_className()

    # train
    batch_num=myUCF101.set_mode('train')
    for batch_index in range(batch_num):
        train_x,train_y=myUCF101[batch_index]
        print (train_x,train_y)
        print ("train batch:",batch_index)
    
    #TEST
    batch_num=myUCF101.set_mode('test')
    for batch_index in range(batch_num):
        test_x,test_y_label=myUCF101[batch_index]
        print test_x,test_y_label
        print ("test batch: " ,batch_index)

四完整代碼


from PIL import Image
import random
from skimage import io, color, exposure
from skimage.transform import resize
import os
import numpy as np
import pandas as pd
import torch


class UCF101:
    def __init__(self,mode='train'):
        self.videos_path='/home/hl/Desktop/lovelyqian/CV_Learning/UCF101_jpg'
        self.csv_dir_path='/home/hl/Desktop/lovelyqian/CV_Learning/UCF101_TrainTestlist/'
        self.label_csv_path = os.path.join(self.csv_dir_path, 'classInd.txt')
        # self.batch_size=128
        self.batch_size=8
        self.mode= mode

        self.get_train()
        self.get_test()

        
    def get_className(self):
        data = pd.read_csv(self.label_csv_path, delimiter=' ', header=None)
        labels = []
        # labels.append("0")
        for i in range(data.shape[0]):
            labels.append(data.ix[i, 1])
        return labels

    def get_train(self):
        train_x_path = []
        train_y = []
        for index in range(1,4):
            tmp_path='trainlist0'+str(index)+'.txt'
            train_csv_path = os.path.join(self.csv_dir_path, tmp_path)
            # print (train_csv_path)

            data = pd.read_csv(train_csv_path, delimiter=' ', header=None)
            for i in range(data.shape[0]):
                train_x_path.append(data.ix[i,0])
                # train_y.append(data.ix[i,1])
                train_y.append(data.ix[i,1]-1)
    
        self.train_num=len(train_x_path)
        self.train_x_path=train_x_path
        self.train_y=train_y
        return train_x_path,train_y


    def get_test(self):
        test_x_path=[]
        test_y_label=[]
        for index in range(1,4):
            temp_path='testlist0'+str(index)+'.txt'
            test_csv_path=os.path.join(self.csv_dir_path,temp_path)
            # print (test_csv_path)

            data=pd.read_csv(test_csv_path,delimiter=' ',header=None)
            for i in range(data.shape[0]):
                test_x_path.append(data.ix[i,0])
                label=self.get_label(data.ix[i,0])
                test_y_label.append(label)
        self.test_num=len(test_x_path)
        self.test_x_path=test_x_path
        self.test_y_label=test_y_label
        return test_x_path,test_y_label


    def get_label(self,video_path):
        slash_rows = video_path.split('/')
        class_name = slash_rows[0]
        return class_name
    

    def get_single_image(self,image_path):
        image=resize(io.imread(image_path),output_shape=(160,160),preserve_range= True)    #240,320,3--160,160,3
        # io.imshow(image.astype(np.uint8))
        # io.show()
        image =image.transpose(2, 0, 1)              #3,160,160
        return torch.from_numpy(image)               #range[0,255]

    def get_single_video_x(self,train_x_path):
        slash_rows=train_x_path.split('.')
        dir_name=slash_rows[0]
        video_jpgs_path=os.path.join(self.videos_path,dir_name)
        ##get the random 16 frame
        data=pd.read_csv(os.path.join(video_jpgs_path,'n_frames'),delimiter=' ',header=None)
        frame_count=data[0][0]
        train_x=torch.Tensor(16,3,160,160)

        image_start=random.randint(1,frame_count-17)
        image_id=image_start
        for i in range(16):
            s="%05d" % image_id
            image_name='image_'+s+'.jpg'
            image_path=os.path.join(video_jpgs_path,image_name)
            single_image=self.get_single_image(image_path)
            train_x[i,:,:,:]=single_image
            image_id+=1
        return train_x

    
    def get_minibatches_index(self, shuffle=True):
        """
        :param n: len of data
        :param minibatch_size: minibatch size of data
        :param shuffle: shuffle the data
        :return: len of minibatches and minibatches
        """
        if self.mode=='train':
            n=self.train_num
        elif self.mode=='test':
            n=self.test_num

        minibatch_size=self.batch_size
        
        index_list = np.arange(n, dtype="int32")
 
        # shuffle
        if shuffle:
            random.shuffle(index_list)
 
        # segment
        minibatches = []
        minibatch_start = 0
        for i in range(n // minibatch_size):
            minibatches.append(index_list[minibatch_start:minibatch_start + minibatch_size])
            minibatch_start += minibatch_size
 
        # processing the last batch
        if (minibatch_start != n):
            minibatches.append(index_list[minibatch_start:])
        
        if self.mode=='train':
            self.minibatches_train=minibatches
        elif self.mode=='test':
            self.minibatches_test=minibatches
        return 


    
    def __getitem__(self, index):
        if self.mode=='train':
            batches=self.minibatches_train[index]
            N=batches.shape[0]
            train_x=torch.Tensor(N,16,3,160,160)
            train_y=torch.Tensor(N)
            for i in range (N):
                tmp_index=batches[i]
                tmp_video_path=self.train_x_path[tmp_index]
                tmp_train_x= self.get_single_video_x(tmp_video_path)
                tmp_train_y=self.train_y[tmp_index]
                train_x[i,:,:,:]=tmp_train_x
                train_y[i]=tmp_train_y
            train_x=train_x.permute(0,2,1,3,4)
            return train_x,train_y
        elif self.mode=='test':
            batches=self.minibatches_test[index]
            N=batches.shape[0]
            test_x=torch.Tensor(N,16,3,160,160)
            test_y_label=[]
            for i in range (N):
                tmp_index=batches[i]
                tmp_video_path=self.test_x_path[tmp_index]
                tmp_test_x= self.get_single_video_x(tmp_video_path)
                tmp_test_y=self.test_y_label[tmp_index]
                test_x[i,:,:,:]=tmp_test_x
                test_y_label.append(tmp_test_y)
            test_x=test_x.permute(0,2,1,3,4)
            return test_x,test_y_label
    
    def set_mode(self,mode):
        self.mode=mode
        if mode=='train':
            self.get_minibatches_index()
            return self.train_num // self.batch_size
        elif mode=='test':
            self.get_minibatches_index()
            return self.test_num // self.batch_size





##  usage 

if __name__=="__main__":
    myUCF101=UCF101()
   
    className=myUCF101.get_className()


    
    # train
    batch_num=myUCF101.set_mode('train')
    for batch_index in range(batch_num):
        train_x,train_y=myUCF101[batch_index]
        print (train_x,train_y)
        print ("train batch:",batch_index)
    
    #TEST
    batch_num=myUCF101.set_mode('test')
    for batch_index in range(batch_num):
        test_x,test_y_label=myUCF101[batch_index]
        print test_x,test_y_label
        print ("test batch: " ,batch_index)

五寫在最后

竟然沒有想到可以用PyTorch提供的函數(shù)來實(shí)現(xiàn)了，還是太年輕了哈哈哈哈哈哈哈哈～

但也算是整理了一下整個(gè)的數(shù)據(jù)集處理的實(shí)現(xiàn)思路，總歸是沒有壞處的嘻嘻

下次再寫一個(gè)用PyTorch框架函數(shù)的吧。

有問題歡迎簡(jiǎn)信交流，謝謝！

參考材料

UCF數(shù)據(jù)集官網(wǎng)
two-stream-action-recognition
PyTorch(四)——視頻數(shù)據(jù)的處理
讓你秒懂Python 類特殊方法getitem

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av

視頻數(shù)據(jù)集UCF101的處理與加載（未使用深度學(xué)習(xí)框架）

視頻數(shù)據(jù)集UCF101的處理與加載（未使用深度學(xué)習(xí)框架）

一寫在前面

1 需求所在

2 不足之處

二 UCF101數(shù)據(jù)集

三具體實(shí)現(xiàn)思路

1 數(shù)據(jù)集準(zhǔn)備

2 預(yù)處理

3 后續(xù)處理

4 使用方法

四完整代碼

五寫在最后

參考材料

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九 欧美,1769亚洲,黄色成人av

視頻數(shù)據(jù)集UCF101的處理與加載（未使用深度學(xué)習(xí)框架）

一 寫在前面

1 需求所在

2 不足之處

二 UCF101數(shù)據(jù)集

三 具體實(shí)現(xiàn)思路

1 數(shù)據(jù)集準(zhǔn)備

2 預(yù)處理

3 后續(xù)處理

4 使用方法

四 完整代碼

五 寫在最后

參考材料

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av

一寫在前面

三具體實(shí)現(xiàn)思路

四完整代碼

五寫在最后