*第二講的背景是定長字符識(shí)別方法*

0.CV常用python模塊

圖像數(shù)據(jù)讀取與處理包

包名	特點(diǎn)	（中文）官網(wǎng)
`Pillow`	提供常見的圖像讀取和處理的操作，可以與ipython notebook無縫集成	https://pillow.readthedocs.io/en/stable/
`OpenCV`	眾多的計(jì)算機(jī)視覺、數(shù)字圖像處理和機(jī)器視覺等功能，比Pillow更加強(qiáng)大很多，但是學(xué)習(xí)成本高	http://www.opencv.org.cn/

數(shù)據(jù)擴(kuò)增包

包名	特點(diǎn)	官網(wǎng)
`torchvision`	可與torch集成，提供基本的數(shù)據(jù)數(shù)據(jù)擴(kuò)增方法，但較少，速度中等	https://github.com/pytorch/vision
`imgaug`	提供了多樣的數(shù)據(jù)擴(kuò)增方法，且組合起來非常方便，速度較快	https://github.com/aleju/imgaug
`albumentations`	提供了多樣的數(shù)據(jù)擴(kuò)增方法，對(duì)圖像分類、語義分割、物體檢測(cè)和關(guān)鍵點(diǎn)檢測(cè)都支持，速度較快	https://albumentations.readthedocs.io

數(shù)據(jù)擴(kuò)增就像是通過各種方法，給同一樣本增加不同視角的衍生樣本。一般會(huì)從圖像顏色、尺寸、形態(tài)、空間和像素等角度進(jìn)行變換，或者組合變換。但是要注意的是某些變換可能會(huì)改變樣本的正確標(biāo)簽，如 $6\to 9$

常見的方法：

transforms.CenterCrop 對(duì)圖片中心進(jìn)行裁剪

transforms.ColorJitter 對(duì)圖像顏色的對(duì)比度、飽和度和零度進(jìn)行變換

transforms.FiveCrop 對(duì)圖像四個(gè)角和中心進(jìn)行裁剪得到五分圖像

transforms.Grayscale 對(duì)圖像進(jìn)行灰度變換

transforms.Pad 使用固定值進(jìn)行像素填充

transforms.RandomAffine 隨機(jī)仿射變換

transforms.RandomCrop 隨機(jī)區(qū)域裁剪

transforms.RandomHorizontalFlip 隨機(jī)水平翻轉(zhuǎn)

transforms.RandomRotation 隨機(jī)旋轉(zhuǎn)

transforms.RandomVerticalFlip 隨機(jī)垂直翻轉(zhuǎn)

1. 使用方法

以torch為例主要是修改其中的兩個(gè)類：

Dataset：對(duì)數(shù)據(jù)集的封裝，提供索引方式的對(duì)數(shù)據(jù)樣本進(jìn)行讀取
DataLoder：對(duì)Dataset進(jìn)行封裝，提供批量讀取的迭代讀取

from torch.utils.data.dataset import Dataset, DataLoader
from torchvision import transforms

class SVHNDataset(Dataset):
    def __init__(self, img_path, img_label, transform=None):
        self.img_path = img_path # 所有圖像的具體路徑，如[/home/username/database/train/0000.jpg,...]
        self.img_label = img_label # 所有圖像的標(biāo)簽，如[1,0,1,1,0,...]，但是在這個(gè)任務(wù)中，list中的元素還是list
        if transform is not None:
            self.transform = transform
        else:
            self.transform = None

    def __getitem__(self, index): # 這個(gè)類中 最重寫關(guān)鍵的 方法，保證單條數(shù)據(jù)可以通過index取出來
        img = Image.open(self.img_path[index]).convert('RGB')

        if self.transform is not None:
            img = self.transform(img)
        
        lbl = np.array(self.img_label[index], dtype=np.int)
        lbl = list(lbl)  + (5 - len(lbl)) * [10]
        
        return img, torch.from_numpy(np.array(lbl[:5]))

    def __len__(self):
        return len(self.img_path)

以上封裝完成后，儲(chǔ)存在目錄下數(shù)據(jù)的索引就被打包起來了，而且可以用index單條讀取具體數(shù)據(jù)，數(shù)據(jù)擴(kuò)增也在讀取時(shí)完成（隨叫隨取隨擴(kuò)增）

train_loader = DataLoader(
        SVHNDataset(train_path, train_label,
                   transforms.Compose([
                       transforms.Resize((64, 128)), # 縮放到64*128
                       transforms.ColorJitter(0.3, 0.3, 0.2), # 隨機(jī)顏色變換
                       transforms.RandomRotation(5), # 加入隨機(jī)旋轉(zhuǎn)
                       transforms.ToTensor(), # 將圖片轉(zhuǎn)換為pytorch 的tesntor
                       transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]) # 對(duì)圖像像素進(jìn)行歸一化
            ])), 
    batch_size=10, # 每批樣本個(gè)數(shù)
    shuffle=False, # 是否打亂順序
    num_workers=10, # 讀取的線程個(gè)數(shù)
)

在加入DataLoder后，數(shù)據(jù)按照批次獲取，每批次調(diào)用Dataset讀取單個(gè)樣本進(jìn)行拼接。此時(shí)data的格式為：
torch.Size([10, 3, 64, 128]), torch.Size([10, 6])
前者為圖像文件，為batchsize * chanel * height * width次序；后者為字符標(biāo)簽，由于定長字符識(shí)別中預(yù)設(shè)字符長度為6，所以標(biāo)簽為10x6矩陣。

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av

Task02: 數(shù)據(jù)讀取與數(shù)據(jù)擴(kuò)增

Task02: 數(shù)據(jù)讀取與數(shù)據(jù)擴(kuò)增

0.CV常用python模塊

1. 使用方法

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九 欧美,1769亚洲,黄色成人av

Task02: 數(shù)據(jù)讀取與數(shù)據(jù)擴(kuò)增

0.CV常用python模塊

1. 使用方法

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av