出于科研需要,開(kāi)一個(gè)帖子來(lái)總結(jié)一下自己在閱讀文獻(xiàn)中遇到過(guò)的數(shù)據(jù)庫(kù),也方便之后再使用。
分類+檢測(cè)數(shù)據(jù)庫(kù)
ImageNet
ImageNet,無(wú)需多言,上介紹:
What is ImageNet?
ImageNet is an image dataset organized according to the WordNet hierarchy. Each meaningful concept in WordNet, possibly described by multiple words or word phrases, is called a "synonym set" or "synset". There are more than 100,000 synsets in WordNet, majority of them are nouns (80,000+). In ImageNet, we aim to provide on average 1000 images to illustrate each synset. Images of each concept are quality-controlled and human-annotated. In its completion, we hope ImageNet will offer tens of millions of cleanly sorted images for most of the concepts in the WordNet hierarchy.
ImageNet是一個(gè)根據(jù)WordNet層級(jí)組織起來(lái)的數(shù)據(jù)庫(kù)。每一個(gè)在WordNet上有意義的概念,可能是通過(guò)一個(gè)詞,也可能是通過(guò)多個(gè)詞組織起來(lái)的。都被稱作“同義詞組”。在WordNet上大約有100000個(gè)同義詞組(概念),其中8000多個(gè)是名詞。在ImageNet上,我們的目標(biāo)是為每一個(gè)概念提供1000個(gè)圖像。每一個(gè)圖像都有質(zhì)量保證和人工標(biāo)注。在完成后,我們希望能夠提供百萬(wàn)級(jí)的分類好的圖片。
與其相關(guān)的競(jìng)賽是 ILSVRC。
分類數(shù)據(jù)庫(kù)
MNIST
MNIST 是大牛Yan LeCun的工作之一,用來(lái)識(shí)別手寫(xiě)數(shù)字。簡(jiǎn)介:
The MNIST database of handwritten digits, available from this page, has a training set of 60,000 examples, and a test set of 10,000 examples. It is a subset of a larger set available from NIST. The digits have been size-normalized and centered in a fixed-size image.
It is a good database for people who want to try learning techniques and pattern recognition methods on real-world data while spending minimal efforts on preprocessing and formatting.
MNIST數(shù)據(jù)庫(kù)是手寫(xiě)數(shù)字的數(shù)據(jù)庫(kù)(人寫(xiě)的數(shù)字)。它包括訓(xùn)練集(60000個(gè)實(shí)例),測(cè)試集(10000個(gè)實(shí)例)。它是NIST數(shù)據(jù)庫(kù)的一個(gè)子集。這些數(shù)字大小相同,而且都位于圖像中央。
它可以幫助科研人員測(cè)試學(xué)習(xí)技術(shù)和模式識(shí)別方法。
CIFAR
CIFAR 是多倫多大學(xué)計(jì)算機(jī)科學(xué)系維護(hù)的一個(gè)數(shù)據(jù)庫(kù),全稱是Canadian Institute for Advanced Research,都是分類好的圖片,用來(lái)測(cè)試算法分類的錯(cuò)誤率的。既然是多倫多大學(xué)的,果然……CIFAR有Hinton大神參與維護(hù)。CIFAR又分為CIFAR-10和CIFAR-100,其實(shí)就是10個(gè)類別和100個(gè)類別的區(qū)別。
CIFAR-10包括了60000張32x32的彩色圖片,共分為10類,每一類6000張圖片??偣灿?0000個(gè)訓(xùn)練圖像和10000個(gè)測(cè)試圖像。
這個(gè)數(shù)據(jù)庫(kù)被分為5個(gè)訓(xùn)練批次(batch)和1個(gè)測(cè)試批次,每個(gè)批次10000張圖片。測(cè)試批次準(zhǔn)確包括了每個(gè)類別各1000張隨機(jī)選擇的圖片。訓(xùn)練批次包含了隨機(jī)選擇的剩余的圖片,也就是說(shuō),某些訓(xùn)練批次可能包含的某一個(gè)類別的圖片會(huì)多一些??偣布悠饋?lái),這五個(gè)訓(xùn)練批次共包含每類5000張圖片。
這些分類都是互斥的。沒(méi)有重疊,比如說(shuō)有兩個(gè)類是汽車(automobile)和卡車(truck)。汽車包括轎車,SUV等??ㄜ囍话ù罂ㄜ嚒D阋獑?wèn)我皮卡怎么算?答案是兩個(gè)類里面都沒(méi)有皮卡。
CIFAR-100差不多,就是類別多了10倍,每一類的圖片的數(shù)量不同。詳細(xì)的需要的時(shí)候再去看吧。
YFCC100
YFCC100是雅虎的圖片/視頻分類數(shù)據(jù)庫(kù)。
檢測(cè)數(shù)據(jù)庫(kù)
PASCAL VOC 2007/2012
Visual Object Classes Challenge 2012 (VOC 2012) 是牛津大學(xué)出品的數(shù)據(jù)庫(kù),用來(lái)識(shí)別物體。簡(jiǎn)介:
The main goal of this challenge is to recognize objects from a number of visual object classes in realistic scenes (i.e. not pre-segmented objects). It is fundamentally a supervised learning learning problem in that a training set of labelled images is provided. The twenty object classes that have been selected are:
Person: person
Animal: bird, cat, cow, dog, horse, sheep
Vehicle: aeroplane, bicycle, boat, bus, car, motorbike, train
Indoor: bottle, chair, dining table, potted plant, sofa, tv/monitor
There are three main object recognition competitions: classification, detection, and segmentation, a competition on action classification, and a competition on large scale recognition run by ImageNet. In addition there is a "taster" competition on person layout.
VOC2012的主要目標(biāo)是從真實(shí)場(chǎng)景中識(shí)別物體。它的基本作用是為監(jiān)督學(xué)習(xí)問(wèn)題提供一個(gè)訓(xùn)練集。20個(gè)物體類別是:
- 人:人
- 動(dòng)物:鳥(niǎo),毛,牛,狗,馬,羊
- 交通工具: 飛機(jī),自行車,傳,公交,轎車,摩托車,火車;
- 室內(nèi)物體:瓶子,椅子,餐桌,盆栽植物,沙發(fā),電視/顯示器
物體識(shí)別主要有三類任務(wù):
- 分類,檢測(cè)和分割
- 動(dòng)作分類
- 大尺度識(shí)別(by ImageNet)
- 額外的:人體輪廓
COCO
COCO 是一個(gè)新的圖像識(shí)別,分割,標(biāo)記數(shù)據(jù)庫(kù)。這里面的圖像都已經(jīng)預(yù)先分割好了,就看你的算法分割的錯(cuò)誤率低不低了。與其相關(guān)的競(jìng)賽是COCO 2016 Detection and Keypoint Challenges
KITTI
KITTI Vision Benchmark Suite,測(cè)試自動(dòng)駕駛 。這個(gè)庫(kù)里面的圖片都是汽車在行駛過(guò)程中在Karlruhe這個(gè)城市拍攝的街景,都有標(biāo)簽。比較小,只有289張訓(xùn)練圖片。
其中一些道路標(biāo)簽包括:Highway, minor road
分割數(shù)據(jù)庫(kù)
CityScapes Dataset
CityScapes dataset 目標(biāo)是城市街景的語(yǔ)義理解(感覺(jué)就是城市街景里面的物體識(shí)別)。特點(diǎn):
Type of annotations
- Semantic
- Instance-wise
- Dense pixel annotations
Complexity
- 30 classes
- See Class Definitions for a list of all classes and have a look at the applied labeling policy.
Diversity
- 50 cities
- Several months (spring, summer, fall)
- Daytime
- Good/medium weather conditions
- Manually selected frames
- Large number of dynamic objects
- Varying scene layout
- Varying background
Volume
- 5?000 annotated images with fine annotations (examples)
- 20?000 annotated images with coarse annotations (examples)