Stacked Denoising and Stacked Convolutional Autoencoders

An Evaluation of Transformation Robustness for Spatial Data Representations


This blog mainly want to take some note from the work of SDA & SCA from TUM ---- https://mediatum.ub.tum.de/doc/1381852/54858742554.pdf

I think they got the following conclusion.

** SCA yield image features that are more useful for a specific purpose, namely the image classification with a subsequent SVM classifier.?

** Secondly, SCA features exhibit a higher degree of invariance to input transformations than those representations generated by an SDA.

** the reason for the result is likely a consequence of SCA preserving and exploiting the spatial structure of the input data, while also coming with pooling which by design lead to a degree of invariance.


1 Introduction?

In today's CV task often related a standard procedure which is feature extraction and representation, because you use it like NLP problems like image embedding once you have the overall feature space. Namely you could use it as a component of lots unsupervised and reinforcement learning tasks.

However there are lots of problems around the topic of feature representation in CV. Such as light sources and object shapes and object surface that is complex.?

Some good example of the traditional method is PCA. While PCA have been frequently used for feature extraction purposes, their capabilities are limited as they are not able to perform nonlinear transformations.Generally highly non-linear functions of the raw input.

the treatment is autoencoder "nonlinear generalization of PCA"

Masci et al. [3] combine CNNs with SAEs to construct stackedconvolutional autoencoders(SCA) for feature extraction. The approach promises to include CNN characteristics, which currently significantly outperform all other solutions in the realm of image processing. Additionally, however, SCA can be trained in unsupervised fashion and then be used for either extracting features or initializing the parameters of a traditional CNN.

Common Autoencoder Approach

(a) Triangular Shape

(b) Rectangular Shape

SCA architecture

conclusion


SCA are able to exploit spatial relations and come with max pooling layers that naturally increase transformation invariance, they may be a better choice for representation learning in the realm of visual data than SDA

最后編輯于
?著作權歸作者所有,轉載或內容合作請聯(lián)系作者
【社區(qū)內容提示】社區(qū)部分內容疑似由AI輔助生成,瀏覽時請結合常識與多方信息審慎甄別。
平臺聲明:文章內容(如有圖片或視頻亦包括在內)由作者上傳并發(fā)布,文章內容僅代表作者本人觀點,簡書系信息發(fā)布平臺,僅提供信息存儲服務。

相關閱讀更多精彩內容

友情鏈接更多精彩內容