2016-05-28 今日收集

(videos)"International Conference on Learning Representations (ICLR) 2016, San Juan | VideoLectures"

Deep Structured Energy Based Models for Anomaly Detection
In this paper, we attack the anomaly detection problem by directly modeling the data distribution with deep architectures. We propose deep structured energy based models (DSEBMs), where the energy function is the output of a deterministic deep neural network with structure. We develop novel model architectures to integrate EBMs with different types of data such as static data, sequential data, and spatial data, and apply appropriate model architectures to adapt to the data structure. Our training algorithm is built upon the recent development of score matching \cite{sm}, which connects an EBM with a regularized autoencoder, eliminating the need for complicated sampling method. Statistically sound decision criterion can be derived for anomaly detection purpose from the perspective of the energy landscape of the data distribution. We investigate two decision criteria for performing anomaly detection: the energy score and the reconstruction error. Extensive empirical studies on benchmark tasks demonstrate that our proposed model consistently matches or outperforms all the competing methods.

Simultaneous Sparse Dictionary Learning and Pruning

How priors of initial hyperparameters affect Gaussian process regression models

Bayesian Variable Selection and Estimation Based on Global-Local Shrinkage Priors

Information Matrix Splitting
Efficient statistical estimates via the maximum likelihood method requires the observed information, the negative of the Hessian of the underlying log-likelihood function. Computing the observed information is computationally expensive, therefore, the expected information matrix—the Fisher information matrix—is often preferred due to its simplicity. In this paper, we prove that the average of the observed and Fisher information of the restricted/residual log-likelihood function for the linear mixed model can be split into two matrices. The expectation of one part is the Fisher information matrix but has a simper form than the Fisher information matrix. The other part which involves a lot of computations is a zero random matrix and thus is negligible. Leveraging such a splitting can simplify evaluation of the approximate Hessian of a log-likelihood function.

A Concise Overview of Standard Model-fitting Methods
"A Concise Overview of Standard Model-fitting Methods - Fitting a model via closed-form equations vs. Gradient Descent vs Stochastic Gradient Descent vs Mini-Batch Learning. What is the difference?" by Sebastian Raschka

"KDD2016 - Accepted Papers"

《Qs - Deep Gaussian Processes》by Neil Lawrence

FLAG: Fast Linearly-Coupled Adaptive Gradient Method
The celebrated Nesterov’s accelerated gradient method offers great speed-ups compared to the classical gradient descend method as it attains the optimal first-order oracle complexity for smooth convex optimization. On the other hand, the popular AdaGrad algorithm competes with mirror descent under the best regularizer by adaptively scaling the gradient. Recently, it has been shown that the accelerated gradient descent can be viewed as a linear combination of gradient descent and mirror descent steps. Here, we draw upon these ideas and present a fast linearly-coupled adaptive gradient method (FLAG) as an accelerated version of AdaGrad, and show that our algorithm can indeed offer the best of both worlds. Like Nesterov’s accelerated algorithm and its proximal variant, FISTA, our method has a convergence rate of

1/T^2
1/T^2
after
T
T
iterations. Like AdaGrad our method adaptively chooses a regularizer, in a way that performs almost as well as the best choice of regularizer in hindsight.

Cognitive Dynamic Systems: A Technical Review of Cognitive Radar
We start with the history of cognitive radar, where origins of the PAC, Fuster research on cognition and principals of cognition are provided. Fuster describes five cognitive functions: perception, memory, attention, language, and intelligence. We describe the Perception-Action Cyclec as it applies to cognitive radar, and then discuss long-term memory, memory storage, memory retrieval and working memory. A comparison between memory in human cognition and cognitive radar is given as well. Attention is another function described by Fuster, and we have given the comparison of attention in human cognition and cognitive radar. We talk about the four functional blocks from the PAC: Bayesian filter, feedback information, dynamic programming and state-space model for the radar environment. Then, to show that the PAC improves the tracking accuracy of Cognitive Radar over Traditional Active Radar, we have provided simulation results. In the simulation, three nonlinear filters: Cubature Kalman Filter, Unscented Kalman Filter and Extended Kalman Filter are compared. Based on the results, radars implemented with CKF perform better than the radars implemented with UKF or radars implemented with EKF. Further, radar with EKF has the worst accuracy and has the biggest computation load because of derivation and evaluation of Jacobian matrices. We suggest using the concept of risk management to better control parameters and improve performance in cognitive radar. We believe, spectrum sensing can be seen as a potential interest to be used in cognitive radar and we propose a new approach Probabilistic ICA which will presumably reduce noise based on estimation error in cognitive radar. Parallel computing is a concept based on divide and conquers mechanism, and we suggest using the parallel computing approach in cognitive radar by doing complicated calculations or tasks to reduce processing time.

Predict or classify: The deceptive role of time-locking in brain signal classification
Several experimental studies claim to be able to predict the outcome of simple decisions from brain signals measured before subjects are aware of their decision. Often, these studies use multivariate pattern recognition methods with the underlying assumption that the ability to classify the brain signal is equivalent to predict the decision itself. Here we show instead that it is possible to correctly classify a signal even if it does not contain any predictive information about the decision. We first define a simple stochastic model that mimics the random decision process between two equivalent alternatives, and generate a large number of independent trials that contain no choice-predictive information. The trials are first time-locked to the time point of the final event and then classified using standard machine-learning techniques. The resulting classification accuracy is above chance level long before the time point of time-locking. We then analyze the same trials using information theory. We demonstrate that the high classification accuracy is a consequence of time-locking and that its time behavior is simply related to the large relaxation time of the process. We conclude that when time-locking is a crucial step in the analysis of neuronal activity patterns, both the emergence and the timing of the classification accuracy are affected by structural properties of the network that generates the signal.

Discrete Deep Feature Extraction: A Theory and New Architectures
First steps towards a mathematical theory of deep convolutional neural networks for feature extraction were made—for the continuous-time case—in Mallat, 2012, and Wiatowski and B\’olcskei, 2015. This paper considers the discrete case, introduces new convolutional neural network architectures, and proposes a mathematical framework for their analysis. Specifically, we establish deformation and translation sensitivity results of local and global nature, and we investigate how certain structural properties of the input signal are reflected in the corresponding feature vectors. Our theory applies to general filters and general Lipschitz-continuous non-linearities and pooling operators. Experiments on handwritten digit classification and facial landmark detection—including feature importance evaluation—complement the theoretical findings.

最后編輯于
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請(qǐng)聯(lián)系作者
【社區(qū)內(nèi)容提示】社區(qū)部分內(nèi)容疑似由AI輔助生成,瀏覽時(shí)請(qǐng)結(jié)合常識(shí)與多方信息審慎甄別。
平臺(tái)聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點(diǎn),簡(jiǎn)書系信息發(fā)布平臺(tái),僅提供信息存儲(chǔ)服務(wù)。

相關(guān)閱讀更多精彩內(nèi)容

  • 本文參加#漫步青春#征文活動(dòng),作者:黃麗,本人承諾,文章內(nèi)容原創(chuàng),且未在其他平臺(tái)發(fā)布。 而我停下來 當(dāng)驕陽(yáng)散去的時(shí)...
    MOCHAHL閱讀 463評(píng)論 0 0
  • 你還是一個(gè)人,挺好的。單身的兩個(gè)人不一定要組成雙人旁。男生最吸引人的特質(zhì)是幽默,而往往有趣的靈魂很少。一直希望初戀...
    江沅公子閱讀 226評(píng)論 0 0
  • 今天我們上了我期待已久的微機(jī)課。 微機(jī)課教室在科藝樓的三樓,哇,教室里有這么多電腦,把我的眼睛都看花了! 老師給我...
    快樂的月亮公主閱讀 576評(píng)論 0 0

友情鏈接更多精彩內(nèi)容