代碼工具 A new R package for detecting unusual time series

The anom-alous pack-age pro-vides some tools to detect unusual time series in a large col-lec-tion of time series. This is joint work with Earo Wang (an hon-ours stu-dent at Monash) and Niko-lay Laptev (from Yahoo Labs). Yahoo is inter-ested in detect-ing unusual pat-terns in server met-rics.
The pack-age is based on this paper with Earo and Niko-lay.
The basic idea is to mea-sure a range of fea-tures of the time series (such as strength of sea-son-al-ity, an index of spik-i-ness, first order auto-cor-re-la-tion, etc.) Then a prin-ci-pal com-po-nent decom-po-si-tion of the fea-ture matrix is cal-cu-lated, and out-liers are iden-ti-fied in 2-??dimensional space of the first two prin-ci-pal com-po-nent scores.
We use two meth-ods to iden-tify outliers.
A bivari-ate ker-nel den-sity esti-mate of the first two PC scores is com-puted, and the points are ordered based on the value of the den-sity at each obser-va-tion. This gives us a rank-ing of most out-ly-ing (least den-sity) to least out-ly-ing (high-est density).
A series of

Rendered by QuickLaTeX.com
Rendered by QuickLaTeX.com
–con-vex hulls are com-puted on the first two PC scores with decreas-ing
Rendered by QuickLaTeX.com
Rendered by QuickLaTeX.com
, and points are clas-si-fied as out-liers when they become sin-gle-tons sep-a-rated from the main hull. This gives us an alter-na-tive rank-ing with the most out-ly-ing hav-ing sep-a-rated at the high-est value of
Rendered by QuickLaTeX.com
Rendered by QuickLaTeX.com
, and the remain-ing out-liers with decreas-ing val-ues of
Rendered by QuickLaTeX.com
Rendered by QuickLaTeX.com
.

I explained the ideas in a talk last Tues-day given at a joint meet-ing of the Sta-tis-ti-cal Soci-ety of Aus-tralia and the Mel-bourne Data Sci-ence Meetup Group. Slides are avail-able here. A link to a video of the talk will also be added there when it is ready.
The density-??ranking of PC scores was also used in my work on detect-ing out-liers in func-tional data. See my 2010 JCGS paper and the asso-ci-ated rain-bow pack-age for R.
There are two ver-sions of the pack-age: one under an ACM licence, and a lim-ited ver-sion under a GPL licence. Even-tu-ally we hope to make the GPL ver-sion con-tain every-thing, but we are cur-rently depen-dent on the alphahull pack-age which has an ACM licence.

最后編輯于
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請(qǐng)聯(lián)系作者
【社區(qū)內(nèi)容提示】社區(qū)部分內(nèi)容疑似由AI輔助生成,瀏覽時(shí)請(qǐng)結(jié)合常識(shí)與多方信息審慎甄別。
平臺(tái)聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點(diǎn),簡(jiǎn)書(shū)系信息發(fā)布平臺(tái),僅提供信息存儲(chǔ)服務(wù)。

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容