時(shí)間序列挖掘分析:tsfresh特征中文(一)

tsfresh是一個(gè)Python的時(shí)序數(shù)據(jù)特征挖掘的模塊(官網(wǎng)https://tsfresh.readthedocs.io/en/latest/index.html,安裝可用pip install tsfresh),提取的特征可以用來描述或聚類基于提取特征的時(shí)間序列。此外,它們還可以用于構(gòu)建在時(shí)間序列上執(zhí)行分類/回歸任務(wù)的模型。通常,這些特性為時(shí)間序列及其動(dòng)態(tài)特性提供了新的見解。該項(xiàng)目總共涉及64個(gè)特征,前32個(gè)特征由我翻譯和調(diào)試,后32個(gè)特征由我的同事托馬斯所翻譯。我們?cè)贕itHub存放了項(xiàng)目資料和介紹https://github.com/SimaShanhe/tsfresh-feature-translation/

abs_energy(x)

  • 譯:絕對(duì)能量值
  • 返回時(shí)序數(shù)據(jù)的絕對(duì)能量(平方和)

E=\sum_{i=1}^n x_i^2

  • 參數(shù):x (pandas.Series)計(jì)算時(shí)序特征的數(shù)據(jù)對(duì)象
  • 返回值:絕對(duì)能量值(浮點(diǎn)數(shù))
  • 函數(shù)類型:簡單
  • 代碼示例:
#!/usr/bin/python3
import tsfresh as tsf
import pandas as pd

ts = pd.Series(x)  #數(shù)據(jù)x假設(shè)已經(jīng)獲取
ae = tsf.feature_extraction.feature_calculators.abs_energy(ts)
  • 注釋:描述時(shí)序數(shù)據(jù)的離原點(diǎn)的平方波動(dòng)情況(能量)

absolute_sum_of_changes(x)

  • 譯:一階差分絕對(duì)和
  • 返回時(shí)序數(shù)據(jù)的一階差分結(jié)果的絕對(duì)值之和

\sum_{i=1}^{n-1} |x_{i+1}-x_i|

  • 參數(shù):x (pandas.Series)計(jì)算時(shí)序特征的數(shù)據(jù)對(duì)象
  • 返回值:一階差分絕對(duì)和(非負(fù)浮點(diǎn)數(shù))
  • 函數(shù)類型:簡單
  • 代碼示例:
#!/usr/bin/python3
import tsfresh as tsf
import pandas as pd

ts = pd.Series(x)  #數(shù)據(jù)x假設(shè)已經(jīng)獲取
ae = tsf.feature_extraction.feature_calculators.absolute_sum_of_changes(ts)
  • 注釋:描述時(shí)序數(shù)據(jù)相鄰觀測值之間的絕對(duì)波動(dòng)情況

agg_autocorrelation(x, param)

  • 譯:各階自相關(guān)系數(shù)的聚合統(tǒng)計(jì)特征
  • 返回時(shí)序數(shù)據(jù)的各階差分值之間的聚合(方差、均值)統(tǒng)計(jì)特征

R(l)=\frac{1}{(n-l)\sigma^2}\sum_{i=1}^{n-l}(x_i-\mu)(x_{i+l}-\mu)

  • 參數(shù):
    • x (pandas.Series)計(jì)算時(shí)序特征的數(shù)據(jù)對(duì)象
    • parma(list) 包含一個(gè)字典{“f_agg”: x, “maxlag”, n} 其中x為聚合函數(shù)名,n為最大差分階數(shù)
  • 返回值:各階自相關(guān)系數(shù)的聚合統(tǒng)計(jì)特征(浮點(diǎn)數(shù))
  • 函數(shù)類型:組合
  • 代碼示例:
#!/usr/bin/python3
import tsfresh as tsf
import pandas as pd

ts = pd.Series(x)  #數(shù)據(jù)x假設(shè)已經(jīng)獲取
param = [{'f_agg': 'mean', 'maxlag':2}]
ae = tsf.feature_extraction.feature_calculators.agg_autocorrelation(ts, param)
  • 注釋:統(tǒng)計(jì)時(shí)序數(shù)據(jù)各界差分間的聚合特征,依據(jù)不同的聚合函數(shù),刻畫時(shí)序數(shù)據(jù)的不同特征

agg_linear_trend(x, param)

  • 譯:基于分塊時(shí)序聚合值的線性回歸
  • 返回時(shí)序數(shù)據(jù)的分塊聚合后的線性回歸(基于OLS)
  • 參數(shù):
    • x (pandas.Series)計(jì)算時(shí)序特征的數(shù)據(jù)對(duì)象
    • parma(list) 包含一個(gè)字典{“attr”: x, “chunk_len”: l, “f_agg”: f}其中“f_agg”為聚合函數(shù)名,“chunk_len”指定每塊數(shù)據(jù)量,“attr”為線性回歸結(jié)果參數(shù)屬性: “pvalue”, “rvalue”, “intercept”, “slope”, “stderr”。
  • 返回值:指定的線性回歸屬性值(zip,可用list讀取)
  • 函數(shù)類型:簡單
  • 代碼示例:
#!/usr/bin/python3
import tsfresh as tsf
import pandas as pd

ts = pd.Series(x)  #數(shù)據(jù)x假設(shè)已經(jīng)獲取
param = [{'f_agg': 'mean','attr': 'pvalue', 'chunk_len': 2}]
ae=tsf.feature_extraction.feature_calculators.agg_linear_trend(ts,param)
print(ae,list(ae))
  • 注釋:略

approximate_entropy(x, m, r)

  • 譯:近似熵
  • 衡量時(shí)序數(shù)據(jù)的的周期性、不可預(yù)測性和波動(dòng)性
  • 參數(shù):
    • x (pandas.Series)計(jì)算時(shí)序特征的數(shù)據(jù)對(duì)象
    • m (int)對(duì)照運(yùn)行數(shù)據(jù)長度
    • r (float) 過濾閾值(非負(fù)數(shù))
  • 返回值:近似熵(浮點(diǎn)數(shù))
  • 函數(shù)類型:簡單
  • 代碼示例:
#!/usr/bin/python3
import tsfresh as tsf
import pandas as pd

ts = pd.Series(x)  #數(shù)據(jù)x假設(shè)已經(jīng)獲取
ae=tsf.feature_extraction.feature_calculators.approximate_entropy(ts, 10, 0.1)
  • 注釋:相鄰熵值間的比值,是一個(gè)相對(duì)量

ar_coefficient(x, param)

  • 譯:自回歸系數(shù)
  • 衡量時(shí)序數(shù)據(jù)的的周期性、不可預(yù)測性和波動(dòng)性

X_t=\psi_0+\sum_{i=1}^k \psi_i X_{t-i} + \varepsilon_t

  • 參數(shù):
    • x (pandas.Series)計(jì)算時(shí)序特征的數(shù)據(jù)對(duì)象
    • paramm (lsit) {“coeff”: x, “k”: y}其中“coeff”自回歸中第X項(xiàng)系數(shù),“k”為自回歸階數(shù)
  • 返回值:自回歸系數(shù)(pandas.Series)
  • 函數(shù)類型:組合
  • 代碼示例:
#!/usr/bin/python3
import tsfresh as tsf
import pandas as pd

ts = pd.Series(x)  #數(shù)據(jù)x假設(shè)已經(jīng)獲取
param = [{'coeff': 0, 'k': 10}]
ae=tsf.feature_extraction.feature_calculators.ar_coefficient(ts, param)
  • 注釋:自回歸方程的各階系數(shù)\psi_i

augmented_dickey_fuller(x, param)

  • 譯:擴(kuò)展迪基-福勒檢驗(yàn)(ADF檢驗(yàn))

  • 測試一個(gè)自回歸模型是否存在單位根,衡量時(shí)序數(shù)據(jù)的平穩(wěn)性

  • 參數(shù):

    • x (pandas.Series)計(jì)算時(shí)序特征的數(shù)據(jù)對(duì)象
    • paramm (lsit) {“attr”: x} 其中x是字符串,包含“teststat”, “pvalue” 和“usedlag”
  • 返回值:ADF檢驗(yàn)統(tǒng)計(jì)值(浮點(diǎn)數(shù))

  • 函數(shù)類型:組合

  • 代碼示例:

#!/usr/bin/python3
import tsfresh as tsf
import pandas as pd

ts = pd.Series(x)  #數(shù)據(jù)x假設(shè)已經(jīng)獲取
param = [{'attr': 'pvalue'}]
ae=tsf.feature_extraction.feature_calculators.augmented_dickey_fuller(ts, param)
  • 注釋:返回ADF檢驗(yàn)統(tǒng)計(jì)值

autocorrelation(x, lag)

  • 譯:lag階自相關(guān)性

  • 計(jì)算lag階滯后時(shí)序數(shù)據(jù)的自相關(guān)性(浮點(diǎn)數(shù))

  • 參數(shù):

    • x (pandas.Series)計(jì)算時(shí)序特征的數(shù)據(jù)對(duì)象
    • lag(int)時(shí)序數(shù)據(jù)滯后階數(shù)
  • 返回值:自相關(guān)性值(浮點(diǎn)數(shù))

  • 函數(shù)類型:簡單

  • 代碼示例:

#!/usr/bin/python3
import tsfresh as tsf
import pandas as pd

ts = pd.Series(x)  #數(shù)據(jù)x假設(shè)已經(jīng)獲取
ae=tsf.feature_extraction.feature_calculators.autocorrelation(ts, 2)
  • 注釋:lag階自相關(guān)性值

binned_entropy(x, max_bins)

  • 譯:分組熵

  • 把整個(gè)序列按值均分成max_bins個(gè)桶,然后把每個(gè)值放進(jìn)相應(yīng)的桶中,然后求熵(浮點(diǎn)數(shù))

  • 參數(shù):

    • x (pandas.Series)計(jì)算時(shí)序特征的數(shù)據(jù)對(duì)象
    • max_bins(int)分組數(shù)
  • 返回值:分組熵(浮點(diǎn)數(shù))

  • 函數(shù)類型:簡單

  • 代碼示例:

#!/usr/bin/python3
import tsfresh as tsf
import pandas as pd

ts = pd.Series(x)  #數(shù)據(jù)x假設(shè)已經(jīng)獲取
ae=tsf.feature_extraction.feature_calculators.binned_entropy(ts, 10)
  • 注釋:時(shí)序數(shù)據(jù)等距分組求熵

c3(x, lag)

  • 譯:時(shí)序數(shù)據(jù)非線性度量
  • 基于物理學(xué)的時(shí)序數(shù)據(jù)非線性度量(浮點(diǎn)數(shù))

\frac{1}{n-2lag}\sum_{i=0}^{n-2lag}x_{i+2lag}^2x_{i+lag}x_i
等同于計(jì)算
\mathbb{E}[L^2(X)L(X)X]
其中L為時(shí)滯算子。

  • 參數(shù):
    • x (pandas.Series)計(jì)算時(shí)序特征的數(shù)據(jù)對(duì)象
    • lag(int)時(shí)滯階數(shù)
  • 返回值:非線性度(浮點(diǎn)數(shù))
  • 函數(shù)類型:簡單
  • 代碼示例:
#!/usr/bin/python3
import tsfresh as tsf
import pandas as pd

ts = pd.Series(x)  #數(shù)據(jù)x假設(shè)已經(jīng)獲取
ae=tsf.feature_extraction.feature_calculators.c3(ts, 2)
  • 注釋:時(shí)序數(shù)據(jù)非線性度

change_quantiles(x, ql, qh, isabs, f_agg)

  • 譯:給定區(qū)間的時(shí)序數(shù)據(jù)描述統(tǒng)計(jì)

  • 先用ql和qh兩個(gè)分位數(shù)在x中確定出一個(gè)區(qū)間,然后在這個(gè)區(qū)間里計(jì)算時(shí)序數(shù)據(jù)的均值、絕對(duì)值、連續(xù)變化值。(浮點(diǎn)數(shù))

  • 參數(shù):

    • x (pandas.Series)計(jì)算時(shí)序特征的數(shù)據(jù)對(duì)象
    • ql(float)區(qū)間下界
    • qh(float)區(qū)間上界
    • isabs(bool)是否采用絕對(duì)差值
    • f_agg(str)numpy聚合函數(shù),如mean,std等
  • 返回值:區(qū)間內(nèi)描述統(tǒng)計(jì)量(浮點(diǎn)數(shù))

  • 函數(shù)類型:簡單

  • 代碼示例:

#!/usr/bin/python3
import tsfresh as tsf
import pandas as pd
import numpy as np

ts = pd.Series(x)  #數(shù)據(jù)x假設(shè)已經(jīng)獲取
ae=tsf.feature_extraction.feature_calculators.change_quantiles(ts, 0.05, 0.95, False, 'mean')
  • 注釋:時(shí)序數(shù)據(jù)區(qū)間內(nèi)描述統(tǒng)計(jì)量

cid_ce(x, normalize)

  • 譯:時(shí)序數(shù)據(jù)復(fù)雜度

  • 用來評(píng)估時(shí)間序列的復(fù)雜度,越復(fù)雜的序列有越多的谷峰。 (浮點(diǎn)數(shù))

  • 參數(shù):

    • x (pandas.Series)計(jì)算時(shí)序特征的數(shù)據(jù)對(duì)象
    • normalize (bool) 是否對(duì)數(shù)據(jù)進(jìn)行z標(biāo)準(zhǔn)化
  • 返回值:時(shí)序數(shù)據(jù)復(fù)雜度(浮點(diǎn)數(shù))

  • 函數(shù)類型:簡單

  • 代碼示例:

#!/usr/bin/python3
import tsfresh as tsf
import pandas as pd

ts = pd.Series(x)  #數(shù)據(jù)x假設(shè)已經(jīng)獲取
ae=tsf.feature_extraction.feature_calculators.cid_ce(ts, True)
  • 注釋:時(shí)序數(shù)據(jù)復(fù)雜度

count_above_mean(x)

  • 譯:高于均值個(gè)數(shù)

  • 統(tǒng)計(jì)高于時(shí)序數(shù)據(jù)均值的個(gè)數(shù) (整數(shù))

  • 參數(shù):

    • x (pandas.Series)計(jì)算時(shí)序特征的數(shù)據(jù)對(duì)象
  • 返回值:高于均值個(gè)數(shù)(整數(shù))

  • 函數(shù)類型:簡單

  • 代碼示例:

#!/usr/bin/python3
import tsfresh as tsf
import pandas as pd

ts = pd.Series(x)  #數(shù)據(jù)x假設(shè)已經(jīng)獲取
ae=tsf.feature_extraction.feature_calculators.count_above_mean(ts)
  • 注釋:高于均值個(gè)數(shù)

count_below_mean(x)

  • 譯:低于均值個(gè)數(shù)

  • 統(tǒng)計(jì)低于時(shí)序數(shù)據(jù)均值的個(gè)數(shù) (整數(shù))

  • 參數(shù):

    • x (pandas.Series)計(jì)算時(shí)序特征的數(shù)據(jù)對(duì)象
  • 返回值:低于均值個(gè)數(shù)(整數(shù))

  • 函數(shù)類型:簡單

  • 代碼示例:

#!/usr/bin/python3
import tsfresh as tsf
import pandas as pd

ts = pd.Series(x)  #數(shù)據(jù)x假設(shè)已經(jīng)獲取
ae=tsf.feature_extraction.feature_calculators.count_below_mean(ts)
  • 注釋:低于均值個(gè)數(shù)

cwt_coefficients(x, param)

  • 譯:Ricker小波分析
  • 連續(xù)的小波分析,ricker子波是地震勘探中常用的子波類型,ricker子波是基于波動(dòng)方程嚴(yán)格推導(dǎo)得到的。(pandas.Series)

\frac{2}{\sqrt{3a}\pi^{\frac{1}{4}}}(1-\frac{x^2}{a^2})\exp(-\frac{x^2}{2a^2})
其中,a是小波變換函數(shù)中的寬度參數(shù)。

  • 參數(shù):

    • x (pandas.Series)計(jì)算時(shí)序特征的數(shù)據(jù)對(duì)象
    • param (list) {“widths”:x, “coeff”: y, “w”: z} 中x為整數(shù)值構(gòu)成的數(shù)組,y,z都是整數(shù)值。
  • 返回值:小波分析(pandas.Series)

  • 函數(shù)類型:組合

  • 代碼示例:

#!/usr/bin/python3
import tsfresh as tsf
import pandas as pd

ts = pd.Series(x)  #數(shù)據(jù)x假設(shè)已經(jīng)獲取
param = [ {'widths':tuple([2,2,2]), 'coeff': 2, 'w': 2}]
ae=tsf.feature_extraction.feature_calculators.cwt_coefficients(ts, param)
print(list(ae))
  • 注釋:width參數(shù)需要可hash對(duì)象,最后返回結(jié)果可用list查看

energy_ratio_by_chunks(x, param)

  • 譯:分塊局部熵比率

  • 將時(shí)序數(shù)據(jù)分塊后,計(jì)算目標(biāo)塊數(shù)據(jù)的熵與全體的熵比率。當(dāng)數(shù)據(jù)不夠均分時(shí),會(huì)將多余的數(shù)據(jù)在前面的塊中散布。(浮點(diǎn)數(shù))

  • 參數(shù):

    • x (pandas.Series)計(jì)算時(shí)序特征的數(shù)據(jù)對(duì)象
    • param – {“num_segments”: N, “segment_focus”: i} with N, i 取值均為整數(shù)
  • 返回值:分塊局部熵比率(list)

  • 函數(shù)類型:組合

  • 代碼示例:

#!/usr/bin/python3
import tsfresh as tsf
import pandas as pd

ts = pd.Series(x)  #數(shù)據(jù)x假設(shè)已經(jīng)獲取
param = [{'num_segments': 10, 'segment_focus': 5} ]
ae=tsf.feature_extraction.feature_calculators.energy_ratio_by_chunks(ts, param)
  • 注釋:segment_focus是從0開始計(jì)數(shù)的,返回值為一個(gè)列表包含元組

fft_aggregated(x, param)

  • 譯:絕對(duì)傅里葉變換的譜統(tǒng)計(jì)量

  • 返回絕對(duì)傅里葉變換后的光譜質(zhì)心、峰度、偏度等值(pandas.Series)

  • 參數(shù):

    • x (pandas.Series)計(jì)算時(shí)序特征的數(shù)據(jù)對(duì)象
    • param (list) {“aggtype”: s} s為字符串取值于 [“centroid”, “variance”, “skew”, “kurtosis”]
  • 返回值:絕對(duì)傅里葉變換的譜統(tǒng)計(jì)量(pandas.Series)

  • 函數(shù)類型:組合

  • 代碼示例:

#!/usr/bin/python3
import tsfresh as tsf
import pandas as pd

ts = pd.Series(x)  #數(shù)據(jù)x假設(shè)已經(jīng)獲取
param = [{'aggtype': 'skew'}]
ae=tsf.feature_extraction.feature_calculators.fft_aggregated(ts, param)
print(list(ae))
  • 注釋:略

fft_coefficient(x, param)

  • 譯:傅里葉變換系數(shù)
  • 基于快速傅里葉變換算法計(jì)算一維離散傅里葉序列的系數(shù)(pandas.Series)

A_k=\sum_{m=0}^{n-1}a_m\exp{(-2\pi i\frac{mk}{n})}

  • 參數(shù):
    • x (pandas.Series)計(jì)算時(shí)序特征的數(shù)據(jù)對(duì)象
    • param (list) {“coeff”: x, “attr”: s} x正整數(shù), s為字符串取值于 [“real”, “imag”, “abs”, “angle”]
  • 返回值:傅里葉變換系數(shù)(pandas.Series)
  • 函數(shù)類型:組合
  • 代碼示例:
#!/usr/bin/python3
import tsfresh as tsf
import pandas as pd

ts = pd.Series(x)  #數(shù)據(jù)x假設(shè)已經(jīng)獲取
param = [{'coeff': 2, 'attr': 'angle'}]
ae=tsf.feature_extraction.feature_calculators.fft_coefficient(ts, param)
print(list(ae))
  • 注釋: [“real”, “imag”, “abs”, “angle”]分別對(duì)應(yīng)系數(shù)的實(shí)值部、虛值部、絕對(duì)值、角度值。

first_location_of_maximum(x)

  • 譯:最大值位置

  • 基于時(shí)序數(shù)據(jù)長度的相對(duì)最大值位置

  • 參數(shù):

    • x (pandas.Series)計(jì)算時(shí)序特征的數(shù)據(jù)對(duì)象
  • 返回值:相對(duì)最大值位置(float)

  • 函數(shù)類型:簡單

  • 代碼示例:

#!/usr/bin/python3
import tsfresh as tsf
import pandas as pd

ts = pd.Series(x)  #數(shù)據(jù)x假設(shè)已經(jīng)獲取
ae=tsf.feature_extraction.feature_calculators.first_location_of_maximum(ts)
  • 注釋:返回值為pandas.Series,每個(gè)值都是最大值位置(在整個(gè)序列中的位置)與序列長度的比值

first_location_of_minimum(x)

  • 譯:最小值位置

  • 基于時(shí)序數(shù)據(jù)長度的相對(duì)最小值位置

  • 參數(shù):

    • x (pandas.Series)計(jì)算時(shí)序特征的數(shù)據(jù)對(duì)象
  • 返回值:相對(duì)最小值位置(float)

  • 函數(shù)類型:簡單

  • 代碼示例:

#!/usr/bin/python3
import tsfresh as tsf
import pandas as pd

ts = pd.Series(x)  #數(shù)據(jù)x假設(shè)已經(jīng)獲取
ae=tsf.feature_extraction.feature_calculators.first_location_of_minimum(ts)
  • 注釋:返回值為pandas.Series,每個(gè)值都是最小值位置(在整個(gè)序列中的位置)與序列長度的比值

friedrich_coefficients(x, param)

<font face="黑體" color=red size=5>調(diào)用接口未成功</font>

  • 譯:Langevin模型擬合的多項(xiàng)式系數(shù)
  • 基于確定動(dòng)力學(xué)模型Langevin擬合的多項(xiàng)式系數(shù)(pandas.Series)

\dot{x}(t)=h(x(t))+\mathcal{N}(0,R)

  • 參數(shù):

    • x (pandas.Series)計(jì)算時(shí)序特征的數(shù)據(jù)對(duì)象
    • param (list) {“m”: x, “r”: y, “coeff”: z} x為正整數(shù),是多項(xiàng)式擬合的最高階數(shù),y是正實(shí)數(shù),用于計(jì)算均值的分位數(shù),z為正整數(shù),多項(xiàng)式的第幾項(xiàng)。
  • 返回值:多項(xiàng)式擬合系數(shù)(pandas.Series)

  • 函數(shù)類型:組合

  • 代碼示例:

#!/usr/bin/python3
import tsfresh as tsf
import pandas as pd

ts = pd.Series(x)  #數(shù)據(jù)x假設(shè)已經(jīng)獲取
ae=tsf.feature_extraction.feature_calculators.friedrich_coefficients(ts, param)
  • 注釋:略

has_duplicate(x)

  • 譯:重復(fù)記錄檢驗(yàn)

  • 檢查時(shí)序數(shù)據(jù)是否有重復(fù)記錄(bool)

  • 參數(shù):

    • x (pandas.Series)計(jì)算時(shí)序特征的數(shù)據(jù)對(duì)象
  • 返回值:重復(fù)值存在與否(bool)

  • 函數(shù)類型:簡單

  • 代碼示例:

#!/usr/bin/python3
import tsfresh as tsf
import pandas as pd

ts = pd.Series(x)  #數(shù)據(jù)x假設(shè)已經(jīng)獲取
ae=tsf.feature_extraction.feature_calculators.has_duplicate(ts)
  • 注釋:略

has_duplicate_max(x)

  • 譯:最大值記錄重復(fù)檢驗(yàn)

  • 檢查時(shí)序數(shù)據(jù)最大記錄是否有重復(fù)記錄(bool)

  • 參數(shù):

    • x (pandas.Series)計(jì)算時(shí)序特征的數(shù)據(jù)對(duì)象
  • 返回值:最大記錄重復(fù)存在與否(bool)

  • 函數(shù)類型:簡單

  • 代碼示例:

#!/usr/bin/python3
import tsfresh as tsf
import pandas as pd

ts = pd.Series(x)  #數(shù)據(jù)x假設(shè)已經(jīng)獲取
ae=tsf.feature_extraction.feature_calculators.has_duplicate_max(ts)
  • 注釋:略

has_duplicate_min(x)

  • 譯:最小值記錄重復(fù)檢驗(yàn)

  • 檢查時(shí)序數(shù)據(jù)最小記錄是否有重復(fù)記錄(bool)

  • 參數(shù):

    • x (pandas.Series)計(jì)算時(shí)序特征的數(shù)據(jù)對(duì)象
  • 返回值:最小記錄重復(fù)存在與否(bool)

  • 函數(shù)類型:簡單

  • 代碼示例:

#!/usr/bin/python3
import tsfresh as tsf
import pandas as pd

ts = pd.Series(x)  #數(shù)據(jù)x假設(shè)已經(jīng)獲取
ae=tsf.feature_extraction.feature_calculators.has_duplicate_min(ts)
  • 注釋:略

index_mass_quantile(x, param)

  • 譯:分位數(shù)索引

  • 計(jì)算某分位數(shù)對(duì)應(yīng)的索引值(pandas.Series)

  • 參數(shù):

    • x (pandas.Series)計(jì)算時(shí)序特征的數(shù)據(jù)對(duì)象
    • param (list) {“q”: x} x為分位數(shù)值
  • 返回值:分位數(shù)索引(pandas.Series)

  • 函數(shù)類型:組合

  • 代碼示例:

#!/usr/bin/python3
import tsfresh as tsf
import pandas as pd

ts = pd.Series(x)  #數(shù)據(jù)x假設(shè)已經(jīng)獲取
param = [{'q':50}]
ae=tsf.feature_extraction.feature_calculators.index_mass_quantile(ts, param)
  • 注釋:略

kurtosis(x)

  • 譯:峰度

  • 計(jì)算基于修正的Fisher-Pearson矩統(tǒng)計(jì)量的峰度(float)

  • 參數(shù):

    • x (pandas.Series)計(jì)算時(shí)序特征的數(shù)據(jù)對(duì)象
  • 返回值:修正的峰度(float)

  • 函數(shù)類型:組合

  • 代碼示例:

#!/usr/bin/python3
import tsfresh as tsf
import pandas as pd

ts = pd.Series(x)  #數(shù)據(jù)x假設(shè)已經(jīng)獲取
ae=tsf.feature_extraction.feature_calculators.kurtosis(ts)
  • 注釋:表征概率密度分布曲線在平均值處峰值高低的特征數(shù)。

large_standard_deviation(x, r)

  • 譯:標(biāo)準(zhǔn)差是否倍于極差
  • 標(biāo)準(zhǔn)差是否為數(shù)據(jù)范圍的r倍(bool)

std(x)>r*(max(x)-min(x))

  • 參數(shù):
    • x (pandas.Series)計(jì)算時(shí)序特征的數(shù)據(jù)對(duì)象
    • r (float) – 比率值
  • 返回值:標(biāo)準(zhǔn)差是否倍于極差(bool)
  • 函數(shù)類型:組合
  • 代碼示例:
#!/usr/bin/python3
import tsfresh as tsf
import pandas as pd

ts = pd.Series(x)  #數(shù)據(jù)x假設(shè)已經(jīng)獲取
ae=tsf.feature_extraction.feature_calculators.large_standard_deviation(ts, 0.2)
  • 注釋:根據(jù)經(jīng)驗(yàn)法則,標(biāo)準(zhǔn)偏差應(yīng)該是數(shù)值范圍的四分之一。

last_location_of_maximum(x)

  • 譯:最大值最近位置

  • 基于時(shí)序數(shù)據(jù)長度的相對(duì)最大值最近位置(float)

  • 參數(shù):

    • x (pandas.Series)計(jì)算時(shí)序特征的數(shù)據(jù)對(duì)象
  • 返回值:相對(duì)最大值最近位置(float)

  • 函數(shù)類型:簡單

  • 代碼示例:

#!/usr/bin/python3
import tsfresh as tsf
import pandas as pd

ts = pd.Series(x)  #數(shù)據(jù)x假設(shè)已經(jīng)獲取
ae=tsf.feature_extraction.feature_calculators.last_location_of_maximum(ts)
  • 注釋:返回值為一個(gè)相對(duì)位置和時(shí)序數(shù)據(jù)長度的比值

last_location_of_minimum(x)

  • 譯:最小值最近位置

  • 基于時(shí)序數(shù)據(jù)長度的相對(duì)最小值最近位置(float)

  • 參數(shù):

    • x (pandas.Series)計(jì)算時(shí)序特征的數(shù)據(jù)對(duì)象
  • 返回值:相對(duì)最小值最近位置(float)

  • 函數(shù)類型:簡單

  • 代碼示例:

#!/usr/bin/python3
import tsfresh as tsf
import pandas as pd

ts = pd.Series(x)  #數(shù)據(jù)x假設(shè)已經(jīng)獲取
ae=tsf.feature_extraction.feature_calculators.last_location_of_minimum(ts)
  • 注釋:返回值為一個(gè)相對(duì)位置和時(shí)序數(shù)據(jù)長度的比值

length(x)

  • 譯:數(shù)據(jù)記錄數(shù)

  • 統(tǒng)計(jì)時(shí)序數(shù)據(jù)的總記錄數(shù)(int)

  • 參數(shù):

    • x (pandas.Series)計(jì)算時(shí)序特征的數(shù)據(jù)對(duì)象
  • 返回值:數(shù)據(jù)總行數(shù)(int)

  • 函數(shù)類型:簡單

  • 代碼示例:

#!/usr/bin/python3
import tsfresh as tsf
import pandas as pd

ts = pd.Series(x)  #數(shù)據(jù)x假設(shè)已經(jīng)獲取
ae=tsf.feature_extraction.feature_calculators.length(ts)
  • 注釋:略

linear_trend(x, param)

<font face="黑體" color=red size=5>調(diào)用接口未成功</font>

  • 譯:線性回歸分析

  • 基于最小二乘,自變量為索引(0-len(x)-1)的線性回歸,認(rèn)為數(shù)據(jù)是簡單采樣所得(pandas.Series)

  • 參數(shù):

    • x (pandas.Series)計(jì)算時(shí)序特征的數(shù)據(jù)對(duì)象
    • param (list) {“attr”: x} x為線性回歸中的結(jié)果變量名,如'pvale'。
  • 返回值:線性回歸分析(pandas.Series)

  • 函數(shù)類型:組合

  • 代碼示例:

#!/usr/bin/python3
import tsfresh as tsf
import pandas as pd

ts = pd.Series(x)  #數(shù)據(jù)x假設(shè)已經(jīng)獲取
param = [{'attr': 'pvalue'}]
ae=tsf.feature_extraction.feature_calculators.linear_trend(ts, param)
  • 注釋:略

longest_strike_above_mean(x)

  • 譯:均值上的最長連續(xù)自列長度

  • 返回x中大于x平均值的最長連續(xù)子序列的長度(int)

  • 參數(shù):

    • x (pandas.Series)計(jì)算時(shí)序特征的數(shù)據(jù)對(duì)象
  • 返回值:最長連續(xù)子列長度(int)

  • 函數(shù)類型:簡單

  • 代碼示例:

#!/usr/bin/python3
import tsfresh as tsf
import pandas as pd

ts = pd.Series(x)  #數(shù)據(jù)x假設(shè)已經(jīng)獲取
ae=tsf.feature_extraction.feature_calculators.longest_strike_above_mean(ts)
  • 注釋:略

longest_strike_below_mean(x)

  • 譯:均值下的最長連續(xù)自列長度

  • 返回x中小于x平均值的最長連續(xù)子序列的長度(int)

  • 參數(shù):

    • x (pandas.Series)計(jì)算時(shí)序特征的數(shù)據(jù)對(duì)象
  • 返回值:最長連續(xù)子列長度(int)

  • 函數(shù)類型:簡單

  • 代碼示例:

#!/usr/bin/python3
import tsfresh as tsf
import pandas as pd

ts = pd.Series(x)  #數(shù)據(jù)x假設(shè)已經(jīng)獲取
ae=tsf.feature_extraction.feature_calculators.longest_strike_below_mean(ts)
  • 注釋:略
最后編輯于
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請(qǐng)聯(lián)系作者
【社區(qū)內(nèi)容提示】社區(qū)部分內(nèi)容疑似由AI輔助生成,瀏覽時(shí)請(qǐng)結(jié)合常識(shí)與多方信息審慎甄別。
平臺(tái)聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點(diǎn),簡書系信息發(fā)布平臺(tái),僅提供信息存儲(chǔ)服務(wù)。

友情鏈接更多精彩內(nèi)容