隨機(jī)森林可解釋性

1、數(shù)據(jù)下載
https://www.kaggle.com/c/titanic/data?select=train.csv
2、代碼


import pandas as pd
import numpy as np


#(1)讀取數(shù)據(jù)
train_path = './titanic/train.csv' #訓(xùn)練數(shù)據(jù)
df = pd.read_csv(train_path)    #讀取訓(xùn)練數(shù)據(jù)
df.drop(['PassengerId','Name','Ticket','Cabin'], inplace = True, axis = 1)#刪除 PassengerId, Name, Ticket and Cabin
print(df.head())

#(2)填充 Age, Cabin和Embarked缺失值
df['Age'].fillna(value=df['Age'].mean(), inplace=True)
df['Embarked'].fillna(value='S',inplace=True)

#(3)將類別數(shù)據(jù)轉(zhuǎn)化為數(shù)值
df = pd.get_dummies(df, columns=['Sex','Embarked'])
# print(df)

#(4)劃分訓(xùn)練集和測(cè)試集
df_x = df.iloc[:,1:]
df_y = df.iloc[:,:1]

X = df_x.to_numpy()
Y = df_y.iloc[:,0].to_numpy()

x_train, x_val, y_train, y_val = train_test_split(X, Y,test_size = 0.2,random_state = 0)


#(5)訓(xùn)練模型
from sklearn.ensemble import RandomForestClassifier
model =  RandomForestClassifier(n_estimators=200,n_jobs=-1, random_state=0)
model.fit(x_train,y_train)

#(5)模型可解釋性
import lime
from lime import lime_tabular
feature_names = list(df_x.columns)#每一列特征名字
explainer = lime_tabular.LimeTabularExplainer(#構(gòu)建解釋器
    training_data=x_train,
    feature_names=feature_names,
    class_names=['diead', 'survived'],
    mode='classification'
)

#對(duì)x_val第一個(gè)樣本進(jìn)行解釋
exp = explainer.explain_instance(
    data_row=x_val[0],
    predict_fn=model.predict_proba
)
print(y_val[0])
exp.show_in_notebook(show_table=True)

3、實(shí)驗(yàn)結(jié)果


圖片.png

https://peaceful0907.medium.com/lime-explain-the-predictions-of-your-machine-learning-models-c089cf25989

?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請(qǐng)聯(lián)系作者
【社區(qū)內(nèi)容提示】社區(qū)部分內(nèi)容疑似由AI輔助生成,瀏覽時(shí)請(qǐng)結(jié)合常識(shí)與多方信息審慎甄別。
平臺(tái)聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點(diǎn),簡(jiǎn)書系信息發(fā)布平臺(tái),僅提供信息存儲(chǔ)服務(wù)。

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容