邏輯回歸LogisticRegression

from sklearn.datasets import make_classification
from sklearn.linear_model import LogisticRegression
X, y = make_classification(n_samples=1000, n_features=4)
print(X) #np.array類型,1000*3
print(y) #np.array類型,1000*1
lr = LogisticRegression()
X_train = X[:-200]
X_test = X[-200:]
y_train = y[:-200]
y_test = y[-200:]
lr.fit(X_train, y_train)
y_train_predictions = lr.predict(X_train)
print(type(y_train_predictions))
y_test_predictions = lr.predict(X_test)
print ((y_train_predictions == y_train).sum().astype(float) / y_train.shape[0])
print ((y_test_predictions == y_test).sum().astype(float) / y_test.shape[0])

LogisticRegression()中的可加入參數較多,包含有:
(1)penalty:正則化項,l2正則化的目的是為防止過擬合,其內容為各權重的平方和加權
(2)C:目標函數的系數;因此C越大時,表示正則化的能力越弱
(3)tol:迭代停止值
(4)solver:求梯度的方法,默認選擇‘liblinear’---線性分類器。其可選參數類型包含{‘newton-cg’, ‘lbfgs’, ‘liblinear’, ‘sag’, ‘saga’}, default: ‘liblinear’.
根據API文檔,各參數的優(yōu)勢為:
For small datasets, ‘liblinear’ is a good choice, whereas ‘sag’ and ‘saga’ are faster for large ones.
For multiclass problems, only ‘newton-cg’, ‘sag’, ‘saga’ and ‘lbfgs’ handle multinomial loss; ‘liblinear’ is limited to one-versus-rest schemes.
‘newton-cg’, ‘lbfgs’ and ‘sag’ only handle L2 penalty, whereas ‘liblinear’ and ‘saga’ handle L1 penalty.
Note that ‘sag’ and ‘saga’ fast convergence is only guaranteed on features with approximately the same scale. You can preprocess the data with a scaler from sklearn.preprocessing.
(5)dual:是否采用對偶方式進行求解;dual=true表示對偶方式,primal為原問題方式。

算法語言描述

邏輯回歸算法語言描述.png

以上算法描述部分若有誤,敬請留言~

最后編輯于
?著作權歸作者所有,轉載或內容合作請聯系作者
【社區(qū)內容提示】社區(qū)部分內容疑似由AI輔助生成,瀏覽時請結合常識與多方信息審慎甄別。
平臺聲明:文章內容(如有圖片或視頻亦包括在內)由作者上傳并發(fā)布,文章內容僅代表作者本人觀點,簡書系信息發(fā)布平臺,僅提供信息存儲服務。

相關閱讀更多精彩內容

友情鏈接更多精彩內容