幾種梯度下降數(shù)學(xué)推導(dǎo)和做圖(動態(tài))英文原文
https://ruder.io/optimizing-gradient-descent/index.html
六種特度下降方法和代碼實(shí)現(xiàn)
https://zhuanlan.zhihu.com/p/158813090#:~:text=%E4%BB%A5%E9%80%BB%E8%BE%91%E5%9B%9E%E5%BD%92%E4%B8%BA%E4%BE%8B%E4%BB%8B,dam%E5%8F%8A%E5%85%B6%E5%AE%9E%E7%8E%B0.
梯度下降方法和對比(SGD & BGD)
https://blog.csdn.net/u012328159/article/details/80252012
詳細(xì)解釋幾種常見梯度下降優(yōu)化算法:
https://blog.csdn.net/huwenxing0801/article/details/85627245