昨天小群上的科學(xué)家提及到ML,今天看了一個(gè)視頻非常簡單易懂。做一個(gè)筆記方便以后繼續(xù)學(xué)習(xí)。今天還看到了一個(gè)論調(diào)說好多最近興起的topic其實(shí)也沒有那么的難。入門級(jí)別應(yīng)該花點(diǎn)功夫就能達(dá)到。以此也提醒自己,隨時(shí)隨地抱著好奇心去關(guān)注自己覺得有興趣的事物吧。
ML回答的5個(gè)問題(就只有5個(gè),沒其他了。然后會(huì)有相對的Algorithm去對付這些問題):
1. Is this A or B (or C...): Classification algorithm, handle multi-class questions, handle question?with a number of answers
2. Is this weird: Anomaly detection algorithm
3. How much/How many: Regression Algorithm, any question that asks for a number
4. How is this organized: Clustering Algorithm: no one right answer, but help organize structure and better predict behavior/event
5. What should I do now: Reinforcement Learning, make a lot of small decision without human guidance
顯然: 4和5是open ended question,所以對ML要求更高了
Data Science:
Algorithm=Recipe
Data=Ingredients
Computer=Blender
Anser=Smoothie
下一步是要針對data是不是好的。 有5個(gè)原則:Relevant (data類型之間有沒有聯(lián)系), connected(就算類型相關(guān),有沒有missing呢), accurate (是否否會(huì)指導(dǎo)錯(cuò)誤的結(jié)論), enough to work with (不夠的話結(jié)論很fuzzy)
想Data問問題很重要. 一定要問Sharp question. 并且考慮問問題的角度, 和目前是不是已經(jīng)有target data在data base了 (例如問下星期stock price的價(jià)格, 首先要有stock的歷史價(jià)格)