從今天開始,用英語學習一遍SQL,典型的非典型學英語場景。
課程是Coursera上面的SQL for Data Scientists。今天學了第一小節(jié)Course Introduction。
整理下我需要聽幾遍才能夠聽明白的,或者能聽懂但是無法應用自如的詞和準確的表達,尤其是動詞的運用。
學單詞必須在語境中學習,所以下面先寫整句話,?再?學習單詞和表達。
I'm a data scientist for VSP Global, where I build data stories, predictive models, and work with machine learning algorithms to help the organization improve their services, identify new markets, and a whole host of other things.
這句話中有幾個可以看做是固定搭配的表達,也是一個數(shù)據(jù)分析師工作的關鍵職責和價值——build data stories, predictive models, identify new markets。 還有一個我第一次看到的表達a whole host of other things.
今天看起來就只有時間學習第一個表達——
build data stories
data stories也經(jīng)常會表達為data driven stories。就是怎么拿數(shù)據(jù)講故事,也就是將數(shù)據(jù)整理為可以讓人理解的觀點。
講述data driven stories的5個要點是(Five Core Data-Driven Narratives):
1. Trends
2. Comparisons
A common data-driven narrative is comparisons. For example, we can take a different angle on Twitter's failure to grow its active users by comparing how it is performing relative to Facebook. Below is an example chart to show how Facebook is outperforming Twitter.
3. Rank order or league tables
A league table 最開始就是一堆公司的排名,一般是基于一系列評價公司的標準,比如收入,利潤等。后來也被應用在財務之外的領域,比如賽事,大學,科學等領域。?
4. Relationships
A simple approach to exploring relationships is to look at the correlation of two sets of data. It is important to remember that correlation is not the same as causation but it can highlight areas for further research.
數(shù)據(jù)之間的關系也是用數(shù)據(jù)講故事的一種非常好的表達形式。一定要記住,數(shù)據(jù)之間的關聯(lián)關系不等同于因果關系。

5. Surprising or counter-intuitive data
如果研究中發(fā)現(xiàn)了令人震驚的或者與預想不符的數(shù)據(jù),這個也會是一個很好的講述點。下方這個是個很好的例子,另外——我要喝酒去了!??!
I personally liked the research which found that 5 glasses of champagne a day can help prevent Alzheimer's disease.

從疫情開始,就一直是各種釋放輕罪犯,覺得很不合理??墒强吹竭@個圖標,感覺,是不是那些輕罪犯的確可以釋放啊,畢竟抓這么多人。然后再一想,監(jiān)獄里人多跟犯罪率高也有關系啊,不控槍,毒品也是泛濫,又是種族問題,這些可能才是監(jiān)獄服刑人員多的原因,而不是因為美國法律嚴格。
Well, 終于可以去睡覺覺了~night night~