[中/英雙語] Andrej Karpathy:A Survival Guide to a PhD (二)

Andrej Karpathy - Academic Website | Blog | Github | Quora Session.


Writing papers 寫論文

Writing good papers is an essential survival skill of an academic (kind of like making fire for a caveman). In particular, it is very important to realize that papers are a specific thing: they look a certain way, they flow a certain way, they have a certain structure, language, and statistics that the other academics expect. It’s usually a painful exercise for me to look through some of my early PhD paper drafts because they are quite terrible. There is a lot to learn here.

在學(xué)術(shù)界,能寫好論文是一項關(guān)鍵的生存技能(就像是生火技能對穴居人一樣)。特別地,很重要的一點(diǎn)是要意識到論文是一種特別的事物:它們看起來有一定的形式、以一定的方式流動、有一定的結(jié)構(gòu)、語言以及其他學(xué)者所期望的統(tǒng)計數(shù)據(jù)。對我來說,查看我博士早期階段的論文真是一種痛苦的歷練,因?yàn)樗鼈儗?shí)在太糟糕了。在這方面有很多東西需要了解。

Review papers. If you’re trying to learn to write better papers it can feel like a sensible strategy to look at many good papers and try to distill patterns. This turns out to not be the best strategy; it’s analogous to only receiving positive examples for a binary classification problem. What you really want is to also have exposure to a large number of bad papers and one way to get this is by reviewing papers. Most good conferences have an acceptance rate of about 25% so most papers you’ll review are bad, which will allow you to build a powerful binary classifier. You’ll read through a bad paper and realize how unclear it is, or how it doesn’t define it’s variables, how vague and abstract its intro is, or how it dives in to the details too quickly, and you’ll learn to avoid the same pitfalls in your own papers. Another related valuable experience is to attend (or form) journal clubs - you’ll see experienced researchers critique papers and get an impression for how your own papers will be analyzed by others.

查閱論文。如果你正在學(xué)習(xí)寫更好的論文,閱讀許多好論文并提取出其中的模式似乎是一個明智的選擇。但事實(shí)證明這并不是最好的策略;這就好像是對于一個二元分類問題只接受正面的樣本一樣。你真正需要的是查閱大量糟糕的論文,其中一種方法是評閱論文。大部分好的會議的論文接收率大約為 25%,所以你查閱的大部分論文都很差,這讓你可以構(gòu)建一個強(qiáng)大的二元分類器。你可以閱讀一篇糟糕的論文,看它的描述有多么不清楚,或者它如何沒有定義自己的變量、摘要介紹有多模糊、或者它如何過快地深入到了細(xì)節(jié)之中——你可以學(xué)習(xí)讓你的論文不落入同樣的陷阱。另一個相關(guān)的有價值的經(jīng)驗(yàn)是參加(或組織)讀書俱樂部——你將看到經(jīng)驗(yàn)豐富的研究者批評論文,并且了解自己的論文將會被其他人怎樣分析。

Get the gestalt right. I remember being impressed with Fei-Fei (my adviser) once during a reviewing session. I had a stack of 4 papers I had reviewed over the last several hours and she picked them up, flipped through each one for 10 seconds, and said one of them was good and the other three bad. Indeed, I was accepting the one and rejecting the other three, but something that took me several hours took her seconds. Fei-Fei was relying on the gestalt of the papers as a powerful heuristic. Your papers, as you become a more senior researcher take on a characteristic look. An introduction of ~1 page. A ~1 page related work section with a good density of citations - not too sparse but not too crowded. A well-designed pull figure (on page 1 or 2) and system figure (on page 3) that were not made in MS Paint. A technical section with some math symbols somewhere, results tables with lots of numbers and some of them bold, one additional cute analysis experiment, and the paper has exactly 8 pages (the page limit) and not a single line less. You’ll have to learn how to endow your papers with the same gestalt because many researchers rely on it as a cognitive shortcut when they judge your work.

格式正確。我清楚地記得有一次和飛飛參加一次審閱會議。我在前面的幾個小時里只評閱了 4 篇論文,而她拿起這些論文,每篇只翻了 10 秒鐘就說其中一篇很好,其它都很糟糕。確實(shí)如此,我也接受了這一篇并拒絕了其它三篇,但這項花費(fèi)我?guī)讉€小時做成的事她只用幾十秒就完成了。飛飛是將論文的格式作為強(qiáng)大的啟發(fā)線索的。隨著你變成越來越資深的研究者,你的論文將有一種特定風(fēng)格的外觀。一頁引言/介紹。一頁帶有合適密度引用文獻(xiàn)(不過于稀疏也不過于密集)的相關(guān)成果介紹。一張設(shè)計良好的 pull figure(在第一頁或第二頁)和系統(tǒng)圖(在第三頁)——不要用 MS Paint 制作。描寫技術(shù)的章節(jié)在某個地方有些數(shù)學(xué)符號、帶有大量數(shù)字的結(jié)果表(其中一些是粗體)、一個額外的聰明的分析實(shí)驗(yàn)、而且論文正好有 8 頁(頁數(shù)限制)且一行不少。你將不得不學(xué)習(xí)如何為你的論文賦予相同的格式,因?yàn)樵S多研究者在評價你的成果時都將其作為認(rèn)知的捷徑。

Identify the core contribution. Before you start writing anything it’s important to identify the single core contribution that your paper makes to the field. I would especially highlight the word single. A paper is not a random collection of some experiments you ran that you report on. The paper sells a single thing that was not obvious or present before. You have to argue that the thing is important, that it hasn’t been done before, and then you support its merit experimentally in controlled experiments. The entire paper is organized around this core contribution with surgical precision. In particular it doesn’t have any additional fluff and it doesn’t try to pack anything else on a side. As a concrete example, I made a mistake in one of my earlier papers on video classification where I tried to pack in two contributions: 1) a set of architectural layouts for video convnets and an unrelated 2) multi-resolution architecture which gave small improvements. I added it because I reasoned first that maybe someone could find it interesting and follow up on it later and second because I thought that contributions in a paper are additive: two contributions are better than one. Unfortunately, this is false and very wrong. The second contribution was minor/dubious and it diluted the paper, it was distracting, and no one cared. I’ve made a similar mistake again in my CVPR 2014 paper which presented two separate models: a ranking model and a generation model. Several good in-retrospect arguments could be made that I should have submitted two separate papers; the reason it was one is more historical than rational.

確定核心貢獻(xiàn)。在你開始寫任何東西之前,首先很重要的是要確定你的論文對該領(lǐng)域的一個單一的核心貢獻(xiàn)。我會特別強(qiáng)調(diào)其中的單個詞。一篇論文不是你運(yùn)行的一些實(shí)驗(yàn)的隨機(jī)集合的報告。論文的目的是給出一個之前并不存在或并不明顯的單個事物。你必須認(rèn)為這個事物是重要的,它之前從未被完成過,然后你通過實(shí)驗(yàn)的方式在有對照組的環(huán)境中證明它的優(yōu)點(diǎn)。整篇論文都應(yīng)該圍繞這一核心貢獻(xiàn)精準(zhǔn)地展開。尤其是不要有任何額外的無價值的擴(kuò)展,也不要裹帶任何其它東西。舉一個具體的例子,在我早期的一篇關(guān)于視頻分類的論文(Large-scale Video Classification with Convolutional Neural Networks)中我就犯了這個錯誤,我嘗試一次打包兩個貢獻(xiàn):1)一個用于視頻卷積網(wǎng)絡(luò)的架構(gòu)布局集合,2)一個不相關(guān)的帶有很小改進(jìn)的多分辨率架構(gòu)。我把它加上去是因?yàn)槲矣X得一是也許有人會對此感興趣然后跟進(jìn)后續(xù)研究,二是因?yàn)槲矣X得論文的貢獻(xiàn)越多越好:兩個貢獻(xiàn)好于一個貢獻(xiàn)。不幸的是,這是一個非常徹底的錯誤。第二個貢獻(xiàn)是微不足道的/可疑的,它稀釋了這篇論文,分散了注意力,而且也沒人關(guān)心。在我 CVPR 2014 的一篇論文(Deep Visual-Semantic Alignments for Generating Image Descriptions)中我又犯了類似的錯誤,我在該論文給出了兩個沒有關(guān)聯(lián)的模型:一個排序模型和一個生成模型。我可以舉出一些好的論據(jù)來證明我應(yīng)該分開發(fā)兩篇論文;只些一個貢獻(xiàn)的原因更多是歷史上的,而非理智上的。

The structure. Once you’ve identified your core contribution there is a default recipe for writing a paper about it. The upper level structure is by default Intro, Related Work, Model, Experiments, Conclusions. When I write my intro I find that it helps to put down a coherent top-level narrative in latex comments and then fill in the text below. I like to organize each of my paragraphs around a single concrete point stated on the first sentence that is then supported in the rest of the paragraph. This structure makes it easy for a reader to skim the paper. A good flow of ideas is then along the lines of 1) X (+define X if not obvious) is an important problem 2) The core challenges are this and that. 2) Previous work on X has addressed these with Y, but the problems with this are Z. 3) In this work we do W (?). 4) This has the following appealing properties and our experiments show this and that. You can play with this structure a bit but these core points should be clearly made. Note again that the paper is surgically organized around your exact contribution. For example, when you list the challenges you want to list exactly the things that you address later; you don’t go meandering about unrelated things to what you have done (you can speculate a bit more later in conclusion). It is important to keep a sensible structure throughout your paper, not just in the intro. For example, when you explain the model each section should: 1) explain clearly what is being done in the section, 2) explain what the core challenges are 3) explain what a baseline approach is or what others have done before 4) motivate and explain what you do 5) describe it.

結(jié)構(gòu)。一旦你確定了你的核心貢獻(xiàn),就有了一個寫論文的默認(rèn)配方。上層結(jié)構(gòu)默認(rèn)的是引言/介紹、相關(guān)工作、模型、實(shí)驗(yàn)、結(jié)論。當(dāng)我寫我的引言時,我發(fā)現(xiàn)可以以相關(guān)評論的形式寫下一些條理分明的頂層敘述,然后再填寫下面的文本,這會很有幫助。我喜歡圍繞單個明確的點(diǎn)來組織我的段落,并且這個觀點(diǎn)在第一段就會給出,并用該段的剩下部分來支撐這個觀點(diǎn)。這樣的結(jié)構(gòu)可以讓讀者輕松地快速略覽。然后我們需要一個好的思維流程,可以按以下線索進(jìn)行:1)X(如果不明顯,還要加上對 X 的定義)是一個重要的問題;2)核心的挑戰(zhàn)是什么,2)X 上之前的成果已經(jīng)用 Y 解決的問題,而這一次的問題是 Z;3)在這項工作中,我們做了 W(?);4)這有以下有吸引力的特性,我們的實(shí)現(xiàn)表明了什么。你可以稍微調(diào)整這個結(jié)構(gòu),但這些核心的點(diǎn)需要得到明確。再重申一下:論文需要圍繞你的確切貢獻(xiàn)精準(zhǔn)地進(jìn)行組織。比如說,當(dāng)你羅列挑戰(zhàn)的時候,你需要確切列出那些你將在后面解決的問題,而不要牽扯到你做的與之無關(guān)的事情上(你可以在后面的結(jié)論中多做一點(diǎn)推測)。不只是在引言中,保持論文整體的合理結(jié)構(gòu)也是很重要的。比如說,當(dāng)你解釋你的模型時,每一節(jié)應(yīng)該:1)解釋清楚在這一節(jié)做了什么,2)解釋核心挑戰(zhàn),3)解釋基本方法或之前其他人做了哪些工作,4)解釋你的動機(jī)和你所做的工作,5)描述它。

Break the structure. You should also feel free (and you’re encouraged to!) play with these formulas to some extent and add some spice to your papers. For example, see this amusing paper from Razavian et al. in 2014 that structures the introduction as a dialog between a student and the professor. It’s clever and I like it. As another example, a lot of papers from Alyosha Efros have a playful tone and make great case studies in writing fun papers. As only one of many examples, see this paper he wrote with Antonio Torralba: Unbiased look at dataset bias. Another possibility I’ve seen work well is to include an FAQ section, possibly in the appendix.

打破結(jié)構(gòu)。你也應(yīng)該靈活應(yīng)對這些格式,擴(kuò)展你的論文,為之增加一點(diǎn)香料。比如說 Razavian et al. 的這篇論文(CNN Features off-the-shelf: an Astounding Baseline for Recognition)驚人地將引言做成了一位學(xué)生和教授的對話形式。這做得很聰明,我很喜歡。另一個例子,Alyosha Efros 的很多論文都帶著一種俏皮的語氣,為有趣論文的書寫給出了絕佳的案例。比如說他與 Antonio Torralba 合著的這篇論文《Unbiased look at dataset bias》。另一種我見過的效果不錯論文是問答式的章節(jié),可能用在附錄中。

Common mistake: the laundry list. One very common mistake to avoid is the “l(fā)aundry list”, which looks as follows: “Here is the problem. Okay now to solve this problem first we do X, then we do Y, then we do Z, and now we do W, and here is what we get”. You should try very hard to avoid this structure. Each point should be justified, motivated, explained. Why do you do X or Y? What are the alternatives? What have others done? It’s okay to say things like this is common (add citation if possible). Your paper is not a report, an enumeration of what you’ve done, or some kind of a translation of your chronological notes and experiments into latex. It is a highly processed and very focused discussion of a problem, your approach and its context. It is supposed to teach your colleagues something and you have to justify your steps, not just describe what you did.

常見的錯誤:洗衣清單(laundry list)。洗衣清單是應(yīng)該避免的一種非常常見的錯誤,它看起來像這樣:「這里有一個問題?,F(xiàn)在為了解決這個問題,我們首先做 X,然后我們做 Y,再做 Z,之后再是 Y,就得到了我們的結(jié)果。」你應(yīng)該竭力避免這種結(jié)構(gòu)。每一個點(diǎn)都應(yīng)該得到證明、給出動機(jī)和解釋。為什么你要做 X 或 Y?有沒有替代選擇?其他人做了什么?可以說這樣的論文很常見(如果可能的話我倒愿意給出例子)。你的論文不是一份報告,不是你做過的事情的枚舉,也不是你的按時間排列的筆記和實(shí)驗(yàn)的某種格式化的翻譯。論文是對于一個問題、你的方法和其背景的高度處理過的和高度聚焦的討論。它應(yīng)該能教給你的同事一些東西,它必須要能證明你的步驟,而不只是描述你做了什么。

The language. Over time you’ll develop a vocabulary of good words and bad words to use when writing papers. Speaking about machine learning or computer vision papers specifically as concrete examples, in your papers you never “study” or “investigate” (there are boring, passive, bad words); instead you “develop” or even better you “propose”. And you don’t present a “system” or, shudder, a “pipeline”; instead, you develop a “model”. You don’t learn “features”, you learn “representations”. And god forbid, you never “combine”, “modify” or “expand”. These are incremental, gross terms that will certainly get your paper rejected :).

語言。隨著時間的推移,你會積累一個寫論文時的好詞詞典和壞詞詞典。具體可以機(jī)器學(xué)習(xí)或計算機(jī)視覺論文為例:在你的論文中永遠(yuǎn)不要出現(xiàn)「study」和「investigate」(這是無聊的、被動的、糟糕的詞);而你應(yīng)該使用「develop」或甚至「propose」這樣的詞。你不要提出一個「system」或甚至更糟的「pipeline」;相反,你開發(fā)了一個「model」。你不是在學(xué)習(xí)「features」,你是在學(xué)習(xí)「representations」。而且上帝保佑,你千萬不要使用「combine」、「modify」或「expand」。這些多余的、粗陋的術(shù)語肯定會讓你的論文被拒 :)

An internal deadlines 2 weeks prior. Not many labs do this, but luckily Fei-Fei is quite adamant about an internal deadline 2 weeks before the due date in which you must submit at least a 5-page draft with all the final experiments (even if not with final numbers) that goes through an internal review process identical to the external one (with the same review forms filled out, etc). I found this practice to be extremely useful because forcing yourself to lay out the full paper almost always reveals some number of critical experiments you must run for the paper to flow and for its argument flow to be coherent, consistent and convincing.

提前兩周的內(nèi)部截至?xí)r間。并沒有許多實(shí)驗(yàn)室這樣做,但幸運(yùn)的是飛飛對這個提前兩周的內(nèi)部截至?xí)r間限制很是堅定,在這個時間,你必須提交至少 5 頁帶有所有最終實(shí)驗(yàn)的草稿(即使不是最終的數(shù)字);這份草稿會進(jìn)入一個與外部完全一樣的內(nèi)部評審過程(具有相同的評審表等等)我發(fā)現(xiàn)這種做法非常有用,因?yàn)檫@會迫使你思考整篇論文的布局,從而總是能讓你彰顯出一些你必須為這篇論文的思路而運(yùn)行的關(guān)鍵實(shí)驗(yàn),并讓論據(jù)思路條理清晰、連貫和有說服力。

Another great resource on this topic is Tips for Writing Technical Papers from Jennifer Widom.

關(guān)于這一主題的另一個好資源是 Jennifer Widom 寫的《Tips for Writing Technical Papers》(https://cs.stanford.edu/people/widom/paper-writing.html)。

Writing code 寫代碼

A lot of your time will of course be taken up with the execution of your ideas, which likely involves a lot of coding. I won’t dwell on this too much because it’s not uniquely academic, but I would like to bring up a few points.

當(dāng)然,你仍舊會花很多時間在實(shí)現(xiàn)你的想法上,也就是說,你還會編寫很多代碼。因?yàn)檫@并不是學(xué)術(shù)上獨(dú)有的工作,所以我不會在此詳談,但還是有幾點(diǎn)我想提一下。

Release your code. It’s a somewhat surprising fact but you can get away with publishing papers and not releasing your code. You will also feel a lot of incentive to not release your code: it can be a lot of work (research code can look like spaghetti since you iterate very quickly, you have to clean up a lot), it can be intimidating to think that others might judge you on your at most decent coding abilities, it is painful to maintain code and answer questions from other people about it (forever), and you might also be concerned that people could spot bugs that invalidate your results. However, it is precisely for some of these reasons that you should commit to releasing your code: it will force you to adopt better coding habits due to fear of public shaming (which will end up saving you time!), it will force you to learn better engineering practices, it will force you to be more thorough with your code (e.g. writing unit tests to make bugs much less likely), it will make others much more likely to follow up on your work (and hence lead to more citations of your papers) and of course it will be much more useful to everyone as a record of exactly what was done for posterity. When you do release your code I recommend taking advantage of docker containers; this will reduce the amount of headaches people email you about when they can’t get all the dependencies (and their precise versions) installed.

公開你的代碼。雖然你可能會感到驚訝,但是你確實(shí)可以不發(fā)表論文也不公開代碼。同時,你有很多動機(jī)將自己的代碼藏起來:寫代碼會花費(fèi)許多時間(研究項目的代碼看起來像是意大利面,因?yàn)樗牡浅?欤阅阈枰?jīng)常進(jìn)行清理);同時,光是想到別人可能會對你的代碼評頭論足,就已經(jīng)足夠嚇人了,維護(hù)代碼以及回答別人(永遠(yuǎn)會有)的問題是非常痛苦的,你甚至?xí)?dān)心別人可能會發(fā)現(xiàn)代碼中的錯誤,從而減弱了研究的可信度。然而,這正是你應(yīng)該發(fā)表代碼的原因之一:為了避免尷尬的情況發(fā)生,你會不斷采用更好的編碼習(xí)慣(而這最終會幫你節(jié)省時間?。荒銜黄仁箤W(xué)習(xí)更好的工程實(shí)踐;你會被迫使對自己的代碼更加嚴(yán)格要求(例如,編寫單元測試以最小化錯誤出現(xiàn)的可能性),這一切都將讓你的研究受到更多關(guān)注(并由此帶來更多的引用次數(shù)),并且很自然地,你的研究也將對之后的研究更加有用。當(dāng)你真的準(zhǔn)備發(fā)表代碼的時候,我建議你好好利用 docker containers(https://www.docker.com/);它會減少人們發(fā)郵件來問你要附件(和它們的各種版本),從而減輕你的煩惱。

Think of the future you. Make sure to document all your code very well for yourself. I guarantee you that you will come back to your code base a few months later (e.g. to do a few more experiments for the camera ready version of the paper), and you will feel completely lost in it. I got into the habit of creating very thorough readme.txt files in all my repos (for my personal use) as notes to future self on how the code works, how to run it, etc.

為將來的你著想。為了你自己的便捷,務(wù)必將自己的所有代碼妥善記錄,我保證幾個月之后你會回來看你的代碼(例如,為即將發(fā)表的論文再做幾個實(shí)驗(yàn)),那時,你會一頭霧水。我已經(jīng)養(yǎng)成了為(自己的)每一個版本編寫非常詳盡的 readme.txt 文件的習(xí)慣,以便未來的自己能夠明白代碼的原理和使用方法等等。

Giving talks 做演講

So, you published a paper and it’s an oral! Now you get to give a few minute talk to a large audience of people - what should it look like?

現(xiàn)在,你的論文成功發(fā)表了!你需要就這篇論文向許多觀眾進(jìn)行幾分鐘的演講——它應(yīng)該是什么樣的?

The goal of a talk. First, that there’s a common misconception that the goal of your talk is to tell your audience about what you did in your paper. This is incorrect, and should only be a second or third degree design criterion. The goal of your talk is to 1) get the audience really excited about the problem you worked on (they must appreciate it or they will not care about your solution otherwise!) 2) teach the audience something (ideally while giving them a taste of your insight/solution; don’t be afraid to spend time on other’s related work), and 3) entertain (they will start checking their Facebook otherwise). Ideally, by the end of the talk the people in your audience are thinking some mixture of “wow, I’m working in the wrong area”, “I have to read this paper”, and “This person has an impressive understanding of the whole area”.

演講的目的。首先,一個常有的誤解是,演講的目的是向聽眾介紹你在論文中做了什么。這是錯誤的,這一目的最多也只能排在第二或第三位。你的演講應(yīng)應(yīng)該:1)使聽眾對你研究的問題產(chǎn)生濃厚興趣(如果大家對問題本身沒興趣,他們也不會在乎你的解決方法的?。?)教些東西給聽眾(理想的情況是在讓大家體驗(yàn)?zāi)愕乃伎?/ 解決方案的時候,不要害怕在別人的相關(guān)工作上花時間)以及 3)有趣(否則很多人會開始刷 Facebook)。理想情況下,在演講結(jié)束之后。你的聽眾中應(yīng)該有人在想這幾件事情:「哇,我要換個研究方向」,「我一定要看看這篇論文」,以及「作者本人對整個領(lǐng)域的理解非常出眾?!?/p>

A few do’s: There are several properties that make talks better. For instance, Do: Lots of pictures. People Love pictures. Videos and animations should be used more sparingly because they distract. Do: make the talk actionable - talk about something someone can do after your talk. Do: give a live demo if possible, it can make your talk more memorable. Do: develop a broader intellectual arch that your work is part of. Do: develop it into a story (people love stories). Do: cite, cite, cite - a lot! It takes very little slide space to pay credit to your colleagues. It pleases them and always reflects well on you because it shows that you’re humble about your own contribution, and aware that it builds on a lot of what has come before and what is happening in parallel. You can even cite related work published at the same conference and briefly advertise it. Do: practice the talk! First for yourself in isolation and later to your lab/friends. This almost always reveals very insightful flaws in your narrative and flow.

一些可以做的事情:有些特征會讓演講更上一層樓,例如,要:有許多圖片。人們喜歡圖片。錄像和動畫應(yīng)該更少一些,因?yàn)樗鼈內(nèi)菀鬃屓朔中?。要讓演講內(nèi)容高度可執(zhí)行——將一些人們在聽到之后可以馬上動手去做的東西。要:如果可能的話給一個 demo,它會讓你的演講更容易被記住。要發(fā)展一個你的研究涉及到更廣泛的領(lǐng)域。要講成一個故事(人們喜歡故事)。要引用,引用,引用——很多應(yīng)用!加入引用不會占用你的幻燈片多大的空間,而你的同行們會因此感到高興,并且認(rèn)為你是一個十分謙虛的人,因?yàn)槟阋庾R到自己的貢獻(xiàn)是建立在他人的許多成果之上的。你甚至可以引用在同一個會議發(fā)表的文章,并為之做簡短的推薦。要進(jìn)行練習(xí)!先自己練習(xí),然后向同事 / 朋友展示。這常常會幫你發(fā)現(xiàn)許多敘述和流程中的重要問題。

Don’t: texttexttext. Don’t crowd your slides with text. There should be very few or no bullet points - speakers sometimes try to use these as a crutch to remind themselves what they should be talking about but the slides are not for you they are for the audience. These should be in your speaker notes. On the topic of crowding the slides, also avoid complex diagrams as much as you can - your audience has a fixed bit bandwidth and I guarantee that your own very familiar and “simple” diagram is not as simple or interpretable to someone seeing it for the first time.

不要加很多文字。不要讓文字?jǐn)D滿你的幻燈片。你應(yīng)該少用甚至不用重點(diǎn)標(biāo)識——演講者們有時會使用重點(diǎn)標(biāo)識來提醒自己要講些什么,但是幻燈片不是給你自己看的,而是給觀眾看的。重點(diǎn)標(biāo)識應(yīng)該出現(xiàn)在你的演講筆記中。于此類似地,盡可能地避免使用復(fù)雜的圖表——你的聽眾是有固定帶寬的,并且我保證那些在你看來十分熟悉且「簡單」的圖表,對于那些第一次看到的人來說,就不是這么好理解了。

Careful with: result tables: Don’t include dense tables of results showing that your method works better. You got a paper, I’m sure your results were decent. I always find these parts boring and unnecessary unless the numbers show something interesting (other than your method works better), or of course unless there is a large gap that you’re very proud of. If you do include results or graphs build them up slowly with transitions, don’t post them all at once and spend 3 minutes on one slide.

注意,結(jié)果表:不要使用信息十分密集的表格來展示你的方法有多么優(yōu)秀。既然你已經(jīng)寫了篇論文出來了,我相信你的結(jié)果至少是可靠的。我一致認(rèn)為這一部分是非常無聊和無用的,除非數(shù)字能夠表明一些(與證明你的論文無關(guān)的)十分有趣的東西,或者數(shù)字所表明的差距確實(shí)非常巨大。如果你真的要展示結(jié)果或圖表,請循序漸進(jìn)地將它們展示出來,而不是把所有東西扔到頁面上,然后在一頁幻燈片上花上三分鐘。

Pitfall: the thin band between bored/confused. It’s actually quite tricky to design talks where a good portion of your audience learns something. A common failure case (as an audience member) is to see talks where I’m painfully bored during the first half and completely confused during the second half, learning nothing by the end. This can occur in talks that have a very general (too general) overview followed by a technical (too technical) second portion. Try to identify when your talk is in danger of having this property.

陷阱:無聊與困惑之間的微小距離。如果你聽眾中的許多人都抱著一種學(xué)習(xí)的心態(tài)而來,要設(shè)計出一個好的演講不是那么容易的。一個常見的失敗案例是(作為一個聽眾),在演講的前半段無聊至死,然后在后半段困惑不已,最后啥都沒學(xué)到。經(jīng)常出現(xiàn)這一情形的演講的特點(diǎn)是,摘要非常概括性(過于概括了),然后緊接著技術(shù)(過于技術(shù)的)詳解。嘗試在你的演講中規(guī)避這一傾向。

Pitfall: running out of time. Many speakers spend too much time on the early intro parts (that can often be somewhat boring) and then frantically speed through all the last few slides that contain the most interesting results, analysis or demos. Don’t be that person.

陷阱:超時。許多演講者會在開始的部分花費(fèi)過多的時間(一般來講這也會使得演講變得無聊),然后火急火燎地了解最后的幾張幻燈片,而那些往往是最有趣的結(jié)果、分析或 demo。不要做這樣的演講者。

Pitfall: formulaic talks. I might be a special case but I’m always a fan of non-formulaic talks that challenge conventions. For instance, I despise the outline slide. It makes the talk so boring, it’s like saying: “This movie is about a ring of power. In the first chapter we’ll see a hobbit come into possession of the ring. In the second we’ll see him travel to Mordor. In the third he’ll cast the ring into Mount Doom and destroy it. I will start with chapter 1” - Come on! I use outline slides for much longer talks to keep the audience anchored if they zone out (at 30min+ they inevitably will a few times), but it should be used sparingly.

陷阱:形式化的演講。我可能是個特例,但是我一直都喜歡挑戰(zhàn)傳統(tǒng)的、規(guī)避形式化的演講。例如,我鄙視在幻燈片中加入演講大綱的行為。因?yàn)檫@使得整個演講變得無聊,就像在說:「這部電影講述的是一個有魔力的戒指,在第一章我們會看到一個霍比特人得到這個戒指,第二章我們會看到他去了 Mordor,第三章里他將戒指扔到了 Mount Doom 并將之毀壞了。我將從第一章開始講起」——拜托別這樣!我只在非常長的演講中才使用大綱頁面,以便于聽眾在走神之后重新恢復(fù)記憶(30 分鐘后他們往往會走幾次神),但是這應(yīng)該盡量少用。

Observe and learn. Ultimately, the best way to become better at giving talks (as it is with writing papers too) is to make conscious effort to pay attention to what great (and not so great) speakers do and build a binary classifier in your mind. Don’t just enjoy talks; analyze them, break them down, learn from them. Additionally, pay close attention to the audience and their reactions. Sometimes a speaker will put up a complex table with many numbers and you will notice half of the audience immediately look down on their phone and open Facebook. Build an internal classifier of the events that cause this to happen and avoid them in your talks.

觀察并學(xué)習(xí)。最終,成為一個優(yōu)秀演講者的最好方法是(寫論文也是這樣),留意觀察優(yōu)秀的(和不怎么優(yōu)秀的)演講者的行為,然后在你的大腦里構(gòu)建一個二元分類器。不要僅僅做演講的聽眾;你要對它們進(jìn)行分析、分解、然后從中學(xué)習(xí)。除此之外,留意現(xiàn)場反應(yīng)。有時,當(dāng)演講者展示出一個復(fù)雜的數(shù)字表格時,你會注意到,許多觀眾立馬低頭看起了手機(jī)。為可能導(dǎo)致這一場景的行為構(gòu)建一個內(nèi)部分類器,并在你自己的演講中避免這些行為。

Attending conferences 參加會議

On the subject of conferences:
對于會議:

Go. It’s very important that you go to conferences, especially the 1-2 top conferences in your area. If your adviser lacks funds and does not want to pay for your travel expenses (e.g. if you don’t have a paper) then you should be willing to pay for yourself (usually about $2000 for travel, accommodation, registration and food). This is important because you want to become part of the academic community and get a chance to meet more people in the area and gossip about research topics. Science might have this image of a few brilliant lone wolfs working in isolation, but the truth is that research is predominantly a highly social endeavor - you stand on the shoulders of many people, you’re working on problems in parallel with other people, and it is these people that you’re also writing papers to. Additionally, it’s unfortunate but each field has knowledge that doesn’t get serialized into papers but is instead spread across a shared understanding of the community; things such as what are the next important topics to work on, what papers are most interesting, what is the inside scoop on papers, how they developed historically, what methods work (not just on paper, in reality), etcetc. It is very valuable (and fun!) to become part of the community and get direct access to the hivemind - to learn from it first, and to hopefully influence it later.

參加。參加會議是很重要的,特別是你所在的領(lǐng)域的最頂尖的 1-2 場會議。如果你的導(dǎo)師缺乏資金,不愿意為你的路費(fèi)買單(例如,當(dāng)你還沒有論文的時候),那么你應(yīng)當(dāng)愿意自己買單。這是很重要的,因?yàn)槟阈枰蔀閷W(xué)術(shù)圈的一員,并能夠見到更多同僚,以及了解研究話題的八卦??茖W(xué)界可能有一些極少數(shù)的單打獨(dú)斗的人,但是真相是,做研究很大程度上是一個高度社交性的事業(yè)——你是站在許多人的肩膀上的,且還有許多人和你一起努力,并且這些人也是你的論文的閱讀者。此外,我很遺憾這么說,但是每一個領(lǐng)域都有一些沒有出現(xiàn)在論文里、但是在整個圈子里廣為流傳的知識,包括接下來的重要話題有什么,哪些論文是最有趣的,論文的內(nèi)線消息是什么,他們之前是如何發(fā)展的,哪些方法管用了(不是在論文里,而是在實(shí)際中),等等等等。成為圈子里的一員,并且了解這個集體中的共識,是很有價值的(并且很有趣?。紫葟闹袑W(xué)習(xí),然后最好能夠影響這個圈子。

Talks: choose by speaker. One conference trick I’ve developed is that if you’re choosing which talks to attend it can be better to look at the speakers instead of the topics. Some people give better talks than others (it’s a skill, and you’ll discover these people in time) and in my experience I find that it often pays off to see them speak even if it is on a topic that isn’t exactly connected to your area of research.

講座:根據(jù)演講者進(jìn)行選擇。我使用的一個會議技巧是,在選擇講座的時候要看演講嘉賓,而不是講座主題(這是一項技能,慢慢地你會發(fā)現(xiàn)有價值的人),并且,根據(jù)我的經(jīng)驗(yàn),我發(fā)現(xiàn)親耳聽這些人演講會大有裨益,盡管話題甚至和你的研究領(lǐng)域沒有直接聯(lián)系。

The real action is in the hallways. The speed of innovation (especially in Machine Learning) now works at timescales much faster than conferences so most of the relevant papers you’ll see at the conference are in fact old news. Therefore, conferences are primarily a social event. Instead of attending a talk I encourage you to view the hallway as one of the main events that doesn’t appear on the schedule. It can also be valuable to stroll the poster session and discover some interesting papers and ideas that you may have missed.

It is said that there are three stages to a PhD. In the first stage you look at a related paper’s reference section and you haven’t read most of the papers. In the second stage you recognize all the papers. In the third stage you’ve shared a beer with all the first authors of all the papers.

真正有價值的信息可能在走廊上?,F(xiàn)在,創(chuàng)新的速度(尤其在機(jī)器學(xué)習(xí)領(lǐng)域)已經(jīng)比會議的間隔時間要短了,所以你在會議看到的大部分論文實(shí)際上都算是舊新聞了。因此,會議更多地是一項社交活動。與其參加一個講座,我建議你把去走廊轉(zhuǎn)轉(zhuǎn)作為一項主要活動。你還可以去海報宣傳去逛逛,說不定會發(fā)現(xiàn)一些錯過的有趣論文和想法。

據(jù)說一個博士生有三個階段。在第一個階段,一篇相關(guān)論文的引用你大部分都沒看過;在第二個階段,你能認(rèn)出這些論文;在第三個階段,你已經(jīng)與所有論文的第一作者喝過一圈了。

Closing thoughts 最后的一些想法

I can’t find the quote anymore but I heard Sam Altman of YC say that there are no shortcuts or cheats when it comes to building a startup. You can’t expect to win in the long run by somehow gaming the system or putting up false appearances. I think that the same applies in academia. Ultimately you’re trying to do good research and push the field forward and if you try to game any of the proxy metrics you won’t be successful in the long run. This is especially so because academia is in fact surprisingly small and highly interconnected, so anything shady you try to do to pad your academic resume (e.g. self-citing a lot, publishing the same idea multiple times with small remixes, resubmitting the same rejected paper over and over again with no changes, conveniently trying to leave out some baselines etc.) will eventually catch up with you and you will not be successful.

盡管我現(xiàn)在找不到出處了,但是我曾聽到 YC 的 Sam Altman 說,建立一個創(chuàng)業(yè)公司沒有捷徑可走。你不能指望通過玩弄體制,或者通過偽裝來獲得長久的勝利。我想在學(xué)術(shù)領(lǐng)域也是一樣的。最終,你的目的是用優(yōu)秀的研究推動這一領(lǐng)域的進(jìn)步,如果你試圖針對某些指標(biāo)動手腳,從長遠(yuǎn)來看你無法成功。在學(xué)術(shù)界尤其如此,因?yàn)閷W(xué)術(shù)界令人驚訝地小,并且高度關(guān)聯(lián),所以,任何你試圖在學(xué)術(shù)履歷上用點(diǎn)陰招(例如,常常自己引用自己、將同一想法稍作修改后重復(fù)發(fā)表、重復(fù)提交被退回的論文而沒有絲毫修改、為了自己的便利而拋棄一些基本原則,等等)最終將讓你嘗盡苦果,而你也不會成功。

So at the end of the day it’s quite simple. Do good work, communicate it properly, people will notice and good things will happen. Have a fun ride!

所以,總而言之就一句話:好好工作、適當(dāng)交流,人們會注意到你,好事也會發(fā)生。祝博士之旅愉快!

EDIT: HN discussion link.

comments powered by Disqus


附錄:博士論文

  • 論文:連接圖像與自然語言(CONNECTING IMAGES AND NATURAL LANGUAGE)

導(dǎo)師審核

摘要:人工智能領(lǐng)域的一個長期目標(biāo)是開發(fā)能夠感知和理解我們周圍豐富的視覺世界,并能使用自然語言與我們進(jìn)行關(guān)于其的交流的代理。由于近些年來計算基礎(chǔ)設(shè)施、數(shù)據(jù)收集和算法的發(fā)展,人們在這一目標(biāo)的實(shí)現(xiàn)上已經(jīng)取得了顯著的進(jìn)步。這些進(jìn)步在視覺識別上尤為迅速——現(xiàn)在計算機(jī)已能以可與人類媲美的表現(xiàn)對圖像進(jìn)行分類,甚至在一些情況下超越人類,比如識別狗的品種。但是,盡管有許多激動人心的進(jìn)展,但大部分視覺識別方面的進(jìn)步仍然是在給一張圖像分配一個或多個離散的標(biāo)簽(如,人、船、鍵盤等等)方面。

在這篇學(xué)位論文中,我們開發(fā)了讓我們可以將視覺數(shù)據(jù)領(lǐng)域和自然語言話語領(lǐng)域連接起來的模型和技術(shù),從而讓我們可以實(shí)現(xiàn)兩個領(lǐng)域中元素的互譯。具體來說,首先我們引入了一個可以同時將圖像和句子嵌入到一個共有的多模態(tài)嵌入空間(multi-modal embedding space)中的模型。然后這個空間讓我們可以識別描繪了一個任意句子描述的圖像,而且反過來我們還可以找出描述任意圖像的句子。其次,我們還開發(fā)了一個圖像描述模型(image captioning model),該模型可以根據(jù)輸入其的圖像直接生成一個句子描述——該描述并不局限于人工編寫的有限選擇集合。最后,我們描述了一個可以定位和描述圖像中所有顯著部分的模型。我們的研究表明這個模型還可以反向使用:以任意描述(如:白色網(wǎng)球鞋)作為輸入,然后有效地在一個大型的圖像集合中定位其所描述的概念。我們認(rèn)為這些模型、它們內(nèi)部所使用的技術(shù)以及它們可以帶來的交互是實(shí)現(xiàn)人工智能之路上的一塊墊腳石,而且圖像和自然語言之間的連接也能帶來許多實(shí)用的益處和馬上就有價值的應(yīng)用。

從建模的角度來看,我們的貢獻(xiàn)不在于設(shè)計和展現(xiàn)了能以復(fù)雜的處理流程處理圖像和句子的明確算法,而在于卷積和循環(huán)神經(jīng)網(wǎng)絡(luò)架構(gòu)的混合設(shè)計,這種設(shè)計可以在一個單個網(wǎng)絡(luò)中將視覺數(shù)據(jù)和自然語言話語連接起來。因此,圖像、句子和關(guān)聯(lián)它們的多模態(tài)嵌入結(jié)構(gòu)的計算處理會在優(yōu)化損失函數(shù)的過程中自動涌現(xiàn),該優(yōu)化考慮網(wǎng)絡(luò)在圖像及其描述的訓(xùn)練數(shù)據(jù)集上的參數(shù)。這種方法享有許多神經(jīng)網(wǎng)絡(luò)的優(yōu)點(diǎn),其中包括簡單的均質(zhì)計算的使用,這讓其易于在硬件上實(shí)現(xiàn)并行;以及強(qiáng)大的性能——由于端到端訓(xùn)練(end-to-end training)可以將這個問題表示成單個優(yōu)化問題,其中該模型的所有組件都具有一個相同的最終目標(biāo)。我們的研究表明我們的模型在需要圖像和自然語言的聯(lián)合處理的任務(wù)中推進(jìn)了當(dāng)前最佳的表現(xiàn),而且我們可以一種能促進(jìn)對該網(wǎng)絡(luò)的預(yù)測的可解讀視覺檢查的方式來設(shè)計這一架構(gòu)。


(本文為自己整理,僅供學(xué)習(xí)收藏使用,譯文部分參考機(jī)器之心翻譯(有一段翻譯漏掉了,自己加上去了,然后略作修改),在此表示感謝。未經(jīng)允許禁止轉(zhuǎn)載,授權(quán)轉(zhuǎn)載請注明出處,謝謝?。?/p>

最后編輯于
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者
【社區(qū)內(nèi)容提示】社區(qū)部分內(nèi)容疑似由AI輔助生成,瀏覽時請結(jié)合常識與多方信息審慎甄別。
平臺聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點(diǎn),簡書系信息發(fā)布平臺,僅提供信息存儲服務(wù)。

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容