大衛·席爾瓦 (計算機科學家)

大衛·席爾瓦
David Silver
大衛·席爾瓦; David Silver
國籍	英國
母校	劍橋大學（BSc）; 阿爾伯塔大學（PhD）
知名於	AlphaGo; AlphaZero; AlphaStar（英語：AlphaStar (software)）
獎項	ACM計算機獎（英語：ACM Prize in Computing）（2019）; 皇家學會院士（2021）
	科學生涯
研究領域	計算機科學
機構	DeepMind

大衛·席爾瓦 FRS （英語：David Silver，1976年—）是一名英國計算機科學家和商人。他領導DeepMind的強化學習研究小組，是AlphaGo、AlphaZero的首席研究員和AlphaStar（英語：AlphaStar (software)）的共同負責人。

Quick Facts 大衛·席爾瓦David Silver, 國籍 ...

Close

教育

席爾瓦於1997年畢業於劍橋大學，獲得阿迪生-韋斯利獎，並在那裏與傑米斯·哈薩比斯結識^[1]。席爾瓦於2004年回到學術界，在阿爾伯塔大學攻讀強化學習的博士學位，在那裏他共同提出了用於第一個碩士級9×9圍棋項目的算法，並於2009年畢業^[2]^[3]。他版本的程序MoGo是截至2009年的最強圍棋程式之一^[4]。

職業生涯

大學畢業後，席爾瓦共同創立了電子遊戲公司Elixir Studios（英語：Elixir Studios），並擔任其首席技術官和首席程序員，獲得多個技術和創新獎項^[1]^[5]。

席爾瓦在2011年被授予皇家學會大學研究獎學金，隨後成為倫敦大學學院的講師，現在是教授^[6]。他關於強化學習的講座可以在YouTube上找到^[7]。席爾瓦從DeepMind成立之初就為其提供諮詢，於2013年全職加入。

席爾瓦近期的研究重點是將強化學習與深度學習互相結合，包括一個直接從像素學習玩雅達利遊戲的程式^[8]。席爾瓦領導了AlphaGo項目，最終使其成為第一個在全尺寸圍棋遊戲中擊敗頂級職業棋手的程式^[9]。隨後AlphaGo獲得榮譽的9段職業認證，並獲得了康城獅子獎的創新獎^[10]。之後他領導了AlphaZero的開發工作，利用同樣的人工智能從頭開始學習下圍棋（只通過自己下棋而不是從人類遊戲中學習），然後以同樣的方式學習下國際象棋和日本將棋，達到比其他任何電腦程式更高的等級。

席爾瓦是DeepMind發表文章最多的員工之一，引用次數超過130,000次，h指數為78^[11]。

他因在電腦遊戲方面取得的突破性進展而被授予2019年ACM計算機獎（英語：ACM Prize in Computing）^[12]

2021年，席爾瓦因其對深度Q-學習和AlphaGo的貢獻而被選為英國皇家學會院士^[13]。

參考資料

[1]
Shead, Sam. David Silver: The unsung hero and intellectual powerhouse at Google DeepMind. Business Insider. [2020-09-26]. （原始內容存檔於2022-11-16）.
[2]
David, Silver. Reinforcement Learning and Simulation-Based Search in Computer Go. ERA. 2009. doi:10.7939/R39D8T （英語）.
[3]
Sylvain Gelly, David Silver. Achieving Master Level Play in 9 × 9 Computer Go (PDF). Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligence. 2008 [2023-02-23]. （原始內容存檔 (PDF)於2022-04-03）.
[4]
Stuart J. Russell, Peter Norvig. Artificial Intelligence: A Modern Approach 3rd. Prentice Hall. 2009.
[5]
What the AI Behind AlphaGo Can Teach Us About Being Human. Wired.com. [17 May 2016]. （原始內容存檔於2016-05-29）.
[6]
CSML | David Silver. www.csml.ucl.ac.uk. [2017-05-27]. （原始內容存檔於2021-04-24）（美國英語）.
[7]
RL Course by David Silver - Lecture 1: Introduction to Reinforcement Learning. May 13, 2015 [2023-02-23]. （原始內容存檔於2023-02-25） –透過YouTube.
[8]
Mnih, Volodymyr; Kavukcuoglu, Koray; Silver, David; Rusu, Andrei A.; Veness, Joel; Bellemare, Marc G.; Graves, Alex; Riedmiller, Martin; Fidjeland, Andreas K. Human-level control through deep reinforcement learning. Nature. 2015-02-26, 518 (7540): 529–533. Bibcode:2015Natur.518..529M. ISSN 0028-0836. PMID 25719670. S2CID 205242740. doi:10.1038/nature14236 （英語）.
[9]
Silver, David; Huang, Aja; Maddison, Chris J.; Guez, Arthur; Sifre, Laurent; Driessche, George van den; Schrittwieser, Julian; Antonoglou, Ioannis; Panneershelvam, Veda; Lanctot, Marc; Dieleman, Sander; Grewe, Dominik; Nham, John; Kalchbrenner, Nal; Sutskever, Ilya; Lillicrap, Timothy; Leach, Madeleine; Kavukcuoglu, Koray; Graepel, Thore; Hassabis, Demis. Mastering the game of Go with deep neural networks and tree search. Nature. 28 January 2016, 529 (7587): 484–489. Bibcode:2016Natur.529..484S. ISSN 0028-0836. PMID 26819042. S2CID 515925. doi:10.1038/nature16961.
[10]
Google DeepMind AlphaGo in U.K. Wins Innovation Grand Prix. [2017-05-27]. （原始內容存檔於2016-07-31）（英語）.
[11]
David Silver – Google Scholar Citations. [2022-02-01]. （原始內容存檔於2023-03-25）.
[12]
Ormond, Jim. ACM Prize in Computing Awarded to AlphaGo Developer: David Silver Recognized for Breakthrough Advances in Computer Game-Playing. acm.org. [2020-04-02]. （原始內容存檔於2023-03-07）.
[13]
Royal Society elects outstanding new Fellows and Foreign Members. royalsociety.org. [2021-06-08]. （原始內容存檔於2021-05-06）.

[Unsung_Hero-1] [1]
Shead, Sam. David Silver: The unsung hero and intellectual powerhouse at Google DeepMind. Business Insider. [2020-09-26]. （原始內容存檔於2022-11-16）.

[2] [2]
David, Silver. Reinforcement Learning and Simulation-Based Search in Computer Go. ERA. 2009. doi:10.7939/R39D8T （英語）.

[3] [3]
Sylvain Gelly, David Silver. Achieving Master Level Play in 9 × 9 Computer Go (PDF). Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligence. 2008 [2023-02-23]. （原始內容存檔 (PDF)於2022-04-03）.

[4] [4]
Stuart J. Russell, Peter Norvig. Artificial Intelligence: A Modern Approach 3rd. Prentice Hall. 2009.

[MyUser_Wired.com_May_17_2016c-5] [5]
What the AI Behind AlphaGo Can Teach Us About Being Human. Wired.com. [17 May 2016]. （原始內容存檔於2016-05-29）.

[6] [6]
CSML | David Silver. www.csml.ucl.ac.uk. [2017-05-27]. （原始內容存檔於2021-04-24）（美國英語）.

[7] [7]
RL Course by David Silver - Lecture 1: Introduction to Reinforcement Learning. May 13, 2015 [2023-02-23]. （原始內容存檔於2023-02-25） –透過YouTube.

[8] [8]
Mnih, Volodymyr; Kavukcuoglu, Koray; Silver, David; Rusu, Andrei A.; Veness, Joel; Bellemare, Marc G.; Graves, Alex; Riedmiller, Martin; Fidjeland, Andreas K. Human-level control through deep reinforcement learning. Nature. 2015-02-26, 518 (7540): 529–533. Bibcode:2015Natur.518..529M. ISSN 0028-0836. PMID 25719670. S2CID 205242740. doi:10.1038/nature14236 （英語）.

[9] [9]
Silver, David; Huang, Aja; Maddison, Chris J.; Guez, Arthur; Sifre, Laurent; Driessche, George van den; Schrittwieser, Julian; Antonoglou, Ioannis; Panneershelvam, Veda; Lanctot, Marc; Dieleman, Sander; Grewe, Dominik; Nham, John; Kalchbrenner, Nal; Sutskever, Ilya; Lillicrap, Timothy; Leach, Madeleine; Kavukcuoglu, Koray; Graepel, Thore; Hassabis, Demis. Mastering the game of Go with deep neural networks and tree search. Nature. 28 January 2016, 529 (7587): 484–489. Bibcode:2016Natur.529..484S. ISSN 0028-0836. PMID 26819042. S2CID 515925. doi:10.1038/nature16961.

[10] [10]
Google DeepMind AlphaGo in U.K. Wins Innovation Grand Prix. [2017-05-27]. （原始內容存檔於2016-07-31）（英語）.

[MyUser_Https:_May_17_2016c-11] [11]
David Silver – Google Scholar Citations. [2022-02-01]. （原始內容存檔於2023-03-25）.

[12] [12]
Ormond, Jim. ACM Prize in Computing Awarded to AlphaGo Developer: David Silver Recognized for Breakthrough Advances in Computer Game-Playing. acm.org. [2020-04-02]. （原始內容存檔於2023-03-07）.

[13] [13]
Royal Society elects outstanding new Fellows and Foreign Members. royalsociety.org. [2021-06-08]. （原始內容存檔於2021-05-06）.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]