可微分編程

可微分編程是一種程式設計範式，在其中數值計算程式始終可通過自動微分來求導數^[1]^[2]^[3]^[4]。這允許了對程式中的參數的基於梯度最佳化（英語：Gradient method），通常通過梯度下降。可微分編程廣泛用於各種領域，特別是科學計算和人工智能^[4]。

方式

多數可微分編程框架是通過構造包含程式中的控制流和數據結構的圖來進行工作的^[5]。各種嘗試一般可歸入兩組之中：

基於靜態、編譯圖的方式，比如TensorFlow 1、Theano和MXNet。它們意圖允許良好的編譯器最佳化（英語：优化编译器）並易於伸縮成大系統，但是它們的靜態本質，限制了互動性和能夠輕易建立的程式類型，例如難於構建涉及迴圈或遞歸的那些程式，還有使得用戶難以針對他們的程式進行有效的推理^[5]^[6]^[7]。

基於運算子多載、動態圖的方式，比如PyTorch和針對NumPy的Autograd^[8]，TensorFlow 2也預設使用了動態圖方式。它們的動態和互動本質，使得多數程式可以更容易的書寫和推理。但是它們導致了直譯器開銷，特別是在包含很多小運算的時候，和較弱的可伸縮性，並且縮減了來自編譯器最佳化的利益^[6]^[7]。用Julia寫的Flux（英語：Flux (machine-learning framework)）用到了自動微分程式包Zygote^[9]，它直接工作在Julia的中間表示之上，但仍可以由Julia的JIT編譯器進行最佳化^[5]^[10]^[4]。

早期方式的局限在於，它們都是以適合於這些框架的風格書寫求微分的代碼，這限制了同其他程式的互操作性。新近的方式，通過從語言的語法或中間表示構造圖來解決了這種問題，允許任意代碼都是可求微分的^[5]^[6]。

應用

可微分編程已經應用於多個領域，比如在機械人學中結合深度學習和物理引擎，用可微分密度泛函理論解決電子結構（英語：Electronic structure）問題，可微分光線追蹤，圖像處理和概率編程^[11]^[12]^[13]^[14]^[15]^[4]。

參見

參照

[1]
Baydin, Atilim Gunes; Pearlmutter, Barak; Radul, Alexey Andreyevich; Siskind, Jeffrey. Automatic differentiation in machine learning: a survey. Journal of Machine Learning Research. 2018, 18: 1–43 [2021-01-14]. （原始內容存檔於2022-01-23）.
[2]
Wang, Fei; Decker, James; Wu, Xilun; Essertel, Gregory; Rompf, Tiark, Bengio, S.; Wallach, H.; Larochelle, H.; Grauman, K. , 編, Backpropagation with Callbacks: Foundations for Efficient and Expressive Differentiable Programming (PDF), Advances in Neural Information Processing Systems 31 (Curran Associates, Inc.), 2018: 10201–10212 [2019-02-13], （原始內容存檔 (PDF)於2021-02-15）
[3]
Innes, Mike. On Machine Learning and Programming Languages (PDF). SysML Conference 2018. 2018 [2021-01-14]. （原始內容存檔 (PDF)於2020-06-05）.
[4]
Innes, Mike; Edelman, Alan; Fischer, Keno; Rackauckas, Chris; Saba, Elliot; Viral B Shah; Tebbutt, Will, ∂P: A Differentiable Programming System to Bridge Machine Learning and Scientific Computing, 2019, arXiv:1907.07587 
[5]
Innes, Michael; Saba, Elliot; Fischer, Keno; Gandhi, Dhairya; Rudilosso, Marco Concetto; Joy, Neethu Mariya; Karmali, Tejan; Pal, Avik; Shah, Viral. Fashionable Modelling with Flux. 2018-10-31 [2022-08-31]. arXiv:1811.01457  [cs.PL]. （原始內容存檔於2022-08-31）.
[6]
Automatic Differentiation in Myia. [2019-06-24]. （原始內容存檔於2021-02-24）.
[7]
TensorFlow: Static Graphs. [2019-03-04]. （原始內容存檔於2021-09-02）.
[8]
Autograd － Efficiently computes derivatives of numpy code. [2022-08-28]. （原始內容存檔於2022-07-18）.
[9]
Zygote. [2021-01-14]. （原始內容存檔於2021-02-14）.
[10]
Innes, Michael. Don't Unroll Adjoint: Differentiating SSA-Form Programs. 2018-10-18. arXiv:1810.07951  [cs.PL].
[11]
Degrave, Jonas; Hermans, Michiel; Dambre, Joni; wyffels, Francis. A Differentiable Physics Engine for Deep Learning in Robotics. 2016-11-05. arXiv:1611.01652  [cs.NE].
[12]
Li, Li; Hoyer, Stephan; Pederson, Ryan; Sun, Ruoxi; Cubuk, Ekin D.; Riley, Patrick; Burke, Kieron. Kohn-Sham Equations as Regularizer: Building Prior Knowledge into Machine-Learned Physics. Physical Review Letters. 2021, 126 (3): 036401. doi:10.1103/PhysRevLett.126.036401.
[13]
Differentiable Monte Carlo Ray Tracing through Edge Sampling. people.csail.mit.edu. [2019-02-13]. （原始內容存檔於2021-05-12）.
[14]
SciML Scientific Machine Learning Open Source Software Organization Roadmap. sciml.ai. [2020-07-19]. （原始內容存檔於2021-10-17）.
[15]
Differentiable Programming for Image Processing and Deep Learning in Halide. people.csail.mit.edu. [2019-02-13]. （原始內容存檔於2021-05-06）.

[baydin2018automatic-1] [1]
Baydin, Atilim Gunes; Pearlmutter, Barak; Radul, Alexey Andreyevich; Siskind, Jeffrey. Automatic differentiation in machine learning: a survey. Journal of Machine Learning Research. 2018, 18: 1–43 [2021-01-14]. （原始內容存檔於2022-01-23）.

[2] [2]
Wang, Fei; Decker, James; Wu, Xilun; Essertel, Gregory; Rompf, Tiark, Bengio, S.; Wallach, H.; Larochelle, H.; Grauman, K. , 編, Backpropagation with Callbacks: Foundations for Efficient and Expressive Differentiable Programming (PDF), Advances in Neural Information Processing Systems 31 (Curran Associates, Inc.), 2018: 10201–10212 [2019-02-13], （原始內容存檔 (PDF)於2021-02-15）

[innes-3] [3]
Innes, Mike. On Machine Learning and Programming Languages (PDF). SysML Conference 2018. 2018 [2021-01-14]. （原始內容存檔 (PDF)於2020-06-05）.

[diffprog-zygote-4] [4]
Innes, Mike; Edelman, Alan; Fischer, Keno; Rackauckas, Chris; Saba, Elliot; Viral B Shah; Tebbutt, Will, ∂P: A Differentiable Programming System to Bridge Machine Learning and Scientific Computing, 2019, arXiv:1907.07587 

[flux-5] [5]
Innes, Michael; Saba, Elliot; Fischer, Keno; Gandhi, Dhairya; Rudilosso, Marco Concetto; Joy, Neethu Mariya; Karmali, Tejan; Pal, Avik; Shah, Viral. Fashionable Modelling with Flux. 2018-10-31 [2022-08-31]. arXiv:1811.01457  [cs.PL]. （原始內容存檔於2022-08-31）.

[myia1-6] [6]
Automatic Differentiation in Myia. [2019-06-24]. （原始內容存檔於2021-02-24）.

[pytorchtut-7] [7]
TensorFlow: Static Graphs. [2019-03-04]. （原始內容存檔於2021-09-02）.

[8] [8]
Autograd － Efficiently computes derivatives of numpy code. [2022-08-28]. （原始內容存檔於2022-07-18）.

[9] [9]
Zygote. [2021-01-14]. （原始內容存檔於2021-02-14）.

[10] [10]
Innes, Michael. Don't Unroll Adjoint: Differentiating SSA-Form Programs. 2018-10-18. arXiv:1810.07951  [cs.PL].

[11] [11]
Degrave, Jonas; Hermans, Michiel; Dambre, Joni; wyffels, Francis. A Differentiable Physics Engine for Deep Learning in Robotics. 2016-11-05. arXiv:1611.01652  [cs.NE].

[Li2021-12] [12]
Li, Li; Hoyer, Stephan; Pederson, Ryan; Sun, Ruoxi; Cubuk, Ekin D.; Riley, Patrick; Burke, Kieron. Kohn-Sham Equations as Regularizer: Building Prior Knowledge into Machine-Learned Physics. Physical Review Letters. 2021, 126 (3): 036401. doi:10.1103/PhysRevLett.126.036401.

[13] [13]
Differentiable Monte Carlo Ray Tracing through Edge Sampling. people.csail.mit.edu. [2019-02-13]. （原始內容存檔於2021-05-12）.

[14] [14]
SciML Scientific Machine Learning Open Source Software Organization Roadmap. sciml.ai. [2020-07-19]. （原始內容存檔於2021-10-17）.

[15] [15]
Differentiable Programming for Image Processing and Deep Learning in Halide. people.csail.mit.edu. [2019-02-13]. （原始內容存檔於2021-05-06）.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]