PMID 7964366. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1334459/. ^ “Token reinforcement, choice, and self-control in pigeons”. Journal of the Experimental Analysis
M. G. H. (2002). “The Neural Basis of Human Error Processing: ReinforcementLearning, Dopamine, and the Error-related Negativity”. Psychological Review(英語版)
Riedmiller, Martin et al. (2015-02). “Human-level control through deep reinforcementlearning” (英語). Nature 518 (7540): 529–533. doi:10.1038/nature14236. ISSN 1476-4687