Loading AI tools
Broad class of interacting type Monte Carlo algorithms From Wikipedia, the free encyclopedia
Mean-field particle methods are a broad class of interacting type Monte Carlo algorithms for simulating from a sequence of probability distributions satisfying a nonlinear evolution equation.[1][2][3][4] These flows of probability measures can always be interpreted as the distributions of the random states of a Markov process whose transition probabilities depends on the distributions of the current random states.[1][2] A natural way to simulate these sophisticated nonlinear Markov processes is to sample a large number of copies of the process, replacing in the evolution equation the unknown distributions of the random states by the sampled empirical measures. In contrast with traditional Monte Carlo and Markov chain Monte Carlo methods these mean-field particle techniques rely on sequential interacting samples. The terminology mean-field reflects the fact that each of the samples (a.k.a. particles, individuals, walkers, agents, creatures, or phenotypes) interacts with the empirical measures of the process. When the size of the system tends to infinity, these random empirical measures converge to the deterministic distribution of the random states of the nonlinear Markov chain, so that the statistical interaction between particles vanishes. In other words, starting with a chaotic configuration based on independent copies of initial state of the nonlinear Markov chain model, the chaos propagates at any time horizon as the size the system tends to infinity; that is, finite blocks of particles reduces to independent copies of the nonlinear Markov process. This result is called the propagation of chaos property.[5][6][7] The terminology "propagation of chaos" originated with the work of Mark Kac in 1976 on a colliding mean-field kinetic gas model.[8]
The theory of mean-field interacting particle models had certainly started by the mid-1960s, with the work of Henry P. McKean Jr. on Markov interpretations of a class of nonlinear parabolic partial differential equations arising in fluid mechanics.[5][9] The mathematical foundations of these classes of models were developed from the mid-1980s to the mid-1990s by several mathematicians, including Werner Braun, Klaus Hepp,[10] Karl Oelschläger,[11][12][13] Gérard Ben Arous and Marc Brunaud,[14] Donald Dawson, Jean Vaillancourt[15] and Jürgen Gärtner,[16][17] Christian Léonard,[18] Sylvie Méléard, Sylvie Roelly,[6] Alain-Sol Sznitman[7][19] and Hiroshi Tanaka[20] for diffusion type models; F. Alberto Grünbaum,[21] Tokuzo Shiga, Hiroshi Tanaka,[22] Sylvie Méléard and Carl Graham[23][24][25] for general classes of interacting jump-diffusion processes.
We also quote an earlier pioneering article by Theodore E. Harris and Herman Kahn, published in 1951, using mean-field but heuristic-like genetic methods for estimating particle transmission energies.[26] Mean-field genetic type particle methods are also used as heuristic natural search algorithms (a.k.a. metaheuristic) in evolutionary computing. The origins of these mean-field computational techniques can be traced to 1950 and 1954 with the work of Alan Turing on genetic type mutation-selection learning machines[27] and the articles by Nils Aall Barricelli at the Institute for Advanced Study in Princeton, New Jersey.[28][29] The Australian geneticist Alex Fraser also published in 1957 a series of papers on the genetic type simulation of artificial selection of organisms.[30]
Quantum Monte Carlo, and more specifically Diffusion Monte Carlo methods can also be interpreted as a mean-field particle approximation of Feynman-Kac path integrals.[3][4][31][32][33][34][35] The origins of Quantum Monte Carlo methods are often attributed to Enrico Fermi and Robert Richtmyer who developed in 1948 a mean field particle interpretation of neutron-chain reactions,[36] but the first heuristic-like and genetic type particle algorithm (a.k.a. Resampled or Reconfiguration Monte Carlo methods) for estimating ground state energies of quantum systems (in reduced matrix models) is due to Jack H. Hetherington in 1984[35] In molecular chemistry, the use of genetic heuristic-like particle methods (a.k.a. pruning and enrichment strategies) can be traced back to 1955 with the seminal work of Marshall. N. Rosenbluth and Arianna. W. Rosenbluth.[37]
The first pioneering articles on the applications of these heuristic-like particle methods in nonlinear filtering problems were the independent studies of Neil Gordon, David Salmon and Adrian Smith (bootstrap filter),[38] Genshiro Kitagawa (Monte Carlo filter) ,[39] and the one by Himilcon Carvalho, Pierre Del Moral, André Monin and Gérard Salut[40] published in the 1990s. The term interacting "particle filters" was first coined in 1996 by Del Moral.[41] Particle filters were also developed in signal processing in the early 1989-1992 by P. Del Moral, J.C. Noyer, G. Rigal, and G. Salut in the LAAS-CNRS in a series of restricted and classified research reports with STCAN (Service Technique des Constructions et Armes Navales), the IT company DIGILOG, and the LAAS-CNRS (the Laboratory for Analysis and Architecture of Systems) on RADAR/SONAR and GPS signal processing problems.[42][43][44][45][46][47]
The foundations and the first rigorous analysis on the convergence of genetic type models and mean field Feynman-Kac particle methods are due to Pierre Del Moral[48][49] in 1996. Branching type particle methods with varying population sizes were also developed in the end of the 1990s by Dan Crisan, Jessica Gaines and Terry Lyons,[50][51][52] and by Dan Crisan, Pierre Del Moral and Terry Lyons.[53] The first uniform convergence results with respect to the time parameter for mean field particle models were developed in the end of the 1990s by Pierre Del Moral and Alice Guionnet[54][55] for interacting jump type processes, and by Florent Malrieu for nonlinear diffusion type processes.[56]
New classes of mean field particle simulation techniques for Feynman-Kac path-integration problems includes genealogical tree based models,[2][3][57] backward particle models,[2][58] adaptive mean field particle models,[59] island type particle models,[60][61] and particle Markov chain Monte Carlo methods[62][63]
In physics, and more particularly in statistical mechanics, these nonlinear evolution equations are often used to describe the statistical behavior of microscopic interacting particles in a fluid or in some condensed matter. In this context, the random evolution of a virtual fluid or a gas particle is represented by McKean-Vlasov diffusion processes, reaction–diffusion systems, or Boltzmann type collision processes.[11][12][13][25][64] As its name indicates, the mean field particle model represents the collective behavior of microscopic particles weakly interacting with their occupation measures. The macroscopic behavior of these many-body particle systems is encapsulated in the limiting model obtained when the size of the population tends to infinity. Boltzmann equations represent the macroscopic evolution of colliding particles in rarefied gases, while McKean Vlasov diffusions represent the macroscopic behavior of fluid particles and granular gases.
In computational physics and more specifically in quantum mechanics, the ground state energies of quantum systems is associated with the top of the spectrum of Schrödinger's operators. The Schrödinger equation is the quantum mechanics version of the Newton's second law of motion of classical mechanics (the mass times the acceleration is the sum of the forces). This equation represents the wave function (a.k.a. the quantum state) evolution of some physical system, including molecular, atomic of subatomic systems, as well as macroscopic systems like the universe.[65] The solution of the imaginary time Schrödinger equation (a.k.a. the heat equation) is given by a Feynman-Kac distribution associated with a free evolution Markov process (often represented by Brownian motions) in the set of electronic or macromolecular configurations and some potential energy function. The long time behavior of these nonlinear semigroups is related to top eigenvalues and ground state energies of Schrödinger's operators.[3][32][33][34][35][66] The genetic type mean field interpretation of these Feynman-Kac models are termed Resample Monte Carlo, or Diffusion Monte Carlo methods. These branching type evolutionary algorithms are based on mutation and selection transitions. During the mutation transition, the walkers evolve randomly and independently in a potential energy landscape on particle configurations. The mean field selection process (a.k.a. quantum teleportation, population reconfiguration, resampled transition) is associated with a fitness function that reflects the particle absorption in an energy well. Configurations with low relative energy are more likely to duplicate. In molecular chemistry, and statistical physics Mean field particle methods are also used to sample Boltzmann-Gibbs measures associated with some cooling schedule, and to compute their normalizing constants (a.k.a. free energies, or partition functions).[2][67][68][69]
In computational biology, and more specifically in population genetics, spatial branching processes with competitive selection and migration mechanisms can also be represented by mean field genetic type population dynamics models.[4][70] The first moments of the occupation measures of a spatial branching process are given by Feynman-Kac distribution flows.[71][72] The mean field genetic type approximation of these flows offers a fixed population size interpretation of these branching processes.[2][3][73] Extinction probabilities can be interpreted as absorption probabilities of some Markov process evolving in some absorbing environment. These absorption models are represented by Feynman-Kac models.[74][75][76][77] The long time behavior of these processes conditioned on non-extinction can be expressed in an equivalent way by quasi-invariant measures, Yaglom limits,[78] or invariant measures of nonlinear normalized Feynman-Kac flows.[2][3][54][55][66][79]
In computer sciences, and more particularly in artificial intelligence these mean field type genetic algorithms are used as random search heuristics that mimic the process of evolution to generate useful solutions to complex optimization problems.[80][81][82] These stochastic search algorithms belongs to the class of Evolutionary models. The idea is to propagate a population of feasible candidate solutions using mutation and selection mechanisms. The mean field interaction between the individuals is encapsulated in the selection and the cross-over mechanisms.
In mean field games and multi-agent interacting systems theories, mean field particle processes are used to represent the collective behavior of complex systems with interacting individuals.[83][84][85][86][87][88][89][90] In this context, the mean field interaction is encapsulated in the decision process of interacting agents. The limiting model as the number of agents tends to infinity is sometimes called the continuum model of agents[91]
In information theory, and more specifically in statistical machine learning and signal processing, mean field particle methods are used to sample sequentially from the conditional distributions of some random process with respect to a sequence of observations or a cascade of rare events.[2][3][73][92] In discrete time nonlinear filtering problems, the conditional distributions of the random states of a signal given partial and noisy observations satisfy a nonlinear updating-prediction evolution equation. The updating step is given by Bayes' rule, and the prediction step is a Chapman-Kolmogorov transport equation. The mean field particle interpretation of these nonlinear filtering equations is a genetic type selection-mutation particle algorithm[48] During the mutation step, the particles evolve independently of one another according to the Markov transitions of the signal . During the selection stage, particles with small relative likelihood values are killed, while the ones with high relative values are multiplied.[93][94] These mean field particle techniques are also used to solve multiple-object tracking problems, and more specifically to estimate association measures[2][73][95]
The continuous time version of these particle models are mean field Moran type particle interpretations of the robust optimal filter evolution equations or the Kushner-Stratonotich stochastic partial differential equation.[4][31][94] These genetic type mean field particle algorithms also termed Particle Filters and Sequential Monte Carlo methods are extensively and routinely used in operation research and statistical inference .[96][97][98] The term "particle filters" was first coined in 1996 by Del Moral,[41] and the term "sequential Monte Carlo" by Liu and Chen in 1998. Subset simulation and Monte Carlo splitting[99] techniques are particular instances of genetic particle schemes and Feynman-Kac particle models equipped with Markov chain Monte Carlo mutation transitions[67][100][101]
To motivate the mean field simulation algorithm we start with S a finite or countable state space and let P(S) denote the set of all probability measures on S. Consider a sequence of probability distributions on S satisfying an evolution equation:
(1) |
for some, possibly nonlinear, mapping These distributions are given by vectors
that satisfy:
Therefore, is a mapping from the -unit simplex into itself, where s stands for the cardinality of the set S. When s is too large, solving equation (1) is intractable or computationally very costly. One natural way to approximate these evolution equations is to reduce sequentially the state space using a mean field particle model. One of the simplest mean field simulation scheme is defined by the Markov chain
on the product space , starting with N independent random variables with probability distribution and elementary transitions
with the empirical measure
where is the indicator function of the state x.
In other words, given the samples are independent random variables with probability distribution . The rationale behind this mean field simulation technique is the following: We expect that when is a good approximation of , then is an approximation of . Thus, since is the empirical measure of N conditionally independent random variables with common probability distribution , we expect to be a good approximation of .
Another strategy is to find a collection
of stochastic matrices indexed by such that
(2) |
This formula allows us to interpret the sequence as the probability distributions of the random states of the nonlinear Markov chain model with elementary transitions
A collection of Markov transitions satisfying the equation (1) is called a McKean interpretation of the sequence of measures . The mean field particle interpretation of (2) is now defined by the Markov chain
on the product space , starting with N independent random copies of and elementary transitions
with the empirical measure
Under some weak regularity conditions[2] on the mapping for any function , we have the almost sure convergence
These nonlinear Markov processes and their mean field particle interpretation can be extended to time non homogeneous models on general measurable state spaces.[2]
To illustrate the abstract models presented above, we consider a stochastic matrix and some function . We associate with these two objects the mapping
and the Boltzmann-Gibbs measures defined by
We denote by the collection of stochastic matrices indexed by given by
for some parameter . It is readily checked that the equation (2) is satisfied. In addition, we can also show (cf. for instance[3]) that the solution of (1) is given by the Feynman-Kac formula
with a Markov chain with initial distribution and Markov transition M.
For any function we have
If is the unit function and , then we have
And the equation (2) reduces to the Chapman-Kolmogorov equation
The mean field particle interpretation of this Feynman-Kac model is defined by sampling sequentially N conditionally independent random variables with probability distribution
In other words, with a probability the particle evolves to a new state randomly chosen with the probability distribution ; otherwise, jumps to a new location randomly chosen with a probability proportional to and evolves to a new state randomly chosen with the probability distribution If is the unit function and , the interaction between the particle vanishes and the particle model reduces to a sequence of independent copies of the Markov chain . When the mean field particle model described above reduces to a simple mutation-selection genetic algorithm with fitness function G and mutation transition M. These nonlinear Markov chain models and their mean field particle interpretation can be extended to time non homogeneous models on general measurable state spaces (including transition states, path spaces and random excursion spaces) and continuous time models.[1][2][3]
We consider a sequence of real valued random variables defined sequentially by the equations
(3) |
with a collection of independent standard Gaussian random variables, a positive parameter σ, some functions and some standard Gaussian initial random state . We let be the probability distribution of the random state ; that is, for any bounded measurable function f, we have
with
The integral is the Lebesgue integral, and dx stands for an infinitesimal neighborhood of the state x. The Markov transition of the chain is given for any bounded measurable functions f by the formula
with
Using the tower property of conditional expectations we prove that the probability distributions satisfy the nonlinear equation
for any bounded measurable functions f. This equation is sometimes written in the more synthetic form
The mean field particle interpretation of this model is defined by the Markov chain
on the product space by
where
stand for N independent copies of and respectively. For regular models (for instance for bounded Lipschitz functions a, b, c) we have the almost sure convergence
with the empirical measure
for any bounded measurable functions f (cf. for instance [2]). In the above display, stands for the Dirac measure at the state x.
We consider a standard Brownian motion (a.k.a. Wiener Process) evaluated on a time mesh sequence with a given time step . We choose in equation (1), we replace and σ by and , and we write instead of the values of the random states evaluated at the time step Recalling that are independent centered Gaussian random variables with variance the resulting equation can be rewritten in the following form
(4) |
When h → 0, the above equation converge to the nonlinear diffusion process
The mean field continuous time model associated with these nonlinear diffusions is the (interacting) diffusion process on the product space defined by
where
are N independent copies of and For regular models (for instance for bounded Lipschitz functions a, b) we have the almost sure convergence
with and the empirical measure
for any bounded measurable functions f (cf. for instance.[7]). These nonlinear Markov processes and their mean field particle interpretation can be extended to interacting jump-diffusion processes[1][2][23][25]
Seamless Wikipedia browsing. On steroids.
Every time you click a link to Wikipedia, Wiktionary or Wikiquote in your browser's search results, it will show the modern Wikiwand interface.
Wikiwand extension is a five stars, simple, with minimum permission required to keep your browsing private, safe and transparent.