Foundation model
Artificial intelligence model paradigm / From Wikipedia, the free encyclopedia
A foundation model is a machine learning or deep learning model that is trained on broad data such that it can be applied across a wide range of use cases.[1] Foundation models have transformed artificial intelligence (AI), powering prominent generative AI applications like ChatGPT.[1] The Stanford Institute for Human-Centered Artificial Intelligence's (HAI) Center for Research on Foundation Models (CRFM) created and popularized the term.[2]
Foundation models are general-purpose technologies that can support a diverse range of use cases. Building foundation models is often highly resource-intensive, with the most expensive models costing hundreds of millions of dollars to pay for the underlying data and compute required.[3] In contrast, adapting an existing foundation model for a specific use case or using it directly is much less expensive.
Early examples of foundation models are language models (LMs) like Google's BERT[4] and OpenAI's "GPT-n" series. Beyond text, foundation models have been developed across a range of modalitiesāincluding DALL-E and Flamingo[5] for images, MusicGen[6] for music, and RT-2[7] for robotic control. Foundation models constitute a broad shift in AI development: foundation models are being built for astronomy,[8] radiology,[9] genomics,[10] music,[11] coding,[12] times-series forecasting,[13] and mathematics.[14]