Leakage (machine learning)
Concept in machine learning / From Wikipedia, the free encyclopedia
Dear Wikiwand AI, let's keep it short by simply answering these key questions:
Can you list the top facts and stats about Leakage (machine learning)?
Summarize this article for a 10 year old
SHOW ALL QUESTIONS
In statistics and machine learning, leakage (also known as data leakage or target leakage) is the use of information in the model training process which would not be expected to be available at prediction time, causing the predictive scores (metrics) to overestimate the model's utility when run in a production environment.[1]
"Data leakage" redirects here. For the unauthorized exposure, disclosure, or loss of personal information, see Data breach.
Leakage is often subtle and indirect, making it hard to detect and eliminate. Leakage can cause a statistician or modeler to select a suboptimal model, which could be outperformed by a leakage-free model.[1]