Data mining is an interdisciplinary subfield of
computer science. It is the computational process of discovering patterns in large
data sets ("
big data") involving methods at the intersection of
artificial intelligence,
machine learning,
statistics, and
database systems. The overall goal of the data mining process is to extract information from a data set and transform it into an understandable structure for further use. Aside from the raw analysis step, it involves database and
data management aspects,
data pre-processing,
model and
inference considerations, interestingness metrics,
complexity considerations, post-processing of discovered structures,
visualization, and
online updating. Data mining is the analysis step of the "knowledge discovery in databases" process, or KDD.