Data mining is the process of extracting patterns from data. Data mining becomes an increasingly important tool for converting data into information. It is commonly used in various practice profiles, such as marketing, surveillance, fraud detection and scientific discovery.

Data mining can be used to find patterns in the data but often done only on the data samples. The mining process will not be effective if the sample is not a good representation of the larger body of data. Data mining can not find patterns that may exist in the body larger than the data if those patterns are not present in samples that are "mined". The inability to find patterns can be the cause for some disputes between customers and service providers. Therefore data mining is not very easy, but may be useful if enough representative data samples are collected. The discovery of a particular pattern in a particular set of data does not mean that a pattern is found elsewhere on data larger than the sample taken. An important part of this process is verification and validation of patterns in other sample data.

The terms related to dredging data, fishing data and data mining refers to the use of engineering data mining for sample sizes that (or may be) too small for statistical conclusions to be made regarding the validity of each pattern found (see also data-peep bias). Dredging data may, however, be used to develop new hypotheses, which should then be validated with a fairly large sample set.

the main professional body in the field is the Association of Computing Machine's Special Interest Group on Knowledge Discovery and Data Mining (SIGKDD). Since 1989 they have hosted an annual international conference and announced its process, [6] and since 1999 published a bi-annual academic journal entitled "SIGKDD Explorations". Other computer science conferences on data include mining:

  • DMIN - International Conference on Data Mining;
  • DMKD - Research Problems on Data Mining and Knowledge Discovery;
  • ECML-PKDD - European Machine Learning Conference and Knowledge Discovery Principles and Practices in Databases;
  • ICDM - IEEE International Conference on Data Mining;
  • MLDM - Machine Learning and Data Mining in Pattern Recognition;
  • HR - Siam International Conference on Data Mining
  • EDM - International Conference on Data Mining Education
