52 relations: Anomaly detection, Backward induction, Bellman equation, Chi-squared distribution, Cluster analysis, Clustering high-dimensional data, Combinatorial explosion, Combinatorics, Concentration of measure, Data dredging, Data mining, Data set, Database, Dimension, Dimensionality reduction, Directed graph, Dynamic programming, Euclidean distance, Feature (machine learning), Feature selection, Gamma function, Graph (discrete mathematics), Hypercube, Hypersphere, Independent and identically distributed random variables, Information retrieval, K-nearest neighbors algorithm, Linear least squares (mathematics), List of Fourier-related transforms, Machine learning, Mathematical optimization, Model order reduction, Multilinear principal component analysis, Multilinear subspace learning, Nearest neighbor search, Numerical analysis, Principal component analysis, Real number, Richard E. Bellman, Sampling (statistics), Semi-supervised learning, Signal-to-noise ratio, Singular-value decomposition, Skewness, Space, Space (mathematics), Statistical classification, Three-dimensional space, Time series, Unit cube, ..., Unit interval, Volume. Expand index (2 more) » « Shrink index
In data mining, anomaly detection (also outlier detection) is the identification of items, events or observations which do not conform to an expected pattern or other items in a dataset.
Backward induction is the process of reasoning backwards in time, from the end of a problem or situation, to determine a sequence of optimal actions.
A Bellman equation, named after Richard E. Bellman, is a necessary condition for optimality associated with the mathematical optimization method known as dynamic programming.
Cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense) to each other than to those in other groups (clusters).
Clustering high-dimensional data is the cluster analysis of data with anywhere from a few dozen to many thousands of dimensions.
In mathematics, a combinatorial explosion is the rapid growth of the complexity of a problem due to how the combinatorics of the problem is affected by the input, constraints, and bounds of the problem.
Combinatorics is an area of mathematics primarily concerned with counting, both as a means and an end in obtaining results, and certain properties of finite structures.
In mathematics, concentration of measure (about a median) is a principle that is applied in measure theory, probability and combinatorics, and has consequences for other fields such as Banach space theory.
Data dredging (also data fishing, data snooping, and '''''p'''''-hacking) is the use of data mining to uncover patterns in data that can be presented as statistically significant, without first devising a specific hypothesis as to the underlying causality.
Data mining is the process of discovering patterns in large data sets involving methods at the intersection of machine learning, statistics, and database systems.
A data set (or dataset) is a collection of data.
A database is an organized collection of data, stored and accessed electronically.
In physics and mathematics, the dimension of a mathematical space (or object) is informally defined as the minimum number of coordinates needed to specify any point within it.
In statistics, machine learning, and information theory, dimensionality reduction or dimension reduction is the process of reducing the number of random variables under consideration by obtaining a set of principal variables.
In mathematics, and more specifically in graph theory, a directed graph (or digraph) is a graph that is a set of vertices connected by edges, where the edges have a direction associated with them.
Dynamic programming is both a mathematical optimization method and a computer programming method.
In mathematics, the Euclidean distance or Euclidean metric is the "ordinary" straight-line distance between two points in Euclidean space.
In machine learning and pattern recognition, a feature is an individual measurable property or characteristic of a phenomenon being observed.
In machine learning and statistics, feature selection, also known as variable selection, attribute selection or variable subset selection, is the process of selecting a subset of relevant features (variables, predictors) for use in model construction.
In mathematics, the gamma function (represented by, the capital Greek alphabet letter gamma) is an extension of the factorial function, with its argument shifted down by 1, to real and complex numbers.
In mathematics, and more specifically in graph theory, a graph is a structure amounting to a set of objects in which some pairs of the objects are in some sense "related".
In geometry, a hypercube is an ''n''-dimensional analogue of a square and a cube.
In geometry of higher dimensions, a hypersphere is the set of points at a constant distance from a given point called its center.
In probability theory and statistics, a sequence or other collection of random variables is independent and identically distributed (i.i.d. or iid or IID) if each random variable has the same probability distribution as the others and all are mutually independent.
Information retrieval (IR) is the activity of obtaining information system resources relevant to an information need from a collection of information resources.
In pattern recognition, the k-nearest neighbors algorithm (k-NN) is a non-parametric method used for classification and regression.
In statistics and mathematics, linear least squares is an approach to fitting a mathematical or statistical model to data in cases where the idealized value provided by the model for any data point is expressed linearly in terms of the unknown parameters of the model.
This is a list of linear transformations of functions related to Fourier analysis.
Machine learning is a subset of artificial intelligence in the field of computer science that often uses statistical techniques to give computers the ability to "learn" (i.e., progressively improve performance on a specific task) with data, without being explicitly programmed.
In mathematics, computer science and operations research, mathematical optimization or mathematical programming, alternatively spelled optimisation, is the selection of a best element (with regard to some criterion) from some set of available alternatives.
Model order reduction (MOR) is a technique for reducing the computational complexity of mathematical models in numerical simulations.
Multilinear principal component analysis (MPCA) is a multilinear extension of principal component analysis (PCA).
Multilinear subspace learning is an approach to dimensionality reduction.
Nearest neighbor search (NNS), as a form of proximity search, is the optimization problem of finding the point in a given set that is closest (or most similar) to a given point.
Numerical analysis is the study of algorithms that use numerical approximation (as opposed to general symbolic manipulations) for the problems of mathematical analysis (as distinguished from discrete mathematics).
Principal component analysis (PCA) is a statistical procedure that uses an orthogonal transformation to convert a set of observations of possibly correlated variables into a set of values of linearly uncorrelated variables called principal components.
In mathematics, a real number is a value of a continuous quantity that can represent a distance along a line.
Richard Ernest Bellman (August 26, 1920 – March 19, 1984) was an American applied mathematician, who introduced dynamic programming in 1953, and important contributions in other fields of mathematics.
In statistics, quality assurance, and survey methodology, sampling is the selection of a subset (a statistical sample) of individuals from within a statistical population to estimate characteristics of the whole population.
Semi-supervised learning is a class of supervised learning tasks and techniques that also make use of unlabeled data for training – typically a small amount of labeled data with a large amount of unlabeled data.
Signal-to-noise ratio (abbreviated SNR or S/N) is a measure used in science and engineering that compares the level of a desired signal to the level of background noise.
In linear algebra, the singular-value decomposition (SVD) is a factorization of a real or complex matrix.
In probability theory and statistics, skewness is a measure of the asymmetry of the probability distribution of a real-valued random variable about its mean.
Space is the boundless three-dimensional extent in which objects and events have relative position and direction.
In mathematics, a space is a set (sometimes called a universe) with some added structure.
In machine learning and statistics, classification is the problem of identifying to which of a set of categories (sub-populations) a new observation belongs, on the basis of a training set of data containing observations (or instances) whose category membership is known.
Three-dimensional space (also: 3-space or, rarely, tri-dimensional space) is a geometric setting in which three values (called parameters) are required to determine the position of an element (i.e., point).
A time series is a series of data points indexed (or listed or graphed) in time order.
A unit cube, more formally a cube of side 1, is a cube whose sides are 1 unit long.
In mathematics, the unit interval is the closed interval, that is, the set of all real numbers that are greater than or equal to 0 and less than or equal to 1.
Volume is the quantity of three-dimensional space enclosed by a closed surface, for example, the space that a substance (solid, liquid, gas, or plasma) or shape occupies or contains.