Logo
Unionpedia
Communication
Get it on Google Play
New! Download Unionpedia on your Android™ device!
Free
Faster access than browser!
 

Cross-validation (statistics)

Index Cross-validation (statistics)

Cross-validation, sometimes called rotation estimation, or out-of-sample testing is any of various similar model validation techniques for assessing how the results of a statistical analysis will generalize to an independent data set. [1]

60 relations: Accuracy and precision, Bias (statistics), Binary classification, Binomial coefficient, Biometrika, Boosting (machine learning), Bootstrap aggregating, Bootstrapping (statistics), Cancer, Closed-form expression, Complement (set theory), Data, Data mining, Dichotomy, Drug, Euclidean vector, Expected value, Feature (machine learning), Feature selection, Forward chaining, Gene expression, Hyperplane, Independence (probability theory), Jackknife resampling, Journal of the American Statistical Association, K-nearest neighbors algorithm, Kernel regression, Least squares, Linear regression, Logistic regression, Mathematical optimization, Mean squared error, Median absolute deviation, Medical diagnosis, Model selection, Monte Carlo method, Optical character recognition, Overfitting, Parameter, Partition of a set, Positive and negative predictive values, Predictive modelling, PRESS statistic, Protein, Real number, Regression validation, Resampling (statistics), Root-mean-square deviation, Sample (statistics), Scientific Reports, ..., Sherman–Morrison formula, Stability (learning theory), Statistical model, Statistical population, Statistics, Stock market prediction, Support vector machine, Training, test, and validation sets, Validity (statistics), Variance. Expand index (10 more) »

Accuracy and precision

Precision is a description of random errors, a measure of statistical variability.

New!!: Cross-validation (statistics) and Accuracy and precision · See more »

Bias (statistics)

Statistical bias is a feature of a statistical technique or of its results whereby the expected value of the results differs from the true underlying quantitative parameter being estimated.

New!!: Cross-validation (statistics) and Bias (statistics) · See more »

Binary classification

Binary or binomial classification is the task of classifying the elements of a given set into two groups (predicting which group each one belongs to) on the basis of a classification rule.

New!!: Cross-validation (statistics) and Binary classification · See more »

Binomial coefficient

In mathematics, any of the positive integers that occurs as a coefficient in the binomial theorem is a binomial coefficient.

New!!: Cross-validation (statistics) and Binomial coefficient · See more »

Biometrika

Biometrika is a peer-reviewed scientific journal published by Oxford University Press for the Biometrika Trust.

New!!: Cross-validation (statistics) and Biometrika · See more »

Boosting (machine learning)

Boosting is a machine learning ensemble meta-algorithm for primarily reducing bias, and also variance in supervised learning, and a family of machine learning algorithms that convert weak learners to strong ones.

New!!: Cross-validation (statistics) and Boosting (machine learning) · See more »

Bootstrap aggregating

Bootstrap aggregating, also called bagging, is a machine learning ensemble meta-algorithm designed to improve the stability and accuracy of machine learning algorithms used in statistical classification and regression.

New!!: Cross-validation (statistics) and Bootstrap aggregating · See more »

Bootstrapping (statistics)

In statistics, bootstrapping is any test or metric that relies on random sampling with replacement.

New!!: Cross-validation (statistics) and Bootstrapping (statistics) · See more »

Cancer

Cancer is a group of diseases involving abnormal cell growth with the potential to invade or spread to other parts of the body.

New!!: Cross-validation (statistics) and Cancer · See more »

Closed-form expression

In mathematics, a closed-form expression is a mathematical expression that can be evaluated in a finite number of operations.

New!!: Cross-validation (statistics) and Closed-form expression · See more »

Complement (set theory)

In set theory, the complement of a set refers to elements not in.

New!!: Cross-validation (statistics) and Complement (set theory) · See more »

Data

Data is a set of values of qualitative or quantitative variables.

New!!: Cross-validation (statistics) and Data · See more »

Data mining

Data mining is the process of discovering patterns in large data sets involving methods at the intersection of machine learning, statistics, and database systems.

New!!: Cross-validation (statistics) and Data mining · See more »

Dichotomy

A dichotomy is a partition of a whole (or a set) into two parts (subsets).

New!!: Cross-validation (statistics) and Dichotomy · See more »

Drug

A drug is any substance (other than food that provides nutritional support) that, when inhaled, injected, smoked, consumed, absorbed via a patch on the skin, or dissolved under the tongue causes a temporary physiological (and often psychological) change in the body.

New!!: Cross-validation (statistics) and Drug · See more »

Euclidean vector

In mathematics, physics, and engineering, a Euclidean vector (sometimes called a geometric or spatial vector, or—as here—simply a vector) is a geometric object that has magnitude (or length) and direction.

New!!: Cross-validation (statistics) and Euclidean vector · See more »

Expected value

In probability theory, the expected value of a random variable, intuitively, is the long-run average value of repetitions of the experiment it represents.

New!!: Cross-validation (statistics) and Expected value · See more »

Feature (machine learning)

In machine learning and pattern recognition, a feature is an individual measurable property or characteristic of a phenomenon being observed.

New!!: Cross-validation (statistics) and Feature (machine learning) · See more »

Feature selection

In machine learning and statistics, feature selection, also known as variable selection, attribute selection or variable subset selection, is the process of selecting a subset of relevant features (variables, predictors) for use in model construction.

New!!: Cross-validation (statistics) and Feature selection · See more »

Forward chaining

Forward chaining (or forward reasoning) is one of the two main methods of reasoning when using an inference engine and can be described logically as repeated application of modus ponens.

New!!: Cross-validation (statistics) and Forward chaining · See more »

Gene expression

Gene expression is the process by which information from a gene is used in the synthesis of a functional gene product.

New!!: Cross-validation (statistics) and Gene expression · See more »

Hyperplane

In geometry, a hyperplane is a subspace whose dimension is one less than that of its ambient space.

New!!: Cross-validation (statistics) and Hyperplane · See more »

Independence (probability theory)

In probability theory, two events are independent, statistically independent, or stochastically independent if the occurrence of one does not affect the probability of occurrence of the other.

New!!: Cross-validation (statistics) and Independence (probability theory) · See more »

Jackknife resampling

In statistics, the jackknife is a resampling technique especially useful for variance and bias estimation.

New!!: Cross-validation (statistics) and Jackknife resampling · See more »

Journal of the American Statistical Association

The Journal of the American Statistical Association (JASA) is the primary journal published by the American Statistical Association, the main professional body for statisticians in the United States.

New!!: Cross-validation (statistics) and Journal of the American Statistical Association · See more »

K-nearest neighbors algorithm

In pattern recognition, the k-nearest neighbors algorithm (k-NN) is a non-parametric method used for classification and regression.

New!!: Cross-validation (statistics) and K-nearest neighbors algorithm · See more »

Kernel regression

Kernel regression is a non-parametric technique in statistics to estimate the conditional expectation of a random variable.

New!!: Cross-validation (statistics) and Kernel regression · See more »

Least squares

The method of least squares is a standard approach in regression analysis to approximate the solution of overdetermined systems, i.e., sets of equations in which there are more equations than unknowns.

New!!: Cross-validation (statistics) and Least squares · See more »

Linear regression

In statistics, linear regression is a linear approach to modelling the relationship between a scalar response (or dependent variable) and one or more explanatory variables (or independent variables).

New!!: Cross-validation (statistics) and Linear regression · See more »

Logistic regression

In statistics, the logistic model (or logit model) is a statistical model that is usually taken to apply to a binary dependent variable.

New!!: Cross-validation (statistics) and Logistic regression · See more »

Mathematical optimization

In mathematics, computer science and operations research, mathematical optimization or mathematical programming, alternatively spelled optimisation, is the selection of a best element (with regard to some criterion) from some set of available alternatives.

New!!: Cross-validation (statistics) and Mathematical optimization · See more »

Mean squared error

In statistics, the mean squared error (MSE) or mean squared deviation (MSD) of an estimator (of a procedure for estimating an unobserved quantity) measures the average of the squares of the errors—that is, the average squared difference between the estimated values and what is estimated.

New!!: Cross-validation (statistics) and Mean squared error · See more »

Median absolute deviation

In statistics, the median absolute deviation (MAD) is a robust measure of the variability of a univariate sample of quantitative data.

New!!: Cross-validation (statistics) and Median absolute deviation · See more »

Medical diagnosis

Medical diagnosis (abbreviated Dx or DS) is the process of determining which disease or condition explains a person's symptoms and signs.

New!!: Cross-validation (statistics) and Medical diagnosis · See more »

Model selection

Model selection is the task of selecting a statistical model from a set of candidate models, given data.

New!!: Cross-validation (statistics) and Model selection · See more »

Monte Carlo method

Monte Carlo methods (or Monte Carlo experiments) are a broad class of computational algorithms that rely on repeated random sampling to obtain numerical results.

New!!: Cross-validation (statistics) and Monte Carlo method · See more »

Optical character recognition

Optical character recognition (also optical character reader, OCR) is the mechanical or electronic conversion of images of typed, handwritten or printed text into machine-encoded text, whether from a scanned document, a photo of a document, a scene-photo (for example the text on signs and billboards in a landscape photo) or from subtitle text superimposed on an image (for example from a television broadcast).

New!!: Cross-validation (statistics) and Optical character recognition · See more »

Overfitting

In statistics, overfitting is "the production of an analysis that corresponds too closely or exactly to a particular set of data, and may therefore fail to fit additional data or predict future observations reliably".

New!!: Cross-validation (statistics) and Overfitting · See more »

Parameter

A parameter (from the Ancient Greek παρά, para: "beside", "subsidiary"; and μέτρον, metron: "measure"), generally, is any characteristic that can help in defining or classifying a particular system (meaning an event, project, object, situation, etc.). That is, a parameter is an element of a system that is useful, or critical, when identifying the system, or when evaluating its performance, status, condition, etc.

New!!: Cross-validation (statistics) and Parameter · See more »

Partition of a set

In mathematics, a partition of a set is a grouping of the set's elements into non-empty subsets, in such a way that every element is included in one and only one of the subsets.

New!!: Cross-validation (statistics) and Partition of a set · See more »

Positive and negative predictive values

The positive and negative predictive values (PPV and NPV respectively) are the proportions of positive and negative results in statistics and diagnostic tests that are true positive and true negative results, respectively.

New!!: Cross-validation (statistics) and Positive and negative predictive values · See more »

Predictive modelling

Predictive modelling uses statistics to predict outcomes.

New!!: Cross-validation (statistics) and Predictive modelling · See more »

PRESS statistic

In statistics, the predicted residual error sum of squares (PRESS) statistic is a form of cross-validation used in regression analysis to provide a summary measure of the fit of a model to a sample of observations that were not themselves used to estimate the model.

New!!: Cross-validation (statistics) and PRESS statistic · See more »

Protein

Proteins are large biomolecules, or macromolecules, consisting of one or more long chains of amino acid residues.

New!!: Cross-validation (statistics) and Protein · See more »

Real number

In mathematics, a real number is a value of a continuous quantity that can represent a distance along a line.

New!!: Cross-validation (statistics) and Real number · See more »

Regression validation

In statistics, regression validation is the process of deciding whether the numerical results quantifying hypothesized relationships between variables, obtained from regression analysis, are acceptable as descriptions of the data.

New!!: Cross-validation (statistics) and Regression validation · See more »

Resampling (statistics)

In statistics, resampling is any of a variety of methods for doing one of the following.

New!!: Cross-validation (statistics) and Resampling (statistics) · See more »

Root-mean-square deviation

The root-mean-square deviation (RMSD) or root-mean-square error (RMSE) (or sometimes root-mean-squared error) is a frequently used measure of the differences between values (sample or population values) predicted by a model or an estimator and the values observed.

New!!: Cross-validation (statistics) and Root-mean-square deviation · See more »

Sample (statistics)

In statistics and quantitative research methodology, a data sample is a set of data collected and/or selected from a statistical population by a defined procedure.

New!!: Cross-validation (statistics) and Sample (statistics) · See more »

Scientific Reports

Scientific Reports is an online open access scientific mega journal published by the Nature Publishing Group, covering all areas of the natural sciences.

New!!: Cross-validation (statistics) and Scientific Reports · See more »

Sherman–Morrison formula

In mathematics, in particular linear algebra, the Sherman–Morrison formula, named after Jack Sherman and Winifred J. Morrison, computes the inverse of the sum of an invertible matrix A and the outer product, u v^T, of vectors u and v. The Sherman–Morrison formula is a special case of the Woodbury formula.

New!!: Cross-validation (statistics) and Sherman–Morrison formula · See more »

Stability (learning theory)

Stability, also known as algorithmic stability, is a notion in computational learning theory of how a machine learning algorithm is perturbed by small changes to its inputs.

New!!: Cross-validation (statistics) and Stability (learning theory) · See more »

Statistical model

A statistical model is a mathematical model that embodies a set of statistical assumptions concerning the generation of some sample data and similar data from a larger population.

New!!: Cross-validation (statistics) and Statistical model · See more »

Statistical population

In statistics, a population is a set of similar items or events which is of interest for some question or experiment.

New!!: Cross-validation (statistics) and Statistical population · See more »

Statistics

Statistics is a branch of mathematics dealing with the collection, analysis, interpretation, presentation, and organization of data.

New!!: Cross-validation (statistics) and Statistics · See more »

Stock market prediction

Stock market prediction is the act of trying to determine the future value of a company stock or other financial instrument traded on an exchange.

New!!: Cross-validation (statistics) and Stock market prediction · See more »

Support vector machine

In machine learning, support vector machines (SVMs, also support vector networks) are supervised learning models with associated learning algorithms that analyze data used for classification and regression analysis.

New!!: Cross-validation (statistics) and Support vector machine · See more »

Training, test, and validation sets

In machine learning, the study and construction of algorithms that can learn from and make predictions on data is a common task.

New!!: Cross-validation (statistics) and Training, test, and validation sets · See more »

Validity (statistics)

Validity is the extent to which a concept, conclusion or measurement is well-founded and likely corresponds accurately to the real world based on probability.

New!!: Cross-validation (statistics) and Validity (statistics) · See more »

Variance

In probability theory and statistics, variance is the expectation of the squared deviation of a random variable from its mean.

New!!: Cross-validation (statistics) and Variance · See more »

Redirects here:

Hold-out cross-validation, LOOCV, Out of sample testing, Out-of-sample test, Root-mean-square error of cross-validation, Rotation estimation.

References

[1] https://en.wikipedia.org/wiki/Cross-validation_(statistics)

OutgoingIncoming
Hey! We are on Facebook now! »