Install
Faster access than browser!

# Statistics

Statistics is a branch of mathematics dealing with the collection, analysis, interpretation, presentation, and organization of data. [1]

## A. W. F. Edwards

Anthony William Fairbank Edwards, FRS (born 1935) is a British statistician, geneticist, and evolutionary biologist.

## Abundance estimation

Abundance estimation comprises all statistical methods for estimating the number of individuals in a population.

## Actuarial science

Actuarial science is the discipline that applies mathematical and statistical methods to assess risk in insurance, finance and other industries and professions.

Adrien-Marie Legendre (18 September 1752 – 10 January 1833) was a French mathematician.

## Algorithm

In mathematics and computer science, an algorithm is an unambiguous specification of how to solve a class of problems.

## Alternative hypothesis

In statistical hypothesis testing, the alternative hypothesis (or maintained hypothesis or research hypothesis) and the null hypothesis are the two rival hypotheses which are compared by a statistical hypothesis test.

## Analysis of variance

Analysis of variance (ANOVA) is a collection of statistical models and their associated estimation procedures (such as the "variation" among and between groups) used to analyze the differences among group means in a sample.

## Ancillary statistic

In statistics, an ancillary statistic is a statistic whose sampling distribution does not depend on the parameters of the model.

## Applied information economics

Applied information economics (AIE) is a decision analysis method developed by Douglas W. Hubbard and partially described in his book How to Measure Anything: Finding the Value of Intangibles in Business (2007; 2nd ed. 2010; 3rd ed. 2014).

## Arthur Lyon Bowley

Sir Arthur Lyon Bowley (Bristol, 6 November 1869 – Surrey, 21 January 1957) was an English statistician and economist who worked on economic statistics and pioneered the use of sampling techniques in social surveys.

## Assembly line

An assembly line is a manufacturing process (often called a progressive assembly) in which parts (usually interchangeable parts) are added as the semi-finished assembly moves from workstation to workstation where the parts are added in sequence until the final assembly is produced.

## Astrostatistics

Astrostatistics is a discipline which spans astrophysics, statistical analysis and data mining.

## Average treatment effect

The average treatment effect (ATE) is a measure used to compare treatments (or interventions) in randomized experiments, evaluation of policy interventions, and medical trials.

## Baseball statistics

Baseball statistics play an important role in evaluating a player's and/or team's progress.

## Bayesian inference

Bayesian inference is a method of statistical inference in which Bayes' theorem is used to update the probability for a hypothesis as more evidence or information becomes available.

## Bayesian probability

Bayesian probability is an interpretation of the concept of probability, in which, instead of frequency or propensity of some phenomenon, probability is interpreted as reasonable expectation representing a state of knowledge or as quantification of a personal belief.

## Bayesian statistics

Bayesian statistics, named for Thomas Bayes (1701–1761), is a theory in the field of statistics in which the evidence about the true state of the world is expressed in terms of degrees of belief known as Bayesian probabilities.

## Bias

Bias is disproportionate weight in favour of or against one thing, person, or group compared with another, usually in a way considered to be unfair.

## Bias (statistics)

Statistical bias is a feature of a statistical technique or of its results whereby the expected value of the results differs from the true underlying quantitative parameter being estimated.

## Bias of an estimator

In statistics, the bias (or bias function) of an estimator is the difference between this estimator's expected value and the true value of the parameter being estimated.

## Big data

Big data is data sets that are so big and complex that traditional data-processing application software are inadequate to deal with them.

## Biological network

A biological network is any network that applies to biological systems.

## Biology

Biology is the natural science that studies life and living organisms, including their physical structure, chemical composition, function, development and evolution.

## Biometrics (journal)

Biometrics is a journal that publishes articles on the application of statistics and mathematics to the biological sciences.

## Biometrika

Biometrika is a peer-reviewed scientific journal published by Oxford University Press for the Biometrika Trust.

## Biostatistics

Biostatistics is the application of statistics to a wide range of topics in biology.

## Blaise Pascal

Blaise Pascal (19 June 1623 – 19 August 1662) was a French mathematician, physicist, inventor, writer and Catholic theologian.

## Blinded experiment

A blind or blinded-experiment is an experiment in which information about the test is masked (kept) from the participant, to reduce or eliminate bias, until after a trial outcome is known.

## Blocking (statistics)

In the statistical theory of the design of experiments, blocking is the arranging of experimental units in groups (blocks) that are similar to one another.

## Boolean data type

In computer science, the Boolean data type is a data type that has one of two possible values (usually denoted true and false), intended to represent the two truth values of logic and Boolean algebra.

## Bootstrapping (statistics)

In statistics, bootstrapping is any test or metric that relies on random sampling with replacement.

"Business statistics is the science of good decision making in the face of uncertainty and is used in many disciplines such as financial analysis, econometrics, auditing, production and operations including services improvement and marketing research".

## Calculus

Calculus (from Latin calculus, literally 'small pebble', used for counting and calculations, as on an abacus), is the mathematical study of continuous change, in the same way that geometry is the study of shape and algebra is the study of generalizations of arithmetic operations.

## Case-control study

A case-control study is a type of observational study in which two existing groups differing in outcome are identified and compared on the basis of some supposed causal attribute.

## Categorical variable

In statistics, a categorical variable is a variable that can take on one of a limited, and usually fixed number of possible values, assigning each individual or other unit of observation to a particular group or nominal category on the basis of some qualitative property.

## Causality

Causality (also referred to as causation, or cause and effect) is what connects one process (the cause) with another process or state (the effect), where the first is partly responsible for the second, and the second is partly dependent on the first.

## Celsius

The Celsius scale, previously known as the centigrade scale, is a temperature scale used by the International System of Units (SI).

## Censoring (statistics)

In statistics, engineering, economics, and medical research, censoring is a condition in which the value of a measurement or observation is only partially known.

## Census

A census is the procedure of systematically acquiring and recording information about the members of a given population.

## Central tendency

In statistics, a central tendency (or measure of central tendency) is a central or typical value for a probability distribution.

## Chaos theory

Chaos theory is a branch of mathematics focusing on the behavior of dynamical systems that are highly sensitive to initial conditions.

## Chemistry

Chemistry is the scientific discipline involved with compounds composed of atoms, i.e. elements, and molecules, i.e. combinations of atoms: their composition, structure, properties, behavior and the changes they undergo during a reaction with other compounds.

## Chemometrics

Chemometrics is the science of extracting information from chemical systems by data-driven means.

## Chi-squared test

A chi-squared test, also written as test, is any statistical hypothesis test where the sampling distribution of the test statistic is a chi-squared distribution when the null hypothesis is true.

## Cohort study

A cohort study is a particular form of longitudinal study that sample a cohort (a group of people who share a defining characteristic, typically those who experienced a common event in a selected period, such as birth or graduation), performing a cross-section at intervals through time.

## Computational biology

Computational biology involves the development and application of data-analytical and theoretical methods, mathematical modeling and computational simulation techniques to the study of biological, behavioral, and social systems.

## Computational sociology

Computational sociology is a branch of sociology that uses computationally intensive methods to analyze and model social phenomena.

## Computational statistics

Computational statistics, or statistical computing, is the interface between statistics and computer science.

## Computer

A computer is a device that can be instructed to carry out sequences of arithmetic or logical operations automatically via computer programming.

## Confidence interval

In statistics, a confidence interval (CI) is a type of interval estimate, computed from the statistics of the observed data, that might contain the true value of an unknown population parameter.

## Confounding

In statistics, a confounder (also confounding variable, confounding factor or lurking variable) is a variable that influences both the dependent variable and independent variable causing a spurious association.

## Conjoint analysis

Conjoint analysis is a survey based statistical technique used in market research that helps determine how people value different attributes (feature, function, benefits) that make up an individual product or service.

## Consistency (statistics)

In statistics, consistency of procedures, such as computing confidence intervals or conducting hypothesis tests, is a desired property of their behaviour as the number of items in the data set to which they are applied increases indefinitely.

## Consistent estimator

In statistics, a consistent estimator or asymptotically consistent estimator is an estimator—a rule for computing estimates of a parameter θ0—having the property that as the number of data points used increases indefinitely, the resulting sequence of estimates converges in probability to θ0.

## Convergence of random variables

In probability theory, there exist several different notions of convergence of random variables.

## Correlation and dependence

In statistics, dependence or association is any statistical relationship, whether causal or not, between two random variables or bivariate data.

## Correlation does not imply causation

In statistics, many statistical tests calculate correlations between variables and when two variables are found to be correlated, it is tempting to assume that this shows that one variable causes the other.

## Credible interval

In Bayesian statistics, a credible interval is a range of values within which an unobserved parameter value falls with a particular subjective probability.

## Cricket statistics

Cricket is a sport that generates a large number of statistics.

## Data

Data is a set of values of qualitative or quantitative variables.

## Data mining

Data mining is the process of discovering patterns in large data sets involving methods at the intersection of machine learning, statistics, and database systems.

## Data science

Data science is an interdisciplinary field that uses scientific methods, processes, algorithms and systems to extract knowledge and insights from data in various forms, both structured and unstructured, similar to data mining.

## Data set

A data set (or dataset) is a collection of data.

## Data type

In computer science and computer programming, a data type or simply type is a classification of data which tells the compiler or interpreter how the programmer intends to use the data.

## David A. Freedman

David Amiel Freedman (5 March 1938 – 17 October 2008) was Professor of Statistics at the University of California, Berkeley.

## Decision theory

Decision theory (or the theory of choice) is the study of the reasoning underlying an agent's choices.

## Deductive reasoning

Deductive reasoning, also deductive logic, logical deduction is the process of reasoning from one or more statements (premises) to reach a logically certain conclusion.

## Demography

Demography (from prefix demo- from Ancient Greek δῆμος dēmos meaning "the people", and -graphy from γράφω graphō, implies "writing, description or measurement") is the statistical study of populations, especially human beings.

## Dependent and independent variables

In mathematical modeling, statistical modeling and experimental sciences, the values of dependent variables depend on the values of independent variables.

## Descriptive statistics

A descriptive statistic (in the count noun sense) is a summary statistic that quantitatively describes or summarizes features of a collection of information, while descriptive statistics in the mass noun sense is the process of using and analyzing those statistics.

## Design of experiments

The design of experiments (DOE, DOX, or experimental design) is the design of any task that aims to describe or explain the variation of information under conditions that are hypothesized to reflect the variation.

## Difference in differences

Difference in differences (DID or DD) is a statistical technique used in econometrics and quantitative research in the social sciences that attempts to mimic an experimental research design using observational study data, by studying the differential effect of a treatment on a 'treatment group' versus a 'control group' in a natural experiment.

## Differentiable function

In calculus (a branch of mathematics), a differentiable function of one real variable is a function whose derivative exists at each point in its domain.

## Differential equation

A differential equation is a mathematical equation that relates some function with its derivatives.

## Digital image processing

In computer science, Digital image processing is the use of computer algorithms to perform image processing on digital images.

## Econometrics

Econometrics is the application of statistical methods to economic data and is described as the branch of economics that aims to give empirical content to economic relations.

## Effect size

In statistics, an effect size is a quantitative measure of the magnitude of a phenomenon.

## Efficient estimator

In statistics, an efficient estimator is an estimator that estimates the quantity of interest in some “best possible” manner.

## Egon Pearson

Egon Sharpe Pearson, CBE FRS (11 August 1895 – 12 June 1980) was one of three children and the son of Karl Pearson and, like his father, a leading British statistician.

## Energy distance

Energy distance is a statistical distance between probability distributions.

## Engineering statistics

Engineering statistics combines engineering and statistics using scientific methods for analyzing data.

## Epidemiology

Epidemiology is the study and analysis of the distribution (who, when, and where) and determinants of health and disease conditions in defined populations.

## Estimating equations

In statistics, the method of estimating equations is a way of specifying how the parameters of a statistical model should be estimated.

## Estimation theory

Estimation theory is a branch of statistics that deals with estimating the values of parameters based on measured empirical data that has a random component.

## Estimator

In statistics, an estimator is a rule for calculating an estimate of a given quantity based on observed data: thus the rule (the estimator), the quantity of interest (the estimand) and its result (the estimate) are distinguished.

## Evolution

Evolution is change in the heritable characteristics of biological populations over successive generations.

## Evolutionary biology

Evolutionary biology is the subfield of biology that studies the evolutionary processes that produced the diversity of life on Earth, starting from a single common ancestor.

## Expected value

In probability theory, the expected value of a random variable, intuitively, is the long-run average value of repetitions of the experiment it represents.

## Experiment

An experiment is a procedure carried out to support, refute, or validate a hypothesis.

## Extrapolation

In mathematics, extrapolation is the process of estimating, beyond the original observation range, the value of a variable on the basis of its relationship with another variable.

## Factor analysis

Factor analysis is a statistical method used to describe variability among observed, correlated variables in terms of a potentially lower number of unobserved variables called factors.

## Fahrenheit

The Fahrenheit scale is a temperature scale based on one proposed in 1724 by Dutch-German-Polish physicist Daniel Gabriel Fahrenheit (1686–1736).

## Fat-tailed distribution

A fat-tailed distribution is a probability distribution that has the property, along with the other heavy-tailed distributions, that it exhibits large skewness or kurtosis.

## Faulty generalization

A faulty generalization is a conclusion about all or many instances of a phenomenon that has been reached on the basis of just one or just a few instances of that phenomenon.

## Fisher information

In mathematical statistics, the Fisher information (sometimes simply called information) is a way of measuring the amount of information that an observable random variable X carries about an unknown parameter θ of a distribution that models X. Formally, it is the variance of the score, or the expected value of the observed information.

## Fisher's principle

Fisher's principle is an evolutionary model that explains why the sex ratio of most species that produce offspring through sexual reproduction is approximately 1:1 between males and females.

## Fisherian runaway

Fisherian runaway or runaway selection is a sexual selection mechanism proposed by the mathematical biologist Ronald Fisher in the early 20th century, to account for the evolution of exaggerated male ornamentation by persistent, directional female choice.

## Floating-point arithmetic

In computing, floating-point arithmetic is arithmetic using formulaic representation of real numbers as an approximation so as to support a trade-off between range and precision.

## Forecasting

Forecasting is the process of making predictions of the future based on past and present data and most commonly by analysis of trends.

## Foundations of statistics

The foundations of statistics concern the epistemological debate in statistics over how one should conduct inductive inference from data.

## Fractal

In mathematics, a fractal is an abstract object used to describe and simulate naturally occurring objects.

## Francis Galton

Sir Francis Galton, FRS (16 February 1822 – 17 January 1911) was an English Victorian era statistician, progressive, polymath, sociologist, psychologist, anthropologist, eugenicist, tropical explorer, geographer, inventor, meteorologist, proto-geneticist, and psychometrician.

## Frequentist inference

Frequentist inference is a type of statistical inference that draws conclusions from sample data by emphasizing the frequency or proportion of the data.

## Generalized linear model

In statistics, the generalized linear model (GLM) is a flexible generalization of ordinary linear regression that allows for response variables that have error distribution models other than a normal distribution.

## Geographic information system

A geographic information system (GIS) is a system designed to capture, store, manipulate, analyze, manage, and present spatial or geographic data.

## Geography

Geography (from Greek γεωγραφία, geographia, literally "earth description") is a field of science devoted to the study of the lands, the features, the inhabitants, and the phenomena of Earth.

## Gerolamo Cardano

Gerolamo (or Girolamo, or Geronimo) Cardano (Jérôme Cardan; Hieronymus Cardanus; 24 September 1501 – 21 September 1576) was an Italian polymath, whose interests and proficiencies ranged from being a mathematician, physician, biologist, physicist, chemist, astrologer, astronomer, philosopher, writer, and gambler.

## Gibbs sampling

In statistics, Gibbs sampling or a Gibbs sampler is a Markov chain Monte Carlo (MCMC) algorithm for obtaining a sequence of observations which are approximated from a specified multivariate probability distribution, when direct sampling is difficult.

## Glossary of probability and statistics

Most of the terms listed in Wikipedia glossaries are already defined and explained within Wikipedia itself.

## Hawthorne effect

The Hawthorne effect (also referred to as the observer effect) is a type of reactivity in which individuals modify an aspect of their behavior in response to their awareness of being observed.

## How to Lie with Statistics

How to Lie with Statistics is a book written by Darrell Huff in 1954 presenting an introduction to statistics for the general reader.

## Iannis Xenakis

Iannis Xenakis (Greek: Γιάννης (Ιάννης) Ξενάκης; 29 May 1922 – 4 February 2001) was a Romanian-born, Greek-French composer, music theorist, architect, and engineer.

## Independent and identically distributed random variables

In probability theory and statistics, a sequence or other collection of random variables is independent and identically distributed (i.i.d. or iid or IID) if each random variable has the same probability distribution as the others and all are mutually independent.

## Index (statistics)

In statistics and research design, an index is a composite statistic – a measure of changes in a representative group of individual data points, or in other words, a compound measure that aggregates multiple indicators.

## Inductive reasoning

Inductive reasoning (as opposed to ''deductive'' reasoning or ''abductive'' reasoning) is a method of reasoning in which the premises are viewed as supplying some evidence for the truth of the conclusion.

## Instrumental variables estimation

In statistics, econometrics, epidemiology and related disciplines, the method of instrumental variables (IV) is used to estimate causal relationships when controlled experiments are not feasible or when a treatment is not successfully delivered to every unit in a randomized experiment.

## Integer

An integer (from the Latin ''integer'' meaning "whole")Integer&#x2009;'s first literal meaning in Latin is "untouched", from in ("not") plus tangere ("to touch").

## Integer (computer science)

In computer science, an integer is a datum of integral data type, a data type that represents some range of mathematical integers.

## International Statistical Institute

The International Statistical Institute (ISI) is a professional association of statisticians.

## Interpolation

In the mathematical field of numerical analysis, interpolation is a method of constructing new data points within the range of a discrete set of known data points.

## Jackson Pollock

Jackson Pollock (January 28, 1912 &ndash; August 11, 1956) was an American painter and a major figure in the abstract expressionist movement.

## Jargon

Jargon is a type of language that is used in a particular context and may not be well understood outside that context.

## Jerzy Neyman

Jerzy Neyman (April 16, 1894 – August 5, 1981), born Jerzy Spława-Neyman, was a Polish mathematician and statistician who spent the first part of his professional career at various institutions in Warsaw, Poland and then at University College London, and the second part at the University of California, Berkeley.

## John Graunt

John Graunt (24 April 1620 – 18 April 1674) was one of the first demographers, though by profession he was a haberdasher.

## Journal of the Royal Statistical Society

The Journal of the Royal Statistical Society is a peer-reviewed scientific journal of statistics.

## Juan Caramuel y Lobkowitz

Juan Caramuel y Lobkowitz (Juan Caramuel de Lobkowitz, May 23, 1606 in Madrid — September 7 or 8, 1682 in Vigevano) was a Spanish Catholic scholastic philosopher, ecclesiastic, mathematician and writer.

## Karl Pearson

Karl Pearson HFRSE LLD (originally named Carl; 27 March 1857 – 27 April 1936) was an English mathematician and biostatistician. He has been credited with establishing the discipline of mathematical statistics. He founded the world's first university statistics department at University College London in 1911, and contributed significantly to the field of biometrics, meteorology, theories of social Darwinism and eugenics. Pearson was also a protégé and biographer of Sir Francis Galton.

In the design of experiments in statistics, the lady tasting tea is a randomized experiment devised by Ronald Fisher and reported in his book The Design of Experiments (1935).

## Least absolute deviations

Least absolute deviations (LAD), also known as least absolute errors (LAE), least absolute value (LAV), least absolute residual (LAR), sum of absolute deviations, or the ''L''1 norm condition, is a statistical optimality criterion and the statistical optimization technique that relies on it.

## Least squares

The method of least squares is a standard approach in regression analysis to approximate the solution of overdetermined systems, i.e., sets of equations in which there are more equations than unknowns.

## Level of measurement

Level of measurement or scale of measure is a classification that describes the nature of information within the values assigned to variables.

## Lies, damned lies, and statistics

"Lies, damned lies, and statistics" is a phrase describing the persuasive power of numbers, particularly the use of statistics to bolster weak arguments.

## Limit (mathematics)

In mathematics, a limit is the value that a function (or sequence) "approaches" as the input (or index) "approaches" some value.

## Linear algebra

Linear algebra is the branch of mathematics concerning linear equations such as linear functions such as and their representations through matrices and vector spaces.

## Linear discriminant analysis

Linear discriminant analysis (LDA), normal discriminant analysis (NDA), or discriminant function analysis is a generalization of Fisher's linear discriminant, a method used in statistics, pattern recognition and machine learning to find a linear combination of features that characterizes or separates two or more classes of objects or events.

## Linear model

In statistics, the term linear model is used in different ways according to the context.

## Linear regression

In statistics, linear regression is a linear approach to modelling the relationship between a scalar response (or dependent variable) and one or more explanatory variables (or independent variables).

## List of academic statistical associations

Statistics is the study of the collection, organization, analysis, and interpretation of data.

## List of important publications in statistics

This is a list of important publications in statistics, organized by field.

## List of national and international statistical services

The following is a list of national and international statistical services.

## List of statistical packages

Statistical software are specialized computer programs for analysis in statistics and econometrics.

## List of statisticians

This list of statisticians lists people who have made notable contributions to the theories or application of statistics, or to the related fields of probability or machine learning.

No description.

## List of university statistical consulting centers

This list of university statistical consulting centers (or centres) is a simple list of universities in which there is a specifically designated team providing statistical consultancy services.

## Longitude

Longitude, is a geographic coordinate that specifies the east-west position of a point on the Earth's surface.

## Machine learning

Machine learning is a subset of artificial intelligence in the field of computer science that often uses statistical techniques to give computers the ability to "learn" (i.e., progressively improve performance on a specific task) with data, without being explicitly programmed.

## Mann–Whitney U test

In statistics, the Mann–Whitney U test (also called the Mann–Whitney–Wilcoxon (MWW), Wilcoxon rank-sum test, or Wilcoxon–Mann–Whitney test) is a nonparametric test of the null hypothesis that it is equally likely that a randomly selected value from one sample will be less than or greater than a randomly selected value from a second sample.

## Markov chain

A Markov chain is "a stochastic model describing a sequence of possible events in which the probability of each event depends only on the state attained in the previous event".

## Mathematical analysis

Mathematical analysis is the branch of mathematics dealing with limits and related theories, such as differentiation, integration, measure, infinite series, and analytic functions.

## Mathematics

Mathematics (from Greek μάθημα máthēma, "knowledge, study, learning") is the study of such topics as quantity, structure, space, and change.

## Maximum likelihood estimation

In statistics, maximum likelihood estimation (MLE) is a method of estimating the parameters of a statistical model, given observations.

## Mean

In mathematics, mean has several different definitions depending on the context.

## Mean squared error

In statistics, the mean squared error (MSE) or mean squared deviation (MSD) of an estimator (of a procedure for estimating an unobserved quantity) measures the average of the squares of the errors—that is, the average squared difference between the estimated values and what is estimated.

## Medical statistics

Medical statistics deals with applications of statistics to medicine and the health sciences, including epidemiology, public health, forensic medicine, and clinical research.

## Medieval Roman law

Medieval Roman law is the continuation and development of ancient Roman law that developed in the European Late Middle Ages.

## Method of moments (statistics)

In statistics, the method of moments is a method of estimation of population parameters.

A methodological advisor or consultant provides methodological and statistical advice and guidance to clients interested in making decisions regarding the design of studies, the collection and analysis of data, and the presentation and dissemination of research findings.

## Minimum-variance unbiased estimator

In statistics a minimum-variance unbiased estimator (MVUE) or uniformly minimum-variance unbiased estimator (UMVUE) is an unbiased estimator that has lower variance than any other unbiased estimator for all possible values of the parameter.

## Missing data

In statistics, missing data, or missing values, occur when no data value is stored for the variable in an observation.

## Missouri State University

Missouri State University (MSU or MO State), formerly Southwest Missouri State University, is a public university located in Springfield, Missouri, United States.

## Misuse of statistics

Statistics are supposed to make something easier to understand but when used in a misleading fashion can trick the casual observer into believing something other than what the data shows.

## Multilevel model

Multilevel models (also known as hierarchical linear models, nested data models, mixed models, random coefficient, random-effects models, random parameter models, or split-plot designs) are statistical models of parameters that vary at more than one level.

## Multivariate analysis of variance

In statistics, multivariate analysis of variance (MANOVA) is a procedure for comparing multivariate sample means.

## Multivariate random variable

In probability, and statistics, a multivariate random variable or random vector is a list of mathematical variables each of whose value is unknown, either because the value has not yet occurred or because there is imperfect knowledge of its value.

## Multivariate statistics

Multivariate statistics is a subdivision of statistics encompassing the simultaneous observation and analysis of more than one outcome variable.

## Natural experiment

A natural experiment is an empirical study in which individuals (or clusters of individuals) exposed to the experimental and control conditions are determined by nature or by other factors outside the control of the investigators, but the process governing the exposures arguably resembles random assignment.

## Nature

Nature, in the broadest sense, is the natural, physical, or material world or universe.

## Neural network

The term neural network was traditionally used to refer to a network or circuit of neurons.

## Non-linear least squares

Non-linear least squares is the form of least squares analysis used to fit a set of m observations with a model that is non-linear in n unknown parameters (m > n).

## Nonlinear regression

In statistics, nonlinear regression is a form of regression analysis in which observational data are modeled by a function which is a nonlinear combination of the model parameters and depends on one or more independent variables.

## Notation in probability and statistics

Probability theory and statistics have some commonly used conventions, in addition to standard mathematical notation and mathematical symbols.

## Null hypothesis

In inferential statistics, the term "null hypothesis" is a general statement or default position that there is no relationship between two measured phenomena, or no association among groups.

## Number theory

Number theory, or in older usage arithmetic, is a branch of pure mathematics devoted primarily to the study of the integers.

## Observational error

Observational error (or measurement error) is the difference between a measured value of a quantity and its true value.

## Observational study

In fields such as epidemiology, social sciences, psychology and statistics, an observational study draws inferences from a sample to a population where the independent variable is not under the control of the researcher because of ethical concerns or logistical constraints.

## Official statistics

Official statistics are statistics published by government agencies or other public bodies such as international organizations as a public good.

## Ordinary least squares

In statistics, ordinary least squares (OLS) or linear least squares is a method for estimating the unknown parameters in a linear regression model.

## P-value

In statistical hypothesis testing, the p-value or probability value or asymptotic significance is the probability for a given statistical model that, when the null hypothesis is true, the statistical summary (such as the sample mean difference between two compared groups) would be the same as or of greater magnitude than the actual observed results.

## Pattern recognition

Pattern recognition is a branch of machine learning that focuses on the recognition of patterns and regularities in data, although it is in some cases considered to be nearly synonymous with machine learning.

## Pearson correlation coefficient

In statistics, the Pearson correlation coefficient (PCC, pronounced), also referred to as Pearson's r, the Pearson product-moment correlation coefficient (PPMCC) or the bivariate correlation, is a measure of the linear correlation between two variables X and Y. It has a value between +1 and −1, where 1 is total positive linear correlation, 0 is no linear correlation, and −1 is total negative linear correlation.

## Pearson distribution

The Pearson distribution is a family of continuous probability distributions.

## Performance art

Performance art is a performance presented to an audience within a fine art context, traditionally interdisciplinary.

## Pierre de Fermat

Pierre de Fermat (Between 31 October and 6 December 1607 – 12 January 1665) was a French lawyer at the Parlement of Toulouse, France, and a mathematician who is given credit for early developments that led to infinitesimal calculus, including his technique of adequality.

## Pivotal quantity

In statistics, a pivotal quantity or pivot is a function of observations and unobservable parameters such that the function's probability distribution does not depend on the unknown parameters (including nuisance parameters).

## Political science

Political science is a social science which deals with systems of governance, and the analysis of political activities, political thoughts, and political behavior.

## Polynomial least squares

In mathematical statistics, polynomial least squares comprises a broad range of statistical methods for estimating an underlying polynomial that describes observations.

## Power (statistics)

The power of a binary hypothesis test is the probability that the test correctly rejects the null hypothesis (H0) when a specific alternative hypothesis (H1) is true.

## Prediction

A prediction (Latin præ-, "before," and dicere, "to say"), or forecast, is a statement about a future event.

## Prior probability

In Bayesian statistical inference, a prior probability distribution, often simply called the prior, of an uncertain quantity is the probability distribution that would express one's beliefs about this quantity before some evidence is taken into account.

## Probability distribution

In probability theory and statistics, a probability distribution is a mathematical function that provides the probabilities of occurrence of different possible outcomes in an experiment.

## Probability interpretations

The word probability has been used in a variety of ways since it was first applied to the mathematical study of games of chance.

## Probability theory

Probability theory is the branch of mathematics concerned with probability.

## Procedure (term)

A procedure is a document written to support a "policy directive".

## Process art

Process art is an artistic movement as well as a creative sentiment where the end product of art and craft, the objet d’art (work of art/found object), is not the principal focus.

## Prosecutor's fallacy

The prosecutor's fallacy is a fallacy of statistical reasoning, typically used by the prosecution to argue for the guilt of a defendant during a criminal trial.

## Protocol (science)

In the natural sciences a protocol is a predefined written procedural method in the design and implementation of experiments.

## Psychological statistics

Psychological statistics is application of formulas, theorems, numbers and laws to psychology.

## R (programming language)

R is a programming language and free software environment for statistical computing and graphics that is supported by the R Foundation for Statistical Computing.

## Random assignment

Random assignment or random placement is an experimental technique for assigning human participants or animal subjects to different groups in an experiment (e.g., a treatment group versus a control group) using randomization, such as by a chance procedure (e.g., flipping a coin) or a random number generator.

## Random variable

In probability and statistics, a random variable, random quantity, aleatory variable, or stochastic variable is a variable whose possible values are outcomes of a random phenomenon.

## Randomized controlled trial

A randomized controlled trial (or randomized control trial; RCT) is a type of scientific (often medical) experiment which aims to reduce bias when testing a new treatment.

## Real data type

A real data type is a data type used in a computer program to represent an approximation of a real number.

## Reduced chi-squared statistic

In statistics, the reduced chi-squared statistic is used extensively in goodness of fit testing.

## Regression analysis

In statistical modeling, regression analysis is a set of statistical processes for estimating the relationships among variables.

## Reliability engineering

Reliability engineering is a sub-discipline of systems engineering that emphasizes dependability in the lifecycle management of a product.

## Resampling (statistics)

In statistics, resampling is any of a variety of methods for doing one of the following.

## Residual sum of squares

In statistics, the residual sum of squares (RSS), also known as the sum of squared residuals (SSR) or the sum of squared errors of prediction (SSE), is the sum of the squares of residuals (deviations predicted from actual empirical values of data).

## Ronald Fisher

Sir Ronald Aylmer Fisher (17 February 1890 – 29 July 1962), who published as R. A. Fisher, was a British statistician and geneticist.

## Root-mean-square deviation

The root-mean-square deviation (RMSD) or root-mean-square error (RMSE) (or sometimes root-mean-squared error) is a frequently used measure of the differences between values (sample or population values) predicted by a model or an estimator and the values observed.

## Row and column vectors

In linear algebra, a column vector or column matrix is an m &times; 1 matrix, that is, a matrix consisting of a single column of m elements, Similarly, a row vector or row matrix is a 1 &times; m matrix, that is, a matrix consisting of a single row of m elements Throughout, boldface is used for the row and column vectors.

## Sabermetrics

Sabermetrics is the empirical analysis of baseball, especially baseball statistics that measure in-game activity.

## Sample (statistics)

In statistics and quantitative research methodology, a data sample is a set of data collected and/or selected from a statistical population by a defined procedure.

## Sampling (statistics)

In statistics, quality assurance, and survey methodology, sampling is the selection of a subset (a statistical sample) of individuals from within a statistical population to estimate characteristics of the whole population.

## Sampling distribution

In statistics, a sampling distribution or finite-sample distribution is the probability distribution of a given random-sample-based statistic.

## SAS (software)

SAS (previously "Statistical Analysis System") is a software suite developed by SAS Institute for advanced analytics, multivariate analyses, business intelligence, data management, and predictive analytics.

## Scatter plot

A scatter plot (also called a scatterplot, scatter graph, scatter chart, scattergram, or scatter diagram) is a type of plot or mathematical diagram using Cartesian coordinates to display values for typically two variables for a set of data.

## Scientific control

A scientific control is an experiment or observation designed to minimize the effects of variables other than the independent variable.

## Sex ratio

The sex ratio is the ratio of males to females in a population.

## Sexual selection

Sexual selection is a mode of natural selection where members of one biological sex choose mates of the other sex to mate with (intersexual selection), and compete with members of the same sex for access to members of the opposite sex (intrasexual selection).

## Social research

Social research is a research conducted by social scientists following a systematic plan.

## Social science

Social science is a major category of academic disciplines, concerned with society and the relationships among individuals within a society.

## Social statistics

Social statistics is the use of statistical measurement systems to study human behavior in a social environment.

## Sociology

Sociology is the scientific study of society, patterns of social relationships, social interaction, and culture.

## Spatial analysis

Spatial analysis or spatial statistics includes any of the formal techniques which study entities using their topological, geometric, or geographic properties.

## Spearman's rank correlation coefficient

In statistics, Spearman's rank correlation coefficient or Spearman's rho, named after Charles Spearman and often denoted by the Greek letter \rho (rho) or as r_s, is a nonparametric measure of rank correlation (statistical dependence between the rankings of two variables).

## SPSS

SPSS Statistics is a software package used for interactive, or batched, statistical analysis.

## Standard deviation

In statistics, the standard deviation (SD, also represented by the Greek letter sigma σ or the Latin letter s) is a measure that is used to quantify the amount of variation or dispersion of a set of data values.

## Standard score

In statistics, the standard score is the signed number of standard deviations by which the value of an observation or data point differs from the mean value of what is being observed or measured.

## Stanley Smith Stevens

Stanley Smith Stevens (November 4, 1906 – January 18, 1973) was an American psychologist who founded Harvard's Psycho-Acoustic Laboratory, studying psychoacoustics, and he is credited with the introduction of Stevens's power law.

## Statistic

A statistic (singular) or sample statistic is a single measure of some attribute of a sample (e.g. its arithmetic mean value).

## Statistical classification

In machine learning and statistics, classification is the problem of identifying to which of a set of categories (sub-populations) a new observation belongs, on the basis of a training set of data containing observations (or instances) whose category membership is known.

## Statistical dispersion

In statistics, dispersion (also called variability, scatter, or spread) is the extent to which a distribution is stretched or squeezed.

## Statistical hypothesis testing

A statistical hypothesis, sometimes called confirmatory data analysis, is a hypothesis that is testable on the basis of observing a process that is modeled via a set of random variables.

## Statistical inference

Statistical inference is the process of using data analysis to deduce properties of an underlying probability distribution.

## Statistical literacy

Statistical literacy is the ability to understand and reason with statistics and data.

## Statistical mechanics

Statistical mechanics is one of the pillars of modern physics.

## Statistical Methods for Research Workers

Statistical Methods for Research Workers is a classic book on statistics, written by the statistician R. A. Fisher.

## Statistical model

A statistical model is a mathematical model that embodies a set of statistical assumptions concerning the generation of some sample data and similar data from a larger population.

## Statistical population

In statistics, a population is a set of similar items or events which is of interest for some question or experiment.

## Statistical process control

Statistical process control (SPC) is a method of quality control which employs statistical methods to monitor and control a process.

## Statistical significance

In statistical hypothesis testing, a result has statistical significance when it is very unlikely to have occurred given the null hypothesis.

## Statistical theory

The theory of statistics provides a basis for the whole range of techniques, in both study design and data analysis, that are used within applications of statistics.

## Statistician

A statistician is a person who works with theoretical or applied statistics.

## Stochastic

The word stochastic is an adjective in English that describes something that was randomly determined.

## Stochastic calculus

Stochastic calculus is a branch of mathematics that operates on stochastic processes.

## Structural equation modeling

Structural equation modeling (SEM) includes a diverse set of mathematical models, computer algorithms, and statistical methods that fit networks of constructs to data.

## Structured data analysis (statistics)

Structured data analysis is the statistical data analysis of structured data.

## Student's t-test

The t-test is any statistical hypothesis test in which the test statistic follows a Student's ''t''-distribution under the null hypothesis.

## Sufficient statistic

In statistics, a statistic is sufficient with respect to a statistical model and its associated unknown parameter if "no other statistic that can be calculated from the same sample provides any additional information as to the value of the parameter".

## Survey methodology

A field of applied statistics of human research surveys, survey methodology studies the sampling of individual units from a population and associated techniques of survey data collection, such as questionnaire construction and methods for improving the number and accuracy of responses to surveys.

## Survey sampling

In statistics, survey sampling describes the process of selecting a sample of elements from a target population to conduct a survey.

## Survival analysis

Survival analysis is a branch of statistics for analyzing the expected duration of time until one or more events happen, such as death in biological organisms and failure in mechanical systems.

## Temperature

Temperature is a physical quantity expressing hot and cold.

## Test statistic

A test statistic is a statistic (a quantity derived from the sample) used in statistical hypothesis testing.

## The American Statistician

The American Statistician is a quarterly peer-reviewed scientific journal covering statistics published by Taylor & Francis on behalf of the American Statistical Association.

## The Correlation between Relatives on the Supposition of Mendelian Inheritance

"The Correlation between Relatives on the Supposition of Mendelian Inheritance" is a scientific paper by Ronald Fisher which was published in the Philosophical Transactions of the Royal Society of Edinburgh in 1918, (volume 52, pages 399–433).

## The Design of Experiments

The Design of Experiments is a 1935 book by the English statistician Ronald Fisher about the design of experiments and is considered a foundational work in experimental design.

## The Genetical Theory of Natural Selection

The Genetical Theory of Natural Selection is a book by Ronald Fisher which combines Mendelian genetics with Charles Darwin's theory of natural selection, with Fisher being the first to argue that "Mendelism therefore validates Darwinism" and stating with regard to mutations that "The vast majority of large mutations are deleterious; small mutations are both far more frequent and more likely to be useful", thus refuting orthogenesis.

## Time series

A time series is a series of data points indexed (or listed or graphed) in time order.

## Treatment and control groups

In the design of experiments, treatments are applied to experimental units in the treatment group(s).

## Type I and type II errors

In statistical hypothesis testing, a type I error is the rejection of a true null hypothesis (also known as a "false positive" finding), while a type II error is failing to reject a false null hypothesis (also known as a "false negative" finding).

## University College London

University College London (UCL) is a public research university in London, England, and a constituent college of the federal University of London.

## Variance

In probability theory and statistics, variance is the expectation of the squared deviation of a random variable from its mean.

## Western Electric

Western Electric Company (WE, WECo) was an American electrical engineering and manufacturing company that served as the primary supplier to AT&T from 1881 to 1996.

## William Sealy Gosset

William Sealy Gosset (13 June 1876 – 16 October 1937) was an English statistician.

## Wolfram Mathematica

Wolfram Mathematica (usually termed Mathematica) is a modern technical computing system spanning most areas of technical computing — including neural networks, machine learning, image processing, geometry, data science, visualizations, and others.

## References

Hey! We are on Facebook now! »