## Adjusted mutual information

In probability theory and information theory, adjusted mutual information, a variation of mutual information may be used for comparing clusterings.

## Affinity propagation

In statistics and data mining, affinity propagation (AP) is a clustering algorithm based on the concept of "message passing" between data points.

## Algorithm

In mathematics and computer science, an algorithm is an unambiguous specification of how to solve a class of problems.

## Animal

Animals are multicellular eukaryotic organisms that form the biological kingdom Animalia.

## Anomaly detection

In data mining, anomaly detection (also outlier detection) is the identification of items, events or observations which do not conform to an expected pattern or other items in a dataset.

## Artificial neural network

Artificial neural networks (ANNs) or connectionist systems are computing systems vaguely inspired by the biological neural networks that constitute animal brains.

## Association for Computing Machinery

The Association for Computing Machinery (ACM) is an international learned society for computing.

## Association for the Advancement of Artificial Intelligence

The Association for the Advancement of Artificial Intelligence (AAAI) is an international, nonprofit, scientific society devoted to promote research in, and responsible use of, artificial intelligence.

## Balanced clustering

Balanced clustering is a special case of clustering where, in the strictest sense, cluster sizes are constrained to \lfloor \rfloor or \lceil\rceil, where n is the number of points and k is the number of clusters.

## Biclustering

Biclustering, block clustering, co-clustering, or two-mode clustering is a data mining technique which allows simultaneous clustering of the rows and columns of a matrix.

## Big data

Big data is data sets that are so big and complex that traditional data-processing application software are inadequate to deal with them.

## Bioinformatics

Bioinformatics is an interdisciplinary field that develops methods and software tools for understanding biological data.

## Biology

Biology is the natural science that studies life and living organisms, including their physical structure, chemical composition, function, development and evolution.

## BIRCH

BIRCH (balanced iterative reducing and clustering using hierarchies) is an unsupervised data mining algorithm used to perform hierarchical clustering over particularly large data-sets.

## Canopy clustering algorithm

The canopy clustering algorithm is an unsupervised pre-clustering algorithm introduced by Andrew McCallum, Kamal Nigam and Lyle Ungar in 2000.

## Centroid

In mathematics and physics, the centroid or geometric center of a plane figure is the arithmetic mean position of all the points in the shape.

## Climatology

Climatology (from Greek κλίμα, klima, "place, zone"; and -λογία, -logia) or climate science is the scientific study of climate, scientifically defined as weather conditions averaged over a period of time.

## Clique (graph theory)

In the mathematical area of graph theory, a clique is a subset of vertices of an undirected graph such that every two distinct vertices in the clique are adjacent; that is, its induced subgraph is complete.

## Cluster-weighted modeling

In data mining, cluster-weighted modeling (CWM) is an algorithm-based approach to non-linear prediction of outputs (dependent variables) from inputs (independent variables) based on density estimation using a set of models (clusters) that are each notionally appropriate in a sub-region of the input space.

## Clustering high-dimensional data

Clustering high-dimensional data is the cluster analysis of data with anywhere from a few dozen to many thousands of dimensions.

## Cohen's kappa

Cohen's kappa coefficient (κ) is a statistic which measures inter-rater agreement for qualitative (categorical) items.

## Community

A community is a small or large social unit (a group of living things) that has something in common, such as norms, religion, values, or identity.

## Complete-linkage clustering

Complete-linkage clustering is one of several methods of agglomerative hierarchical clustering.

## Computer graphics

Computer graphics are pictures and films created using computers.

## Computer science

Computer science deals with the theoretical foundations of information and computation, together with practical techniques for the implementation and application of these foundations.

## Conceptual clustering

Conceptual clustering is a machine learning paradigm for unsupervised classification developed mainly during the 1980s.

## Confusion matrix

In the field of machine learning and specifically the problem of statistical classification, a confusion matrix, also known as an error matrix, is a specific table layout that allows visualization of the performance of an algorithm, typically a supervised learning one (in unsupervised learning it is usually called a matching matrix).

## Consensus clustering

Clustering is the assignment of objects into groups (called clusters) so that objects from the same cluster are more similar to each other than objects from different clusters.

## Constrained clustering

In computer science, constrained clustering is a class of semi-supervised learning algorithms.

## Consumer

A consumer is a person or organization that use economic services or commodities.

## Correlation and dependence

In statistics, dependence or association is any statistical relationship, whether causal or not, between two random variables or bivariate data.

## Correlation clustering

Clustering is the problem of partitioning data points into groups based on their similarity.

## Curse of dimensionality

The curse of dimensionality refers to various phenomena that arise when analyzing and organizing data in high-dimensional spaces (often with hundreds or thousands of dimensions) that do not occur in low-dimensional settings such as the three-dimensional physical space of everyday experience.

## Customer

In sales, commerce and economics, a customer (sometimes known as a client, buyer, or purchaser) is the recipient of a good, service, product or an idea - obtained from a seller, vendor, or supplier via a financial transaction or exchange for money or some other valuable consideration.

## Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

## Data compression

In signal processing, data compression, source coding, or bit-rate reduction involves encoding information using fewer bits than the original representation.

## Data mining

Data mining is the process of discovering patterns in large data sets involving methods at the intersection of machine learning, statistics, and database systems.

## Data stream clustering

In computer science, data stream clustering is defined as the clustering of data that arrive continuously such as telephone records, multimedia data, financial transactions etc.

## Davies–Bouldin index

The Davies–Bouldin index (DBI) (introduced by David L. Davies and Donald W. Bouldin in 1979) is a metric for evaluating clustering algorithms.

## DBSCAN

Density-based spatial clustering of applications with noise (DBSCAN) is a data clustering algorithm proposed by Martin Ester, Hans-Peter Kriegel, Jörg Sander and Xiaowei Xu in 1996.

## Dendrogram

A dendrogram (from Greek dendro "tree" and gramma "drawing") is a tree diagram frequently used to illustrate the arrangement of the clusters produced by hierarchical clustering.

## Determining the number of clusters in a data set

Determining the number of clusters in a data set, a quantity often labelled k as in the ''k''-means algorithm, is a frequent problem in data clustering, and is a distinct issue from the process of actually solving the clustering problem.

## Deterministic algorithm

In computer science, a deterministic algorithm is an algorithm which, given a particular input, will always produce the same output, with the underlying machine always passing through the same sequence of states.

## Digital data

Digital data, in information theory and information systems, is the discrete, discontinuous representation of information or works.

## Dimensionality reduction

In statistics, machine learning, and information theory, dimensionality reduction or dimension reduction is the process of reducing the number of random variables under consideration by obtaining a set of principal variables.

## DNA annotation

DNA annotation or genome annotation is the process of identifying the locations of genes and all of the coding regions in a genome and determining what those genes do.

## DNA microarray

A DNA microarray (also commonly known as DNA chip or biochip) is a collection of microscopic DNA spots attached to a solid surface.

## Dunn index

The Dunn index (DI) (introduced by J. C. Dunn in 1974) is a metric for evaluating clustering algorithms.

## Ecology

Ecology (from οἶκος, "house", or "environment"; -λογία, "study of") is the branch of biology which studies the interactions among organisms and their environment.

## Edge detection

Edge detection includes a variety of mathematical methods that aim at identifying points in a digital image at which the image brightness changes sharply or, more formally, has discontinuities.

## Educational data mining

Educational data mining (EDM) describes a research field concerned with the application of data mining, machine learning and statistics to information generated from educational settings (e.g., universities and intelligent tutoring systems).

## Empirical distribution function

In statistics, an empirical distribution function is the distribution function associated with the empirical measure of a sample.

## Enzyme

Enzymes are macromolecular biological catalysts.

## Evolutionary algorithm

In artificial intelligence, an evolutionary algorithm (EA) is a subset of evolutionary computation, a generic population-based metaheuristic optimization algorithm.

## Evolutionary biology

Evolutionary biology is the subfield of biology that studies the evolutionary processes that produced the diversity of life on Earth, starting from a single common ancestor.

## Expectation–maximization algorithm

In statistics, an expectation–maximization (EM) algorithm is an iterative method to find maximum likelihood or maximum a posteriori (MAP) estimates of parameters in statistical models, where the model depends on unobserved latent variables.

## Expressed sequence tag

In genetics, an expressed sequence tag (EST) is a short sub-sequence of a cDNA sequence.

## F1 score

In statistical analysis of binary classification, the F1 score (also F-score or F-measure) is a measure of a test's accuracy.

## False positives and false negatives

In medical testing, and more generally in binary classification, a false positive is an error in data reporting in which a test result improperly indicates presence of a condition, such as a disease (the result is positive), when in reality it is not present, while a false negative is an error in which a test result improperly indicates no presence of a condition (the result is negative), when in reality it is present.

## Flickr

Flickr (pronounced "flicker") is an image hosting service and video hosting service.

## Fowlkes–Mallows index

Fowlkes–Mallows index is an external evaluation method that is used to determine the similarity between two clusterings (clusters obtained after a clustering algorithm).

## Fuzzy clustering

Fuzzy clustering (also referred to as soft clustering) is a form of clustering in which each data point can belong to more than one cluster.

## Gene

In biology, a gene is a sequence of DNA or RNA that codes for a molecule that has a function.

## Gene duplication

Gene duplication (or chromosomal duplication or gene amplification) is a major mechanism through which new genetic material is generated during molecular evolution.

## Genomics

Genomics is an interdisciplinary field of science focusing on the structure, function, evolution, mapping, and editing of genomes.

## Genotype

The genotype is the part of the genetic makeup of a cell, and therefore of an organism or individual, which determines one of its characteristics (phenotype).

## Gold standard (test)

In medicine and statistics, gold standard test is usually diagnostic test or benchmark that is the best available under reasonable conditions.

Google LLC is an American multinational technology company that specializes in Internet-related services and products, which include online advertising technologies, search engine, cloud computing, software, and hardware.

## Graph (discrete mathematics)

In mathematics, and more specifically in graph theory, a graph is a structure amounting to a set of objects in which some pairs of the objects are in some sense "related".

## Hans-Peter Kriegel

Hans-Peter Kriegel (1 October 1948, Germany) is a German computer scientist and professor at the Ludwig Maximilian University of Munich and leading the Database Systems Group in the Department of Computer Science.

## HCS clustering algorithm

The (also known as the HCS algorithm, and other names such as Highly Connected Clusters/Components/Kernels) is an algorithm based on graph connectivity for Cluster analysis, by first representing the similarity data in a similarity graph, and afterwards finding all the highly connected subgraphs as clusters.

## Heidelberg University

Heidelberg University (Ruprecht-Karls-Universität Heidelberg; Universitas Ruperto Carola Heidelbergensis) is a public research university in Heidelberg, Baden-Württemberg, Germany.

## Hierarchical clustering

In data mining and statistics, hierarchical clustering (also called hierarchical cluster analysis or HCA) is a method of cluster analysis which seeks to build a hierarchy of clusters.

## High-dimensional statistics

In statistical theory, the field of high-dimensional statistics studies data whose dimension is larger than dimensions considered in classical multivariate analysis.

## Hopkins statistic

The Hopkins statistic (introduced by Brian Hopkins and John Gordon Skellam) is a way of measuring the cluster tendency of a data set.

## Human genetic clustering

Human genetic clustering is the degree to which human genetic variation can be partitioned into a small number of groups or clusters.

## Image

An image (from imago) is an artifact that depicts visual perception, for example, a photo or a two-dimensional picture, that has a similar appearance to some subject—usually a physical object or a person, thus providing a depiction of it.

## Image analysis

Image analysis is the extraction of meaningful information from images; mainly from digital images by means of digital image processing techniques.

## Image segmentation

In computer vision, image segmentation is the process of partitioning a digital image into multiple segments (sets of pixels, also known as super-pixels).

## Independent component analysis

In signal processing, independent component analysis (ICA) is a computational method for separating a multivariate signal into additive subcomponents.

## Information retrieval

Information retrieval (IR) is the activity of obtaining information system resources relevant to an information need from a collection of information resources.

## Information theory

Information theory studies the quantification, storage, and communication of information.

## Jaccard index

The Jaccard index, also known as Intersection over Union and the Jaccard similarity coefficient (originally coined coefficient de communauté by Paul Jaccard), is a statistic used for comparing the similarity and diversity of sample sets.

## Journal of the American Statistical Association

The Journal of the American Statistical Association (JASA) is the primary journal published by the American Statistical Association, the main professional body for statisticians in the United States.

## K-means clustering

k-means clustering is a method of vector quantization, originally from signal processing, that is popular for cluster analysis in data mining.

## K-means++

In data mining, k-means++ is an algorithm for choosing the initial values (or "seeds") for the ''k''-means clustering algorithm.

## K-medians clustering

In statistics and data mining, k-medians clustering is a cluster analysis algorithm.

## K-medoids

The -medoids algorithm is a clustering algorithm related to the k-means algorithm and the medoidshift algorithm.

## Kernel density estimation

In statistics, kernel density estimation (KDE) is a non-parametric way to estimate the probability density function of a random variable.

## Knowledge extraction

Knowledge extraction is the creation of knowledge from structured (relational databases, XML) and unstructured (text, documents, images) sources.

## Latent class model

In statistics, a latent class model (LCM) relates a set of observed (usually discrete) multivariate variables to a set of latent variables.

## List of gene families

This is a list of gene families or gene complexes, that is sets of genes which occur across a number of different species which often serve similar biological functions.

## Lloyd's algorithm

In computer science and electrical engineering, Lloyd's algorithm, also known as Voronoi iteration or relaxation, is an algorithm named after Stuart P. Lloyd for finding evenly spaced sets of points in subsets of Euclidean spaces and partitions of these subsets into well-shaped and uniformly sized convex cells.

## Local optimum

In applied mathematics and computer science, a local optimum of an optimization problem is a solution that is optimal (either maximal or minimal) within a neighboring set of candidate solutions.

## Machine learning

Machine learning is a subset of artificial intelligence in the field of computer science that often uses statistical techniques to give computers the ability to "learn" (i.e., progressively improve performance on a specific task) with data, without being explicitly programmed.

## Markedness

In linguistics and social sciences, markedness is the state of standing out as unusual or divergent in comparison to a more common or regular form.

## Market research

Market research (also in some contexts known as industrial research) is any organized effort to gather information about target markets or customers.

## Market segmentation

Market segmentation is the process of dividing a broad consumer or business market, normally consisting of existing and potential customers, into sub-groups of consumers (known as segments) based on some type of shared characteristics.

## Marketing

Marketing is the study and management of exchange relationships.

## Markov chain Monte Carlo

In statistics, Markov chain Monte Carlo (MCMC) methods comprise a class of algorithms for sampling from a probability distribution.

## Mathematical chemistry

Mathematical chemistry is the area of research engaged in novel applications of mathematics to chemistry; it concerns itself principally with the mathematical modeling of chemical phenomena.

## Matthews correlation coefficient

The Matthews correlation coefficient is used in machine learning as a measure of the quality of binary (two-class) classifications, introduced by biochemist Brian W. Matthews in 1975.

## Mean shift

Mean shift is a non-parametric feature-space analysis technique for locating the maxima of a density function, a so-called mode-seeking algorithm.

## Median

The median is the value separating the higher half of a data sample, a population, or a probability distribution, from the lower half.

## Medical imaging

Medical imaging is the technique and process of creating visual representations of the interior of a body for clinical analysis and medical intervention, as well as visual representation of the function of some organs or tissues (physiology).

## Medicine

Medicine is the science and practice of the diagnosis, treatment, and prevention of disease.

## Message passing

In computer science, message passing is a technique for invoking behavior (i.e., running a program) on a computer.

## Metabolic pathway

In biochemistry, a metabolic pathway is a linked series of chemical reactions occurring within a cell.

## Metric (mathematics)

In mathematics, a metric or distance function is a function that defines a distance between each pair of elements of a set.

## Multi-objective optimization

Multi-objective optimization (also known as multi-objective programming, vector optimization, multicriteria optimization, multiattribute optimization or Pareto optimization) is an area of multiple criteria decision making, that is concerned with mathematical optimization problems involving more than one objective function to be optimized simultaneously.

## Multidimensional scaling

Multidimensional scaling (MDS) is a means of visualizing the level of similarity of individual cases of a dataset.

## Multimodal distribution

In statistics, a bimodal distribution is a continuous probability distribution with two different modes.

## Multivariate normal distribution

In probability theory and statistics, the multivariate normal distribution or multivariate Gaussian distribution is a generalization of the one-dimensional (univariate) normal distribution to higher dimensions.

## Mutual information

In probability theory and information theory, the mutual information (MI) of two random variables is a measure of the mutual dependence between the two variables.

## Nearest neighbor search

Nearest neighbor search (NNS), as a form of proximity search, is the optimization problem of finding the point in a given set that is closest (or most similar) to a given point.

## Neighbourhood components analysis

Neighbourhood components analysis is a supervised learning method for classifying multivariate data into distinct classes according to a given distance metric over the data.

## Neural network

The term neural network was traditionally used to refer to a network or circuit of neurons.

## New product development

In business and engineering, new product development (NPD) covers the complete process of bringing a new product to market.

## Normal distribution

In probability theory, the normal (or Gaussian or Gauss or Laplace–Gauss) distribution is a very common continuous probability distribution.

## NP-hardness

NP-hardness (''n''on-deterministic ''p''olynomial-time hardness), in computational complexity theory, is the defining property of a class of problems that are, informally, "at least as hard as the hardest problems in NP".

## Numerical taxonomy

Numerical taxonomy is a classification system in biological systematics which deals with the grouping by numerical methods of taxonomic units based on their character states.

## OPTICS algorithm

Ordering points to identify the clustering structure (OPTICS) is an algorithm for finding density-based clusters in spatial data.

## Outline of object recognition

The following outline is provided as an overview of and topical guide to object recognition: Object recognition – technology in the field of computer vision for finding and identifying objects in an image or video sequence.

## Overfitting

In statistics, overfitting is "the production of an analysis that corresponds too closely or exactly to a particular set of data, and may therefore fail to fit additional data or predict future observations reliably".

## Parallel coordinates

Parallel coordinates are a common way of visualizing high-dimensional geometry and analyzing multivariate data.

## Pattern recognition

Pattern recognition is a branch of machine learning that focuses on the recognition of patterns and regularities in data, although it is in some cases considered to be nearly synonymous with machine learning.

## Personality psychology

Personality psychology is a branch of psychology that studies personality and its variation among individuals.

## Phylogenetic tree

A phylogenetic tree or evolutionary tree is a branching diagram or "tree" showing the evolutionary relationships among various biological species or other entities—their phylogeny—based upon similarities and differences in their physical or genetic characteristics.

## Plant

Plants are mainly multicellular, predominantly photosynthetic eukaryotes of the kingdom Plantae.

## Population

In biology, a population is all the organisms of the same group or species, which live in a particular geographical area, and have the capability of interbreeding.

## Positioning (marketing)

Positioning refers to the place that a brand occupies in the mind of the customer and how it is distinguished from products from competitors.

## Positron emission tomography

Positron-emission tomography (PET) is a nuclear medicine functional imaging technique that is used to observe metabolic processes in the body as an aid to the diagnosis of disease.

## Precision and recall

In pattern recognition, information retrieval and binary classification, precision (also called positive predictive value) is the fraction of relevant instances among the retrieved instances, while recall (also known as sensitivity) is the fraction of relevant instances that have been retrieved over the total amount of relevant instances.

## Principal component analysis

Principal component analysis (PCA) is a statistical procedure that uses an orthogonal transformation to convert a set of observations of possibly correlated variables into a set of values of linearly uncorrelated variables called principal components.

## Probability distribution

In probability theory and statistics, a probability distribution is a mathematical function that provides the probabilities of occurrence of different possible outcomes in an experiment.

## R-tree

R-trees are tree data structures used for spatial access methods, i.e., for indexing multi-dimensional information such as geographical coordinates, rectangles or polygons.

## Rand index

The Rand index or Rand measure (named after William M. Rand) in statistics, and in particular in data clustering, is a measure of the similarity between two data clusterings.

## Raymond Cattell

Raymond Bernard Cattell (20 March 1905 – 2 February 1998) was a British and American psychologist, known for his psychometric research into intrapersonal psychological structure.

## Recommender system

A recommender system or a recommendation system (sometimes replacing "system" with a synonym such as platform or engine) is a subclass of information filtering system that seeks to predict the "rating" or "preference" a user would give to an item.

## Robert Tryon

Robert Choate Tryon (September 4, 1901 – September 27, 1967) was an American behavioral psychologist, who pioneered the study of hereditary trait inheritance and learning in animals.

## Sørensen–Dice coefficient

The Sørensen–Dice index, also known by other names (see Name, below), is a statistic used for comparing the similarity of two samples.

## Self-organizing map

A self-organizing map (SOM) or self-organizing feature map (SOFM) is a type of artificial neural network (ANN) that is trained using unsupervised learning to produce a low-dimensional (typically two-dimensional), discretized representation of the input space of the training samples, called a map, and is therefore a method to do dimensionality reduction.

## Sequence analysis

In bioinformatics, sequence analysis is the process of subjecting a DNA, RNA or peptide sequence to any of a wide range of analytical methods to understand its features, function, structure, or evolution.

## Sequence clustering

In bioinformatics, sequence clustering algorithms attempt to group biological sequences that are somehow related.

## SIGKDD

SIGKDD is the Association for Computing Machinery's (ACM) Special Interest Group (SIG) on Knowledge Discovery and Data Mining.

## Silhouette (clustering)

Silhouette refers to a method of interpretation and validation of consistency within clusters of data.

## Single-linkage clustering

In statistics, single-linkage clustering is one of several methods of hierarchical clustering.

## Social network

A social network is a social structure made up of a set of social actors (such as individuals or organizations), sets of dyadic ties, and other social interactions between actors.

## Software evolution

Software evolution is the term used in software engineering (specifically software maintenance) to refer to the process of developing software initially, then repeatedly updating it for various reasons.

## Spectral clustering

In multivariate statistics and the clustering of data, spectral clustering techniques make use of the spectrum (eigenvalues) of the similarity matrix of the data to perform dimensionality reduction before clustering in fewer dimensions.

## Statistical classification

In machine learning and statistics, classification is the problem of identifying to which of a set of categories (sub-populations) a new observation belongs, on the basis of a training set of data containing observations (or instances) whose category membership is known.

## Statistical physics

Statistical physics is a branch of physics that uses methods of probability theory and statistics, and particularly the mathematical tools for dealing with large populations and approximations, in solving physical problems.

## Statistics

Statistics is a branch of mathematics dealing with the collection, analysis, interpretation, presentation, and organization of data.

## Stock keeping unit

In the field of inventory management, a stock keeping unit (SKU) is a distinct type of item for sale, such as a product or service, and all attributes associated with the item type that distinguish it from other item types.

## Structured data analysis (statistics)

Structured data analysis is the statistical data analysis of structured data.

## SUBCLU

SUBCLU is an algorithm for clustering high-dimensional data by Karin Kailing, Hans-Peter Kriegel and Peer Kröger.

## Supervised learning

Supervised learning is the machine learning task of learning a function that maps an input to an output based on example input-output pairs.

## Survey methodology

A field of applied statistics of human research surveys, survey methodology studies the sampling of individual units from a population and associated techniques of survey data collection, such as questionnaire construction and methods for improving the number and accuracy of responses to surveys.

## Systematics

Biological systematics is the study of the diversification of living forms, both past and present, and the relationships among living things through time.

## Tissue (biology)

In biology, tissue is a cellular organizational level between cells and a complete organ.

## Topological index

In the fields of chemical graph theory, molecular topology, and mathematical chemistry, a topological index also known as a connectivity index is a type of a molecular descriptor that is calculated based on the molecular graph of a chemical compound.

## Transcriptomics technologies

Transcriptomics technologies are the techniques used to study an organism’s transcriptome, the sum of all of its RNA transcripts.

## Unsupervised learning

Unsupervised machine learning is the machine learning task of inferring a function that describes the structure of "unlabeled" data (i.e. data that has not been classified or categorized).

## UPGMA

UPGMA (Unweighted Pair Group Method with Arithmetic Mean) is a simple agglomerative (bottom-up) hierarchical clustering method.

## Variation of information

In probability theory and information theory, the variation of information or shared information distance is a measure of the distance between two clusterings (partitions of elements).

## Voronoi diagram

In mathematics, a Voronoi diagram is a partitioning of a plane into regions based on distance to points in a specific subset of the plane.

## World Wide Web

The World Wide Web (abbreviated WWW or the Web) is an information space where documents and other web resources are identified by Uniform Resource Locators (URLs), interlinked by hypertext links, and accessible via the Internet.

## Yippy

Yippy (formerly Clusty) is a metasearch engine developed by Vivísimo before Vivisimo was later acquired by IBM and renamed IBM Watson Explorer which offers clusters of results.

## Youden's J statistic

Youden's J statistic (also called Youden's index) is a single statistic that captures the performance of a dichotomous diagnostic test.

