Logo
Unionpedia
Communication
Get it on Google Play
New! Download Unionpedia on your Android™ device!
Free
Faster access than browser!
 

String metric

Index String metric

In mathematics and computer science, a string metric (also known as a string similarity metric or string distance function) is a metric that measures distance ("inverse similarity") between two text strings for approximate string matching or comparison and in fuzzy string searching. [1]

42 relations: Approximate string matching, Bhattacharyya distance, Computer science, Damerau–Levenshtein distance, Data analysis techniques for fraud detection, Data deduplication, Data integration, Data mining, Database, Distance, Fingerprint, Genetic testing, Hamming distance, Hellinger distance, Image analysis, Incremental search, Information integration, Jaccard index, Jaro–Winkler distance, JavaScript, Jensen–Shannon divergence, Kendall tau distance, Knowledge integration, Kullback–Leibler divergence, Levenshtein distance, Lexical analysis, Machine learning, Mathematics, Metric (mathematics), Most frequent k characters, Ontology merging, Overlap coefficient, Plagiarism detection, Sørensen–Dice coefficient, Scala (programming language), Simple matching coefficient, String (computer science), String-searching algorithm, Taxicab geometry, Tf–idf, Triangle inequality, Tversky index.

Approximate string matching

In computer science, approximate string matching (often colloquially referred to as fuzzy string searching) is the technique of finding strings that match a pattern approximately (rather than exactly).

New!!: String metric and Approximate string matching · See more »

Bhattacharyya distance

In statistics, the Bhattacharyya distance measures the similarity of two probability distributions.

New!!: String metric and Bhattacharyya distance · See more »

Computer science

Computer science deals with the theoretical foundations of information and computation, together with practical techniques for the implementation and application of these foundations.

New!!: String metric and Computer science · See more »

Damerau–Levenshtein distance

In information theory and computer science, the Damerau–Levenshtein distance (named after Frederick J. Damerau and Vladimir I. Levenshtein.) is a string metric for measuring the edit distance between two sequences.

New!!: String metric and Damerau–Levenshtein distance · See more »

Data analysis techniques for fraud detection

Fraud is a billion-dollar business and it is increasing every year.

New!!: String metric and Data analysis techniques for fraud detection · See more »

Data deduplication

In computing, data deduplication is a specialized data compression technique for eliminating duplicate copies of repeating data.

New!!: String metric and Data deduplication · See more »

Data integration

Data integration involves combining data residing in different sources and providing users with a unified view of them.

New!!: String metric and Data integration · See more »

Data mining

Data mining is the process of discovering patterns in large data sets involving methods at the intersection of machine learning, statistics, and database systems.

New!!: String metric and Data mining · See more »

Database

A database is an organized collection of data, stored and accessed electronically.

New!!: String metric and Database · See more »

Distance

Distance is a numerical measurement of how far apart objects are.

New!!: String metric and Distance · See more »

Fingerprint

A fingerprint in its narrow sense is an impression left by the friction ridges of a human finger.

New!!: String metric and Fingerprint · See more »

Genetic testing

Genetic testing, also known as DNA testing, allows the determination of bloodlines and the genetic diagnosis of vulnerabilities to inherited diseases.

New!!: String metric and Genetic testing · See more »

Hamming distance

In information theory, the Hamming distance between two strings of equal length is the number of positions at which the corresponding symbols are different.

New!!: String metric and Hamming distance · See more »

Hellinger distance

In probability and statistics, the Hellinger distance (closely related to, although different from, the Bhattacharyya distance) is used to quantify the similarity between two probability distributions.

New!!: String metric and Hellinger distance · See more »

Image analysis

Image analysis is the extraction of meaningful information from images; mainly from digital images by means of digital image processing techniques.

New!!: String metric and Image analysis · See more »

Incremental search

In computing, incremental search, incremental find or real-time suggestions is a user interface interaction method to progressively search for and filter through text.

New!!: String metric and Incremental search · See more »

Information integration

Information integration (II) is the merging of information from heterogeneous sources with differing conceptual, contextual and typographical representations.

New!!: String metric and Information integration · See more »

Jaccard index

The Jaccard index, also known as Intersection over Union and the Jaccard similarity coefficient (originally coined coefficient de communauté by Paul Jaccard), is a statistic used for comparing the similarity and diversity of sample sets.

New!!: String metric and Jaccard index · See more »

Jaro–Winkler distance

In computer science and statistics, the Jaro–Winkler distance is a string metric for measuring the edit distance between two sequences.

New!!: String metric and Jaro–Winkler distance · See more »

JavaScript

JavaScript, often abbreviated as JS, is a high-level, interpreted programming language.

New!!: String metric and JavaScript · See more »

Jensen–Shannon divergence

In probability theory and statistics, the Jensen–Shannon divergence is a method of measuring the similarity between two probability distributions.

New!!: String metric and Jensen–Shannon divergence · See more »

Kendall tau distance

The Kendall tau rank distance is a metric that counts the number of pairwise disagreements between two ranking lists.

New!!: String metric and Kendall tau distance · See more »

Knowledge integration

Knowledge integration is the process of synthesizing multiple knowledge models (or representations) into a common model (representation).

New!!: String metric and Knowledge integration · See more »

Kullback–Leibler divergence

In mathematical statistics, the Kullback–Leibler divergence (also called relative entropy) is a measure of how one probability distribution diverges from a second, expected probability distribution.

New!!: String metric and Kullback–Leibler divergence · See more »

Levenshtein distance

In information theory, linguistics and computer science, the Levenshtein distance is a string metric for measuring the difference between two sequences.

New!!: String metric and Levenshtein distance · See more »

Lexical analysis

In computer science, lexical analysis, lexing or tokenization is the process of converting a sequence of characters (such as in a computer program or web page) into a sequence of tokens (strings with an assigned and thus identified meaning).

New!!: String metric and Lexical analysis · See more »

Machine learning

Machine learning is a subset of artificial intelligence in the field of computer science that often uses statistical techniques to give computers the ability to "learn" (i.e., progressively improve performance on a specific task) with data, without being explicitly programmed.

New!!: String metric and Machine learning · See more »

Mathematics

Mathematics (from Greek μάθημα máthēma, "knowledge, study, learning") is the study of such topics as quantity, structure, space, and change.

New!!: String metric and Mathematics · See more »

Metric (mathematics)

In mathematics, a metric or distance function is a function that defines a distance between each pair of elements of a set.

New!!: String metric and Metric (mathematics) · See more »

Most frequent k characters

In information theory, MostFreqKDistance is a string metric technique for quickly estimating how similar two ordered sets or strings are.

New!!: String metric and Most frequent k characters · See more »

Ontology merging

Ontology merging defines the act of bringing together two conceptually divergent ontologies or the instance data associated to two ontologies.

New!!: String metric and Ontology merging · See more »

Overlap coefficient

The overlap coefficient, or Szymkiewicz–Simpson coefficient, is a similarity measure that measures the overlap between two sets.

New!!: String metric and Overlap coefficient · See more »

Plagiarism detection

Plagiarism detection is the process of locating instances of plagiarism within a work or document.

New!!: String metric and Plagiarism detection · See more »

Sørensen–Dice coefficient

The Sørensen–Dice index, also known by other names (see Name, below), is a statistic used for comparing the similarity of two samples.

New!!: String metric and Sørensen–Dice coefficient · See more »

Scala (programming language)

Scala is a general-purpose programming language providing support for functional programming and a strong static type system.

New!!: String metric and Scala (programming language) · See more »

Simple matching coefficient

The simple matching coefficient (SMC) or Rand similarity coefficient is a statistic used for comparing the similarity and diversity of sample sets.

New!!: String metric and Simple matching coefficient · See more »

String (computer science)

In computer programming, a string is traditionally a sequence of characters, either as a literal constant or as some kind of variable.

New!!: String metric and String (computer science) · See more »

String-searching algorithm

In computer science, string-searching algorithms, sometimes called string-matching algorithms, are an important class of string algorithms that try to find a place where one or several strings (also called patterns) are found within a larger string or text.

New!!: String metric and String-searching algorithm · See more »

Taxicab geometry

A taxicab geometry is a form of geometry in which the usual distance function or metric of Euclidean geometry is replaced by a new metric in which the distance between two points is the sum of the absolute differences of their Cartesian coordinates.

New!!: String metric and Taxicab geometry · See more »

Tf–idf

In information retrieval, tf–idf or TFIDF, short for term frequency–inverse document frequency, is a numerical statistic that is intended to reflect how important a word is to a document in a collection or corpus.

New!!: String metric and Tf–idf · See more »

Triangle inequality

In mathematics, the triangle inequality states that for any triangle, the sum of the lengths of any two sides must be greater than or equal to the length of the remaining side.

New!!: String metric and Triangle inequality · See more »

Tversky index

The Tversky index, named after Amos Tversky, is an asymmetric similarity measure on sets that compares a variant to a prototype.

New!!: String metric and Tversky index · See more »

Redirects here:

List of string metrics, String Metrics, String distance, String distance function, String distance measure, String distance metric, String metrics, String similarity, String similarity function, String similarity measure, String similarity metric.

References

[1] https://en.wikipedia.org/wiki/String_metric

OutgoingIncoming
Hey! We are on Facebook now! »