Get it on Google Play
New! Download Unionpedia on your Android™ device!
Faster access than browser!

Data deduplication

Index Data deduplication

In computing, data deduplication is a specialized data compression technique for eliminating duplicate copies of repeating data. [1]

36 relations: Analysis, Archive, Backup, Capacity optimization, Cloud storage, Collision (computer science), Computing, Content-addressable storage, Convergent encryption, Copy-on-write, Cryptographic hash function, Data compression, Data corruption, Data differencing, Delta encoding, Email, Hard link, IEEE Spectrum, Institute of Electrical and Electronics Engineers, Linked data, LOCKSS, LZ77 and LZ78, Megabyte, Network-attached storage, Non-volatile random-access memory, Pigeonhole principle, Pointer (computer programming), Record linkage, SHA-1, SHA-2, Single-instance storage, Snapshot (computer storage), Storage Networking Industry Association, Virtual tape library, WAN optimization, Write Anywhere File Layout.


Analysis is the process of breaking a complex topic or substance into smaller parts in order to gain a better understanding of it.

New!!: Data deduplication and Analysis · See more »


An archive is an accumulation of historical records or the physical place they are located.

New!!: Data deduplication and Archive · See more »


In information technology, a backup, or the process of backing up, refers to the copying into an archive file of computer data so it may be used to restore the original after a data loss event.

New!!: Data deduplication and Backup · See more »

Capacity optimization

Capacity optimization is a general term for technologies used to improve storage use by shrinking stored data.

New!!: Data deduplication and Capacity optimization · See more »

Cloud storage

Cloud storage is a model of computer data storage in which the digital data is stored in logical pools.

New!!: Data deduplication and Cloud storage · See more »

Collision (computer science)

Collision is used in two slightly different senses in theoretical computer science and telecommunications.

New!!: Data deduplication and Collision (computer science) · See more »


Computing is any goal-oriented activity requiring, benefiting from, or creating computers.

New!!: Data deduplication and Computing · See more »

Content-addressable storage

Content-addressable storage, also referred to as associative storage or abbreviated CAS, is a mechanism for storing information that can be retrieved based on its content, not its storage location.

New!!: Data deduplication and Content-addressable storage · See more »

Convergent encryption

Convergent encryption, also known as content hash keying, is a cryptosystem that produces identical ciphertext from identical plaintext files.

New!!: Data deduplication and Convergent encryption · See more »


Copy-on-write (CoW or COW), sometimes referred to as implicit sharing or shadowing, is a resource-management technique used in computer programming to efficiently implement a "duplicate" or "copy" operation on modifiable resources.

New!!: Data deduplication and Copy-on-write · See more »

Cryptographic hash function

A cryptographic hash function is a special class of hash function that has certain properties which make it suitable for use in cryptography.

New!!: Data deduplication and Cryptographic hash function · See more »

Data compression

In signal processing, data compression, source coding, or bit-rate reduction involves encoding information using fewer bits than the original representation.

New!!: Data deduplication and Data compression · See more »

Data corruption

Data corruption refers to errors in computer data that occur during writing, reading, storage, transmission, or processing, which introduce unintended changes to the original data.

New!!: Data deduplication and Data corruption · See more »

Data differencing

In computer science and information theory, data differencing or differential compression is producing a technical description of the difference between two sets of data – a source and a target.

New!!: Data deduplication and Data differencing · See more »

Delta encoding

Delta encoding is a way of storing or transmitting data in the form of differences (deltas) between sequential data rather than complete files; more generally this is known as data differencing.

New!!: Data deduplication and Delta encoding · See more »


Electronic mail (email or e-mail) is a method of exchanging messages ("mail") between people using electronic devices.

New!!: Data deduplication and Email · See more »

Hard link

In computing, a hard link is a directory entry that associates a name with a file on a file system.

New!!: Data deduplication and Hard link · See more »

IEEE Spectrum

IEEE Spectrum is a magazine edited by the Institute of Electrical and Electronics Engineers.

New!!: Data deduplication and IEEE Spectrum · See more »

Institute of Electrical and Electronics Engineers

The Institute of Electrical and Electronics Engineers (IEEE) is a professional association with its corporate office in New York City and its operations center in Piscataway, New Jersey.

New!!: Data deduplication and Institute of Electrical and Electronics Engineers · See more »

Linked data

In computing, linked data (often capitalized as Linked Data) is a method of publishing structured data so that it can be interlinked and become more useful through semantic queries.

New!!: Data deduplication and Linked data · See more »


The LOCKSS ("Lots of Copies Keep Stuff Safe") project, under the auspices of Stanford University, is a peer-to-peer network that develops and supports an open source system allowing libraries to collect, preserve and provide their readers with access to material published on the Web.

New!!: Data deduplication and LOCKSS · See more »

LZ77 and LZ78

LZ77 and LZ78 are the two lossless data compression algorithms published in papers by Abraham Lempel and Jacob Ziv in 1977 and 1978.

New!!: Data deduplication and LZ77 and LZ78 · See more »


The megabyte is a multiple of the unit byte for digital information.

New!!: Data deduplication and Megabyte · See more »

Network-attached storage

Network-attached storage (NAS) is a file-level computer data storage server connected to a computer network providing data access to a heterogeneous group of clients.

New!!: Data deduplication and Network-attached storage · See more »

Non-volatile random-access memory

Non-volatile random-access memory (NVRAM) is random-access memory that retains its information when power is turned off.

New!!: Data deduplication and Non-volatile random-access memory · See more »

Pigeonhole principle

In mathematics, the pigeonhole principle states that if items are put into containers, with, then at least one container must contain more than one item.

New!!: Data deduplication and Pigeonhole principle · See more »

Pointer (computer programming)

In computer science, a pointer is a programming language object that stores the memory address of another value located in computer memory.

New!!: Data deduplication and Pointer (computer programming) · See more »

Record linkage

Record linkage (RL) is the task of finding records in a data set that refer to the same entity across different data sources (e.g., data files, books, websites, and databases).

New!!: Data deduplication and Record linkage · See more »


In cryptography, SHA-1 (Secure Hash Algorithm 1) is a cryptographic hash function which takes an input and produces a 160-bit (20-byte) hash value known as a message digest - typically rendered as a hexadecimal number, 40 digits long.

New!!: Data deduplication and SHA-1 · See more »


SHA-2 (Secure Hash Algorithm 2) is a set of cryptographic hash functions designed by the United States National Security Agency (NSA).

New!!: Data deduplication and SHA-2 · See more »

Single-instance storage

Single-instance storage (SIS) is a system's ability to keep one copy of content that multiple users or computers share.

New!!: Data deduplication and Single-instance storage · See more »

Snapshot (computer storage)

In computer systems, a snapshot is the state of a system at a particular point in time.

New!!: Data deduplication and Snapshot (computer storage) · See more »

Storage Networking Industry Association

The Storage Networking Industry Association (SNIA) is an association of producers and consumers of computer data storage networking products.

New!!: Data deduplication and Storage Networking Industry Association · See more »

Virtual tape library

A virtual tape library (VTL) is a data storage virtualization technology used typically for backup and recovery purposes.

New!!: Data deduplication and Virtual tape library · See more »

WAN optimization

WAN optimization is a collection of techniques for increasing data transfer efficiencies across wide-area networks (WANs).

New!!: Data deduplication and WAN optimization · See more »

Write Anywhere File Layout

The Write Anywhere File Layout (WAFL) is a that supports large, high-performance RAID arrays, quick restarts without lengthy consistency checks in the event of a crash or power failure, and growing the filesystems size quickly.

New!!: Data deduplication and Write Anywhere File Layout · See more »

Redirects here:

Data de-duplication, Data duplication, De-duplication, Storage de-duplication.


[1] https://en.wikipedia.org/wiki/Data_deduplication

Hey! We are on Facebook now! »