Logo
Unionpedia
Communication
Get it on Google Play
New! Download Unionpedia on your Android™ device!
Download
Faster access than browser!
 

Character encoding

Index Character encoding

Character encoding is used to represent a repertoire of characters by some kind of encoding system. [1]

156 relations: Abstraction layer, Addison-Wesley, Alt code, ANSEL, Arabic numerals, ASCII, Émile Baudot, Backward compatibility, Bacon's cipher, Baudot code, BCD (character encoding), Big5, Binary Ordered Compression for Unicode, Bitstream, Braille, Byte, Byte order mark, Byte-oriented protocol, C++, CCSID, Character (computing), Character encodings in HTML, Chinese telegraph code, CJK characters, Code, Code page, Code page 437, Code page 720, Code page 737, Code page 850, Code page 852, Code page 855, Code page 857, Code page 858, Code page 860, Code page 861, Code page 862, Code page 863, Code page 865, Code page 866, Code page 869, Code page 930, Code page 932 (Microsoft Windows), Code page 950, Code point, Code word, Comparison of Unicode encodings, Computation, Computer data storage, Computer science, ..., Content sniffing, Control character, Cross-platform, Cygwin, Cyrillic script, Data, Diacritic, EBCDIC, EBCDIC 037, EBCDIC 1047, Endianness, Escape sequence, Extended Unix Code, Fieldata, File (command), Function (mathematics), GB 18030, GB 2312, GBK (character encoding), Glyph, Greek alphabet, Guobiao standards, Hans Schjellerup, Hong Kong Supplementary Character Set, Hypertext Transfer Protocol, IBM, IBM 1401, IBM 1620, IBM 700/7000 series, Iconv, Indian Script Code for Information Interchange, Integer, International Components for Unicode, International maritime signal flags, ISO/IEC 2022, ISO/IEC 646, ISO/IEC 6937, ISO/IEC 8859, ISO/IEC 8859-1, ISO/IEC 8859-10, ISO/IEC 8859-11, ISO/IEC 8859-13, ISO/IEC 8859-14, ISO/IEC 8859-15, ISO/IEC 8859-16, ISO/IEC 8859-2, ISO/IEC 8859-3, ISO/IEC 8859-4, ISO/IEC 8859-5, ISO/IEC 8859-6, ISO/IEC 8859-7, ISO/IEC 8859-8, ISO/IEC 8859-9, JIS X 0208, JIS X 0213, KOI-7, KOI8-R, KOI8-U, KS X 1001, Latin alphabet, Legacy system, Luit, Mac OS Roman, Microsoft Windows, MIK (character set), MIME, Mojibake, Mojikyo, Morse code, Mozilla, Number, Octet (computing), Plane (Unicode), Punycode, Shift JIS, SIL International, Standard Compression Scheme for Unicode, String (computer science), Tamil Script Code for Information Interchange, Telegraph key, Telegraphy, Transcoding, TRON (encoding), Typographic ligature, Unicode, Universal Character Set characters, Universal Coded Character Set, Unix-like, UTF-16, UTF-32, UTF-8, Variable-width encoding, VSCII, Web browser, Windows code page, Windows-1250, Windows-1251, Windows-1252, Windows-1253, Windows-1254, Windows-1255, Windows-1256, Windows-1257, Windows-1258, Writing system, XML. Expand index (106 more) »

Abstraction layer

In computing, an abstraction layer or abstraction level is a way of hiding the implementation details of a particular set of functionality, allowing the separation of concerns to facilitate interoperability and platform independence.

New!!: Character encoding and Abstraction layer · See more »

Addison-Wesley

Addison-Wesley is a publisher of textbooks and computer literature.

New!!: Character encoding and Addison-Wesley · See more »

Alt code

On IBM compatible personal computers, many characters not directly associated with a key can be entered using the Alt Numpad input method or Alt code: pressing and holding the ''Alt'' key while typing the number identifying the character with the keyboard's numeric keypad.

New!!: Character encoding and Alt code · See more »

ANSEL

ANSEL, the American National Standard for Extended Latin Alphabet Coded Character Set for Bibliographic Use, was a character set used in text encoding.

New!!: Character encoding and ANSEL · See more »

Arabic numerals

Arabic numerals, also called Hindu–Arabic numerals, are the ten digits: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, based on the Hindu–Arabic numeral system, the most common system for the symbolic representation of numbers in the world today.

New!!: Character encoding and Arabic numerals · See more »

ASCII

ASCII, abbreviated from American Standard Code for Information Interchange, is a character encoding standard for electronic communication.

New!!: Character encoding and ASCII · See more »

Émile Baudot

Jean-Maurice-Émile Baudot (11 September 1845 – 28 March 1903), French telegraph engineer and inventor of the first means of digital communication Baudot code, was one of the pioneers of telecommunications.

New!!: Character encoding and Émile Baudot · See more »

Backward compatibility

Backward compatibility is a property of a system, product, or technology that allows for interoperability with an older legacy system, or with input designed for such a system, especially in telecommunications and computing.

New!!: Character encoding and Backward compatibility · See more »

Bacon's cipher

Bacon's cipher or the Baconian cipher is a method of steganography (a method of hiding a secret message as opposed to just a cipher) devised by Francis Bacon in 1605.

New!!: Character encoding and Bacon's cipher · See more »

Baudot code

The Baudot code, invented by Émile Baudot, is a character set predating EBCDIC and ASCII.

New!!: Character encoding and Baudot code · See more »

BCD (character encoding)

BCD ("Binary-Coded Decimal"), also called alphanumeric BCD, alphameric BCD, BCD Interchange Code, or BCDIC, is a family of representations of numerals, uppercase Latin letters, and some special and control characters as six-bit character codes.

New!!: Character encoding and BCD (character encoding) · See more »

Big5

Big-5 or Big5 is a Chinese character encoding method used in Taiwan, Hong Kong, and Macau for Traditional Chinese characters.

New!!: Character encoding and Big5 · See more »

Binary Ordered Compression for Unicode

Binary Ordered Compression for Unicode (BOCU) is a MIME compatible Unicode compression scheme.

New!!: Character encoding and Binary Ordered Compression for Unicode · See more »

Bitstream

A bitstream (or bit stream), also known as binary sequence, is a sequence of bits.

New!!: Character encoding and Bitstream · See more »

Braille

Braille is a tactile writing system used by people who are visually impaired.

New!!: Character encoding and Braille · See more »

Byte

The byte is a unit of digital information that most commonly consists of eight bits, representing a binary number.

New!!: Character encoding and Byte · See more »

Byte order mark

The byte order mark (BOM) is a Unicode character,, whose appearance as a magic number at the start of a text stream can signal several things to a program consuming the text.

New!!: Character encoding and Byte order mark · See more »

Byte-oriented protocol

Byte-oriented framing protocol is "a communications protocol in which full bytes are used as control codes.

New!!: Character encoding and Byte-oriented protocol · See more »

C++

C++ ("see plus plus") is a general-purpose programming language.

New!!: Character encoding and C++ · See more »

CCSID

CCSID is an abbreviation used by IBM to mean "Coded Character Set Identifier".

New!!: Character encoding and CCSID · See more »

Character (computing)

In computer and machine-based telecommunications terminology, a character is a unit of information that roughly corresponds to a grapheme, grapheme-like unit, or symbol, such as in an alphabet or syllabary in the written form of a natural language.

New!!: Character encoding and Character (computing) · See more »

Character encodings in HTML

HTML (Hypertext Markup Language) has been in use since 1991, but HTML 4.0 (December 1997) was the first standardized version where international characters were given reasonably complete treatment.

New!!: Character encoding and Character encodings in HTML · See more »

Chinese telegraph code

The Chinese telegraph code, Chinese telegraphic code, or Chinese commercial code is a four-digit decimal code (character encoding) for electrically telegraphing messages written with Chinese characters.

New!!: Character encoding and Chinese telegraph code · See more »

CJK characters

In internationalization, CJK is a collective term for the Chinese, Japanese, and Korean languages, all of which include Chinese characters and derivatives (collectively, CJK characters) in their writing systems.

New!!: Character encoding and CJK characters · See more »

Code

In communications and information processing, code is a system of rules to convert information—such as a letter, word, sound, image, or gesture—into another form or representation, sometimes shortened or secret, for communication through a communication channel or storage in a storage medium.

New!!: Character encoding and Code · See more »

Code page

In computing, a code page is a table of values that describes the character set used for encoding a particular set of characters, usually combined with a number of control characters.

New!!: Character encoding and Code page · See more »

Code page 437

Code page 437 is the character set of the original IBM PC (personal computer), or DOS.

New!!: Character encoding and Code page 437 · See more »

Code page 720

Code page 720 (also known as CP 720, IBM 00720, OEM 720) is a code page used under DOS to write Arabic.

New!!: Character encoding and Code page 720 · See more »

Code page 737

Code page 737 (also known as CP 737, IBM 00737, OEM 737, MS-DOS Greek) is a code page used under DOS to write the Greek language.

New!!: Character encoding and Code page 737 · See more »

Code page 850

Code page 850 (also known as CP 850, IBM 00850, OEM 850, DOS Latin 1) is a code page used under DOS and Psion’s EPOC16 operating systems in Western Europe.

New!!: Character encoding and Code page 850 · See more »

Code page 852

Code page 852 (also known as CP 852, IBM 00852, OEM 852 (Latin II), MS-DOS Latin 2) is a code page used under DOS to write Central European languages that use Latin script (such as Bosnian, Croatian, Czech, Hungarian, Polish, Romanian, Serbian, Slovak or Slovene).

New!!: Character encoding and Code page 852 · See more »

Code page 855

Code page 855 (also known as CP 855, IBM 00855, OEM 855, MS-DOS Cyrillic) is a code page used under DOS to write Cyrillic script.

New!!: Character encoding and Code page 855 · See more »

Code page 857

Code page 857 (also known as CP 857, IBM 00857, OEM 857, MS-DOS Turkish) is a code page used under DOS to write Turkish.

New!!: Character encoding and Code page 857 · See more »

Code page 858

Code page 858 (also known as CP 858, IBM 00858, OEM 858) is a code page used under DOS to write Western European languages.

New!!: Character encoding and Code page 858 · See more »

Code page 860

Code page 860 (also known as CP 860, IBM 00860, OEM 860, DOS Portuguese) is a code page used under DOS to write Portuguese and it is also suitable to write Spanish and Italian.

New!!: Character encoding and Code page 860 · See more »

Code page 861

Code page 861 (also known as CP 861, IBM 00861, OEM 861, DOS Icelandic) is a code page used under DOS to write the Icelandic language (as well as other Nordic languages).

New!!: Character encoding and Code page 861 · See more »

Code page 862

Code page 862 (also known as CP 862, IBM 00862, OEM 862 (Hebrew), MS-DOS Hebrew) is a code page used under DOS for Hebrew.

New!!: Character encoding and Code page 862 · See more »

Code page 863

Code page 863 (also known as CP 863, IBM 00863, OEM 863, MS-DOS French Canada) is a code page used under DOS to write French language (mainly in Quebec) although it lacks the letters Æ, æ, Œ, œ, Ÿ and ÿ.

New!!: Character encoding and Code page 863 · See more »

Code page 865

Code page 865 (also known as CP 865, IBM 00865, OEM 865, DOS Nordic) is a code page used under DOS to write Nordic languages (except Icelandic, for which code page 861 is used).

New!!: Character encoding and Code page 865 · See more »

Code page 866

Code page 866 (CP 866; Альтернативная кодировка) is a code page used under DOS and OS/2 to write Cyrillic script.

New!!: Character encoding and Code page 866 · See more »

Code page 869

Code page 869 (CP 869, IBM 869, OEM 869) is a code page used under DOS to write Greek language.

New!!: Character encoding and Code page 869 · See more »

Code page 930

CCSID 930 (sometimes known as CP930 or codepage 930) is one of several Japanese EBCDIC code pages created by IBM for representation of Japanese text.

New!!: Character encoding and Code page 930 · See more »

Code page 932 (Microsoft Windows)

Microsoft Windows code page 932 (abbreviated MS932, Windows-932 or ambiguously CP932), also called Windows-31J amongst other names (see § Terminology below), is the Microsoft Windows code page for the Japanese language, which is an extended variant of the Shift JIS Japanese character encoding.

New!!: Character encoding and Code page 932 (Microsoft Windows) · See more »

Code page 950

Code page 950 is Microsoft's implementation of the de facto standard Big5.

New!!: Character encoding and Code page 950 · See more »

Code point

In character encoding terminology, a code point or code position is any of the numerical values that make up the code space.

New!!: Character encoding and Code point · See more »

Code word

In communication, a code word is an element of a standardized code or protocol.

New!!: Character encoding and Code word · See more »

Comparison of Unicode encodings

This article compares Unicode encodings.

New!!: Character encoding and Comparison of Unicode encodings · See more »

Computation

Computation is any type of calculation that includes both arithmetical and non-arithmetical steps and follows a well-defined model, for example an algorithm.

New!!: Character encoding and Computation · See more »

Computer data storage

Computer data storage, often called storage or memory, is a technology consisting of computer components and recording media that are used to retain digital data.

New!!: Character encoding and Computer data storage · See more »

Computer science

Computer science deals with the theoretical foundations of information and computation, together with practical techniques for the implementation and application of these foundations.

New!!: Character encoding and Computer science · See more »

Content sniffing

Content sniffing, also known as media type sniffing or MIME sniffing, is the practice of inspecting the content of a byte stream to attempt to deduce the file format of the data within it.

New!!: Character encoding and Content sniffing · See more »

Control character

In computing and telecommunication, a control character or non-printing character is a code point (a number) in a character set, that does not represent a written symbol.

New!!: Character encoding and Control character · See more »

Cross-platform

In computing, cross-platform software (also multi-platform software or platform-independent software) is computer software that is implemented on multiple computing platforms.

New!!: Character encoding and Cross-platform · See more »

Cygwin

Cygwin is a Unix-like environment and command-line interface for Microsoft Windows.

New!!: Character encoding and Cygwin · See more »

Cyrillic script

The Cyrillic script is a writing system used for various alphabets across Eurasia (particularity in Eastern Europe, the Caucasus, Central Asia, and North Asia).

New!!: Character encoding and Cyrillic script · See more »

Data

Data is a set of values of qualitative or quantitative variables.

New!!: Character encoding and Data · See more »

Diacritic

A diacritic – also diacritical mark, diacritical point, diacritical sign, or an accent – is a glyph added to a letter, or basic glyph.

New!!: Character encoding and Diacritic · See more »

EBCDIC

Extended Binary Coded Decimal Interchange Code (EBCDIC) is an eight-bit character encoding used mainly on IBM mainframe and IBM midrange computer operating systems.

New!!: Character encoding and EBCDIC · See more »

EBCDIC 037

IBM code page 37 is an EBCDIC code page with the full Latin-1 character set used in IBM mainframes.

New!!: Character encoding and EBCDIC 037 · See more »

EBCDIC 1047

Code page 1047 is an EBCDIC code page with the full Latin-1 character set.

New!!: Character encoding and EBCDIC 1047 · See more »

Endianness

Endianness refers to the sequential order in which bytes are arranged into larger numerical values when stored in memory or when transmitted over digital links.

New!!: Character encoding and Endianness · See more »

Escape sequence

An escape sequence is a series of characters used to change the state of computers and their attached peripheral devices, rather than to be displayed or printed as regular data bytes would be.

New!!: Character encoding and Escape sequence · See more »

Extended Unix Code

Extended Unix Code (EUC) is a multibyte character encoding system used primarily for Japanese, Korean, and simplified Chinese.

New!!: Character encoding and Extended Unix Code · See more »

Fieldata

FIELDATA (also written as Fieldata) was a pioneering computer project run by the US Army Signal Corps in the late 1950s that intended to create a single standard (as defined in MIL-STD-188A/B/C) for collecting and distributing battlefield information.

New!!: Character encoding and Fieldata · See more »

File (command)

file is a standard Unix program for recognizing the type of data contained in a computer file.

New!!: Character encoding and File (command) · See more »

Function (mathematics)

In mathematics, a function was originally the idealization of how a varying quantity depends on another quantity.

New!!: Character encoding and Function (mathematics) · See more »

GB 18030

GB 18030 is a Chinese government standard, described as Information technology — Chinese coded character set and defines the required language and character support necessary for software in China.

New!!: Character encoding and GB 18030 · See more »

GB 2312

GB2312 is the registered internet name for a key official character set of the People's Republic of China, used for simplified Chinese characters.

New!!: Character encoding and GB 2312 · See more »

GBK (character encoding)

GBK is an extension of the GB2312 character set for simplified Chinese characters, used in the People's Republic of China.

New!!: Character encoding and GBK (character encoding) · See more »

Glyph

In typography, a glyph is an elemental symbol within an agreed set of symbols, intended to represent a readable character for the purposes of writing.

New!!: Character encoding and Glyph · See more »

Greek alphabet

The Greek alphabet has been used to write the Greek language since the late 9th or early 8th century BC.

New!!: Character encoding and Greek alphabet · See more »

Guobiao standards

GB standards are the Chinese national standards issued by the Standardization Administration of China (SAC), the Chinese National Committee of the ISO and IEC.

New!!: Character encoding and Guobiao standards · See more »

Hans Schjellerup

Hans Carl Frederik Christian Schjellerup (February 8, 1827 – November 13, 1887) was a Danish astronomer.

New!!: Character encoding and Hans Schjellerup · See more »

Hong Kong Supplementary Character Set

The Hong Kong Supplementary Character Set (commonly abbreviated to HKSCS) is a set of Chinese characters – 4,702 in total in the initial release—used in Cantonese, as well as when writing the names of some places in Hong Kong (whether in written Cantonese or standard written Chinese sentences).

New!!: Character encoding and Hong Kong Supplementary Character Set · See more »

Hypertext Transfer Protocol

The Hypertext Transfer Protocol (HTTP) is an application protocol for distributed, collaborative, and hypermedia information systems.

New!!: Character encoding and Hypertext Transfer Protocol · See more »

IBM

The International Business Machines Corporation (IBM) is an American multinational technology company headquartered in Armonk, New York, United States, with operations in over 170 countries.

New!!: Character encoding and IBM · See more »

IBM 1401

The IBM 1401 is a variable wordlength decimal computer that was announced by IBM on October 5, 1959.

New!!: Character encoding and IBM 1401 · See more »

IBM 1620

The IBM 1620 was announced by IBM on October 21, 1959, and marketed as an inexpensive "scientific computer".

New!!: Character encoding and IBM 1620 · See more »

IBM 700/7000 series

The IBM 700/7000 series is a series of large-scale (mainframe) computer systems that were made by IBM through the 1950s and early 1960s.

New!!: Character encoding and IBM 700/7000 series · See more »

Iconv

In Unix-like operating systems, iconv (an abbreviation of '''i'''nternationalization conversion) is a command-line program and a standardized application programming interface (API) used to convert between different character encodings.

New!!: Character encoding and Iconv · See more »

Indian Script Code for Information Interchange

Indian Script Code for Information Interchange (ISCII) is a coding scheme for representing various writing systems of India.

New!!: Character encoding and Indian Script Code for Information Interchange · See more »

Integer

An integer (from the Latin ''integer'' meaning "whole")Integer 's first literal meaning in Latin is "untouched", from in ("not") plus tangere ("to touch").

New!!: Character encoding and Integer · See more »

International Components for Unicode

International Components for Unicode (ICU) is an open source project of mature C/C++ and Java libraries for Unicode support, software internationalization, and software globalization.

New!!: Character encoding and International Components for Unicode · See more »

International maritime signal flags

International maritime signal flags refers to various flags used to communicate with ships.

New!!: Character encoding and International maritime signal flags · See more »

ISO/IEC 2022

ISO/IEC 2022 Information technology—Character code structure and extension techniques, is an ISO standard (equivalent to the ECMA standard ECMA-35) specifying.

New!!: Character encoding and ISO/IEC 2022 · See more »

ISO/IEC 646

ISO/IEC 646 is the name of a set of ISO standards, described as Information technology — ISO 7-bit coded character set for information interchange and developed in cooperation with ASCII at least since 1964.

New!!: Character encoding and ISO/IEC 646 · See more »

ISO/IEC 6937

ISO/IEC 6937:2001, Information technology — Coded graphic character set for text communication — Latin alphabet, is a multibyte extension of ASCII, or rather of ISO/IEC 646-IRV.

New!!: Character encoding and ISO/IEC 6937 · See more »

ISO/IEC 8859

ISO/IEC 8859 is a joint ISO and IEC series of standards for 8-bit character encodings.

New!!: Character encoding and ISO/IEC 8859 · See more »

ISO/IEC 8859-1

ISO/IEC 8859-1:1998, Information technology — 8-bit single-byte coded graphic character sets — Part 1: Latin alphabet No.

New!!: Character encoding and ISO/IEC 8859-1 · See more »

ISO/IEC 8859-10

ISO/IEC 8859-10:1998, Information technology — 8-bit single-byte coded graphic character sets — Part 10: Latin alphabet No.

New!!: Character encoding and ISO/IEC 8859-10 · See more »

ISO/IEC 8859-11

ISO/IEC 8859-11:2001, Information technology — 8-bit single-byte coded graphic character sets — Part 11: Latin/Thai alphabet, is part of the ISO/IEC 8859 series of ASCII-based standard character encodings, first edition published in 2001.

New!!: Character encoding and ISO/IEC 8859-11 · See more »

ISO/IEC 8859-13

ISO/IEC 8859-13:1998, Information technology — 8-bit single-byte coded graphic character sets — Part 13: Latin alphabet No.

New!!: Character encoding and ISO/IEC 8859-13 · See more »

ISO/IEC 8859-14

ISO/IEC 8859-14:1998, Information technology — 8-bit single-byte coded graphic character sets — Part 14: Latin alphabet No.

New!!: Character encoding and ISO/IEC 8859-14 · See more »

ISO/IEC 8859-15

ISO/IEC 8859-15:1999, Information technology — 8-bit single-byte coded graphic character sets — Part 15: Latin alphabet No.

New!!: Character encoding and ISO/IEC 8859-15 · See more »

ISO/IEC 8859-16

ISO/IEC 8859-16:2001, Information technology — 8-bit single-byte coded graphic character sets — Part 16: Latin alphabet No.

New!!: Character encoding and ISO/IEC 8859-16 · See more »

ISO/IEC 8859-2

ISO/IEC 8859-2:1999, Information technology — 8-bit single-byte coded graphic character sets — Part 2: Latin alphabet No.

New!!: Character encoding and ISO/IEC 8859-2 · See more »

ISO/IEC 8859-3

ISO/IEC 8859-3:1999, Information technology — 8-bit single-byte coded graphic character sets — Part 3: Latin alphabet No.

New!!: Character encoding and ISO/IEC 8859-3 · See more »

ISO/IEC 8859-4

ISO/IEC 8859-4:1998, Information technology — 8-bit single-byte coded graphic character sets — Part 4: Latin alphabet No.

New!!: Character encoding and ISO/IEC 8859-4 · See more »

ISO/IEC 8859-5

ISO/IEC 8859-5:1999, Information technology — 8-bit single-byte coded graphic character sets — Part 5: Latin/Cyrillic alphabet, is part of the ISO/IEC 8859 series of ASCII-based standard character encodings, first edition published in 1988.

New!!: Character encoding and ISO/IEC 8859-5 · See more »

ISO/IEC 8859-6

ISO/IEC 8859-6:1999, Information technology — 8-bit single-byte coded graphic character sets — Part 6: Latin/Arabic alphabet, is part of the ISO/IEC 8859 series of ASCII-based standard character encodings, first edition published in 1987.

New!!: Character encoding and ISO/IEC 8859-6 · See more »

ISO/IEC 8859-7

ISO/IEC 8859-7:2003, Information technology — 8-bit single-byte coded graphic character sets — Part 7: Latin/Greek alphabet, is part of the ISO/IEC 8859 series of ASCII-based standard character encodings, first edition published in 1987.

New!!: Character encoding and ISO/IEC 8859-7 · See more »

ISO/IEC 8859-8

ISO/IEC 8859-8, Information technology — 8-bit single-byte coded graphic character sets — Part 8: Latin/Hebrew alphabet, is part of the ISO/IEC 8859 series of ASCII-based standard character encodings.

New!!: Character encoding and ISO/IEC 8859-8 · See more »

ISO/IEC 8859-9

ISO/IEC 8859-9:1999, Information technology — 8-bit single-byte coded graphic character sets — Part 9: Latin alphabet No.

New!!: Character encoding and ISO/IEC 8859-9 · See more »

JIS X 0208

JIS X 0208 is a 2-byte character set specified as a Japanese Industrial Standard, containing 6879 graphic characters suitable for writing text, place names, personal names, and so forth in the Japanese language.

New!!: Character encoding and JIS X 0208 · See more »

JIS X 0213

JIS X 0213 is a Japanese Industrial Standard defining coded character sets for encoding the characters used in Japan.

New!!: Character encoding and JIS X 0213 · See more »

KOI-7

KOI-7 (КОИ-7) is a 7-bit character encoding, designed to cover Russian, which uses the Cyrillic alphabet.

New!!: Character encoding and KOI-7 · See more »

KOI8-R

KOI8-R (RFC 1489) is an 8-bit character encoding, designed to cover Russian, which uses a Cyrillic alphabet.

New!!: Character encoding and KOI8-R · See more »

KOI8-U

KOI8-U (RFC 2319) is an 8-bit character encoding, designed to cover Ukrainian, which uses a Cyrillic alphabet.

New!!: Character encoding and KOI8-U · See more »

KS X 1001

KS X 1001 (Korean Graphic Character Set for Information Interchange), formerly called KS C 5601, is a South Korean coded character set standard to represent hangul and hanja characters on a computer.

New!!: Character encoding and KS X 1001 · See more »

Latin alphabet

The Latin alphabet or the Roman alphabet is a writing system originally used by the ancient Romans to write the Latin language.

New!!: Character encoding and Latin alphabet · See more »

Legacy system

In computing, a legacy system is an old method, technology, computer system, or application program, "of, relating to, or being a previous or outdated computer system." Often a pejorative term, referencing a system as "legacy" means that it paved the way for the standards that would follow it.

New!!: Character encoding and Legacy system · See more »

Luit

luit is a utility program used to translate the character set of a computer program so that its output can be displayed correctly on a terminal emulator that uses a different character set.

New!!: Character encoding and Luit · See more »

Mac OS Roman

Mac OS Roman is a character encoding primarily used by the classic Mac OS to represent text.

New!!: Character encoding and Mac OS Roman · See more »

Microsoft Windows

Microsoft Windows is a group of several graphical operating system families, all of which are developed, marketed, and sold by Microsoft.

New!!: Character encoding and Microsoft Windows · See more »

MIK (character set)

MIK (МИК) is a 8-bit Cyrillic code page used with DOS.

New!!: Character encoding and MIK (character set) · See more »

MIME

Multipurpose Internet Mail Extensions (MIME) is an Internet standard that extends the format of email to support.

New!!: Character encoding and MIME · See more »

Mojibake

Mojibake (文字化け) is the garbled text that is the result of text being decoded using an unintended character encoding.

New!!: Character encoding and Mojibake · See more »

Mojikyo

is a set of computer software and fonts for enhanced logogram word-processing.

New!!: Character encoding and Mojikyo · See more »

Morse code

Morse code is a method of transmitting text information as a series of on-off tones, lights, or clicks that can be directly understood by a skilled listener or observer without special equipment.

New!!: Character encoding and Morse code · See more »

Mozilla

Mozilla (stylized as moz://a) is a free software community founded in 1998 by members of Netscape.

New!!: Character encoding and Mozilla · See more »

Number

A number is a mathematical object used to count, measure and also label.

New!!: Character encoding and Number · See more »

Octet (computing)

The octet is a unit of digital information in computing and telecommunications that consists of eight bits.

New!!: Character encoding and Octet (computing) · See more »

Plane (Unicode)

In the Unicode standard, a plane is a continuous group of 65,536 (216) code points.

New!!: Character encoding and Plane (Unicode) · See more »

Punycode

Punycode is a representation of Unicode with the limited ASCII character subset used for Internet host names.

New!!: Character encoding and Punycode · See more »

Shift JIS

--> Shift JIS (Shift Japanese Industrial Standards, also SJIS, MIME name Shift_JIS) is a character encoding for the Japanese language, originally developed by a Japanese company called ASCII Corporation in conjunction with Microsoft and standardized as JIS X 0208 Appendix 1.

New!!: Character encoding and Shift JIS · See more »

SIL International

SIL International (formerly known as the Summer Institute of Linguistics) is a U.S.-based, worldwide, Christian non-profit organization, whose main purpose is to study, develop and document languages, especially those that are lesser-known, in order to expand linguistic knowledge, promote literacy, translate the Christian Bible into local languages, and aid minority language development.

New!!: Character encoding and SIL International · See more »

Standard Compression Scheme for Unicode

The Standard Compression Scheme for Unicode (SCSU) is a Unicode Technical Standard for reducing the number of bytes needed to represent Unicode text, especially if that text uses mostly characters from one or a small number of per-language character blocks.

New!!: Character encoding and Standard Compression Scheme for Unicode · See more »

String (computer science)

In computer programming, a string is traditionally a sequence of characters, either as a literal constant or as some kind of variable.

New!!: Character encoding and String (computer science) · See more »

Tamil Script Code for Information Interchange

Tamil Script Code for Information Interchange (TSCII) is a coding scheme for representing the Tamil script.

New!!: Character encoding and Tamil Script Code for Information Interchange · See more »

Telegraph key

A telegraph key is a switching device used primarily to send Morse code.

New!!: Character encoding and Telegraph key · See more »

Telegraphy

Telegraphy (from Greek: τῆλε têle, "at a distance" and γράφειν gráphein, "to write") is the long-distance transmission of textual or symbolic (as opposed to verbal or audio) messages without the physical exchange of an object bearing the message.

New!!: Character encoding and Telegraphy · See more »

Transcoding

Transcoding is the direct digital-to-digital conversion of one encoding to another, such as for movie data files (e.g., PAL, SECAM, NTSC), audio files (e.g., MP3, WAV), or character encoding (e.g., UTF-8, ISO/IEC 8859).

New!!: Character encoding and Transcoding · See more »

TRON (encoding)

TRON Code is a multi-byte character encoding used in the TRON project.

New!!: Character encoding and TRON (encoding) · See more »

Typographic ligature

In writing and typography, a ligature occurs where two or more graphemes or letters are joined as a single glyph.

New!!: Character encoding and Typographic ligature · See more »

Unicode

Unicode is a computing industry standard for the consistent encoding, representation, and handling of text expressed in most of the world's writing systems.

New!!: Character encoding and Unicode · See more »

Universal Character Set characters

No description.

New!!: Character encoding and Universal Character Set characters · See more »

Universal Coded Character Set

The Universal Coded Character Set (UCS) is a standard set of characters defined by the International Standard ISO/IEC 10646, Information technology — Universal Coded Character Set (UCS) (plus amendments to that standard), which is the basis of many character encodings.

New!!: Character encoding and Universal Coded Character Set · See more »

Unix-like

A Unix-like (sometimes referred to as UN*X or *nix) operating system is one that behaves in a manner similar to a Unix system, while not necessarily conforming to or being certified to any version of the Single UNIX Specification.

New!!: Character encoding and Unix-like · See more »

UTF-16

UTF-16 (16-bit Unicode Transformation Format) is a character encoding capable of encoding all 1,112,064 valid code points of Unicode.

New!!: Character encoding and UTF-16 · See more »

UTF-32

UTF-32 stands for Unicode Transformation Format in 32 bits.

New!!: Character encoding and UTF-32 · See more »

UTF-8

UTF-8 is a variable width character encoding capable of encoding all 1,112,064 valid code points in Unicode using one to four 8-bit bytes.

New!!: Character encoding and UTF-8 · See more »

Variable-width encoding

A variable-width encoding is a type of character encoding scheme in which codes of differing lengths are used to encode a character set (a repertoire of symbols) for representation in a computer.

New!!: Character encoding and Variable-width encoding · See more »

VSCII

VSCII, also known as TCVN 5712:1993, ISO-IR-180, and Vietnamese Standard Code for Information Interchange is a set of three Vietnamese national standard character encodings for using the Vietnamese language with computers.

New!!: Character encoding and VSCII · See more »

Web browser

A web browser (commonly referred to as a browser) is a software application for accessing information on the World Wide Web.

New!!: Character encoding and Web browser · See more »

Windows code page

Windows code pages are sets of characters or code pages (known as character encodings in other operating systems) used in Microsoft Windows from the 1980s and 1990s.

New!!: Character encoding and Windows code page · See more »

Windows-1250

Windows-1250 is a code page used under Microsoft Windows to represent texts in Central European and Eastern European languages that use Latin script, such as Polish, Czech, Slovak, Hungarian, Slovene, Bosnian, Croatian, Serbian (Latin script), Romanian (before 1993 spelling reform) and Albanian.

New!!: Character encoding and Windows-1250 · See more »

Windows-1251

Windows-1251 is a 8-bit character encoding, designed to cover languages that use the Cyrillic script such as Russian, Bulgarian, Serbian Cyrillic and other languages.

New!!: Character encoding and Windows-1251 · See more »

Windows-1252

Windows-1252 or CP-1252 (code page 1252) is a 1 byte character encoding of the Latin alphabet, used by default in the legacy components of Microsoft Windows in English and some other Western languages (other languages use different default encodings).

New!!: Character encoding and Windows-1252 · See more »

Windows-1253

Windows-1253 is a Windows code page used to write modern Greek.

New!!: Character encoding and Windows-1253 · See more »

Windows-1254

Windows-1254 is a code page used under Microsoft Windows to write Turkish.

New!!: Character encoding and Windows-1254 · See more »

Windows-1255

Windows-1255 is a code page used under Microsoft Windows to write Hebrew.

New!!: Character encoding and Windows-1255 · See more »

Windows-1256

Windows-1256 is a code page used to write Arabic (and possibly some other languages that use Arabic script, like Persian and Urdu) under Microsoft Windows. This code page is not compatible with ISO 8859-6 and MacArabic encodings.

New!!: Character encoding and Windows-1256 · See more »

Windows-1257

Windows-1257 (Windows Baltic) is a single byte code page used to support the Estonian, Latvian and Lithuanian languages under Microsoft Windows.

New!!: Character encoding and Windows-1257 · See more »

Windows-1258

Windows-1258 is a code page used in Microsoft Windows to represent Vietnamese texts.

New!!: Character encoding and Windows-1258 · See more »

Writing system

A writing system is any conventional method of visually representing verbal communication.

New!!: Character encoding and Writing system · See more »

XML

In computing, Extensible Markup Language (XML) is a markup language that defines a set of rules for encoding documents in a format that is both human-readable and machine-readable.

New!!: Character encoding and XML · See more »

Redirects here:

CDRA, Character Data Representation Architecture, Character Set, Character code, Character coding, Character coding system, Character encoding form, Character encoding scheme, Character encoding system, Character encodings, Character repertoire, Character set, Character sets, Charset, Charsets, Code character, Code unit, Coded Character Set, Coded character, Coded character set, Codeset, Convmv, File encoding, File encodings, IBM CDRA, IBM Character Data Representation Architecture, International character set, Legacy character set, Legacy encoding, Symbol set, Text encoding, Text encodings.

References

[1] https://en.wikipedia.org/wiki/Character_encoding

OutgoingIncoming
Hey! We are on Facebook now! »