Table of Contents
460 relations: Abugida, Acute accent, Adlam script, Adobe Inc., African Reference Alphabet, Ahom script, Alchemical symbol, Allograph, Alphabet, Alphabetic Presentation Forms, Anatolian hieroglyphs, Ancient North Arabian, Ancient South Arabian script, ANSI escape code, Apple Advanced Typography, Apple Inc., Apple Type Services for Unicode Imaging, April Fools' Day Request for Comments, Arabic script, Arabic script in Unicode, Armenian alphabet, Arrows (Unicode block), ASCII, Avestan alphabet, Azerbaijani alphabet, ß, İ, Balinese script, Bamum script, Base64, Basic Latin (Unicode block), Bassa Vah alphabet, Batak script, Baybayin, Bengali alphabet, Bhaiksuki script, Bidirectional text, Big5, Binary Ordered Compression for Unicode, Bitcoin, Block Elements, Bob Belleville, Bopomofo, Box Drawing, Brahmi script, Brahmic scripts, Braille, Buhid script, Burmese alphabet, Burmese numerals, ... Expand index (410 more) »
Abugida
An abugida (from Ge'ez: አቡጊዳ)sometimes also called alphasyllabary, neosyllabary, or pseudo-alphabetis a segmental writing system in which consonant–vowel sequences are written as units; each unit is based on a consonant letter, and vowel notation is secondary, similar to a diacritical mark.
Acute accent
The acute accent,, because of rendering limitation in Android (as of v13), that its default sans font fails to render "dotted circle + diacritic", so visitors just get a meaningless (to most) mark.
Adlam script
The Adlam script is a script used to write Fulani.
Adobe Inc.
Adobe Inc., formerly Adobe Systems Incorporated, is an American computer software company based in San Jose, California.
African Reference Alphabet
The African Reference Alphabet is a largely defunct continent-wide guideline for the creation of Latin alphabets for African languages.
See Unicode and African Reference Alphabet
Ahom script
The Ahom script or Tai Ahom Script is an abugida that is used to write the Ahom language, a dormant Tai language undergoing revival spoken by the Ahom people till the late 18th-century, who established the Ahom kingdom and ruled the eastern part of the Brahmaputra valley between the 13th and the 18th centuries.
Alchemical symbol
Alchemical symbols were used to denote chemical elements and compounds, as well as alchemical apparatus and processes, until the 18th century.
See Unicode and Alchemical symbol
Allograph
In graphemics and typography, the term allograph is used of a glyph that is a design variant of a letter or other grapheme, such as a letter, a number, an ideograph, a punctuation mark or other typographic symbol.
Alphabet
An alphabet is a standard set of letters written to represent particular sounds in a spoken language.
Alphabetic Presentation Forms
Alphabetic Presentation Forms is a Unicode block containing standard ligatures for the Latin, Armenian, and Hebrew scripts.
See Unicode and Alphabetic Presentation Forms
Anatolian hieroglyphs
Anatolian hieroglyphs are an indigenous logographic script native to central Anatolia, consisting of some 500 signs.
See Unicode and Anatolian hieroglyphs
Ancient North Arabian
Ancient North Arabian (ANA) is a collection of scripts and a language or family of languages under the North Arabian languages branch along with Old Arabic that were used in north and central Arabia and south Syria from the 8th century BCE to the 4th century CE.
See Unicode and Ancient North Arabian
Ancient South Arabian script
The Ancient South Arabian script (Old South Arabian: 𐩣𐩯𐩬𐩵; modern الْمُسْنَد) branched from the Proto-Sinaitic script in about the late 2nd millennium BCE.
See Unicode and Ancient South Arabian script
ANSI escape code
ANSI escape sequences are a standard for in-band signaling to control cursor location, color, font styling, and other options on video text terminals and terminal emulators.
See Unicode and ANSI escape code
Apple Advanced Typography
Apple Advanced Typography (AAT) is Apple Inc.'s computer technology for advanced font rendering, supporting internationalization and complex features for typographers, a successor to Apple's little-used QuickDraw GX font technology of the mid-1990s.
See Unicode and Apple Advanced Typography
Apple Inc.
Apple Inc. is an American multinational corporation and technology company headquartered in Cupertino, California, in Silicon Valley.
Apple Type Services for Unicode Imaging
The Apple Type Services for Unicode Imaging (ATSUI) is the set of services for rendering Unicode-encoded text introduced in Mac OS 8.5 and carried forward into Mac OS X. It replaced the WorldScript engine for legacy encodings.
See Unicode and Apple Type Services for Unicode Imaging
April Fools' Day Request for Comments
A Request for Comments (RFC), in the context of Internet governance, is a type of publication from the Internet Engineering Task Force (IETF) and the Internet Society (ISOC), usually describing methods, behaviors, research, or innovations applicable to the working of the Internet and Internet-connected systems.
See Unicode and April Fools' Day Request for Comments
Arabic script
The Arabic script is the writing system used for Arabic and several other languages of Asia and Africa.
Arabic script in Unicode
Many scripts in Unicode, such as Arabic, have special orthographic rules that require certain combinations of letterforms to be combined into special ligature forms.
See Unicode and Arabic script in Unicode
Armenian alphabet
The Armenian alphabet (Հայոց գրեր, Hayocʼ grer or Հայոց այբուբեն, Hayocʼ aybuben) or, more broadly, the Armenian script, is an alphabetic writing system developed for Armenian and occasionally used to write other languages.
See Unicode and Armenian alphabet
Arrows (Unicode block)
Arrows is a Unicode block containing line, curve, and semicircle symbols terminating in barbs or arrows.
See Unicode and Arrows (Unicode block)
ASCII
ASCII, an acronym for American Standard Code for Information Interchange, is a character encoding standard for electronic communication. Unicode and ASCII are character encoding.
Avestan alphabet
The Avestan alphabet (Avestan: 𐬛𐬍𐬥 𐬛𐬀𐬠𐬌𐬭𐬫𐬵 transliteration: dīn dabiryªh, Middle Persian: transliteration: dyn' dpywryh, transcription: dēn dēbīrē, translit) is a writing system developed during Iran's Sasanian era (226–651 CE) to render the Avestan language.
See Unicode and Avestan alphabet
Azerbaijani alphabet
The Azerbaijani alphabet (Azərbaycan əlifbası, آذربایجان اَلیفباسؽ, Азəрбајҹан әлифбасы) has three versions which includes the Arabic, Latin, and Cyrillic alphabets.
See Unicode and Azerbaijani alphabet
ß
In German orthography, the letter ß, called Eszett or scharfes S ("sharp S"), represents the phoneme in Standard German when following long vowels and diphthongs.
See Unicode and ß
İ
İ, or i, called dotted I or i-dot, is a letter used in the Latin-script alphabets of Azerbaijani, Crimean Tatar, Gagauz, Kazakh, Tatar, and Turkish.
See Unicode and İ
Balinese script
The Balinese script, natively known as Aksarä Bali and Hanacaraka, is an abugida used in the island of Bali, Indonesia, commonly for writing the Austronesian Balinese language, Old Javanese, and the liturgical language Sanskrit.
See Unicode and Balinese script
Bamum script
The Bamum scripts are an evolutionary series of six scripts created for the Bamum language by Ibrahim Njoya, King of Bamum (now western Cameroon). Unicode and Bamum script are digital typography.
Base64
In computer programming, Base64 is a group of binary-to-text encoding schemes that transforms binary data into a sequence of printable characters, limited to a set of 64 unique characters.
Basic Latin (Unicode block)
The Basic Latin Unicode block, sometimes informally called C0 Controls and Basic Latin, is the first block of the Unicode standard, and the only block which is encoded in one byte in UTF-8.
See Unicode and Basic Latin (Unicode block)
Bassa Vah alphabet
Bassa Vah, also known as simply Vah ('throwing a sign' in Bassa) is an alphabetic script for writing the Bassa language of Liberia.
See Unicode and Bassa Vah alphabet
Batak script
The Batak script (natively known as Surat Batak, Surat na Sampulu Sia ("the nineteen letters"), or Sisiasia) is a writing system used to write the Austronesian Batak languages spoken by several million people on the Indonesian island of Sumatra.
Baybayin
Baybayin (also formerly known as alibata) is a Philippine script.
Bengali alphabet
The Bengali script or Bangla alphabet (Bangla bôrṇômala, বেঙ্গলি ময়েক|Bengali mayek) is the alphabet used to write the Bengali language based on the Bengali-Assamese script, and has historically been used to write Sanskrit within Bengal.
See Unicode and Bengali alphabet
Bhaiksuki script
Bhaiksuki (Sanskrit: भैक्षुकी, Bhaiksuki) is a Brahmi-based script that was used around the 11th and 12th centuries CE.
See Unicode and Bhaiksuki script
Bidirectional text
A bidirectional text contains two text directionalities, right-to-left (RTL) and left-to-right (LTR). Unicode and bidirectional text are character encoding.
See Unicode and Bidirectional text
Big5
Big-5 or Big5 (t) is a Chinese character encoding method used in Taiwan, Hong Kong, and Macau for traditional Chinese characters.
See Unicode and Big5
Binary Ordered Compression for Unicode
Binary Ordered Compression for Unicode (BOCU) is a MIME compatible Unicode compression scheme.
See Unicode and Binary Ordered Compression for Unicode
Bitcoin
Bitcoin (abbreviation: BTC; sign: ₿) is the first decentralized cryptocurrency.
Block Elements
Block Elements is a Unicode block containing square block symbols of various fill and shading.
See Unicode and Block Elements
Bob Belleville
Robert L. Belleville is an American computer engineer who was an early head of engineering at Apple from 1982 until 1985.
See Unicode and Bob Belleville
Bopomofo
Bopomofo, also called Zhuyin Fuhao, or simply Zhuyin, is a transliteration system for Standard Chinese and other Sinitic languages.
Box Drawing
Box Drawing is a Unicode block containing characters for compatibility with legacy graphics standards that contained characters for making bordered charts and tables, i.e. box-drawing characters.
Brahmi script
Brahmi (ISO: Brāhmī) is a writing system of ancient India.
Brahmic scripts
The Brahmic scripts, also known as Indic scripts, are a family of abugida writing systems.
See Unicode and Brahmic scripts
Braille
Braille is a tactile writing system used by people who are visually impaired. Unicode and Braille are character encoding and digital typography.
Buhid script
Surat Buhid is an abugida used to write the Buhid language.
Burmese alphabet
The Burmese alphabet (မြန်မာအက္ခရာ myanma akkha.ya) is an abugida used for writing Burmese.
See Unicode and Burmese alphabet
Burmese numerals
Burmese numerals (မြန်မာ ကိန်းဂဏန်းများ) are a set of numerals traditionally used in the Burmese language, although Arabic numerals are also used.
See Unicode and Burmese numerals
Byte
The byte is a unit of digital information that most commonly consists of eight bits.
See Unicode and Byte
Byte order mark
The byte-order mark (BOM) is a particular usage of the special Unicode character code,, whose appearance as a magic number at the start of a text stream can signal several things to a program reading the text.
See Unicode and Byte order mark
Byzantine music
Byzantine music (Vyzantiné mousiké) originally consisted of the songs and hymns composed for the courtly and religious ceremonial of the Byzantine Empire and continued, after the fall of Constantinople in 1453, in the traditions of the sung Byzantine chant of Eastern Orthodox liturgy.
See Unicode and Byzantine music
C0 and C1 control codes
The C0 and C1 control code or control character sets define control codes for use in text by computer systems that use ASCII and derivatives of ASCII.
See Unicode and C0 and C1 control codes
Canadian Aboriginal syllabics
Canadian syllabic writing, or simply syllabics, is a family of writing systems used in a number of Indigenous Canadian languages of the Algonquian, Inuit, and (formerly) Athabaskan language families.
See Unicode and Canadian Aboriginal syllabics
Carian alphabets
The Carian alphabets are a number of regional scripts used to write the Carian language of western Anatolia.
See Unicode and Carian alphabets
Caucasian Albanian script
The Caucasian Albanian script was an alphabetic writing system used by the Caucasian Albanians, one of the ancient Northeast Caucasian peoples whose territory comprised parts of the present-day Republic of Azerbaijan and Dagestan.
See Unicode and Caucasian Albanian script
Chakma script
The Chakma Script (Ajhā pāṭh), also called Ajhā pāṭh, Ojhapath, Ojhopath, Aaojhapath, is an abugida used for the Chakma language, and recently for the Pali language.
Cham script
The Cham script is a Brahmic abugida used to write Cham, an Austronesian language spoken by some 245,000 Chams in Vietnam and Cambodia.
Character encoding
Character encoding is the process of assigning numbers to graphical characters, especially the written characters of human language, allowing them to be stored, transmitted, and transformed using digital computers. Unicode and character encoding are digital typography.
See Unicode and Character encoding
Charis SIL
Charis SIL is a slab serif typeface developed by SIL International based on Bitstream Charter, one of the first fonts designed for laser printers.
Cherokee syllabary
The Cherokee syllabary is a syllabary invented by Sequoyah in the late 1810s and early 1820s to write the Cherokee language.
See Unicode and Cherokee syllabary
Chinese Character Code for Information Interchange
The Chinese Character Code for Information Interchange or CCCII is a character set developed by the Chinese Character Analysis Group in Taiwan. Unicode and Chinese Character Code for Information Interchange are character encoding.
See Unicode and Chinese Character Code for Information Interchange
Chinese character description languages
Several systems have been proposed for describing the internal structure of Chinese characters, including their strokes, components, and the stroke order, and the location of each in the character's ideal square.
See Unicode and Chinese character description languages
Chinese character radicals
A radical, or indexing component, is a visually prominent component of a Chinese character under which the character is traditionally listed in a Chinese dictionary.
See Unicode and Chinese character radicals
Chinese characters
Chinese characters are logographs used to write the Chinese languages and others from regions historically influenced by Chinese culture.
See Unicode and Chinese characters
CJK characters
In internationalization, CJK characters is a collective term for graphemes used in the Chinese, Japanese, and Korean writing systems, which each include Chinese characters.
See Unicode and CJK characters
CJK Radicals Supplement
CJK Radicals Supplement is a Unicode block containing alternative, often positional, forms of the Kangxi radicals.
See Unicode and CJK Radicals Supplement
CJK Unified Ideographs
The Chinese, Japanese and Korean (CJK) scripts share a common background, collectively known as CJK characters.
See Unicode and CJK Unified Ideographs
Cocoa text system
The Cocoa text system (formerly known simply by the primary class name NSText) is the linked network of classes, protocols, interfaces and objects that provide typography and text field editing capabilities and to Cocoa applications on Apple's macOS, where it is the primary text-handling system.
See Unicode and Cocoa text system
Code page
In computing, a code page is a character encoding and as such it is a specific association of a set of printable characters and control characters with unique numbers. Unicode and code page are character encoding.
Code point
A code point, codepoint or code position is a particular position in a table, where the position has been assigned a meaning. Unicode and code point are character encoding.
Combining character
In digital typography, combining characters are characters that are intended to modify other characters.
See Unicode and Combining character
Comparison of Unicode encodings
This article compares Unicode encodings in two types of environments: 8-bit-clean environments, and environments that forbid the use of byte values with the high bit set.
See Unicode and Comparison of Unicode encodings
ConScript Unicode Registry
The ConScript Unicode Registry is a volunteer project to coordinate the assignment of code points in the Unicode Private Use Areas (PUA) for the encoding of artificial scripts, such as those for constructed languages.
See Unicode and ConScript Unicode Registry
Coptic script
The Coptic script is the script used for writing the Coptic language, the most recent development of Egyptian.
Core Text
Core Text is a Core Foundation style API in macOS, first introduced in Mac OS X 10.4 Tiger, made public in Mac OS X 10.5 Leopard, and introduced for the iPad with iPhone SDK 3.2.
COVID-19 pandemic
The COVID-19 pandemic (also known as the coronavirus pandemic), caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), began with an outbreak of COVID-19 in Wuhan, China, in December 2019.
See Unicode and COVID-19 pandemic
Cuneiform
Cuneiform is a logo-syllabic writing system that was used to write several languages of the Ancient Near East.
Currency Symbols (Unicode block)
Currency Symbols is a Unicode block containing characters for representing unique monetary signs.
See Unicode and Currency Symbols (Unicode block)
Cypriot syllabary
The Cypriot or Cypriote syllabary (also Classical Cypriot Syllabary) is a syllabic script used in Iron Age Cyprus, from about the 11th to the 4th centuries BCE, when it was replaced by the Greek alphabet.
See Unicode and Cypriot syllabary
Cypro-Minoan syllabary
The Cypro-Minoan syllabary (CM), more commonly called the Cypro-Minoan Script, is an undeciphered syllabary used on the island of Cyprus and at its trading partners during the late Bronze Age and early Iron Age (c. 1550–1050 BC).
See Unicode and Cypro-Minoan syllabary
Cyrillic (Unicode block)
Cyrillic is a Unicode block containing the characters used to write the most widely used languages with a Cyrillic orthography.
See Unicode and Cyrillic (Unicode block)
Cyrillic script
The Cyrillic script, Slavonic script or simply Slavic script is a writing system used for various languages across Eurasia.
See Unicode and Cyrillic script
Dave Opstad
David G. Opstad (born) is a retired American computer scientist specializing during his career in computer typography and information processing (focusing on character encodings), leading to several breakthroughs.
Deseret alphabet
The Deseret alphabet (Deseret: or) is a phonemic English-language spelling reform developed between 1847 and 1854 by the board of regents of the University of Deseret under the leadership of Brigham Young, the second president of the Church of Jesus Christ of Latter-day Saints (LDS Church).
See Unicode and Deseret alphabet
Devanagari
Devanagari (देवनागरी) is an Indic script used in the northern Indian subcontinent.
Dhives Akuru
Dhives Akuru, later called Dhivehi Akuru (meaning Maldivian letters) is a script formerly used for the Maldivian language.
Diminishing returns
In economics, diminishing returns are the decrease in marginal (incremental) output of a production process as the amount of a single factor of production is incrementally increased, holding all other factors of production equal (ceteris paribus).
See Unicode and Diminishing returns
DIN 91379
The DIN standard DIN 91379: "Characters and defined character sequences in Unicode for the electronic processing of names and data exchange in Europe, with CD-ROM" defines a normative subset of Unicode Latin characters, sequences of base characters and diacritic signs, and special characters for use in names of persons, legal entities, products, addresses etc.
Dingbat
In typography, a dingbat (sometimes more formally known as a printer's ornament or printer's character) is an ornament, specifically, a glyph used in typesetting, often employed to create box frames (similar to box-drawing characters), or as a dinkus (section divider).
DirectWrite
DirectWrite is a text layout and glyph rendering API by Microsoft.
Dogri script
The Dogri script is a writing system originally used for writing the Dogri language in Jammu and Kashmir in the northern part of the Indian subcontinent.
Domain Name System
The Domain Name System (DNS) is a hierarchical and distributed name service that provides a naming system for computers, services, and other resources on the Internet or other Internet Protocol (IP) networks.
See Unicode and Domain Name System
Dominoes
Dominoes is a family of tile-based games played with gaming pieces.
Dot (diacritic)
When used as a diacritic mark, the term dot refers to the glyphs "combining dot above", because of rendering limitation in Android (as of v13), that its default sans font fails to render "dotted circle + diacritic", so visitors just get a meaningless (to most) mark.
See Unicode and Dot (diacritic)
Dotless I
I, or ı, called dotless i, is a letter used in the Latin-script alphabets of Azerbaijani, Crimean Tatar, Gagauz, Kazakh, Tatar and Turkish.
Duplicate characters in Unicode
Unicode has a certain amount of duplication of characters.
See Unicode and Duplicate characters in Unicode
Duployan shorthand
The Duployan shorthand, or Duployan stenography (Sténographie Duployé), was created by Father Émile Duployé in 1860 for writing French.
See Unicode and Duployan shorthand
E
E, or e, is the fifth letter and the second vowel letter of the Latin alphabet, used in the modern English alphabet, the alphabets of other western European languages and others worldwide.
See Unicode and E
EBCDIC
Extended Binary Coded Decimal Interchange Code (EBCDIC) is an eight-bit character encoding used mainly on IBM mainframe and IBM midrange computer operating systems.
Egyptian hieroglyphs
Egyptian hieroglyphs were the formal writing system used in Ancient Egypt for writing the Egyptian language.
See Unicode and Egyptian hieroglyphs
Elbasan alphabet
The Elbasan alphabet is a mid 18th-century alphabetic script created for the Albanian language Elbasan Gospel Manuscript, also known as the Anonimi i Elbasanit ("the Anonymous of Elbasan"), which is the only document written in it.
See Unicode and Elbasan alphabet
Elymaic
The Elymaic alphabet is a right-to-left, non-joining abjad.
Emoji
An emoji (plural emoji or emojis; 絵文字) is a pictogram, logogram, ideogram, or smiley embedded in text and used in electronic messages and web pages.
Emoticon
An emoticon (rarely), short for emotion icon, is a pictorial representation of a facial expression using characters—usually punctuation marks, numbers, and letters—to express a person's feelings, mood, or reaction, without needing to describe it in detail.
Endianness
''Gulliver's Travels'' by Jonathan Swift, the novel from which the term was coined In computing, endianness is the order in which bytes within a word of digital data are transmitted over a data communication medium or addressed (by rising addresses) in computer memory, counting only byte significance compared to earliness.
Euro sign
The euro sign is the currency sign used for the euro, the official currency of the eurozone and adopted, although not required to, by Kosovo and Montenegro.
European Committee for Standardization
The European Committee for Standardization (CEN, Comité Européen de Normalisation) is a public standards organization whose mission is to foster the economy of the European Single Market and the wider European continent in global trading, the welfare of European citizens and the environment by providing an efficient infrastructure to interested parties for the development, maintenance and distribution of coherent sets of standards and specifications.
See Unicode and European Committee for Standardization
Extended ASCII
Extended ASCII is a repertoire of character encodings that include (most of) the original 96 ASCII character set, plus up to 128 additional characters.
See Unicode and Extended ASCII
Extended Unix Code
Extended Unix Code (EUC) is a multibyte character encoding system used primarily for Japanese, Korean, and simplified Chinese (characters).
See Unicode and Extended Unix Code
ʼPhags-pa script
The Phagspa script or Phags-pa script is an alphabet designed by the Tibetan monk and State Preceptor (later Imperial Preceptor) Drogön Chögyal Phagpa (1235-1280) for Kublai Khan, the founder of the Yuan dynasty (1271-1368) in China, as a unified script for the written languages within the Yuan.
See Unicode and ʼPhags-pa script
Fallback font
A fallback font is a reserve typeface containing symbols for as many Unicode characters as possible.
File Transfer Protocol
The File Transfer Protocol (FTP) is a standard communication protocol used for the transfer of computer files from a server to a client on a computer network.
See Unicode and File Transfer Protocol
Fitzpatrick scale
The Fitzpatrick scale (also Fitzpatrick skin typing test; or Fitzpatrick phototyping scale) is a numerical classification schema for human skin color.
See Unicode and Fitzpatrick scale
Font
In metal typesetting, a font is a particular size, weight and style of a typeface.
See Unicode and Font
Font substitution
Font substitution is the process of using one typeface in place of another when the intended typeface either is not available or does not contain glyphs for the required characters. Unicode and Font substitution are digital typography.
See Unicode and Font substitution
Fraser script
The Fraser or Old Lisu script is an artificial abugida for the Lisu language invented around 1915 by Sara Ba Thaw, a Karen preacher from Myanmar, and improved by the missionary James O. Fraser.
FreeBSD
FreeBSD is a free and open-source Unix-like operating system descended from the Berkeley Software Distribution (BSD).
Garay alphabet
The Garay alphabet was designed in 1961, as a transcription system " African sociolinguistic characteristics" according to its inventor, Assane Faye.
See Unicode and Garay alphabet
Gardiner's sign list
Gardiner's sign list is a list of common Egyptian hieroglyphs compiled by Sir Alan Gardiner.
See Unicode and Gardiner's sign list
GB 18030
GB 18030 is a Chinese government standard, described as Information Technology — Chinese coded character set and defines the required language and character support necessary for software in China.
Geʽez script
Geʽez (Gəʽəz) is a script used as an abugida (alphasyllabary) for several Afro-Asiatic and Nilo-Saharan languages of Ethiopia and Eritrea.
General Punctuation
General Punctuation is a Unicode block containing punctuation, spacing, and formatting characters for use with all scripts and writing systems.
See Unicode and General Punctuation
Geometric Shapes (Unicode block)
Geometric Shapes is a Unicode block of 96 symbols at code point range U+25A0–25FF.
See Unicode and Geometric Shapes (Unicode block)
Georgian lari
The lari (ლარი; ISO 4217: GEL) is the currency of Georgia.
Georgian scripts
The Georgian scripts are the three writing systems used to write the Georgian language: Asomtavruli, Nuskhuri and Mkhedruli.
See Unicode and Georgian scripts
Glagolitic script
The Glagolitic script (glagolitsa) is the oldest known Slavic alphabet.
See Unicode and Glagolitic script
Glyph
A glyph is any kind of purposeful mark.
Gmail
Gmail is the email service provided by Google.
GNOME
GNOME, originally an acronym for GNU Network Object Model Environment, is a free and open-source desktop environment for Linux and other Unix-like operating systems.
GNU Compiler Collection
The GNU Compiler Collection (GCC) is a collection of compilers from the GNU Project that support various programming languages, hardware architectures and operating systems.
See Unicode and GNU Compiler Collection
Gondi writing
Gondi has typically been written in Devanagari script or Telugu script, but native scripts are in existence.
Google LLC is an American multinational corporation and technology company focusing on online advertising, search engine technology, cloud computing, computer software, quantum computing, e-commerce, consumer electronics, and artificial intelligence (AI).
Gothic alphabet
The Gothic alphabet is an alphabet used for writing the Gothic language.
See Unicode and Gothic alphabet
Grantha script
The Grantha script (Granta eḻuttu; granthalipi) was a classical South Indian Brahmic script, found particularly in Tamil Nadu and Kerala.
See Unicode and Grantha script
Grapheme
In linguistics, a grapheme is the smallest functional unit of a writing system.
Graphite (smart font technology)
Graphite is a programmable Unicode-compliant smart font technology and rendering system developed by SIL International as free software, distributed under the terms of the GNU Lesser General Public License and the Common Public License.
See Unicode and Graphite (smart font technology)
Greek alphabet
The Greek alphabet has been used to write the Greek language since the late 9th or early 8th century BC.
See Unicode and Greek alphabet
Greek and Coptic
Greek and Coptic is the Unicode block for representing modern (monotonic) Greek.
See Unicode and Greek and Coptic
Greek Extended
Greek Extended is a Unicode block containing the accented vowels necessary for writing polytonic Greek.
See Unicode and Greek Extended
GTK
GTK (formerly GIMP ToolKit and GTK+) is a free software cross-platform widget toolkit for creating graphical user interfaces (GUIs).
See Unicode and GTK
Gujarati script
The Gujarati script (ગુજરાતી લિપિ, transliterated) is an abugida for the Gujarati language, Kutchi language, and various other languages.
See Unicode and Gujarati script
Gunjala Gondi script
The Gunjala Gondi lipi or Gunjala Gondi script is a script used to write the Gondi language, a Dravidian language spoken by the Gond people of northern Telangana, eastern Maharashtra, southeastern Madhya Pradesh, and Chhattisgarh.
See Unicode and Gunjala Gondi script
Gurmukhi
Gurmukhī (ਗੁਰਮੁਖੀ,, Shahmukhi: گُرمُکھی|rtl.
Halfwidth and Fullwidth Forms (Unicode block)
Halfwidth and Fullwidth Forms is the name of a Unicode block U+FF00–FFEF, provided so that older encodings containing both halfwidth and fullwidth characters can have lossless translation to/from Unicode.
See Unicode and Halfwidth and Fullwidth Forms (Unicode block)
Han unification
Han unification is an effort by the authors of Unicode and the Universal Character Set to map multiple character sets of the Han characters of the so-called CJK languages into a single set of unified characters. Unicode and Han unification are character encoding.
See Unicode and Han unification
Hanazono University
is a private university in Kyoto, Japan that belongs to the Rinzai sect (specifically the Myōshin-ji temple complex, which it is next to).
See Unicode and Hanazono University
Hangul
The Korean alphabet, known as Hangul or Hangeul in South Korea and Chosŏn'gŭl in North Korea, is the modern writing system for the Korean language.
Hanifi Rohingya script
The Hanifi Rohingya script is a unified script for the Rohingya language.
See Unicode and Hanifi Rohingya script
Hanunoo script
Hanunoo, also rendered Hanunó'o, is one of the scripts indigenous to the Philippines and is used by the Mangyan peoples of southern Mindoro to write the Hanunó'o language.
See Unicode and Hanunoo script
Hatran Aramaic
Hatran Aramaic (Aramaic of Hatra, Ashurian or East Mesopotamian) designates a Middle Aramaic dialect, that was used in the region of Hatra and Assur in northeastern parts of Mesopotamia (modern Iraq), approximately from the 3rd century BC to the 3rd century CE.
See Unicode and Hatran Aramaic
Hausa language
Hausa (Harshen/Halshen Hausa; Ajami: هَرْشٜىٰن هَوْسَا) is a Chadic language that is spoken by the Hausa people in the northern parts of Nigeria, Ghana, Cameroon, Benin and Togo, and the southern parts of Niger, and Chad, with significant minorities in Ivory Coast.
See Unicode and Hausa language
Hebrew alphabet
The Hebrew alphabet (אָלֶף־בֵּית עִבְרִי), known variously by scholars as the Ktav Ashuri, Jewish script, square script and block script, is traditionally an abjad script used in the writing of the Hebrew language and other Jewish languages, most notably Yiddish, Ladino, Judeo-Arabic, and Judeo-Persian.
See Unicode and Hebrew alphabet
Hentaigana
In the Japanese writing system, are variant forms of hiragana.
Hexadecimal
In mathematics and computing, the hexadecimal (also base-16 or simply hex) numeral system is a positional numeral system that represents numbers using a radix (base) of sixteen.
Hexagram (I Ching)
The I Ching book consists of 64 hexagrams.
See Unicode and Hexagram (I Ching)
High-level programming language
In computer science, a high-level programming language is a programming language with strong abstraction from the details of the computer.
See Unicode and High-level programming language
Hindko
Hindko (ہندکو, romanized) is a cover term for a diverse group of Lahnda dialects spoken by several million people of various ethnic backgrounds in several areas in northwestern Pakistan, primarily in the provinces of Khyber Pakhtunkhwa and northwestern regions of Punjab.
Hiragana
is a Japanese syllabary, part of the Japanese writing system, along with katakana as well as kanji.
Homoglyph
In orthography and typography, a homoglyph is one of two or more graphemes, characters, or glyphs with shapes that appear identical or very similar but may have differing meaning.
Hong Kong
Hong Kong is a special administrative region of the People's Republic of China.
HTML
Hypertext Markup Language (HTML) is the standard markup language for documents designed to be displayed in a web browser.
See Unicode and HTML
HTTP
HTTP (Hypertext Transfer Protocol) is an application layer protocol in the Internet protocol suite model for distributed, collaborative, hypermedia information systems.
See Unicode and HTTP
IBM
International Business Machines Corporation (using the trademark IBM), nicknamed Big Blue, is an American multinational technology company headquartered in Armonk, New York and present in over 175 countries.
See Unicode and IBM
Ideographic Research Group
The Ideographic Research Group (IRG), formerly called the Ideographic Rapporteur Group, is a subgroup of Working Group 2 (WG2) of ISO/IEC JTC1 Subcommittee 2 (SC2), which is the committee responsible for developing the Universal Coded Character Set (ISO/IEC 10646).
See Unicode and Ideographic Research Group
Imperial Aramaic
Imperial Aramaic is a linguistic term, coined by modern scholars in order to designate a specific historical variety of Aramaic language.
See Unicode and Imperial Aramaic
Indian rupee sign
The Indian rupee sign ⟨₹⟩ is the currency symbol for the Indian rupee (ISO 4217: INR), the official currency of India.
See Unicode and Indian rupee sign
Indian Script Code for Information Interchange
Indian Standard Code for Information Interchange (ISCII) is a coding scheme for representing various writing systems of India.
See Unicode and Indian Script Code for Information Interchange
Indic Siyaq Numbers
Indic Siyaq Numbers is a Unicode block containing a specialized subset of the Arabic script that was used for accounting in India under the Mughals by the 17th century through the middle of the 20th century.
See Unicode and Indic Siyaq Numbers
Indo-Aryan languages
The Indo-Aryan languages (or sometimes Indic languages) are a branch of the Indo-Iranian languages in the Indo-European language family.
See Unicode and Indo-Aryan languages
Injective function
In mathematics, an injective function (also known as injection, or one-to-one function) is a function that maps distinct elements of its domain to distinct elements; that is, implies.
See Unicode and Injective function
Inscriptional Pahlavi
Inscriptional Pahlavi is the earliest attested form of Pahlavi scripts, and is evident in clay fragments that have been dated to the reign of Mithridates I (r. 171–138 BC).
See Unicode and Inscriptional Pahlavi
Inscriptional Parthian
Inscriptional Parthian is a script used to write the Parthian language on coins of Parthia from the time of Arsaces I (250 BC).
See Unicode and Inscriptional Parthian
Insular script
Insular script is a medieval script system originating from Ireland that spread to England and continental Europe under the influence of Irish Christianity.
See Unicode and Insular script
International Components for Unicode
International Components for Unicode (ICU) is an open-source project of mature C/C++ and Java libraries for Unicode support, software internationalization, and software globalization. Unicode and international Components for Unicode are digital typography.
See Unicode and International Components for Unicode
Internationalized domain name
An internationalized domain name (IDN) is an Internet domain name that contains at least one label displayed in software applications, in whole or in part, in non-Latin script or alphabet or in the Latin alphabet-based characters with diacritics or ligatures.
See Unicode and Internationalized domain name
Internet Engineering Task Force
The Internet Engineering Task Force (IETF) is a standards organization for the Internet and is responsible for the technical standards that make up the Internet protocol suite (TCP/IP).
See Unicode and Internet Engineering Task Force
Internet Explorer
Internet Explorer (formerly Microsoft Internet Explorer and Windows Internet Explorer, commonly abbreviated as IE or MSIE) is a retired series of graphical web browsers developed by Microsoft that were used in the Windows line of operating systems.
See Unicode and Internet Explorer
IPA Extensions
IPA Extensions is a block (U+0250–U+02AF) of the Unicode standard that contains full size letters used in the International Phonetic Alphabet (IPA).
See Unicode and IPA Extensions
ISO/IEC 14755
ISO/IEC 14755 is a joint International Organization for Standardization (ISO) and International Electrotechnical Commission (IEC) standard for input methods to enter characters defined in ISO/IEC 10646, the international standard corresponding to the Unicode Standard.
ISO/IEC 2022
ISO/IEC 2022 Information technology—Character code structure and extension techniques, is an ISO/IEC standard in the field of character encoding.
ISO/IEC 8859
ISO/IEC 8859 is a joint ISO and IEC series of standards for 8-bit character encodings.
ISO/IEC 8859-1
ISO/IEC 8859-1:1998, Information technology — 8-bit single-byte coded graphic character sets — Part 1: Latin alphabet No.
See Unicode and ISO/IEC 8859-1
ISO/IEC 8859-9
ISO/IEC 8859-9:1999, Information technology — 8-bit single-byte coded graphic character sets — Part 9: Latin alphabet No.
See Unicode and ISO/IEC 8859-9
ISO/IEC JTC 1/SC 2
ISO/IEC JTC 1/SC 2 Coded character sets is a standardization subcommittee of the Joint Technical Committee ISO/IEC JTC 1 of the International Organization for Standardization (ISO) and the International Electrotechnical Commission (IEC), that develops and facilitates standards within the field of coded character sets.
See Unicode and ISO/IEC JTC 1/SC 2
Japan
Japan is an island country in East Asia, located in the Pacific Ocean off the northeast coast of the Asian mainland.
Java virtual machine
A Java virtual machine (JVM) is a virtual machine that enables a computer to run Java programs as well as programs written in other languages that are also compiled to Java bytecode.
See Unicode and Java virtual machine
Javanese script
The Javanese script (natively known as Aksara Jawa, Hanacaraka, Carakan, and Dentawyanjana) is one of Indonesia's traditional scripts developed on the island of Java.
See Unicode and Javanese script
JIS X 0208
JIS X 0208 is a 2-byte character set specified as a Japanese Industrial Standard, containing 6879 graphic characters suitable for writing text, place names, personal names, and so forth in the Japanese language.
Joe Becker (Unicode)
Joseph D. Becker is an American computer scientist and one of the co-founders of the Unicode project, and a Technical Vice President Emeritus of the Unicode Consortium.
See Unicode and Joe Becker (Unicode)
Jurchen script
The Jurchen script (Jurchen) was the writing system used to write the Jurchen language, the language of the Jurchen people who created the Jin Empire in northeastern China in the 12th–13th centuries.
See Unicode and Jurchen script
Kaithi
Kaithi, also called Kayathi or Kayasthi, is a historical Brahmic script that was used widely in parts of Northern and Eastern India, primarily in the present-day states of Uttar Pradesh, Jharkhand and Bihar.
Kangxi radical
The 214 Kangxi radicals, also known as Zihui radicals, were collated in the 18th-century Kangxi Dictionary to aid categorization of Chinese characters.
See Unicode and Kangxi radical
Kanji
are the logographic Chinese characters adapted from the Chinese script used in the writing of Japanese.
Kannada script
The Kannada script (IAST: Kannaḍa lipi; obsolete: Kanarese or Canarese script in English) is an abugida of the Brahmic family, used to write Kannada, one of the Dravidian languages of South India especially in the state of Karnataka.
See Unicode and Kannada script
Katakana
is a Japanese syllabary, one component of the Japanese writing system along with hiragana, kanji and in some cases the Latin script (known as rōmaji).
Kawi script
The Kawi, aksara kawi, aksara carakan kuna) or Old Javanese script is a Brahmic script found primarily in Java and used across much of Maritime Southeast Asia between the 8th century and the 16th century.Aditya Bayu Perdana and Ilham Nurwansah 2020. The script is an abugida, meaning that characters are read with an inherent vowel.
Kayah Li alphabet
The Kayah Li alphabet (Kayah Li) is used to write the Kayah languages Eastern Kayah Li and Western Kayah Li, which are members of Karenic branch of the Sino-Tibetan language family.
See Unicode and Kayah Li alphabet
KDE
KDE is an international free software community that develops free and open-source software.
See Unicode and KDE
Ken Lunde
Ken Roger Lunde (born 12 August 1965 in Madison, Wisconsin)Lunde, 2008.
Kharosthi
The Kharoṣṭhī script, also known as the Gāndhārī script, was an ancient Indic script used by various peoples from the north-western outskirts of the Indian subcontinent (present-day Pakistan) to Central Asia via Afghanistan.
Khema script
The Khema script, also known as Gurung Khema, Khema Phri, Khema Lipi, is used to write the Gurung language.
Khitan large script
The Khitan large script was one of two writing systems used for the now-extinct Khitan language (the other was the Khitan small script).
See Unicode and Khitan large script
Khitan small script
The Khitan small script was one of two writing systems used for the now-extinct Khitan language.
See Unicode and Khitan small script
Khmer script
Khmer script (អក្សរខ្មែរ)Huffman, Franklin.
Khojki script
Khojkī, Khojakī, or Khwājā Sindhī (خوجڪي (Arabic script) खोजकी (Devanagari)), is a script used formerly and almost exclusively by the Khoja community of parts of the Indian subcontinent, including Sindh, Gujarat, and Punjab.
Khudabadi script
Khudabadi (देवदेन/ Devden) was a script used to write the Sindhi language, generally used by some Sindhi Hindus even in the present-day.
See Unicode and Khudabadi script
Kirat Rai
Kirat Rai (also called Khambu Rai, Rai Barṇamālā and Kirat Khambu Rai) is a left-to-right abugida (a type of segmental writing system), based on the Sumhung Lipi of 1920s, used to write the Bantawa language in the Indian state of Sikkim.
Klingon scripts
The Klingon scripts are fictional alphabetic scripts used in the Star Trek movies and television shows to write the Klingon language.
See Unicode and Klingon scripts
Kyrgyz som
The som (Kyrgyz: сом; ISO code: KGS; sign: ⃀) is the currency of Kyrgyzstan.
Lamedh
Lamedh or lamed is the twelfth letter of the Semitic abjads, including Hebrew lāmeḏ ל, Aramaic lāmaḏ 𐡋, Syriac lāmaḏ ܠ, Arabic lām ل, and Phoenician lāmd 𐤋.
Lao script
Lao script or Akson Lao (ອັກສອນລາວ) is the primary script used to write the Lao language and other minority languages in Laos.
Latin Extended Additional
Latin Extended Additional is a Unicode block.
See Unicode and Latin Extended Additional
Latin Extended-A
Latin Extended-A is a Unicode block and is the third block of the Unicode standard.
See Unicode and Latin Extended-A
Latin Extended-B
Latin Extended-B is the fourth block (0180-024F) of the Unicode Standard.
See Unicode and Latin Extended-B
Latin script
The Latin script, also known as the Roman script, is a writing system based on the letters of the classical Latin alphabet, derived from a form of the Greek alphabet which was in use in the ancient Greek city of Cumae in Magna Graecia.
Latin-1 Supplement
The Latin-1 Supplement (also called C1 Controls and Latin-1 Supplement) is the second Unicode block in the Unicode standard.
See Unicode and Latin-1 Supplement
Leading zero
A leading zero is any 0 digit that comes before the first nonzero digit in a number string in positional notation.
Lee Collins (Unicode)
Lee Collins is a software engineer and co-founder of the Unicode Consortium.
See Unicode and Lee Collins (Unicode)
Lepcha script
The Lepcha script, or Róng script, is an abugida used by the Lepcha people to write the Lepcha language.
Letterlike Symbols
Letterlike Symbols is a Unicode block containing 80 characters which are constructed mainly from the glyphs of one or more letters.
See Unicode and Letterlike Symbols
Ligature (writing)
In writing and typography, a ligature occurs where two or more graphemes or letters are joined to form a single glyph.
See Unicode and Ligature (writing)
Limbu script
The Limbu script (also Sirijanga script) is used to write the Limbu language.
Linear A
Linear A is a writing system that was used by the Minoans of Crete from 1800 BC to 1450 BC.
Linear B
Linear B is a syllabic script that was used for writing in Mycenaean Greek, the earliest attested form of the Greek language.
Linux distribution
A Linux distribution (often abbreviated as distro) is an operating system made from a software collection that includes the Linux kernel and often a package management system.
See Unicode and Linux distribution
List of binary codes
This is a list of some binary codes that are (or have been) used to represent text as a sequence of binary digits "0" and "1".
See Unicode and List of binary codes
List of Hangul jamo
This is the list of Hangul jamo (Korean alphabet letters which represent consonants and vowels in Korean) including obsolete ones.
See Unicode and List of Hangul jamo
List of typefaces
This is a list of typefaces, which are separated into groups by distinct artistic differences.
See Unicode and List of typefaces
List of Unicode characters
As of Unicode version, there are 149,878 characters with code points, covering 161 modern and historical scripts, as well as multiple symbol sets.
See Unicode and List of Unicode characters
List of XML and HTML character entity references
In SGML, HTML and XML documents, the logical constructs known as character data and attribute values consist of sequences of characters, in which each character can manifest directly (representing itself), or can be represented by a series of characters called a character reference, of which there are two types: a numeric character reference and a character entity reference.
See Unicode and List of XML and HTML character entity references
Lithuanian language
Lithuanian is an East Baltic language belonging to the Baltic branch of the Indo-European language family.
See Unicode and Lithuanian language
Lontara script
The Lontara script, also known as the Bugis script, Bugis-Makassar script, or Urupu Sulapa’ Eppa’ "four-cornered letters", is one of Indonesia's traditional scripts developed in the South Sulawesi and West Sulawesi region.
See Unicode and Lontara script
Lotus Multi-Byte Character Set
The Lotus Multi-Byte Character Set (LMBCS) is a proprietary multi-byte character encoding originally conceived in 1988 at Lotus Development Corporation with input from Bob Balaban and others. Unicode and Lotus Multi-Byte Character Set are character encoding.
See Unicode and Lotus Multi-Byte Character Set
Lycian alphabet
The Lycian alphabet was used to write the Lycian language of the Asia Minor region of Lycia.
See Unicode and Lycian alphabet
Lydian alphabet
Lydian script was used to write the Lydian language.
See Unicode and Lydian alphabet
MacOS
macOS, originally Mac OS X, previously shortened as OS X, is an operating system developed and marketed by Apple since 2001.
Macron (diacritic)
A macron is a diacritical mark: it is a straight bar placed above a letter, usually a vowel.
See Unicode and Macron (diacritic)
Mahajani
Mahajani is a Laṇḍā mercantile script that was historically used in northern India for writing accounts and financial records in Marwari, Hindi and Punjabi.
Mahjong
Mahjong (English pronunciation) is a tile-based game that was developed in the 19th century in China and has spread throughout the world since the early 20th century.
Makassarese language
Makassarese (basa Mangkasara or basa Mangkasarak), sometimes called Makasar, Makassar, or Macassar, is a language of the Makassarese people, spoken in South Sulawesi province of Indonesia.
See Unicode and Makassarese language
Malayalam script
Malayalam script (/ മലയാള ലിപി) is a Brahmic script used commonly to write Malayalam, which is the principal language of Kerala, India, spoken by 45 million people in the world.
See Unicode and Malayalam script
Mandaic alphabet
The Mandaic alphabet is a writing system primarily used to write the Mandaic language.
See Unicode and Mandaic alphabet
Manichaean script
The Manichaean script is an abjad-based writing system rooted in the Semitic family of alphabets and associated with the spread of Manichaeism from southwest to central Asia and beyond, beginning in the third century CE.
See Unicode and Manichaean script
Mark Davis (Unicode)
Mark Edward Davis (born September 13, 1952) is an American specialist in the internationalization and localization of software and the co-founder and chief technical officer of the Unicode Consortium, previously serving as its president until 2022.
See Unicode and Mark Davis (Unicode)
Markup language
A markup language is a text-encoding system which specifies the structure and formatting of a document and potentially the relationship between its parts.
See Unicode and Markup language
Mathematical Operators (Unicode block)
Mathematical Operators is a Unicode block containing characters for mathematical, logical, and set notation.
See Unicode and Mathematical Operators (Unicode block)
Maya numerals
The Mayan numeral system was the system to represent numbers and calendar dates in the Maya civilization.
Medefaidrin
Medefaidrin (Medefidrin), or Obɛri Ɔkaimɛ, is a constructed language and script created as a Christian sacred language by an Ibibio congregation in 1930s Nigeria.
Medieval Unicode Font Initiative
In digital typography, the Medieval Unicode Font Initiative (MUFI) is a project which aims to coordinate the encoding and display of special characters in medieval texts written in the Latin alphabet or in runes, which are not otherwise encoded as part of Unicode. Unicode and medieval Unicode Font Initiative are digital typography.
See Unicode and Medieval Unicode Font Initiative
Meitei script
The Meitei script (ꯃꯩꯇꯩ ꯃꯌꯦꯛ|Meitei mayek), also known as the Kanglei script (ꯀꯪꯂꯩ ꯃꯌꯦꯛ|Kanglei mayek) or the Kok Sam Lai script (ꯀꯣꯛ ꯁꯝ ꯂꯥꯏ ꯃꯌꯦꯛ|Kok Sam Lai mayek), after its first three letters is an abugida in the Brahmic scripts family used to write the Meitei language, the official language of Manipur, Assam and one of the 22 official languages of India.
Mende Kikakui script
The Mende Kikakui script is a syllabary used for writing the Mende language of Sierra Leone.
See Unicode and Mende Kikakui script
Meroitic script
The Meroitic script consists of two alphasyllabic scripts developed to write the Meroitic language at the beginning of the Meroitic Period (3rd century BC) of the Kingdom of Kush.
See Unicode and Meroitic script
Meta Platforms
Meta Platforms, Inc., doing business as Meta, and formerly named Facebook, Inc., and TheFacebook, Inc., is an American multinational technology conglomerate based in Menlo Park, California.
See Unicode and Meta Platforms
Michael Everson
Michael Everson (born January 1963) is an American and Irish linguist, script encoder, typesetter, type designer and publisher.
See Unicode and Michael Everson
Microsoft
Microsoft Corporation is an American multinational corporation and technology company headquartered in Redmond, Washington.
Microsoft Layer for Unicode
The Microsoft Layer for Unicode (MSLU) is a software library for legacy versions of Windows, simplifying the creation of Unicode-aware programs on Windows 9x (Windows 95, Windows 98, and Windows Me).
See Unicode and Microsoft Layer for Unicode
Microsoft Windows
Microsoft Windows is a product line of proprietary graphical operating systems developed and marketed by Microsoft.
See Unicode and Microsoft Windows
MIME
Multipurpose Internet Mail Extensions (MIME) is a standard that extends the format of email messages to support text in character sets other than ASCII, as well as attachments of audio, video, images, and application programs.
See Unicode and MIME
Ministry of Endowments and Religious Affairs (Oman)
The Ministry of Awqaf and Religious Affairs (MARA) is the governmental body in the Sultanate of Oman responsible for overseeing all matters related to awqaf and religious affairs.
See Unicode and Ministry of Endowments and Religious Affairs (Oman)
Miscellaneous Symbols
Miscellaneous Symbols is a Unicode block (U+2600–U+26FF) containing glyphs representing concepts from a variety of categories: astrological, astronomical, chess, dice, musical notation, political symbols, recycling, religious symbols, trigrams, warning signs, and weather, among others.
See Unicode and Miscellaneous Symbols
Miscellaneous Technical
Miscellaneous Technical is a Unicode block ranging from U+2300 to U+23FF, which contains various common symbols which are related to and used in the various technical, programming language, and academic professions.
See Unicode and Miscellaneous Technical
Modi script
Modi (मोडी) is a script used to write the Marathi language, which is the primary language spoken in the state of Maharashtra, India.
Mojibake
Mojibake (文字化け;, "character transformation") is the garbled or gibberish text that is the result of text being decoded using an unintended character encoding. Unicode and Mojibake are character encoding.
Mon alphabet
The Mon alphabet (အက္ခရ်မန်;, မွန်အက္ခရာ;, อักษรมอญ) is a Brahmic abugida used for writing the Mon language.
Mongolian script
The traditional Mongolian script, also known as the Hudum Mongol bichig, was the first writing system created specifically for the Mongolian language, and was the most widespread until the introduction of Cyrillic in 1946.
See Unicode and Mongolian script
Mru script
The Mru script (Mru) is an indigenous, messianic script for the Mru language.
Multani script
Multani is a Brahmic script originating in the Multan region of Punjab and in northern Sindh, Pakistan.
See Unicode and Multani script
Multilingualism
Multilingualism is the use of more than one language, either by an individual speaker or by a group of speakers.
See Unicode and Multilingualism
Mundari Bani
Mundari Bani (Mundari: Mundari Bani 'Mundari alphabet', also known as Mundari Bani Hisir Hisir 'writing', Nag Mundari, or the Mundari alphabet) is the writing system created for the Mundari language, spoken in eastern India.
Musical notation
Musical notation is any system used to visually represent music.
See Unicode and Musical notation
N'Ko script
NKo (ߒߞߏ), also spelled N'Ko, is an alphabetic script devised by Solomana Kanté in 1949, as a modern writing system for the Manding languages of West Africa.
Nabataean script
The Nabataean script is an abjad (consonantal alphabet) that was used to write Nabataean Aramaic and Nabataean Arabic from the second century BC onwards.
See Unicode and Nabataean script
Nandinagari
Nandināgarī is a Brahmic script derived from the Nāgarī script which appeared in the 7th century AD.
Natural language processing
Natural language processing (NLP) is an interdisciplinary subfield of computer science and artificial intelligence.
See Unicode and Natural language processing
Nüshu
Nüshu is a syllabic script derived from Chinese characters that was used exclusively among ethnic Yao women in Jiangyong County in Hunan province of southern China before going extinct in the early 21st century.
Netflix
Netflix is an American subscription video on-demand over-the-top streaming service.
New Tai Lue alphabet
New Tai Lue script, also known as Xishuangbanna Dai and Simplified Tai Lue, is an abugida used to write the Tai Lü language.
See Unicode and New Tai Lue alphabet
Newline
A newline (frequently called line ending, end of line (EOL), next line (NEL) or line break) is a control character or sequence of control characters in character encoding specifications such as ASCII, EBCDIC, Unicode, etc.
NeXT
NeXT, Inc. (later NeXT Computer, Inc. and NeXT Software, Inc.) was an American technology company headquartered in Redwood City, California that specialized in computer workstations for higher education and business markets, and later developed web software.
See Unicode and NeXT
Number Forms
Number Forms is a Unicode block containing Unicode compatibility characters that have specific meaning as numbers, but are constructed from other characters.
Numidian language
Numidian was a language spoken in ancient Numidia.
See Unicode and Numidian language
Nyiakeng Puachue Hmong
Nyiakeng Puachue Hmong (Hmong:; RPA: Ntawv Nyiajkeeb Puajtxwm Hmoob) is an alphabet script devised for White Hmong and Green Hmong in the 1980s by Reverend Chervang Kong for use within his United Christians Liberty Evangelical Church.
See Unicode and Nyiakeng Puachue Hmong
Odia script
The Odia script (translit-std, also translit-std) is a Brahmic script used to write primarily Odia language and others including Sanskrit and other regional languages.
Ogham
Ogham (Modern Irish:; ogum, ogom, later ogam) is an Early Medieval alphabet used primarily to write the early Irish language (in the "orthodox" inscriptions, 4th to 6th centuries AD), and later the Old Irish language (scholastic ogham, 6th to 9th centuries).
Ogonek
The ogonek (Polish:, "little tail", diminutive of ogon) is a diacritic hook placed under the lower right corner of a vowel in the Latin alphabet used in several European languages, and directly under a vowel in several Native American languages.
Ol Chiki script
The Ol Chiki (ᱚᱞ ᱪᱤᱠᱤ) script, also known as Ol Chemetʼ (ᱚᱞ ᱪᱮᱢᱮᱫ; ol 'writing', chemetʼ 'learning'), Ol Ciki, Ol, and sometimes as the Santhali alphabet invented by Pandit Raghunath Murmu in 1925, is the official writing system for Santhali, an Austroasiatic language recognized as an official regional language in India.
See Unicode and Ol Chiki script
Ol Onal
The Ol Onal, also known as also known as Bhumij Lipi or Bhumij Onal, is an alphabetic writing system for the Bhumij language.
Old Hungarian script
The Old Hungarian script or Hungarian runes (Székely-magyar rovás, 'székely-magyar runiform', or rovásírás) is an alphabetic writing system used for writing the Hungarian language.
See Unicode and Old Hungarian script
Old Italic scripts
The Old Italic scripts are a family of ancient writing systems used in the Italian Peninsula between about 700 and 100 BC, for various languages spoken in that time and place.
See Unicode and Old Italic scripts
Old Permic script
The Old Permic script (Важ Перым гижӧм,, Važ Perym gižöm), sometimes known by its initial two characters as Abur or Anbur, is a "highly idiosyncratic adaptation" of the Cyrillic script once used to write medieval Komi (a member of the Permic branch of Finno-Ugric languages).
See Unicode and Old Permic script
Old Persian cuneiform
Old Persian cuneiform is a semi-alphabetic cuneiform script that was the primary script for Old Persian.
See Unicode and Old Persian cuneiform
Old Turkic script
The Old Turkic script (also known as variously Göktürk script, Orkhon script, Orkhon-Yenisey script, Turkic runes) was the alphabet used by the Göktürks and other early Turkic khanates from the 8th to 10th centuries to record the Old Turkic language.
See Unicode and Old Turkic script
Old Uyghur alphabet
The Old Uyghur alphabet was a Turkic script used for writing Old Uyghur, a variety of Old Turkic spoken in Turpan and Gansu that is the ancestor of the modern Western Yugur language.
See Unicode and Old Uyghur alphabet
Open-source Unicode typefaces
There are Unicode typefaces which are open-source and designed to contain glyphs of all Unicode characters, or at least a broad selection of Unicode scripts.
See Unicode and Open-source Unicode typefaces
OpenType
OpenType is a format for scalable computer fonts. Unicode and OpenType are digital typography.
Osage script
The Osage script is a new script promulgated in 2006 and revised 2012–2014 for the Osage language.
Osmanya alphabet
The Osmanya alphabet (Farta Cismaanya, 𐒍𐒖𐒇𐒂𐒖 𐒋𐒘𐒈𐒑𐒛𐒒𐒕𐒖), also known as Far Soomaali (𐒍𐒖𐒇 𐒘𐒝𐒈𐒑𐒛𐒘, "Somali writing") and, in Arabic, as al-kitābah al-ʿuthmānīyah (الكتابة العثمانية; "Osman writing"), is an alphabetic script created to transcribe the Somali language.
See Unicode and Osmanya alphabet
Outlook.com
Outlook.com, formerly Hotmail, is a free personal email service offered by Microsoft.
Pahawh Hmong
Pahawh Hmong (RPA: Phaj hauj Hmoob, Pahawh:; known also as Ntawv Pahawh, Ntawv Keeb, Ntawv Caub Fab, Ntawv Soob Lwj) is an indigenous semi-syllabic script, invented in 1959 by Shong Lue Yang, to write two Hmong languages, Hmong Daw (Hmoob Dawb White Miao) and Hmong Njua AKA Hmong Leng (Moob Leeg Green Miao).
Pali
Pāli, also known as Pali-Magadhi, is a Middle Indo-Aryan liturgical language on the Indian subcontinent.
See Unicode and Pali
Palmyrene alphabet
The Palmyrene alphabet was a historical Semitic alphabet used to write Palmyrene Aramaic.
See Unicode and Palmyrene alphabet
Pango
Pango (stylized as Παν語) is a text (i.e. glyph) layout engine library which works with the HarfBuzz shaping engine for displaying multi-language text.
PARC (company)
SRI Future Concepts Division (formerly Palo Alto Research Center, PARC and Xerox PARC) is a research and development company in Palo Alto, California.
See Unicode and PARC (company)
Pau Cin Hau script
The Pau Cin Hau scripts, known as Pau Cin Hau lai ('Pau Cin Hau script'), or Zo tual lai ('Zo indigenous script') in Zomi, are two scripts, a logographic script and an alphabetic script created by Pau Cin Hau, a Zomi religious leader from Chin State, Burma.
See Unicode and Pau Cin Hau script
Percent-encoding
URL encoding, officially known as percent-encoding, is a method to encode arbitrary data in a uniform resource identifier (URI) using only the US-ASCII characters legal within a URI.
See Unicode and Percent-encoding
Phaistos Disc
The Phaistos Disc or Phaistos Disk is a disk of fired clay from the island of Crete, Greece, possibly from the middle or late Minoan Bronze Age (second millennium BC), bearing a text in an unknown script and language.
Philippines
The Philippines, officially the Republic of the Philippines, is an archipelagic country in Southeast Asia.
Phoenician alphabet
The Phoenician alphabet is an abjad (consonantal alphabet) used across the Mediterranean civilization of Phoenicia for most of the 1st millennium BC.
See Unicode and Phoenician alphabet
Plan 9 from Bell Labs
Plan 9 from Bell Labs is a distributed operating system which originated from the Computing Science Research Center (CSRC) at Bell Labs in the mid-1980s and built on UNIX concepts first developed there in the late 1960s.
See Unicode and Plan 9 from Bell Labs
Plane (Unicode)
In the Unicode standard, a plane is a contiguous group of 65,536 (216) code points.
See Unicode and Plane (Unicode)
Playing card
A playing card is a piece of specially prepared card stock, heavy paper, thin cardboard, plastic-coated paper, cotton-paper blend, or thin plastic that is marked with distinguishing motifs.
Pollard script
The Pollard script, also known as Pollard Miao or Miao, is an abugida loosely based on the Latin alphabet and invented by Methodist missionary Sam Pollard.
See Unicode and Pollard script
Pracalit script
Prachalit, also known as Newa, Newar, Newari, or Nepāla lipi is a type of abugida script developed from the Nepalese scripts, which are a part of the family of Brahmic scripts descended from Brahmi script.
See Unicode and Pracalit script
Precomposed character
A precomposed character (alternatively composite character or decomposable character) is a Unicode entity that can also be defined as a sequence of one or more other characters.
See Unicode and Precomposed character
Private Use Areas
In Unicode, a Private Use Area (PUA) is a range of code points that, by definition, will not be assigned characters by the Unicode Consortium.
See Unicode and Private Use Areas
Proof of concept
Proof of concept (POC or PoC), also known as proof of principle, is a realization of a certain idea, method or principle in order to demonstrate its feasibility, or viability, or a demonstration in principle with the aim of verifying that some concept or theory has practical potential.
See Unicode and Proof of concept
Psalter Pahlavi
Psalter Pahlavi is a cursive abjad that was used for writing Middle Persian on paper; it is thus described as one of the Pahlavi scripts.
See Unicode and Psalter Pahlavi
Punjabi language
Punjabi, sometimes spelled Panjabi, is an Indo-Aryan language native to the Punjab region of Pakistan and India.
See Unicode and Punjabi language
Punycode
Punycode is a representation of Unicode with the limited ASCII character subset used for Internet hostnames.
Python (programming language)
Python is a high-level, general-purpose programming language.
See Unicode and Python (programming language)
Quoted-printable
Quoted-Printable, or QP encoding, is a binary-to-text encoding system using printable ASCII characters (alphanumeric and the equals sign.
See Unicode and Quoted-printable
Reiwa era
is the current and 232nd era of the official calendar of Japan.
Rejang alphabet
The Rejang script is an abugida of the Brahmic family that is related to other scripts of the region, such as the Batak and Lontara scripts.
See Unicode and Rejang alphabet
Religious and political symbols in Unicode
Unicode contains a number of characters that represent various cultural, political, and religious symbols.
See Unicode and Religious and political symbols in Unicode
Research Libraries Group
The Research Libraries Group (RLG) was a U.S.-based library consortium that existed from 1974 until its merger with the OCLC library consortium in 2006.
See Unicode and Research Libraries Group
Romanization
In linguistics, romanization is the conversion of text from a different writing system to the Roman (Latin) script, or a system for doing so.
Rongorongo
Rongorongo (Rapa Nui: roŋoroŋo) is a system of glyphs discovered in the 19th century on Easter Island that has the appearance of writing or proto-writing.
Roozbeh Pournader
Roozbeh Pournader (روزبه پورنادر) is a free software activist and expert on Unicode text encoding, text rendering, and fonts, especially for bidirectional text.
See Unicode and Roozbeh Pournader
Round-trip format conversion
The term round-trip is used in document conversion particularly involving markup languages such as XML and SGML.
See Unicode and Round-trip format conversion
Ruble sign
The ruble sign,, is the currency sign used for the Russian ruble, the official currency of Russia.
Rune
A rune is a letter in a set of related alphabets known as runic alphabets native to the Germanic peoples.
See Unicode and Rune
Samaritan script
The Samaritan Hebrew script, or simply Samaritan script is used by the Samaritans for religious writings, including the Samaritan Pentateuch, writings in Samaritan Hebrew, and for commentaries and translations in Samaritan Aramaic and occasionally Arabic.
See Unicode and Samaritan script
SAP
SAP SE is a German multinational software company based in Walldorf, Baden-Württemberg.
See Unicode and SAP
Saurashtra script
The Saurashtra script is an abugida script that is used by Saurashtrians of Tamil Nadu to write the Saurashtra language.
See Unicode and Saurashtra script
Scribal abbreviation
Scribal abbreviations, or sigla (singular: siglum), are abbreviations used by ancient and medieval scribes writing in various languages, including Latin, Greek, Old English and Old Norse.
See Unicode and Scribal abbreviation
Script (Unicode)
In Unicode, a script is a collection of letters and other written signs used to represent textual information in one or more writing systems.
See Unicode and Script (Unicode)
Seed7
Seed7 is an extensible general-purpose programming language designed by Thomas Mertes.
Shape context
Shape context is a feature descriptor used in object recognition.
Sharada script
The Śāradā, Sarada or Sharada script is an abugida writing system of the Brahmic family of scripts.
See Unicode and Sharada script
Shavian alphabet
The Shavian alphabet (also known as the Shaw alphabet) is a constructed alphabet conceived as a way to provide simple, phonemic orthography for the English language to replace the inefficiencies and difficulties of conventional spelling using the Latin alphabet.
See Unicode and Shavian alphabet
Shift JIS
Shift JIS (also SJIS, MIME name Shift_JIS, known as PCK in Solaris contexts) is a character encoding for the Japanese language, originally developed by the Japanese company ASCII Corporation in conjunction with Microsoft and standardized as JIS X 0208 Appendix 1.
Sic
The Latin adverb sic (thus, so, and in this manner) inserted after a quotation indicates that the quoted matter has been transcribed or translated as found in the source text, including erroneous, archaic, or unusual spelling, punctuation, and grammar.
See Unicode and Sic
Siddhaṃ script
(also), also known in its later evolved form as Siddhamātṛkā, is a medieval Brahmic abugida, derived from the Gupta script and ancestral to the Nāgarī, Eastern Nagari, Tirhuta, Odia and Nepalese scripts.
See Unicode and Siddhaṃ script
SignWriting
Sutton SignWriting, or simply SignWriting, is a system of writing sign languages.
SIL International
SIL International (formerly known as the Summer Institute of Linguistics) is an evangelical Christian nonprofit organization whose main purpose is to study, develop and document languages, especially those that are lesser-known, in order to expand linguistic knowledge, promote literacy, translate the Christian Bible into local languages, and aid minority language development.
See Unicode and SIL International
Sinhala script
The Sinhala script (Siṁhala Akṣara Mālāva), also known as Sinhalese script, is a writing system used by the Sinhalese people and most Sri Lankans in Sri Lanka and elsewhere to write the Sinhala language as well as the liturgical languages Pali and Sanskrit.
See Unicode and Sinhala script
Sinosphere
The Sinosphere, also known as the Chinese cultural sphere, East Asian cultural sphere, or the Sinic world, encompasses multiple countries in East Asia and Southeast Asia that were historically heavily influenced by Chinese culture.
Sogdian alphabet
The Sogdian alphabet was originally used for the Sogdian language, a language in the Iranian family used by the people of Sogdia.
See Unicode and Sogdian alphabet
Sorang Sompeng script
The Sorang Sompeng script is used to write Sora, a Munda language with 300,000 speakers in India.
See Unicode and Sorang Sompeng script
Soyombo script
The Soyombo script is an abugida developed by the monk and scholar Zanabazar in 1686 to write Mongolian.
See Unicode and Soyombo script
Spacing Modifier Letters
Spacing Modifier Letters is a Unicode block containing characters for the IPA, UPA, and other phonetic transcriptions.
See Unicode and Spacing Modifier Letters
Specials (Unicode block)
Specials is a short Unicode block of characters allocated at the very end of the Basic Multilingual Plane, at U+FFF0–FFFF.
See Unicode and Specials (Unicode block)
Standard Compression Scheme for Unicode
The Standard Compression Scheme for Unicode (SCSU) is a Unicode Technical Standard for reducing the number of bytes needed to represent Unicode text, especially if that text uses mostly characters from one or a small number of per-language character blocks.
See Unicode and Standard Compression Scheme for Unicode
Standardization Administration of China
The Standardization Administration of China (SAC;; abbr.) is an external name of the State Administration for Market Regulation.
See Unicode and Standardization Administration of China
Standards related to Unicode
There are several standards related to Unicode.
See Unicode and Standards related to Unicode
Star (classification)
Star ratings are a type of rating scale using a star glyph or similar typographical symbol.
See Unicode and Star (classification)
Sun Microsystems
Sun Microsystems, Inc. (Sun for short) was an American technology company that sold computers, computer components, software, and information technology services and created the Java programming language, the Solaris operating system, ZFS, the Network File System (NFS), and SPARC microprocessors.
See Unicode and Sun Microsystems
Sundanese script
Standard Sundanese script (Aksara Sunda Baku) is a writing system which is used by the Sundanese people.
See Unicode and Sundanese script
Sunuwar alphabet
The Sunuwar alphabet (previously the Jenticha script, occasionally Kõits script) is an alphabet developed by Krishna Bahadur Jentich in 1942, to write the Sunwar language, a member of the Kiranti language family spoken in Eastern Nepal, as in Sikkim.
See Unicode and Sunuwar alphabet
Superscripts and Subscripts
Superscripts and Subscripts is a Unicode block containing superscript and subscript numerals, mathematical operators, and letters used in mathematics and phonetics.
See Unicode and Superscripts and Subscripts
Sylheti Nagri
Sylheti Nagri or Sylheti Nāgarī (ꠍꠤꠟꠐꠤ ꠘꠣꠉꠞꠤ), known in classical manuscripts as Sylhet Nagri as well as by many other names, is an Indic script of the Brahmic family.
Syllabary
In the linguistic study of written languages, a syllabary is a set of written symbols that represent the syllables or (more frequently) moras which make up words.
Symbols for Legacy Computing
Symbols for Legacy Computing is a Unicode block containing graphic characters that were used for various home computers from the 1970s and 1980s and in Teletext broadcasting standards.
See Unicode and Symbols for Legacy Computing
Syriac alphabet
The Syriac alphabet (ܐܠܦ ܒܝܬ ܣܘܪܝܝܐ) is a writing system primarily used to write the Syriac language since the 1st century AD.
See Unicode and Syriac alphabet
T.51/ISO/IEC 6937
T.51 / ISO/IEC 6937:2001, Information technology — Coded graphic character set for text communication — Latin alphabet, is a multibyte extension of ASCII, or more precisely ISO/IEC 646-IRV. Unicode and T.51/ISO/IEC 6937 are character encoding.
See Unicode and T.51/ISO/IEC 6937
Tagbanwa script
Tagbanwa is one of the scripts indigenous to the Philippines, used by the Tagbanwa and the Palawan people as their ethnic writing system.
See Unicode and Tagbanwa script
Tai Tham script
Tai Tham script (Tham meaning "scripture") is an abugida writing system used mainly for a group of Southwestern Tai languages i.e., Northern Thai, Tai Lü, Khün and Lao; as well as the liturgical languages of Buddhism i.e., Pali and Sanskrit.
See Unicode and Tai Tham script
Tai Viet script
The Tai Viet script (Tai Dam: ("Tai script"), Vietnamese: Chữ Thái Việt, อักษรไทดำ) is a Brahmic script used by the Tai Dam people and various other Thai people in Vietnam and Thailand.
See Unicode and Tai Viet script
Taiwan
Taiwan, officially the Republic of China (ROC), is a country in East Asia.
Takri script
The Tākri script (Takri (Chamba):; Takri (Jammu/Dogra):; sometimes called Tankri) is an abugida writing system of the Brahmic family of scripts.
Tamil script
The Tamil script (தமிழ் அரிச்சுவடி) is an abugida script that is used by Tamils and Tamil speakers in India, Sri Lanka, Malaysia, Singapore, Indonesia and elsewhere to write the Tamil language.
Tangsa language
Tangsa, also known as Tase and Tase Naga, is a Sino-Tibetan language or language cluster spoken by the Tangsa people of Burma and north-eastern India.
See Unicode and Tangsa language
Tangut script
The Tangut script (Tangut) was a logographic writing system, used for writing the extinct Tangut language of the Western Xia dynasty.
Tatsuo Kobayashi
is a Japanese web architect who specializes in international standardization.
See Unicode and Tatsuo Kobayashi
Telugu script
Telugu script (Telugu lipi), an abugida from the Brahmic family of scripts, is used to write the Telugu language, a Dravidian language spoken in the Indian states of Andhra Pradesh and Telangana as well as several other neighbouring states.
Tengwar
The Tengwar script is an artificial script, one of several scripts created by J. R. R. Tolkien, the author of The Lord of the Rings.
Thaana
Thaana, Tãnaa, Taana or Tāna (&thinsp) is the present writing system of the Maldivian language spoken in the Maldives.
Thai Industrial Standard 620-2533
Thai Industrial Standard 620-2533, commonly referred to as TIS-620, is the most common character set and character encoding for the Thai language.
See Unicode and Thai Industrial Standard 620-2533
Thai script
The Thai script (อักษรไทย) is the abugida used to write Thai, Southern Thai and many other languages spoken in Thailand.
Tibetan script
The Tibetan script is a segmental writing system, or abugida, derived from of Brahmic scripts and Gupta script, and used to write certain Tibetic languages, including Tibetan, Dzongkha, Sikkimese, Ladakhi, Jirel and Balti.
See Unicode and Tibetan script
Tifinagh
Tifinagh (Tuareg Berber language:; Neo-Tifinagh:; Berber Latin alphabet: Tifinaɣ) is a script used to write the Berber languages.
Tirhuta script
The Tirhuta or Maithili script was the primary historical script for the Maithili language, as well as one of the historical scripts for Sanskrit.
See Unicode and Tirhuta script
Tittle
The tittle or superscript dot is the dot on top of lowercase i and j. The tittle is an integral part of these glyphs, but diacritic dots can appear over other letters in various languages.
Todhri alphabet
The Todhri alphabet is an 18th-century Albanian alphabetical writing system invented for writing the Albanian language by Theodhor Haxhifilipi, also known as Dhaskal Todhri.
See Unicode and Todhri alphabet
Toto language
Toto (Bengali: তোতো, Toto) is a Sino-Tibetan language spoken on the border of India and Bhutan, by the tribal Toto people in Totopara, West Bengal along the border with Bhutan.
Trojan Source
Trojan Source is the name of a software vulnerability that abuses Unicode's bidirectional characters to display source code differently than the actual execution of the source code.
TRON (encoding)
TRON Code is a multi-byte character encoding used in the TRON project.
See Unicode and TRON (encoding)
TrueType
TrueType is an outline font standard developed by Apple in the late 1980s as a competitor to Adobe's Type 1 fonts used in PostScript. Unicode and TrueType are digital typography.
Turkish alphabet
The Turkish alphabet (Türk alfabesi) is a Latin-script alphabet used for writing the Turkish language, consisting of 29 letters, seven of which (Ç, Ğ, I, İ, Ö, Ş and Ü) have been modified from their Latin originals for the phonetic requirements of the language.
See Unicode and Turkish alphabet
Turkish lira sign
The Turkish lira sign (symbol: ₺; image) is the currency symbol used for the Turkish lira, the official currency of Turkey and Northern Cyprus.
See Unicode and Turkish lira sign
Typeface
A typeface (or font family) is a design of letters, numbers and other symbols, to be used in printing or for electronic display.
Ugaritic alphabet
The Ugaritic writing system is a cuneiform abjad (consonantal alphabet) with syllabic elements used from around either 1400 BCE or 1300 BCE for Ugaritic, an extinct Northwest Semitic language.
See Unicode and Ugaritic alphabet
Unicode
Unicode, formally The Unicode Standard, is a text encoding standard maintained by the Unicode Consortium designed to support the use of text in all of the world's writing systems that can be digitized. Unicode and Unicode are character encoding and digital typography.
Unicode alias names and abbreviations
In Unicode, characters can have a unique name.
See Unicode and Unicode alias names and abbreviations
Unicode block
A Unicode block is one of several contiguous ranges of numeric character codes (code points) of the Unicode character set that are defined by the Unicode Consortium for administrative and documentation purposes.
Unicode collation algorithm
The Unicode collation algorithm (UCA) is an algorithm defined in Unicode Technical Report #10, which is a customizable method to produce binary keys from strings representing text in any writing system and language that can be represented with Unicode.
See Unicode and Unicode collation algorithm
Unicode Consortium
The Unicode Consortium (legally Unicode, Inc.) is a 501(c)(3) non-profit organization incorporated and based in Mountain View, California, U.S. Its primary purpose is to maintain and publish the Unicode Standard which was developed with the intention of replacing existing character encoding schemes that are limited in size and scope, and are incompatible with multilingual environments.
See Unicode and Unicode Consortium
Unicode control characters
Many Unicode characters are used to control the interpretation or display of text, but these characters themselves have no visual or spatial representation.
See Unicode and Unicode control characters
Unicode equivalence
Unicode equivalence is the specification by the Unicode character encoding standard that some sequences of code points represent essentially the same character.
See Unicode and Unicode equivalence
Unicode symbol
In computing, a Unicode symbol is a Unicode character which is not part of a script used to write a natural language, but is nonetheless available for use as part of a text.
See Unicode and Unicode symbol
Uniform Resource Identifier
A Uniform Resource Identifier (URI), formerly Universal Resource Identifier, is a unique sequence of characters that identifies an abstract or physical resource, such as resources on a webpage, mail address, phone number, books, real-world objects such as people and places, concepts.
See Unicode and Uniform Resource Identifier
Uniscribe
Uniscribe is the Microsoft Windows set of services for rendering Unicode-encoded text, supporting complex text layout.
United States
The United States of America (USA or U.S.A.), commonly known as the United States (US or U.S.) or America, is a country primarily located in North America.
Universal Character Set characters
The Unicode Consortium and the ISO/IEC JTC 1/SC 2/WG 2 jointly collaborate on the list of the characters in the Universal Coded Character Set.
See Unicode and Universal Character Set characters
Universal Coded Character Set
The Universal Coded Character Set (UCS, Unicode) is a standard set of characters defined by the international standard ISO/IEC 10646, Information technology — Universal Coded Character Set (UCS) (plus amendments to that standard), which is the basis of many character encodings, improving as characters from previously unrepresented typing systems are added.
See Unicode and Universal Coded Character Set
University of California, Berkeley
The University of California, Berkeley (UC Berkeley, Berkeley, Cal, or California) is a public land-grant research university in Berkeley, California.
See Unicode and University of California, Berkeley
University of Cambridge
The University of Cambridge is a public collegiate research university in Cambridge, England.
See Unicode and University of Cambridge
University of Edinburgh
The University of Edinburgh (University o Edinburgh, Oilthigh Dhùn Èideann; abbreviated as Edin. in post-nominals) is a public research university based in Edinburgh, Scotland.
See Unicode and University of Edinburgh
Unix-like
A Unix-like (sometimes referred to as UN*X or *nix) operating system is one that behaves in a manner similar to a Unix system, although not necessarily conforming to or being certified to any version of the Single UNIX Specification.
URL
A uniform resource locator (URL), colloquially known as an address on the Web, is a reference to a resource that specifies its location on a computer network and a mechanism for retrieving it.
See Unicode and URL
UTF-1
UTF-1 is a method of transforming ISO/IEC 10646/Unicode into a stream of bytes.
UTF-16
UTF-16 (16-bit Unicode Transformation Format) is a character encoding capable of encoding all 1,112,064 valid code points of Unicode (in fact this number of code points is dictated by the design of UTF-16). Unicode and UTF-16 are character encoding.
UTF-32
UTF-32 (32-bit Unicode Transformation Format) is a fixed-length encoding used to encode Unicode code points that uses exactly 32 bits (four bytes) per code point (but a number of leading bits must be zero as there are far fewer than 232 Unicode code points, needing actually only 21 bits). Unicode and UTF-32 are character encoding.
UTF-7
UTF-7 (7-bit Unicode Transformation Format) is an obsolete variable-length character encoding for representing Unicode text using a stream of ASCII characters. Unicode and UTF-7 are character encoding.
UTF-8
UTF-8 is a variable-length character encoding standard used for electronic communication. Unicode and UTF-8 are character encoding.
UTF-EBCDIC
UTF-EBCDIC is a character encoding capable of encoding all 1,112,064 valid character code points in Unicode using 1 to 5 bytes (in contrast to a maximum of 4 for UTF-8). Unicode and UTF-EBCDIC are character encoding.
Vai syllabary
The Vai syllabary is a syllabic writing system devised for the Vai language by Momolu Duwalu Bukele of Jondu, in what is now Grand Cape Mount County, Liberia.
Variant Chinese characters
Chinese characters may have several variant forms—visually distinct glyphs that represent the same underlying meaning and pronunciation.
See Unicode and Variant Chinese characters
Variant form (Unicode)
A variant form is an alternate glyph for a character, encoded in Unicode through the mechanism of variation sequences: sequences in Unicode that consist of a base character followed by a variation selector character.
See Unicode and Variant form (Unicode)
Variation Selectors (Unicode block)
Variation Selectors is a Unicode block containing 16 variation selectors used to specify a glyph variant for a preceding character.
See Unicode and Variation Selectors (Unicode block)
Vedic Sanskrit
Vedic Sanskrit, also simply referred as the Vedic language, is an ancient language of the Indo-Aryan subgroup of the Indo-European language family.
See Unicode and Vedic Sanskrit
Vithkuqi alphabet
The Vithkuqi alphabet, also called Büthakukye or Beitha Kukju after the appellation applied to it by German Albanologist Johann Georg von Hahn, was an alphabetic script invented for writing the Albanian language between 1825 and 1845 by Albanian scholar Naum Veqilharxhi.
See Unicode and Vithkuqi alphabet
Wancho script
Wancho script is an alphabet created between 2001 and 2012 by middle school teacher Banwang Losu in Longding district, Arunachal Pradesh for writing the Wancho language.
Warang Citi
Warang Citi (also written Varang Kshiti or Barang Kshiti;, IPA: /wɐrɐŋ ʧɪt̪ɪ/) is a writing system invented by Lako Bodra for the Ho language spoken in East India.
Web browser
A web browser is an application for accessing websites.
Web Open Font Format
The Web Open Font Format (WOFF) is a font format for use in web pages. Unicode and web Open Font Format are digital typography.
See Unicode and Web Open Font Format
Web page
A web page (or webpage) is a document on the Web that is accessed in a web browser.
Wide character
A wide character is a computer character datatype that generally has a size greater than the traditional 8-bit character. Unicode and wide character are character encoding.
See Unicode and Wide character
Windows 10
Windows 10 is a major release of Microsoft's Windows NT operating system.
Windows 11
Windows 11 is the latest major release of Microsoft's Windows NT operating system, released on October 5, 2021.
Windows 2000
Windows 2000 is a major release of the Windows NT operating system developed by Microsoft and oriented towards businesses.
Windows 7
Windows 7 is a major release of the Windows NT operating system developed by Microsoft.
Windows 8
Windows 8 is a major release of the Windows NT operating system developed by Microsoft.
Windows 9x
Windows 9x is a generic term referring to a series of Microsoft Windows computer operating systems produced from 1995 to 2000, which were based on the Windows 95 kernel and its underlying foundation of MS-DOS, both of which were updated in subsequent versions.
Windows Glyph List 4
Windows Glyph List 4, or more commonly WGL4 for short, also known as the Pan-European character set, is a character repertoire on Microsoft operating systems comprising 657 Unicode characters, two of them private use. Unicode and Windows Glyph List 4 are character encoding and digital typography.
See Unicode and Windows Glyph List 4
Windows NT
Windows NT is a proprietary graphical operating system produced by Microsoft as part of its Windows product line, the first version of which, Windows NT 3.1, was released on July 27, 1993.
Windows NT 4.0
Windows NT 4.0 is a major release of the Windows NT operating system developed by Microsoft and oriented towards businesses.
See Unicode and Windows NT 4.0
Windows Vista
Windows Vista is a major release of the Windows NT operating system developed by Microsoft.
Windows XP
Windows XP is a major release of Microsoft's Windows NT operating system.
Windows-1252
Windows-1252 or CP-1252 (Windows code page 1252) is a legacy single-byte character encoding that is used by default (as the "ANSI code page") in Microsoft Windows throughout the Americas, Western Europe, Oceania, and much of Africa.
Wireless
Wireless communication (or just wireless, when the context allows) is the transfer of information (telecommunication) between two or more points without the use of an electrical conductor, optical fiber or other continuous guided medium for the transfer.
Wolof language
Wolof (Wolof làkk, وࣷلࣷفْ لࣵکّ) is a Niger–Congo language spoken by the Wolof people in much of West African subregion of Senegambia that is split between the countries of Senegal, Mauritania, and the Gambia.
See Unicode and Wolof language
Word processor
A word processor (WP) is a device or computer program that provides for input, editing, formatting, and output of text, often with some additional features.
See Unicode and Word processor
World Wide Web
The World Wide Web (WWW or simply the Web) is an information system that enables content sharing over the Internet through user-friendly ways meant to appeal to users beyond IT specialists and hobbyists.
See Unicode and World Wide Web
World Wide Web Consortium
The World Wide Web Consortium (W3C) is the main international standards organization for the World Wide Web.
See Unicode and World Wide Web Consortium
Writing system
A writing system comprises a particular set of symbols, called a script, as well as the rules by which the script represents a particular language.
See Unicode and Writing system
Xerox
Xerox Holdings Corporation is an American corporation that sells print and digital document products and services in more than 160 countries.
Xerox Character Code Standard
The Xerox Character Code Standard (XCCS) is a historical 16-bit character encoding that was created by Xerox in 1980 for the exchange of information between elements of the Xerox Network Systems Architecture. Unicode and Xerox Character Code Standard are character encoding.
See Unicode and Xerox Character Code Standard
XHTML
Extensible HyperText Markup Language (XHTML) is part of the family of XML markup languages which mirrors or extends versions of the widely used HyperText Markup Language (HTML), the language in which Web pages are formulated.
Xiangqi
Xiangqi, commonly known as Chinese chess or elephant chess, is a strategy board game for two players.
XML
Extensible Markup Language (XML) is a markup language and file format for storing, transmitting, and reconstructing arbitrary data.
See Unicode and XML
Yahoo! Mail
Yahoo! Mail (also written as Yahoo Mail) is an email service offered by the American company Yahoo, Inc. The service is free for personal use, with an optional monthly fee for additional features.
Yi script
The Yi scripts (Yi: ꆈꌠꁱꂷ nuosu bburma) are two scripts used to write the Yi languages; Classical Yi (an ideogram script), and the later Yi syllabary.
Zanabazar square script
Zanabazar's square script is a horizontal Mongolian square script (Hevtee Dörvöljin bichig or label), an abugida developed by the monk and scholar Zanabazar based on the Tibetan alphabet to write Mongolian.
See Unicode and Zanabazar square script
Znamenny chant
Znamenny Chant (знаменное пение, знаменный распев) is a singing tradition used by some in the Russian Eastern Orthodox Church.
See Unicode and Znamenny chant
.NET Framework
The.NET Framework (pronounced as "dot net") is a proprietary software framework developed by Microsoft that runs primarily on Microsoft Windows.
See Unicode and .NET Framework
16-bit computing
16-bit microcomputers are microcomputers that use 16-bit microprocessors.
See Unicode and 16-bit computing
32-bit computing
In computer architecture, 32-bit computing refers to computer systems with a processor, memory, and other major system components that operate on data in 32-bit units.
See Unicode and 32-bit computing
References
Also known as Brakcet, Bulldog Award, History of unicode, MES-1, MES-2, MES-3, Multilingual European subset, Multilingual European subsets, Script Encoding Initiative, The Unicode Bulldog Award, The Unicode Standard, U+, Uni-code, Unicode 1, Unicode 1.0, Unicode 1.0.0, Unicode 1.0.1, Unicode 1.1, Unicode 1.1.0, Unicode 1.1.5, Unicode 10, Unicode 10.0, Unicode 10.0.0, Unicode 11, Unicode 11.0, Unicode 11.0.0, Unicode 12, Unicode 12.0, Unicode 12.0.0, Unicode 12.1, Unicode 12.1.0, Unicode 13, Unicode 13.0, Unicode 13.0.0, Unicode 14, Unicode 14.0, Unicode 14.0.0, Unicode 15, Unicode 15.0, Unicode 15.0.0, Unicode 2, Unicode 2.0, Unicode 2.0.0, Unicode 2.1, Unicode 2.1.0, Unicode 2.1.2, Unicode 2.1.5, Unicode 2.1.8, Unicode 2.1.9, Unicode 3, Unicode 3.0, Unicode 3.0.0, Unicode 3.0.1, Unicode 3.1, Unicode 3.1.0, Unicode 3.1.1, Unicode 3.2, Unicode 3.2.0, Unicode 4, Unicode 4.0, Unicode 4.0.0, Unicode 4.0.1, Unicode 4.1, Unicode 4.1.0, Unicode 5, Unicode 5.0, Unicode 5.0.0, Unicode 5.1, Unicode 5.1.0, Unicode 5.2, Unicode 5.2.0, Unicode 6, Unicode 6.0, Unicode 6.0.0, Unicode 6.1, Unicode 6.1.0, Unicode 6.2, Unicode 6.2.0, Unicode 6.3, Unicode 6.3.0, Unicode 7, Unicode 7.0, Unicode 7.0.0, Unicode 8, Unicode 8.0, Unicode 8.0.0, Unicode 88, Unicode 9, Unicode 9.0, Unicode 9.0.0, Unicode Bulldog Award, Unicode Character Set, Unicode Pipeline, Unicode Standard, Unicode Transformation Format, Unicode Transformation Formats, Unicode Version History, Unicode Versions, Unicode alias, Unicode anomaly, Unicode code point, Unicode code points, Unicode codepoint, Unicode notation, Unicode roadmap, Unicode.org, Yunicode.
, Byte, Byte order mark, Byzantine music, C0 and C1 control codes, Canadian Aboriginal syllabics, Carian alphabets, Caucasian Albanian script, Chakma script, Cham script, Character encoding, Charis SIL, Cherokee syllabary, Chinese Character Code for Information Interchange, Chinese character description languages, Chinese character radicals, Chinese characters, CJK characters, CJK Radicals Supplement, CJK Unified Ideographs, Cocoa text system, Code page, Code point, Combining character, Comparison of Unicode encodings, ConScript Unicode Registry, Coptic script, Core Text, COVID-19 pandemic, Cuneiform, Currency Symbols (Unicode block), Cypriot syllabary, Cypro-Minoan syllabary, Cyrillic (Unicode block), Cyrillic script, Dave Opstad, Deseret alphabet, Devanagari, Dhives Akuru, Diminishing returns, DIN 91379, Dingbat, DirectWrite, Dogri script, Domain Name System, Dominoes, Dot (diacritic), Dotless I, Duplicate characters in Unicode, Duployan shorthand, E, EBCDIC, Egyptian hieroglyphs, Elbasan alphabet, Elymaic, Emoji, Emoticon, Endianness, Euro sign, European Committee for Standardization, Extended ASCII, Extended Unix Code, ʼPhags-pa script, Fallback font, File Transfer Protocol, Fitzpatrick scale, Font, Font substitution, Fraser script, FreeBSD, Garay alphabet, Gardiner's sign list, GB 18030, Geʽez script, General Punctuation, Geometric Shapes (Unicode block), Georgian lari, Georgian scripts, Glagolitic script, Glyph, Gmail, GNOME, GNU Compiler Collection, Gondi writing, Google, Gothic alphabet, Grantha script, Grapheme, Graphite (smart font technology), Greek alphabet, Greek and Coptic, Greek Extended, GTK, Gujarati script, Gunjala Gondi script, Gurmukhi, Halfwidth and Fullwidth Forms (Unicode block), Han unification, Hanazono University, Hangul, Hanifi Rohingya script, Hanunoo script, Hatran Aramaic, Hausa language, Hebrew alphabet, Hentaigana, Hexadecimal, Hexagram (I Ching), High-level programming language, Hindko, Hiragana, Homoglyph, Hong Kong, HTML, HTTP, IBM, Ideographic Research Group, Imperial Aramaic, Indian rupee sign, Indian Script Code for Information Interchange, Indic Siyaq Numbers, Indo-Aryan languages, Injective function, Inscriptional Pahlavi, Inscriptional Parthian, Insular script, International Components for Unicode, Internationalized domain name, Internet Engineering Task Force, Internet Explorer, IPA Extensions, ISO/IEC 14755, ISO/IEC 2022, ISO/IEC 8859, ISO/IEC 8859-1, ISO/IEC 8859-9, ISO/IEC JTC 1/SC 2, Japan, Java virtual machine, Javanese script, JIS X 0208, Joe Becker (Unicode), Jurchen script, Kaithi, Kangxi radical, Kanji, Kannada script, Katakana, Kawi script, Kayah Li alphabet, KDE, Ken Lunde, Kharosthi, Khema script, Khitan large script, Khitan small script, Khmer script, Khojki script, Khudabadi script, Kirat Rai, Klingon scripts, Kyrgyz som, Lamedh, Lao script, Latin Extended Additional, Latin Extended-A, Latin Extended-B, Latin script, Latin-1 Supplement, Leading zero, Lee Collins (Unicode), Lepcha script, Letterlike Symbols, Ligature (writing), Limbu script, Linear A, Linear B, Linux distribution, List of binary codes, List of Hangul jamo, List of typefaces, List of Unicode characters, List of XML and HTML character entity references, Lithuanian language, Lontara script, Lotus Multi-Byte Character Set, Lycian alphabet, Lydian alphabet, MacOS, Macron (diacritic), Mahajani, Mahjong, Makassarese language, Malayalam script, Mandaic alphabet, Manichaean script, Mark Davis (Unicode), Markup language, Mathematical Operators (Unicode block), Maya numerals, Medefaidrin, Medieval Unicode Font Initiative, Meitei script, Mende Kikakui script, Meroitic script, Meta Platforms, Michael Everson, Microsoft, Microsoft Layer for Unicode, Microsoft Windows, MIME, Ministry of Endowments and Religious Affairs (Oman), Miscellaneous Symbols, Miscellaneous Technical, Modi script, Mojibake, Mon alphabet, Mongolian script, Mru script, Multani script, Multilingualism, Mundari Bani, Musical notation, N'Ko script, Nabataean script, Nandinagari, Natural language processing, Nüshu, Netflix, New Tai Lue alphabet, Newline, NeXT, Number Forms, Numidian language, Nyiakeng Puachue Hmong, Odia script, Ogham, Ogonek, Ol Chiki script, Ol Onal, Old Hungarian script, Old Italic scripts, Old Permic script, Old Persian cuneiform, Old Turkic script, Old Uyghur alphabet, Open-source Unicode typefaces, OpenType, Osage script, Osmanya alphabet, Outlook.com, Pahawh Hmong, Pali, Palmyrene alphabet, Pango, PARC (company), Pau Cin Hau script, Percent-encoding, Phaistos Disc, Philippines, Phoenician alphabet, Plan 9 from Bell Labs, Plane (Unicode), Playing card, Pollard script, Pracalit script, Precomposed character, Private Use Areas, Proof of concept, Psalter Pahlavi, Punjabi language, Punycode, Python (programming language), Quoted-printable, Reiwa era, Rejang alphabet, Religious and political symbols in Unicode, Research Libraries Group, Romanization, Rongorongo, Roozbeh Pournader, Round-trip format conversion, Ruble sign, Rune, Samaritan script, SAP, Saurashtra script, Scribal abbreviation, Script (Unicode), Seed7, Shape context, Sharada script, Shavian alphabet, Shift JIS, Sic, Siddhaṃ script, SignWriting, SIL International, Sinhala script, Sinosphere, Sogdian alphabet, Sorang Sompeng script, Soyombo script, Spacing Modifier Letters, Specials (Unicode block), Standard Compression Scheme for Unicode, Standardization Administration of China, Standards related to Unicode, Star (classification), Sun Microsystems, Sundanese script, Sunuwar alphabet, Superscripts and Subscripts, Sylheti Nagri, Syllabary, Symbols for Legacy Computing, Syriac alphabet, T.51/ISO/IEC 6937, Tagbanwa script, Tai Tham script, Tai Viet script, Taiwan, Takri script, Tamil script, Tangsa language, Tangut script, Tatsuo Kobayashi, Telugu script, Tengwar, Thaana, Thai Industrial Standard 620-2533, Thai script, Tibetan script, Tifinagh, Tirhuta script, Tittle, Todhri alphabet, Toto language, Trojan Source, TRON (encoding), TrueType, Turkish alphabet, Turkish lira sign, Typeface, Ugaritic alphabet, Unicode, Unicode alias names and abbreviations, Unicode block, Unicode collation algorithm, Unicode Consortium, Unicode control characters, Unicode equivalence, Unicode symbol, Uniform Resource Identifier, Uniscribe, United States, Universal Character Set characters, Universal Coded Character Set, University of California, Berkeley, University of Cambridge, University of Edinburgh, Unix-like, URL, UTF-1, UTF-16, UTF-32, UTF-7, UTF-8, UTF-EBCDIC, Vai syllabary, Variant Chinese characters, Variant form (Unicode), Variation Selectors (Unicode block), Vedic Sanskrit, Vithkuqi alphabet, Wancho script, Warang Citi, Web browser, Web Open Font Format, Web page, Wide character, Windows 10, Windows 11, Windows 2000, Windows 7, Windows 8, Windows 9x, Windows Glyph List 4, Windows NT, Windows NT 4.0, Windows Vista, Windows XP, Windows-1252, Wireless, Wolof language, Word processor, World Wide Web, World Wide Web Consortium, Writing system, Xerox, Xerox Character Code Standard, XHTML, Xiangqi, XML, Yahoo! Mail, Yi script, Zanabazar square script, Znamenny chant, .NET Framework, 16-bit computing, 32-bit computing.