15 relations: Byte order mark, Compound document, Endianness, Interlinear gloss, International Committee for Information Technology Standards, ISO/IEC 8859-1, ISO/IEC JTC 1/SC 2, Mojibake, Plane (Unicode), Text file, Unicode, Unicode Consortium, Unicode control characters, UTF-8, Windows-1252.
The byte order mark (BOM) is a Unicode character,, whose appearance as a magic number at the start of a text stream can signal several things to a program consuming the text.
In computing, a compound document is a document type typically produced using word processing software, and is a regular text document intermingled with non-text elements such as spreadsheets, pictures, digital videos, digital audio, and other multimedia features.
Endianness refers to the sequential order in which bytes are arranged into larger numerical values when stored in memory or when transmitted over digital links.
In linguistics and pedagogy, an interlinear gloss is a gloss (series of brief explanations, such as definitions or pronunciations) placed between lines (inter- + linear), such as between a line of original text and its translation into another language.
The InterNational Committee for Information Technology Standards (INCITS), (pronounced "insights"), is an ANSI-accredited standards development organization composed of Information technology developers.
ISO/IEC 8859-1:1998, Information technology — 8-bit single-byte coded graphic character sets — Part 1: Latin alphabet No.
ISO/IEC JTC 1/SC 2 Coded character sets is a standardization subcommittee of the Joint Technical Committee ISO/IEC JTC 1 of the International Organization for Standardization (ISO) and the International Electrotechnical Commission (IEC), that develops and facilitates standards within the field of coded character sets.
Mojibake (文字化け) is the garbled text that is the result of text being decoded using an unintended character encoding.
In the Unicode standard, a plane is a continuous group of 65,536 (216) code points.
A text file (sometimes spelled "textfile"; an old alternative name is "flatfile") is a kind of computer file that is structured as a sequence of lines of electronic text.
Unicode is a computing industry standard for the consistent encoding, representation, and handling of text expressed in most of the world's writing systems.
The Unicode Consortium (Unicode Inc.) is a 501(c)(3) non-profit organization that coordinates the development of the Unicode standard, based in Mountain View, California.
Many Unicode control characters are used to control the interpretation or display of text, but these characters themselves have no visual or spatial representation.
UTF-8 is a variable width character encoding capable of encoding all 1,112,064 valid code points in Unicode using one to four 8-bit bytes.
Windows-1252 or CP-1252 (code page 1252) is a 1 byte character encoding of the Latin alphabet, used by default in the legacy components of Microsoft Windows in English and some other Western languages (other languages use different default encodings).
0xFFFD, Black diamond question mark, Black diamond with white question mark, Black question mark in white diamond, FFFC, Question mark in black diamond, Question mark in red diamond, Red diamond question mark, Red diamond with white question mark, Replacement character, Replacement glyph, Specials Unicode block, U+FFFD, U+FFFE, U+FFFF, Unicode Specials, Unicode replacement character, Unrecognized character, White diamond with black question mark, White question mark in black diamond, White question mark in red diamond, Ï¿½, ￼.