86 relations: Abstract Meaning Representation, Ancient Greek, Annotation, Arabic, Blood bank, Bulgarian language, Case grammar, Catalan language, Chinese language, Classical Arabic, Classical Armenian, Combinatory categorial grammar, Computational linguistics, Corpus linguistics, Creative Commons license, Croatian language, Czech language, Danish language, Dependency grammar, Discourse representation theory, Dutch language, Empirical evidence, English language, Estonian language, Finnish language, Formal semantics (linguistics), FrameNet, French language, Geoffrey Leech, German language, GNU General Public License, GNU Lesser General Public License, Gothic language, Greek language, Head-driven phrase structure grammar, Hebrew language, Hindi, History of English, History of French, History of Portuguese, Hungarian language, Icelandic language, Italian language, Japanese language, Kais Dukes, Korean language, Latin, Lexical functional grammar, Linguistic Data Consortium, Linguistics, ..., Logical form, Norwegian language, Old Church Slavonic, Old East Slavic, Parsing, Part-of-speech tagging, Persian language, Phrase structure grammar, Phrase structure rules, Polish language, Portuguese language, PropBank, Psycholinguistics, Quranic Arabic Corpus, Romanian language, Russian language, Russian National Corpus, Seed bank, Semantics, Sentence (linguistics), Slovene language, Spanish language, Swedish language, Syntax, Tatoeba, Text corpus, Thai language, Theoretical linguistics, Tree structure, Turkish language, Ukrainian language, Universal Conceptual Cognitive Annotation, University of Groningen, Urdu, Vietnamese language, XML. Expand index (36 more) »
Abstract Meaning Representation
Abstract Meaning Representation (AMR) is a semantic representation language.
New!!: Treebank and Abstract Meaning Representation · See more »
Ancient Greek
The Ancient Greek language includes the forms of Greek used in ancient Greece and the ancient world from around the 9th century BC to the 6th century AD.
New!!: Treebank and Ancient Greek · See more »
Annotation
An annotation is a metadatum (e.g. a post, explanation, markup) attached to location or other data.
New!!: Treebank and Annotation · See more »
Arabic
Arabic (العَرَبِيَّة) or (عَرَبِيّ) or) is a Central Semitic language that first emerged in Iron Age northwestern Arabia and is now the lingua franca of the Arab world. It is named after the Arabs, a term initially used to describe peoples living from Mesopotamia in the east to the Anti-Lebanon mountains in the west, in northwestern Arabia, and in the Sinai peninsula. Arabic is classified as a macrolanguage comprising 30 modern varieties, including its standard form, Modern Standard Arabic, which is derived from Classical Arabic. As the modern written language, Modern Standard Arabic is widely taught in schools and universities, and is used to varying degrees in workplaces, government, and the media. The two formal varieties are grouped together as Literary Arabic (fuṣḥā), which is the official language of 26 states and the liturgical language of Islam. Modern Standard Arabic largely follows the grammatical standards of Classical Arabic and uses much of the same vocabulary. However, it has discarded some grammatical constructions and vocabulary that no longer have any counterpart in the spoken varieties, and has adopted certain new constructions and vocabulary from the spoken varieties. Much of the new vocabulary is used to denote concepts that have arisen in the post-classical era, especially in modern times. During the Middle Ages, Literary Arabic was a major vehicle of culture in Europe, especially in science, mathematics and philosophy. As a result, many European languages have also borrowed many words from it. Arabic influence, mainly in vocabulary, is seen in European languages, mainly Spanish and to a lesser extent Portuguese, Valencian and Catalan, owing to both the proximity of Christian European and Muslim Arab civilizations and 800 years of Arabic culture and language in the Iberian Peninsula, referred to in Arabic as al-Andalus. Sicilian has about 500 Arabic words as result of Sicily being progressively conquered by Arabs from North Africa, from the mid 9th to mid 10th centuries. Many of these words relate to agriculture and related activities (Hull and Ruffino). Balkan languages, including Greek and Bulgarian, have also acquired a significant number of Arabic words through contact with Ottoman Turkish. Arabic has influenced many languages around the globe throughout its history. Some of the most influenced languages are Persian, Turkish, Spanish, Urdu, Kashmiri, Kurdish, Bosnian, Kazakh, Bengali, Hindi, Malay, Maldivian, Indonesian, Pashto, Punjabi, Tagalog, Sindhi, and Hausa, and some languages in parts of Africa. Conversely, Arabic has borrowed words from other languages, including Greek and Persian in medieval times, and contemporary European languages such as English and French in modern times. Classical Arabic is the liturgical language of 1.8 billion Muslims and Modern Standard Arabic is one of six official languages of the United Nations. All varieties of Arabic combined are spoken by perhaps as many as 422 million speakers (native and non-native) in the Arab world, making it the fifth most spoken language in the world. Arabic is written with the Arabic alphabet, which is an abjad script and is written from right to left, although the spoken varieties are sometimes written in ASCII Latin from left to right with no standardized orthography.
New!!: Treebank and Arabic · See more »
Blood bank
A blood bank is a center where blood gathered as a result of blood donation is stored and preserved for later use in blood transfusion.
New!!: Treebank and Blood bank · See more »
Bulgarian language
No description.
New!!: Treebank and Bulgarian language · See more »
Case grammar
Case grammar is a system of linguistic analysis, focusing on the link between the valence, or number of subjects, objects, etc., of a verb and the grammatical context it requires.
New!!: Treebank and Case grammar · See more »
Catalan language
Catalan (autonym: català) is a Western Romance language derived from Vulgar Latin and named after the medieval Principality of Catalonia, in northeastern modern Spain.
New!!: Treebank and Catalan language · See more »
Chinese language
Chinese is a group of related, but in many cases mutually unintelligible, language varieties, forming a branch of the Sino-Tibetan language family.
New!!: Treebank and Chinese language · See more »
Classical Arabic
Classical Arabic is the form of the Arabic language used in Umayyad and Abbasid literary texts from the 7th century AD to the 9th century AD.
New!!: Treebank and Classical Arabic · See more »
Classical Armenian
Classical Armenian (grabar, Western Armenian krapar, meaning "literary "; also Old Armenian or Liturgical Armenian) is the oldest attested form of the Armenian language.
New!!: Treebank and Classical Armenian · See more »
Combinatory categorial grammar
Combinatory categorial grammar (CCG) is an efficiently parsable, yet linguistically expressive grammar formalism.
New!!: Treebank and Combinatory categorial grammar · See more »
Computational linguistics
Computational linguistics is an interdisciplinary field concerned with the statistical or rule-based modeling of natural language from a computational perspective, as well as the study of appropriate computational approaches to linguistic questions.
New!!: Treebank and Computational linguistics · See more »
Corpus linguistics
Corpus linguistics is the study of language as expressed in corpora (bodies) of "real world" text.
New!!: Treebank and Corpus linguistics · See more »
Creative Commons license
A Creative Commons (CC) license is one of several public copyright licenses that enable the free distribution of an otherwise copyrighted work.
New!!: Treebank and Creative Commons license · See more »
Croatian language
Croatian (hrvatski) is the standardized variety of the Serbo-Croatian language used by Croats, principally in Croatia, Bosnia and Herzegovina, the Serbian province of Vojvodina and other neighboring countries.
New!!: Treebank and Croatian language · See more »
Czech language
Czech (čeština), historically also Bohemian (lingua Bohemica in Latin), is a West Slavic language of the Czech–Slovak group.
New!!: Treebank and Czech language · See more »
Danish language
Danish (dansk, dansk sprog) is a North Germanic language spoken by around six million people, principally in Denmark and in the region of Southern Schleswig in northern Germany, where it has minority language status.
New!!: Treebank and Danish language · See more »
Dependency grammar
Dependency grammar (DG) is a class of modern grammatical theories that are all based on the dependency relation (as opposed to the constituency relation) and that can be traced back primarily to the work of Lucien Tesnière.
New!!: Treebank and Dependency grammar · See more »
Discourse representation theory
In formal linguistics, discourse representation theory (DRT) is a framework for exploring meaning under a formal semantics approach.
New!!: Treebank and Discourse representation theory · See more »
Dutch language
The Dutch language is a West Germanic language, spoken by around 23 million people as a first language (including the population of the Netherlands where it is the official language, and about sixty percent of Belgium where it is one of the three official languages) and by another 5 million as a second language.
New!!: Treebank and Dutch language · See more »
Empirical evidence
Empirical evidence, also known as sensory experience, is the information received by means of the senses, particularly by observation and documentation of patterns and behavior through experimentation.
New!!: Treebank and Empirical evidence · See more »
English language
English is a West Germanic language that was first spoken in early medieval England and is now a global lingua franca.
New!!: Treebank and English language · See more »
Estonian language
Estonian (eesti keel) is the official language of Estonia, spoken natively by about 1.1 million people: 922,000 people in Estonia and 160,000 outside Estonia.
New!!: Treebank and Estonian language · See more »
Finnish language
Finnish (or suomen kieli) is a Finnic language spoken by the majority of the population in Finland and by ethnic Finns outside Finland.
New!!: Treebank and Finnish language · See more »
Formal semantics (linguistics)
In linguistics, formal semantics seeks to understand linguistic meaning by constructing precise mathematical models of the principles that speakers use to define relations between expressions in a natural language and the world that supports meaningful discourse.
New!!: Treebank and Formal semantics (linguistics) · See more »
FrameNet
In computational linguistics, FrameNet is a project housed at the International Computer Science Institute in Berkeley, California which produces an electronic resource based on a theory of meaning called frame semantics.
New!!: Treebank and FrameNet · See more »
French language
French (le français or la langue française) is a Romance language of the Indo-European family.
New!!: Treebank and French language · See more »
Geoffrey Leech
Geoffrey Neil Leech FBA (16 January 1936 – 19 August 2014) was a specialist in English language and linguistics.
New!!: Treebank and Geoffrey Leech · See more »
German language
German (Deutsch) is a West Germanic language that is mainly spoken in Central Europe.
New!!: Treebank and German language · See more »
GNU General Public License
The GNU General Public License (GNU GPL or GPL) is a widely used free software license, which guarantees end users the freedom to run, study, share and modify the software.
New!!: Treebank and GNU General Public License · See more »
GNU Lesser General Public License
The GNU Lesser General Public License (LGPL) is a free software license published by the Free Software Foundation (FSF).
New!!: Treebank and GNU Lesser General Public License · See more »
Gothic language
Gothic is an extinct East Germanic language that was spoken by the Goths.
New!!: Treebank and Gothic language · See more »
Greek language
Greek (Modern Greek: ελληνικά, elliniká, "Greek", ελληνική γλώσσα, ellinikí glóssa, "Greek language") is an independent branch of the Indo-European family of languages, native to Greece and other parts of the Eastern Mediterranean and the Black Sea.
New!!: Treebank and Greek language · See more »
Head-driven phrase structure grammar
Head-driven phrase structure grammar (HPSG) is a highly lexicalized, constraint-based grammar developed by Carl Pollard and Ivan Sag.
New!!: Treebank and Head-driven phrase structure grammar · See more »
Hebrew language
No description.
New!!: Treebank and Hebrew language · See more »
Hindi
Hindi (Devanagari: हिन्दी, IAST: Hindī), or Modern Standard Hindi (Devanagari: मानक हिन्दी, IAST: Mānak Hindī) is a standardised and Sanskritised register of the Hindustani language.
New!!: Treebank and Hindi · See more »
History of English
English is a West Germanic language that originated from Anglo-Frisian dialects brought to Britain in the mid 5th to 7th centuries AD by Anglo-Saxon settlers from what is now northwest Germany, west Denmark and the Netherlands, displacing the Celtic languages that previously predominated.
New!!: Treebank and History of English · See more »
History of French
French is a Romance language (meaning that it is descended primarily from Vulgar Latin) that evolved out of the Gallo-Romance spoken in northern France.
New!!: Treebank and History of French · See more »
History of Portuguese
The Portuguese language developed in the Western Iberian Peninsula from Latin spoken by Roman soldiers and colonists starting in the 3rd century BC.
New!!: Treebank and History of Portuguese · See more »
Hungarian language
Hungarian is a Finno-Ugric language spoken in Hungary and several neighbouring countries. It is the official language of Hungary and one of the 24 official languages of the European Union. Outside Hungary it is also spoken by communities of Hungarians in the countries that today make up Slovakia, western Ukraine, central and western Romania (Transylvania and Partium), northern Serbia (Vojvodina), northern Croatia, and northern Slovenia due to the effects of the Treaty of Trianon, which resulted in many ethnic Hungarians being displaced from their homes and communities in the former territories of the Austro-Hungarian Empire. It is also spoken by Hungarian diaspora communities worldwide, especially in North America (particularly the United States). Like Finnish and Estonian, Hungarian belongs to the Uralic language family branch, its closest relatives being Mansi and Khanty.
New!!: Treebank and Hungarian language · See more »
Icelandic language
Icelandic (íslenska) is a North Germanic language, and the language of Iceland.
New!!: Treebank and Icelandic language · See more »
Italian language
Italian (or lingua italiana) is a Romance language.
New!!: Treebank and Italian language · See more »
Japanese language
is an East Asian language spoken by about 128 million people, primarily in Japan, where it is the national language.
New!!: Treebank and Japanese language · See more »
Kais Dukes
Kais Dukes is a British computer scientist and software developer known for the development of the Quranic Arabic Corpus and JQuranTree.
New!!: Treebank and Kais Dukes · See more »
Korean language
The Korean language (Chosŏn'gŭl/Hangul: 조선말/한국어; Hanja: 朝鮮말/韓國語) is an East Asian language spoken by about 80 million people.
New!!: Treebank and Korean language · See more »
Latin
Latin (Latin: lingua latīna) is a classical language belonging to the Italic branch of the Indo-European languages.
New!!: Treebank and Latin · See more »
Lexical functional grammar
Lexical functional grammar (LFG) is a constraint-based grammar framework in theoretical linguistics.
New!!: Treebank and Lexical functional grammar · See more »
Linguistic Data Consortium
The Linguistic Data Consortium is an open consortium of universities, companies and government research laboratories.
New!!: Treebank and Linguistic Data Consortium · See more »
Linguistics
Linguistics is the scientific study of language, and involves an analysis of language form, language meaning, and language in context.
New!!: Treebank and Linguistics · See more »
Logical form
In philosophy and mathematics, a logical form of a syntactic expression is a precisely-specified semantic version of that expression in a formal system.
New!!: Treebank and Logical form · See more »
Norwegian language
Norwegian (norsk) is a North Germanic language spoken mainly in Norway, where it is the official language.
New!!: Treebank and Norwegian language · See more »
Old Church Slavonic
Old Church Slavonic, also known as Old Church Slavic (or Ancient/Old Slavonic often abbreviated to OCS; (autonym словѣ́ньскъ ѩꙁꙑ́къ, slověnĭskŭ językŭ), not to be confused with the Proto-Slavic, was the first Slavic literary language. The 9th-century Byzantine missionaries Saints Cyril and Methodius are credited with standardizing the language and using it in translating the Bible and other Ancient Greek ecclesiastical texts as part of the Christianization of the Slavs. It is thought to have been based primarily on the dialect of the 9th century Byzantine Slavs living in the Province of Thessalonica (now in Greece). It played an important role in the history of the Slavic languages and served as a basis and model for later Church Slavonic traditions, and some Eastern Orthodox and Eastern Catholic churches use this later Church Slavonic as a liturgical language to this day. As the oldest attested Slavic language, OCS provides important evidence for the features of Proto-Slavic, the reconstructed common ancestor of all Slavic languages.
New!!: Treebank and Old Church Slavonic · See more »
Old East Slavic
Old East Slavic or Old Russian was a language used during the 10th–15th centuries by East Slavs in Kievan Rus' and states which evolved after the collapse of Kievan Rus'.
New!!: Treebank and Old East Slavic · See more »
Parsing
Parsing, syntax analysis or syntactic analysis is the process of analysing a string of symbols, either in natural language, computer languages or data structures, conforming to the rules of a formal grammar.
New!!: Treebank and Parsing · See more »
Part-of-speech tagging
In corpus linguistics, part-of-speech tagging (POS tagging or PoS tagging or POST), also called grammatical tagging or word-category disambiguation, is the process of marking up a word in a text (corpus) as corresponding to a particular part of speech, based on both its definition and its context—i.e., its relationship with adjacent and related words in a phrase, sentence, or paragraph.
New!!: Treebank and Part-of-speech tagging · See more »
Persian language
Persian, also known by its endonym Farsi (فارسی), is one of the Western Iranian languages within the Indo-Iranian branch of the Indo-European language family.
New!!: Treebank and Persian language · See more »
Phrase structure grammar
The term phrase structure grammar was originally introduced by Noam Chomsky as the term for grammar studied previously by Emil Post and Axel Thue (Post canonical systems).
New!!: Treebank and Phrase structure grammar · See more »
Phrase structure rules
Phrase structure rules are a type of rewrite rule used to describe a given language's syntax and are closely associated with the early stages of transformational grammar, being first proposed by Noam Chomsky in 1957.
New!!: Treebank and Phrase structure rules · See more »
Polish language
Polish (język polski or simply polski) is a West Slavic language spoken primarily in Poland and is the native language of the Poles.
New!!: Treebank and Polish language · See more »
Portuguese language
Portuguese (português or, in full, língua portuguesa) is a Western Romance language originating from the regions of Galicia and northern Portugal in the 9th century.
New!!: Treebank and Portuguese language · See more »
PropBank
PropBank is a corpus that is annotated with verbal propositions and their arguments—a "proposition bank".
New!!: Treebank and PropBank · See more »
Psycholinguistics
Psycholinguistics or psychology of language is the study of the psychological and neurobiological factors that enable humans to acquire, use, comprehend and produce language.
New!!: Treebank and Psycholinguistics · See more »
Quranic Arabic Corpus
The Quranic Arabic Corpus is an annotated linguistic resource consisting of 77,430 words of Quranic Arabic.
New!!: Treebank and Quranic Arabic Corpus · See more »
Romanian language
Romanian (obsolete spellings Rumanian, Roumanian; autonym: limba română, "the Romanian language", or românește, lit. "in Romanian") is an East Romance language spoken by approximately 24–26 million people as a native language, primarily in Romania and Moldova, and by another 4 million people as a second language.
New!!: Treebank and Romanian language · See more »
Russian language
Russian (rússkiy yazýk) is an East Slavic language, which is official in Russia, Belarus, Kazakhstan and Kyrgyzstan, as well as being widely spoken throughout Eastern Europe, the Baltic states, the Caucasus and Central Asia.
New!!: Treebank and Russian language · See more »
Russian National Corpus
The Russian National Corpus (English official name; the Russian name is Национальный корпус русского языка, lit. the National Corpus of the Russian language, but as the official English variant the Russian National Corpus is used) is a corpus of the Russian language that has been partially accessible through a query interface online since April 29, 2004.
New!!: Treebank and Russian National Corpus · See more »
Seed bank
Seeds are living creatures and keeping them viable over the long term requires adjusting storage moisture and temperature appropriately.
New!!: Treebank and Seed bank · See more »
Semantics
Semantics (from σημαντικός sēmantikós, "significant") is the linguistic and philosophical study of meaning, in language, programming languages, formal logics, and semiotics.
New!!: Treebank and Semantics · See more »
Sentence (linguistics)
In non-functional linguistics, a sentence is a textual unit consisting of one or more words that are grammatically linked.
New!!: Treebank and Sentence (linguistics) · See more »
Slovene language
Slovene or Slovenian (slovenski jezik or slovenščina) belongs to the group of South Slavic languages.
New!!: Treebank and Slovene language · See more »
Spanish language
Spanish or Castilian, is a Western Romance language that originated in the Castile region of Spain and today has hundreds of millions of native speakers in Latin America and Spain.
New!!: Treebank and Spanish language · See more »
Swedish language
Swedish is a North Germanic language spoken natively by 9.6 million people, predominantly in Sweden (as the sole official language), and in parts of Finland, where it has equal legal standing with Finnish.
New!!: Treebank and Swedish language · See more »
Syntax
In linguistics, syntax is the set of rules, principles, and processes that govern the structure of sentences in a given language, usually including word order.
New!!: Treebank and Syntax · See more »
Tatoeba
Tatoeba.org is a free collaborative online database of example sentences geared towards foreign language learners.
New!!: Treebank and Tatoeba · See more »
Text corpus
In linguistics, a corpus (plural corpora) or text corpus is a large and structured set of texts (nowadays usually electronically stored and processed).
New!!: Treebank and Text corpus · See more »
Thai language
Thai, Central Thai, or Siamese, is the national and official language of Thailand and the first language of the Central Thai people and vast majority Thai of Chinese origin.
New!!: Treebank and Thai language · See more »
Theoretical linguistics
For|the journal|Theoretical Linguistics (journal) Multiple issues| one source|date.
New!!: Treebank and Theoretical linguistics · See more »
Tree structure
A tree structure or tree diagram is a way of representing the hierarchical nature of a structure in a graphical form.
New!!: Treebank and Tree structure · See more »
Turkish language
Turkish, also referred to as Istanbul Turkish, is the most widely spoken of the Turkic languages, with around 10–15 million native speakers in Southeast Europe (mostly in East and Western Thrace) and 60–65 million native speakers in Western Asia (mostly in Anatolia).
New!!: Treebank and Turkish language · See more »
Ukrainian language
No description.
New!!: Treebank and Ukrainian language · See more »
Universal Conceptual Cognitive Annotation
Universal Conceptual Cognitive Annotation (UCCA) is a semantic approach to grammatical representation.
New!!: Treebank and Universal Conceptual Cognitive Annotation · See more »
University of Groningen
The University of Groningen (abbreviated as UG; Rijksuniversiteit Groningen, abbreviated as RUG) is a public research university in the city of Groningen in the Netherlands.
New!!: Treebank and University of Groningen · See more »
Urdu
Urdu (اُردُو ALA-LC:, or Modern Standard Urdu) is a Persianised standard register of the Hindustani language.
New!!: Treebank and Urdu · See more »
Vietnamese language
Vietnamese (Tiếng Việt) is an Austroasiatic language that originated in Vietnam, where it is the national and official language.
New!!: Treebank and Vietnamese language · See more »
XML
In computing, Extensible Markup Language (XML) is a markup language that defines a set of rules for encoding documents in a format that is both human-readable and machine-readable.
New!!: Treebank and XML · See more »
Redirects here:
Parsed corpus, Penn Treebank, Treebanks.
References
[1] https://en.wikipedia.org/wiki/Treebank