Table of Contents
40 relations: Advanced Vector Extensions, Android (operating system), Apple silicon, AVX-512, Bfloat16 floating-point format, BLOOM (language model), C (programming language), C++, Central processing unit, Command-line interface, DBRX, Fabrice Bellard, Fine-tuning (deep learning), Gemini (language model), GitHub, GPT-2, Graphics processing unit, Grok (chatbot), Half-precision floating-point format, Inference engine, Justine Tunney, Large language model, Library (computing), Llama (language model), Machine learning, Mamba (deep learning architecture), MIT License, Mozilla, Open source, OpenAI, PyTorch, Quantization (signal processing), Single-precision floating-point format, SYCL, Tensor (machine learning), Tensor algebra, Vulkan, Web server, Whisper (speech recognition system), X86-64.
- Large language models
- Open-source artificial intelligence
Advanced Vector Extensions
Advanced Vector Extensions (AVX, also known as Gesher New Instructions and then Sandy Bridge New Instructions) are SIMD extensions to the x86 instruction set architecture for microprocessors from Intel and Advanced Micro Devices (AMD).
See Llama.cpp and Advanced Vector Extensions
Android (operating system)
Android is a mobile operating system based on a modified version of the Linux kernel and other open-source software, designed primarily for touchscreen mobile devices such as smartphones and tablets.
See Llama.cpp and Android (operating system)
Apple silicon
Apple silicon refers to a series of system on a chip (SoC) and system in a package (SiP) processors designed by Apple Inc., mainly using the ARM architecture.
See Llama.cpp and Apple silicon
AVX-512
AVX-512 are 512-bit extensions to the 256-bit Advanced Vector Extensions SIMD instructions for x86 instruction set architecture (ISA) proposed by Intel in July 2013, and first implemented in the 2016 Intel Xeon Phi x200 (Knights Landing), and then later in a number of AMD and other Intel CPUs (see list below).
Bfloat16 floating-point format
The bfloat16 (brain floating point) floating-point format is a computer number format occupying 16 bits in computer memory; it represents a wide dynamic range of numeric values by using a floating radix point.
See Llama.cpp and Bfloat16 floating-point format
BLOOM (language model)
BigScience Large Open-science Open-access Multilingual Language Model (BLOOM) is a 176-billion-parameter transformer-based autoregressive large language model (LLM). Llama.cpp and BLOOM (language model) are large language models.
See Llama.cpp and BLOOM (language model)
C (programming language)
C (pronounced – like the letter c) is a general-purpose programming language.
See Llama.cpp and C (programming language)
C++
C++ (pronounced "C plus plus" and sometimes abbreviated as CPP) is a high-level, general-purpose programming language created by Danish computer scientist Bjarne Stroustrup.
Central processing unit
A central processing unit (CPU), also called a central processor, main processor, or just processor, is the most important processor in a given computer.
See Llama.cpp and Central processing unit
Command-line interface
A command-line interface (CLI) is a means of interacting with a computer program by inputting lines of text called command-lines.
See Llama.cpp and Command-line interface
DBRX
DBRX is an open-sourced large language model (LLM) developed by Mosaic ML team at Databricks, released on March 27, 2024. Llama.cpp and DBRX are large language models.
Fabrice Bellard
Fabrice Bellard (born 1972) is a French computer programmer known for writing FFmpeg, QEMU, and the Tiny C Compiler.
See Llama.cpp and Fabrice Bellard
Fine-tuning (deep learning)
In deep learning, fine-tuning is an approach to transfer learning in which the parameters of a pre-trained model are trained on new data.
See Llama.cpp and Fine-tuning (deep learning)
Gemini (language model)
Google Gemini is a family of multimodal large language models developed by Google DeepMind, serving as the successor to LaMDA and PaLM 2. Llama.cpp and Gemini (language model) are large language models.
See Llama.cpp and Gemini (language model)
GitHub
GitHub is a developer platform that allows developers to create, store, manage and share their code.
GPT-2
Generative Pre-trained Transformer 2 (GPT-2) is a large language model by OpenAI and the second in their foundational series of GPT models. Llama.cpp and GPT-2 are large language models.
Graphics processing unit
A graphics processing unit (GPU) is a specialized electronic circuit initially designed for digital image processing and to accelerate computer graphics, being present either as a discrete video card or embedded on motherboards, mobile phones, personal computers, workstations, and game consoles.
See Llama.cpp and Graphics processing unit
Grok (chatbot)
Grok is a generative artificial intelligence chatbot developed by xAI.
See Llama.cpp and Grok (chatbot)
Half-precision floating-point format
In computing, half precision (sometimes called FP16 or float16) is a binary floating-point computer number format that occupies 16 bits (two bytes in modern computers) in computer memory.
See Llama.cpp and Half-precision floating-point format
Inference engine
In the field of artificial intelligence, an inference engine is a software component of an intelligent system that applies logical rules to the knowledge base to deduce new information.
See Llama.cpp and Inference engine
Justine Tunney
Justine Alexandra Roberts Tunney (born 1984) is an American software developer and a former activist for Occupy Wall Street.
See Llama.cpp and Justine Tunney
Large language model
A large language model (LLM) is a computational model notable for its ability to achieve general-purpose language generation and other natural language processing tasks such as classification. Llama.cpp and large language model are large language models.
See Llama.cpp and Large language model
Library (computing)
In computer science, a library is a collection of read-only resources that is leveraged during software development to implement a computer program.
See Llama.cpp and Library (computing)
Llama (language model)
Llama (acronym for Large Language Model Meta AI, and formerly stylized as LLaMA) is a family of autoregressive large language models (LLMs) released by Meta AI starting in February 2023. Llama.cpp and Llama (language model) are large language models.
See Llama.cpp and Llama (language model)
Machine learning
Machine learning (ML) is a field of study in artificial intelligence concerned with the development and study of statistical algorithms that can learn from data and generalize to unseen data and thus perform tasks without explicit instructions.
See Llama.cpp and Machine learning
Mamba (deep learning architecture)
Mamba is a deep learning architecture focused on sequence modeling.
See Llama.cpp and Mamba (deep learning architecture)
MIT License
The MIT License is a permissive software license originating at the Massachusetts Institute of Technology (MIT) in the late 1980s.
Mozilla
Mozilla (stylized as moz://a) is a free software community founded in 1998 by members of Netscape.
Open source
Open source is source code that is made freely available for possible modification and redistribution. Llama.cpp and Open source are free and open-source software.
OpenAI
OpenAI is an American artificial intelligence (AI) research organization founded in December 2015 and headquartered in San Francisco, California.
PyTorch
PyTorch is a machine learning library based on the Torch library, used for applications such as computer vision and natural language processing, originally developed by Meta AI and now part of the Linux Foundation umbrella. Llama.cpp and PyTorch are open-source artificial intelligence.
Quantization (signal processing)
Quantization, in mathematics and digital signal processing, is the process of mapping input values from a large set (often a continuous set) to output values in a (countable) smaller set, often with a finite number of elements.
See Llama.cpp and Quantization (signal processing)
Single-precision floating-point format
Single-precision floating-point format (sometimes called FP32 or float32) is a computer number format, usually occupying 32 bits in computer memory; it represents a wide dynamic range of numeric values by using a floating radix point.
See Llama.cpp and Single-precision floating-point format
SYCL
SYCL (pronounced "sickle") is a higher-level programming model to improve programming productivity on various hardware accelerators.
Tensor (machine learning)
Tensor informally refers in machine learning to two different concepts that organize and represent data.
See Llama.cpp and Tensor (machine learning)
Tensor algebra
In mathematics, the tensor algebra of a vector space V, denoted T(V) or T(V), is the algebra of tensors on V (of any rank) with multiplication being the tensor product.
See Llama.cpp and Tensor algebra
Vulkan
Vulkan is a low-level, low-overhead cross-platform API and open standard for 3D graphics and computing.
Web server
A web server is computer software and underlying hardware that accepts requests via HTTP (the network protocol created to distribute web content) or its secure variant HTTPS.
Whisper (speech recognition system)
Whisper is a machine learning model for speech recognition and transcription, created by OpenAI and first released as open-source software in September 2022.
See Llama.cpp and Whisper (speech recognition system)
X86-64
x86-64 (also known as x64, x86_64, AMD64, and Intel 64) is a 64-bit version of the x86 instruction set, first announced in 1999.
See also
Large language models
- Auto-GPT
- BERT (language model)
- BLOOM (language model)
- Braina
- Brave Leo
- ChatGPT
- Chinchilla (language model)
- Chroma (vector database)
- Claude (language model)
- Cohere
- DBRX
- Ernie Bot
- GPT-1
- GPT-2
- GPT-3
- GPT-4
- GPT-4o
- GPT-J
- GPT4-Chan
- GPTZero
- Gemini (language model)
- Generative pre-trained transformer
- GigaChat
- Huawei PanGu
- IBM Granite
- IBM Watsonx
- Jais (language model)
- Jamba (language model)
- LaMDA
- LangChain
- Large language model
- Llama (language model)
- Llama.cpp
- MMLU
- PaLM
- Retrieval-augmented generation
- SearchGPT
- Sparrow (chatbot)
- Stochastic parrot
- T5 (language model)
- The Pile (dataset)
- Top-p sampling
- Undetectable.ai
- Vicuna LLM
- VideoPoet
- Waluigi effect
- YandexGPT
- You.com
Open-source artificial intelligence
- AUTOMATIC1111 Stable Diffusion Web UI
- CatBoost
- Chainer
- ComfyUI
- CuckooChess
- Deeplearning4j
- Fooocus
- Horovod (machine learning)
- Hugging Face
- Infer.NET
- Jais (language model)
- LAION
- LightGBM
- Llama.cpp
- ML.NET
- MindSpore
- Neural Network Intelligence
- Open Mind Common Sense
- Open-source artificial intelligence
- OpenCog
- OpenNN
- PyTorch
- Rnn (software)
- Spark NLP
- TensorFlow
References
Also known as GGUF.

