44 relations: Amazon Elastic Compute Cloud, Apache Hadoop, Apache License, Apache Spark, Big data, CentOS, Cloud computing, Cloudera, Data analysis, Databricks, Deep learning, Firefox, Generalized linear model, Google Chrome, Google Compute Engine, Gradient boosting, Hortonworks, Internet Explorer, Internet Explorer 10, Iterative method, Java (programming language), K-means clustering, Linux, Low-rank approximation, Machine learning, MacOS, MapR, Microsoft Azure, Microsoft Windows, Naive Bayes classifier, Open-source software, OS X Mavericks, Principal component analysis, Procrustes, Python (programming language), R (programming language), Random forest, Red Hat Enterprise Linux, Safari (web browser), Scala (programming language), Statistical learning theory, Stochastic gradient descent, Ubuntu version history, Windows 7.
Amazon Elastic Compute Cloud (EC2) forms a central part of Amazon.com's cloud-computing platform, Amazon Web Services (AWS), by allowing users to rent virtual computers on which to run their own computer applications.
Apache Hadoop is a collection of open-source software utilities that facilitate using a network of many computers to solve problems involving massive amounts of data and computation.
The Apache License is a permissive free software license written by the Apache Software Foundation (ASF).
Apache Spark is an open-source cluster-computing framework.
Big data is data sets that are so big and complex that traditional data-processing application software are inadequate to deal with them.
CentOS (from Community Enterprise Operating System) is a Linux distribution that provides a free, enterprise-class, community-supported computing platform functionally compatible with its upstream source, Red Hat Enterprise Linux (RHEL).
Cloud computing is an information technology (IT) paradigm that enables ubiquitous access to shared pools of configurable system resources and higher-level services that can be rapidly provisioned with minimal management effort, often over the Internet.
Cloudera, Inc. is a United States-based software company that provides Apache Hadoop and Apache Spark-based software, support and services, and training to business customers.
Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.
Databricks is a company founded by the creators of Apache Spark, that aims to help clients with cloud-based big data processing using Spark.
Deep learning (also known as deep structured learning or hierarchical learning) is part of a broader family of machine learning methods based on learning data representations, as opposed to task-specific algorithms.
Mozilla Firefox (or simply Firefox) is a free and open-source web browser developed by Mozilla Foundation and its subsidiary, Mozilla Corporation.
In statistics, the generalized linear model (GLM) is a flexible generalization of ordinary linear regression that allows for response variables that have error distribution models other than a normal distribution.
Google Chrome is a freeware web browser developed by Google LLC.
Google Compute Engine (GCE) is the Infrastructure as a Service (IaaS) component of Google Cloud Platform which is built on the global infrastructure that runs Google’s search engine, Gmail, YouTube and other services.
Gradient boosting is a machine learning technique for regression and classification problems, which produces a prediction model in the form of an ensemble of weak prediction models, typically decision trees.
Hortonworks is a big data software company based in Santa Clara, California.
Internet Explorer (formerly Microsoft Internet Explorer and Windows Internet Explorer, commonly abbreviated IE or MSIE) is a series of graphical web browsers developed by Microsoft and included in the Microsoft Windows line of operating systems, starting in 1995.
Internet Explorer 10 (IE10) is a version of the Internet Explorer web browser released by Microsoft in 2012, and is the default browser in Windows 8.
In computational mathematics, an iterative method is a mathematical procedure that uses an initial guess to generate a sequence of improving approximate solutions for a class of problems, in which the n-th approximation is derived from the previous ones.
Java is a general-purpose computer-programming language that is concurrent, class-based, object-oriented, and specifically designed to have as few implementation dependencies as possible.
k-means clustering is a method of vector quantization, originally from signal processing, that is popular for cluster analysis in data mining.
Linux is a family of free and open-source software operating systems built around the Linux kernel.
In mathematics, low-rank approximation is a minimization problem, in which the cost function measures the fit between a given matrix (the data) and an approximating matrix (the optimization variable), subject to a constraint that the approximating matrix has reduced rank.
Machine learning is a subset of artificial intelligence in the field of computer science that often uses statistical techniques to give computers the ability to "learn" (i.e., progressively improve performance on a specific task) with data, without being explicitly programmed.
macOS (previously and later) is a series of graphical operating systems developed and marketed by Apple Inc. since 2001.
MapR is a business software company headquartered in Santa Clara, California.
Microsoft Azure (formerly Windows Azure) is a cloud computing service created by Microsoft for building, testing, deploying, and managing applications and services through a global network of Microsoft-managed data centers.
Microsoft Windows is a group of several graphical operating system families, all of which are developed, marketed, and sold by Microsoft.
In machine learning, naive Bayes classifiers are a family of simple "probabilistic classifiers" based on applying Bayes' theorem with strong (naive) independence assumptions between the features.
Open-source software (OSS) is a type of computer software whose source code is released under a license in which the copyright holder grants users the rights to study, change, and distribute the software to anyone and for any purpose.
OS X Mavericks (version 10.9) is the tenth major release of OS X (now named macOS), Apple Inc.'s desktop and server operating system for Macintosh computers.
Principal component analysis (PCA) is a statistical procedure that uses an orthogonal transformation to convert a set of observations of possibly correlated variables into a set of values of linearly uncorrelated variables called principal components.
In Greek mythology, Procrustes (Προκρούστης Prokroustes) or "the stretcher ", also known as Prokoptas or Damastes (Δαμαστής, "subduer"), was a rogue smith and bandit from Attica who attacked people by stretching them or cutting off their legs, so as to force them to fit the size of an iron bed.
Python is an interpreted high-level programming language for general-purpose programming.
R is a programming language and free software environment for statistical computing and graphics that is supported by the R Foundation for Statistical Computing.
Random forests or random decision forests are an ensemble learning method for classification, regression and other tasks, that operate by constructing a multitude of decision trees at training time and outputting the class that is the mode of the classes (classification) or mean prediction (regression) of the individual trees.
Red Hat Enterprise Linux (RHEL) is a Linux distribution developed by Red Hat and targeted toward the commercial market.
Safari is a web browser developed by Apple based on the WebKit engine.
Scala is a general-purpose programming language providing support for functional programming and a strong static type system.
Statistical learning theory is a framework for machine learning drawing from the fields of statistics and functional analysis.
Stochastic gradient descent (often shortened to SGD), also known as incremental gradient descent, is an iterative method for optimizing a differentiable objective function, a stochastic approximation of gradient descent optimization.
Ubuntu releases are made semiannually by Canonical Ltd, the developers of the Ubuntu operating system, using the year and month of the release as a version number.
Windows 7 (codenamed Vienna, formerly Blackcomb) is a personal computer operating system developed by Microsoft.