187 relations: Academic journal, Academic Press, ADVISE, Agent mining, Aggregate function, Analytics, Angoss, Anomaly detection, Artificial intelligence, Artificial neural network, Association for Computing Machinery, Association for the Advancement of Artificial Intelligence, Association rule learning, Automatic summarization, Bayes' theorem, Bayesian network, Behavior informatics, Big data, Bioinformatics, Business intelligence, Buzzword, C++, Cambridge University Press, Carrot2, Chemicalize, Clarabridge, Cluster analysis, Clustering high-dimensional data, Computational complexity theory, Computational science, Computer science, Conference on Information and Knowledge Management, Copyright Directive, CounterPunch, Cross-industry standard process for data mining, Data, Data analysis, Data collection, Data dredging, Data integration, Data management, Data mart, Data Mining and Knowledge Discovery, Data pre-processing, Data set, Data transformation, Data visualization, Data warehouse, Database, Database Directive, ..., DATADVANCE, Decision support system, Decision tree, Decision tree learning, Deep learning, Domain driven data mining, Drug discovery, ECML PKDD, Educational data mining, Edward Snowden, Electronic discovery, ELKI, Ensemble learning, European Commission, Examples of data mining, Exploratory data analysis, Factor analysis, Fair use, Family Educational Rights and Privacy Act, Forrester Research, Fraction of variance unexplained, Gartner, General Architecture for Text Engineering, Genetic algorithm, Global surveillance disclosures (2013–present), GNU Project, Google Book Search Settlement Agreement, Google Scholar, Gregory Piatetsky-Shapiro, Health Insurance Portability and Accountability Act, Hewlett-Packard, IBM, Information extraction, Information integration, Information processing, InformationWeek, Intention mining, Interdisciplinarity, International Journal of Data Warehousing and Mining, International Safe Harbor Privacy Principles, Java (programming language), Java Data Mining, Jerome H. Friedman, Jiawei Han, KNIME, KXEN Inc., Learning classifier system, Limitations and exceptions to copyright, LIONsolver, Lua (programming language), Machine learning, Massive Online Analysis, Megaputer Intelligence, Michael Lovell, Microsoft, Microsoft Academic Search, Microsoft Analysis Services, Misnomer, Missing data, MLPACK (C++ library), Morgan Kaufmann Publishers, Multi expression programming, Multilinear subspace learning, Multivariate statistics, Named-entity recognition, National Security Agency, Natural language processing, Natural Language Toolkit, NetOwl, Neural network, Online algorithm, Open access, Open-source model, OpenNN, OpenText, Oracle Corporation, Oracle Data Mining, Orange (software), Overfitting, Personally identifiable information, Philip S. Yu, Predictive analytics, Predictive Model Markup Language, Prentice Hall, Profiling (information science), Programming language, PSeven, Psychometrics, Python (programming language), Qlucore, R (programming language), RapidMiner, Receiver operating characteristic, Regression analysis, Reproducibility, Rexer's Annual Data Miner Survey, Robert Tibshirani, SAS Institute, Scikit-learn, SEMMA, Sequential pattern mining, SIGKDD, SIGMOD, Social media mining, Spatial database, Springer Science+Business Media, SPSS Modeler, Statistica, Statistical classification, Statistical hypothesis testing, Statistical inference, Statistical model, Statistics, StatSoft, Stellar Wind, Structured data analysis (statistics), Support vector machine, Surveillance capitalism, Tanagra (machine learning), Text mining, The American Statistician, The Review of Economic Studies, Time series, Torch (machine learning), Total Information Awareness, Training, test, and validation sets, Trevor Hastie, UBM plc, UIMA, United States Congress, Usama Fayyad, Vertica, VLDB, Web mining, Web scraping, Weka (machine learning), XML. Expand index (137 more) » « Shrink index
An academic or scholarly journal is a periodical publication in which scholarship relating to a particular academic discipline is published.
Academic Press is an academic book publisher.
ADVISE (Analysis, Dissemination, Visualization, Insight, and Semantic Enhancement) is a research and development program within the United States Department of Homeland Security (DHS) Threat and Vulnerability Testing and Assessment (TVTA) portfolio.
Agent mining is an interdisciplinary area that synergizes multiagent systems with data mining and machine learning.
In database management an aggregate function is a function where the values of multiple rows are grouped together to form a single value of more significant meaning or measurement such as a set, a bag or a list.
Analytics is the discovery, interpretation, and communication of meaningful patterns in data.
Angoss Software Corporation, headquartered in Toronto, Ontario, Canada, with offices in the United States and UK, is a provider of predictive analytics systems through software licensing and services.
In data mining, anomaly detection (also outlier detection) is the identification of items, events or observations which do not conform to an expected pattern or other items in a dataset.
Artificial intelligence (AI, also machine intelligence, MI) is intelligence demonstrated by machines, in contrast to the natural intelligence (NI) displayed by humans and other animals.
Artificial neural networks (ANNs) or connectionist systems are computing systems vaguely inspired by the biological neural networks that constitute animal brains.
The Association for Computing Machinery (ACM) is an international learned society for computing.
The Association for the Advancement of Artificial Intelligence (AAAI) is an international, nonprofit, scientific society devoted to promote research in, and responsible use of, artificial intelligence.
Association rule learning is a rule-based machine learning method for discovering interesting relations between variables in large databases.
Automatic summarization is the process of shortening a text document with software, in order to create a summary with the major points of the original document.
In probability theory and statistics, Bayes’ theorem (alternatively Bayes’ law or Bayes' rule, also written as Bayes’s theorem) describes the probability of an event, based on prior knowledge of conditions that might be related to the event.
A Bayesian network, Bayes network, belief network, Bayes(ian) model or probabilistic directed acyclic graphical model is a probabilistic graphical model (a type of statistical model) that represents a set of variables and their conditional dependencies via a directed acyclic graph (DAG).
Behavior informatics (BI) is the informatics of behaviors so as to obtain behavior intelligence and behavior insights.
Big data is data sets that are so big and complex that traditional data-processing application software are inadequate to deal with them.
Bioinformatics is an interdisciplinary field that develops methods and software tools for understanding biological data.
Business intelligence (BI) comprises the strategies and technologies used by enterprises for the data analysis of business information.
A buzzword is a word or phrase, new or already existing, that becomes very popular for a period of time.
C++ ("see plus plus") is a general-purpose programming language.
Cambridge University Press (CUP) is the publishing business of the University of Cambridge.
Carrot² is an open source search results clustering engine.
Chemicalize is an online platform for chemical calculations, search, and text processing.
Clarabridge is an American software company founded in 2006 in Reston, Virginia, United States.
Cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense) to each other than to those in other groups (clusters).
Clustering high-dimensional data is the cluster analysis of data with anywhere from a few dozen to many thousands of dimensions.
Computational complexity theory is a branch of the theory of computation in theoretical computer science that focuses on classifying computational problems according to their inherent difficulty, and relating those classes to each other.
Computational science (also scientific computing or scientific computation (SC)) is a rapidly growing multidisciplinary field that uses advanced computing capabilities to understand and solve complex problems.
Computer science deals with the theoretical foundations of information and computation, together with practical techniques for the implementation and application of these foundations.
The ACM Conference on Information and Knowledge Management (CIKM, pronounced) is an annual computer science research conference dedicated to information management (IM) and knowledge management (KM).
The Copyright Directive (officially the Directive 2001/29/EC of the European Parliament and of the Council of 22 May 2001 on the harmonisation of certain aspects of copyright and related rights in the information society, also known as the Information Society Directive or the InfoSoc Directive), is a directive of the European Union enacted to implement the WIPO Copyright Treaty and to harmonise aspects of copyright law across Europe, such as copyright exceptions. The directive was enacted under the internal market provisions of the Treaty of Rome. The directive was subject to unprecedented lobbying and has been cited as a success for copyright industries. The directive gives EU Member States significant freedom in certain aspects of transposition. Member States had until 22 December 2002 to implement the directive into their national laws. However, only Greece and Denmark met the deadline and the European Commission eventually initiated enforcement action against six Member States for non-implementation.
CounterPunch is a magazine published six times per year in the United States that covers politics in a manner its editors describe as "muckraking with a radical attitude".
Cross-industry standard process for data mining, known as CRISP-DM,Shearer C., The CRISP-DM model: the new blueprint for data mining, J Data Warehousing (2000); 5:13—22.
Data is a set of values of qualitative or quantitative variables.
Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.
Data collection is the process of gathering and measuring information on targeted variables in an established systematic fashion, which then enables one to answer relevant questions and evaluate outcomes.
Data dredging (also data fishing, data snooping, and '''''p'''''-hacking) is the use of data mining to uncover patterns in data that can be presented as statistically significant, without first devising a specific hypothesis as to the underlying causality.
Data integration involves combining data residing in different sources and providing users with a unified view of them.
Data management comprises all disciplines related to managing data as a valuable resource.
A data mart is a structure / access pattern specific to data warehouse environments, used to retrieve client-facing data.
Data Mining and Knowledge Discovery is a bimonthly peer-reviewed scientific journal focusing on data mining published by Springer Science+Business Media.
Data pre-processing is an important step in the data mining process.
A data set (or dataset) is a collection of data.
In computing, data transformation is the process of converting data from one format or structure into another format or structure.
Data visualiation or data visualiation is viewed by many disciplines as a modern equivalent of visual communication.
In computing, a data warehouse (DW or DWH), also known as an enterprise data warehouse (EDW), is a system used for reporting and data analysis, and is considered a core component of business intelligence.
A database is an organized collection of data, stored and accessed electronically.
The Directive 96/9/EC of the European Parliament and of the Council of 11 March 1996 on the legal protection of databases is a directive of the European Union in the field of copyright law, made under the internal market provisions of the Treaty of Rome.
DATADVANCE Is a software development company, evolved out of a collaborative research program between Airbus and Institute for Information Transmission Problems of the Russian Academy of Sciences (IITP RAS).
A decision support system (DSS) is an information system that supports business or organizational decision-making activities.
A decision tree is a decision support tool that uses a tree-like graph or model of decisions and their possible consequences, including chance event outcomes, resource costs, and utility.
Decision tree learning uses a decision tree (as a predictive model) to go from observations about an item (represented in the branches) to conclusions about the item's target value (represented in the leaves).
Deep learning (also known as deep structured learning or hierarchical learning) is part of a broader family of machine learning methods based on learning data representations, as opposed to task-specific algorithms.
Domain driven data mining is a data mining methodology for discovering actionable knowledge and deliver actionable insights from complex data and behaviors in a complex environment.
In the fields of medicine, biotechnology and pharmacology, drug discovery is the process by which new candidate medications are discovered.
ECML PKDD, the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, is one of the leading academic conferences on machine learning and knowledge discovery, held in Europe every year.
Educational data mining (EDM) describes a research field concerned with the application of data mining, machine learning and statistics to information generated from educational settings (e.g., universities and intelligent tutoring systems).
Edward Joseph Snowden (born June 21, 1983) is an American computer professional, former Central Intelligence Agency (CIA) employee, and former contractor for the United States government who copied and leaked classified information from the National Security Agency (NSA) in 2013 without authorization.
Electronic discovery (also e-discovery or ediscovery) refers to discovery in legal proceedings such as litigation, government investigations, or Freedom of Information Act requests, where the information sought is in electronic format (often referred to as electronically stored information or ESI).
ELKI (for Environment for DeveLoping KDD-Applications Supported by Index-Structures) is a knowledge discovery in databases (KDD, "data mining") software framework developed for use in research and teaching originally at the database systems research unit of Professor Hans-Peter Kriegel at the Ludwig Maximilian University of Munich, Germany.
In statistics and machine learning, ensemble methods use multiple learning algorithms to obtain better predictive performance that could be obtained from any of the constituent learning algorithms alone.
The European Commission (EC) is an institution of the European Union, responsible for proposing legislation, implementing decisions, upholding the EU treaties and managing the day-to-day business of the EU.
Data mining, the process of discovering patterns in large data sets, has been used in many applications.
In statistics, exploratory data analysis (EDA) is an approach to analyzing data sets to summarize their main characteristics, often with visual methods.
Factor analysis is a statistical method used to describe variability among observed, correlated variables in terms of a potentially lower number of unobserved variables called factors.
Fair use is a doctrine in the law of the United States that permits limited use of copyrighted material without having to first acquire permission from the copyright holder.
The Family Educational Rights and Privacy Act of 1974 (FERPA or the Buckley Amendment) is a United States federal law that governs the access of educational information and records to public entities such as potential employers, publicly funded educational institutions, and foreign governments.
Forrester is an American market research company that provides advice on existing and potential impact of technology, to its clients and the public.
In statistics, the fraction of variance unexplained (FVU) in the context of a regression task is the fraction of variance of the regressand (dependent variable) Y which cannot be explained, i.e., which is not correctly predicted, by the explanatory variables X.
Gartner, Inc. is a global research and advisory firm providing insights, advice, and tools for leaders in IT, Finance, HR, Customer Service and Support, Legal and Compliance, Marketing, Sales, and Supply Chain functions across the world.
General Architecture for Text Engineering or GATE is a Java suite of tools originally developed at the University of Sheffield beginning in 1995 and now used worldwide by a wide community of scientists, companies, teachers and students for many natural language processing tasks, including information extraction in many languages.
In computer science and operations research, a genetic algorithm (GA) is a metaheuristic inspired by the process of natural selection that belongs to the larger class of evolutionary algorithms (EA).
Ongoing news reports in the international media have revealed operational details about the United States National Security Agency (NSA) and its international partners' global surveillance of foreign nationals and U.S. citizens.
The GNU Project is a free-software, mass-collaboration project, first announced on September 27, 1983 by Richard Stallman at MIT.
The Google Book Search Settlement Agreement was a proposal between the Authors Guild, the Association of American Publishers, and Google in the settlement of ''Authors Guild et al. v. Google'', a class action lawsuit alleging copyright infringement on the part of Google.
Google Scholar is a freely accessible web search engine that indexes the full text or metadata of scholarly literature across an array of publishing formats and disciplines.
Gregory I. Piatetsky-Shapiro (born 7 April 1958) is a data scientist and the co-founder of the KDD, the Association for Computing Machinery SIGKDD association for Knowledge Discovery and Data Mining.
The Health Insurance Portability and Accountability Act of 1996 (HIPAA) was enacted by the United States Congress and signed by President Bill Clinton in 1996.
The Hewlett-Packard Company (commonly referred to as HP) or shortened to Hewlett-Packard was an American multinational information technology company headquartered in Palo Alto, California.
The International Business Machines Corporation (IBM) is an American multinational technology company headquartered in Armonk, New York, United States, with operations in over 170 countries.
Information extraction (IE) is the task of automatically extracting structured information from unstructured and/or semi-structured machine-readable documents.
Information integration (II) is the merging of information from heterogeneous sources with differing conceptual, contextual and typographical representations.
Information processing is the change (processing) of information in any manner detectable by an observer.
InformationWeek is a digital magazine which conducts corresponding face-to-face events, virtual events, and research.
In data mining, intention mining or intent mining is the problem of determining a user's intention from logs of his/her behavior in interaction with a computer system, such as in search engines, where there has been research on user intent or query intent prediction since 2002 (see Section 7.2.3 in R. Baeza-Yates and B. Ribeiro-Neto. "", second edition, Addison-Wesley, 2011.); and commercial intents expressed in social media posts Zhiyuan Chen, Bing Liu, Meichun Hsu, Malu Castellanos, and Riddhiman Ghosh.
Interdisciplinarity or interdisciplinary studies involves the combining of two or more academic disciplines into one activity (e.g., a research project).
The International Journal of Data Warehousing and Mining (IJDWM) is a quarterly peer-reviewed academic journal covering data warehousing and data mining.
The International Safe Harbor Privacy Principles or Safe Harbour Privacy Principles were principles developed between 1998 and 2000 in order to prevent private organizations within the European Union or United States which store customer data from accidentally disclosing or losing personal information.
Java is a general-purpose computer-programming language that is concurrent, class-based, object-oriented, and specifically designed to have as few implementation dependencies as possible.
Java Data Mining (JDM) is a standard Java API for developing data mining applications and tools.
Jerome Harold Friedman (born 1939) is an American statistician, consultant and Professor of Statistics at Stanford University, known for his contributions in the field of statistics and data mining.
Jiawei Han (born August 10, 1949) is a Chinese computer scientist and Abel Bliss Professor in the Department of Computer Science at the University of Illinois at Urbana-Champaign.
KNIME, the Konstanz Information Miner, is a free and open-source data analytics, reporting and integration platform.
KXEN was an American software company which existed from 1998 to 2013 when it was acquired by SAP AG.
Learning classifier systems, or LCS, are a paradigm of rule-based machine learning methods that combine a discovery component (e.g. typically a genetic algorithm) with a learning component (performing either supervised learning, reinforcement learning, or unsupervised learning).
Limitations and exceptions to copyright are provisions, in local copyright law or Berne Convention, which allow for copyrighted works to be used without a license from the copyright owner.
LIONsolver is an integrated software for data mining, business intelligence, analytics, and modeling Learning and Intelligent OptimizatioN and reactive business intelligence approach.
Lua (from meaning moon) is a lightweight, multi-paradigm programming language designed primarily for embedded use in applications.
Machine learning is a subset of artificial intelligence in the field of computer science that often uses statistical techniques to give computers the ability to "learn" (i.e., progressively improve performance on a specific task) with data, without being explicitly programmed.
Massive Online Analysis (MOA) is a free open-source software project specific for data stream mining with concept drift.
Megaputer Intelligence, Inc., is a software company headquartered in Bloomington, Indiana, United States, that provides data and text mining tools along with consulting services.
Michael R. Lovell (born 1967) is an American engineer, educator, and President of Marquette University.
Microsoft Corporation (abbreviated as MS) is an American multinational technology company with headquarters in Redmond, Washington.
Microsoft Academic Search was a research project and academic search engine retired in 2012.
Microsoft SQL Server Analysis Services, SSAS, is an online analytical processing (OLAP) and data mining tool in Microsoft SQL Server.
A misnomer is a name or term that suggests an idea that is known to be wrong.
In statistics, missing data, or missing values, occur when no data value is stored for the variable in an observation.
mlpack is a machine learning software library for C++, built on top of the Armadillo library.
Morgan Kaufmann Publishers is a Burlington, Massachusetts (San Francisco, California until 2008) based publisher specializing in computer science and engineering content.
Multi Expression Programming (MEP) is a genetic programming variant encoding multiple solutions in the same chromosome.
Multilinear subspace learning is an approach to dimensionality reduction.
Multivariate statistics is a subdivision of statistics encompassing the simultaneous observation and analysis of more than one outcome variable.
Named-entity recognition (NER) (also known as entity identification, entity chunking and entity extraction) is a subtask of information extraction that seeks to locate and classify named entities in text into pre-defined categories such as the names of persons, organizations, locations, expressions of times, quantities, monetary values, percentages, etc.
The National Security Agency (NSA) is a national-level intelligence agency of the United States Department of Defense, under the authority of the Director of National Intelligence.
Natural language processing (NLP) is an area of computer science and artificial intelligence concerned with the interactions between computers and human (natural) languages, in particular how to program computers to process and analyze large amounts of natural language data.
The Natural Language Toolkit, or more commonly NLTK, is a suite of libraries and programs for symbolic and statistical natural language processing (NLP) for English written in the Python programming language.
NetOwl is a suite of multilingual text and entity analytics products that analyze big data in the form of text data – reports, web, social media, etc.
The term neural network was traditionally used to refer to a network or circuit of neurons.
In computer science, an online algorithm is one that can process its input piece-by-piece in a serial fashion, i.e., in the order that the input is fed to the algorithm, without having the entire input available from the start.
Open access (OA) refers to research outputs which are distributed online and free of cost or other barriers, and possibly with the addition of a Creative Commons license to promote reuse.
The open-source model is a decentralized software-development model that encourages open collaboration.
OpenNN (Open Neural Networks Library) is a software library written in the C++ programming language which implements neural networks, a main area of deep learning research.
OpenText Corporation (also written opentext) is a Canadian company that develops and sells enterprise information management (EIM) software.
Oracle Corporation is an American multinational computer technology corporation, headquartered in Redwood Shores, California.
Oracle Data Mining (ODM) is an option of Oracle Corporation's Relational Database Management System (RDBMS) Enterprise Edition (EE).
Orange is an open-source data visualization, machine learning and data mining toolkit.
In statistics, overfitting is "the production of an analysis that corresponds too closely or exactly to a particular set of data, and may therefore fail to fit additional data or predict future observations reliably".
Personal information, described in United States legal fields as either personally identifiable information (PII), or sensitive personal information (SPI), as used in information security and privacy laws, is information that can be used on its own or with other information to identify, contact, or locate a single person, or to identify an individual in context.
Philip S. Yu (born 1952) is an American computer scientist and Professor in Information Technology at the University of Illinois at Chicago, known for his work in the field of data mining.
Predictive analytics encompasses a variety of statistical techniques from predictive modelling, machine learning, and data mining that analyze current and historical facts to make predictions about future or otherwise unknown events.
The Predictive Model Markup Language (PMML) is an XML-based predictive model interchange format conceived by Dr.
Prentice Hall is a major educational publisher owned by Pearson plc.
In information science, profiling refers to the process of construction and application of user profiles generated by computerized data analysis.
A programming language is a formal language that specifies a set of instructions that can be used to produce various kinds of output.
pSeven is a design space exploration software platform developed by DATADVANCE, extending design, simulation and analysis capabilities and assisting in smarter and faster design decisions.
Psychometrics is a field of study concerned with the theory and technique of psychological measurement.
Python is an interpreted high-level programming language for general-purpose programming.
Qlucore is a bioinformatics company from Lund, Sweden, that provides software for the life science and biotech industries.
R is a programming language and free software environment for statistical computing and graphics that is supported by the R Foundation for Statistical Computing.
RapidMiner is a data science software platform developed by the company of the same name that provides an integrated environment for data preparation, machine learning, deep learning, text mining, and predictive analytics.
In statistics, a receiver operating characteristic curve, i.e. ROC curve, is a graphical plot that illustrates the diagnostic ability of a binary classifier system as its discrimination threshold is varied.
In statistical modeling, regression analysis is a set of statistical processes for estimating the relationships among variables.
Reproducibility is the closeness of the agreement between the results of measurements of the same measurand carried out under changed conditions of measurement.
Rexer Analytics’s Annual Data Miner Survey is the largest survey of data mining, data science, and analytics professionals in the industry.
Robert Tibshirani (born July 10, 1956) is a Professor in the Departments of Statistics and Health Research and Policy at Stanford University.
SAS Institute (or SAS, pronounced "sass") is an American multinational developer of analytics software based in Cary, North Carolina.
Scikit-learn (formerly scikits.learn) is a free software machine learning library for the Python programming language.
SEMMA is an acronym that stands for Sample, Explore, Modify, Model, and Assess.
Sequential pattern mining is a topic of data mining concerned with finding statistically relevant patterns between data examples where the values are delivered in a sequence.
SIGKDD is the Association for Computing Machinery's (ACM) Special Interest Group (SIG) on Knowledge Discovery and Data Mining.
SIGMOD is the Association for Computing Machinery's Special Interest Group on Management of Data, which specializes in large-scale data management problems and databases.
Social media mining is the process of representing, analyzing, and extracting actionable patterns and trends from raw social media data.
A spatial database is a database that is optimized for storing and querying data that represents objects defined in a geometric space.
Springer Science+Business Media or Springer, part of Springer Nature since 2015, is a global publishing company that publishes books, e-books and peer-reviewed journals in science, humanities, technical and medical (STM) publishing.
IBM SPSS Modeler is a data mining and text analytics software application from IBM.
Statistica is an advanced analytics software package originally developed by StatSoft which was acquired by Dell in March 2014.
In machine learning and statistics, classification is the problem of identifying to which of a set of categories (sub-populations) a new observation belongs, on the basis of a training set of data containing observations (or instances) whose category membership is known.
A statistical hypothesis, sometimes called confirmatory data analysis, is a hypothesis that is testable on the basis of observing a process that is modeled via a set of random variables.
Statistical inference is the process of using data analysis to deduce properties of an underlying probability distribution.
A statistical model is a mathematical model that embodies a set of statistical assumptions concerning the generation of some sample data and similar data from a larger population.
Statistics is a branch of mathematics dealing with the collection, analysis, interpretation, presentation, and organization of data.
StatSoft is the original developer of Statistica.
"Stellar Wind" (or "Stellarwind") was the code name of a warrantless surveillance program begun under the George W. Bush administration's President's Surveillance Program (PSP).
Structured data analysis is the statistical data analysis of structured data.
In machine learning, support vector machines (SVMs, also support vector networks) are supervised learning models with associated learning algorithms that analyze data used for classification and regression analysis.
Surveillance capitalism is a term first introduced by John Bellamy Foster and Robert W. McChesney in Monthly Review in 2014 and later popularized by academic Shoshana Zuboff that denotes a new genus of capitalism that monetizes data acquired through surveillance.
Tanagra is a free suite of machine learning software for research and academic purposes developed by Ricco Rakotomalala at the Lumière University Lyon 2, France.
Text mining, also referred to as text data mining, roughly equivalent to text analytics, is the process of deriving high-quality information from text.
The American Statistician is a quarterly peer-reviewed scientific journal covering statistics published by Taylor & Francis on behalf of the American Statistical Association.
The Review of Economic Studies (also known as RESTUD) is a quarterly peer-reviewed academic journal covering economics.
A time series is a series of data points indexed (or listed or graphed) in time order.
Torch is an open source machine learning library, a scientific computing framework, and a script language based on the Lua programming language.
Total Information Awareness (TIA) was a program of the United States Information Awareness Office that began during the 2003 fiscal year.
In machine learning, the study and construction of algorithms that can learn from and make predictions on data is a common task.
Trevor John Hastie (born 27 June 1953) is a South African and American statistician and computer scientist.
UBM plc is a global business-to-business (B2B) events organiser headquartered in London, United Kingdom.
UIMA, short for Unstructured Information Management Architecture, is an OASIS standard for content analytics, originally developed at IBM.
The United States Congress is the bicameral legislature of the Federal government of the United States.
Usama M. Fayyad (born July, 1965) is an American data scientist and co-founder of KDD conferences and ACM SIGKDD association for Knowledge Discovery and Data Mining.
Vertica Systems is an analytic database management software company.
VLDB is an annual conference held by the non-profit Very Large Data Base Endowment Inc. The mission of VLDB is to promote and exchange scholarly work in databases and related fields throughout the world.
Web mining is the application of data mining techniques to discover patterns from the World Wide Web.
Web scraping, web harvesting, or web data extraction is data scraping used for extracting data from websites.
Waikato Environment for Knowledge Analysis (Weka) is a suite of machine learning software written in Java, developed at the University of Waikato, New Zealand.
In computing, Extensible Markup Language (XML) is a markup language that defines a set of rules for encoding documents in a format that is both human-readable and machine-readable.
Artificial Intelligence in Data Mining, DATA MINING, Data Mining, Data discovery, Data mine, Data miner, Data mining system, Data-mining, Datamine, Datamining, Information mining, Information-mining, Knowledge Discovery in Databases, Knowledge discovering in databases, Knowledge discovery in databases, Knowledge mining, List of data mining software, Pattern Mining, Pattern mining, Predictive software, Subject-based data mining, Usage mining, Visual Data Mining, Web data mining.