41 relations: Annals of Applied Probability, Annals of Statistics, Asymptote, Bayes' theorem, Bulletin of the American Mathematical Society, Clinical trial, Condorcet criterion, Condorcet paradox, Dynamic routing, Gambling, Germany, Gittins index, Greedy algorithm, Herbert Robbins, John C. Gittins, Journal of the Royal Statistical Society, Lecture Notes in Computer Science, Markov decision process, Medical ethics, Michael Katehakis, Nonparametric regression, Open-source model, Operations Research (journal), Optimal stopping, Peter Whittle (mathematician), Pharmaceutical industry, Portfolio (finance), Prisoner's dilemma, Probability distribution, Probability theory, Regret (decision theory), Reinforcement learning, Search theory, SIAM Journal on Computing, Singular-value decomposition, Slot machine, Softmax function, Stochastic scheduling, Thompson sampling, Tikhonov regularization, World War II.
The Annals of Applied Probability is a peer-reviewed mathematics journal published by the Institute of Mathematical Statistics.
The Annals of Statistics is a peer-reviewed statistics journal published by the Institute of Mathematical Statistics.
In analytic geometry, an asymptote of a curve is a line such that the distance between the curve and the line approaches zero as one or both of the x or y coordinates tends to infinity.
In probability theory and statistics, Bayes’ theorem (alternatively Bayes’ law or Bayes' rule, also written as Bayes’s theorem) describes the probability of an event, based on prior knowledge of conditions that might be related to the event.
The Bulletin of the American Mathematical Society is a quarterly mathematical journal published by the American Mathematical Society.
Clinical trials are experiments or observations done in clinical research.
The Condorcet candidate (Condorcet winner) is the person who would win a two-candidate election against each of the other candidates in a plurality vote.
The Condorcet paradox (also known as voting paradox or the paradox of voting) in social choice theory is a situation noted by the Marquis de Condorcet in the late 18th century, in which collective preferences can be cyclic, even if the preferences of individual voters are not cyclic.
Dynamic routing, also called adaptive routing, is a process where a router can forward data via a different route or given destination based on the current conditions of the communication circuits within a system.
Gambling is the wagering of money or something of value (referred to as "the stakes") on an event with an uncertain outcome with the primary intent of winning money or material goods.
Germany (Deutschland), officially the Federal Republic of Germany (Bundesrepublik Deutschland), is a sovereign state in central-western Europe.
The Gittins index is a measure of the reward that can be achieved by a random process bearing a termination state and evolving from its present state onward, under the option of terminating the said process at every later stage with the accrual of the probabilistic expected reward from that stage up to the attainment of its termination state.
A greedy algorithm is an algorithmic paradigm that follows the problem solving heuristic of making the locally optimal choice at each stage with the intent of finding a global optimum.
Herbert Ellis Robbins (January 12, 1915 – February 12, 2001) was an American mathematician and statistician.
John Charles Gittins (born 1938) is a researcher in applied probability and operations research, who is a professor and Emeritus Fellow at Keble College, Oxford University.
The Journal of the Royal Statistical Society is a peer-reviewed scientific journal of statistics.
Springer Lecture Notes in Computer Science (LNCS) is a series of computer science books published by Springer Science+Business Media (formerly Springer-Verlag) since 1973.
Markov decision processes (MDPs) provide a mathematical framework for modeling decision making in situations where outcomes are partly random and partly under the control of a decision maker.
Medical ethics is a system of moral principles that apply values to the practice of clinical medicine and in scientific research.
Michael N. Katehakis (Μιχαήλ Ν. Κατεχάκης; born 1952) is a Professor of Management Science at Rutgers University.
Nonparametric regression is a category of regression analysis in which the predictor does not take a predetermined form but is constructed according to information derived from the data.
The open-source model is a decentralized software-development model that encourages open collaboration.
Operations Research is a bimonthly peer-reviewed academic journal covering operations research that is published by INFORMS.
In mathematics, the theory of optimal stopping or early stopping is concerned with the problem of choosing a time to take a particular action, in order to maximise an expected reward or minimise an expected cost.
Peter Whittle (born 27 February 1927, in Wellington, New Zealand) is a mathematician and statistician, working in the fields of stochastic nets, optimal control, time series analysis, stochastic optimisation and stochastic dynamics. From 1967 to 1994, he was the Churchill Professor of Mathematics for Operational Research at the University of Cambridge.
The pharmaceutical industry (or medicine industry) is the commercial industry that discovers, develops, produces, and markets drugs or pharmaceutical drugs for use as different types of medicine and medications.
In finance, a portfolio is a collection of investments held by an investment company, hedge fund, financial institution or individual.
The prisoner's dilemma is a standard example of a game analyzed in game theory that shows why two completely rational individuals might not cooperate, even if it appears that it is in their best interests to do so.
In probability theory and statistics, a probability distribution is a mathematical function that provides the probabilities of occurrence of different possible outcomes in an experiment.
Probability theory is the branch of mathematics concerned with probability.
In decision theory, on making decisions under uncertainty—should information about the best course of action arrive after taking a fixed decision—the human emotional response of regret is often experienced.
Reinforcement learning (RL) is an area of machine learning inspired by behaviourist psychology, concerned with how software agents ought to take actions in an environment so as to maximize some notion of cumulative reward.
In microeconomics, search theory studies buyers or sellers who cannot instantly find a trading partner, and must therefore search for a partner prior to transacting.
The SIAM Journal on Computing is a scientific journal focusing on the mathematical and formal aspects of computer science.
In linear algebra, the singular-value decomposition (SVD) is a factorization of a real or complex matrix.
A slot machine (American English), known variously as a fruit machine (British English), puggy (Scottish English), the slots (Canadian and American English), poker machine/pokies (Australian English and New Zealand English), or simply slot (American English), is a casino gambling machine with three or more reels which spin when a button is pushed.
In mathematics, the softmax function, or normalized exponential function, is a generalization of the logistic function that "squashes" a -dimensional vector \mathbf of arbitrary real values to a -dimensional vector \sigma(\mathbf) of real values, where each entry is in the range (0, 1, and all the entries add up to 1. The function is given by In probability theory, the output of the softmax function can be used to represent a categorical distribution – that is, a probability distribution over different possible outcomes. In fact, it is the gradient-log-normalizer of the categorical probability distribution. The softmax function is also the gradient of the LogSumExp function. The softmax function is used in various multiclass classification methods, such as multinomial logistic regression (also known as softmax regression), multiclass linear discriminant analysis, naive Bayes classifiers, and artificial neural networks. Specifically, in multinomial logistic regression and linear discriminant analysis, the input to the function is the result of distinct linear functions, and the predicted probability for the 'th class given a sample vector and a weighting vector is: This can be seen as the composition of linear functions \mathbf \mapsto \mathbf^\mathsf\mathbf_1, \ldots, \mathbf \mapsto \mathbf^\mathsf\mathbf_K and the softmax function (where \mathbf^\mathsf\mathbf denotes the inner product of \mathbf and \mathbf). The operation is equivalent to applying a linear operator defined by \mathbf to vectors \mathbf, thus transforming the original, probably highly-dimensional, input to vectors in a -dimensional space \mathbb^K.
Stochastic scheduling concerns scheduling problems involving random attributes, such as random processing times, random due dates, random weights, and stochastic machine breakdowns.
In artificial intelligence, Thompson sampling, named after William R. Thompson, is a heuristic for choosing actions that addresses the exploration-exploitation dilemma in the multi-armed bandit problem.
Tikhonov regularization, named for Andrey Tikhonov, is the most commonly used method of regularization of ill-posed problems.
World War II (often abbreviated to WWII or WW2), also known as the Second World War, was a global war that lasted from 1939 to 1945, although conflicts reflecting the ideological clash between what would become the Allied and Axis blocs began earlier.
Adversarial bandit, Bandit (machine learning), Bandit model, Bandit problem, Bandit process, Contextual bandit algorithm, E-greedy strategy, Epsilon-greedy strategy, K armed bandit, K-armed bandit, Multi armed bandit, Multi-armed bandit problem, Multi-armed bandits, Multiarmed bandit, Multi–armed bandit, N armed bandit, N-armed bandit, Two armed bandit, Two-armed bandit.