probability

For example, if the risk factor is diabetes and the disease is cholecystitis, a hospital patient without diabetes is more likely to have cholecystitis than a member of the general population, since the patient must have had some non-diabetes (possibly cholecystitis-causing) reason to enter the hospital in the first place. That result will be obtained regardless of whether there is any association between diabetes and cholecystitis in the general population.

An example presented by Jordan Ellenb...
10 hours ago by hellsten
[1812.00681] Numerical computation of rare events via large deviation theory
"An overview of rare events algorithms based on large deviation theory (LDT) is presented. It covers a range of numerical schemes to compute the large deviation minimizer in various setups, and discusses best practices, common pitfalls, and implementation trade-offs. Generalizations, extensions, and improvements of the minimum action methods are proposed. These algorithms are tested on example problems which illustrate several common difficulties which arise e.g. when the forcing is degenerate or multiplicative, or the systems are infinite-dimensional. Generalizations to processes driven by non-Gaussian noises or random initial data and parameters are also discussed, along with the connection between the LDT-based approach reviewed here and other methods, such as stochastic field theory and optimal control. Finally, the integration of this approach in importance sampling methods using e.g. genealogical algorithms is explored."
to:NB  large_deviations  rar  rare-event_simulation  simulation  computational_statistics  probability  via:rvenkat
2 days ago by cshalizi
Probability in Dice Rolling - Newton and Pepys DataGenetics
The origin of this problem is that Samuel Pepys, apparently a gambling man, asked Isaac Newton which of these three events had the highest probability of occurring:
Game A Throwing 6 dice and getting at least one six.
Game B Throwing 12 dice and getting at least two sixes.
Game C Throwing 18 dice and getting at least three sixes.

Pepys, who had quite a considerable wager on this, thought that Game C was the more likely event, and wrote to Newton asking advice. A series of letters went backwards and forwards as the problem was discussed. Newton arrived at the correct solution, though historians and mathematicians debate that his correspondence about the matter contained a logical error in the explanation.

Have a think about the problem, yourself, for a few seconds before looking at the solution.

Which game do you think is more likely to win?
probability  dice  statistics  Newton  Pepys
3 days ago by Tonti
[1706.04290] A general method for lower bounds on fluctuations of random variables
"There are many ways of establishing upper bounds on fluctuations of random variables, but there is no systematic approach for lower bounds. As a result, lower bounds are unknown in many important problems. This paper introduces a general method for lower bounds on fluctuations. The method is used to obtain new results for the stochastic traveling salesman problem, the stochastic minimal matching problem, the random assignment problem, the Sherrington-Kirkpatrick model of spin glasses, first-passage percolation and random matrices. A long list of open problems is provided at the end."
to:NB  probability  deviation_inequalities  via:vaguery
3 days ago by cshalizi
[1802.00211] Hoeffding's lemma for Markov Chains and its applications to statistical learning
"We extend Hoeffding's lemma to general-state-space and not necessarily reversible Markov chains. Let {Xi}i≥1 be a stationary Markov chain with invariant measure π and absolute spectral gap 1−λ, where λ is defined as the operator norm of the transition kernel acting on mean zero and square-integrable functions with respect to π. Then, for any bounded functions fi:x↦[ai,bi], the sum of fi(Xi) is sub-Gaussian with variance proxy 1+λ1−λ⋅∑i(bi−ai)24. This result differs from the classical Hoeffding's lemma by a multiplicative coefficient of (1+λ)/(1−λ), and simplifies to the latter when λ=0. The counterpart of Hoeffding's inequality for Markov chains immediately follows. Our results assume none of countable state space, reversibility and time-homogeneity of Markov chains and cover time-dependent functions with various ranges. We illustrate the utility of these results by applying them to six problems in statistics and machine learning."
in_NB  deviation_inequalities  probability  stochastic_processes  markov_models
7 days ago by cshalizi
[1805.10721] Bernstein's inequality for general Markov chains
"We prove a sharp Bernstein inequality for general-state-space and not necessarily reversible Markov chains. It is sharp in the sense that the variance proxy term is optimal. Our result covers the classical Bernstein's inequality for independent random variables as a special case."
in_NB  deviation_inequalities  probability  stochastic_processes  markov_models  re:almost_none
7 days ago by cshalizi
Seeing Theory
This really is one of our favourite things on the web, a visual and interactive and journey