Probability Primer - YouTube
A series of videos giving an introduction to some of the basic definitions, notation, and concepts one would encounter in a 1st year graduate probability course.
probability  math  statistics  video  elearning  towatch 
11 hours ago by lena
Dashboard Gelijke Kansen | Onderwijsmonitor | OCW in cijfers
Dit is het dashboard gelijke kansen in het onderwijs. Dit dashboard monitort voor verschillende groepen leerlingen en studenten de overgangen in de gehele onderwijsloopbaan en geeft inzicht in de ontwikkeling van gelijke kansen in het onderwijs.
education  nl  statistics 
7 days ago by lena
CRAN - Package reclin
Functions to assist in performing probabilistic record linkage and deduplication: generating pairs, comparing records, em-algorithm for estimating m- and u-probabilities, forcing one-to-one matching. Can also be used for pre- and post-processing for machine learning methods for record linkage.
r  statistics  survey 
7 days ago by lena
Explained Visually
Explained Visually (EV) is an experiment in making hard ideas intuitive inspired the work of Bret Victor's Explorable Explanations.

Regression, PCA, Eigenvalues, Pi, Sine/Cosine, Markov chains, Probability
math  programming  statistics  probability  visualization  pca  matrix  markov 
5 weeks ago by lena
Harper's Index | Harper's Magazine
many fun or interesting simple statistics that are usually contrasted with some other statistic. paywalled.
statistics  media 
6 weeks ago by lena
SNStatComp/awesome-official-statistics-software: An awesome list of statistical software packages useful for creating and accessing official statistics.
An item on this list is awesome because

it is free, open source, and available for download;
it is confirmed to be used in the production of official statistics by at least one institute, or
it provides access to official statistics publications.

We prefer packages that are reasonably easy to install and use, that have at least one stable version, and that are actively maintained.
statistics  resources 
9 weeks ago by lena
The European Statistical System (ESS) website is your single entry point to relevant information on the organization and activities of the ESS, both as a whole and for its individual partners.

The ESS website welcome page offers the latest news concerning life in the ESS partners. In addition, the news feeds' page provides news in RSS format, such as press releases, also from all ESS partners.
statistics  europe 
9 weeks ago by lena
Data opschonen met statistiek-software R
Statistical Data Cleaning with applications in R
books  statistics  r  survey 
9 weeks ago by lena
Data Design: Visualising Quantities, Locations, Connections: Per Mollerup: 9781408191873: Books
Data Design: Visualising quantities, locations, connections is a lively and comprehensive introduction to data visualisation, illustrated with 199 instructive data displays. The book is for designers, journalists, editors, writers and anyone concerned with presenting factual information in a clear and effective way.
data  visualization  statistics  books  graphs  charts  plots 
9 weeks ago by lena
Methods of Comparison, Compared / Observable
Methods of Comparison, Compared
Log ratios are often used when considering growth, as with investment returns. For example, if a stock doubles and then halves, you’re back where you started: log⁡(21) log⁡(12)=0\log(\tfrac{2}{1}) \log(\tfrac{1}{2}) = 0log(12​) log(21​)=0. On the other hand if a stock goes up by fifty percent then down by fifty percent, you’ve lost twenty-five percent of your investment: (1×0.5)−(1.5×0.5)=−0.25(1 \times 0.5) - (1.5 \times 0.5) = -0.25(1×0.5)−(1.5×0.5)=−0.25. This is why log scales are commonly used in stock price charts, such as this change line chart and index chart.
maps  statistics  visualization  comparison 
9 weeks ago by lena Counterfactuals and Causal Inference: Methods and Principles for Social Research (Analytical Methods for Social Research) (9781107694163): Stephen L. Morgan, Christopher Winship: Books Counterfactuals and Causal Inference: Methods and Principles for Social Research (Analytical Methods for Social Research) (9781107694163): Stephen L. Morgan, Christopher Winship: Books
books  causality  statistics 
9 weeks ago by lena
If correlation doesn’t imply causation, then what does? | DDI
I often wonder how many people with real decision-making power – politicians, judges, and so on – are making decisions based on statistical studies, and yet they don’t understand even basic things like Simpson’s paradox.
causality  statistics  probability  research  science 
9 weeks ago by lena
ESRA | Deliverables & publications
ESRA (E-SURVEY OF ROAD USERS’ ATTITUDES) is a joint international initiative of 26 research centres and road safety institutes; the project has surveyed road users in 38 countries on 5 continents. The purpose of this network is to collect comparable data on the opinions, attitudes, and behaviour of road users concerning road safety and mobility, and to provide scientific evidence for policy making at the national and international levels. Vias institute initiated the project, and the first edition of the ESRA survey was launched in 2015. The institute continues to coordinate this fast-evolving endeavour, and the next edition will be in 2018.
traffic  safety  data  research  statistics  psychology  survey 
9 weeks ago by lena
Trevor Hastie - Publications
Books and papers about statistics. Elements of statistical learning and others.
statistics  books 
10 weeks ago by lena
Emulating R plots in Python – A Journey in Data
One of the simplest R commands that doesn’t have a direct equivalent in Python is plot() for linear regression models (wraps plot.lm() when fed linear models). While python has a vast array of plotting libraries, the more hands-on approach of it necessitates some intervention to replicate R’s plot(), which creates a group of diagnostic plots (residual, qq, scale-location, leverage) to assess model performance when applied to a fitted linear regression model.
python  r  statistics  plots 
10 weeks ago by lena
Practical Guide to Cluster Analysis in R (book)
Although there are several good books on unsupervised machine learning/clustering and related topics, we felt that many of them are either too high-level, theoretical or too advanced. Our goal was to write a practical guide to cluster analysis, elegant visualization and interpretation.

The main parts of the book include:

distance measures,
partitioning clustering,
hierarchical clustering,
cluster validation methods, as well as,
advanced clustering methods such as fuzzy clustering, density-based clustering and model-based clustering.
books  statistics  pca  r 
11 weeks ago by lena
PCA : Interpretation Examples — Stats366 / Stats 166 Course Notes
Short tutorial, nice examples with dudi.pca. Turtles and olympic data
pca  statistics 
11 weeks ago by lena
PCA - Principal Component Analysis Essentials - Articles - STHDA
detailed pca plot examples with factominer/factoextra
pca  r  statistics 
11 weeks ago by lena
Our World in Data
Living conditions around the world are changing rapidly. Explore how and why.
data  statistics  visualization 
11 weeks ago by lena
Summarising data using dot plots | R-bloggers
A dot plot is a type of display that compares counts, frequencies, totals or other summary measures for a series of categories.
plots  r  statistics 
june 2018 by lena
Bioconductor - pcaMethods
PDF R Script Data with outliers
PDF R Script Introduction
PDF R Script Missing value imputation
PDF Reference Manual
pca  r  statistics 
june 2018 by lena
Summated Rating Scale Construction | SAGE Publications Ltd
$20 ebook, was recommended by statistician.

The goal for any social scientist doing a survey is to develop a rating on some attitude, value or opinion - a summated rating scale. Aimed at helping researchers construct more effective scales, Spector shows how to determine the number of items necessary, the appropriate amount of response categories and the most productive wording of items, how to sort good items from bad (including item-remainder coefficients and Cronbach's alpha) and how to validate a scale, including dimensional validity from factor analysis. Written in a user-friendly manner, the book concludes with a step-by-step account of how to develop a summated rating scale based on classical test theory.
statistics  psychology  books 
june 2018 by lena
Chances Are - The New York Times
The improbable thrills of probability theory.
bayes  statistics 
june 2018 by lena
Biostatistics for Biomedical Research (479 page pdf)
Lots of general info about what kind of tests to use when, and pitfalls to avoid.
statistics  books 
june 2018 by lena
Uses of the logarithm transformation in regression and forecasting
Change in natural log ≈ percentage change: The natural logarithm and its base number e have some magical properties, which you may remember from calculus (and which you may have hoped you would never meet again). For example, the function eX is its own derivative, and the derivative of LN(X) is 1/X. But for purposes of business analysis, its great advantage is that small changes in the natural log of a variable are directly interpretable as percentage changes, to a very close approximation. The reason for this is that the graph of Y = LN(X) passes through the point (1, 0) and has a slope of 1 there, so it is tangent to the straight line whose equation is Y = X-1 (the dashed line in the plot below):
math  statistics  regression 
june 2018 by lena
Index of /~skim43/stat550/Data
exercise datasets for Johnson multivariate statistical analysis book
statistics  datasets 
may 2018 by lena
GitHub - timkpaine/lantern: Data exploration kit
Jupyter extension: An orchestration layer for plots and tables, dummy datasets, research, reports, and anything else a data scientist might need.
data  datascience  python  jupyter  statistics  plots 
may 2018 by lena
Principal Component Analysis (PCA) - A.B. Dufour - course2.pdf
Interesting tutorial, with 3D plots that explain effects of scaling/centering. Uses dudi.pca r code: "dudi.pca deals with the variables and/or the individuals whereas princomp
and prcomp deal with the individuals only. "
pca  r  statistics 
may 2018 by lena
PCA-2016.pages - pcaTutorial.pdf
9 pages introduction with basic r-code and references
pca  statistics 
may 2018 by lena
In short, it will often be ok to treat an ordinal variable as though it had linear effects
. The greater
parsimony that results from doing so may be enough to offset any disadvantages that result. But, there are ways to formally test whether the assumption of linearity is justified.
regression  statistics 
may 2018 by lena
Is two-tailed testing for directional research hypotheses tests legitimate? - ScienceDirect
This paper demonstrates that there is currently a widespread misuse of two-tailed testing for directional research hypotheses tests. One probable reason for this overuse of two-tailed testing is the seemingly valid beliefs that two-tailed testing is more conservative and safer than one-tailed testing. However, the authors examine the legitimacy of this notion and find it to be flawed. A second and more fundamental cause of the current problem is the pervasive oversight in making a clear distinct...
may 2018 by lena
Why is the squared difference so commonly used? - Cross Validated
Considering alternative losses opens up a rich set of possibilities: quantile regression, M-estimators, robust statistics, and much more can all be framed in this decision-theoretic way and justified using alternative loss functions. For a simple example, see Percentile Loss Functions.
may 2018 by lena
