**lena : statistics**
Probability Primer - YouTube

11 hours ago by lena

A series of videos giving an introduction to some of the basic definitions, notation, and concepts one would encounter in a 1st year graduate probability course.

probability
math
statistics
video
elearning
towatch
Graphic presentation

7 days ago by lena

1939 book on presenting data

graphics
history
visualization
books
statistics
Dashboard Gelijke Kansen | Onderwijsmonitor | OCW in cijfers

education
nl
statistics

Dit is het dashboard gelijke kansen in het onderwijs. Dit dashboard monitort voor verschillende groepen leerlingen en studenten de overgangen in de gehele onderwijsloopbaan en geeft inzicht in de ontwikkeling van gelijke kansen in het onderwijs.

CRAN - Package reclin

7 days ago by lena

Functions to assist in performing probabilistic record linkage and deduplication: generating pairs, comparing records, em-algorithm for estimating m- and u-probabilities, forcing one-to-one matching. Can also be used for pre- and post-processing for machine learning methods for record linkage.

r
statistics
survey
RISQ - Representative Indicators for Survey Quality - Cathie Marsh Institute for Social Research - The University of Manchester

22 days ago by lena

risq-project.eu representativity indicators for survey quality.

survey
statistics
tools
r
Explained Visually

5 weeks ago by lena

Explained Visually (EV) is an experiment in making hard ideas intuitive inspired the work of Bret Victor's Explorable Explanations.

Regression, PCA, Eigenvalues, Pi, Sine/Cosine, Markov chains, Probability

math
programming
statistics
probability
visualization
pca
matrix
markov
Harper's Index | Harper's Magazine

6 weeks ago by lena

many fun or interesting simple statistics that are usually contrasted with some other statistic. paywalled.

statistics
media
SNStatComp/awesome-official-statistics-software: An awesome list of statistical software packages useful for creating and accessing official statistics.

9 weeks ago by lena

An item on this list is awesome because

it is free, open source, and available for download;

it is confirmed to be used in the production of official statistics by at least one institute, or

it provides access to official statistics publications.

We prefer packages that are reasonably easy to install and use, that have at least one stable version, and that are actively maintained.

statistics
resources
Home

9 weeks ago by lena

The European Statistical System (ESS) website is your single entry point to relevant information on the organization and activities of the ESS, both as a whole and for its individual partners.

The ESS website welcome page offers the latest news concerning life in the ESS partners. In addition, the news feeds' page provides news in RSS format, such as press releases, also from all ESS partners.

statistics
europe
Data opschonen met statistiek-software R

9 weeks ago by lena

Statistical Data Cleaning with applications in R

books
statistics
r
survey
Data Design: Visualising Quantities, Locations, Connections: Per Mollerup: 9781408191873: Amazon.com: Books

9 weeks ago by lena

Data Design: Visualising quantities, locations, connections is a lively and comprehensive introduction to data visualisation, illustrated with 199 instructive data displays. The book is for designers, journalists, editors, writers and anyone concerned with presenting factual information in a clear and effective way.

data
visualization
statistics
books
graphs
charts
plots
Creating More Effective Graphs: Naomi B. Robbins: 9780985911126: Amazon.com: Books

9 weeks ago by lena

Also covers trellis graphs

statistics
visualization
plots
charts
graphs
books
Methods of Comparison, Compared / Observable

9 weeks ago by lena

Methods of Comparison, Compared

Log ratios are often used when considering growth, as with investment returns. For example, if a stock doubles and then halves, you’re back where you started: log(21) log(12)=0\log(\tfrac{2}{1}) \log(\tfrac{1}{2}) = 0log(12) log(21)=0. On the other hand if a stock goes up by fifty percent then down by fifty percent, you’ve lost twenty-five percent of your investment: (1×0.5)−(1.5×0.5)=−0.25(1 \times 0.5) - (1.5 \times 0.5) = -0.25(1×0.5)−(1.5×0.5)=−0.25. This is why log scales are commonly used in stock price charts, such as this change line chart and index chart.

maps
statistics
visualization
comparison
Amazon.com: Counterfactuals and Causal Inference: Methods and Principles for Social Research (Analytical Methods for Social Research) (9781107694163): Stephen L. Morgan, Christopher Winship: Books

9 weeks ago by lena

Amazon.com: Counterfactuals and Causal Inference: Methods and Principles for Social Research (Analytical Methods for Social Research) (9781107694163): Stephen L. Morgan, Christopher Winship: Books

books
causality
statistics
If correlation doesn’t imply causation, then what does? | DDI

9 weeks ago by lena

I often wonder how many people with real decision-making power – politicians, judges, and so on – are making decisions based on statistical studies, and yet they don’t understand even basic things like Simpson’s paradox.

causality
statistics
probability
research
science
Statistics – accidents data - European Commission

9 weeks ago by lena

Not really datasets, but pdf documents with data

datasets
europe
traffic
safety
statistics
ESRA | Deliverables & publications

9 weeks ago by lena

ESRA (E-SURVEY OF ROAD USERS’ ATTITUDES) is a joint international initiative of 26 research centres and road safety institutes; the project has surveyed road users in 38 countries on 5 continents. The purpose of this network is to collect comparable data on the opinions, attitudes, and behaviour of road users concerning road safety and mobility, and to provide scientific evidence for policy making at the national and international levels. Vias institute initiated the project, and the first edition of the ESRA survey was launched in 2015. The institute continues to coordinate this fast-evolving endeavour, and the next edition will be in 2018.

traffic
safety
data
research
statistics
psychology
survey
Trevor Hastie - Publications

10 weeks ago by lena

Books and papers about statistics. Elements of statistical learning and others.

statistics
books
Emulating R plots in Python – A Journey in Data

10 weeks ago by lena

One of the simplest R commands that doesn’t have a direct equivalent in Python is plot() for linear regression models (wraps plot.lm() when fed linear models). While python has a vast array of plotting libraries, the more hands-on approach of it necessitates some intervention to replicate R’s plot(), which creates a group of diagnostic plots (residual, qq, scale-location, leverage) to assess model performance when applied to a fitted linear regression model.

python
r
statistics
plots
Practical Guide to Cluster Analysis in R (book)

11 weeks ago by lena

Although there are several good books on unsupervised machine learning/clustering and related topics, we felt that many of them are either too high-level, theoretical or too advanced. Our goal was to write a practical guide to cluster analysis, elegant visualization and interpretation.

The main parts of the book include:

distance measures,

partitioning clustering,

hierarchical clustering,

cluster validation methods, as well as,

advanced clustering methods such as fuzzy clustering, density-based clustering and model-based clustering.

books
statistics
pca
r
PCA : Interpretation Examples — Stats366 / Stats 166 Course Notes

11 weeks ago by lena

Short tutorial, nice examples with dudi.pca. Turtles and olympic data

pca
statistics
RPubs - ggplot theme for publication ready figures

11 weeks ago by lena

much better looking plots

r
statistics
visualization
plots
PCA - Principal Component Analysis Essentials - Articles - STHDA

11 weeks ago by lena

detailed pca plot examples with factominer/factoextra

pca
r
statistics
Our World in Data

11 weeks ago by lena

Living conditions around the world are changing rapidly. Explore how and why.

data
statistics
visualization
Lattice - Multivariate Data Visualization with R - Figures and Code

june 2018 by lena

Super useful, many examples.

visualization
r
statistics
plots
Summarising data using dot plots | R-bloggers

june 2018 by lena

A dot plot is a type of display that compares counts, frequencies, totals or other summary measures for a series of categories.

plots
r
statistics
Bioconductor - pcaMethods

june 2018 by lena

PDF R Script Data with outliers

PDF R Script Introduction

PDF R Script Missing value imputation

PDF Reference Manual

pca
r
statistics
Summated Rating Scale Construction | SAGE Publications Ltd

june 2018 by lena

$20 ebook, was recommended by statistician.

The goal for any social scientist doing a survey is to develop a rating on some attitude, value or opinion - a summated rating scale. Aimed at helping researchers construct more effective scales, Spector shows how to determine the number of items necessary, the appropriate amount of response categories and the most productive wording of items, how to sort good items from bad (including item-remainder coefficients and Cronbach's alpha) and how to validate a scale, including dimensional validity from factor analysis. Written in a user-friendly manner, the book concludes with a step-by-step account of how to develop a summated rating scale based on classical test theory.

statistics
psychology
books
The goal for any social scientist doing a survey is to develop a rating on some attitude, value or opinion - a summated rating scale. Aimed at helping researchers construct more effective scales, Spector shows how to determine the number of items necessary, the appropriate amount of response categories and the most productive wording of items, how to sort good items from bad (including item-remainder coefficients and Cronbach's alpha) and how to validate a scale, including dimensional validity from factor analysis. Written in a user-friendly manner, the book concludes with a step-by-step account of how to develop a summated rating scale based on classical test theory.

Chances Are - The New York Times

june 2018 by lena

The improbable thrills of probability theory.

bayes
statistics
Introductory Probability and Statistics - YouTube - YouTube

june 2018 by lena

By Barbara Oakley

statistics
video
Biostatistics for Biomedical Research (479 page pdf)

june 2018 by lena

Lots of general info about what kind of tests to use when, and pitfalls to avoid.

statistics
books
Uses of the logarithm transformation in regression and forecasting

june 2018 by lena

Change in natural log ≈ percentage change: The natural logarithm and its base number e have some magical properties, which you may remember from calculus (and which you may have hoped you would never meet again). For example, the function eX is its own derivative, and the derivative of LN(X) is 1/X. But for purposes of business analysis, its great advantage is that small changes in the natural log of a variable are directly interpretable as percentage changes, to a very close approximation. The reason for this is that the graph of Y = LN(X) passes through the point (1, 0) and has a slope of 1 there, so it is tangent to the straight line whose equation is Y = X-1 (the dashed line in the plot below):

math
statistics
regression
PythonDataScienceHandbook/05.09-Principal-Component-Analysis.ipynb at master · jakevdp/PythonDataScienceHandbook

june 2018 by lena

Mostly examples about PCA for images: recognize faces after dimension reduction.

jupyter
pca
statistics
Index of /~skim43/stat550/Data

may 2018 by lena

exercise datasets for Johnson multivariate statistical analysis book

statistics
datasets
GitHub - timkpaine/lantern: Data exploration kit

may 2018 by lena

Jupyter extension: An orchestration layer for plots and tables, dummy datasets, research, reports, and anything else a data scientist might need.

data
datascience
python
jupyter
statistics
plots
Principal Component Analysis (PCA) - A.B. Dufour - course2.pdf

may 2018 by lena

Interesting tutorial, with 3D plots that explain effects of scaling/centering. Uses dudi.pca r code: "dudi.pca deals with the variables and/or the individuals whereas princomp

pca
r
statistics
PCA-2016.pages - pcaTutorial.pdf

may 2018 by lena

9 pages introduction with basic r-code and references

pca
statistics
OrdinalIndependent.pdf

may 2018 by lena

In short, it will often be ok to treat an ordinal variable as though it had linear effects

regression
statistics
Is two-tailed testing for directional research hypotheses tests legitimate? - ScienceDirect

may 2018 by lena

This paper demonstrates that there is currently a widespread misuse of two-tailed testing for directional research hypotheses tests. One probable reason for this overuse of two-tailed testing is the seemingly valid beliefs that two-tailed testing is more conservative and safer than one-tailed testing. However, the authors examine the legitimacy of this notion and find it to be flawed. A second and more fundamental cause of the current problem is the pervasive oversight in making a clear distinct...

statistics
Why is the squared difference so commonly used? - Cross Validated

may 2018 by lena

Considering alternative losses opens up a rich set of possibilities: quantile regression, M-estimators, robust statistics, and much more can all be framed in this decision-theoretic way and justified using alternative loss functions. For a simple example, see Percentile Loss Functions.

statistics
Wilcoxon-Mann-Whitney Test Calculator -

may 2018 by lena

calculates exact p-values

statistics
calculator
nonparametric
