recentpopularlog in

tsuomela : data-science   131

« earlier  
CRAN - Package jsonlite
"A fast JSON parser and generator optimized for statistical data and the web. Started out as a fork of 'RJSONIO', but has been completely rewritten in recent versions. The package offers flexible, robust, high performance tools for working with JSON in R and is particularly powerful for building pipelines and interacting with a web API. The implementation is based on the mapping described in the vignette (Ooms, 2014). In addition to converting JSON data from/to R objects, 'jsonlite' contains functions to stream, validate, and prettify JSON data. The unit tests included with the package verify that all edge cases are encoded and decoded consistently for use with dynamic data in systems and applications."
r  package  json  data-science 
14 days ago by tsuomela
Introduction
"The Turing Way is a lightly opinionated guide to reproducible data science."
data-science  guide  howto 
may 2019 by tsuomela
First Python Notebook — First Python Notebook 1.0 documentation
"A step-by-step guide to analyzing data with Python and the Jupyter Notebook."
data-science  python  programming  notebook 
december 2018 by tsuomela
SciServer – Collaborative data-driven science
"SciServer is a revolutionary new approach to doing science by bringing the analysis to the data. SciServer consists of data hosting services coupled with integrated Tools that work together to create a full-featured system."
data-science  big-data 
december 2018 by tsuomela
[1710.00027v1] Toward a System Building Agenda for Data Integration
"In this paper we argue that the data management community should devote far more effort to building data integration (DI) systems, in order to truly advance the field. Toward this goal, we make three contributions. First, we draw on our recent industrial experience to discuss the limitations of current DI systems. Second, we propose an agenda to build a new kind of DI systems to address these limitations. These systems guide users through the DI workflow, step by step. They provide tools to address the "pain points" of the steps, and tools are built on top of the Python data science and Big Data ecosystem (PyData). We discuss how to foster an ecosystem of such tools within PyData, then use it to build DI systems for collaborative/cloud/crowd/lay user settings. Finally, we discuss ongoing work at Wisconsin, which suggests that these DI systems are highly promising and building them raises many interesting research challenges. "
research-data  management  integration  sharing  data-science 
november 2018 by tsuomela
Enigma Labs | Temperature Anomalies
"Every day, the Global Historical Climatology Network collects temperatures from 90,000 weather stations. Dating back as far as the late 1700's, the records provide an incredible source of insight into our changing climate. Using this data, we can determine what the weather is normally like for most places on Earth. We can tell you that the average low temperature in New York City on January 11th is 29°F and that the average high temperature in Los Angeles on July 24th is 80°F. Once we know what temperatures to expect on any given day with a certain degree of confidence, we can sift out the uneventful days, leaving only anomalous weather events."
data-science  demonstration  weather  environment  temperature  climate-change  public-data 
september 2018 by tsuomela
How to Make Better-Looking, More Readable Charts in R | FlowingData
"Defaults are generalized settings to work with many datasets. This is fine for analysis, but data graphics for presentation benefit from context-specific design."
data-science  visualization  r  tutorial 
august 2018 by tsuomela
Data Love - The Seduction and Betrayal of Digital Technologies | Columbia University Press
"Intelligence services, government administrations, businesses, and a growing majority of the population are hooked on the idea that big data can reveal patterns and correlations in everyday life. Initiated by software engineers and carried out through algorithms, the mining of big data has sparked a silent revolution. But algorithmic analysis and data mining are not simply byproducts of media development or the logical consequences of computation. They are the radicalization of the Enlightenment's quest for knowledge and progress. Data Love argues that the "cold civil war" of big data is taking place not among citizens or between the citizen and government but within each of us. Roberto Simanowski elaborates on the changes data love has brought to the human condition while exploring the entanglements of those who—out of stinginess, convenience, ignorance, narcissism, or passion—contribute to the amassing of ever more data about their lives, leading to the statistical evaluation and individual profiling of their selves. Writing from a philosophical standpoint, Simanowski illustrates the social implications of technological development and retrieves the concepts, events, and cultural artifacts of past centuries to help decode the programming of our present."
book  publisher  data-science  data-mining  epistemology 
august 2018 by tsuomela
Data, a first-class research output
" The Make Data Count (MDC) project is funded by the Alfred P. Sloan Foundation to develop and deploy the social and technical infrastructure necessary to elevate data to a first-class research output alongside more traditional products, such as publications. It will run between May 2017 and April 2019. The project will address the significant social as well as technical barriers to widespread incorporation of data-level metrics in the research data management ecosystem through consultation, recommendation, new technical capability, and community outreach. Project work will build upon long-standing partner initiatives supporting research data management and DLM, leverage prior Sloan investments in key technologies such as Lagotto, and enlist the cooperation of the research, library, funder, and publishing stakeholder communities."
research-data  management  metrics  altmetrics  data-science  data  publishing  scholarly-communication 
may 2017 by tsuomela
Data School
"My name is Kevin Markham. I'm a data scientist and a teacher. Previously, I was the lead instructor for General Assembly's 11-week data science course in Washington, DC, as well an instructor fellow, responsible for training and mentoring new data science instructors. I have over 400 hours of classroom experience teaching data science in Python. Here are testimonials about my teaching."
weblog-individual  courses  statistics  data-science 
april 2017 by tsuomela
CS109 Data Science
"Learning from data in order to gain useful predictions and insights. This course introduces methods for five key facets of an investigation: data wrangling, cleaning, and sampling to get a suitable data set; data management to be able to access big data quickly and reliably; exploratory data analysis to generate hypotheses and intuition; prediction based on statistical methods such as regression and classification; and communication of results through visualization, stories, and interpretable summaries."
data-science  courses  open-education  school(Harvard) 
april 2017 by tsuomela
Welcome to the NEON Data Skills Portal! – NEON Data Skills
"This site contains data lessons, background materials and other resources that support working with large spatio-temporal datasets, like those offered by the NEON project. We welcome any comments and feedback that you have and also materials that support or expand upon what’s available on this site!"
data-science  education  data-sources  tutorials  pedagogy  lessons 
april 2017 by tsuomela
[1502.05256] Cultural Anthropology Through the Lens of Wikipedia - A Comparison of Historical Leadership Networks in the English, Chinese, Japanese and German Wikipedia
"In this paper we study the differences in historical worldview between Western and Eastern cultures, represented through the English, Chinese, Japanese, and German Wikipedia. In particular, we analyze the historical networks of the World's leaders since the beginning of written history, comparing them in the four different Wikipedias. "
anthropology  data-science  computational-science 
april 2017 by tsuomela
NSI | Deeper Analyses. Clarifying Insights. Better Decisions.
"NSI is a professional services firm specializing in multidisciplinary data-driven analytics. Our niche is helping clients better and more reliably understand people and their behaviors on critical, complex decision-making problems. Our highly diverse and experienced team of researchers and analysts apply multidisciplinary social science techniques and various analytic methods to create deeper analyses and clarifying insights enabling our clients to make more informed and better decisions."
business  consulting  data-science  social-science 
april 2017 by tsuomela
« earlier      
per page:    204080120160

Copy this bookmark:





to read