recentpopularlog in

tsuomela : big-data   235

« earlier  
[1911.02479v1] Algorithms and Statistical Models for Scientific Discovery in the Petabyte Era
"The field of astronomy has arrived at a turning point in terms of size and complexity of both datasets and scientific collaboration. Commensurately, algorithms and statistical models have begun to adapt --- e.g., via the onset of artificial intelligence --- which itself presents new challenges and opportunities for growth. This white paper aims to offer guidance and ideas for how we can evolve our technical and collaborative frameworks to promote efficient algorithmic development and take advantage of opportunities for scientific discovery in the petabyte era. We discuss challenges for discovery in large and complex data sets; challenges and requirements for the next stage of development of statistical methodologies and algorithmic tool sets; how we might change our paradigms of collaboration and education; and the ethical implications of scientists' contributions to widely applicable algorithms and computational modeling. We start with six distinct recommendations that are supported by the commentary following them. This white paper is related to a larger corpus of effort that has taken place within and around the Petabytes to Science Workshops"
astronomy  big-data  data-science 
9 weeks ago by tsuomela
[1904.04736] Cold Storage Data Archives: More Than Just a Bunch of Tapes
"The abundance of available sensor and derived data from large scientific experiments, such as earth observation programs, radio astronomy sky surveys, and high-energy physics already exceeds the storage hardware globally fabricated per year. To that end, cold storage data archives are the---often overlooked---spearheads of modern big data analytics in scientific, data-intensive application domains. While high-performance data analytics has received much attention from the research community, the growing number of problems in designing and deploying cold storage archives has only received very little attention. In this paper, we take the first step towards bridging this gap in knowledge by presenting an analysis of four real-world cold storage archives from three different application domains. In doing so, we highlight (i) workload characteristics that differentiate these archives from traditional, performance-sensitive data analytics, (ii) design trade-offs involved in building cold storage systems for these archives, and (iii) deployment trade-offs with respect to migration to the public cloud. Based on our analysis, we discuss several other important research challenges that need to be addressed by the data management community. "
archives  data-curation  big-data  science  computational-science 
april 2019 by tsuomela
SciServer – Collaborative data-driven science
"SciServer is a revolutionary new approach to doing science by bringing the analysis to the data. SciServer consists of data hosting services coupled with integrated Tools that work together to create a full-featured system."
data-science  big-data 
december 2018 by tsuomela
10 Big Data Trends You Should Know
"Very business centric, but may be interesting."
big-data  trends  business 
september 2018 by tsuomela
Humanity’s Halting Problem, Adam Riggio « Social Epistemology Review and Reply Collective
"Brett Frischmann and Evan Selinger have written Re-Engineering Humanity as a sustained and multifaceted critique of how contemporary trends in internet technology are slowly but surely shrinking the territory of human autonomy. Their work is a warning, as well as a description, of how internet technologies that ostensibly make our lives easier do so by taking control of our lives away from our self-conscious decision-making."
book  review  technology  technology-critique  big-data 
september 2018 by tsuomela
Habeas Data » Melville House Books
"Habeas Data shows how the explosive growth of surveillance technology has outpaced our understanding of the ethics, mores, and laws of privacy. Award-winning tech reporter Cyrus Farivar makes the case by taking ten historic court decisions that defined our privacy rights and matching them against the capabilities of modern technology. It’s an approach that combines the charge of a legal thriller with the shock of the daily headlines. Chapters include: the 1960s prosecution of a bookie that established the “reasonable expectation of privacy” in nonpublic places beyond your home (but how does that ruling apply now, when police can chart your every move and hear your every conversation within your own home — without even having to enter it?); the 1970s case where the police monitored a lewd caller — the decision of which is now the linchpin of the NSA’s controversial metadata tracking program revealed by Edward Snowden; and a 2010 low-level burglary trial that revealed police had tracked a defendant’s past 12,898 locations before arrest — an invasion of privacy grossly out of proportion to the alleged crime, which showed how authorities are all too willing to take advantage of the ludicrous gap between the slow pace of legal reform and the rapid transformation of technology."
book  publisher  surveillance  big-data  computer  culture 
september 2018 by tsuomela
Big data: are we making a big mistake?
Very good description of the problems that big data claims to solve, but may not actually solve.
big-data  statistics  science 
march 2018 by tsuomela
Responsible Data Forum — A series of collaborative events, convened to develop useful tools and strategies for dealing with the ethical, security and privacy challenges facing data-driven advocacy.
"The Responsible Data Forum is a collaborative effort to develop useful tools and strategies for dealing with the ethical, security and privacy challenges facing data-driven advocacy. RDF activities include organizing events; fostering discussion between communities; developing and testing concrete tools; disseminating useful information; and advocating for advocates and their supporters to improve the way they work with data. The Forum is a collaboration between Amnesty International, Aspiration, The Engine Room, Greenhost, HURIDOCS, Leiden University’s Peace Informatics Lab, Open Knowledge and Ushahidi."
big-data  privacy  surveillance  risk  humanitarian  genocide  human-rights  activism 
december 2016 by tsuomela
HRDAG
"The Human Rights Data Analysis Group is a non-profit, non-partisan organization that applies rigorous science to the analysis of human rights violations around the world. As scientists, we work to support our partners—the advocates and human rights defenders who “speak truth to power”—by producing unbiased, scientific results that bring clarity to human rights violence and by ensuring that the “truth” is the most accurate truth possible. "
human-rights  data  big-data  activism  non-profit 
november 2016 by tsuomela
Supporting Ethical Data Research: An Exploratory Study of Emerging Issues in Big Data and Technical Research || Data & Society
"In the era of big data, how do researchers ethically collect, analyze, and store data? danah boyd, Emily F. Keller, and Bonnie Tijerina explore this question and examine issues from how to achieve informed consent from research subjects in big data research to how to store data securely in case of breaches. The primer evolves into a discussion on how libraries can collaborate with computer scientists to examine ethical big data research issues."
big-data  research  ethics  irb 
october 2016 by tsuomela
Data Love | Books | Columbia University Press
"Intelligence services, government administrations, businesses, and a growing majority of the population are hooked on the idea that big data can reveal patterns and correlations in everyday life. Initiated by software engineers and carried out through algorithms, the mining of big data has sparked a silent revolution. But algorithmic analysis and data mining are not simply byproducts of media development or the logical consequences of computation. They are the radicalization of the Enlightenment's quest for knowledge and progress. Data Love argues that the "cold civil war" of big data is taking place not among citizens or between the citizen and government but within each of us. Roberto Simanowski elaborates on the changes data love has brought to the human condition while exploring the entanglements of those who—out of stinginess, convenience, ignorance, narcissism, or passion—contribute to the amassing of ever more data about their lives, leading to the statistical evaluation and individual profiling of their selves. Writing from a philosophical standpoint, Simanowski illustrates the social implications of technological development and retrieves the concepts, events, and cultural artifacts of past centuries to help decode the programming of our present."
book  publisher  data  big-data  technology-critique 
october 2016 by tsuomela
Companies Not Saving Your Data - Schneier on Security
"I believe that all this data isn't nearly as valuable as the big-data people are promising. Now that companies are recognizing that it is also a liability, I think we're going to see more rational trade-offs about what to keep -- and for how long -- and what to discard."
big-data  storage  business  silicon-valley  business-model 
june 2016 by tsuomela
The Ethics of Algorithms
"The Ethics of Algorithms is a combination research and education project. It aims to investigate the ethics and values of the computer scientists, information scientists, and software engineers who create algorithms. This research and education project aims to both bridge silos between philosophical and social scientific approaches to ethics to develop an integrated theoretical approach to ethics. Such a theoretical approach simultaneously identifies the analytical, moral reasoning that is happening during the conceptualization and design phase as well as critically analyzes the interplay between an individual's personal ethics and values and the ethics and values created by aspects of policies, institutional, economic, and cultural contexts. The proposed research furthers the literature on information ethics by taking an upstream approach that focuses on the design process. Finally, by focusing on algorithms, the proposed research will contribute to broader discussions about ethics, values, and big data. Algorithms are the driving technique behind the creation of big data sets yet there is little talk about the decisions and values that shape algorithm design and thus impact big data content. "
ethics  algorithms  big-data  philosophy  computer-science  technology  sts 
may 2016 by tsuomela
CLTC Scenarios – CLTC
"How might individuals function in a world where literally everything they do online will likely be hacked or stolen? How could the proliferation of networked appliances, vehicles, and devices transform what it means to have a “secure” society? What would be the consequences of almost unimaginably powerful algorithms that predict individual human behavior at the most granular scale? These are among the questions considered through a set of five scenarios developed by the Center for Long-Term Cybersecurity (CLTC), a new research and collaboration center founded at UC Berkeley’s School of Information with support from the Hewlett Foundation. These scenarios are not predictions—it’s impossible to make precise predictions about such a complex set of issues. Rather, the scenarios paint a landscape of future possibilities, exploring how emerging and unknown forces could intersect to reshape the relationship between humans and technology—and what it means to be “secure.”"
big-data  ethics  data  research  scenario  scenario-planning  futures  data-curation  online 
may 2016 by tsuomela
Reflection Stories — Responsible Data Forum
"Through the various Responsible Data Forum events over the past couple of years, we’ve heard many anecdotes of responsible data challenges faced by people or organizations. These include potentially harmful data management practices, situations where people have experienced gut feelings that there is potential for harm, or workarounds that people have created to avoid those situations. But we feel that trading in these “war stories” isn’t the most useful way for us to learn from these experiences as a community. Instead, we have worked with our communities to build a set of Reflection Stories: a structured knowledge base on the unforeseen challenges and (sometimes) negative consequences of using technology and data for social change. W"
data  ethics  big-data  reflection  story 
march 2016 by tsuomela
« earlier      
per page:    204080120160

Copy this bookmark:





to read