Analyzing Documents with TF-IDF | Programming Historian
"This lesson focuses on a foundational natural language processing and information retrieval method called Term Frequency - Inverse Document Frequency (tf-idf). This lesson explores the foundations of tf-idf, and will also introduce you to some of the questions and concepts of computationally oriented text analysis."
text-mining  python  digital-humanities  text-analysis  tutorial 
29 days ago by tsuomela
Distant Horizons: Digital Evidence and Literary Change, Underwood
"Just as a traveler crossing a continent won’t sense the curvature of the earth, one lifetime of reading can’t grasp the largest patterns organizing literary history. This is the guiding premise behind Distant Horizons, which uses the scope of data newly available to us through digital libraries to tackle previously elusive questions about literature. Ted Underwood shows how digital archives and statistical tools, rather than reducing words to numbers (as is often feared), can deepen our understanding of issues that have always been central to humanistic inquiry. Without denying the usefulness of time-honored approaches like close reading, narratology, or genre studies, Underwood argues that we also need to read the larger arcs of literary change that have remained hidden from us by their sheer scale. Using both close and distant reading to trace the differentiation of genres, transformation of gender roles, and surprising persistence of aesthetic judgment, Underwood shows how digital methods can bring into focus the larger landscape of literary history and add to the beauty and complexity we value in literature."
book  publisher  digital-humanities  text-mining  evidence 
april 2019 by tsuomela
DH Infrastructure Symposium - HumTech - UCLA
"We enthusiastically invite you to join us at UCLA on November 15, 2018 for the third annual Digital Humanities Infrastructure Symposium. Our focus this year is on some real-world built and building platforms and methods to support DH researchers. We welcome anyone interested in learning from what has been done in practical, infrastructure-building terms — especially technologists, library staff, and those involved or getting started in building DH capacity."
conference  digital-humanities  infrastructure 
december 2018 by tsuomela
Doing Digital Scholarship
"Doing Digital Scholarship offers a self-guided introduction to digital scholarship, designed for digital novices. It allows you to dip a toe into a very large field of practice. It starts with the basics, such as securing web server space, preserving data, and improving your search techniques. It then moves forward to explore different methods used for analyzing data, designing digitally inflected teaching assignments, and creating the building blocks required for publishing digital work."
digital-scholarship  digital-humanities  guide  tutorials 
august 2018 by tsuomela
Too Much Information and the KWIC | SpringerLink
"This paper takes a media archaeology look at the development of the Keyword-in-Context (KWIC) display by Peter Luhn and how the KWIC helped automate ways of disseminating information about information. The paper takes the development of the KWIC as an example of the development of a knowledge technology that frames knowledge in a certain way. The KWIC and other information technologies transform knowledge into information that can be quantified and processed. Developments like the KWIC are the beginning of language engineering—a new way of conceiving of text as information to be manipulated. Finally, the paper proposes a way of reflecting on developments like the KWIC by replicating these early technologies. Replications can take the form of demonstration devices or knowledge things that expose the processes in our infrastructure."
digital-humanities  keywords  methods  history 
may 2018 by tsuomela
Topic Modeling in Python with NLTK and Gensim | DataScience+
"In this post, we will learn how to identify which topic is discussed in a document, called topic modeling. In particular, we will cover Latent Dirichlet Allocation (LDA): a widely used topic modelling technique. And we will apply LDA to convert set of research papers to a set of topics."
python  statistics  topic-modeling  digital-humanities  methods 
april 2018 by tsuomela
Visualizing Cultural Collections
"At the University of Applied Sciences Potsdam, »Visualizing Cultural Collections« is a cross-disciplinary research theme that started with the reseach project VIKUS (Visualisierung kultureller Sammlungen) in 2014-2017. The aim of this research has been to study new forms of graphical user interfaces to support the exploration of digital cultural heritage. Researchers and students from various fields such as interface design, informatics, media studies and cultural management have been conceiving, prototyping and evaluating novel visualization techniques that are aimed at enabling interactive examination of cultural objects."
visualization  demonstration  digital-humanities 
march 2018 by tsuomela
Python Programming for the Humanities by Folgert Karsdorp
"The programming language Python is widely used within many scientific domains nowadays and the language is readily accessible to scholars from the Humanities. Python is an excellent choice for dealing with (linguistic as well as literary) textual data, which is so typical of the Humanities. In this book you will be thoroughly introduced to the language and be taught to program basic algorithmic procedures. The book expects no prior experience with programming, although we hope to provide some interesting insights and skills for more advanced programmers as well. The book consists of 10 chapters. Chapter 5 and Chapter 6 are still in draft status and not ready for use."
python  programming  tutorial  digital-humanities 
january 2018 by tsuomela
Home - Matthew Lincoln, PhD
"I am a data research specialist at the Getty Research Institute, where I use computer-aided analysis of cultural datasets to help model long-term trends in iconography, art markets, and the social relations between artists."
weblog-individual  history  art  network-analysis  people  digital-humanities 
november 2017 by tsuomela
Digital History & Argument White Paper – Roy Rosenzweig Center for History and New Media
"This white paper is the product of the Arguing with Digital History Workshop organized by Stephen Robertson and Lincoln Mullen of George Mason University, with funding from the Andrew W. Mellon Foundation. The two-day workshop, which involved twenty-four invited participants at different stages in their careers, working in a variety of fields with a range of digital methods, was conceived with a focus on one particular form of digital history, arguments directed at scholarly audiences and disciplinary conversations. Despite recurrent calls for digital history in this form from digital and analog historians, few examples exist. The original aim of the workshop was to promote digital history that directly engaged with historiographical arguments by producing a white paper that addressed the conceptual and structural issues involved in such scholarship. Input from the participants expanded the scope of the white paper to also elaborate the arguments made by other forms of digital history and address the obstacles to professional recognition of those interpretations. The result was a document that aims to help bridge the argumentative practices of digital history and the broader historical profession. On the one hand, it aims to demonstrate to the wider historical discipline how digital history is already making arguments in different forms than analog scholarship. On the other hand, it aims to help digital historians weave the scholarship they produce into historiographical conversations in the discipline."
digital-humanities  digital  history  methodology 
november 2017 by tsuomela
"Washington, D.C. and Georgetown, Texas) February 13, 2012—The Council on Library and Information Resources (CLIR) and the National Institute for Technology in Liberal Education (NITLE) announce the formation of Anvil Academic, a digital publisher for the humanities. Anvil will focus on publishing new forms of scholarship that cannot be adequately conveyed in the traditional monograph."
publishing  scholarly-communication  digital-humanities  academic 
july 2017 by tsuomela
Ways to Compute Topics over Time, Part 1 · from data to scholarship
"This the first in a series of posts which constitute a “lit review” of sorts to document the range of methods scholars are using to compute the distribution of topics over time."
digital-humanities  topic-modeling  methods  tutorial  temporal 
june 2017 by tsuomela
