recentpopularlog in


« earlier   
NovelTM Datasets for English-Language Fiction, 1700-2009 | hc:26955 | Humanities CORE
This report describes a collection of 210,305 volumes of fiction that researchers are encouraged to borrow for their own work. Alternately, readers can simply browse the report as a description of English-language fiction in HathiTrust Digital Library. For instance, how does the proportion of fiction written by British authors or by women change across time? We also divide nineteenth- and twentieth-century fiction into seven subsets with different emphases (for instance, one where men and women are represented equally, and one composed of only the most prominent and widely-held books). Comparing the pictures produced by these different samples allows us to assess the fragility of recent quantitative arguments about literary history. Preprint version of an article to appear in the Journal of Cultural Analytics.
digital-humanities  open-data  natural-language-processing  rather-interesting  to-use  digitization 
14 hours ago by Vaguery
We are ready to hear all the questions you might have on and how to unleash the potentia…
digitization  audiovisual  from twitter_favs
11 days ago by verwinv
Scenes from Day 2 of & week: making a case for trying out 384 kHz for audio…
IASA2019  JTS  digitization  from twitter_favs
12 days ago by verwinv
Arcadia Fund |Protecting endangered culture and nature and promoting open access
Arcadia serves humanity by preserving endangered cultural heritage and ecosystems. We protect complexity and work against the entropy of ravaged and thereby starkly simplified natural environments and globalized cultures. Innovation and change occur best in already complex systems. Once memories, knowledge, skills, variety, and intricacy disappear – once the old complexities are lost – they are hard to replicate or replace. Arcadia aims to return to people both their memories and their natural surroundings.
archives  culture  digitization  funding  libraries  preservation 
19 days ago by kintopp
[1701.07396] LAREX - A semi-automatic open-source Tool for Layout Analysis and Region Extraction on Early Printed Books
A semi-automatic open-source tool for layout analysis on early printed books is presented. LAREX uses a rule based connected components approach which is very fast, easily comprehensible for the user and allows an intuitive manual correction if necessary. The PageXML format is used to support integration into existing OCR workflows. Evaluations showed that LAREX provides an efficient and flexible way to segment pages of early printed books.
OCR  digitization  machine-learning  algorithms  image-processing  image-segmentation  digital-humanities  to-understand  to-write-about  to-simulate 
5 weeks ago by Vaguery

Copy this bookmark:

to read