recentpopularlog in


« earlier   
spirit / guess_language — Bitbucket
guess_language – Guess the natural language of a text
language  classification  nlp  detection  python  library 
6 days ago by raphman
Building a charter database 4: agent attributes and relationships | The Making of Charlemagne's Europe
Posted: Dec. 8, 2014, 9:43 a.m. by Many charters include explicit information about agents (individuals, groups and institutions) that is of interest to record. via Pocket
classification  dev  models  people  prosopography 
7 days ago by kintopp
Linked Art
Linked Art is a Community working together to create a shared Model based on Linked Open Data to describe Art. We then implement that model in Software and use it to provide valuable content. It is under active development and we welcome additional partners and collaborators. via Pocket
art  classification  models  standards  open  data 
9 days ago by kintopp
Newsmap: A semi-supervised approach to geographical news classification: Digital Journalism: Vol 6, No 3
This paper presents the results of an evaluation of three different types of geographical news classification methods: (1) simple keyword matching, a popular method in media and communications research; (2) geographical information extraction systems equipped with named-entity recognition and place name disambiguation mechanisms (Open Calais and; and (3) a semi-supervised machine learning classifier developed by the author (Newsmap). Newsmap substitutes manual coding of news stories with dictionary-based labelling in the creation of large training sets to extract large numbers of geographical words without human involvement and it also identifies multi-word names to reduce the ambiguity of the geographical traits fully automatically. The evaluation of classification accuracy of the three types of methods against 5000 human-coded news summaries reveals that Newsmap outperforms the geographical information extraction systems in overall accuracy, while the simple keyword matching suffers from ambiguity of place names in countries with ambiguous place names.
Research  automation  classification  mapping  opencalais 
11 days ago by paulbradshaw
Local administrative unit - Wikipedia
Generally, a local administrative unit (LAU) is a low level administrative division of a country, ranked below a province, region, or state. Not all countries describe their locally governed areas this way, but it can be descriptively applied anywhere to refer to counties, municipalities, etc. via Pocket
classification  gazetteer  places  standards 
12 days ago by kintopp
Nomenclature | Communities
"Robert G. Chenhall’s nomenclature for classifying man-made objects is the standard cataloging tool for thousands of museums and historical organizations across the United States and Canada. Nomenclature’s lexicon of object names, arranged hierarchically within functionally defined categories, has become a de facto standard within the community of history museums in North America."
museum  museology  classification  description  cataloging  metadata  standards 
17 days ago by tsuomela
[1802.01021] DeepType: Multilingual Entity Linking by Neural Type System Evolution
The wealth of structured (e.g. Wikidata) and unstructured data about the world available today presents an incredible opportunity for tomorrow's Artificial Intelligence. So far, integration of these two different modalities is a difficult process, involving many decisions concerning how best to represent the information so that it will be captured or useful, and hand-labeling large amounts of data. DeepType overcomes this challenge by explicitly integrating symbolic information into the reasoning process of a neural network with a type system. First we construct a type system, and second, we use it to constrain the outputs of a neural network to respect the symbolic structure. We achieve this by reformulating the design problem into a mixed integer problem: create a type system and subsequently train a neural network with it. In this reformulation discrete variables select which parent-child relations from an ontology are types within the type system, while continuous variables control a classifier fit to the type system. The original problem cannot be solved exactly, so we propose a 2-step algorithm: 1) heuristic search or stochastic optimization over discrete variables that define a type system informed by an Oracle and a Learnability heuristic, 2) gradient descent to fit classifier parameters. We apply DeepType to the problem of Entity Linking on three standard datasets (i.e. WikiDisamb30, CoNLL (YAGO), TAC KBP 2010) and find that it outperforms all existing solutions by a wide margin, including approaches that rely on a human-designed type system or recent deep learning-based entity embeddings, while explicitly using symbolic information lets it integrate new entities without retraining.
data-fusion  machine-learning  deep-learning  rather-interesting  inference  classification  to-write-about  consider:representation 
20 days ago by Vaguery

Copy this bookmark:

to read