recentpopularlog in

juliusbeezer : text   58

Authentic or graded? Is there a middle way? | elt-resourceful
It’s also important that students are exposed to different genres of texts, and, especially for the teacher creating materials for their own class, authentic texts provide a relatively easy way to bring something up to date and topical into the classroom. They can provide us with the opportunity to look at the same topic reported in different ways, or give students a starting point from which to follow the news topic as it unfolds, in their own time.

However, in recent years I have been moving away from using unadapted authentic texts. The most obvious problem is the level of the language. When I was first trained, we were taught, ‘grade the task not the text’, but, while this is usually possible, I’m no longer sure that it’s always in the students’ best interests.

Taking this kind of approach is intended to help students develop strategies to deal with texts where a lot of the language is unknown. There is certainly a value in this, but is it as valuable as giving them a text from which they can get so much more? Hu and Nation (2000) concluded that most learners needed to comprehend 98% of words in a text in order to gain ‘adequate comprehension’
language  learning  writing  teaching  text 
december 2017 by juliusbeezer
Unicode strikethrough text tool for Twitter, Facebook, internationalized domain names, etc.
Create s̶t̶r̶i̶k̶e̶t̶h̶r̶o̶u̶g̶h̶ text on Twitter, Facebook, internationalized domain names, etc.

This little tool generates s̶t̶r̶i̶k̶e̶t̶h̶r̶o̶u̶g̶h̶ text using unicode characters. While the text it generates may look similar to text generated using the <strike> HTML tag or text-decoration:line-through CSS attribute, it isn't*. You can use this script to generate strikethrough text to paste in to Twitter or Facebook, or to register internationalized domain names containing strikethrough characters (ie. s̶t̶r̶i̶k̶e̶.ws).
text  text_tools  twitter 
july 2017 by juliusbeezer
Archive Publications - Editing and Proofreading Services
For over 30 years, Paul Beverley has used his programming ability to complement his writing and editing skills. Latterly he has decided to make his Word macros freely available to other writers and editors and has put them in a free book, Computer Tools for Editors, which you can download from this website. The most powerful macro, FRedit, is available separately (but still free).

For greater benefit from the (over 450) macros, we offer training for large or small groups, from the level of ‘What is a macro?!’, right up to macro power-users.
tools  text  editing 
september 2016 by juliusbeezer
Type Slowly: Word Processing and Literary Composition - The Los Angeles Review of Books
The phrase “word processing” wasn’t coined with literary elegance in mind. It originally applied to a range of technologies and practices, including typewriters, that allowed for delayed inscription, a shift from oral to recorded dictation, copiers, and a restructuring of office domains. As Kirschenbaum explains, research and consulting outfits such as the Word Processing Institute and the American Management Association, as well as manufacturers such as IBM, promoted the adaptation of new products and operations to stanch the proliferation of paperwork and, more pressingly, to cure a “social disease” stemming from secretarial mobility and versatility. Secretaries made up a large percentage of office personnel, and their duties were a hodgepodge: taking letters, typing, filing, making coffee, scheduling meetings, answering phones, booking flights, stocking supplies, buying gifts, and so on. Accordingly, they were hardly bound to their desks, which raised inefficiency flags for management consultants.
writing  text_tools  text 
april 2016 by juliusbeezer
Suleiman Mourad: Riddles of the Book. New Left Review 86, March-April 2014.
[punted to this by a search for Perry Anderson/Suleiman Mourad that pulled up as its top hit, this article linked at its foot. Seems to be of solid stuff...]

When the Great Mosque in Sana‘a was being renovated in the early 1970s, a secret attic was discovered above a false ceiling, containing a mass of old manuscripts. The Middle Eastern tradition (which applies to Christians and Jews as well) is that if a manuscript has the name of God or the name of the Prophet on it, you can’t simply destroy it. The best thing you can do is put it away, or bury it, as with the Dead Sea Scrolls or the Nag Hamadeh texts. You do so not to hide them for hiding’s sake, but to keep them from getting corrupted and thus insulting God. That was the case in San‘a. A German scholar was allowed to study the finds, but she has published very little on them for fear of the political consequences of doing so; it seems the Yemeni government threatened Germany with repercussions if anything embarrassing appeared. But from a few of what are believed to be very early parchments in the cache, using Kufic script, we know that they date to the late seventh or early eighth century, and we can already see one significant difference with the canonical version of the Qur’an. The traditional story tells us there were no serious variations between the different versions assembled by Caliph ‘Uthman around the year 650, though we know that down to the eighth century more popular versions of the Qur’an, without major discrepancies from the canonical text, were retained in certain regions—Iraq or Syria—out of local pride. The Yemeni manuscript, however, contains a very serious divergence. In the canonical Qur’an, there is a verse with the imperative form ‘say’ [qul]—God instructing Muhammad—whereas in the San‘a text, the same verse reads ‘he said’ [qala]. That suggests some early Muslims may have perceived the Qur’an as the word of the Prophet, and it was only some time later that his reported speech became a divine command. There is also some serious variation with respect to the size of some chapters.
religion  text  history  hermeneutics 
april 2016 by juliusbeezer
Sublime Text: The text editor you'll fall in love with
Sublime Text is a sophisticated text editor for code, markup and prose.
You'll love the slick user interface, extraordinary features and amazing performance.
text_tools  programming  software  text  linux 
march 2016 by juliusbeezer
Daring Fireball: Markdown
Thus, “Markdown” is two things: (1) a plain text formatting syntax; and (2) a software tool, written in Perl, that converts the plain text formatting to HTML. See the Syntax page for details pertaining to Markdown’s formatting syntax. You can try it out, right now, using the online Dingus.

The overriding design goal for Markdown’s formatting syntax is to make it as readable as possible. The idea is that a Markdown-formatted document should be publishable as-is, as plain text, without looking like it’s been marked up with tags or formatting instructions. While Markdown’s syntax has been influenced by several existing text-to-HTML filters, the single biggest source of inspiration for Markdown’s syntax is the format of plain text email.
text_tools  text  tools 
march 2016 by juliusbeezer
OCR - Community Help Wiki
The Ubuntu Universe repositories contain the following OCR tools:
tools  ubuntu  text  text_tools 
january 2016 by juliusbeezer
Jeremiah's Vanishing New York: Goldsmith's Capital
I actually stopped being a boring writer almost a decade ago when I got bored of being boring. I’m known for a book called Day, which was a transcription of The New York Times of September 1, 2000 into a 900-page book. That was boring. When the book was reviewed, most people mistakenly thought I had transcribed September 11, 2001. I thought that was a great idea and went ahead and transcribed the 9-11 New York Times—the one that everyone carried to work that day, not the 9-12 newspaper, when you saw the planes crashing into the towers. And as I was doing the transcription, I found my keyboard soaked in tears.
news  goldsmith  poetry  text 
november 2015 by juliusbeezer
What Does An Editor Do? | Redwoods Writes...
This is an all-too-common misconception about editorial work. People inside and outside the industry think that we just correct spellings, grammar and punctuation, and that’s it. Job done. But these are things that are done at the very END of a book’s editorial life, and they are often outsourced to freelance copy-editors and proofreaders AFTER the big, ‘structural’ edit has been done by the commissioning editor...
they know it could be EVEN BETTER if the narrative was re-shaped in certain ways – perhaps a character needs to be drawn out or cut completely, maybe more or less dialogue is needed, perhaps the author could magnify a particular event to make it more dramatic or a non-fiction text needs more factual information to make sense of the point it’s trying to make.

It’s our job to let authors know how we think the text they’ve supplied can be improved, and to deliver that information in the way that allows THEM to make the changes successfully. It’s a collaboration, and if the author disagrees with a note, then it’s their right to resist that change, but maybe suggest another one. I’ve watched TV producers give ‘notes’ to actors and crew on set and realised how similar it is to the editorial process.
editing  text  literature 
october 2015 by juliusbeezer
Planet PDF - Adobe Reader is the <em>de facto</em> Standard, not PDF
A pattern was established in which poorly-structured PDF files were roaming around in the wild, and that problem has worsened over time. As PDF has grown more popular, more and more applications of widely varying quality make bad PDF.

Adobe's solution was to engineer Adobe Reader to handle all the various oddball PDF files out there. It's one of the main reasons why Adobe Reader is a larger application to download and install compared to its rivals. Reader includes lots of code to deal with the thousands of different types of exceptions to "good" PDF that Reader users worldwide can and will encounter on a regular basis.

In their attempt to ensure that even the sloppiest PDF files still worked, Adobe created a situation in which developers could (and have) used Adobe's Reader as the reference implementation for their PDF software.

In 2010, there is still no alternative to Adobe Reader when it comes to validating third-party software.

As the vice-chair of ISO 32000, that bothers me
text_tools  text  copyright  monopoly 
september 2015 by juliusbeezer
Is DOCX really an open standard? | Abhishek Bhatnagar
Microsoft got the proposal fast-tracked in ISO even though reportedly 20 out of the 30 countries involved were not interested in passing it. This however didn’t stop the ISO secretariat Lisa Rachjel from pushing it through anyway after deciding “to move Open XML forward after consulting with staff at the International Technology Task Force”.

So ISO had a new incoming standard, but specific clauses of it still met resistance. To solve this problem, it was proposed that OOXML be split into two sub-standards, namely ISO 29500 Transitional, and ISO 29500 Strict. The Strict version was that which was accepted by ISO, and the Transitional version was fairly granted to Microsoft to allow them to slowly curb out older features from the closed source days. Nothing wrong with this, its only fair to their users.

However, the problem arose when Microsoft decided not to fully implement the Strict version of the standard in Office 2010.
tools  LibreOffice  monopoly  text  text_tools  ms_word_critique 
september 2015 by juliusbeezer
How to Install Microsoft Office on Ubuntu Linux | TechSource
"allow me teach you how to install Microsoft Word on Ubuntu. As some of you may know, I still use MS Word in favor of Writer. So if you are like me or if you have other reasons not to ditch Microsoft Office completely, perhaps you should follow this guide of installing MS Office on Ubuntu or on just about any other Linux distributions."
linux  tools  text  text_tools 
june 2015 by juliusbeezer
I Can Text You A Pile of Poo, But I Can’t Write My Name by Aditya Mukerjee | Model View Culture
My family’s native language, which I grew up speaking, is far from a niche language. Bengali is the seventh most common native language in the world...
The very first version of the Unicode standard did include Bengali. However, it left out a number of important characters. Until 2005, Unicode did not have one of the characters in the Bengali word for “suddenly”. Instead, people who wanted to write this everyday word had to combine three separate, unrelated characters...
Even today, I am forced to do this when writing my own name. My name is not only a common Indian name, but one of the top 1,000 names in the United States as well. But the final letter has still not been given its own Unicode character, so I have to use a substitute.
language  text  text_tools 
march 2015 by juliusbeezer
Authorea | A cheat sheet for web-friendly LaTeX
In this cheat sheet, we discuss some of the basics for writing documents in LaTeX. In particular, we will focus on web documents and introduce a subset of LaTeX which safely works on the web. Why? LaTeX is primarily intended for the printed page, not the web. But more and more scientists are writing content on the web, and they need to use mathematical notation. This cheat sheet presents LaTeX notation which converts easily to HTML and works well on platforms like Authorea.
text  tools  text_tools  web  internet  sciencepublishing 
february 2015 by juliusbeezer
Announcing the Interest Graph API
Traditionally, these articles are manually tagged with interests resulting in a non-canonical, open class of tags, which makes searching for them hard. Having the interest graph automatically tag posts helps maintain a standard that keeps things organized and lends itself well to searching.

Most articles lie at the intersection of multiple interests; for instance, the article above is about both Photography and Vintage. By navigating to these related interests, a reader can stumble upon new interests and discover interesting articles in a unique and serendipitous way.

Find Related Content

You can encourage your readers to explore by showing related articles (e.g. in the sidebar); interest tags make automatically selecting related articles easy. There are various similarity metrics that you can use to find related docs to a given doc.
tagging  folksonomy  tools  text  text_tools  textmining  web  webdesign 
february 2015 by juliusbeezer
Contact - DocumentCloud
DocumentCloud accounts are all newsroom based. DocumentCloud is available to anyone who reports on primary source documents. For the most part our users are journalists, but if you are doing document based investigative reporting we'd love to have you join us, even if you aren't a newsroom-based journalist in the conventional sense. If you're not in a traditional newsroom, please do show us some of your reporting and tell us a little bit about the kind of documents you're working with.
text_tools  text  textmining  journalism 
january 2015 by juliusbeezer
When Small Dropbear's LaTeX to html tools page goes down
I’ve been using LaTeX for many years, I should say quickly for the freaks out there that it doesn’t mean I’m into vinyl or other strangeness. LaTeX is a document processing system that creates g...
text_tools  text 
january 2015 by juliusbeezer
PLOS ONE: An Efficiency Comparison of Document Preparation Systems Used in Academic Research and Development
Word users more productive and made fewer errors than LateX users, say researchers (study n=40). I note the continuous text was right justified...
text_tools  science  editing  text  ms_word_critique 
january 2015 by juliusbeezer
Pandoc - About pandoc
If you need to convert files from one markup format into another, pandoc is your swiss-army knife. Pandoc can convert documents in markdown, reStructuredText, textile, HTML, DocBook, LaTeX, MediaWiki markup, TWiki markup, OPML, Emacs Org-Mode, Txt2Tags, Microsoft Word docx, EPUB, or Haddock markup to

HTML formats: XHTML, HTML5, and HTML slide shows using Slidy, reveal.js, Slideous, S5, or DZSlides.
Word processor formats: Microsoft Word docx, OpenOffice/LibreOffice ODT, OpenDocument XML
Ebooks: EPUB version 2 or 3, FictionBook2
Documentation formats: DocBook, GNU TexInfo, Groff man pages, Haddock markup
Page layout formats: InDesign ICML
Outline formats: OPML
TeX formats: LaTeX, ConTeXt, LaTeX Beamer slides
PDF via LaTeX
Lightweight markup formats: Markdown, reStructuredText, AsciiDoc, MediaWiki markup, DokuWiki markup, Emacs Org-Mode, Textile
Custom formats: custom writers can be written in lua.
text  text_tools  ebooks 
december 2014 by juliusbeezer
Post-publication review of the PLOS ONE paper comparing MS Word and LaTeX: How not to compare document preparation — The Serial Mentor
I have been using LaTeX for over 20 years, I have written tens of thousands of pages with LaTeX, and I am extremely familiar with its pros and cons. I am also using MS Word on a regular basis, and I regularly make the choice between using LaTeX or MS Word, on a document-by-document case. There are documents I’d rather write in LaTeX, and there are other documents I’d rather write in MS Word. There are also documents (increasingly many, in fact) for which I prefer entirely different approaches, such as Markdown.
text  text_tools  ms_word_critique 
december 2014 by juliusbeezer
Ragged-right or justified alignment? | Kai's Tech Writing Blog
How do you argue for the preference of ragged-right over justified alignment in print? Searching the web, I soon came across pages which mentioned research, but it was harder to actually find it.
text  tools  typography 
december 2014 by juliusbeezer
Justified Text Versus Ragged-Right Text « Adams on Contract Drafting
The reason that text with justified margins looks bad in a single-column Word document is that subtle word-spacing and letter-spacing algorithms are needed to make justified text look “good,” and Word’s aren’t up the job. So it’s not really the column width that’s the problem, but rather limitations in the software. Many beautiful books are set in single-column justified pages, but they have been properly typeset. Word documents simply should not be justified.
typography  text 
december 2014 by juliusbeezer
Justified vs. Rag Right
If the text is for the web – or any medium that does not allow for complete control over size, line breaks, and hyphenation – it is best to avoid justification entirely.
typography  text 
december 2014 by juliusbeezer
Textproof Chronicle: Readability: Justified vs. Ragged
The key point, then, is that ragged right is consistent with using the same width space between words, whilst justified text must allow the interword space to vary.

In an influential article, Williams (2000, p394) recommends the use of [s]et type intended for extended reading flush left, and ragged right because (p390) [n]on-uniform spacing between words decreases reading speed by as much as 11 percent (Trollip and Sales 1986).
design  text  typography 
december 2014 by juliusbeezer
LaTeX Something Something Darkside | Peter Krautzberger
(when was the last time you went to a library to look at the printed copy of a current journal issue?). What is not obsolete is PDF and TeX is, of course, very good when it comes to generating PDF.

However, this “Portable Document Format” is really quite useless in the one place where people consume more and more information: the web. (I admit I’m of the conviction that the web won’t go away; crazy talk, I know.) And for the web, TeX/LaTeX is the wrong tool.

Turn this around and you’ll realize that the community as whole has a serious problem: almost nobody writes TeX/LaTeX that way which means almost all TeX/LaTeX will never convert to web formats well. To put it differently, there’s a reason for a large market of blackbox vendors that specialize in TeX to XML/HTML conversion for professional publishers (and this often involves re-keying).

This is, of course, in no way a fault of TeX/LaTeX itself which was designed for print, in 1978.
text  text_tools  tools  dccomment 
november 2014 by juliusbeezer
Authorea | LaTeX was not built for the web
Authorea understands and renders markup languages such as Markdown, and LaTeX. But it does not rely on a compiler which takes TeX and spits out PDF. All the content created on Authorea is web-native. As we create more and more content on the web, we think that scholarly articles, too, should live on the web.

That said, we do enjoy and use LaTeX frequently at Authorea.
tools  text 
november 2014 by juliusbeezer
Scientific writing: the online cooperative : Nature News & Comment
For Authorea, that concept is based on the software-management system Git, used by programmers to keep track of changes on collaborative code-writing projects, and by data scientists to record their analysis workflow. Other tools take different approaches: Google Docs and Fidus Writer allow all users access to the entire file simultaneously, and track changes more or less like Microsoft Word, but Fidus Writer, for example, does not record the detailed history of every single edit
tools  text 
november 2014 by juliusbeezer
le tiers livre, web & littérature : Kenneth Goldsmith | « Recopiez-moi cinq pages »
With the web, writing has met his photography dit Goldsmith en tête de son Uncreative writing.

Kenneth Goldsmith est probablement moins connu pour son apport à l’écriture créative, mais personnellement je le place au premier plan pour son apport, et le bien que nous fait à tous son approche iconoclaste.

Iconoclaste ? Pas de vaine provocation. Plusieurs années que je suis de très près ce qui s’échafaude et se publie dans le creative writing US, et que je collectionne les bouquins même si, pour 80% d’entre eux, ils vont paraître absolument décevants
goldsmith  text  français 
september 2014 by juliusbeezer
Meet Xiki, the Revolutionary Command Shell for Linux and Mac OS X |
Xiki merges shell and GUI concepts. It runs in a text editor, so everything is editable and you can save your Xiki sessions in text files. You can use a mouse in Xiki, insert a command prompt anywhere you want, incrementally filter searches, expand and filter directory contents, open and edit files in place, enter text notes wherever you want, edit, re-order, and re-use command history, and you can do all of this in a natural progressive flow. You can create new commands as you go, browse and replay commands that were run from specific directories, and have menus of favorite commands. You can send Tweets and emails directly from Xiki.
tools  text  linux  unix 
june 2014 by juliusbeezer
About the Project | Argo
The objective of the project is to develop a workbench for analysing (primarily annotating) textual data that meets the following requirements:

ease of combining elementary text-processing components to form meaningful and comprehensive processing workflows
ability to manually intervene in the otherwise automatic process of annotation by correcting or creating new annotations
easy access by providing a web-based interface
user collaboration by providing sharing capabilities for user-owned resources

Argo is being developed by The National Centre for Text Mining at The University of Manchester.
text_tools  textmining  text 
may 2014 by juliusbeezer
PLOS ONE: Revisiting the Estimation of Dinosaur Growth Rates
Blimey PLOSONE's graphical rendering of mathematical expressions is ugly (and informationally subtracting). There are ancient historial reasons for this: basically (La)TeX is a pain the ass to render into SGML or HTML because its locally extensible macros make conversions unpredictable.
text  text_tools  mathematics  web  webdesign 
december 2013 by juliusbeezer
Small Dropbear | Latex to HTML Converters
about the various types of LaTeX to HTML converters out there. It is not an exhaustive list but should help other people looking around for converters.

[see pinboard note added 07/01/15]
text_tools  text  web  webdesign 
december 2013 by juliusbeezer
About writeLaTeX
WriteLaTeX is a free service that lets you create, edit and share your scientific ideas easily online using LaTeX, a comprehensive and powerful tool for scientific writing.
tools  text  text_tools  sciencepublishing 
november 2013 by juliusbeezer
Arcades Awakening | a hypertextual extension of Walter Benjamin's Arcades Project
When I first read Walter Benjamin’s Arcades Project I was struck (as most people probably are) by its beauty and depth, but also its enormity, its density and its difficulty. I wanted so badly to get to the meat of his ideas, but the linearity of reading it on paper made wading through his ideas difficult, when ultimately they were modular — more like a constellation than a simple chain.

And so, through this project I intended to overlay the convolutes with hypertext to sate my urge to read it as a non-linear constellation. I wanted to wander through the work like a flâneur, drifting slowly through on a whim
text  internet  web  theory  literature 
november 2013 by juliusbeezer
Word Processors: Stupid and Inefficient
Preparing printable text using a word processor effectively forces you to conflate two tasks that are conceptually distinct and that, to ensure that people's time is used most effectively and that the final communication is most effective, ought also to be kept practically distinct. The two tasks are: the composition of the text itself... [and the] typesetting of the document... The author of a text should, at least in the first instance, concentrate entirely on the first of these sets of tasks. That is the author's business. Typesetting is the typesetter's business. This division of labour was of course fulfilled in the traditional production of books and articles in the pre-computer age.
text  text_tools  typography  writing 
october 2013 by juliusbeezer
Open Access Toolset Alliance
Some sort of umbrella org for OA OS tools via @irynakuchma
openaccess  opensource  publishing  software  tools  text 
september 2013 by juliusbeezer
My philosophy: Alan Sokal | The Philosophers Magazine
Although that was it with formal philosophy until the Social Text affair, Sokal did have a “philosophically-oriented approach to physics,” which contrasted with the “very pragmatic anti-philosophical point of view” of many of his colleagues, of which “the extreme version is ‘shut up and calculate’: physics is about predicting, experiment and that’s all. I was always opposed to that point of view. It seems to me that physics is about trying to understand the world, and experiments are tools for checking whether your theories about the world are possibly right but they’re not an end in themselves.
text  postmodernism  science  sciencepublishing  philosophy 
may 2013 by juliusbeezer
Wordcounts are amazing. | The Stone and the Shell
We need to remember that words are actually features of a very, very high-level kind. As a thought experiment, I find it useful to compare text mining to image processing. Take the picture on the right. It’s pretty hard to teach a computer to recognize that this is a picture that contains a face. To recognize that it contains “sitting” and a “baby” would be extraordinarily impressive. And it’s probably, at present, impossible to figure out that it contains a “blanket.”
language  text  textmining  text_tools  semantic  statistics 
february 2013 by juliusbeezer
Uncreative Writing: Redefining Language and Authorship in the Digital Age | Brain Pickings
Goldsmith echoes legendary designer Charles Eames, who famously advised to “innovate only as a last resort”
writing  creativity  literature  poetry  internet  text 
february 2013 by juliusbeezer
Kevin's Word List Page
Collection of dictionary tools for coders
english  text  text_tools  dictionary 
january 2013 by juliusbeezer
A bit more about the Up-Goer Five Text Editor · Splasho
I hacked together an editor which would complain if you used any word not on a particular list. Now, what list to use? ... it depends ... Newspapers ≠ scientific papers ≠ novels... XKCD used the contemporary fiction frequency list available on Wiktionary. I used the Automatically Generated Inflection Database to make sure I had every derivative of these 1000 words – leading to some odd words like ‘themselveses’ being allowed!
tools  text  writing 
january 2013 by juliusbeezer
The Best of Our Gun Debate: Readers Weigh In on Owning Firearms - The Daily Beast
Interesting automated analysis of comments to Daily Beast gun control discussion.
"Now, this isn’t a comprehensive study by any means, but it does provide a lot of food for thought. To help sort through the data, we ran the 1,300 responses through Overview, a clustering algorithm maintained by the Associated Press and supported by the John S. and James L. Knight Foundation, designed to flag key words and phrases that showed up most frequently. For example, over 30 responses included some variant of the word “need”—mostly from non-gun owners like JR from Indiana, who wrote, “I don’t need one. Simple.”

On the other side, gun owners clearly argued for the practicality of gun ownership—the word “hunt” came up often..."
I guess we'd better let the farmers keep blasting away at the rabid skunks with a shotgun...
journalism  text  text_tools  textmining  commenting 
january 2013 by juliusbeezer
Allen H. Renear (personal web page)
Denton Declaration collaborator's research interests include:

"Ontologies for digital objects. Our statements about digital objects make extensive use of idiom, metaphor, and logical fiction. If these sentences are naively transferred into the world of linked data and semantic technologies much unsound (and possibly harmful) inferencing will ensue and many opportunities will be lost. More robust ontologies are needed.

and publications include:

“What is Text, Really?” Steven J. DeRose, David G. Durand, Elli Mylonas, and Allen H. Renear. Journal of Computing in Higher Education 2:1 3-26 (1990). Reprinted in the ACM/SIGDOC *Journal of Computer Documentation 21:3 1-24 (1997). [ACM]
opendata  text  text_tools  semantic  ontology  sciencepublishing 
november 2012 by juliusbeezer
Stephen Ramsay / Reading Machines: Toward an Algorithmic Criticism
Computer-based text analysis has been employed for the past several decades as a way of searching, collating, and indexing texts. Despite this, the digital revolution has not penetrated the core activity of literary studies: interpretive analysis of written texts.

Computers can handle vast amounts of data, allowing for the comparison of texts in ways that were previously too overwhelming for individuals, but they may also assist in enhancing the entirely necessary role of subjectivity in critical interpretation.
literature  text 
june 2012 by juliusbeezer
Unilever Centre for Molecular Informatics, Cambridge - What are the formal restrictions on text-mining? « petermr's blog
#oscar4 #okfn #pantonpapers

A little while ago I suggested that we create whitepapers (“Panton Papers”, ) to help our development of open science. We’ve come up with some titles and I’ve drafted one on text-mining . There’s now a useful response from Todd Vision on the Open-science discussion list ( )

Peter’s draft whitepaper on text-mining is badly needed and nicely put. I was particularly interested in this passage:

“The provision of journal articles is controlled not only by copyright but also (for most scientists) the contracts signed by the institution.
text  text_tools  openaccess  open  openscience 
april 2011 by juliusbeezer
The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!) - Joel on Software
All that stuff about "plain text = ascii = characters are 8 bits" is not only wrong, it's hopelessly wrong...In Unicode, the letter A is a platonic ideal. Every platonic letter in every alphabet is assigned a magic number by the Unicode consortium which is written like this: U+0639. This magic number is called a code point. There is no real limit on the number of letters that Unicode can define and in fact they have gone beyond 65,536 so not every unicode letter can really be squeezed into two bytes
text  text_tools 
december 2010 by juliusbeezer The Glass Box And The Commonplace Book
iPad (but not Kindle) locks you out of copying and pasting the text you are reading
internet  copyright  history  technology  text  journalism 
may 2010 by juliusbeezer

Copy this bookmark:

to read