Tag Archives: text extraction

css.php

NES Instruction Manuals

For experimental use of OCR software and text encoding I have decided to switch from attempting to pull out text from comic books (which has proven difficult), and to pull from databases of video game instruction manuals, specifically NES.  There is still a lot of cleanup to do, but the issues are far less taxing […]

Posted in Uncategorized | Also tagged , , , , | Comments closed

Uncle Scrooge Experiment Session 2

As far as data sets go, Comichron.com has a detailed table presenting data on Uncle Scrooge comics distributed from 1960 to 1998.  This table can be found here: http://www.comichron.com/titlespotlights/unclescrooge.html I cut and pasted into data wrangler.  The issue I found was that the amount of information seemed to overload the application.  a table sprang up only […]

Posted in Uncategorized | Also tagged , | Comments closed

Uncle Scrooge Experiment Session 1

Uncle Scrooge Corpus Project Session 1 So my first attempt at text extraction was to take a very old Uncle Scrooge comic book, and use the OCR tools in Adobe Acrobat X. I am not completely unfamiliar with this process, as I had to use this software at a former job. I have experience with […]

Posted in Uncategorized | Also tagged , , | Comments closed
Need help with the Commons? Visit our
help page
Send us a message
Skip to toolbar