Monthly Archives: February 2013

Basic XML Exploration and Thoughts on Moving Beyond

I’ve been practicing with XML and XSLT stylesheets.  Using this list from the LA Times I’ve been able to create a very basic data set complete with book covers.  I believe this is a very good first step toward being able to create something engaging with a relevant data set.  I’m not sold on this […]

Posted in XML | Tagged , , , , , | Comments closed

NES Instruction Manuals

For experimental use of OCR software and text encoding I have decided to switch from attempting to pull out text from comic books (which has proven difficult), and to pull from databases of video game instruction manuals, specifically NES.  There is still a lot of cleanup to do, but the issues are far less taxing […]

Posted in Uncategorized | Tagged , , , , , | Comments closed

Uncle Scrooge Experiment Session 2

As far as data sets go, Comichron.com has a detailed table presenting data on Uncle Scrooge comics distributed from 1960 to 1998.  This table can be found here: http://www.comichron.com/titlespotlights/unclescrooge.html I cut and pasted into data wrangler.  The issue I found was that the amount of information seemed to overload the application.  a table sprang up only […]

Posted in Uncategorized | Tagged , , | Comments closed

Uncle Scrooge Experiment Session 1

Uncle Scrooge Corpus Project Session 1 So my first attempt at text extraction was to take a very old Uncle Scrooge comic book, and use the OCR tools in Adobe Acrobat X. I am not completely unfamiliar with this process, as I had to use this software at a former job. I have experience with […]

Posted in Uncategorized | Tagged , , , | Comments closed
Skip to toolbar