Uncle Scrooge Experiment Session 1

Uncle Scrooge Corpus Project
Session 1

Uncle Scrooge #89

Uncle Scrooge #89

So my first attempt at text extraction was to take a very old Uncle Scrooge comic book, and use the OCR tools in Adobe Acrobat X. I am not completely unfamiliar with this process, as I had to use this software at a former job. I have experience with turning typewritten manuscripts into editable PDF files. I was curious as to how well text extraction would work with a comic. My initial results are aggravating.

Page 1:
The software was unable to read more than half of the text. It extracted the first speech bubble and the metadata at the bottom of the page:

1101..0 YOUR IIORS5
OONM.O l WWERE ‘
ARE ‘tOO GOING IN SUCH A HURRY l

iJSTMASTER• Please send nolfce on Form 3579 to Western Publis hing Company, Inc., orth Road, Poughkeepsie, New York 12602
? .
:·s ey UNelE SCROOGE, No. 89, October, 1970. Published bi-monthly by Western P blishlng Company, Inc.,- North Road, Poughkeep
: York 12602. Second-class postage paid at Poughkeepsie, New York. Subscr ipt ion pr ic e in t he U.S.A. $1.00 per year; foreign subscrip
s: 5 to be repr
• ·-:Jt permission of Walt Disney Productions. Authorized edit ion. Printed in U.S.A.
– 1870, 185&, by Walt Disney Productions.
GOLD KEY & DESIGN is a Trademark of Western Publis hing Company, Inc.
CHANGE OF ADDRESS should reach us six weeks in advance of the next issue date. Give both your old and
‘ · . new addnu enclosinif possible your old address label.
” • • :al may not .be sold except by authorized dealers and is sold subject to the conditions that it shall not be sold or distribute
rt of its· cover or markinis removed, nor in a mutilated ondition nor affixed to nor as part of any advertising, literary

PAGE 2
I had the same problem here, for some reason the software is picking up only the first speech bubble. I imagine there is something in the settings I can alter to fix the issue…

I WISH ‘t’OU ‘D BE LI KE lt\E, OONALOl I NEVER WASTE MONeY OR
. ANYTHING l

 

FAIR USE NOTICE: This study utilizes copyrighted material the use of which has not been specifically authorized by the copyright owner. I am making my study available in an effort to advance understanding of environmental, political, human rights, economic, democracy, scientific, and social justice issues, etc. I believe this constitutes a ‘fair use’ of any such copyrighted material as provided for in section 107 of the US Copyright Law. In accordance with Title 17 U.S.C. Section 107, the material on this site is distributed without profit to those who have expressed a prior interest in receiving the included information for research and educational purposes. For more information go to: http://www.law.cornell.edu/uscode/17/107.shtml.

If you wish to use copyrighted material from this site for purposes of your own that go beyond ‘fair use’, you must obtain permission from the copyright owner.

This entry was posted in Uncategorized and tagged , , , . Bookmark the permalink. Both comments and trackbacks are currently closed.
Skip to toolbar