the tesseract library is an OCR engine (optical character recognition)
basically it searches an image for words or text and gives it back to you in a string format.
ive created a ofxTesseract wrapper which is very simple at the moment but should get people started quickly.
originally i found this source by nolan brown, http://github.com/nolanbrown/Tesseract-iPhone-Demo
the tesseract library was already compiled in the iphone xcode project and so i figured it would work in OF on a mac, which it did.
it might work better if the library is compiled again, strictly for the mac osx, but this is a bit over my head. be great if someone knew how to do this.
something to be aware of,
in the xcode project -> executables
double click the executable and a window will come up.
in the Arguments tab, i had to add the following,
TESSDATA_PREFIX = “…/…/…/data/”
this tells the library where the “tessdata” folder is located.
so far i haven’t been able to figure out any other way of specifying the data path.
subscan - basically scans for movie subtitles.
not sure where this side project is going… but its been a good exercise in openCV.
also nice to be able to transcribe all that movie data in real-time and have it at hand.
if anyone has any cool ideas on what to do with all this movie data, would love to hear them!
Thanks for your work on this! I compiled Tesseract-3.00 for i386 on OS X and have used it in OpenFrameworks. This update seems to have fixed some of the problems with setting data-paths etc., plus solved a mysterious “tesseract crashes 30% of the time” issue I was having (which may or may not have been a problem with my coding).
Anyway, I’ve put an updated version of the tesseractExample online here:
hi robert,
awesome that you got Tesseract-3.00 compiling!
i’ve run your tesseractExample and also getting the same error as stephan.
could there be step i missed somewhere?
OCR in OF sounds ace! I couldn’t get any of these examples to work though…the first one seems to compile but then disappears straight away with a “emptyExample has exited with status 1” comment in the message bar at the bottom of xcode. The second download link seems to fail with these 3 errors of main.cpp listed as:
/Users/SCam/Documents/of_preRelease_v0062_osxSL_FAT/apps/addonsExamples/tesseractV2download/src/main.cpp:13:0 /Users/SCam/Documents/of_preRelease_v0062_osxSL_FAT/apps/addonsExamples/tesseractV2download/src/main.cpp:13: error: expected type-specifier before ‘testApp’
/Users/SCam/Documents/of_preRelease_v0062_osxSL_FAT/apps/addonsExamples/tesseractV2download/src/main.cpp:13:0 /Users/SCam/Documents/of_preRelease_v0062_osxSL_FAT/apps/addonsExamples/tesseractV2download/src/main.cpp:13: error: expected `)’ before ‘testApp’
/Users/SCam/Documents/of_preRelease_v0062_osxSL_FAT/apps/addonsExamples/tesseractV2download/src/main.cpp:13:0 /Users/SCam/Documents/of_preRelease_v0062_osxSL_FAT/apps/addonsExamples/tesseractV2download/src/main.cpp:13: error: cannot convert ‘int*’ to ‘ofBaseApp*’ for argument ‘1’ to ‘void ofRunApp(ofBaseApp*)’
Any pointers as to where I’m going wrong?..I’d love to have a play with this!
just tried the first example, and it works fine for me with minor tweaking.
i’m using it with OF-github and started with the opencv example.
then i swapped out the source and added ofxTesseract.
then i got the error “emptyExample has exited with status 1”.
to fix this, i had to create a directory at /usr/local/ called ‘share’, then one called ‘tessdata’ inside share/. then i moved everything from bin/data/tessdata/ into /usr/local/share/tessdata/ and everything worked.
now that i can see it’s working i’m going to try getting the latest tesseract compiled and try that instead. i’d rather not have to create that /usr/local/share/tessdata folder.
@stephanschulz, @julapy – there seems to still be some kind of error with tesseract recognizing the path you give it.
but i looked through the tesseract source and noticed it has an env variable you can set to override everything else (this is in mainblk.cpp at void CCUtil::main_setup).
so if you have your data in bin/data/tessdata, you can just say:
and that will force tesseract to use the right location. kind of a hack, but i’m not sure why it’s broken right now… the secret might be somewhere in the tesseract getpath() function, but it’s a bit messy and hard to decipher.
I seem to be having same problem, I’m running latest version however and still no luck. I don’t see a ofxAutoControlPanel.h in the library, maybe I am missing something.