Tesseract ocr for windows?

Hello everyone
So I see there’s a tesseract ocr for Mac machine,but are there any tesseract ocr addon for windows?
If I put the ofxTesseact addon by Kyle Mcdonald into a windows version of openframeworks, will it still work?

thx in advance

Currently @kylemcdonald 's ofxTesseract addon only includes static binary libraries for osx, so if you want to use it with windows, you’ll need to download the source and compile the library for windows.

The instructions included in his README will work well if you are compiling it with MinGW / Codeblocks on windows. If you want to run it with Visual Studio you’ll have to find instructions for how to build visual studio binary libs.

Once you have those binary libraries, make a subfolder in the addon’s libs folder (like the existing osx folder), drop in the static library you compiled and test it out. If it works, issue a Pull Request back to the original repository for others to enjoy your hard work :smile:

Good luck!

@bakercp Hey thanks a lot, although I’m not familiar with building static library, I’ll definitely try it out.
I’ll keep on posting here if I have any significant progress or face any trouble here. I think it’s time to let windows users get some of these sweeties :smiley:

updated:
so I’ve followed this link: http://vorba.ch/2014/tesseract-3.03-vs2013.html
and I have been successful created several files.They are “libtesseract303-static.lib”, “tesseract.exe”, “tesseract.exp”, " tesseract.lib" and all located in “bin/Win32/LIB_Release” as mentioned in the above link.

After that, what should I do? I’ve tried to put all these files back to “ofxTesseract/lib/Win32”, put the whole ofxTesseract in addon file of OpenFrameWorks, and created a project using project generator. However this gives me a "Error 2 error C1083: Cannot open include file: ‘baseapi.h’: No such file or directory d:\openframeworks\of_v0.8.0_vs_release\addons\ofxtesseractwindows\src\ofxTesseract.h 10 1 testTesseract
" error. and it seems that the project generator did not include the baseapi.h inside my newly created project(I’ve checked the the addon folder and the baseapi.h is in there).I’m using Visual Studio 2013 Express

What should I do to fix this? Thanks in advance.

I just realized that Kyle’s addon is a bit out of date.

I just updated it so try my fork here:

https://github.com/bakercp/ofxTesseract/tree/feature-updated-addon-format

Then try placing the generated libtesseract303-static.lib in a folder called vs next to the osx folder here:

https://github.com/bakercp/ofxTesseract/tree/master/libs/tesseract/lib

Project generator won’t recognized the ofxTesseract/lib/Win32 but it will recognize:

ofxTesseract/libs/tesseract/lib/vs.

@bakercp
Thx for the update, now the “baseapi.h” is correctly included in the project.
However,several other problems appeared. The first one is “setenv : identifier not found”, which i sovled with a little bit of hacking:

adding the following code in “ofxTesseract.h”

//edited for setenv : identifier not found problem
int setenv(const char *name, const char *value, int overwrite);

and adding the following code in “ofxTesseract.cpp”

//edited for setenv : identifier not found problem
int ofxTesseract::setenv(const char *name, const char *value, int overwrite)
{
	int errcode = 0;
    if(!overwrite) {
        size_t envsize = 0;
        errcode = getenv_s(&envsize, NULL, 0, name);
        if(errcode || envsize) return errcode;
    }
    return _putenv_s(name, value);
}

reference: link here

The second one is I’m getting a lot of linker errors, they are

error LNK2038: mismatch detected for ‘_MSC_VER’: value ‘1800’ doesn’t match value ‘1700’ in main.obj

error LNK2038: mismatch detected for ‘_ITERATOR_DEBUG_LEVEL’: value ‘0’ doesn’t match value ‘2’ in main.obj

error LNK2038: mismatch detected for ‘RuntimeLibrary’: value ‘MD_DynamicRelease’ doesn’t match value ‘MDd_DynamicDebug’ in main.obj

As I’m totally inexperienced in building static library, does these errors have something to do with the version of visual studio that I used to create the Tesseract library? (I’m using Visual Studio 2013 for building the Tesseract library and using Visual Studio 2012 for coding the OpenFrameWorks project,as the version I’m using v0.8.0 does not support Visual Studio 2013)

I’m also less experienced building for Visual studio … but I’d guess that the VS2012 vs VS2013 issue is absolutely an issue based on my limited experience of late :smile:

Is there a way you can compile your libs with VS2012? Currently OF doesn’t support 2013 (afaik), although last I heard we’re working on it.

After searching a bit through Google, that seems to be the case. Well I guess that leave me no choice but to find a way to compile that using Visual Studio 2012 or Code::Blocks (though I don’t really like using Code::Blocks)

The link that I reference to link here is for Visual Studio 2013 only, and my version of OpenFrameWorks(0.8.0) supports Visual Studio 2012 only. Spent almost 2 days with nothing yielded, that’s quite a frustrated matter. Nevertheless, I’ll keep trying with different method to make Tesseract work with my project, though I may not be able to provide a proper addon.Will keep on posing here.

Thank you very much and really appreciate for all the help

LATEST UPDATE
@bakercp @kylemcdonald
So I’m still struggling in compiling the library. I’m following the official tutorial for compiling in Visual Studio 2008, which instead of VS2008, I’m using VS2012 Express for Windows Desktop to do the job.

And it gives out this error log(sorry I have to edit the extension to cpp):
tesseractErrorlog.cpp (1.7 KB)

I’ve tried opening the VS2012 Express as administrator to compile the project (WDExpress.exe, right click, run as administrator) but that still didn’t get the job done.

Anyone who’s familiar with building static library(as I’m trying to build this Tesseract as library to be used in my own project) can point me a direction? Any help is greatly appreciated.

Update 2:
so after quite a bit of messing around, it now gives me

and I solved this by turning the “image has safe exception handler” option off

The result gives me quite some files

and Ive put the lib file into the addon (ofxTeseract\lib\tesseract\lib\vs),used with the project generator, and still it gives me a bunch of errors

Here I’m providing the error log and the static library for more investigation
—Download link here, it’s too big to upload to here—

Again, any help would be greatly appreciated.

I just googled “tesseract vs2012” and got this link. Did you try this by chance?

@bakercp
Thank you for providing the source, tried that with a few error

probably because some linkage setting, will try again as this one seems to give me the least error

UPDATE
well after spending quite some days and getting no result, I’m giving up to build the library for vs2012.

My workaround is install Tesseract from the installer exe file, then in my program I just call the command prompt to run tesseract.exe which is located in the Tesseract install path.

However I would like to know is CreateProcess() not available in OpenFrameWorks? Currently I’musing system() function to work the trick, but I don’t really want to use it, as it has vulnerability.
reference here

Currently the ofSystem command is in the process of being upgraded to use Poco::Process. I think that fix will give you more reliable results. If you want to try it out before the upgrade, you can see how we use it in ofSketch here:

why not using free online service? I find a free online ocr, it’s using tesseract ocr 3.02.