-
Using Tesseract in MFC
I am struggle to use tesseract library into MFC VS2010 project. And I guess I started like this: I have tried to compile tesseract library from here: https://github.com/tesseract-ocr/tesseract ... but when I tried to create a solution with CMake, I get the following error from CMake:
Code:
CMake Error at CMakeLists.txt:100 (find_package):
Could not find a package configuration file provided by "Leptonica"
(requested version 1.74) with any of the following names:
LeptonicaConfig.cmake
leptonica-config.cmake
Add the installation prefix of "Leptonica" to CMAKE_PREFIX_PATH or set
"Leptonica_DIR" to a directory containing one of the above files. If
"Leptonica" provides a separate development package or SDK, be sure it has
been installed.
Obviously, I need leptonica ...but whre can I get this library ? I read that is used in tesseract projects ...
Or else, have you already compiled tesseract with VS2010 ? Can you lead me into solving my task ? Thank you.
-
Re: Using Tesseract in MFC
-
Re: Using Tesseract in MFC
Quote:
Originally Posted by
VictorN
This problem I didn't searched on internet, I admit that :D .. thank you for idea. However, I didn't found something about leptonica library on github ....
-
Re: Using Tesseract in MFC
If I compile tesseact library in VS2017, could I use this compiled lib and dll into VS2010 project ?
-
Re: Using Tesseract in MFC
Because I have compiled Leptonica, but when I compile tesseract, I get a bunch of errors ... and I guess is because VS2010 compiler is the point:
Code:
3>c:\flaviu\tesseract-master\src\arch\dotproductavx.cpp(30): error C4716: 'tesseract::DotProductAVX' : must return a value
3>
3>Build FAILED.
3>
3>Time Elapsed 00:00:00.36
2>C:\Flaviu\tesseract-master\src\ccutil\host.h(30): fatal error C1083: Cannot open include file: 'cinttypes': No such file or directory
the code:
Code:
namespace tesseract {
double DotProductAVX(const double* u, const double* v, int n) {
fprintf(stderr, "DotProductAVX can't be used on Android\n");
abort();
}
} // namespace tesseract
and
Code:
#include <cinttypes> // PRId32, ...
-
Re: Using Tesseract in MFC
<cinttypes> was introduced with c++11 - so you need at least a c++11 compatible compiler. VS2010 is not c++11 compatible so you'll need a later version of the compiler to compile this code.
-
Re: Using Tesseract in MFC
So, if I would compile this library with VS2017 let say, I could use this library inside of VS2010 project ?
-
Re: Using Tesseract in MFC
Quote:
Originally Posted by
mesajflaviu
So, if I would compile this library with VS2017 let say, I could use this library inside of VS2010 project ?
probably, yes.
Just try it!
-
Re: Using Tesseract in MFC
Note that each version of VS c++ uses it's own version of the crt - so the.dll compiled with VS2017 will use a different version of the crt to code compiled with VS2010. This may, or may not, be an issue - but for any computer on which you want to run these programs you'll need to have at least these different versions of the crt installed.
-
Re: Using Tesseract in MFC
Quote:
Originally Posted by
mesajflaviu
So, if I would compile this library with VS2017 let say, I could use this library inside of VS2010 project ?
Why ask? Just try it.
-
1 Attachment(s)
Re: Using Tesseract in MFC
From my experience, I knew that is impossible to go everything alright :D
I have installed VS2017 on my machine ... but when I have tried to create a simple MFC test project, I have met the following error:Attachment 35343
Ok, so I have tried to create a solution file for tesseract, with CMake, and here is the result:
Code:
CMake Error at CMakeLists.txt:41 (project):
Generator
Visual Studio 15 2017
could not find any instance of Visual Studio.
Configuring incomplete, errors occurred!
See also "C:/Flaviu/tesseract-master/bin/CMakeFiles/CMakeOutput.log".
I run on the empty road ...
-
Re: Using Tesseract in MFC
-
Re: Using Tesseract in MFC
No, of course not, and I have searched a solution for my problem, but I didn't found something relevant ... might haven't used right terms ...
-
Re: Using Tesseract in MFC
When you installed Vs2017, did you tell the installation configuration that you wanted c++/mfc/SDK etc etc for desktop? I don't think they are installed by default now.
Run the VS installer, click modify and under Installation Details, what have you installed for Desktop Development with C++?
-
Re: Using Tesseract in MFC
Quote:
Originally Posted by
2kaud
When you installed Vs2017, did you tell the installation configuration that you wanted c++/mfc/SDK etc etc for desktop? I don't think they are installed by default now.
Run the VS installer, click modify and under Installation Details, what have you installed for Desktop Development with C++?
Yes, I have checked these details, and I noticed that I haven't installed all components for my needs ... and I have installed them, and I have successfully compiled leptonica and tesseract library ... now it come to used them, inside of VS2010 project ... I am very curios if will work ...I will come back with feedback.
-
Re: Using Tesseract in MFC
Soon as I included
Code:
#include <tesseract/baseapi.h>
in my VS2010 project, I got
Code:
1>c:\flaviu\imagetext\tesseract\include\tesseract\publictypes.h(33): error C2144: syntax error : 'int' should be preceded by ';'
to
/** Number of printers' points in an inch. The unit of the pointsize return. */
constexpr int kPointsPerInch = 72;
...
...
fatal error C1083: Cannot open include file: 'cinttypes': No such file or directory
I have no chance to use tesseract library into VS2010 project ... would I ?
-
Re: Using Tesseract in MFC
You cannot use the code of tesseract in VS2010, but if tesseract is compiled into a .dll or an .obj, then potentially these could be used by VS2010 c++ code. When you referred to using the tesseraft library within VS2010 previously, this is what I thought you meant. When you compile a c++ program, this produces .obj file(s) which are then linked to produce the .exe (or .dll) file. These .obj files need not all have been compiled at the same time by the same compiler. Indeed, this is how say a program composed of Fortran and c++ sources is compiled and linked. The c++ compiler produces a.obj file, the Fortran compiler produces a .obj file and then the linker combines these to produce the .exe. If you use VS2017 to produce a .dll from the tesseract source, then the exported functions can be used by another c++ program by referencing the appropriate .lib file at link time and including an appropriate header for the function definitions. However if this header for tesseract requires <cinttypes> then this avenue is closed off (if you can't get around it) and you're left with linking multiple .obj files.
However, why are you trying to persist with using VS2010? This is 8 years old and doesn't even support c++11 - never mind the latest c++17 standard. Why not just move to the current VS2017?
-
Re: Using Tesseract in MFC
Quote:
Originally Posted by
2kaud
You cannot use the code of tesseract in VS2010, but if tesseract is compiled into a .dll or an .obj, then potentially these could be used by VS2010 c++ code. When you referred to using the tesseraft library within VS2010 previously, this is what I thought you meant. When you compile a c++ program, this produces .obj file(s) which are then linked to produce the .exe (or .dll) file. These .obj files need not all have been compiled at the same time by the same compiler. Indeed, this is how say a program composed of Fortran and c++ sources is compiled and linked. The c++ compiler produces a.obj file, the Fortran compiler produces a .obj file and then the linker combines these to produce the .exe. If you use VS2017 to produce a .dll from the tesseract source, then the exported functions can be used by another c++ program by referencing the appropriate .lib file at link time and including an appropriate header for the function definitions. However if this header for tesseract requires <cinttypes> then this avenue is closed off (if you can't get around it) and you're left with linking multiple .obj files.
However, why are you trying to persist with using VS2010? This is 8 years old and doesn't even support c++11 - never mind the latest c++17 standard. Why not just move to the current VS2017?
After all, I have installed VS2017, and I like it. I needed VS2010 compiled tesseract library because the project where I am intend to use this library is VS2010 project ... now I should convert this parent project, built in VS2010 into VS2017 ... I hope I will succeed.
I have successfully compiled leptonica and tesseract libraries, but only with VS2017 ... now is follow the next step: to see why these libraries is not working in my project :D
All these libs are compiled as Debug Win32, and my MFC project is compiled as Debug x86, all of them as Multi-threaded Debug DLL (/MDd) ...
When I tested their sample code:
Code:
tesseract::TessBaseAPI api;
if (0 != api.Init(NULL, _T("eng"), tesseract::OEM_DEFAULT))
{
m_sState.Format(_T("tesseract initialize error"));
return FALSE;
}
always the app is pass by error branch ... but this is other issue. Anyway, I really thank you for your time and patience.
-
Re: Using Tesseract in MFC
Quote:
Originally Posted by
mesajflaviu
When I tested their sample code:
Code:
tesseract::TessBaseAPI api;
if (0 != api.Init(NULL, _T("eng"), tesseract::OEM_DEFAULT))
{
m_sState.Format(_T("tesseract initialize error"));
return FALSE;
}
always the app is pass by error branch ... but this is other issue. Anyway, I really thank you for your time and patience.
Are you sure Init failed if it returns any non-zero value?
-
Re: Using Tesseract in MFC
Quote:
Originally Posted by
mesajflaviu
After all, I have installed VS2017, and I like it.
That's great news.
-
Re: Using Tesseract in MFC
Quote:
Originally Posted by
Arjay
That's great news.
It surprise me few things: it not consume a tone of RAM (or HDD space), (I admit, it consume more than VS2010), and I can open VS2010 project, work in VS2017 (with his great IDE), and when is done, I can open the project in VS2010 and compile it there ... fantastic !
-
Re: Using Tesseract in MFC
Quote:
Originally Posted by
VictorN
Are you sure Init failed if it returns any non-zero value?
Yes, I am sure: I have tried:
Code:
int n = api.Init(NULL, _T("eng"), tesseract::OEM_DEFAULT);
and n is -1. And here is the comments from Init method:
Code:
* Start tesseract. Returns zero on success and -1 on failure.
* NOTE that the only members that may be called before Init are those
* listed above here in the class definition.
-
Re: Using Tesseract in MFC
I wonder if someone used tesseract in MFC ... could point me what I am doing wrong ?
-
Re: Using Tesseract in MFC
Have you tried to call GetLastError after the failed Init call? Btw, if folks here knew what the problem was, they would be posting?
-
Re: Using Tesseract in MFC
Quote:
Originally Posted by
Arjay
Have you tried to call GetLastError after the failed Init call? Btw, if folks here knew what the problem was, they would be posting?
I have tried:
Code:
tesseract::TessBaseAPI api;
int n = api.Init(NULL, _T("eng"), tesseract::OEM_DEFAULT);
DWORD nError = GetLastError();
n is -1, and nError is 3, which mean: The system cannot find the path specified. ... strange ....
-
Re: Using Tesseract in MFC
I haven't used tesseract - but the documentation at https://zdenop.github.io/tesseract-d...ab1cfc3bc09f3e
Are you compiling as ASCII or Unicode? From the documentation, the expected first 2 parameters are const char * - ie ASCII. You are using _T("..") which gives ASCII if compiled as ASCII or Unicode if compiled as Unicode. Have you tried setting parameter 2 to NULL which should default to eng?
There is no indication in the documentation that .init() sets LastError, so GetLastError() may or may not return a valid value if .init() returns -1.
As you are using OEM_DEFAULT, there is no need to use the 3 parameter version - just use the 2 param version which defaults to OEM_DEFAULT.
Have you tried
Code:
int n = api.init(NULL, NULL);
-
Re: Using Tesseract in MFC
Quote:
Originally Posted by
2kaud
I haven't used tesseract - but the documentation at
https://zdenop.github.io/tesseract-d...ab1cfc3bc09f3e
Are you compiling as ASCII or Unicode? From the documentation, the expected first 2 parameters are const char * - ie ASCII. You are using _T("..") which gives ASCII if compiled as ASCII or Unicode if compiled as Unicode. Have you tried setting parameter 2 to NULL which should default to eng?
There is no indication in the documentation that .init() sets LastError, so GetLastError() may or may not return a valid value if .init() returns -1.
As you are using OEM_DEFAULT, there is no need to use the 3 parameter version - just use the 2 param version which defaults to OEM_DEFAULT.
Have you tried
Code:
int n = api.init(NULL, NULL);
I have tried
Code:
tesseract::TessBaseAPI api;
int n = api.Init(NULL, NULL);
DWORD nError = GetLastError();
with the same results ... I have compiled the project as ASCII, same result, Unicode, same result ...
-
Re: Using Tesseract in MFC
According to the source, when a failure occurs, info is output via tprintf() to the file specified by the file name in the variable debug_file - or if non specified to stderr. As you are running a mfc program I guess there is no stderr - so the debug file name needs to be set. I think you just need to set it as debug_file is defined as
DLLSYM char* debug_file = "";
Failing that, as you have the source then a debug of the .init() function should indicate where is the problem.
-
Re: Using Tesseract in MFC
Quote:
Originally Posted by
2kaud
According to the source, when a failure occurs, info is output via tprintf() to the file specified by the file name in the variable debug_file - or if non specified to stderr. As you are running a mfc program I guess there is no stderr - so the debug file name needs to be set. I think you just need to set it as debug_file is defined as
DLLSYM char* debug_file = "";
Failing that, as you have the source then a debug of the .init() function should indicate where is the problem.
I have put the following line in MyProjectApp.h
Code:
#define DLLSYM char* debug_file = "C:\\Flaviu\\test.txt";
and I collect no data (0 bytes) ... my guess is that I didn't put this file where should be ....
-
Re: Using Tesseract in MFC
You are redefining the symbol DLLSYM.
Try just
Code:
debug_file = "C:\\Flaviu\\test.txt";
-
Re: Using Tesseract in MFC
Quote:
Originally Posted by
2kaud
You are redefining the symbol DLLSYM.
Try just
Code:
debug_file = "C:\\Flaviu\\test.txt";
I have tried that, the test.txt file is still empty ... I am not get it ... I have compiled leptonica, and tesseract library successfully, included inside my project, no error, still, the following simple code is not working:
Code:
if (0 != api.Init(NULL, NULL))
{
m_sState.Format(_T("tesseract initialize error. last error: %d"), GetLastError());
::SendMessage(theApp.m_pMainWnd->GetSafeHwnd(), WM_SETMESSAGESTRING, 0, (LPARAM)(LPCTSTR)m_sState);
}
// tesseract initialize error. last error: 3
strange ... I guess is about configuring tesseract ? However, they were compiled successfully ... with no error, the same platform, the same settings ...
-
Re: Using Tesseract in MFC
I would suggest that first you try creating a new non-mfc standard console c++ solution that only tries to initialise tesseract. If this still fails then you're going to have to get your hands dirty and debug tesseract.
If this simple test program does initialise ok, then it's something you doing in your code that tesseract doesn't like. Then try the simplest mfc program with tesseract and see if that works. If not, then there's something about mfc and tesseract which will have to be debugged to determine. If this simple mfc test program does initialise tesseract then add code and test after each code addition until you know what code was added to cause tesseract to not initialise.
-
Re: Using Tesseract in MFC
Quote:
Originally Posted by
2kaud
I would suggest that first you try creating a new non-mfc standard console c++ solution that only tries to initialise tesseract. If this still fails then you're going to have to get your hands dirty and debug tesseract.
If this simple test program does initialise ok, then it's something you doing in your code that tesseract doesn't like. Then try the simplest mfc program with tesseract and see if that works. If not, then there's something about mfc and tesseract which will have to be debugged to determine. If this simple mfc test program does initialise tesseract then add code and test after each code addition until you know what code was added to cause tesseract to not initialise.
Very good suggestion: soon as I run this code inside a console app, I get valuable error messages. Here is the code:
Code:
#include <leptonica/allheaders.h>
#include <tesseract/baseapi.h>
int main()
{
std::cout << "Hello World!\n";
tesseract::TessBaseAPI api;
if (0 != api.Init(NULL, NULL))
{
std::cout << "tesseract initialize error\n";
std::cout << "Last error:" << GetLastError() << std::endl;
}
}
and here is the messages:
Code:
Hello World!
Error in pixReadMemTiff: function not present
Error in pixReadMem: tiff: no pix returned
Error in pixaGenerateFontFromString: pix not made
Error in bmfCreate: font pixa not made
Error opening data file C:\Program Files (x86)\Tesseract-OCR\eng.traineddata
Please make sure the TESSDATA_PREFIX environment variable is set to your "tessdata" directory.
Failed loading language 'eng'
Tesseract couldn't load any languages!
tesseract initialize error
Last error:3
but I have not any "Tesseract-OCR" folder in "Program Files (x86)" ...
-
Re: Using Tesseract in MFC
Quote:
Originally Posted by
mesajflaviu
Very good suggestion: soon as I run this code inside a console app, I get valuable error messages. Here is the code:
Code:
#include <leptonica/allheaders.h>
#include <tesseract/baseapi.h>
int main()
{
std::cout << "Hello World!\n";
tesseract::TessBaseAPI api;
if (0 != api.Init(NULL, NULL))
{
std::cout << "tesseract initialize error\n";
std::cout << "Last error:" << GetLastError() << std::endl;
}
}
and here is the messages:
Code:
Hello World!
Error in pixReadMemTiff: function not present
Error in pixReadMem: tiff: no pix returned
Error in pixaGenerateFontFromString: pix not made
Error in bmfCreate: font pixa not made
Error opening data file C:\Program Files (x86)\Tesseract-OCR\eng.traineddata
Please make sure the TESSDATA_PREFIX environment variable is set to your "tessdata" directory.
Failed loading language 'eng'
Tesseract couldn't load any languages!
tesseract initialize error
Last error:3
but I have not any "Tesseract-OCR" folder in "Program Files (x86)" ...
Well there's your error 3 from GetLastError(). It was being set by .Init() trying to open a non-existent file.
The 'fix' is also stated in the messages. You need to find the file 'eng.traineddata' on your system and set the TESSDATA_PREFIX environment variable to point to the directory on your system in which this file resides. If this file doesn't exist anywhere then you haven't downloaded and installed the required data files. See https://github.com/tesseract-ocr/tessdata for eng.traineddata file.
PS See also https://github.com/tesseract-ocr for further data/info.
-
Re: Using Tesseract in MFC
That is why I coming here, I always solve the problems :)
Soon as I set up the correct TESSDATA_PREFIX system variable, the .Init method has worked well.
Now I have met another error, while was trying to use:
Code:
Pix* pImage = pixRead("C:\\Flaviu\\imagine.png");
from official API example: https://github.com/tesseract-ocr/tes...iki/APIExample
Code:
int main()
{
tesseract::TessBaseAPI api;
if (0 != api.Init(NULL, NULL))
{
std::cout << "====tesseract initialize error\n";
std::cout << "====Last error:" << GetLastError() << std::endl;
}
Pix* pImage = pixRead("C:\\Flaviu\\imagine.png"); // <--- pImage has 0
printf("pImage pointer value: %p\n", pImage);
}
the result is:
Code:
Error in pixReadMemTiff: function not present
Error in pixReadMem: tiff: no pix returned
Error in pixaGenerateFontFromString: pix not made
Error in bmfCreate: font pixa not made
Error in pixReadStreamPng: function not present
Error in pixReadStream: png: no pix returned
Error in pixRead: pix not read
pImage pointer value: 00000000
obviously, I have searched on internet these errors messages, and they tell me to compile libtiff before to install leptonica: https://stackoverflow.com/questions/...nclude-libtiff
Code:
Try to install
apt-get install -y libtiff5-dev
Then install leptonica from source
git clone https://github.com/DanBloomberg/leptonica.git leptonica
cd leptonica
./autobuild
./configure --with-libtiff
make -j
make install
And reinstall tesseract after that
what is these commands ? git clone ... etc. ? I don't have any git app, and I have no need of such thing ...
-
Re: Using Tesseract in MFC
These are linux commands! Have you installed/compiled libtiff/leptonica on your system?
-
Re: Using Tesseract in MFC
Quote:
Originally Posted by
2kaud
These are linux commands! Have you installed/compiled libtiff/leptonica on your system?
No, I guess not ... myself I installed nothing about this ... I only compiled leptonica from github ... that is all.
-
Re: Using Tesseract in MFC
I have installed leptonica and tesseract library on my PC. I haven't seen any libtiff option during installation/compilation process ..