Hello C# developers,
I got a task developing a console program to parse a PDF file and extract information from it (mainly text).
Do anybody know how to do it?
Thanks a lot.
Printable View
Hello C# developers,
I got a task developing a console program to parse a PDF file and extract information from it (mainly text).
Do anybody know how to do it?
Thanks a lot.
google it... and you`ll found it...
http://www.codeproject.com/csharp/MgPDFReader.asp
http://www.codeproject.com/cs/samples/pdf2text.asp
Hello again...
I've already used iTextSharp .NET library to extract text from PDF, and it works quite good although some PDF files are unreadable...
Now I got a new challenge to extract images/pictures from PDF. Is there any library capable doing this?
Thanks a lot.
You should really consider commercial packages if you want to read/modify pdf files, such as this one (just googled it).
I've been looking at PDFNet before, but it's too costly for me... and I've done days of googling as well.
Maybe in this forum there's sombody having experienced with PDF reading/parsing or using iTextSharp before to read through PDF?