Click to See Complete Forum and Search --> : [ask]Reading PDF file using C#


pradana
November 14th, 2007, 08:04 PM
Hello C# developers,

I got a task developing a console program to parse a PDF file and extract information from it (mainly text).

Do anybody know how to do it?

Thanks a lot.

saktya
November 14th, 2007, 08:19 PM
google it... and you`ll found it...

http://www.codeproject.com/csharp/MgPDFReader.asp

http://www.codeproject.com/cs/samples/pdf2text.asp

pradana
November 19th, 2007, 10:07 PM
Hello again...

I've already used iTextSharp .NET library to extract text from PDF, and it works quite good although some PDF files are unreadable...

Now I got a new challenge to extract images/pictures from PDF. Is there any library capable doing this?

Thanks a lot.

Tischnoetentoet
November 20th, 2007, 08:28 AM
You should really consider commercial packages if you want to read/modify pdf files, such as this one (http://www.pdftron.com/net/index.html) (just googled it).

pradana
November 20th, 2007, 10:50 PM
I've been looking at PDFNet before, but it's too costly for me... and I've done days of googling as well.

Maybe in this forum there's sombody having experienced with PDF reading/parsing or using iTextSharp before to read through PDF?