Anyone know of a good library for reading PDF files? I don't need to display them I need a general purpose way of extracting embedded metadata.
I've not yet decided on a programming language and may base its selection of the availability of a good PDF library. I do need this to be portable. I like to keep only the GUI platform specific. Open source is preferred because it will be used by an open source application
It turns out the USGS has made all of their topographic maps available as free downloads in "geoPDF" form. These maps cover the entire US in extreme detail and can be displayed using any PDF reader but they also have meta data inside (hence the name "geoPDF") that describe the exact map projection. I'm planning on pulling this data out of 10,000+ PDF files the storing it in a DBMS.
I'll write my own PDF parser if I have to but I thought I'd ask around first.
I've not yet decided on a programming language and may base its selection of the availability of a good PDF library. I do need this to be portable. I like to keep only the GUI platform specific. Open source is preferred because it will be used by an open source application
It turns out the USGS has made all of their topographic maps available as free downloads in "geoPDF" form. These maps cover the entire US in extreme detail and can be displayed using any PDF reader but they also have meta data inside (hence the name "geoPDF") that describe the exact map projection. I'm planning on pulling this data out of 10,000+ PDF files the storing it in a DBMS.
I'll write my own PDF parser if I have to but I thought I'd ask around first.