tshab (4) [Avatar] Offline
#1
Have a PDF of a graphic with text-selectable labels.

Would like to, in batch, update the labels to be hyperlink objects, using the label as a postpended argument (i.e. label1 gets turned into a hyperlink, with a label of "label1", and a href of: http://www.google.com/search?q=label1 ).

How would this be done in iText?

Thanks!
blowagie (284) [Avatar] Offline
#2
Re: Adding a hyperlink to an existing text label
Please define 'text selectable labels', because I don't know what you mean by that.
Use official PDF terms please (if possible refer to page or section numbers in the PDF Reference manual).
br,
Bruno
tshab (4) [Avatar] Offline
#3
Re: Adding a hyperlink to an existing text label
Probably making this more difficult than it should be.

Imagine a PDF document with a graphic, and several "text objects". These "text objects" can be selected in the PDF Reader application and pasted into another application (Notepad, for example), and the text is accurate to what is shown in the PDF Reader. I'm not sure why, but in some cases this translation doesn't work properly in some PDF documents, so I just wanted to call this out (i.e. that the text can be both selected in a PDF Reader AND correctly pasted to another application).

What I'm trying to do is to take that "text object" and make it into a hypertext link. In other words, when someone clicks on this object, they will be taken to an http URL. I've seen this work in PDF documents before. The trick is that I'd like to convert the text into part of the URL, keeping the display of the original text object the same.

Imagine an image that has three text objects, Object1, Object2 and Object3. I'd like to convert Object1 into a hyperlink that will still display Object1, but will have an href of, for example, "http://www.google.com/search?p=Object1".

How would I do this in iText?

Thanks.
blowagie (284) [Avatar] Offline
#4
Re: Adding a hyperlink to an existing text label
The reason why selecting and copy/pasting text from a PDF doesn't work in some cases is implicitly explained in the chapters about fonts. The reason why your question is unanswerable is explained in chapter 18.
There are very little programs that can programmatically determine the location of some 'selectable text' in a PDF file and add a link at that position. As explained in the book, iText can't do it, and I don't know of any other product that can. This is inherent to the PDF format. There was a long discussion on the PlanetPDF forum about this. A developer desperately needed this functionality and he wouldn't believe the experts saying what he asked was impossible. This was frustrating for both the developer and the experts smilie
What you need is a PDF renderer that allows you to select the text manually and add the link yourself at this position. Acrobat is such a product.
tshab (4) [Avatar] Offline
#5
Re: Adding a hyperlink to an existing text label
Thanks for the response.

It appears that a commercial product, the Big Faceless PDF Library will handle this in batch.
blowagie (284) [Avatar] Offline
#6
Re: Adding a hyperlink to an existing text label
Thank you for the follow-up.
I didn't know Big Faceless could do this.
I'll note this in case the question comes up again.
tshab (4) [Avatar] Offline
#7
Re: Adding a hyperlink to an existing text label
Just wanted to say that my experiements with the Big Faceless Library have not yet been a success.
blowagie (284) [Avatar] Offline
#8
Re: Adding a hyperlink to an existing text label
Thanks again for the feedback. I is a very difficult issue, technically speaking.
PDFs can be generated in 1001 different ways. Your best shot would be to use an OCR tool.
The ideal solution is to change the source document that was used to produce the PDF.
Of course, I realize that the source document isn't available in many cases.
mood (1) [Avatar] Offline
#9
Re: Adding a hyperlink to an existing text label
tshab ,

If you are a perl monk, you might look into the Perl module by Chris Dolan called CAM-PDF-1.10.

you could use the CAM:smilieDF::Renderer:smilieump utility.
capture the output (text node coordinates) to a text file.
then run a perl script to place annotations over the text at those coordinates.
384931 (1) [Avatar] Offline
#10
Re: Adding a hyperlink to an existing text label
blowagie wrote:Thanks again for the feedback. I is a very difficult issue, technically speaking.
PDFs can be generated in 1001 different ways. Your best shot would be to use an online OCR tool.
The ideal solution is to change the source document that was used to produce the PDF.
Of course, I realize that the source document isn't available in many cases.


I agree. OCR tool can't provide a 100% accurate recognition.