Save as PDF...

Ambrosia7177 · Apr 11, 2020

Why is it that when I try and save a web page as a PDF on my Macs, that the hyperlinks do not work?

I just discovered this, and it is devastating if not addressable.

At first I thought it was because my Retina does not have Adobe Acrobat Reader installed, and so I thought maybe it was just an issue with Preview.

But here on my old Mac, there is the same issue.

What is going on?!

Dave Braine · Apr 11, 2020

From a quick experiment, I think it might depend on the web page. On this page:

Mac Basics, Help and Buying Advice

new to mac or not sure where to post? ask any hardware or software question here.

forums.macrumors.com

most links work apart from the Thread titles and the links under the titles.

On this page:

Home - BBC News

Visit BBC News for up-to-the-minute news, breaking news, video, audio and feature stories. BBC News provides trusted World and UK news as well as local and regional perspectives. Also entertainment, business, science, technology and health news.

www.bbc.co.uk

just about everything seems to work.

Juicy Box · Apr 11, 2020

I was able to save the MR front page and the links work.

I saved as PDF and I also tried saving as Web Receipt and the links work fine.

Ambrosia7177 · Apr 11, 2020

Dave Braine said:
From a quick experiment, I think it might depend on the web page. On this page:

Mac Basics, Help and Buying Advice

new to mac or not sure where to post? ask any hardware or software question here.

forums.macrumors.com

most links work apart from the Thread titles and the links under the titles.

On this page:

Home - BBC News

Visit BBC News for up-to-the-minute news, breaking news, video, audio and feature stories. BBC News provides trusted World and UK news as well as local and regional perspectives. Also entertainment, business, science, technology and health news.

www.bbc.co.uk

just about everything seems to work.

Not following what you mean...

Juicy Box · Apr 11, 2020

Texas_Toast said:
Not following what you mean...

I think they mean it depends on the webpage.

Ambrosia7177 · Apr 11, 2020

I did some more testing on my Retina, and in Chrome when I do File > Print > Save As PDF the links seem to work.

But in Firefox, when I do File > Print > Save as PDF I am getting a block solid document with no working links.

Could uMatrix, NoScript, uBlock, or Privacy badger be to blame?

Fishrrman · Apr 12, 2020

I just tried saving THIS forum page using "export as PDF" (Safari), and the links seem fine.

Why don't you list a few of the web pages that aren't saving properly to pdfs right here, so that we can try doing it for ourselves...?

"But in Firefox, when I do File > Print > Save as PDF I am getting a block solid document with no working links.
Could uMatrix, NoScript, uBlock, or Privacy badger be to blame?"

No way to know without "an example" we can try for ourselves.

But you know, this is kind of like a toolbox with all kinds of different tools inside.
For a specific job, a certain tool may be the right choice.
Another, similar, tool might not do it.

The solution:
Use the tool that works for the job at-hand.

If Firefox won't do it, perhaps other browsers will.
It's up to YOU to find out what works and what doesn't for your specific situations...

Dave Braine · Apr 12, 2020

Texas_Toast said:
But in Firefox, when I do File > Print > Save as PDF I am getting a block solid document with no working links.

It's a shame that you didn't mention Firefox in your first post. I can confirm that there are no clickable links when I do File > Print > Save as PDF from Firefox.

That would point to a problem with Firefox as it works with Safari.

ApfelKuchen · Apr 12, 2020

This is covered in Web Developement 101. It's up to the website developers to use programming techniques that are "friendly" to the browser (or a file converter, which is essentially what a Save As PDF is), and it's up to browser developers to decide what their app does to display those web pages. In theory, the web follows established standards, but in practice there's nobody enforcing those standards. Website developers are trying to find new and eye-catching methods of presenting information, and browser developers need to differentiate their browser from the next. A website may not look the same when viewed in different browsers - sometimes web developers produce multiple versions of the same site, each optimized to a different browser (this is especially true for mobile vs. desktop browsers).

For argument's sake let's say that the folks at Firefox decide that an exported PDF should not contain active links, because that means the page can then be used again without using Firefox. Then let's say the folks at Google Chrome decide they want to allow PDFs to include active linking for "href" links (the traditional blue underlined text hyperlinks) but do not want to allow linking from the clickable graphics that are commonly used for ads, due to challenges in collecting click-through revenue for those ads. (I'm not saying either scenario/motivation is true, just saying this is the kind of motivations one encounters in these things). Both browser-makers are free to make those choices; exporting a web page in another format is an optional feature, not a core capability.

Ambrosia7177 · Apr 13, 2020

@Fishrrman,

You wanted examples so here goes...

Example #1:
wiki - Hello World

Firefox - Links are not visible, and they do not work.
Chrome - Links work, but are hard to distinguish from regular text.

Example #2:
NPR

Firefox - Links are not visible, and they do not work.
Chrome - Links work, but are hard to distinguish. (All I get is a hand when I hover over linked text)

Example #3:
StackExchange

Firefox - Links are shaded blue, but do not work.
Chrome - Links are blue and work. (You used to get underlined text when you hovered over links...)

Exmaple #4:
He Went to Jared

Firefox - Links are not visible, and they do not work.
Chrome - Links work, but are NOT distinguishable from regular text (i.e. you have no clue what is a link?!)

For example, if you go to the 10th paragraph, and hover over "The Los Angeles Times reported..." that is a hyperlink that works, but you'd never know it just visually inspecting things.

In conclusion, I'd say the PDF capability of macOS Sierra is of limited use...

Dave Braine · Apr 13, 2020

Texas_Toast said:
In conclusion, I'd say the PDF capability of macOS Sierra is of limited use...

In conclusion, I'd say it's nothing to do with macOS. Read Apfelkuchen's post again. After that, use Safari.

Ambrosia7177 · Apr 13, 2020

@ApfelKuchen,

So to which specific "web dev 101" code are you referring?

I've never heard of HTML that would impact how links show up when PDF'ed.

ApfelKuchen · Apr 13, 2020

Texas_Toast said:
@ApfelKuchen,

So to which specific "web dev 101" code are you referring?

I've never heard of HTML that would impact how links show up when PDF'ed.

"Web Development 101" is not code. It's an allusion to entry-level courses in web development.

To create a PDF, the app performing the conversion makes decisions as to what to do with the contents of the document being converted. It's not a matter of "HTML that would impact how links show up when PDF'ed." It's a matter of what the conversion app has been programmed to do when it encounters a particular HTML tag.

TheIntruder · Apr 13, 2020

Firefox is my go-to browser, and while there are many things I love about it, its printing subsystem is not one of them.

I print web pages to PDF often, and there are some sites/pages that Fx won't render and print properly, but other browsers, like Vivaldi, will.

Each browser relies on its own engine to render, display, images, and pass them onto the OS for printing. Mac OS provides a ready framework for displaying, manipulating, and distilling PDF documents, but it can't magically include data that the browser doesn't generate and pass along.

Gecko is going to produce different results than Safari/WebKit and Chrome/Vivaldi/Blink (a closely related fork), so if one doesn't work, try another.

Firefox couldn't even display PDFs in page until PDF.js was implemented, and it has had some issues. You may also notice that the PDF files it generates results in larger files than those generated by the other two.

Fishrrman · Apr 14, 2020

A suggestion:

Download the iCab browser.

Open the page you want to save.
Choose "Save As..."
There are a number of different output formats you can save to.
Try several of them and see what works for you.

satcomer · Apr 14, 2020

In Safari I could print PDFs by using a Option Key before clicking the menu item for printing! Does that still work?

Dave Braine · Apr 14, 2020

satcomer said:
In Safari I could print PDFs by using a Option Key before clicking the menu item for printing! Does that still work?

I don't know. Why not give it a try.

Ambrosia7177 · Apr 16, 2020

ApfelKuchen said:
"Web Development 101" is not code. It's an allusion to entry-level courses in web development.

To create a PDF, the app performing the conversion makes decisions as to what to do with the contents of the document being converted. It's not a matter of "HTML that would impact how links show up when PDF'ed." It's a matter of what the conversion app has been programmed to do when it encounters a particular HTML tag.

Right, and so I'm asking which HTML elements - to your knowledge - would be treaetd differently so that a link on aw eb page does not work or display properly in a PDF?

There is only one way to code a hyperlink in Web pages that I know and it looks like this...

Code:

<a href="url">link text</a>

So if a web page has a hyperlink, you'd expect any PDF application to treat all links the same.

If not, I'm trying to understand examples of where that wouldn't happen.

Ambrosia7177 · Apr 16, 2020

@TheIntruder,

PDF'ing in modern times has become the bane of my existence!!

The purpose of my OP in this thread was just to figure out why something as simple as hyperlinks weren't working.

However my larger goal is to get back to like 1999...

You see, I am a news junky, and a researcher of sorts, and I write a lot of articles/reports, and so I want an easy way to capture a moment in time on the Internet and save web pages WYSIWYG to preserve information, including images/layout/etc. (I also need working hyperlinks, because one article might link to a dozen more articles, of which I might not have saved all of them, and so I want a way to at least view them via hyperlink.)

Back when I was a Windows-only user in the 1990s, and I had a copy of Adobe Acrobat Professional, in like two clicks I had a perfect copy of whatever web page I was reading, including proper page layout, images present, and working hyperlinks.

Life was good!! 😎

Fast forward to 2020, and this entire endeavor is a real PITA.

Right now, as I read articles throughout the day, I do the following...

1.) I use a short PHP script that someon helped me with that uses file_get_contents to "scrape" a given web page and attempt to give my an HTML file that at least displays the text content and has working hyperlinks, but even that is hit-or-miss.

2.) Then I use Firefox's add-on "Take a Screenshot" to create a fairly close WYSIWYG .png file. This works fairly well for shorter documents, but often chops off or leaves out content on larger articles. And the other problem is that this is just a large screenshot, so forget about hyperlinks!

3.) Next, I use Chrome's add-on "Full Page Screen Capture" which works fairly well and create a .pdf file. But it often chops off sentences between pages, and the file-size is a little large than I'd like. (While this approach gives me a fairly good "snapshot" (i.e. WYSIWYG) of the original page, it is also just a huge screen-shot and not a PDF in a traditional sense, because none of the hyperlinks work - which is a problem if I need to go back a use those links for additional research!!)

Which leads me to this thread...

I (incorrectly) assumed that if I just used macOS Mountain Lion/Sierra's built-in File > Print > Save as PDF option, that at least I would be able to capture all of the textual content and have working hyperlinks, but alas, that is not the case?!

And if that approach worked, then that would be Approach #4 in my long, painful process of saving a web page in multiple versions, so I have all of the things I need (i.e. WYSIWYG, text content, images, and working hyperlinks).

Of course, if I could find a way to get a near perfect WYSIWYG capture, PLUS all of the content, PLUS working hyperlinks all in one reasonable sized document - presumably a .pdf - then I'd probably give someone my first-born as payment!! *LOL*

To buy Adobe Acrobat Professional is like $600+.

If I thought it would work the way it did in the 1990's, then I'd pony up for it in a heartbeat. But I suspect that because the Internet (and web pages) have become such a FUBAR, that even Adobe Acrobat Professional is limited.

Of course, a free/open-source solution would be the best.

So that is sorta where I am...

Back to my OP here, is there an easy way to PDF web pages so I get all of the content AND the hyperlinks work AND you can see what is a hyperlink?

Then if there is a way to accomplish my larger goal, that would be gravy!!

ApfelKuchen · Apr 16, 2020

Texas_Toast said:
Write, and so I'm asking which HTML elements - to your knowledge - would be treaetd differently so that a link on aw eb page does not work or display properly in a PDF?

There is only one way to code a hyperlink in Web pages that I know and it looks like this...

Code:

<a href="url">link text</a>

So if a web page has a hyperlink, you'd expect any PDF application to treat all links the same.

If not, I'm trying to understand examples of where that wouldn't happen.

You're making a false assumption. Each converter app is going to be different (otherwise it's plagiarism). Each app developer can make choices as to how it treats the markup code that it encounters. The converter must also be able to recognize variations in how various web sites/developers use HTML.

If you've ever encountered a web site that behaves well in one web browser but does not behave well in another (because the web developer didn't care to debug their site using that other browser)... this is essentially the same thing, only relating to the conversion of that site into a different document format.

Insisting that things ought to be a certain way is pointless. You already see the results yourself. If things were as you believe they ought to be, we wouldn't be here discussing this topic.

TheIntruder · Apr 18, 2020

Texas_Toast said:
Back to my OP here, is there an easy way to PDF web pages so I get all of the content AND the hyperlinks work AND you can see what is a hyperlink?

Then if there is a way to accomplish my larger goal, that would be gravy!!

You could try Paparazzi!, which is a web capture app from one of the former developers of the Camino browser, and see if it meets your needs.

Due to stability issues, I only use it occasionally, but more as a last resort when regular browsers can't generate acceptable PDFs due to the way they're coded.

Save as PDF...

macrumors 68020

macrumors 601

macrumors 604

macrumors 68020

macrumors 604

macrumors 68020

macrumors Nehalem

macrumors 601

macrumors 601

macrumors 68020

macrumors 601

macrumors 68020

macrumors 601

macrumors 68000

macrumors Nehalem

Suspended

macrumors 601

macrumors 68020

macrumors 68020

macrumors 601

macrumors 68000

Our Staff