Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

Lyle

macrumors 68000
Original poster
Jun 11, 2003
1,874
1
Madison, Alabama
In the newest ad ("Watered Down"), we see that when the user touches an image on the web page, Safari zooms in on that image. This part I get; the browser can presumably pick up on the dimensions of the image and scale it accordingly...

... but then in the next shot, the user touches one of the articles on the page and Safari appears to zoom in on that article. How does it "know" how far to zoom? I'm looking at the HTML source for the NY Times home page and it doesn't appear that the individual stories are in their own DIVs, or any other structure that the browser could isolate on the page.

Or is it just magic? :D
 
I noticed that too

...and was wondering the very same thing. I suppose they could have constructed software that would 'parse' the content and try to figure out where the 'boundaries' of the article (or other content) were, but I can't believe that would be very reliable over the jillions of web page formats out there. They are pretty crafty at Apple, and they did build phone number parsing into the iPhone (SJ demonstrated this, and said so, in one of the demos), but that's a lot simpler than trying to do something like this with web content. Didja notice how it not only zooms to the article, but neatly includes the headline as well?

The more I think about it, the more I think you're right: magic. Well we'll know lots more about it soon...
 
I suppose they could have constructed software that would 'parse' the content and try to figure out where the 'boundaries' of the article (or other content) were, but I can't believe that would be very reliable over the jillions of web page formats out there... Didja notice how it not only zooms to the article, but neatly includes the headline as well?
I wasn't thinking about parsing the article text so much as doing some kind of edge detection -- seeing that there's a big blob of text with some whitespace (the margins) around it, and saying, "Aha! That blob must be a feature that I can zoom in on." If that's what they're doing, that's very cool -- sort of tricky to do in real time.
 
I suppose theoretically it is very easy for a computer (or in this case a phone) to detect a section of text and say "show this bit only please plus the larger writing above it cause that's going to be the heading." I'm just a little suspicious that they have only shown us this function on the NY Times pages; is this because they've got a bit of a mutual money-free sponsership or because this is the only site it works on. I suppose we will find out in time...
 
On the NY Times website, it's very simple for Safari to determine which article is which. Each article is contained within a separate div element with either "summary" "story" or "article" as the class and/or id. All Safari has to do is zoom in on just that div, and bam, there's the article. It won't work as well on sites that aren't coded as well, but remember, you've always got the pinch to let you zoom in manually, which should work fairly well.

jW
 
I doubt Apple would've spent over two years developing the iPhone and then only have the Safari browser interaction optimized for one website.
 
On the NY Times website ... each article is contained within a separate div element with either "summary" "story" or "article" as the class and/or id.
Whoops, you're right. :eek:

I had looked at the source before I posted, but I didn't pick up on that at first. That goes a long way towards explaining it.
 
the browser just zooms in to the same screen resolution that 100% 96dpi would give you on a standard monitor... not rocket science. There is no edge detection... the CPU in the phone is not powerful enough to do that... heck Photoshop CS3 on a quad-core doesn't even always do a good job with that...
 
the browser just zooms in to the same screen resolution that 100% 96dpi would give you on a standard monitor... not rocket science. There is no edge detection... the CPU in the phone is not powerful enough to do that... heck Photoshop CS3 on a quad-core doesn't even always do a good job with that...
That's not what I was getting at. I was wondering how the browser is able to determine the bounding box for the article that you're "pointing at", and zoom in so that that particular element fills the screen. I think Mal is correct that the browser's making some assumptions (based on the use of HTML DIV elements) about which elements on the page belong together. For example, on the NYT home page you can see DIVs with class "story", and those DIVs contain the headline and body for a particular article.

Now, that's not the whole story, of course. Unless Safari is hard-coded to recognize certain class names on certain pages, there's nothing special about the "story" or "summary" class that should automatically clue Safari in to the fact that those DIVs contain complete stories. I mean, NYT could rename the "story" class to "foo" and it would still display the same way. But DIVs at least tell the browser something about the logical markup of the document.
 
divs

the browser just zooms in to the same screen resolution that 100% 96dpi would give you on a standard monitor... not rocket science. There is no edge detection... the CPU in the phone is not powerful enough to do that... heck Photoshop CS3 on a quad-core doesn't even always do a good job with that...

This isn't correct. Look at any of the videos. It zooms to different levels depending on where on the page they are pointing. I would agree with other posters and say it is probably filling the screen according to divs or other type of bounding area.
 
I'll tell you what I'm going to do, being the selfless person that I am. If all of you will chip in to buy an iPhone for me, I will investigate this as thoroughly as possible and report back on what I find. :D
 
zoom and links

It seems a couple of times in the videos they point to something that should be a link, but it does not change pages, just zooms on the area. I wonder if it only uses a link if you are zoomed in all the way.
 
Here's my take on it, based on the "Watered Down" ad. There are 3 zoom clicks in the video.

Click 1 - double-click on an image, it fills the screen. Nothing odd about that.

Click 2 - double-click on the air collision story. The click falls to the left of the story center, but the browser should be able to understand that you want to zoom in on an element and zoom on the center of it. Vertically, the zoom is centered around the click.

Click 3 - double-click on the housing cost story. Since we're zoomed in all the way, double-click pans instead of zooming. We center exactly as above - centered on the element horizontally, and on the click location vertically.

None of the double-clicks are on a link, but even if they were, I assume that double-clicking is reserved for zooming/panning. Some of the scrolls fall on links, but scrolling definitely overrides clicking (otherwise you couldn't scroll a song list in iPod).

All of these seem perfectly logical to me, and they work the way I expect.
 
Click 2 - double-click on the air collision story. The click falls to the left of the story center, but the browser should be able to understand that you want to zoom in on an element and zoom on the center of it. Vertically, the zoom is centered around the click.
Sigh.

No one is arguing whether or not this is how it should work; I think it's very cool. My "mental model" of how this works is that you point at something, and Safari zooms in on that thing.

The questions that I have about it have to do with the implementation model. For us as humans it's straightforward to see that that bit of the page that has a headline with some text underneath it is a "story", and they belong together; but, generally speaking, computers can't just figure that kind of thing out as easily as we can.

neven said:
All of these seem perfectly logical to me, and they work the way I expect.
Oh yeah, definitely. Like I said -- magic.
 
Sigh.The questions that I have about it have to do with the implementation model. For us as humans it's straightforward to see that that bit of the page that has a headline with some text underneath it is a "story", and they belong together; but, generally speaking, computers can't just figure that kind of thing out as easily as we can.

Umm... ok. I don't mean to pull rank on you, but I'm a web software developer and I can imagine the actual implementation very easily.

Double-clicking the image to zoom it is a no-brainer; hopefully I don't have to explain that one. The other two pan-and-scan clicks are on stories, which look like this on nytimes.com (I'll strip the unnecessary tags):

<div class="story">
<h5>Putin Makes His Own Proposal on Missile Defense</h5>
<div class="byline">By SHERYL GAY STOLBERG<span class="timestamp">12:12 PM ET</span></div>
<p class="summary">The Russian president suggested that instead of building radar defenses in the Czech Republic, the U.S. should use an existing system in the former Soviet republic of Azerbaijan.</p>

<ul class="refer">
<li class="free"><div class="inlinePlayer" style="border: none; padding: 6px 0 0 0;"><div class="doubleRule" style="margin: 0; padding: 0;"></div></div></li>
</ul>
</div>

The double-clicks fell inside the <p class="summary">. All the zoom engine has to do is figure out the rendered width of that element - no problem - and the vertical click position. Zoom in to that coordinate at the zoom level that fills the page with 100% of the P element's width (or 1:1 view of the page, whichever comes first) - presto, zoom to story. What part of this is perplexing?
 
I'm not convinced that it could use the <div>s to identify the section you want to see when you zoom. A typical web page is going to have divs within divs within divs, so it would have to be pretty clever to pick the one you want and then zoom to match its horizontal boundaries.

Of course I'm still trying to figure out what the algorithm is for F9 with Exposé!
 
I'm not convinced that it could use the <div>s to identify the section you want to see when you zoom. A typical web page is going to have divs within divs within divs, so it would have to be pretty clever to pick the one you want and then zoom to match its horizontal boundaries.

Of course I'm still trying to figure out what the algorithm is for F9 with Exposé!

Nested divs present no problem. All I'm saying is that in those ads, Safari zoomed in on the horizontal width of the div that was clicked on. It doesn't matter what the div was contained in; it has a rendered width and that's what was zoomed to.

The more interesting question is what happens if you double-click on, say, a B tag inside a div. My guess is that it won't zoom in to the width of the B, but to the container div. It will probably only zoom to the width of block-type elements.
 
Nested divs present no problem. All I'm saying is that in those ads, Safari zoomed in on the horizontal width of the div that was clicked on. It doesn't matter what the div was contained in; it has a rendered width and that's what was zoomed to.
So you're saying that it would zoom to the smallest containing div. Perhaps that's good enough, since it can't read our minds and has to take an educated guess.

But I can imagine cases, such as a small div'ed area within a column, where that wouldn't be the best guess. Imagine, for example, something like this:
<div>
<div>Part 1</div>
<div>A newspaper-style column with lot of paragraphs of body text. Blah blah blah blah blah... .</div>
<div>Part 2</div>
<div>More paragraphs of body text.</div>
</div>​
If the "Part" divs have generous margins to distinguish them visually, and you happen to touch the "Part 2" line, you might get zoomed to just the "Part 2" borders, instead of the column width. But, as I've said, how would it know what you meant?
 
So you're saying that it would zoom to the smallest containing div. Perhaps that's good enough, since it can't read our minds and has to take an educated guess.

But I can imagine cases, such as a small div'ed area within a column, where that wouldn't be the best guess. Imagine, for example, something like this:
<div>
<div>Part 1</div>
<div>A newspaper-style column with lot of paragraphs of body text. Blah blah blah blah blah... .</div>
<div>Part 2</div>
<div>More paragraphs of body text.</div>
</div>​
If the "Part" divs have generous margins to distinguish them visually, and you happen to touch the "Part 2" line, you might get zoomed to just the "Part 2" borders, instead of the column width. But, as I've said, how would it know what you meant?

How would these two divs be positioned in relation to one another? One on top of the other, or one below the other?

If they're arranged as vertical columns, then if you double-click a column, you'd zoom in on just that column. That makes sense, semantically. Since breaking up stories into columns isn't common, it wouldn't be a big deal anyway.

If the two divs are stacked, then double-clicking will obviously zoom just the clicked area. That doesn't seem semantically wrong either.

As long as the website is reasonably put together, this should work fine. When I look at, say, nytimes.com or slate.com or apple.com, I'm fairly confident that if I double-clicked to zoom, it would zoom to the area I expected.
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.