Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

err404

macrumors 68030
Original poster
Mar 4, 2007
2,525
623
Every time the audio on the Apple TV goes silent for few seconds, my receiver looses its DD signal. Once sound returns it takes about a second for my receiver resume playback. The result is that while in menus the first click that I perform makes no sound.

I have my Apple TV connected via an optical cable to a Sherwood RD-8601 receiver and it is the only device that exhibits this problem.

I know that this may seem minor, but the inconsistent UI audio feedback hurts the user experience.

Has anybody else seen this?
 
Every time the audio on the Apple TV goes silent for few seconds, my receiver looses its DD signal. Once sound returns it takes about a second for my receiver resume playback. The result is that while in menus the first click that I perform makes no sound.

I have my Apple TV connected via an optical cable to a Sherwood RD-8601 receiver and it is the only device that exhibits this problem.

I know that this may seem minor, but the inconsistent UI audio feedback hurts the user experience.

Has anybody else seen this?

I know this problem exists, but I'm not sure it's the fault of AppleTV. I need to watch and see if the problem occurs chiefly when the last content played through AppleTV was a Dolby Digital signal. The signal from the UI menu is just PCM. It is possible that the delay is caused by the receiver not picking up on when to switch.

The problem is inherent to the fact that a PCM or Dolby Digital signal is only constant when the source is constantly outputting it, as with a DVD or ATSC source.

The AppleTV, and possibly some devices like it, emit a PCM signal when the menu is navigated, but that signal starts and stops abruptly. The first time through, this may not be enough time for the receiver to detect that a PCM signal is coming to it. If the signal were a continuous feed, then it would work... but I'm not sure what the design considerations would be for emitting a constant digital audio signal from a "live" source (i.e. not prerecorded stream).

Still, I'd mention it to Apple. The only possible way around this I can think of at the moment would be to delay the sound effect for a second as the bitstream starts up, but the problem here is obvious... now you've traded audibility for UI responsiveness.

You generally won't notice this problem on a DVD or a digital TV channel if you aren't really paying attention to that first fraction of a second or so as the processor in the receiver is synching up. Instead, you'll transfix on the rest of the audio that follows and most likely never notice a gap at the beginning... especially on a DVD where the authors can master the bistream to give a second of lead time on a black screen with a corresponding few frames of AC-3 before the audio needs to start.
 
That explanation sounds spot on. I'll contact Apple tonight to register the issue.

The xbox360 appears to use the approach of sending a continuous signal(even in the menu), but its output is always DD, not PCM.

Technically the ATV hardware can generate DD realtime. Some of the hacks for DivX 5.1 playback successfully utilize an A52 library for this. That said, I would think the only limitation for Apple would be licensing, but they have shown that they are unwilling to license AC3. It's really a shame since I would much rather see AAC 6 channel as their means to 5.1 as opposed to pass-through.
 
That explanation sounds spot on. I'll contact Apple tonight to register the issue.

The xbox360 appears to use the approach of sending a continuous signal(even in the menu), but its output is always DD, not PCM.

Technically the ATV hardware can generate DD realtime. Some of the hacks for DivX 5.1 playback successfully utilize an A52 library for this. That said, I would think the only limitation for Apple would be licensing, but they have shown that they are unwilling to license AC3. It's really a shame since I would much rather see AAC 6 channel as their means to 5.1 as opposed to pass-through.

What would you do with AAC 6 channel? No receiver can decode it. It's preferable to send via digital transport to the receiver rather than convert it to analog and induce potential generation loss and EM/RF interference along the six analog interconnects.
 
What would you do with AAC 6 channel? No receiver can decode it. It's preferable to send via digital transport to the receiver rather than convert it to analog and induce potential generation loss and EM/RF interference along the six analog interconnects.

AAC is the standard audio compression format used by MPEG4 and is vastly superior to AC3 in it's inefficiency and versatility. AC3 was not developed with efficiency in mind as it was intended for large bandwidth applications like DVD. AAC on the other hand is designed with variable bandwidth constrains in mind and scales in quality accordingly. It's also fairly easy to collapse an AAC 6 channel stream to stereo or matrixed surround in real time. Receiver compatibility is not needed anymore then your receiver would need to understand MP3, but the receiver needs to be able to receive the discreet 6 channel in some way. Most receivers only understand AC3 for this, so the Apple TV would need to generate the DD stream real time from the 6 extracted channels in the AAC. In fact the Apple TV software does most of this already. Do you know what the Apple TV does with AAC 6 Channel stream today? You might be surprised to find out that it understands the stream and re-encodes it to a matrixed pseudo surround output (prologic-like) that is compatible with stereo devices. Matrixing the sound is actually more complex then converting to AC3. The hacking community have clearly shown that the hardware can do this with very little overhead.

In summary, AAC gives the advantage of better file compression for the same quality, removes the need for a second sound stream and adheres to MPEG4 standards better then using AC3. Unless Apple wanted files to only play at full quality on an Apple TV, only licensing was in the way.
 
You might be surprised to find out that it understands the stream and re-encodes it to a matrixed pseudo surround output (prologic-like) that is compatible with stereo devices. Matrixing the sound is actually more complex then converting to AC3. The hacking community have clearly shown that the hardware can do this with very little overhead.

Actually this is not entirely correct. As I understand it, AppleTV only transmits the first two channels of a six-channel AAC bitstream. Those two channels, if they are from any one of most film soundtracks produced in the last 20 years, are sufficient to reconstruct a Dolby Surround analog mix. The reason this works is because the phase-shifted surround is already stereo matrixed into the Left and Right front channels.

As one who produces Dolby Digital content under a Trademark Service Agreement with Dolby Laboratories, I can tell you this for a fact: The majority of film soundtracks carry for sake of backward compatibility with Dolby ProLogic receivers a Dolby Surround mix embedded in the Left and Right front channels.

While it is not impossible to create a Dolby Surround analog mix on the fly, it also requires licensing. AppleTV may or may not contain the hardware or software to do Dolby Surround analog encoding, but a) it doesn't need to and b) they would STILL have to license the technology from Dolby Labs. Furthermore, the only "evidence" I've seen from the hacker community is their supposition as to what seems to be happening. But what I've just described to you is the more plausible and less complicated explanation: The Dolby Surround analog mix is already present in the Left and Right front channels.

That being said, if your thought is to be able to reconstruct the Dolby Surround analog mix, this makes it absolutely unnecessary to encode the audio as six-channel AAC when two-channel AAC will actually contain the same information! Don't believe me? Try it... take a few films and encode them in Handbrake 0.9.2 as AAC + AC-3. Now flip your receiver to Dolby ProLogic mode and set the AppleTV to output only the AAC 2-channel... it's not magic, and it's not the receiver interpolating. It's because Dolby Surround analog is IN that two channel mix. Another example of this can be found in the recordings of Isao Tomita, which can be purchased off iTunes. It's ordinary two-channel AAC... no special encoding required by iTunes or AppleTV. Just play it through as stereo AAC to your Dolby ProLogic capable receiver and it'll do the rest.

As far as the Dolby Digital end of things, it is actually more convoluted to have to transcode the AC-3 into AAC and then back into AC-3 again so a Dolby Digital decoder on a receiver can do something with it. It is even more convoluted to require receivers to be capable of decoding six-channel AAC... and for reasons I will point out below, a step backward in soundtrack reproduction. Of these various options, it is easiest to encapsulate the AC-3 and pass it through to the licensed decoder on a Dolby Digital receiver.

In summary, AAC gives the advantage of better file compression for the same quality, removes the need for a second sound stream and adheres to MPEG4 standards better then using AC3. Unless Apple wanted files to only play at full quality on an Apple TV, only licensing was in the way.

While it's true that AAC, developed jointly by Apple, Fraunhofer IIS and Dolby Laboratories, is a direct descendant of AC-3, and is superior to AC-3 at the same bitrates, AAC lacks certain parameters that make AC-3 more efficient and well-suited for film at those bitrates.

These parameters include:

1. Dialnorm - Dialogue normalization. In the encoding process, the mastering lab is advised by Dolby Laboratories to measure the A-weighted average loudness of the soundtrack (in -dB Full Scale, aka -dBFS). This value is input into the Dialnorm parameter during the encoding stage and stored as metadata in the AC-3 track. This metadata tells the Dolby Digital decoder what the average loudness of the track is so that from one Dolby Digital track to the next, the dialogue relative to the foley and music can be normalized. This parameter was developed for Digital Television applications so that the user would not have to constantly adjust the volume from one channel to the next to compensate for variations in dialogue levels. It has the added advantage of allowing dialogue to remain audible throughout a program no matter how loud the foley and score.

2. Dynamic Range Compression - There are several presets including Film Standard, Film Light, Music Standard and Music Light compression that are dictated by the sound engineer during mastering. This information, along with the reference monitor type (e.g. X-curve, commonly used in theatrical sound mastering) and peak monitoring level (in dB SPL), are stored as metadata. This metadata instructs the Dolby Digital decoder what profile of dynamic range compression to apply to extend the capabilities of Dolby Digital well past that of standard AAC in its ability to reproduce a wider dynamic range with minimal distortion. That is, to produce a wider range of softest to loudest sounds, while increasing or decreasing the compression above and below the baseline during different parts of the program to compensate for amplitude inconsistencies that would otherwise create distortion or drown out dialogue.

3. Low-pass filter - By default, a 20kHz low-pass filter is applied to completely eliminate frequencies above the A-weighted range (the range of human hearing) to comply with the basic principles of digital sampling techniques. Based on equations defined by Harry Nyquist of Bell Labs in the early 20th century, sampling above the Nyquist limit can induce aliased frequencies. Therefore, a 20kHz low-pass filter is applied to eliminate frequencies that could produce aliased frequencies that would falsely color the soundtrack. This procedure is also a standard practice in mastering of sound recordings when done professionally, for the exact same reason... and it is why most of the talk of aliasing amongst audiophiles is a load of nonsense.

There are other features including an RF intermodulation filter, a 120Hz high-pass filter for the LFE channel, a DC offset filter (which reduces noise induced by the 60Hz hum of DC-powered hardware), and so on.

AAC, in short, is not optimized for film soundtrack reproduction in the way AC-3 is. To encode AC-3, rather than six-channel AAC, is therefore preferable. Not only that, but because of the additional parameters mentioned above, the threshold for acoustic transparency is a lower bitrate in AC-3 than it is in AAC. AAC is superior to AC-3 only when not taking these parameters into account.
 
There isn't anything to test on the Apple TV since we know it does not properly support AAC 6 Channel. AAC 6 Channel contains 6 discreet channels (5.1) not matrixed stereo. The matrixed stereo is done via the Apple TV since they have no licensed means to generate a 5.1 output.

I certainly wont argue about the merits of AC3. It is a very impressive format but it is very heavy. On HB encodes the AC3 track is typically %30-%40 of the file size not including the second stereo mix. The unfortunate truth is that a level of compromise is required when dealing with heavily compressed files. We compress the video by up to 75%, but leave the audio alone. AAC scales well and I feel it is better suited for downloadable content.
 
There isn't anything to test on the Apple TV since we know it does not properly support AAC 6 Channel. AAC 6 Channel contains 6 discreet channels (5.1) not matrixed stereo. The matrixed stereo is done via the Apple TV since they have no licensed means to generate a 5.1 output.

Incorrect again. The stereo matrixed surround format is already IN the source file's first two channels. AppleTV doesn't do anything to it.

I certainly wont argue about the merits of AC3. It is a very impressive format but it is very heavy. On HB encodes the AC3 track is typically %30-%40 of the file size not including the second stereo mix.

What do you mean 30%-40% of what? The total file? Of course it's that large. It's a 5.1 channel bitstream at 448 Kbps. Are you suggesting that AAC should be encoded at a rate fundamentally lower than 96 Kbps per channel? That's right at the threshold of acoustic transparency for BOTH formats. No, AAC will not outperform AC-3 if, say, a six channel file is encoded at half or one-quarter the bitrate of AC-3. AAC at 128 Kbps for two channels is about equivalent to AC-3 at 192 Kbps, NOT counting the metadata parameters that make AC-3 the superior format for film soundtrack purposes.

You'd still need a bitstream of just above 320 Kbps for six channel AAC to be funadmentally equivalent to 448 Kbps AAC in fidelity without factoring in the metadata and filtering... and that's unrealistic since every AC-3 track encoded to a DVD is going to possess those parameters and no AAC track will. Your AppleTV cannot ADD those parameters, either, as it is not capable of measuring Leq(A) or knowing the DRC profile that corresponds to the specific mixing setup used by the mastering engineers on the original recording. Therefore, there's still going to be size to your file and to no avail since all the extra workarounds required are like taking the long way around just to get back to the same place you would have been if you just used the original AC-3 file in the first place.

The unfortunate truth is that a level of compromise is required when dealing with heavily compressed files. We compress the video by up to 75%, but leave the audio alone. AAC scales well and I feel it is better suited for downloadable content.

AAC is a perceptual coding schema like AC-3, except AAC doesn't have any metadata parameters that allow true maximum efficiency of the format particularly in theatrical surround applications. Please see my previous comments about the metadata and filters again... These are not present in AAC.
 
I haven't found any information that discusses the front 2 channels in AAC being matrixed or not, but you seem to know what your talking about ;). It's an interesting idea and does make sense, I just hadn't heard it before.

As for the %30-%40 I meant as compared to just the video encode at 1.5mbs (~%25 of the total file size). Since most transcoders simply strip the AC3 from the source, a typical 2 hour movie takes 400mb+ for the 5.1 audio track. The Apple method then spends an additional 100mb+ data for the second stereo stream. To top it off, that 400mb+ audio track is only compataible with an Apple TV. Not even other Apple products can make use of it.

As I said, I agree that AC3 is better. But in regards to standards compliance and efficient use of a limited resource (bandwidth), it's not optimal.
Can you imagine this argument for video? The original MPEG2 encode on your DVD is MUCH better than an h.264 encoded from that same disc. I feel that the loss of quality in the video is drastic compared to the loss of metadata on the audio stream. A properly mastered 5.1 source playing on a balanced home theater setup would have little use for this data that couldn't be approximated. (not equal, but close enough to provide an experience very few people could distinguish)

For the distinguishing individual are looking to maximize their experience, a BD disc is a much better way to go then downloading a movie on-line or ripping a DVD for your library.

I don't mind the Apple TV supporting AC3 pass-through, but as of right now it's the only 5.1 option it supports.
 
A properly mastered 5.1 source playing on a balanced home theater setup would have little use for this data that couldn't be approximated. (not equal, but close enough to provide an experience very few people could distinguish)

Absolutely incorrect. You cannot approximate dialogue normalization without knowing the average loudness of the entire AC-3 track. Software and hardware that do this are considerably expensive (the Dolby LM-1000 meter costs upwards of $3000)... and you cannot approximate the average value on the fly, the entire file must be analyzed first. You cannot approximate Dynamic Range Compression without actually knowing the peak mixing level, speaker type and compression profile intended by the original sound engineer. You cannot approximate RF intermodulation filtering or bandpass... it must be either performed in the original encoding OR done by rather expensive hardware capable of such filtering.

Lastly, the reason these metadata exist is because the home theater is not the same as the theatrical auditorium. It doesn't matter how much money you spend, or how good your room acoustics are. The metadata, in addition to other functions, serve to replicate the theater experience in the acoustical environment and volumetric space of a smaller home theater, as well as compensating for the fact that home theaters do not generally use X-curve reference monitors. In a mixing room and a theater, X-curve monitors have sharp frequency roll off at the upper and lower end of the A-weighted spectrum because the acoustical space of the auditorium creates harmonic effects which alter the dynamics of frequency response at those extremes. You cannot "fake" a larger acoustical space and therefore, the metadata are necessary to compensate for the difference in hardware and room dynamics.

It would be smarter to simply bridge the disconnect between media centers and home theaters by bringing the media centers (e.g. AppleTV) into standards compliance with the motion picture standards, not the other way around. This is an easier approach because it reduces the expenditure for the end user. Rather than having to replace all kinds of audio and video hardware to conform to non-standard audio (no, AAC is not a SMPTE/AES standard format) it would be wiser at this juncture to require merely software upgrades to existing computer hardware to give them the capability of processing standard formats.

Bandwidth and storage are becoming less and less of an issue as the cost of both continues to decline much faster than the cost of upgrading audio and video hardware.
 
If you want an uncompromised solution, use the physical media. Apple could have used MOV or MKV, but they chose MPEG4 and should have abided by the standards.
This is somewhat of an ideological stance, but in the end I'm not convinced that the use of AC3 over AAC is close to being worth sacrificing the portability of the MP4 format. In samples that I have tried, the difference between AC3 and AAC does not enhance the experience for me. What's the point of ISO standards, if they just go and break interoperability? The fragmentation of formats like MPEG4 is sloppy and is hurting the uptake of digital distribution to the TV.
There is no question of replacing hardware. It's all in place today. Just store the audio in the recommended format and use a solution like the open source AC3Filter for playback through any DD receiver.

(yes, I know that MP4 allows private streams and Apples solution technically is a 'legal' file, but being that no other device can read, not even QT, it still results in being a bit of a cludge.)
 
There is no question of replacing hardware. It's all in place today. Just store the audio in the recommended format and use a solution like the open source AC3Filter for playback through any DD receiver.

Doesn't solve the metadata issue. As newer formats emerge there will be convergence with newer hardware, but for the time being a better solution would be to either continue encapsulating the AC-3 file along with a 2-channel AAC file, OR to develop a metadata layer for AAC that corresponds to AC-3 so that multichannel AAC can be converted to AC-3 without losing the original parameters, and then using a decoder that can convert the AAC with metadata into AC-3 with metadata.

The only problem here is that Dolby has already invested resources in moving toward Dolby Digital Plus and Dolby TrueHD, which are much higher bandwidth formats and render even AAC impractical as a media center solution because of AAC's inability to preserve acoustic transparency of 24-bit multichannel audio.

If AAC had the metadata layer that AC-3 does, it would indeed be superior to AC-3 in every respect... but my contention is that metadata need to be preserved in order for it to be sufficient.
 
The discussion isn't about the better format, but the appropriateness of a format for on-line distribution/bulk library's through the AppleTV.

No doubt that Dolby Digital Plus and Dolby TrueHD are impressive formats, but they don't address the issue of audio in bandwidth constrained content or ISO spec adherence. They are design for physical media where they can use exponentially more bandwidth. Besides the absurdity of the audio track exceeding the size of the video in an SD DVD rip, hardware wise I don't think that the Apple TV could handle even basic E-AC-3 through the current optical out.

Except for rare exceptions in home setups, the encoded DRC in AC3 is just applied to the original stream at played back. While this has the advantage of allowing more usable bits of data during extreme quite or loud moments, it does not mean that AAC would sound bad in the same situation. The more advanced compression algorithm can largely mitigate the perceptual loss.
Dialog Normalization is nice, but it is more advantageous in content that is interrupted by commercials. One movie being slightly louder then another, is not a large issue these days.

If the metadata is just applied and recorded into the AAC stream the sound will be extremely close to the intended AC3 original. It's not perfect, but close enough given the technology restraints.

I don't mind so much that AC3 is supported via a private stram, but I am disappointed that the MP4 container standard 5.1 solution is not. I don't want my library to be Apple TV exclusive and would gladly encode my DVD collection to AAC if given the choice.
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.