Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

0002378

Suspended
Original poster
May 28, 2017
675
671
In my Swift app, I'm using a GCD timer (aka dispatch source timer) to read from a file into a look-ahead buffer at regular intervals (5 secs). This task repeats until it is either complete (end of file, no more reading necessary), OR the user performs an action (e.g. play next track), that "invalidates" the timer, i.e. makes the timer no longer necessary.

Now, in order to "invalidate" the timer, I am performing a cancel on the timer source, and I am waiting (in a loop) till I can verify that the task has, indeed, been cancelled. See code below:

Code:
        // Timer creation
        let dispatchQueue = dispatch_queue_create(queue, nil)
      
        timer = dispatch_source_create(DISPATCH_SOURCE_TYPE_TIMER, 0, 0, dispatchQueue)
      
        let intervalNanos = ... // some interval
      
        // Allow a 10% time leeway
        dispatch_source_set_timer(timer!, DISPATCH_TIME_NOW, intervalNanos , intervalNanos / 10)
      
        dispatch_source_set_event_handler(timer!) { [weak self] in
          
            // Check two flags before executing the task
            if ((!self!.stopped) && (!self!.paused)) {
                // Do the task
            }
        }

...
    // Stop executing the task
    func stop() {

        // If already stopped, do nothing further
        if (stopped) {
            return
        }
      
        stopped = true

        if (timer != nil) {
            dispatch_source_cancel(timer!)
           
            while (dispatch_source_testcancel(timer!) == 0) {
                // Wait for timer to be cancelled
            }
            // At this point, no task instances should be executing
        }
    }

However, sometimes, I enter a race condition where, even after I perform the stop() method above, an instance of the task is executing (presumably due to sudden disk seeking slowness when reading the file), and I get an EXC_BAD_ACCESS(code=1, address=0x0) in my task code (i.e. reading part of a file into a buffer ... not shown here). I did some research online to determine that this is probably because of dereferencing a pointer to an object that has already been deallocated (presumably as a result of the stop() call).

So, my questions are:
1 - Is my code in stop() correct ? At the end of its execution, can I be sure that all already executing tasks have completed ?
2 - If indeed, the error is because of dereferencing a dead pointer (or whatever else), what can I do to prevent this bad access ?
 
Last edited:
I seem to have found a solution to the above problem. I'm now using NSOperationQueue in conjunction with the GCD timer. I use the timer to run at regular intervals, but all it does is put the task on the operation queue. The operation queue executes it. When I need to kill the whole executor, I call cancel and waitUntilAllOperationsFinished on the operation queue, which seems to have solved the bad access error. I'm not sure if this is the best solution, but it seems to work quite reliably. I performed rigorous testing.

However, if anyone has any comments/suggestions, I would still like to hear them.
 
Another option would be to use a dispatch group. You can enter/leave the group in your timer code and then just wait on the group. You'd need to make sure that the very first thing you do in the task is enter the group, and the very last thing is leave the group. So you probably want to keep a strong reference captured into the closure.

Then once entered into the group, you can check the stopped state. (EDIT: Also be sure to be somewhat careful around the use of these flags for state. It should be okay, but if you get into real trouble, you might want to look at the OSAtomic APIs to ensure access is atomic)

EDIT2: I also wonder if a barrier would be useful here too. Cancel the timer, dispatch a barrier synchronously to the queue the timer was using. That will force the queue into serial mode after the current block(s) have executed. It's a bit goofy, since you'd be dispatching an empty block, but it wouldn't require creating and maintaining timing on the dispatch group. Once the barrier has executed, you should be in the clear. Or you can dispatch the barrier async and do the cleanup inside the barrier block, guaranteed to not being doing things in parallel with a timer block. I haven't tried this technique myself though.

As an aside, I'd write a couple of your ifs like this:

Code:
if let validTimer = timer {
    // Do cleanup with validTimer
}

// Option 1:
if (self?.stopped == false && self?.paused == false) {
    // Do the task
}

// Option 2:
if let realSelf = self, realSelf.stopped != true, realSelf.pause != true {
    // Do the task with realSelf
}

The general issue is that you wind up sprinkling "!" into your code in many places, and the result is that I find the code "alarming" in the places where it gets used. "!" tends to be a flag that the contract of these variables are not properly defined when I read it. In the case of timer, if you know you can't use it improperly, you can make it implicitly unwrapped in the class/struct.

In the checking stopped case, you can check optionals for specific values, and if nil, will evaluate to false. This is one way you can do the nil check and the state check at the same time on "self". Another option is to combine the state check with the unwrap of the weak variable.

If I have an optional internal variable, I've gotten into the habit of allocating to a local variable first, doing configuration there, and then assigning to the internal variable to use later. It helps reduce the number of "?"s and "!"s that make it into my code, as well.
 
Last edited:
  • Like
Reactions: 0002378
Another option would be to use a dispatch group.

Wow, dude ! Thanks again :) This is a wealth of great information !

I did read about dispatch groups, but didn't try them (went with NSOperationQueue instead). I should branch my code and give dispatch groups a try.

"cleanup inside the barrier block"
What did you mean by "cleanup" ? The only cleanup I can think of is actually canceling the executor. Once the executor is canceled, there are no resources to clean up, as such. So, I'm confused about what you meant by cleanup.

Good tip about the atomic variables. Will look into that further to see how I can make my flags more reliable.

I agree that optional unwrapping is ugly.

Thanks again, for the great response. I will look into these suggestions over time.

BTW, in my previous post about NSTimer, I asked a question at the end, that you perhaps missed. I'll ask it again - What is the intended use of NSTimer ? What are its typical applications ? Since it cannot be used outside the main thread, it seems to be limited in its potential applications.
 
No one but me does try to reinvent the wheel, but I use timers to quit blocking in the event loop. I wanted to bring back them scrollbars.

Capture.jpg
 
"cleanup inside the barrier block"
What did you mean by "cleanup" ? The only cleanup I can think of is actually canceling the executor. Once the executor is canceled, there are no resources to clean up, as such. So, I'm confused about what you meant by cleanup.

It was an assumption I made based on what you said here:

I did some research online to determine that this is probably because of dereferencing a pointer to an object that has already been deallocated (presumably as a result of the stop() call).

What I was assuming was that the deallocation was happening during stop(), on the main thread, and not on the thread the timer was firing on. So using a barrier task can ensure that task is the only thing running in the queue when it does its work, preventing the sort of deallocate/access race conditions that may have been present. It is somewhat difficult to diagnose without more detail in the code, which I understand you can't provide, so I was imagining. :)

BTW, in my previous post about NSTimer, I asked a question at the end, that you perhaps missed. I'll ask it again - What is the intended use of NSTimer ? What are its typical applications ? Since it cannot be used outside the main thread, it seems to be limited in its potential applications.

NSTimer pre-dates GCD. Hell, NSTimer pre-dates OS X. It's old. It was written in an era where threads were special-case tools, not something you can and should be using day-to-day.

When NSTimer is the only thing you have, it is the thing you use. Now that you have GCD and the flexibility to dispatch to other thread queues, there is much less need for it, I'd agree. I think the main reason it hasn't been deprecated is that it'd break legacy Mac apps, and if you aren't going to use GCD, you should still have something you can use as a timer.

As you've seen, going into GCD introduces new problems to solve. Problems you wouldn't really need to deal with if you could do it on the main thread with NSTimer only (especially if the work being done is cheap in CPU time). And in some cases, it is enough.
 
  • Like
Reactions: 0002378
What I was assuming was that the deallocation was happening during stop(), on the main thread, and not on the thread the timer was firing on. So using a barrier task can ensure that task is the only thing running in the queue when it does its work, preventing the sort of deallocate/access race conditions that may have been present.

Yes, I'm assuming the same thing (deallocation happens in the main thread), but I never figured that out. My workaround worked, so I didn't investigate further. Anyway, I'm not sure how the barrier task would help if the rogue task is already executing. Correct me if I'm wrong, but once a task begins executing, isn't it taken off the queue ? So then, the barrier task would have no effect on the rogue task, although it would prevent new tasks from being enqueued. Right ? So, it seems like the only solution is to let the rogue task finish (which is not a problem for me, it is harmless), and then proceed.

It is somewhat difficult to diagnose without more detail in the code, which I understand you can't provide, so I was imagining. :)

Ironic that you mention this, because, all my code is totally open source, and I enthusiastically advertised it here :) https://forums.macrumors.com/thread...-in-swift-app-bundle-source-included.2053821/. It's also on GitHub (link in thread).

Of course, no pressure at all. But, if you ever have any free time and feel like taking a look at my code, please feel free to download the Git repository and build and run/debug the code. If you decide to take a look, the relevant classes are:
- BufferManager (which initiates the problematic timers),
- Player (which calls into BufferManager, to give you some context as to app functionality),
- ScheduledTaskExecutor (timer wrapper class which didn't work), and
- StoppableScheduledTaskExecutor (timer wrapper class which worked).

Truth be told, I'm totally new to working with OS X (and in particular, CoreAudio/AVFoundation), so, I would love for someone more experienced (such as you) to take a look and review my code. Again, this is just a wish. No pressure whatsoever. This is a hobby project anyway, nothing serious.

As you've seen, going into GCD introduces new problems to solve. Problems you wouldn't really need to deal with if you could do it on the main thread with NSTimer only (especially if the work being done is cheap in CPU time). And in some cases, it is enough.

Yup, that's understandable. In fact, I am using NSTimer to update my mp3 player seek bar every second with the calculated playback position (a very cheap CPU computation). And, for that purpose, it works remarkably well.
 
Last edited:
Yes, I'm assuming the same thing (deallocation happens in the main thread), but I never figured that out. My workaround worked, so I didn't investigate further. Anyway, I'm not sure how the barrier task would help if the rogue task is already executing. Correct me if I'm wrong, but once a task begins executing, isn't it taken off the queue ? So then, the barrier task would have no effect on the rogue task, although it would prevent new tasks from being enqueued. Right ? So, it seems like the only solution is to let the rogue task finish (which is not a problem for me, it is harmless), and then proceed.

The purpose of a barrier is to help solve threading issues like the reader/writer problem. Shared state where you need to synchronize writing with reading, but reading can be done in parallel safely.

If you insert the barrier into a queue, even if the task is running, it will wind up not running until the existing task completes. The behavior is that the barrier is a forced serial task in an otherwise parallel queue. So it needs to have assurance that no tasks are currently active, not just queued. GCD knows about tasks currently in flight, not just the ones in the queue. Otherwise it can't properly load balance tasks queued either, which is one of the more important roles of GCD. If you've seen Thread Pools on Windows, it is a similar concept, only more robust (IMO).

So say you have 6 tasks, and a queue that decides to run 3 in parallel. Let's also assume that every task takes an identical amount of time for the sake of argument. Makes the demonstration a bit easier.

I enqueue Tasks 1-6, and then Barrier A. Barrier A cannot execute until all 6 tasks have completed. Otherwise it breaks the guarantee that the barrier will be executed in serial with other tasks. So you will see:

Task 1-3 -> Task 4-6 -> Barrier A.

Now, what about a case where things aren't so neat? Say I enqueue Tasks 1-2, Barrier A, and then 3-6. You will then see:

Task 1-2 -> Barrier A -> Task 3-5 -> Task 6.

Even though the queue can run 3 tasks in parallel and can execute either task 3, or the barrier, the barrier prevents the queue from allowing any more to run until the barrier has executed, and the barrier cannot execute until all tasks before it in the queue have executed.

So if you want to modify shared state, one way you can do it is make sure that writing to the shared state is done using a barrier task in the queue. That way, you don't even have to wait. Throw the barrier in the queue, cancel the timer source, and know that the barrier will do any cleanup of the shared state that you previously did on the main thread.

Honestly, if I am going to be managing data on a background queue, a barrier can be a useful tool for ensuring synchronization when managing shared state.

Of course, this assumes a concurrent queue. If you are using a serial queue (which I think you are as you pass in zero for the attributes for the queue), you don't even have to use a barrier. Serial queues are already guaranteed to have this behavior of doing tasks one at a time, in serial order. It may be a good habit to use a barrier despite that though, since it means the queue could later morph into a concurrent one without disrupting behavior.

Yup, that's understandable. In fact, I am using NSTimer to update my mp3 player seek bar every second with the calculated playback position (a very cheap CPU computation). And, for that purpose, it works remarkably well.

Yup, and while that can also be done in GCD, sometimes it is pretty nice to say "You know, every 1s, call this selector on the view controller". Especially in Obj-C where it's a bit more in the style of Obj-C to do things like that, where Swift relies more on closures instead.
 
  • Like
Reactions: 0002378
The purpose of a barrier is to help solve threading issues like the reader/writer problem. Shared state where you need to synchronize writing with reading, but reading can be done in parallel safely.

If you insert the barrier into a queue, even if the task is running, it will wind up not running until the existing task completes. The behavior is that the barrier is a forced serial task in an otherwise parallel queue. So it needs to have assurance that no tasks are currently active, not just queued. GCD knows about tasks currently in flight, not just the ones in the queue. Otherwise it can't properly load balance tasks queued either, which is one of the more important roles of GCD. If you've seen Thread Pools on Windows, it is a similar concept, only more robust (IMO).

Ahhhh ... ok, I think I finally get what you're proposing. Yes, that would, indeed, be a very good way to "wait" for my rogue task to finish running - simply enqueue my "barrier" to force the serial behavior. I think where I got lost earlier was where I assumed that the barrier's only purpose was to prevent more new (unnecessary) tasks from being enqueued, which, by itself, would not be much help, because that one executing rogue task would already do damage (schedule an audio buffer for a song that is now no longer being played), while the next valid task (e.g. playing of the new track) would be kicked off by the main thread without waiting for rogue to finish.

To clarify, the following are my two tasks in question, with the second one dependent on the completion of the first:
scheduleBuffer() // Rogue task executing on some dispatch queue
playNextTrack() // New task waiting for rogue task to finish

Currently, playNextTrack() runs on the main thread, totally independent of the async dispatch queue, which is why this problem exists. But, if playNextTrack(), can itself be the "barrier" task, and put on the same queue (and it is a serial queue, i.e. maxConcurrentOperations=1), then it cannot possibly start till the rogue task finishes. So, in other words, instead of task2 observing task1 from outside the queue, task2 can simply follow task1 on the queue.

Yes, I've worked with thread pools extensively, in Java. Haven't done much native coding on either Windows or OS X (this being the first time), but yes, I'm quite familiar with thread pools in general.

Thanks!
 
To clarify, the following are my two tasks in question, with the second one dependent on the completion of the first:
scheduleBuffer() // Rogue task executing on some dispatch queue
playNextTrack() // New task waiting for rogue task to finish

Currently, playNextTrack() runs on the main thread, totally independent of the async dispatch queue, which is why this problem exists. But, if playNextTrack(), can itself be the "barrier" task, and put on the same queue (and it is a serial queue, i.e. maxConcurrentOperations=1), then it cannot possibly start till the rogue task finishes. So, in other words, instead of task2 observing task1 from outside the queue, task2 can simply follow task1 on the queue.

Yes, the way I'd handle it is use a single serial queue if there is work that needs to be done in chunks, serially, but you want to keep it off the main thread. Then you get the barrier task behavior for free at that point.

As an aside, you got me curious about the code. And there are two big things I get from reading the code around buffer management. First, I would definitely use a serial queue to manage the buffer. When it comes time to swap out what you are buffering, those just become new tasks that do what is needed. Second is that I'd probably ditch the timer entirely and do something slightly different.

  • Have a function that handles reading a segment, and return some state on if there is more to read or not.
  • Have a wrapper function that forms the task. It checks to see if the buffer is full. If not, read a segment and schedule it. If the buffer is still not full after reading that segment, and there is more to read, queue a copy of itself.
  • In startPlaybackFromFrame, queue a copy of the read segment task.
  • When calling scheduleBuffer(), use a completion handler that queues a copy of the read segment task after recording that there are one fewer segments scheduled in the AVAudioPlayerNode.
The goal here is to quickly read many segments to fill the in-memory buffers, while the data on disk is still spinning/cached/etc. Once you have a full buffer, you can read "as needed". Using your code, lets say we call a full buffer 25 seconds of audio. Each segment is 5 seconds of audio. So the changes will mean that you read 25 seconds of audio in 5 second chunks in the serial queue and schedule them. Then, once every 5 seconds, you will read another 5 second chunk as a natural consequence of one of the segments finishing playback. A couple nice benefits:
  1. No more timer to manage, and you really just need a buffer count which you can increment/decrement to know how how many segments are currently scheduled in the AVAudioPlayerNode.
  2. The serial queue can have things inserted between the various segment tasks. Sometimes you might get doubled up tasks in the queue in rare circumstances, but the end result is relatively minor and easy to render harmless.
Also curious why you read the buffers yourself instead of using scheduleSegment(), beyond needing to know the seek time would be to use scheduleSegment(). And I wonder how well Apple handles internal buffering, since it should be possible to do things like schedule a file and then seek within the file while letting Apple do all the buffering on your behalf.

There's quite a bit about the code that made me think that you are expending quite a bit of effort reinventing wheels and the like, but I'm gonna try to avoid nit picking it to death. Instead, here's one Swift-specific tidbit that you can use to write less code that you may not already know (but relates to what I mention earlier in the thread):

Code:
// This:
if (bufferTimer != nil) {
  bufferTimer!.pause()
}

// Can be written as this:
bufferTimer?.pause()

The behavior between the two are identical, and the second version saves you quite a bit of typing.
 
  • Like
Reactions: 0002378
As an aside, you got me curious about the code. And there are two big things I get from reading the code around buffer management. First, I would definitely use a serial queue to manage the buffer. When it comes time to swap out what you are buffering, those just become new tasks that do what is needed. Second is that I'd probably ditch the timer entirely and do something slightly different.
  • Have a function that handles reading a segment, and return some state on if there is more to read or not.
  • Have a wrapper function that forms the task. It checks to see if the buffer is full. If not, read a segment and schedule it. If the buffer is still not full after reading that segment, and there is more to read, queue a copy of itself.
  • In startPlaybackFromFrame, queue a copy of the read segment task.
  • When calling scheduleBuffer(), use a completion handler that queues a copy of the read segment task after recording that there are one fewer segments scheduled in the AVAudioPlayerNode.

Thanks for taking the time to read my code ! I think I get the essence of your suggestion - use completion handlers, instead of timers, to "time" the reading of buffers. And, that is a brilliant idea ! I totally didn't even think of that one :) I'm sure the code will be much simpler ! And, pausing/resuming playback will be much simpler, because I won't have to pause/resume any timers - using the completion handler makes the scheduling logic much smarter and more reliable. Will look into refactoring accordingly.
The goal here is to quickly read many segments to fill the in-memory buffers, while the data on disk is still spinning/cached/etc. Once you have a full buffer, you can read "as needed". Using your code, lets say we call a full buffer 25 seconds of audio. Each segment is 5 seconds of audio. So the changes will mean that you read 25 seconds of audio in 5 second chunks in the serial queue and schedule them. Then, once every 5 seconds, you will read another 5 second chunk as a natural consequence of one of the segments finishing playback. A couple nice benefits:

Yes, I get that. The reason I didn't want to "look ahead" that far (25 seconds) is to minimize my memory footprint. I understand that it's best to grab as much as possible from disk, at once, but I had a decision to make, a compromise. Either I grab a bigger bunch from disk and fill up a fair bit of memory, or I could just grab a small chunk every 5 seconds. It's a question of memory footprint vs efficient disk reading, I suppose.

Since, I'm always "looking ahead" at least 5 seconds (plus the initial headstart of 5 seconds), there will never be a hiccup in playback, even if the disk read suddenly slows down (unless the slowdown lasts 5 seconds or more). I decided on just having at most 10 seconds of data in memory at any given time.

I could change that to 25 seconds, I suppose, and see how my memory footprint changes. I'll give it a try and see what happens.

Also curious why you read the buffers yourself instead of using scheduleSegment(), beyond needing to know the seek time would be to use scheduleSegment(). And I wonder how well Apple handles internal buffering, since it should be possible to do things like schedule a file and then seek within the file while letting Apple do all the buffering on your behalf.

This is the most important question, of course. And, I'm glad you asked ! I was using scheduleFile() and scheduleSegment(), till I discovered that they cannot handle mp3 files with incorrect duration metadata, which is to say ... they cannot handle about 90% of my mp3 collection, which was downloaded/converted from various places like YouTube :) Some of them might say, for instance, that there are 2 million frames, while the actual frame count is 1000 or 100,000 frames less.

When the end of such a file is reached, scheduleFile() and scheduleSegment() think that there are a few more frames to read (because of the incorrect duration metadata), and start allocating audio buffers in a mad frenzy and cause a huge memory leak that causes a complete system freeze ! This is why I wrote that other post asking about using a timer for my daemon task that keeps an eye on memory usage :) When scheduleFile() or scheduleSegment() reaches the end of a "broken" mp3 file, it does not know where to end playback, and I'm shocked to realize that they're not smart enough to compute where the EOF actually is !

I spent countless head-scratching hours trying to figure it out. When I just couldn't figure it out, I decided to experiment with directly working with buffers, and no more problems ! I'm able to detect the EOF very easily.

Is this a known issue with scheduleFile() and scheduleSegment() ? I am shocked that no one else has encountered this problem playing broken mp3 files ! I did so much research and couldn't find one single hit mentioning this issue. I can't be the only one. Of course, I would use scheduleFile() and scheduleSegment() if they worked properly.

There's quite a bit about the code that made me think that you are expending quite a bit of effort reinventing wheels and the like, but I'm gonna try to avoid nit picking it to death.
Other than me using scheduleBuffer() instead of schFile() or schSegment(), is there any other wheel that you think I reinvented ? Please let me know. I'm still getting acquainted with the Swift/CoreAudio frameworks.

Instead, here's one Swift-specific tidbit that you can use to write less code that you may not already know (but relates to what I mention earlier in the thread):

Code:
// This:
if (bufferTimer != nil) {
  bufferTimer!.pause()
}

// Can be written as this:
bufferTimer?.pause()

The behavior between the two are identical, and the second version saves you quite a bit of typing.

Ahhh, ok ! I always wondered what the difference between ? and ! is, because they seemed to be the same. Now, I know. Thanks, I will refactor my code accordingly !

Thanks dude ! If you have more comments about my code, do let me know.
 
Last edited:
Yes, I get that. The reason I didn't want to "look ahead" that far (25 seconds) is to minimize my memory footprint. I understand that it's best to grab as much as possible from disk, at once, but I had a decision to make, a compromise. Either I grab a bigger bunch from disk and fill up a fair bit of memory, or I could just grab a small chunk every 5 seconds. It's a question of memory footprint vs efficient disk reading, I suppose.

Since, I'm always "looking ahead" at least 5 seconds (plus the initial headstart of 5 seconds), there will never be a hiccup in playback, even if the disk read suddenly slows down (unless the slowdown lasts 5 seconds or more). I decided on just having at most 10 seconds of data in memory at any given time.

I could change that to 25 seconds, I suppose, and see how my memory footprint changes. I'll give it a try and see what happens.

Something worth thinking about is actually doing some napkin math on your buffer sizes. If the PCM buffer is 24-bit at 44khz (for example), each second is about 129KB in size. Each 5 second buffer is then 645KB. For the 10 seconds you are caching, you are using just under 1.3MB of RAM for buffers as written.

25 seconds in this case is just an example, but unless you wanted to fit this into a very tiny memory footprint, I would be somewhat surprised that <3.2MB is too much buffer space. Since RAM always has to be powered, and Apple tends to ship Macs with a minimum of nearly 1000 times that much memory, I think you may be over-optimizing memory use before understanding where your memory is going. UI resources and the like will very easily catch up to even a 25 second buffer.


I spent countless head-scratching hours trying to figure it out. When I just couldn't figure it out, I decided to experiment with directly working with buffers, and no more problems ! I'm able to detect the EOF very easily.

Is this a known issue with scheduleFile() and scheduleSegment() ? I am shocked that no one else has encountered this problem playing broken mp3 files ! I did so much research and couldn't find one single hit mentioning this issue. I can't be the only one. Of course, I would use scheduleFile() and scheduleSegment() if they worked properly.

No, that's a pretty good reason to not use it. I'm not too familiar with the quirks of this area of AVFoundation (I'm delving more into getting it to handle looped videos properly as a side project, which has different issues).

Not sure exactly what's going on there, but I suspect Apple tends to test against compliant data.

Other than me using scheduleBuffer() instead of schFile() or schSegment(), is there any other wheel that you think I reinvented ? Please let me know. I'm still getting acquainted with the Swift/CoreAudio frameworks.

Don't say I didn't warn ya. :)

Mostly it is things that I would say is familiar knowing your Java background. Creating a StringBuilder wrapper class instead of just adding convenience extension methods onto String itself. Because String is a value type in Swift, there's no real need to to have a separate type for mutation. If there's a convenience you want on the type, extensions are a better way to go, and make the code clearer to someone new to the codebase.

Similar thing with Size being a case of over engineering, although I could maybe understand making a custom subtype of UInt64 to represent sizes (type safety) and then get human-readable representations from the type as convenience methods. I personally just had an extension on Integer itself that would let me take an Int of any type and get the human-readable string from it as if it were a byte count. Not as type safe though. But it worked when dealing with random JSON that were byte counts that I wanted to make human readable for UI.

Also don't really need to write your own comparison function for SizeUnit. Make the enum derive from Int and Comparable, and set the values, and you'll get the usual comparison operators from the library for free. Similarly, ShuffleMode can derive from string and you can use shuffleMode.rawValue and ShuffleMode(rawValue: string) to go back and forth between string and enum representations. The raw value stuff is handled for you by Swift if you derive from a type, reducing how much code you write.

Things like that. If you feel like you are writing boilerplate code for something, there's probably a better way to do it in Swift. One of the reasons I like the language.

Also not a huge fan of some of the forms of decomposition that Java tends to encourage myself. Stuff like TrackIO aren't really necessary when you can use a value type (struct) for Track in Swift, meaning if I 'let' a track value, I can guarantee that a mutating function is not callable at compile time, so the isolation away from the type isn't really necessary. But in general, using class factories and static mutators tends to rub me the wrong way in Swift. Something like loadTrack() can be written as a failable initializer:

Code:
init?(file: NSURL) {
   // Do I/O that can fail here.

   if (loadFailed) {
      return nil
   }

   self = Track()
   self.setPropertiesFromSource(foo) // this can probably be folded into a non-failable initializer once you have something that "cannot" fail. 
}

Oh, and I'm curious why so many classes derive from NSObject, and so few structs are used. Track, Playlist, and a couple others are ideal for this, since it makes it easier to do multithreading when you can safely mutate a copy (rather than a reference) to a playlist the UI might currently be using for display, and then tell the UI to take a copy of the mutated struct. Because of Swift's copy-on-write behaviors for value types, you shouldn't be generating a whole lot of extra memory use, but still getting the ability to mutate local copies freely without worrying what else could happen.

Unless the type needs to be bridged to AppKit or the like, it doesn't need to use NSObject/@objc for the most part. My projects on this sort of scale mostly only get @objc used for types that are view controllers or custom views. The data model tends to be free of Obj-C compatibility.
 
Last edited:
  • Like
Reactions: 0002378
Something worth thinking about is actually doing some napkin math on your buffer sizes. If the PCM buffer is 24-bit at 44khz (for example), each second is about 129KB in size. Each 5 second buffer is then 645KB. For the 10 seconds you are caching, you are using just under 1.3MB of RAM for buffers as written.

Yup, that is a good point. I think the memory leak issue really scared me wrt memory usage. The other uncertainty that led to that decision (5 sec buffer) was that I'm not sure how (or how well) garbage collection works in Swift. I don't yet understand what makes an object eligible for GC in Swift (I need to learn) and how quickly it is cleaned up. I'm scared of the possibility of GC not being able to keep up with allocation. Also, I have noticed some remnants of C-like behavior, which, frankly, terrifies me. That uncertainty definitely pushed me towards the side of using less memory. So, it's not so much the footprint under normal circumstances, but the uncertainties and potential for a runaway/leak, that drove my decision.

I agree that much more memory is used up for other things, in comparison. And, of course, if a leak occurs, it won't matter how small/big the buffers are, but I just wanted to start conservatively. Once I have a better handle on how things work in the Swift world, I will do the kind of math you're suggesting, (and you're right, of course - that makes a lot of sense), and make changes accordingly. Thanks for the tip.


Don't say I didn't warn ya. :)

Mostly it is things that I would say is familiar knowing your Java background. Creating a StringBuilder wrapper class instead of just adding convenience extension methods onto String itself. Because String is a value type in Swift, there's no real need to to have a separate type for mutation. If there's a convenience you want on the type, extensions are a better way to go, and make the code clearer to someone new to the codebase.

Similar thing with Size being a case of over engineering, although I could maybe understand making a custom subtype of UInt64 to represent sizes (type safety) and then get human-readable representations from the type as convenience methods. I personally just had an extension on Integer itself that would let me take an Int of any type and get the human-readable string from it as if it were a byte count. Not as type safe though. But it worked when dealing with random JSON that were byte counts that I wanted to make human readable for UI.

Also don't really need to write your own comparison function for SizeUnit. Make the enum derive from Int and Comparable, and set the values, and you'll get the usual comparison operators from the library for free. Similarly, ShuffleMode can derive from string and you can use shuffleMode.rawValue and ShuffleMode(rawValue: string) to go back and forth between string and enum representations. The raw value stuff is handled for you by Swift if you derive from a type, reducing how much code you write.

Things like that. If you feel like you are writing boilerplate code for something, there's probably a better way to do it in Swift. One of the reasons I like the language.

Also not a huge fan of some of the forms of decomposition that Java tends to encourage myself. Stuff like TrackIO aren't really necessary when you can use a value type (struct) for Track in Swift, meaning if I 'let' a track value, I can guarantee that a mutating function is not callable at compile time, so the isolation away from the type isn't really necessary. But in general, using class factories and static mutators tends to rub me the wrong way in Swift. Something like loadTrack() can be written as a failable initializer:

Code:
init?(file: NSURL) {
   // Do I/O that can fail here.

   if (loadFailed) {
      return nil
   }

   self = Track()
   self.setPropertiesFromSource(foo) // this can probably be folded into a non-failable initializer once you have something that "cannot" fail.
}

Oh, and I'm curious why so many classes derive from NSObject, and so few structs are used. Track, Playlist, and a couple others are ideal for this, since it makes it easier to do multithreading when you can safely mutate a copy (rather than a reference) to a playlist the UI might currently be using for display, and then tell the UI to take a copy of the mutated struct. Because of Swift's copy-on-write behaviors for value types, you shouldn't be generating a whole lot of extra memory use, but still getting the ability to mutate local copies freely without worrying what else could happen.

Unless the type needs to be bridged to AppKit or the like, it doesn't need to use NSObject/@objc for the most part. My projects on this sort of scale mostly only get @objc used for types that are view controllers or custom views. The data model tends to be free of Obj-C compatibility.

Thanks for the tips.

The only reason for me extending NSObject was that it was necessary for some of my selectors to work.

As for doing track IO in the Track initializer, that comes to the subject of separation of concerns, and I think this has to do with my background in Java. All my life, I've been taught that, for the most part, model objects should be as "dumb" as possible - only concerned with their data fields and operations on those data fields. So, reading track info from a file doesn't agree with that paradigm. That's why I created TrackIO. I noticed the kind of thing you're suggesting is present in the Swift API, in String for example ... you can construct a String from the contents of a file ! Wow, I never would have guessed that ! There are very few instances of this kind of behavior in Java, from what I remember ... java.util.Properties can be read from a file using a constructor, but that's about it. So, yeah, in short, I wanted to keep the IO logic separate from the "dumb" Track model object.

I didn't yet absorb what you said about mutable structs, let, etc. I think it's because I just woke up :D I will need to read that again.

When I wrote my enums, I looked up "How to convert enum to String" and came across rawValue(), but then discovered that it was not (apparently) available for me in Swift 2.0. I tried using XCode to suggest a rawValue (String) code completion for my enum, but it didn't. So, I assumed that it was a newer Swift feature, and wrote my own toString function. I will look into Int-based enums, though. I didn't know Swift had "Comparable" !!! I'm sure you can appreciate why it piques my curiosity :D

I like your idea of writing extensions (as opposed to wrapper classes). I don't yet know how they work, but have seen them everywhere in code samples, and will give them a try :)

Can I ask how long you've been programming ? What kind of stuff have you programmed ?
 
Last edited:
Yup, that is a good point. I think the memory leak issue really scared me wrt memory usage. The other uncertainty that led to that decision (5 sec buffer) was that I'm not sure how (or how well) garbage collection works in Swift. I don't yet understand what makes an object eligible for GC in Swift (I need to learn) and how quickly it is cleaned up. I'm scared of the possibility of GC not being able to keep up with allocation. Also, I have noticed some remnants of C-like behavior, which, frankly, terrifies me. That uncertainty definitely pushed me towards the side of using less memory. So, it's not so much the footprint under normal circumstances, but the uncertainties and potential for a runaway/leak, that drove my decision.

Unlike Java, Swift isn't garbage collected, it uses Automatic Reference Counting (ARC) instead. So there isn't a collector that runs occasionally. If you are familiar with reference counting, or C++'s shared pointers, it's similar to that, but Swift emits the retain/release calls on your behalf. So things will get deallocated the moment the ref count goes to zero.

It has different concerns when handling performance compared to garbage collection, so it's worth looking it up. You won't see heavy surges in memory usage from allocations that wind up waiting. What you will see is performance issues if you allocate a bunch of objects in tight loops that are loop scoped as you run afoul of the implicit retain/release on those objects.

The only reason for me extending NSObject was that it was necessary for some of my selectors to work.

Yeah, with Swift, I try to avoid using selectors, and use closures instead, since Swift closures are compatible with Obj-C/GCD blocks. I was more curious why Track and related types were NSObjects, since the separation of concerns made it difficult to understand why you'd need to interact with them from Obj-C directly.

As for doing track IO in the Track initializer, that comes to the subject of separation of concerns, and I think this has to do with my background in Java. All my life, I've been taught that, for the most part, model objects should be as "dumb" as possible - only concerned with their data fields and operations on those data fields. So, reading track info from a file doesn't agree with that paradigm. That's why I created TrackIO. I noticed the kind of thing you're suggesting is present in the Swift API, in String for example ... you can construct a String from the contents of a file ! Wow, I never would have guessed that ! There are very few instances of this kind of behavior in Java, from what I remember ... java.util.Properties can be read from a file using a constructor, but that's about it. So, yeah, in short, I wanted to keep the IO logic separate from the "dumb" Track model object.

Yeah, it's something where there are probably a couple different ways to go, and I would shrug and say either one could be the "right" one. I tend to skew more towards the fact that related code should be nearby and accessible so that the scope of a thing can be determined when reading the code. Separation of concerns is still compatible with that model, but it does take a bit of finesse to strike the right balance.

This is another area where type extensions come in to help strike the balance, for two reasons:
  1. The extension only has access to public members/initializers.
  2. The extension can live in a separate file, or even module.
So what I tend to do is put convenience initializers like these in extensions. That way the concern of how to create the type itself remains encapsulated away from the code that knows how to read the file. In the case of Track, how I'd personally do it is something like this:

Code:
// Oh hey, AVMetadataItem kinda already conforms to this.
protocol TrackMetadataItem {
   var keyName : String? { get }

   var stringValue : String? { get }
   var numberValue : String? { get }
   var dateValue : String? { get }
}

struct Track {
   // Some stuff

   init(metadataItems : [TrackMetadataItem]) {
      // Initialize myself with the metadata items, I can check for specific key names and the like with switches/etc.
   }
}

// In some other file, or not, doesn't matter
extension AVMetadataItem : TrackMetadataItem {
   var keyName : String? {
      get {
         return self.commonKey as? String
      }
   }
}

extension Track {
   // May need to use NSURL, not sure when URL stabilized in Swift. But it is bridgeable with NSURL
   init?(fileUrl : URL) {
      let asset = AVAsset(...)
    
      // And some other things

      guard (thisAllWorked) else { return nil }

      self = Track(metadataItems: asset.commonMetadata)
   }
}

It's not a perfect example, but it does show how you can still separate the concerns of interacting with the AVAsset to extract the metadata from the ownership of the Track. While providing a way to construct a Track without having to know anything more than the metadata items being passed in. So the Track doesn't even really know anything about where the metadata came from. And the fileUrl initializer doesn't actually have to know how the Track wants to use the actual metadata items (something your code fails to separate out cleanly).

But note the benefits over the class factory style. Typing "Track(" will offer up the fileUrl version of the init, making it easier as a programmer to know how they can create a track. I also cheat a bit here and make it possible for AVMetadataItem to conform to TrackMetadataItem very simply, so the metadata items from the AVAsset can be passed in as-is since it already is an array of things Track wants when initializing. So not only am I keeping the two parts separated, I also manage to avoid too much boilerplate to glue the two together.

A more interesting thing would be to turn the array into a dictionary of [String: TrackMetadataItem]. The trick here is to do something a little goofy. Extend "Array where Element: TrackMetadataItem", and create a convenience function that returns the dictionary by using keyName as the key, and the TrackMetadataItem as the value. You can get more specific here, but it does get more complicated to get the compiler happy with it. But the idea being that now the version of init that takes a fileUrl can easily create the dictionary from an array of AVMetadataItems, while the Track initializer can now just lookup keys directly.

I didn't yet absorb what you said about mutable structs, let, etc. I think it's because I just woke up :D I will need to read that again.

Short reason: Java and Swift are actually somewhat similar here. They both have value types and reference types which have different semantics. Where it gets interesting in Swift is that struct vs class gives you different semantics (I think Java structs are value types too, maybe?), and with copy-on-write, value semantics for complex types (like Track) are a lot cheaper than they would be in Java. It doesn't copy on assign/pass, but rather on mutation (to a point). And passing them between threads via closures is safe. So there's fewer reasons to use classes exclusively, since structs actually provide you a bit more flexibility in some cases. My Swift code tends to skew more structs than classes, depending on what the type represents. But it is a habit that's hard to break from C++/Java/Obj-C.

And Swift has better mechanisms to prevent mutation on value types, which avoids the need for String + StringBuilder/StringBuffer where "String" is immutable, while "StringBuilder" is mutable. Instead, String itself can be made mutable or immutable using let vs var. Fewer types are needed to represent the same set of concepts, since you can declare a particular copy of a struct mutable or not.

When I wrote my enums, I looked up "How to convert enum to String" and came across rawValue(), but then discovered that it was not (apparently) available for me in Swift 2.0. I tried using XCode to suggest a rawValue (String) code completion for my enum, but it didn't. So, I assumed that it was a newer Swift feature, and wrote my own toString function. I will look into Int-based enums, though. I didn't know Swift had "Comparable" !!! I'm sure you can appreciate why it piques my curiosity :D

I like your idea of writing extensions (as opposed to wrapper classes). I don't yet know how they work, but have seen them everywhere in code samples, and will give them a try :)

Can I ask how long you've been programming ? What kind of stuff have you programmed ?

See this on raw values. You mostly have to declare the enum correctly and things light up. Because once the enum is declared correctly, it winds up getting bound to a few different protocol extensions that Swift defines in the library. That's where the ability to set and get via a raw value is defined.

Far, far too long for someone my age (more than 20 years, not even 40 yet). Picked it up early, but I'll say I was pretty rubbish for the longest time in high school and even college. The main thing was how many languages I got exposed to. I could produce usable code out of college, but a decade later and I'm not sure I can say I'm proud of some of the things I wrote back then. I do have a couple contributions in Handbrake and libavcodec from that timeframe though (under a different alias). Still use the Handbrake changes I wrote today (in some form anyways, not sure how much they've modified them for MKV input and the like). I currently work on a project you've probably heard of, and before that, worked on another project you've probably heard of.

Honestly, I'd say that my specialties are in Obj-C, Swift, and C. I know enough C# not to be completely rubbish in it, but they keep changing things drastically for me to realistically keep up. I'm still learning some of the bizarre undocumented C++ behavior (things you don't learn until you work with multiple compilers). But I've dabbled in so many languages beyond these that it kinda makes up for my lack of depth.
 
Last edited:
  • Like
Reactions: 0002378
  • When calling scheduleBuffer(), use a completion handler that queues a copy of the read segment task after recording that there are one fewer segments scheduled in the AVAudioPlayerNode.

Dude, I finally mustered the balls to go ahead and implement this rather significant change in BufferManager, i.e. doing away with the timer and using the scheduleBuffer() completion handler to "look ahead". And man, it was not nearly as straightforward as I thought it would be.

The main problem I faced was that, when calling playerNode.stop(), not only does playback stop, but all completion handlers get invoked, because playerNode can't tell the difference between the buffer actually finishing its playback (reaching the last frame) vs calling stop() before that last frame has been read, i.e. it doesn't care, it invokes the completion handler regardless. I think this is poor design. This caused a whole lot of unnecessary problems and resultant complexity in my code, because, when I wanted to seek or play the next track (i.e. invalidate the current playback session), I couldn't just call stop() and forget about all previously scheduled buffers. The stop() call would result in more buffers as the completion handlers got invoked.

Then, the next challenge was - how to detect that an invalidated scheduling task, which could potentially be invoked a few seconds later (after a new track has started playing), is, in fact, invalid ? The answer was a unique identifier of some sort - I chose to use a timestamp identifying a "playback session" (duration between play() and stop() or seek() and stop()). Anytime a buffer is scheduled, a timestamp for its parent playback session is passed into the task. Then, when it finally executes, seconds later, that timestamp is compared to the current playback session timestamp (which may have changed). And, if they don't match up, I reject the task. I also have a redundant check just before actually scheduling the buffer on the player node, which could be seconds after the task initiated (if disk is being slow, suddenly). This second layer of safety prevents an invalid task from being put on the queue.

Finally, the order in which I call: 1- playerNode.stop(), and 2- operationQueue.waitUntilAllOperationsAreFinished() matters. I figured this out the hard way. Turns out, this is because of those pesky completion handlers triggered by stop(). First, I wait for the queue to drain, and then, I call stop() on the player. All the completion handlers then get harmlessly executed and ignored (thanks to my timestamp comparison).

Anyway, the timer is gone, and it works very reliably. I don't know if the code is simpler than before (in ways, yes, and in others, no), but I'm glad the timer is gone. I've pushed the changes to Github. Take a look if you want.

There are two very nice consequences of this change:
1 - Pause/resume now require no actions on the part of BufferManager. No timer to suspend/resume. I was always afraid that somewhere, somehow, that pause/resume state might not get toggled properly and I'd have a player totally out of whack. Now, there's one less thing to worry about.
2 - It will greatly simplify my implementation of variable speed playback. For varispeed, I need to change how frequently buffers are allocated, and with timers, that would have been an ugly thing to have to implement. But, now that I'm relying on completion handlers, it doesn't matter how fast/slow playback occurs, lookahead buffers will be scheduled ahead of time, regardless.

I'm sure you'd find a better way to implement this, but I'm actually happy with it. The next thing to play with might be buffer size. Thanks for suggesting these changes.
 
Last edited:
Yes, I get that. The reason I didn't want to "look ahead" that far (25 seconds) is to minimize my memory footprint. I understand that it's best to grab as much as possible from disk, at once, but I had a decision to make, a compromise. Either I grab a bigger bunch from disk and fill up a fair bit of memory, or I could just grab a small chunk every 5 seconds. It's a question of memory footprint vs efficient disk reading, I suppose.

5 seconds versus 25 seconds of audio buffer is a small difference in memory, still single digit megabytes. But continuously hitting disk and keeping that powered up is likely a noticeably drain on the battery. Try profiling for energy with your 5 second buffer, and see if it's an issue. It might guide you for tuning your buffer size.
 
  • Like
Reactions: 0002378
5 seconds versus 25 seconds of audio buffer is a small difference in memory, still single digit megabytes. But continuously hitting disk and keeping that powered up is likely a noticeably drain on the battery. Try profiling for energy with your 5 second buffer, and see if it's an issue. It might guide you for tuning your buffer size.

Yes, true. I agree. Thanks for the tip. I mentioned other reasons for not upping the memory usage - mostly to do with my unfamiliarity of memory cleanup by the runtime and the possibility of a runaway/leak. I have yet to do serious profiling and performance testing on my app. I'm still adding features n stuff. I'll revisit all performance-impacting aspects from time to time.
 
  • Like
Reactions: MarkCollette
Dude, I finally mustered the balls to go ahead and implement this rather significant change in BufferManager, i.e. doing away with the timer and using the scheduleBuffer() completion handler to "look ahead". And man, it was not nearly as straightforward as I thought it would be.

So, another part of this that I was advocating for was getting rid of the need for waiting by leveraging the serial queue you have. If the work to start new playback is part of the queue that does the buffering, you should see the new queued work appear behind these empty work blocks you mention (and it was something I kinda expected to happen occasionally, and need to be accounted for, just not in this case).

The goal I was aiming for with my suggestion was a scenario where maybe stop() does a little work off queue so that the queue can unclog faster (i.e. stopping the playback and marking the buffer as something you no longer need to fill), but play() should be setting up and configuring itself using a block dispatched to the queue. You've got a serial queue, but you seem to be doing a lot of work to avoid actually treating it like one?

These sort of hacks start to go away when the queue owns the state, things happen serially, and so you don't risk things happening out of order. The neat thing about serial queues is that any state that you let the queue "own" (meaning you don't touch it outside of the queue) doesn't need to be guarded against parallel access.
 
  • Like
Reactions: 0002378
So, another part of this that I was advocating for was getting rid of the need for waiting by leveraging the serial queue you have.

I understand. I did try to put player playback code on the queue, and it had some really weird side-effects: the playerNode's sampleTime (which is used to calculate playback/seek position) suddenly went to zero, in the middle of playing a song. Because I was troubleshooting the other (more serious) issues, I decided to keep the playback code as is for now.

I think there is some sort of implicit constraint that ties the playerNode code to the main thread. It sounds strange, and I don't know why or even if this is the case, but, as soon as the playerNode calls are put on the queue, I start seeing really weird consequences.

I will give it another try to better understand this behavior.
 
Code:
...   
        dispatch_source_set_event_handler(timer!) { [weak self] in
         
            // Check two flags before executing the task
            if ((!self!.stopped) && (!self!.paused)) {
                // Do the task
            }
        }
...

I didn't follow the whole thread or review all the code, but I do see a problem here.

Since "self" is weak in the handler it may be nil, but you aren't handling that case.
The code should probably be something like this:
Code:
...  
        dispatch_source_set_event_handler(timer!) { [weak self] in
          
            // Check two flags before executing the task
            if let s = self, !s.stopped && !s.paused {
                // Do the task
            }
        }
...
 
  • Like
Reactions: 0002378
Something worth thinking about is actually doing some napkin math on your buffer sizes. If the PCM buffer is 24-bit at 44khz (for example), each second is about 129KB in size. Each 5 second buffer is then 645KB. For the 10 seconds you are caching, you are using just under 1.3MB of RAM for buffers as written.

25 seconds in this case is just an example, but unless you wanted to fit this into a very tiny memory footprint, I would be somewhat surprised that <3.2MB is too much buffer space. Since RAM always has to be powered, and Apple tends to ship Macs with a minimum of nearly 1000 times that much memory, I think you may be over-optimizing memory use before understanding where your memory is going. UI resources and the like will very easily catch up to even a 25 second buffer.

Krevnik, I profiled my app with both 5 second and 30 second buffer sizes. I played the same 44.1KHz stereo audio file for both tests.

The single buffer footprints were: 1.68 MB and 10.09 MB respectively. This comes out to 1.68/5 = 344 KB/s of data, not 129KB.

Furthermore, keep in mind that I always have 2 buffers in memory, the one that's playing + the lookahead. So, for a 5 second buffer size, I'm really storing 3.36 MB, and for a 30 sec buffer size, I'm storing 20 MB.

That said, I think that my happy medium is a 15 sec buffer size, which puts two 5.05 MB buffers in memory, for a total of 10.1 MB at any given time.

5 seconds versus 25 seconds of audio buffer is a small difference in memory, still single digit megabytes. But continuously hitting disk and keeping that powered up is likely a noticeably drain on the battery. Try profiling for energy with your 5 second buffer, and see if it's an issue. It might guide you for tuning your buffer size.

Mark, interestingly, Instruments doesn't let me profile for energy with my platform being OS X. Did the developers of Instruments forget that Apple makes laptops that run OS X, that run on batteries ?

30 seconds of playback data takes up 10.09 MB vs 1.68 MB for a 5 second buffer. I don't know if that's a small difference. Maybe, in the grand scheme of things (assuming a modern machine with > 4GB of memory). Here is my code snippet that schedules buffers:

Code:
// Seconds of playback
private static let BUFFER_SIZE: UInt32 = 30
...
let sampleRate = playingFile!.processingFormat.sampleRate
let buffer: AVAudioPCMBuffer = AVAudioPCMBuffer(PCMFormat: playingFile!.processingFormat, frameCapacity: AVAudioFrameCount(Double(BufferManager.BUFFER_SIZE) * sampleRate))

10MBBuffer.png
 
Last edited:
Krevnik, I profiled my app with both 5 second and 30 second buffer sizes. I played the same 44.1KHz stereo audio file for both tests.

The single buffer footprints were: 1.68 MB and 10.09 MB respectively. This comes out to 1.68/5 = 344 KB/s of data, not 129KB.

<snip>

30 seconds of playback data takes up 10.09 MB vs 1.68 MB for a 5 second buffer. I don't know if that's a small difference. Maybe, in the grand scheme of things (assuming a modern machine with > 4GB of memory). Here is my code snippet that schedules buffers:

Yup, my napkin math didn't account for stereo audio. Whoops. :)

But I'd say also keep in mind you really have (up to) 30s of buffering happening here. You've just done it by making the active buffer that much bigger. What I was thinking was scheduling a larger number of smaller buffers. So instead of buffering 30s per read (which blocks the core/queue for longer), buffer 5s until you have 30s of lookahead. I've made a sort of tweaked version of QueueBasedBufferManager to demonstrate:

Code:
import Cocoa
import AVFoundation

/*
Manages audio buffer allocation, scheduling, and playback. Provides two main operations - play() and seekToTime(). Works in conjunction with a playerNode to perform playback.

A "playback session" begins when playback is started, as a result of either play() or seekToTime(). It ends when either playback is completed or a new request is received (and stop() is called).
*/
class QueueBasedBufferManager {
  
    // Indicates the beginning of a file, used when starting file playback
    static let FRAME_ZERO = AVAudioFramePosition(0)
  
    // Buffer Constants
    private static let BUFFER_SIZE: UInt32 = 5
    private static let BUFFER_COUNT: Int32 = 6

    // MARK: Dispatch Queue and owned variables
    // The (serial) dispatch queue on which all scheduling tasks will be enqueued
    // TODO: There should probably be a "master" queue for the type.
    // Each instance can have it's own child queue, using the master queue as a target.
    // This means you can have more than one of these and they will coalesce the queues
    // into a single thread somewhere.
    private var dispatchQueue: DispatchQueue
  
    // Player node used for actual playback
    private var playerNode: AVAudioPlayerNode
  
    // The currently playing audio file
    // TODO: Should this use "!"? Too many of them in the code.
    private var playingFile: AVAudioFile?
  
    // Current number of buffers
    private var currentBufferCount: Int32 = 0
  
    // MARK: Shared variables across queues
    // TODO: abortBuffering should really be atomic?
    private var abortBuffering: Bool = false
  
    init(playerNode: AVAudioPlayerNode) {
        dispatchQueue = DispatchQueue(label: "Aural.queues.bufferManager")

        // Safe as we can guarantee nothing has yet been dispatched.
        self.playerNode = playerNode
    }
  
    // Start track playback from the beginning
    func play(avFile: AVAudioFile) {
        dispatchQueue.async {
            self.playingFile = avFile
            self.startPlaybackFromFrame(frame: QueueBasedBufferManager.FRAME_ZERO)
        }
    }
  
    // Seeks to a certain position (seconds) in the audio file being played back. Returns the calculated start frame and whether or not playback has completed after this seek (i.e. end of file)
    func seekToTime(seconds: Double) -> (playbackCompleted: Bool, startFrame: AVAudioFramePosition?) {
      
        stop()
      
        // TODO: Accessing playingFile from main queue could be a problem source.
        // This should be fine while immutable, but it's worth making a local reference.
        guard let realPlayingFile = self.playingFile else { return (true, nil) }
      
        let sampleRate = realPlayingFile.processingFormat.sampleRate
      
        //  Multiply your sample rate by the new time in seconds. This will give you the exact frame of the song at which you want to start the player
        let firstFrame = Int64(seconds * sampleRate)
      
        let framesToPlay = realPlayingFile.length - firstFrame
      
        // If not enough frames left to play, consider playback finished
        if framesToPlay > 0 {
          
            // Start playback
            dispatchQueue.async {
                self.startPlaybackFromFrame(frame: firstFrame)
            }
          
            // Return the start frame to later determine seek position
            return (false, firstFrame)
          
        } else {
          
            // Nothing to play means playback has completed
          
            // Notify observers about playback completion
            EventRegistry.publishEvent(EventType.PlaybackCompleted, event: PlaybackCompletedEvent.instance)
          
            return (true, nil)
        }
    }
  
    // Stops the scheduling of audio buffers, in response to a request to stop playback (or when seeking to a new position). Waits till all previously scheduled buffers are cleared. After execution of this method, code can assume no scheduled buffers. Marks the end of a "playback session".
    func stop() {
        // TODO: Is there anything playerNode.pause() does that .stop() doesn't?
        abortBuffering = true
        playerNode.stop()

        // Do cleanup once the existing tasks have flushed.
        dispatchQueue.async {
            self.abortBuffering = false
            self.playingFile = nil
        }
    }
  
    // MARK: Do Not Call These Outside of the Queue
  
    // Starts track playback from a given frame position. Marks the beginning of a "playback session".
    private func startPlaybackFromFrame(frame: AVAudioFramePosition) {
      
        // Set the position in the audio file from which reading is to begin
        playingFile!.framePosition = frame
      
        // Schedule one buffer for immediate playback
        let reachedEOF = scheduleNextBuffer()
      
        // Start playing the file
        playerNode.play()
    }
  
    // Schedules a single audio buffer for playback
    // The timestamp argument indicates which playback session this task was initiated for
    private func scheduleNextBuffer() -> Bool {
        guard !abortBuffering else { return true }
        guard self.currentBufferCount < QueueBasedBufferManager.BUFFER_COUNT else { return false }
        guard let currentPlayingFile = self.playingFile else { return false }
      
        let sampleRate = playingFile!.processingFormat.sampleRate
        let buffer: AVAudioPCMBuffer = AVAudioPCMBuffer(pcmFormat: currentPlayingFile.processingFormat, frameCapacity: AVAudioFrameCount(Double(QueueBasedBufferManager.BUFFER_SIZE) * sampleRate))
      
        do {
            try currentPlayingFile.read(into: buffer)
        } catch let error as NSError {
            NSLog("Error reading from audio file '%@': %@", playingFile!.url.lastPathComponent, error.description)
        }
      
        let readAllFrames = currentPlayingFile.framePosition >= currentPlayingFile.length
        let bufferNotFull = buffer.frameLength < buffer.frameCapacity
      
        // If all frames have been read, OR the buffer is not full, consider track done playing (EOF)
        let reachedEOF = readAllFrames || bufferNotFull
      
        // TODO: should this be dispatched to the main queue?
        // Again a case where playerNode should probably know what queue it needs to be on.
        playerNode.scheduleBuffer(buffer, atTime: nil, options: []) {
            self.currentBufferCount -= 1
            if reachedEOF {
                // Notify observers about playback completion
                EventRegistry.publishEvent(EventType.PlaybackCompleted, event: PlaybackCompletedEvent.instance)
            } else if !self.abortBuffering {
                self.scheduleNextBuffer()
            }
        }
        self.currentBufferCount += 1
      
        return reachedEOF
    }
}

Some notes about the above code:
  1. The total size of the buffers in flight at any one time here will be bufferSize * maxBufferCount. That includes the one actively being played by the AVAudioEngine, since you don't get called back until after it completes playback. This means you don't have the two large buffer issue you point out in your latest post.
    • This means you can actually have a guaranteed 25s look-ahead buffer (+5s active), versus the 15s (+15s active) you have now, in the same memory footprint.
  2. Got rid of the timestamp variable for a simple boolean. Also leverages the serial queue as one might expect to use it in GCD land: Let it own the data, and dispatch writes to the data to the queue.
  3. I've marked TODOs in the code in places where I'm not too familiar with other parts of the code, but seem somewhat concerning to me or present future opportunities to clean up the code.
Now, I haven't tested the code, and some of the things you will hit playing with it are due to the difference between the version of Swift you are using and Swift 3. But I think this demonstrates more what I was going for with my suggestions. Take it or leave it as you see fit.
 
  • Like
Reactions: 0002378
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.