At the Fall 2010 edition of the Voices That Matter: iPhone Developers Conference in Philadelphia, attendees got an introduction to AV Foundation, the comprehensive new media framework introduced in iOS 4. At its basic level, AV Foundation offers a straightforward option for playback of any iOS-supported media type -- including streaming formats such as HTTP Live Streaming -- and developers who take the next step gain powerful capture, editing, and export features.
In this new session for 2011, we'll recap these abilities and move on to advanced uses of AV Foundation. In addition to capturing audio and video to a file, we'll look at how to use your own code to process in the incoming data on the fly. To make your video editor more complete, we'll see how Core Animation ties into AV Foundation to provide titles and effects, and how these can be exported out with your finished product. Finally, we'll look at the latest additions to the AV Foundation framework, such as the AVAssetReader and AVAssetWriter APIs, which give you the ability to work at the level of individual media samples, and more.
Voices That Matter: iPhone Developers Conference, April 9-10, 2011, Seattle, WA, USA.
The Future of Software Development - Devin AI Innovative Approach.pdf
Advanced Media Manipulation with AV Foundation
1. Advanced Media Manipulation
with AV Foundation
Chris Adamson — @invalidname — http://www.subfurther.com/blog
Voices That Matter IPhone Developer Conference — March 10, 2011
Sunday, April 10, 2011
2. The Deal
✤ Slides will be posted VTM conference site and http://
www.slideshare.com/invalidname
✤ Code will be posted to blog at http://www.subfurther.com/blog
✤ Don’t try to transcribe the code examples
Sunday, April 10, 2011
3. No, really
✤ Seriously, don’t try to transcribe the code examples
✤ You will never keep up
✤ AV Foundation has the longest class and method names you have
ever seen:
✤ AVMutableVideoCompositionLayerInstruction
✤ AVAssetWriterInputPixelBufferAdaptor
✤ etc.
Sunday, April 10, 2011
8. iOS 4 Media Frameworks
Core Audio /
Low-level audio streaming
OpenAL
Media Player iPod library search/playback
Audio / Video capture, editing,
AV Foundation
playback, export…
Core Video Quartz effects on moving images
Objects for representing media
Core Media
times, formats, buffers
Sunday, April 10, 2011
9. Size is relative
AV android. QuickTime
QT Kit
Foundation media for Java*
Classes 61 40 24 576
Methods 500+ 280 360 >10,000
* – QTJ is used here only as an OO proxy for the procedural QuickTime API
Sunday, April 10, 2011
10. How do media frameworks work?
Sunday, April 10, 2011
15. “Boom Box” APIs
✤ Simple API for playback, sometimes
recording
✤ Little or no support for editing,
mixing, metadata, etc.
✤ Example: HTML 5 <audio> and
<video> tags, iOS Media Player
framework
Sunday, April 10, 2011
16. “Streaming” APIs
✤ Use “stream of audio” metaphor
✤ Strong support for mixing, effects,
other real-time operations
✤ Example: Core Audio
Sunday, April 10, 2011
17. “Streaming” APIs
✤ Use “stream of audio” metaphor
✤ Strong support for mixing, effects,
other real-time operations
✤ Example: Core Audio
and AV Foundation (capture)
Sunday, April 10, 2011
18. “Document” APIs
✤ Use “media document” metaphor
✤ Strong support for editing
✤ Mixing may be a special case of
editing
✤ Example: QuickTime
Sunday, April 10, 2011
19. “Document” APIs
✤ Use “media document” metaphor
✤ Strong support for editing
✤ Mixing may be a special case of
editing
✤ Example: QuickTime
and AV Foundation (playback and editing)
Sunday, April 10, 2011
20. AV Foundation Classes
✤ Capture
✤ Assets and compositions
✤ Playback, editing, and export
✤ Legacy classes
Sunday, April 10, 2011
21. AVAsset
✤ A collection of time-based media data
✤ Sound, video, text (closed captions, subtitles, etc.)
✤ Each distinct media type is contained in a track
✤ An asset represents the arrangement of the tracks. Tracks are pointers
to source media, plus metadata (i.e., what parts of the source to use; a
gain or opacity to apply, etc.)
✤ Asset ≠ media. Track ≠ media. Media = media.
✤ Asset also contains metadata (where common to all tracks)
Sunday, April 10, 2011
22. AVAsset subclasses
✤ AVURLAsset — An asset created from a URL, such as a song or
movie file or network document/stream
✤ AVComposition — An asset created from assets in multiple files, used
to combine and present media together.
✤ Used for editing
Sunday, April 10, 2011
23. AVPlayer
✤ Provides the ability to play an asset
✤ play, pause, seekToTime: methods; currentTime, rate properties
✤ Init with URL or with AVPlayerItem
NSURL *url = [NSURL URLWithString:
@"http://www.subfurther.com/video/running-start-
iphone.m4v"];
AVURLAsset *asset = [AVURLAsset URLAssetWithURL:url
! ! ! ! ! ! ! ! options:nil];
AVPlayerItem *playerItem = [AVPlayerItem
playerItemWithAsset:asset];
player = [[AVPlayer playerWithPlayerItem:playerItem]
retain];
Sunday, April 10, 2011
24. AVPlayerLayer (or not)
✤ CALayer used to display video from a player
✤ Check that the media has video
NSArray *visualTracks = [asset tracksWithMediaCharacteristic:
AVMediaCharacteristicVisual];
if ((!visualTracks) ||
! ([visualTracks count] == 0)) {
! playerView.hidden = YES;
! noVideoLabel.hidden = NO;
}
Sunday, April 10, 2011
25. AVPlayerLayer (no really)
✤ If you have video, create AVPlayerLayer from AVPlayer.
✤ Set bounds and video “gravity” (bounds-filling behavior)
else {
! playerView.hidden = NO;
! noVideoLabel.hidden = YES;
! AVPlayerLayer *playerLayer = [AVPlayerLayer
playerLayerWithPlayer:player];
! [playerView.layer addSublayer:playerLayer];
! playerLayer.frame = playerView.layer.bounds;
! playerLayer.videoGravity =
AVLayerVideoGravityResizeAspect;
}
Sunday, April 10, 2011
27. Media Capture
✤ AV Foundation capture classes for audio / video capture, along with
still image capture
✤ Programmatic control of white balance, autofocus, zoom, etc.
✤ Does not exist on the simulator. AV Foundation capture apps can
only be compiled for and run on the device.
✤ API design is borrowed from QTKit on the Mac
Sunday, April 10, 2011
30. Capture basics
✤ Create an AVCaptureSession to coordinate the capture
✤ Investigate available AVCaptureDevices
✤ Create AVCaptureDeviceInput and connect it to the session
✤ Optional: set up an AVCaptureVideoPreviewLayer
✤ Optional: connect AVCaptureOutputs
✤ Tell the session to start recording
Sunday, April 10, 2011
31. Getting capture device and input
AVCaptureDevice *videoDevice = [AVCaptureDevice
defaultDeviceWithMediaType: AVMediaTypeVideo];
if (videoDevice) {
! NSLog (@"got videoDevice");
! AVCaptureDeviceInput *videoInput = [AVCaptureDeviceInput
deviceInputWithDevice:videoDevice
! ! ! ! ! ! ! ! error:&setUpError];
! if (videoInput) {
! ! [captureSession addInput: videoInput];
! }
}
Note 1: You may also want to check for AVMediaTypeMuxed
Note 2: Do not assume devices based on model (c.f. iPad
Camera Connection Kit)
Sunday, April 10, 2011
32. Creating a video preview layer
AVCaptureVideoPreviewLayer *previewLayer =
[AVCaptureVideoPreviewLayer
layerWithSession:captureSession];
previewLayer.frame = captureView.layer.bounds;
previewLayer.videoGravity =
AVLayerVideoGravityResizeAspect;
[captureView.layer addSublayer:previewLayer];
Keep in mind that the iPhone cameras have a
portrait orientation
Sunday, April 10, 2011
33. Setting an output
captureMovieOutput = [[AVCaptureMovieFileOutput alloc] init];
if (! captureMovieURL) {
! captureMoviePath = [getCaptureMoviePath() retain];
! captureMovieURL = [[NSURL alloc]
initFileURLWithPath:captureMoviePath];
}
NSLog (@"recording to %@", captureMovieURL);
[captureSession addOutput:captureMovieOutput];
We’ll use the captureMovieURL later…
Sunday, April 10, 2011
34. Start capturing
[captureSession startRunning];
recordButton.selected = YES;
if ([[NSFileManager defaultManager]
fileExistsAtPath:captureMoviePath]) {
! [[NSFileManager defaultManager]
removeItemAtPath:captureMoviePath error:nil];
}
// note: must have a delegate
[captureMovieOutput
startRecordingToOutputFileURL:captureMovieURL
! ! ! ! ! ! ! ! recordingDelegate:self];
Sunday, April 10, 2011
39. Core Media
✤ C-based framework containing structures that represent media
samples and media timing
✤ Opaque types: CMBlockBuffer, CMBufferQueue,
CMFormatDescription, CMSampleBuffer, CMTime, CMTimeRange
✤ Handful of convenience functions to work with these
✤ Buffer types provide wrappers around possibly-fragmented memory,
time types provide timing at arbitrary precision
Sunday, April 10, 2011
40. CMTime
✤ CMTime contains a value and a timescale (similar to QuickTime)
✤ Time scale is how the time is measured: “nths of a second”
✤ Time in seconds = value / timescale
✤ Allows for exact timing of any kind of media
✤ Different tracks of an asset can and will have different timescales
✤ Convert with CMTimeConvertScale()
Sunday, April 10, 2011
42. AVAssetWriter
✤ Introduced in iOS 4.1
✤ Allows you to create samples programmatically and write them to an
asset
✤ Used for synthesized media files: screen recording, CGI, synthesized
audio, etc.
Sunday, April 10, 2011
43. Using AVAssetWriter
✤ Create an AVAssetWriter
✤ Create and configure an AVAssetWriterInput and connect it to the
writer
✤ -[AVAssetWriter startWriting]
✤ Repeatedly call -[AVAssetWriterInput appendSampleBuffer:] with
CMSampleBufferRef’s
✤ Set expectsDataInRealTime appropriately, honor
readyForMoreMediaData property.
Sunday, April 10, 2011
44. Example: iOS Screen Recorder
✤ Set up an AVAssetWriter to write to a QuickTime movie file, and an
AVAssetWriterInput with codec and other video track metadata
✤ Set up an AVAssetWriterPixelBufferAdaptor to simplify converting
CGImageRefs into CMSampleBufferRefs
✤ Use an NSTimer to periodically grab the screen image and use the
AVAssetWriterPixelBufferAdapter to write to the AVAssetWriterInput
Sunday, April 10, 2011
45. Create writer, writer input, and
pixel buffer adaptor
assetWriter = [[AVAssetWriter alloc] initWithURL:movieURL
fileType:AVFileTypeQuickTimeMovie
error:&movieError];
NSDictionary *assetWriterInputSettings = [NSDictionary dictionaryWithObjectsAndKeys:
AVVideoCodecH264, AVVideoCodecKey,
[NSNumber numberWithInt:FRAME_WIDTH], AVVideoWidthKey,
[NSNumber numberWithInt:FRAME_HEIGHT], AVVideoHeightKey,
nil];
assetWriterInput = [AVAssetWriterInput assetWriterInputWithMediaType: AVMediaTypeVideo
outputSettings:assetWriterInputSettings];
assetWriterInput.expectsMediaDataInRealTime = YES;
[assetWriter addInput:assetWriterInput];
assetWriterPixelBufferAdaptor = [[AVAssetWriterInputPixelBufferAdaptor alloc]
initWithAssetWriterInput:assetWriterInput
sourcePixelBufferAttributes:nil];
[assetWriter startWriting];
Settings keys and values are defined in AVAudioSettings.h
and AVVideoSettings.h, or AV Foundation Constants Reference
Sunday, April 10, 2011
51. AVAssetReader
✤ Introduced in iOS 4.1
✤ Possible uses:
✤ Showing an audio wave form in a timeline
✤ Generating frame-accurate thumbnails
Sunday, April 10, 2011
52. Using AVAssetReader
✤ Create an AVAssetReader
✤ Create and configure an AVAssetReaderOutput
✤ Three concrete subclasses: AVAssetReaderTrackOutput,
AVAssetReaderAudioMixOutput, and
AVAssetReaderVideoCompositionOutput.
✤ Get data with -[AVAssetReader copyNextSampleBuffer]
Sunday, April 10, 2011
53. Example: Convert iPod song to
PCM
✤ In iOS 4, Media Framework exposes a new metadata property,
MPMediaItemPropertyAssetURL, that allows AV Foundation to open
the library item as an AVAsset
✤ Create an AVAssetReader to read sample buffers from the song
✤ Create an AVAssetWriter to convert and write PCM samples
Sunday, April 10, 2011
54. Coordinated reading/writing
✤ You can provide a block to -[AVAssetWriter
requestMediaDataWhenReady:onQueue:]
✤ Only perform your asset reads / writes when the writer is ready.
✤ In this example, AVAssetWriterInput.expectsMediaInRealTime is NO
Sunday, April 10, 2011
56. Set up writer input
AudioChannelLayout channelLayout;
memset(&channelLayout, 0, sizeof(AudioChannelLayout));
channelLayout.mChannelLayoutTag = kAudioChannelLayoutTag_Stereo;
NSDictionary *outputSettings =
[NSDictionary dictionaryWithObjectsAndKeys:
[NSNumber numberWithInt:kAudioFormatLinearPCM], AVFormatIDKey,
[NSNumber numberWithFloat:44100.0], AVSampleRateKey,
[NSNumber numberWithInt:2], AVNumberOfChannelsKey,
[NSData dataWithBytes:&channelLayout length:sizeof(AudioChannelLayout)],
AVChannelLayoutKey,
[NSNumber numberWithInt:16], AVLinearPCMBitDepthKey,
[NSNumber numberWithBool:NO], AVLinearPCMIsNonInterleaved,
[NSNumber numberWithBool:NO],AVLinearPCMIsFloatKey,
[NSNumber numberWithBool:NO], AVLinearPCMIsBigEndianKey,
nil];
AVAssetWriterInput *assetWriterInput =
[[AVAssetWriterInput assetWriterInputWithMediaType:AVMediaTypeAudio
outputSettings:outputSettings]
retain];
Note 1: Many of these settings are required, but you won’t know which until you get a runtime error.
Note 2: AudioChannelLayout is from Core Audio
Sunday, April 10, 2011
62. Video Editing? On iPhone?
Really?
Comparison specs from everymac.com
Sunday, April 10, 2011
63. Video Editing? On iPhone?
Really?
1999:
Power Mac G4 500 AGP
Comparison specs from everymac.com
Sunday, April 10, 2011
64. Video Editing? On iPhone?
Really?
1999:
Power Mac G4 500 AGP
Comparison specs from everymac.com
Sunday, April 10, 2011
65. Video Editing? On iPhone?
Really?
1999:
Power Mac G4 500 AGP
CPU: 500 MHz G4
RAM: 256 MB
Storage: 20 GB HDD
Comparison specs from everymac.com
Sunday, April 10, 2011
66. Video Editing? On iPhone?
Really?
1999: 2010:
Power Mac G4 500 AGP iPhone 4
CPU: 500 MHz G4
RAM: 256 MB
Storage: 20 GB HDD
Comparison specs from everymac.com
Sunday, April 10, 2011
67. Video Editing? On iPhone?
Really?
1999: 2010:
Power Mac G4 500 AGP iPhone 4
CPU: 500 MHz G4
RAM: 256 MB
Storage: 20 GB HDD
Comparison specs from everymac.com
Sunday, April 10, 2011
68. Video Editing? On iPhone?
Really?
1999: 2010:
Power Mac G4 500 AGP iPhone 4
CPU: 500 MHz G4 CPU: 800 MHz Apple A4
RAM: 256 MB RAM: 512 MB
Storage: 20 GB HDD Storage: 16 GB Flash
Comparison specs from everymac.com
Sunday, April 10, 2011
69. AVComposition
✤ An AVAsset that gets its tracks from multiple file-based sources
✤ To create a movie, you typically use an AVMutableComposition
composition = [[AVMutableComposition alloc] init];
Sunday, April 10, 2011
74. Multiple video tracks
✤ To combine multiple video sources into one movie, create an
AVMutableComposition, then create AVMutableCompositionTracks
// create composition
self.composition = [[AVMutableComposition alloc] init];
// create video tracks a and b
// note: mediatypes are defined in AVMediaFormat.h
[trackA release];
trackA = [self.composition addMutableTrackWithMediaType:AVMediaTypeVideo
preferredTrackID:kCMPersistentTrackID_Invalid];
[trackB release];
trackB = [self.composition addMutableTrackWithMediaType:AVMediaTypeVideo
preferredTrackID:kCMPersistentTrackID_Invalid];
// locate source video track
AVAssetTrack *sourceVideoTrack = [[sourceVideoAsset tracksWithMediaType: AVMediaTypeVideo]
objectAtIndex: 0];
Sunday, April 10, 2011
75. A/B Roll Editing
✤ Apple recommends alternating between two tracks, rather than using
arbitrarily many (e.g., one track per shot)
Sunday, April 10, 2011
76. Sound tracks
✤ Treat your audio as separate tracks too.
// create music track
trackMusic = [self.composition addMutableTrackWithMediaType:AVMediaTypeAudio
preferredTrackID:kCMPersistentTrackID_Invalid];
CMTimeRange musicTrackTimeRange = CMTimeRangeMake(kCMTimeZero,
musicTrackAudioAsset.duration);
NSError *trackMusicError = nil;
[trackMusic insertTimeRange:musicTrackTimeRange
ofTrack:[musicTrackAudioAsset.tracks objectAtIndex:0]
atTime:kCMTimeZero
error:&trackMusicError];
Sunday, April 10, 2011
77. Empty ranges
✤ Use -[AVMutableCompositionTrack insertEmptyTimeRange:] to
account for any part of any track where you won’t be inserting media
segments.
CMTime videoTracksTime = CMTimeMake(0, VIDEO_TIME_SCALE);
CMTime postEditTime = CMTimeAdd (videoTracksTime,
CMTimeMakeWithSeconds(FIRST_CUT_TRACK_A_IN_TIME,
VIDEO_TIME_SCALE));
[trackA insertEmptyTimeRange:CMTimeRangeMake(kCMTimeZero, postEditTime)];
videoTracksTime = postEditTime;
Sunday, April 10, 2011
79. AVVideoComposition
✤ Describes how multiple video tracks are to be composited together.
Mutable version is AVMutableVideoComposition
✤ Not a subclass of AVComposition!
✤ Contains an array AVVideoCompositionInstructions
✤ Time ranges of these instructions need to not overlap, have gaps, or
fail to match the duration of the AVComposition
Sunday, April 10, 2011
80. AVVideoCompositionInstruction
✤ Represents video compositor instructions for all tracks in one time
range
✤ These instructions are a layerInstructions property
✤ Of course, you’ll be creating an
AVMutableVideoCompositionInstruction
Sunday, April 10, 2011
81. AVVideoCompositionLayerInstru
ction (yes, really)
✤ Identifies the instructions for one track within an
AVVideoCompositionInstruction.
✤ AVMutableVideoCompositionLayerInstruction. I warned you about
this back on slide 3.
✤ Currently supports two properties: opacity and affine transform.
Animating (“ramping”) these creates fades/cross-dissolves and
pushes.
✤ e.g., -[AVMutableVideoCompositionLayerInstruction
setOpacityRampFromStartOpacity:toEndOpacity:timeRange]
Sunday, April 10, 2011
84. Titles and Effects
✤ AVSynchronizedLayer gives you a CALayer that gets its timing from
an AVPlayerItem, rather than a wall clock
✤ Run the movie slowly or backwards, the animation runs slowly or
backwards
✤ Can add other CALayers as sublayers and they’ll all get their timing
from the AVPlayerItem
Sunday, April 10, 2011
85. Creating a main title layer
// synchronized layer to own all the title layers
AVSynchronizedLayer *synchronizedLayer =
[AVSynchronizedLayer synchronizedLayerWithPlayerItem:compositionPlayer.currentItem];
synchronizedLayer.frame = [compositionView frame];
[self.view.layer addSublayer:synchronizedLayer];
// main titles
CATextLayer *mainTitleLayer = [CATextLayer layer];
mainTitleLayer.string = NSLocalizedString(@"Running Start", nil);
mainTitleLayer.font = @"Verdana-Bold";
mainTitleLayer.fontSize = videoSize.height / 8;
mainTitleLayer.foregroundColor = [[UIColor yellowColor] CGColor];
mainTitleLayer.alignmentMode = kCAAlignmentCenter;
mainTitleLayer.frame = CGRectMake(0.0, 0.0, videoSize.width, videoSize.height);
mainTitleLayer.opacity = 0.0; // initially invisible
[synchronizedLayer addSublayer:mainTitleLayer];
Sunday, April 10, 2011
86. Adding an animation
// main title opacity animation
[CATransaction begin];
[CATransaction setDisableActions:YES];
CABasicAnimation *mainTitleInAnimation =
[CABasicAnimation animationWithKeyPath:@"opacity"];
mainTitleInAnimation.fromValue = [NSNumber numberWithFloat: 0.0];
mainTitleInAnimation.toValue = [NSNumber numberWithFloat: 1.0];
mainTitleInAnimation.removedOnCompletion = NO;
mainTitleInAnimation.beginTime = AVCoreAnimationBeginTimeAtZero;
mainTitleInAnimation.duration = 5.0;
[mainTitleLayer addAnimation:mainTitleInAnimation forKey:@"in-animation"];
Nasty gotcha: AVCoreAnimationBeginTimeAtZero is a special value that is used for AVF
animations, since 0 would otherwise be interpreted as CACurrentMediaTime()
Sunday, April 10, 2011
88. Multi-track audio
✤ AVPlayerItem.audioMix property
✤ AVAudioMix class describes how multiple audio tracks are to be
mixed together
✤ Analogous to videoComposition property (AVVideoComposition)
Sunday, April 10, 2011
89. Basic Export
✤ Create an AVAssetExportSession
✤ Must set outputURL and outputFileType properties
✤ Inspect possible types with supportedFileTypes property (list of
AVFileType… strings in docs)
✤ Begin export with exportAsynchronouslyWithCompletionHandler:
✤ This takes a block, which will be called on completion, failure,
cancellation, etc.
Sunday, April 10, 2011
90. Advanced Export
✤ AVAssetExportSession takes videoComposition and audioMix
parameters, just like AVPlayerItem
✤ To include AVSynchronizedLayer-based animations in an export, use
a AVVideoCompositionCoreAnimationTool and set it as the
animationTool property of the AVMutableVideoComposition (but
only for export)
Sunday, April 10, 2011
92. More fun with capture
✤ Can analyze video data coming off the camera with the
AVCaptureVideoDataOutput class
✤ Can provide uncompressed frames to your
AVCaptureVideoDataOutputSampleBufferDelegate
✤ The callback provides you with a CMSampleBufferRef
✤ See WWDC 2010 AVCam example
Sunday, April 10, 2011
98. Only effects are dissolve and
push?
How would we do this checkerboard wipe in AV Foundation?
It’s pretty easy in QuickTime!
Sunday, April 10, 2011
99. How do you…
✤ Save a composition to work on later?
✤ Even if AVMutableComposition supports NSCopying, what if
you’ve got titles in an AVSynchronizedLayer?
✤ Support undo / redo of edits?
✤ Add import/export support for other formats and codecs?
Sunday, April 10, 2011
100. AV Foundation Sucks!
✤ Too hard to understand!
✤ Too many classes and methods!
✤ Verbose and obtuse method naming
✤ AVComposition and AVVideoComposition are completely
unrelated? WTF, Apple?
Sunday, April 10, 2011
103. AV Foundation Rocks!
✤ Addresses a huge range of media functionality
✤ The other guys don’t even try
✤ Same framework used by Apple for iMovie for iPhone/iPad
✤ You can create functionality equivalent to iMovie / Final Cut in a few
hundred lines of code
✤ Coming to Mac OS X in 10.7 (Lion)
Sunday, April 10, 2011
104. Q&A
Chris Adamson — @invalidname — http://www.subfurther.com/blog
Voices That Matter IPhone Developer Conference — March 10, 2011
Sunday, April 10, 2011
105. Also from
Pearson!
✤ “Core Audio is serious black
arts shit.” — Mike Lee (@bmf)
✤ It’s tangentially related to AV
Foundation, so you should
totally buy it when it comes
out.
Sunday, April 10, 2011