WWDC 2017 Notes

Tuesday, June 6

What’s New in Cocoa Touch (Session 201)

Productivity
- Set UIDragInteraction on views to enable drag, and containers to enable drop
- Document picker provides both local and cloud files
Visual changes
- Search bar, refresh control become part of the navigation bar
- Navbar height can change, so read safeAreaInsets to get the size
- UIScrollView.contentInsetAdjustmentBehavior: control the behavior of automatic scrollable padding
- Table view swipe actions are now public API
API enhancements
- Codable protocol for NSCoding, which requires no work if all the members are also Codable
- Block-based KVO: use block to specify what happens when value changes
- UIFontMetrics helps you scale a normal-sized font based on the user’s dynamic type setting
- Password autofill appears in the keyboard suggestion bar when it detects a login form
- For PDF assets, can now save the vector data. This can be used for the accessibility HUD on tab bar icons.

Introducing Drag and Drop (Session 203)

Great multi-touch experience: can pass dragged items between hands, and hold onto them while interacting with other UI elements, as if manipulating physical objects
Drag and drop objects
- UIDragItem is the model object for the data you’re dragging
- UIPasteConfiguration can be used to support both paste and drop with the same code
- UIDropInteraction for more customization: can provide a delegate that accepts/rejects drops, requests formats
- Animations, etc can be customized with the delegates
Timeline for a drag and drop interaction
- Drag starts: app needs to provide the items to be dragged
  - Default image is a snapshot of the dragged view
- Drop proposal: .cancel, .copy (should be default), .move, .forbidden
  - .move only works within the same app
- Get another callback when the user lifts finger to complete or fail the drop session
Sample code
- Bulletin board app for dragging around photos

From Monroe to NASA (Session 106)

Dr. Christine Darden
In high school, geometry was her first exposure to math. Ended up studying math and teaching in college out of concern that nobody would hire a black female mathematician.
Ended up in graduate school as a research assistant for aerosol physics. Earned a degree in applied math and recruited by NASA.
After five years of implementing math on the computer, she pushes management to give her a promotion. Previously, men and women of the same background were assigned different positions: men became researchers and published papers, while the women worked as computers.
Researched ways to reduce sonic boom for supersonic aircraft.
Went into management at Langley and retired in 2007.

Introducing Core ML (Session 703)

CoreML provides inference implementations of many learning algorithms, including deep learning, allowing developers to focus on app features
- Vision, NLP built on top of CoreML
- High performance, on-device implementations with Accelerate framework and Metal Performance Shaders, which you can use to build custom models
Storing models in CoreML
- Many models, including feed forward and recurrent neural networks (30+ layer types)
- Developers aren’t required to work with the models directly — CoreML can handle these “low level details”
- Models stored / shared as JSON that contains the “function signature,” structure and weights. This format is optimized for “openness,” and compiled to something more performant by Xcode.
- Apple provides some models, and you can import from other open source learning frameworks using Core ML Tools (open source)
Using models
- All models have the same interface, like scikit-learn
- Xcode generates Swift code so that you can write things like myModel.prediction(input: myInput)
- You can also access the lower level MLModel object you want

Introducing HEIF and HEVC (Session 503)

Context: quality and popularity of photos/videos continues to increase, but bandwidth is still expensive
High Efficiency Video Coding (HEVC or H.265) uses up to 40% less space compared to H.264 (50% for iOS camera)
- Variable block sizes allows larger blocks, which provides compression benefits for high resolution videos
- Use Discrete Sine Transform (DST) as well as DCT, also adding variable size
- More directions for intra prediction
- Filters to do sub-pixel (not pixel aligned) motion estimation?
- Sample adaptive offset filter for deblocking
High Efficiency Image File Format (HEIF) is an image container format that can use HEVC as the codec
- Multiple images: sequences, alpha/depth channel
- Usually uses HEVC for compression, but can support others
HEIF photos on Apple devices
- Images are encoded as 512x512 tiles to support partial downloading
- .heic extension indicates the HEIF file was encoded with HEVC (this is always the case)
- Includes EXIF and thumbnail as usual
- Hardware decoding: A9 (iPhone 6s and later), Skylake (ex: Touch Bar MacBook Pro)
- Hardware encoding: A10 (iPhone 7)
HEVC movies on Apple devices
- QuickTime is used as the container
- Also supports 10-bit color (software encoder available)
- Hardware decoding: A9, Macs?
- Hardware encoding: A10, Intel 6th generation Core
Decodable vs. playable
- Decodable: it is possible to decode the movies, such as for a transcoding or non-realtime workflow (true on all devices)
- Playable: can be played in realtime
Sharing
- Transcode to JPEG or H.264 if you’re not sure about the receiver’s decoding capabilities, or if the server cannot transcode
  - Ex: Mail
- Try to avoid transcoding and increased file size with capabilities exchange
  - Ex: AirDrop conditionally encodes depending on the receiving device

Introducing ARKit: Augmented Reality for iOS (Session 603)

Sneak peek: applications of ARKit
- Tell 3D stories that you can explore by moving the device
- IKEA puts furniture in your living room
- Pokémon Go improved tracking by adopting ARKit
What’s provided by ARKit?
- Visual–inertial odometry (physical distances with no camera calibration)
- Finding surfaces and hit testing (intersecting?) them
- Lighting estimation
- Custom AR views to help with rendering
ARKit API
- Create ARSession object and run with ARSessionConfiguration. It will create AVCaptureSession and CMMotionManager for you, then start tracking.
- Grab frames from the ARSession or get notified via delegate
- Calling run() again updates the configuration
- ARFrame: image, tracking points, and scene information
- ARAnchor represents a real-world position, which you can add and remove
Tracking details
- Find the same feature in multiple frames to triangulate device’s position
- Pose estimation saves power by fusing high-frequency motion data (relative) with low-frequency camera data (absolute)
- The transform and camera intrinsics are returned as an ARCamera
Plane detection details
- Finds planes that are horizontal with respect to gravity
- Multiple similar planes are merged
Hit testing
- Given a ray from the camera to some visible point in the scene, returns all the distances along the ray that intersect with planes in the scene
- Option to estimate planes by finding coplanar feature points

Wednesday, June 7

Capturing Depth in iPhone Photography (Session 507)

Dual camera
- iPhone 7 Plus sold much better than 6s Plus, and Brad attributes this to the dual camera
- Dual camera (virtual device) seamlessly matches exposure and compensates for parallax when zooming
Depth and disparity
- Value of depth map at each location indicates distance to that object
- Stereorectified: multiple cameras that have same focal length and parallel optical axes
- Disparity: the inverse of depth. For the same object, change in position between the images goes down as object gets farther.
- AVDepthData can be either depth or disparity map, because they’re both depth-related data
- Holes are represented as NaN. Happens when unable to find features, or points that are not present in both images.
- Calibration errors (unknown baseline) can happen from OIS, gravity pulling on lens, or focusing. This leads to a constant offset error in the depth data. So we only have relative depth (cannot compare between images).
Streaming depth data
- AVCaptureDepthDataOutput can give raw depth data during preview, or smooth it between frames
- Maximum resolution 320x240 at 24 fps
- AVCaptureDataOutputSynchronizer helps synchronize multiple outputs of a capture session
- Can opt into receiving camera intrinsics matrix with focal length and optical center (used by ARKit)
Capturing depth data
- In iOS 11, all Portrait Mode photos contain depth data embedded
- Lenses have distortion because they’re not pinholes — straight lines in the scene don’t necessarily become straight lines in the image. Features are different parts may be warped differently (especially because the cameras have different focal lengths).
- Depth maps are computed as rectilinear, then distortion is added so they correspond directly to the RGB image. This is good for photo editing, but you need to make it rectilinear for scene reconstruction.
- AVCameraCalibrationData: intrinsics, extrinsics (camera’s pose), lens distortion center (not always the same as optical axis), radial distortion lookup table
Dual photo capture
- Capturing separate images from both cameras in a single request

Thursday, June 8

Core ML in depth (Session 710)

Core ML abstracts machine learning models as Swift functions
- Recurrent neural networks can also return a state, which should be passed back to the model for the next prediction
Implementation and optimization
- Automatically decides GPU (compute-heavy), CPU (memory-heavy), or both for different parts of the model — no knobs to tweak
Where do models come from?
- Download Core ML format models from Apple
- Core ML Tools to convert other models, including models you trained (pip install coremltools)
- Python bindings into Core ML let you test whether the converted model has the same output
- Converter library and model format library makes it easier to add new input formats
- May need to specify input is image, so you don’t get a NxNx3 array

Image Editing with Depth (Session 508)

Use depth data to apply different effects to different parts of the scene
Preparing depth data
- Read it into CIImage from the photo’s auxiliary data
- Upscale with edge-preserving algorithm
- Normalize by finding min and max, because dual camera can only provide relative depth data
- Higher quality by using disparity instead of depth (why?)
Editing with depth data
- Common pattern: filter the image, then use a modified depth map as a blend mask between original and filtered
- Pass nil as color space when working with disparity maps so Core Image doesn’t try to color manage
- CIDepthBlurEffect is the same filter used in Portrait Mode. You can adjust the size of the simulated aperture and the focused area (either by rect or facial landmark locations).

Friday, June 9

What’s new in Apple File System (Session 715)

Millions of devices converted to APFS
- Was the default on iOS, watchOS, and tvOS — now macOS too
- Dry run conversions in iOS 10.0, 10.1, and 10.2 to test robustness of the conversion proecess
- Many devices gained free space because LwVM (volume manager) is no longer needed
- COW snapshots ensure consistent state when making iCloud backups
Unicode normalization
- Unicode characters can be represented in many ways, such as ñ = n + ~
- Native normalization for erase-restored iOS 11 devices
- Runtime normalization (in the filesystem driver?) for 10.3.3 and 11 devices that are not erase-restored
- Future update to convert all devices to native normalization. Maybe because APFS compares filenames by hash, so they need to redo all the hashes on disk.
APFS on macOS
- System volume will be automatically converted by High Sierra release installer (it’s optional during the beta). Other volumes can be manually converted. Boot volume must be converted by the installer to be bootable, so don’t do it manually.
- Multiple volumes on existing drives do not use space sharing — they are converted independently. Suggest adding APFS volumes to an existing container and manually copying the files over.
- EFI driver embedded into each APFS volume, allowing boot support even for encrypted drives in virtual machines
- FileVault conversion preserves existing recovery key and passwords. Snapshots are encrypted even if they were taken before enabling FileVault.
- All Fusion Drive metadata is pinned to the SSD
- Defragmentation support for spinning hard drives only (never on SSDs)
APFS and Time Machine
- Since Lion, Mobile Time Machine locally caches some backups so you don’t need the external drive all the time
- Was implemented with 2 daemons, including a filesystem overlay. Lots of complexity: O(10,000) lines of code.
- In High Sierra, it’s been reimplemented on top of APFS snapshots
- Make a backup in O(1) time: tmutil snapshot
- List the volumes with mount. The ones beginning with com.apple.TimeMachine are the hourly APFS snapshots being taken by Time Machine.
- Unmount all snapshots to put it into a cold state: tmutil unmountLocalSnapshots /. Then try to enter Time Machine and it will load very fast (they are mounted lazily).
- Local restores are also O(1) because we just make a COW reference to existing blocks on disk
APFS and Finder
- Fast copying using COW clones
- If a bunch of clones referencing the same blocks are copied to another APFS container, the cloning relationship is preserved
APIs
- NSFileManager: it’s already built in
- Syscall: renamex_np()
- Foundation: copyfile() + COPYFILE_CLONE option

Introducing Business Chat (Session 240)

More efficient way to communicate with customers (vs phone calls)
- Chat, with persistent history
- Pay for things using Apple Pay
- Message buttons prominently featured in Safari, Maps, and Siri
Intents
- Can also create conversations by links on your website
- Parameters in the URL can be used to store what product they want to buy, or which location / business unit the customer contacted
Conversations
- Customer sends the first message. Business can send messages after that, but the customer can always mute or delete the conversation.
- Customer is assigned a unique identifier (pseudononymous) until the customer chooses to provide more information.
- Attachments (photos and video) supported
- Built-in features even if you don’t have an app: appointment picker, list picker, Apple Pay. Add custom features using iMessage Apps.
Testing
- Works with customer service platforms (CSPs) that have support for talking to Apple’s Business Chat server
- Beta users can’t see if a business supports chat yet, but you can whitelist employees
- Apple provides a Business Chat Sandbox for testing if your CSP doesn’t have support yet
- Register for early access

Working with HEIF and HEVC (Session 511)

Capturing HEVC
- Check for available video codec types (for example, AVCapturePhotoOutput). If HEVC is supported, it will be the first item in the array, so it gets used by default.
- Get used to dealing with HEVC content
Encoding HEVC
- AVAssetWriter has new presets
Hierarchical frame encoding
- I-frames contain full information. P-frames refer to a previous frames, and B-frames can refer to previous and future frames.
- When dropping frames, can only drop if nobody depends on it
- If you want to drop a lot of frames, the p-frames must depend on frames far in the past, which makes compression suffers
- Hierarchical encoding in HEVC makes the referenced frames closer
- Opt into compatible high frame rate content in Video Toolbox: set the base layer frame rate (30 fps) and expected frame rate (the actual content frame rate, such as 120 fps).
What is High Efficiency Image File Format (HEIF)?
- Proposed 2013, ratified in summer 2015 (only 1.5 years later!)
- Uses HEVC intra-encoding, so average 2x smaller images than JPEG
- Supports auxiliary images, such as alpha or depth maps
- Store image sequences, such as bursts or focus stacks
- EXIF and XMP for metadata
- Arbitrarily large files with efficiently loading tiles, allowing loading gigapixel panos while using O(100 MB) of memory
Low level access to HEIF
- CGImageSource supports HEIF images, and infers type from file extension
- Tiling support: get metadata from CGImageSourceCopyPropertiesAtIndex, then specify area with CGImage.cropping(). CG will only decode the relevant tiles.
- CGImageDestinationCreate() will infer format from extension, though it returns nil if the device does not have hardware encoder
High level access to HEIF (PhotoKit)
- Required to render output as JPEG or H.264, regardless of input format
- Don’t need anything for Live Photo editing, since editing code doesn’t render stuff directly
Capturing HEIF
- Must use AVCapturePhotoOutput to get HEIF
- It used to deliver a CMSampleBuffer, but unlike JPEG/JFIF, HEIF is a container format that may hold many codecs
- Introducing AVCapturePhoto, which encapsulates the photo, preview, auxiliary data, settings, and more
- Every captured image is now automatically put into a container (HEIF, JFIF, TIFF, or DNG)
- Photos captured during movies will contend for the same HEVC block. Video is given priority, so photos will take longer and be less compressed. Recommendation: JPEG for photos captured during video.
- HEVC encode takes longer than JPEG, but it’s still fast enough for 10 fps burst. However, faster bursts should use JPEG.

Scroll to Top All Posts

Tuesday, June 6

What’s New in Cocoa Touch (Session 201)

Introducing Drag and Drop (Session 203)

From Monroe to NASA (Session 106)

Introducing Core ML (Session 703)

Introducing HEIF and HEVC (Session 503)

Introducing ARKit: Augmented Reality for iOS (Session 603)

Wednesday, June 7

Capturing Depth in iPhone Photography (Session 507)

Thursday, June 8

Core ML in depth (Session 710)

Image Editing with Depth (Session 508)

Friday, June 9

What’s new in Apple File System (Session 715)

Introducing Business Chat (Session 240)

Working with HEIF and HEVC (Session 511)

Recommend

What’s on my ballot: November 2020 general election

WWDC 2016 Notes

Saving six round trips with OCSP stapling

What I learned in 2017

My quest to find the best air quality monitors, both outdoors and indoors

Handling periodic tasks in Swift Vapor

!!Con 2018 Notes

!!Con 2016 notes: Day 1

How to do integration testing on a Vapor server

Updating NativeMarkKit for SwiftUI and iOS 14

About Joyk