73

716/717 Vision-in-video-streams

 6 years ago
source link: https://techblog.toutiao.com/2018/06/19/untitled-18/?amp%3Butm_medium=referral
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

一、WWDC2018 Vision

去年IOS11出了Vision框架给开发者提供了使用简单的图像识别方式,本来期待在今年能够拥有更多的图像处理的功能,但是从WWDC2018看来,苹果此番针对Vision框架并没有进行大幅度的升级,功能未变,只是针对IOS12有增加一些修订含义的常量,比如:

* VNDetectFaceLandmarksRequestRevision1
* VNDetectFaceLandmarksRequestRevision2
* VNDetectHorizonRequestRevision1

而关于Vision框架的使用只有两个session的讲解,分别是两个场景下的使用:

* Vision with Core ML

* Object Tracking in Vision

场景的使用下的使用并不复杂,我们通过一个具体的Demo来看看。

二、Vision调用CoreML

苹果在大会上演示了一个Demo,Vision框架通过调用CoreML在相机实时的视频流检测识别出物体名称,我们这里也来实现一个。

1、通过AVFoundation构建一个相机

``
- (void)initAVCapturWritterConfig
{
    self.session = [[AVCaptureSession alloc] init];
    //视频
    AVCaptureDevice *videoDevice = [AVCaptureDevice defaultDeviceWithMediaType:AVMediaTypeVideo];
    if (videoDevice.isFocusPointOfInterestSupported && [videoDevice isFocusModeSupported:AVCaptureFocusModeContinuousAutoFocus]) {
        [videoDevice lockForConfiguration:nil];
        [videoDevice setFocusMode:AVCaptureFocusModeContinuousAutoFocus];
        [videoDevice unlockForConfiguration];
    }
    AVCaptureDeviceInput *cameraDeviceInput = [[AVCaptureDeviceInput alloc] initWithDevice:videoDevice error:nil];
    if ([self.session canAddInput:cameraDeviceInput]) {
        [self.session addInput:cameraDeviceInput];
    }
    //视频
    self.videoOutPut = [[AVCaptureVideoDataOutput alloc] init];
    NSDictionary * outputSettings = [[NSDictionary alloc] initWithObjectsAndKeys:[NSNumber numberWithInt:kCVPixelFormatType_32BGRA],(id)kCVPixelBufferPixelFormatTypeKey, nil];
    [self.videoOutPut setVideoSettings:outputSettings];
    if ([self.session canAddOutput:self.videoOutPut]) {
        [self.session addOutput:self.videoOutPut];
    }
    self.videoConnection = [self.videoOutPut connectionWithMediaType:AVMediaTypeVideo];
    self.videoConnection.enabled = NO;
    [self.videoConnection setVideoOrientation:AVCaptureVideoOrientationPortrait];
    //初始化预览图层
    self.previewLayer = [[AVCaptureVideoPreviewLayer alloc] initWithSession:self.session];
    [self.previewLayer setVideoGravity:AVLayerVideoGravityResizeAspectFill];
}

```

2、引入CoreML的模型

coreml.png

3、初始化Vision框架的请求

``
    //实物识别
    VNCoreMLModel *vnModel = [VNCoreMLModel modelForMLModel:[MobileNet new].model error:nil];
    self.coreMLRequest = [[VNCoreMLRequest alloc] initWithModel:vnModel completionHandler:^(VNRequest * _Nonnull request, NSError * _Nullable error) {
        VNCoreMLRequest *coreR = (VNCoreMLRequest *)request;
        VNClassificationObservation *firstObservation = [coreR.results firstObject];
        dispatch_async(dispatch_get_main_queue(), ^{
            if (firstObservation) {
                self.googleLabel.text = firstObservation.identifier;
            }
            else {
                self.googleLabel.text = @"";
            }
        });
    }];
    self.coreMLRequest.imageCropAndScaleOption = VNImageCropAndScaleOptionCenterCrop;

```

4、相机回调执行

``
- (void)captureOutput:(AVCaptureOutput *)captureOutput didOutputSampleBuffer:(CMSampleBufferRef)sampleBuffer fromConnection:(AVCaptureConnection *)connection
{
        UIImage *image = [UIImage imageFromSampleBuffer:sampleBuffer];
        UIImage *scaledImage = [image scaleToSize:CGSizeMake(224, 224)];
        CVPixelBufferRef buffer = [image pixelBufferFromCGImage:scaledImage];
        VNImageRequestHandler *handler = [[VNImageRequestHandler alloc] initWithCVPixelBuffer:buffer options:@{}];
        NSError *error;
        [handler performRequests:@[self.coreMLRequest] error:&error];
}
```

5、结果展示

当我获取的画面返回的时候就会通过MobileNet这个机器学习模型去识别,结果展示在左下角的标签里面。这样也就完成了Vision在CoreML上的调用。

model.gif

三、Vision实现物体追踪

1、人脸请求

这里我们没有使用VNTrackObjectRequest,这里使用了VNDetectFaceLandmarksRequest来实现一个脸部追踪贴纸的效果,调用是一样的。 在上面相机的基础上,我们新建一个脸部识别的请求

``
    self.faceRequest = [[VNDetectFaceLandmarksRequest alloc] initWithCompletionHandler:^(VNRequest * _Nonnull request, NSError * _Nullable error) {
        VNDetectFaceLandmarksRequest *faceRequest = (VNDetectFaceLandmarksRequest*)request;
        VNFaceObservation *firstObservation = [faceRequest.results firstObject];
        dispatch_async(dispatch_get_main_queue(), ^{
            if (firstObservation) {
                CGRect boundingBox = [firstObservation boundingBox];
                CGRect rect = VNImageRectForNormalizedRect(boundingBox,self.realTimeView.frame.size.width,self.realTimeView.frame.size.height);
                CGRect frame = CGRectMake(self.realTimeView.frame.size.width - rect.origin.x - rect.size.width, self.realTimeView.frame.size.height - rect.origin.y - rect.size.height, rect.size.width, rect.size.height);
                self.maskView.frame = frame;
                self.maskView.hidden = NO;
            }
            else {
                self.maskView.hidden = YES;
            }
        })
    }];
```

2、相机回调切换

在上面相机回调的基础上增加一个按钮切换请求模式即可

``` - (void)captureOutput:(AVCaptureOutput *)captureOutput didOutputSampleBuffer:(CMSampleBufferRef)sampleBuffer fromConnection:(AVCaptureConnection *)connection

{
    if (self.coreMlMode) {

        UIImage *image = [UIImage imageFromSampleBuffer:sampleBuffer];

        UIImage *scaledImage = [image scaleToSize:CGSizeMake(224, 224)];

        CVPixelBufferRef buffer = [image pixelBufferFromCGImage:scaledImage];

        VNImageRequestHandler *handler = [[VNImageRequestHandler alloc] initWithCVPixelBuffer:buffer options:@{}];

        NSError *error;

        [handler performRequests:@[self.coreMLRequest] error:&error];

    }

    else {

        CVImageBufferRef imageBuffer = CMSampleBufferGetImageBuffer(sampleBuffer);

        VNImageRequestHandler *handler = [[VNImageRequestHandler alloc] initWithCVPixelBuffer:(CVPixelBufferRef)imageBuffer options:@{}];

        NSError *error;

        [handler performRequests:@[self.faceRequest] error:&error];

    }
}

``` 3、结果展示

我们把镜头放在同事的脸上,就会识别出同事的脸部位置,将预先插入的maskView的frame设置在对应的位置,就能让面具一直追踪脸部紧贴,当没有识别出脸部的时候,就会隐藏面具,效果如下。

face.gif

四、小结

Vision框架为我们封装的视觉处理一些场景下的功能,调用非常简单,但是正是由于调用的简单,对应就达不到一个复杂的功能,一般场景是可以实现的,期待苹果未来能够提供更为丰富的API,比如图片的风格变换等等,我们的应用也会越来越丰富。附带Demo地址,有兴趣的可以下载看看。 iOS Vision in Video Streams


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK