Metal與圖形渲染五:鏈?zhǔn)郊軜?gòu)的實現(xiàn)

零. 前言

在之前提到的渲染指令都是單次渲染,但當(dāng)我們需要復(fù)用之前渲染的結(jié)果的時候,單次渲染顯然就不能滿足我們的需求,因此,鏈?zhǔn)浇Y(jié)構(gòu)就應(yīng)運而生了。在鏈?zhǔn)浇Y(jié)構(gòu)中,我們可以利用一次渲染產(chǎn)生的輸出再次作為輸入,最后渲染到屏幕上,例如,我們依舊采取Metal與圖形渲染二:透明圖片的渲染的例子,需要得到透明圖片的效果。

我們之前的實現(xiàn)原理其實是一次渲染實現(xiàn)的:

之前的實現(xiàn)會導(dǎo)致所有渲染操作都堆在一次渲染,導(dǎo)致OC層、Metal層的代碼全部一次性放在一個地方,難以維護。

這次,我們不再把代碼堆砌到一次渲染中實現(xiàn),而是用鏈?zhǔn)浇Y(jié)構(gòu)來實現(xiàn)這個效果:

而鏈?zhǔn)浇Y(jié)構(gòu)的代碼會更加直觀簡潔,更重要的是,無論后續(xù)想復(fù)用Picture的紋理,亦或是某個Filter的紋理,只需要在該Filter再加一層鏈即可再次復(fù)用。

提起鏈?zhǔn)浇Y(jié)構(gòu),就不得不提到大神庫GPUImage3了,該庫可以支持一次渲染多次使用,但由于該庫語言是基于Swift來編寫的,除此之外,GPUImage3在處理視頻還有致命的高CPU和高內(nèi)存問題,一個視頻沒播放完內(nèi)存就已經(jīng)爆了,搜了下issue,19年就有人提到相關(guān)問題,但作者的回復(fù)也僅僅是 "We still have a lot of work to do on the inputs and outputs to get this to be ready for regular use."

坑爹..這樣的開源庫用來播放特效,怕是基本的需求都搞不定,再加上目前項目中運用的還是OC,沒辦法,只能借鑒前人的思路,自己手?jǐn)]一個鏈?zhǔn)娇蚣芰?,還得把開源庫的坑給填掉。

一. 基本架構(gòu)

鏈?zhǔn)浇Y(jié)構(gòu)的工作流程如下圖所示:

而實現(xiàn)該工作流程的基礎(chǔ)組成部分有:

基礎(chǔ)庫MetalKit、渲染層Renderer、紋理生產(chǎn)者Provider、紋理消費者Consumer,他們的關(guān)系如下圖所示。

二. 渲染原理及基礎(chǔ)組成部分

在介紹組成部分前,我們有必要簡要回顧介紹一下單次渲染操作的流程圖,即,在單次渲染操作中,一個輸入源(UIImage)是如何通過層層處理渲染到屏幕上面的:

--- 初始化階段 ---

  1. 配置 Device 、 Queue、MTKView(初始化階段,只初始化一次)
  2. 配置 PipelineState (設(shè)置和.metal文件映射方法,只初始化一次)
  3. 創(chuàng)建資源,讀取紋理MTLTexture(只初始化一次)
  4. 設(shè)置頂點MTLBuffer(最好只初始化一次)

--- 渲染階段,drawInMTKView回調(diào),每幀渲染一次 ---

  1. 根據(jù)Queue獲取 CommandBuffer
  2. 根據(jù)CommandBuffer和RenderPassDescriptor配置 CommandBufferEncoder
  3. Encoder Buffer 【如有需要的話可以用 Threadgroups 來分組 Encoder 數(shù)據(jù)】

--- 結(jié)束,提交渲染命令,在完成渲染后,將命令緩存區(qū)提交至GPU ---

  1. 提交到 Queue 中


我們可以看到,在單次渲染操作中,有些部分是只會初始化一次,而有些部分需要頻繁地創(chuàng)建和讀取。

在本次鏈?zhǔn)浇Y(jié)構(gòu)中,對于一次鏈?zhǔn)戒秩荆◤腢IImage到MTKView)來說,我們只需要創(chuàng)建一次的內(nèi)容包括:Device、CommanQueue、CommandBuffer、Library、Pipeline。

而需要多次讀取的內(nèi)容為CommandEncoder,多次Encode之后,直到MTKView,將該次渲染所有Encode操作得到的CommandBuffer提交Commit,讓GPU進行渲染。

1. 基礎(chǔ)庫MetalKit

MetalKit負責(zé)管理和存儲只需要創(chuàng)建一次的內(nèi)容,基本都是Lazy Load得到的,這樣就避免了渲染的時候頻繁創(chuàng)建對象,消耗CPU和內(nèi)存。

- (id<MTLDevice>)device {
    if (!_device) {
        _device = MTLCreateSystemDefaultDevice();
    }
    return _device;
}

- (id<MTLCommandQueue>)commandQueue {
    if (!_commandQueue) {
        _commandQueue = [self.device newCommandQueue];
    }
    return _commandQueue;
}

- (id<MTLCommandBuffer>)commandBuffer {
    if (!_commandBuffer) {
        _commandBuffer = self.commandQueue.commandBuffer;
    }
    return _commandBuffer;
}

- (id<MTLLibrary>)library {
    if (!_library) {
        NSString *libPath = [METAL_BUNDLE pathForResource:@"alpha_video_renderer" ofType:@"metallib"];
        if (!libPath) {
            NSAssert(NO, @"[HobenMetalKit] libPath is nil!");
            [CCAlphaVideoUtils handleMetalSetupError:CCAlphaVideoMetalErrorTypeLibLoadError reason:@"libPath is nil"];
            HobenLog(@"[HobenMetalKit] libPath is nil!");
            return nil;
        }
        NSError *error;
        id <MTLLibrary> defaultLibrary = [MTL_DEVICE newLibraryWithFile:libPath error:&error];
        if (error || !defaultLibrary) {
            [CCAlphaVideoUtils handleMetalSetupError:CCAlphaVideoMetalErrorTypeLibLoadError reason:@"defaultLibrary load failed"];
            HobenLog(@"[HobenMetalKit] newLibraryWithFile error: %@", error);
            return nil;
        }
        _library = defaultLibrary;
    }
    return _library;
}

- (NSMutableDictionary<NSString *,id<MTLRenderPipelineState>> *)pipelineDict {
    if (!_pipelineDict) {
        _pipelineDict = [NSMutableDictionary dictionary];
    }
    return _pipelineDict;
}

這里將Pipeline管理也放到MetalKit中,加入緩存機制,同樣也是為了避免渲染中頻繁創(chuàng)建管線

+ (id <MTLRenderPipelineState>)pipelineStateWithVertexName:(NSString *)vertexName fragmentName:(NSString *)fragmentName {
    NSMutableDictionary *pipelineDict = [HobenMetalKit sharedInstance].pipelineDict;
    NSString *vName = vertexName ?: @"oneInputVertex";
    NSString *fName = fragmentName ?: @"passthroughFragment";
    NSString *key = [NSString stringWithFormat:@"%@_%@", vName, fName];
    id <MTLRenderPipelineState> cachedPipeline = pipelineDict[key];
    if (cachedPipeline) {
        [HobenMetalKit sharedInstance].didLoadMetalLibSuccess = YES;
        return cachedPipeline;
    }
    MTLRenderPipelineDescriptor *pipelineDesc = [MTLRenderPipelineDescriptor new];
    id <MTLLibrary> library = [self sharedLibrary];
    id <MTLFunction> vertexFunction = [library newFunctionWithName:vName];
    id <MTLFunction> fragmentFunction = [library newFunctionWithName:fName];
    if (!vertexFunction || !fragmentFunction) {
        NSAssert(NO, @"fuction is nil");
        return nil;
    }
    pipelineDesc.vertexFunction = vertexFunction;
    pipelineDesc.fragmentFunction = fragmentFunction;
    pipelineDesc.colorAttachments[0].pixelFormat = MTLPixelFormatBGRA8Unorm;

    NSError *pipelineError;
    id <MTLRenderPipelineState> pipelineState = [[self sharedDevice] newRenderPipelineStateWithDescriptor:pipelineDesc error:nil];
    if (pipelineError) {
        [CCAlphaVideoUtils handleMetalSetupError:CCAlphaVideoMetalErrorTypeLibLoadError reason:@"pipelinestate error"];
        HobenLog(@"[CCAlphaVideoMetalFunctionLoader] pipelinestate error: %@", pipelineError);
    }
    if (pipelineState) {
        [HobenMetalKit sharedInstance].didLoadMetalLibSuccess = YES;
    }
    pipelineDict[key] = pipelineState;
    return pipelineState;
}

2. 渲染層Renderer

渲染層的主要目的是將傳進來的Pipeline、頂點坐標(biāo)、各種緩沖、輸入的紋理進行操作,進行Encode操作后得到輸出的紋理

/**
 單次渲染操作
 @param pipelineState 渲染管線
 @param inputTextures 輸入的紋理,結(jié)構(gòu)體包含紋理數(shù)據(jù)和紋理坐標(biāo)
 @param imageVertices 頂點坐標(biāo),輸入nil則為默認頂點坐標(biāo)
 @param vertexBuffers 頂點著色器緩沖數(shù)組
 @param fragmentBuffers 片段著色器緩沖數(shù)組
 @param loadAction 讀取/清除之前渲染的內(nèi)容,默認MTLLoadActionClear
 @param outputTexture 輸出的紋理,可復(fù)用
 */
+ (void)renderQuad:(id <MTLRenderPipelineState>)pipelineState
     inputTextures:(NSArray <HobenMetalTexture *> *)inputTextures
     imageVertices:(nullable NSArray *)imageVertices
     vertexBuffers:(nullable NSArray <id<MTLBuffer>> *)vertexBuffers
   fragmentBuffers:(nullable NSArray <id<MTLBuffer>> *)fragmentBuffers
        loadAction:(MTLLoadAction)loadAction
     outputTexture:(id <MTLTexture>)outputTexture {
    
    NSAssert(!imageVertices || imageVertices.count == 8, @"imageVertices.count must be 8");
    
    AUTO_RELEASE_BEGIN
        
    if (!pipelineState) {
        NSAssert(NO, @"pipelineState is nil");
        return;
    }
    NSArray *defaultImageVertices = @[
        @-1.0, @1.0,
        @1.0, @1.0,
        @-1.0, @-1.0,
        @1.0, @-1.0,
    ];
    NSArray *vertice = imageVertices ?: defaultImageVertices;
    float verticeCoordinates[8] = {
        [vertice[0] floatValue], [vertice[1] floatValue],
        [vertice[2] floatValue], [vertice[3] floatValue],
        [vertice[4] floatValue], [vertice[5] floatValue],
        [vertice[6] floatValue], [vertice[7] floatValue],
    };
    id <MTLBuffer> vertexBuffer = [[HobenMetalKit sharedDevice] newBufferWithBytes:verticeCoordinates length:sizeof(verticeCoordinates) options:MTLResourceStorageModeShared];
    
    MTLRenderPassDescriptor *renderPass = [MTLRenderPassDescriptor renderPassDescriptor];
    renderPass.colorAttachments[0].texture = outputTexture;
    renderPass.colorAttachments[0].clearColor = MTLClearColorMake(0, 0, 0, 0);
    renderPass.colorAttachments[0].storeAction = MTLStoreActionStore;
    renderPass.colorAttachments[0].loadAction = loadAction;
    
    id <MTLRenderCommandEncoder> renderEncoder = [MTL_COMMAND_BUFFER renderCommandEncoderWithDescriptor:renderPass];
    [renderEncoder setRenderPipelineState:pipelineState];
    [renderEncoder setVertexBuffer:vertexBuffer offset:0 atIndex:0];
    
    for (NSInteger i = 0; i < vertexBuffers.count; i++) {
        id <MTLBuffer> extraVertexBuffer = vertexBuffers[i];
        [renderEncoder setVertexBuffer:extraVertexBuffer offset:0 atIndex:1 + i];
    }
    
    for (NSInteger i = 0; i < inputTextures.count; i++) {
        HobenMetalTexture *texture = inputTextures[i];
        if (![texture isKindOfClass:[HobenMetalTexture class]]) {
            NSAssert(NO, @"texture class must be HobenMetalTexture");
            [renderEncoder setVertexBuffer:nil offset:0 atIndex:1 + i + vertexBuffers.count];
            [renderEncoder setFragmentTexture:nil atIndex:i];
            continue;
        }
        NSArray *textureCoor = texture.textureCoordinates;
        NSAssert(textureCoor.count == 8, @"textureCoor.count must be 8");
        float textureCoordinates[8] = {
            [textureCoor[0] floatValue], [textureCoor[1] floatValue],
            [textureCoor[2] floatValue], [textureCoor[3] floatValue],
            [textureCoor[4] floatValue], [textureCoor[5] floatValue],
            [textureCoor[6] floatValue], [textureCoor[7] floatValue],
        };
        id <MTLBuffer> textureBuffer = [[HobenMetalKit sharedDevice] newBufferWithBytes:textureCoordinates length:sizeof(textureCoordinates) options:MTLResourceStorageModeShared];
        [renderEncoder setVertexBuffer:textureBuffer offset:0 atIndex:1 + i + vertexBuffers.count];
        [renderEncoder setFragmentTexture:texture.texture atIndex:i];
    }
    
    for (NSInteger i = 0; i < fragmentBuffers.count; i++) {
        id <MTLBuffer> fragmentBuffer = fragmentBuffers[i];
        [renderEncoder setFragmentBuffer:fragmentBuffer offset:0 atIndex:i];
    }
    [renderEncoder drawPrimitives:MTLPrimitiveTypeTriangleStrip vertexStart:0 vertexCount:4];
    [renderEncoder endEncoding];
        
    AUTO_RELEASE_END
}

3. 紋理生產(chǎn)者Provider

生產(chǎn)者的主要工作是根據(jù)渲染層獲得的紋理,提供給對應(yīng)的消費者,從而進行下一步操作,在這里我們定義了Provider需要遵循的協(xié)議:

@protocol HobenMetalProviderProtocol <NSObject>

- (void)transmitTexture:(id<MTLTexture>)texture
                 target:(id<HobenMetalConsumerProtocol>)target
                  index:(NSInteger)index;

@end

再定義一個遵循Provider協(xié)議的紋理生產(chǎn)者MetalOutput,該生產(chǎn)者主要是管理自己所擁有的Consumer(根據(jù)addTarget方法加入),并在必要時刻通知給對應(yīng)的Consumer,讓其調(diào)用相應(yīng)的方法。

@interface HobenMetalOutput : NSObject <HobenMetalProviderProtocol> {
    id<MTLTexture> outputTexture;
}

#pragma mark - Public Method

- (void)addTarget:(id <HobenMetalConsumerProtocol>)target {
    NSInteger index = 0;
    if ([target respondsToSelector:@selector(nextAvailableTextureIndex)]) {
        index = [target nextAvailableTextureIndex];
    }
    [self addTarget:target atIndex:index];
}

- (void)addTarget:(id <HobenMetalConsumerProtocol>)target atIndex:(NSInteger)index {
    if (!target) {
        return;
    }
    if ([self.targets containsObject:target]) {
        return;
    }
    if ([target respondsToSelector:@selector(textureIndexUnavailable:)]) {
        [target textureIndexUnavailable:index];
    }
    [self.targets addObject:target];
    [self.targetTextureIndices addObject:@(index)];
}

- (void)transmitTextureToAllTargets:(id<MTLTexture>)texture {
    for (id <HobenMetalConsumerProtocol> target in self.targets) {
        NSInteger indexOfObject = [self.targets indexOfObject:target];
        NSInteger textureIndex = [[self.targetTextureIndices objectAtIndex:indexOfObject] integerValue];
        [self transmitTexture:texture target:target index:textureIndex];
    }
}

#pragma mark - HobenMetalProviderProtocol

- (void)transmitTexture:(id<MTLTexture>)texture target:(id<HobenMetalConsumerProtocol>)target index:(NSInteger)index {
    [target newTextureAvailable:texture index:index];
}

在本架構(gòu)中,屬于生產(chǎn)者的有HobenMetalPicture(根據(jù)UIImage獲取到紋理)、HobenMetalMovieReader(根據(jù)CVPixelBufferRef獲取到紋理)、HobenMetalFilter(根據(jù)鏈?zhǔn)缴蠈荧@取到紋理),他們得到紋理后將會進行處理,輸出給鏈?zhǔn)较聦印?/p>

4. 紋理消費者Consumer

消費者的主要工作是根據(jù)Provider提供的紋理信息,進行進一步操作,在這里我們也定義了Consumer需要遵循的協(xié)議:

@protocol HobenMetalConsumerProtocol <NSObject>

- (void)newTextureAvailable:(id <MTLTexture>)texture index:(NSInteger)index;

@optional

- (NSInteger)nextAvailableTextureIndex;

- (void)textureIndexUnavailable:(NSInteger)index;

@end

在本架構(gòu)中,屬于消費者的有HobenMetalRenderView(根據(jù)獲取到的紋理提交渲染指令)、HobenMetalFilter(根據(jù)獲取到的紋理進行這一層的Encode),他們的職責(zé)是根據(jù)上一層Provider提供的紋理,在這一層進行編碼。

三. 生產(chǎn)者和消費者們

1. 資源處理器

資源處理器,即將一些現(xiàn)有的資源對象(UIImage、CVPixelBufferRef)轉(zhuǎn)化為紋理的工具,他們屬于生產(chǎn)者Provider,轉(zhuǎn)化為紋理后可以提供給鏈?zhǔn)较聦覥onsumer。

HobenMetalPicture根據(jù)MTKTextureLoader提供的紋理讀取方法,在init的時候就將CGImage轉(zhuǎn)換為了紋理。

- (instancetype)initWithImage:(UIImage *)newImageSource {
    if (self = [self initWithCGImage:newImageSource.CGImage]) {
        
    }
    return self;
}

- (instancetype)initWithCGImage:(CGImageRef)newImageSource {
    if (self = [super init]) {
        [self renderCGImage:newImageSource];
    }
    return self;
}

- (void)renderCGImage:(CGImageRef)cgImage {
    MTKTextureLoader *loader = [[MTKTextureLoader alloc] initWithDevice:MTL_DEVICE];
    NSDictionary *options = @{
        MTKTextureLoaderOptionSRGB : @(NO),
    };
    self.texture = [loader newTextureWithCGImage:cgImage options:options error:nil];
}

當(dāng)開發(fā)者需要開始傳遞創(chuàng)建好的紋理的時候,調(diào)用以下方法即可

- (void)processImage {
    [self transmitTextureToAllTargets:self.texture];
}

HobenMetalMovieReader則需要定義好自己的YUV轉(zhuǎn)換矩陣,加入到片段著色器緩沖當(dāng)中,原理在Metal與圖形渲染三:透明通道視頻有提及,這里只是將過去的邏輯抽離得更簡潔和可讀一點:

- (BOOL)renderPixelBuffer:(CVPixelBufferRef)pixelBuffer {
    AUTO_RELEASE_BEGIN
    
    id <MTLTexture> textureY = [self textureWithPixelBuffer:pixelBuffer pixelFormat:MTLPixelFormatR8Unorm planeIndex:0];
    id <MTLTexture> textureUV = [self textureWithPixelBuffer:pixelBuffer pixelFormat:MTLPixelFormatRG8Unorm planeIndex:1];
    [self setupMatrixWithPixelBuffer:pixelBuffer];
    
    if (!textureY || !textureUV || !self.convertMatrix) {
        return NO;
    }
    CVPixelBufferLockBaseAddress(pixelBuffer, kCVPixelBufferLock_ReadOnly);
    NSMutableArray *inputTextureArray = [NSMutableArray array];
    for (id <MTLTexture> texture in @[textureY, textureUV]) {
        HobenMetalTexture *inputTexture = [[HobenMetalTexture alloc] initWithTexture:texture];
        [inputTextureArray addObject:inputTexture];
    }
    
    CVPixelBufferUnlockBaseAddress(pixelBuffer, kCVPixelBufferLock_ReadOnly);
    
    if (!outputTexture) {
        outputTexture = [HobenMetalTexture defaultTextureByWidth:textureY.width height:textureY.height];
    }
        
    [HobenMetalKit renderQuad:MTL_PIPELINE(@"oneInputVertex", @"movieFragment") inputTextures:inputTextureArray imageVertices:nil vertexBuffers:nil fragmentBuffers:@[_convertMatrix] outputTexture:outputTexture];
    
    [self transmitTextureToAllTargets:outputTexture];
    
    AUTO_RELEASE_END
    
    return YES;
}

2. 中間層Filter

在鏈?zhǔn)綀D中,我們可以發(fā)現(xiàn)一個很重要的中間層——Filter,它既是生產(chǎn)者,也是消費者,它既可以消費上一層提供的紋理,又可以加入自己想要渲染的管線、緩沖、坐標(biāo),進行這一層的渲染,將得到的紋理提供給下一層。

Filter支持多個輸入紋理,自己可以編寫多個頂點緩沖、紋理緩沖,加上自己對應(yīng)的Pipeline傳遞給渲染層,而最終只會得到一個輸出。

根據(jù)Filter又是生產(chǎn)者又是消費者的特性,我們可以得出,它是一個繼承HobenMetalOutput同時遵循HobenMetalConsumerProtocol的類:

@interface HobenMetalFilter : HobenMetalOutput <HobenMetalConsumerProtocol>
{
    NSMutableArray <HobenMetalTexture *> *inputTextures;
}

由于Filter支持多輸入,所以我們需要等待所有的輸入源準(zhǔn)備好了,再進行該次渲染操作,在渲染時,如果上一層的Provider傳來紋理,且所有紋理已經(jīng)準(zhǔn)備完畢,那就可以開始處理了:

- (void)newTextureAvailable:(id<MTLTexture>)texture index:(NSInteger)index {
    if (!texture) {
        return;
    }
    NSInteger numberOfInputs = MAX(_numberOfInputs, 1);
    
    HobenMetalTexture *inputTexture = [[HobenMetalTexture alloc] initWithTexture:texture];
    inputTexture.textureIndex = index;
    [inputTextures addObject:inputTexture];
    
    if (inputTextures.count < numberOfInputs) {
        return;
    }
    
    if (!outputTexture) {
        outputTexture = [HobenMetalTexture defaultTextureByWidth:texture.width height:texture.height];
    }
    
    [inputTextures sortUsingComparator:^NSComparisonResult(HobenMetalTexture *obj1, HobenMetalTexture *obj2) {
        if (obj1.textureIndex <= obj2.textureIndex) {
            return NSOrderedAscending;
        } else {
            return NSOrderedDescending;
        }
    }];
    [self renderToTextureWithVertices:nil textureCoordinates:nil];
    [inputTextures removeAllObjects];
}

- (void)renderToTextureWithVertices:(NSArray *)vertices textureCoordinates:(NSArray *)textureCoordinates {
    for (HobenMetalTexture *inputTexture in inputTextures) {
        inputTexture.textureCoordinates = textureCoordinates;
    }
    [HobenMetalKit renderQuad:MTL_PIPELINE(_vertexName, _fragmentName) inputTextures:inputTextures imageVertices:vertices outputTexture:outputTexture];
    
    [self transmitTextureToAllTargets:outputTexture];
}

值得注意的是,由于MTLTextureDescriptor創(chuàng)建紋理是一個很耗CPU的操作,因此,我們只創(chuàng)建一次outputTexture就好了(GPUImage3可能是因為這個問題,渲染視頻的時候CPU占比很高,坑了我好久。。)

這里將renderToTextureWithVertices:textureCoordinates:抽了出來,開發(fā)者可以根據(jù)自己的需要自定義頂點坐標(biāo)或紋理坐標(biāo),或者自己實現(xiàn)一套渲染邏輯,比如這次需要用到的裁剪操作CropFilter就是這樣實現(xiàn)的:

- (void)calculateCropTextureCoordinates {
    CGFloat minX = _cropRegion.origin.x;
    CGFloat minY = _cropRegion.origin.y;
    CGFloat maxX = CGRectGetMaxX(_cropRegion);
    CGFloat maxY = CGRectGetMaxY(_cropRegion);
    
    _cropTextureCoordinates = @[
        @(minX), @(minY),
        @(maxX), @(minY),
        @(minX), @(maxY),
        @(maxX), @(maxY),
    ];
}

#pragma mark - Override

- (void)renderToTextureWithVertices:(NSArray *)vertices textureCoordinates:(NSArray *)textureCoordinates {
    [super renderToTextureWithVertices:vertices textureCoordinates:_cropTextureCoordinates];
}

3. 輸出視圖

輸出視圖繼承于MTKView,其職責(zé)是將上一層提供的紋理進行展示,屬于消費者Consumer,是將編碼指令提交給GPU的最終結(jié)點。而這次,我們不需要讓系統(tǒng)每幀回調(diào)drawInMtkView:了,而是我們自己決定調(diào)用的時機,代碼如下:

@interface HobenMetalRenderView : MTKView <HobenMetalConsumerProtocol>

static const NSUInteger MaxFramesInFlight = 3;

- (void)setup {
    // 設(shè)置enableSetNeedsDisplay為NO且paused為YES,開發(fā)者自決定draw時機
    self.enableSetNeedsDisplay = NO;
    self.paused = YES;
    self.autoResizeDrawable = YES;
    self.device = MTL_DEVICE;
    self.opaque = NO;
    _inFlightSemaphore = dispatch_semaphore_create(MaxFramesInFlight);
}

- (void)newTextureAvailable:(id<MTLTexture>)texture index:(NSInteger)index {
    self.drawableSize = CGSizeMake(texture.width, texture.height);
    self.currentTexture = texture;
    [self draw];
}

- (void)drawRect:(CGRect)rect {
    if (!self.currentTexture) {
        return;
    }
    if (!self.currentDrawable) {
        NSAssert(NO, @"drawable is nil");
        return;
    }
    dispatch_semaphore_wait(_inFlightSemaphore, DISPATCH_TIME_FOREVER);
    
    id <MTLCommandBuffer> commandBuffer = MTL_COMMAND_BUFFER;
    HobenMetalTexture *texture = [[HobenMetalTexture alloc] initWithTexture:self.currentTexture];
    [HobenMetalKit renderQuad:MTL_PASSTHROUGH_PIPELINE inputTextures:@[texture] outputTexture:self.currentDrawable.texture];
    __block dispatch_semaphore_t block_semaphore = _inFlightSemaphore;
    [commandBuffer addCompletedHandler:^(id<MTLCommandBuffer> buffer)
     {
         dispatch_semaphore_signal(block_semaphore);
     }];
    [commandBuffer presentDrawable:self.currentDrawable];
    [commandBuffer commit];
    self.currentTexture = nil;
    [HobenMetalKit resetCommandBuffer];
}

MTKView的currentDrawable也就是當(dāng)前屏幕的畫布,當(dāng)渲染指令commit完畢后,這次鏈?zhǔn)浇Y(jié)構(gòu)的所有編碼好的命令緩沖就會提交給GPU,至此,該條鏈?zhǔn)浇Y(jié)構(gòu)就能完成了。

需要注意的是,當(dāng)CommandBuffer提交上去后,需要重置,下次渲染的時候,會從命令緩沖隊列里面再創(chuàng)建一條命令緩沖,直到下次MTKView又將渲染指令提交上去完畢。

四. 業(yè)務(wù)層的繼承和調(diào)用

1. 自定義一個Filter

經(jīng)過這次重構(gòu)之后,業(yè)務(wù)層的邏輯顯然簡潔了很多,如果需要自定義一個Filter,我們只需要指定對應(yīng)的頂點著色器、片段著色器即可進行操作,有需要的話還可以自定義頂點坐標(biāo)、片段坐標(biāo),例如,裁剪操作CropFilter可以簡化為以下代碼:

- (instancetype)initWithCropRegin:(CGRect)newCropRegion {
    if (self = [super init]) {
        self.cropRegion = newCropRegion;
    }
    return self;
}

- (void)calculateCropTextureCoordinates {
    CGFloat minX = _cropRegion.origin.x;
    CGFloat minY = _cropRegion.origin.y;
    CGFloat maxX = CGRectGetMaxX(_cropRegion);
    CGFloat maxY = CGRectGetMaxY(_cropRegion);
    
    _cropTextureCoordinates = @[
        @(minX), @(minY),
        @(maxX), @(minY),
        @(minX), @(maxY),
        @(maxX), @(maxY),
    ];
}

#pragma mark - Override

- (void)renderToTextureWithVertices:(NSArray *)vertices textureCoordinates:(NSArray *)textureCoordinates {
    [super renderToTextureWithVertices:vertices textureCoordinates:_cropTextureCoordinates];
}

- (void)setCropRegion:(CGRect)newValue {
    NSParameterAssert(newValue.origin.x >= 0 && newValue.origin.x <= 1 &&
                      newValue.origin.y >= 0 && newValue.origin.y <= 1 &&
                      newValue.size.width >= 0 && newValue.size.width <= 1 &&
                      newValue.size.height >= 0 && newValue.size.height <= 1);

    _cropRegion = newValue;
    [self calculateCropTextureCoordinates];
}

而融合操作由于沒有自定義頂點坐標(biāo)的需求,在OC層就更簡單了

- (instancetype)init {
    if (self = [super initWithVertexName:@"twoInputVertex" fragmentName:@"mixFragment" numberOfInputs:2]) {
        
    }
    return self;
}

對應(yīng)的.metal文件也只是之前的融合操作:

vertex TwoInputVertexIO twoInputVertex(const device packed_float2 *position [[buffer(0)]],
                                       const device packed_float2 *texturecoord [[buffer(1)]],
                                       const device packed_float2 *texturecoord2 [[buffer(2)]],
                                       uint vid [[vertex_id]])
{
    TwoInputVertexIO outputVertices;
    
    outputVertices.position = float4(position[vid], 0, 1.0);
    outputVertices.textureCoordinate = texturecoord[vid];
    outputVertices.textureCoordinate2 = texturecoord2[vid];

    return outputVertices;
}

fragment float4 mixFragment(TwoInputVertexIO fragmentInput [[stage_in]],
                            texture2d<float> inputTexture [[texture(0)]],
                            texture2d<float> inputTexture2 [[texture(1)]])
{
    constexpr sampler quadSampler;
    float4 color1 = inputTexture.sample(quadSampler, fragmentInput.textureCoordinate);
    float4 color2 = inputTexture2.sample(quadSampler, fragmentInput.textureCoordinate2);

    return float4(color1.rgb, color2.r);
}

2. 業(yè)務(wù)層的調(diào)用

業(yè)務(wù)層需要指定鏈?zhǔn)浇Y(jié)構(gòu)的走向,也只需要一個可讀性非常好的操作:

- (void)viewDidLoad {
    [super viewDidLoad];

    if (!_renderView) {
        _renderView = [[HobenMetalRenderView alloc] initWithFrame:CGRectMake(0, 0, self.view.frame.size.width, self.view.frame.size.height)];
    }
    if (!_cropLeftFilter) {
        _cropLeftFilter = [[HobenMetalCropFilter alloc] initWithCropRegin:CGRectMake(0, 0, .5f, 1.f)];
    }
    
    if (!_cropRightFilter) {
        _cropRightFilter = [[HobenMetalCropFilter alloc] initWithCropRegin:CGRectMake(.5f, 0, .5f, 1.f)];
    }

    if (!_mixFilter) {
        _mixFilter = [[HobenMetalMixFilter alloc] init];
    }

    if (!_picture) {
        _picture = [[HobenMetalPicture alloc] initWithImage:[UIImage imageNamed:@"crop_image"]];
    }
    
    [self.view addSubview:_renderView];
    
    [_picture addTarget:_cropLeftFilter];
    [_picture addTarget:_cropRightFilter];
    
    [_cropLeftFilter addTarget:_mixFilter];
    [_cropRightFilter addTarget:_mixFilter];
    
    [_mixFilter addTarget:_renderView];
    
    [_picture processImage];
}

至此,一個鏈?zhǔn)浇Y(jié)構(gòu)就完成啦!

五. 內(nèi)存和CPU優(yōu)(Cai)化(Keng)的一些思考

GPUImage3處理視頻的高CPU和高內(nèi)存情況,預(yù)估原因體現(xiàn)在以下幾點:

  1. AutoReleasePool

蘋果的對Metal渲染的官方文檔是建議使用autoRelease的,對此我們渲染的操作也需要加上這個操作。

  1. 對CommandBuffer的頻繁Commit

在GPUImage3的設(shè)計中,無論是Provider、Consumer還是Filter,他的每次編碼操作之后都進行了一次commit,事實上,對于單次渲染來說,只需要一次commit、多次編碼即可完成,而commit恰恰是CPU和GPU溝通的橋梁。

根據(jù)蘋果官方的描述,Drawable其實是一個非常有限的資源(只有3個),他由系統(tǒng)進行調(diào)度,而官方的Sample Code:Synchronizing CPU and GPU Work,建議使用信號量來控制commit,GPUImage3這番頻繁的commit估計會很影響CPU的性能。

// The maximum number of frames in flight.
static const NSUInteger MaxFramesInFlight = 3;

...

/// Handles view rendering for a new frame.
- (void)drawInMTKView:(nonnull MTKView *)view
{
    // Wait to ensure only `MaxFramesInFlight` number of frames are getting processed
    // by any stage in the Metal pipeline (CPU, GPU, Metal, Drivers, etc.).
    dispatch_semaphore_wait(_inFlightSemaphore, DISPATCH_TIME_FOREVER);

...

    // Add a completion handler that signals `_inFlightSemaphore` when Metal and the GPU have fully
    // finished processing the commands that were encoded for this frame.
    // This completion indicates that the dynamic buffers that were written-to in this frame, are no
    // longer needed by Metal and the GPU; therefore, the CPU can overwrite the buffer contents
    // without corrupting any rendering operations.
    __block dispatch_semaphore_t block_semaphore = _inFlightSemaphore;
    [commandBuffer addCompletedHandler:^(id<MTLCommandBuffer> buffer)
     {
         dispatch_semaphore_signal(block_semaphore);
     }];

    // Finalize CPU work and submit the command buffer to the GPU.
    [commandBuffer commit];
}
  1. 頻繁地使用MTLTextureDescriptor創(chuàng)建outputTexture

在視頻的每一幀渲染中,這個是非常非常消耗CPU的,一個視頻有非常多幀,每一幀都初始化一個紋理肯定是不行的,因為這個,我渲染視頻的CPU飆升到了50%左右,而優(yōu)化之后CPU維持在10%左右,有多耗性能可想而知,事實上這個也不需要頻繁創(chuàng)建,只需要Lazy Load就好了~

下圖就是經(jīng)過優(yōu)化之后,渲染視頻中,CPU和內(nèi)存的峰值啦:

六. 總結(jié)

本次鏈?zhǔn)交軜?gòu)的實現(xiàn),大大地提升渲染邏輯的維護性和可讀性,支持按照渲染功能對Filter文件和.metal文件進行分類,簡化了業(yè)務(wù)層開發(fā)的邏輯。

即便需要自定義渲染操作,也只需要繼承HobenMetalFilter,自行決定所需的頂點著色器、片段著色器、頂點坐標(biāo)、紋理坐標(biāo)、頂點緩沖、紋理緩沖即可,非常方便。

該鏈?zhǔn)浇Y(jié)構(gòu)遵循生產(chǎn)者-消費者結(jié)構(gòu),將輸入作為生產(chǎn)者,輸出作為消費者,中間層Filter作為生產(chǎn)者和消費者,從而使得單次的命令緩沖CommandBuffer集成了多個指令編碼CommandEncode,最后讓MTKView提交命令緩沖至GPU,完成該次渲染。

而本次鏈?zhǔn)郊軜?gòu)不僅用OC完成了開源庫GPUImage3的代碼邏輯,而且還解決了高內(nèi)存和高CPU問題,雖然過程比較煎熬,但收獲真的很多,繼續(xù)加油!

?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者
【社區(qū)內(nèi)容提示】社區(qū)部分內(nèi)容疑似由AI輔助生成,瀏覽時請結(jié)合常識與多方信息審慎甄別。
平臺聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點,簡書系信息發(fā)布平臺,僅提供信息存儲服務(wù)。

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容