零. 前言
在之前提到的渲染指令都是單次渲染,但當(dāng)我們需要復(fù)用之前渲染的結(jié)果的時候,單次渲染顯然就不能滿足我們的需求,因此,鏈?zhǔn)浇Y(jié)構(gòu)就應(yīng)運而生了。在鏈?zhǔn)浇Y(jié)構(gòu)中,我們可以利用一次渲染產(chǎn)生的輸出再次作為輸入,最后渲染到屏幕上,例如,我們依舊采取Metal與圖形渲染二:透明圖片的渲染的例子,需要得到透明圖片的效果。


我們之前的實現(xiàn)原理其實是一次渲染實現(xiàn)的:

之前的實現(xiàn)會導(dǎo)致所有渲染操作都堆在一次渲染,導(dǎo)致OC層、Metal層的代碼全部一次性放在一個地方,難以維護。
這次,我們不再把代碼堆砌到一次渲染中實現(xiàn),而是用鏈?zhǔn)浇Y(jié)構(gòu)來實現(xiàn)這個效果:

而鏈?zhǔn)浇Y(jié)構(gòu)的代碼會更加直觀簡潔,更重要的是,無論后續(xù)想復(fù)用Picture的紋理,亦或是某個Filter的紋理,只需要在該Filter再加一層鏈即可再次復(fù)用。
提起鏈?zhǔn)浇Y(jié)構(gòu),就不得不提到大神庫GPUImage3了,該庫可以支持一次渲染多次使用,但由于該庫語言是基于Swift來編寫的,除此之外,GPUImage3在處理視頻還有致命的高CPU和高內(nèi)存問題,一個視頻沒播放完內(nèi)存就已經(jīng)爆了,搜了下issue,19年就有人提到相關(guān)問題,但作者的回復(fù)也僅僅是 "We still have a lot of work to do on the inputs and outputs to get this to be ready for regular use."


坑爹..這樣的開源庫用來播放特效,怕是基本的需求都搞不定,再加上目前項目中運用的還是OC,沒辦法,只能借鑒前人的思路,自己手?jǐn)]一個鏈?zhǔn)娇蚣芰?,還得把開源庫的坑給填掉。
一. 基本架構(gòu)
鏈?zhǔn)浇Y(jié)構(gòu)的工作流程如下圖所示:

而實現(xiàn)該工作流程的基礎(chǔ)組成部分有:
基礎(chǔ)庫MetalKit、渲染層Renderer、紋理生產(chǎn)者Provider、紋理消費者Consumer,他們的關(guān)系如下圖所示。

二. 渲染原理及基礎(chǔ)組成部分
在介紹組成部分前,我們有必要簡要回顧介紹一下單次渲染操作的流程圖,即,在單次渲染操作中,一個輸入源(UIImage)是如何通過層層處理渲染到屏幕上面的:
--- 初始化階段 ---
- 配置 Device 、 Queue、MTKView(初始化階段,只初始化一次)
- 配置 PipelineState (設(shè)置和.metal文件映射方法,只初始化一次)
- 創(chuàng)建資源,讀取紋理MTLTexture(只初始化一次)
- 設(shè)置頂點MTLBuffer(最好只初始化一次)
--- 渲染階段,drawInMTKView回調(diào),每幀渲染一次 ---
- 根據(jù)Queue獲取 CommandBuffer
- 根據(jù)CommandBuffer和RenderPassDescriptor配置 CommandBufferEncoder
- Encoder Buffer 【如有需要的話可以用 Threadgroups 來分組 Encoder 數(shù)據(jù)】
--- 結(jié)束,提交渲染命令,在完成渲染后,將命令緩存區(qū)提交至GPU ---
-
提交到 Queue 中
我們可以看到,在單次渲染操作中,有些部分是只會初始化一次,而有些部分需要頻繁地創(chuàng)建和讀取。
在本次鏈?zhǔn)浇Y(jié)構(gòu)中,對于一次鏈?zhǔn)戒秩荆◤腢IImage到MTKView)來說,我們只需要創(chuàng)建一次的內(nèi)容包括:Device、CommanQueue、CommandBuffer、Library、Pipeline。
而需要多次讀取的內(nèi)容為CommandEncoder,多次Encode之后,直到MTKView,將該次渲染所有Encode操作得到的CommandBuffer提交Commit,讓GPU進行渲染。
1. 基礎(chǔ)庫MetalKit
MetalKit負責(zé)管理和存儲只需要創(chuàng)建一次的內(nèi)容,基本都是Lazy Load得到的,這樣就避免了渲染的時候頻繁創(chuàng)建對象,消耗CPU和內(nèi)存。
- (id<MTLDevice>)device {
if (!_device) {
_device = MTLCreateSystemDefaultDevice();
}
return _device;
}
- (id<MTLCommandQueue>)commandQueue {
if (!_commandQueue) {
_commandQueue = [self.device newCommandQueue];
}
return _commandQueue;
}
- (id<MTLCommandBuffer>)commandBuffer {
if (!_commandBuffer) {
_commandBuffer = self.commandQueue.commandBuffer;
}
return _commandBuffer;
}
- (id<MTLLibrary>)library {
if (!_library) {
NSString *libPath = [METAL_BUNDLE pathForResource:@"alpha_video_renderer" ofType:@"metallib"];
if (!libPath) {
NSAssert(NO, @"[HobenMetalKit] libPath is nil!");
[CCAlphaVideoUtils handleMetalSetupError:CCAlphaVideoMetalErrorTypeLibLoadError reason:@"libPath is nil"];
HobenLog(@"[HobenMetalKit] libPath is nil!");
return nil;
}
NSError *error;
id <MTLLibrary> defaultLibrary = [MTL_DEVICE newLibraryWithFile:libPath error:&error];
if (error || !defaultLibrary) {
[CCAlphaVideoUtils handleMetalSetupError:CCAlphaVideoMetalErrorTypeLibLoadError reason:@"defaultLibrary load failed"];
HobenLog(@"[HobenMetalKit] newLibraryWithFile error: %@", error);
return nil;
}
_library = defaultLibrary;
}
return _library;
}
- (NSMutableDictionary<NSString *,id<MTLRenderPipelineState>> *)pipelineDict {
if (!_pipelineDict) {
_pipelineDict = [NSMutableDictionary dictionary];
}
return _pipelineDict;
}
這里將Pipeline管理也放到MetalKit中,加入緩存機制,同樣也是為了避免渲染中頻繁創(chuàng)建管線
+ (id <MTLRenderPipelineState>)pipelineStateWithVertexName:(NSString *)vertexName fragmentName:(NSString *)fragmentName {
NSMutableDictionary *pipelineDict = [HobenMetalKit sharedInstance].pipelineDict;
NSString *vName = vertexName ?: @"oneInputVertex";
NSString *fName = fragmentName ?: @"passthroughFragment";
NSString *key = [NSString stringWithFormat:@"%@_%@", vName, fName];
id <MTLRenderPipelineState> cachedPipeline = pipelineDict[key];
if (cachedPipeline) {
[HobenMetalKit sharedInstance].didLoadMetalLibSuccess = YES;
return cachedPipeline;
}
MTLRenderPipelineDescriptor *pipelineDesc = [MTLRenderPipelineDescriptor new];
id <MTLLibrary> library = [self sharedLibrary];
id <MTLFunction> vertexFunction = [library newFunctionWithName:vName];
id <MTLFunction> fragmentFunction = [library newFunctionWithName:fName];
if (!vertexFunction || !fragmentFunction) {
NSAssert(NO, @"fuction is nil");
return nil;
}
pipelineDesc.vertexFunction = vertexFunction;
pipelineDesc.fragmentFunction = fragmentFunction;
pipelineDesc.colorAttachments[0].pixelFormat = MTLPixelFormatBGRA8Unorm;
NSError *pipelineError;
id <MTLRenderPipelineState> pipelineState = [[self sharedDevice] newRenderPipelineStateWithDescriptor:pipelineDesc error:nil];
if (pipelineError) {
[CCAlphaVideoUtils handleMetalSetupError:CCAlphaVideoMetalErrorTypeLibLoadError reason:@"pipelinestate error"];
HobenLog(@"[CCAlphaVideoMetalFunctionLoader] pipelinestate error: %@", pipelineError);
}
if (pipelineState) {
[HobenMetalKit sharedInstance].didLoadMetalLibSuccess = YES;
}
pipelineDict[key] = pipelineState;
return pipelineState;
}
2. 渲染層Renderer
渲染層的主要目的是將傳進來的Pipeline、頂點坐標(biāo)、各種緩沖、輸入的紋理進行操作,進行Encode操作后得到輸出的紋理

/**
單次渲染操作
@param pipelineState 渲染管線
@param inputTextures 輸入的紋理,結(jié)構(gòu)體包含紋理數(shù)據(jù)和紋理坐標(biāo)
@param imageVertices 頂點坐標(biāo),輸入nil則為默認頂點坐標(biāo)
@param vertexBuffers 頂點著色器緩沖數(shù)組
@param fragmentBuffers 片段著色器緩沖數(shù)組
@param loadAction 讀取/清除之前渲染的內(nèi)容,默認MTLLoadActionClear
@param outputTexture 輸出的紋理,可復(fù)用
*/
+ (void)renderQuad:(id <MTLRenderPipelineState>)pipelineState
inputTextures:(NSArray <HobenMetalTexture *> *)inputTextures
imageVertices:(nullable NSArray *)imageVertices
vertexBuffers:(nullable NSArray <id<MTLBuffer>> *)vertexBuffers
fragmentBuffers:(nullable NSArray <id<MTLBuffer>> *)fragmentBuffers
loadAction:(MTLLoadAction)loadAction
outputTexture:(id <MTLTexture>)outputTexture {
NSAssert(!imageVertices || imageVertices.count == 8, @"imageVertices.count must be 8");
AUTO_RELEASE_BEGIN
if (!pipelineState) {
NSAssert(NO, @"pipelineState is nil");
return;
}
NSArray *defaultImageVertices = @[
@-1.0, @1.0,
@1.0, @1.0,
@-1.0, @-1.0,
@1.0, @-1.0,
];
NSArray *vertice = imageVertices ?: defaultImageVertices;
float verticeCoordinates[8] = {
[vertice[0] floatValue], [vertice[1] floatValue],
[vertice[2] floatValue], [vertice[3] floatValue],
[vertice[4] floatValue], [vertice[5] floatValue],
[vertice[6] floatValue], [vertice[7] floatValue],
};
id <MTLBuffer> vertexBuffer = [[HobenMetalKit sharedDevice] newBufferWithBytes:verticeCoordinates length:sizeof(verticeCoordinates) options:MTLResourceStorageModeShared];
MTLRenderPassDescriptor *renderPass = [MTLRenderPassDescriptor renderPassDescriptor];
renderPass.colorAttachments[0].texture = outputTexture;
renderPass.colorAttachments[0].clearColor = MTLClearColorMake(0, 0, 0, 0);
renderPass.colorAttachments[0].storeAction = MTLStoreActionStore;
renderPass.colorAttachments[0].loadAction = loadAction;
id <MTLRenderCommandEncoder> renderEncoder = [MTL_COMMAND_BUFFER renderCommandEncoderWithDescriptor:renderPass];
[renderEncoder setRenderPipelineState:pipelineState];
[renderEncoder setVertexBuffer:vertexBuffer offset:0 atIndex:0];
for (NSInteger i = 0; i < vertexBuffers.count; i++) {
id <MTLBuffer> extraVertexBuffer = vertexBuffers[i];
[renderEncoder setVertexBuffer:extraVertexBuffer offset:0 atIndex:1 + i];
}
for (NSInteger i = 0; i < inputTextures.count; i++) {
HobenMetalTexture *texture = inputTextures[i];
if (![texture isKindOfClass:[HobenMetalTexture class]]) {
NSAssert(NO, @"texture class must be HobenMetalTexture");
[renderEncoder setVertexBuffer:nil offset:0 atIndex:1 + i + vertexBuffers.count];
[renderEncoder setFragmentTexture:nil atIndex:i];
continue;
}
NSArray *textureCoor = texture.textureCoordinates;
NSAssert(textureCoor.count == 8, @"textureCoor.count must be 8");
float textureCoordinates[8] = {
[textureCoor[0] floatValue], [textureCoor[1] floatValue],
[textureCoor[2] floatValue], [textureCoor[3] floatValue],
[textureCoor[4] floatValue], [textureCoor[5] floatValue],
[textureCoor[6] floatValue], [textureCoor[7] floatValue],
};
id <MTLBuffer> textureBuffer = [[HobenMetalKit sharedDevice] newBufferWithBytes:textureCoordinates length:sizeof(textureCoordinates) options:MTLResourceStorageModeShared];
[renderEncoder setVertexBuffer:textureBuffer offset:0 atIndex:1 + i + vertexBuffers.count];
[renderEncoder setFragmentTexture:texture.texture atIndex:i];
}
for (NSInteger i = 0; i < fragmentBuffers.count; i++) {
id <MTLBuffer> fragmentBuffer = fragmentBuffers[i];
[renderEncoder setFragmentBuffer:fragmentBuffer offset:0 atIndex:i];
}
[renderEncoder drawPrimitives:MTLPrimitiveTypeTriangleStrip vertexStart:0 vertexCount:4];
[renderEncoder endEncoding];
AUTO_RELEASE_END
}
3. 紋理生產(chǎn)者Provider
生產(chǎn)者的主要工作是根據(jù)渲染層獲得的紋理,提供給對應(yīng)的消費者,從而進行下一步操作,在這里我們定義了Provider需要遵循的協(xié)議:
@protocol HobenMetalProviderProtocol <NSObject>
- (void)transmitTexture:(id<MTLTexture>)texture
target:(id<HobenMetalConsumerProtocol>)target
index:(NSInteger)index;
@end
再定義一個遵循Provider協(xié)議的紋理生產(chǎn)者MetalOutput,該生產(chǎn)者主要是管理自己所擁有的Consumer(根據(jù)addTarget方法加入),并在必要時刻通知給對應(yīng)的Consumer,讓其調(diào)用相應(yīng)的方法。
@interface HobenMetalOutput : NSObject <HobenMetalProviderProtocol> {
id<MTLTexture> outputTexture;
}
#pragma mark - Public Method
- (void)addTarget:(id <HobenMetalConsumerProtocol>)target {
NSInteger index = 0;
if ([target respondsToSelector:@selector(nextAvailableTextureIndex)]) {
index = [target nextAvailableTextureIndex];
}
[self addTarget:target atIndex:index];
}
- (void)addTarget:(id <HobenMetalConsumerProtocol>)target atIndex:(NSInteger)index {
if (!target) {
return;
}
if ([self.targets containsObject:target]) {
return;
}
if ([target respondsToSelector:@selector(textureIndexUnavailable:)]) {
[target textureIndexUnavailable:index];
}
[self.targets addObject:target];
[self.targetTextureIndices addObject:@(index)];
}
- (void)transmitTextureToAllTargets:(id<MTLTexture>)texture {
for (id <HobenMetalConsumerProtocol> target in self.targets) {
NSInteger indexOfObject = [self.targets indexOfObject:target];
NSInteger textureIndex = [[self.targetTextureIndices objectAtIndex:indexOfObject] integerValue];
[self transmitTexture:texture target:target index:textureIndex];
}
}
#pragma mark - HobenMetalProviderProtocol
- (void)transmitTexture:(id<MTLTexture>)texture target:(id<HobenMetalConsumerProtocol>)target index:(NSInteger)index {
[target newTextureAvailable:texture index:index];
}
在本架構(gòu)中,屬于生產(chǎn)者的有HobenMetalPicture(根據(jù)UIImage獲取到紋理)、HobenMetalMovieReader(根據(jù)CVPixelBufferRef獲取到紋理)、HobenMetalFilter(根據(jù)鏈?zhǔn)缴蠈荧@取到紋理),他們得到紋理后將會進行處理,輸出給鏈?zhǔn)较聦印?/p>
4. 紋理消費者Consumer
消費者的主要工作是根據(jù)Provider提供的紋理信息,進行進一步操作,在這里我們也定義了Consumer需要遵循的協(xié)議:
@protocol HobenMetalConsumerProtocol <NSObject>
- (void)newTextureAvailable:(id <MTLTexture>)texture index:(NSInteger)index;
@optional
- (NSInteger)nextAvailableTextureIndex;
- (void)textureIndexUnavailable:(NSInteger)index;
@end
在本架構(gòu)中,屬于消費者的有HobenMetalRenderView(根據(jù)獲取到的紋理提交渲染指令)、HobenMetalFilter(根據(jù)獲取到的紋理進行這一層的Encode),他們的職責(zé)是根據(jù)上一層Provider提供的紋理,在這一層進行編碼。
三. 生產(chǎn)者和消費者們
1. 資源處理器
資源處理器,即將一些現(xiàn)有的資源對象(UIImage、CVPixelBufferRef)轉(zhuǎn)化為紋理的工具,他們屬于生產(chǎn)者Provider,轉(zhuǎn)化為紋理后可以提供給鏈?zhǔn)较聦覥onsumer。

HobenMetalPicture根據(jù)MTKTextureLoader提供的紋理讀取方法,在init的時候就將CGImage轉(zhuǎn)換為了紋理。
- (instancetype)initWithImage:(UIImage *)newImageSource {
if (self = [self initWithCGImage:newImageSource.CGImage]) {
}
return self;
}
- (instancetype)initWithCGImage:(CGImageRef)newImageSource {
if (self = [super init]) {
[self renderCGImage:newImageSource];
}
return self;
}
- (void)renderCGImage:(CGImageRef)cgImage {
MTKTextureLoader *loader = [[MTKTextureLoader alloc] initWithDevice:MTL_DEVICE];
NSDictionary *options = @{
MTKTextureLoaderOptionSRGB : @(NO),
};
self.texture = [loader newTextureWithCGImage:cgImage options:options error:nil];
}
當(dāng)開發(fā)者需要開始傳遞創(chuàng)建好的紋理的時候,調(diào)用以下方法即可
- (void)processImage {
[self transmitTextureToAllTargets:self.texture];
}
而HobenMetalMovieReader則需要定義好自己的YUV轉(zhuǎn)換矩陣,加入到片段著色器緩沖當(dāng)中,原理在Metal與圖形渲染三:透明通道視頻有提及,這里只是將過去的邏輯抽離得更簡潔和可讀一點:
- (BOOL)renderPixelBuffer:(CVPixelBufferRef)pixelBuffer {
AUTO_RELEASE_BEGIN
id <MTLTexture> textureY = [self textureWithPixelBuffer:pixelBuffer pixelFormat:MTLPixelFormatR8Unorm planeIndex:0];
id <MTLTexture> textureUV = [self textureWithPixelBuffer:pixelBuffer pixelFormat:MTLPixelFormatRG8Unorm planeIndex:1];
[self setupMatrixWithPixelBuffer:pixelBuffer];
if (!textureY || !textureUV || !self.convertMatrix) {
return NO;
}
CVPixelBufferLockBaseAddress(pixelBuffer, kCVPixelBufferLock_ReadOnly);
NSMutableArray *inputTextureArray = [NSMutableArray array];
for (id <MTLTexture> texture in @[textureY, textureUV]) {
HobenMetalTexture *inputTexture = [[HobenMetalTexture alloc] initWithTexture:texture];
[inputTextureArray addObject:inputTexture];
}
CVPixelBufferUnlockBaseAddress(pixelBuffer, kCVPixelBufferLock_ReadOnly);
if (!outputTexture) {
outputTexture = [HobenMetalTexture defaultTextureByWidth:textureY.width height:textureY.height];
}
[HobenMetalKit renderQuad:MTL_PIPELINE(@"oneInputVertex", @"movieFragment") inputTextures:inputTextureArray imageVertices:nil vertexBuffers:nil fragmentBuffers:@[_convertMatrix] outputTexture:outputTexture];
[self transmitTextureToAllTargets:outputTexture];
AUTO_RELEASE_END
return YES;
}
2. 中間層Filter
在鏈?zhǔn)綀D中,我們可以發(fā)現(xiàn)一個很重要的中間層——Filter,它既是生產(chǎn)者,也是消費者,它既可以消費上一層提供的紋理,又可以加入自己想要渲染的管線、緩沖、坐標(biāo),進行這一層的渲染,將得到的紋理提供給下一層。
Filter支持多個輸入紋理,自己可以編寫多個頂點緩沖、紋理緩沖,加上自己對應(yīng)的Pipeline傳遞給渲染層,而最終只會得到一個輸出。

根據(jù)Filter又是生產(chǎn)者又是消費者的特性,我們可以得出,它是一個繼承HobenMetalOutput同時遵循HobenMetalConsumerProtocol的類:
@interface HobenMetalFilter : HobenMetalOutput <HobenMetalConsumerProtocol>
{
NSMutableArray <HobenMetalTexture *> *inputTextures;
}
由于Filter支持多輸入,所以我們需要等待所有的輸入源準(zhǔn)備好了,再進行該次渲染操作,在渲染時,如果上一層的Provider傳來紋理,且所有紋理已經(jīng)準(zhǔn)備完畢,那就可以開始處理了:
- (void)newTextureAvailable:(id<MTLTexture>)texture index:(NSInteger)index {
if (!texture) {
return;
}
NSInteger numberOfInputs = MAX(_numberOfInputs, 1);
HobenMetalTexture *inputTexture = [[HobenMetalTexture alloc] initWithTexture:texture];
inputTexture.textureIndex = index;
[inputTextures addObject:inputTexture];
if (inputTextures.count < numberOfInputs) {
return;
}
if (!outputTexture) {
outputTexture = [HobenMetalTexture defaultTextureByWidth:texture.width height:texture.height];
}
[inputTextures sortUsingComparator:^NSComparisonResult(HobenMetalTexture *obj1, HobenMetalTexture *obj2) {
if (obj1.textureIndex <= obj2.textureIndex) {
return NSOrderedAscending;
} else {
return NSOrderedDescending;
}
}];
[self renderToTextureWithVertices:nil textureCoordinates:nil];
[inputTextures removeAllObjects];
}
- (void)renderToTextureWithVertices:(NSArray *)vertices textureCoordinates:(NSArray *)textureCoordinates {
for (HobenMetalTexture *inputTexture in inputTextures) {
inputTexture.textureCoordinates = textureCoordinates;
}
[HobenMetalKit renderQuad:MTL_PIPELINE(_vertexName, _fragmentName) inputTextures:inputTextures imageVertices:vertices outputTexture:outputTexture];
[self transmitTextureToAllTargets:outputTexture];
}
值得注意的是,由于MTLTextureDescriptor創(chuàng)建紋理是一個很耗CPU的操作,因此,我們只創(chuàng)建一次outputTexture就好了(GPUImage3可能是因為這個問題,渲染視頻的時候CPU占比很高,坑了我好久。。)
這里將renderToTextureWithVertices:textureCoordinates:抽了出來,開發(fā)者可以根據(jù)自己的需要自定義頂點坐標(biāo)或紋理坐標(biāo),或者自己實現(xiàn)一套渲染邏輯,比如這次需要用到的裁剪操作CropFilter就是這樣實現(xiàn)的:
- (void)calculateCropTextureCoordinates {
CGFloat minX = _cropRegion.origin.x;
CGFloat minY = _cropRegion.origin.y;
CGFloat maxX = CGRectGetMaxX(_cropRegion);
CGFloat maxY = CGRectGetMaxY(_cropRegion);
_cropTextureCoordinates = @[
@(minX), @(minY),
@(maxX), @(minY),
@(minX), @(maxY),
@(maxX), @(maxY),
];
}
#pragma mark - Override
- (void)renderToTextureWithVertices:(NSArray *)vertices textureCoordinates:(NSArray *)textureCoordinates {
[super renderToTextureWithVertices:vertices textureCoordinates:_cropTextureCoordinates];
}
3. 輸出視圖
輸出視圖繼承于MTKView,其職責(zé)是將上一層提供的紋理進行展示,屬于消費者Consumer,是將編碼指令提交給GPU的最終結(jié)點。而這次,我們不需要讓系統(tǒng)每幀回調(diào)drawInMtkView:了,而是我們自己決定調(diào)用的時機,代碼如下:
@interface HobenMetalRenderView : MTKView <HobenMetalConsumerProtocol>
static const NSUInteger MaxFramesInFlight = 3;
- (void)setup {
// 設(shè)置enableSetNeedsDisplay為NO且paused為YES,開發(fā)者自決定draw時機
self.enableSetNeedsDisplay = NO;
self.paused = YES;
self.autoResizeDrawable = YES;
self.device = MTL_DEVICE;
self.opaque = NO;
_inFlightSemaphore = dispatch_semaphore_create(MaxFramesInFlight);
}
- (void)newTextureAvailable:(id<MTLTexture>)texture index:(NSInteger)index {
self.drawableSize = CGSizeMake(texture.width, texture.height);
self.currentTexture = texture;
[self draw];
}
- (void)drawRect:(CGRect)rect {
if (!self.currentTexture) {
return;
}
if (!self.currentDrawable) {
NSAssert(NO, @"drawable is nil");
return;
}
dispatch_semaphore_wait(_inFlightSemaphore, DISPATCH_TIME_FOREVER);
id <MTLCommandBuffer> commandBuffer = MTL_COMMAND_BUFFER;
HobenMetalTexture *texture = [[HobenMetalTexture alloc] initWithTexture:self.currentTexture];
[HobenMetalKit renderQuad:MTL_PASSTHROUGH_PIPELINE inputTextures:@[texture] outputTexture:self.currentDrawable.texture];
__block dispatch_semaphore_t block_semaphore = _inFlightSemaphore;
[commandBuffer addCompletedHandler:^(id<MTLCommandBuffer> buffer)
{
dispatch_semaphore_signal(block_semaphore);
}];
[commandBuffer presentDrawable:self.currentDrawable];
[commandBuffer commit];
self.currentTexture = nil;
[HobenMetalKit resetCommandBuffer];
}
MTKView的currentDrawable也就是當(dāng)前屏幕的畫布,當(dāng)渲染指令commit完畢后,這次鏈?zhǔn)浇Y(jié)構(gòu)的所有編碼好的命令緩沖就會提交給GPU,至此,該條鏈?zhǔn)浇Y(jié)構(gòu)就能完成了。
需要注意的是,當(dāng)CommandBuffer提交上去后,需要重置,下次渲染的時候,會從命令緩沖隊列里面再創(chuàng)建一條命令緩沖,直到下次MTKView又將渲染指令提交上去完畢。
四. 業(yè)務(wù)層的繼承和調(diào)用
1. 自定義一個Filter
經(jīng)過這次重構(gòu)之后,業(yè)務(wù)層的邏輯顯然簡潔了很多,如果需要自定義一個Filter,我們只需要指定對應(yīng)的頂點著色器、片段著色器即可進行操作,有需要的話還可以自定義頂點坐標(biāo)、片段坐標(biāo),例如,裁剪操作CropFilter可以簡化為以下代碼:
- (instancetype)initWithCropRegin:(CGRect)newCropRegion {
if (self = [super init]) {
self.cropRegion = newCropRegion;
}
return self;
}
- (void)calculateCropTextureCoordinates {
CGFloat minX = _cropRegion.origin.x;
CGFloat minY = _cropRegion.origin.y;
CGFloat maxX = CGRectGetMaxX(_cropRegion);
CGFloat maxY = CGRectGetMaxY(_cropRegion);
_cropTextureCoordinates = @[
@(minX), @(minY),
@(maxX), @(minY),
@(minX), @(maxY),
@(maxX), @(maxY),
];
}
#pragma mark - Override
- (void)renderToTextureWithVertices:(NSArray *)vertices textureCoordinates:(NSArray *)textureCoordinates {
[super renderToTextureWithVertices:vertices textureCoordinates:_cropTextureCoordinates];
}
- (void)setCropRegion:(CGRect)newValue {
NSParameterAssert(newValue.origin.x >= 0 && newValue.origin.x <= 1 &&
newValue.origin.y >= 0 && newValue.origin.y <= 1 &&
newValue.size.width >= 0 && newValue.size.width <= 1 &&
newValue.size.height >= 0 && newValue.size.height <= 1);
_cropRegion = newValue;
[self calculateCropTextureCoordinates];
}
而融合操作由于沒有自定義頂點坐標(biāo)的需求,在OC層就更簡單了
- (instancetype)init {
if (self = [super initWithVertexName:@"twoInputVertex" fragmentName:@"mixFragment" numberOfInputs:2]) {
}
return self;
}
對應(yīng)的.metal文件也只是之前的融合操作:
vertex TwoInputVertexIO twoInputVertex(const device packed_float2 *position [[buffer(0)]],
const device packed_float2 *texturecoord [[buffer(1)]],
const device packed_float2 *texturecoord2 [[buffer(2)]],
uint vid [[vertex_id]])
{
TwoInputVertexIO outputVertices;
outputVertices.position = float4(position[vid], 0, 1.0);
outputVertices.textureCoordinate = texturecoord[vid];
outputVertices.textureCoordinate2 = texturecoord2[vid];
return outputVertices;
}
fragment float4 mixFragment(TwoInputVertexIO fragmentInput [[stage_in]],
texture2d<float> inputTexture [[texture(0)]],
texture2d<float> inputTexture2 [[texture(1)]])
{
constexpr sampler quadSampler;
float4 color1 = inputTexture.sample(quadSampler, fragmentInput.textureCoordinate);
float4 color2 = inputTexture2.sample(quadSampler, fragmentInput.textureCoordinate2);
return float4(color1.rgb, color2.r);
}
2. 業(yè)務(wù)層的調(diào)用
業(yè)務(wù)層需要指定鏈?zhǔn)浇Y(jié)構(gòu)的走向,也只需要一個可讀性非常好的操作:
- (void)viewDidLoad {
[super viewDidLoad];
if (!_renderView) {
_renderView = [[HobenMetalRenderView alloc] initWithFrame:CGRectMake(0, 0, self.view.frame.size.width, self.view.frame.size.height)];
}
if (!_cropLeftFilter) {
_cropLeftFilter = [[HobenMetalCropFilter alloc] initWithCropRegin:CGRectMake(0, 0, .5f, 1.f)];
}
if (!_cropRightFilter) {
_cropRightFilter = [[HobenMetalCropFilter alloc] initWithCropRegin:CGRectMake(.5f, 0, .5f, 1.f)];
}
if (!_mixFilter) {
_mixFilter = [[HobenMetalMixFilter alloc] init];
}
if (!_picture) {
_picture = [[HobenMetalPicture alloc] initWithImage:[UIImage imageNamed:@"crop_image"]];
}
[self.view addSubview:_renderView];
[_picture addTarget:_cropLeftFilter];
[_picture addTarget:_cropRightFilter];
[_cropLeftFilter addTarget:_mixFilter];
[_cropRightFilter addTarget:_mixFilter];
[_mixFilter addTarget:_renderView];
[_picture processImage];
}
至此,一個鏈?zhǔn)浇Y(jié)構(gòu)就完成啦!
五. 內(nèi)存和CPU優(yōu)(Cai)化(Keng)的一些思考
GPUImage3處理視頻的高CPU和高內(nèi)存情況,預(yù)估原因體現(xiàn)在以下幾點:
- AutoReleasePool
蘋果的對Metal渲染的官方文檔是建議使用autoRelease的,對此我們渲染的操作也需要加上這個操作。

- 對CommandBuffer的頻繁Commit
在GPUImage3的設(shè)計中,無論是Provider、Consumer還是Filter,他的每次編碼操作之后都進行了一次commit,事實上,對于單次渲染來說,只需要一次commit、多次編碼即可完成,而commit恰恰是CPU和GPU溝通的橋梁。
根據(jù)蘋果官方的描述,Drawable其實是一個非常有限的資源(只有3個),他由系統(tǒng)進行調(diào)度,而官方的Sample Code:Synchronizing CPU and GPU Work,建議使用信號量來控制commit,GPUImage3這番頻繁的commit估計會很影響CPU的性能。
// The maximum number of frames in flight.
static const NSUInteger MaxFramesInFlight = 3;
...
/// Handles view rendering for a new frame.
- (void)drawInMTKView:(nonnull MTKView *)view
{
// Wait to ensure only `MaxFramesInFlight` number of frames are getting processed
// by any stage in the Metal pipeline (CPU, GPU, Metal, Drivers, etc.).
dispatch_semaphore_wait(_inFlightSemaphore, DISPATCH_TIME_FOREVER);
...
// Add a completion handler that signals `_inFlightSemaphore` when Metal and the GPU have fully
// finished processing the commands that were encoded for this frame.
// This completion indicates that the dynamic buffers that were written-to in this frame, are no
// longer needed by Metal and the GPU; therefore, the CPU can overwrite the buffer contents
// without corrupting any rendering operations.
__block dispatch_semaphore_t block_semaphore = _inFlightSemaphore;
[commandBuffer addCompletedHandler:^(id<MTLCommandBuffer> buffer)
{
dispatch_semaphore_signal(block_semaphore);
}];
// Finalize CPU work and submit the command buffer to the GPU.
[commandBuffer commit];
}
- 頻繁地使用MTLTextureDescriptor創(chuàng)建outputTexture
在視頻的每一幀渲染中,這個是非常非常消耗CPU的,一個視頻有非常多幀,每一幀都初始化一個紋理肯定是不行的,因為這個,我渲染視頻的CPU飆升到了50%左右,而優(yōu)化之后CPU維持在10%左右,有多耗性能可想而知,事實上這個也不需要頻繁創(chuàng)建,只需要Lazy Load就好了~
下圖就是經(jīng)過優(yōu)化之后,渲染視頻中,CPU和內(nèi)存的峰值啦:

六. 總結(jié)
本次鏈?zhǔn)交軜?gòu)的實現(xiàn),大大地提升渲染邏輯的維護性和可讀性,支持按照渲染功能對Filter文件和.metal文件進行分類,簡化了業(yè)務(wù)層開發(fā)的邏輯。
即便需要自定義渲染操作,也只需要繼承HobenMetalFilter,自行決定所需的頂點著色器、片段著色器、頂點坐標(biāo)、紋理坐標(biāo)、頂點緩沖、紋理緩沖即可,非常方便。
該鏈?zhǔn)浇Y(jié)構(gòu)遵循生產(chǎn)者-消費者結(jié)構(gòu),將輸入作為生產(chǎn)者,輸出作為消費者,中間層Filter作為生產(chǎn)者和消費者,從而使得單次的命令緩沖CommandBuffer集成了多個指令編碼CommandEncode,最后讓MTKView提交命令緩沖至GPU,完成該次渲染。
而本次鏈?zhǔn)郊軜?gòu)不僅用OC完成了開源庫GPUImage3的代碼邏輯,而且還解決了高內(nèi)存和高CPU問題,雖然過程比較煎熬,但收獲真的很多,繼續(xù)加油!
