本系列文章是對(duì) http://metalkit.org 上面MetalKit內(nèi)容的全面翻譯和學(xué)習(xí).
Augmented Reality增強(qiáng)現(xiàn)實(shí)提供了一種疊加虛擬內(nèi)容到攝像頭獲取到的真實(shí)世界視圖上的方法.上個(gè)月在WWDC2017當(dāng)看到Apple的新的ARKit框架時(shí)我們都很興奮,這是一個(gè)高級(jí)API,工作在運(yùn)行iOS11的A9設(shè)備或更新設(shè)備上.有些ARKit實(shí)驗(yàn)確實(shí)非常杰出,比如下面這個(gè):

在ARKit應(yīng)用中有三種不同的圖層:
- Tracking追蹤 - 使用視覺慣性里程計(jì)來實(shí)現(xiàn)世界追蹤,無需額外設(shè)置.
- Scene Understanding場(chǎng)景理解 - 使用平面檢測(cè),點(diǎn)擊測(cè)試和光照估計(jì)來探測(cè)場(chǎng)景屬性的能力.
-
Rendering渲染 - 可以輕松整合,因?yàn)?code>AR視圖模板是由
SpriteKit和SceneKit提供的,可以用Metal自定義.所有預(yù)渲染過程由ARKit處理完成,它還同時(shí)負(fù)責(zé)用AVFoundation和CoreMotion進(jìn)行圖像捕捉.
在本系列的第一章節(jié)里,我們將主要關(guān)注Metal中的渲染,其余兩步將在本系列下一章節(jié)中討論.在一個(gè)AR應(yīng)用中,Tracking追蹤和Scene Understanding場(chǎng)景理解是完全由ARKit框架處理的,但渲染可以用SpriteKit, SceneKit或Metal來處理:

開始,我們需要有一個(gè)ARSession實(shí)例,它是用一個(gè)ARSessionConfiguration對(duì)象來創(chuàng)建的.然后,我們調(diào)用run()函數(shù)來配置.這個(gè)會(huì)話管理著同時(shí)運(yùn)行的AVCaptureSession和CMMotionManager對(duì)象,來獲取圖像和運(yùn)動(dòng)數(shù)據(jù)來實(shí)現(xiàn)追蹤.最后,會(huì)話將輸出當(dāng)前幀到一個(gè)ARFrame對(duì)象:

ARSessionConfiguration對(duì)象包含了關(guān)于追蹤類型的信息.ARSessionConfiguration的基礎(chǔ)類提供3個(gè)自由度的追蹤,而它的子類,ARWorldTrackingSessionConfiguration提供6個(gè)自由度的追蹤(設(shè)備位置和旋轉(zhuǎn)方向).

當(dāng)一個(gè)設(shè)備不支持世界追蹤時(shí),回落到基礎(chǔ)配置:
if ARWorldTrackingSessionConfiguration.isSupported {
configuration = ARWorldTrackingSessionConfiguration()
} else {
configuration = ARSessionConfiguration()
}
ARFrame包含了捕捉到的圖像,追蹤信息和場(chǎng)景信息,場(chǎng)景信息通過包含真實(shí)世界位置和旋轉(zhuǎn)信息的ARAnchor對(duì)象來獲取,這個(gè)對(duì)象可以輕易從會(huì)話中被添加,更新或移除.Tracking追蹤是實(shí)時(shí)確定物理位置的能力.World Tracking,能同時(shí)確定位置和朝向,它使用物理距離,與起始位置相關(guān)聯(lián)并提供3D特征點(diǎn).
ARFrame的最后一個(gè)組件是ARCamera對(duì)象,它處理變換(平移,旋轉(zhuǎn),縮放)并攜帶了追蹤狀態(tài)和相機(jī)本體.追蹤的質(zhì)量強(qiáng)烈依賴于不間斷的傳感器數(shù)據(jù),穩(wěn)定的場(chǎng)景,并且當(dāng)場(chǎng)景中有大量復(fù)雜紋理時(shí)會(huì)更加精確.追蹤狀態(tài)有三個(gè)值:Not Available不可用(相機(jī)只有單位矩陣),Limited受限(場(chǎng)景中特征不足或不夠穩(wěn)定),還有Normal正常(相機(jī)數(shù)據(jù)正常).當(dāng)相機(jī)輸入不可用時(shí)或當(dāng)追蹤停止時(shí),會(huì)引發(fā)會(huì)話打斷:
func session(_ session: ARSession, cameraDidChangeTrackingState camera: ARCamera) {
if case .limited(let reason) = camera.trackingState {
// Notify user of limited tracking state
}
}
func sessionWasInterrupted(_ session: ARSession) {
showOverlay()
}
func sessionInterruptionEnded(_ session: ARSession) {
hideOverlay()
// Optionally restart experience
}
Rendering可以在SceneKit中完成,它使用ARSCNView的代理來添加,更新或移除節(jié)點(diǎn).類似的,Rendering也可以在SpriteKit中完成,它使用ARSKView代理來布局SKNodes到ARAnchor對(duì)象.因?yàn)?code>SpriteKit是2D的,它不能使用真實(shí)世界的相機(jī)位置,所以它是投影錨點(diǎn)位置到ARSKView,然后在這個(gè)被投影的位置作為廣告牌(平面)來渲染點(diǎn)精靈的,所以點(diǎn)精靈總是面對(duì)著攝像機(jī).對(duì)于Metal,沒有定制的AR視圖,所以這個(gè)責(zé)任落到了程序員手里.為了處理渲染出的圖像,我們需要:
- 繪制背景相機(jī)圖像(從像素緩沖器生成一個(gè)紋理)
- 更新虛擬攝像機(jī)
- 更新光照
- 更新幾何體的變換
所有這些信息都在ARFrame對(duì)象中.為訪問這個(gè)幀,有兩種設(shè)置:polling輪詢或使用delegate代理.我們將使用后者.我拿出ARKit為Metal準(zhǔn)備的模板,并精簡(jiǎn)到最簡(jiǎn),這樣我能更好地理解它是如何工作的.我做的第一件事就是移除所有C語(yǔ)言的依賴項(xiàng),這樣就不在需要橋接了.保留這些類型和枚舉常量在以后可能會(huì)很有用,能用來在API代碼和著色器之間共享這些類型和枚舉,但是對(duì)于本文來說這是不需要的.
下一步,到ViewController中,它將作為我們MTKView和ARSession的代理.我們創(chuàng)建一個(gè)Renderer實(shí)例,它將與代理協(xié)作,實(shí)時(shí)更新應(yīng)用:
var session: ARSession!
var renderer: Renderer!
override func viewDidLoad() {
super.viewDidLoad()
session = ARSession()
session.delegate = self
if let view = self.view as? MTKView {
view.device = MTLCreateSystemDefaultDevice()
view.delegate = self
renderer = Renderer(session: session, metalDevice: view.device!, renderDestination: view)
renderer.drawRectResized(size: view.bounds.size)
}
let tapGesture = UITapGestureRecognizer(target: self, action: #selector(self.handleTap(gestureRecognize:)))
view.addGestureRecognizer(tapGesture)
}
正如你看到的,我們將添加一個(gè)手勢(shì)識(shí)別器,我們用它來添加虛擬內(nèi)容到視圖中.我們首先拿到會(huì)話的當(dāng)前幀,然后創(chuàng)建一個(gè)轉(zhuǎn)換來將我們的物體放到相機(jī)前(本例中為0.3米),最后用這個(gè)變換來添加一個(gè)新的錨點(diǎn)到會(huì)話中:
func handleTap(gestureRecognize: UITapGestureRecognizer) {
if let currentFrame = session.currentFrame {
var translation = matrix_identity_float4x4
translation.columns.3.z = -0.3
let transform = simd_mul(currentFrame.camera.transform, translation)
let anchor = ARAnchor(transform: transform)
session.add(anchor: anchor)
}
}
我們使用viewWillAppear()和viewWillDisappear()方法來開始和暫停會(huì)話:
override func viewWillAppear(_ animated: Bool) {
super.viewWillAppear(animated)
let configuration = ARWorldTrackingSessionConfiguration()
session.run(configuration)
}
override func viewWillDisappear(_ animated: Bool) {
super.viewWillDisappear(animated)
session.pause()
}
剩下的只有響應(yīng)視圖更新或會(huì)話錯(cuò)誤及打斷的代理方法:
func mtkView(_ view: MTKView, drawableSizeWillChange size: CGSize) {
renderer.drawRectResized(size: size)
}
func draw(in view: MTKView) {
renderer.update()
}
func session(_ session: ARSession, didFailWithError error: Error) {}
func sessionWasInterrupted(_ session: ARSession) {}
func sessionInterruptionEnded(_ session: ARSession) {}
讓我們現(xiàn)在轉(zhuǎn)到Renderer.swift文件中.需要注意的第一件事情是,使用了一個(gè)非常有用的協(xié)議,它將讓我們能訪問我們稍后繪制調(diào)用中需要的所有MTKView屬性:
protocol RenderDestinationProvider {
var currentRenderPassDescriptor: MTLRenderPassDescriptor? { get }
var currentDrawable: CAMetalDrawable? { get }
var colorPixelFormat: MTLPixelFormat { get set }
var depthStencilPixelFormat: MTLPixelFormat { get set }
var sampleCount: Int { get set }
}
現(xiàn)在你可以只需擴(kuò)展MTKView類(在ViewController中),就讓它遵守了這個(gè)協(xié)議:
extension MTKView : RenderDestinationProvider {}
要想有一個(gè)Renderer類的高級(jí)視圖,下面是它的偽代碼:
init() {
setupPipeline()
setupAssets()
}
func update() {
updateBufferStates()
updateSharedUniforms()
updateAnchors()
updateCapturedImageTextures()
updateImagePlane()
drawCapturedImage()
drawAnchorGeometry()
}
像以前一樣,我們首先創(chuàng)建管線,這里用setupPipeline()函數(shù).然后,在setupAssets()里,我們創(chuàng)建我們的模型,當(dāng)我們的點(diǎn)擊手勢(shì)識(shí)別時(shí)就會(huì)加載出來.當(dāng)繪制調(diào)用時(shí)或需要更新時(shí),MTKView代理將會(huì)調(diào)用update()函數(shù).讓我們仔細(xì)看看它們.首先我們用了updateBufferStates(),它更新我們?yōu)楫?dāng)前幀(本例中,我們使用3個(gè)空位的環(huán)形緩沖器)寫入到緩沖器中的位置.
func updateBufferStates() {
uniformBufferIndex = (uniformBufferIndex + 1) % maxBuffersInFlight
sharedUniformBufferOffset = alignedSharedUniformSize * uniformBufferIndex
anchorUniformBufferOffset = alignedInstanceUniformSize * uniformBufferIndex
sharedUniformBufferAddress = sharedUniformBuffer.contents().advanced(by: sharedUniformBufferOffset)
anchorUniformBufferAddress = anchorUniformBuffer.contents().advanced(by: anchorUniformBufferOffset)
}
下一步,在updateSharedUniforms()中我們更新該幀的共享的uniforms,并為場(chǎng)景設(shè)置光照:
func updateSharedUniforms(frame: ARFrame) {
let uniforms = sharedUniformBufferAddress.assumingMemoryBound(to: SharedUniforms.self)
uniforms.pointee.viewMatrix = simd_inverse(frame.camera.transform)
uniforms.pointee.projectionMatrix = frame.camera.projectionMatrix(withViewportSize: viewportSize, orientation: .landscapeRight, zNear: 0.001, zFar: 1000)
var ambientIntensity: Float = 1.0
if let lightEstimate = frame.lightEstimate {
ambientIntensity = Float(lightEstimate.ambientIntensity) / 1000.0
}
let ambientLightColor: vector_float3 = vector3(0.5, 0.5, 0.5)
uniforms.pointee.ambientLightColor = ambientLightColor * ambientIntensity
var directionalLightDirection : vector_float3 = vector3(0.0, 0.0, -1.0)
directionalLightDirection = simd_normalize(directionalLightDirection)
uniforms.pointee.directionalLightDirection = directionalLightDirection
let directionalLightColor: vector_float3 = vector3(0.6, 0.6, 0.6)
uniforms.pointee.directionalLightColor = directionalLightColor * ambientIntensity
uniforms.pointee.materialShininess = 30
}
下一步,在updateAnchors()中我們用當(dāng)前幀的錨點(diǎn)的變換來更新錨點(diǎn)uniform緩沖器:
func updateAnchors(frame: ARFrame) {
anchorInstanceCount = min(frame.anchors.count, maxAnchorInstanceCount)
var anchorOffset: Int = 0
if anchorInstanceCount == maxAnchorInstanceCount {
anchorOffset = max(frame.anchors.count - maxAnchorInstanceCount, 0)
}
for index in 0..<anchorInstanceCount {
let anchor = frame.anchors[index + anchorOffset]
var coordinateSpaceTransform = matrix_identity_float4x4
coordinateSpaceTransform.columns.2.z = -1.0
let modelMatrix = simd_mul(anchor.transform, coordinateSpaceTransform)
let anchorUniforms = anchorUniformBufferAddress.assumingMemoryBound(to: InstanceUniforms.self).advanced(by: index)
anchorUniforms.pointee.modelMatrix = modelMatrix
}
}
下一步,在updateCapturedImageTextures()我們從提供的幀的捕捉圖像里,創(chuàng)建兩個(gè)紋理:
func updateCapturedImageTextures(frame: ARFrame) {
let pixelBuffer = frame.capturedImage
if (CVPixelBufferGetPlaneCount(pixelBuffer) < 2) { return }
capturedImageTextureY = createTexture(fromPixelBuffer: pixelBuffer, pixelFormat:.r8Unorm, planeIndex:0)!
capturedImageTextureCbCr = createTexture(fromPixelBuffer: pixelBuffer, pixelFormat:.rg8Unorm, planeIndex:1)!
}
下一步,在updateImagePlane()中,我們更新圖像平面的紋理坐標(biāo)來適應(yīng)視口:
func updateImagePlane(frame: ARFrame) {
let displayToCameraTransform = frame.displayTransform(withViewportSize: viewportSize, orientation: .landscapeRight).inverted()
let vertexData = imagePlaneVertexBuffer.contents().assumingMemoryBound(to: Float.self)
for index in 0...3 {
let textureCoordIndex = 4 * index + 2
let textureCoord = CGPoint(x: CGFloat(planeVertexData[textureCoordIndex]), y: CGFloat(planeVertexData[textureCoordIndex + 1]))
let transformedCoord = textureCoord.applying(displayToCameraTransform)
vertexData[textureCoordIndex] = Float(transformedCoord.x)
vertexData[textureCoordIndex + 1] = Float(transformedCoord.y)
}
}
下一步,在drawCapturedImage()中我們?cè)趫?chǎng)景中繪制來自相機(jī)的畫面:
func drawCapturedImage(renderEncoder: MTLRenderCommandEncoder) {
guard capturedImageTextureY != nil && capturedImageTextureCbCr != nil else { return }
renderEncoder.pushDebugGroup("DrawCapturedImage")
renderEncoder.setCullMode(.none)
renderEncoder.setRenderPipelineState(capturedImagePipelineState)
renderEncoder.setDepthStencilState(capturedImageDepthState)
renderEncoder.setVertexBuffer(imagePlaneVertexBuffer, offset: 0, index: 0)
renderEncoder.setFragmentTexture(capturedImageTextureY, index: 1)
renderEncoder.setFragmentTexture(capturedImageTextureCbCr, index: 2)
renderEncoder.drawPrimitives(type: .triangleStrip, vertexStart: 0, vertexCount: 4)
renderEncoder.popDebugGroup()
}
最終,在drawAnchorGeometry()中,我們?yōu)閯?chuàng)建的虛擬內(nèi)容繪制錨點(diǎn):
func drawAnchorGeometry(renderEncoder: MTLRenderCommandEncoder) {
guard anchorInstanceCount > 0 else { return }
renderEncoder.pushDebugGroup("DrawAnchors")
renderEncoder.setCullMode(.back)
renderEncoder.setRenderPipelineState(anchorPipelineState)
renderEncoder.setDepthStencilState(anchorDepthState)
renderEncoder.setVertexBuffer(anchorUniformBuffer, offset: anchorUniformBufferOffset, index: 2)
renderEncoder.setVertexBuffer(sharedUniformBuffer, offset: sharedUniformBufferOffset, index: 3)
renderEncoder.setFragmentBuffer(sharedUniformBuffer, offset: sharedUniformBufferOffset, index: 3)
for bufferIndex in 0..<mesh.vertexBuffers.count {
let vertexBuffer = mesh.vertexBuffers[bufferIndex]
renderEncoder.setVertexBuffer(vertexBuffer.buffer, offset: vertexBuffer.offset, index:bufferIndex)
}
for submesh in mesh.submeshes {
renderEncoder.drawIndexedPrimitives(type: submesh.primitiveType, indexCount: submesh.indexCount, indexType: submesh.indexType, indexBuffer: submesh.indexBuffer.buffer, indexBufferOffset: submesh.indexBuffer.offset, instanceCount: anchorInstanceCount)
}
renderEncoder.popDebugGroup()
}
回到前面提到的setupPipeline()函數(shù).創(chuàng)建兩個(gè)渲染管線狀態(tài)對(duì)象,一個(gè)用于捕捉圖像(從相機(jī)接收),一個(gè)用于我們?cè)趫?chǎng)景中放置虛擬物體時(shí)創(chuàng)建的錨點(diǎn).正如所料,每一個(gè)狀態(tài)對(duì)象都將有一對(duì)自己的頂點(diǎn)函數(shù)和片段函數(shù) - 讓我們轉(zhuǎn)到最后一個(gè)需要關(guān)注的文件 - 在Shaders.metal文件中.用來捕捉圖像的第一對(duì)著色器中,我們?cè)陧旤c(diǎn)著色器中傳遞圖像的頂點(diǎn)位置和紋理坐標(biāo):
vertex ImageColorInOut capturedImageVertexTransform(ImageVertex in [[stage_in]]) {
ImageColorInOut out;
out.position = float4(in.position, 0.0, 1.0);
out.texCoord = in.texCoord;
return out;
}
在片段著色器中,我們采樣兩個(gè)紋理來得到給定紋理坐標(biāo)處的顏色,然后返回修改過的RGB顏色:
fragment float4 capturedImageFragmentShader(ImageColorInOut in [[stage_in]],
texture2d<float, access::sample> textureY [[ texture(1) ]],
texture2d<float, access::sample> textureCbCr [[ texture(2) ]]) {
constexpr sampler colorSampler(mip_filter::linear, mag_filter::linear, min_filter::linear);
const float4x4 ycbcrToRGBTransform = float4x4(float4(+1.0000f, +1.0000f, +1.0000f, +0.0000f),
float4(+0.0000f, -0.3441f, +1.7720f, +0.0000f),
float4(+1.4020f, -0.7141f, +0.0000f, +0.0000f),
float4(-0.7010f, +0.5291f, -0.8860f, +1.0000f));
float4 ycbcr = float4(textureY.sample(colorSampler, in.texCoord).r, textureCbCr.sample(colorSampler, in.texCoord).rg, 1.0);
return ycbcrToRGBTransform * ycbcr;
}
給錨點(diǎn)幾何體的第二對(duì)著色器中,頂點(diǎn)著色器中我們?cè)诓眉艨臻g里計(jì)算頂點(diǎn)的位置并輸出,以供裁剪和光柵化,然后給每個(gè)面著上不同顏色,然后計(jì)算我們頂點(diǎn)在觀察空間的位置,并最終旋轉(zhuǎn)法線到世界坐標(biāo)系:
vertex ColorInOut anchorGeometryVertexTransform(Vertex in [[stage_in]],
constant SharedUniforms &sharedUniforms [[ buffer(3) ]],
constant InstanceUniforms *instanceUniforms [[ buffer(2) ]],
ushort vid [[vertex_id]],
ushort iid [[instance_id]]) {
ColorInOut out;
float4 position = float4(in.position, 1.0);
float4x4 modelMatrix = instanceUniforms[iid].modelMatrix;
float4x4 modelViewMatrix = sharedUniforms.viewMatrix * modelMatrix;
out.position = sharedUniforms.projectionMatrix * modelViewMatrix * position;
ushort colorID = vid / 4 % 6;
out.color = colorID == 0 ? float4(0.0, 1.0, 0.0, 1.0) // Right face
: colorID == 1 ? float4(1.0, 0.0, 0.0, 1.0) // Left face
: colorID == 2 ? float4(0.0, 0.0, 1.0, 1.0) // Top face
: colorID == 3 ? float4(1.0, 0.5, 0.0, 1.0) // Bottom face
: colorID == 4 ? float4(1.0, 1.0, 0.0, 1.0) // Back face
: float4(1.0, 1.0, 1.0, 1.0); // Front face
out.eyePosition = half3((modelViewMatrix * position).xyz);
float4 normal = modelMatrix * float4(in.normal.x, in.normal.y, in.normal.z, 0.0f);
out.normal = normalize(half3(normal.xyz));
return out;
}
在片段著色器中,我們計(jì)算方向光的貢獻(xiàn)值,使用漫反射和高光項(xiàng)目的總和,然后通過將從顏色地圖的采樣與片段的光照值相乘來計(jì)算最終顏色,最后,用剛計(jì)算出來的顏色和顏色地圖的透明通道給片段的透明度值:
fragment float4 anchorGeometryFragmentLighting(ColorInOut in [[stage_in]],
constant SharedUniforms &uniforms [[ buffer(3) ]]) {
float3 normal = float3(in.normal);
float3 directionalContribution = float3(0);
{
float nDotL = saturate(dot(normal, -uniforms.directionalLightDirection));
float3 diffuseTerm = uniforms.directionalLightColor * nDotL;
float3 halfwayVector = normalize(-uniforms.directionalLightDirection - float3(in.eyePosition));
float reflectionAngle = saturate(dot(normal, halfwayVector));
float specularIntensity = saturate(powr(reflectionAngle, uniforms.materialShininess));
float3 specularTerm = uniforms.directionalLightColor * specularIntensity;
directionalContribution = diffuseTerm + specularTerm;
}
float3 ambientContribution = uniforms.ambientLightColor;
float3 lightContributions = ambientContribution + directionalContribution;
float3 color = in.color.rgb * lightContributions;
return float4(color, in.color.w);
}
如果你運(yùn)行應(yīng)用,將能夠通過點(diǎn)擊屏幕來添加立方體到相機(jī)視圖上,到處移動(dòng)或湊近或環(huán)繞立方體來觀察每個(gè)面的不同顏色,比如這樣:

在本系列的下一章節(jié),我們將更深入學(xué)習(xí)Tracking追蹤和Scene Understanding場(chǎng)景理解,并看看平面檢測(cè),點(diǎn)擊測(cè)試,碰撞和物理效果是如何讓我們的經(jīng)歷更美好的.
源代碼source code已發(fā)布在Github上.
下次見!