問題種類

時間復(fù)雜度

在集合里數(shù)據(jù)量小的情況下時間復(fù)雜度對于性能的影響看起來微乎其微。但如果某個開發(fā)的功能是一個公共功能，無法預(yù)料調(diào)用者傳入數(shù)據(jù)的量時，這個復(fù)雜度的優(yōu)化顯得非常重要了。

01.png

上圖列出了各種情況的時間復(fù)雜度，比如高效的排序算法一般都是 O(n log n)。接下來看看下圖：

02.png

圖中可以看出 O(n) 是個分水嶺，大于它對于性能就具有很大的潛在影響，如果是個公共的接口一定要加上說明，自己調(diào)用也要做到心中有數(shù)。當(dāng)然最好是通過算法優(yōu)化或者使用合適的系統(tǒng)接口方法，權(quán)衡內(nèi)存消耗爭取通過空間來換取時間。

下面通過集合里是否有某個值來舉個例子：

//O(1)
return array[idx] == value;

//O(n)
for (int i = 0; i < count; i++) {
    if (array[i] == value) {
        return YES;
    }
}
return NO;

//O(n2) 找重復(fù)的值
for (int i = 0; i < count; i++) {
    for (int j = 0; j < count; j++) {
        if ( i != j && array[i] == array[j]) {
            return YES;
        }
    }
}
return NO;

那么 OC 里幾種常用集合對象提供的接口方法時間復(fù)雜度是怎么樣的。

NSArray / NSMutableArray

首先我們發(fā)現(xiàn)他們是有排序，并允許重復(fù)元素存在的，那么這么設(shè)計就表明了集合存儲沒法使用里面的元素做 hash table 的 key 進(jìn)行相關(guān)的快速操作，。所以不同功能接口方法性能是會有很大的差異。

containsObject:，containsObject:，indexOfObject*，removeObject: 會遍歷里面元素查看是否與之匹對，所以復(fù)雜度等于或大于 O(n)
objectAtIndex:，firstObject:，lastObject:，addObject:，removeLastObject: 這些只針對棧頂棧底操作的時間復(fù)雜度都是 O(1)
indexOfObject:inSortedRange:options:usingComparator: 使用的是二分查找，時間復(fù)雜度是 O(log n)

NSSet / NSMutableSet / NSCountedSet

這些集合類型是無序沒有重復(fù)元素。這樣就可以通過 hash table 進(jìn)行快速的操作。比如 addObject:, removeObject:, containsObject: 都是按照 O(1) 來的。需要注意的是將數(shù)組轉(zhuǎn)成 Set 時會將重復(fù)元素合成一個，同時失去排序。

NSDictionary / NSMutableDictionary

和 Set 差不多，多了鍵值對應(yīng)。添加刪除和查找都是 O(1) 的。需要注意的是 Keys 必須是符合 NSCopying。

containsObject 方法在數(shù)組和 Set 里不同的實現(xiàn)

在數(shù)組中的實現(xiàn)

- (BOOL) containsObject: (id)anObject
{
  return ([self indexOfObject: anObject] != NSNotFound);
}
- (NSUInteger) indexOfObject: (id)anObject
{
  unsigned  c = [self count];

  if (c > 0 && anObject != nil)
    {
      unsigned  i;
      IMP   get = [self methodForSelector: oaiSel];
      BOOL  (*eq)(id, SEL, id)
    = (BOOL (*)(id, SEL, id))[anObject methodForSelector: eqSel];

      for (i = 0; i < c; i++)
    if ((*eq)(anObject, eqSel, (*get)(self, oaiSel, i)) == YES)
      return i;
    }
  return NSNotFound;
}

可以看到會遍歷所有元素在查找到后才進(jìn)行返回.

接下來可以看看 containsObject 在 Set 里的實現(xiàn)：

- (BOOL) containsObject: (id)anObject
{
  return (([self member: anObject]) ? YES : NO);
}
//在 GSSet,m 里有對 member 的實現(xiàn)
- (id) member: (id)anObject
{
  if (anObject != nil)
    {
      GSIMapNode node = GSIMapNodeForKey(&map, (GSIMapKey)anObject);
      if (node != 0)
    {
      return node->key.obj;
    }
    }
  return nil;
}

找元素時是通過鍵值方式從 map 映射表里取出，因為 Set 里元素是唯一的，所以可以 hash 元素對象作為 key 達(dá)到快速獲取值的目的。

用 GCD 來做優(yōu)化

我們可以通過 GCD 提供的方法來將一些需要耗時操作放到非主線程上做，使得 App 能夠運(yùn)行的更加流暢響應(yīng)更快。但是使用 GCD 時需要注意避免可能引起線程爆炸和死鎖的情況，還有非主線程處理任務(wù)也不是萬能的，如果一個處理需要消耗大量內(nèi)存或者大量CPU操作 GCD 也沒法幫你，只能通過將處理進(jìn)行拆解分步驟分時間進(jìn)行處理才比較妥當(dāng)。

異步處理事件

03.png

上圖是最典型的異步處理事件的方法

需要耗時長的任務(wù)

04.png

將 GCD 的 block 通過 dispatch_block_create_with_qos_class 方法指定隊列的 QoS 為 QOS_CLASS_UTILITY。這種 QoS 系統(tǒng)會針對大的計算，I/O，網(wǎng)絡(luò)以及復(fù)雜數(shù)據(jù)處理做電量優(yōu)化。

避免線程爆炸

使用串行隊列
使用 NSOperationQueues 的并發(fā)限制方法 NSOperationQueue.maxConcurrentOperationCount

舉個例子，下面的寫法就比較危險，可能會造成線程爆炸和死鎖

for (int i = 0; i < 999; i++) {
    dispatch_async(q, ^{...});
}
dispatch_barrier_sync(q, ^{});

05.png

那么怎么能夠避免呢？首先可以使用 dispatch_apply

dispatch_apply(999, q, ^(size_t i){...});

或者使用 dispatch_semaphore

#define CONCURRENT_TASKS 4
sema = dispatch_semaphore_create(CONCURRENT_TASKS);
for (int i = 0; i < 999; i++){
    dispatch_async(q, ^{
        dispatch_semaphore_signal(sema);
    });
    dispatch_semaphore_wait(sema, DISPATCH_TIME_FOREVER);
}

GCD 相關(guān) Crash 日志

管理線程問題

Thread 1:: Dispatch queue: com.apple.libdispatch-manager
0   libsystem_kernel.dylib   0x00007fff8967e08a kevent_qos + 10
1   libdispatch.dylib        0x00007fff8be05811 _dispatch_mgr_invoke + 251
2   libdispatch.dylib        0x00007fff8be05465 _dispatch_mgr_thread + 52

線程閑置時

Thread 6:
0   libsystem_kernel.dylib       0x00007fff8967d772 __workq_kernreturn + 10
1   libsystem_pthread.dylib      0x00007fff8fd317d9 _pthread_wqthread + 1283
2   libsystem_pthread.dylib      0x00007fff8fd2ed95 start_wqthread + 13

線程活躍時

Thread 3 Crashed:: Dispatch queue: <queue name>
<my code>
7   libdispatch.dylib        0x07fff8fcfd323 _dispatch_call_block_and_release
8   libdispatch.dylib        0x07fff8fcf8c13 _dispatch_client_callout + 8
9   libdispatch.dylib        0x07fff8fcfc365 _dispatch_queue_drain + 1100
10  libdispatch.dylib        0x07fff8fcfdecc _dispatch_queue_invoke + 202
11  libdispatch.dylib        0x07fff8fcfb6b7 _dispatch_root_queue_drain + 463
12  libdispatch.dylib        0x07fff8fd09fe4 _dispatch_worker_thread3 + 91
13  libsystem_pthread.dylib  0x07fff93c17637 _pthread_wqthread + 729
14  libsystem_pthread.dylib  0x07fff93c1540d start_wqthread + 13

主線程閑置時

Thread 0 Crashed:: Dispatch queue: com.apple.main-thread
0   libsystem_kernel.dylib     0x00007fff906614de mach_msg_trap + 10
1   libsystem_kernel.dylib     0x00007fff9066064f mach_msg + 55
2   com.apple.CoreFoundation   0x00007fff9a8c1eb4 __CFRunLoopServiceMachPort
3   com.apple.CoreFoundation   0x00007fff9a8c137b __CFRunLoopRun + 1371
4   com.apple.CoreFoundation   0x00007fff9a8c0bd8 CFRunLoopRunSpecific + 296
...
10  com.apple.AppKit           0x00007fff8e823c03 -[NSApplication run] + 594
11  com.apple.AppKit           0x00007fff8e7a0354 NSApplicationMain + 1832
12  com.example                0x00000001000013b4 start + 52

主隊列

Thread 0 Crashed:: Dispatch queue: com.apple.main-thread
<my code>
12  com.apple.Foundation      0x00007fff931157e8 __NSBLOCKOPERATION_IS_CALLING_OUT_TO_A_BLOCK__ + 7
13  com.apple.Foundation      0x00007fff931155b5 -[NSBlockOperation main] + 9
14  com.apple.Foundation      0x00007fff93114a6c -[__NSOperationInternal _start:] + 653
15  com.apple.Foundation      0x00007fff93114543 __NSOQSchedule_f + 184
16  libdispatch.dylib         0x00007fff935d6c13 _dispatch_client_callout + 8
17  libdispatch.dylib         0x00007fff935e2cbf _dispatch_main_queue_callback_4CF + 861
18  com.apple.CoreFoundation  0x00007fff8d9223f9 __CFRUNLOOP_IS_SERVICING_THE_MAIN_DISPATCH_QUEUE__
19  com.apple.CoreFoundation  0x00007fff8d8dd68f __CFRunLoopRun + 2159
20  com.apple.CoreFoundation  0x00007fff8d8dcbd8 CFRunLoopRunSpecific + 296
...
26  com.apple.AppKit          0x00007fff999a1bd3 -[NSApplication run] + 594
27  com.apple.AppKit          0x00007fff9991e324 NSApplicationMain + 1832
28  libdyld.dylib             0x00007fff9480f5c9 start + 1

I/O 性能優(yōu)化

I/O 是性能消耗大戶，任何的 I/O 操作都會使低功耗狀態(tài)被打破，所以減少 I/O 次數(shù)是這個性能優(yōu)化的關(guān)鍵點(diǎn)，為了達(dá)成這個目下面列出一些方法。

將零碎的內(nèi)容作為一個整體進(jìn)行寫入
使用合適的 I/O 操作 API
使用合適的線程
使用 NSCache 做緩存能夠減少 I/O

NSCache

06.png

達(dá)到如圖的目的為何不直接用字典來做呢？
NSCache 具有字典的所有功能，同時還有如下的特性：

自動清理系統(tǒng)占用內(nèi)存
NSCache 是線程安全
-(void)cache:(NSCache *)cache willEvictObject:(id)obj; 緩存對象將被清理時的回調(diào)
evictsObjectsWithDiscardedContent 可以控制是否清理

那么 NSCache是如何做到這些特性的呢？

接下來學(xué)習(xí)下 NSCache 是如何做的。首先 NSCache 是會持有一個 NSMutableDictionary。

@implementation NSCache
- (id) init
{
  if (nil == (self = [super init]))
    {
      return nil;
    }
  _objects = [NSMutableDictionary new];
  _accesses = [NSMutableArray new];
  return self;
}

需要設(shè)計一個 Cached 對象結(jié)構(gòu)來保存一些額外的信息

@interface _GSCachedObject : NSObject
{
  @public
  id object; //cache 的值
  NSString *key; //設(shè)置 cache 的 key
  int accessCount; //保存訪問次數(shù)，用于自動清理
  NSUInteger cost; //setObject:forKey:cost:
  BOOL isEvictable; //線程安全
}
@end

在 Cache 讀取的時候會對 _accesses 數(shù)組的添加刪除通過 isEvictable 布爾值來保證線程安全操作。使用 Cached 對象里的 accessCount 屬性進(jìn)行 +1 操作為后面自動清理的條件判斷做準(zhǔn)備。具體實現(xiàn)如下：

- (id) objectForKey: (id)key
{
  _GSCachedObject *obj = [_objects objectForKey: key];
  if (nil == obj)
    {
      return nil;
    }
  if (obj->isEvictable) //保證添加刪除操作線程安全
    {
      // 將 obj 移到 access list 末端
      [_accesses removeObjectIdenticalTo: obj];
      [_accesses addObject: obj];
    }
  obj->accessCount++;
  _totalAccesses++;
  return obj->object;
}

在每次 Cache 添加時會先去檢查是否自動清理，會創(chuàng)建一個 Cached 對象將 key，object，cost 等信息記錄下添加到 _accesses 數(shù)組和 _objects 字典里。

- (void) setObject: (id)obj forKey: (id)key cost: (NSUInteger)num
{
  _GSCachedObject *oldObject = [_objects objectForKey: key];
  _GSCachedObject *newObject;

  if (nil != oldObject)
    {
      [self removeObjectForKey: oldObject->key];
    }
  [self _evictObjectsToMakeSpaceForObjectWithCost: num];
  newObject = [_GSCachedObject new];
  // Retained here, released when obj is dealloc'd
  newObject->object = RETAIN(obj);
  newObject->key = RETAIN(key);
  newObject->cost = num;
  if ([obj conformsToProtocol: @protocol(NSDiscardableContent)])
    {
      newObject->isEvictable = YES;
      [_accesses addObject: newObject];
    }
  [_objects setObject: newObject forKey: key];
  RELEASE(newObject);
  _totalCost += num;
}

那么上面提到的自動清理內(nèi)存的方法是如何實現(xiàn)的呢？
既然是自動清理必定需要有觸發(fā)時機(jī)和進(jìn)入清理的條件判斷，觸發(fā)時機(jī)一個是發(fā)生在添加 Cache 內(nèi)容時，一個是發(fā)生在內(nèi)存警告時。條件判斷代碼如下：

// cost 在添加新 cache 值時指定的 cost
// _costLimit 是 totalCostLimit 屬性值
if (_costLimit > 0 && _totalCost + cost > _costLimit){
    spaceNeeded = _totalCost + cost - _costLimit;
}
// 只有當(dāng) cost 大于人工限制時才會清理
// 或者 cost 設(shè)置為0不進(jìn)行人工干預(yù)
if (count > 0 && (spaceNeeded > 0 || count >= _countLimit))

所以 NSCache 的 totalCostLimit 的值會和每次 Cache 添加的 cost 之和對比，超出限制必然觸發(fā)內(nèi)存清理。

清理時會對經(jīng)常訪問的 objects 不清理，主要是通過 _totalAccesses 和總數(shù)獲得平均訪問頻率，如果那個對象的訪問次數(shù)是小于平均值的才需要清理。

//_totalAccesses 所有的值的訪問都會 +1
NSUInteger averageAccesses = (_totalAccesses / count * 0.2) + 1;
//accessCount 每次 obj 取值時會 +1
if (obj->accessCount < averageAccesses && obj->isEvictable)

在清理之前還需要一些準(zhǔn)備工作，包括標(biāo)記 Cached 對象的 isEvictable 防止后面有不安全的線程操作。將滿足條件的清理 objects 放到清理數(shù)組里，如果空間釋放足夠就不用再把更多的 objects 加到清理數(shù)組里了，最后遍歷清理數(shù)組進(jìn)行逐個清理即可。

NSUInteger cost = obj->cost;
obj->cost = 0;
// 不會被再次清除
obj->isEvictable = NO;
// 添加到 remove list 里
if (_evictsObjectsWithDiscardedContent)
{
    [evictedKeys addObject: obj->key];
}
_totalCost -= cost;
// 如果已經(jīng)釋放了足夠空間就不用后面操作了
if (cost > spaceNeeded)
{
    break;
}
spaceNeeded -= cost;

在清理時會執(zhí)行回調(diào)內(nèi)容，這樣如果有些緩存數(shù)據(jù)需要持續(xù)化存儲可以在回調(diào)里進(jìn)行處理。

- (void) removeObjectForKey: (id)key
{
    _GSCachedObject *obj = [_objects objectForKey: key];
    
    if (nil != obj)
    {
        [_delegate cache: self willEvictObject: obj->object];
        _totalAccesses -= obj->accessCount;
        [_objects removeObjectForKey: key];
        [_accesses removeObjectIdenticalTo: obj];
    }
}

完整的實現(xiàn)可以查看 GNUstep Base 的 NSCache.m 文件。

下面可以看看 NSCache 在 SDWebImage 的運(yùn)用是怎么樣的：

- (UIImage *)imageFromMemoryCacheForKey:(NSString *)key {
    return [self.memCache objectForKey:key];
}
- (UIImage *)imageFromDiskCacheForKey:(NSString *)key {
    // 檢查 NSCache 里是否有
    UIImage *image = [self imageFromMemoryCacheForKey:key];
    if (image) {
        return image;
    }
    // 從磁盤里讀
    UIImage *diskImage = [self diskImageForKey:key];
    if (diskImage && self.shouldCacheImagesInMemory) {
        NSUInteger cost = SDCacheCostForImage(diskImage);
        [self.memCache setObject:diskImage forKey:key cost:cost];
    }
    return diskImage;
}

可以看出利用 NSCache 自動釋放內(nèi)存的特點(diǎn)將圖片都放到 NSCache 里這樣在內(nèi)存不夠用時可以自動清理掉不常用的那些圖片，在讀取 Cache 里內(nèi)容時如果沒有被清理會直接返回圖片數(shù)據(jù)，清理了的話才會執(zhí)行 I/O 從磁盤讀取圖片，通過這種方式能夠利用空間減少磁盤操作，空間也能夠更加有效的控制釋放。

控制 App 的 Wake 次數(shù)

通知，VoIP，定位，藍(lán)牙等都會使設(shè)備從 Standby 狀態(tài)喚起。喚起這個過程會有比較大的消耗，應(yīng)該避免頻繁發(fā)生。通知方面主要要在產(chǎn)品層面多做考慮。定位方面，下面可以看看定位的一些 API 看看它們對性能的不同影響，便于考慮采用合適的接口。

連續(xù)的位置更新

[locationManager startUpdatingLocation]

這個方法會時設(shè)備一直處于活躍狀態(tài)。

延時有效定位

[locationManager allowDeferredLocationUpdatesUntilTraveled: timeout:]

高效節(jié)能的定位方式，數(shù)據(jù)會緩存在位置硬件上。適合于跑步應(yīng)用應(yīng)該都采用這種方式。

重大位置變化

[locationManager startMonitoringSignificantLocationChanges]

會更節(jié)能，對于那些只有在位置有很大變化的才需要回調(diào)的應(yīng)用可以采用這種，比如天氣應(yīng)用。

區(qū)域監(jiān)測

[locationManager startMonitoringForRegion:(CLRegion *)]

也是一種節(jié)能的定位方式，比如在博物館里按照不同區(qū)域監(jiān)測展示不同信息之類的應(yīng)用比較適合這種定位。

經(jīng)常訪問的地方

// Start monitoring
locationManager.startMonitoringVisits()
// Stop monitoring when no longer needed
locationManager.stopMonitoringVisits()

總的來說，不要輕易使用 startUpdatingLocation() 除非萬不得已，盡快的使用 stopUpdatingLocation() 來結(jié)束定位還用戶一個節(jié)能設(shè)備。

內(nèi)存對于性能的影響

首先 Reclaiming 內(nèi)存是需要時間的，突然的大量內(nèi)存需求是會影響響應(yīng)的。

如何預(yù)防這些性能問題，需要刻意預(yù)防么

堅持下面幾個原則爭取在編碼階段避免一些性能問題。

優(yōu)化計算的復(fù)雜度從而減少 CPU 的使用
在應(yīng)用響應(yīng)交互的時候停止沒必要的任務(wù)處理
設(shè)置合適的 QoS
將定時器任務(wù)合并，讓 CPU 更多時候處于 idle 狀態(tài)

那么如果寫需求時來不及注意這些問題做不到預(yù)防的話，可以通過自動化代碼檢查的方式來避免這些問題嗎？

如何檢查

根據(jù)這些問題在代碼里查，寫工具或用工具自動化查？雖然可以，但是需要考慮的情況太多，現(xiàn)有工具支持不好，自己寫需要考慮的點(diǎn)太多需要花費(fèi)太長的時間，那么什么方式會比較好呢？

通過監(jiān)聽主線程方式來監(jiān)察

首先用 CFRunLoopObserverCreate 創(chuàng)建一個觀察者里面接受 CFRunLoopActivity 的回調(diào)，然后用 CFRunLoopAddObserver 將觀察者添加到 CFRunLoopGetMain() 主線程 Runloop 的 kCFRunLoopCommonModes 模式下進(jìn)行觀察。

接下來創(chuàng)建一個子線程來進(jìn)行監(jiān)控，使用 dispatch_semaphore_wait 定義區(qū)間時間，標(biāo)準(zhǔn)是 16 或 20 微秒一次監(jiān)控的話基本可以把影響響應(yīng)的都找出來。監(jiān)控結(jié)果的標(biāo)準(zhǔn)是根據(jù)兩個 Runloop 的狀態(tài) BeforeSources 和 AfterWaiting 在區(qū)間時間是否能檢測到來判斷是否卡頓。

如何打印堆棧信息，保存現(xiàn)場

打印堆棧整體思路是獲取線程的信息得到線程的 state 從而得到線程里所有棧的指針，根據(jù)這些指針在符號表里找到對應(yīng)的描述即符號化解析，這樣就能夠展示出可讀的堆棧信息。具體實現(xiàn)是怎樣的呢？下面詳細(xì)說說：

獲取線程的信息

這里首先是要通過 task_threads 取到所有的線程，

thread_act_array_t threads; //int 組成的數(shù)組比如 thread[1] = 5635
mach_msg_type_number_t thread_count = 0; //mach_msg_type_number_t 是 int 類型
const task_t this_task = mach_task_self(); //int
//根據(jù)當(dāng)前 task 獲取所有線程
kern_return_t kr = task_threads(this_task, &threads, &thread_count);

遍歷時通過 thread_info 獲取各個線程的詳細(xì)信息

SMThreadInfoStruct threadInfoSt = {0};
thread_info_data_t threadInfo;
thread_basic_info_t threadBasicInfo;
mach_msg_type_number_t threadInfoCount = THREAD_INFO_MAX;

if (thread_info((thread_act_t)thread, THREAD_BASIC_INFO, (thread_info_t)threadInfo, &threadInfoCount) == KERN_SUCCESS) {
    threadBasicInfo = (thread_basic_info_t)threadInfo;
    if (!(threadBasicInfo->flags & TH_FLAGS_IDLE)) {
        threadInfoSt.cpuUsage = threadBasicInfo->cpu_usage / 10;
        threadInfoSt.userTime = threadBasicInfo->system_time.microseconds;
    }
}

uintptr_t buffer[100];
int i = 0;
NSMutableString *reStr = [NSMutableString stringWithFormat:@"Stack of thread: %u:\n CPU used: %.1f percent\n user time: %d second\n", thread, threadInfoSt.cpuUsage, threadInfoSt.userTime];

獲取線程里所有棧的信息

可以通過 thread_get_state 得到 machine context 里面包含了線程棧里所有的棧指針。

_STRUCT_MCONTEXT machineContext; //線程棧里所有的棧指針
//通過 thread_get_state 獲取完整的 machineContext 信息，包含 thread 狀態(tài)信息
mach_msg_type_number_t state_count = smThreadStateCountByCPU();
kern_return_t kr = thread_get_state(thread, smThreadStateByCPU(), (thread_state_t)&machineContext.__ss, &state_count);

創(chuàng)建一個棧結(jié)構(gòu)體用來保存棧的數(shù)據(jù)

//為通用回溯設(shè)計結(jié)構(gòu)支持棧地址由小到大，地址里存儲上個棧指針的地址
typedef struct SMStackFrame {
    const struct SMStackFrame *const previous;
    const uintptr_t return_address;
} SMStackFrame;

SMStackFrame stackFrame = {0};
//通過?；分羔槴@取當(dāng)前棧幀地址
const uintptr_t framePointer = smMachStackBasePointerByCPU(&machineContext);
if (framePointer == 0 || smMemCopySafely((void *)framePointer, &stackFrame, sizeof(stackFrame)) != KERN_SUCCESS) {
    return @"Fail frame pointer";
}
for (; i < 32; i++) {
    buffer[i] = stackFrame.return_address;
    if (buffer[i] == 0 || stackFrame.previous == 0 || smMemCopySafely(stackFrame.previous, &stackFrame, sizeof(stackFrame)) != KERN_SUCCESS) {
        break;
    }
}

符號化

符號化主要思想就是通過棧指針地址減去 Slide 地址得到 ASLR 偏移量，通過這個偏移量可以在 __LINKEDIT segment 查找到字符串和符號表的位置。具體代碼實現(xiàn)如下：

info->dli_fname = NULL;
info->dli_fbase = NULL;
info->dli_sname = NULL;
info->dli_saddr = NULL;
//根據(jù)地址獲取是哪個 image
const uint32_t idx = smDyldImageIndexFromAddress(address);
if (idx == UINT_MAX) {
    return false;
}
/*
 Header
 ------------------
 Load commands
 Segment command 1 -------------|
 Segment command 2              |
 ------------------             |
 Data                           |
 Section 1 data |segment 1 <----|
 Section 2 data |          <----|
 Section 3 data |          <----|
 Section 4 data |segment 2
 Section 5 data |
 ...            |
 Section n data |
 */
/*----------Mach Header---------*/
//根據(jù) image 的序號獲取 mach_header
const struct mach_header* machHeader = _dyld_get_image_header(idx);
//返回 image_index 索引的 image 的虛擬內(nèi)存地址 slide 的數(shù)量，如果 image_index 超出范圍返回0
//動態(tài)鏈接器加載 image 時，image 必須映射到未占用地址的進(jìn)程的虛擬地址空間。動態(tài)鏈接器通過添加一個值到 image 的基地址來實現(xiàn)，這個值是虛擬內(nèi)存 slide 數(shù)量
const uintptr_t imageVMAddressSlide = (uintptr_t)_dyld_get_image_vmaddr_slide(idx);
/*-----------ASLR 的偏移量---------*/
//https://en.wikipedia.org/wiki/Address_space_layout_randomization
const uintptr_t addressWithSlide = address - imageVMAddressSlide;
//根據(jù) Image 的 Index 來獲取 segment 的基地址
//段定義Mach-O文件中的字節(jié)范圍以及動態(tài)鏈接器加載應(yīng)用程序時這些字節(jié)映射到虛擬內(nèi)存中的地址和內(nèi)存保護(hù)屬性。 因此，段總是虛擬內(nèi)存頁對齊。 片段包含零個或多個節(jié)。
const uintptr_t segmentBase = smSegmentBaseOfImageIndex(idx) + imageVMAddressSlide;
if (segmentBase == 0) {
    return false;
}
//
info->dli_fname = _dyld_get_image_name(idx);
info->dli_fbase = (void*)machHeader;

/*--------------Mach Segment-------------*/
//地址最匹配的symbol
const nlistByCPU* bestMatch = NULL;
uintptr_t bestDistance = ULONG_MAX;
uintptr_t cmdPointer = smCmdFirstPointerFromMachHeader(machHeader);
if (cmdPointer == 0) {
    return false;
}
//遍歷每個 segment 判斷目標(biāo)地址是否落在該 segment 包含的范圍里
for (uint32_t iCmd = 0; iCmd < machHeader->ncmds; iCmd++) {
    const struct load_command* loadCmd = (struct load_command*)cmdPointer;
    /*----------目標(biāo) Image 的符號表----------*/
    //Segment 除了 __TEXT 和 __DATA 外還有 __LINKEDIT segment，它里面包含動態(tài)鏈接器的使用的原始數(shù)據(jù)，比如符號，字符串和重定位表項。
    //LC_SYMTAB 描述了 __LINKEDIT segment 內(nèi)查找字符串和符號表的位置
    if (loadCmd->cmd == LC_SYMTAB) {
        //獲取字符串和符號表的虛擬內(nèi)存偏移量。
        const struct symtab_command* symtabCmd = (struct symtab_command*)cmdPointer;
        const nlistByCPU* symbolTable = (nlistByCPU*)(segmentBase + symtabCmd->symoff);
        const uintptr_t stringTable = segmentBase + symtabCmd->stroff;
        
        for (uint32_t iSym = 0; iSym < symtabCmd->nsyms; iSym++) {
            //如果 n_value 是0，symbol 指向外部對象
            if (symbolTable[iSym].n_value != 0) {
                //給定的偏移量是文件偏移量，減去 __LINKEDIT segment 的文件偏移量獲得字符串和符號表的虛擬內(nèi)存偏移量
                uintptr_t symbolBase = symbolTable[iSym].n_value;
                uintptr_t currentDistance = addressWithSlide - symbolBase;
                //尋找最小的距離 bestDistance，因為 addressWithSlide 是某個方法的指令地址，要大于這個方法的入口。
                //離 addressWithSlide 越近的函數(shù)入口越匹配
                if ((addressWithSlide >= symbolBase) && (currentDistance <= bestDistance)) {
                    bestMatch = symbolTable + iSym;
                    bestDistance = currentDistance;
                }
            }
        }
        if (bestMatch != NULL) {
            //將虛擬內(nèi)存偏移量添加到 __LINKEDIT segment 的虛擬內(nèi)存地址可以提供字符串和符號表的內(nèi)存 address。
            info->dli_saddr = (void*)(bestMatch->n_value + imageVMAddressSlide);
            info->dli_sname = (char*)((intptr_t)stringTable + (intptr_t)bestMatch->n_un.n_strx);
            if (*info->dli_sname == '_') {
                info->dli_sname++;
            }
            //所有的 symbols 的已經(jīng)被處理好了
            if (info->dli_saddr == info->dli_fbase && bestMatch->n_type == 3) {
                info->dli_sname = NULL;
            }
            break;
        }
    }
    cmdPointer += loadCmd->cmdsize;
}

需要注意的地方

需要注意的是這個程序有消耗性能的地方 thread get state。這個也會被監(jiān)控檢查出，所以可以過濾掉這樣的堆棧信息。

夠獲取更多信息的方法

獲取更多信息比如全層級方法調(diào)用和每個方法消耗的時間，那么這樣做的好處在哪呢？

可以更細(xì)化的測量時間消耗，找到耗時方法，更快的交互操作能使用戶體驗更好，下面是一些可以去衡量的場景：

響應(yīng)能力
按鈕點(diǎn)擊
手勢操作
Tab 切換
vc 的切換和轉(zhuǎn)場

可以給優(yōu)化定個目標(biāo)，比如滾動和動畫達(dá)到 60fps，響應(yīng)用戶操作在 100ms 內(nèi)完成。然后逐個檢測出來 fix 掉。

如何獲取到更多信息呢？

通過 hook objc_msgSend 方法能夠獲取所有被調(diào)用的方法，記錄深度就能夠得到方法調(diào)用的樹狀結(jié)構(gòu)，通過執(zhí)行前后時間的記錄能夠得到每個方法的耗時，這樣就能獲取一份完整的性能消耗信息了。

hook c 函數(shù)可以使用 facebook 的 fishhook，獲取方法調(diào)用樹狀結(jié)構(gòu)可以使用 InspectiveC，下面對于他們的實現(xiàn)詳細(xì)介紹一下：

獲取方法調(diào)用樹結(jié)構(gòu)

首先設(shè)計兩個結(jié)構(gòu)體，CallRecord 記錄調(diào)用方法詳細(xì)信息，包括 obj 和 SEL 等，ThreadCallStack 里面需要用 index 記錄當(dāng)前調(diào)用方法樹的深度。有了 SEL 再通過 NSStringFromSelector 就能夠取得方法名，有了 obj 通過 object_getClass 能夠得到 Class 再用 NSStringFromClass 就能夠獲得類名。

// Shared structures.
typedef struct CallRecord_ {
  id obj;   //通過 object_getClass 能夠得到 Class 再通過 NSStringFromClass 能夠得到類名
  SEL _cmd; //通過 NSStringFromSelector 方法能夠得到方法名
  uintptr_t lr;
  int prevHitIndex;
  char isWatchHit;
} CallRecord;

typedef struct ThreadCallStack_ {
  FILE *file;
  char *spacesStr;
  CallRecord *stack;
  int allocatedLength;
  int index;
  int numWatchHits;
  int lastPrintedIndex;
  int lastHitIndex;
  char isLoggingEnabled;
  char isCompleteLoggingEnabled;
} ThreadCallStack;

存儲讀取 ThreadCallStack

pthread_setspecific() 可以將私有數(shù)據(jù)設(shè)置在指定線程上，pthread_getspecific() 用來讀取這個私有數(shù)據(jù)，利用這個特性可以就可以將 ThreadCallStack 的數(shù)據(jù)和該線程綁定在一起，隨時進(jìn)行數(shù)據(jù)的存取。代碼如下：

static inline ThreadCallStack * getThreadCallStack() {
  ThreadCallStack *cs = (ThreadCallStack *)pthread_getspecific(threadKey); //讀取
  if (cs == NULL) {
    cs = (ThreadCallStack *)malloc(sizeof(ThreadCallStack));
#ifdef MAIN_THREAD_ONLY
    cs->file = (pthread_main_np()) ? newFileForThread() : NULL;
#else
    cs->file = newFileForThread();
#endif
    cs->isLoggingEnabled = (cs->file != NULL);
    cs->isCompleteLoggingEnabled = 0;
    cs->spacesStr = (char *)malloc(DEFAULT_CALLSTACK_DEPTH + 1);
    memset(cs->spacesStr, ' ', DEFAULT_CALLSTACK_DEPTH);
    cs->spacesStr[DEFAULT_CALLSTACK_DEPTH] = '\0';
    cs->stack = (CallRecord *)calloc(DEFAULT_CALLSTACK_DEPTH, sizeof(CallRecord)); //分配 CallRecord 默認(rèn)空間
    cs->allocatedLength = DEFAULT_CALLSTACK_DEPTH;
    cs->index = cs->lastPrintedIndex = cs->lastHitIndex = -1;
    cs->numWatchHits = 0;
    pthread_setspecific(threadKey, cs); //保存數(shù)據(jù)
  }
  return cs;
}

記錄方法調(diào)用深度

因為要記錄深度，而一個方法的調(diào)用里會有更多的方法調(diào)用，所以方法的調(diào)用寫兩個方法分別記錄開始 pushCallRecord 和記錄結(jié)束的時刻 popCallRecord，這樣才能夠通過在開始時對深度加一在結(jié)束時減一。

//開始時
static inline void pushCallRecord(id obj, uintptr_t lr, SEL _cmd, ThreadCallStack *cs) {
  int nextIndex = (++cs->index); //增加深度
  if (nextIndex >= cs->allocatedLength) {
    cs->allocatedLength += CALLSTACK_DEPTH_INCREMENT;
    cs->stack = (CallRecord *)realloc(cs->stack, cs->allocatedLength * sizeof(CallRecord));
    cs->spacesStr = (char *)realloc(cs->spacesStr, cs->allocatedLength + 1);
    memset(cs->spacesStr, ' ', cs->allocatedLength);
    cs->spacesStr[cs->allocatedLength] = '\0';
  }
  CallRecord *newRecord = &cs->stack[nextIndex];
  newRecord->obj = obj;
  newRecord->_cmd = _cmd;
  newRecord->lr = lr;
  newRecord->isWatchHit = 0;
}
//結(jié)束時
static inline CallRecord * popCallRecord(ThreadCallStack *cs) {
  return &cs->stack[cs->index--]; //減少深度
}

在 objc_msgSend 前后插入執(zhí)行方法

最后是 hook objc_msgSend 需要在調(diào)用前和調(diào)用后分別加入 pushCallRecord 和 popCallRecord。因為需要在調(diào)用后這個時機(jī)插入一個方法，而且不可能編寫一個保留未知參數(shù)并跳轉(zhuǎn)到 c 中任意函數(shù)指針的函數(shù)，那么這就需要用到匯編來做到。

下面針對 arm64 進(jìn)行分析，arm64 有31個64 bit 的整數(shù)型寄存器，用 x0 到 x30 表示，主要思路就是先入棧參數(shù)，參數(shù)寄存器是 x0 - x7，對于objc_msgSend方法來說 x0 第一個參數(shù)是傳入對象，x1 第二個參數(shù)是選擇器 _cmd。 syscall 的 number 會放到 x8 里。然后交換寄存器中，將用于返回的寄存器 lr 移到 x1 里。先讓 pushCallRecord 能夠執(zhí)行，再執(zhí)行原始的 objc_msgSend，保存返回值，最后讓 popCallRecord 能執(zhí)行。具體代碼如下：

static void replacementObjc_msgSend() {
  __asm__ volatile (
    // sp 是堆棧寄存器，存放棧的偏移地址，每次都指向棧頂。
    // 保存 {q0-q7} 偏移地址到 sp 寄存器
      "stp q6, q7, [sp, #-32]!\n"
      "stp q4, q5, [sp, #-32]!\n"
      "stp q2, q3, [sp, #-32]!\n"
      "stp q0, q1, [sp, #-32]!\n"
    // 保存 {x0-x8, lr}
      "stp x8, lr, [sp, #-16]!\n"
      "stp x6, x7, [sp, #-16]!\n"
      "stp x4, x5, [sp, #-16]!\n"
      "stp x2, x3, [sp, #-16]!\n"
      "stp x0, x1, [sp, #-16]!\n"
    // 交換參數(shù).
      "mov x2, x1\n"
      "mov x1, lr\n"
      "mov x3, sp\n"
    // 調(diào)用 preObjc_msgSend，使用 bl label 語法。bl 執(zhí)行一個分支鏈接操作，label 是無條件分支的，是和本指令的地址偏移，范圍是 -128MB 到 +128MB
      "bl __Z15preObjc_msgSendP11objc_objectmP13objc_selectorP9RegState_\n"
      "mov x9, x0\n"
      "mov x10, x1\n"
      "tst x10, x10\n"
    // 讀取 {x0-x8, lr} 從保存到 sp 棧頂?shù)钠频刂纷x起
      "ldp x0, x1, [sp], #16\n"
      "ldp x2, x3, [sp], #16\n"
      "ldp x4, x5, [sp], #16\n"
      "ldp x6, x7, [sp], #16\n"
      "ldp x8, lr, [sp], #16\n"
    // 讀取 {q0-q7}
      "ldp q0, q1, [sp], #32\n"
      "ldp q2, q3, [sp], #32\n"
      "ldp q4, q5, [sp], #32\n"
      "ldp q6, q7, [sp], #32\n"
      "b.eq Lpassthrough\n"
    // 調(diào)用原始 objc_msgSend。使用 blr xn 語法。blr 除了從指定寄存器讀取新的 PC 值外效果和 bl 一樣。xn 是通用寄存器的64位名稱分支地址，范圍0到31
      "blr x9\n"
    // 保存 {x0-x9}
      "stp x0, x1, [sp, #-16]!\n"
      "stp x2, x3, [sp, #-16]!\n"
      "stp x4, x5, [sp, #-16]!\n"
      "stp x6, x7, [sp, #-16]!\n"
      "stp x8, x9, [sp, #-16]!\n"
    // 保存 {q0-q7}
      "stp q0, q1, [sp, #-32]!\n"
      "stp q2, q3, [sp, #-32]!\n"
      "stp q4, q5, [sp, #-32]!\n"
      "stp q6, q7, [sp, #-32]!\n"
    // 調(diào)用 postObjc_msgSend hook.
      "bl __Z16postObjc_msgSendv\n"
      "mov lr, x0\n"
    // 讀取 {q0-q7}
      "ldp q6, q7, [sp], #32\n"
      "ldp q4, q5, [sp], #32\n"
      "ldp q2, q3, [sp], #32\n"
      "ldp q0, q1, [sp], #32\n"
    // 讀取 {x0-x9}
      "ldp x8, x9, [sp], #16\n"
      "ldp x6, x7, [sp], #16\n"
      "ldp x4, x5, [sp], #16\n"
      "ldp x2, x3, [sp], #16\n"
      "ldp x0, x1, [sp], #16\n"
      "ret\n"
      "Lpassthrough:\n"
    // br 無條件分支到寄存器中的地址
      "br x9"
    );
}

記錄時間的方法

為了記錄耗時，這樣就需要在 pushCallRecord 和 popCallRecord 里記錄下時間。下面列出一些計算一段代碼開始到結(jié)束的時間的方法

第一種： NSDate 微秒

NSDate* tmpStartData = [NSDate date];
//some code need caculate
double deltaTime = [[NSDate date] timeIntervalSinceDate:tmpStartData];
NSLog(@"cost time: %f s", deltaTime);

第二種：clock_t 微秒clock_t計時所表示的是占用CPU的時鐘單元

clock_t start = clock();
//some code need caculate
clock_t end = clock();
NSLog(@"cost time: %f s", (double)(end - start)/CLOCKS_PER_SEC);

第三種：CFAbsoluteTime 微秒

CFAbsoluteTime start = CFAbsoluteTimeGetCurrent();
//some code need caculate
CFAbsoluteTime end = CFAbsoluteTimeGetCurrent();
NSLog(@"cost time = %f s", end - start); //s

第四種：CFTimeInterval 納秒

CFTimeInterval start = CACurrentMediaTime();
//some code need caculate
CFTimeInterval end = CACurrentMediaTime();
NSLog(@"cost time: %f s", end - start);

第五種：mach_absolute_time 納秒

uint64_t start = mach_absolute_time ();
//some code need caculate
uint64_t end = mach_absolute_time ();
uint64_t elapsed = 1e-9 *(end - start);

最后兩種可用，本質(zhì)區(qū)別
NSDate 或 CFAbsoluteTimeGetCurrent() 返回的時鐘時間將會會網(wǎng)絡(luò)時間同步，從時鐘偏移量的角度。mach_absolute_time() 和 CACurrentMediaTime() 是基于內(nèi)建時鐘的。選擇一種，加到 pushCallRecord 和 popCallRecord 里，相減就能夠獲得耗時。

如何 hook msgsend 方法

那么 objc_msgSend 這個 c 方法是如何 hook 到的呢。首先了解下 dyld 是通過更新 Mach-O 二進(jìn)制的 __DATA segment 特定的部分中的指針來邦定 lazy 和 non-lazy 符號，通過確認(rèn)傳遞給 rebind_symbol 里每個符號名稱更新的位置就可以找出對應(yīng)替換來重新綁定這些符號。下面針對關(guān)鍵代碼進(jìn)行分析：

遍歷 dyld

首先是遍歷 dyld 里的所有的 image，取出 image header 和 slide。注意第一次調(diào)用時主要注冊 callback。

if (!_rebindings_head->next) {
    _dyld_register_func_for_add_image(_rebind_symbols_for_image);
} else {
    uint32_t c = _dyld_image_count();
    for (uint32_t i = 0; i < c; i++) {
        _rebind_symbols_for_image(_dyld_get_image_header(i), _dyld_get_image_vmaddr_slide(i));
    }
}

找出符號表相關(guān) Command

接下來需要找到符號表相關(guān)的 command，包括 linkedit segment command，symtab command 和 dysymtab command。方法如下：

segment_command_t *cur_seg_cmd;
segment_command_t *linkedit_segment = NULL;
struct symtab_command* symtab_cmd = NULL;
struct dysymtab_command* dysymtab_cmd = NULL;

uintptr_t cur = (uintptr_t)header + sizeof(mach_header_t);
for (uint i = 0; i < header->ncmds; i++, cur += cur_seg_cmd->cmdsize) {
    cur_seg_cmd = (segment_command_t *)cur;
    if (cur_seg_cmd->cmd == LC_SEGMENT_ARCH_DEPENDENT) {
        if (strcmp(cur_seg_cmd->segname, SEG_LINKEDIT) == 0) {
            linkedit_segment = cur_seg_cmd;
        }
    } else if (cur_seg_cmd->cmd == LC_SYMTAB) {
        symtab_cmd = (struct symtab_command*)cur_seg_cmd;
    } else if (cur_seg_cmd->cmd == LC_DYSYMTAB) {
        dysymtab_cmd = (struct dysymtab_command*)cur_seg_cmd;
    }
}

獲得 base 和 indirect 符號表

// Find base symbol/string table addresses
uintptr_t linkedit_base = (uintptr_t)slide + linkedit_segment->vmaddr - linkedit_segment->fileoff;
nlist_t *symtab = (nlist_t *)(linkedit_base + symtab_cmd->symoff);
char *strtab = (char *)(linkedit_base + symtab_cmd->stroff);

// Get indirect symbol table (array of uint32_t indices into symbol table)
uint32_t *indirect_symtab = (uint32_t *)(linkedit_base + dysymtab_cmd->indirectsymoff);

進(jìn)行方法替換

有了符號表和傳入的方法替換數(shù)組就可以進(jìn)行符號表訪問指針地址的替換，具體實現(xiàn)如下：

uint32_t *indirect_symbol_indices = indirect_symtab + section->reserved1;
void **indirect_symbol_bindings = (void **)((uintptr_t)slide + section->addr);
for (uint i = 0; i < section->size / sizeof(void *); i++) {
    uint32_t symtab_index = indirect_symbol_indices[i];
    if (symtab_index == INDIRECT_SYMBOL_ABS || symtab_index == INDIRECT_SYMBOL_LOCAL ||
        symtab_index == (INDIRECT_SYMBOL_LOCAL   | INDIRECT_SYMBOL_ABS)) {
        continue;
    }
    uint32_t strtab_offset = symtab[symtab_index].n_un.n_strx;
    char *symbol_name = strtab + strtab_offset;
    if (strnlen(symbol_name, 2) < 2) {
        continue;
    }
    struct rebindings_entry *cur = rebindings;
    while (cur) {
        for (uint j = 0; j < cur->rebindings_nel; j++) {
            if (strcmp(&symbol_name[1], cur->rebindings[j].name) == 0) {
                if (cur->rebindings[j].replaced != NULL &&
                    indirect_symbol_bindings[i] != cur->rebindings[j].replacement) {
                    *(cur->rebindings[j].replaced) = indirect_symbol_bindings[i];
                }
                indirect_symbol_bindings[i] = cur->rebindings[j].replacement;
                goto symbol_loop;
            }
        }
        cur = cur->next;
    }
symbol_loop:;

統(tǒng)計方法調(diào)用頻次

在一些應(yīng)用場景會有一些頻繁的方法調(diào)用，有些方法的調(diào)用實際上是沒有必要的，但是首先是需要將那些頻繁調(diào)用的方法找出來這樣才能夠更好的定位到潛在的會造成性能浪費(fèi)的方法使用。這些頻繁調(diào)用的方法要怎么找呢？

大致的思路是這樣，基于上面章節(jié)提到的記錄方法調(diào)用深度的方案，將每個調(diào)用方法的路徑保存住，調(diào)用相同路徑的相同方法調(diào)用一次加一記錄在數(shù)據(jù)庫中，最后做一個視圖按照調(diào)用次數(shù)的排序即可找到調(diào)用頻繁的那些方法。下圖是完成后展示的效果：

接下來看看具體實現(xiàn)方式

設(shè)計方法調(diào)用頻次記錄的結(jié)構(gòu)

在先前時間消耗的 model 基礎(chǔ)上增加路徑，頻次等信息

@property (nonatomic, strong) NSString *className;       //類名
@property (nonatomic, strong) NSString *methodName;      //方法名
@property (nonatomic, assign) BOOL isClassMethod;        //是否是類方法
@property (nonatomic, assign) NSTimeInterval timeCost;   //時間消耗
@property (nonatomic, assign) NSUInteger callDepth;      //Call 層級
@property (nonatomic, copy) NSString *path;              //路徑
@property (nonatomic, assign) BOOL lastCall;             //是否是最后一個 Call
@property (nonatomic, assign) NSUInteger frequency;      //訪問頻次
@property (nonatomic, strong) NSArray <SMCallTraceTimeCostModel *> *subCosts;

拼裝方法路徑

在遍歷 SMCallTrace 記錄的方法 model 和遍歷方法子方法時將路徑拼裝好，記錄到數(shù)據(jù)庫中

for (SMCallTraceTimeCostModel *model in arr) {
    //記錄方法路徑
    model.path = [NSString stringWithFormat:@"[%@ %@]",model.className,model.methodName];
    [self appendRecord:model to:mStr];
}

+ (void)appendRecord:(SMCallTraceTimeCostModel *)cost to:(NSMutableString *)mStr {
    [mStr appendFormat:@"%@\n path%@\n",[cost des],cost.path];
    if (cost.subCosts.count < 1) {
        cost.lastCall = YES;
    }
    //記錄到數(shù)據(jù)庫中
    [[[SMLagDB shareInstance] increaseWithClsCallModel:cost] subscribeNext:^(id x) {}];
    for (SMCallTraceTimeCostModel *model in cost.subCosts) {
        //記錄方法的子方法的路徑
        model.path = [NSString stringWithFormat:@"%@ - [%@ %@]",cost.path,model.className,model.methodName];
        [self appendRecord:model to:mStr];
    }
}

記錄方法調(diào)用頻次數(shù)據(jù)庫

創(chuàng)建數(shù)據(jù)庫

這里的 lastcall 是記錄是否是最后一個方法的調(diào)用，展示時只取最后一個方法即可，因為也會有完整路徑可以知道父方法和來源方法。

_clsCallDBPath = [PATH_OF_DOCUMENT stringByAppendingPathComponent:@"clsCall.sqlite"];
if ([[NSFileManager defaultManager] fileExistsAtPath:_clsCallDBPath] == NO) {
    FMDatabase *db = [FMDatabase databaseWithPath:_clsCallDBPath];
    if ([db open]) {
        /*
         cid: 主id
         fid: 父id 暫時不用
         cls: 類名
         mtd: 方法名
         path: 完整路徑標(biāo)識
         timecost: 方法消耗時長
         calldepth: 層級
         frequency: 調(diào)用次數(shù)
         lastcall: 是否是最后一個 call
         */
        NSString *createSql = @"create table clscall (cid INTEGER PRIMARY KEY AUTOINCREMENT  NOT NULL, fid integer, cls text, mtd text, path text, timecost integer, calldepth integer, frequency integer, lastcall integer)";
        [db executeUpdate:createSql];
    }
}

添加記錄

添加記錄時需要先檢查數(shù)據(jù)庫里是否有相同路徑的同一個方法調(diào)用，這樣可以給 frequency 字段加一已達(dá)到記錄頻次的目的。

FMResultSet *rsl = [db executeQuery:@"select cid,frequency from clscall where path = ?", model.path];
if ([rsl next]) {
    //有相同路徑就更新路徑訪問頻率
    int fq = [rsl intForColumn:@"frequency"] + 1;
    int cid = [rsl intForColumn:@"cid"];
    [db executeUpdate:@"update clscall set frequency = ? where cid = ?", @(fq), @(cid)];
} else {
    //沒有就添加一條記錄
    NSNumber *lastCall = @0;
    if (model.lastCall) {
        lastCall = @1;
    }
    [db executeUpdate:@"insert into clscall (cls, mtd, path, timecost, calldepth, frequency, lastcall) values (?, ?, ?, ?, ?, ?, ?)", model.className, model.methodName, model.path, @(model.timeCost), @(model.callDepth), @1, lastCall];
}
[db close];
[subscriber sendCompleted];

檢索記錄

檢索時注意按照調(diào)用頻次字段進(jìn)行排序即可。

FMResultSet *rs = [db executeQuery:@"select * from clscall where lastcall=? order by frequency desc limit ?, 50",@1, @(page * 50)];
NSUInteger count = 0;
NSMutableArray *arr = [NSMutableArray array];
while ([rs next]) {
    SMCallTraceTimeCostModel *model = [self clsCallModelFromResultSet:rs];
    [arr addObject:model];
    count ++;
}
if (count > 0) {
    [subscriber sendNext:arr];
} else {
    [subscriber sendError:nil];
}
[subscriber sendCompleted];
[db close];

找出 CPU 使用大的線程堆棧

在前面檢測卡頓打印的堆棧里提到使用 thread_info 能夠獲取到各個線程的 cpu 消耗，但是 cpu 不在主線程即使消耗很大也不一定會造成卡頓導(dǎo)致卡頓檢測無法檢測出更多 cpu 消耗的情況，所以只能通過輪詢監(jiān)控各線程里的 cpu 使用情況，對于超過標(biāo)準(zhǔn)值比如70%的進(jìn)行記錄來跟蹤定位出耗電的那些方法。下圖是列出 cpu 過載時的堆棧記錄的展示效果：

有了前面的基礎(chǔ)，實現(xiàn)起來輕松多了

//輪詢檢查多個線程 cpu 情況
+ (void)updateCPU {
    thread_act_array_t threads;
    mach_msg_type_number_t threadCount = 0;
    const task_t thisTask = mach_task_self();
    kern_return_t kr = task_threads(thisTask, &threads, &threadCount);
    if (kr != KERN_SUCCESS) {
        return;
    }
    for (int i = 0; i < threadCount; i++) {
        thread_info_data_t threadInfo;
        thread_basic_info_t threadBaseInfo;
        mach_msg_type_number_t threadInfoCount = THREAD_INFO_MAX;
        if (thread_info((thread_act_t)threads[i], THREAD_BASIC_INFO, (thread_info_t)threadInfo, &threadInfoCount) == KERN_SUCCESS) {
            threadBaseInfo = (thread_basic_info_t)threadInfo;
            if (!(threadBaseInfo->flags & TH_FLAGS_IDLE)) {
                integer_t cpuUsage = threadBaseInfo->cpu_usage / 10;
                if (cpuUsage > 70) {
                    //cup 消耗大于 70 時打印和記錄堆棧
                    NSString *reStr = smStackOfThread(threads[i]);
                    //記錄數(shù)據(jù)庫中
                    [[[SMLagDB shareInstance] increaseWithStackString:reStr] subscribeNext:^(id x) {}];
                    NSLog(@"CPU useage overload thread stack：\n%@",reStr);
                }
            }
        }
    }
}

Demo

工具已整合到先前做的 GCDFetchFeed 里。

子線程檢測主線程卡頓使用的話在需要開始檢測的地方添加 [[SMLagMonitor shareInstance] beginMonitor]; 即可。
需要檢測所有方法調(diào)用的用法就是在需要檢測的地方調(diào)用 [SMCallTrace start]; 就可以了，不檢測打印出結(jié)果的話調(diào)用 stop 和 save 就好了。這里還可以設(shè)置最大深度和最小耗時檢測來過濾不需要看到的信息。
方法調(diào)用頻次使用可以在需要開始統(tǒng)計的地方加上 [SMCallTrace startWithMaxDepth:3]; 記錄時使用 [SMCallTrace stopSaveAndClean]; 記錄到到數(shù)據(jù)庫中同時清理內(nèi)存的占用?？梢?hook VC 的 viewWillAppear 和 viewWillDisappear，在 appear 時開始記錄，在 disappear 時記錄到數(shù)據(jù)庫同時清理一次。結(jié)果展示的 view controller 是 SMClsCallViewController，push 出來就能夠看到列表結(jié)果。

資料

WWDC

WWDC 2013 224 Designing Code for Performance
WWDC 2013 408 Optimizing Your Code Using LLVM
WWDC 2013 712 Energy Best Practices
WWDC 2014 710 writing energy efficient code part 1
WWDC 2014 710 writing energy efficient code part 2
WWDC 2015 230 performance on ios and watchos
WWDC 2015 707 achieving allday battery life
WWDC 2015 708 debugging energy issues
WWDC 2015 718 building responsive and efficient apps with gcd
WWDC 2016 406 optimizing app startup time
WWDC 2016 719 optimizing io for performance and battery life
WWDC 2017 238 writing energy efficient apps
WWDC 2017 706 modernizing grand central dispatch usage

PS:

最近我這需要招一名實習(xí)生來滴滴和我一起攻克一個工程開發(fā)效率的項目，絕對會成就感滿滿的，有興趣的可以將簡歷發(fā)我 daiming@didichuxing.com 或者在微博上聯(lián)系我 @戴銘

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九 欧美,1769亚洲,黄色成人av

深入剖析 iOS 性能優(yōu)化