OOM探究:XNU 內(nèi)存狀態(tài)管理

前言

OOM全稱 Out Of Memory,指的是因?yàn)閮?nèi)存使用過多而導(dǎo)致的 APP 閃退。其實(shí)這本身是一種操作系統(tǒng)管理內(nèi)存的機(jī)制。因?yàn)槭謾C(jī)的內(nèi)存是有限的,不可能無限制的使用,當(dāng)內(nèi)存不夠時(shí),需要將低優(yōu)先級的進(jìn)程kill,騰出內(nèi)存以便高優(yōu)先級的進(jìn)程使用。這里發(fā)生的進(jìn)程 kill,就是 OOM 了。

那OOM的觸發(fā)機(jī)制到底是怎么樣的呢?目前市上的資料說的都比較模糊,沒有一個(gè)很清晰的介紹

源碼探究

幸好xnu這塊代碼是開源的,在opensource.apple.com里可以下到整個(gè) xnu 內(nèi)核的代碼。內(nèi)存狀態(tài)管理相關(guān)的代碼主要是在kern_memorystatus.c(.h)文件中

優(yōu)先級隊(duì)列

首先系統(tǒng)對進(jìn)程是分優(yōu)先級的,整個(gè)系統(tǒng)會有一個(gè)優(yōu)先級隊(duì)列。

#define MEMSTAT_BUCKET_COUNT (JETSAM_PRIORITY_MAX + 1)

typedef struct memstat_bucket {
    TAILQ_HEAD(, proc) list;
    int count;
} memstat_bucket_t;

memstat_bucket_t memstat_bucket[MEMSTAT_BUCKET_COUNT];

kern_memorystatus.c中定義了一個(gè)memstat_bucket_t的結(jié)構(gòu)體。結(jié)構(gòu)體很簡單,count 表示這個(gè)優(yōu)先級下有多少個(gè)進(jìn)程,list是一個(gè)鏈表,用來存放各個(gè)進(jìn)程。(使用鏈表是為了插入和刪除方便。)

memstat_bucket_t結(jié)構(gòu)體之后,系統(tǒng)定義了一個(gè)memstat_bucket_t結(jié)構(gòu)體的數(shù)組,用來存放系統(tǒng)進(jìn)程的優(yōu)先級隊(duì)列。每個(gè)優(yōu)先級對應(yīng)一個(gè)memstat_bucket_t結(jié)構(gòu)體,結(jié)構(gòu)體中存放著這個(gè)優(yōu)先級下的所有進(jìn)程。

kern_memorystatus.h中定義了優(yōu)先級有哪些:

#define JETSAM_PRIORITY_REVISION                  2

#define JETSAM_PRIORITY_IDLE_HEAD                -2
/* The value -1 is an alias to JETSAM_PRIORITY_DEFAULT */
#define JETSAM_PRIORITY_IDLE                      0
#define JETSAM_PRIORITY_IDLE_DEFERRED         1 /* Keeping this around till all xnu_quick_tests can be moved away from it.*/
#define JETSAM_PRIORITY_AGING_BAND1       JETSAM_PRIORITY_IDLE_DEFERRED
#define JETSAM_PRIORITY_BACKGROUND_OPPORTUNISTIC  2
#define JETSAM_PRIORITY_AGING_BAND2       JETSAM_PRIORITY_BACKGROUND_OPPORTUNISTIC
#define JETSAM_PRIORITY_BACKGROUND                3
#define JETSAM_PRIORITY_ELEVATED_INACTIVE     JETSAM_PRIORITY_BACKGROUND
#define JETSAM_PRIORITY_MAIL                      4
#define JETSAM_PRIORITY_PHONE                     5
#define JETSAM_PRIORITY_UI_SUPPORT                8
#define JETSAM_PRIORITY_FOREGROUND_SUPPORT        9
#define JETSAM_PRIORITY_FOREGROUND               10
#define JETSAM_PRIORITY_AUDIO_AND_ACCESSORY      12
#define JETSAM_PRIORITY_CONDUCTOR                13
#define JETSAM_PRIORITY_HOME                     16
#define JETSAM_PRIORITY_EXECUTIVE                17
#define JETSAM_PRIORITY_IMPORTANT                18
#define JETSAM_PRIORITY_CRITICAL                 19

#define JETSAM_PRIORITY_MAX                      21

/* TODO - tune. This should probably be lower priority */
#define JETSAM_PRIORITY_DEFAULT                  18
#define JETSAM_PRIORITY_TELEPHONY                19

可以看到foreground是10,background 是3,當(dāng)內(nèi)存緊張的時(shí)候,后臺的進(jìn)程會優(yōu)先被干掉,正常當(dāng)foreground前面優(yōu)先級的進(jìn)程全被kill后,依然內(nèi)存緊張,才會kill foreground進(jìn)程

OOM 類型

目前 OOM 主要分為11種類型:

/* Cause */
enum {
    kMemorystatusInvalid            = JETSAM_REASON_INVALID,
    kMemorystatusKilled         = JETSAM_REASON_GENERIC,
    kMemorystatusKilledHiwat        = JETSAM_REASON_MEMORY_HIGHWATER, //high water
    kMemorystatusKilledVnodes       = JETSAM_REASON_VNODE, // vnode
    kMemorystatusKilledVMPageShortage   = JETSAM_REASON_MEMORY_VMPAGESHORTAGE, //vm page shortager
    kMemorystatusKilledVMThrashing      = JETSAM_REASON_MEMORY_VMTHRASHING, // vm thrashing
    kMemorystatusKilledFCThrashing      = JETSAM_REASON_MEMORY_FCTHRASHING, // fc thrashing
    kMemorystatusKilledPerProcessLimit  = JETSAM_REASON_MEMORY_PERPROCESSLIMIT, // per process limit
    kMemorystatusKilledDiagnostic       = JETSAM_REASON_MEMORY_DIAGNOSTIC, // diagnostic
    kMemorystatusKilledIdleExit     = JETSAM_REASON_MEMORY_IDLE_EXIT, // idle exit
    kMemorystatusKilledZoneMapExhaustion    = JETSAM_REASON_ZONE_MAP_EXHAUSTION // map exhaustion
};

對應(yīng)每種類型,輸出日志時(shí)會有相應(yīng)的字符串,輸出到 log 中

/* For logging clarity */
static const char *memorystatus_kill_cause_name[] = {
    ""                      ,
    "jettisoned"        ,       /* kMemorystatusKilled          */
    "highwater"             ,       /* kMemorystatusKilledHiwat     */ 
    "vnode-limit"           ,       /* kMemorystatusKilledVnodes        */
    "vm-pageshortage"       ,       /* kMemorystatusKilledVMPageShortage    */
    "vm-thrashing"          ,       /* kMemorystatusKilledVMThrashing   */
    "fc-thrashing"          ,       /* kMemorystatusKilledFCThrashing   */
    "per-process-limit"     ,       /* kMemorystatusKilledPerProcessLimit   */
    "diagnostic"            ,       /* kMemorystatusKilledDiagnostic    */
    "idle-exit"             ,       /* kMemorystatusKilledIdleExit      */
    "zone-map-exhaustion"   ,       /* kMemorystatusKilledZoneMapExhaustion */
};

當(dāng)我們的 App 觸發(fā) OOM 時(shí),系統(tǒng)會有相應(yīng)的日志寫到手機(jī)的設(shè)置->隱私->分析->分析數(shù)據(jù)->jstsamEvent-xxx文件中。打開文件,可以看到reason一欄會標(biāo)明 OOM 的類型

這是我手機(jī)里的一個(gè)jstsamEvent文件

...
  "largestProcess" : "Boom",
  "genCounter" : 23,
  "processes" : [
  {
    "uuid" : "ebd916c8-96e7-3b8f-985d-027098a13fd6",
    "states" : [
      "daemon",
      "idle"
    ],
    "killDelta" : 1887,
    "genCount" : 0,
    "age" : 200706725,
    "purgeable" : 0,
    "fds" : 50,
    "coalition" : 268,
    "rpages" : 34,
    "reason" : "vm-pageshortage",
    "pid" : 2205,
    "cpuTime" : 0.0030500000000000002,
    "name" : "xpcproxy",
    "lifetimeMax" : 79
  },

...

在這里我們可以看到占用內(nèi)存最大的進(jìn)程是 boom,OOM 的類型是vm-pageshortage

OOM 的觸發(fā)方式

正常 OOM 的觸發(fā)方式有2種,一種是同步觸發(fā),一種是異步觸發(fā)。比如 VMPageShortage類型的 OOM 觸發(fā)方式:

boolean_t memorystatus_kill_on_VM_page_shortage(boolean_t async) {
    if (async) {
        return memorystatus_kill_process_async(-1, kMemorystatusKilledVMPageShortage);
    } else {
        os_reason_t jetsam_reason = os_reason_create(OS_REASON_JETSAM, JETSAM_REASON_MEMORY_VMPAGESHORTAGE);
        if (jetsam_reason == OS_REASON_NULL) {
            printf("memorystatus_kill_on_VM_page_shortage -- sync: failed to allocate jetsam reason\n");
        }

        return memorystatus_kill_process_sync(-1, kMemorystatusKilledVMPageShortage, jetsam_reason);
    }
}

同步觸發(fā)比較簡單粗暴,直接根據(jù)pid,kill 掉相應(yīng)的進(jìn)程。如果 pid 傳的是-1,就 kill 掉優(yōu)先級隊(duì)列里面優(yōu)先級最低的那個(gè)進(jìn)程。(如果多個(gè)進(jìn)程同一個(gè)優(yōu)先級,系統(tǒng)會根據(jù)占用內(nèi)存大小排序,kill 掉內(nèi)存占用最大的進(jìn)程)

static boolean_t 
memorystatus_kill_process_sync(pid_t victim_pid, uint32_t cause, os_reason_t jetsam_reason) {
    boolean_t res;

    uint32_t errors = 0;

    if (victim_pid == -1) {
        /* No pid, so kill first process */
        res = memorystatus_kill_top_process(TRUE, TRUE, cause, jetsam_reason, NULL, &errors);
    } else {
        res = memorystatus_kill_specific_process(victim_pid, cause, jetsam_reason);
    }
    
    if (errors) {
        memorystatus_clear_errors();
    }

    return res;
}

而異步觸發(fā)實(shí)際是通過設(shè)置一個(gè)內(nèi)存標(biāo)志位,標(biāo)志當(dāng)前內(nèi)存已經(jīng)有問題了,然后喚醒專門的內(nèi)存管理線程去管理內(nèi)存狀態(tài),觸發(fā) OOM,kill 部分進(jìn)程,回收內(nèi)存。

static boolean_t 
memorystatus_kill_process_async(pid_t victim_pid, uint32_t cause) {
    /*
     * TODO: allow a general async path
     *
     * NOTE: If a new async kill cause is added, make sure to update memorystatus_thread() to
     * add the appropriate exit reason code mapping.
     */
    if ((victim_pid != -1) || (cause != kMemorystatusKilledVMPageShortage && cause != kMemorystatusKilledVMThrashing &&
                   cause != kMemorystatusKilledFCThrashing && cause != kMemorystatusKilledZoneMapExhaustion)) {
        return FALSE;
    }
    
    kill_under_pressure_cause = cause;
    memorystatus_thread_wake();
    return TRUE;
}

內(nèi)存狀態(tài)管理線程

系統(tǒng)中專門有一個(gè)線程用來管理內(nèi)存狀態(tài),當(dāng)內(nèi)存狀態(tài)出現(xiàn)問題或者內(nèi)存壓力過大時(shí),將會通過一定的策略,干掉一些 App 回收內(nèi)存。

將部分無關(guān)代碼刪除后,內(nèi)存狀態(tài)管理線程代碼是這樣的

static void
memorystatus_thread(void *param __unused, wait_result_t wr __unused)
{
    static boolean_t is_vm_privileged = FALSE;

    boolean_t post_snapshot = FALSE;
    uint32_t errors = 0;
    uint32_t hwm_kill = 0;
    boolean_t sort_flag = TRUE;
    boolean_t corpse_list_purged = FALSE;
    int jld_idle_kills = 0;

    if (is_vm_privileged == FALSE) {
        /* 一些初始化工作 */
        thread_wire(host_priv_self(), current_thread(), TRUE);
        is_vm_privileged = TRUE;
        
        if (vm_restricted_to_single_processor == TRUE)
            thread_vm_bind_group_add();
        thread_set_thread_name(current_thread(), "VM_memorystatus");
        memorystatus_thread_block(0, memorystatus_thread);
    }
    
    // 真正的內(nèi)存管理的循環(huán)
    while (memorystatus_action_needed()) {
        boolean_t killed;
        int32_t priority;
        uint32_t cause;
        uint64_t jetsam_reason_code = JETSAM_REASON_INVALID;
        os_reason_t jetsam_reason = OS_REASON_NULL;

        cause = kill_under_pressure_cause;
        switch (cause) {
            case kMemorystatusKilledFCThrashing:
                jetsam_reason_code = JETSAM_REASON_MEMORY_FCTHRASHING;
                break;
            case kMemorystatusKilledVMThrashing:
                jetsam_reason_code = JETSAM_REASON_MEMORY_VMTHRASHING;
                break;
            case kMemorystatusKilledZoneMapExhaustion:
                jetsam_reason_code = JETSAM_REASON_ZONE_MAP_EXHAUSTION;
                break;
            case kMemorystatusKilledVMPageShortage:
                /* falls through */
            default:
                jetsam_reason_code = JETSAM_REASON_MEMORY_VMPAGESHORTAGE;
                cause = kMemorystatusKilledVMPageShortage;
                break;
        }

        /* HIGHWATER類型的 OOM 觸發(fā) */
        boolean_t is_critical = TRUE;
        if (memorystatus_act_on_hiwat_processes(&errors, &hwm_kill, &post_snapshot, &is_critical)) {
            if (is_critical == FALSE) {
                /*
                 * For now, don't kill any other processes.
                 */
                break;
            } else {
                goto done;
            }
        }

        jetsam_reason = os_reason_create(OS_REASON_JETSAM, jetsam_reason_code);
        if (jetsam_reason == OS_REASON_NULL) {
            printf("memorystatus_thread: failed to allocate jetsam reason\n");
        }
        
        // 核心的 OOM 觸發(fā)機(jī)制
        if (memorystatus_act_aggressive(cause, jetsam_reason, &jld_idle_kills, &corpse_list_purged, &post_snapshot)) {
            goto done;
        }

        os_reason_ref(jetsam_reason);

        /* LRU,干掉優(yōu)先級最低的一個(gè)進(jìn)程 */
        killed = memorystatus_kill_top_process(TRUE, sort_flag, cause, jetsam_reason, &priority, &errors);
        sort_flag = FALSE;

        if (killed) {
            /* Jetsam Loop Detection */
            if (memorystatus_jld_enabled == TRUE) {
                if ((priority == JETSAM_PRIORITY_IDLE) || (priority == system_procs_aging_band) || (priority == applications_aging_band)) {
                    jld_idle_kills++;
                } 
            }

            if ((priority >= JETSAM_PRIORITY_UI_SUPPORT) && (total_corpses_count() > 0) && (corpse_list_purged == FALSE)) {
                task_purge_all_corpses();
                corpse_list_purged = TRUE;
            }
            goto done;
        }
        
        if (memorystatus_avail_pages_below_critical()) {
            /*
             * Still under pressure and unable to kill a process - purge corpse memory
             */
            if (total_corpses_count() > 0) {
                task_purge_all_corpses();
                corpse_list_purged = TRUE;
            }

            if (memorystatus_avail_pages_below_critical()) {
                /*
                 * Still under pressure and unable to kill a process - panic
                 */
                panic("memorystatus_jetsam_thread: no victim! available pages:%llu\n", (uint64_t)memorystatus_available_pages);
            }
        }
            
done:       

        /*
         * We do not want to over-kill when thrashing has been detected.
         * To avoid that, we reset the flag here and notify the
         * compressor.
         */
        if (is_reason_thrashing(kill_under_pressure_cause)) {
            kill_under_pressure_cause = 0;
#if CONFIG_JETSAM
            vm_thrashing_jetsam_done();
#endif /* CONFIG_JETSAM */
        } else if (is_reason_zone_map_exhaustion(kill_under_pressure_cause)) {
            kill_under_pressure_cause = 0;
        }

        os_reason_free(jetsam_reason);
    }

    kill_under_pressure_cause = 0;
    
    if (errors) {
        memorystatus_clear_errors();
    }
}

代碼比較多,我們來慢慢解析

準(zhǔn)入條件

我們可以看到真正核心的代碼在while (memorystatus_action_needed())的循環(huán)里面,memorystatus_action_needed是觸發(fā) OOM 的核心判斷條件

/* Does cause indicate vm or fc thrashing? */
static boolean_t
is_reason_thrashing(unsigned cause)
{
    switch (cause) {
    case kMemorystatusKilledVMThrashing:
    case kMemorystatusKilledFCThrashing:
        return TRUE;
    default:
        return FALSE;
    }
}

/* Is the zone map almost full? */
static boolean_t
is_reason_zone_map_exhaustion(unsigned cause)
{
    if (cause == kMemorystatusKilledZoneMapExhaustion)
        return TRUE;
    return FALSE;
}

static boolean_t
memorystatus_action_needed(void)
{
    return (is_reason_thrashing(kill_under_pressure_cause) ||
            is_reason_zone_map_exhaustion(kill_under_pressure_cause) ||
           memorystatus_available_pages <= memorystatus_available_pages_pressure);
}

當(dāng)kill_under_pressure_cause值為kMemorystatusKilledVMThrashing,kMemorystatusKilledFCThrashing,kMemorystatusKilledZoneMapExhaustion時(shí),或者當(dāng)前可用內(nèi)存 memorystatus_available_pages 小于閾值memorystatus_available_pages_pressure時(shí),會走進(jìn)去觸發(fā) OOM。

high-water

進(jìn)入循環(huán)之后,首先走到memorystatus_act_on_hiwat_processes

/* HIGHWATER類型的 OOM 觸發(fā) */
boolean_t is_critical = TRUE;
if (memorystatus_act_on_hiwat_processes(&errors, &hwm_kill, &post_snapshot, &is_critical)) {
    if (is_critical == FALSE) {
        /*
         * For now, don't kill any other processes.
         */
        break;
    } else {
        goto done;
    }
}

這是觸發(fā)HIGHWATER類型 OOM 的關(guān)鍵方法

static boolean_t
memorystatus_act_on_hiwat_processes(uint32_t *errors, uint32_t *hwm_kill, boolean_t *post_snapshot, __unused boolean_t *is_critical)
{
    boolean_t killed = memorystatus_kill_hiwat_proc(errors);

    if (killed) {
        *hwm_kill = *hwm_kill + 1;
        *post_snapshot = TRUE;
        return TRUE;
    } else {
        memorystatus_hwm_candidates = FALSE;
    }
    return FALSE;
}

memorystatus_act_on_hiwat_processes會直接調(diào)用memorystatus_kill_hiwat_proc,核心代碼都在memorystatus_kill_hiwat_proc中。

static boolean_t
memorystatus_kill_hiwat_proc(uint32_t *errors)
{
    pid_t aPid = 0;
    proc_t p = PROC_NULL, next_p = PROC_NULL;
    boolean_t new_snapshot = FALSE, killed = FALSE;
    int kill_count = 0;
    unsigned int i = 0;
    uint32_t aPid_ep;
    uint64_t killtime = 0;
        clock_sec_t     tv_sec;
        clock_usec_t    tv_usec;
        uint32_t        tv_msec;
    os_reason_t jetsam_reason = OS_REASON_NULL;
    
    jetsam_reason = os_reason_create(OS_REASON_JETSAM, JETSAM_REASON_MEMORY_HIGHWATER);
    proc_list_lock();
    
    next_p = memorystatus_get_first_proc_locked(&i, TRUE);
    while (next_p) {
        uint64_t footprint_in_bytes = 0;
        uint64_t memlimit_in_bytes  = 0;
        boolean_t skip = 0;

        p = next_p;
        next_p = memorystatus_get_next_proc_locked(&i, p, TRUE);
        
        aPid = p->p_pid;
        aPid_ep = p->p_memstat_effectivepriority;
        
        if (p->p_memstat_state  & (P_MEMSTAT_ERROR | P_MEMSTAT_TERMINATED)) {
            continue;
        }
        
        /* skip if no limit set */
        if (p->p_memstat_memlimit <= 0) {
            continue;
        }

        footprint_in_bytes = get_task_phys_footprint(p->task);
        memlimit_in_bytes  = (((uint64_t)p->p_memstat_memlimit) * 1024ULL * 1024ULL);   /* convert MB to bytes */
        skip = (footprint_in_bytes <= memlimit_in_bytes);


#if CONFIG_FREEZE
        if (!skip) {
            if (p->p_memstat_state & P_MEMSTAT_LOCKED) {
                skip = TRUE;
            } else {
                skip = FALSE;
            }               
        }
#endif

        if (skip) {
            continue;
        } else {
            if (memorystatus_jetsam_snapshot_count == 0) {
                
            p->p_memstat_state |= P_MEMSTAT_TERMINATED;

            killtime = mach_absolute_time();
            absolutetime_to_microtime(killtime, &tv_sec, &tv_usec);
            tv_msec = tv_usec / 1000;
                
            {
                memorystatus_update_jetsam_snapshot_entry_locked(p, kMemorystatusKilledHiwat, killtime);
                    
                if (proc_ref_locked(p) == p) {
                    proc_list_unlock();

                    /*
                     * memorystatus_do_kill drops a reference, so take another one so we can
                     * continue to use this exit reason even after memorystatus_do_kill()
                     * returns
                     */
                    os_reason_ref(jetsam_reason);

                    killed = memorystatus_do_kill(p, kMemorystatusKilledHiwat, jetsam_reason);

                    /* Success? */
                    if (killed) {
                        proc_rele(p);
                        kill_count++;
                        goto exit;
                    }

                    proc_list_lock();
                    proc_rele_locked(p);
                    p->p_memstat_state &= ~P_MEMSTAT_TERMINATED;
                    p->p_memstat_state |= P_MEMSTAT_ERROR;
                    *errors += 1;
                }

                i = 0;
                next_p = memorystatus_get_first_proc_locked(&i, TRUE);
            }
        }
    }
    
    proc_list_unlock();
    
exit:
    os_reason_free(jetsam_reason);
    return killed;
}

首先通過memorystatus_get_first_proc_locked(&i, TRUE)去優(yōu)先級隊(duì)列里面取出優(yōu)先級最低的進(jìn)程。如果這個(gè)進(jìn)程內(nèi)存小于閾值(footprint_in_bytes <= memlimit_in_bytes),則跳過然后取下一個(gè)進(jìn)程memorystatus_get_next_proc_locked,如果內(nèi)存超過閾值,將通過memorystatus_do_kill干掉這個(gè)進(jìn)程,并結(jié)束循環(huán)。

我們可以看到這里計(jì)算內(nèi)存的口徑主要用的是phys_footprint,不過目前觀察我自己手機(jī)上的 OOM 類型,從未見過high-water 類型的 OOM,猜測可能high-water的閾值比較高,比較難觸發(fā),大家也可以看看自己手機(jī)里的 OOM 類型,如果有 high-water 類型的 OOM,可以告訴我

normal kill

如果沒有high-water的進(jìn)程,程序繼續(xù)往下執(zhí)行,走到memorystatus_act_aggressive方法里,這個(gè)方法是通常oom的觸發(fā)方法,大部分OOM都在這里面觸發(fā)。

static boolean_t
memorystatus_act_aggressive(uint32_t cause, os_reason_t jetsam_reason, int *jld_idle_kills, boolean_t *corpse_list_purged, boolean_t *post_snapshot)
{
    if (memorystatus_jld_enabled == TRUE) {

        boolean_t killed;
        uint32_t errors = 0;

        /* Jetsam Loop Detection - locals */
        memstat_bucket_t *bucket;
        int     jld_bucket_count = 0;
        struct timeval  jld_now_tstamp = {0,0};
        uint64_t    jld_now_msecs = 0;
        int     elevated_bucket_count = 0;

        /* Jetsam Loop Detection - statics */
        static uint64_t  jld_timestamp_msecs = 0;
        static int   jld_idle_kill_candidates = 0;  /* Number of available processes in band 0,1 at start */
        static int   jld_eval_aggressive_count = 0;     /* Bumps the max priority in aggressive loop */
        static int32_t   jld_priority_band_max = JETSAM_PRIORITY_UI_SUPPORT;

        microuptime(&jld_now_tstamp);

        jld_now_msecs = (jld_now_tstamp.tv_sec * 1000);

        proc_list_lock();
        switch (jetsam_aging_policy) {
        case kJetsamAgingPolicyLegacy:
            bucket = &memstat_bucket[JETSAM_PRIORITY_IDLE];
            jld_bucket_count = bucket->count;
            bucket = &memstat_bucket[JETSAM_PRIORITY_AGING_BAND1];
            jld_bucket_count += bucket->count;
            break;
        case kJetsamAgingPolicySysProcsReclaimedFirst:
        case kJetsamAgingPolicyAppsReclaimedFirst:
            bucket = &memstat_bucket[JETSAM_PRIORITY_IDLE];
            jld_bucket_count = bucket->count;
            bucket = &memstat_bucket[system_procs_aging_band];
            jld_bucket_count += bucket->count;
            bucket = &memstat_bucket[applications_aging_band];
            jld_bucket_count += bucket->count;
            break;
        case kJetsamAgingPolicyNone:
        default:
            bucket = &memstat_bucket[JETSAM_PRIORITY_IDLE];
            jld_bucket_count = bucket->count;
            break;
        }

        bucket = &memstat_bucket[JETSAM_PRIORITY_ELEVATED_INACTIVE];
        elevated_bucket_count = bucket->count;

        proc_list_unlock();

        if ( (jld_bucket_count == 0) || 
             (jld_now_msecs > (jld_timestamp_msecs + memorystatus_jld_eval_period_msecs))) {
            jld_timestamp_msecs  = jld_now_msecs;
            // 先回收優(yōu)先級特別低的進(jìn)程:JETSAM_PRIORITY_IDLE,system_procs_aging_band,applications_aging_band,這些進(jìn)程回收后jld_bucket_count將等于0
            jld_idle_kill_candidates = jld_bucket_count;
            *jld_idle_kills      = 0;
            jld_eval_aggressive_count = 0;
            jld_priority_band_max   = JETSAM_PRIORITY_UI_SUPPORT;
        }

        // 正常狀態(tài)下先回收一些隨時(shí)可以回收的線程:JETSAM_PRIORITY_IDLE,system_procs_aging_band,applications_aging_band,這些進(jìn)程回收后才能走進(jìn)這個(gè)判斷里面
        if (*jld_idle_kills > jld_idle_kill_candidates) {
            jld_eval_aggressive_count++;

            if ((jld_eval_aggressive_count == memorystatus_jld_eval_aggressive_count) &&
                (total_corpses_count() > 0) && (*corpse_list_purged == FALSE)) {
                task_purge_all_corpses();
                *corpse_list_purged = TRUE;
            }
            else if (jld_eval_aggressive_count > memorystatus_jld_eval_aggressive_count) {
                if ((memorystatus_jld_eval_aggressive_priority_band_max < 0) ||
                    (memorystatus_jld_eval_aggressive_priority_band_max >= MEMSTAT_BUCKET_COUNT)) {

                } else {
                    jld_priority_band_max = memorystatus_jld_eval_aggressive_priority_band_max;
                }
            }

            // 先干掉后臺線程
            /* Visit elevated processes first */
            while (elevated_bucket_count) {

                elevated_bucket_count--;

                os_reason_ref(jetsam_reason);
                killed = memorystatus_kill_elevated_process(
                    cause,
                    jetsam_reason,
                    jld_eval_aggressive_count,
                    &errors);

                if (killed) {
                    *post_snapshot = TRUE;
                    // 如果還是有壓力,就繼續(xù)殺App
                    if (memorystatus_avail_pages_below_pressure()) {
                        /*
                         * Still under pressure.
                         * Find another pinned processes.
                         */
                        continue;
                    } else {
                        return TRUE;
                    }
                } else {
                    break;
                }
            }

            // 干掉前臺線程
            killed = memorystatus_kill_top_process_aggressive(
                kMemorystatusKilledVMThrashing,
                jld_eval_aggressive_count, 
                jld_priority_band_max, 
                &errors);
                
            if (killed) {
                /* Always generate logs after aggressive kill */
                *post_snapshot = TRUE;
                *jld_idle_kills = 0;
                return TRUE;
            } 
        }

        return FALSE;
    }

    return FALSE;
}

這里的邏輯比較多,我們慢慢解釋。

首先有一個(gè)jld_bucket_count,這里面包含可以直接干掉的低優(yōu)先級進(jìn)程數(shù)量。

switch (jetsam_aging_policy) {
case kJetsamAgingPolicyLegacy:
    bucket = &memstat_bucket[JETSAM_PRIORITY_IDLE];
    jld_bucket_count = bucket->count;
    bucket = &memstat_bucket[JETSAM_PRIORITY_AGING_BAND1];
    jld_bucket_count += bucket->count;
    break;
case kJetsamAgingPolicySysProcsReclaimedFirst:
case kJetsamAgingPolicyAppsReclaimedFirst:
    bucket = &memstat_bucket[JETSAM_PRIORITY_IDLE];
    jld_bucket_count = bucket->count;
    bucket = &memstat_bucket[system_procs_aging_band];
    jld_bucket_count += bucket->count;
    bucket = &memstat_bucket[applications_aging_band];
    jld_bucket_count += bucket->count;
    break;
case kJetsamAgingPolicyNone:
default:
    bucket = &memstat_bucket[JETSAM_PRIORITY_IDLE];
    jld_bucket_count = bucket->count;
    break;
}
if ( (jld_bucket_count == 0) || 
     (jld_now_msecs > (jld_timestamp_msecs + memorystatus_jld_eval_period_msecs))) {
    jld_timestamp_msecs  = jld_now_msecs;
    // 先回收優(yōu)先級特別低的進(jìn)程:JETSAM_PRIORITY_IDLE,system_procs_aging_band,applications_aging_band,這些進(jìn)程回收后jld_bucket_count將等于0
    jld_idle_kill_candidates = jld_bucket_count;
    *jld_idle_kills      = 0;
    jld_eval_aggressive_count = 0;
    jld_priority_band_max   = JETSAM_PRIORITY_UI_SUPPORT;
}

// 正常狀態(tài)下先回收一些隨時(shí)可以回收的線程:JETSAM_PRIORITY_IDLE,system_procs_aging_band,applications_aging_band,這些進(jìn)程回收后才能走進(jìn)這個(gè)判斷里面
if (*jld_idle_kills > jld_idle_kill_candidates) {
// 這里面是我們App經(jīng)常觸發(fā)OOM的地方
}
    
killed = memorystatus_kill_top_process(TRUE, sort_flag, cause, jetsam_reason, &priority, &errors);
if (killed) {
    jld_idle_kills++;
}

根據(jù)jetsam_aging_policy確定哪些優(yōu)先級類型的進(jìn)程需要被直接干掉。正常走到kJetsamAgingPolicyAppsReclaimedFirst或者kJetsamAgingPolicySysProcsReclaimedFirst,jld_bucket_count = JETSAM_PRIORITY_IDLE + system_procs_aging_band + applications_aging_band

*jld_idle_kills表示已經(jīng)kill掉的低優(yōu)先級進(jìn)程,每次kill掉一個(gè)低優(yōu)先級進(jìn)程jld_idle_kills++jld_idle_kill_candidates = jld_bucket_count;,在if (*jld_idle_kills > jld_idle_kill_candidates)的判斷條件里,只有前面提到的jld_bucket_count的低優(yōu)先級進(jìn)程全部被干掉了,才會走到判斷條件里面。

所以當(dāng)內(nèi)存不夠的時(shí)候,系統(tǒng)會先回收JETSAM_PRIORITY_IDLE ``system_procs_aging_band ``applications_aging_band優(yōu)先級的進(jìn)程。

我們再來看判斷條件里面

if (*jld_idle_kills > jld_idle_kill_candidates) {
    jld_eval_aggressive_count++;

    if ((jld_eval_aggressive_count == memorystatus_jld_eval_aggressive_count) &&
        (total_corpses_count() > 0) && (*corpse_list_purged == FALSE)) {
        task_purge_all_corpses();
        *corpse_list_purged = TRUE;
    }
    else if (jld_eval_aggressive_count > memorystatus_jld_eval_aggressive_count) {
        if ((memorystatus_jld_eval_aggressive_priority_band_max < 0) ||
            (memorystatus_jld_eval_aggressive_priority_band_max >= MEMSTAT_BUCKET_COUNT)) {

        } else {
            jld_priority_band_max = memorystatus_jld_eval_aggressive_priority_band_max;
        }
    }

    // 先干掉后臺線程
    /* Visit elevated processes first */
    while (elevated_bucket_count) {

        elevated_bucket_count--;

        os_reason_ref(jetsam_reason);
        killed = memorystatus_kill_elevated_process(
            cause,
            jetsam_reason,
            jld_eval_aggressive_count,
            &errors);

        if (killed) {
            *post_snapshot = TRUE;
            // 如果還是有壓力,就繼續(xù)殺App
            if (memorystatus_avail_pages_below_pressure()) {
                /*
                 * Still under pressure.
                 * Find another pinned processes.
                 */
                continue;
            } else {
                return TRUE;
            }
        } else {
            break;
        }
    }

    // 干掉前臺線程
    killed = memorystatus_kill_top_process_aggressive(
        kMemorystatusKilledVMThrashing,
        jld_eval_aggressive_count, 
        jld_priority_band_max, 
        &errors);
        
    if (killed) {
        /* Always generate logs after aggressive kill */
        *post_snapshot = TRUE;
        *jld_idle_kills = 0;
        return TRUE;
    } 
}

他會先通過memorystatus_kill_elevated_process干掉后臺的進(jìn)程,每干掉一個(gè)進(jìn)程,檢測一下內(nèi)存壓力,檢測內(nèi)存壓力還是通過memorystatus_available_pages

static boolean_t memorystatus_avail_pages_below_pressure(void) {
    return (memorystatus_available_pages <= memorystatus_available_pages_pressure);
}

如果memorystatus_available_pages還是小于閾值,則繼續(xù)kill下一個(gè)進(jìn)程。當(dāng)所有后臺進(jìn)程都被kill后。如果還有內(nèi)存壓力,再通過memorystatus_kill_top_process_aggressivekill掉優(yōu)先級最低的進(jìn)程。這里是觸發(fā)FOOM的關(guān)鍵,如果foreground已經(jīng)是最低優(yōu)先級的進(jìn)程了,那就會發(fā)生FOOM,kill掉前臺的App

memorystatus_available_pages計(jì)算

是否觸發(fā)FOOM,主要還是根據(jù)memorystatus_available_pages是否小于閾值。那memorystatus_available_pages怎么計(jì)算呢?

查閱源碼,可以找到

#define VM_CHECK_MEMORYSTATUS do { \
    memorystatus_pages_update(      \
            vm_page_pageable_external_count + \
        vm_page_free_count +        \
            (VM_DYNAMIC_PAGING_ENABLED() ? 0 : vm_page_purgeable_count) \
        ); \
    } while(0)
    
void memorystatus_pages_update(unsigned int pages_avail)
{
    memorystatus_available_pages = pages_avail;
    ...
}

可以看到memorystatus_available_pages = vm_page_pageable_external_count + vm_page_free_count + vm_page_purgeable_count

  • vm_page_pageable_external_count: iOS里表示已經(jīng)備份的page count,內(nèi)存不夠時(shí),可以使用
  • vm_page_free_count: 表示未使用的page count
  • vm_page_purgeable_count: 表示可清理的page count

另外memorystatus_available_pages_pressure實(shí)際等于手機(jī)最大內(nèi)存的15%。也就是說當(dāng)可用內(nèi)存小于系統(tǒng)內(nèi)存的15%時(shí),就會觸發(fā)OOM了

邏輯匯總

縱觀memorystatus_thread代碼,邏輯如下:

memorystatus_thread流程圖.png

  1. 判斷 kill_under_pressure_cause值為kMemorystatusKilledVMThrashing,kMemorystatusKilledFCThrashing,kMemorystatusKilledZoneMapExhaustion時(shí),或者當(dāng)前可用內(nèi)存 memorystatus_available_pages 小于閾值memorystatus_available_pages_pressure,進(jìn)入OOM邏輯
  2. 遍歷每個(gè)進(jìn)程,跟據(jù)phys_footprint,判斷每個(gè)進(jìn)程是否高于閾值,如果高于閾值,以high-water類型kill進(jìn)程,觸發(fā)OOM
  3. 如果JETSAM_PRIORITY_IDLE,JETSAM_PRIORITY_AGING_BAND1,JETSAM_PRIORITY_IDLE優(yōu)先級隊(duì)列中還存在進(jìn)程,則kill一個(gè)最低優(yōu)先級的進(jìn)程,再次走1的判斷邏輯
  4. 當(dāng)所有低優(yōu)先級進(jìn)程被kill掉后,如果memorystatus_available_pages仍然小于閾值,先kill掉后臺進(jìn)程,每kill一個(gè)進(jìn)程,判斷一下memorystatus_available_pages是否還小于閾值,如果已經(jīng)小于閾值,則結(jié)束流程,走到1
  5. 當(dāng)所有后臺優(yōu)先級進(jìn)程都被kill后,調(diào)用memorystatus_kill_top_process_aggressive,kill掉前臺的進(jìn)程。再次回到1

總結(jié)

根據(jù)源碼,觸發(fā)前臺OOM的可能性有3個(gè):

  1. 直接觸發(fā)同步kill,比如kMemorystatusKilledPerProcessLimit類型的OOM,這個(gè)解釋起來還需要一篇文章,暫時(shí)不在本文的討論范圍之類
  2. footprint_in_bytes > memlimit_in_bytes,觸發(fā)high-water類型的OOM,目前我在自己手機(jī)上,暫時(shí)沒有看到這個(gè)類型的OOM
  3. 當(dāng)后臺線程都被kill后,依然memorystatus_available_pages <= memorystatus_available_pages_pressure,進(jìn)而系統(tǒng)kill掉我們的App

OOM監(jiān)控和解決都還是目前iOS界內(nèi)的一個(gè)難點(diǎn),大多數(shù)App的OOM率應(yīng)該比Crash率高不少,因?yàn)镃rash的監(jiān)控已經(jīng)有非常成熟的方案了,只需要根據(jù)堆棧,解決Crash即可,而OOM的監(jiān)控還是任重道遠(yuǎn)。

最后編輯于
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者
【社區(qū)內(nèi)容提示】社區(qū)部分內(nèi)容疑似由AI輔助生成,瀏覽時(shí)請結(jié)合常識與多方信息審慎甄別。
平臺聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點(diǎn),簡書系信息發(fā)布平臺,僅提供信息存儲服務(wù)。

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容