首先看堆棧:
android.app.RemoteServiceException
Context.startForegroundService() did not then call Service.startForeground(): ServiceRecord{413a500 u0 com.xx.xx/com.xx.baseUi.service.BackgroundKeepService}
android.app.ActivityThread$H.handleMessage(ActivityThread.java:1796)
android.os.Handler.dispatchMessage(Handler.java:106)
android.os.Looper.loop(Looper.java:213)
android.app.ActivityThread.main(ActivityThread.java:6954)
java.lang.reflect.Method.invoke(Native Method)
com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run(RuntimeInit.java:493)
com.android.internal.os.ZygoteInit.main(ZygoteInit.java:909)
大概是說 startForegroundService 但是沒有調(diào)用 Service.startForeground()。
回到業(yè)務(wù)代碼里一看,BackgroundKeepService 的 onCreate 方法里確實已經(jīng)調(diào)用了。這一下就很迷惑了...
思來想去都覺得這不會是AOSP的一個Bug吧... 有疑問,看源碼!
在Framework的 ActiveService 里面有兩個地方會拋出 Context.startForegroundService() did not then call Service.startForeground() 異常,分別是:
void serviceForegroundTimeout(ServiceRecord r) {
ProcessRecord app;
synchronized (mAm) {
if (!r.fgRequired || !r.fgWaiting || r.destroying) {
return;
}
app = r.app;
if (app != null && app.isDebugging()) {
// The app's being debugged; let it ride
return;
}
if (DEBUG_BACKGROUND_CHECK) {
Slog.i(TAG, "Service foreground-required timeout for " + r);
}
r.fgWaiting = false;
stopServiceLocked(r, false);
}
if (app != null) {
final String annotation = "Context.startForegroundService() did not then call "
+ "Service.startForeground(): " + r;
Message msg = mAm.mHandler.obtainMessage(
ActivityManagerService.SERVICE_FOREGROUND_TIMEOUT_ANR_MSG);
SomeArgs args = SomeArgs.obtain();
args.arg1 = app;
args.arg2 = annotation;
msg.obj = args;
mAm.mHandler.sendMessageDelayed(msg,
mAm.mConstants.mServiceStartForegroundAnrDelayMs);
}
}
和
void serviceForegroundCrash(ProcessRecord app, String serviceRecord,
ComponentName service) {
mAm.crashApplicationWithTypeWithExtras(
app.uid, app.getPid(), app.info.packageName, app.userId,
"Context.startForegroundService() did not then call " + "Service.startForeground(): "
+ serviceRecord, false /*force*/,
ForegroundServiceDidNotStartInTimeException.TYPE_ID,
ForegroundServiceDidNotStartInTimeException.createExtrasForService(service));
}
由于第一個拋出的是ANR不是Crash,不是我們的目標,那么就重點分析 serviceForegroundCrash 方法。定位到調(diào)用的地方 ActivityManagerService 中有一段這樣的代碼:
case SERVICE_FOREGROUND_CRASH_MSG: {
SomeArgs args = (SomeArgs) msg.obj;
mServices.serviceForegroundCrash(
(ProcessRecord) args.arg1,
(String) args.arg2,
(ComponentName) args.arg3);
args.recycle();
} break;
Crash并且符合堆棧拋出的特征。查找調(diào)用的地方在 ActiveService 的 bringDownServiceLocked 方法:
private void bringDownServiceLocked(ServiceRecord r, boolean enqueueOomAdj) {
//....
mAm.mAppOpsService.finishOperation(AppOpsManager.getToken(mAm.mAppOpsService),
AppOpsManager.OP_START_FOREGROUND, r.appInfo.uid, r.packageName, null);
mAm.mHandler.removeMessages(
ActivityManagerService.SERVICE_FOREGROUND_TIMEOUT_MSG, r);
if (r.app != null) {
Message msg = mAm.mHandler.obtainMessage(
ActivityManagerService.SERVICE_FOREGROUND_CRASH_MSG);
SomeArgs args = SomeArgs.obtain();
args.arg1 = r.app;
args.arg2 = r.toString();
args.arg3 = r.getComponentName();
msg.obj = args;
mAm.mHandler.sendMessage(msg);
}
}
//...
}
繼續(xù)分析,在 ActiveService 里共有9處調(diào)用這個方法的,排除掉應(yīng)用啟動、殺進程等場景的調(diào)用,其中 private String bringUpServiceLocked() 中兩處調(diào)用,private final void bringDownServiceIfNeededLocked() 中一處調(diào)用。
OK,問題簡化。進一步分析 private String bringUpServiceLocked() 中兩處和進程啟動和權(quán)限相關(guān),那么可以確定不是這個問題。
問題再簡化,針對 private final void bringDownServiceIfNeededLocked() 分析即可。 ActiveService 里共有三處調(diào)用的地方,分別是:
private void stopServiceLocked(ServiceRecord service, boolean enqueueOomAdj) {
//...
bringDownServiceIfNeededLocked(service, false, false, enqueueOomAdj); //這里
}
boolean stopServiceTokenLocked(ComponentName className, IBinder token,
int startId) {
if (DEBUG_SERVICE) Slog.v(TAG_SERVICE, "stopServiceToken: " + className
+ " " + token + " startId=" + startId);
ServiceRecord r = findServiceLocked(className, token, UserHandle.getCallingUserId());
if (r != null) {
//...
r.callStart = false;
final long origId = Binder.clearCallingIdentity();
bringDownServiceIfNeededLocked(r, false, false, false); //這里
Binder.restoreCallingIdentity(origId);
return true;
}
return false;
}
void removeConnectionLocked(ConnectionRecord c, ProcessRecord skipApp,
ActivityServiceConnectionsHolder skipAct, boolean enqueueOomAdj) {
//...
bringDownServiceIfNeededLocked(s, true, hasAutoCreate, enqueueOomAdj); //這里
}
}
}
最后一個是connection相關(guān)的,由于這個Service并沒有Binder connection排除。再此分析下來竟然和stopService有關(guān)系。??!這...

先來把代碼里stopService的代碼注掉,跑一遍Monkey,果然沒事了。
那么究竟為什么stopService會導(dǎo)致這個Crash?讓我們先追蹤一下stopService的源碼。先看 ActivityManagerService 的 stopService()
@Override
public int stopService(IApplicationThread caller, Intent service,
String resolvedType, int userId) {
enforceNotIsolatedCaller("stopService");
// Refuse possible leaked file descriptors
if (service != null && service.hasFileDescriptors() == true) {
throw new IllegalArgumentException("File descriptors passed in Intent");
}
synchronized(this) {
return mServices.stopServiceLocked(caller, service, resolvedType, userId);
}
}
再追蹤 mServices.stopServiceLocked() 到 ActiveServices的 int stopServiceLocked(IApplicationThread caller, Intent service,String resolvedType, int userId) 方法:
int stopServiceLocked(IApplicationThread caller, Intent service,
String resolvedType, int userId) {
if (DEBUG_SERVICE) Slog.v(TAG_SERVICE, "stopService: " + service
+ " type=" + resolvedType);
final ProcessRecord callerApp = mAm.getRecordForAppLOSP(caller);
if (caller != null && callerApp == null) {
throw new SecurityException(
"Unable to find app for caller " + caller
+ " (pid=" + Binder.getCallingPid()
+ ") when stopping service " + service);
}
// If this service is active, make sure it is stopped.
ServiceLookupResult r = retrieveServiceLocked(service, null, resolvedType, null,
Binder.getCallingPid(), Binder.getCallingUid(), userId, false, false, false, false);
if (r != null) {
if (r.record != null) {
final long origId = Binder.clearCallingIdentity();
try {
stopServiceLocked(r.record, false);//這里
} finally {
Binder.restoreCallingIdentity(origId);
}
return 1;
}
return -1;
}
return 0;
}
到這里,就可以確定和上面的 private void stopServiceLocked(ServiceRecord service, boolean enqueueOomAdj) 對上了。細看這里面的邏輯:
// Check to see if the service had been started as foreground, but being
// brought down before actually showing a notification. That is not allowed.
if (r.fgRequired) {
Slog.w(TAG_SERVICE, "Bringing down service while still waiting for start foreground: "
+ r);
r.fgRequired = false;
r.fgWaiting = false;
synchronized (mAm.mProcessStats.mLock) {
ServiceState stracker = r.getTracker();
if (stracker != null) {
stracker.setForeground(false, mAm.mProcessStats.getMemFactorLocked(),
SystemClock.uptimeMillis());
}
}
mAm.mAppOpsService.finishOperation(AppOpsManager.getToken(mAm.mAppOpsService),
AppOpsManager.OP_START_FOREGROUND, r.appInfo.uid, r.packageName, null);
mAm.mHandler.removeMessages(
ActivityManagerService.SERVICE_FOREGROUND_TIMEOUT_MSG, r);
if (r.app != null) {
Message msg = mAm.mHandler.obtainMessage(
ActivityManagerService.SERVICE_FOREGROUND_CRASH_MSG);
SomeArgs args = SomeArgs.obtain();
args.arg1 = r.app;
args.arg2 = r.toString();
args.arg3 = r.getComponentName();
msg.obj = args;
mAm.mHandler.sendMessage(msg);
}
}
如果r.fgRequired 是true,那么就會出現(xiàn)這個crash。 所以 r.fgRequired 是什么邏輯?在此之前,先看一下 Service 的 startForeground() 到底干了啥?老規(guī)矩,看代碼。通過ActivityManagerService最終跳到我們的老朋友 ActiveService 中:
@GuardedBy("mAm")
public void setServiceForegroundLocked(ComponentName className, IBinder token,
int id, Notification notification, int flags, int foregroundServiceType) {
final int userId = UserHandle.getCallingUserId();
final long origId = Binder.clearCallingIdentity();
try {
ServiceRecord r = findServiceLocked(className, token, userId);
if (r != null) {
setServiceForegroundInnerLocked(r, id, notification, flags, foregroundServiceType);
}
} finally {
Binder.restoreCallingIdentity(origId);
}
}
繼續(xù) setServiceForegroundInnerLocked:
@GuardedBy("mAm")
private void setServiceForegroundInnerLocked(final ServiceRecord r, int id,
Notification notification, int flags, int foregroundServiceType) {
//...
if (r.fgRequired) {
if (DEBUG_SERVICE || DEBUG_BACKGROUND_CHECK) {
Slog.i(TAG, "Service called startForeground() as required: " + r);
}
r.fgRequired = false;
r.fgWaiting = false;
alreadyStartedOp = stopProcStatsOp = true;
mAm.mHandler.removeMessages(
ActivityManagerService.SERVICE_FOREGROUND_TIMEOUT_MSG, r);
}
//...
}
也就是 r.fgRequired = false; 這個代碼沒有執(zhí)行,進而導(dǎo)致后面的Crash??墒菫槭裁碨ervice的onCreate()方法沒有執(zhí)行呢?這和Service的stop又有什么關(guān)系?難道說,我們的Servcie還沒有執(zhí)行onCreate方法就被stop了?思來想去也就只有這一個選項了!
由于Servcie調(diào)用onCreate方法是跨進程調(diào)用的,確實存在這種可能。寫個代碼驗證下,在startForegroundService后立刻調(diào)用stopService,果然預(yù)期中的crash出現(xiàn)了:
android.app.RemoteServiceException$ForegroundServiceDidNotStartInTimeException: Context.startForegroundService() did not then call Service.startForeground(): ServiceRecord{1e9c4ea u0 com.example.test/.MyService}
at android.app.ActivityThread.generateForegroundServiceDidNotStartInTimeException(ActivityThread.java:2042)
at android.app.ActivityThread.throwRemoteServiceException(ActivityThread.java:2013)
at android.app.ActivityThread.-$$Nest$mthrowRemoteServiceException(Unknown Source:0)
at android.app.ActivityThread$H.handleMessage(ActivityThread.java:2282)
at android.os.Handler.dispatchMessage(Handler.java:106)
at android.os.Looper.loopOnce(Looper.java:201)
at android.os.Looper.loop(Looper.java:288)
at android.app.ActivityThread.main(ActivityThread.java:8049)
at java.lang.reflect.Method.invoke(Native Method)
at com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run(RuntimeInit.java:548)
at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:942)
對應(yīng)代碼:
startForegroundService(Intent(this@MainActivity,MyService::class.java))
stopService(Intent(this@MainActivity,MyService::class.java))
為什么會出現(xiàn)先后調(diào)用這兩個方法呢?原來App做了保活,應(yīng)用退到后臺前會啟動一個前臺Service,防止應(yīng)用在后臺被kill。這個前后臺監(jiān)聽是通過Application的ActivityLifecycleCallback實現(xiàn)的。也就是說可見頁面小于0,就會startForegroundService,大于0就會stopService。問題是應(yīng)用換皮膚的方案就是通過重建應(yīng)用完成的,那么在系統(tǒng)切換黑/白模式的時候,應(yīng)用會重建,相當于經(jīng)歷了一次快速前后臺切換,導(dǎo)致Crash。
那么如何解決呢?要通過Service的stopSelf()進行Service銷毀。具體代碼:
override fun onStartCommand(intent: Intent?, flags: Int, startId: Int): Int {
when (intent?.action) {
ACTION_UPDATE_INFO -> {
}
ACTION_STOP_SERVICE -> {
stopSelf()
}
}
return START_NOT_STICKY
}
調(diào)用的地方:
val intent = Intent(this, BackgroundKeepService::class.java)
intent.action = BackgroundKeepService.ACTION_STOP_SERVICE
startService(intent)
至此,完成了Crash的分析、復(fù)現(xiàn)和驗證。線上也沒有相關(guān)Crash。
總結(jié):這是AOSP對startForeground的一種限制,如果業(yè)務(wù)邏輯無法繞開短時間內(nèi)啟動暫停Service,上文提到的解決方案可以規(guī)避Crash問題。
