起因:某次抽獎(jiǎng)活動(dòng),開發(fā)人員錯(cuò)誤調(diào)用了用戶服務(wù)遺棄的舊接口,舊接口有大對(duì)象操作非常慢。且抽獎(jiǎng)接口活動(dòng)期間有一定并發(fā)量,導(dǎo)致大量線程同步阻塞在調(diào)用處,差點(diǎn)導(dǎo)致整個(gè)抽獎(jiǎng)服務(wù)不可用。這就是
服務(wù)雪崩效應(yīng) 或者叫級(jí)聯(lián)效應(yīng)
一、文中為模擬代碼,訂單服務(wù)要依次調(diào)用庫存服務(wù)、用戶服務(wù)
@ApiOperation(value = "下單")
@GetMapping("/create/{userId}")
public CommonResponse<Void> createOrder(@PathVariable(value = "userId") Long userId) {
logger.info("---調(diào)用庫存服務(wù),校驗(yàn)庫存---");
logger.info("---調(diào)用用戶服務(wù),查詢用戶信息---");
UserDTO userInfo = accountService.getUserInfo(userId);
logger.info("---獲得用戶信息,并校驗(yàn)");
logger.info("---進(jìn)行扣款,下單");
return ResultBuilder.buildSuccessResult(null);
}
將用戶服務(wù)查詢接口模擬成耗時(shí)5秒,訂單服務(wù)設(shè)置ribbon.ReadTimeout=1000。顯然會(huì)報(bào)錯(cuò)
Caused by: java.net.SocketTimeoutException: Read timed out
但是光靠超時(shí)報(bào)錯(cuò),不止打斷了正常業(yè)務(wù),而且會(huì)導(dǎo)致大量下單線程阻塞住!!!
二、使用hystrix
還是有必要引入下熔斷機(jī)制,為了不影響測試先把ribbon.ReadTimeout改為10秒吧
<dependency>
<groupId>org.springframework.cloud</groupId>
<artifactId>spring-cloud-starter-netflix-hystrix</artifactId>
</dependency>
訂單服務(wù)啟動(dòng)類加上@EnableCircuitBreaker或者@EnableHystrix
@Service
public class AccountService {
Logger logger = LoggerFactory.getLogger(getClass());
@Autowired
private AccountClient accountClient;
public UserDTO getUserInfo(Long userId) {
UserIdDTO dto = new UserIdDTO();
dto.setUserId(userId);
CommonResponse<UserDTO> response = accountClient.getUserInfo(dto);
logger.info("---調(diào)用用戶服務(wù),查詢用戶信息---{}", response);
if (response == null) {
throw new BizException(StatusCodeConstant.ACCOUNT_SERVICE_CALL_ERROR.getCode(),
StatusCodeConstant.ACCOUNT_SERVICE_CALL_ERROR.getDesc());
}
return response.getData();
}
}
@ApiOperation(value = "下單")
@GetMapping("/create/{userId}")
@HystrixCommand(fallbackMethod = "getUserInfoFallback", commandProperties = {
@HystrixProperty(name = "execution.isolation.thread.timeoutInMilliseconds", value = "6000")
})
public CommonResponse<UserDTO> createOrder(@PathVariable(value = "userId") Long userId) {
logger.info("---調(diào)用庫存服務(wù),校驗(yàn)庫存---");
logger.info("---調(diào)用用戶服務(wù),查詢用戶信息---");
UserDTO userInfo = accountService.getUserInfo(userId);
logger.info("---獲得用戶信息,并校驗(yàn)。-> userInfo {}", userInfo);
logger.info("---進(jìn)行扣款,下單");
return ResultBuilder.buildSuccessResult(userInfo);
}
// 注意降級(jí)方法必須和調(diào)用方法的入?yún)?、返回參?shù)一致
public CommonResponse<UserDTO> getUserInfoFallback(Long userId) {
UserDTO userDTO = new UserDTO();
userDTO.setUserId(-1L);
userDTO.setNickname("游客");
return ResultBuilder.buildSuccessResult(userDTO);
}
## Hystrix調(diào)用接口默認(rèn)1秒超時(shí),超時(shí)后會(huì)自動(dòng)執(zhí)行降級(jí)方法
hystrix.command.default.execution.isolation.thread.timeoutInMilliseconds=1000
三、測試
(1)服務(wù)不可用降級(jí)
kill掉用戶服務(wù)
---調(diào)用庫存服務(wù),校驗(yàn)庫存---
YWW-ACCOUNT using LB returned Server: yww.account.cn:9220 for request http:///user/find
The request for /v2/orders/create/100000551 takes 2007 ms.
2020-04-07 16:56:52,699 [hystrix-OrderV2Controller-3] DEBUG c.n.l.reactive.LoadBalancerCommand 314 - Got error java.net.ConnectException: Connection refused: connect when executed on server yww.account.cn:9220
(2)超時(shí)降級(jí)
設(shè)置超時(shí)時(shí)間為2秒
@HystrixProperty(name = "execution.isolation.thread.timeoutInMilliseconds", value = "2000")
---調(diào)用庫存服務(wù),校驗(yàn)庫存---
The request for /v2/orders/create/100000551 takes 2297 ms.
---調(diào)用用戶服務(wù),查詢用戶信息---CommonReponse [code=200, message=success, data=UserDTO{userId=100000551, avatar='/avatar/default_avatar.png', nickname='富豪100000551', showVoice='null', sex='null', accountId='test_100000551', token='f733008ba897d2409a1ec8afa997583c'}]
---獲得用戶信息,并校驗(yàn)。-> userInfo UserDTO{userId=100000551, avatar='/avatar/default_avatar.png', nickname='富豪100000551', showVoice='null', sex='null', accountId='test_100000551', token='f733008ba897d2409a1ec8afa997583c'}
---進(jìn)行扣款,下單
(3)調(diào)用報(bào)錯(cuò)熔斷
降級(jí)是還會(huì)執(zhí)行l(wèi)ogger.info并返回fallbackMethod結(jié)果(甚至連accountService.getUserInfo后的邏輯也會(huì)異步去繼續(xù)執(zhí)行),熔斷是直接createOrder方法都不執(zhí)行了。
不過兩者都會(huì)返回缺省結(jié)果。
## 當(dāng)在配置時(shí)間窗口內(nèi)達(dá)到此數(shù)量的失敗后,進(jìn)行短路。默認(rèn)20個(gè)
hystrix.command.default.circuitBreaker.requestVolumeThreshold=3
## 短路多久以后開始嘗試是否恢復(fù),默認(rèn)5s
hystrix.command.default.circuitBreaker.sleepWindowInMilliseconds=5
在用戶服務(wù)設(shè)置報(bào)錯(cuò)代碼,5秒內(nèi)連續(xù)調(diào)用3次,觸發(fā)熔斷直接返回fallbackMethod結(jié)果。過完5秒,進(jìn)入半開狀態(tài),調(diào)用成功一次,則關(guān)閉熔斷器,還是調(diào)用失敗則繼續(xù)打開熔斷器
(4)限流熔斷(線程隔離或者使用信號(hào)量)
只有2個(gè)線程同步執(zhí)行createOrder方法
如果不配置maxQueueSize、queueSizeRejectionThreshold 則如果同時(shí)打過來3個(gè)線程,必然有一個(gè)調(diào)用降級(jí)方法。和超不超時(shí)無關(guān)
如果配置了maxQueueSize、queueSizeRejectionThreshold 則取其中較小的參數(shù)作為隊(duì)列大小。如下同時(shí)打過去13個(gè)線程,必然有一個(gè)調(diào)用降級(jí)方法(直接熔斷,不會(huì)調(diào)用createOrder方法),隊(duì)列里的10個(gè)線程則會(huì)慢慢消費(fèi)
@HystrixCommand(fallbackMethod = "getUserInfoFallback",
groupKey = "orderGroup",
threadPoolKey = "orderThreadPool",
threadPoolProperties = {
@HystrixProperty(name = "coreSize", value = "2"),
@HystrixProperty(name = "maxQueueSize", value = "10"),
@HystrixProperty(name = "queueSizeRejectionThreshold", value = "11")
}
)
總結(jié):
服務(wù)雪崩效應(yīng) 或者叫級(jí)聯(lián)效應(yīng),可以使用hystrix作
超時(shí)降級(jí)、不可用降級(jí)、調(diào)用報(bào)錯(cuò)熔斷、限流熔斷,降級(jí)返回缺省結(jié)果也還會(huì)異步執(zhí)行方法邏輯,熔斷直接返回缺省結(jié)果不執(zhí)行方法邏輯