問題
celery定時任務(wù)里面啟用延時任務(wù),出現(xiàn)延時任務(wù)重復(fù)執(zhí)行的問題。
如:
定時任務(wù):
'project_schedule_task': {
'task': 'apps.project.tasks.project_status_monitor',
'schedule': crontab(minute=0, hour=2),
}
project_status_monitor任務(wù)執(zhí)行時,再生成若干個延遲任務(wù):
send_company_project_daily_remind_msg.apply_async(
(company.name,
date_str,
company.push_title,
company.push_content,
None if company.push_weather_enable is False else city_weather_map,
project_list,
exec_time),
countdown=company.push_time*60)
這時如果countdown配置的時間太長,該任務(wù)回重復(fù)執(zhí)行多次。
問題原因及解決方法
- 問題原因
查看celery官網(wǎng)有段對redis的說明:https://docs.celeryproject.org/en/latest/getting-started/brokers/redis.html
Visibility timeout
If a task isn’t acknowledged within the Visibility Timeout the task will be redelivered to another worker and executed.
This causes problems with ETA/countdown/retry tasks where the time to execute exceeds the visibility timeout; in fact if that happens it will be executed again, and again in a loop.
So you have to increase the visibility timeout to match the time of the longest ETA you’re planning to use.
Note that Celery will redeliver messages at worker shutdown, so having a long visibility timeout will only delay the redelivery of ‘lost’ tasks in the event of a power failure or forcefully terminated workers.
Periodic tasks won’t be affected by the visibility timeout, as this is a concept separate from ETA/countdown.
You can increase this timeout by configuring a transport option with the same name:
app.conf.broker_transport_options = {'visibility_timeout': 43200}
The value must be an int describing the number of seconds.
原因是對于eta/countdown延遲任務(wù),有超時時間,如果超過超時時間任務(wù)未被執(zhí)行,會被丟到下一個worker去執(zhí)行,造成循環(huán)執(zhí)行。當(dāng)我們設(shè)置一個ETA時間比visibility_timeout長的任務(wù)時,每過一次 visibility_timeout 時間,celery就會認(rèn)為這個任務(wù)沒被worker執(zhí)行成功,重新分配給其它worker再執(zhí)行
- 解決方法
重設(shè)超時等待時間:
app.conf.broker_transport_options = {'visibility_timeout': 86400}
排查過程中用到的一些命令記錄
- 查看celery eta任務(wù)隊列
celery -A proj inspect scheduled
celery -A proj inspect scheduled |wc -l
- 關(guān)閉任務(wù)
celery -A proj control terminate
- 關(guān)閉worker
pkill -9 -f 'celery worker'
ps auxww | awk '/celery worker/ {print $2}' | xargs kill -9
- 重啟worker
celery multi start 1 -A proj -l info -c4 --pidfile=/var/run/celery/%n.pid
celery multi restart 1 --pidfile=/var/run/celery/%n.pid
參考:https://docs.celeryproject.org/en/stable/userguide/workers.html