調(diào)度概念
進(jìn)程調(diào)度
按照某種調(diào)度算法從就緒隊(duì)列中選取進(jìn)程分配CPU,主要是協(xié)調(diào)對(duì)CPU等的資源使用。進(jìn)程調(diào)度目標(biāo)是最大限度地利用CPU時(shí)間,只要有可以執(zhí)行的進(jìn)程,那么總會(huì)有進(jìn)程正在執(zhí)行,只要進(jìn)程數(shù)目比處理器個(gè)數(shù)多,就注定某一個(gè)給定時(shí)刻會(huì)有一些進(jìn)程不能執(zhí)行。
進(jìn)程切換
CPU資源的當(dāng)前占有者進(jìn)行切換,將Context(CPU狀態(tài),主要是寄存器的狀態(tài))保存至當(dāng)前進(jìn)程的TCB中,并恢復(fù)下一個(gè)進(jìn)程的上下文。
進(jìn)程狀態(tài)
進(jìn)程在運(yùn)行過程中一般存在三種狀態(tài):
- 就緒: 進(jìn)程已分配到除CPU之外的所有必要資源,只要獲取CPU便可執(zhí)行;
- 執(zhí)行: 當(dāng)前進(jìn)程正在CPU上運(yùn)行;
- 阻塞: 正在執(zhí)行的進(jìn)程,由于等待某些資源無法執(zhí)行,放棄CPU處于阻塞狀態(tài)

數(shù)據(jù)結(jié)構(gòu)
與調(diào)度相關(guān)的數(shù)據(jù)結(jié)構(gòu)包括:
- 任務(wù)描述符,在nuttx中是以
struct tcb_s來定義任務(wù)- 任務(wù)狀態(tài),對(duì)應(yīng)進(jìn)程三狀態(tài),及其他相關(guān)狀態(tài),nuttx中以
enum tstate_e來定義- 任務(wù)隊(duì)列,存放不同狀態(tài)的task
任務(wù)描述符
FAR struct wdog_s; /* Forward reference */
struct tcb_s
{
/* Fields used to support list management *************************************/
/* 雙向鏈表,用于將相同狀態(tài)的task連成任務(wù)隊(duì)列 */
FAR struct tcb_s *flink; /* Doubly linked list */
FAR struct tcb_s *blink;
/* Task Group *****************************************************************/
#ifdef HAVE_TASK_GROUP
FAR struct task_group_s *group; /* Pointer to shared task group data */
#endif
/* Task Management Fields *****************************************************/
/* 任務(wù)管理的相關(guān)字段 */
pid_t pid; /* This is the ID of the thread */
start_t start; /* Thread start function */
entry_t entry; /* Entry Point into the thread */
uint8_t sched_priority; /* Current priority of the thread */
uint8_t init_priority; /* Initial priority of the thread */
#ifdef CONFIG_PRIORITY_INHERITANCE
#if CONFIG_SEM_NNESTPRIO > 0
uint8_t npend_reprio; /* Number of nested reprioritizations */
uint8_t pend_reprios[CONFIG_SEM_NNESTPRIO];
#endif
uint8_t base_priority; /* "Normal" priority of the thread */
#endif
/* 進(jìn)程的狀態(tài),對(duì)應(yīng)后邊會(huì)描述的不同進(jìn)程狀態(tài) */
uint8_t task_state; /* Current state of the thread */
#ifdef CONFIG_SMP
uint8_t cpu; /* CPU index if running or assigned */
cpu_set_t affinity; /* Bit set of permitted CPUs */
#endif
uint16_t flags; /* Misc. general status flags */
int16_t lockcount; /* 0=preemptable (not-locked) */
#ifdef CONFIG_SMP
int16_t irqcount; /* 0=interrupts enabled */
#endif
#ifdef CONFIG_CANCELLATION_POINTS
int16_t cpcount; /* Nested cancellation point count */
#endif
#if CONFIG_RR_INTERVAL > 0 || defined(CONFIG_SCHED_SPORADIC)
int32_t timeslice; /* RR timeslice OR Sporadic budget */
/* interval remaining */
#endif
#ifdef CONFIG_SCHED_SPORADIC
FAR struct sporadic_s *sporadic; /* Sporadic scheduling parameters */
#endif
FAR struct wdog_s *waitdog; /* All timed waits use this timer */
/* 每個(gè)task都有自己的棧區(qū)域 */
/* Stack-Related Fields *******************************************************/
size_t adj_stack_size; /* Stack size after adjustment */
/* for hardware, processor, etc. */
/* (for debug purposes only) */
FAR void *stack_alloc_ptr; /* Pointer to allocated stack */
/* Need to deallocate stack */
FAR void *adj_stack_ptr; /* Adjusted stack_alloc_ptr for HW */
/* The initial stack pointer value */
/* External Module Support ****************************************************/
#ifdef CONFIG_PIC
FAR struct dspace_s *dspace; /* Allocated area for .bss and .data */
#endif
/* POSIX Semaphore Control Fields *********************************************/
sem_t *waitsem; /* Semaphore ID waiting on */
/* POSIX Signal Control Fields ************************************************/
#ifndef CONFIG_DISABLE_SIGNALS
sigset_t sigprocmask; /* Signals that are blocked */
sigset_t sigwaitmask; /* Waiting for pending signals */
sq_queue_t sigpendactionq; /* List of pending signal actions */
sq_queue_t sigpostedq; /* List of posted signals */
siginfo_t sigunbinfo; /* Signal info when task unblocked */
#endif
/* POSIX Named Message Queue Fields *******************************************/
#ifndef CONFIG_DISABLE_MQUEUE
FAR struct mqueue_inode_s *msgwaitq; /* Waiting for this message queue */
#endif
/* Library related fields *****************************************************/
int pterrno; /* Current per-thread errno */
/* State save areas ***********************************************************/
/* The form and content of these fields are platform-specific. */
struct xcptcontext xcp; /* Interrupt register save area */
#if CONFIG_TASK_NAME_SIZE > 0
char name[CONFIG_TASK_NAME_SIZE+1]; /* Task name (with NUL terminator) */
#endif
};
在
struct tcb_s中,有一個(gè)數(shù)據(jù)結(jié)構(gòu)需要了解一下,那就是struct task_group_s,用于描述任務(wù)組的信息,其中里邊包括了子任務(wù)狀態(tài)、環(huán)境變量、文件描述符、Sockets等信息,這就能對(duì)應(yīng)到Linux系統(tǒng)中的任務(wù)描述符包含的信息了。
/* struct task_group_s ***********************************************************/
/* All threads created by pthread_create belong in the same task group (along with
* the thread of the original task). struct task_group_s is a shared structure
* referenced by the TCB of each thread that is a member of the task group.
*
* This structure should contain *all* resources shared by tasks and threads that
* belong to the same task group:
*
* Child exit status
* Environment variables
* PIC data space and address environments
* File descriptors
* FILE streams
* Sockets
* Address environments.
*
* Each instance of struct task_group_s is reference counted. Each instance is
* created with a reference count of one. The reference incremented when each
* thread joins the group and decremented when each thread exits, leaving the
* group. When the reference count decrements to zero, the struct task_group_s
* is free.
*/
#ifdef HAVE_TASK_GROUP
#ifndef CONFIG_DISABLE_PTHREAD
struct join_s; /* Forward reference */
/* Defined in sched/pthread/pthread.h */
#endif
struct task_group_s
{
#if defined(HAVE_GROUP_MEMBERS) || defined(CONFIG_ARCH_ADDRENV)
struct task_group_s *flink; /* Supports a singly linked list */
gid_t tg_gid; /* The ID of this task group */
#endif
#ifdef HAVE_GROUP_MEMBERS
gid_t tg_pgid; /* The ID of the parent task group */
#endif
#if !defined(CONFIG_DISABLE_PTHREAD) && defined(CONFIG_SCHED_HAVE_PARENT)
pid_t tg_task; /* The ID of the task within the group */
#endif
uint8_t tg_flags; /* See GROUP_FLAG_* definitions */
/* Group membership ***********************************************************/
uint8_t tg_nmembers; /* Number of members in the group */
#ifdef HAVE_GROUP_MEMBERS
uint8_t tg_mxmembers; /* Number of members in allocation */
FAR pid_t *tg_members; /* Members of the group */
#endif
#if defined(CONFIG_SCHED_ATEXIT) && !defined(CONFIG_SCHED_ONEXIT)
/* atexit support ************************************************************/
# if defined(CONFIG_SCHED_ATEXIT_MAX) && CONFIG_SCHED_ATEXIT_MAX > 1
atexitfunc_t tg_atexitfunc[CONFIG_SCHED_ATEXIT_MAX];
# else
atexitfunc_t tg_atexitfunc; /* Called when exit is called. */
# endif
#endif
#ifdef CONFIG_SCHED_ONEXIT
/* on_exit support ***********************************************************/
# if defined(CONFIG_SCHED_ONEXIT_MAX) && CONFIG_SCHED_ONEXIT_MAX > 1
onexitfunc_t tg_onexitfunc[CONFIG_SCHED_ONEXIT_MAX];
FAR void *tg_onexitarg[CONFIG_SCHED_ONEXIT_MAX];
# else
onexitfunc_t tg_onexitfunc; /* Called when exit is called. */
FAR void *tg_onexitarg; /* The argument passed to the function */
# endif
#endif
#ifdef CONFIG_SCHED_HAVE_PARENT
/* Child exit status **********************************************************/
#ifdef CONFIG_SCHED_CHILD_STATUS
FAR struct child_status_s *tg_children; /* Head of a list of child status */
#endif
#ifndef HAVE_GROUP_MEMBERS
/* REVISIT: What if parent thread exits? Should use tg_pgid. */
pid_t tg_ppid; /* This is the ID of the parent thread */
#ifndef CONFIG_SCHED_CHILD_STATUS
uint16_t tg_nchildren; /* This is the number active children */
#endif
#endif /* HAVE_GROUP_MEMBERS */
#endif /* CONFIG_SCHED_HAVE_PARENT */
#if defined(CONFIG_SCHED_WAITPID) && !defined(CONFIG_SCHED_HAVE_PARENT)
/* waitpid support ************************************************************/
/* Simple mechanism used only when there is no support for SIGCHLD */
uint8_t tg_nwaiters; /* Number of waiters */
sem_t tg_exitsem; /* Support for waitpid */
int *tg_statloc; /* Location to return exit status */
#endif
#ifndef CONFIG_DISABLE_PTHREAD
/* Pthreads *******************************************************************/
/* Pthread join Info: */
sem_t tg_joinsem; /* Mutually exclusive access to join data */
FAR struct join_s *tg_joinhead; /* Head of a list of join data */
FAR struct join_s *tg_jointail; /* Tail of a list of join data */
uint8_t tg_nkeys; /* Number pthread keys allocated */
#endif
#ifndef CONFIG_DISABLE_SIGNALS
/* POSIX Signal Control Fields ************************************************/
sq_queue_t tg_sigactionq; /* List of actions for signals */
sq_queue_t tg_sigpendingq; /* List of pending signals */
#endif
#ifndef CONFIG_DISABLE_ENVIRON
/* Environment variables ******************************************************/
size_t tg_envsize; /* Size of environment string allocation */
FAR char *tg_envp; /* Allocated environment strings */
#endif
/* PIC data space and address environments ************************************/
/* Logically the PIC data space belongs here (see struct dspace_s). The
* current logic needs review: There are differences in the away that the
* life of the PIC data is managed.
*/
#if CONFIG_NFILE_DESCRIPTORS > 0
/* File descriptors ***********************************************************/
struct filelist tg_filelist; /* Maps file descriptor to file */
#endif
#if CONFIG_NFILE_STREAMS > 0
/* FILE streams ***************************************************************/
/* In a flat, single-heap build. The stream list is allocated with this
* structure. But kernel mode with a kernel allocator, it must be separately
* allocated using a user-space allocator.
*/
#if (defined(CONFIG_BUILD_PROTECTED) || defined(CONFIG_BUILD_KERNEL)) && \
defined(CONFIG_MM_KERNEL_HEAP)
FAR struct streamlist *tg_streamlist;
#else
struct streamlist tg_streamlist; /* Holds C buffered I/O info */
#endif
#endif
#if CONFIG_NSOCKET_DESCRIPTORS > 0
/* Sockets ********************************************************************/
struct socketlist tg_socketlist; /* Maps socket descriptor to socket */
#endif
#ifndef CONFIG_DISABLE_MQUEUE
/* POSIX Named Message Queue Fields *******************************************/
sq_queue_t tg_msgdesq; /* List of opened message queues */
#endif
#ifdef CONFIG_ARCH_ADDRENV
/* Address Environment ********************************************************/
group_addrenv_t tg_addrenv; /* Task group address environment */
#endif
#ifdef CONFIG_MM_SHM
/* Shared Memory **************************************************************/
struct group_shm_s tg_shm; /* Task shared memory logic */
#endif
};
#endif
基于
struct tcb_s又?jǐn)U展了兩個(gè)數(shù)據(jù)結(jié)構(gòu),分別用于描述task和線程:
/* struct task_tcb_s *************************************************************/
/* This is the particular form of the task control block (TCB) structure used by
* tasks (and kernel threads). There are two TCB forms: one for pthreads and
* one for tasks. Both share the common TCB fields (which must appear at the
* top of the structure) plus additional fields unique to tasks and threads.
* Having separate structures for tasks and pthreads adds some complexity, but
* saves memory in that it prevents pthreads from being burdened with the
* overhead required for tasks (and vice versa).
*/
struct task_tcb_s
{
/* Common TCB fields **********************************************************/
struct tcb_s cmn; /* Common TCB fields */
/* Task Management Fields *****************************************************/
#ifdef CONFIG_SCHED_STARTHOOK
starthook_t starthook; /* Task startup function */
FAR void *starthookarg; /* The argument passed to the function */
#endif
/* [Re-]start name + start-up parameters **************************************/
FAR char **argv; /* Name+start-up parameters */
};
/* struct pthread_tcb_s **********************************************************/
/* This is the particular form of the task control block (TCB) structure used by
* pthreads. There are two TCB forms: one for pthreads and one for tasks. Both
* share the common TCB fields (which must appear at the top of the structure)
* plus additional fields unique to tasks and threads. Having separate structures
* for tasks and pthreads adds some complexity, but saves memory in that it
* prevents pthreads from being burdened with the overhead required for tasks
* (and vice versa).
*/
#ifndef CONFIG_DISABLE_PTHREAD
struct pthread_tcb_s
{
/* Common TCB fields **********************************************************/
struct tcb_s cmn; /* Common TCB fields */
/* Task Management Fields *****************************************************/
pthread_addr_t arg; /* Startup argument */
FAR void *joininfo; /* Detach-able info to support join */
/* Clean-up stack *************************************************************/
#ifdef CONFIG_PTHREAD_CLEANUP
/* tos - The index to the next avaiable entry at the top of the stack.
* stack - The pre-allocated clean-up stack memory.
*/
uint8_t tos;
struct pthread_cleanup_s stack[CONFIG_PTHREAD_CLEANUP_STACKSIZE];
#endif
/* POSIX Thread Specific Data *************************************************/
#if CONFIG_NPTHREAD_KEYS > 0
FAR void *pthread_data[CONFIG_NPTHREAD_KEYS];
#endif
};
#endif /* !CONFIG_DISABLE_PTHREAD */
任務(wù)狀態(tài)
/* General Task Management Types ************************************************/
/* This is the type of the task_state field of the TCB. NOTE: the order and
* content of this enumeration is critical since there are some OS tables indexed
* by these values. The range of values is assumed to fit into a uint8_t in
* struct tcb_s.
*/
enum tstate_e
{
TSTATE_TASK_INVALID = 0, /* INVALID - The TCB is uninitialized */
TSTATE_TASK_PENDING, /* READY_TO_RUN - Pending preemption unlock */
TSTATE_TASK_READYTORUN, /* READY-TO-RUN - But not running */
#ifdef CONFIG_SMP
TSTATE_TASK_ASSIGNED, /* READY-TO-RUN - Not running, but assigned to a CPU */
#endif
TSTATE_TASK_RUNNING, /* READY_TO_RUN - And running */
TSTATE_TASK_INACTIVE, /* BLOCKED - Initialized but not yet activated */
TSTATE_WAIT_SEM, /* BLOCKED - Waiting for a semaphore */
#ifndef CONFIG_DISABLE_SIGNALS
TSTATE_WAIT_SIG, /* BLOCKED - Waiting for a signal */
#endif
#ifndef CONFIG_DISABLE_MQUEUE
TSTATE_WAIT_MQNOTEMPTY, /* BLOCKED - Waiting for a MQ to become not empty. */
TSTATE_WAIT_MQNOTFULL, /* BLOCKED - Waiting for a MQ to become not full. */
#endif
#ifdef CONFIG_PAGING
TSTATE_WAIT_PAGEFILL, /* BLOCKED - Waiting for page fill */
#endif
NUM_TASK_STATES /* Must be last */
};
typedef enum tstate_e tstate_t;
任務(wù)隊(duì)列
/* Task Lists ***************************************************************/
/* The state of a task is indicated both by the task_state field of the TCB
* and by a series of task lists. All of these tasks lists are declared
* below. Although it is not always necessary, most of these lists are
* prioritized so that common list handling logic can be used (only the
* g_readytorun, the g_pendingtasks, and the g_waitingforsemaphore lists
* need to be prioritized).
*/
/* This is the list of all tasks that are ready to run. This is a
* prioritized list with head of the list holding the highest priority
* (unassigned) task. In the non-SMP cae, the head of this list is the
* currently active task and the tail of this list, the lowest priority
* task, is always the IDLE task.
*/
volatile dq_queue_t g_readytorun;
#ifdef CONFIG_SMP
/* In order to support SMP, the function of the g_readytorun list changes,
* The g_readytorun is still used but in the SMP cae it will contain only:
*
* - Only tasks/threads that are eligible to run, but not currently running,
* and
* - Tasks/threads that have not been assigned to a CPU.
*
* Otherwise, the TCB will be reatined in an assigned task list,
* g_assignedtasks. As its name suggests, on 'g_assignedtasks queue for CPU
* 'n' would contain only tasks/threads that are assigned to CPU 'n'. Tasks/
* threads would be assigned a particular CPU by one of two mechanisms:
*
* - (Semi-)permanently through an RTOS interfaces such as
* pthread_attr_setaffinity(), or
* - Temporarily through scheduling logic when a previously unassigned task
* is made to run.
*
* Tasks/threads that are assigned to a CPU via an interface like
* pthread_attr_setaffinity() would never go into the g_readytorun list, but
* would only go into the g_assignedtasks[n] list for the CPU 'n' to which
* the thread has been assigned. Hence, the g_readytorun list would hold
* only unassigned tasks/threads.
*
* Like the g_readytorun list in in non-SMP case, each g_assignedtask[] list
* is prioritized: The head of the list is the currently active task on this
* CPU. Tasks after the active task are ready-to-run and assigned to this
* CPU. The tail of this assigned task list, the lowest priority task, is
* always the CPU's IDLE task.
*/
volatile dq_queue_t g_assignedtasks[CONFIG_SMP_NCPUS];
#endif
/* This is the list of all tasks that are ready-to-run, but cannot be placed
* in the g_readytorun list because: (1) They are higher priority than the
* currently active task at the head of the g_readytorun list, and (2) the
* currently active task has disabled pre-emption.
*/
volatile dq_queue_t g_pendingtasks;
/* This is the list of all tasks that are blocked waiting for a semaphore */
volatile dq_queue_t g_waitingforsemaphore;
/* This is the list of all tasks that are blocked waiting for a signal */
#ifndef CONFIG_DISABLE_SIGNALS
volatile dq_queue_t g_waitingforsignal;
#endif
/* This is the list of all tasks that are blocked waiting for a message
* queue to become non-empty.
*/
#ifndef CONFIG_DISABLE_MQUEUE
volatile dq_queue_t g_waitingformqnotempty;
#endif
/* This is the list of all tasks that are blocked waiting for a message
* queue to become non-full.
*/
#ifndef CONFIG_DISABLE_MQUEUE
volatile dq_queue_t g_waitingformqnotfull;
#endif
/* This is the list of all tasks that are blocking waiting for a page fill */
#ifdef CONFIG_PAGING
volatile dq_queue_t g_waitingforfill;
#endif
/* This the list of all tasks that have been initialized, but not yet
* activated. NOTE: This is the only list that is not prioritized.
*/
volatile dq_queue_t g_inactivetasks;
調(diào)度策略
調(diào)度策略又稱調(diào)度算法,根據(jù)系統(tǒng)的資源分配策略所規(guī)定的資源分配算法。在代碼實(shí)現(xiàn)中,看到的就是將task在不同的任務(wù)隊(duì)列中進(jìn)行移動(dòng)。在Nuttx中支持的調(diào)度算法有:
- FIFO,先來先服務(wù),在優(yōu)先級(jí)相同時(shí)的一種調(diào)度策略,F(xiàn)IFO會(huì)導(dǎo)致后面的任務(wù)延時(shí)較大
- Round Robin,時(shí)間片輪轉(zhuǎn),在優(yōu)先級(jí)相同時(shí)的一種調(diào)度策略,比如一個(gè)task分配200ms的時(shí)間片,在同一優(yōu)先級(jí)時(shí),當(dāng)前task執(zhí)行完200ms后,讓出CPU,切換至隊(duì)列中的下一個(gè)task。
- Sporadic,偶發(fā)調(diào)度,sporadic的引入主要是為了去除周期性和非周期性事件對(duì)實(shí)時(shí)性的影響,相比RR策略,它可以在一個(gè)設(shè)定的時(shí)間段里限制線程執(zhí)行時(shí)間的長短。當(dāng)一個(gè)系統(tǒng)同時(shí)處理周期性和非周期性事件,對(duì)其進(jìn)行速率單調(diào)性分析(Rate Monotonic Analysis)時(shí),這個(gè)偶發(fā)調(diào)度算法是必須的。
/* POSIX-like scheduling policies */
#define SCHED_FIFO 1 /* FIFO priority scheduling policy */
#define SCHED_RR 2 /* Round robin scheduling policy */
#define SCHED_SPORADIC 3 /* Sporadic scheduling policy */
#define SCHED_OTHER 4 /* Not supported */
nuttx支持根據(jù)優(yōu)先級(jí)進(jìn)行搶占,以便支持實(shí)時(shí)性,使用下邊的接口來設(shè)置調(diào)度策略和優(yōu)先級(jí):
/****************************************************************************
* Name:sched_setscheduler
*
* Description:
* sched_setscheduler() sets both the scheduling policy and the priority
* for the task identified by pid. If pid equals zero, the scheduler of
* the calling task will be set. The parameter 'param' holds the priority
* of the thread under the new policy.
*
* Inputs:
* pid - the task ID of the task to modify. If pid is zero, the calling
* task is modified.
* policy - Scheduling policy requested (either SCHED_FIFO or SCHED_RR)
* param - A structure whose member sched_priority is the new priority.
* The range of valid priority numbers is from SCHED_PRIORITY_MIN
* through SCHED_PRIORITY_MAX.
*
* Return Value:
* On success, sched_setscheduler() returns OK (zero). On error, ERROR
* (-1) is returned, and errno is set appropriately:
*
* EINVAL The scheduling policy is not one of the recognized policies.
* ESRCH The task whose ID is pid could not be found.
*
* Assumptions:
*
****************************************************************************/
int sched_setscheduler(pid_t pid, int policy, FAR const struct sched_param *param)
調(diào)度點(diǎn)
進(jìn)程的調(diào)度并不是任意時(shí)刻都能進(jìn)行,必須在某些時(shí)間點(diǎn)上完成。調(diào)度器這個(gè)詞容易帶來理解誤區(qū):有一個(gè)scheduler在運(yùn)行,類似于一個(gè)內(nèi)核線程,由它去完成任務(wù)的調(diào)度。實(shí)際上,調(diào)度器只是一個(gè)接口函數(shù),當(dāng)一個(gè)task在某些條件下,要讓出CPU時(shí),此時(shí)就會(huì)調(diào)用到schedule的接口函數(shù),從而完成進(jìn)程的切換。
task schedule
常見的調(diào)度點(diǎn)有:
- 時(shí)間片輪轉(zhuǎn)調(diào)度時(shí)機(jī),主要是在system tick時(shí),在timer的中斷中調(diào)用
sched_process_timer()函數(shù),周期性的處理tick,當(dāng)優(yōu)先級(jí)發(fā)生轉(zhuǎn)換時(shí),會(huì)調(diào)用up_reprioritize_ptr()函數(shù),并會(huì)出發(fā)context的切換。- 搶占式調(diào)度時(shí)機(jī),包括在等待信號(hào)量、信號(hào)、消息隊(duì)列、環(huán)境變量、調(diào)度器設(shè)置、任務(wù)創(chuàng)建與恢復(fù)、yield等。
從代碼來入手分析就能清晰看到調(diào)度點(diǎn)了,以
arm926為例,在路徑arch/arm/src/arm目錄下,有兩個(gè)函數(shù)up_saveusercontext(), up_fullcontextrestore(),分別用于context的保存和恢復(fù)。在任務(wù)調(diào)度的時(shí)候,最終都會(huì)調(diào)用到這兩個(gè)函數(shù)來完成切換。
在arch/arm/src/arm下,分別有up_block_task(), up_unblock_task(), up_reprioritizertr(), up_releasepending(),四個(gè)函數(shù)調(diào)用了context切換的函數(shù)接口, 其中up_releasepending()函數(shù)在sched_unlock()函數(shù)中調(diào)用,因此,便得到了上下文切換的四個(gè)上層函數(shù):
up_block_task()up_unblock_task()up_reprioritize_rtr()sched_unlock()
凡是調(diào)用到上述四個(gè)函數(shù)的其中的一個(gè),都可能帶來任務(wù)的切換,也就是對(duì)應(yīng)的調(diào)度點(diǎn)。
- 調(diào)用
up_block_task()的接口有:
mq_receive() //message接收mq_timedsend()mq_send() //message發(fā)送mq_timedreceive()sem_wait() //信號(hào)量加鎖sigsuspend() //信號(hào)suspendsigtimedwait() //信號(hào)等待
- 調(diào)用
up_unblock_task()的接口有:
mq_receive() //message接收mq_timedsend()mq_send() //message發(fā)送mq_timedreceive()sem_post() //信號(hào)量解鎖sig_tcbdispatch() //信號(hào)dispatchsem_waitirq()mq_waitirq()sig_timeout()task_activate() //激活task,這個(gè)在task_create/task_vforkstart/時(shí)都會(huì)調(diào)用到
- 調(diào)用
sched_unlock()的接口有:
mq_receive() //message接收mq_timedsend()mq_send() //message發(fā)送mq_timedreceive()mq_notify()sem_reset()sig_deliver()sig_queueaction()sig_findaction()kill()sig_mqnotempty()sigprocmask()sigqueue()sigsuspend()sigtimedwait()sig_unmaskpendingsignal()env_dup() //環(huán)境變量相關(guān)getenv()setenv()unsetenv()group_assigngid() //組相關(guān)sched_getaffinity()sched_getparam()setaffinity()sched_setparam()sched_setscheduler()waitid()atexit()task_signalparent()on_exit()posix_spawn()task_assignpid()thread_schedsetup()task_restart()task_spawn()task_terminate()lpwork_boostpriority()lpwork_restorerepriority()work_lpstart()
- 調(diào)用
up_reprioritizertr()的接口有:
sched_roundrobin_process() //在進(jìn)行RR調(diào)度時(shí)會(huì)調(diào)用sched_running_setpriority() //在運(yùn)行時(shí)的優(yōu)先級(jí)設(shè)置sched_readytorun_setpriority() //在ready-to-run時(shí)的優(yōu)先級(jí)設(shè)置
Context切換
以
arm926為例,底層實(shí)現(xiàn)context切換的代碼位于arch/arm/src/arm/目錄下,分別為up_saveusercontext(), up_fullcontextrestore()兩個(gè)函數(shù)。
up_saveusercontext(), 完成的任務(wù)是將所有的寄存器保存至tcb->xcptcontext中,也就是將現(xiàn)場都保存好
.text
.globl up_saveusercontext
.type up_saveusercontext, function
up_saveusercontext:
/* On entry, a1 (r0) holds address of struct xcptcontext.
* Offset to the user region.
*/
/* Make sure that the return value will be non-zero (the
* value of the other volatile registers don't matter --
* r1-r3, ip). This function is called throught the
* noraml C calling conventions and the values of these
* registers cannot be assumed at the point of setjmp
* return.
*/
mov ip, #1
str ip, [r0, #(4*REG_R0)]
/* Save the volatile registers (plus r12 which really
* doesn't need to be saved)
*/
add r1, r0, #(4*REG_R4)
stmia r1, {r4-r14}
/* Save the current cpsr */
mrs r2, cpsr /* R3 = CPSR value */
add r1, r0, #(4*REG_CPSR)
str r2, [r1]
/* Finally save the return address as the PC so that we
* return to the exit from this function.
*/
add r1, r0, #(4*REG_PC)
str lr, [r1]
/* Return 0 */
mov r0, #0 /* Return value == 0 */
mov pc, lr /* Return */
.size up_saveusercontext, . - up_saveusercontext
up_fullcontextrestore(), 完成將tcb->xtcpcontext中保存的寄存器值恢復(fù)到CPU的寄存器中,將現(xiàn)場恢復(fù)。
.globl up_fullcontextrestore
.type up_fullcontextrestore, function
up_fullcontextrestore:
/* On entry, a1 (r0) holds address of the register save area */
/* Recover all registers except for r0, r1, R15, and CPSR */
add r1, r0, #(4*REG_R2) /* Offset to REG_R2 storage */
ldmia r1, {r2-r14} /* Recover registers */
/* Create a stack frame to hold the PC */
sub sp, sp, #4 /* Frame for one register */
ldr r1, [r0, #(4*REG_PC)] /* Fetch the stored pc value */
str r1, [sp] /* Save it in the stack */
/* Now we can restore the CPSR. We wait until we are completely
* finished with the context save data to do this. Restore the CPSR
* may re-enable and interrupts and we could be in a context
* where the save structure is only protected by interrupts being
* disabled.
*/
ldr r1, [r0, #(4*REG_CPSR)] /* Fetch the stored CPSR value */
msr cpsr, r1 /* Set the CPSR */
/* Now recover r0 and r1
* Then return to the address at the stop of the stack,
* destroying the stack frame
*/
ldr r1, [r0, #(4*REG_R1)] /* Restore r1 register firstly */
ldr r0, [r0, #(4*REG_R0)]
ldmia sp!, {r15} /* Return pc value */
.size up_fullcontextrestore, . - up_fullcontextrestore
上文中提到過調(diào)用context切換的四個(gè)上層函數(shù):
up_block_task(), up_unblock_task(), up_reprioritize_rtr(), up_release_pending(),其中四個(gè)函數(shù)的實(shí)現(xiàn)機(jī)制都類似,以up_reprioritize_rtr()為例:
/****************************************************************************
* Name: up_reprioritize_rtr
*
* Description:
* Called when the priority of a running or
* ready-to-run task changes and the reprioritization will
* cause a context switch. Two cases:
*
* 1) The priority of the currently running task drops and the next
* task in the ready to run list has priority.
* 2) An idle, ready to run task's priority has been raised above the
* the priority of the current, running task and it now has the
* priority.
*
* Inputs:
* tcb: The TCB of the task that has been reprioritized
* priority: The new task priority
*
****************************************************************************/
void up_reprioritize_rtr(struct tcb_s *tcb, uint8_t priority)
{
/* Verify that the caller is sane */
if (tcb->task_state < FIRST_READY_TO_RUN_STATE ||
tcb->task_state > LAST_READY_TO_RUN_STATE
#if SCHED_PRIORITY_MIN > 0
|| priority < SCHED_PRIORITY_MIN
#endif
#if SCHED_PRIORITY_MAX < UINT8_MAX
|| priority > SCHED_PRIORITY_MAX
#endif
)
{
PANIC();
}
else
{
struct tcb_s *rtcb = this_task(); /* 獲取g_readytorun隊(duì)列的頭結(jié)點(diǎn) */
bool switch_needed;
sinfo("TCB=%p PRI=%d\n", tcb, priority);
/* Remove the tcb task from the ready-to-run list.
* sched_removereadytorun will return true if we just
* remove the head of the ready to run list.
*/
switch_needed = sched_removereadytorun(tcb); /* 將tcb從g_readytorun隊(duì)列中移走 */
/* Setup up the new task priority */
tcb->sched_priority = (uint8_t)priority; /* 設(shè)置新的優(yōu)先級(jí) */
/* Return the task to the specified blocked task list.
* sched_addreadytorun will return true if the task was
* added to the new list. We will need to perform a context
* switch only if the EXCLUSIVE or of the two calls is non-zero
* (i.e., one and only one the calls changes the head of the
* ready-to-run list).
*/
switch_needed ^= sched_addreadytorun(tcb); /* 添加回g_readytorun隊(duì)列中 */
/* Now, perform the context switch if one is needed */
if (switch_needed)
{
/* If we are going to do a context switch, then now is the right
* time to add any pending tasks back into the ready-to-run list.
* task list now
*/
if (g_pendingtasks.head)
{
sched_mergepending(); /* 在切換前,將pending隊(duì)列和readytorun隊(duì)列merge */
}
/* Update scheduler parameters */
sched_suspend_scheduler(rtcb);
/* Are we in an interrupt handler? */
if (CURRENT_REGS) /* CURRENT_REGS宏用于判斷是否在中斷中,在中斷處理完后CURRENT_REGS會(huì)被賦值為NULL */
{
/* Yes, then we have to do things differently.
* Just copy the CURRENT_REGS into the OLD rtcb.
*/
up_savestate(rtcb->xcp.regs); /* 保存寄存器到老的rtcb中 */
/* Restore the exception context of the rtcb at the (new) head
* of the ready-to-run task list.
*/
rtcb = this_task(); /* 找到將要切換過去的新tcb */
/* Update scheduler parameters */
sched_resume_scheduler(rtcb); /* 更新參數(shù) */
/* Then switch contexts. Any necessary address environment
* changes will be made when the interrupt returns.
*/
up_restorestate(rtcb->xcp.regs); /* 將新tcb的寄存器值恢復(fù)到CPU中 */
}
/* Copy the exception context into the TCB at the (old) head of the
* ready-to-run Task list. if up_saveusercontext returns a non-zero
* value, then this is really the previously running task restarting!
*/
else if (!up_saveusercontext(rtcb->xcp.regs)) /* 不在中斷中,進(jìn)行context切換 */
{
/* Restore the exception context of the rtcb at the (new) head
* of the ready-to-run task list.
*/
rtcb = this_task();
#ifdef CONFIG_ARCH_ADDRENV
/* Make sure that the address environment for the previously
* running task is closed down gracefully (data caches dump,
* MMU flushed) and set up the address environment for the new
* thread at the head of the ready-to-run list.
*/
(void)group_addrenv(rtcb);
#endif
/* Update scheduler parameters */
sched_resume_scheduler(rtcb);
/* Then switch contexts */
up_fullcontextrestore(rtcb->xcp.regs); /* 恢復(fù)之后,變跳轉(zhuǎn)到新的任務(wù)上執(zhí)行了 */
}
}
}
最后,把nuttx的調(diào)度器相關(guān)代碼圖片貼一下,當(dāng)你把流程跟一下后,發(fā)現(xiàn)其實(shí)很多文件都已經(jīng)用到了
Task Schedule source code
補(bǔ)充
之前一直在類似
中斷中是否能進(jìn)行任務(wù)切換,在關(guān)了中斷之后怎么還能進(jìn)行任務(wù)切換等問題上糾纏不清,可以重新捋了一下。
在timer中斷中完成任務(wù)切換,如下圖:
中斷中完成任務(wù)切換
中斷觸發(fā)后的步驟:
- IRQ disable & Mode Switch:進(jìn)入中斷時(shí)處理器是在中斷模式下,會(huì)先切換至SVC模式,并關(guān)掉中斷
- IRQ Context Save:現(xiàn)場保存,將所有的寄存器保存至中斷棧上,并會(huì)將保存的地址
R0傳遞給up_decodeirq,在up_decodeirq中,會(huì)用一個(gè)全局的宏CURRENT_REGS指向傳遞的地址R0- up_savestate():在
IRQ dispatch的過程中,如果遇到了需要task切換的點(diǎn),比如調(diào)到up_reprioritize_rtr()函數(shù),此時(shí)該函數(shù)會(huì)判斷CURRENT_REGS宏是否為NULL,用于確定任務(wù)的切換是否發(fā)生在中斷Handler中。發(fā)生在中斷Handler中時(shí),調(diào)用up_savestate()函數(shù),將CURRENT_REGS中的內(nèi)容保存至task A中。- up_restorestate():將需要切換的
task B中的現(xiàn)場恢復(fù)到CURRENT_REGS中,此時(shí)已經(jīng)完成了內(nèi)容的覆蓋,但是并沒有將值寫入寄存器中,也就是沒有完成真正的任務(wù)切換。- IRQ Context Restore:當(dāng)中斷執(zhí)行完畢后,恢復(fù)現(xiàn)場,此時(shí)會(huì)將之前地址上保存的值恢復(fù)到寄存器里,關(guān)鍵點(diǎn)來了,因?yàn)樵?code>up_restorestate()的時(shí)候,已經(jīng)將新任務(wù)的內(nèi)容覆蓋了原有的,當(dāng)完成現(xiàn)場恢復(fù)后,此時(shí)就跳轉(zhuǎn)到了
task B中去了,也就完成了任務(wù)的切換。
- 在關(guān)中斷后進(jìn)行任務(wù)切換
關(guān)中斷后任務(wù)切換
我在閱讀源碼的時(shí)候,發(fā)現(xiàn)enter_critical_section()后進(jìn)行了任務(wù)切換。當(dāng)場就有點(diǎn)迷糊了,這個(gè)一關(guān)中斷,切換到新任務(wù)后,新任務(wù)難道就不能響應(yīng)中斷了?后來發(fā)現(xiàn)了其實(shí)忽略了一個(gè)重要關(guān)鍵點(diǎn),在TASK_B Context Restore的過程中,會(huì)將TASK_B保存的context中的CPSR恢復(fù)到寄存器中,這時(shí)候運(yùn)行的就是TASK_B的現(xiàn)場了,跟之前TASK_A沒有關(guān)系了。
上圖中,TASK_B Context Restore執(zhí)行完后,TASK_A就中斷在這個(gè)點(diǎn)上了,當(dāng)下一次再調(diào)度到TASK_A時(shí),會(huì)接著從這個(gè)點(diǎn)往下執(zhí)行,所以就算在之前enter_critical_section()關(guān)了中斷,運(yùn)行后leave_critical_section()會(huì)打開中斷,接著執(zhí)行TASK_A后續(xù)的流程。
如果我有新的疑問和心得,我會(huì)保持持續(xù)的更新...



