Zephyr内核之CPU使用率

Creative Commons
本作品采用知识共享署名

本文分析说明Zephyr中Thread的CPU使用率统计实现方式。

Zephyr LTS2 release后,大约在2021年11月引入了thread usage功能https://github.com/zephyrproject-rtos/zephyr/pull/38876这对调试分析多线程系统性能很有必要。本文分析thread usage的实现方式,同时说明如何使用shell和thread 分析模块来显示thread cpu loading,并分析说明目前显示thread cpu loading的问题和改进办法。

使用率数据收集

线程调度时间的计算

Zephyr内核在切换上下文时记录CPU执行的cycle变化量,该变化量就是被切出线程最近一次的调度时间。
Zephyr通过do_swap切换上下文,向下的调用顺序为:zephyr/kernel/include/kswap.h do_swap->z_sched_usage_switch

1
2
3
4
5
6
7
8
static inline void z_sched_usage_switch(struct k_thread *thread)
{
ARG_UNUSED(thread);
#ifdef CONFIG_SCHED_THREAD_USAGE
z_sched_usage_stop();
z_sched_usage_start(thread);
#endif
}

z_sched_usage_stop计算当前要被切出的线程被调度了多长时间

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
void z_sched_usage_stop(void)
{
k_spinlock_key_t k = k_spin_lock(&usage_lock);

struct _cpu *cpu = _current_cpu;

//usage0在上一次上下文切换时调用z_sched_usage_start中被赋值
uint32_t u0 = cpu->usage0;

if (u0 != 0) {
//此时计算距离上一次上下文切换过去了多少cycles,这个cycles也就是当前要被切出线程被调度的时间
uint32_t cycles = usage_now() - u0;

//将线程被调度的时间更新到当前线程中
if (cpu->current->base.usage.track_usage) {
sched_thread_update_usage(cpu->current, cycles);
}

//将线程被调度的时间更新到当前cpu中
sched_cpu_update_usage(cpu, cycles);
}

//清空usage0记录
cpu->usage0 = 0;
k_spin_unlock(&usage_lock, k);
}

z_sched_usage_switchthread参数为即将被调度的线程,z_sched_usage_start记录被调度线程的起始cycle数

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
void z_sched_usage_start(struct k_thread *thread)
{
#ifdef CONFIG_SCHED_THREAD_USAGE_ANALYSIS
k_spinlock_key_t key;

key = k_spin_lock(&usage_lock);

//记录当前cycle数,也就是thread调度一开始的数量
_current_cpu->usage0 = usage_now(); /* Always update */

//num_windows记录thread被调度的次数
//current记录thread最近一次被调度的时间长度,这里才开始调度因此被清0
if (thread->base.usage.track_usage) {
thread->base.usage.num_windows++;
thread->base.usage.current = 0;
}

k_spin_unlock(&usage_lock, key);
#else
/* One write through a volatile pointer doesn't require
* synchronization as long as _usage() treats it as volatile
* (we can't race with _stop() by design).
*/

_current_cpu->usage0 = usage_now();
#endif
}

调度时间的更新

无论是cpu还是线程都使用zephyr/include/zephyr/kernel/stats.h内的struct k_cycle_stats记录执行

1
2
3
4
5
6
7
8
9
struct k_cycle_stats {
uint64_t total; /* total usage in cycles */
#ifdef CONFIG_SCHED_THREAD_USAGE_ANALYSIS
uint64_t current; /* # of cycles in current usage window */
uint64_t longest; /* # of cycles in longest usage window */
uint32_t num_windows; /* # of usage windows */
#endif
bool track_usage; /* true if gathering usage stats */
};

线程调度时间

线程被调度的时间以cycle为单位记录在下面的结构体关系中

1
struct k_thread->struct _thread_base base->struct k_cycle_stats  usage;

针对线程的usage通过void sched_thread_update_usage(struct k_thread *thread, uint32_t cycles)将线程被调度的cycle数更新到线程的struct k_cycle_stats usage中,对于线程来说,该结构体各个字段的含义如下

  • total 该线程总计执行了多少个cycle
  • current 该线程最近一次调度执行了多少个cycle
  • longest 该线程多次调度中最长执行了的cycle数
  • num_windows 线程被调度的次数
    包括idle线程在内的所有线程都会记录以上数据

CPU时间

CPU执行的时间以cycle为单位记录在下面的结构体中

1
struct _cpu -> struct k_cycle_stats usage;

针对CPU的usage通过void sched_cpu_update_usage(struct _cpu *cpu, uint32_t cycles)将线程被调度的cycle数更新到cpu的struct k_cycle_stats usage中,对于cpu来说,该结构体各个字段的含义如下

  • total 示该cpu总计执行了多少个cycle,不包含idle线程的
  • current 距离上一次idle该cpu执行了多少个cycle
  • longest 两次idle之间该cpu最长执行了的cycle数
  • num_windows 该cpu上Idle多少次调度

运行时获取使用率数据

无论是cpu还是线程都使用zephyr/include/zephyr/kernel/thread.h内的struct k_thread_runtime_stats来存储返回运行时数据

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
typedef struct k_thread_runtime_stats {
#ifdef CONFIG_SCHED_THREAD_USAGE
uint64_t execution_cycles;
uint64_t total_cycles; /* total # of non-idle cycles */
/*
* In the context of thread statistics, [execution_cycles] is the same
* as the total # of non-idle cycles. In the context of CPU statistics,
* it refers to the sum of non-idle + idle cycles.
*/
#endif

#ifdef CONFIG_SCHED_THREAD_USAGE_ANALYSIS
/*
* For threads, the following fields refer to the time spent executing
* as bounded by when the thread was scheduled in and scheduled out.
* For CPUs, the same fields refer to the time spent executing
* non-idle threads as bounded by the idle thread(s).
*/

uint64_t current_cycles; /* current # of non-idle cycles */
uint64_t peak_cycles; /* peak # of non-idle cycles */
uint64_t average_cycles; /* average # of non-idle cycles */
#endif

#ifdef CONFIG_SCHED_THREAD_USAGE_ALL
/*
* This field is always zero for individual threads. It only comes
* into play when gathering statistics for the CPU. In that case it
* represents the total number of cycles spent idling.
*/

uint64_t idle_cycles;
#endif

#if defined(__cplusplus) && !defined(CONFIG_SCHED_THREAD_USAGE) && \
!defined(CONFIG_SCHED_THREAD_USAGE_ANALYSIS) && !defined(CONFIG_SCHED_THREAD_USAGE_ALL)
/* If none of the above Kconfig values are defined, this struct will have a size 0 in C
* which is not allowed in C++ (it'll have a size 1). To prevent this, we add a 1 byte dummy
* variable when the struct would otherwise be empty.
*/
uint8_t dummy;
#endif
} k_thread_runtime_stats_t;

运行时线程数据获取

z_sched_thread_usage用于获取线程的usage,如果是线程自己调用该API,会计算到调用这一刻

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
void z_sched_thread_usage(struct k_thread *thread,
struct k_thread_runtime_stats *stats)
{
struct _cpu *cpu;
k_spinlock_key_t key;

key = k_spin_lock(&usage_lock);
cpu = _current_cpu;

//如果是当前线程,计算从开始调度到这一刻的cycles并进行更新
if (thread == cpu->current) {
uint32_t now = usage_now();
uint32_t cycles = now - cpu->usage0;

/*
* Getting stats for the current thread. Update both the
* current thread stats and its CPU stats as the CPU's
* [usage0] field will also get updated. This keeps all
* that information up-to-date.
*/

if (thread->base.usage.track_usage) {
sched_thread_update_usage(thread, cycles);
}

sched_cpu_update_usage(cpu, cycles);

cpu->usage0 = now;
}

//转换为运行时统计数据
stats->execution_cycles = thread->base.usage.total;
stats->total_cycles = thread->base.usage.total;

/* Copy-out the thread's usage stats */

#ifdef CONFIG_SCHED_THREAD_USAGE_ANALYSIS
stats->current_cycles = thread->base.usage.current;
stats->peak_cycles = thread->base.usage.longest;

//计算平局调度时间
if (thread->base.usage.num_windows == 0) {
stats->average_cycles = 0;
} else {
stats->average_cycles = stats->total_cycles /
thread->base.usage.num_windows;
}
#endif

#ifdef CONFIG_SCHED_THREAD_USAGE_ALL
stats->idle_cycles = 0;
#endif
stats->execution_cycles = thread->base.usage.total;

k_spin_unlock(&usage_lock, key);
}

对于单个线程来说 k_thread_runtime_stats_t各字段的含义如下

  • execution_cycles 该线程总计调度的cycle数
  • total_cycles 该线程总计调度的cycle数
  • current_cycles 该线程最近一次调度的cycle数
  • peak_cycles 线程最长一次调度的cycle数
  • average_cycles 线程调度平均的cycle数
  • idle_cycles 为0,无效

运行时CPU数据获取

z_sched_cpu_usage用于获取CPU的usage,会计算到调用API这一刻

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
void z_sched_cpu_usage(uint8_t cpu_id, struct k_thread_runtime_stats *stats)
{
k_spinlock_key_t key;
struct _cpu *cpu;

key = k_spin_lock(&usage_lock);
cpu = _current_cpu;

//计算从开始调度到这一刻的cycles并进行更新
if (&_kernel.cpus[cpu_id] == cpu) {
uint32_t now = usage_now();
uint32_t cycles = now - cpu->usage0;

/*
* Getting stats for the current CPU. Update both its
* current thread stats and the CPU stats as the CPU's
* [usage0] field will also get updated. This keeps all
* that information up-to-date.
*/

if (cpu->current->base.usage.track_usage) {
sched_thread_update_usage(cpu->current, cycles);
}

sched_cpu_update_usage(cpu, cycles);

cpu->usage0 = now;
}

stats->total_cycles = cpu->usage.total;
#ifdef CONFIG_SCHED_THREAD_USAGE_ANALYSIS
stats->current_cycles = cpu->usage.current;
stats->peak_cycles = cpu->usage.longest;

//计算该CPU上调度的平均执行时间
if (cpu->usage.num_windows == 0) {
stats->average_cycles = 0;
} else {
stats->average_cycles = stats->total_cycles /
cpu->usage.num_windows;
}
#endif

stats->idle_cycles =
_kernel.cpus[cpu_id].idle_thread->base.usage.total;

stats->execution_cycles = stats->total_cycles + stats->idle_cycles;

k_spin_unlock(&usage_lock, key);
}

对于CPU来说 k_thread_runtime_stats_t各字段的含义如下

  • total_cycles cpu总计调度的cycle,不包含idle
  • idle_cycles cpu上的idle线程执总计执行了多少tick
  • execution_cycles cpu总计执行了多少cycles,包含idle和非idle
  • current_cycles 距离上一次idle该cpu执行了多少个cycle
  • peak_cycles 表示cpu上两次idle之间该cpu最长多少个cycle
  • average_cycles 该cpu上平均调度时间

usage的使用

在zephyr中进行如下配置可以启动usage的统计

1
2
CONFIG_SCHED_THREAD_USAGE_ANALYSIS=y
CONFIG_SCHED_THREAD_USAGE_ALL=y

shell中usage的使用

开启如下配置项会有cpu Loading和usage出现

1
2
3
CONFIG_SHELL=y
CONFIG_KERNEL_SHELL=y
CONFIG_THREAD_RUNTIME_STATS=y

执行kernel threads可以看到

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
uart:~$ kernel threads
Scheduler: 38 since last call
Threads:
0x3fc90ac0 temp_task_id
options: 0x0, priority: 3 timeout: 0
state: pending, entry: 0x4201277c
Total execution cycles: 217 (0 %)
Current execution cycles: 217
Peak execution cycles: 217
Average execution cycles: 217
stack size 2048, unused 1864, usage 184 / 2048 (8 %)

0x3fc90568 sys_stress
options: 0x0, priority: 14 timeout: 52
state: suspended, entry: 0x4201121c
Total execution cycles: 2920152707 (92 %)
Current execution cycles: 800216
Peak execution cycles: 802088
Average execution cycles: 255291
stack size 2048, unused 1740, usage 308 / 2048 (15 %)

0x3fc90480 sys_ps
options: 0x0, priority: 0 timeout: 499
state: suspended, entry: 0x4201115e
Total execution cycles: 491107 (0 %)
Current execution cycles: 835
Peak execution cycles: 4225
Average execution cycles: 626
stack size 2048, unused 1688, usage 360 / 2048 (17 %)

0x3fc938d0 input
options: 0x0, priority: 14 timeout: 0
state: pending, entry: 0x4201f8ec
Total execution cycles: 216 (0 %)
Current execution cycles: 216
Peak execution cycles: 216
Average execution cycles: 216
stack size 512, unused 344, usage 168 / 512 (32 %)

线程分析功能

zephyr的debug子系统中有一个线程分析功能zephyr/subsys/debug/thread_analyzer.c,提供的功能和shell命令类似,主要是用于没有开启shell的情况下。
当配置CONFIG_THREAD_ANALYZER=y后就可以调用thread_analyzer_print()显示线程信息,其中报告cpu的usage。当多配置CONFIG_THREAD_ANALYZER_AUTO=y后线程分析会多开一个thread,每隔CONFIG_THREAD_ANALYZER_AUTO_INTERVAL调用一次thread_analyzer_print()来自动显示线程信息,如下

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
Thread analyze:
temp_task_id : STACK: unused 1864 usage 184 / 2048 (8 %); CPU: 0 %
: Total CPU cycles used: 204
- Current Frame: 204; Longest Frame: 204; Average Frame: 204
sys_stress : STACK: unused 1868 usage 180 / 2048 (8 %); CPU: 0 %
: Total CPU cycles used: 13642
- Current Frame: 124; Longest Frame: 322; Average Frame: 128
input : STACK: unused 344 usage 168 / 512 (32 %); CPU: 0 %
: Total CPU cycles used: 204
- Current Frame: 204; Longest Frame: 204; Average Frame: 204
display_task_id : STACK: unused 1880 usage 168 / 2048 (8 %); CPU: 0 %
: Total CPU cycles used: 160
- Current Frame: 160; Longest Frame: 160; Average Frame: 160
check_task_id : STACK: unused 1880 usage 168 / 2048 (8 %); CPU: 0 %
: Total CPU cycles used: 513
- Current Frame: 513; Longest Frame: 513; Average Frame: 513
sysworkq : STACK: unused 840 usage 184 / 1024 (17 %); CPU: 0 %
: Total CPU cycles used: 60
- Current Frame: 60; Longest Frame: 60; Average Frame: 60
shell_uart : STACK: unused 968 usage 1080 / 2048 (52 %); CPU: 0 %
: Total CPU cycles used: 227831
- Current Frame: 21963; Longest Frame: 21963; Average Frame: 240
logging : STACK: unused 456 usage 312 / 768 (40 %); CPU: 0 %
: Total CPU cycles used: 28826
- Current Frame: 143; Longest Frame: 3228; Average Frame: 134
idle : STACK: unused 300 usage 212 / 512 (41 %); CPU: 99 %
: Total CPU cycles used: 169747177
- Current Frame: 40345; Longest Frame: 815888; Average Frame: 218184
ISR0 : STACK: unused 0 usage 2048 / 2048 (100 %)

usage功能的问题和处理

shell命令中线程的cpu loading百分比计算方式如下:

1
2
pcnt = (rt_stats_thread.execution_cycles * 100U) /
rt_stats_all.execution_cycles;

在线程分析中线程的cpu loading百分比计算方式如下:

1
2
info.utilization = (info.usage.execution_cycles * 100U) /
rt_stats_all.execution_cycles;

二者都是以线程的总计被调度cycle去除于cpu的总计执行cycle,也就是说线程的cpu loading是从开机到当前的总计平均情况。如果某个线程在开机后loading一直很低,后来突然增高,shell和线程分析显示该线程的cpu loading只会慢慢增加,在增高一瞬间是看不出来的,此时可以通过观察Current execution cycles,如果该值较平时大说明该线程的瞬时loading上涨,但这样并不直观,我们还是希望能像ps一样得到一个百分比。shell和线程分析的百分比不能反映瞬时的线程loading是因为统计时间太长,因此我们另外写一个cpu loading的计算方法就是缩短统计时间, 以下是伪代码表示方法,1s计算一次loading:

1
2
3
4
5
6
7
8
9
10
11
12
while(1){
//获取当前执行的cycles数量
k_thread_runtime_stats_all_get(&cpu_stats);
k_thread_runtime_stats_get(thread, &thread_stats);
//通过两次的差值计算线程的loading百分比
usage = (thread_stats.execution_cycles - last_thread_stats.execution_cycles)/
(cpu_stats.execution_cycles - last_cpu_stats.execution_cycles);
//保存本次的采样
last_cpu_stats = cpu_stats;
last_thread_stats = thread_stats;
k_msleep(1000);
}

此时就比较贴近当前的CPU Loading了

1
2
3
4
5
6
7
8
9
10
temp_task_id         CPU 0 %
sys_stress CPU 99 %
sys_ps CPU 0 %
input CPU 0 %
display_task_id CPU 0 %
check_task_id CPU 0 %
sysworkq CPU 0 %
shell_uart CPU 0 %
logging CPU 0 %
idle CPU 1 %