RISCv Debug Sop
REVISION HISTORY¶
Revision No. | Description | Date |
---|---|---|
1.0 | Initial release | 12/16/2024 |
1. RISCv无法启动如何排查?¶
如果遇到RISCv没有任何log输出,首先可以现确认镜像是否有被成功加载,确认镜像是否有被成功加载的步骤如下。
1.1. 确认RISCv加载地址¶
镜像的加载地址可以在编译的配置文件中找到,例如mak/options_pcupid_riscv_isw.mak
中:
# Feature_Name = uImage load address # Description = define uImage load address # Option_Selection = ADDRESS CONFIG_RTOS_LOAD_ADDR = 0x26800000
由此可知,RISCv的镜像将会从存储介质(例如Flash、EMMC)中被加载到ddr上,地址为0x26800000。
1.2. 确定代码段大小¶
RISCv的镜像中会包含代码段(text section)和数据段(data section),代码段存储的为程序的指令,在运行过程中不会被改变,数据段存储的为程序的带初值的全局变量,在运行过程中会被改写。所以要确认镜像是否加载正确,只需要对比镜像的代码段即可。确定代码段的大小,可以在编译服务器上使用以下命令获得:
readelf -S build/pcupid_riscv_isw/out/pcupid_riscv_isw.elf There are 28 section headers, starting at offset 0x2292a4: Section Headers: [Nr] Name Type Addr Off Size ES Flg Lk Inf Al [ 0] NULL 00000000 000000 000000 00 0 0 0 [ 1] XRAM PROGBITS 10000000 061fcc 000000 00 W 0 0 1 [ 2] .text PROGBITS 10000000 001000 041148 00 AX 0 0 256 [ 3] .rodata PROGBITS 10041148 042148 01e780 00 A 0 0 8 [ 4] PREMAIN_INITCALL PROGBITS 1005f8c8 061fcc 000000 00 W 0 0 1 [ 5] NORM_INITCALL PROGBITS 1005f8c8 0608c8 000028 00 WA 0 0 4 [ 6] APPLICATION_INITC PROGBITS 1005f8f0 0608f0 00001c 00 WA 0 0 4 [ 7] XRAM0 PROGBITS 1005f90c 06090c 000004 00 WA 0 0 4 [ 8] .cli_cmd_list PROGBITS 1005f910 060910 0002a0 00 WA 0 0 4 [ 9] .cam_dev_list PROGBITS 1005fbb0 061fcc 000000 00 W 0 0 1 [10] .data PROGBITS 1005fbb0 060bb0 001414 00 WA 0 0 8 [11] RW_STATICBOOT PROGBITS 10060fc4 061fc4 000008 00 WA 0 0 1 [12] DEBUG_AREA PROGBITS 10060fcc 061fcc 000000 00 W 0 0 1 [13] .bss NOBITS 10061000 061fcc 00f470 00 WA 0 0 64 [14] .sys_stack NOBITS 10070470 061fcc 000500 00 WA 0 0 1 [15] .data2 NOBITS 101b8000 062000 008000 00 WA 0 0 1 [16] .debug_info PROGBITS 00000000 061fcc 0cc52a 00 0 0 1 [17] .debug_abbrev PROGBITS 00000000 12e4f6 019483 00 0 0 1 [18] .debug_loc PROGBITS 00000000 147979 0475fc 00 0 0 1 [19] .debug_aranges PROGBITS 00000000 18ef78 004590 00 0 0 8 [20] .debug_line PROGBITS 00000000 193508 055e97 00 0 0 1 [21] .debug_str PROGBITS 00000000 1e939f 01d66b 01 MS 0 0 1 [22] .comment PROGBITS 00000000 206a0a 000011 01 MS 0 0 1 [23] .debug_frame PROGBITS 00000000 206a1c 00f1a0 00 0 0 4 [24] .debug_ranges PROGBITS 00000000 215bbc 000ae0 00 0 0 1 [25] .symtab SYMTAB 00000000 21669c 009070 10 26 926 4 [26] .strtab STRTAB 00000000 21f70c 009a6f 00 0 0 1 [27] .shstrtab STRTAB 00000000 22917b 000129 00 0 0 1 Key to Flags: W (write), A (alloc), X (execute), M (merge), S (strings), I (info), L (link order), O (extra OS processing required), G (group), T (TLS), C (compressed), x (unknown), o (OS specific), E (exclude), p (processor specific)
可以看到.text
的大小为0x41148
。
1.3. 确认RISCv镜像是否加载正确¶
RISCv的镜像会在IPL_CUST阶段被加载,所以可以借助u-boot命令行来确认RISCv镜像是否加载正确。进入u-boot命令行后,配置正确的网络环境,然后使用tftp命令dump ddr上的数据内容:
SigmaStar # tftpput 0x26800000 0x41148 riscvfw_dump.bin Using sstar_emac device TFTP to server 10.21.2.10; our IP address is 10.24.16.137; sending through gateway 10.24.16.254 Filename 'riscvfw_dump.bin'. Save address: 0x26800000 Save size: 0x41148 Saving: T ################## 43 KiB/s done Bytes transferred = 266568 (41148 hex) SigmaStar #
使用对比工具对比烧录的镜像和dump出来的文件观察是否一致。
2. 判断RISCv是否处于运行状态¶
可以通过读写寄存器的工具(例如:system tool、riu_r)获取以下寄存器的值:
Bank | Offset | Bits | Description |
---|---|---|---|
0x1E | 0x35 | [0] | 0: diable 1: enable |
3. 判断RISCv是否处于TCM Mode¶
可以通过读写寄存器的工具(例如:system tool、riu_r)获取以下寄存器的值:
Bank | Offset | Bits | Description |
---|---|---|---|
0x802 | 0x18 | [2:0] | 3b'000: icache mode |
3b'001: reserved | |||
3b'010: tcm mode |
4. RISCv如何查看线程优先级,线程占用CPU Loading情况¶
通过CLI 命令taskstat,此命令会统计输入后1s内的所有线程的cpu loading情况。
ID | PRIO | STAT | CPU | STACK USAGE | NAME | HANDLER |
---|---|---|---|---|---|---|
1 | 111 | B | 0.0 | 776/3072 | SYS_CUST | 0x1002a250 |
2 | 2 | B | 0.0 | 496/2048 | CONSOLE | 0x1002c6b8 |
3 | 2 | X | 0.0 | 1320/4096 | MENU | 0x1002a268 |
4 | 0 | R | 99.9 | 144/2044 | IDLE | 0x00000000 |
5 | 127 | B | 0.0 | 192/2040 | Tmr Svc | 0x1003919c |
6 | 99 | B | 0.0 | 648/1528 | NonSecureWorld | 0x1002e900 |
7 | 64 | B | 0.0 | 776/2040 | rpmsg_dualos | 0x1001b194 |
5. RISCv如何查看所有中断的状态¶
通过CLI 命令intrstat 获取中断触发次数,中断耗时时间,中断注册名称:
INT | Count | MaxTimeUs | AvgTimeUs | Handler | DevId | Affinity | Name |
---|---|---|---|---|---|---|---|
162 | 1 | 7 | 14 | 0x10016000 | 0x10072740 | 0x00000001 | pwm_group3 |
365 | 13 | 22 | 22 | 0x1001B0B0 | 0x00000000 | 0x00000001 | RPMSG_L2R |
105 | 0 | 0 | 4294967297 | 0x1000664C | 0x1006230C | 0x00000001 | bdma7 |
105 | 0 | 0 | 4294967297 | 0x1000664C | 0x1006228C | 0x00000001 | bdma6 |
105 | 0 | 0 | 4294967297 | 0x1000664C | 0x1006220C | 0x00000001 | bdma5 |
105 | 0 | 0 | 4294967297 | 0x1000664C | 0x1006218C | 0x00000001 | bdma4 |
6. Riscv 串口卡住,无法输入如何Debug¶
6.1. 原因¶
串口的输入依赖CONSOLE和MENU两个Task,默认优先级为2,当Riscv的CPU被其他高优先级的Task完全抢占的时候,就会遇到串口无法输入的情况。
6.2. 常见问题¶
1)客户代码逻辑异常,rtos_application_initcall 函数里调用耗时操作API, initcall优先级属于SYS_CUST(111)的优先级
2)客户通过CamOsThreadCreate创建的Thread,使用的是while(1){……CamOsMsDelay()}进行轮询执行任务,Delay不会释放CPU资源,应该改为CamOsMsSleep
3)系统的IRQ把Riscv的Loading都抢走了,这类问题一般是公版的Bug,或者是客户在中断处理函数里,有耗时操作同时中断数量又很多,导致处理不过来
6.3. 排查方法¶
通过Riscv PC指针查看当前Riscv 在干吗,连续读10次左右,然后通过addr2line 解析PC指针对应代码位置。
6.3.1. 获取PC指针方法¶
1)如果Arm串口可以正常工作,使用如下命令
/customer # ./riux32_r 0x803 0x1 BANK:0x0803 16bit-offset 0x01 0x1002D630
2)如果Arm串口也卡死了,通过Debug串口使用SStarSystemTool进行读取 选择到X32栏位寄存器填入0x803,查看offset:0x1的值
6.3.2. 解析PC指针方法¶
1)先找到问题环境对应riscv编译的elf文件:
MOUNRIVER 编译文件路径: rtk\proj\obj\PCUPID.elf, Linux编译路径:rtk\proj\build\pcupid_riscv_isw\out\pcupid_riscv_isw.elf
2)在Linux命令行使用addr2line命令解析pc指针, 可以同时解析多个地址
riscv64-unknown-elf-addr2line -e xxx.elf 0x1002D630 0x1002D630 或者addr2line- e xxx.elf 0x1002D630 0x1002D630
7. Riscv Exception报错原因如何排查¶
exception原因主要分为以下4种:
- DATA ABORT: 对非法内存地址进行存取,建议检查获取内存地址的流程是否出错
- UNDEFINED INSTRUCTION: CPU执行到无法识别的指令, 建议检查function指针是否错误,或是function所在的内存被破环了
- PREFETCH ABORT:CPU对非法内存地址读取指令,建议检查function指针是否错误
- SYSTEM ASSERT:code流程主动触发exception
exception信息,主要分3部分:
- Exception register info: 打印RISCV cpu主要的一些状态寄存器信息
- Exception type: 打印异常原因
- Panic message: 打印异常时记录的backtrace以及触发assert的具体位置
Exception实例:
Exception without dump info Exception type: SYSTEM ASSERT (240), Param: 0x100425c4 Panic at 0x10004d90 (unknown symbol) Panic message: Test Assert Call Stack Backtrace Begin: #0 0x10004d90 (unknown symbol) #1 0x1000550a (unknown symbol) #2 0x10025038 (unknown symbol) #3 0x1002313c (unknown symbol) #4 0x1002c5e6 (unknown symbol) #5 0x1002316a (unknown symbol) #6 0x1002c5e6 (unknown symbol) #7 0x1003885a (unknown symbol) #8 0x10038854 (unknown symbol) #9 0x1002acd8 (unknown symbol) #10 0x1002b34e (unknown symbol) #11 0x1002be12 (unknown symbol) #12 0x1000311c (unknown symbol)
根据Panic backtrace,参考前面章节提到的解pc指针的方法解析
aarch64-linux-gnu-addr2line -e pcupid_riscv_isw.elf 0x10004d90 0x1000550a 0x10025038 0x1002313c 0x1002c5e6 0x1002316a 0x1002c5e6 0x1003885a 0x10038854 0x1002acd8 0x1002b34e 0x1002be12 0x1000311c
根据backtrace解析查看代码,发现riscv_gpio.c 63行位置调用了CamOsPanic("Test Assert");
/home/beck.zhang/5_2.3.0_p3p/riscv/kernel/rtk/proj/build/pcupid_riscv_isw/out/riscv_gpio.c:63 ………………………… /home/beck.zhang/5_2.3.0_p3p/riscv/kernel/rtk/proj/build/pcupid_riscv_isw/out/core_state.c:160