1

I am working on an embedded Linux system (kernel-5.10.24), the CPU is 32bit MIPs.

The applications run in the system may trigger invalid memory access, which will be shot by a SIGSEGV from kernel, and the kernel may dump some logs as follows,

[    5.464129] do_page_fault(): sending SIGSEGV to testsegv for invalid read access from 0000001c
[    5.464144] epc = 0041e118 in testsegv[400000+668000]
[    5.464173] ra  = 00661010 in testsegv[400000+668000]

This log is too simple to triage the problem.
So I am trying to do some backtracing in kernel by adding show_regs() into the mm/fault.c and I can got following logs when hit error.

[  185.408332] do_page_fault(): sending SIGSEGV to segv for invalid write access to 00000000
[  185.418592] epc = 0040065c in segv[400000+1000]
[  185.423642] ra  = 00400654 in segv[400000+1000]
[  185.428349] CPU: 1 PID: 1235 Comm: segv Not tainted 5.10.24 #17
[  185.434760] $ 0   : 00000000 00000001 00000000 00000063
[  185.440325] $ 4   : 77e7953c 00c8a190 ffffffff 00000000
[  185.445742] $ 8   : 00000000 00000000 00000001 68736172
[  185.451338] $12   : 0000000d 00000080 00000000 00000000
[  185.456755] $16   : 7faf6564 77ecaf10 004007a0 00000002
[  185.462340] $20   : 00000000 77ec6508 77ecb408 00400714
[  185.467748] $24   : 00000000 77d38a60
[  185.473200] $28   : 77e7ee30 7faf63d0 7faf63d0 00400654
[  185.478712] Hi    : 00000013
[  185.481717] Lo    : 00000000
[  185.484774] epc   : 0040065c 0x40065c
[  185.488555] ra    : 00400654 0x400654
[  185.492383] Status: 04001c13 USER EXL IE
[  185.496639] Cause : 0880000c (ExcCode 03)
[  185.500797] BadVA : 00000000
[  185.508620] CPU: 1 PID: 1235 Comm: segv Not tainted 5.10.24 #17
[  185.514826] Stack : 80bb0000 80092358 00000000 00000000 80a8ee08 8160f950 81deb528 80095370
[  185.523482]         00000000 00000000 00000000 7b699e72 825dbe6c 00000001 825dbe00 7b699e72
[  185.532132]         00000000 00000000 80a74f30 825dbcc0 000001cf 825dbcd4 00000000 00001388
[  185.540785]         1e50ef51 825dbcd3 ffffffff 00000030 80b90000 80000000 00000000 80a70000
[  185.549432]         00000001 00000001 8160f900 8160f950 00000000 00000000 2000e098 80c40004
[  185.558081]         ...
[  185.560613] Call Trace:
[  185.563149] [<8001f294>] show_stack+0x94/0x12c
[  185.567745] [<8094857c>] dump_stack+0xac/0xe8
[  185.572253] [<8002c714>] do_page_fault+0x2d4/0x510
[  185.577211] [<80032b98>] tlb_do_page_fault_1+0x118/0x120

It showed the backtrace in kernel space not the user space.
So is there a way to get the backtrace of user space in Linux kernel in this case? (IIRC, X86 can dump something more in kernel space).

Updated with my testing codes in kernel.

I added some codes in mm/fault.c.

unsigned long user_unwind_stack(unsigned long *sp,
        unsigned long pc, unsigned long *ra)
{
    struct mips_frame_info info;
//  unsigned long size, ofs;
    int leaf;
    unsigned long stackinst[0x20] = {0xa5};

    int rc = copy_from_user(stackinst, (void *)(*sp), sizeof(stackinst));
printk("XXXXXXXXXXXXXXXXXXXXXX %s, %d, sp:%lx, rc=%d\n", __func__, __LINE__, *sp, rc);
    for (leaf = 0; leaf < sizeof(stackinst)/sizeof(stackinst[0]); leaf++) {
        if ((leaf % 4) == 0) {
            pr_cont("%px: ", *sp + leaf);
        }
        pr_cont("%08lX ", stackinst[leaf]);
        if ((leaf % 4) == 3) {
            pr_cont("\n");
        }
    }
printk("XXXXXXXXXXXXXXXXXXXXXX %s, %d\n", __func__, __LINE__);
...
}

int mytest_dump_backtrace (struct pt_regs *regs)
{
    unsigned long sp = regs->regs[29];
    unsigned long ra = regs->regs[31];
    unsigned long pc = regs->cp0_epc;
    int count = 0;
    printk("user thread pc%d:0x%lx, sp:0x%lx\n", count++, pc, sp);
    pc = user_unwind_stack(&sp, pc, &ra);

    return 0;
}

    if (user_mode(regs)) {
        mytest_dump_backtrace(regs); /// Call my backtrace codes.
        tsk->thread.cp0_badvaddr = address;
        tsk->thread.error_code = write;

And what I got are

[    9.484609] user thread pc0:0x40065c, sp:0x7fbd5be0
[    9.491495] XXXXXXXXXXXXXXXXXXXXXX user_unwind_stack, 238, sp:7fbd5be0, rc=0
[    9.498835] 7fbd5be0: 00000001 00000000 77E1EE30 77E62F88
[    9.504844] 7fbd5be4: 77C98080 77C6BDCC 00000000 77E64F10
[    9.510726] 7fbd5be8: 7FBD5C08 00400694 00000000 77E61508
[    9.516407] 7fbd5bec: 77E65408 00400714 7FBD5C48 77CA3B30
[    9.522269] 7fbd5bf0: 7FBD5C28 004006F8 7FBD5EB0 7FBD5D74
[    9.527947] 7fbd5bf4: 00000001 77E16BC8 77E1EE30 00400788
[    9.533702] 7fbd5bf8: 7FBD5C48 00400794 00000001 0F6B5934
[    9.539462] 7fbd5bfc: 7FBD5C4C 00000000 77E1EE30 00000000
[    9.545202] XXXXXXXXXXXXXXXXXXXXXX user_unwind_stack, 248

From the return value 0 of copy_from_user, it seemed it failed to copy/read process stack.....

What is the wrong with the code?

0

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.