Raspberry Pi 4, latest bookworm with all updates, 6.6.47+rpt-rpi-v8 #1 SMP PREEMPT Debian 1:6.6.47-1+rpt1 (2024-09-02) aarch64 GNU/Linux
1. Get and build kmscube
2. Set GALLIUM_HUD to show some performance counters. E.g like this:3. Run kmscube a few times.
4. After 3-4th time kernel will panic at some random and often unrelated location. I've seen stacks from drm+v3d, ext4, usb, etc. It's completely arbitrary.
Not sure if any specific counter, or combination of, is causing this. Using just a single counter doesn't seem to crash even after several tries. Enabling 10-20 counters crashes on a second run.
One seemingly relevant panic:
1. Get and build kmscube
2. Set GALLIUM_HUD to show some performance counters. E.g like this:
Code:
GALLIUM_HUD_VISIBLE=falseGALLIUM_HUD=stdoutGALLIUM_HUD+=,fpsGALLIUM_HUD+=,frametimeGALLIUM_HUD+=,cpuGALLIUM_HUD+=,samples-passedGALLIUM_HUD+=,primitives-generatedGALLIUM_HUD+=,PTB-primitives-discarded-outside-viewportGALLIUM_HUD+=,QPU-total-idle-clk-cyclesGALLIUM_HUD+=,QPU-total-active-clk-cycles-vertex-coord-shadingGALLIUM_HUD+=,QPU-total-active-clk-cycles-fragment-shadingGALLIUM_HUD+=,QPU-total-clk-cycles-executing-valid-instrGALLIUM_HUD+=,QPU-total-clk-cycles-waiting-TMUGALLIUM_HUD+=,QPU-total-clk-cycles-waiting-varyingsGALLIUM_HUD+=,QPU-total-instr-cache-hitGALLIUM_HUD+=,QPU-total-instr-cache-missGALLIUM_HUD+=,TMU-total-text-quads-accessGALLIUM_HUD+=,TMU-total-text-cache-missGALLIUM_HUD+=,L2T-total-cache-hitGALLIUM_HUD+=,L2T-total-cache-missGALLIUM_HUD+=,QPU-total-clk-cycles-waiting-vertex-coord-shadingGALLIUM_HUD+=,QPU-total-clk-cycles-waiting-fragment-shadingGALLIUM_HUD+=,TLB-partial-quads-written-to-color-bufferGALLIUM_HUD+=,TMU-active-cyclesGALLIUM_HUD+=,TMU-stalled-cyclesGALLIUM_HUD+=,L2T-TMU-readsGALLIUM_HUD+=,L2T-TMU-write-missGALLIUM_HUD+=,L2T-TMU-read-missGALLIUM_HUD+=,TMU-MRU-hits./build/kmscube
4. After 3-4th time kernel will panic at some random and often unrelated location. I've seen stacks from drm+v3d, ext4, usb, etc. It's completely arbitrary.
Not sure if any specific counter, or combination of, is causing this. Using just a single counter doesn't seem to crash even after several tries. Enabling 10-20 counters crashes on a second run.
One seemingly relevant panic:
Code:
[ 391.090878] ------------[ cut here ]------------[ 391.090891] WARNING: CPU: 0 PID: 1188 at mm/slab_common.c:994 free_large_kmalloc+0x78/0xb8[ 391.090916] Modules linked in: cmac algif_hash aes_arm64 aes_generic algif_skcipher af_alg bnep brcmfmac_wcc brcmfmac vc4 brcmutil cfg80211 binfmt_misc snd_soc_hdmi_codec hci_uart uvcvideo drm_display_helper btbcm cec bluetooth uvc rpivid_hevc(C) bcm2835_codec(C) drm_dma_helper v3d drm_kms_helper bcm2835_isp(C) v4l2_mem2mem bcm2835_v4l2(C) bcm2835_mmal_vchiq(C) raspberrypi_hwmon videobuf2_vmalloc videobuf2_dma_contiggpu_sched snd_soc_core drm_shmem_helper ecdh_generic ecc videobuf2_memops rfkill videobuf2_v4l2 videodev snd_compress snd_pcm_dmaengine libaes raspberrypi_gpiomem snd_bcm2835(C) vc_sm_cma(C) snd_pcm videobuf2_common snd_timer snd mc nvmem_rmem uio_pdrv_genirq uio drm fuse dm_mod drm_panel_orientation_quirks backlight ip_tables x_tables ipv6 i2c_brcmstb[ 391.091067] CPU: 0 PID: 1188 Comm: kmscube Tainted: G C 6.6.47+rpt-rpi-v8 #1 Debian 1:6.6.47-1+rpt1[ 391.091075] Hardware name: Raspberry Pi 4 Model B Rev 1.1 (DT)[ 391.091079] pstate: 40000005 (nZcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)[ 391.091085] pc : free_large_kmalloc+0x78/0xb8[ 391.091092] lr : kfree+0x134/0x140[ 391.091098] sp : ffffffc0800fbb60[ 391.091101] x29: ffffffc0800fbb60 x28: ffffffe1ace63b48 x27: ffffffc0800fbcf8[ 391.091111] x26: ffffffc0800fbcf8 x25: ffffff804174b400 x24: 0000000000000049[ 391.091121] x23: ffffffc0800fbcf8 x22: ffffffe1acf050a0 x21: ffffffe1acf04a84[ 391.091130] x20: ffffff8044873820 x19: fffffffe010a8c40 x18: 0000000000000000[ 391.091140] x17: 0000000000000000 x16: ffffffe2126d4760 x15: 00000000ffeaffc0[ 391.091149] x14: 0000000000000004 x13: ffffff8044873808 x12: 0000000000000000[ 391.091158] x11: ffffff804b1b1de8 x10: ffffff804b1b1da8 x9 : ffffffe2126d4894[ 391.091167] x8 : ffffff804b1b1dd0 x7 : 0000000000000000 x6 : 0000000000000228[ 391.091175] x5 : 0000000000000000 x4 : 0000000000000000 x3 : fffffffe010a8c40[ 391.091184] x2 : 0000000000000001 x1 : ffffff8042a31498 x0 : 4000000000000000[ 391.091193] Call trace:[ 391.091196] free_large_kmalloc+0x78/0xb8[ 391.091203] kfree+0x134/0x140[ 391.091209] v3d_perfmon_put.part.0+0x64/0x90 [v3d][ 391.091237] v3d_perfmon_destroy_ioctl+0x54/0x80 [v3d][ 391.091254] drm_ioctl_kernel+0xd8/0x190 [drm][ 391.091378] drm_ioctl+0x220/0x4c0 [drm][ 391.091462] drm_compat_ioctl+0x118/0x140 [drm][ 391.091546] __arm64_compat_sys_ioctl+0x158/0x180[ 391.091558] invoke_syscall+0x50/0x128[ 391.091567] el0_svc_common.constprop.0+0x48/0xf0[ 391.091574] do_el0_svc_compat+0x24/0x48[ 391.091581] el0_svc_compat+0x30/0x88[ 391.091591] el0t_32_sync_handler+0x98/0x140[ 391.091595] el0t_32_sync+0x194/0x198[ 391.091601] ---[ end trace 0000000000000000 ]---[ 391.091950] object pointer: 0x000000007ee5e213
Statistics: Posted by provod — Wed Sep 25, 2024 11:10 pm