Seite 1 von 1

HDD IO friert ein, System läuft weiter

Verfasst: So 19. Jul 2020, 10:36
von PichlAlex
Hallo zusammen,

ich habe aktuell ein Problem auf mehreren virtuellen und einem physischen Ubuntu 18.04 LTS das mich ein wenig fordert:

Symptom:
in unregelmäßigen Abständen, aber immer weniger als 24 h frieren 2 virtuelle Maschinen und eine physische Maschine ein. Alle sind gleich installiert, haben gleiche Konfiguration, gleichen Kernel.
die virtuellen Maschinen laufen auf 2 unterschiedlichen ESXi-Hosts.
--> dh Hardwareproblem kann ich aktuell ausschließen - vor allem da es praktsich zeitgleich (dh binnen weniger Tage) bei allen aufgetreten ist.

in "top" steigt die system load immer weiter, aber es gibt keine Auslastung auf der CPU (ich denke er wird hier auf die IOs warten und somit die load "zumüllen"):
top - 10:16:37 up 1 day, 23:33, 2 users, load average: 27.00, 27.00, 26.94
Tasks: 286 total, 1 running, 214 sleeping, 0 stopped, 0 zombie
%Cpu(s): 0.3 us, 1.4 sy, 0.1 ni, 76.9 id, 21.0 wa, 0.0 hi, 0.4 si, 0.0 st
KiB Mem : 4039312 total, 129500 free, 731164 used, 3178648 buff/cache
KiB Swap: 7811068 total, 7724028 free, 87040 used. 3028580 avail Mem

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
1 root 20 0 159712 7112 5488 S 0.0 0.2 0:17.41 systemd
2 root 20 0 0 0 0 S 0.0 0.0 0:00.06 kthreadd
4 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 kworker/0:0H
im Syslog findet man einen Kernel-Panic:
Jul 18 10:08:24 vmPlex kernel: [84342.412595] INFO: task kcompactd0:40 blocked for more than 120 seconds.
Jul 18 10:08:24 vmPlex kernel: [84342.412871] Not tainted 4.15.0-111-generic #112-Ubuntu
Jul 18 10:08:24 vmPlex kernel: [84342.413013] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jul 18 10:08:24 vmPlex kernel: [84342.413170] kcompactd0 D 0 40 2 0x80000000
Jul 18 10:08:24 vmPlex kernel: [84342.413172] Call Trace:
Jul 18 10:08:24 vmPlex kernel: [84342.413193] __schedule+0x24e/0x880
Jul 18 10:08:24 vmPlex kernel: [84342.413195] ? __switch_to_asm+0x35/0x70
Jul 18 10:08:24 vmPlex kernel: [84342.413196] ? __switch_to_asm+0x41/0x70
Jul 18 10:08:24 vmPlex kernel: [84342.413197] schedule+0x2c/0x80
Jul 18 10:08:24 vmPlex kernel: [84342.413198] io_schedule+0x16/0x40
Jul 18 10:08:24 vmPlex kernel: [84342.413204] __lock_page+0xff/0x140
Jul 18 10:08:24 vmPlex kernel: [84342.413207] ? page_cache_tree_insert+0xe0/0xe0
Jul 18 10:08:24 vmPlex kernel: [84342.413210] migrate_pages+0x91f/0xb80
Jul 18 10:08:24 vmPlex kernel: [84342.413212] ? __ClearPageMovable+0x10/0x10
Jul 18 10:08:24 vmPlex kernel: [84342.413213] ? isolate_freepages_block+0x3b0/0x3b0
Jul 18 10:08:24 vmPlex kernel: [84342.413214] compact_zone+0x681/0x950
Jul 18 10:08:24 vmPlex kernel: [84342.413215] kcompactd_do_work+0xfe/0x2a0
Jul 18 10:08:24 vmPlex kernel: [84342.413216] ? __switch_to_asm+0x35/0x70
Jul 18 10:08:24 vmPlex kernel: [84342.413217] ? __switch_to_asm+0x41/0x70
Jul 18 10:08:24 vmPlex kernel: [84342.413219] ? __switch_to_asm+0x35/0x70
Jul 18 10:08:24 vmPlex kernel: [84342.413220] kcompactd+0x86/0x1c0
Jul 18 10:08:24 vmPlex kernel: [84342.413220] ? kcompactd+0x86/0x1c0
Jul 18 10:08:24 vmPlex kernel: [84342.413226] ? wait_woken+0x80/0x80
Jul 18 10:08:24 vmPlex kernel: [84342.413228] kthread+0x121/0x140
Jul 18 10:08:24 vmPlex kernel: [84342.413229] ? kcompactd_do_work+0x2a0/0x2a0
Jul 18 10:08:24 vmPlex kernel: [84342.413229] ? kthread_create_worker_on_cpu+0x70/0x70
Jul 18 10:08:24 vmPlex kernel: [84342.413231] ret_from_fork+0x35/0x40
Jul 18 10:10:25 vmPlex kernel: [84463.245932] INFO: task kcompactd0:40 blocked for more than 120 seconds.
Jul 18 10:10:25 vmPlex kernel: [84463.246215] Not tainted 4.15.0-111-generic #112-Ubuntu
Jul 18 10:10:25 vmPlex kernel: [84463.246389] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jul 18 10:10:25 vmPlex kernel: [84463.246568] kcompactd0 D 0 40 2 0x80000000
Jul 18 10:10:25 vmPlex kernel: [84463.246570] Call Trace:
Jul 18 10:10:25 vmPlex kernel: [84463.246579] __schedule+0x24e/0x880
Jul 18 10:10:25 vmPlex kernel: [84463.246581] ? __switch_to_asm+0x35/0x70
Jul 18 10:10:25 vmPlex kernel: [84463.246582] ? __switch_to_asm+0x41/0x70
Jul 18 10:10:25 vmPlex kernel: [84463.246583] schedule+0x2c/0x80
Jul 18 10:10:25 vmPlex kernel: [84463.246584] io_schedule+0x16/0x40
Jul 18 10:10:25 vmPlex kernel: [84463.246586] __lock_page+0xff/0x140
Jul 18 10:10:25 vmPlex kernel: [84463.246588] ? page_cache_tree_insert+0xe0/0xe0
Jul 18 10:10:25 vmPlex kernel: [84463.246591] migrate_pages+0x91f/0xb80
Jul 18 10:10:25 vmPlex kernel: [84463.246592] ? __ClearPageMovable+0x10/0x10
Jul 18 10:10:25 vmPlex kernel: [84463.246594] ? isolate_freepages_block+0x3b0/0x3b0
Jul 18 10:10:25 vmPlex kernel: [84463.246595] compact_zone+0x681/0x950
Jul 18 10:10:25 vmPlex kernel: [84463.246596] kcompactd_do_work+0xfe/0x2a0
Jul 18 10:10:25 vmPlex kernel: [84463.246597] ? __switch_to_asm+0x35/0x70
Jul 18 10:10:25 vmPlex kernel: [84463.246598] ? __switch_to_asm+0x41/0x70
Jul 18 10:10:25 vmPlex kernel: [84463.246599] ? __switch_to_asm+0x35/0x70
Jul 18 10:10:25 vmPlex kernel: [84463.246600] kcompactd+0x86/0x1c0
Jul 18 10:10:25 vmPlex kernel: [84463.246601] ? kcompactd+0x86/0x1c0
Jul 18 10:10:25 vmPlex kernel: [84463.246604] ? wait_woken+0x80/0x80
Jul 18 10:10:25 vmPlex kernel: [84463.246606] kthread+0x121/0x140
Jul 18 10:10:25 vmPlex kernel: [84463.246607] ? kcompactd_do_work+0x2a0/0x2a0
Jul 18 10:10:25 vmPlex kernel: [84463.246608] ? kthread_create_worker_on_cpu+0x70/0x70
Jul 18 10:10:25 vmPlex kernel: [84463.246609] ret_from_fork+0x35/0x40
Jul 18 10:10:25 vmPlex kernel: [84463.246629] INFO: task btrfs-transacti:259 blocked for more than 120 seconds.
Jul 18 10:10:25 vmPlex kernel: [84463.246816] Not tainted 4.15.0-111-generic #112-Ubuntu
Jul 18 10:10:25 vmPlex kernel: [84463.247038] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jul 18 10:10:25 vmPlex kernel: [84463.247392] btrfs-transacti D 0 259 2 0x80000000
Jul 18 10:10:25 vmPlex kernel: [84463.247393] Call Trace:
Jul 18 10:10:25 vmPlex kernel: [84463.247396] __schedule+0x24e/0x880
Jul 18 10:10:25 vmPlex kernel: [84463.247397] ? bit_wait+0x60/0x60
Jul 18 10:10:25 vmPlex kernel: [84463.247398] schedule+0x2c/0x80
Jul 18 10:10:25 vmPlex kernel: [84463.247399] io_schedule+0x16/0x40
Jul 18 10:10:25 vmPlex kernel: [84463.247400] bit_wait_io+0x11/0x60
Jul 18 10:10:25 vmPlex kernel: [84463.247401] __wait_on_bit+0x4c/0x90
Jul 18 10:10:25 vmPlex kernel: [84463.247402] ? bit_wait+0x60/0x60
Jul 18 10:10:25 vmPlex kernel: [84463.247403] out_of_line_wait_on_bit+0x90/0xb0
Jul 18 10:10:25 vmPlex kernel: [84463.247405] ? bit_waitqueue+0x40/0x40
Jul 18 10:10:25 vmPlex kernel: [84463.247445] lock_extent_buffer_for_io+0x100/0x2a0 [btrfs]
Jul 18 10:10:25 vmPlex kernel: [84463.247459] btree_write_cache_pages+0x1b8/0x420 [btrfs]
Jul 18 10:10:25 vmPlex kernel: [84463.247472] btree_writepages+0x5d/0x70 [btrfs]
Jul 18 10:10:25 vmPlex kernel: [84463.247473] do_writepages+0x4b/0xe0
Jul 18 10:10:25 vmPlex kernel: [84463.247483] ? btrfs_free_path.part.32+0x21/0x30 [btrfs]
Jul 18 10:10:25 vmPlex kernel: [84463.247496] ? btrfs_select_ref_head+0xf4/0x120 [btrfs]
Jul 18 10:10:25 vmPlex kernel: [84463.247508] ? merge_state.part.47+0x44/0x130 [btrfs]
Jul 18 10:10:25 vmPlex kernel: [84463.247509] __filemap_fdatawrite_range+0xcf/0x100
Jul 18 10:10:25 vmPlex kernel: [84463.247510] ? __filemap_fdatawrite_range+0xcf/0x100
Jul 18 10:10:25 vmPlex kernel: [84463.247511] filemap_fdatawrite_range+0x13/0x20
Jul 18 10:10:25 vmPlex kernel: [84463.247523] btrfs_write_marked_extents+0x68/0x140 [btrfs]
Jul 18 10:10:25 vmPlex kernel: [84463.247533] btrfs_write_and_wait_marked_extents.constprop.20+0x4f/0x90 [btrfs]
Jul 18 10:10:25 vmPlex kernel: [84463.247561] btrfs_commit_transaction+0x696/0x910 [btrfs]
Jul 18 10:10:25 vmPlex kernel: [84463.247573] ? btrfs_commit_transaction+0x696/0x910 [btrfs]
Jul 18 10:10:25 vmPlex kernel: [84463.247582] ? start_transaction+0x191/0x430 [btrfs]
Jul 18 10:10:25 vmPlex kernel: [84463.247592] transaction_kthread+0x18d/0x1b0 [btrfs]
Jul 18 10:10:25 vmPlex kernel: [84463.247593] kthread+0x121/0x140
Jul 18 10:10:25 vmPlex kernel: [84463.247603] ? btrfs_cleanup_transaction+0x570/0x570 [btrfs]
Jul 18 10:10:25 vmPlex kernel: [84463.247604] ? kthread_create_worker_on_cpu+0x70/0x70
Jul 18 10:10:25 vmPlex kernel: [84463.247606] ret_from_fork+0x35/0x40
Jul 18 10:10:25 vmPlex kernel: [84463.247608] INFO: task systemd-journal:294 blocked for more than 120 seconds.
Jul 18 10:10:25 vmPlex kernel: [84463.247979] Not tainted 4.15.0-111-generic #112-Ubuntu
Jul 18 10:10:25 vmPlex kernel: [84463.248358] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jul 18 10:10:25 vmPlex kernel: [84463.248760] systemd-journal D 0 294 1 0x00000120
Jul 18 10:10:25 vmPlex kernel: [84463.248762] Call Trace:
Jul 18 10:10:25 vmPlex kernel: [84463.248765] __schedule+0x24e/0x880
Jul 18 10:10:25 vmPlex kernel: [84463.248766] schedule+0x2c/0x80
Jul 18 10:10:25 vmPlex kernel: [84463.248767] schedule_preempt_disabled+0xe/0x10
Jul 18 10:10:25 vmPlex kernel: [84463.248768] __mutex_lock.isra.5+0x276/0x4e0
Jul 18 10:10:25 vmPlex kernel: [84463.248770] ? __switch_to_asm+0x41/0x70
Jul 18 10:10:25 vmPlex kernel: [84463.248771] __mutex_lock_slowpath+0x13/0x20
Jul 18 10:10:25 vmPlex kernel: [84463.248773] ? __mutex_lock_slowpath+0x13/0x20
Jul 18 10:10:25 vmPlex kernel: [84463.248774] mutex_lock+0x2f/0x40
Jul 18 10:10:25 vmPlex kernel: [84463.248790] btrfs_log_inode_parent+0x17a/0xa80 [btrfs]
Jul 18 10:10:25 vmPlex kernel: [84463.248791] ? __schedule+0x256/0x880
Jul 18 10:10:25 vmPlex kernel: [84463.248803] ? wait_current_trans+0x33/0x110 [btrfs]
Jul 18 10:10:25 vmPlex kernel: [84463.248804] ? _cond_resched+0x19/0x40
Jul 18 10:10:25 vmPlex kernel: [84463.248814] ? join_transaction+0x27/0x420 [btrfs]
Jul 18 10:10:25 vmPlex kernel: [84463.248826] btrfs_log_dentry_safe+0x60/0x80 [btrfs]
Jul 18 10:10:25 vmPlex kernel: [84463.248838] btrfs_sync_file+0x375/0x530 [btrfs]
Jul 18 10:10:25 vmPlex kernel: [84463.248849] vfs_fsync_range+0x51/0xb0
Jul 18 10:10:25 vmPlex kernel: [84463.248851] do_fsync+0x3d/0x70
Jul 18 10:10:25 vmPlex kernel: [84463.248852] SyS_fsync+0x10/0x20
Jul 18 10:10:25 vmPlex kernel: [84463.248855] do_syscall_64+0x73/0x130
Jul 18 10:10:25 vmPlex kernel: [84463.248857] entry_SYSCALL_64_after_hwframe+0x41/0xa6
Jul 18 10:10:25 vmPlex kernel: [84463.248859] RIP: 0033:0x7f11f2cb5337
Jul 18 10:10:25 vmPlex kernel: [84463.248859] RSP: 002b:00007ffd5d0e52b0 EFLAGS: 00000293 ORIG_RAX: 000000000000004a
Jul 18 10:10:25 vmPlex kernel: [84463.248861] RAX: ffffffffffffffda RBX: 0000000000000016 RCX: 00007f11f2cb5337
Jul 18 10:10:25 vmPlex kernel: [84463.248862] RDX: 0000000000000000 RSI: 00005555bc028c90 RDI: 0000000000000016
Jul 18 10:10:25 vmPlex kernel: [84463.248862] RBP: 0000000000000001 R08: 00000000000000ca R09: 00007f11ee6be9d0
Jul 18 10:10:25 vmPlex kernel: [84463.248863] R10: 0000000000000000 R11: 0000000000000293 R12: 00007ffd5d0e5410
Jul 18 10:10:25 vmPlex kernel: [84463.248863] R13: 00007ffd5d0e5408 R14: 00005555bc035840 R15: 00007ffd5d0e5648
Jul 18 10:10:25 vmPlex kernel: [84463.248881] INFO: task Plex Media Serv:22366 blocked for more than 120 seconds.
Jul 18 10:10:25 vmPlex kernel: [84463.249304] Not tainted 4.15.0-111-generic #112-Ubuntu
Jul 18 10:10:25 vmPlex kernel: [84463.249739] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jul 18 10:10:25 vmPlex kernel: [84463.250236] Plex Media Serv D 0 22366 1 0x00000000
Jul 18 10:10:25 vmPlex kernel: [84463.250242] Call Trace:
Jul 18 10:10:25 vmPlex kernel: [84463.250256] __schedule+0x24e/0x880
Jul 18 10:10:25 vmPlex kernel: [84463.250270] schedule+0x2c/0x80
Jul 18 10:10:25 vmPlex kernel: [84463.250271] schedule_preempt_disabled+0xe/0x10
Jul 18 10:10:25 vmPlex kernel: [84463.250273] __mutex_lock.isra.5+0x276/0x4e0
Jul 18 10:10:25 vmPlex kernel: [84463.250274] ? __switch_to_asm+0x41/0x70
Jul 18 10:10:25 vmPlex kernel: [84463.250275] ? __switch_to_asm+0x35/0x70
Jul 18 10:10:25 vmPlex kernel: [84463.250276] __mutex_lock_slowpath+0x13/0x20
Jul 18 10:10:25 vmPlex kernel: [84463.250277] ? __mutex_lock_slowpath+0x13/0x20
Jul 18 10:10:25 vmPlex kernel: [84463.250278] mutex_lock+0x2f/0x40
Jul 18 10:10:25 vmPlex kernel: [84463.250294] btrfs_log_inode_parent+0x412/0xa80 [btrfs]
Jul 18 10:10:25 vmPlex kernel: [84463.250302] ? __switch_to_asm+0x41/0x70
Jul 18 10:10:25 vmPlex kernel: [84463.250315] ? wait_current_trans+0x33/0x110 [btrfs]
Jul 18 10:10:25 vmPlex kernel: [84463.250317] ? _cond_resched+0x19/0x40
Jul 18 10:10:25 vmPlex kernel: [84463.250328] ? join_transaction+0x27/0x420 [btrfs]
Jul 18 10:10:25 vmPlex kernel: [84463.250341] btrfs_log_dentry_safe+0x60/0x80 [btrfs]
Jul 18 10:10:25 vmPlex kernel: [84463.250354] btrfs_sync_file+0x375/0x530 [btrfs]
Jul 18 10:10:25 vmPlex kernel: [84463.250356] vfs_fsync_range+0x51/0xb0
Jul 18 10:10:25 vmPlex kernel: [84463.250358] do_fsync+0x3d/0x70
Jul 18 10:10:25 vmPlex kernel: [84463.250359] SyS_fsync+0x10/0x20
Jul 18 10:10:25 vmPlex kernel: [84463.250361] do_syscall_64+0x73/0x130
Jul 18 10:10:25 vmPlex kernel: [84463.250363] entry_SYSCALL_64_after_hwframe+0x41/0xa6
Jul 18 10:10:25 vmPlex kernel: [84463.250364] RIP: 0033:0x7fb57804bb17
Jul 18 10:10:25 vmPlex kernel: [84463.250364] RSP: 002b:00007fb568b08050 EFLAGS: 00000293 ORIG_RAX: 000000000000004a
Jul 18 10:10:25 vmPlex kernel: [84463.250366] RAX: ffffffffffffffda RBX: 000000000000000b RCX: 00007fb57804bb17
Jul 18 10:10:25 vmPlex kernel: [84463.250366] RDX: 0000000000000000 RSI: 0000000000000002 RDI: 000000000000000b
Jul 18 10:10:25 vmPlex kernel: [84463.250367] RBP: 000000000231b0e8 R08: 0000000000000000 R09: 000000000006d653
Jul 18 10:10:25 vmPlex kernel: [84463.250368] R10: 00007fb56001b200 R11: 0000000000000293 R12: 0000000000000000
Jul 18 10:10:25 vmPlex kernel: [84463.250368] R13: 0000000002311ea8 R14: 0000000000000002 R15: 0000000000000000
Jul 18 10:10:25 vmPlex kernel: [84463.250382] INFO: task Plex Media Serv:23434 blocked for more than 120 seconds.
Jul 18 10:10:25 vmPlex kernel: [84463.250870] Not tainted 4.15.0-111-generic #112-Ubuntu
Jul 18 10:10:25 vmPlex kernel: [84463.251358] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jul 18 10:10:25 vmPlex kernel: [84463.251873] Plex Media Serv D 0 23434 1 0x00000000
Jul 18 10:10:25 vmPlex kernel: [84463.251874] Call Trace:
Jul 18 10:10:25 vmPlex kernel: [84463.251877] __schedule+0x24e/0x880
Jul 18 10:10:25 vmPlex kernel: [84463.251879] schedule+0x2c/0x80
Jul 18 10:10:25 vmPlex kernel: [84463.251880] schedule_preempt_disabled+0xe/0x10
Jul 18 10:10:25 vmPlex kernel: [84463.251882] __mutex_lock.isra.5+0x276/0x4e0
Jul 18 10:10:25 vmPlex kernel: [84463.251898] ? join_transaction+0x27/0x420 [btrfs]
Jul 18 10:10:25 vmPlex kernel: [84463.251899] __mutex_lock_slowpath+0x13/0x20
Jul 18 10:10:25 vmPlex kernel: [84463.251900] ? __mutex_lock_slowpath+0x13/0x20
Jul 18 10:10:25 vmPlex kernel: [84463.251902] mutex_lock+0x2f/0x40
Jul 18 10:10:25 vmPlex kernel: [84463.251916] btrfs_pin_log_trans+0x1e/0x40 [btrfs]
Jul 18 10:10:25 vmPlex kernel: [84463.251929] btrfs_rename+0x24c/0xdf0 [btrfs]
Jul 18 10:10:25 vmPlex kernel: [84463.251941] btrfs_rename2+0x1d/0x30 [btrfs]
Jul 18 10:10:25 vmPlex kernel: [84463.251949] vfs_rename+0x46e/0x960
Jul 18 10:10:25 vmPlex kernel: [84463.251951] SyS_rename+0x362/0x3c0
Jul 18 10:10:25 vmPlex kernel: [84463.251954] do_syscall_64+0x73/0x130
Jul 18 10:10:25 vmPlex kernel: [84463.251955] entry_SYSCALL_64_after_hwframe+0x41/0xa6
Jul 18 10:10:25 vmPlex kernel: [84463.251956] RIP: 0033:0x7fb5731aeda7
Jul 18 10:10:25 vmPlex kernel: [84463.251957] RSP: 002b:00007fb510ff86a8 EFLAGS: 00000202 ORIG_RAX: 0000000000000052
Jul 18 10:10:25 vmPlex kernel: [84463.251958] RAX: ffffffffffffffda RBX: 00007fb510ff86d0 RCX: 00007fb5731aeda7
Jul 18 10:10:25 vmPlex kernel: [84463.251959] RDX: 00007fb510ff86f0 RSI: 00007fb50c9f92b0 RDI: 00007fb564001b20
Jul 18 10:10:25 vmPlex kernel: [84463.251959] RBP: 00007fb510ff8710 R08: 0000000000000000 R09: 0000000000000000
Jul 18 10:10:25 vmPlex kernel: [84463.251960] R10: 00007fb50c03b2e0 R11: 0000000000000202 R12: 00007fb510ff86e0
Jul 18 10:10:25 vmPlex kernel: [84463.251961] R13: 00007fb510ff8738 R14: 00007fb510ff86f0 R15: 00007fb510ff8700
Jul 18 10:12:26 vmPlex kernel: [84584.099333] INFO: task kcompactd0:40 blocked for more than 120 seconds.
Jul 18 10:12:26 vmPlex kernel: [84584.099933] Not tainted 4.15.0-111-generic #112-Ubuntu
Jul 18 10:12:26 vmPlex kernel: [84584.100256] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jul 18 10:12:26 vmPlex kernel: [84584.111753] kcompactd0 D 0 40 2 0x80000000
Jul 18 10:12:26 vmPlex kernel: [84584.111755] Call Trace:
Jul 18 10:12:26 vmPlex kernel: [84584.111765] __schedule+0x24e/0x880
Jul 18 10:12:26 vmPlex kernel: [84584.111767] ? __switch_to_asm+0x35/0x70
Jul 18 10:12:26 vmPlex kernel: [84584.111768] ? __switch_to_asm+0x41/0x70
Jul 18 10:12:26 vmPlex kernel: [84584.111769] schedule+0x2c/0x80
Jul 18 10:12:26 vmPlex kernel: [84584.111770] io_schedule+0x16/0x40
Jul 18 10:12:26 vmPlex kernel: [84584.111772] __lock_page+0xff/0x140
Jul 18 10:12:26 vmPlex kernel: [84584.111779] ? page_cache_tree_insert+0xe0/0xe0
Jul 18 10:12:26 vmPlex kernel: [84584.111782] migrate_pages+0x91f/0xb80
Jul 18 10:12:26 vmPlex kernel: [84584.111784] ? __ClearPageMovable+0x10/0x10
Jul 18 10:12:26 vmPlex kernel: [84584.111785] ? isolate_freepages_block+0x3b0/0x3b0
Jul 18 10:12:26 vmPlex kernel: [84584.111786] compact_zone+0x681/0x950
Jul 18 10:12:26 vmPlex kernel: [84584.111788] kcompactd_do_work+0xfe/0x2a0
Jul 18 10:12:26 vmPlex kernel: [84584.111789] ? __switch_to_asm+0x35/0x70
Jul 18 10:12:26 vmPlex kernel: [84584.111790] ? __switch_to_asm+0x41/0x70
Jul 18 10:12:26 vmPlex kernel: [84584.111791] ? __switch_to_asm+0x35/0x70
Jul 18 10:12:26 vmPlex kernel: [84584.111792] kcompactd+0x86/0x1c0
Jul 18 10:12:26 vmPlex kernel: [84584.111792] ? kcompactd+0x86/0x1c0
Jul 18 10:12:26 vmPlex kernel: [84584.111796] ? wait_woken+0x80/0x80
Jul 18 10:12:26 vmPlex kernel: [84584.111798] kthread+0x121/0x140
Jul 18 10:12:26 vmPlex kernel: [84584.111799] ? kcompactd_do_work+0x2a0/0x2a0
Jul 18 10:12:26 vmPlex kernel: [84584.111800] ? kthread_create_worker_on_cpu+0x70/0x70
Jul 18 10:12:26 vmPlex kernel: [84584.111801] ret_from_fork+0x35/0x40
Jul 18 10:12:26 vmPlex kernel: [84584.111823] INFO: task btrfs-transacti:259 blocked for more than 120 seconds.
Jul 18 10:12:26 vmPlex kernel: [84584.112201] Not tainted 4.15.0-111-generic #112-Ubuntu
Jul 18 10:12:26 vmPlex kernel: [84584.112524] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jul 18 10:12:26 vmPlex kernel: [84584.112859] btrfs-transacti D 0 259 2 0x80000000
Jul 18 10:12:26 vmPlex kernel: [84584.112860] Call Trace:
Jul 18 10:12:26 vmPlex kernel: [84584.112862] __schedule+0x24e/0x880
Jul 18 10:12:26 vmPlex kernel: [84584.112863] ? bit_wait+0x60/0x60
Jul 18 10:12:26 vmPlex kernel: [84584.112864] schedule+0x2c/0x80
Jul 18 10:12:26 vmPlex kernel: [84584.112865] io_schedule+0x16/0x40
Jul 18 10:12:26 vmPlex kernel: [84584.112866] bit_wait_io+0x11/0x60
Jul 18 10:12:26 vmPlex kernel: [84584.112867] __wait_on_bit+0x4c/0x90
Jul 18 10:12:26 vmPlex kernel: [84584.112868] ? bit_wait+0x60/0x60
Jul 18 10:12:26 vmPlex kernel: [84584.112868] out_of_line_wait_on_bit+0x90/0xb0
Jul 18 10:12:26 vmPlex kernel: [84584.112870] ? bit_waitqueue+0x40/0x40
Jul 18 10:12:26 vmPlex kernel: [84584.112909] lock_extent_buffer_for_io+0x100/0x2a0 [btrfs]
Jul 18 10:12:26 vmPlex kernel: [84584.112920] btree_write_cache_pages+0x1b8/0x420 [btrfs]
Jul 18 10:12:26 vmPlex kernel: [84584.112930] btree_writepages+0x5d/0x70 [btrfs]
Jul 18 10:12:26 vmPlex kernel: [84584.112931] do_writepages+0x4b/0xe0
Jul 18 10:12:26 vmPlex kernel: [84584.112937] ? btrfs_free_path.part.32+0x21/0x30 [btrfs]
Jul 18 10:12:26 vmPlex kernel: [84584.112947] ? btrfs_select_ref_head+0xf4/0x120 [btrfs]
Jul 18 10:12:26 vmPlex kernel: [84584.112957] ? merge_state.part.47+0x44/0x130 [btrfs]
Jul 18 10:12:26 vmPlex kernel: [84584.112958] __filemap_fdatawrite_range+0xcf/0x100
Jul 18 10:12:26 vmPlex kernel: [84584.112959] ? __filemap_fdatawrite_range+0xcf/0x100
Jul 18 10:12:26 vmPlex kernel: [84584.112969] filemap_fdatawrite_range+0x13/0x20
Jul 18 10:12:26 vmPlex kernel: [84584.112979] btrfs_write_marked_extents+0x68/0x140 [btrfs]
Jul 18 10:12:26 vmPlex kernel: [84584.112987] btrfs_write_and_wait_marked_extents.constprop.20+0x4f/0x90 [btrfs]
Jul 18 10:12:26 vmPlex kernel: [84584.112995] btrfs_commit_transaction+0x696/0x910 [btrfs]
Jul 18 10:12:26 vmPlex kernel: [84584.113001] ? btrfs_commit_transaction+0x696/0x910 [btrfs]
Jul 18 10:12:26 vmPlex kernel: [84584.113008] ? start_transaction+0x191/0x430 [btrfs]
Jul 18 10:12:26 vmPlex kernel: [84584.113015] transaction_kthread+0x18d/0x1b0 [btrfs]
Jul 18 10:12:26 vmPlex kernel: [84584.113016] kthread+0x121/0x140
Jul 18 10:12:26 vmPlex kernel: [84584.113023] ? btrfs_cleanup_transaction+0x570/0x570 [btrfs]
Jul 18 10:12:26 vmPlex kernel: [84584.113024] ? kthread_create_worker_on_cpu+0x70/0x70
Jul 18 10:12:26 vmPlex kernel: [84584.113025] ret_from_fork+0x35/0x40
Jul 18 10:12:26 vmPlex kernel: [84584.113027] INFO: task systemd-journal:294 blocked for more than 120 seconds.
Jul 18 10:12:26 vmPlex kernel: [84584.113371] Not tainted 4.15.0-111-generic #112-Ubuntu
Jul 18 10:12:26 vmPlex kernel: [84584.113718] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jul 18 10:12:26 vmPlex kernel: [84584.114078] systemd-journal D 0 294 1 0x00000120
Jul 18 10:12:26 vmPlex kernel: [84584.114079] Call Trace:
Jul 18 10:12:26 vmPlex kernel: [84584.114081] __schedule+0x24e/0x880
Jul 18 10:12:26 vmPlex kernel: [84584.114082] schedule+0x2c/0x80
Jul 18 10:12:26 vmPlex kernel: [84584.114083] schedule_preempt_disabled+0xe/0x10
Jul 18 10:12:26 vmPlex kernel: [84584.114084] __mutex_lock.isra.5+0x276/0x4e0
Jul 18 10:12:26 vmPlex kernel: [84584.114085] ? __switch_to_asm+0x41/0x70
Jul 18 10:12:26 vmPlex kernel: [84584.114086] __mutex_lock_slowpath+0x13/0x20
Jul 18 10:12:26 vmPlex kernel: [84584.114087] ? __mutex_lock_slowpath+0x13/0x20
Jul 18 10:12:26 vmPlex kernel: [84584.114088] mutex_lock+0x2f/0x40
Jul 18 10:12:26 vmPlex kernel: [84584.114100] btrfs_log_inode_parent+0x17a/0xa80 [btrfs]
Jul 18 10:12:26 vmPlex kernel: [84584.114101] ? __schedule+0x256/0x880
Jul 18 10:12:26 vmPlex kernel: [84584.114109] ? wait_current_trans+0x33/0x110 [btrfs]
Jul 18 10:12:26 vmPlex kernel: [84584.114110] ? _cond_resched+0x19/0x40
Jul 18 10:12:26 vmPlex kernel: [84584.114116] ? join_transaction+0x27/0x420 [btrfs]
Jul 18 10:12:26 vmPlex kernel: [84584.114125] btrfs_log_dentry_safe+0x60/0x80 [btrfs]
Jul 18 10:12:26 vmPlex kernel: [84584.114134] btrfs_sync_file+0x375/0x530 [btrfs]
Jul 18 10:12:26 vmPlex kernel: [84584.114137] vfs_fsync_range+0x51/0xb0
Jul 18 10:12:26 vmPlex kernel: [84584.114138] do_fsync+0x3d/0x70
Jul 18 10:12:26 vmPlex kernel: [84584.114140] SyS_fsync+0x10/0x20
Jul 18 10:12:26 vmPlex kernel: [84584.114141] do_syscall_64+0x73/0x130
Jul 18 10:12:26 vmPlex kernel: [84584.114143] entry_SYSCALL_64_after_hwframe+0x41/0xa6
Jul 18 10:12:26 vmPlex kernel: [84584.114144] RIP: 0033:0x7f11f2cb5337
Jul 18 10:12:26 vmPlex kernel: [84584.114145] RSP: 002b:00007ffd5d0e52b0 EFLAGS: 00000293 ORIG_RAX: 000000000000004a
Jul 18 10:12:26 vmPlex kernel: [84584.114146] RAX: ffffffffffffffda RBX: 0000000000000016 RCX: 00007f11f2cb5337
Jul 18 10:12:26 vmPlex kernel: [84584.114146] RDX: 0000000000000000 RSI: 00005555bc028c90 RDI: 0000000000000016
Jul 18 10:12:26 vmPlex kernel: [84584.114147] RBP: 0000000000000001 R08: 00000000000000ca R09: 00007f11ee6be9d0
Jul 18 10:12:26 vmPlex kernel: [84584.114147] R10: 0000000000000000 R11: 0000000000000293 R12: 00007ffd5d0e5410
Jul 18 10:12:26 vmPlex kernel: [84584.114147] R13: 00007ffd5d0e5408 R14: 00005555bc035840 R15: 00007ffd5d0e5648
Jul 18 10:12:26 vmPlex kernel: [84584.114160] INFO: task Plex Media Serv:22366 blocked for more than 120 seconds.
Jul 18 10:12:26 vmPlex kernel: [84584.114533] Not tainted 4.15.0-111-generic #112-Ubuntu
Jul 18 10:12:26 vmPlex kernel: [84584.114931] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jul 18 10:12:26 vmPlex kernel: [84584.115334] Plex Media Serv D 0 22366 1 0x00000000
Jul 18 10:12:26 vmPlex kernel: [84584.115335] Call Trace:
Jul 18 10:12:26 vmPlex kernel: [84584.115337] __schedule+0x24e/0x880
Jul 18 10:12:26 vmPlex kernel: [84584.115340] schedule+0x2c/0x80
Jul 18 10:12:26 vmPlex kernel: [84584.115342] schedule_preempt_disabled+0xe/0x10
Jul 18 10:12:26 vmPlex kernel: [84584.115344] __mutex_lock.isra.5+0x276/0x4e0
Jul 18 10:12:26 vmPlex kernel: [84584.115346] ? __switch_to_asm+0x41/0x70
Jul 18 10:12:26 vmPlex kernel: [84584.115347] ? __switch_to_asm+0x35/0x70
Jul 18 10:12:26 vmPlex kernel: [84584.115349] __mutex_lock_slowpath+0x13/0x20
Jul 18 10:12:26 vmPlex kernel: [84584.115350] ? __mutex_lock_slowpath+0x13/0x20
Jul 18 10:12:26 vmPlex kernel: [84584.115352] mutex_lock+0x2f/0x40
Jul 18 10:12:26 vmPlex kernel: [84584.115362] btrfs_log_inode_parent+0x412/0xa80 [btrfs]
Jul 18 10:12:26 vmPlex kernel: [84584.115364] ? __switch_to_asm+0x41/0x70
Jul 18 10:12:26 vmPlex kernel: [84584.115372] ? wait_current_trans+0x33/0x110 [btrfs]
Jul 18 10:12:26 vmPlex kernel: [84584.115374] ? _cond_resched+0x19/0x40
Jul 18 10:12:26 vmPlex kernel: [84584.115381] ? join_transaction+0x27/0x420 [btrfs]
Jul 18 10:12:26 vmPlex kernel: [84584.115389] btrfs_log_dentry_safe+0x60/0x80 [btrfs]
Jul 18 10:12:26 vmPlex kernel: [84584.115398] btrfs_sync_file+0x375/0x530 [btrfs]
Jul 18 10:12:26 vmPlex kernel: [84584.115400] vfs_fsync_range+0x51/0xb0
Jul 18 10:12:26 vmPlex kernel: [84584.115403] do_fsync+0x3d/0x70
Jul 18 10:12:26 vmPlex kernel: [84584.115405] SyS_fsync+0x10/0x20
Jul 18 10:12:26 vmPlex kernel: [84584.115407] do_syscall_64+0x73/0x130
Jul 18 10:12:26 vmPlex kernel: [84584.115409] entry_SYSCALL_64_after_hwframe+0x41/0xa6
Jul 18 10:12:26 vmPlex kernel: [84584.115410] RIP: 0033:0x7fb57804bb17
Jul 18 10:12:26 vmPlex kernel: [84584.115411] RSP: 002b:00007fb568b08050 EFLAGS: 00000293 ORIG_RAX: 000000000000004a
Jul 18 10:12:26 vmPlex kernel: [84584.115412] RAX: ffffffffffffffda RBX: 000000000000000b RCX: 00007fb57804bb17
Jul 18 10:12:26 vmPlex kernel: [84584.115413] RDX: 0000000000000000 RSI: 0000000000000002 RDI: 000000000000000b
Jul 18 10:12:26 vmPlex kernel: [84584.115421] RBP: 000000000231b0e8 R08: 0000000000000000 R09: 000000000006d653
Jul 18 10:12:26 vmPlex kernel: [84584.115423] R10: 00007fb56001b200 R11: 0000000000000293 R12: 0000000000000000
Jul 18 10:12:26 vmPlex kernel: [84584.115424] R13: 0000000002311ea8 R14: 0000000000000002 R15: 0000000000000000

was kann man tun wenn es auftritt: hard poweroff and restart
Aktueller Workaround:
vm#1: cronjob der die Maschinen alle 2 h durchstartet
vm#2: zum Glück war noch 4.15.0-109-generic installiert -> habe den nun gebootet und werde weiter beobachten....


alle sind Ubuntu 18.04 LTS mit Kernel 4.15.0-111-generic

ein "btrfs check" liefert no errors.


Hintergrund zu den verschiedenen Maschinen:
VM #1: Plex Server - auf ESXi 6.7
installiert auf raid 1 / VMFS 5 / VM-Hardware-Version 6.0 / Xeon E5-1650 v3 / Supermicro X10SRL-F / ECC RAM

VM #2: Plex Server - auf ESXi 6.7
installiert auf raid 1 / VMFS 5 / VM-Hardware-Version 6.0 / Intel Atom C2550 / Supermicro A1SAi-2550F / ECC RAM

Physical: Fhem/ioBroker auf Udoo x86 Ultra (8 GB RAM; SSD als Bootmedium) https://www.udoo.org/udoo-x86/
Single-Disk Intel SSD D3-S4510 960GB


ideen?
für mich siehts so aus als wäre der aktuelle Kernel 4.15.0-111-generic und btrfs ziemlich geschrottet....

schöne Grüße
Alexander

Re: HDD IO friert ein, System läuft weiter

Verfasst: Di 21. Jul 2020, 15:17
von PichlAlex
Update: Kernel 4.15.0-109-generic hat die gleichen Probleme... :cry:

aktueller Workaround:
crontab -e
55 */3 * * * /sbin/reboot

damit kommt er nie dazu kcompact auszuführen und somit tritt der Kerneldump nicht auf

Re: HDD IO friert ein, System läuft weiter

Verfasst: Di 21. Jul 2020, 15:41
von Webbutterfly
Hmmmm... warum installierst du dir nich einen aktueleren Kernel?
Im Repository von 18.04 geht's bis zum 5.4.0-40...

Re: HDD IO friert ein, System läuft weiter

Verfasst: Di 21. Jul 2020, 17:23
von PichlAlex
das war der aktuellte über apt upgrade :-)

ich probiers mal aus

Re: HDD IO friert ein, System läuft weiter

Verfasst: Mi 22. Jul 2020, 17:03
von PichlAlex
Update: 2 VMs laufen wieder ohne Probleme seit 24 h

Aber der physische (Udoo x86) spinnt jetzt total:

Startup / starten der Deamons:
https://1drv.ms/u/s!Am3NuLFwEuea9lV7W61J1ZqHcH3A

Login auf der Shell:
https://1drv.ms/u/s!Am3NuLFwEuea9kk8cAc ... w?e=Vj7cGe

während Shutdown:
https://1drv.ms/u/s!Am3NuLFwEuea9j0QnI-mzi1LUGqv

übers Netzwerk ist er nicht mehr erreichbar...
besonders spannend ist dass ab dem Loginbildschirm der hintergrund weiß wird und man nur noch die "schatten" sieht
zurück zum alten Kernel geht nicht da Grub auf einmal nich tmehr dargestellt wird.... und blindes navigieren lieferte auch keinen Erfolg...

mit stundenlangem googeln hab ich mal nichts gefunden was dem Symptom "Shell nicht mehr lesbar" entsprechen könnte...

ideen nach welchen Schlagworten man suchen sollte?

schöne Grüße
Alex

Re: HDD IO friert ein, System läuft weiter

Verfasst: Mi 22. Jul 2020, 22:47
von PichlAlex