LXD containers randomly drops/missing /proc/cpuinfo when using ZFS and Linux 6.8 (Noble) #14178

Qubitium · 2024-09-28T11:56:37Z

Required information

Distribution: Ubuntu
Distribution version: 24.04.1
The output of "snap list --all lxd core20 core22 core24 snapd":

Name    Version         Rev    Tracking       Publisher   Notes
core22  20240823        1612   latest/stable  canonical✓  base,disabled
core22  20240904        1621   latest/stable  canonical✓  base
core24  20240528        423    latest/stable  canonical✓  base,disabled
core24  20240710        490    latest/stable  canonical✓  base
lxd     5.21.2-34459c8  29568  5.21/stable    canonical✓  disabled
lxd     5.21.2-2f4ba6b  30131  5.21/stable    canonical✓  -
snapd   2.63            21759  latest/stable  canonical✓  snapd

The output of "lxc info" or if that fails:
- Kernel version: 6.10.9-x64v3-xanmod1
- LXC version: 5.21
- LXD version: 5.21
- Storage backend in use: zfs

config:
  core.https_address: '[::]:8443'
api_extensions:
- storage_zfs_remove_snapshots
- container_host_shutdown_timeout
- container_stop_priority
- container_syscall_filtering
- auth_pki
- container_last_used_at
- etag
- patch
- usb_devices
- https_allowed_credentials
- image_compression_algorithm
- directory_manipulation
- container_cpu_time
- storage_zfs_use_refquota
- storage_lvm_mount_options
- network
- profile_usedby
- container_push
- container_exec_recording
- certificate_update
- container_exec_signal_handling
- gpu_devices
- container_image_properties
- migration_progress
- id_map
- network_firewall_filtering
- network_routes
- storage
- file_delete
- file_append
- network_dhcp_expiry
- storage_lvm_vg_rename
- storage_lvm_thinpool_rename
- network_vlan
- image_create_aliases
- container_stateless_copy
- container_only_migration
- storage_zfs_clone_copy
- unix_device_rename
- storage_lvm_use_thinpool
- storage_rsync_bwlimit
- network_vxlan_interface
- storage_btrfs_mount_options
- entity_description
- image_force_refresh
- storage_lvm_lv_resizing
- id_map_base
- file_symlinks
- container_push_target
- network_vlan_physical
- storage_images_delete
- container_edit_metadata
- container_snapshot_stateful_migration
- storage_driver_ceph
- storage_ceph_user_name
- resource_limits
- storage_volatile_initial_source
- storage_ceph_force_osd_reuse
- storage_block_filesystem_btrfs
- resources
- kernel_limits
- storage_api_volume_rename
- network_sriov
- console
- restrict_devlxd
- migration_pre_copy
- infiniband
- maas_network
- devlxd_events
- proxy
- network_dhcp_gateway
- file_get_symlink
- network_leases
- unix_device_hotplug
- storage_api_local_volume_handling
- operation_description
- clustering
- event_lifecycle
- storage_api_remote_volume_handling
- nvidia_runtime
- container_mount_propagation
- container_backup
- devlxd_images
- container_local_cross_pool_handling
- proxy_unix
- proxy_udp
- clustering_join
- proxy_tcp_udp_multi_port_handling
- network_state
- proxy_unix_dac_properties
- container_protection_delete
- unix_priv_drop
- pprof_http
- proxy_haproxy_protocol
- network_hwaddr
- proxy_nat
- network_nat_order
- container_full
- backup_compression
- nvidia_runtime_config
- storage_api_volume_snapshots
- storage_unmapped
- projects
- network_vxlan_ttl
- container_incremental_copy
- usb_optional_vendorid
- snapshot_scheduling
- snapshot_schedule_aliases
- container_copy_project
- clustering_server_address
- clustering_image_replication
- container_protection_shift
- snapshot_expiry
- container_backup_override_pool
- snapshot_expiry_creation
- network_leases_location
- resources_cpu_socket
- resources_gpu
- resources_numa
- kernel_features
- id_map_current
- event_location
- storage_api_remote_volume_snapshots
- network_nat_address
- container_nic_routes
- cluster_internal_copy
- seccomp_notify
- lxc_features
- container_nic_ipvlan
- network_vlan_sriov
- storage_cephfs
- container_nic_ipfilter
- resources_v2
- container_exec_user_group_cwd
- container_syscall_intercept
- container_disk_shift
- storage_shifted
- resources_infiniband
- daemon_storage
- instances
- image_types
- resources_disk_sata
- clustering_roles
- images_expiry
- resources_network_firmware
- backup_compression_algorithm
- ceph_data_pool_name
- container_syscall_intercept_mount
- compression_squashfs
- container_raw_mount
- container_nic_routed
- container_syscall_intercept_mount_fuse
- container_disk_ceph
- virtual-machines
- image_profiles
- clustering_architecture
- resources_disk_id
- storage_lvm_stripes
- vm_boot_priority
- unix_hotplug_devices
- api_filtering
- instance_nic_network
- clustering_sizing
- firewall_driver
- projects_limits
- container_syscall_intercept_hugetlbfs
- limits_hugepages
- container_nic_routed_gateway
- projects_restrictions
- custom_volume_snapshot_expiry
- volume_snapshot_scheduling
- trust_ca_certificates
- snapshot_disk_usage
- clustering_edit_roles
- container_nic_routed_host_address
- container_nic_ipvlan_gateway
- resources_usb_pci
- resources_cpu_threads_numa
- resources_cpu_core_die
- api_os
- container_nic_routed_host_table
- container_nic_ipvlan_host_table
- container_nic_ipvlan_mode
- resources_system
- images_push_relay
- network_dns_search
- container_nic_routed_limits
- instance_nic_bridged_vlan
- network_state_bond_bridge
- usedby_consistency
- custom_block_volumes
- clustering_failure_domains
- resources_gpu_mdev
- console_vga_type
- projects_limits_disk
- network_type_macvlan
- network_type_sriov
- container_syscall_intercept_bpf_devices
- network_type_ovn
- projects_networks
- projects_networks_restricted_uplinks
- custom_volume_backup
- backup_override_name
- storage_rsync_compression
- network_type_physical
- network_ovn_external_subnets
- network_ovn_nat
- network_ovn_external_routes_remove
- tpm_device_type
- storage_zfs_clone_copy_rebase
- gpu_mdev
- resources_pci_iommu
- resources_network_usb
- resources_disk_address
- network_physical_ovn_ingress_mode
- network_ovn_dhcp
- network_physical_routes_anycast
- projects_limits_instances
- network_state_vlan
- instance_nic_bridged_port_isolation
- instance_bulk_state_change
- network_gvrp
- instance_pool_move
- gpu_sriov
- pci_device_type
- storage_volume_state
- network_acl
- migration_stateful
- disk_state_quota
- storage_ceph_features
- projects_compression
- projects_images_remote_cache_expiry
- certificate_project
- network_ovn_acl
- projects_images_auto_update
- projects_restricted_cluster_target
- images_default_architecture
- network_ovn_acl_defaults
- gpu_mig
- project_usage
- network_bridge_acl
- warnings
- projects_restricted_backups_and_snapshots
- clustering_join_token
- clustering_description
- server_trusted_proxy
- clustering_update_cert
- storage_api_project
- server_instance_driver_operational
- server_supported_storage_drivers
- event_lifecycle_requestor_address
- resources_gpu_usb
- clustering_evacuation
- network_ovn_nat_address
- network_bgp
- network_forward
- custom_volume_refresh
- network_counters_errors_dropped
- metrics
- image_source_project
- clustering_config
- network_peer
- linux_sysctl
- network_dns
- ovn_nic_acceleration
- certificate_self_renewal
- instance_project_move
- storage_volume_project_move
- cloud_init
- network_dns_nat
- database_leader
- instance_all_projects
- clustering_groups
- ceph_rbd_du
- instance_get_full
- qemu_metrics
- gpu_mig_uuid
- event_project
- clustering_evacuation_live
- instance_allow_inconsistent_copy
- network_state_ovn
- storage_volume_api_filtering
- image_restrictions
- storage_zfs_export
- network_dns_records
- storage_zfs_reserve_space
- network_acl_log
- storage_zfs_blocksize
- metrics_cpu_seconds
- instance_snapshot_never
- certificate_token
- instance_nic_routed_neighbor_probe
- event_hub
- agent_nic_config
- projects_restricted_intercept
- metrics_authentication
- images_target_project
- cluster_migration_inconsistent_copy
- cluster_ovn_chassis
- container_syscall_intercept_sched_setscheduler
- storage_lvm_thinpool_metadata_size
- storage_volume_state_total
- instance_file_head
- instances_nic_host_name
- image_copy_profile
- container_syscall_intercept_sysinfo
- clustering_evacuation_mode
- resources_pci_vpd
- qemu_raw_conf
- storage_cephfs_fscache
- network_load_balancer
- vsock_api
- instance_ready_state
- network_bgp_holdtime
- storage_volumes_all_projects
- metrics_memory_oom_total
- storage_buckets
- storage_buckets_create_credentials
- metrics_cpu_effective_total
- projects_networks_restricted_access
- storage_buckets_local
- loki
- acme
- internal_metrics
- cluster_join_token_expiry
- remote_token_expiry
- init_preseed
- storage_volumes_created_at
- cpu_hotplug
- projects_networks_zones
- network_txqueuelen
- cluster_member_state
- instances_placement_scriptlet
- storage_pool_source_wipe
- zfs_block_mode
- instance_generation_id
- disk_io_cache
- amd_sev
- storage_pool_loop_resize
- migration_vm_live
- ovn_nic_nesting
- oidc
- network_ovn_l3only
- ovn_nic_acceleration_vdpa
- cluster_healing
- instances_state_total
- auth_user
- security_csm
- instances_rebuild
- numa_cpu_placement
- custom_volume_iso
- network_allocations
- storage_api_remote_volume_snapshot_copy
- zfs_delegate
- operations_get_query_all_projects
- metadata_configuration
- syslog_socket
- event_lifecycle_name_and_project
- instances_nic_limits_priority
- disk_initial_volume_configuration
- operation_wait
- cluster_internal_custom_volume_copy
- disk_io_bus
- storage_cephfs_create_missing
- instance_move_config
- ovn_ssl_config
- init_preseed_storage_volumes
- metrics_instances_count
- server_instance_type_info
- resources_disk_mounted
- server_version_lts
- oidc_groups_claim
- loki_config_instance
- storage_volatile_uuid
- import_instance_devices
- instances_uefi_vars
- instances_migration_stateful
- container_syscall_filtering_allow_deny_syntax
- access_management
- vm_disk_io_limits
- storage_volumes_all
- instances_files_modify_permissions
- image_restriction_nesting
- container_syscall_intercept_finit_module
- device_usb_serial
- network_allocate_external_ips
- explicit_trust_token
api_status: stable
api_version: "1.0"
auth: trusted
public: false
auth_methods:
- tls
auth_user_name: don
auth_user_method: unix
environment:
  addresses:
  - [REDACTED]
  architectures:
  - x86_64
  - i686
  certificate: |
    -----BEGIN CERTIFICATE-----
    [REDACTED]
    -----END CERTIFICATE-----
  certificate_fingerprint: 89f218ba98182b8477f539d0f8956f0652380c2883ba4c28092c566f9cfc54f1
  driver: qemu | lxc
  driver_version: 8.2.1 | 6.0.0
  instance_types:
  - virtual-machine
  - container
  firewall: nftables
  kernel: Linux
  kernel_architecture: x86_64
  kernel_features:
    idmapped_mounts: "true"
    netnsid_getifaddrs: "true"
    seccomp_listener: "true"
    seccomp_listener_continue: "true"
    uevent_injection: "true"
    unpriv_fscaps: "true"
  kernel_version: 6.10.9-x64v3-xanmod1
  lxc_features:
    cgroup2: "true"
    core_scheduling: "true"
    devpts_fd: "true"
    idmapped_mounts_v2: "true"
    mount_injection_file: "true"
    network_gateway_device_route: "true"
    network_ipvlan: "true"
    network_l2proxy: "true"
    network_phys_macvlan_mtu: "true"
    network_veth_router: "true"
    pidfd: "true"
    seccomp_allow_deny_syntax: "true"
    seccomp_notify: "true"
    seccomp_proxy_send_notify_fd: "true"
  os_name: Ubuntu
  os_version: "24.04"
  project: default
  server: lxd
  server_clustered: false
  server_event_mode: full-mesh
  server_name: slime7
  server_pid: 2156
  server_version: 5.21.2
  server_lts: true
  storage: zfs
  storage_version: 2.2.99-699_g1713aa7b4
  storage_supported_drivers:
  - name: btrfs
    version: 5.16.2
    remote: false
  - name: ceph
    version: 17.2.7
    remote: true
  - name: cephfs
    version: 17.2.7
    remote: true
  - name: cephobject
    version: 17.2.7
    remote: true
  - name: dir
    version: "1"
    remote: false
  - name: lvm
    version: 2.03.11(2) (2021-01-08) / 1.02.175 (2021-01-08) / 4.48.0
    remote: false
  - name: powerflex
    version: 1.16 (nvme-cli)
    remote: true
  - name: zfs
    version: 2.2.99-699_g1713aa7b4
    remote: false

Issue description

lxd container (both host and container are ubuntu 24.04.1) randomly drops /proc/cpuinfo?

cat: /proc/cpuinfo: Transport endpoint is not connected)

I have no idea why this is happening. Force stop container and start container will fix this issue, until it happens next time. The chance of it happening is about once per week.

Want to add this server/container has a single Nvidia 4070 GPU passed to via device=gpu type=gpu.

Steps to reproduce

Happened more than once randomly on different amd single socket servers EPCY 9004 32-core/64 threads. (cpu has no official model id: engineering sample)

Information to attach

Correct cpuinfo:


processor	: 63
vendor_id	: AuthenticAMD
cpu family	: 25
model		: 17
model name	: AMD Eng Sample: 100-000000897-03
stepping	: 0
microcode	: 0xa101020
cpu MHz		: 3494.270
cache size	: 1024 KB
physical id	: 0
siblings	: 64
core id		: 15
cpu cores	: 32
apicid		: 31
initial apicid	: 31
fpu		: yes
fpu_exception	: yes
cpuid level	: 16
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good amd_lbr_v2 nopl xtopology nonstop_tsc cpuid extd_apicid aperfmperf rapl pni pclmulqdq monitor ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb cat_l3 cdp_l3 hw_pstate ssbd mba perfmon_v2 ibrs ibpb stibp ibrs_enhanced vmmcall fsgsbase bmi1 avx2 smep bmi2 erms invpcid cqm rdt_a avx512f avx512dq rdseed adx smap avx512ifma clflushopt clwb avx512cd sha_ni avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local user_shstk avx512_bf16 clzero irperf xsaveerptr rdpru wbnoinvd amd_ppin cppc arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif x2avic v_spec_ctrl vnmi avx512vbmi umip pku ospke avx512_vbmi2 gfni vaes vpclmulqdq avx512_vnni avx512_bitalg avx512_vpopcntdq la57 rdpid overflow_recov succor smca fsrm flush_l1d debug_swap amd_lbr_pmc_freeze
bugs		: sysret_ss_attrs spectre_v1 spectre_v2 spec_store_bypass srso
bogomips	: 5092.49
TLB size	: 3584 4K pages
clflush size	: 64
cache_alignment	: 64
address sizes	: 52 bits physical, 57 bits virtual
power management: ts ttp tm hwpstate cpb eff_freq_ro [13] [14]

lxc config show c1 (the container)

architecture: x86_64
config:
  image.architecture: amd64
  image.description: ubuntu 24.04 LTS amd64 (release) (20240911)
  image.label: release
  image.os: ubuntu
  image.release: noble
  image.serial: "20240911"
  image.type: squashfs
  image.version: "24.04"
  volatile.base_image: 788feb0b7c4ab9ae31c74a9751a6d66b77f6f34fb793913dd1423f7c7ad5fc69
  volatile.cloud-init.instance-id: e2f982c9-e0f6-4144-a68e-b54ac31bac09
  volatile.eth0.host_name: veth94a33961
  volatile.eth0.hwaddr: 00:16:3e:9f:8a:3c
  volatile.idmap.base: "0"
  volatile.idmap.current: '[{"Isuid":true,"Isgid":false,"Hostid":1000000,"Nsid":0,"Maprange":1000000000},{"Isuid":false,"Isgid":true,"Hostid":1000000,"Nsid":0,"Maprange":1000000000}]'
  volatile.idmap.next: '[{"Isuid":true,"Isgid":false,"Hostid":1000000,"Nsid":0,"Maprange":1000000000},{"Isuid":false,"Isgid":true,"Hostid":1000000,"Nsid":0,"Maprange":1000000000}]'
  volatile.last_state.idmap: '[]'
  volatile.last_state.power: RUNNING
  volatile.last_state.ready: "false"
  volatile.uuid: ebe982e7-ab86-412f-b3a8-5c8ab21fe7b1
  volatile.uuid.generation: ebe982e7-ab86-412f-b3a8-5c8ab21fe7b1
devices:
  gpu:
    type: gpu
  root:
    path: /
    pool: big
    type: disk
ephemeral: false
profiles:
- default
stateful: false
description: ""

The text was updated successfully, but these errors were encountered:

Qubitium · 2024-09-29T19:53:23Z

I have upgraded the lxd on hosts to 6.1/stable to see if the new major revision mitigates this issue.

simondeziel · 2024-10-01T18:23:21Z

I tried reproducing this on 5.21/edge and latest/edge to no avail. Here's what I did:

$ lxc launch ubuntu-minimal-daily:24.04 c1
$ while :; do lxc exec c1 -- cat /proc/cpuinfo > /dev/null; done

It didn't result in a Transport endpoint is not connected error after several minutes.

This sounds like a fuse/lxcfs issue at first glance but I wonder if the attached GPU or the ES CPU could have anything to do with it.

@Qubitium it's a long shot but I see many fuse related changes in the next kernel point release: https://cdn.kernel.org/pub/linux/kernel/v6.x/ChangeLog-6.10.10 so you might want to consider upgrading to the latest 6.10.x release.

Qubitium · 2024-10-02T00:29:36Z

@simondeziel I will upgrade to 6.10.10 to see if the problem persists.

To add more info:

The host /proc/cpuinfo had no issue. When the container lost access to /proc/cpuinfo, the host was able to access it just fine.
The loss of access to /proc/cpuinfo happened randomly over a period of time post container startup. So after container startup, cpuinfo access was fine, but after a random extended period of time, the container lost access.

Very strange.

tomponline · 2024-10-02T15:06:28Z

@mihalicyn could this be related to the LXCFS fixes you're working on?

Qubitium · 2024-10-04T05:26:39Z

Reproduced on kernel 6.10.12-x64v4-xanmod1:

Host: Ubuntu 24.04, Kernel 6.10.12, snap lxd 6.1/stable
Container: Centos 7.9.2009

ls -l /proc/*
ls: cannot access /proc/cpuinfo: Transport endpoint is not connected
ls: cannot access /proc/diskstats: Transport endpoint is not connected
ls: cannot access /proc/loadavg: Transport endpoint is not connected
ls: cannot access /proc/meminfo: Transport endpoint is not connected
ls: cannot access /proc/slabinfo: Transport endpoint is not connected
ls: cannot access /proc/stat: Transport endpoint is not connected
ls: cannot access /proc/swaps: Transport endpoint is not connected
ls: cannot access /proc/uptime: Transport endpoint is not connected

This server/host was rebooted yesterday so it happened within 24 hours. Again, it's quite random when/which container it happens to.

I checked syslog and dmesg on host and there is nothing there that would signal host had any errors of any kind.

The host has GPU passed to a separate container.

EDIT: ALL containers on this host lost access relevant /proc/* entries. Not just this container. I checked all containers, about 8-10 and they all have broken /proc/cpuinfo and related access.

Qubitium · 2024-10-04T05:46:34Z

@simondeziel @tomponline @mihalicyn

Found the cause! This is very good news. The lxd daemon had an internal crash related to liblxcfs. Found it after executing sudo snap logs lxd.daemon -n 100

2024-10-03T21:11:01Z lxd.daemon[2267]: *** signal 11
2024-10-03T21:11:01Z lxd.daemon[2267]: Register dump:
2024-10-03T21:11:01Z lxd.daemon[2267]:  RAX: 0000729991eb0df3   RBX: ffffffffffffff80   RCX: 0000729ed0f0f769
2024-10-03T21:11:01Z lxd.daemon[2267]:  RDX: 0000729ec95ffc90   RSI: 0000729991eb0de3   RDI: 0000729991eb0df3
2024-10-03T21:11:01Z lxd.daemon[2267]:  RBP: 0000729ec95ffb20   R8 : 0000729ed0ef4198   R9 : 000000000000000f
2024-10-03T21:11:01Z lxd.daemon[2267]:  R10: 0000729ed0ef4198   R11: 0000000000000000   R12: 0000000000000003
2024-10-03T21:11:01Z lxd.daemon[2267]:  R13: 0000729eb80389a0   R14: 00005f4a1d393dc8   R15: 0000000000000000
2024-10-03T21:11:01Z lxd.daemon[2267]:  RSP: 0000729ec95ffad0
2024-10-03T21:11:01Z lxd.daemon[2267]:  RIP: 0000729ed0ca53fe   EFLAGS: 00010202
2024-10-03T21:11:01Z lxd.daemon[2267]:  CS: 0033   FS: 0000   GS: 0000
2024-10-03T21:11:01Z lxd.daemon[2267]:  Trap: 0000000e   Error: 00000004   OldMask: 00004007   CR2: 91eb0deb
2024-10-03T21:11:01Z lxd.daemon[2267]:  FPUCW: 0000037f   FPUSW: 00000000   TAG: 00000000
2024-10-03T21:11:01Z lxd.daemon[2267]:  RIP: 00000000   RDP: 00000000
2024-10-03T21:11:01Z lxd.daemon[2267]:  ST(0) 0000 0000000000000000   ST(1) 0000 0000000000000000
2024-10-03T21:11:01Z lxd.daemon[2267]:  ST(2) 0000 0000000000000000   ST(3) 0000 0000000000000000
2024-10-03T21:11:01Z lxd.daemon[2267]:  ST(4) 0000 0000000000000000   ST(5) 0000 0000000000000000
2024-10-03T21:11:01Z lxd.daemon[2267]:  ST(6) 0000 0000000000000000   ST(7) 0000 0000000000000000
2024-10-03T21:11:01Z lxd.daemon[2267]:  mxcsr: 1fa0
2024-10-03T21:11:01Z lxd.daemon[2267]:  XMM0:  00000000000000000000000000000000 XMM1:  00000000000000000000000000000000
2024-10-03T21:11:01Z lxd.daemon[2267]:  XMM2:  00000000000000000000000000000000 XMM3:  00000000000000000000000000000000
2024-10-03T21:11:01Z lxd.daemon[2267]:  XMM4:  00000000000000000000000000000000 XMM5:  00000000000000000000000000000000
2024-10-03T21:11:01Z lxd.daemon[2267]:  XMM6:  00000000000000000000000000000000 XMM7:  00000000000000000000000000000000
2024-10-03T21:11:01Z lxd.daemon[2267]:  XMM8:  00000000000000000000000000000000 XMM9:  00000000000000000000000000000000
2024-10-03T21:11:01Z lxd.daemon[2267]:  XMM10: 00000000000000000000000000000000 XMM11: 00000000000000000000000000000000
2024-10-03T21:11:01Z lxd.daemon[2267]:  XMM12: 00000000000000000000000000000000 XMM13: 00000000000000000000000000000000
2024-10-03T21:11:01Z lxd.daemon[2267]:  XMM14: 00000000000000000000000000000000 XMM15: 00000000000000000000000000000000
2024-10-03T21:11:01Z lxd.daemon[2267]: Backtrace:
2024-10-03T21:11:01Z lxd.daemon[2267]: /lib/x86_64-linux-gnu/libc.so.6(free+0x1e)[0x729ed0ca53fe]
2024-10-03T21:11:01Z lxd.daemon[2267]: /snap/lxd/current/lib/liblxcfs.so(do_release_file_info+0x42)[0x729ed0f1a0c8]
2024-10-03T21:11:01Z lxd.daemon[2267]: /snap/lxd/current/lib/liblxcfs.so(proc_release+0x20)[0x729ed0f0f789]
2024-10-03T21:11:01Z lxd.daemon[2267]: lxcfs(+0x2f2c)[0x5f49e2624f2c]
2024-10-03T21:11:01Z lxd.daemon[2267]: lxcfs(+0x3c7f)[0x5f49e2625c7f]
2024-10-03T21:11:01Z lxd.daemon[2267]: /snap/lxd/current/lib/x86_64-linux-gnu/libfuse3.so.3(+0xbb6a)[0x729ed0f64b6a]
2024-10-03T21:11:01Z lxd.daemon[2267]: /snap/lxd/current/lib/x86_64-linux-gnu/libfuse3.so.3(+0x104fa)[0x729ed0f694fa]
2024-10-03T21:11:01Z lxd.daemon[2267]: /snap/lxd/current/lib/x86_64-linux-gnu/libfuse3.so.3(+0x11738)[0x729ed0f6a738]
2024-10-03T21:11:01Z lxd.daemon[2267]: /snap/lxd/current/lib/x86_64-linux-gnu/libfuse3.so.3(+0x1e13f)[0x729ed0f7713f]
2024-10-03T21:11:01Z lxd.daemon[2267]: /snap/lxd/current/lib/x86_64-linux-gnu/libfuse3.so.3(+0x167a7)[0x729ed0f6f7a7]
2024-10-03T21:11:01Z lxd.daemon[2267]: /lib/x86_64-linux-gnu/libc.so.6(+0x94ac3)[0x729ed0c94ac3]
2024-10-03T21:11:01Z lxd.daemon[2267]: /lib/x86_64-linux-gnu/libc.so.6(+0x126850)[0x729ed0d26850]
2024-10-03T21:11:01Z lxd.daemon[2267]: Memory map:
2024-10-03T21:11:01Z lxd.daemon[2267]: 5f49e2622000-5f49e262a000 r-xp 00000000 07:05 45                         /snap/lxd/30130/bin/lxcfs
2024-10-03T21:11:01Z lxd.daemon[2267]: 5f49e262a000-5f49e262b000 r--p 00007000 07:05 45                         /snap/lxd/30130/bin/lxcfs
2024-10-03T21:11:01Z lxd.daemon[2267]: 5f49e262b000-5f49e262c000 rw-p 00008000 07:05 45                         /snap/lxd/30130/bin/lxcfs
2024-10-03T21:11:01Z lxd.daemon[2267]: 5f4a1d387000-5f4a1d406000 rw-p 00000000 00:00 0                          [heap]
2024-10-03T21:11:01Z lxd.daemon[2267]: 729e5c000000-729e5c104000 rw-p 00000000 00:00 0
2024-10-03T21:11:01Z lxd.daemon[2267]: 729e5c104000-729e60000000 ---p 00000000 00:00 0
2024-10-03T21:11:01Z lxd.daemon[2267]: 729e64000000-729e6407f000 rw-p 00000000 00:00 0
2024-10-03T21:11:01Z lxd.daemon[2267]: 729e6407f000-729e68000000 ---p 00000000 00:00 0
2024-10-03T21:11:01Z lxd.daemon[2267]: 729e68000000-729e68021000 rw-p 00000000 00:00 0
2024-10-03T21:11:01Z lxd.daemon[2267]: 729e68021000-729e6c000000 ---p 00000000 00:00 0
2024-10-03T21:11:01Z lxd.daemon[2267]: 729e70000000-729e7006c000 rw-p 00000000 00:00 0
2024-10-03T21:11:01Z lxd.daemon[2267]: 729e7006c000-729e74000000 ---p 00000000 00:00 0
2024-10-03T21:11:01Z lxd.daemon[2267]: 729e74000000-729e74021000 rw-p 00000000 00:00 0
2024-10-03T21:11:01Z lxd.daemon[2267]: 729e74021000-729e78000000 ---p 00000000 00:00 0
2024-10-03T21:11:01Z lxd.daemon[2267]: 729e7c000000-729e7c047000 rw-p 00000000 00:00 0
2024-10-03T21:11:01Z lxd.daemon[2267]: 729e7c047000-729e80000000 ---p 00000000 00:00 0
2024-10-03T21:11:01Z lxd.daemon[2267]: 729e80000000-729e80021000 rw-p 00000000 00:00 0
2024-10-03T21:11:01Z lxd.daemon[2267]: 729e80021000-729e84000000 ---p 00000000 00:00 0
2024-10-03T21:11:01Z lxd.daemon[2267]: 729e88000000-729e88160000 rw-p 00000000 00:00 0
2024-10-03T21:11:01Z lxd.daemon[2267]: 729e88160000-729e8c000000 ---p 00000000 00:00 0
2024-10-03T21:11:01Z lxd.daemon[2267]: 729e8c000000-729e8c021000 rw-p 00000000 00:00 0
2024-10-03T21:11:01Z lxd.daemon[2267]: 729e8c021000-729e90000000 ---p 00000000 00:00 0
2024-10-03T21:11:01Z lxd.daemon[2267]: 729e94000000-729e94021000 rw-p 00000000 00:00 0
2024-10-03T21:11:01Z lxd.daemon[2267]: 729e94021000-729e98000000 ---p 00000000 00:00 0
2024-10-03T21:11:01Z lxd.daemon[2267]: 729e98000000-729e98080000 rw-p 00000000 00:00 0
2024-10-03T21:11:01Z lxd.daemon[2267]: 729e98080000-729e9c000000 ---p 00000000 00:00 0
2024-10-03T21:11:01Z lxd.daemon[2267]: 729ea0000000-729ea007e000 rw-p 00000000 00:00 0
2024-10-03T21:11:01Z lxd.daemon[2267]: 729ea007e000-729ea4000000 ---p 00000000 00:00 0
2024-10-03T21:11:01Z lxd.daemon[2267]: 729ea4000000-729ea418f000 rw-p 00000000 00:00 0
2024-10-03T21:11:01Z lxd.daemon[2267]: 729ea418f000-729ea8000000 ---p 00000000 00:00 0
2024-10-03T21:11:01Z lxd.daemon[2267]: 729eac000000-729eac175000 rw-p 00000000 00:00 0
2024-10-03T21:11:01Z lxd.daemon[2267]: 729eac175000-729eb0000000 ---p 00000000 00:00 0
2024-10-03T21:11:01Z lxd.daemon[2267]: 729eb0000000-729eb0068000 rw-p 00000000 00:00 0
2024-10-03T21:11:01Z lxd.daemon[2267]: 729eb0068000-729eb4000000 ---p 00000000 00:00 0
2024-10-03T21:11:01Z lxd.daemon[2267]: 729eb4400000-729eb4401000 ---p 00000000 00:00 0
2024-10-03T21:11:01Z lxd.daemon[2267]: 729eb4401000-729eb4c01000 rw-p 00000000 00:00 0
2024-10-03T21:11:01Z lxd.daemon[2267]: 729eb4e00000-729eb4e01000 ---p 00000000 00:00 0
2024-10-03T21:11:01Z lxd.daemon[2267]: 729eb4e01000-729eb5601000 rw-p 00000000 00:00 0
2024-10-03T21:11:01Z lxd.daemon[2267]: 729eb6200000-729eb6201000 ---p 00000000 00:00 0
2024-10-03T21:11:01Z lxd.daemon[2267]: 729eb6201000-729eb6a01000 rw-p 00000000 00:00 0
2024-10-03T21:11:01Z lxd.daemon[2267]: 729eb7600000-729eb7601000 ---p 00000000 00:00 0
2024-10-03T21:11:01Z lxd.daemon[2267]: 729eb7601000-729eb7e01000 rw-p 00000000 00:00 0
2024-10-03T21:11:01Z lxd.daemon[2267]: 729eb8000000-729eb8092000 rw-p 00000000 00:00 0
2024-10-03T21:11:01Z lxd.daemon[2267]: 729eb8092000-729ebc000000 ---p 00000000 00:00 0
2024-10-03T21:11:01Z lxd.daemon[2267]: 729ebc000000-729ebc06a000 rw-p 00000000 00:00 0
2024-10-03T21:11:01Z lxd.daemon[2267]: 729ebc06a000-729ec0000000 ---p 00000000 00:00 0
2024-10-03T21:11:01Z lxd.daemon[2267]: 729ec02fe000-729ec0400000 rw-p 00000000 00:00 0
2024-10-03T21:11:01Z lxd.daemon[2267]: 729ec0e00000-729ec0e01000 ---p 00000000 00:00 0
2024-10-03T21:11:01Z lxd.daemon[2267]: 729ec0e01000-729ec1601000 rw-p 00000000 00:00 0
2024-10-03T21:11:01Z lxd.daemon[2267]: 729ec16fe000-729ec1800000 rw-p 00000000 00:00 0
2024-10-03T21:11:01Z lxd.daemon[2267]: 729ec2200000-729ec2201000 ---p 00000000 00:00 0
2024-10-03T21:11:01Z lxd.daemon[2267]: 729ec2201000-729ec2a01000 rw-p 00000000 00:00 0
2024-10-03T21:11:01Z lxd.daemon[2267]: 729ec2c00000-729ec2c01000 ---p 00000000 00:00 0
2024-10-03T21:11:01Z lxd.daemon[2267]: 729ec2c01000-729ec3401000 rw-p 00000000 00:00 0
2024-10-03T21:11:01Z lxd.daemon[2267]: 729ec3600000-729ec3601000 ---p 00000000 00:00 0
2024-10-03T21:11:01Z lxd.daemon[2267]: 729ec3601000-729ec3e01000 rw-p 00000000 00:00 0
2024-10-03T21:11:01Z lxd.daemon[2267]: 729ec4000000-729ec4180000 rw-p 00000000 00:00 0
2024-10-03T21:11:01Z lxd.daemon[2267]: 729ec4180000-729ec8000000 ---p 00000000 00:00 0
2024-10-03T21:11:01Z lxd.daemon[2267]: 729ec82fe000-729ec8400000 rw-p 00000000 00:00 0
2024-10-03T21:11:01Z lxd.daemon[2267]: 729ec8400000-729ec8401000 ---p 00000000 00:00 0
2024-10-03T21:11:01Z lxd.daemon[2267]: 729ec8401000-729ec8c01000 rw-p 00000000 00:00 0
2024-10-03T21:11:01Z lxd.daemon[2267]: 729ec8cfe000-729ec8e00000 rw-p 00000000 00:00 0
2024-10-03T21:11:01Z lxd.daemon[2267]: 729ec8e00000-729ec8e01000 ---p 00000000 00:00 0
2024-10-03T21:11:01Z lxd.daemon[2267]: 729ec8e01000-729ec9601000 rw-p 00000000 00:00 0
2024-10-03T21:11:01Z lxd.daemon[2267]: 729ec9800000-729ec9801000 ---p 00000000 00:00 0
2024-10-03T21:11:01Z lxd.daemon[2267]: 729ec9801000-729eca001000 rw-p 00000000 00:00 0
2024-10-03T21:11:01Z lxd.daemon[2267]: 729eca200000-729eca201000 ---p 00000000 00:00 0
2024-10-03T21:11:01Z lxd.daemon[2267]: 729eca201000-729ecaa01000 rw-p 00000000 00:00 0
2024-10-03T21:11:01Z lxd.daemon[2267]: 729ecaafe000-729ecac00000 rw-p 00000000 00:00 0
2024-10-03T21:11:01Z lxd.daemon[2267]: 729ecac00000-729ecac01000 ---p 00000000 00:00 0
2024-10-03T21:11:01Z lxd.daemon[2267]: 729ecac01000-729ecb401000 rw-p 00000000 00:00 0
2024-10-03T21:11:01Z lxd.daemon[2267]: 729ecb4fe000-729ecb600000 rw-p 00000000 00:00 0
2024-10-03T21:11:01Z lxd.daemon[2267]: 729ecb600000-729ecb601000 ---p 00000000 00:00 0
2024-10-03T21:11:01Z lxd.daemon[2267]: 729ecb601000-729ecbe01000 rw-p 00000000 00:00 0
2024-10-03T21:11:01Z lxd.daemon[2267]: 729ecc000000-729ecc026000 rw-p 00000000 00:00 0
2024-10-03T21:11:01Z lxd.daemon[2267]: 729ecc026000-729ed0000000 ---p 00000000 00:00 0
2024-10-03T21:11:01Z lxd.daemon[2267]: 729ed0b21000-729ed0c00000 rw-p 00000000 00:00 0
2024-10-03T21:11:01Z lxd.daemon[2267]: 729ed0c00000-729ed0c28000 r--p 00000000 07:01 7474                       /usr/lib/x86_64-linux-gnu/libc.so.6
2024-10-03T21:11:01Z lxd.daemon[2267]: 729ed0c28000-729ed0dbd000 r-xp 00028000 07:01 7474                       /usr/lib/x86_64-linux-gnu/libc.so.6
2024-10-03T21:11:01Z lxd.daemon[2267]: 729ed0dbd000-729ed0e15000 r--p 001bd000 07:01 7474                       /usr/lib/x86_64-linux-gnu/libc.so.6
2024-10-03T21:11:01Z lxd.daemon[2267]: 729ed0e15000-729ed0e16000 ---p 00215000 07:01 7474                       /usr/lib/x86_64-linux-gnu/libc.so.6
2024-10-03T21:11:01Z lxd.daemon[2267]: 729ed0e16000-729ed0e1a000 r--p 00215000 07:01 7474                       /usr/lib/x86_64-linux-gnu/libc.so.6
2024-10-03T21:11:01Z lxd.daemon[2267]: 729ed0e1a000-729ed0e1c000 rw-p 00219000 07:01 7474                       /usr/lib/x86_64-linux-gnu/libc.so.6
2024-10-03T21:11:01Z lxd.daemon[2267]: 729ed0e1c000-729ed0e29000 rw-p 00000000 00:00 0
2024-10-03T21:11:01Z lxd.daemon[2267]: 729ed0e29000-729ed0ef3000 rw-p 00000000 00:00 0
2024-10-03T21:11:01Z lxd.daemon[2267]: 729ed0ef3000-729ed0f26000 r-xp 00000000 07:05 158                        /snap/lxd/30130/lib/liblxcfs.so
2024-10-03T21:11:01Z lxd.daemon[2267]: 729ed0f26000-729ed0f27000 r--p 00032000 07:05 158                        /snap/lxd/30130/lib/liblxcfs.so
2024-10-03T21:11:01Z lxd.daemon[2267]: 729ed0f27000-729ed0f28000 rw-p 00033000 07:05 158                        /snap/lxd/30130/lib/liblxcfs.so
2024-10-03T21:11:01Z lxd.daemon[2267]: 729ed0f28000-729ed0f36000 rw-p 00000000 00:00 0
2024-10-03T21:11:01Z lxd.daemon[2267]: 729ed0f36000-729ed0f39000 r--p 00000000 07:01 7524                       /usr/lib/x86_64-linux-gnu/libgcc_s.so.1
2024-10-03T21:11:01Z lxd.daemon[2267]: 729ed0f39000-729ed0f50000 r-xp 00003000 07:01 7524                       /usr/lib/x86_64-linux-gnu/libgcc_s.so.1
2024-10-03T21:11:01Z lxd.daemon[2267]: 729ed0f50000-729ed0f54000 r--p 0001a000 07:01 7524                       /usr/lib/x86_64-linux-gnu/libgcc_s.so.1
2024-10-03T21:11:01Z lxd.daemon[2267]: 729ed0f54000-729ed0f55000 r--p 0001d000 07:01 7524                       /usr/lib/x86_64-linux-gnu/libgcc_s.so.1
2024-10-03T21:11:01Z lxd.daemon[2267]: 729ed0f55000-729ed0f56000 rw-p 0001e000 07:01 7524                       /usr/lib/x86_64-linux-gnu/libgcc_s.so.1
2024-10-03T21:11:01Z lxd.daemon[2267]: 729ed0f56000-729ed0f59000 rw-p 00000000 00:00 0
2024-10-03T21:11:01Z lxd.daemon[2267]: 729ed0f59000-729ed0f60000 r--p 00000000 07:05 568                        /snap/lxd/30130/lib/x86_64-linux-gnu/libfuse3.so.3.10.5
2024-10-03T21:11:01Z lxd.daemon[2267]: 729ed0f60000-729ed0f7b000 r-xp 00007000 07:05 568                        /snap/lxd/30130/lib/x86_64-linux-gnu/libfuse3.so.3.10.5
2024-10-03T21:11:01Z lxd.daemon[2267]: 729ed0f7b000-729ed0f85000 r--p 00022000 07:05 568                        /snap/lxd/30130/lib/x86_64-linux-gnu/libfuse3.so.3.10.5
2024-10-03T21:11:01Z lxd.daemon[2267]: 729ed0f85000-729ed0f97000 r--p 0002b000 07:05 568                        /snap/lxd/30130/lib/x86_64-linux-gnu/libfuse3.so.3.10.5
2024-10-03T21:11:01Z lxd.daemon[2267]: 729ed0f97000-729ed0f98000 rw-p 0003d000 07:05 568                        /snap/lxd/30130/lib/x86_64-linux-gnu/libfuse3.so.3.10.5
2024-10-03T21:11:01Z lxd.daemon[2267]: 729ed0f98000-729ed0f99000 r--p 00000000 07:05 541                        /snap/lxd/30130/lib/x86_64-linux-gnu/libSegFault.so
2024-10-03T21:11:01Z lxd.daemon[2267]: 729ed0f99000-729ed0f9c000 r-xp 00001000 07:05 541                        /snap/lxd/30130/lib/x86_64-linux-gnu/libSegFault.so
2024-10-03T21:11:01Z lxd.daemon[2267]: 729ed0f9c000-729ed0f9d000 r--p 00004000 07:05 541                        /snap/lxd/30130/lib/x86_64-linux-gnu/libSegFault.so
2024-10-03T21:11:01Z lxd.daemon[2267]: 729ed0f9d000-729ed0f9e000 r--p 00005000 07:05 541                        /snap/lxd/30130/lib/x86_64-linux-gnu/libSegFault.so
2024-10-03T21:11:01Z lxd.daemon[2267]: 729ed0f9e000-729ed0f9f000 rw-p 00006000 07:05 541                        /snap/lxd/30130/lib/x86_64-linux-gnu/libSegFault.so
2024-10-03T21:11:01Z lxd.daemon[2267]: 729ed0f9f000-729ed0fa1000 rw-p 00000000 00:00 0
2024-10-03T21:11:01Z lxd.daemon[2267]: 729ed0fa1000-729ed0fa5000 r--p 00000000 00:00 0                          [vvar]
2024-10-03T21:11:01Z lxd.daemon[2267]: 729ed0fa5000-729ed0fa7000 r-xp 00000000 00:00 0                          [vdso]
2024-10-03T21:11:01Z lxd.daemon[2267]: 729ed0fa7000-729ed0fa9000 r--p 00000000 07:01 7445                       /usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2
2024-10-03T21:11:01Z lxd.daemon[2267]: 729ed0fa9000-729ed0fd3000 r-xp 00002000 07:01 7445                       /usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2
2024-10-03T21:11:01Z lxd.daemon[2267]: 729ed0fd3000-729ed0fde000 r--p 0002c000 07:01 7445                       /usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2
2024-10-03T21:11:01Z lxd.daemon[2267]: 729ed0fde000-729ed0fdf000 rw-p 00000000 00:00 0
2024-10-03T21:11:01Z lxd.daemon[2267]: 729ed0fdf000-729ed0fe1000 r--p 00037000 07:01 7445                       /usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2
2024-10-03T21:11:01Z lxd.daemon[2267]: 729ed0fe1000-729ed0fe3000 rw-p 00039000 07:01 7445                       /usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2
2024-10-03T21:11:01Z lxd.daemon[2267]: 7ffd23f81000-7ffd23fa2000 rw-p 00000000 00:00 0                          [stack]
2024-10-03T21:11:01Z lxd.daemon[2267]: ffffffffff600000-ffffffffff601000 --xp 00000000 00:00 0                  [vsyscall]

EDIT: looks like an attempt to free an invalid pointer in liblxcfs.so (do_release_file_info)

simondeziel · 2024-10-04T14:21:03Z

@Qubitium thanks that's really useful. I've punted the issue to Aleks whose one of the lxcfs maintainer.

Qubitium · 2024-10-07T06:06:46Z

@simondeziel Should I track this here or will there there be a second issue on github/lxcfs?

simondeziel · 2024-10-07T17:35:27Z

Indeed, that might require a bug in the lxcfs project but I'll let Aleks comment as he might want some more information from you anyway. Thanks

mihalicyn · 2024-10-08T12:05:39Z

Hi @Qubitium,

Thanks a lot for reporting this issue to us!

Maybe my question sounds unrelated but, are you using ZFS? If yes, then your case looks similar to lxc/lxcfs#644
I've spent really a lot of time rechecking LXCFS code and analyzing coredumps and all points out that ZFS kernel driver corrupts kernel memory on all kernels starting from 6.8.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LXD containers randomly drops/missing /proc/cpuinfo when using ZFS and Linux 6.8 (Noble) #14178

LXD containers randomly drops/missing /proc/cpuinfo when using ZFS and Linux 6.8 (Noble) #14178

Qubitium commented Sep 28, 2024 •

edited

Loading

Qubitium commented Sep 29, 2024

simondeziel commented Oct 1, 2024

Qubitium commented Oct 2, 2024 •

edited

Loading

tomponline commented Oct 2, 2024

Qubitium commented Oct 4, 2024 •

edited

Loading

Qubitium commented Oct 4, 2024 •

edited

Loading

simondeziel commented Oct 4, 2024

Qubitium commented Oct 7, 2024

simondeziel commented Oct 7, 2024

mihalicyn commented Oct 8, 2024 •

edited

Loading

Qubitium commented Oct 8, 2024 •

edited

Loading

mihalicyn commented Oct 8, 2024 •

edited

Loading

Qubitium commented Oct 8, 2024

mihalicyn commented Oct 9, 2024 •

edited

Loading

Qubitium commented Oct 10, 2024

mihalicyn commented Oct 10, 2024

LXD containers randomly drops/missing /proc/cpuinfo when using ZFS and Linux 6.8 (Noble) #14178

LXD containers randomly drops/missing /proc/cpuinfo when using ZFS and Linux 6.8 (Noble) #14178

Comments

Qubitium commented Sep 28, 2024 • edited Loading

Required information

Issue description

Steps to reproduce

Information to attach

Qubitium commented Sep 29, 2024

simondeziel commented Oct 1, 2024

Qubitium commented Oct 2, 2024 • edited Loading

tomponline commented Oct 2, 2024

Qubitium commented Oct 4, 2024 • edited Loading

Qubitium commented Oct 4, 2024 • edited Loading

simondeziel commented Oct 4, 2024

Qubitium commented Oct 7, 2024

simondeziel commented Oct 7, 2024

mihalicyn commented Oct 8, 2024 • edited Loading

Qubitium commented Oct 8, 2024 • edited Loading

mihalicyn commented Oct 8, 2024 • edited Loading

Qubitium commented Oct 8, 2024

mihalicyn commented Oct 9, 2024 • edited Loading

Qubitium commented Oct 10, 2024

mihalicyn commented Oct 10, 2024

Qubitium commented Sep 28, 2024 •

edited

Loading

Qubitium commented Oct 2, 2024 •

edited

Loading

Qubitium commented Oct 4, 2024 •

edited

Loading

Qubitium commented Oct 4, 2024 •

edited

Loading

mihalicyn commented Oct 8, 2024 •

edited

Loading

Qubitium commented Oct 8, 2024 •

edited

Loading

mihalicyn commented Oct 8, 2024 •

edited

Loading

mihalicyn commented Oct 9, 2024 •

edited

Loading