Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
swiotlb full prevents loading nvidia vbios on 5.4.6
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware
View previous topic :: View next topic  
Author Message
krsr
n00b
n00b


Joined: 18 Sep 2020
Posts: 2
Location: NJ

PostPosted: Fri Sep 18, 2020 8:28 pm    Post subject: swiotlb full prevents loading nvidia vbios on 5.4.6 Reply with quote

Recently finished my first Gentoo install and so far most things have been a breeze. However, when I boot the system, the nvidia drivers fail to load – evidently because the swiotlb is full. I don’t really understand the dmesg output, but it looks like the buffer shouldn’t actually be full (or otherwise the requested space is too large?)

Reading similar issues/bugzilla, I’ve tried booting with a variety of kernel params, including iommu=off, iommu=force, iommu=soft, and increasing the swiotlb to a much larger size. None seemed to have any effect. I also tried configuring the IOMMU as written on this page (building the AMD v2 driver as a module) but got a dmesg line about "v2 not supported on this system". Current settings are:

Code:
# CONFIG_GART_IOMMU is not set
# CONFIG_CALGARY_IOMMU is not set
CONFIG_IOMMU_IOVA=y
CONFIG_IOMMU_API=y
CONFIG_IOMMU_SUPPORT=y
# Generic IOMMU Pagetable Support
# end of Generic IOMMU Pagetable Support
# CONFIG_IOMMU_DEBUGFS is not set
# CONFIG_IOMMU_DEFAULT_PASSTHROUGH is not set
CONFIG_AMD_IOMMU=y
# CONFIG_AMD_IOMMU_V2 is not set
# CONFIG_INTEL_IOMMU is not set


Not sure if relevant, but this system has two NVMe SSDs - one contains the EFI system partition and gentoo’s root partition, which uses btrfs. The other SSD contains a Windows installation which I can boot into using GRUB. I’ve been using the Windows installation without issue for months, and this GPU works great on that side - the second SSD is a recent addition.

dmesg output:

Code:
[    3.982791] software IO TLB: Memory encryption is active and system is using DMA bounce buffers
[    3.982838] nvidia 0000:10:00.0: swiotlb buffer is full (sz: 327680 bytes), total 32768 (slots), used 0 (slots)
[    3.982841] nvidia 0000:10:00.0: overflow 0x00008007eac00000+327680 of DMA mask 7fffffffffff bus mask 0
[    3.982845] ------------[ cut here ]------------
[    3.982850] WARNING: CPU: 13 PID: 910 at kernel/dma/direct.c:35 report_addr+0x2e/0x50
[    3.982850] Modules linked in: nvidia_modeset(PO+) nvidia(PO) efivarfs
[    3.982855] CPU: 13 PID: 910 Comm: nvidia-smi Tainted: P           O      5.4.60-gentoo #4
[    3.982856] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./X570 Steel Legend WiFi ax, BIOS P1.90 09/10/2019
[    3.982858] RIP: 0010:report_addr+0x2e/0x50
[    3.982860] Code: 48 8b 87 28 02 00 00 48 89 34 24 48 85 c0 74 2d 4c 8b 00 b8 fe ff ff ff 49 39 c0 76 14 80 3d 14 b6 40 01 00 0f 84 25 07 00 00 <0f> 0b 48 83 c4 08 c3 48 83 bf 38 02 00 00 00 74 ef eb e0 80 3d f5
[    3.982861] RSP: 0018:ffffa5fe8049f898 EFLAGS: 00010246
[    3.982862] RAX: 0000000000000000 RBX: ffff9a42b93fb800 RCX: 0000000000000000
[    3.982862] RDX: 0000000000000001 RSI: 0000000000000092 RDI: ffffffffa8ac54ac
[    3.982863] RBP: 0000000000000050 R08: 0000000000000001 R09: 00000000000003d8
[    3.982864] R10: 00000000000155c0 R11: 0000000000000001 R12: 0000000000050000
[    3.982865] R13: ffff9a42b9f295c8 R14: 0000000000000001 R15: ffff9a42b9f29630
[    3.982866] FS:  00007f4326a01b80(0000) GS:ffff9a42beb40000(0000) knlGS:0000000000000000
[    3.982867] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    3.982868] CR2: 000055fbe2ed54b0 CR3: 00008007f797e000 CR4: 0000000000340ee0
[    3.982868] Call Trace:
[    3.982872]  dma_direct_map_page+0xdd/0xf0
[    3.983019]  nv_dma_map_pages+0x184/0x3f0 [nvidia]
[    3.983162]  nv_dma_map_alloc+0xd9/0x270 [nvidia]
[    3.983314]  _nv030433rm+0x357/0x480 [nvidia]
[    3.983442]  ? _nv025145rm+0x2b5/0x470 [nvidia]
[    3.983564]  ? _nv026175rm+0x76/0x270 [nvidia]
[    3.983683]  ? _nv026133rm+0x381/0x430 [nvidia]
[    3.983801]  ? _nv026126rm+0xd8/0x610 [nvidia]
[    3.983914]  ? _nv037389rm+0x104/0x180 [nvidia]
[    3.984022]  ? _nv037434rm+0x93a/0x11d0 [nvidia]
[    3.984089]  ? _nv000738rm+0xd06/0x2030 [nvidia]
[    3.984156]  ? rm_init_adapter+0xc5/0xe0 [nvidia]
[    3.984221]  ? nv_request_soc_irq+0x200/0xe60 [nvidia]
[    3.984223]  ? _cond_resched+0x10/0x20
[    3.984287]  ? nv_request_soc_irq+0xc02/0xe60 [nvidia]
[    3.984289]  ? exact_lock+0x8/0x20
[    3.984353]  ? nvidia_frontend_open+0x4e/0x90 [nvidia]
[    3.984353]  ? chrdev_open+0x98/0x1a0
[    3.984354]  ? cdev_put.part.0+0x20/0x20
[    3.984355]  ? do_dentry_open+0x137/0x380
[    3.984357]  ? path_openat+0x58c/0x1560
[    3.984359]  ? security_capable+0x31/0x50
[    3.984361]  ? capable_wrt_inode_uidgid+0x12/0x30
[    3.984362]  ? do_filp_open+0x8c/0x100
[    3.984363]  ? chown_common.isra.0+0x9a/0x150
[    3.984364]  ? do_sys_open+0x17f/0x220
[    3.984365]  ? do_syscall_64+0x43/0x110
[    3.984366]  ? entry_SYSCALL_64_after_hwframe+0x44/0xa9
[    3.984367] ---[ end trace bcfec63070b43e43 ]---
[    3.984405] NVRM: GPU 0000:10:00.0: Failed to copy vbios to system memory.
[    3.984541] NVRM: GPU 0000:10:00.0: RmInitAdapter failed! (0x30:0xffff:794)
[    3.984560] NVRM: GPU 0000:10:00.0: rm_init_adapter failed, device minor number 0


CPU & Kernel:

Code:
$ uname -a
Linux 5.4.60-gentoo #4 SMP Thu Sep 17 18:16:51 EDT 2020 x86_64 AMD Ryzen 7 3800X 8-Core Processor AuthenticAMD GNU/Linux


Motherboard:

Code:
$ cat /sys/devices/virtual/dmi/id/board_{vendor,name}
ASRock
X570 Steel Legend WiFi ax


Nvidia hardware, including GPU:

Code:
$ lspci | grep -i nvidia
10:00.0 VGA compatible controller: NVIDIA Corporation TU104 [GeForce RTX 2070 SUPER] (rev a1)
10:00.1 Audio device: NVIDIA Corporation TU104 HD Audio Controller (rev a1)
10:00.2 USB controller: NVIDIA Corporation TU104 USB 3.1 Host Controller (rev a1)
10:00.3 Serial bus controller [0c80]: NVIDIA Corporation TU104 USB Type-C UCSI Controller (rev a1)


Driver package settings:

Code:
x11-drivers/nvidia-drivers-450.66::gentoo was built with the following:
USE="X driver kms (libglvnd) multilib tools -compat -gtk3 -static-libs -uvm -wayland" ABI_X86="32
Back to top
View user's profile Send private message
krsr
n00b
n00b


Joined: 18 Sep 2020
Posts: 2
Location: NJ

PostPosted: Sat Sep 19, 2020 6:51 pm    Post subject: Resolved Reply with quote

The culprit was AMD's Secure Memory Encryption. Rebuilt kernel with
Code:
AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT=N
as described in nvidia's README and I can now start X without issue.

See https://forums.developer.nvidia.com/t/unable-to-start-x-failed-to-initialize-dma/64925/15 for more info.
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum