While you're here, please consider supporting GamingOnLinux on:
Reward Tiers:
Patreon. Plain Donations:
PayPal.
This ensures all of our main content remains totally free for everyone! Patreon supporters can also remove all adverts and sponsors! Supporting us helps bring good, fresh content. Without your continued support, we simply could not continue!
You can find even more ways to support us on this dedicated page any time. If you already are, thank you!
Reward Tiers:
This ensures all of our main content remains totally free for everyone! Patreon supporters can also remove all adverts and sponsors! Supporting us helps bring good, fresh content. Without your continued support, we simply could not continue!
You can find even more ways to support us on this dedicated page any time. If you already are, thank you!
Login / Register
- Nexus Mods retire their in-development cross-platform app to focus back on Vortex
- Windows compatibility layer Wine 11 arrives bringing masses of improvements to Linux
- GOG plan to look a bit closer at Linux through 2026
- European Commission gathering feedback on the importance of open source
- Hytale has arrived in Early Access with Linux support
- > See more over 30 days here
- Venting about open source security.
- rcrit - Weekend Players' Club 2026-01-16
- grigi - Welcome back to the GamingOnLinux Forum
- simplyseven - A New Game Screenshots Thread
- JohnLambrechts - Will you buy the new Steam Machine?
- mr-victory - See more posts
How to setup OpenMW for modern Morrowind on Linux / SteamOS and Steam Deck
How to install Hollow Knight: Silksong mods on Linux, SteamOS and Steam Deck
A while ago I build a new Ryzen 3000 PC with a Vega 56 GPU. For month now my PC is freezing randomly. With freezing I mean nothing works anymore, no virtual consol, no ping, no kernel sysreq. Have to do a hw reset. And there is never an error in the kernel log.
I tried everything I could find on the internet (BIOS changes, Kernel Boot options etc.) but nothing helps. Even the RAM is currently on the low default clock.
The main problem is, I can't reproduce it. But it happens always when a game is running. It never happens when I watch videos for example. So I don't think this is the C6 state problem. Even games with a low CPU and GPU usage, so it is not related to a high power usage.
About 2 weeks ago I watched a video from AdoreTV and he sad that this happens on Windows too. Because I thought that is a Linux problem I never looked into Windows related search results :(
However, the Windows problem is related to the Vega and Navi GPUs, not the Ryzen CPU.
I really hate so say that, but I'm very close to buy a NVidia (2060super) card. If this problem does not go away very soon I might even give up gaming on the PC. (Which is a bit of a pity with at least 30 not played games on steam, but I already found a new hobby, building Lego-kind sets, and I'm 53.)
Anyway, I think my question at the moment is not so much about how to fix the problem, although I might try if someone has an idea, but is there any known issue with a RTX 2060 super on linux at the moment? Don't want to spend over €450,- when there are other big problems.
Currently running Manjaro unstable? with the 5.5.7 kernel.
Thanks for reading this,
Bye
I tried this, there is no connection possible, even a ping is not working anymore.
For some testing I let a a little script read out the current GPU state from /sys directory and let it shown on the other computer every second. When the gaming PC stop working, the update stops too.
Btw, the other computer is a DELL laptop with a pentium 3M, with 512MB RAM, 64GB PATA-SSD and runs Mint with an i3 WM. (Thought that a little bit funny to mention.)
I would ask on the [mesa issues](https://gitlab.freedesktop.org/mesa/mesa/issues) page for help.
EDIT
Give us something to read. Start a game and then post the contents of dmesg.
Tell us about your graphics stack. I don't have manjaro. What version of mesa?
What version of libdrm? How about the contents of /var/log/Xorg0.log? Do you
use ACO as your shader compiler?
Last edited by sr_ls_boy on 6 Mar 2020 at 9:17 pm UTC
I'm not sure how much part of the driver is in mesa (at least the 3D part). Can mesa even crash the kernel? But it is worth at to look there. I can't recall that any mesa related came up by my google search.
Thanks
(But keep in mind the problem is there for month now and the first entries I found during google serach are over 2 years old.)
Mesa 20.0.1
libdrm 2.4.100
ACO I don't think. The only package I found (mesa-aco) is not installed. I guess it's LLVM then?
The kernel mode line: oops=panic udev.log_priority=3 audit=0 amdgpu.ppfeaturemask=0xffffffff amdgpu.vm_debug=1 amdgpu.gpu_recovery=1 processor.max_cstate=3 rcu_nocbs=all
The Xorg log is long to post it here, but mostly Modelines from AMDGPU. Nothing unusual I would say.
This log has to go to the laptop too.
This look a little bit odd. At the end of the Xorg log is this:
[ 11880.078] (II) AMDGPU(0): EDID vendor "GSM", prod id 30436[ 11880.078] (II) AMDGPU(0): Using EDID range info for horizontal sync
[ 11880.078] (II) AMDGPU(0): Using EDID range info for vertical refresh
[ 11880.078] (II) AMDGPU(0): Printing DDC gathered Modelines:
[ 11880.078] (II) AMDGPU(0): Modeline "3440x1440"x0.0 319.75 3440 3488 3520 3600 1440 1443 1453 1481 +hsync -vsync (88.8 kHz eP)
[ 11880.078] (II) AMDGPU(0): Modeline "3440x1440"x0.0 429.80 3440 3584 3680 3880 1440 1448 1452 1476 +hsync -vsync (110.8 kHz e)
[ 11880.078] (II) AMDGPU(0): Modeline "3440x1440"x0.0 157.75 3440 3488 3520 3600 1440 1443 1453 1461 +hsync -vsync (43.8 kHz e)
[ 11880.078] (II) AMDGPU(0): Modeline "2560x1080"x0.0 185.58 2560 2624 2688 2784 1080 1083 1093 1111 -hsync -vsync (66.7 kHz e)
[ 11880.078] (II) AMDGPU(0): Modeline "1280x720"x0.0 74.25 1280 1390 1430 1650 720 725 730 750 +hsync +vsync (45.0 kHz e)
[ 11880.078] (II) AMDGPU(0): Modeline "720x480"x0.0 27.00 720 736 798 858 480 489 495 525 -hsync -vsync (31.5 kHz e)
[ 11880.078] (II) AMDGPU(0): Modeline "1920x1080"x0.0 148.50 1920 2008 2052 2200 1080 1084 1089 1125 +hsync +vsync (67.5 kHz e)
[ 11880.078] (II) AMDGPU(0): Modeline "640x480"x0.0 25.18 640 656 752 800 480 490 492 525 -hsync -vsync (31.5 kHz e)
[ 11880.078] (II) AMDGPU(0): Modeline "1920x1080"x0.0 148.50 1920 2448 2492 2640 1080 1084 1089 1125 +hsync +vsync (56.2 kHz e)
[ 11880.078] (II) AMDGPU(0): Modeline "1280x720"x0.0 74.25 1280 1720 1760 1980 720 725 730 750 +hsync +vsync (37.5 kHz e)
[ 11880.078] (II) AMDGPU(0): Modeline "720x576"x0.0 27.00 720 732 796 864 576 581 586 625 -hsync -vsync (31.2 kHz e)
[ 11880.078] (II) AMDGPU(0): Modeline "800x600"x0.0 40.00 800 840 968 1056 600 601 605 628 +hsync +vsync (37.9 kHz e)
[ 11880.078] (II) AMDGPU(0): Modeline "640x480"x0.0 31.50 640 656 720 840 480 481 484 500 -hsync -vsync (37.5 kHz e)
[ 11880.078] (II) AMDGPU(0): Modeline "1280x1024"x0.0 135.00 1280 1296 1440 1688 1024 1025 1028 1066 +hsync +vsync (80.0 kHz e)
[ 11880.078] (II) AMDGPU(0): Modeline "1024x768"x0.0 78.75 1024 1040 1136 1312 768 769 772 800 +hsync +vsync (60.0 kHz e)
[ 11880.078] (II) AMDGPU(0): Modeline "1024x768"x0.0 65.00 1024 1048 1184 1344 768 771 777 806 -hsync -vsync (48.4 kHz e)
[ 11880.078] (II) AMDGPU(0): Modeline "832x624"x0.0 57.28 832 864 928 1152 624 625 628 667 -hsync -vsync (49.7 kHz e)
[ 11880.078] (II) AMDGPU(0): Modeline "800x600"x0.0 49.50 800 816 896 1056 600 601 604 625 +hsync +vsync (46.9 kHz e)
[ 11880.078] (II) AMDGPU(0): Modeline "1152x864"x0.0 108.00 1152 1216 1344 1600 864 865 868 900 +hsync +vsync (67.5 kHz e)
[ 11880.078] (II) AMDGPU(0): Modeline "1152x864"x60.0 81.75 1152 1216 1336 1520 864 867 871 897 -hsync +vsync (53.8 kHz e)
[ 11880.078] (II) AMDGPU(0): Modeline "1280x1024"x0.0 108.00 1280 1328 1440 1688 1024 1025 1028 1066 +hsync +vsync (64.0 kHz e)
[ 11880.078] (II) AMDGPU(0): Modeline "1600x900"x59.9 118.25 1600 1696 1856 2112 900 903 908 934 -hsync +vsync (56.0 kHz e)
[ 11880.078] (II) AMDGPU(0): Modeline "1680x1050"x0.0 146.25 1680 1784 1960 2240 1050 1053 1059 1089 -hsync +vsync (65.3 kHz e)
[ 11880.078] (II) AMDGPU(0): Modeline "1280x800"x0.0 83.50 1280 1352 1480 1680 800 803 809 831 -hsync +vsync (49.7 kHz e)
Repeated 3 times with different time stamps.
Searching the mesa issue side I found this: [Random crash on amdgpu due to temperature missrepoorting](https://gitlab.freedesktop.org/mesa/mesa/issues/1044)
Sounds interesting. I will try what he/she wrote to log this.
Thanks.
Also consider posting dmesg and the Xorg log and use the spoiler tags. I get those modelines as well.
I have an old GTX970 I tried. Because the freezing is not reproducible, and the RTX 2060 has certainly a different driver I can't tell with the old card if the RTX will work or not. I used the GTX970 for a couple of years in my old Intel based PC with Linux Mint and Arch Linux and never run into this kind of problems. And I don't nobody with an RTX card.
@sr_ls_boy
I tried what was suggested in the comment 23 and set the GALLIUM_DDEBUG. Played some games yesterday and let some games run in demo mode. But the PC never froze. Not sure if the problem solved itself or not. There is one difference, I'm running the 5.5.8 kernel now (came with a Manjaro update) and according to the kernel change log there are some things fixed in the AMDGPU driver.
The comment 23 suggested to run this commands if the error occurs:
sudo umr -lb > umr_dumpsudo umr -O verbose,use_colour -R gfx[.] >> umr_dump
sudo umr -O halt_waves,use_colour -wa >> umr_dump
I tried this and the 2nd one instantly reboots my PC. (This commands are not working with zsh by the way.)
However, I don't now how to run this when the PC is frozen.
I'm still not so sure this is a driver problem. I mean the Windows driver is based on an other source code. I'm not sure but I think the driver developer for Windows and Linux are two different teams.
I will post more, if I have more info on this. Not sure how many people with an AMD Vega card are reading this post and have the same problems. According to a poll from Hardware Unboxed [Can We Still Recommend Radeon GPUs?](https://youtu.be/1uynVO4ZXl0) there are about 19% AMD GPU users with problems.
I forgot to post the link to [Still Something Wrong At Radeon](https://youtu.be/_x-QSi_yvoU) from AdroedTV.
Thanks for your help.
Dax
Sure there use the same driver installation blob but there are different HW architectures. I would be very suppressed if there are no differences in the driver.
So, I just played Minecraft for a while and it happened again. The PC froze. This time I had an ssh connection open the whole time and this was dead.
The Xorg log shows only this lines at the end. You can see the time difference.
Spoiler, click me
[ 59.125] (II) AMDGPU(0): Modeline "1280x800"x0.0 83.50 1280 1352 1480 1680 800 803 809 831 -hsync +vsync (49.7 kHz e)
[ 5691.334] (EE) client bug: timer event5 debounce: scheduled expiry is in the past (-0ms), your system is too slow
The journal gives this in the end (The pam messages are from the monitoring I did every second.):
Spoiler, click me
Mär 08 10:56:03 moritz sudo[52433]: alfred : TTY=pts/2 ; PWD=/home/alfred ; USER=root ; COMMAND=/usr/bin/cat /sys/kernel/debug/dri/0/amdgpu_pm_info
Mär 08 10:56:03 moritz sudo[52433]: pam_unix(sudo:session): session opened for user root by alfred(uid=0)
Mär 08 10:56:03 moritz sudo[52433]: pam_unix(sudo:session): session closed for user root
Mär 08 10:56:04 moritz sudo[52442]: alfred : TTY=pts/2 ; PWD=/home/alfred ; USER=root ; COMMAND=/usr/bin/cat /sys/kernel/debug/dri/0/amdgpu_pm_info
Mär 08 10:56:04 moritz sudo[52442]: pam_unix(sudo:session): session opened for user root by alfred(uid=0)
Mär 08 10:56:04 moritz sudo[52442]: pam_unix(sudo:session): session closed for user root
Mär 08 10:56:05 moritz sudo[52452]: alfred : TTY=pts/2 ; PWD=/home/alfred ; USER=root ; COMMAND=/usr/bin/cat /sys/kernel/debug/dri/0/amdgpu_pm_info
Mär 08 10:56:05 moritz sudo[52452]: pam_unix(sudo:session): session opened for user root by alfred(uid=0)
Mär 08 10:56:05 moritz sudo[52452]: pam_unix(sudo:session): session closed for user root
Mär 08 10:56:06 moritz sudo[52461]: alfred : TTY=pts/2 ; PWD=/home/alfred ; USER=root ; COMMAND=/usr/bin/cat /sys/kernel/debug/dri/0/amdgpu_pm_info
Mär 08 10:56:06 moritz sudo[52461]: pam_unix(sudo:session): session opened for user root by alfred(uid=0)
Mär 08 10:56:06 moritz sudo[52461]: pam_unix(sudo:session): session closed for user root
Mär 08 10:56:07 moritz sudo[52470]: alfred : TTY=pts/2 ; PWD=/home/alfred ; USER=root ; COMMAND=/usr/bin/cat /sys/kernel/debug/dri/0/amdgpu_pm_info
Mär 08 10:56:07 moritz sudo[52470]: pam_unix(sudo:session): session opened for user root by alfred(uid=0)
Mär 08 10:56:07 moritz sudo[52470]: pam_unix(sudo:session): session closed for user root
-- Reboot --
Mär 08 10:58:21 moritz kernel: Linux version 5.5.8-1-MANJARO (builder@216fb1516504) (gcc version 9.2.1 20200130 (Arch Linux 9.2.1+20200130-2)) #1 SMP PREEMPT Thu Mar 5 20:29:51 UTC 2020
Mär 08 10:58:21 moritz kernel: Command line: BOOT_IMAGE=/vmlinuz-5.5-x86_64 root=UUID=7f7d3134-e671-4bf4-b00c-dac4ecf90413 rw oops=panic udev.log_priority=3 audit=0 amdgpu.ppfeaturemask=0xffffffff amdgpu.vm_debug=1 amdgpu.vm_fault_stop=2 amdgpu.gpu_recovery=1 processor.max_cstate=3 rcu_nocbs=all
I think this has cost enough of mine (and your) time already. I spend at least 30 hours on this by now and every time I thinks its working, it happens again. I will order a RTX 2060 Super today.
Putting some time into finding a solution for a problem is not an issue if there are at least some hints whats going on. But this situation is not what I have in mind when I want to play a game after working the whole day writing software.
(Anyone wants to buy a Vega 56 Shapire Pulse? :)
Thank you all for your support
Dax
The order for a new graphic card got out 10 minutes ago. (Because it is send to my mother I will get it next Saturday.)
It is not because I want a NVidia card. When the Vega is working it does a great job and I'm actually happy with the performance and even the fan is barley noticeable.
If its not the GPU, then I buy other components. At the end I might have 3 PC here and only one is working :'(
The reason I chose an AMD GPU is that I don't like NVidias politics but as I sad in my first post, I might end up not using a PC for games anymore at all. I'm not at this point just yet.
Dax
The GPU cards manufactures I used so far are ELSA, ASUS, Gigabyte, MSI, Sapphire, Palit (there is still a GTX560TI on the shelf). The RTX 2060 Super I ordered is from Palit. (I will get it today.)
The MSI GTX 970 has a bug in the fan control which is known by MSI but has never been fixed. From time to time one fan stops spinning and the the other goes up to full speed. I never closed the casing of my last PC because I had to give the not spinning fan a short nudge and it starts spinning again.
What I'm saying, the manufacture is not the way someone should select PC components. From time to time every company produces a bad component. Checking tests is the best way I know. Of cause not all tests are without bias and some are very bad. A while ago I found the youtube channel IgorsLab (in German). He uses very high end equipment to test HW. Never found someone who actual measured the 10ms peek power consumption of GPU's, which could be a problem for the power supply.
(No I'm not starting writing about power supplies, this will end up in a short novel:)
I don't think the problem I have with the Vega is related to Sapphire but to AMD. But it would be interesting if there are manufactures who has this problem more likely. As I wrote, my PC freezes completely. That means the Linux kernel is not running anymore. Not sure a driver can actually so this. My hunch is, that the GPU is holding the DMA or an IRQ or makes some bad noise on the power so the CPU stops working.
Thanks for reading,
Dax
PS.
As soon has I have more information I will post this here. I'm might buy other components to build a 2nd PC for the Vega card to test this. Maybe this is not one problem but a combination of more then one.
PPS.
I'm working as a SW developer on embedded systems and yesterday I finally could build the test system to evaluate the new APU board we like to use in the next gen of our devices. Its an [AMD R1605B](https://en.wikichip.org/wiki/amd/ryzen_embedded/v1605b) APU with Vega graphic. First tests, using debian unstable, looks good. Hopefully this APU does not have the same problems I have. Our devices are running 24/7 in industrial production lines.
Now the 2060 is in use for over 2 weeks and not a single freeze. I think its safe to say the Vega has a problem. If hardware or driver is still a question.
I'm working from home at the moment. My PC is running much longer than usual.