Graphics crash, glitch, freeze
trashy May 23, 2021
Hello good people, I started gaming a bit on Linux with Lutris and Steam Proton. It works most of the time, but approx. once an evening my graphics completely crash. Linux itself seems to be still running (Discord Voice Chat keeps on working) but the display becomes frozen, unresponsive.



► Is there a way to recover the system from this, e.g. reboot the graphics somehow with a shortkey so the computer can be used again? Right now, I have to press the physical reboot button and sometimes lose my game progress.
► Any ideas how to troubleshoot / fix this? I don't even know where to begin, would be happy about any pointers!

This is pretty much a fresh Ubuntu, I followed the Lutris Docs for Wine, Drivers and Battle.net. This glitch already happened in both Lutris and Steam though.

My System Info

Linux Distribution: Ubuntu 20.04.2 LTS
Desktop Environment: GNOME 3.36.8
Graphics Card: Sapphire Radeon RX 590 Nitro+ SE OC AMD 8GB
GPU Driver Version: OpenGL 4.6 (Core Profile) Mesa 21.0.2 - kisak-mesa PPA (?)

Have you checked for system updates?: There seem to be newer mesa versions (21.1.1) but when I do `apt-get upgrade`, they are listed under `The following packages have been kept back:` so they don't install right now.

Steam system read-out: https://pastebin.com/qqdQev9t
tuxintuxedo May 23, 2021
This is just a guess, but aside from Mesa, you could try a newer kernel.
Can you get to console with Ctrl+Alt+F2? If yes, then you can basically try everything from killing Wine to restarting xorg.
denyasis May 26, 2021
I agree. Try switching to the console and see what's going on or kill the wine/steam/lutris instance.

As for mesa, try:
apt-get dist upgrade

I hope that helps a bit.

Last edited by denyasis on 26 May 2021 at 12:14 am UTC
Koopacabras May 26, 2021
looks like memory related to me. Are you overclocking your GPU or did you ever overclock it in the past? I get a similar error but when I overclock. Overclocking has always been an issue with the amdgpu driver. On polaris it is, on navi (my hardware) happens as well.
trashy May 26, 2021
Thank you all! I have upgraded Mesa and until now I was lucky and had no crash.

I did no overclocking, I was wondering if the "OC" in the name means it's overclocked by factory, I find no info on that.

Def' will test the fullscreen terminal if it happens again, didn't know that, thanks. My plan is to kill the game and/or restart xorg.

ps ux | grep Valheim //find the process id
kill -1 [pid] //graceful shutdown
kill -9 [pid] //kill it with fire

sudo systemctl restart display-manager
trashy Jun 12, 2021
Hey, short update: both Overwatch through Lutris and Valheim through Steam still crash occasionally.

Sometimes I can recover the system with the commands above, but sometimes the CTRL+ALT+F3 console just gets flodded with errors (Ctrl+Alt+F2 doesn't work on my Ubuntu). https://streamable.com/og7lsc

I found out about `/var/log/kern.log` and this is the full log.

Jun 12 13:42:46 Hitower kernel: [ 2840.133972] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx timeout, signaled seq=589788, emitted seq=589790
Jun 12 13:42:46 Hitower kernel: [ 2840.134019] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process Overwatch.exe pid 8382 thread Overwatch.exe pid 8467
Jun 12 13:42:46 Hitower kernel: [ 2840.134024] amdgpu 0000:01:00.0: amdgpu: GPU reset begin!
Jun 12 13:42:46 Hitower kernel: [ 2840.655063] amdgpu: cp is busy, skip halt cp
Jun 12 13:42:46 Hitower kernel: [ 2840.836353] amdgpu: rlc is busy, skip halt rlc
Jun 12 13:42:46 Hitower kernel: [ 2840.837363] amdgpu 0000:01:00.0: amdgpu: GPU BACO reset
Jun 12 13:42:47 Hitower kernel: [ 2841.134864] amdgpu 0000:01:00.0: amdgpu: GPU reset succeeded, trying to resume
Jun 12 13:42:47 Hitower kernel: [ 2841.135330] [drm] PCIE GART of 256M enabled (table at 0x000000F40012C000).
Jun 12 13:42:47 Hitower kernel: [ 2841.135338] [drm] VRAM is lost due to GPU reset!
Jun 12 13:42:47 Hitower kernel: [ 2841.282108] [drm] UVD and UVD ENC initialized successfully.
Jun 12 13:42:47 Hitower kernel: [ 2841.382170] [drm] VCE initialized successfully.
Jun 12 13:42:47 Hitower kernel: [ 2841.388388] [drm] recover vram bo from shadow start
Jun 12 13:42:47 Hitower kernel: [ 2841.394743] [drm] recover vram bo from shadow done
Jun 12 13:42:47 Hitower kernel: [ 2841.394745] [drm] Skip scheduling IBs!
Jun 12 13:42:47 Hitower kernel: [ 2841.394746] [drm] Skip scheduling IBs!
Jun 12 13:42:47 Hitower kernel: [ 2841.394777] amdgpu 0000:01:00.0: amdgpu: GPU reset(2) succeeded!
Jun 12 13:42:47 Hitower kernel: [ 2841.394786] [drm] Skip scheduling IBs! (repeats alot ...)
Jun 12 13:42:47 Hitower kernel: [ 2841.396830] [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125! (repeats endless ...)


With these new error messages I find tons of threads online but there seems no real solution but trying the latest kernel (reading that on threads that go back years so it doesn't seem to me like a stable solution) or tinker with kernel options. I don't know, maybe dual boot for gaming is the easiest way.

Last edited by trashy on 12 June 2021 at 4:17 pm UTC
Xpander Jun 13, 2021
AMD does still not have the GPU recovery after crash? looks like the graphics drivers just crash the card.
maybe you can try corectl to lower the clocks or something? if you said your card is factory OC. Might help, might not.

Probably have to wait the day they implement GPU recovering
tuubi Jun 13, 2021
Quoting: XpanderProbably have to wait the day they implement GPU recovering
They've implemented it already, years ago. Doesn't seem to work in this case though.

Might be worth trying a more recent kernel to check if this particular bug was already fixed in the amdgpu kernel driver. I'd recommend Xanmod kernels. They're a breeze to install (and remove) and won't mess up your system in any way. Just follow the link for instructions on how to add the repo and install the kernel.

Another option is Ubuntu's own mainline kernel repo, but as you say you use Proton and Lutris, you might appreciate that Xanmod includes things like the fsync and fsync2 patchsets.
trashy Jun 13, 2021
Thank you both, I will look into both downclocking my "factory OC" and Xanmod. Time to learn something about the Linux kernel
tuubi Jun 14, 2021
Quoting: trashyThank you both, I will look into both downclocking my "factory OC" and Xanmod. Time to learn something about the Linux kernel
I had an RX580 that kept crashing before my current GPU. Apparently some component or another was faulty in some way, in addition to a broken cooler. I got it to stop overheating and hanging by simply undervolting it a bit using some graphical tool (I think it was WattmanGTK), didn't have to touch the clocks at all. Just dropped all the voltages slightly. That might be worth trying if installing the new kernel isn't sufficient.

I sent that card for RMA when I bought the new one, and the replacement seems to run just fine in my wife's workstation, but she doesn't really play anything heavier than Stardew Valley so who knows.
While you're here, please consider supporting GamingOnLinux on:

Reward Tiers: Patreon. Plain Donations: PayPal.

This ensures all of our main content remains totally free for everyone! Patreon supporters can also remove all adverts and sponsors! Supporting us helps bring good, fresh content. Without your continued support, we simply could not continue!

You can find even more ways to support us on this dedicated page any time. If you already are, thank you!
Login / Register


Or login with...
Sign in with Steam Sign in with Google
Social logins require cookies to stay logged in.