Patreon Logo Support us on Patreon to keep GamingOnLinux alive. This ensures all of our main content remains free for everyone. Just good, fresh content! Alternatively, you can donate through PayPal Logo PayPal. You can also buy games using our partner links for GOG and Humble Store.
Title: [Rant]: RX 5700... a frustrating experience
Page: 1/11
  Go to:
Tuxee 3 Nov 2019
After several generations of NVidia cards I decided to switch back to AMD. Prior to my NVidia period my experiences with AMD cards had been... well, mediocre (proprietary drivers always lagging behind kernel releases, open source drivers laughably bad). But since then the open source drivers are supposed to be in good shape and with the RX 5700 a truly competitive hardware is finally available. What could possibly go wrong...

Got my Sapphire Pulse RX 5700 this October (I wanted a quiet custom design) and I was well aware that I required a 5.3 kernel, manually copied firmware and 19.2 Mesa drivers.
I set up a brand new Ubuntu 19.10 on my rig - a X570 board with 3700X and 32Gig RAM - added the firmware (wasn't there upon default install) and the Oibaf-PPA.
Outcome: A constant flow of
amdgpu: [powerplay] Failed to send message ...
or
amdgpu: [powerplay] Failed to export SMU metrics table!

Result: An unusable desktop which crashes hard after a few minutes.

Back to my trusty 18.04, added mainline kernel, added firmware, added Oibaf. This interestingly works. A dmesg shows nothing disturbing. Well, kinda works. It crashes hard every now and then, but not too often. Crashing games I can live with, but a crashing browser is more of an annoyance.

Googling and finding threads like that

https://bugs.freedesktop.org/show_bug.cgi?id=111481

I heartily have to agree to one of the posters there which states

We’re Amd customers like anyone else, support was supposed to be introduced on Mesa 19.2 and improved on Mesa 19.3, so far none of these two versions work properly. Please get your shit together Amd, this is ridiculous.
because I don't get the feeling that this is going to be fixed anytime soon. After all the RX 5700 was introduced about 4 months ago... (He got reprimanded for "name calling" though I found he had put it "rather mildly".)

Next thing: OpenCL. I wasn't aware that the open source drivers lack that entirely. Extracting binaries from the proprietary packages and manually placing them appropriately fixed that too.

I'm all for Open Source but at this point I can only recommend NVidia graphic cards with their proprietary drivers. They just "work". I've always considered the attacks of Windows users claiming "hardware problems with Linux" as unfounded. And I was right - never had them. Till I came across the RX 5700. Now you really "have to compile kernels and drivers" and "do everything in a shell, typing cryptic commands".

On the bright side: Doing some Vulkan benchmarks yielded spectacular frame rates when compared to my previous GTX1060:
  • War Thunder saw a 70% increase (this game works now more stable with Vulkan than OpenGL, albeit far from flawless)
  • Shadow of Mordor went from 90 FPS to 146
  • Talos Principle exploded from 60 FPS to 280
sub 3 Nov 2019
Damn, what a bummer. :/

I can only hope that this is during transition
to the new architecture and will be sorted out soon.

So far I'm more than pleased with AMD's politics and support.
Yet, I don't have one of the new cards.

I can fully understand your anger.

[freedesktop.org is down?]
tuxintuxedo 3 Nov 2019
I somewhat understand where you are coming from, but note, that Linux users are still less important, than Windows users from the viewpoint of any kind of company (Nvidia included).
AMD truly made a huge turnover regarding their Linux drivers, but it can be expected to lag behind somewhat. Although I don't have problems with the AMD cards I have or with the ones my friends own, I wouldn't recommend one which is younger than half a year because of the support. That's sad, but currently not much can be done. AMD's Linux team has great people, but the upper management and the communication inside the company is still lacking in this regard.
So the most I can suggest is to try again the "flawless experience" sometime later, when it will be ready, like for older AMD cards.

Last edited by tuxintuxedo on 3 Nov 2019 at 12:24 pm UTC
sub 3 Nov 2019
Quoting: tuxintuxedoI somewhat understand where you are coming from, but note, that Linux users are still less important, than Windows users from the viewpoint of any kind of company (Nvidia included).
AMD truly made a huge turnover regarding their Linux drivers, but it can be expected to lag behind somewhat. Although I don't have problems with the AMD cards I have or with the ones my friends own, I wouldn't recommend one which is younger than half a year because of the support. That's sad, but currently not much can be done. AMD's Linux team has great people, but the upper management and the communication inside the company is still lacking in this regard.
So the most I can suggest is to try again the "flawless experience" sometime later, when it will be ready, like for older AMD cards.
The Linux team has surely great people and absolutely
love the FOSS drivers / documentation available.

I currently only see one major problem with it,
that might cause problems like described by the OP.

Mesa has a release cycle. So unless you're able to build the stack yourself,
this means sometime waiting 3-6 month for a release that (properly) supports the new hardware.
In particular, if there is a major architecture change.
Now, why not simply push the new code earlier?
Well, products that are not released (probably announced) but have no official specs yet,are understandably a problem for AMD by means that they don't want to expose details early,that gives competition a major advantage. Be it simply to have an existing products ready that's at least competitive in the bang for buck target regime.

While I'm sure that AMD has an internal review process for the patches,
they have to undergo another review when they are pushed to Mesa.
Hence, this can't be simply done right before a release.

If you think exposing information early isn't a problem, then you have missed the many
news sites (not Linux related) that tried to speculate about upcoming AMD GPUs just by
some commits to Mesa. So this is a problem.
And while this stuff sometimes looks hard to read, people that are really interested
in those bits (read competitors) DO KNOW how to extract valuable information from it.

This is a true dilemma.
That could maybe just be resolved if Mesa would allow for private code reviews (including NDA),
which they probably won't consider for other good reason.

Last edited by sub on 3 Nov 2019 at 3:05 pm UTC
Tuxee 3 Nov 2019
I've been long enough on Linux to be aware of the lack of importance for manufacturers, however when selling hardware which states on the package "supported operating systems include Linux, Windows 7 and Windows 10" I expect a working driver. There ARE packages provided by AMD, too (the ones I used to get my OpenCL support) but it is unclear whether their packages dating from August work any more reliably than my current setup (being it the closed source or open source variants).

The power consumption of 30W when idling should be addressed by kernel 5.4. Hopefully...

Phoronix had benchmarks up and running by the end of July (yes, everything was bleeding edge then)
https://www.phoronix.com/scan.php?page=article&item=rx-5700-july&num=1
but three months later I don't think I'm asking too much when I ask for a somewhat stable driver. (Or at least guidelines which will yield a stable system.)

With the Oibaf PPA I am on the most recent Mesa stack. But there are several firmware versions floating around and the kernel has to be recent enough (with an 18.04 Ubuntu you will get the official 5.3 kernel next spring).

And then we look to the Green Side: Get a GTX2060, get pretty much any of their drivers readily available in repositories (they had RTX support in their drivers the same day they released the hardware), and - it works.

Whether withholding information will yield competitive advantages is questionable. For fellow Linux users I just can't recommend AMD cards. By the time the issues for the RX5700 are ironed out, the next generation is ready to launch. And the reports about bugs and problems might also bleed into the Windows world, fueling a general perception of AMD being sub-par.

Well, I just got carried away. I was pissed. Writing these posts helped :)
tuubi 3 Nov 2019
User Avatar
Quoting: subThis is a true dilemma.
That could maybe just be resolved if Mesa would allow for private code reviews (including NDA),
which they probably won't consider for other good reason.
Nothing stops AMD from developing and beta-testing their hardware support internally before it's ever officially submitted to MESA for inclusion. Of course they do this already, but obviously big chunks weren't even started on when Navi HW was already on the shelves. In any case, MESA's licensing or development model do not force them to release any information or source code early so you're barking up the wrong tree.

MESA's code reviews aren't the reason Navi still has major bugs and missing functionality that were basically only discovered when a customer tried running a popular game or application on a popular distribution. AMD simply doesn't have a big enough Linux driver team to properly test and support a new consumer hardware architecture in a timely fashion.

All that said, my next GPU will most likely be AMD again. I'm rarely in a hurry to buy the latest and greatest anyway. I have been thinking about the 5700, but I don't think I'll take a serious look until Christmas.
TobyGornow 3 Nov 2019
I feel you and agree 100% with you, If I had knew I would've dished out 100 bucks more and got me a 2070 Super instead of the 5700xt.

I'm not a Linux Poweruser so installing the latest Kernel-Rc then finding that it lacks the firmware and installing it, building the latest mesa-git, the one that is bugged with the game you're actually playing, filing a bug, waiting for the patch to be merged, re-building mesa to actually use your hardware is really really tedious for a product that is almost 5 months old. And you're right when you say it's exactly how Windows users describes us.

Heck, if I've gone the nvidia way : Unplug the old card, plug the new one, better than Windows.

I'm rendering some stuff on Blender and like you I need OpenCl, so right now my GTX 970 is better with her old working cuda cores.

Right now I'm satisfied, it's working well and I'm sorry to learn that you're crashing cause I don't have this problem, but tomorrow ? I feel like it's working but with duct tape and zip ties holding everything together and one wrong update or upgrade could mess it all really quickly.

Last edited by TobyGornow on 3 Nov 2019 at 6:48 pm UTC
Shmerl 4 Nov 2019
5.3 is required, but far from enough. For at least moderately stable experience, use 5.4-rc6. Plus Mesa master to run games. Too many games hang with radeonsi with older Mesa. radv for the most part is OK.
Shmerl 4 Nov 2019
Quoting: TuxeeThe power consumption of 30W when idling should be addressed by kernel 5.4. Hopefully...
Depending on your resolution and refresh rate, it's not addressed.

See: https://bugs.freedesktop.org/show_bug.cgi?id=111482

There is conflicting info on that. Some claim that for high resolution / high refresh rate. GDDR6 needs 30W (and they observe it on Windows as well) so it's not a bug, it simply differs from HBM2 which can handle it with less.

However others say, that it's possible to run it with less (see suspend resume sequence example). I tested that and indeed it's possible. So something is clearly missing here.

See also #amd-navi-linux chat room on matrix.org for active discussion.

Last edited by Shmerl on 4 Nov 2019 at 5:21 am UTC
damarrin 4 Nov 2019
At least we will now have this thread to point people to when they ask what gfx card to get for their Linux machine.

I have a computer with an AMD GPU. It hangs hard any time from 5 seconds to 5 minutes after booting the system. I don't use it much.
Tuxee 4 Nov 2019
Quoting: Shmerl5.3 is required, but far from enough. For at least moderately stable experience, use 5.4-rc6. Plus Mesa master to run games. Too many games hang with radeonsi with older Mesa. radv for the most part is OK.
Well 5.4 should be out in two or three weeks. I'll try it once it becomes stable. After all, I have to work on the machine, too...
Shmerl 4 Nov 2019
Quoting: TuxeeWell 5.4 should be out in two or three weeks. I'll try it once it becomes stable. After all, I have to work on the machine, too...
Works pretty well for me. I.e. if you already have Navi, it's not going to be stable with 5.3 for sure, so get 5.4 if you need to work on it. Otherwise, don't use Navi until it comes out.

Last edited by Shmerl on 4 Nov 2019 at 2:01 pm UTC
Tuxee 4 Nov 2019
Quoting: ShmerlI.e. if you already have Navi, it's not going to be stable with 5.3 for sure, so get 5.4 if you need to work on it. Otherwise, don't use Navi until it comes out.
No. Doesn't work. 5.4rc6 can't even finish booting to the desktop.

Nov  4 17:38:04 leia kernel: [   12.814605] amdgpu: [powerplay] failed send message: TransferTableSmu2Dram (18)         param: 0x00000006 response 0xffffffc2
Nov  4 17:38:04 leia kernel: [   12.814607] amdgpu: [powerplay] Failed to export SMU metrics table!
Nov  4 17:38:06 leia kernel: [   15.026804] amdgpu: [powerplay] failed send message: SetDriverDramAddrHigh (14)         param: 0x00000080 response 0xffffffc2
Nov  4 17:38:09 leia kernel: [   17.241224] amdgpu: [powerplay] failed send message: TransferTableSmu2Dram (18)         param: 0x00000006 response 0xffffffc2
Nov  4 17:38:09 leia kernel: [   17.241224] amdgpu: [powerplay] Failed to export SMU metrics table!
Nov  4 17:38:11 leia kernel: [   19.747899] amdgpu: [powerplay] failed send message: SetDriverDramAddrHigh (14)         param: 0x00000080 response 0xffffffc2
Nov  4 17:38:14 leia kernel: [   22.254581] amdgpu: [powerplay] failed send message: SetDriverDramAddrHigh (14)         param: 0x00000080 response 0xffffffc2
Nov  4 17:38:14 leia kernel: [   22.254583] amdgpu: [powerplay] Failed to export SMU metrics table!
Nov  4 17:38:14 leia kernel: [   22.328723] kauditd_printk_skb: 30 callbacks suppressed
Nov  4 17:38:14 leia kernel: [   22.328724] audit: type=1400 audit(1572885494.198:42): apparmor="DENIED" operation="open" profile="/usr/sbin/mysqld" name="/sys/devices/system/node/" pid=2037 comm="mysqld" requested_mask="r" denied_mask="r" fsuid=0 ouid=0
Nov  4 17:38:14 leia kernel: [   22.342333] audit: type=1400 audit(1572885494.210:43): apparmor="DENIED" operation="capable" profile="/usr/sbin/mysqld" pid=2037 comm="mysqld" capability=2  capname="dac_read_search"
Nov  4 17:38:14 leia kernel: [   22.362019] audit: type=1400 audit(1572885494.230:44): apparmor="DENIED" operation="open" profile="/usr/sbin/mysqld" name="/sys/devices/system/node/" pid=2052 comm="mysqld" requested_mask="r" denied_mask="r" fsuid=121 ouid=0
Nov  4 17:38:16 leia kernel: [   24.734193] amdgpu: [powerplay] failed send message: SetDriverDramAddrHigh (14)         param: 0x00000080 response 0xffffffc2
Nov  4 17:38:19 leia kernel: [   27.211635] amdgpu: [powerplay] failed send message: SetDriverDramAddrHigh (14)         param: 0x00000080 response 0xffffffc2
Nov  4 17:38:19 leia kernel: [   27.211637] amdgpu: [powerplay] Failed to export SMU metrics table!
Nov  4 17:38:21 leia kernel: [   29.688247] amdgpu: [powerplay] failed send message: SetDriverDramAddrHigh (14)         param: 0x00000080 response 0xffffffc2
Nov  4 17:38:24 leia kernel: [   32.165118] amdgpu: [powerplay] failed send message: SetDriverDramAddrHigh (14)         param: 0x00000080 response 0xffffffc2
Nov  4 17:38:24 leia kernel: [   32.165119] amdgpu: [powerplay] Failed to export SMU metrics table!
Nov  4 17:38:24 leia kernel: [   32.165883] igb 0000:06:00.0 enp6s0: igb: enp6s0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX
Nov  4 17:38:26 leia kernel: [   34.328291] igb 0000:06:00.0: exceed max 2 second
Nov  4 17:38:26 leia kernel: [   34.328468] IPv6: ADDRCONF(NETDEV_CHANGE): enp6s0: link becomes ready
Nov  4 17:38:27 leia kernel: [   35.244024] amdgpu: [powerplay] failed send message: GetMaxDpmFreq (31)         param: 0x00000000 response 0xffffffc2


Then I turned it off.
Maybe next time...
Shmerl 4 Nov 2019
Breaking powerplay errors went away for me around rc5 time or so. May be your firmware is not up to date?

It could be also a VBIOS issue with specific card model. Feel free to comment on this in the bug: https://bugs.freedesktop.org/show_bug.cgi?id=111481

Powerplay issues is one of the subtopics in it.

Last edited by Shmerl on 4 Nov 2019 at 4:51 pm UTC
Tuxee 4 Nov 2019
Doubt that. Got these

https://people.freedesktop.org/~agd5f/radeon_ucode/navi10/

And yes, I follow the discussion on freedesktop. I first ended up there when googling for the powerplay issues. The discussion there must be quite amusing for someone not affected. "Try another kernel", "try this Mesa version", "have you applied this patch?", "set these boot parameters", "maybe it's a PCIe 4 issue", "...could be NVMe related", "the crashes went away", "they are back - just not that often", "it definitely happens when something wants to get statistics from the GPU"...

It's always a good thing to learn, that you are not the only one affected. But in this very case I can't help the impression that everybody (including me o.c.) is quite clueless.

I don't get ANY powerplay issues on my 5.3 kernel on Ubuntu 18.04. I get hundreds of them on my 5.3 kernel on 19.10. Mesa is both times from Oibaf.

On 18.04

gregor@leia:/lib/firmware/amdgpu$ dmesg | grep amdgpu
[    2.926479] [drm] amdgpu kernel modesetting enabled.
[    2.926599] amdgpu 0000:0c:00.0: remove_conflicting_pci_framebuffers: bar 0: 0xe0000000 -> 0xefffffff
[    2.926600] amdgpu 0000:0c:00.0: remove_conflicting_pci_framebuffers: bar 2: 0xf0000000 -> 0xf01fffff
[    2.926600] amdgpu 0000:0c:00.0: remove_conflicting_pci_framebuffers: bar 5: 0xfcb00000 -> 0xfcb7ffff
[    2.926602] fb0: switching to amdgpudrmfb from EFI VGA
[    2.926663] amdgpu 0000:0c:00.0: vgaarb: deactivate vga console
[    2.952688] amdgpu 0000:0c:00.0: No more image in the PCI ROM
[    2.952738] amdgpu 0000:0c:00.0: VRAM: 8176M 0x0000008000000000 - 0x00000081FEFFFFFF (8176M used)
[    2.952739] amdgpu 0000:0c:00.0: GART: 512M 0x0000000000000000 - 0x000000001FFFFFFF
[    2.952806] [drm] amdgpu: 8176M of VRAM memory ready
[    2.952807] [drm] amdgpu: 8176M of GTT memory ready.
[    3.738225] amdgpu: [powerplay] SMU is initialized successfully!
[    3.951684] fbcon: amdgpudrmfb (fb0) is primary device
[    4.037119] amdgpu 0000:0c:00.0: fb0: amdgpudrmfb frame buffer device
[    4.056080] amdgpu 0000:0c:00.0: ring 0(gfx_0.0.0) uses VM inv eng 4 on hub 0
[    4.056081] amdgpu 0000:0c:00.0: ring 1(gfx_0.1.0) uses VM inv eng 5 on hub 0
[    4.056082] amdgpu 0000:0c:00.0: ring 2(comp_1.0.0) uses VM inv eng 6 on hub 0
[    4.056083] amdgpu 0000:0c:00.0: ring 3(comp_1.1.0) uses VM inv eng 7 on hub 0
[    4.056083] amdgpu 0000:0c:00.0: ring 4(comp_1.2.0) uses VM inv eng 8 on hub 0
[    4.056084] amdgpu 0000:0c:00.0: ring 5(comp_1.3.0) uses VM inv eng 9 on hub 0
[    4.056084] amdgpu 0000:0c:00.0: ring 6(comp_1.0.1) uses VM inv eng 10 on hub 0
[    4.056085] amdgpu 0000:0c:00.0: ring 7(comp_1.1.1) uses VM inv eng 11 on hub 0
[    4.056086] amdgpu 0000:0c:00.0: ring 8(comp_1.2.1) uses VM inv eng 12 on hub 0
[    4.056086] amdgpu 0000:0c:00.0: ring 9(comp_1.3.1) uses VM inv eng 13 on hub 0
[    4.056087] amdgpu 0000:0c:00.0: ring 10(kiq_2.1.0) uses VM inv eng 14 on hub 0
[    4.056088] amdgpu 0000:0c:00.0: ring 11(sdma0) uses VM inv eng 15 on hub 0
[    4.056088] amdgpu 0000:0c:00.0: ring 12(sdma1) uses VM inv eng 16 on hub 0
[    4.056089] amdgpu 0000:0c:00.0: ring 13(vcn_dec) uses VM inv eng 4 on hub 1
[    4.056089] amdgpu 0000:0c:00.0: ring 14(vcn_enc0) uses VM inv eng 5 on hub 1
[    4.056090] amdgpu 0000:0c:00.0: ring 15(vcn_enc1) uses VM inv eng 6 on hub 1
[    4.056091] amdgpu 0000:0c:00.0: ring 16(vcn_jpeg) uses VM inv eng 7 on hub 1
[    4.056214] [drm] Initialized amdgpu 3.33.0 20150101 for 0000:0c:00.0 on minor 0


on 19.10 it looks like this...

Nov  4 13:01:15 leia kernel: [    3.769236] [drm] Initialized amdgpu 3.33.0 20150101 for 0000:0c:00.0 on minor 0
Nov  4 13:03:32 leia kernel: [  140.916869] amdgpu: [powerplay] Failed to send message 0xe, response 0xfffffffb param 0x80
Nov  4 13:03:32 leia kernel: [  140.916873] amdgpu: [powerplay] Failed to export SMU metrics table!
Nov  4 13:04:17 leia kernel: [  185.932919] amdgpu: [powerplay] Failed to send message 0x12, response 0xfffffffb param 0x6
Nov  4 13:04:17 leia kernel: [  185.932923] amdgpu: [powerplay] Failed to export SMU metrics table!
Nov  4 13:04:32 leia kernel: [  200.936818] amdgpu: [powerplay] Failed to send message 0xf, response 0xfffffffb, param 0xfd6000
Nov  4 13:04:42 leia kernel: [  210.938902] amdgpu: [powerplay] Failed to send message 0xe, response 0xfffffffb, param 0x80
Nov  4 13:04:42 leia kernel: [  210.938910] amdgpu: [powerplay] Failed to send message 0xf, response 0xfffffffb param 0xfd6000
Nov  4 13:04:42 leia kernel: [  210.938913] amdgpu: [powerplay] Failed to export SMU metrics table!
Nov  4 13:04:47 leia kernel: [  215.940047] amdgpu: [powerplay] Failed to send message 0x12, response 0xfffffffb param 0x6
Nov  4 13:04:47 leia kernel: [  215.940050] amdgpu: [powerplay] Failed to export SMU metrics table!
Nov  4 13:05:07 leia kernel: [  235.943394] amdgpu: [powerplay] Failed to send message 0xe, response 0xfffffffb, param 0x80
Nov  4 13:05:07 leia kernel: [  235.943527] amdgpu 0000:0c:00.0: [mmhub] VMC page fault (src_id:0 ring:174 vmid:0 pasid:0)
Nov  4 13:05:07 leia kernel: [  235.943530] amdgpu 0000:0c:00.0:   at page 0x0000000000fd6000 from 18
Nov  4 13:05:07 leia kernel: [  235.943532] amdgpu 0000:0c:00.0: VM_L2_PROTECTION_FAULT_STATUS:0x0004115C
Nov  4 13:05:10 leia kernel: [  238.703395] amdgpu: [powerplay] Failed to send message 0x12, response 0xffffffc2 param 0x6
Nov  4 13:05:10 leia kernel: [  238.703399] amdgpu: [powerplay] Failed to export SMU metrics table!
Nov  4 13:05:15 leia kernel: [  243.695494] amdgpu: [powerplay] Failed to send message 0xe, response 0xffffffc2, param 0x80
Nov  4 13:05:15 leia kernel: [  243.697307] amdgpu: [powerplay] Failed to send message 0xe, response 0xffffffc2, param 0x80
Nov  4 13:05:18 leia kernel: [  246.443323] amdgpu: [powerplay] Failed to send message 0xe, response 0xffffffc2 param 0x80
Nov  4 13:05:18 leia kernel: [  246.443327] amdgpu: [powerplay] Failed to export SMU metrics table!
Nov  4 13:05:18 leia kernel: [  246.443817] [drm:amdgpu_dm_commit_planes.constprop.0 [amdgpu]] *ERROR* Waiting for fences timed out or interrupted!
Nov  4 13:05:18 leia kernel: [  246.443902] [drm:amdgpu_dm_commit_planes.constprop.0 [amdgpu]] *ERROR* Waiting for fences timed out or interrupted!
Nov  4 13:05:18 leia kernel: [  246.448969] amdgpu: [powerplay] Failed to send message 0xe, response 0xffffffc2 param 0x80
Nov  4 13:05:18 leia kernel: [  246.448974] amdgpu: [powerplay] Failed to export SMU metrics table!
Nov  4 13:05:20 leia kernel: [  248.699575] amdgpu: [powerplay] Failed to send message 0xe, response 0xffffffc2, param 0x80
Nov  4 13:05:20 leia kernel: [  248.700054] amdgpu: [powerplay] Failed to send message 0xe, response 0xffffffc2, param 0x80
Nov  4 13:05:23 leia kernel: [  251.451212] amdgpu: [powerplay] Failed to send message 0xe, response 0xffffffc2 param 0x80
Nov  4 13:05:23 leia kernel: [  251.451216] amdgpu: [powerplay] Failed to export SMU metrics table!
Nov  4 13:05:23 leia kernel: [  251.454606] amdgpu: [powerplay] Failed to send message 0xe, response 0xffffffc2 param 0x80
Nov  4 13:05:23 leia kernel: [  251.454609] amdgpu: [powerplay] Failed to export SMU metrics table!
Nov  4 13:05:25 leia kernel: [  253.693905] amdgpu: [powerplay] Failed to send message 0xe, response 0xffffffc2, param 0x80
Nov  4 13:05:25 leia kernel: [  253.699553] amdgpu: [powerplay] Failed to send message 0xe, response 0xffffffc2, param 0x80
Nov  4 13:05:28 leia kernel: [  256.446574] amdgpu: [powerplay] Failed to send message 0xe, response 0xffffffc2 param 0x80
Nov  4 13:05:28 leia kernel: [  256.446577] amdgpu: [powerplay] Failed to export SMU metrics table!
Nov  4 13:05:28 leia kernel: [  256.452207] amdgpu: [powerplay] Failed to send message 0xe, response 0xffffffc2 param 0x80
Nov  4 13:05:28 leia kernel: [  256.452210] amdgpu: [powerplay] Failed to export SMU metrics table!
Nov  4 13:05:30 leia kernel: [  258.700728] amdgpu: [powerplay] Failed to send message 0xe, response 0xffffffc2, param 0x80
Nov  4 13:05:30 leia kernel: [  258.701184] amdgpu: [powerplay] Failed to send message 0xe, response 0xffffffc2, param 0x80
Nov  4 13:05:33 leia kernel: [  261.452346] amdgpu: [powerplay] Failed to send message 0xe, response 0xffffffc2 param 0x80
Nov  4 13:05:33 leia kernel: [  261.452350] amdgpu: [powerplay] Failed to export SMU metrics table!
Nov  4 13:05:33 leia kernel: [  261.455480] amdgpu: [powerplay] Failed to send message 0xe, response 0xffffffc2 param 0x80
Nov  4 13:05:33 leia kernel: [  261.455483] amdgpu: [powerplay] Failed to export SMU metrics table!
Nov  4 13:05:35 leia kernel: [  263.716466] amdgpu: [powerplay] Failed to send message 0xe, response 0xffffffc2, param 0x80
Nov  4 13:05:35 leia kernel: [  263.729459] amdgpu: [powerplay] Failed to send message 0xe, response 0xffffffc2, param 0x80
Nov  4 13:05:38 leia kernel: [  266.491442] amdgpu: [powerplay] Failed to send message 0xe, response 0xffffffc2 param 0x80
Nov  4 13:05:38 leia kernel: [  266.491445] amdgpu: [powerplay] Failed to export SMU metrics table!
Nov  4 13:05:38 leia kernel: [  266.503670] amdgpu: [powerplay] Failed to send message 0xe, response 0xffffffc2 param 0x80
Nov  4 13:05:38 leia kernel: [  266.503674] amdgpu: [powerplay] Failed to export SMU metrics table!
Shmerl 4 Nov 2019
Can't say anything about Ubuntu, I'm using Debian testing. Until a while ago, powerplay was causing problems (stalls) when sensors were read concurrently. Then it was fixed, I still see some errors in dmesg during usage of lm-sensors for amdgpu, but they aren't breaking anything at least.

Last edited by Shmerl on 4 Nov 2019 at 6:23 pm UTC
Tuxee 4 Nov 2019
Anyway, thanks for your input.
Pangaea 4 Nov 2019
First time I heard about these problems was in the "my PSU fried" thread. No reviews or tests or anything mention this -- which I find odd given how common they appear to be. Thanks for making the thread, it's good to get more info about this out there.

For quite a while I've planned to fork out a big pile of money for a new rig based on AMD GPU and CPU. But seriously, huge stability issues with totally basic stuff like using Firefox and Nemo? That's a huge turn-off, and totally unacceptable. Although Nvidia is overpriced, I've started to look at them instead. So far I've not had any problems with them, either on Linux or back in the Windows days. Closed drivers or not, they simply work.

All this talk about needing to build kernels, this or that driver, patches -- it's a problem for people like me who aren't super confident and knowledgeable. And as mentioned already, this is exactly what the Windows people say about Linux. Which incidentally isn't how the experience has been so far. Generally Linux simply works, and that's that. AMD need to get their shit together -- and fast. The products may be great, especially on the CPU side, but when the PC becomes terribly unstable, then it's a no-go.
Shmerl 4 Nov 2019
Quoting: PangaeaAll this talk about needing to build kernels, this or that driver, patches -- it's a problem for people like me who aren't super confident and knowledgeable. And as mentioned already, this is exactly what the Windows people say about Linux. Which incidentally isn't how the experience has been so far. Generally Linux simply works, and that's that. AMD need to get their shit together -- and fast. The products may be great, especially on the CPU side, but when the PC becomes terribly unstable, then it's a no-go.
If I remember correctly the previous iteration with Vega, it was also quite unstable, until at least one kernel release cycle. When I switched to Vega, it was already rock solid, so I didn't encounter that period at all, but others mentioned it.

So it seems it could be a pattern. AMD releases support in kernel a.b. If you want out of the box stable experience and aren't interested in building the kernel and the like, wait until at least kernel a.b+1 before using that hardware. Stick to older one until then. I.e. in case of Navi, initial support is 5.3, then stabilized support would be at least 5.4.

In order to avoid this period gap, AMD would need to beef up their support team.

For me, it's surely no reason to ever go back to Nvidia.

Last edited by Shmerl on 4 Nov 2019 at 7:00 pm UTC
YoRHa-2B 7 Nov 2019
  • DXVK
Quoting: TuxeeI'm all for Open Source but at this point I can only recommend NVidia graphic cards with their proprietary drivers.
The same is true for Windows as well. I only use that OS to run game benchmarks these days but the sheer number of issues I've had with the graphics drivers in the past four months (on an RX 480, not even Navi!) is beyond silly, hardly ever had problems before.

They managed to break D3D9 to the point where some games that were working fine before now only run with D9VK. They released a driver advertizing support for The Outer Worlds which [broke The Outer Worlds](https://www.reddit.com/r/Amd/comments/dmxstl/amd_the_outer_worlds_driver_is_bugged_check_for/). Their official Vulkan drivers are still a mess and Red Dead Redemption 2 apparently doesn't even render correctly on Navi GPUs. Hell, they even managed to break the Windows 10 login screen at some point.

No problems with Polaris on Linux right now - which, by the way, I got a mere two months after it launched and was usable right away, even on stable kernel and mesa versions (there were a few bugs, but nothing too terrible) - but it's impossible to recommend AMD GPUs at the moment. They are doing the best they can to live up the memes of their drivers being shit, and if they don't get it together some time next year, I'll have no choice but to jump ship again.

Last edited by YoRHa-2B on 7 Nov 2019 at 11:42 pm UTC
Shmerl 7 Nov 2019
Quoting: YoRHa-2BNo problems with Polaris on Linux right now - thank god - but it's impossible to recommend AMD GPUs at the moment. They are doing the best they can to live up the memes of their drivers being shit, and if they don't get it together some time next year, I'll have no choice but to jump ship again.
Didn't AMD report increased profits? I hope at least some of that will translate into better support.
While you're here, please consider supporting GamingOnLinux on:

Reward Tiers: Patreon Logo Patreon. Plain Donations: PayPal Logo PayPal.

This ensures all of our main content remains totally free for everyone! Patreon supporters can also remove all adverts and sponsors! Supporting us helps bring good, fresh content. Without your continued support, we simply could not continue!

You can find even more ways to support us on this dedicated page any time. If you already are, thank you!
Login / Register