You can sign up to get a daily email of our articles, see the Mailing List page!
Support me on Patreon to keep GamingOnLinux alive. This ensures we have no timed articles and no paywalls. Just good, fresh content! We will also never show adverts to anyone who supports GamingOnLinux! Alternatively, you can support me on Paypal.
While doing some comparative benchmarks between my RX 470 and GTX 1060 on a Ryzen 1700 CPU and an i7-2700k CPU, I encountered odd behaviour with Shadow of Mordor.

On 1080p high preset this benchmark is almost exclusively CPU-bound on both a Ryzen 1700 (3,75GHz) and an i7-2700k (4,2GHz). So when I got 30 to 40% better performance on the i7 compared to the Ryzen with the GTX 1060, I was shocked and began to investigate what was causing such a performance drop with Ryzen.

Interesting to note is that, on Ryzen, the performance of the GTX 1060 and the RX 470 was identical in CPU-bound parts of the benchmark, even though AMD’s open source driver (Mesa 17.2-git in this case) still has a significantly higher CPU overhead than Nvidia's proprietary driver. So this pointed to a driver-independent bottleneck on the game side itself.

With that information, I started suspecting a thread allocation problem, either from the Linux kernel (4.12rc1) or from the game (if it forces the scheduling through CPU affinity).

You see, Ryzen has a specific architecture, quite different from Intel's i5 and i7. Ryzen is a bit like some sort of CPU Lego, with the CCX being the base building block. A CCX (core complex) comprises 4 CPU cores with SMT (simultaneous multithreading) and the associated memory caches (level 1 to 3). So a mainstream Ryzen CPU is made of 2 CCXes linked with AMD’s infinity fabric (a high speed communication channel). Even the 4 cores Ryzen are made this way (on these cpus, two cores are disabled in each CCX).

If you’re interested in the subject, you can find more in-depth information here: Anandtech.com review of Ryzen

So how does this all relate to Shadow of Mordor? Well, AMD’s architecture is made to scale efficiently to high core numbers (up to 32), but it has a drawback: communication between CPU cores that are not on the same CCX is slower because it has to go through the Infinity Fabric.

On a lot of workloads this won’t be a problem because threads don’t need to communicate much (for example in video encoding, or serving web pages) but in games threads often need to synchronize with each other. So it’s better if threads that are interdependent are scheduled on the same CCX.

This is not happening with Shadow of Mordor, so performance takes a huge hit, as you can see in the graph below.
image
This graph shows the FPS observed on a Ryzen 1700 @ 3,75GHz and an RX 470 during the automated benchmark of Shadow of Mordor. The blue line shows the FPS with the default scheduling and the red line with the game forced onto the first CCX. The yellow line shows the performance increase (in %) going from default to manual scheduling.

As you can see, manual scheduling roughly yelds a 30% performance improvement in CPU-bound parts of the benchmark. Quite nice, eh?

So how does one manually schedule Shadow of Mordor on a Ryzen CPU?

It’s quite simple really. Just edit the launch options of the game in Steam like this:
taskset -c 0-7 %command%
This command will force the game on logical cores 0-7 which are all located on the first CCX.
Note: due to SMT, there are twice the amount of logical cores as real physical cores. This is because SMT allows two threads to run simultaneously on each physical core (though not both at full speed).

The above command is for an 8 core / 16 threads Ryzen CPU (model 1700 and higher).
On 6 core Ryzen (models 1600/1600X), the command would be taskset -c 0-5 %command% and on a 4 core Ryzen (models 1400/1500X) taskset -c 0-3 %command%

Caveat: on a 4 core Ryzen limiting the game to the first CCX will only give it 2 cores / 4 threads to work with. This may prove insufficient and counter-productive compared to running the game with the default scheduling. You’ll have to try it for yourself to see what option gives the best performance.

Due to its specific architecture, Ryzen needs special care in thread scheduling from the OS and games. If you think a game does not have the performance level it should have you can try forcing the scheduling on the first CCX and see if it improves performance. In my (admittedly limited) experience though, Shadow of Mordor is the only game where manual scheduling mattered. The Linux scheduler does a pretty good job usually.
28 Likes, Who?
Comments
Page: 1/4»
  Go to:

TheRiddick 27 May 2017 at 8:33 pm UTC
Interesting, I have a Ryzen 1600 system I have yet to build (waiting on ITX mobo's) so this will be handy info for when the time comes. I think games like this need to start including these things in their launch scripts so people don't suffer because most people will not have the knowledge to know of such issues nor howto fix them.
Samsai 27 May 2017 at 8:33 pm UTC
This is a fantastic piece of research right here. Mad respect for the guest writer!
Xpander 27 May 2017 at 8:35 pm UTC
nothing new
this has been the fix for stupidly programmed games/programs with FX and is now with ryzen as well

but great to see some in depth details about it


Last edited by Xpander at 27 May 2017 at 8:37 pm UTC
cip91sk 27 May 2017 at 8:56 pm UTC
View PC info
  • Supporter
Thanks for the heads up! However I remember reading that the infinity fabric's speed is dependent on RAM frequency, and that at ~3200+ mhz games weren't affected so much, so can you tell us what configuration are you using? And, if possible, can you try again at higher RAM frequency?
Naib 27 May 2017 at 9:20 pm UTC
This is interesting but a worry...
This would point towards applications needing to be RYZEN friendly which sure is a POSSIBILITY for applications that can still be updated.
There is an awful lot that are not. Equally applications should NOT be made to function on a universal state machine (AKA generic CPU), the universal state machine is meant to accept commands.

This sort of thing needs to finally be resolved by AMD be it via fixing the die (shame for those with ryzen...), fixing ucode, patches to kernel.
Ehvis 27 May 2017 at 9:24 pm UTC
View PC info
  • Supporter
Is this really necessary on a 4 core Ryzen? I would have thought they'd use only one CCX on that one. That would actually have been the big advantage.
Hal_Kado 27 May 2017 at 10:21 pm UTC
EhvisIs this really necessary on a 4 core Ryzen? I would have thought they'd use only one CCX on that one. That would actually have been the big advantage.

I agree, the lower core count Ryzen's seem a little less attractive because of this design. Although I suppose it depends on the workload, and I would assume future game titles may optimize for this better.
mad_mesa 27 May 2017 at 10:32 pm UTC
EhvisIs this really necessary on a 4 core Ryzen? I would have thought they'd use only one CCX on that one. That would actually have been the big advantage.

The current quad core parts seems like a stop gap since the SOC chips are almost certain to be single CCX and its doubtful AMD has that many chips that have two cores fail on each CCX.
I won't be surprised if they simply retire the quad core version of the Ryzen in favor of just offering the SOC chips at those price points, or if we see something like the low-end socket FM2 Athlons that were just the APUs with the GPU portion of the chip disabled.
berillions 27 May 2017 at 11:46 pm UTC
So ...
I would like to buy a Ryzen 5 1400 but with this problem, the good deal is :
- To buy this Processor even with this issue for this game and maybe in future games. Even if a fix exist
- To buy an Intel Processor ...

There are a lot of problem with Ryzen processor on Linux; i don't know what i must to do ... Intel or AMD processor ...
F.Ultra 27 May 2017 at 11:53 pm UTC
What's the output from something like cat /proc/cpuinfo |egrep "processor|physical id|core id" | sed 's/^processor/\nprocessor/g' on a Ryzen or FX? Just curios how they tell the system which cores that share the same CCX, which is important to know if you want to optimize your code for architectures such as this.
  Go to:
While you're here, please consider supporting GamingOnLinux on Patreon. We have no adverts, no paywalls, no timed exclusive articles. Just good, fresh content. Without your continued support, we simply could not continue!

We also accept Paypal donations! If you already are, thank you!

Due to spam you need to Register and Login to comment.


Or login with...

Livestreams & Videos
Community Livestreams
  • hatniX plays: Age of Wonders III
  • Date:
See more!
Popular this week
View by Category
Contact
Latest Forum Posts
Facebook