While you're here, please consider supporting GamingOnLinux on:
Reward Tiers:
Patreon. Plain Donations:
PayPal.
This ensures all of our main content remains totally free for everyone! Patreon supporters can also remove all adverts and sponsors! Supporting us helps bring good, fresh content. Without your continued support, we simply could not continue!
You can find even more ways to support us on this dedicated page any time. If you already are, thank you!
Reward Tiers:
This ensures all of our main content remains totally free for everyone! Patreon supporters can also remove all adverts and sponsors! Supporting us helps bring good, fresh content. Without your continued support, we simply could not continue!
You can find even more ways to support us on this dedicated page any time. If you already are, thank you!
Login / Register
- Nexus Mods retire their in-development cross-platform app to focus back on Vortex
- Windows compatibility layer Wine 11 arrives bringing masses of improvements to Linux
- GOG plan to look a bit closer at Linux through 2026
- Hytale has arrived in Early Access with Linux support
- Valve reveal all the Steam events scheduled for 2026
- > See more over 30 days here
- Venting about open source security.
- rcrit - Away later this week...
- Liam Dawe - Weekend Players' Club 2026-01-16
- Mustache Gamer - Welcome back to the GamingOnLinux Forum
- simplyseven - A New Game Screenshots Thread
- JohnLambrechts - See more posts
How to setup OpenMW for modern Morrowind on Linux / SteamOS and Steam Deck
How to install Hollow Knight: Silksong mods on Linux, SteamOS and Steam Deck
ie
sCCX | dCCX | sCPU | dCPU | Result
--------------------------
0 |0 | 0 | 16 | Fast- expected
0 |0 | 0 | 1 | Fast- expected
0 |1 | 0 | 8 | Slow - problem
0 |1 | 0 | 24 | Slow - problem
0 |1 | 0 | 12 | Slower - expected though
0 |1 | 0 | 28 | Slower - expected though
I can go into more details about the test itself but it's basically a timed busy loop in bash. CPUs threads set realtime, nohrz, etc, everything I could do to isolate them. While not scentific (+/- 40ms), it doesn't have to when one is looking at a 480ms difference.
View PC info
I own as well 1950x with the ASrock Fatal1ty X399 Professional Gaming motherboard
I didnt have any problem at all so far with all the things I have put to this monster :P
I am running it @3.6 all cores @3200ghz
(my only problem is that is getting a bit hot when compiling stuff , like for example compiling unreal engine it goes up to 85c with a water cooler - idle is going from 35-50)
Also I am using the latest bios from Asrock
http://www.asrock.com/mb/AMD/Fatal1ty%20X399%20Professional%20Gaming/index.asp#BIOS
Here is an inxi , in case you need me test something let me know.
#!/bin/bash
sysp=/sys/bus/cpu/devices
fast() {
cpu=cpu$1
read ttfreq < $sysp/$cpu/cpufreq/cpuinfo_max_freq
echo $ttfreq > $sysp/$cpu/cpufreq/scaling_min_freq
}
slow() {
cpu=cpu$1
read ttfreq < $sysp/$cpu/cpufreq/cpuinfo_min_freq
echo $ttfreq > $sysp/$cpu/cpufreq/scaling_min_freq
}
priority() {
taskset -pc $2 $1
renice -20 -p $2
}
doit() {
read ttcpu < /sys/bus/cpu/devices/cpu${1}/cpufreq/scaling_min_freq
read tpcpu < /sys/bus/cpu/devices/cpu${2}/cpufreq/scaling_min_freq
printf "OUT: Parent:%s:%s Target:%s:%s\n" "$pcpu" "$tpcpu" "$tcpu" "$ttcpu"
# for i in {1..200000}
for i in {1..50000}
do
:;
done
}
usage() {
printf "%s" "
$@ Required:
--target
--source
"
exit 1
}
## $1 - [ fast | slow ]:TargetCPU
## $2 - [ fast | slow ]:SourceCPU
get_args() {
while [[ "$1" ]]; do
case "$1" in
"--target") tcpu=${2/*:/}; tcmd=${2/:*/} ;;
"--source") pcpu=${2/*:/}; pcmd=${2/:*/};;
*) printf "Unknown option: $1"; usage ;;
esac
shift 2
done
}
zz() {
tcpu=${1/*:/}
tcmd=${1/:*/}
shift
pcpu=${1/*:/}
pcmd=${1/:*/}
}
if [ $# -le 4 ]
then
printf "## %s\n" "$*"
get_args $* || exit $?
$tcmd $tcpu
priority $$ $tcpu
$pcmd $pcpu
priority $PPID $pcpu
doit $tcpu $pcpu
slow $pcpu
slow $tcpu
else usage
exit 1
fi
Invocation:
Local
Start on CPU4 @ slowest speed
Test on CPU2 @ slowest speed
perf stat -d -d -d ./child.sh --target slow:2 --source slow:4 2>&1 | egrep '(OUT:|task-clock)'|xargsCCX jump
Start on CPU4 @ slowest speed
Test on CPU25 @ slowest speed
perf stat -d -d -d ./child.sh --target slow:25 --source slow:4 2>&1 | egrep '(OUT:|task-clock)'|xargsOUT: Parent:4:2200000 Target:25:2200000 278.624873 task-clock (msec) # 0.998 CPUs utilized
Same CCX jump, faster clock
Start on CPU4 @ slowest speed
Test on CPU25 @ fast speed
perf stat -d -d -d ./child.sh --target fast:25 --source slow:4 2>&1 | egrep '(OUT:|task-clock)'|xargsOUT: Parent:4:2200000 Target:25:4100000 140.126839 task-clock (msec) # 0.997 CPUs utilized
Example output:
OUT: Parent:4:2200000 Target:2:2200000 130.223193 task-clock (msec) # 0.994 CPUs utilizedLonger version:
# perf stat -d -d -d ./child.sh --target slow:2 --source slow:4 2>&1 | egrep '(OUT:|task-clock)'|xargs
OUT: Parent:4:2200000 Target:2:2200000 130.223193 task-clock (msec) # 0.994 CPUs utilized
# perf stat -d -d -d ./child.sh --target slow:25 --source slow:4 2>&1 | egrep '(OUT:|task-clock)'|xargs
OUT: Parent:4:2200000 Target:25:2200000 278.624873 task-clock (msec) # 0.998 CPUs utilized
# perf stat -d -d -d ./child.sh --target fast:25 --source slow:4 2>&1 | egrep '(OUT:|task-clock)'|xargs
OUT: Parent:4:2200000 Target:25:4100000 140.126839 task-clock (msec) # 0.997 CPUs utilized
## do all the things...for spd in fast slow; do for target in $spd:{0..31}; do perf stat -d -d -d ./child.sh --target $target --source slow:4 2>&1 | egrep '(OUT:|task-clock)'|xargs ; done ; done
Explanation:
OUT: Parent:4:2200000 - CPU and clock we're on now
Target:2:2200000 - CPU and clock we're testing
130.223193 task-clock (msec) # 0.994 CPUs utilized - how long it took
High variance in the time is what I'm after, there shouldn't be much.
There are a lot of assumptions made / aren't scripted since it was a quick test. I'm assuming for example the current scheduler is ondemand, though I've seen the same with conservative. Especially if you test with fast but find the clock drops back down it could be the scheduler, or throttling (none of which are accounted for atm).
Though I try to pin the test as much as I can, some CPUs are isolated on boot. GRUB line:
nohz_full=0,16,1,17,8,24,9,25,10,26,11,27
rcu_nocbs=0,16,1,17,8,24,9,25,10,26,11,27
isolcpus=0,16,1,17,8,24,9,25,10,26,11,27
That is intentional as I pin vms. It also doesn't really affect the test from what I can tell.
Some other things to note:
- X399 AORUS Gaming 7 board
- There is _zero_ scripted thermal monitoring
- BIOS supports setting custom power states
- c6 is disabled thus 2.2GHz is the lowest clock for me, ymmv.
- I can hit faster OC but not needed to validate the test
Clocks are intentionally reset to lowest atm due to the way Ryzen works, not all cores can run full OC/XFR, ymmv. Shouldn't be a problem with most schedulers unless you've pinned them higher. Just something to be aware of. Should check what they were before changing but I'm lazy.
What kinda makes this worse is it's running in UMA/ "Creator mode", _NOT_ NUMA / "Gaming".
View PC info
perf stat -d -d -d ./child.sh --target slow:2 --source slow:4 2>&1 | egrep '(OUT:|task-clock)' | xargsOUT: Parent:4:2200000 Target:2:2200000 99.580689 task-clock (msec) # 0.740 CPUs utilized
perf stat -d -d -d ./child.sh --target slow:15 --source slow:4 2>&1 | egrep '(OUT:|task-clock)' | xargsOUT: Parent:4:2200000 Target:15:2200000 102.475220 task-clock (msec) # 0.995 CPUs utilized
perf stat -d -d -d ./child.sh --target fast:15 --source slow:4 2>&1 | egrep '(OUT:|task-clock)' | xargsOUT: Parent:4:2200000 Target:15:3700000 96.033168 task-clock (msec) # 0.992 CPUs utilized
But may be your issue is Threadripper specific.
0-> {0..3,8..15}
1-> {4..7,16..23}
It's interesting you still see a few ms gain though.
View PC info
lstopo is a neat tool - never hard of it before :)
Assuming it's:
CCX0 CCX10 1 2 3 | 4 5 6 7
-------------------------
0 1 2 3 | 4 5 6 7
8 9 10 11 | 12 13 14 15
-------------------------
lstopo/hwloc is quite handy indeed. It misses somethings like identifying nvme drives (lists the bus of course), but you can export it to an XML and add whatever you want. I have VMs for example mapped. Almost like porn on massive servers :P
Don't know why the forum is eating that ascii. It looks fine on preview but post gets garbled.. hm.
I've seen something quite similar with Samsung NVME drives. The latency remains high
because the drive remains at a lower power state.
From ioping, drop is after starting dd in another term.
View PC info
View PC info
I ended up setting 6 MiB offset for partitions to account for some potentially weird erase block sizes (like 1536 KiB one). And fio produces some of [these results](https://www.reddit.com/r/linuxquestions/comments/8hzz20/how_to_configure_samsung_evo_970_for_optimal/).
View PC info
See:
* https://flashdba.com/2014/06/20/understanding-flash-blocks-pages-and-program-erases/
* https://superuser.com/questions/1243559/is-partition-alignment-to-ssd-erase-block-size-pointlessnt
* https://forums.anandtech.com/threads/samsung-tlc-erase-block-sizes.2448833/