Patreon Logo Support us on Patreon to keep GamingOnLinux alive. This ensures all of our main content remains free for everyone. Just good, fresh content! Alternatively, you can donate through PayPal Logo PayPal. You can also buy games using our partner links for GOG and Humble Store.
Latest Comments by F.Ultra
Ubuntu flavours to drop Flatpak by default and stick to Snaps
23 Feb 2023 at 1:14 pm UTC Likes: 4

Quoting: whizseI can't say I have a stake in this, but it's interesting to ponder that Ubuntu seems to have a habit of betting on the wrong horse, Mir and upstart, for example.
I wish that people would stop bringing up Upstart in discussions like these. Upstart predates systemd by 4 years and was at the time the best candidate to replace the ancient SysVinit which is why many, including Red Hat and Chromebooks, move over to Upstart.

Shader cache downloads being a nuisance? Valve may have solved it
16 Feb 2023 at 8:20 am UTC

Any one knows how this all works with games like Callisto Protocol where Steam downloads shader caches only for the game to then upon launch to rebuild the shaders anyway, sounds like the two systems are fighting each other and that perhaps the steam shader cache should be disabled for games like this, or does it still help in some way that I don't understand?

You may want to run system updates, after a recent sudo security flaw
16 Feb 2023 at 8:07 am UTC

For Ubuntu fixed packages for this was released on 2023-01-16, more info: https://ubuntu.com/security/CVE-2023-22809 [External Link]

Red Hat released patched versions on 2023-01-23, https://access.redhat.com/security/cve/CVE-2023-22809 [External Link]

Debian released patched versions on 2023-01-23, https://security-tracker.debian.org/tracker/CVE-2023-22809 [External Link]

I could not find any info for Arch on https://security.archlinux.org/ [External Link] but it looks from their package database that they released patched versions around 2023-02-10, 2023-02-15

Most others probably follow the releases above as they usually are based on Debian or Ubuntu.

AMD reveal Ryzen 7000 X3D processors, desktop 65W CPUs and new mobile chips
10 Jan 2023 at 12:56 pm UTC

Quoting: Shmerl
Quoting: F.UltraThe problem space here is that both cores are high performance, just in different ways. I mean trying to determine if your thread/application would benefit from a higher clock or a larger cache is something that takes endless long benchmarks in numerous runs for application developers today (to determine which cpu to recommend to the enterprise to run the system on).
I wonder if AI can help with scheduler for that. It feels like prediction problem based on some moving sample input.

AMD are now adding AI chips to some of their APUs, so may be this can even be hardware accelerated in the future.
Possibly

AMD reveal Ryzen 7000 X3D processors, desktop 65W CPUs and new mobile chips
9 Jan 2023 at 8:08 pm UTC

Quoting: Shmerl
Quoting: F.UltraHow? I cannot think of how a scheduler can know which thread needs larger L3 vs higher clock frequency.
That's my thought too, but I think this so called "big little" issue exists for a while (started with ARM?) and may be there was some work for Linux on that front before in regards to asymmetric cache?

I also wonder what will happen if the scheduler will be unaware of any of that and scheduling will be random. I.e. what will perform better, 7950X or 7950X3D? Some thorough benchmarks comparing them will be for sure needed.
ARM big.LITTLE is a easy problem in this complex problem space. There the scheduler have the "simple" choice of "should this thread run on a high performing CPU or on a low performing CPU", it does this by doing some metrics (which still after all these years are far from perfect). Alder Lake have very similar design but there they added metric collection in the hardware (Intel Thread Director) buf AFAIK this is still far from perfect even in Windows 11 where compile jobs sometimes gets scheduled to the e-cores and thus takes 55min to complete instead of 17.

The problem space here is that both cores are high performance, just in different ways. I mean trying to determine if your thread/application would benefit from a higher clock or a larger cache is something that takes endless long benchmarks in numerous runs for application developers today (to determine which cpu to recommend to the enterprise to run the system on).

My guess is that MS (at this point in time AMD have only talked to Microsoft AFAIK) will simply (if they will do anything at all) try to detect if the application is a game or not and if so run it on the larger cache cores while running everything else on the higher boost cores.

To really benefit here the app/game developers would have to bench this individually and set the thread affinity but the number of combinations in combination with this probably going to be a niche cpu I have a hard time seeing this been done.

I think the most telling of all is that neither AMD nor Intel have any plans what so ever to implement either of these strategies on the server market.

Google open sourced CDC File Transfer from the ashes of Stadia
8 Jan 2023 at 3:19 pm UTC Likes: 4

Quoting: MayeulCI'm surprised, I thought rsync already used rolling hashes.

There's also casync in that space: https://github.com/systemd/casync/ [External Link] (the blog post is quite nice IIRC).

If it's similar but better than rsync, I feel like these improvements should be folded into rsync.
Perhaps semantics but rsync uses a rolling checksum (a variant of adler-32) combined with a strong hash. Casync looks really promising but unfortunately it needs a prepare stage that is probably why it so far haven't seen wide adoption and implementing that in rsync would require a total rewrite so that will probably never happen either (plus I don't think the rsync devs wants that need to prepare the files for transfer), the CDC improvements however should be a great contender for a new version of the rsync protocol/algorithm, just unfortunate that google decided to go NIH instead of proposing this change upstream to rsync.

Anyone interested should contact Wayne at [email protected]

AMD reveal Ryzen 7000 X3D processors, desktop 65W CPUs and new mobile chips
5 Jan 2023 at 11:49 pm UTC

Quoting: dpanter
Quoting: F.Ultrathere will now be a scheduler problem that is worse than on Alder Lake
Well, a potential problem at least. If it even is a problem at launch I expect it to be addressed quickly, much like Alder Lake was.
How? I cannot think of how a scheduler can know which thread needs larger L3 vs higher clock frequency.

AMD reveal Ryzen 7000 X3D processors, desktop 65W CPUs and new mobile chips
5 Jan 2023 at 8:53 pm UTC Likes: 1

One huge problem is that on the 7900x3d and the 7950x3d the extra L3 3d-cache is only connected to one of the CCD:s so there will now be a scheduler problem that is worse than on Alder Lake in that it have to decide which thread to give extra L3 to and which thread to give extra potential cpu boost frequency to, which of course is not something the scheduler can do (a thread could very much switch between the two needs as well).

Linux kernel 6.1 is out now
14 Dec 2022 at 8:04 pm UTC Likes: 1

Quoting: slaapliedje
Quoting: F.Ultra
Quoting: slaapliedje
Quoting: F.Ultra
Quoting: Guest
Btrfs file system performance improvements.
Is long mounting of large HDD partitions fixed now?
What counts as large partitions and long time? I use BTRFS on several servers each with 153TB per partition and mounting is sub second and have been for many years.

edit: that said one of the listed items is improved mount times on large systems:

Hi,

please pull the following updates for btrfs. There's a bunch of
performance improvements, most notably the FIEMAP speedup, the new block
group tree to speed up mount on large filesystems, more io_uring
integration, some sysfs exports and the usual fixes and core updates.

Thanks.

---

Performance:

- outstanding FIEMAP speed improvement
- algorithmic change how extents are enumerated leads to orders of
magnitude speed boost (uncached and cached)
- extent sharing check speedup (2.2x uncached, 3x cached)
- add more cancellation points, allowing to interrupt seeking in files
with large number of extents
- more efficient hole and data seeking (4x uncached, 1.3x cached)
- sample results:
256M, 32K extents: 4s -> 29ms (~150x)
512M, 64K extents: 30s -> 59ms (~550x)
1G, 128K extents: 225s -> 120ms (~1800x)

- improved inode logging, especially for directories (on dbench workload
throughput +25%, max latency -21%)

- improved buffered IO, remove redundant extent state tracking, lowering
memory consumption and avoiding rb tree traversal

- add sysfs tunable to let qgroup temporarily skip exact accounting when
deleting snapshot, leading to a speedup but requiring a rescan after
that, will be used by snapper

- support io_uring and buffered writes, until now it was just for direct
IO, with the no-wait semantics implemented in the buffered write path
it now works and leads to speed improvement in IOPS (2x), throughput
(2.2x), latency (depends, 2x to 150x)

- small performance improvements when dropping and searching for extent
maps as well as when flushing delalloc in COW mode (throughput +5MB/s)

User visible changes:

- new incompatible feature block-group-tree adding a dedicated tree for
tracking block groups, this allows a much faster load during mount and
avoids seeking unlike when it's scattered in the extent tree items
- this reduces mount time for many-terabyte sized filesystems
- conversion tool will be provided so existing filesystem can also be
updated in place
- to reduce test matrix and feature combinations requires no-holes
and free-space-tree (mkfs defaults since 5.15)

- improved reporting of super block corruption detected by scrub

- scrub also tries to repair super block and does not wait until next
commit

- discard stats and tunables are exported in sysfs
(/sys/fs/btrfs/FSID/discard)

- qgroup status is exported in sysfs (/sys/sys/fs/btrfs/FSID/qgroups/)

- verify that super block was not modified when thawing filesystem

Fixes:

- FIEMAP fixes
- fix extent sharing status, does not depend on the cached status where
merged
- flush delalloc so compressed extents are reported correctly

- fix alignment of VMA for memory mapped files on THP

- send: fix failures when processing inodes with no links (orphan files
and directories)

- fix race between quota enable and quota rescan ioctl

- handle more corner cases for read-only compat feature verification

- fix missed extent on fsync after dropping extent maps

Core:

- lockdep annotations to validate various transactions states and state
transitions

- preliminary support for fs-verity in send

- more effective memory use in scrub for subpage where sector is smaller
than page

- block group caching progress logic has been removed, load is now
synchronous

- simplify end IO callbacks and bio handling, use chained bios instead
of own tracking

- add no-wait semantics to several functions (tree search, nocow,
flushing, buffered write

- cleanups and refactoring

MM changes:

- export balance_dirty_pages_ratelimited_flags
I wonder how long it would take me to fill up 153TB with my Steam Library on my 2gbit fiber line...
Well if you start at 0 and then manages to fully saturate that 2Gbps line of yours and pay for all the games, and being able to do so in a manner that would keep the line saturated still and if we assume that this line would be dedicated to just the downloading of said games and not for other Internet usages (such as e.g visiting the Steam store) then it would take aprox 8 days.
Haha! I mean I likely have enough in my steam library where I could fill up that much. Too many games that are 100GB+ these days... hell, the libraries of most of the 16/32bit era aren't even close to 100GB.
Imagine how worse it will become once they decide to ship textures for 8K gaming...

Linux kernel 6.1 is out now
13 Dec 2022 at 11:51 pm UTC

Quoting: slaapliedje
Quoting: F.Ultra
Quoting: Guest
Btrfs file system performance improvements.
Is long mounting of large HDD partitions fixed now?
What counts as large partitions and long time? I use BTRFS on several servers each with 153TB per partition and mounting is sub second and have been for many years.

edit: that said one of the listed items is improved mount times on large systems:

Hi,

please pull the following updates for btrfs. There's a bunch of
performance improvements, most notably the FIEMAP speedup, the new block
group tree to speed up mount on large filesystems, more io_uring
integration, some sysfs exports and the usual fixes and core updates.

Thanks.

---

Performance:

- outstanding FIEMAP speed improvement
- algorithmic change how extents are enumerated leads to orders of
magnitude speed boost (uncached and cached)
- extent sharing check speedup (2.2x uncached, 3x cached)
- add more cancellation points, allowing to interrupt seeking in files
with large number of extents
- more efficient hole and data seeking (4x uncached, 1.3x cached)
- sample results:
256M, 32K extents: 4s -> 29ms (~150x)
512M, 64K extents: 30s -> 59ms (~550x)
1G, 128K extents: 225s -> 120ms (~1800x)

- improved inode logging, especially for directories (on dbench workload
throughput +25%, max latency -21%)

- improved buffered IO, remove redundant extent state tracking, lowering
memory consumption and avoiding rb tree traversal

- add sysfs tunable to let qgroup temporarily skip exact accounting when
deleting snapshot, leading to a speedup but requiring a rescan after
that, will be used by snapper

- support io_uring and buffered writes, until now it was just for direct
IO, with the no-wait semantics implemented in the buffered write path
it now works and leads to speed improvement in IOPS (2x), throughput
(2.2x), latency (depends, 2x to 150x)

- small performance improvements when dropping and searching for extent
maps as well as when flushing delalloc in COW mode (throughput +5MB/s)

User visible changes:

- new incompatible feature block-group-tree adding a dedicated tree for
tracking block groups, this allows a much faster load during mount and
avoids seeking unlike when it's scattered in the extent tree items
- this reduces mount time for many-terabyte sized filesystems
- conversion tool will be provided so existing filesystem can also be
updated in place
- to reduce test matrix and feature combinations requires no-holes
and free-space-tree (mkfs defaults since 5.15)

- improved reporting of super block corruption detected by scrub

- scrub also tries to repair super block and does not wait until next
commit

- discard stats and tunables are exported in sysfs
(/sys/fs/btrfs/FSID/discard)

- qgroup status is exported in sysfs (/sys/sys/fs/btrfs/FSID/qgroups/)

- verify that super block was not modified when thawing filesystem

Fixes:

- FIEMAP fixes
- fix extent sharing status, does not depend on the cached status where
merged
- flush delalloc so compressed extents are reported correctly

- fix alignment of VMA for memory mapped files on THP

- send: fix failures when processing inodes with no links (orphan files
and directories)

- fix race between quota enable and quota rescan ioctl

- handle more corner cases for read-only compat feature verification

- fix missed extent on fsync after dropping extent maps

Core:

- lockdep annotations to validate various transactions states and state
transitions

- preliminary support for fs-verity in send

- more effective memory use in scrub for subpage where sector is smaller
than page

- block group caching progress logic has been removed, load is now
synchronous

- simplify end IO callbacks and bio handling, use chained bios instead
of own tracking

- add no-wait semantics to several functions (tree search, nocow,
flushing, buffered write

- cleanups and refactoring

MM changes:

- export balance_dirty_pages_ratelimited_flags
I wonder how long it would take me to fill up 153TB with my Steam Library on my 2gbit fiber line...
Well if you start at 0 and then manages to fully saturate that 2Gbps line of yours and pay for all the games, and being able to do so in a manner that would keep the line saturated still and if we assume that this line would be dedicated to just the downloading of said games and not for other Internet usages (such as e.g visiting the Steam store) then it would take aprox 8 days.