It’s becoming easy to see why Linus didn’t merge anything from bcachefs for 6.17. And Kent isn’t gaining himself any supporters by tearing down other filesystems in his tantrum.
The problem is 100% Kent. Linus and the rest of the main contributors have a certain way they like to run and operate. Kent has again and again shown that he doesn’t like working that way and keeps sneaking stuff into patchsets.
You can be a 500% genius, but if you’re working as a team member (which anyone doing a sizeable contribution to the kernel is), then you have to learn how to play in the sandbox.
I can’t see any possible future where BCacheFS stays in the kernel. Kent is starting a fight he cannot win. If he doesn’t want to play nice, then his FS will have to be maintained as a kernel patch, which will forever be a limiting factor in its adoption. It’s too bad he doesn’t just swallow his pride and play by the rules.
btrfs is no perfect piece of software either, so it’s good to know there are alternatives out there.
I’ll say it again and again. The problem is neither Linus, nor Kent, but the lack of resources for independent developers to do the kind of testing that is expected of the big corporations.
Like, one of the issues that Linus yelled at Kent about was that bcachefs would fail on big endian machines. You could spend your limited time and energy setting up an emulator of the powerPC architecture, or you could buy it at pretty absurd prices — I checked ebay, and it was $2000 for 8 GB of ram…
But the big corpos are different. They have these massive CI/CD systems, which automatically build and test Linux on every architecture under the sun. Then they have an extra, internal review process for these patches. And then they push.
But Linux isn’t like that for independent developers. What they do, is just compile the software on their own machine, boot into the kernel, and if it works it works. This is how some of the Asahi developers would do it, where they would just boot into their new kernel on their macs, and it’s how I’m assuming Overstreet is doing it. Maybe there is some minimal testing involved.
So Overstreet gets confused when he’s yelled at for not having tested on big endian architectures, because where is he supposed to get a big endian machine that he can afford that can actually compile the linux kernel in less than 10 years? And even if you do buy or emulate a big endian CPU, then you’ll just get hit with “yeah your patch has issues on machines with 2 terabytes or more of ram” and yeah.
One option is to drop standards. The Asahi developers were allowed to just merge code without being subjected to the scrutiny that Overstreet has been subjected to. This was in part due to having stuff in rust, and under the rust subsystem — they had a lot more control over the parts of Linux they could merge too. The other was being specific to macbooks. No point testing the mac book-specific patches on non-mac CPU’s.
But a better option, is to make the testing resources that these corporations use, available to everybody. I think the Linux foundation should spin up a CI/CD service, so people like Kent Overstreet can test their patches on architectures and setups they don’t have at home, and get it reviewed before it is dumped to the mailing list — exactly like what happens at the corporations who contribute to the Linux kernel.
You could spend your limited time and energy setting up an emulator of the powerPC architecture, or you could buy it at pretty absurd prices — I checked ebay, and it was $2000 for 8 GB of ram…
You’re acting as if setting up a ppc64 VM requires insane amounts of effort, when in reality it’s really trivial. It took me like a weekend to figure out how to set up a PowerPC QEMU VM and install FreeBSD in it, and I’m not at all an expert when it comes to VMs or QEMU or PowerPC. I still use it to test software for big endian machines:
start.sh
#!/usr/bin/env sh if [ "$(id -u)" -ne 0 ]; then printf "Must be run as root.\n" exit 1 fi # Note: The "-netdev" parameter forwards the guest's port 22 to port 10022 on the host. # This allows you to access the VM by SSHing the host on port 10022. qemu-system-ppc64 \ -cpu power9 \ -smp 8 \ -m 3G \ -device e1000,netdev=net0 \ -netdev user,id=net0,hostfwd=tcp::10022-:22 \ -nographic \ -hda /path/to/disk_image.img \ # -cdrom /path/to/installation_image.iso -boot d
Also you don’t usually compile stuff inside VMs (unless there is no other way). You use cross-compilation toolchains which are just as fast as native toolchains, except they spit out machine code for the architecture that you’re compiling for. Testing on real hardware is only really necessary if you’re like developing a device driver, or the hardware has certain quirks to it that are just not there in VMs.
Like, one of the issues that Linus yelled at Kent about was that bcachefs would fail on big endian machines. You could spend your limited time and energy setting up an emulator of the powerPC architecture, or you could buy it at pretty absurd prices — I checked ebay, and it was $2000 for 8 GB of ram…
It’s not that BCacheFS would fail on big endian machines, it’s that it would fail to even compile, and therefore impacted everyone who had it enabled in their build. And you don’t need actual big endian hardware to compile something for that arch: Just now it took me a few minutes to figure what tools to install for cross-compilation, download the latest kernel, and compile it for a big endian arch with BCacheFS enabled. Surely a more talented developer than I could easily do the same, and save everyone else the trouble of broken builds.
ETA: And as pointed out in the email thread, Overstreet had bypassed the linux-next mailing list, which would have allowed other people to test his code before it got pulled into the mainline tree. So he had multiple options that did not necessitate the purchase of expensive hardware
One option is to drop standards. The Asahi developers were allowed to just merge code without being subjected to the scrutiny that Overstreet has been subjected to. This was in part due to having stuff in rust, and under the rust subsystem — they had a lot more control over the parts of Linux they could merge too. The other was being specific to macbooks. No point testing the mac book-specific patches on non-mac CPU’s.
It does not sound to me like standards were dropped for Asahi, nor that their use of Rust had any influence on the standards that were applied to them. It is simply as you said: What’s the point of testing code on architectures that it explicitly does not and cannot support? As long as changes that touches generic code are tested, then there is no problem, but that is probably the minority of changes introduced by the Asahi developers
Bah we did much better testing in the pre CI era than the corpos do now. Main difference was we couldn’t push new releases whenever a bad bug turned up. So there was more willingness to debug first and ship after, instead of the other way around.
-
Ignore engineering best practices and piss off community.
-
Act like all you did was follow best practices and build community.
-
Huh, so I take it from the other comments here that Kent isn’t entirely correct? Is there a summary of the back and forth so far?
I might not have everything, but here’s the best summary I can put together:
This back and forth has been going on for a while now. The main complaints more recently have been the timing of his pull requests, and just generally his attitude and cooperation with others. The most recent spat was because he submitted a feature in the rc3 merge window, whereas you’re only supposed to submit Bugfixes in that time frame. The feature in question was a journal rewind function, which would essentially move the filesystem back in time, which could fix an issue that did crop up in the testing phase. As such, he saw it as a workaround to fix an issue that had arisen, and so despite it technically being a feature, he saw it in the category of Bugfixes. The caused major disagreements as well as the way he talked with others. And now his pull request for rc1 has been simply ignored by Linus.
The point where Kent is coming from is that he wants a rock solid file system, and he’s following a bit of a take no prisoners approach to reach that goal. He seems to get most of his income from his following on Patreon, and so his focus is squarely on the users. With that focus, he seems to lose sight of other things, especially the cooperation with others in the kernel team. In fact, a number of people he has sparred with have shown decent respect for his code recently, saying the problem is really the cooperative aspect. One of the main reasons for bcachefs is also the lack of a proper CoW-filesystem in the Linux kernel that doesn’t have the kinds of problems that btrfs has. And the fact that he states this and also talks about the lessons he’s learnt from btrfs’s shortcomings rubs a number of people the wrong way.
Now here’s some stuff I read into this personally: I have the impression that Kent looks up to Linus in a way. And they’re actually both kind of similar: they both are extremely talented engineers, they both saw something missing in the software landscape and said “fuck it, I’ll make it myself”, and they both can be pretty serious dicks. I mean Linus managed to get suspended from his own damn project for being a dick - now that’s an achievement. He’s older now and somewhat calmer, but even recently he had quite the outburst on the mailing list. And I get the impression I get is that Kent (probably subconsciously?) has an attitude of “if he can do it, so can I”. Which would be fair (even though it is poisonous), the only problem being Linus having the longer lever.
Then there’s the aspect of his mental health. He has said multiple times that his mental health has been suffering, which honestly doesn’t surprise me. And if you look at his responses in different places, there seems to be quite an up and down. In some cases, he’s very respectful to Linus, and in some cases he’s pretty nasty (yes, Linus level nasty, but still). As far as I can tell, he needs a break and therapy. The only problem being, bcachefs has quite the momentum currently, and it wouldn’t exactly be great for the project to lose that momentum, either. (Mind you, probably still better than being kicked from the kernel)
That makes it a bit more clear. Thanks so much for writing that up!
Please consider adding paragraph breaks to your posts; a wall of text like this is not pleasant to read
OK, weird, I had them in there, but I added a second newline per paragraph and it seems to look better now.
Thanks! And yeah, with Markdown you need an empty line for it to actually add a paragraph break.
Though I just learned that you can also end a line with two spaces or an
\
to get a line-break
Tire Fire.