More

mgaunard · 2026-04-20T09:12:38 1776676358

It's 2026 and I'm still defining my own messaging and wire protocols.

Plain C structs that fit in a UDP datagram that you can reinterpret_cast from is still best. You can still provide schemas and UUIDs for that, and dynamically transcode to JSON or whatever.

bluGill · 2026-04-20T13:55:55 1776693355

Until you have to work with big and little endian systems. There are other weirdness about how different computers represent things as well. utf-8 / ucs-16 strings (or other code pages). Not all floats are ieee-754. Still when you can ignore all those issues what you did is really easy and often works.

codedokode · 2026-04-20T23:55:37 1776729337

I disagree. Big endian is long dead and not worth worrying about. And code pages too. What is more important, is dealing with schema changes, when you add new fields to requests and responses.

bluGill · 2026-04-21T11:31:18 1776771078

There are niches where those matter.

but yes schema changes is most likely to get you today

pjc50 · 2026-04-20T13:51:46 1776693106

Provided that:

    - you agree never to care about endianness (can probably get away with this now)

    - you don't want to represent anything complicated or variable length, including strings

codedokode · 2026-04-20T23:56:45 1776729405

You can have strings by using relative pointers ("string starts 123 bytes before this").

mgaunard · 2026-04-22T11:15:00 1776856500

You can also just use an array which sets a max capacity, and either use a null-terminator or a separate size field.

In practice you probably want to have both, and choose what's most practical based on the message.

benterix · 2026-04-20T09:17:55 1776676675

If you decide to use UDP, do you ignore the transmission errors or write the handling layer on your own?

mgaunard · 2026-04-20T09:34:40 1776677680

I handle it in different ways by topic.

For topics which are sending the state of something, a gap naturally self-recovers so long as you keep sending the state even if it doesn't change.

For message buses that need to be incremental, you need to have a separate snapshot system to recover state. That's usually pretty rare outside of things like order books (I work in low-latency trading).

For requests/response, I find it's better to tell the requester their request was not received rather than transparently re-send it, since by the time you re-send it it might be stale already. So what I do at the protocol level is just have ack logic, but no retransmit. Also it's datagram-oriented rather than byte-oriented, so overall much nicer guarantees than TCP (so long as all your messages fit in one UDP payload).

codedokode · 2026-04-20T23:54:13 1776729253

What you use is perfect for short-range communication (application and child process talking over shared memory), but not good for long-range communication (over Internet) because you can have old client talking to new version of a server, so you will have to add version numbers and have the code to parse outdated formats. But protobuf has compatibility built in and you do not need to write anything to support outdated clients. Also, protobuf uses solutions like varints to compress data to use less network traffic. So it is obviously made for long-range communication, and you probably do not have that and send 7 zeros for every small number.

TL;DR protobuf has version compatibility and compact number encoding.

mgaunard · 2026-04-22T11:17:21 1776856641

I already said you can UUIDs and schemas, and even dynamic conversion between mismatched schemas.

Doing plain C structs doesn't prevent any of this.

codedokode · 2026-04-22T19:29:43 1776886183

It requires extra effort to write conversion algorithm for older data structure version.

mgaunard · 2026-04-09T19:21:12 1775762472

I find that conan2 is mostly painful with ABI. Binaries from GCC are all backwards compatible, as are C++ standard versions. The exception is the C++11 ABI break.

And yet it will insist on only giving you binaries that match exactly. Thankfully there are experimental extensions that allow it to automatically fall back.

mgaunard · 2026-04-07T20:11:09 1775592669

Zero mention of s3fs which already did this for decades.

huntaub · 2026-04-07T20:52:02 1775595122

This is pretty different than s3fs. s3fs is a FUSE file system that is backed by S3.

This means that all of the non-atomic operations that you might want to do on S3 (including edits to the middle of files, renames, etc) are run on the machine running S3fs. As a result, if your machine crashes, it's not clear what's going to show up in your S3 bucket or if would corrupt things.

As a result, S3fs is also slow because it means that the next stop after your machine is S3, which isn't suitable for many file-based applications.

What AWS has built here is different, using EFS as the middle layer means that there's a safe, durable place for your file system operations to go while they're being assembled in object operations. It also means that the performance should be much better than s3fs (it's talking to ssds where data is 1ms away instead of hdds where data is 30ms away).

mgaunard · 2026-04-08T06:41:15 1775630475

It also means that you need to pay for EFS, which is outrageously expensive, to use S3, whose whole purpose is to be cheap.

huntaub · 2026-04-08T10:20:30 1775643630

Of course, you don't need to, this is just a way to opt-in to getting file semantics on top of S3.

The purpose of S3 isn't to be cheap, it's to be simple.

ChocolateGod · 2026-04-07T21:01:00 1775595660

You can also use something like JuiceFS to make using S3 as a shared filesystem more sane, but you're moving all the metadata to a shared database.

Eikon · 2026-04-08T06:43:48 1775630628

Or ZeroFS which doesn’t require a 3rd party database, just a s3 bucket!

https://github.com/Barre/ZeroFS

ChocolateGod · 2026-04-10T11:58:53 1775822333

ZeroFS isn't a shared redundant filesystem.

Eikon · 2026-04-13T13:30:56 1776087056

It's definitely shared, and can be redundant.

luke5441 · 2026-04-07T20:37:47 1775594267

A more solid (especially when it comes to caching) solution would be appreciated.

I thought that would be their https://github.com/awslabs/mountpoint-s3 . But no mention about this one either.

S3 files does have the advantage of having a "shared" cache via EFS, but then that would probably also make the cache slower.

PunchyHamster · 2026-04-07T21:13:23 1775596403

I'd assume you can still have local cache in addition to that.

rowanG077 · 2026-04-07T20:45:40 1775594740

I was thinking: "No way this has existed for decades". But the earliest I can find it existing is 2008. Strictly speaking not decades but much closer to it than I expected.

bmurphy1976 · 2026-04-08T01:04:51 1775610291

There's also https://github.com/kahing/goofys, a Go equivalent. A bit of a dead project these days.

moralestapia · 2026-04-08T00:15:59 1775607359

Yeah, that blog post was written as if sliced bread has been invented again.

Reading through it, I was only thinking "is this distinguished engineer TOC 2M aware that people have been doing this since forever?".

mgaunard · 2026-04-05T06:30:52 1775370652

Most people looking for performance will reach for the spinlock.

The expectation is that the kernel should somehow detect applications that are spinning, and avoid preempting them early.

IshKebab · 2026-04-05T07:18:53 1775373533

Well that seems like an unreasonable expectation no? Also isn't the point of spinlocks that they get released before the kernel does anything? Otherwise you could just use a futex... Which maybe you should do anyway...

https://matklad.github.io/2020/01/04/mutexes-are-faster-than...

mgaunard · 2026-04-05T21:55:37 1775426137

The scheduling is based on how much the LWP made use of its previous time slices. A spinning program clearly is using every cycle it's given without yielding, and so you can clearly tell preemption should be minimized.

silon42 · 2026-04-05T17:31:08 1775410268

If you are spinning so long that it requires preemption, you're doing something wrong, no?

jcalvinowens · 2026-04-05T17:38:01 1775410681

It doesn't matter, it's a long tail thing: on average user spinlocks can work, and even appear to be beneficial on benchmarks (for many reasons, Andy alludes to some above). But if you have enough users, some of them will experience the apocalyptic long tail, no matter what you do: that's why user spinlocks are unacceptable. RSEQ is the first real answer for this, but it's still not a guarantee: it is not possible to disable SCHED_OTHER preemption in userspace.

If I make something 1% faster on average, but now a random 0.000001% of its users see a ten-second stall every day, I lose.

It is tempting to think about it as a latency/throughput tradeoff. But it isn't that simple, the unbounded thrashing can be more like a crash in terms of impact to the system.

silon42 · 2026-04-08T13:45:47 1775655947

Yeah, the thrashing thing I'm very familiar on OOM scenario... the far most common Linux "crash" that I experience (at least monthly, sometimes daily, depending on what I'm doing)... I've waited overnight a few times but OOM killer still didn't activate.

mgaunard · 2026-04-05T21:57:13 1775426233

Well, you can always pin to a core and move other threads out of that core.

That's what you'd do if manually scheduling. Ideally the dynamic scheduler would do that on its own.

jcalvinowens · 2026-04-05T23:30:58 1775431858

Sure. But if you squint even that isn't good enough, you'll still take interrupts on that core in the critical section sometimes when somebody else wants the lock.

The other problem with spin-wait is that it overshoots, especially with an increasing backoff. Part of the overhead of sleeping is paid back by being woken up immediately.

When it's made to work, the backoff is often "overfit" in that very slight random differences in kernel scheduler behavior can cause huge apparent regressions.

mgaunard · 2026-04-03T13:49:32 1775224172

The truth is that the NHS is very bad not due to funding, but for structural reasons.

The fact I can't even see a GP I'm not registered with (not even an option to pay extra) is ridiculous. You have absolutely no control over your health at all.

With private, you get exactly what you want, whenever you want it.

array_key_first · 2026-04-03T16:12:26 1775232746

> With private, you get exactly what you want, whenever you want it.

In the US this isn't how it works. You can't see whoever you want unless you have a really, really good plan. Otherwise, you need referrals. And lots of specialists won't see you without a referral anyway.

And, the wait is often on the order of months. I know that's something people complain about in the UK but I assure you, it happens that way in the US too even though we're paying 10x as much.

I know private in the UK is quite good. What you need to understand is that the only reason it's any good at all is because of the NHS. It has to remain competitive. If you go full private, then it very quickly decays.

mgaunard · 2026-04-03T19:38:00 1775245080

A specialist also requires a referral in the UK. There are also much more medicines which are prescription-only than in the US.

That's why in practice we have all these (private) services to get easy GP appointments via phone, video or even online forms. While everyone knows those appointments can't realistically do any real medical work, they serve to give you prescriptions and referrals.

It's just a gatekeeping mechanism, that you can more easily bypass if you have money. The more you pay, the more they care about your user experience and how streamlined it is.

reillyse · 2026-04-03T17:32:26 1775237546

In the US if I want to see my primary care doctor I need to wait 2 months for the appointment.

I pay $500 per month for the privilege (and a $50 copay)

So I’m paying $1000 in the time period where I’m getting no service.

zdragnar · 2026-04-03T17:36:29 1775237789

Where in the US are you? I was able to book a visit with my primary the very next day less than a month ago.

array_key_first · 2026-04-03T19:42:47 1775245367

Not the person you replied to but I'm in North Texas and I just recently had to reschedule my physical. And yup, the next appointment is 2 months out.

I also had cancer in the past and you might think that that would mean I get faster appointments. I do not.

And I have a very, very, very good PPO plan.

chrisjj · 2026-04-03T20:39:33 1775248773

> I also had cancer in the past and you might think that that would mean I get faster appointments. I do not.

Sadly you do not may be because lower life expectancy -> lower return on treatment "investment".

tracker1 · 2026-04-03T17:56:44 1775239004

That was my thinking... even for specialists, I can generally get into a new one within a few weeks.

My SO is on state Medicaid (cancer) and does experience the kinds of waits mentioned above... so I guess it does follow similarly for government/state backed healthcare, where I'm mostly out of pocket.

But even when I had relatively typical coverage, I didn't have issues getting into a doctor more often than not. I think getting my sleep study was the longest wait I had for anything, they were months backed up with appointments... but my kidney and retina specialists were somewhat easy to get started with.

0xffff2 · 2026-04-03T19:52:50 1775245970

As usual when people say "the US", we're papering over the fact that the United States is really 50 countries in a trench coat.

chrisjj · 2026-04-03T20:40:59 1775248859

> the United States is really 50 countries in a trench coat.

Appropriate attire... when you're in a trench :)

IneffablePigeon · 2026-04-03T15:13:42 1775229222

You can absolutely see a GP you’re not registered with if you are travelling and need to. I have done it multiple times. I have been offered it same or next day after calling 111.

harvey9 · 2026-04-03T17:52:57 1775238777

You can call any GP surgery to get emergency treatment for up to 14 days if you're not registered with a GP surgery or are away from home. https://www.nhs.uk/nhs-services/gps/gp-appointments-and-book...

mgaunard · 2026-04-03T19:20:44 1775244044

While away from home, my son (5 yo) cut his finger and was in need of disinfectant and a bandage (steri strips).

Pharmacy was useless, no medical skills or knowledge of their own products. Asked me to figure out myself what I needed and put it on my son myself.

Local GP surgery sent us away: no registration, no visit. Me saying this was an emergency just made them suggest A&E.

A&E is where we ended up, and while that definitely works, going to the emergency services of a large hospital for every little thing is not only a waste of my time but also of resources. It seems however to be the NHS way: whenever the littlest of troubles arise, just go to the hospital, or even call an ambulance.

0xffff2 · 2026-04-03T19:51:38 1775245898

Sorry for being too American to understand, but why would you need to talk to any medical professional to put a bandaid on your kid? Is this about NHS being paying for the bandaid? About medical expertise to apply a bandaid?

mgaunard · 2026-04-03T23:24:04 1775258644

Not all disinfectants are child safe, and the wound was serious enough to require steri-strips (an alternative to sutures) -- it was not a matter of a bandaid.

bookofjoe · 2026-04-04T19:01:49 1775329309

1. Water is child safe

2. Steri-strips are available over the counter at any supermarket or pharmacy (in the U.S.)

harvey9 · 2026-04-03T23:39:45 1775259585

You would have been best served by a Minor Injury Unit but not every town has one, so A&e is not excessive. The great majority of people going there do not need the full capabilities of it (resuscitation etc).

stuaxo · 2026-04-05T14:29:44 1775399384

The chemist can sell you the right stuff for that.

phatfish · 2026-04-03T22:40:27 1775256027

You call 111 if you don't want to bother the 999 guys. 111 will tell you what you need to do, including "go to A&E".

What is wrong with going to A&E for an (as you said yourself) emergency?

A pharmacist dispenses medications and should know about their safe usage. They won't tell you how to bandage a wound.

mgaunard · 2026-04-03T23:36:54 1775259414

Getting a minor wound bandaged up is not what A&E is meant for, it's for life-threatening injuries.

Going to A&E and waiting there also means you're losing 4 to 6 hours.

111 you just get some robot asking you a never-ending list of inane questions before someone tells you to either self-care at home or go to A&E.

A pharmacist should be able to administer the supplies they sell, particularly wound dressing and care. It's a requirement in some other European countries like France (where pharmacists are doctors), but in the UK the reality is that most are unable to do so.

happymellon · 2026-04-05T07:17:04 1775373424

111 do a lot more than that, they will get you a GP visit even if the GP claims to not have slots, and they will get medical professionals to come to you.

If it was in hours, I'm surprised they didn't get a nurse appointment for the cut.

chrisjj · 2026-04-03T23:35:53 1775259353

Pharmacy. Not pharmacist.

Pharmacies provide much more than just medicine.

lambdas · 2026-04-04T04:26:14 1775276774

A pharmacist is someone who is a chemical practitioner though?

“Man, these cryptographers didn’t know a thing about tailwind. Useless!”

happymellon · 2026-04-05T07:13:19 1775373199

And 111 was unable to help, even the GP they assigned to you turned you away?

shigawire · 2026-04-03T14:37:43 1775227063

With private, you get exactly what you want, whenever you want it... If you can afford it.

stvltvs · 2026-04-03T14:55:53 1775228153

Pending availability of specialists, willingness to travel, etc.

tracker1 · 2026-04-03T17:57:12 1775239032

If you're in a major metro area it's generally not too bad.

dwedge · 2026-04-03T15:57:00 1775231820

Compared to the system of no access

IAmBroom · 2026-04-03T18:45:46 1775241946

Same system, for the not-wealthy.

te_chris · 2026-04-03T20:13:14 1775247194

So naive. Private only works this way in Britain because it doesn’t have to be responsible for anything. It’s a luxury good and works accordingly.

We have insurance, it’s amazing! But it’s fake. If you want to know how a whole system of this would work, look at the US

iamtheworstdev · 2026-04-03T15:02:54 1775228574

* if available in your area or within your means of travel, which may include flying to another state

jaccola · 2026-04-03T16:26:59 1775233619

If only there were some system where the incentives could freely flow through and permeate every level of the sector. Where those organisations that provide sub-standard care die and those that excel receive outsized funding...

thunderfork · 2026-04-03T16:32:05 1775233925

Unfortunately, a system with these qualities doesn't exist in practice. You just end up with the same too-big-to-fail macro organization minimizing their point-of-care labor spend and maximizing their management spend either way.

mgaunard · 2026-04-03T13:46:43 1775224003

Many people opt for off-shore bonds (which have a number of advantages) which means paying normal tax instead of capital gains, so the capital gains figure doesn't really capture investment as a whole.

mgaunard · 2026-03-31T17:17:22 1774977442

These local models are far behind the capabilities of latest Gemini Pro, Claude Opus or GPT.

Why waste time with subpar AI?

Lucasoato · 2026-03-31T17:18:42 1774977522

They will eventually catch up, that’s the hope to avoid a techno feudalism in which too much power is in too few hands.

abu_ameena · 2026-03-31T17:42:59 1774978979

Yes, but you don’t always want the power/expense of these models for the task at hand. A hammer is good enough to push a nail inside a wall. Save the nail gun for when you are building a house.

anon373839 · 2026-03-31T20:52:40 1774990360

They’re not far behind, unless you mean for “vibe coding”. And for probably 85% of queries that people use LLMs for, you can’t even really perceive the difference between frontier and local.

sbassi · 2026-03-31T19:23:15 1774984995

It's a trade off.

mgaunard · 2026-03-30T02:18:59 1774837139

Reduced to 80km/h since 2018.

amatecha · 2026-03-30T03:00:07 1774839607

Oh right, I totally forgot! I mean, even then, for so many of those roads I'd never consider driving that fast haha

angry_octet · 2026-04-01T05:54:00 1775022840

Practically people go much faster than 80km/h.

dolmen · 2026-03-30T05:29:30 1774848570

Paris' périphérique is nowadays limited to 50 km/h.

mgaunard · 2026-03-29T19:27:47 1774812467

It's fundamentally different; Rust entirely rejects the notion of a stable ABI, and simply builds everything from source.

C and C++ are usually stuck in that antiquated thinking that you should build a module, package it into some libraries, install/export the library binaries and associated assets, then import those in other projects. That makes everything slow, inefficient, and widely dangerous.

There are of course good ways of building C++, but those are the exception rather than the standard.

jjmarr · 2026-03-29T21:14:52 1774818892

"Stable ABI" is a joke in C++ because you can't keep ABI and change the implementation of a templated function, which blocks improvements to the standard library.

In C, ABI = API because the declaration of a function contains the name and arguments, which is all the info needed to use it. You can swap out the definition without affecting callers.

That's why Rust allows a stable C-style ABI; the definition of a function declared in C doesn't have to be in C!

But in a C++-style templated function, the caller needs access to the definition to do template substitution. If you change the definition, you need to recompile calling code i.e. ABI breakage.

If you don't recompile calling code and link with other libraries that are using the new definition, you'll violate the one-definition rule (ODR).

This is bad because duplicate template functions are pruned at link-time for size reasons. So it's a mystery as to what definition you'll get. Your code will break in mysterious ways.

This means the C++ committee can never change the implementation of a standardized templated class or function. The only time they did was a minor optimization to std::string in 2011 and it was such a catastrophe they never did it again.

That is why Rust will not support stable ABIs for any of its features relying on generic types. It is impossible to keep the ABI stable and optimize an implementation.

tialaramex · 2026-03-29T20:27:27 1774816047

It's not true that Rust rejects "the notion of a stable ABI". Rust rejects the C++ solution of freeze everything and hope because it's a disaster, it's less stable than some customers hoped and yet it's frozen in practice so it disappoints others. Rust says an ABI should be a promise by a developer, the way its existing C ABI is, that you can explicitly make or not make.

Rust is interested in having a properly thought out ABI that's nicer than the C ABI which it supports today. It'd be nice to have say, ABI for slices for example. But "freeze everything and hope" isn't that, it means every user of your language into the unforeseeable future has to pay for every mistake made by the language designers, and that's already a sizeable price for C++ to pay, "ABI: Now or never" spells some of that out and we don't want to join them.

zozbot234 · 2026-03-29T23:18:11 1774826291

> It'd be nice to have say, ABI for slices for example.

The de-facto ABI for slices involves passing/storing pointer and length separately and rebuilding the slice locally. It's hard to do better than that other than by somehow standardizing a "slice" binary representation across C and C-like languages. And then you'll still have to deal with existing legacy code that doesn't agree with that strict representation.

saagarjha · 2026-03-30T09:38:52 1774863532

If Rust makes no progress towards choosing an ABI and decides that freezing things is bad, then Rust is de facto rejecting the notion of a stable ABI.

tialaramex · 2026-03-30T12:52:13 1774875133

Rust is just a bit less than 11 years old, C++ was 13 years old when screwed up std::string ABI, so, I think Rust has a few years yet to do less badly.

Obviously it's easier to provide a stable ABI for say &'static [T] (a reference which lives forever to an immutable slice of T) or Option<NonZeroU32> (either a positive 32-bit unsigned integer, or nothing) than for String (amortized growable UTF-8 text) or File (an open file somewhere on the filesystem, whatever that means) and it will never be practical to provide some sort of "stable ABI" for arbitrary things like IntoIterator -- but that's exactly why the C++ choice was a bad idea. In practice of course the internal guts of things in C++ are not frozen, that would be a nightmare for maintenance teams - but in theory there should be no observable effect from such changes and so that discrepancy leads to endless bugs where a user found some obscure way to depend on what you'd hidden inside some implementation detail, the letter of the ISO document says your change is fine but the practice of C++ development says it is a breaking change - and the resulting engineering overhead at C++ vendors is made even worse by all the UB in real C++ software.

This is the real reason libc++ still shipped Quicksort as its unstable sort when Biden was President, many years after this was in theory prohibited by the ISO standard† Fixing the sort breaks people's code and they'd rather it was technically faulty and practically slower than have their crap code stop working.

† Tony's Quicksort algorithm on its own is worse than O(n log n) for some inputs, you should use an introspective comparison sort aka introsort here, those existed almost 30 years ago but C++ only began to require them in 2011.

saagarjha · 2026-04-01T09:50:42 1775037042

No? Rust has the 11 years C++ got to pick an ABI and then all the intervening years to see what they did wrong.

zrm · 2026-03-29T20:56:41 1774817801

> C and C++ are usually stuck in that antiquated thinking that you should build a module, package it into some libraries, install/export the library binaries and associated assets, then import those in other projects. That makes everything slow, inefficient, and widely dangerous.

It seems to me the "convenient" options are the dangerous ones.

The traditional method is for third party code to have a stable API. Newer versions add functions or fix bugs but existing functions continue to work as before. API mistakes get deprecated and alternatives offered but newly-deprecated functions remain available for 10+ years. With the result that you can link all applications against any sufficiently recent version of the library, e.g. the latest stable release, which can then be installed via the system package manager and have a manageable maintenance burden because only one version needs to be maintained.

Language package managers have a tendency to facilitate breaking changes. You "don't have to worry" about removing functions without deprecating them because anyone can just pull in the older version of the code. Except the older version is no longer maintained.

Then you're using a version of the code from a few years ago because you didn't need any of the newer features and it hadn't had any problems, until it picks up a CVE. Suddenly you have vulnerable code running in production but fixing it isn't just a matter of "apt upgrade" because no one else is going to patch the version only you were using, and the current version has several breaking changes so you can't switch to it until you integrate them into your code.

mgaunard · 2026-03-30T00:05:38 1774829138

This is all wishful thinking disconnected from practicalities.

First you confuse API and ABI.

Second there is no practical difference between first and third-party for any sufficiently complex project.

Third you cannot have multiple versions of the same thing in the same program without very careful isolation and engineering. It's a bad idea and a recipe for ODR violations.

In any non-trivial project there will be complex dependency webs across different files and subprojects, and humans are notoriously bad at packaging pieces of code into sensible modules, libraries or packages, with well-defined and maintained boundaries. Being able to maintain ABI compatibility, deprecating things while introducing replacement etc. is a massive engineering work and simply makes people much less likely to change the way things are done, even if they are broken or not ideal. That's an effort you'll do for a kernel (and only on specific boundaries) but not for the average program.

zrm · 2026-03-30T03:04:50 1774839890

> First you confuse API and ABI.

I'm not confusing API with ABI. If you don't have a stable ABI then you essentially forfeit the traditional method of having every program on the system use the same copy (and therefore version) of that library, which in turn encourages them to each use a different version and facilitates API instability by making the bad thing easier.

> Second there is no practical difference between first and third-party for any sufficiently complex project.

Even when you have a large project, making use of curl or sqlite or openssl does not imply that you would like to start maintaining a private fork.

There are also many projects that are not large enough to absorb the maintenance burden of all of their external dependencies.

> Third you cannot have multiple versions of the same thing in the same program without very careful isolation and engineering.

Which is all the more reason to encourage every program on the system to use the same copy by maintaining a stable ABI. What do you do after you've encouraged everyone to include their own copy of their dependencies and therefore not care if there are many other incompatible versions, and then two of your dependencies each require a different version of a third?

> In any non-trivial project there will be complex dependency webs across different files and subprojects, and humans are notoriously bad at packaging pieces of code into sensible modules, libraries or packages, with well-defined and maintained boundaries.

This feels like arguing that people are bad at writing documentation so we should we should reduce their incentive to write it, instead of coming up with ways to make doing the good thing easier.

stackghost · 2026-03-29T19:45:22 1774813522

>There are of course good ways of building C++, but those are the exception rather than the standard.

What are the good ways?

lstodd · 2026-03-29T21:50:00 1774821000

"Do not do it" looks like the winning one nowadays.

mgaunard · 2026-03-29T22:51:48 1774824708

Build everything from source within a single unified workspace, cache whatever artifacts were already built with content-addressable storage so that you don't need to build them again.

You should also avoid libraries, as they reduce granularity and needlessly complexify the logic.

I'd also argue you shouldn't have any kind of declaration of dependencies and simply deduce them transparently based on what the code includes, with some logic to map header to implementation files.

stackghost · 2026-03-29T23:07:54 1774825674

>Build everything from source within a single unified workspace, cache whatever artifacts were already built with content-addressable storage so that you don't need to build them again.

Which tool do you use for content-addressable storage in your builds?

>You should also avoid libraries, as they reduce granularity and needlessly complexify the logic.

This isn't always feasible though.

What's the best practice when one cannot avoid a library?

mgaunard · 2026-03-30T00:21:19 1774830079

You can use S3 or equivalent; a normal filesystem (networked or not) also works well.

You hash all the inputs that go into building foo.cpp, and then that gives you /objs/<hash>.o. If it exists, you use it; if not, you build it first. Then if any other .cpp file ever includes foo.hpp (directly or indirectly), you mark that it needs to link /objs/<hash>.o.

You expand the link requirements transitively, and you have a build system. 200 lines of code. Your code is self-describing and you never need to write any build logic again, and your build system is reliable, strictly builds only what it needs while sharing artifacts across the team, and never leads to ODR.

stackghost · 2026-03-30T23:02:31 1774911751

Interesting, thanks.

maccard · 2026-03-29T23:11:40 1774825900

The problem is doing this requires a team to support it that is realistically as large as your average product team. I know Bazel is the solution here but as someone who has used C++, modified build systems and maintained CI for teams for years, I have never gotten it to work for anything more than a toy project.

mgaunard · 2026-03-30T00:10:22 1774829422

I have several times built my own system to do just that when it wasn't even my main job. Doesn't take more than a couple of days.

Bazel is certainly not the solution; it's arguably closer to being the problem. The worst build system I have ever seen was Bazel-based.

maccard · 2026-03-30T08:56:40 1774861000

> I have several times built my own system to do just that when it wasn't even my main job. Doesn't take more than a couple of days.

Really? I'd love a link to even something that works as a toy project

> Bazel is certainly not the solution; it's arguably closer to being the problem. The worst build system I have ever seen was Bazel-based.

I agree

mgaunard · 2026-03-30T11:27:27 1774870047

It usually ends up somewhat non-generic, with project-specific decisions hardcoded rather than specified in a config file.

I usually make it so that it's fully integrated with wherever we store artifacts (for CAS), source (to download specific revisions as needed), remote running (which depending on the shop can be local, docker, ssh, kubernetes, ...), GDB, IDEs... All that stuff takes more work for a truly generic solution, and it's generally more valuable to have tight integration for the one workflow you actually use.

Since I also control the build image and toolchain (that I build from source) it also ends up specifically tied to that too.

In practice, I find that regardless of what generic tool you use like cmake or bazel, you end up layering your own build system and workflow scripts on top of those tools anyway. At some point I decided the complexity and overhead of building on top of bazel was more trouble than it was worth, while building it from scratch is actually quite easy and gives you all the control you could possibly need.

maccard · 2026-03-30T12:32:30 1774873950

This is all great, but it doesn’t sound simple or like 200 lines of code.

NetMageSCW · 2026-03-29T19:48:11 1774813691

I would suggest importing binaries and metadata is going to be faster than compiling all the source for that.

mgaunard · 2026-03-30T00:09:17 1774829357

You'd be wrong. If the build system has full knowledge on how to build the whole thing, it can do a much better job. Caching the outputs of the build is trivial.

If you import some ready made binaries, you have no way to guarantee they are compatible with the rest of your build or contain the features you need. If anything needs updating and you actually bother to do it for correctness (most would just hope it's compatible) your only option is usually to rebuild the whole thing, even if your usage only needed one file.

uecker · 2026-03-29T21:22:51 1774819371

"That makes everything slow, inefficient, and widely dangerous."

There nothing faster and more efficient than building C programs. I also not sure what is dangerous in having libraries. C++ is quite different though.

hansvm · 2026-03-29T22:05:28 1774821928

Of course there is. Raw machine code is the gold standard, and everything else is an attempt to achieve _something_ at the cost of performance, C included, and that's even when considering whole-program optimization and ignoring the overhead introduced by libraries. Other languages with better semantics frequently outperform C (slightly) because the compiler is able to assume more things about the data and instructions being manipulated, generating tighter optimizations.

uecker · 2026-03-30T06:24:27 1774851867

I was talking about building code not run-time. But regarding run-time, no other language does outperform C in practice, although your argument about "better semantics" has some grain of truth in it, it does not apply to any existing language I know of - at least not to Rust which is in practice for the most part still slower than C.

mgaunard · 2026-03-29T23:57:44 1774828664

ODR violations are very easy to trigger unless you build the whole thing from source, and are ill-formed, no diagnostic required (worse than UB).

uecker · 2026-03-30T06:26:24 1774851984

Neither "ODR violations" nor IFNDR exist in C. Incompatibility across translation units can cause undefined behavior in C, but this can easily be avoided.

mgaunard · 2026-03-30T10:59:39 1774868379

C simply has less wording for it because less work has been put into it.

The same problems exist.

uecker · 2026-03-30T18:34:53 1774895693

The ODR problem is much more benign in C. Undefined behavior at translation time (~ IFNDR) still exists in C but for C2y we have removed most of it already.

mgaunard · 2026-03-31T12:54:32 1774961672

You can't fundamentally solve the issue of what happens if you call a function in another TU that takes a T but the caller and the callee have a different definition of T. Whether you call that IFNDR or UB doesn't make much of a difference.

C++ mitigates that issue with its mangling (which checks the type name is the same), Rust goes the extra mile and puts a hash of the whole definition of the arguments in the symbol name.

C has the most unsafe solution (no mitigation at all).

uecker · 2026-04-01T09:49:20 1775036960

C++ ODR requires different definition to consist of the same tokens, and if not the the program is IFNDR. Name mangling catches some stuff, but this becomes less relevant today with more things being generated via templates from headers.

In C, it is UB when the types are not compatible, which is more robust. In practice it also easy to avoid with the same solution as in C++, i.e. there is a single header which declares the object. But even if not, tooling can check consistency across TU it is just not required by the ISO standard (which Rust does not have, so the comparison makes no sense). In practice, with GCC a LTO build detects inconsistencies.

mgaunard · 2026-04-01T20:26:35 1775075195

Different parts of the build seeing inconsistent definitions of the same name is a clear consequence of building things piecemeal rather than as a single project -- which is precisely the problem I described higher up in this thread.

Things being built piecemeal also likely won't be using LTO (even if fat LTO allows this, no static library packages in a distro are built with it).

uecker · 2026-04-03T11:55:20 1775217320

Not sure what you are trying to say. Inconsistent definitions is a consequence of being able to build things separately, which is a major feature of C and C++, although in C++ it does not work well anymore. The reason to not build things piecemeal is often that LTO is more expensive. But occasionally running this will catch violations. When libraries get split up, the interface is defined via headers and there is little risk to get an inconsistency. So really do not think there is a major problem in C.

mgaunard · 2026-03-29T19:15:28 1774811728

In my experience, no one does build systems right; Cargo included.

The standard was initially meant to standardize existing practice. There is no good existing practice. Very large institutions depending heavily on C++ systematically fail to manage the build properly despite large amounts of third party licenses and dedicated build teams.

With AI, how you build and integrate together fragmented code bases is even more important, but someone has yet to design a real industry-wide solution.

lenkite · 2026-03-29T19:59:34 1774814374

Speedy convenience beats absolute correctness anyday. Humans are not immortal and have finite amount of time for life and work. If convenience didn't matter, we would all still be coding in assembly or toggling hardware switches.

jjmarr · 2026-03-29T20:55:02 1774817702

C++ builds are extremely slow because they are not correct.

I'm doing a migration of a large codebase from local builds to remote execution and I constantly have bugs with mystery shared library dependencies implicitly pulled from the environment.

This is extremely tricky because if you run an executable without its shared library, you get "file not found" with no explanation. Even AI doesn't understand this error.

mgaunard · 2026-03-30T00:26:59 1774830419

The dynamic linker can clearly tell you where it looks for files and in which order, and where it finds them if it does.

You can also very easily harden this if you somehow don't want to capture libraries from outside certain paths.

You can even build the compiler in such a way that every binary it produces has a built-in RPATH if you want to force certain locations.

jjmarr · 2026-03-30T00:44:27 1774831467

That is what I'm doing so I can get distributed builds working. It sucks and has taken me days of work.

mgaunard · 2026-03-30T15:22:04 1774884124

It's pretty simple and works reliably as specified.

I can only infer that your lack of familiarity was what made it take so long.

Rebuilding GCC with specs does take forever, and building GCC is in general quite painful, but you could also use patchelf to modify the binary after the fact (which is what a lot of build systems do).

jjmarr · 2026-03-30T15:59:05 1774886345

> I can only infer that your lack of familiarity was what made it take so long

Pretty much.

Trying to convert an existing build that doesn't explicitly declare object dependencies is painful. Rust does it properly by default.

For example, I'm discovering our clang toolchain has a transitive dependency on a gcc toolchain.

mgaunard · 2026-03-31T13:09:38 1774962578

Clang cannot bootstrap in the same way GCC can; you need GCC (or another clang) to build it. You can obviously build it twice to have it be built by itself (bear in mind some of the clang components already do this, because they have to be built by clang).

In general though, a clang install will still depend on libstdc++, libgcc, GCC crtbegin.o and binutils (at least on Linux), which is typically why it will refer to a specific GCC install even after being built.

There are of course ways to use clang without any GCC runtime, but that's more involved and non-standard (unless you're on Mac).

And there is also the libc dependency (and all sysroot aspects in general) and while that is usually considered completely separate from GCC, the filesystem location and how it is found is often tied to how GCC is configured.

morning-coffee · 2026-03-29T21:37:12 1774820232

The Mars Polar Lander and Mars Climate Orbiter missions would beg to differ.

(And "absolute" or other adjectives don't qualify "correctness"... it simply is or isn't.)