Aircraft do not have a singular unique identifier that is time invariant.
While it is true that aircraft have serial numbers issued to their airframe, by itself, aircraft serial numbers are not unique.
The only unique identifier for an aircraft across its lifecycle from production to end of life is a combination of the manufacturer, make and serial number.
I know this because I am on (for better or worse) the patent that involves defining that as a unique identifier for aircraft.
The combination of ICAO aircraft type designator + serial number approximately is the most permanent identifier for an airframe - and even then - if an airframe is modified significantly enough that it no longer is the previous type - even then this identifier can change.
Personally, it boggled my mind that something as big as an aircraft did not have a simple time invariant unique identifier.
P.S. For those who might ask - aircraft registration numbers are like license plates, so they change - tail numbers can be ambiguous and misinterpreted depending on what is painted on the aircraft where, and ICAO 24-bit aircraft addresses are tied to ADS-B transponder boxes, which technically can be moved and reprogrammed between aircraft also.
Go work at a big company. The patent lawyers come around and ask what you've been working on, and a month or two later, your name's on 10 patents, none of which make any sense whatsoever. If you're very lucky you might get a dollar bill for each.
For a while at google you would get $5k per patent submission and $10k for each approved(?) one. Given how easy it was, I could have matched my annual salary. It's depressing how easy it is to get a system architecture (unimplemented) patented at bigco.
You burrow this simple idea in pages and pages of obfuscated tedium, and that's good enough that everyone is happy. Patent office gets their fee, lawyers get paid, company can say it has a supercharged patented innovation.
I was wondering the same thing. I've had to derive unique identifiers from hundreds of different data sets over the years. What makes it special when it's a plane?
> And the solution is almost always “model, make, and serial number.”
If you've ever spent time in old car forums, you learn that even this isn't enough because of production-line sloppiness.
Serial number re-use is rare, but it happens. Usually because a product had something detected that resulted in remanufacturing, but sometimes other things slip.
I know about systems who had two types of serial numbers which ought to be the same, but weren‘t because they had been programmed at different eol stages, when daylight savings time kicked in. One of the system run in utc the other in local time. Date was part of the serial.
I'm only joking a little. Funny thing, surnames aren't actually that old for Europeans. Most of history there'd be maybe two people with the same name. They solved it back then very much the same way we solve it now.
Serial Number = was supposed to be what order the things were made in (e.g. the number of the serial order), but this is often obfuscated or often repeats [1].
In cars, make would be like Ford. Model would be like Focus, serial number would be VIN (vehicle identification number - in cars, those are generally unique!).
Ford Focus + VIN, basically.
There is a theoretical concept of a unique identifier for everything... including people from ISO under ISO 8000.. combining a natural location identifier (eNLI)[2] and an ISO8601 timestamp - to represent "where and when a thing is considered to be born" - a point in time and space the thing is considered to come into existence.
I think the idea is called "natural person identifier" for humans.
This ID has to be assigned but I think you can see the idea at least.
I suppose this doesn't include make/manufacturer but realistically that isn't needed for uniqueness in this scheme, only as descriptive metadata for things that have one.
[1] This is related to the fact that if serial numbers were truly serial, one could estimate the rate and quantity of production which is considered sensitive information by most manufacturers. This relates to "the German tank problem" - during WWII the allies were able to accurately estimate the production of German tanks by analyzing the serial numbers off captured tanks.
The full Theseus treatment would need you to take [part of] the airframe that first plane discarded, then recertify it for use under its original serial number.
The way the Aircraft of Theseus is generally resolved is there’s a piece of metal called the “data plate”. This is the airplane as far as the FAA is concerned. I’ve been in a vintage biplane that was completely rebuilt from the data plate up. I think they got it for $40k.
It was worth it because without that, a home built airplane would have an experimental certificate and you couldn’t sell rides in it.
Does the data plate not limit the scope of what can be built around it?
In other words had Virgin Galactic built the VSS Enterprise around the data plate of a Cessna 172, would it then no longer have been an experimental aircraft?
It does limit, but I suspect a lot less for the rebuilt from ground up biplane, than for a certified for airline service aircraft (a commercial airliner).
> Personally, it boggled my mind that something as big as an aircraft did not have a simple time invariant unique identifier.
It boggles my mind that despite not having some sort of universal system things work as well as they do.
Aviation grew up relatively insular, and each country that had any sort of aircraft manufacturing did things their own way until fairly recently. Arguably, the first half of the history of aviation is a kind of free-for-all. The fact that we now have a globalized airline industry that mostly follows some kind of standards is the mind-blowing part to me. And I suspect if we weren't mostly down to a dozen or so manufacturers for the vast majority of airliners, even that wouldn't be the case.
Yeah but at some point countries started buying larger planes from only one or two manufacturers. At that point the manufacturers could standardize things.
I don't think it even matters. If what you're doing doesn't piss off any potential enforcers you're good to go. If whatever you're doing does then you're screwed (or will be tied up in court and paying tons of lawyers fees) regardless.
Engines are actually changed fairly frequently because they're a wear component on most airplanes. They are also sometimes updated to a newer version or even an entirely different manufacturer. And often it's faster and cheaper to swap in a new engine that's ready to go rather than wait for the one that's attached to be overhauled, so the same engine might see service on multiple airframes.
Most cars don't operate for >12 hours a day every day. Last time I randomly checked the flight history of a Ryanair 737 on flightradar24, it had spent over 18 and a half of the previous 24 hours airborne.
And many commercial airliners are sold without engines at all.
The operators, such as Delta, do not actually own engines on the aircraft they fly, even though they own the aircraft. The engines are rented from e.g. Pratt & Whitney along with a maintenance contract. That said, that engines are in fact installed at the factory.
As usual with these lists, they would much benefit from more in-depth explanation. This list at least deigns to link to examples for many of the claims (like a flight that leaves on time but arrives 40 hours late [1]), but doesn't explain what happened.
Having said that, many of the links are very informative. For example the crater on Mars that has an ICAO airport code [2]: "On 19 April 2021, Ingenuity performed the first powered flight on Mars from Jezero, which received the commemorative ICAO airport code JZRO."
This is often for boring reasons - the two week flight was a Google Balloon, the flight was delayed for 40 hours due to bad weather, ADS-B is set by the pilot and many pilots simply set it wrong, and so on.
My favorite falsehood was from ca. 17-18 years ago: that altitude is always positive. There are airports in the mountains where approaching altitude can be below the runway and thus reported as negative.
A lot of these so called "falsehoods" are just design failures on the part of programmers. Someone did it badly first, and it stuck, and a second person came in later and is surprised by the bad design. That's not really interesting, it happens all the time in software. So much so that seasoned engineers have come to expect poor design until proven otherwise.
Things like flight numbers not having reasonable semantics, or conceptual pollution of what a flight is to include multiple take offs and landings are bad design, plain and simple. Just model the problem correctly e.g. maybe a Trip is multiple Flights, or Flights have multiple Legs. This isn't aviation specific. These are generic problems that programmers can and should get right.
Some of it is intrinsic to the domain, like flights not all having gates, or not landing at airports. That was a new tidbit for me.
The claim isn't that programmers go around literally believing falsehoods about a given domain. The whole point of the "falsehoods programmers believe about X" genre is a tongue-in-cheek way of listing the kind of bad design assumptions that happen in a given domain, and I believe that is very interesting indeed.
The fact remains that software that models real-life events or information is making normative assumptions about what can and cannot happen in the domain, due to the very nature of software, and these assumptions are knowingly-or-not being introduced by programmers. If for any given domain we had hundreds of human notaries, scribes or typists managing information instead of software, their mistaken assumptions wouldn't matter—they would simply go "Oh, that's odd", make the necessary adjustments, and learn from the experience. But as long as software is a prescriptive model of what it is representing, it will be valuable to highlight the "falsehoods" that its creators may accidentally prescribe into it.
the point of the article (just as with the one about names) is that there are "reasonable defaults" many people would believe - that don't work in practice and become gotchas
whether you have enough knowledge to know that something is unreasonable doesn't mean it doesn't seem reasonable for many others
It does. The distinction between a misunderstanding of aviation, or a misunderstanding of which of many possible models is in place.
I'm interested in misunderstandings of aviation. How did aviators think about it before software ever entered the picture? Which of their concepts are coherent, and why do they use them? What reasoning power do those concepts get them? It's interesting when programmers misunderstand that.
It's less interesting when programmers setup a poor system for keeping track of things. It could be busses or trains or parts in a factory. It's not aviation specific, and it's run-of-the-mill bad design when it's done badly. Apparently the first software engineers to model aviation were really bad at it.
And as programmers, we inevitably spend most of our time dealing with these weird edge cases, because the stuff that makes sense is generally incorporated into our initial modelling and becomes a solved problem.
Many of the points in the article, including some in your examples, pre-date programmers and programming entirely. Still more emerged before widespread use of computer automation in aviation.
Like, sure, a better data model would be great. But switching to one is largely a human system migration effort, not a software problem.
I develop software for flight data analysis at a company that makes flight data recorders. Our focus is mainly helicopters, but some fixed wing. Dealing with aircraft that may takeoff or land at a base, hospital, roof, parking lot, football field, airport, golf course, etc I feel like most of my days are spent on all sorts of falsehoods about aviation.
Funny how the common thread through many of these 'Falsehoods...' posts is that many programmers think that systems designed by humans, for humans, and kept running by humans will rigidly adhere to a set of rules and don't have edge cases.
Us programmers like to distill everything down to rigid sets of rules because that's how our mind operates. The fewer probabilistic "analog" parameters, the better. Of course the real world doesn't work this way.
It is by no mean specific to programmers. Ask to someone who learns French, for instance. Rules with too many arbitrary exceptions.
What is specific to programmers is that their tool performs at its best with simpler rules, so their job is to find the necessary and sufficient set of rules - and will dismiss most of the cases pointed by this article as unimportant exceptions the software won't handle.
Natural languages are kinda weird about this because most people don't remember their rules as rules, they learn by example, by finding patterns and kinda extrapolating them.
English is a foreign language to me. But I somehow managed to learn it without learning the rules. I can say things correctly-ish without being able to explain why I used this particular grammar.
> Ask to someone who learns French, for instance. Rules with too many arbitrary exceptions.
I took French in middle school, and it was always a running joke that the teacher spent the first 5 minutes on the rule, and the next 40 minutes on the exceptions.
In the end the data has to fit into structures or tables that can be processed by some algorithms. If the system is not rigid to a certain degree it would become unmaintainable or full of bugs or both.
Not really. It's just that software by definition must create a model of the domain it attempts to handle. And a model is, in the end, a set of rules. With an absence of rules, the software can't really do anything, as would be pretty pointless for actually solving any problem. The alternative is to hand the users Notepad and say "knock yourselves out".
I'd argue that programmers are indeed much more aware of how many exceptions and edge cases most real world domains have. Ask a lay person about such a simple thing as leap seconds, for instance, and they'll often believe you're making shit up.
The profession of programming is fundamentally about the interface between squishy human systems and rigid rule-based machines. No surprise that keeps coming up.
It is the classic scenario of confusing the map with the territory.
In the map everything is clear. It is clear what a "plane" is what "airports" are and what their relationship is. And transferring that into a computer program is straight forward.
In the territory everything is fuzzy. None of the definitions are without edge cases and the expected relationships are often violated in surprising ways.
Aviation isn't unique here, every system suffers from the distinction between its actual function and the abstract description of that system.
I think it's more, "you might think as a programmer writing software that models the world of aviation that you could assume the following things— but alas."
Software unfortunately follows rigid rules so the challenge is finding a set of rigid rules that can encompass reality. It would be pretty natural if you were writing a database schema that a flight would have a departing airport and an arriving airport— but alas.
Well what I'm speaking to more is that most systems that you model, most of the model is already assumptions. So natural or not, that database schema is already invoking assumptions which may or may not be false. Especially when dealing with any system where humans are directly involved in it. For many things, there's no exhaustive list of rules that will cover all the cases. As they say, if you make something idiot-proof, they'll invent a better idiot.
And this is why I much prefer Suurogate values for primary keys over natural values. And why I've gravitated to using UUID values for surrogates, not integer identities.
A theme running through the article is "this value is unique " and "this value does not change". And of course those are both wrong.
So when designing databases now I assume "everything changes, nothing is unique " (even when the domain "expert" professes it is.)
This approach solves so many problems and saves something time later on when it turns out that that "absolutely, positively, unique for ever" natural key, isn't.
The tradeoff you’re making is performance, sometimes a lot depending on your RDBMS and table size. For smaller tables, under 10,000,000 rows or so, you won’t really notice much, but in the hundreds of millions or billions, you definitely do.
A UUID is at best 2x larger than even a BIGINT, thus the index size is 2x larger. If you aren’t using v1 or v7, it’s also not k-sortable. But most importantly for MySQL (and optionally SQL Server) if the table contains things related to a common entity, like a user’s purchases, the rows are now scattered around the clustering index’s B+tree. That incurs a huge amount of I/O on large tables, and short of a covering index and/or partitioning (which only masks the problem by shrinking the search space), there is no way to improve it. If instead the PK was (user_id, some_other_identifier), all records for a given user are physically co-located.
Size is in play, yes, but the 8 extra bytes per row is likely negligible compared to the row size.
Is there a case where the dise matters? Sure. But you can't discuss the space cost for 10 billion rows without comparing to the space cost of 10 billion rows.
SQL server let's you cluster by any index, do if your child record table will benefit by clustering by ParentGuid then go for it.
MySQL stores a copy of the PK in every secondary index, so it can start adding up quite a bit. I agree that the overall size of 10 billion rows would dwarf that, but since you're presumably doing some decent indexing on a table of that size, index size matters more IMO.
For any RDBMS (I assume... I don't know a lot about SQL Server or Oracle), the binpacking for pages also impacts query speed for queries where there are many results.
But it doesn't help much, as the surrogate only lives in your system.
So now some information comes in from outside the system that something happened with a plane, and you still have to find which surrogate id that plane has in your system.
You may decide two things happened to two different planes whereas another system consider it the same plane both times, and vice versa.
The uuid keys make it easy to change some value, but won’t solve the issue of keeping a record of historical changes.
UUID keys PLUS some form of versioning with creation dates will let you change an airport name and let you know what the airport name was on some arbitrary date in the past. Useful for backfills and debugging
You don’t need all that; any candidate key (even natural) with the addition of a datetime would work. What was the definition of Airport X before Datetime Y? And after? Etc.
> But if your natural key is the thing that changed you’d never know that airport x was renamed to airport y.
Correct, which is why you need the addition of a DATETIME to indicate when that identifier is valid.
> And when you renamed the airport, you’d need to add new entries in all the other tables that used airport name as a foreign key
No, because you wouldn't use the name in the key, you'd use a code like ICAO, though there are pseudo-ICAO codes for some aerodromes, so whether or not you want to be pedantic about naming is a personal choice. Then use FK constraints. Example:
CREATE TABLE airport_physical (
id INTEGER PRIMARY KEY GENERATED ALWAYS AS IDENTITY,
coordinates POINT NOT NULL,
country_code_alpha2 CHARACTER(2) NOT NULL, -- ideally this would be a FK to an ISO3166-2 table
opened_at DATE NOT NULL,
closed_at DATE DEFAULT 'infinity'
);
CREATE TABLE airport_code (
icao TEXT NOT NULL CHECK (length(icao) <= 16), -- reasonable length, but can easily be changed
iata CHAR(3) DEFAULT NULL,
airport_physical_id INTEGER NOT NULL REFERENCES airport_physical(id) ON UPDATE CASCADE ON DELETE RESTRICT,
effective_date DATE NOT NULL,
end_date DATE NOT NULL DEFAULT 'infinity',
PRIMARY KEY (icao, effective_date)
);
CREATE TABLE airport_name (
airport_physical_id INTEGER NOT NULL REFERENCES airport_physical(id) ON UPDATE CASCADE ON DELETE RESTRICT,
name TEXT NOT NULL CHECK (length(name) <= 126),
effective_date DATE NOT NULL,
end_date DATE DEFAULT 'infinity',
PRIMARY KEY (airport_physical_id, effective_date, name)
);
This would let you model edge cases like John F. Kennedy International Airport, née Idlewild Airport, which had the ICAO "KIDL" from its opening (I mean, probably before that as well, but for the sake of argument assume you care when it was operating) on 1948-07-01 until 1964-01-01, but its name was changed to John F. Kennedy on 1963-12-24. It also allows you to model the reuse of ICAO codes, since Indianoloa Municipal Airport received the ICAO code "KIDL" following its release by JFK.
Is this easier to do than surrogate keys? Not really, no, but IMO it's easier for a human to understand when presented with temporal changes, allows for edge cases like an airport's designator or name changing while flights are enroute, and for flights (which would be the largest table), they can use `icao` and their departure/arrival datetime (which the table would need to model anyway) to effectively link to the other tables.
> Flights that depart from a gate only leave their gate once
I went on a vacation to New Zealand just before new visa requirements took effect, so while I entered legally, at some point, I couldn't reenter. As I was going through passport control to exit, I asked what would happen if the plane had a mechanical issue and I had to spend the ~night, but couldn't reenter. The border agent said they can undo an exit.
I always look at these "Falsehoods Programmers Believe..." lists as a source of tests. Each item should spawn a number of unit or integration tests that will help to uproot any of these assumptions that were incorrectly baked into your software.
Each of these did indeed spawn tests. I used to work there and at the time there were over a thousand ranging from humdrum to David Blaine skydiving. They’re a crowd who really put a focus on good engineering
I find this list strange. I have only a passing interest in aviation and I would not believe very many of these.
What made the corresponding lists for names and time interesting were that it was genuinely surprising to realise that their statements were actually false. I don't get that feeling with these.
Like the top level comment about identifiers for airplanes -- why would they have them? That sounds baffling to me. With ownership changes, continuous upgrades, extending airframes, repurposing etc. I would be surprised if there was a stable identity.
Reading through this list of “false assumptions” is like a trip down memory lane for anyone who’s ever tried to wrangle real-world data. I can’t count the number of times I’ve been bitten by something that “should” be true but absolutely isn’t in practice. The idea that even basic things like flight numbers, departure times, or runway assignments can be so fluid is both hilarious and a little terrifying.
It really drives home the idea that when building systems that rely on data like this, the real challenge isn’t in writing code to handle the “happy path.” It’s in anticipating the weird edge cases, the partial data, the inconsistencies that show up because real life doesn’t always follow the rules we expect.
My impression is that every single older (pre-2010) computer system that manages the Brazilian aviation felt for that and fixed it in a hack.
> Airports never move
Also, Runways never move. Also, if runways move, they don't change direction. Also, if airport or runways move, there will exist some construction work before.
I'd add "aircraft only land in runways" there too. And "ok, aircraft only land in runways and heliports".
I would assume it's somewhat speaking to the prevalence of many informal landing strips, and also that river landings are probably fairly common too. I'd have to imagine places like Alaska might also have to deal with that, especially if you have small local 'airlines' (which are probably just a handful of bush planes really) that operate from an actual registered airport.
Even the Eastern US has to deal with river and water landings all the time. You can book a scheduled flight from the East River right in-between Manhattan and Brooklyn to Marthas Vineyard, or the Hamptons or a number of other destinations. Not to mention those happen in the middle of arguably the most complex commercial airspace in the world.
It's pretty cool to be on a ferry and see a plane land basically next to you in the middle of the river.
I've written various types of aviation support software on and off since the early 1980s.
One of my favourite planes were the Grumman Mallards still owned and operated by Paspaley Pearling out of Mungalalu Truscott and other Kimberley airbases.
They're classic 1950s twin-engined amphibious aircraft that landed anywhere up and down the Kimberley Coast for pearling transfers.
Here's one that is only kind of mentioned, there are actually different altitudes. If you use ADS-B data you will only get the barometric altitude which is not calibrated to the ground pressure level. For example if you watch ADS-B data of flights into Denver it appears that every aircraft is crashing down ~5000ft during landing.
Above ground level, and above sea level are two easily confusable ways of reporting altitude. Reporting a value above sea level on ADSB so that it doesn't change is probably the right thing to do, especially since airfields report their local pressure and have their elevation recorded.
Expecting every runway to be above sea level is another falsehood. AMS has runways 3 meters below sea level, and I'm sure there's even lower ones on the east side of Africa.
>The only unique identifier for an aircraft across its lifecycle from production to end of life is a combination of the manufacturer, make and serial number.
>I know this because I am on (for better or worse) the patent that involves defining that as a unique identifier for aircraft.
Isn't that blindingly obvious? If so, how did it get to be a patent? And is someone now extracting rent from it?
Does it matter if something is obvious or not for getting a patent granted? From some casual looks of various US patents, it seems to be "First who writes a obtuse patent about thing X gets it granted", doesn't really matter if the thing is "novel" or not, just that no one tried to submit it before.
Legally it's supposed to matter, yes. Non-novel or obvious ideas are according to the law not eligible to be patented. In practice the mechanism to decide both of these is broken.
I see, that's really not visible in practice. Silly example perhaps, but US5443036A comes to mind which just shows how broken the system is:
> A method for inducing cats to exercise consists of directing a beam of invisible light produced by a hand-held laser apparatus onto the floor or wall or other opaque surface in the vicinity of the cat, then moving the laser so as to cause the bright pattern of light to move in an irregular way fascinating to cats, and to any other animal with a chase instinct.
How on earth is anyone supposed to be able to take the patents system as a whole when there are 100s (if not 1000s) of examples like that, which obviously shouldn't be approved if "novel" or "non-obvious" ideas are required.
The US patent system seems profoundly broken. Given that the patent system seems much less broken in other developed countries and the vast wealth and resources of the US, I assume it is broken on purpose?
Obviously it doesn't actually work. I submitted the application as an experiment to see how hard it would be to sneak this past the patent office. The answer turns out to be: not hard at all. In fact, there's pretty much an algorithm for it:
1. Write up a half-assed patent application.
2. Submit it and wait for it to be rejected (which it almost certainly will be).
3. Read the rejection notice and tweak the application to address every individual point that was made.
4. Go to step 2. Repeat until the patent office capitulates and issues your patent.
In my experience (my name is on six patents) has never been necessary to do more than one iteration.
The reason this works is that the patent office is required by law to give specific reasons for rejecting a patent application. They are not allowed to simply say, "This is obviously stupid." If they see that you are going to persist, it's a lot easier for them to just give you the damn patent (it's no skin off their nose) than to keep doing your homework for you.
With AI, following this procedure becomes borderline trivial. In fact, I'm a little surprised that the patent office isn't being overwhelmed by AI-generated patent applications. (Or maybe they are and it just hasn't made it into my news feed.)
My understanding is that quantum entanglement can't be used to transmit information. Given this fact and the fact that Einstein started out as a patent clerk, the great man must be turning in his grave!
I thought I would have to disguise the bogosity under some plausible-sounding pseudo-science, and to be fair, it's actually quite tricky to figure out why my invention doesn't work (though the fact that it doesn't work, and can't possibly work, should be pretty obvious to a patent examiner). But at this point I think I could probably get a patent for summoning dragons to slay my enemies. (Hm, there's an idea...)
It's not just the technology, it's the employment of it too. In 1993 this was a new way to use lasers, which a decade before were too expensive, delicate and power hungry to use as such.
Put another way, the change can be incremental. Building upon what is. Without this, pretty much all incremental science would lose funding, for the moment you invent, regardless of cost, it'd just be copied.
If you've ever done hardware, even a toy, it's not simple.
Extensive prototypes, testing for drops, hand fit, assembly at the factory, and more.
Devs today can't even conceive of making a 100% stable product to be shipped on floppy and never updated. Reshipping for bugfixes could break a company in the old days.
Now try that with hardware!
And all those tweaks, fixes, tests can be copied in a second without patents.
I think separating software and hardware patent discussions would be better here, because hardware patents are requied.
> In 1993 this was a new way to use lasers, which a decade before were too expensive, delicate and power hungry to use as such.
I think your timescale is slightly off, but I don't know enough about laser history to say definitely. But judging by what I could find, in 1981 Popular Science seems to have run an ad for laser pointer devices, aimed (no pun intended) towards consumers:
> It wasn’t until the 1980s that lasers became small enough, and required so little energy, that they finally became cheap enough to be used in consumer electronics — take this funky laser pointer from the early 1980s, for example. The November 1981 edition of Popular Science features a Lasers Unlimited advertisement for an assortment of laser pointing devices, including a ruby laser ray gun, a visible red laser lightgun, multi-color lasers and laser light shows, all of which were selling for less than $15 (equivalent to about $42 today) - https://melmagazine.com/en-us/story/a-dazzling-history-of-th...
So if they became usable but consumers in 1980s, I'm about 99% confident at least one individual used it for playing with their cats.
But since the author of the patent just happened to have spent the time (10 years later) to write the patent, they got it awarded to them.
Well, as a senior software engineer and commercial pilot ... I am left confused.
Not all the things in the list, because I am aware of those. I might have missed the runway numbers changing based on shifting magnetic field of the earth, but that's a thing too. Runway 22? That's now Runway 21.
But why programmers specifically would believe this, as opposed to ... any other profession that is not aviation?
There is a genre of articles that list similar non-intuitive facts about various domains (people's names, music, etc). The relation to programmers is that they are often creating software systems where some of these facts come into play, e.g., by using some values as primary keys, foreign keys, etc.
The article isn't meant to imply that only programmers would believe these. It's just a little niche of 'Falsehoods that Programmers believe about XYZ' sort of articles that became popular because programmers tend to write software that ends up interacting with real world systems that have edge cases many programmers would not consider if they're not dealing with the problem space for a while.
Programmers are rarely genuine experts in the domain, we depend on the subject matter experts to define the parameters for the system under construction. Some SMEs have deep knowledge and would know these, but many, while very knowledgeable, have not encountered the specific edge cases nor heard about them. As a result, the design incorporates these inaccurate assumptions, and for a long time it works fine. Until it doesn't. And by the way this is why waterfall and big design up front fails as a development methodology.
tl;dr: you can't know absolutely everything ahead of time.
Back when I was designing an app for air navigation, I came up with an alternative color scheme for various types of color blindness only to be told the target users were not allowed to be color blind (it was in France, much stricter than elsewhere it seems).
They can legally land in any navigable body of water (some restrictions in some areas for national preserves) but some lakes have defined water runways.
- A private pilot who departs their local airport without filing a flight plan and flies around for a while.
- A charter jet that departs whenever the passengers show up.
- A medevac helicopter departs a hospital to return to its base. While en route, it is rerouted by dispatch to pick up a patient at a different hospital.
Haha, nice! My head as a programmer explodes while reading this list, because I feel like these are all reasonable assumptions and I feel how they are painfully discovered late into the implementation.
Also, feeling myself stupid very quickly. Very nice summary, bravo!
>If an aircraft diverts to another destination, it won’t divert again.
Hehe, I was once told we couldn't land at our destination A, so we got diverted to B; while on our way to B we were told we are actually going to C; and, while on our way to C, A became available again so the plane did a U-turn and we flew back to A, landing with a ~3 hour delay.
Honestly I am surprised by some of the points. But after reading all of it, now I am wondering as an outsider, what the hell is a "flight" if there's basically no good abstraction for this mess? What does it mean when a new flight is created, or what does the existence of any single flight mean?
Flight is a concept which is at least a body moving without support to an underlying surface. Everything else are human plugins added to help us exploit the concept in various circumstances. Any additional constraints on the concept are valid only in the circumstances they were invented for. Enumeration of the circumstances should take into account the participants of the communication context where the word "flight" is used.
I had known about some of these, and I had thought that some others are at least possible.
I know that there is a ICAO code on Mars (since I had read about it before).
I think there are some airports that have a ICAO code but not IATA code and vice-versa, and some have a "pseudo-ICAO" code with letters and numbers together.
Perhaps useful to produce a list of true constraints in contrast to false ones. Perhaps that would result in too many “except for”, “apart from” and “subject to” statements.
> Sounds like a list of edge cases just like any other area.
That's exactly the point. The famous example (Falsehoods Programmers Believe About Names) has examples I have encountered in medical databases. If a programmer somewhere didn't fall into the trap, patient names in a medical database would have been better managed and may have avoided duplication, lost records, etc.
to the best of my recollection, the only way to tell a ship from a boat is to watch it make a "high" speed turn, ships lean out, boats lean in. But this is probably incorrect, just like all of my education was.
Submarines are considered "boats". The leaning thing is only about the position of the rudder in relation to the center of gravity while moving. Small boats that ride high in the water will lean into turns because the rudder is well below the center of mass. But a hydrofoil "ship" will do the same. Many small sailboats will lean out of turns too, even though they are "boats".
On the ADS-B receiver side, I'd add "Each ADS-B packet will be clearly heard by the on-ground receiver, there will be no other radio station sending an ADS-B packet when another station is actively transmitting" and "Only actual ADS-B stations use the 1090 MHz frequency, no one will attempt to maliciously jam the entire band".
Its like receiving some API documentation that confidently declares some field as an ENUM and then a few hundred million rows later you discover that that was more like a suggestion and its actually more like a free text field.. sigh
Now you have to specify whether or not it’s moved during queries (and what if it moves again?) There’s probably a more elegant way I’m not thinking of, but standard created_at and updated_at fields would work: if a given date is <= the move date, it’s the original airport, else the new one. Rinse and repeat if it moves again.
Bit of a rant: what annoys me about these lists is how they just give off a huge "you are dumb for making any assumptions, how could you not think of <extremely obscure edge case>" vibe. I'd be interested to see what the effects are of these assumptions failing, because often they are pretty reasonable assumptions for a reasonable subset of the universe. Software is imperfect and you can't cover every possibility. Like ok technically 10 flights with the same number could leave the same gate at the same time, but if 99.99% of the time they don't and you assume that, what is the real impact to people?
Reminds me of a list that came up ages ago that presented an assumption of "X code always runs" with the counterpoint that you could unplug the computer. Ok sure, but then why write software at all? Clearly no point assuming any code will ever run since you can just terminate the program at any random time.
I don't agree that this list has the attitude you describe--if anything, they just seem proud that they have many fewer of these corner case bugs than anyone else--so it is difficult to work with your example of the flight number. These are, in fact, misconceptions made by programmers, often without having the in-depth knowledge of this specific area that comes from being an actual expert (the kind that often people don't allocate for in their budgets), and this list isn't an over-the-top portrayal of such: it feels weird to become offended?
That said, I do appreciate some of these lists--which maybe has put you on edge to the paradigm--do have an edge to them... but, in all honesty, I think they should? The bugs and edge cases that these lists tend to expose aren't random glitches that equally affect every user: they usually segment users into the ones whose lives "follow the happy path" (which often just means "are intuitive and familiar to the culture near the developer") and the users who get disproportionately (or even continually!) screwed every time they dare interact with a computer.
And like, it is actually a problem that the other side of this is almost always a developer who doesn't really give a shit and considers that user's (or even an entire region/country's) existence to somehow be a negligible statistic not worth their time or energy, and I really do think that they deserve to take some flak for that (the same way I try to not get offended if someone points out how my being a cis-het white male blinds me to stuff: I think I deserve to get held to task harder by frustrated minorities rather than force them to be nice all the time in a world that penalizes them).
I don't disagree with you at all. My point was more like what another commenter said, that software adheres to a strict and very finite set of rules, the real world is way more complicated than that. It's so trivially easy to find real world counterexamples to just about any software that it's a barely interesting exercise (IMO). So you define a reasonable subset and work with that. And the reasonable subset is probably defined by positive/negative outcomes.
It would have been cool if the blog post discussed those outcomes so we can reason about it properly, otherwise it's just a list of claims at face value. If the programmer making an assumption means a screen at a gate says the wrong boarding time when there's a human there controlling the boarding, then not the end of the world. But if the programmer making an assumption causes 1/10000 flights to crash, then that's interesting and worthwhile calling out. It's just endless speculation without a proper outcome to tie it down.
At a general level I think these lists make developers more aware of uniqueness and constraints.
When designing data I think these questions (skepticisms) should be front of mind;
1) natural values are not unique.
2) things identified by number are best stored as a string. If you're not going to do math on it, it's not a number. That "customer number" should be treated as "customer id" and as a string.
3) be careful constraining data. Those "helpful checks" to make sure the "zip code is valid" are harmful not helpful.
4) those tiny edge cases may "almost never happen" but they will end up consuming your support department. Challenge your own assumptions at every possible opportunity. Never assume anything you "know" is true.
It's hard to measure time saved, and problems avoided, with good design. But it's easy to see bad design as it plays out over decades.
And (especially today) never optimize design for "size". Y2K showed that folly once and for all.
This implies denormalization, which is rarely needed for performance, despite what so many believe. Now you’ve introduced referential integrity issues, and have taken a huge performance hit at scale.
> 3)
I mean, maybe don’t try to use a regex on an email address beyond “is there a local and domain portion,” but a ZIP code, as in U.S. only, seems pretty straightforward to check. I would much rather have to update a check constraint if proven wrong than to risk bad data the rest of the time.
> never optimize for size
Optimize for size when it doesn’t introduce other issues. Anyone working on 2-digit years could have and likely did see that issue, but opted to ignore it for various reasons (“not my problem,” etc.). But for example, _especially_ since Postgres has a native type for IP addresses, there is zero reason to store them as strings in dotted quad. Even if you have MySQL, store them as a UINT32, and use its built-in functions to cast back and forth.
>It's so trivially easy to find real world counterexamples to just about any software that it's a barely interesting exercise (IMO).
These lists hopefully make programmers aware that a lot of their assumptions about the real world might be wrong, or at least questionable.
Examples are assumptions on the local part of email addresses without checking the appropriate RFCs. Which then get enshrined in e.g. JavaScript libraries which everyone copies. I've been annoyed for the last 30 years by websites where the local part is expected to be composed of only [a-z0-9_-] although the plus sign (and many other characters) are valid constituents of a local part.
Or assumptions on telephone numbers. Including various ways (depending on local culture) of structuring their notation, e.g. "123 456 789" versus "12-3456-89" where software is too dumb to just ignore spaces or dashes, or even a stray whitespace character copied by accident with the mouse.
And those forms where you have to enter a credit card (or bank account number) in fields of n characters each, which makes cut/copy/paste difficult because you notes contain it in the "wrong" format.
So while some examples may count as "just usability" it all stemps from naive assumptions by programmers who think one size fits all (it doesn't).
I disagree, in my view they do not inherently give off such vibes at all. In this post for example, they specifically broach the topic like so:
> There are a lot of assumptions one could make when designing data types and schemas for aviation data that turn out to be inaccurate.
Sounds like a pretty explicit acknowledgement of the notion that these are otherwise reasonable assumptions that just happen to fail when put to the test, I'd say.
It's very easy to self-deprecate, especially if one has insecurities. But that doesn't mean that articles like this actually mean to do so. I think it's worthwhile for everyone involved to always evaluate whether the feeling is actually coming from the source you're looking at, or if that source just happened to trigger it inside you. More often than not, in my anecdotal experience, it's the latter.
I'd also find it interesting to learn what happens when these falsehoods nonetheless make it into an implementation though.
> I'd be interested to see what the effects are of these assumptions failing
Mostly confusion, but the combination of aviation and confusion can be dangerous and even deadly. Not directly related to this list, but I'm reminded of [1]: no one entity has set out to inconvenience the hapless traveler, but the combination of history and practice are a constant source of irritation, and at the times of heightened tensions and security might even lead to scary incidents. All because of the name.
https://www.airnavradar.com/data/airlines/tmw is a good example of some of these (depending on what time you check that link -- if it's night-time in the Maldives it's going to show you nothing)
1. I'll never need to learn a falsehood list, so I can skip it.
2. A falsehood list is complete at the time of writing.
3. OK, but it will surely get updated with new falsehoods and clarifications.
4. Skimming the falsehood list is all I need to do to learn it.
5. OK, but surely I'll remember to recheck the falsehood list once I actually need to, right?
6. If a falsehood doesn't immediately make sense to me, there must be something wrong with it, despite the author having domain expertise that I don't.
Literally had to point out just last night how UTC is not sufficient in all scenarios. I swear it happens every 6 mos on Reddit.
"Falsehoods "falsehoods programmers believe about falsehoods" blog posters believe about "falsehoods programmers believe about "falsehoods programmers believe about logic" blog" falsehoods"
Day by day it feels less and less like regular data modeling and more like a debate with Jordan Peterson where you argue for ten hours what a "name" is.
Eventually you end up having to make choices and deal with the consequences. Otherwise Jordan Peterson would have you chasing your tail for days about what a "choice" is, and nothing would ever get done.
tl;dr: just make your best guess and always include an extra "notes" column where things can get leaky.
Not days necessarily, but I think quite a bit of time should be spent data modeling, yes. Before you’ve ever touched the keyboard, it’s very helpful to attempt to model the problem on paper or a whiteboard. You quickly find problems with your initial guess that way.
Notes / data / extra et. al columns are the worst, as a DBRE. People inevitably shove various shit into them over time instead of making an effort to properly fix past mistakes, and at some point, they practically contain their own table.
Aircraft do not have a singular unique identifier that is time invariant.
While it is true that aircraft have serial numbers issued to their airframe, by itself, aircraft serial numbers are not unique.
The only unique identifier for an aircraft across its lifecycle from production to end of life is a combination of the manufacturer, make and serial number.
I know this because I am on (for better or worse) the patent that involves defining that as a unique identifier for aircraft.
The combination of ICAO aircraft type designator + serial number approximately is the most permanent identifier for an airframe - and even then - if an airframe is modified significantly enough that it no longer is the previous type - even then this identifier can change.
Personally, it boggled my mind that something as big as an aircraft did not have a simple time invariant unique identifier.
P.S. For those who might ask - aircraft registration numbers are like license plates, so they change - tail numbers can be ambiguous and misinterpreted depending on what is painted on the aircraft where, and ICAO 24-bit aircraft addresses are tied to ADS-B transponder boxes, which technically can be moved and reprogrammed between aircraft also.
> patent that involves defining that as a unique identifier for aircraft.
Now i got mighty curious what makes this novel enough to be a patent.
Almost comical that it happened 1 day after this was posted.
Plus year of production if necessary.
I’ve seen programmers attempt deduplicate humans by language spoken.
If you've ever spent time in old car forums, you learn that even this isn't enough because of production-line sloppiness.
Serial number re-use is rare, but it happens. Usually because a product had something detected that resulted in remanufacturing, but sometimes other things slip.
(No racist intentions here, but you bring up both points and I thought that to be interesting)
The son of John who is a smith
I'm only joking a little. Funny thing, surnames aren't actually that old for Europeans. Most of history there'd be maybe two people with the same name. They solved it back then very much the same way we solve it now.
https://en.wikipedia.org/wiki/Category:Occupational_surnames
https://en.wikipedia.org/wiki/Patronymic
Model = what is the type of the thing
Serial Number = was supposed to be what order the things were made in (e.g. the number of the serial order), but this is often obfuscated or often repeats [1].
In cars, make would be like Ford. Model would be like Focus, serial number would be VIN (vehicle identification number - in cars, those are generally unique!).
Ford Focus + VIN, basically.
There is a theoretical concept of a unique identifier for everything... including people from ISO under ISO 8000.. combining a natural location identifier (eNLI)[2] and an ISO8601 timestamp - to represent "where and when a thing is considered to be born" - a point in time and space the thing is considered to come into existence.
I think the idea is called "natural person identifier" for humans.
This ID has to be assigned but I think you can see the idea at least.
I suppose this doesn't include make/manufacturer but realistically that isn't needed for uniqueness in this scheme, only as descriptive metadata for things that have one.
[1] This is related to the fact that if serial numbers were truly serial, one could estimate the rate and quantity of production which is considered sensitive information by most manufacturers. This relates to "the German tank problem" - during WWII the allies were able to accurately estimate the production of German tanks by analyzing the serial numbers off captured tanks.
[2] https://eccma.org/enli-eccma-natural-location-identifier/
Like Mary, first daughter of Henry VIII
How is that supposed to help? If two people have the same name, it's overwhelmingly likely that they also speak the same language.
Is that allowed?
It was worth it because without that, a home built airplane would have an experimental certificate and you couldn’t sell rides in it.
In other words had Virgin Galactic built the VSS Enterprise around the data plate of a Cessna 172, would it then no longer have been an experimental aircraft?
It boggles my mind that despite not having some sort of universal system things work as well as they do.
Aviation grew up relatively insular, and each country that had any sort of aircraft manufacturing did things their own way until fairly recently. Arguably, the first half of the history of aviation is a kind of free-for-all. The fact that we now have a globalized airline industry that mostly follows some kind of standards is the mind-blowing part to me. And I suspect if we weren't mostly down to a dozen or so manufacturers for the vast majority of airliners, even that wouldn't be the case.
What if a new aircraft were made 50/50 from the parts of two older aircraft
A modern turbofan can run for 20,000+ hours on-wing before removal for overhaul. That's longer than many car engines last in total.
The operators, such as Delta, do not actually own engines on the aircraft they fly, even though they own the aircraft. The engines are rented from e.g. Pratt & Whitney along with a maintenance contract. That said, that engines are in fact installed at the factory.
Having said that, many of the links are very informative. For example the crater on Mars that has an ICAO airport code [2]: "On 19 April 2021, Ingenuity performed the first powered flight on Mars from Jezero, which received the commemorative ICAO airport code JZRO."
[1] https://www.flightaware.com/live/flight/PDT5965/history/2025...
[2] https://en.wikipedia.org/wiki/Jezero_(crater)
Things like flight numbers not having reasonable semantics, or conceptual pollution of what a flight is to include multiple take offs and landings are bad design, plain and simple. Just model the problem correctly e.g. maybe a Trip is multiple Flights, or Flights have multiple Legs. This isn't aviation specific. These are generic problems that programmers can and should get right.
Some of it is intrinsic to the domain, like flights not all having gates, or not landing at airports. That was a new tidbit for me.
The fact remains that software that models real-life events or information is making normative assumptions about what can and cannot happen in the domain, due to the very nature of software, and these assumptions are knowingly-or-not being introduced by programmers. If for any given domain we had hundreds of human notaries, scribes or typists managing information instead of software, their mistaken assumptions wouldn't matter—they would simply go "Oh, that's odd", make the necessary adjustments, and learn from the experience. But as long as software is a prescriptive model of what it is representing, it will be valuable to highlight the "falsehoods" that its creators may accidentally prescribe into it.
it doesn't matter whose failure it is
the point of the article (just as with the one about names) is that there are "reasonable defaults" many people would believe - that don't work in practice and become gotchas
whether you have enough knowledge to know that something is unreasonable doesn't mean it doesn't seem reasonable for many others
It does. The distinction between a misunderstanding of aviation, or a misunderstanding of which of many possible models is in place.
I'm interested in misunderstandings of aviation. How did aviators think about it before software ever entered the picture? Which of their concepts are coherent, and why do they use them? What reasoning power do those concepts get them? It's interesting when programmers misunderstand that.
It's less interesting when programmers setup a poor system for keeping track of things. It could be busses or trains or parts in a factory. It's not aviation specific, and it's run-of-the-mill bad design when it's done badly. Apparently the first software engineers to model aviation were really bad at it.
Many of the points in the article, including some in your examples, pre-date programmers and programming entirely. Still more emerged before widespread use of computer automation in aviation.
Like, sure, a better data model would be great. But switching to one is largely a human system migration effort, not a software problem.
It is by no mean specific to programmers. Ask to someone who learns French, for instance. Rules with too many arbitrary exceptions.
What is specific to programmers is that their tool performs at its best with simpler rules, so their job is to find the necessary and sufficient set of rules - and will dismiss most of the cases pointed by this article as unimportant exceptions the software won't handle.
English is a foreign language to me. But I somehow managed to learn it without learning the rules. I can say things correctly-ish without being able to explain why I used this particular grammar.
I took French in middle school, and it was always a running joke that the teacher spent the first 5 minutes on the rule, and the next 40 minutes on the exceptions.
I'd argue that programmers are indeed much more aware of how many exceptions and edge cases most real world domains have. Ask a lay person about such a simple thing as leap seconds, for instance, and they'll often believe you're making shit up.
In the map everything is clear. It is clear what a "plane" is what "airports" are and what their relationship is. And transferring that into a computer program is straight forward.
In the territory everything is fuzzy. None of the definitions are without edge cases and the expected relationships are often violated in surprising ways.
Aviation isn't unique here, every system suffers from the distinction between its actual function and the abstract description of that system.
Software unfortunately follows rigid rules so the challenge is finding a set of rigid rules that can encompass reality. It would be pretty natural if you were writing a database schema that a flight would have a departing airport and an arriving airport— but alas.
A theme running through the article is "this value is unique " and "this value does not change". And of course those are both wrong.
So when designing databases now I assume "everything changes, nothing is unique " (even when the domain "expert" professes it is.)
This approach solves so many problems and saves something time later on when it turns out that that "absolutely, positively, unique for ever" natural key, isn't.
A UUID is at best 2x larger than even a BIGINT, thus the index size is 2x larger. If you aren’t using v1 or v7, it’s also not k-sortable. But most importantly for MySQL (and optionally SQL Server) if the table contains things related to a common entity, like a user’s purchases, the rows are now scattered around the clustering index’s B+tree. That incurs a huge amount of I/O on large tables, and short of a covering index and/or partitioning (which only masks the problem by shrinking the search space), there is no way to improve it. If instead the PK was (user_id, some_other_identifier), all records for a given user are physically co-located.
SQL server let's you cluster by any index, do if your child record table will benefit by clustering by ParentGuid then go for it.
For any RDBMS (I assume... I don't know a lot about SQL Server or Oracle), the binpacking for pages also impacts query speed for queries where there are many results.
So now some information comes in from outside the system that something happened with a plane, and you still have to find which surrogate id that plane has in your system.
You may decide two things happened to two different planes whereas another system consider it the same plane both times, and vice versa.
UUID keys PLUS some form of versioning with creation dates will let you change an airport name and let you know what the airport name was on some arbitrary date in the past. Useful for backfills and debugging
And when you renamed the airport, you’d need to add new entries in all the other tables that used airport name as a foreign key
Correct, which is why you need the addition of a DATETIME to indicate when that identifier is valid.
> And when you renamed the airport, you’d need to add new entries in all the other tables that used airport name as a foreign key
No, because you wouldn't use the name in the key, you'd use a code like ICAO, though there are pseudo-ICAO codes for some aerodromes, so whether or not you want to be pedantic about naming is a personal choice. Then use FK constraints. Example:
This would let you model edge cases like John F. Kennedy International Airport, née Idlewild Airport, which had the ICAO "KIDL" from its opening (I mean, probably before that as well, but for the sake of argument assume you care when it was operating) on 1948-07-01 until 1964-01-01, but its name was changed to John F. Kennedy on 1963-12-24. It also allows you to model the reuse of ICAO codes, since Indianoloa Municipal Airport received the ICAO code "KIDL" following its release by JFK.Is this easier to do than surrogate keys? Not really, no, but IMO it's easier for a human to understand when presented with temporal changes, allows for edge cases like an airport's designator or name changing while flights are enroute, and for flights (which would be the largest table), they can use `icao` and their departure/arrival datetime (which the table would need to model anyway) to effectively link to the other tables.
I went on a vacation to New Zealand just before new visa requirements took effect, so while I entered legally, at some point, I couldn't reenter. As I was going through passport control to exit, I asked what would happen if the plane had a mechanical issue and I had to spend the ~night, but couldn't reenter. The border agent said they can undo an exit.
* Programmers believe they are handling all possible configurations of the universe when putting something into production.
* Programmers don't handle all possible configurations of the universe when putting code into production because they don't know any better.
Falsehoods people believe about the universe:
* There exists a constant.
* SI units are constant at all times or everywhere.
* When a new corner case appears, it is easy to adjust the program to handle it.
What made the corresponding lists for names and time interesting were that it was genuinely surprising to realise that their statements were actually false. I don't get that feeling with these.
Like the top level comment about identifiers for airplanes -- why would they have them? That sounds baffling to me. With ownership changes, continuous upgrades, extending airframes, repurposing etc. I would be surprised if there was a stable identity.
https://www.flightaware.com/squawks/view/1/7_days/popular_ne...
https://www.youtube.com/watch?v=jfOUVYQnuhw
including (attempts at) a few in-depth reasons for why these quirks exists
It really drives home the idea that when building systems that rely on data like this, the real challenge isn’t in writing code to handle the “happy path.” It’s in anticipating the weird edge cases, the partial data, the inconsistencies that show up because real life doesn’t always follow the rules we expect.
My impression is that every single older (pre-2010) computer system that manages the Brazilian aviation felt for that and fixed it in a hack.
> Airports never move
Also, Runways never move. Also, if runways move, they don't change direction. Also, if airport or runways move, there will exist some construction work before.
I'd add "aircraft only land in runways" there too. And "ok, aircraft only land in runways and heliports".
Can you elaborate more?
It's pretty cool to be on a ferry and see a plane land basically next to you in the middle of the river.
One of my favourite planes were the Grumman Mallards still owned and operated by Paspaley Pearling out of Mungalalu Truscott and other Kimberley airbases.
They're classic 1950s twin-engined amphibious aircraft that landed anywhere up and down the Kimberley Coast for pearling transfers.
* https://en.wikipedia.org/wiki/Grumman_G-73_Mallard
* https://en.wikipedia.org/wiki/Mungalalu_Truscott_Airbase
[1] https://www.ncei.noaa.gov/news/airport-runway-names-shift-ma...
Expecting every runway to be above sea level is another falsehood. AMS has runways 3 meters below sea level, and I'm sure there's even lower ones on the east side of Africa.
-64m
Isn't that blindingly obvious? If so, how did it get to be a patent? And is someone now extracting rent from it?
> A method for inducing cats to exercise consists of directing a beam of invisible light produced by a hand-held laser apparatus onto the floor or wall or other opaque surface in the vicinity of the cat, then moving the laser so as to cause the bright pattern of light to move in an irregular way fascinating to cats, and to any other animal with a chase instinct.
How on earth is anyone supposed to be able to take the patents system as a whole when there are 100s (if not 1000s) of examples like that, which obviously shouldn't be approved if "novel" or "non-obvious" ideas are required.
https://patents.google.com/patent/US6360693B1/en
The US patent system seems profoundly broken. Given that the patent system seems much less broken in other developed countries and the vast wealth and resources of the US, I assume it is broken on purpose?
https://patents.google.com/patent/US7126691B2/en
Obviously it doesn't actually work. I submitted the application as an experiment to see how hard it would be to sneak this past the patent office. The answer turns out to be: not hard at all. In fact, there's pretty much an algorithm for it:
1. Write up a half-assed patent application.
2. Submit it and wait for it to be rejected (which it almost certainly will be).
3. Read the rejection notice and tweak the application to address every individual point that was made.
4. Go to step 2. Repeat until the patent office capitulates and issues your patent.
In my experience (my name is on six patents) has never been necessary to do more than one iteration.
The reason this works is that the patent office is required by law to give specific reasons for rejecting a patent application. They are not allowed to simply say, "This is obviously stupid." If they see that you are going to persist, it's a lot easier for them to just give you the damn patent (it's no skin off their nose) than to keep doing your homework for you.
With AI, following this procedure becomes borderline trivial. In fact, I'm a little surprised that the patent office isn't being overwhelmed by AI-generated patent applications. (Or maybe they are and it just hasn't made it into my news feed.)
It reminds me slightly of:
https://en.wikipedia.org/wiki/Sokal_affair
I've done my own minor pranking as well:
https://successfulsoftware.net/2007/08/16/the-software-award...
My understanding is that quantum entanglement can't be used to transmit information. Given this fact and the fact that Einstein started out as a patent clerk, the great man must be turning in his grave!
https://patents.google.com/patent/US6368227B1/en
I thought I would have to disguise the bogosity under some plausible-sounding pseudo-science, and to be fair, it's actually quite tricky to figure out why my invention doesn't work (though the fact that it doesn't work, and can't possibly work, should be pretty obvious to a patent examiner). But at this point I think I could probably get a patent for summoning dragons to slay my enemies. (Hm, there's an idea...)
It's not just the technology, it's the employment of it too. In 1993 this was a new way to use lasers, which a decade before were too expensive, delicate and power hungry to use as such.
Put another way, the change can be incremental. Building upon what is. Without this, pretty much all incremental science would lose funding, for the moment you invent, regardless of cost, it'd just be copied.
If you've ever done hardware, even a toy, it's not simple.
Extensive prototypes, testing for drops, hand fit, assembly at the factory, and more.
Devs today can't even conceive of making a 100% stable product to be shipped on floppy and never updated. Reshipping for bugfixes could break a company in the old days.
Now try that with hardware!
And all those tweaks, fixes, tests can be copied in a second without patents.
I think separating software and hardware patent discussions would be better here, because hardware patents are requied.
I think your timescale is slightly off, but I don't know enough about laser history to say definitely. But judging by what I could find, in 1981 Popular Science seems to have run an ad for laser pointer devices, aimed (no pun intended) towards consumers:
> It wasn’t until the 1980s that lasers became small enough, and required so little energy, that they finally became cheap enough to be used in consumer electronics — take this funky laser pointer from the early 1980s, for example. The November 1981 edition of Popular Science features a Lasers Unlimited advertisement for an assortment of laser pointing devices, including a ruby laser ray gun, a visible red laser lightgun, multi-color lasers and laser light shows, all of which were selling for less than $15 (equivalent to about $42 today) - https://melmagazine.com/en-us/story/a-dazzling-history-of-th...
So if they became usable but consumers in 1980s, I'm about 99% confident at least one individual used it for playing with their cats.
But since the author of the patent just happened to have spent the time (10 years later) to write the patent, they got it awarded to them.
Not all the things in the list, because I am aware of those. I might have missed the runway numbers changing based on shifting magnetic field of the earth, but that's a thing too. Runway 22? That's now Runway 21.
But why programmers specifically would believe this, as opposed to ... any other profession that is not aviation?
I don't read it as programmers specifically believing that, is that they're specifically treating these things as invariants in their projects.
tl;dr: you can't know absolutely everything ahead of time.
"Flights have schedules".
Don't they all have schedules?
- A private pilot who departs their local airport without filing a flight plan and flies around for a while.
- A charter jet that departs whenever the passengers show up.
- A medevac helicopter departs a hospital to return to its base. While en route, it is rerouted by dispatch to pick up a patient at a different hospital.
Also, feeling myself stupid very quickly. Very nice summary, bravo!
Hehe, I was once told we couldn't land at our destination A, so we got diverted to B; while on our way to B we were told we are actually going to C; and, while on our way to C, A became available again so the plane did a U-turn and we flew back to A, landing with a ~3 hour delay.
The cause was snow and wind.
I know that there is a ICAO code on Mars (since I had read about it before).
I think there are some airports that have a ICAO code but not IATA code and vice-versa, and some have a "pseudo-ICAO" code with letters and numbers together.
Aside: is there a notation for such constraints?
Myths programers believe about cars:
Cars in the same lane always travel in the same direction.
Each street has a name.
Each street has a unique name.
Each street has only one name.
Cars have four wheels.
Cars never move vertically.
Roads never move.
Roads never cross water without bridges.
When two roads cross, the do so at an intersection.
Take any field in human experience and one can make such a list.
All boats float. Ships are bigger than boats. Boats are slower than airplanes. Boats only travel on water.
That's exactly the point. The famous example (Falsehoods Programmers Believe About Names) has examples I have encountered in medical databases. If a programmer somewhere didn't fall into the trap, patient names in a medical database would have been better managed and may have avoided duplication, lost records, etc.
https://news.ycombinator.com/item?id=18567548
to the best of my recollection, the only way to tell a ship from a boat is to watch it make a "high" speed turn, ships lean out, boats lean in. But this is probably incorrect, just like all of my education was.
I can imagine them going "I had a perfect database schema that covered every edge case, and then..." with each bullet point.
This had never happened before.
Like, you don't even _change_ the IATA code of a live airport. To switch them was a huuuuuuuuuge assumption breaker for the industry.
Reminds me of a list that came up ages ago that presented an assumption of "X code always runs" with the counterpoint that you could unplug the computer. Ok sure, but then why write software at all? Clearly no point assuming any code will ever run since you can just terminate the program at any random time.
That said, I do appreciate some of these lists--which maybe has put you on edge to the paradigm--do have an edge to them... but, in all honesty, I think they should? The bugs and edge cases that these lists tend to expose aren't random glitches that equally affect every user: they usually segment users into the ones whose lives "follow the happy path" (which often just means "are intuitive and familiar to the culture near the developer") and the users who get disproportionately (or even continually!) screwed every time they dare interact with a computer.
And like, it is actually a problem that the other side of this is almost always a developer who doesn't really give a shit and considers that user's (or even an entire region/country's) existence to somehow be a negligible statistic not worth their time or energy, and I really do think that they deserve to take some flak for that (the same way I try to not get offended if someone points out how my being a cis-het white male blinds me to stuff: I think I deserve to get held to task harder by frustrated minorities rather than force them to be nice all the time in a world that penalizes them).
It would have been cool if the blog post discussed those outcomes so we can reason about it properly, otherwise it's just a list of claims at face value. If the programmer making an assumption means a screen at a gate says the wrong boarding time when there's a human there controlling the boarding, then not the end of the world. But if the programmer making an assumption causes 1/10000 flights to crash, then that's interesting and worthwhile calling out. It's just endless speculation without a proper outcome to tie it down.
When designing data I think these questions (skepticisms) should be front of mind;
1) natural values are not unique.
2) things identified by number are best stored as a string. If you're not going to do math on it, it's not a number. That "customer number" should be treated as "customer id" and as a string.
3) be careful constraining data. Those "helpful checks" to make sure the "zip code is valid" are harmful not helpful.
4) those tiny edge cases may "almost never happen" but they will end up consuming your support department. Challenge your own assumptions at every possible opportunity. Never assume anything you "know" is true.
It's hard to measure time saved, and problems avoided, with good design. But it's easy to see bad design as it plays out over decades.
And (especially today) never optimize design for "size". Y2K showed that folly once and for all.
This implies denormalization, which is rarely needed for performance, despite what so many believe. Now you’ve introduced referential integrity issues, and have taken a huge performance hit at scale.
> 3)
I mean, maybe don’t try to use a regex on an email address beyond “is there a local and domain portion,” but a ZIP code, as in U.S. only, seems pretty straightforward to check. I would much rather have to update a check constraint if proven wrong than to risk bad data the rest of the time.
> never optimize for size
Optimize for size when it doesn’t introduce other issues. Anyone working on 2-digit years could have and likely did see that issue, but opted to ignore it for various reasons (“not my problem,” etc.). But for example, _especially_ since Postgres has a native type for IP addresses, there is zero reason to store them as strings in dotted quad. Even if you have MySQL, store them as a UINT32, and use its built-in functions to cast back and forth.
These lists hopefully make programmers aware that a lot of their assumptions about the real world might be wrong, or at least questionable.
Examples are assumptions on the local part of email addresses without checking the appropriate RFCs. Which then get enshrined in e.g. JavaScript libraries which everyone copies. I've been annoyed for the last 30 years by websites where the local part is expected to be composed of only [a-z0-9_-] although the plus sign (and many other characters) are valid constituents of a local part.
Or assumptions on telephone numbers. Including various ways (depending on local culture) of structuring their notation, e.g. "123 456 789" versus "12-3456-89" where software is too dumb to just ignore spaces or dashes, or even a stray whitespace character copied by accident with the mouse.
And those forms where you have to enter a credit card (or bank account number) in fields of n characters each, which makes cut/copy/paste difficult because you notes contain it in the "wrong" format.
So while some examples may count as "just usability" it all stemps from naive assumptions by programmers who think one size fits all (it doesn't).
> There are a lot of assumptions one could make when designing data types and schemas for aviation data that turn out to be inaccurate.
Sounds like a pretty explicit acknowledgement of the notion that these are otherwise reasonable assumptions that just happen to fail when put to the test, I'd say.
It's very easy to self-deprecate, especially if one has insecurities. But that doesn't mean that articles like this actually mean to do so. I think it's worthwhile for everyone involved to always evaluate whether the feeling is actually coming from the source you're looking at, or if that source just happened to trigger it inside you. More often than not, in my anecdotal experience, it's the latter.
I'd also find it interesting to learn what happens when these falsehoods nonetheless make it into an implementation though.
Mostly confusion, but the combination of aviation and confusion can be dangerous and even deadly. Not directly related to this list, but I'm reminded of [1]: no one entity has set out to inconvenience the hapless traveler, but the combination of history and practice are a constant source of irritation, and at the times of heightened tensions and security might even lead to scary incidents. All because of the name.
[1] https://travel.stackexchange.com/questions/149323/my-name-ca...
1. I'll never need to learn a falsehood list, so I can skip it.
2. A falsehood list is complete at the time of writing.
3. OK, but it will surely get updated with new falsehoods and clarifications.
4. Skimming the falsehood list is all I need to do to learn it.
5. OK, but surely I'll remember to recheck the falsehood list once I actually need to, right?
6. If a falsehood doesn't immediately make sense to me, there must be something wrong with it, despite the author having domain expertise that I don't.
Literally had to point out just last night how UTC is not sufficient in all scenarios. I swear it happens every 6 mos on Reddit.
From that dead comment quoting a chat bot that clearly did not understand the question at all, I think maybe we can extract a single bullet point:
* “Edge cases” live only at the edges; they never creep into the middle.
But that's not much to build a post with.
Eventually you end up having to make choices and deal with the consequences. Otherwise Jordan Peterson would have you chasing your tail for days about what a "choice" is, and nothing would ever get done.
tl;dr: just make your best guess and always include an extra "notes" column where things can get leaky.
Notes / data / extra et. al columns are the worst, as a DBRE. People inevitably shove various shit into them over time instead of making an effort to properly fix past mistakes, and at some point, they practically contain their own table.
We detached this comment from https://news.ycombinator.com/item?id= 44207171 and marked it off-topic.