Retrofitting spatial safety to lines of C++

(security.googleblog.com)

84 points | by jandeboevrie 344 days ago

12 comments

  • titzer 343 days ago
    > We’ve begun by enabling hardened libc++, which adds bounds checking to standard C++ data structures, eliminating a significant class of spatial safety bugs.

    Well, it's 2024 and remember arguing this 20+ years ago. Programs have bugs that bounds checking catches. And making it a language built-in exposes it to compiler optimizations specifically targeting bounds checks, eliminating many and bringing the dynamic cost down immensely. Just turning them on in libraries doesn't necessarily expose all the compiler optimizations, but it's a start. Safety checks should really be built into the language.

    • pjmlp 343 days ago
      Before C++98, this used to be pretty much table stakes in C++ compiler frameworks, e.g. Turbo Vision, AppToolbox, OWL, MFC,....

      I still don't get why the standard library went the other way, other than starting the tradition of standardised wrong defaults.

      • IshKebab 343 days ago
        The C++ standards committee is still under the illusion people can read, understand and remember the entire spec, and write code without making mistakes. All these bugs are the fault of the people making mistakes, not C++.
        • tialaramex 343 days ago
          I don't think there's any such illusion. You do not see, for example, WG21 members who are confident that they understand the entire C++ language (on the contrary they'll often accept corrections about the language from other committee members) and it's not infrequent that a committee member will agree with the statement that C++ is too large and sprawling for any individual to attain such comprehensive understanding. [Today I would guess maybe Sean Baxter, who wrote his own compiler, has the best individual understanding and I believe Sean is not a member of the committee]

          Instead WG21 has very clearly (but without ever admitting it and that's important) taken the path of maintaining a legacy language. Even as debate carried on about whether in future C++ could end up like COBOL, the committee has acted exactly as though it is for some years now. Compatibility is King, no price is too high for compatibility, everything must be sacrificed to make that happen and that's how you end up like COBOL.

          Three important opportunities to divert and pick other ways forward should be highlighted here. P1863 "ABI: Now or Never" by Titus Winters in 2020; P2137 "Goals and priorities for C++" also in 2020 but with a long list of authors and P1818 "Epochs" from 2019 by Vittorio Romeo.

          In all these cases WG21 chose the "hope the problem goes away" path, preferring not only not to address the critical problem highlighted and take a new route forward, but to specifically ignore the problem and press on anyway.

          "Hope the problem goes away" is also, quietly, the preferred strategy by WG21 for the safety problem.

          There's a reason (albeit a terrible one) to prefer the C++ ISO document's language over the approach of Rust. These are both general purpose languages (I might also write separately in this thread about a non-general purpose language which Google should use more, if I have time) and so must wrestle with Rice's Theorem. Rust's solution is to require the compiler to be conservative. This is very difficult and indeed there are known bugs in the code doing this conservative check in the Rust official compiler. But C++ has a much easier (but IMO fatal) path, it says that's the job of the programmer and when the programmer writes C++ software which is nonsense as a result that's their fault, not the compiler's fault for failing to reject the program.

          It would be extremely difficult to explain how a "standards conforming" Rust compiler can correctly accept all the programs Rust's actual compiler accepts and reject all those it rejects without essentially having a black box where the compiler implementation sits. We can explain the purpose of such rules without, but their detailed behaviour not so much.

          Take borrow checking. All the easy scoped borrows (which is all that worked in Rust say eight years ago) can be explained without too much trouble, but today a lot fancier (but to a human obviously correct) borrowing will compile, because the checker is smarter - now, how do you express, not in Rust source code but in the English language, all the checks to be performed, and neither miss things out nor unknowingly accept programs a real Rust compiler will reject ?

          C++ just needn't do that, in effect the ISO document says. "Don't do borrows that last longer than the thing borrowed, if you do, that's not C++ but your compiler won't notice so the result is arbitrary nonsense"

          • kibwen 343 days ago
            > how do you express, not in Rust source code but in the English language, all the checks to be performed, and neither miss things out nor unknowingly accept programs a real Rust compiler will reject ?

            I think this is unintentionally stuck in the mindset of "the purpose of a language specification document is to enable armchair language lawyers to flame each other on Usenet about whether or not such-and-such degenerate edge case is technically valid". But a specification doesn't need to be written in English, it can be written as a formal proof, and indeed I would expect a theoretical Rust spec to specify the behavior of the borrow checker as just such a proof. Rust's borrow checking may no longer be as simple as the lexically-scoped model that existed as of Rust 1.0, but it's not like the extensions that have been added since then are ad-hoc; they're all still designed to result in a model that is provably sound.

            • almostgotcaught 343 days ago
              > But a specification doesn't need to be written in English, it can be written as a formal proof, and indeed I would expect a theoretical Rust spec to specify the behavior of the borrow checker as just such a proof.

              Did you miss the part where the person you're responding to mentioned Rice's theorem? Do you know Rice's theorem and hence understand what they're implying?

              • kibwen 343 days ago
                Rice's theorem isn't relevant here. The goal is not to create a system that produces no false positives, it's perfectly fine to do a conservative syntactic analysis that allows false positives but disallows false negatives, and it's then possible to produce a formal proof that this analysis is sound. It is this formal proof that I would expect to be included in a specification in lieu of English prose.
                • tialaramex 342 days ago
                  Sure, there's a reason I said "Extremely difficult" not "Impossible". Defying Rice without losing generality is mathematically impossible.

                  This is on that continuum where it's definitely neither impossible nor easy enough that we can just let some bored grad student knock out the answer, and so now somebody who wants this must do lots of hard work.

                  I think a specification which says e.g. here's the semantic requirement, here's a rule for scoped borrows which works, you must do at least that, but you can do more however you must not allow anything which violates the semantic requirement - would be great, but if you had that rule in your standard then people can write conforming Rust programs which don't compile - they need a yet-to-be-written smarter compiler to figure out why they're legal, which is kinda annoying as a language feature.

              • Ygg2 343 days ago
                Rice theorem doesn't say anything about humans.
                • almostgotcaught 343 days ago
                  No clue what this means
                  • tialaramex 343 days ago
                    Some people believe that the Church-Turing intuition doesn't tell us anything about humans, that what humans are doing isn't computation but something more powerful. In my experience their lack of evidence for this belief just makes them believe it even harder, and they often write whole books which are in effect the argument from incredulity but expanded to book form.
                    • carbotaniuman 343 days ago
                      There is no proof that humans are just glorified Turing machines and even as a nonreligious person, I find such a statement to be as lacking in evidence as those that claim humanity has some soul or similar that cannot be replicated.

                      The actual logic of gggp's statement also doesn't make any sense. We as humans also under and overestimate the soundness of programs.

                      Sometimes, a perfectly fine solution is massaged to better adhere to best practices because we can't convince ourselves that it's correct. Rust requires that we convince the compiler, and then we know it's correct via the compiler's proofs, instead of requiring us to do the proof all the time.

                      • IshKebab 343 days ago
                        > I find such a statement to be as lacking in evidence as those that claim humanity has some soul or similar that cannot be replicated.

                        It doesn't need evidence; it is the null hypothesis.

                        Brains clearly compute, and it appears that computation is sufficient to produce the observed behaviour of brains. All our experience of the universe and physics suggests that there is no magic or metaphysics or souls or whatever.

                        So the onus is on you to show that there's something more going on. It isn't a 50:50 "is it heads or tails", it's more like "I claim that the tooth fairy exists" vs "I'm pretty sure it's your mum".

                    • Ygg2 342 days ago
                      It's simpler than this. Turing machines are a beautiful abstraction. Whatever happens in humans is much, much, much, much, much, much messier, on the account of it being subject to laws of evolution and working on a scale where various micro-effects can be felt (radiation, Brownian motion and quantum effects anyone?).

                      So even if the Turing machine model is correct (and we don't know that), it's overtly simplified.

                  • Ygg2 343 days ago
                    Humans are not Turning machines. I'm not talking how we work on fundamental level.

                    I'm saying we don't obey axioms of Turing machine model. So Rice theorem nor Godel theorem can apply to unsafe code written by humans.

                    Even if borrow checker is limited by the Rice theorem, you can create either safe abstractions provably or unsound abstractions provably or potentially unsound abstractions, which humans can reject or accept.

          • IshKebab 342 days ago
            This doesn't really have anything to do with Rice's theorem. It's difficult to specify the behaviour of the borrow checker simply because it is very complex behaviour.

            You absolutely could do it, but it would be a ridiculous effort for nebulous benefit.

        • pjmlp 343 days ago
          Spot on.
    • flohofwoe 343 days ago
      Yeah. FWIW, we shipped PC games since the early 2000s written in C++ where the C++ stdlib was banned (for various reasons, not just memory safety), and our custom container classes were bounds checked via custom asserts which stayed in the code for the shipped game (and the rest of the code also peppered with asserts).

      ...and then you still had to argue with some circles of the C++ community why the game and engine code doesn't use the stdlib. It's crazy that it takes decades to convince some people that a bad idea is simply a bad idea.

      • pjmlp 343 days ago
        Which is kind of ironic, given how performance minded the game industry is, and then we have those circles with such attitude.
        • flohofwoe 342 days ago
          Well, I did measure performance overhead of all those asserts of course (not just the range checks). It was somewhere around 2% of the frame budget which isn't nothing, but also not enough to justify removing the asserts.
          • pjmlp 342 days ago
            That is already a big difference to those that oppose on principle, never having measured anything.
  • WalterBright 343 days ago
    Dlang added array bounds checking 20 years ago. It's a huge win. As evidenced by the article noting that 40% of the memory safety bugs were spacial.

    I used to have all kinds of problems with array overflows. I didn't make them very often, but when I did, they took a long time to track down. They've been gone for 20 years now.

    Note that it would be easy to add it to C/C++:

    https://www.digitalmars.com/articles/C-biggest-mistake.html

    It would be the most useful and cost-effective enhancement ever.

    • lpapez 343 days ago
      Thanks for sharing, I enjoy reading your posts in regards to how ahead of time Dlang was in adopting these improvements.

      I wanted to ask: did you ever consider what was missing from Dlang to achieve widespread adoption? Clearly it was not features, so I'm wondering what that would be from your pespective.

      • WalterBright 343 days ago
        The marketing department was what was missing. I've always had that problem. Borland was brilliant at marketing an inferior compiler. Phillippe Kahn is an amazing businessman. (He's also a very fun person to talk to.)

        For example, Borland at one point decided to include the source code to some of its runtime library for free. At a compiler roundup in the magazine, this was hailed as a great advance forward by the reviewer. Meanwhile, Datalight C was also in the roundup, and had always included 100% of the runtime library source code. No mention was made of this.

      • dataflow 343 days ago
        > what was missing from Dlang to achieve widespread adoption

        This: https://godbolt.org/z/s49qzPn81

    • dataflow 343 days ago
      They have it already, it's called std::span.
      • pjmlp 343 days ago
        No they didn't, if you care about security, gsl::span is the answer.
        • coffeeaddict1 343 days ago
          It's quite obvious to me that the C++ folks running the committee didn't care about safety much. How can they standardise `std::span` knowing it's unsafe?

          They care now (well they pretend at least) because Rust is going to take significant market share in domains where C++ is still king.

          • 3836293648 343 days ago
            They didn't just standardise it when it was unsafe. They got a proposal for a safe span and demanded that safety be removed before they'd accept it
          • pjmlp 343 days ago
            Rust isn't the reason, rather governments are now serious about security, just like in any other industry.
            • coffeeaddict1 343 days ago
              Well yes and no. Rust is the reason because it's a real memory-safe alternative for system programming. If it didn't exist, governments would give C and C++ a "pass" for being memory unsafe.
              • pjmlp 342 days ago
                Except, Rust is never mentioned alone on cybersecurity advisories, the anti safety folks are the ones doing that.
            • aninteger 342 days ago
              Meh.. I kind of don't think so. The reason you and I keep ending up on haveibeenpwned has likely nothing to do with Rust or C++.
              • pjmlp 342 days ago
                Indeed it has to do with lack of liability, but it will come.

                And then evolution will take care of which programming ecosystem are less expensive to result in lawsuits, or invalidation of insurance policies.

        • dataflow 343 days ago
          > No they didn't, if you care about security, gsl::span is the answer.

          https://godbolt.org/z/Pda9Me45P ?

          • pjmlp 343 days ago
            Unless you use .at() it isn't portable to assume code safety.
            • dataflow 343 days ago
              So what? Just pass the command-line flag to enable the code safety in your toolchain. The same way you pass it to enable optimizations in your toolchain.
              • coffeeaddict1 343 days ago
                > The same way you pass it to enable optimizations in your toolchain.

                No, it's not the same. I never enable optimisations by manually passing in flags to the compiler. It's always a `cmake -DCMAKE_BUILD_TYPE=...`. There is no such easily accessible equivalent for bounds checking.

                • dataflow 343 days ago
                  • coffeeaddict1 343 days ago
                    What flag can I pass to CMAKE_CXX_FLAGS to enable bounds checking on all platforms regardless of the compiler used? I can do that for optimisations with `CMAKE_BUILD_TYPE`.
                    • dataflow 343 days ago
                      (How) do you have no control over the environment variables you call CMake with?
                      • coffeeaddict1 343 days ago
                        I don't quite get what you mean. Of course, I could get CMake to pass a specific compiler flag at the configuration stage, but that misses the point. What I'm saying is that there is a super-easy way to configure CMake to enable optimisations (CMAKE_BUILD_TYPE=Release), but this cannot be said for bounds checking. Note that my own code does have bounds checking enabled for Clang, GCC and MSVC. What I'm arguing is that setting up the latter is significantly more effort than enabling optimisations. I'm not arguing that it isn't possible or that one shouldn't do that.
                        • dataflow 343 days ago
                          > I don't quite get what you mean. [...] What I'm saying is that there is a super-easy way to configure CMake to enable optimisations (CMAKE_BUILD_TYPE=Release), but this cannot be said for bounds checking

                          Maybe I'm not getting what you mean.

                          You are saying you already run

                            cmake ...
                          
                          So I am saying you can just change that to

                            CXXFLAGS="-Dblah" cmake -U CMAKE_CXX_FLAGS ...
                          
                          That genuinely seems pretty darn easy to me.

                          In any case, any beef you have is clearly with CMake here. You'd have the same issue(s) with any other flag, for any language, if you use CMake.

                • almostgotcaught 343 days ago
                  > I never enable optimisations by manually passing in flags to the compiler

                  Lol then you don't use your compiler/toolchain correctly. How is that anyone's problem but yours?

                  • coffeeaddict1 343 days ago
                    How exactly am I not using my toolchain correctly? What's the "correct" way?
              • WalterBright 343 days ago
                We had this debate early on with D. The resolution was checking was on by default. In order to get array bounds turned off, you had to throw a switch and it only happened for code marked @system.

                This turned out to be the right move.

                • dataflow 343 days ago
                  That wasn't something I was even debating here. People derailed this whole discussion.

                  All I was doing here was saying was that the fix for your "C's biggest mistake" (your T arr[..] proposal) is already in C++ and you can get it today: it's called std::span, and it was explicitly designed to let you get bounds-checking, with just a different syntax. It needs a compiler flag, and so do optimizations. You already pass one, so pass the other too, and get what you wanted.

                  That was all I was saying. But this being HN, everyone insisted on derailing this into an argument about whether safe-by-default is better than fast-by-default, when that had nothing to do with my point, and when I was certainly not trying to argue one is better than the other.

                  • throw16180339 341 days ago
                    > That wasn't something I was even debating here. People derailed this whole discussion.

                    If you propose something with blatantly obvious flaws here, you'll usually get called out.

                    You suggested that people use an interface without bounds checking and jump through a hoop to enable bounds checking with it. Other people disagreed that this is a solution. You kept digging deeper after that while ignoring their responses, but that's on you.

                    • dataflow 341 days ago
                      > If you propose something with blatantly obvious flaws here, you'll usually get called out. You suggested that people use an interface without bounds checking and jump through a hoop to enable bounds checking with it. Other people disagreed that this is a solution. You kept digging deeper after that while ignoring their responses, but that's on you.

                      The problem you don't seem to understand is that, with this being HN, if I'd told people to use gsl::span, then I would have had a similar barrage of people "calling me out" for it having the "obvious flaws" of (1) destroying performance for users who don't want it, and/or (2) being nonstandard and in no way equivalent to the dlang.org proposal, this is why C++ sucks, blah blah. I might as well have just told them to write their own configurable wrappers at that point.

                      So I proposed std::span because it was literally the standard solution that was explicitly designed to let people get bounds checking without those problems... so that they can have their cake and eat it however they want, without an immediate performance loss. I frankly thought that was obvious, but this being HN, I was greeted with people "calling me out". It's like it's impossible to tell people something useful here without writing a comprehensive dissertation on the general topic. Makes me regret trying to help people.

              • pjmlp 343 days ago
                Which command line option from ISO C++23?

                It is not in the standard, it isn't neither portable, nor guaranteed to exist.

      • coffeeaddict1 343 days ago
        std::span is not bounds checked by default.
        • dataflow 343 days ago
          Optimizations aren't enabled by default either, and yet everyone passes a flag to optimize, and nobody argues C++ sucks just because you need to pass a flag to enable optimizations. Is it so hard to pass another flag to enable bounds checking?
          • saagarjha 343 days ago
            Yes.
            • dataflow 343 days ago
              Eh? How/why?
              • saagarjha 343 days ago
                Because turning on optimizations makes your code faster and turning on bounds checks makes your code slower. Hence, one gets used far more than the other.
                • dataflow 343 days ago
                  > Because turning on optimizations makes your code faster and turning on bounds checks makes your code slower. Hence, one gets used far more than the other.

                  The question was "is it so hard to pass a command line flag". You said "yes" when you clearly don't see any difficulty with actually passing the flag. Instead you're apparently answering a totally different question: "why do people lack the motivation to do this." Which had nothing to do with the point you replied to.

                  It's not like opt-out vs. opt-in somehow changes the performance characteristics. People who want maximum performance will turn it off. People who want safety will turn it on.

                  • saagarjha 343 days ago
                    Is it so hard to shoot someone? It's just pressing the trigger. When you say it's hard to kill people you're really just answering a different question, one that is about the psychological or legal or moral cost of doing so. Maybe your overly literal interpretation is not the one people actually want.
                    • dataflow 343 days ago
                      > Is it so hard to shoot someone? It's just pressing the trigger. When you say it's hard to kill people you're really just answering a different question, one that is about the psychological or legal or moral cost of doing so. Maybe your overly literal interpretation is not the one people actually want.

                      You don't feel you're missing the point of the discussion?

                      The whole discussion started with: "if you want bounds checking in your own code". Notice the "if". That's the premise.... it by definition assumes you've already accepted the performance impact of getting the safety you want, and thus it's not a problem for you.

                      The only remaining question at this point is, how hard is it to get you that safety. Asking you "is it so much harder to pass -foo like the -bar you already pass" and expecting you to address the physical difficulty of adding a flag isn't taking an "overly literal" reading of the question, it's literally asking the most obvious and only remaining question.

                      If you want to go back to the premise and argue about the psychological hurdle of taking a performance loss, that's fine and all, but then you're completely changing the topic of the thread you replied to.

                      P.S. comparing passing an extra command-line flag to shooting someone is a rather insane comparison. Honestly, all this is really making me regret trying to share a tip to help people make their code safer.

                      • saagarjha 343 days ago
                        You're regretting it because you keep telling people to use an interface that explicitly was designed to not provide bounds checking and claiming that this is the solution to make their code safer, while in reality you have to look up some nonportable flag to enable it for your STL if even offers the functionality at all. Maybe people would be a lot more reasonable if you didn't post intentional bait in the first place.
                        • dataflow 343 days ago
                          > You're regretting it because

                          No, I'm regretting it because having to spend hours replying to comments that ignore the premise is a complete waste of my time.

                          > you keep telling people to use an interface that explicitly was designed to not provide bounds checking

                          As a matter of fact it was very intentionally and specifically designed to allow bounds-checking to be configured at build time: "As an example, in the current reference implementation, violating a range-check results by default in a call to terminate() but can also be configured via build-time mechanisms to continue execution (albeit with undefined behavior from that point on)." [1]

                          Calling that "explicitly designed not to provide bounds checking" is quite a deceptively misleading way to paint it. It's not an accident that you can enable bounds-checking, it's very much by design and intended that you do so. They just didn't happen to standardize the flag name, just like they never standardized the optimization flag names.

                          > and claiming that this is the solution to make their code safer, while in reality you have to look up some nonportable flag to enable it for your STL if even offers the functionality at all.

                          Like I said, this is literally the same as optimization flags. Everybody passes them and nobody bashes C++ for it. You're making a big deal out of something incredibly tiny just to win an internet argument on the wrong thread.

                          [1] https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p01...

                          • saagarjha 343 days ago
                            You're the one misunderstanding here. The reference implementation that they provided (which I actually believe is gsl::span) allows configuration. The design for the standard, as you have mentioned elsewhere in this discussion, does not provide bounds checking. I am making a big deal out of this because it is a problem that affects real codebases, not something hypothetical that you can wave away with your idea of how things work. The fact is that people who care about security ship non-bounds-checked spans because this is not the default option.
                      • orf 343 days ago
                        Safety by default, opt-in to unsafety.

                        It’s not hard to grok.

                        • dataflow 343 days ago
                          > Safety by default, opt-in to unsafety. It’s not hard to grok.

                          Nobody was ever saying that unsafe-by-default is somehow better. That just wasn't the question being asked.

                          • orf 343 days ago
                            > The question was "is it so hard to pass a command line flag"

                            Can your position not be summed up as “unsafe by default doesn’t matter, because changing the default is easy”?

                            If so, there’s an obvious flaw in that thinking.

                            • dataflow 343 days ago
                              > Can your position not be summed up as “unsafe by default doesn’t matter, because changing the default is easy”?

                              No.

                              >> Nobody was ever saying that unsafe-by-default is somehow better.

  • vblanco 343 days ago
    Game developers have been doing this since forever, its one of their main reasons to avoid the STL.

    EASTL has this as a feature by default, and unreal engine container library has the boundchecks enabled on most games. The performance cost of those boundchecks in practice is well worth the reduction of bugs even on performance sensitive code.

    • pjmlp 343 days ago
      Which is yet another reason to assert (pun intend), how far from reality the anti-bounds check folks are, when even the game industry takes them seriously.
  • omoikane 343 days ago
    > Hardening libc++ resulted in an average 0.30% performance impact

    Maybe what really happened is that compiler technology has improved such that they are able to remove most redundant checks, such that it only costs 0.30% today. I can imagine things going the opposite direction 20 years ago, as in "we removed some bounds checks and gained X% of performance".

    • panstromek 343 days ago
      Probably yes, and branch prediction improved a lot since then, too. Bounds checks are easily predictable.
      • Gibbon1 343 days ago
        Bounds checking feels to me like low hanging fruit for a processor designer. A low cost operation that can run in parallel or tossed away as the steam from the instruction decoder gets optimized and scheduled.

        Meanwhile the guys on the standards committee thinks of fixed width RISC instructions being executed by jungle logic and the ALU.

      • adgjlsfhk1 343 days ago
        the hard part about bounds checks is you need very specific semantics for bounds errors to prevent them from preventing vectorization. specifically, you don't want to promise that they are thrown when they are executed
        • almostgotcaught 343 days ago
          > preventing vectorization

          No one that wants to emit vectorized code is relying on auto-vectorization to emit that code.

    • cma 343 days ago
      Unfortunately for many use cases like gamedev, debug builds need to be fast too. So hopefully more of the improvement is from branch prediction.
    • masklinn 343 days ago
      Bounds checks are trivially predictable though, I would hope code density was the issue rather than branch prediction.

      And as others note, bounds checking was the norm before the STL.

  • alserio 343 days ago
    > We first enabled hardened libc++ in our tests over a year ago. This allowed us to identify and fix hundreds of previously undetected bugs in our code and tests.

    That's something

  • dzogchen 343 days ago
    To “lines of C++” and to “hundreds of millions of lines of C++” is quite a different title.
  • TinkersW 343 days ago
    I wonder if google really never had this turned on before? Like this has been available in the C++ standard library for decades(normally as a debug feature to catch errors in development, but some implementations such as MS support it in release also).

    Might explain why they claimed 70% of exploits were memory related..

    • alpire 343 days ago
      The hardening mode we enabled is quite recent added to libc++. It was proposed in 2022: https://discourse.llvm.org/t/rfc-c-buffer-hardening/65734. It was designed to run in prod, so it's quite fast. Previous debug modes I've seen came with a much higher costs, and therefore weren't (usually) enabled in prod.
  • DLoupe 343 days ago
    > The safety checks have uncovered over 1,000 bugs

    In most implementations of the standard library, safety checks can be enabled with a simple #define. In some, it's the default behavior in DEBUG mode. I wonder what this library improves on that and why these bugs have not been discovered before.

    • pjmlp 343 days ago
      Being actually enforced, even in release.

      Most folks don't use those #defines, and many still haven't leaned about them.

    • dataflow 343 days ago
      It's a great question (_LIBCPP_DEBUG was already a thing in libc++), and AFAIK the answer is supposedly "it used to be too costly to enable these in production with libc++, and it no longer is." I have no first-hand insight as to how accurate this perception is.
      • alpire 343 days ago
        That's exactly right. We've had extra hardening enabled in tests, and that does catch many issues. But tests can't exercise every potential out-of-bounds issue, which is why enabling it prod enabled us to find & fix additional issues.
    • saagarjha 343 days ago
      They turned those on and 1. checked that the software using it didn't break and 2. made sure it didn't tank performance.

      Source: I worked on this apparently

  • dataflow 343 days ago
    PSA: Perhaps this is stating the obvious, but if you want bounds checking in your own code, start replacing T* with std::span<T> or std::span<T>::iterator whenever the target is an array.
    • jpc0 343 days ago
      std::span is not bounds checked.

      gsl::span is

      • dataflow 343 days ago
        > std::span is not bounds checked. gsl::span is

        https://godbolt.org/z/Pda9Me45P ?

        • debugnik 343 days ago
          You've compiled with _LIBCPP_HARDENING_MODE_FAST, which still adds some extra checks not required by the standard.[1] You can also tell it's nonstandard because it doesn't really throw out_of_range, it just traps.

          > Fast mode, which contains a set of security-critical checks that can be done with relatively little overhead in constant time and are intended to be used in production.

          > Using std::span as an example, setting the hardening mode to fast will always enable the valid-element-access checks when accessing elements via a std::span object, but whether dereferencing a std::span iterator does the equivalent check depends on the ABI configuration.

          1: https://libcxx.llvm.org/Hardening.html

          • dataflow 343 days ago
            > You've compiled with _LIBCPP_HARDENING_MODE_FAST, which still adds some extra checks not required by the standard.

            The standard doesn't require any checks to begin with.

            It also doesn't require optimizations.

            • debugnik 343 days ago
              It does, on explicitly bounds-checked accessors like .at, which span is gaining for C++26.

              But you originally implied using span was sufficient, you didn't mention LLVM's libc++ hardening. (You even mentioned iterators which, I just quoted, might not be bounds-checked on fast mode either.)

              • dataflow 343 days ago
                > It does, on explicitly bounds-checked accessors like .at, which span is gaining for C++26.

                When I said "the standard doesn't require this" I clearly was not referring to C++26, which does not even exist yet. In any case, I'm not sure what the point of this pedantry is. I'm pretty sure the point was clear.

                > But you originally implied using span was sufficient, you didn't mention LLVM's libc++ hardening.

                Because this isn't LLVM-specific, every major STL has bounds checking. You just gotta enable it for your toolchain. Sorry I didn't list every single flag, I guess?

                > (You even mentioned iterators which, I just quoted, might not be bound-checked on fast mode either.)

                Which is why I had _LIBCPP_ABI_BOUNDED_ITERATORS, right? I'm not on HN to write comprehensive documentation for every toolchain, I'm just writing a quick tip for people to look into.

                All this pedantic quibbling over "this isn't required by the standard by default" is just pointless arguing for the sake of arguing on the internet. For all the performance freaks who really care about this: no language I know of guarantees optimizations in the standard, so if you're relying on optimized performance, you're already doing nonstandard stuff.

                And practically every major compiled language you love or hate has a way to enable or disable bounds checking, letting you violate their "standard" one way or another. D itself has -boundscheck, C++ has toolchain-specific flags, Go has -gcflags=-B, etc...

                • debugnik 343 days ago
                  So your first answer to being told your initial suggestion is insufficient for bounds-checking was to share a godbolt link without elaborating on where the checking was actually coming from; and when I elaborate, for other readers' sake, not yours, on your solution and other comparable ones, you get defensive and repeatedly call me pedant. Ok, but you know, these discussion are for everyone else to read and maybe learn something too, not just us.

                  As for the bounds-checked accessors, I mentioned them because they already exist in current C++ for other collections, they're coming to the one you suggested using, and I thought them relevant to a discussion about C++ lacking spatial safety.

                • carbotaniuman 343 days ago
                  I've used vendor-specific C++ compilers with no bounds checking and a barely conforming stdlib, so by your logic C++ has zero bounds checking... Defaults matter!
                  • dataflow 343 days ago
                    > I've used vendor-specific C++ compilers with no bounds checking and a barely conforming stdlib, so by your logic C++ has zero bounds checking...

                    I literally said exactly that: "The standard doesn't require any checks to begin with."

                    > Defaults matter!

                    Sigh... nobody claimed otherwise. You're really missing the point of the thread.

                    All I did was give people a tip on how to improve their code security. The exact sentence I wrote was:

                    >> "If you want bounds checking in your own code, start replacing T* with std::span<T> or std::span<T>::iterator whenever the target is an array."

                    "BUT DEFAULTS MATTER!!!", you rebut! Well OK, then I guess keep your raw pointers in and don't migrate your code? Sorry I tried to help!

                    • carbotaniuman 343 days ago
                      Cool, let me know how to improve the code security on my vendor compiler then, I'll be waiting.
                      • dataflow 343 days ago
                        > Cool, let me know how to improve the code security on my vendor compiler then, I'll be waiting.

                        Switch to std::span and add 1 line to std::span::operator[] to check your bounds...

                        • carbotaniuman 343 days ago
                          I don't think std::span is bounds checked. Try again.
                          • dataflow 343 days ago
                            > I don't think std::span is bounds checked. Try again.

                            That's why I said add 1 line to std::span::operator[] to check your bounds.

                            I'm telling you to modify the STL header. It's a text file. Add 1 line to make it bounds-checked.

        • jpc0 343 days ago
          I do believe the comment thread made the point I would have used here...

          Use gsl::span or write your own bounded span.

          std::span is not bounds checked...

  • Animats 344 days ago
    New buzzword for old thing alert.
    • aseipp 343 days ago
      People (both practitioners & researchers) have been using the terms "temporal" and "spatial" to refer to different classes of C++ vulnerabilities for at least 12+ years, back when I was actually writing exploits for a job. It is not new at all, and anyone in the field within the past 6-7 years and worth their salt will instantly recognize them.
      • tom_ 343 days ago
        For whatever it's worth, I've been doing this stupid shit - writing C++, that is - for 25 years, and this is the first time I've heard this term. (This is a data point rather than a complaint. But for a fee, it can become a complaint if you would like.)
        • aseipp 343 days ago
          I meant security engineers/exploiters actually, but yeah, I can see how most working C++ programmers who aren't security specialists might not be as familiar with it.
    • pizlonator 343 days ago
      Nah, "spacial safety" is a term of art among security folks and among PL folks who work on security.

      It's the part of memory safety that's just about bounds. You can also call it "bounds safety" and folks will understand what you mean, but "spacial safety" is the more commonly used jargon.

    • epage 343 days ago
      This term is coming up more frequently in the C++ community as they discuss Rust's safety features so to add more nuance to the discussion and focus on subsets of the problem to solve.

      Note that there are some more heated takes on where these terms are being used. I tried to be as generous as possible in my description.

    • vintagedave 344 days ago
      I'll say.

      > Attackers regularly exploit spatial memory safety vulnerabilities, which occur when code accesses a memory allocation outside of its intended bounds

      Isn't that... 'out of bounds memory access'?

  • userbinator 343 days ago
    [flagged]
    • panstromek 343 days ago
      Interestingly, the original quote is "Those who would give up ESSENTIAL Liberty, to purchase a little TEMPORARY Safety, deserve neither Liberty nor Safety." which is quite different meaning. Otherwise you never have any safety, as safety is always at the expense of some freedom.

      Arguably, this kind of work is the opposite - giving up non-essential freedom (in how you write code) for non-temporary (persistent) security.

      • userbinator 343 days ago
        [flagged]
        • SkiFire13 343 days ago
          What does that have to do with bound checks?
          • saagarjha 343 days ago
            The person you are replying to holds the perspective that keeping software insecure allows for a glorious overthrow of our digital overlords when appropriate. A bit like a digital 2nd amendment, really: the hackers will all band together and use their guns^H^H^H^H vulnerabilities to ensure freedom from tyranny. Is that position stupid? Well, they stop replying when you point out what those security holes are overwhelmingly used for. So you be the judge.
            • userbinator 343 days ago
              It's amazing to see how much the mass media along with Big Tech has brainwashed people with their alarmism. The dystopia is rapidly approaching where you cannot do anything without their explicit approval, and it will all be "for your security".

              Some people are more prescient than most, but they've been canceled for telling others about it.

              • saagarjha 343 days ago
                No, you just refuse to do anything about it, because you dream of that dystopian future so you can brush off your hacker skills and save humanity. Why don’t you actually try to fix the problem instead of mandating that loopholes exist for you (and basically nobody else) to use?
                • userbinator 342 days ago
                  Look what happened to Stallman.

                  Why don’t you actually try to fix the problem

                  People have been trying for decades. I voted and hope 45/47 will give them something to be scared of. I fight Big Tech's authoritarianism every day. What did you do besides spreading more pro-corporate FUD and helping my enemy?

                  Why is it that the MSM gets all riled up about the people using encryption against the government, but when megacorps use encryption against the population, they're strangely silent? The only true freedom is insecurity. And we will fight against having that freedom taken away, no matter what.

                  • saagarjha 342 days ago
                    I did watch what happened to Stallman. I don’t think the attack on his personal character was well substantiated but it’s pretty clear that he isn’t fit to lead a movement as important as the one fighting for software freedom. Not only is he seemingly incapable of understanding that his role is one of dealing with people, not code, but he has slowly fallen out of relevance regardless by fighting for freedoms which are largely not useful as more of the world comes online. Being able to recompile software or read its source code is little solace if you don’t actually know how to use that. Knowing free software is out there is not helpful if you’re required to use proprietary software for your job, to interact with your government, or just be part of modern society. The viewpoint of a guy who uses Trisquel but really relies on dozens of other people to actually function (seriously, look up what Stallman writes: he basically isolates himself from proprietary software by asking other people to do things for him and then sending the effects to him via some inconvenient mechanism that he feels happier consuming) is inappropriate for out of touch for someone who is seeking to fight for the rights of the average person.

                    As for who you voted for president: I guess you at least learned from last time to include it in your response. But I’m still not seeing any real action there? The new administration has said they’ll take on big tech but I don’t see software freedom driving any of their decisions. Most of them involve large companies having too much control over speech or raising the prices on services, but the solutions do not seem forthcoming. I’m actually concerned because this guy’s right hand man is the world’s richest man who built his fortune on proprietary platforms, and every other big tech CEO also seems to be running to support this guy because they think it will help them become even more entrenched. I can’t say what will happen in the next four years but looking at the previous tenure and spoken statements I see dismantling of anti-monopoly policies, regulation, and a general strengthening of corporate power. None of these seem conducive to an environment where free software can thrive.

                    As for myself most of my projects are some form of GPL. Many of them are focused on interoperability, specifically interoperability of proprietary platforms, so those who use them can become familiar with free software and have an easier time leaving them when they can. But the whole effort is a lot more holistic than focusing on one specific area of computer security, which I say even though that is my expertise and area of employment. Like it or not jailbreaks and exploits are neat but do not seem to be resilient drivers of software freedom.

          • panstromek 343 days ago
            It's a big bound check conspiracy, actually. Basically, Google will collaborate with US government to gather all the bounds, via Android. FBI will break into your house and look for all the `i < length`, and if they don't find them, they will terminate your `i++` on the spot. This is just the begining. Next time, they will go even for `int = 0`. You'll own nothing, just `for(;;)` and you'll be happy.
  • andrewstuart 343 days ago
    >> spatial safety vulnerabilities represent 40% of in-the-wild memory safety exploits

    Rust advocates tend to turn stats like this into “40% of all security issues are memory safety”, which sounds very similar but is false.

    • kibwen 343 days ago
      > Rust advocates tend to turn stats like this into “40% of all security issues are memory safety”, which sounds very similar but is false.

      You're right that it's false. Historically it's been a much more damning 70% of vulnerabilities that were rooted in memory-unsafety.

      According to the Google Security Blog, in a post linked to from the OP:

      We’ll also share updated data on how the percentage of memory safety vulnerabilities in Android dropped from 76% to 24% over 6 years as development shifted to memory safe languages. [...] The percent of vulnerabilities caused by memory safety issues continues to correlate closely with the development language that’s used for new code. Memory safety issues, which accounted for 76% of Android vulnerabilities in 2019, and are currently 24% in 2024, well below the 70% industry norm, and continuing to drop.

      https://security.googleblog.com/2024/09/eliminating-memory-s...

      • andrewstuart 343 days ago
        You’re still not getting the point.

        OWASPs top ten security vulnerabilities are not memory safety.

        • pornel 343 days ago
          Because most applications aren't written in C++.

          People don't write web apps in C++, because they would have to deal with memory safety issues in addition to all the other issues related to auth, injections, etc.

        • jerf 343 days ago
          So, maybe you can answer a question I've really had a hard time understanding, that I've posted about before: https://news.ycombinator.com/item?id=39542875

          Why are you offended at the idea that languages should be memory safe by default? What code are you writing that you constantly need memory unsafety, constantly available, without being able to write any sort of "unsafe" keyword? Who cares about whether or not it's the #1 problem in OWASP when it's clearly and undeniably been a massive problem for decades? It is sufficient, after all, that it crashes a program or produces incorrect results for it to be a problem worth pursuing, but it is also extremely well known to produce massive security vulnerabilities regardless of what some list says.

          Why is this a hill you are willing to die on? What are you getting out of it? Is your programming life going to be easier? Are you better off when debugging something to not be able to just know that it's not a memory safety problem, and thus to still have to consider it?

          What actual engineering benefit do those rare few of you who seem to be crusading against memory safety fear disappearing?

          When I got into programming in the late 1990s, I was there to catch the last few holdouts of the "everyone should just write in assembler" opinion. I at least understood their arguments around performance and efficiency, and I understood their arguments around "not needing high level languages" even though I disagree with them both then and now. I think on the net they were wrong, but they did have some legitimate benefits to argue on their side, even if they were already outweighed by the costs then and even more so outweighed today.

          But I don't get what you folk furious about memory safety are looking for. "Using" memory safety is already an invalid program. It's already pretty much automatically a bug, if not worse. You're not losing anything to simply have it, you're not gaining anything except bugs and sharp corners insisting on it. And when you absolutely, positively need it, which I'd call "exceptionally rare but definitely non-zero", it's still there in one form or another of "unsafe". I don't see any benefits at all.

          (And let me reiterate and forstall the usual, memory safety does not mean "Rust". Memory safety is every major language on the market today except C and C++.)

          • addaon 343 days ago
            > Why are you offended at the idea that languages should be memory safe by default?

            Why are you okay with languages that are not overflow-safe, or unit-safe, or infinite-loop-safe, or safe against bit flips? Memory safety violations are a major chunk of bugs. Writing code to avoid them is about as hard as writing code to avoid other major classes of bugs. In either case, it’s failable. Static analysis and testing then gives confidence that the system is safe, by multiple metrics. Memory safety isn’t special enough to demand a different approach here — quality code requires a coherent approach to quality across multiple bug classes.

            • jerf 341 days ago
              Memory safety doesn't require a special approach. We have abundant experience that says it does not interfere with writing code. It is only C and C++ that lack it. Nobody else in any other language is running around saying "Oh, no, if only I could have memory unsafety back!" No other language community is rushing to put it back into their language. Nobody else even wants it back.

              You argue like we live in some hypothetical universe where only some bizarre academic language has recently invented the idea in a world where nobody else has even heard of the idea, and it's solving a problem we don't generally have. But the truth is, we already have memory safety... everywhere except C and C++. Those languages stand alone now. They are the only ones where it's an issue. And they have demonstrated in as concrete an engineering way as it can be demonstrated that it is a problem, on numerous levels.

              You're not arguing against some new fangled idea that has no evidence. You're arguing against something that is completely normal engineering practice in place almost everywhere, and the rest of us look at you arguing against it as if you're arguing against that source control is a stupid idea for people who can't keep track of the changes they've made, by gosh, just sticking random prefixes and suffixes on my files is enough for me and it ought to be enough for everyone. We're not hypothesizing about it. We've been living it for decades. We're not asking the world to change to be memory safe... it already has. Except C and C++.

    • IshKebab 343 days ago
      I think you're forgetting about temporal safety (use after free). Presumably that brings it up to the 70% of security issues being related to memory safety, which many studies have shown - remarkably consistently.
    • pjmlp 343 days ago
      First of all it is 70%, and secondly even if people like to FUD Rust, it is all security advocates that state this, including those of us that would like a better attitude towards safety in C++ world.

      We got too many C refugees that spoiled the soup.

      • andrewstuart 343 days ago
        The top security issues do not relate to memory safety.

        Rust advocates like to muddy the water and make it sound like memory safety is the biggest issue in security. It isn’t.

        • pjmlp 343 days ago
          You mean advocates like Microsoft Security Response Center and Google Project Zero?

          Or advocates like NSA and FBI?

          Security FUD, name calling Rust any time someone raises security issues, is quite impressive.