My first patch to the Linux kernel

(pooladkhay.com)

128 points | by pooladkhay 2 days ago

12 comments

  • ozgrakkurt 2 minutes ago
    Great blog post. Using _BitInt typedefs for integers is a good option for anyone starting a fresh c project. It has worked well for me so far. _BitInt integers don’t promote to signed automatically like regular integers in c
  • fonheponho 4 minutes ago
    Everybody seems to be missing the forest for the trees on this.

    There is absolutely no "sign extension" in the C standard (go ahead, search it). "Sign extension" is a feature of some assembly instructions on some architectures, but C has nothing to do with it.

    Citing integer promotion from the standard is justified, but it's just one part (perhaps even the smaller part) of the picture. The crucial bit is not quoted in the article: the specification of "Bitwise shift operators". Namely

    > The integer promotions are performed on each of the operands. The type of the result is that of the promoted left operand. [...]

    > The result of E1 << E2 is E1 left-shifted E2 bit positions; vacated bits are filled with zeros. If E1 has an unsigned type, the value of the result is E1×2^E2, reduced modulo one more than the maximum value representable in the result type. If E1 has a signed type and nonnegative value, and E1×2^E2 is representable in the result type, then that is the resulting value; otherwise, the behavior is undefined.

    What happens here is that "base2" (of type uint8_t, which is "unsigned char" in this environment) gets promoted to "int", and then left-shifted by 24 bits. You get undefined behavior because, while "base2" (after promotion) has a signed type ("int") and nonnegative value, E1×2^E2 (i.e., base2 × 2^24) is NOT representable in the result type ("int").

    What happens during the conversion to "uint64_t" afterwards is irrelevant; even the particulars of the sign bit of "int", and how you end up with a negative "int" from the shift, are irrelevant; you got your UB right inside the invalid left-shift. How said UB happens to materialize on this particular C implementation may perhaps be explained in terms of sign extension of the underlying ISA -- but do that separately; be absolutely clear about what is what.

    The article fails to mention the root cause (violating the rules for the bitwise left-shift operator) and fails to name the key consequence (undefined behavior); instead, it leads with not-a-thing ("sign-extension bug in C"). I'm displeased.

    BTW this bug (invalid left shift of a signed integer) is common, sadly.

  • ashwinnair99 3 hours ago
    The first one always takes way longer than the code itself deserves. Most of the work is figuring out the unwritten rules, not writing the patch.
    • fooker 2 hours ago
      This is a big problem in open source that seems taboo to discuss.

      In my opinion, unwritten rules are for gatekeeping. And if a new person follows all the unwritten rules, magically there's no one willing to review.

      I think this is how large BFDL-style open source projects slowly become less and less relevant over the next few decades.

      • cromka 2 hours ago
        Agreed. The level of aggressive gatekeepers is just crazy, take Linux ARM mailing list for example. I found the Central and Eastern Europeans particularly aggressive there and I'm saying this as on myself. They sure do like to feel special there, with very little soft skills.
      • tossandthrow 58 minutes ago
        This will likely be alleviated when Ai first projects take over as important OSS projects.

        Fir these projects everything "tribal" has to be explicitly codified.

        On a more general note: this is likely going to have a rather big impact on software in general - the "engineer to company can not afford to loose" is likely loosing their moat entirely.

    • yu3zhou4 3 hours ago
      Can confirm that it also happens in other complex systems! Still a lot of good time and the novelty factor helps with pushing through
    • seb1204 3 hours ago
      Sand that after so many years these rules are still not written down.
  • siddyboi 10 minutes ago
    Huge congrats on tracking that down and getting your first Linux kernel patch merged!
  • NotCamelCase 25 minutes ago
    Lovely article with a happy ending!

    One thing that I am glad to have been taught early on in my career when it comes to debugging, especially anything involving HW, is to `make no assumptions'. Bugs can be anywhere and everywhere.

  • dingensundso 24 minutes ago
    Nice blogpost. Was an really interesting read. Would be interesting to read about the experience of getting the patch accepted and merged.

    One thing I noticed: The last footnote is missing.

  • ngburke 2 hours ago
    Sign extension bugs are the worst. Silent for ages then suddenly everything is on fire. Spent a lot of time in C doing low-level firmware work and ran into the same class of issue more than once. Nice writeup, congrats on the patch.
  • foltik 2 hours ago
    Well done and great writeup! Any idea why the bug hadn’t shown up sooner, like when running self tests?
  • knorker 50 minutes ago
    Integer promotion rules in C are so deceptive.

    I don't believe there's anybody who can reason about them at code skimming speeds. It's probably the best place to hide underhanded code.

  • yu3zhou4 3 hours ago
    Congrats and happy for you, you had a lot of fun and did something genuinely interesting
  • mbana 3 hours ago
    I love these kind of posts.
  • algolint 1 hour ago
    [dead]