How I turned Zig into my favorite language to write network programs in

(lalinsky.com)

183 points | by 0x1997 8 hours ago

11 comments

mananaysiempre 4 hours ago
> Context switching is virtually free, comparable to a function call.
If you’re counting that low, then you need to count carefully.
A coroutine switch, however well implemented, inevitably breaks the branch predictor’s idea of your return stack, but the effect of mispredicted returns will be smeared over the target coroutine’s execution rather than concentrated at the point of the switch. (Similar issues exist with e.g. measuring the effect of blowing the cache on a CPU migration.) I’m actually not sure if Zig’s async design even uses hardware call/return pairs when a (monomorphized-as-)async function calls another one, or if every return just gets translated to an indirect jump. (This option affords what I think is a cleaner design for coroutines with compact frames, but it is much less friendly to the CPU.)
So a foolproof benchmark would require one to compare the total execution time of a (compute-bound) program that constantly switches between (say) two tasks to that of an equivalent program that not only does not switch but (given what little I know about Zig’s “colorless” async) does not run under an async executor(?) at all. Those tasks would also need to yield on a non-trivial call stack each time. Seems quite tricky all in all.
[-]
- messe 2 hours ago
  > I’m actually not sure if Zig’s async design even uses hardware call/return pairs
  Zig no longer has async in the language (and hasn't for quite some time). The OP implemented task switching in user-space.
  [-]
  - loeg 2 hours ago
    Even so. You're talking about storing and loading at least ~16 8-byte registers, including the instruction pointer which is essentially a jump. Even to L1 that takes some time; more than a simple function call (jump + pushed return address).
    [-]
    - lukaslalinsky 2 hours ago
      Only stack and instruction pointer are explicitly restored. The rest is handled by the compiler, instead of depending on the C calling convention, it can avoid having things in registers during yield.
      See this for more details on how stackful coroutines can be made much faster:
      https://photonlibos.github.io/blog/stackful-coroutine-made-f...
      [-]
      - messe 1 hour ago
        > The rest is handled by the compiler, instead of depending on the C calling convention, it can avoid having things in registers during yield.
        Yep, the frame pointer as well if you're using it. This is exactly how its implemented in user-space in Zig's WIP std.Io branch green-threading implementation: https://github.com/ziglang/zig/blob/ce704963037fed60a30fd9d4...
        On ARM64, only fp, sp and pc are explicitly restored; and on x86_64 only rbp, rsp, and rip. For everything else, the compiler is just informed that the registers will be clobbered by the call, so it can optimize allocation to avoid having to save/restore them from the stack when it can.
        [-]
        flimflamm 45 minutes ago
        Is this just buttering the cost of switches by crippling the optimization options compiler have?
        [-]
        lukaslalinsky 22 minutes ago
        If this was done the classical C way, you would always have to stack-save a number of registers, even if they are not really needed. The only difference here is that the compiler will do the save for you, in whatever way fits the context best. Sometimes it will stack-save, sometimes it will decide to use a different option. It's always strictly better than explicitly saving/restoring N registers unaware of the context. Keep in mind, that in Zig, the compiler always knows the entire code base. It does not work on object/function boundaries. That leads to better optimizations.
        hawk_ 15 minutes ago
        What do you mean by "buttering the cost of switches", can you elaborate? (I am trying to learn about this topic)
- lukaslalinsky 2 hours ago
  You are right that the statement was overblown, however when I was testing with "trivial" load between yields (synchronized ping-pong between coroutines), I was getting numbers that I had trouble believing, when comparing them to other solutions.
aidenn0 2 hours ago
I am still mystified as to why callback-based async seems to have become the standard. What this and e.g. libtask[1] do seems so much cleaner to me.
The Rust folks adopted async with callbacks, and they were essentially starting from scratch so had no need to do it that way, and they are smarter than I (both individually and collectively) so I'm sure they have a reason; I just don't know what it is.
1: https://swtch.com/libtask/
[-]
- vlovich123 2 hours ago
  The research Microsoft engineers did on stackful vs stackless coroutines for the c++ standard I think swayed this as “the way” to implement it for something targeting a systems level - significantly less memory overhead (you only pay for what you use) and offload the implementation details of the executor (lots of different design choices that can be made).
  [-]
  - zozbot234 17 minutes ago
    Yup, stackful fibers are an anti-pattern. Here's Gor Nishanov's review for the C++ ISO committee https://www.open-std.org/JTC1/SC22/WG21/docs/papers/2018/p13... linked from https://devblogs.microsoft.com/oldnewthing/20191011-00/?p=10... . Notice how it sums things up:
    > DO NOT USE FIBERS!
- loeg 2 hours ago
  The thread stack for something like libtask is ambiguously sized and often really large relative to like, formalized async state.
- oaiey 2 hours ago
  I think it started with an interrupt. And less abstraction often wins.
noselasd 18 minutes ago
Mostly out of curiosity, a read on a TCP connection could easily block for a month - how does the I/O timeout interface look like ? e.g. if you want to send an application level heartbeat when a read has blocked for 30 seconds.
quantummagic 4 hours ago
Isn't this a bad time to be embracing Zig? It's currently going through an intrusive upheaval of its I/O model. My impression is that it was going to take a few years for things to shake out. Is that wrong?
[-]
- dualogy 2 hours ago
  > My impression is that it was going to take a few years for things to shake out. Is that wrong?
  I had that very impression in early 2020 after some months of Zigging (and being burned by constant breaking changes), and left, deciding "I'll check it out again in a few years."
  I had some intuition it might be one of these forever-refactoring eternal-tinker-and-rewrite fests and here I am 5 years later, still lurking for that 1.0 from the sidelines, while staying in Go or C depending on the nature of the thing at hand.
  That's not to say it'll never get there, it's a vibrant project prioritizing making the best design decisions rather than mere Shipping Asap. For a C-replacement that's the right spirit, in principle. But whether there's inbuilt immunity to engineers falling prey to their forever-refine-and-resculpt I can't tell. I find it a great project to wait for leisurely (=
- laserbeam 2 hours ago
  Kind of is a bad idea. Even the author’s library is not using the latest zig IO features and is planning for big changes with 0.16. From the readme of the repo:
  > Additionally, when Zig 0.16 is released with the std.Io interface, I will implement that as well, allowing you to use the entire standard library with this runtime.
  Unrelated to this library, I plan to do lots of IO with Zig and will wait for 0.16. Your intuition may decide otherwise and that’s ok.
- geysersam 4 hours ago
  What's a few years? They go by in the blink of an eye. Zig is a perfectly usable language. People who want to use it will, those who don't won't.
  [-]
  - attila-lendvai 50 minutes ago
    following upstream is overrated since we have good package managers and version control.
    it's completely feasible to stick to something that works for you, and only update/port/rewrite when it makes sense.
    what matters is the overall cost.
  - tonyhart7 3 hours ago
    only for hobby project
    [-]
    - scuff3d 3 hours ago
      TigerBeetle, Bun, and Ghostty all beg to differ...
      [-]
      - tonyhart7 2 hours ago
        [flagged]
        [-]
        samtheprogram 2 hours ago
        Bun is 100% fine in production. And you should be using it instead of transpiling TypeScript unless there’s some other barrier to using it.
        [-]
        tonyhart7 2 hours ago
        I am not trying to dunk on this project
        all of these project is great but we cant ignore that Zig is not enter phase where we can guarantee stable API compability
        [-]
        scuff3d 1 hour ago
        Nobody is denying that? Andrew Kelly and the Zig team have been extremely clear that they are okay making break changes. So if you're choosing to use it in large projects, as some have, you're taking that risk.
        I think it speaks volumes that these projects chose to use it, and speak very highly of it, despite the fact that it's still pre 1.0.
    - nesarkvechnep 3 hours ago
      You or in general? Because, you know, this is like, your opinion, man.
      [-]
      - tonyhart7 2 hours ago
        My Opinion???
        how about you goes to Zig github and check how progress of the language
        it literally there and its still beta test and not fit for production let alone have mature ecosystem
        [-]
        dns_snek 9 minutes ago
        Yes, your opinion. I run it in production and everything I've built with it has been rock solid (aside from my own bugs). I haven't touched a few of my projects in a few years and they work fine, but if I wanted to update them to the latest version of Zig I'd have a bit of work ahead of me. That's it.
- lukaslalinsky 2 hours ago
  It really depends on what you are doing, but if it's something related to I/O and you embrace the buffered reader/writer interfaces introduced in Zig 0.15, I think not much is going to change. You might need changes on how you get those interfaces, but the core of your code is unchanged.
- grayhatter 3 hours ago
  IMO, it's very wrong. Zig's language is not drastically changing, it's adding a new, *very* powerful API, which similar to how most everything in zig passes an allocator as a function param, soon functions that want to do IO, will accept an object that will provide the desired abstraction, so that callers can define the ideal implementation.
  In other words, the only reason to not use zig if you detest upgrading or improving your code. Code you write today will still work tomorrow. Code you write tomorrow, will likely have a new Io interface, because you want to use that standard abstraction. But, if you don't want to use it, all your existing code will still work.
  Just like today, if you want to alloc, but don't want to pass an `Allocator` you can call std.heap.page_allocator.alloc from anywhere. But because that abstraction is so useful, and zig supports it so ergonomically, everyone writes code that provides that improved API
  side note; I was worried about upgrading all my code to interface with the new Reader/Writer API that's already mostly stable in 0.15.2, but even though I had to add a few lines in many existing projects to upgrade. I find myself optionally choosing to refactor a lot of functions because the new API results is code that is SO much better. Both in readability, but also performance. Do I have to refactor? No, the old API works flawlessly, but the new API is simply more ergonomic, more performant and easier to read and reason about. I'm doing it because I want to, not because I have to.
  Everyone knows' a red diff is the best diff, and the new std.Io API exposes an easier way to do things. Still, like everything in zig, it allows you to write the code that you want to write. But if you want to do it yourself, that's fully supported too!
  [-]
  - brabel 40 minutes ago
    > Code you write today will still work tomorrow.
    Haha no! Zig makes breaking changes in the stdlib in every release. I can guarantee you won’t be able to update a non trivial project between any of the latest 10 versions and beyond without changing your code , often substantially, and the next release is changing pretty much all code doing any kind of IO. I know because I keep track of that in a project and can see diffs between each of the latest versions. This allows me to modify other code much more easily.
    But TBH, in 0.15 only zig build broke IIRC. However, I just didn’t happen to use some of the things that changed, I believe.
  - do_not_redeem 3 hours ago
    This isn't quite accurate. If you look at the new IO branch[1] you'll see (for example) most of the std.fs functions are gone, and most of what's left is deprecated. The plan is for all file/network access, mutexes, etc to be accessible only through the Io interface. It'll be a big migration once 0.16 drops.
    [1]: https://github.com/ziglang/zig/blob/init-std.Io/lib/std/fs.z...
    > Do I have to refactor? No, the old API works flawlessly
    The old API was deleted though? If you're saying it's possible to copy/paste the old stdlib into your project and maintain the old abstractions forward through the ongoing language changes, sure that's possible, but I don't think many people will want to fork std. I copy/pasted some stuff temporarily to make the 0.15 migration easier, but maintaining it forever would be swimming upstream for no reason.
    [-]
    - grayhatter 3 hours ago
      > most of the std.fs functions are gone, and most of what's left is deprecated.
      uhhh.... huh? you and I must be using very different definitions for the word most.
      > The old API was deleted though?
      To be completely fair, you're correct, the old deprecated writer that was available in 0.15 has been removed https://ziglang.org/documentation/0.15.2/std/#std.Io.Depreca... contrasted with the master branch which doesn't provide this anymore.
      edit: lmao, your profile about text is hilarious, I appreciate the laugh!
      [-]
      - do_not_redeem 3 hours ago
        Even the basic stuff like `openFile` is deprecated. I don't know what else to tell you. Zig won't maintain two slightly different versions of the fs functions in parallel. Once something is deprecated, that means it's going away. https://github.com/ziglang/zig/blob/init-std.Io/lib/std/fs/D...
        [-]
        grayhatter 3 hours ago
        Oh, I guess that's a fair point. I didn't consider the change from `std.fs.openFile` to `std.Io.Dir.openFile` to be meaningful, but I guess that is problematic for some reason?
        You're of course correct here; but I thought it was reasonable to omit changes that I would describe as namespace changes. Now considering the audience I regret doing so. (it now does require nhe Io object as well, so namespace is inarticulate here)
        [-]
        Ar-Curunir 3 hours ago
        That is literally a breaking change, so your old code will by definition not work flawlessly. Maybe the migration overhead is low, but it’s not zero like your comment implies
tombert 3 hours ago
I really need to play with Zig. I got really into Rust a few months ago, and I was actually extremely impressed by Tokio, so if this library also gives me Go-style concurrency without having to rely on a garbage collector, then I am likely to enjoy it.
[-]
- lukaslalinsky 2 hours ago
  Go has tricks that you can't replicate elsewhere, things like infinitely growable stacks, that's only possible thanks to the garbage collector. But I did enjoy working on this, I'm continually impressed with Zig for how nice high-level looking APIs are possible in such a low-level language.
  [-]
  - aidenn0 2 hours ago
    Pre-1.0 Rust used to have infinitely growing stacks, but they abandoned it due to (among other things) performance reasons (IIRC the stacks were not collected with Rust's GC[1], but rather on return; the deepest function calls may happen in tight loops, and if you are allocating and freeing the stack in a tight loop, oops!)
    1: Yes, pre-1.0 Rust had a garbage collector.
dxxvi 5 hours ago
Do you know that there's a concurrent Scala library named ZIO (https://zio.dev)? :-)
mrasong 4 hours ago
The first time I heard about Zig was actually on Bun’s website, it’s been getting better and better lately.
breatheoften 5 hours ago
What makes a NATS client implementation the right prototype from which to extract a generic async framework layer?
This looks interesting but I'm not familiar with NATS
[-]
- maxbond 5 hours ago
  If you succeed in creating a generic async primitive, it doesn't really matter what the original task was (as long as it's something that requires async), no? That's an implication of it being generic?
- lukaslalinsky 2 hours ago
  The layer was not extracted from the NATS client, the NATS client was just a source of frustration that prompted this creation.
otobrglez 2 hours ago
There is an extremely popular library/framework for Scala named ZIO out there,… Naming is hard.
sriku 2 hours ago
The article says it was created to write audio software but I'm unable to find any first sources for that. Pointers?
[-]
- lukaslalinsky 1 hour ago
  See the first example in Andrew's introduction: https://andrewkelley.me/post/intro-to-zig.html
supportengineer 6 hours ago
Move Zig, for great justice.
[-]
- echelon 3 hours ago
  One of the very first internet memes. The zig team should adopt it as the slogan.
  https://en.wikipedia.org/wiki/All_your_base_are_belong_to_us