> Context switching is virtually free, comparable to a function call.
If you’re counting that low, then you need to count carefully.
A coroutine switch, however well implemented, inevitably breaks the branch predictor’s idea of your return stack, but the effect of mispredicted returns will be smeared over the target coroutine’s execution rather than concentrated at the point of the switch. (Similar issues exist with e.g. measuring the effect of blowing the cache on a CPU migration.) I’m actually not sure if Zig’s async design even uses hardware call/return pairs when a (monomorphized-as-)async function calls another one, or if every return just gets translated to an indirect jump. (This option affords what I think is a cleaner design for coroutines with compact frames, but it is much less friendly to the CPU.)
So a foolproof benchmark would require one to compare the total execution time of a (compute-bound) program that constantly switches between (say) two tasks to that of an equivalent program that not only does not switch but (given what little I know about Zig’s “colorless” async) does not run under an async executor(?) at all. Those tasks would also need to yield on a non-trivial call stack each time. Seems quite tricky all in all.
Even so. You're talking about storing and loading at least ~16 8-byte registers, including the instruction pointer which is essentially a jump. Even to L1 that takes some time; more than a simple function call (jump + pushed return address).
Only stack and instruction pointer are explicitly restored. The rest is handled by the compiler, instead of depending on the C calling convention, it can avoid having things in registers during yield.
See this for more details on how stackful coroutines can be made much faster:
On ARM64, only fp, sp and pc are explicitly restored; and on x86_64 only rbp, rsp, and rip. For everything else, the compiler is just informed that the registers will be clobbered by the call, so it can optimize allocation to avoid having to save/restore them from the stack when it can.
If this was done the classical C way, you would always have to stack-save a number of registers, even if they are not really needed. The only difference here is that the compiler will do the save for you, in whatever way fits the context best. Sometimes it will stack-save, sometimes it will decide to use a different option. It's always strictly better than explicitly saving/restoring N registers unaware of the context. Keep in mind, that in Zig, the compiler always knows the entire code base. It does not work on object/function boundaries. That leads to better optimizations.
You are right that the statement was overblown, however when I was testing with "trivial" load between yields (synchronized ping-pong between coroutines), I was getting numbers that I had trouble believing, when comparing them to other solutions.
I am still mystified as to why callback-based async seems to have become the standard. What this and e.g. libtask[1] do seems so much cleaner to me.
The Rust folks adopted async with callbacks, and they were essentially starting from scratch so had no need to do it that way, and they are smarter than I (both individually and collectively) so I'm sure they have a reason; I just don't know what it is.
The research Microsoft engineers did on stackful vs stackless coroutines for the c++ standard I think swayed this as “the way” to implement it for something targeting a systems level - significantly less memory overhead (you only pay for what you use) and offload the implementation details of the executor (lots of different design choices that can be made).
Mostly out of curiosity, a read on a TCP connection could easily block for a month - how does the I/O timeout interface look like ? e.g. if you want to send an application level heartbeat when a read has blocked for 30 seconds.
Isn't this a bad time to be embracing Zig? It's currently going through an intrusive upheaval of its I/O model. My impression is that it was going to take a few years for things to shake out. Is that wrong?
> My impression is that it was going to take a few years for things to shake out. Is that wrong?
I had that very impression in early 2020 after some months of Zigging (and being burned by constant breaking changes), and left, deciding "I'll check it out again in a few years."
I had some intuition it might be one of these forever-refactoring eternal-tinker-and-rewrite fests and here I am 5 years later, still lurking for that 1.0 from the sidelines, while staying in Go or C depending on the nature of the thing at hand.
That's not to say it'll never get there, it's a vibrant project prioritizing making the best design decisions rather than mere Shipping Asap. For a C-replacement that's the right spirit, in principle. But whether there's inbuilt immunity to engineers falling prey to their forever-refine-and-resculpt I can't tell. I find it a great project to wait for leisurely (=
Kind of is a bad idea. Even the author’s library is not using the latest zig IO features and is planning for big changes with 0.16. From the readme of the repo:
> Additionally, when Zig 0.16 is released with the std.Io interface, I will implement that as well, allowing you to use the entire standard library with this runtime.
Unrelated to this library, I plan to do lots of IO with Zig and will wait for 0.16. Your intuition may decide otherwise and that’s ok.
Nobody is denying that? Andrew Kelly and the Zig team have been extremely clear that they are okay making break changes. So if you're choosing to use it in large projects, as some have, you're taking that risk.
I think it speaks volumes that these projects chose to use it, and speak very highly of it, despite the fact that it's still pre 1.0.
Yes, your opinion. I run it in production and everything I've built with it has been rock solid (aside from my own bugs). I haven't touched a few of my projects in a few years and they work fine, but if I wanted to update them to the latest version of Zig I'd have a bit of work ahead of me. That's it.
It really depends on what you are doing, but if it's something related to I/O and you embrace the buffered reader/writer interfaces introduced in Zig 0.15, I think not much is going to change. You might need changes on how you get those interfaces, but the core of your code is unchanged.
IMO, it's very wrong. Zig's language is not drastically changing, it's adding a new, *very* powerful API, which similar to how most everything in zig passes an allocator as a function param, soon functions that want to do IO, will accept an object that will provide the desired abstraction, so that callers can define the ideal implementation.
In other words, the only reason to not use zig if you detest upgrading or improving your code. Code you write today will still work tomorrow. Code you write tomorrow, will likely have a new Io interface, because you want to use that standard abstraction. But, if you don't want to use it, all your existing code will still work.
Just like today, if you want to alloc, but don't want to pass an `Allocator` you can call std.heap.page_allocator.alloc from anywhere. But because that abstraction is so useful, and zig supports it so ergonomically, everyone writes code that provides that improved API
side note; I was worried about upgrading all my code to interface with the new Reader/Writer API that's already mostly stable in 0.15.2, but even though I had to add a few lines in many existing projects to upgrade. I find myself optionally choosing to refactor a lot of functions because the new API results is code that is SO much better. Both in readability, but also performance. Do I have to refactor? No, the old API works flawlessly, but the new API is simply more ergonomic, more performant and easier to read and reason about. I'm doing it because I want to, not because I have to.
Everyone knows' a red diff is the best diff, and the new std.Io API exposes an easier way to do things. Still, like everything in zig, it allows you to write the code that you want to write. But if you want to do it yourself, that's fully supported too!
Haha no! Zig makes breaking changes in the stdlib in every release. I can guarantee you won’t be able to update a non trivial project between any of the latest 10 versions and beyond without changing your code , often substantially, and the next release is changing pretty much all code doing any kind of IO. I know because I keep track of that in a project and can see diffs between each of the latest versions. This allows me to modify other code much more easily.
But TBH, in 0.15 only zig build broke IIRC. However, I just didn’t happen to use some of the things that changed, I believe.
This isn't quite accurate. If you look at the new IO branch[1] you'll see (for example) most of the std.fs functions are gone, and most of what's left is deprecated. The plan is for all file/network access, mutexes, etc to be accessible only through the Io interface. It'll be a big migration once 0.16 drops.
> Do I have to refactor? No, the old API works flawlessly
The old API was deleted though? If you're saying it's possible to copy/paste the old stdlib into your project and maintain the old abstractions forward through the ongoing language changes, sure that's possible, but I don't think many people will want to fork std. I copy/pasted some stuff temporarily to make the 0.15 migration easier, but maintaining it forever would be swimming upstream for no reason.
Even the basic stuff like `openFile` is deprecated. I don't know what else to tell you. Zig won't maintain two slightly different versions of the fs functions in parallel. Once something is deprecated, that means it's going away. https://github.com/ziglang/zig/blob/init-std.Io/lib/std/fs/D...
Oh, I guess that's a fair point. I didn't consider the change from `std.fs.openFile` to `std.Io.Dir.openFile` to be meaningful, but I guess that is problematic for some reason?
You're of course correct here; but I thought it was reasonable to omit changes that I would describe as namespace changes. Now considering the audience I regret doing so. (it now does require nhe Io object as well, so namespace is inarticulate here)
That is literally a breaking change, so your old code will by definition not work flawlessly. Maybe the migration overhead is low, but it’s not zero like your comment implies
I really need to play with Zig. I got really into Rust a few months ago, and I was actually extremely impressed by Tokio, so if this library also gives me Go-style concurrency without having to rely on a garbage collector, then I am likely to enjoy it.
Go has tricks that you can't replicate elsewhere, things like infinitely growable stacks, that's only possible thanks to the garbage collector. But I did enjoy working on this, I'm continually impressed with Zig for how nice high-level looking APIs are possible in such a low-level language.
Pre-1.0 Rust used to have infinitely growing stacks, but they abandoned it due to (among other things) performance reasons (IIRC the stacks were not collected with Rust's GC[1], but rather on return; the deepest function calls may happen in tight loops, and if you are allocating and freeing the stack in a tight loop, oops!)
If you succeed in creating a generic async primitive, it doesn't really matter what the original task was (as long as it's something that requires async), no? That's an implication of it being generic?
If you’re counting that low, then you need to count carefully.
A coroutine switch, however well implemented, inevitably breaks the branch predictor’s idea of your return stack, but the effect of mispredicted returns will be smeared over the target coroutine’s execution rather than concentrated at the point of the switch. (Similar issues exist with e.g. measuring the effect of blowing the cache on a CPU migration.) I’m actually not sure if Zig’s async design even uses hardware call/return pairs when a (monomorphized-as-)async function calls another one, or if every return just gets translated to an indirect jump. (This option affords what I think is a cleaner design for coroutines with compact frames, but it is much less friendly to the CPU.)
So a foolproof benchmark would require one to compare the total execution time of a (compute-bound) program that constantly switches between (say) two tasks to that of an equivalent program that not only does not switch but (given what little I know about Zig’s “colorless” async) does not run under an async executor(?) at all. Those tasks would also need to yield on a non-trivial call stack each time. Seems quite tricky all in all.
Zig no longer has async in the language (and hasn't for quite some time). The OP implemented task switching in user-space.
See this for more details on how stackful coroutines can be made much faster:
https://photonlibos.github.io/blog/stackful-coroutine-made-f...
Yep, the frame pointer as well if you're using it. This is exactly how its implemented in user-space in Zig's WIP std.Io branch green-threading implementation: https://github.com/ziglang/zig/blob/ce704963037fed60a30fd9d4...
On ARM64, only fp, sp and pc are explicitly restored; and on x86_64 only rbp, rsp, and rip. For everything else, the compiler is just informed that the registers will be clobbered by the call, so it can optimize allocation to avoid having to save/restore them from the stack when it can.
The Rust folks adopted async with callbacks, and they were essentially starting from scratch so had no need to do it that way, and they are smarter than I (both individually and collectively) so I'm sure they have a reason; I just don't know what it is.
1: https://swtch.com/libtask/
> DO NOT USE FIBERS!
I had that very impression in early 2020 after some months of Zigging (and being burned by constant breaking changes), and left, deciding "I'll check it out again in a few years."
I had some intuition it might be one of these forever-refactoring eternal-tinker-and-rewrite fests and here I am 5 years later, still lurking for that 1.0 from the sidelines, while staying in Go or C depending on the nature of the thing at hand.
That's not to say it'll never get there, it's a vibrant project prioritizing making the best design decisions rather than mere Shipping Asap. For a C-replacement that's the right spirit, in principle. But whether there's inbuilt immunity to engineers falling prey to their forever-refine-and-resculpt I can't tell. I find it a great project to wait for leisurely (=
> Additionally, when Zig 0.16 is released with the std.Io interface, I will implement that as well, allowing you to use the entire standard library with this runtime.
Unrelated to this library, I plan to do lots of IO with Zig and will wait for 0.16. Your intuition may decide otherwise and that’s ok.
it's completely feasible to stick to something that works for you, and only update/port/rewrite when it makes sense.
what matters is the overall cost.
all of these project is great but we cant ignore that Zig is not enter phase where we can guarantee stable API compability
I think it speaks volumes that these projects chose to use it, and speak very highly of it, despite the fact that it's still pre 1.0.
how about you goes to Zig github and check how progress of the language
it literally there and its still beta test and not fit for production let alone have mature ecosystem
In other words, the only reason to not use zig if you detest upgrading or improving your code. Code you write today will still work tomorrow. Code you write tomorrow, will likely have a new Io interface, because you want to use that standard abstraction. But, if you don't want to use it, all your existing code will still work.
Just like today, if you want to alloc, but don't want to pass an `Allocator` you can call std.heap.page_allocator.alloc from anywhere. But because that abstraction is so useful, and zig supports it so ergonomically, everyone writes code that provides that improved API
side note; I was worried about upgrading all my code to interface with the new Reader/Writer API that's already mostly stable in 0.15.2, but even though I had to add a few lines in many existing projects to upgrade. I find myself optionally choosing to refactor a lot of functions because the new API results is code that is SO much better. Both in readability, but also performance. Do I have to refactor? No, the old API works flawlessly, but the new API is simply more ergonomic, more performant and easier to read and reason about. I'm doing it because I want to, not because I have to.
Everyone knows' a red diff is the best diff, and the new std.Io API exposes an easier way to do things. Still, like everything in zig, it allows you to write the code that you want to write. But if you want to do it yourself, that's fully supported too!
Haha no! Zig makes breaking changes in the stdlib in every release. I can guarantee you won’t be able to update a non trivial project between any of the latest 10 versions and beyond without changing your code , often substantially, and the next release is changing pretty much all code doing any kind of IO. I know because I keep track of that in a project and can see diffs between each of the latest versions. This allows me to modify other code much more easily.
But TBH, in 0.15 only zig build broke IIRC. However, I just didn’t happen to use some of the things that changed, I believe.
[1]: https://github.com/ziglang/zig/blob/init-std.Io/lib/std/fs.z...
> Do I have to refactor? No, the old API works flawlessly
The old API was deleted though? If you're saying it's possible to copy/paste the old stdlib into your project and maintain the old abstractions forward through the ongoing language changes, sure that's possible, but I don't think many people will want to fork std. I copy/pasted some stuff temporarily to make the 0.15 migration easier, but maintaining it forever would be swimming upstream for no reason.
uhhh.... huh? you and I must be using very different definitions for the word most.
> The old API was deleted though?
To be completely fair, you're correct, the old deprecated writer that was available in 0.15 has been removed https://ziglang.org/documentation/0.15.2/std/#std.Io.Depreca... contrasted with the master branch which doesn't provide this anymore.
edit: lmao, your profile about text is hilarious, I appreciate the laugh!
You're of course correct here; but I thought it was reasonable to omit changes that I would describe as namespace changes. Now considering the audience I regret doing so. (it now does require nhe Io object as well, so namespace is inarticulate here)
1: Yes, pre-1.0 Rust had a garbage collector.
This looks interesting but I'm not familiar with NATS
https://en.wikipedia.org/wiki/All_your_base_are_belong_to_us