Can't believe the sense of entitlement in this thread. I guess people think bandwidth grows on trees.
For residential usage, unless you're in an apartment tower where all your neighbors are software engineers and you're all behind a CGNAT, you can still do a pull here and there for learning and other hobbyist purposes, which for Docker is a marketing expense to encourage uptake in commercial settings.
If you're in an office, you have an employer, and you're using the registry for commercial purposes, you should be paying to help keep your dependencies running. If you don't expect your power plant to give you electricity for free, why would you expect a commercial company to give you containers for free?
My startup pays Docker for their registry hosting services, for our private registry. However, some of our production machines are not set up to authenticate towards our account, because they are only running public containers.
Because of this change, we now need to either make sure that every machine is authenticated, or take the risk of a production outage in case we do too many pulls at once.
If we had instead simply mirrored everything into a registry at a big cloud provider, we would never have paid docker a cent for the privilege of having unplanned work foisted upon us.
However, if you are using docker's registry without authentication and you don't want to go through the effort of adding the credentials you already have, you are essentially relying on a free service for production already, which may be pulled any time without prior notice. You are already taking the risk of a production outage. Now it's just formalized that your limit is 10 pulls per IP per hour. I don't really get how this can shift your evaluation from using (and paying for) docker's registry to paying for your own registry. It seems orthogonal to the evaluation itself.
The big problem is that the docker client makes it nearly impossible to audit a large deployment to make sure it’s not accidentally talking to docker hub.
This is by design, according to docker.
I’ve never encountered anyone at any of my employers that wanted to use docker hub for anything other than a one-time download of a base image like Ubuntu or Alpine.
I’ve also never seen a CD deployment that doesn’t repeatedly accidentally pull in a docker hub dependency, and then occasionally have outages because of it.
I have a vague memory of reading something to that effect on their bug tracker, but I always thought the reasoning was ok. IIRC it was something to the effect that the goal was to keep things simple for first time users. I think that's disservice to users, because you end up with many refusing to learn how things actually work, but I get the sentiment.
> I’ve also never seen a CD deployment that doesn’t repeatedly accidentally pull in a docker hub dependency, and then occasionally have outages because of it.
There's a point where developers need to take responsibility for some of those issues. The core systems don't prevent anyone from setting up durable build pipelines. Structure the build like this [1]. Set up a local container registry for any images that are required by the build and pull/push those images into a hosted repo. Use a pull through cache so you aren't pulling the same image over the internet 1000 times.
Basically, gate all registry access through something like Nexus. Don't set up the pull through cache as a mirror on local clients. Use a dedicated host name. I use 'xxcr.io' for my local Nexus and set up subdomains for different pull-through upstreams; 'hub.xxcr.io/ubuntu', 'ghcr.xxcr.io/group/project', etc..
Beyond having control over all the build infrastructure, it's also something that would have been considered good netiquette, at least 15-20 years ago. I'm always surprised to see people shocked that free services disappear when the stats quo seems to be to ignore efficiency as long as the cost of inefficiency is externalized to a free service somewhere.
> I'm always surprised to see people shocked that free services disappear when the stats quo seems to be to ignore efficiency as long as the cost of inefficiency is externalized to a free service somewhere.
Same. The “I don’t pay for it, why do I care” attitude is abundant, and it drives me nuts. Don’t bite the hand that feeds you, and make sure, regularly, that you’re not doing that by mistake. Else, you might find the hand biting you back.
That will most likely fail, since the daemon tries to connect to the registry with SSL and your registry will not have the same SSL certificate as Docker Hub. I don't know if a proxy could solve this.
This is supported in the client/daemon. You configure your client to use a self-hosted registry mirror (e.g. docker.io/distribution or zot) with your own TLS cert (or insecure without if you must) as pull-through cache (that's your search key word). This way it works "automagically" with existing docker.io/ image references now being proxied and cached via your mirror.
You would put this as a separate registry and storage from your actual self-hosted registry of explicitly pushed example.com/ images.
It's an extremely common use-case and well-documented if you try to RTFM instead of just throwing your hands in the air before speculating and posting about how hard or impossible this supposedly is.
You could fall back to DNS rewrite and front with your own trusted CA but I don't think that particular approach is generally advisable given how straightforward a pull-through cache is to set up and operate.
All the large objects in the OCI world are identified by their cryptographic hash. When you’re pulling things when building a Dockerfile or preparing to run a container, you are doing one of two things:
a) resolving a name (like ubuntu:latest or whatever)
b) downloading an object, possibly a quite large object, by hash
Part b may recurse in the sense that an object can reference other objects by hash.
In a sensible universe, we would describe the things we want to pull by name, pin hashes via a lock
file, and download the objects. And the only part that requires any sort of authentication of the server is the resolution of a name that is not in the lockfile to the corresponding hash.
Of course, the tooling doesn’t work like this, there usually aren’t lockfiles, and there is no effort made AFAICT to allow pulling an object with a known hash without dealing with the almost entirely pointless authentication of the source server.
Right but then you notice the failing CI job and fix it to correctly pull from your artifact repository. It's definitely doable. We require using an internal repo at my work where we run things like vulnerability scanners.
> since the daemon tries to connect to the registry with SSL
If you rewrite DNS, you should of course also have a custom CA trusted by your container engine as well as appropriate certificates and host configurations for your registry.
You'll always need to take these steps if you want to go the rewrite-DNS path for isolation from external services because some proprietary tool forces you to use those services.
I don't really get how this can shift your evaluation from using (and paying for) docker's registry to paying for your own registry
Announcing a new limitation that requires rolling out changes to prod with 1 week notice should absolutely shift your evaluation of whether you should pay for this company's services.
At Docker, our mission is to empower development teams by providing the tools they need to ship secure, high-quality apps — FAST. Over the past few years, we’ve continually added value for our customers, responding to the evolving needs of individual developers and organizations alike. Today, we’re excited to announce significant updates to our Docker subscription plans that will deliver even more value, flexibility, and power to your development workflows.
We’ve listened closely to our community, and the message is clear: Developers want tools that meet their current needs and evolve with new capabilities to meet their future needs.
That’s why we’ve revamped our plans to include access to ALL the tools our most successful customers are leveraging — Docker Desktop, Docker Hub, Docker Build Cloud, Docker Scout, and Testcontainers Cloud. Our new unified suite makes it easier for development teams to access everything they need under one subscription with included consumption for each new product and the ability to add more as they need it. This gives every paid user full access, including consumption-based options, allowing developers to scale resources as their needs evolve. Whether customers are individual developers, members of small teams, or work in large enterprises, the refreshed Docker Personal, Docker Pro, Docker Team, and Docker Business plans ensure developers have the right tools at their fingertips.
These changes increase access to Docker Hub across the board, bring more value into Docker Desktop, and grant access to the additional value and new capabilities we’ve delivered to development teams over the past few years. From Docker Scout’s advanced security and software supply chain insights to Docker Build Cloud’s productivity-generating cloud build capabilities, Docker provides developers with the tools to build, deploy, and verify applications faster and more efficiently.
Sorry, where in this hyped up marketingspeak walloftext does it say "WARNING we are rugging your pulls per IPv4"?
That's some cherry-picking right there. That is a small part of the announcement.
Right at the top of the page it says:
> consumption limits are coming March 1st, 2025.
Then further in the article it says:
> We’re introducing image pull and storage limits for Docker Hub.
Then at the bottom in the summary it says again:
> The Docker Hub plan limits will take effect on March 1, 2025
I think like everyone else is saying here, if you rely on a service for your production environments it is your responsibility to stay up to date on upcoming changes and plan for them appropriately.
If I were using a critical service, paid or otherwise, that said "limits are coming on this date" and it wasn't clear to me what those limits were, I certainly would not sit around waiting to find out. I would proactively investigate and plan for it.
The whole article is PR bs that makes it sound like they are introducing new features in the commercial plans and hiking up their prices accordingly to make up for the additional value of the plans.
I mean just starting with the title:
> Announcing Upgraded Docker Plans: Simpler, More Value, Better Development and Productivity
Wow great it's simpler, more value, better development and productivity!
Then somewhere in the middle of the 1500-word (!) PR fluff there is a paragraph with bullet points:
> With the rollout of our unified suites, we’re also updating our pricing to reflect the additional value. Here’s what’s changing at a high level:
> • Docker Business pricing stays the same but gains the additional value and features announced today.
> • Docker Personal remains — and will always remain — free. This plan will continue to be improved upon as we work to grant access to a container-first approach to software development for all developers.
> • Docker Pro will increase from $5/month to $9/month and Docker Team prices will increase from $9/user/month to $15/user/mo (annual discounts). Docker Business pricing remains the same.
And at that point if you're still reading this bullet point is coming:
> We’re introducing image pull and storage limits for Docker Hub. This will impact less than 3% of accounts, the highest commercial consumers.
Ah cool I guess we'll need to be careful how much storage we use for images pushed to our private registry on Docker Hub and how much we pull them.
Well it's an utter and complete lie because even non-commercial users are affected.
————
This super long article (1500 words) intentionally buries the lede because they are afraid of a backlash. But you can't reasonably say “I told u so” when you only mentioned in a bullet point somewhere in a PR article that there will be limits that impact the top 3% of commercial users, then 4 months later give a one week notice that images pulls will be capped to 10 pulls per hour LOL.
The least they could do is to introduce random pull failures with an increasing probability rate over time until it finally entirely fails. That's what everyone does with deprecated APIs. Some people are in for a big surprise when a production incident will cause all their images to be pulled again which will cascade in an even bigger failure.
None of this takes away from my point that the facts are in the article, if you read it.
If the PR stuff isn't for you, fine, ignore that. Take notes on the parts that do matter to you, and then validate those in whatever way you need to in order to assure the continuity of your business based on how you rely on Docker Hub.
Simply the phrase "consumption limits" should be a pretty clear indicator that you need to dig into that and find out more, if you rely on Docker in production.
I don't get everyone's refusal here to be responsible for their own shit, like Docker owes you some bespoke explanation or solution, when you are using their free tier.
How you chose to interpret the facts they shared, and what assumptions you made, and if you just sat around waiting for these additional details to come out, is on you.
They also link to an FAQ (to be fair we don't know when that was published or updated) with more of a Q&A format and the same information.
It's intentionally buried. The FAQ is significantly different in November; it does say that unauthenticated pulls will experience rate limits, but the documentation for the rate limits given doesn't offer the limit of 10/hour but instead talks about how to authenticate, how to read limits using API, etc.
The snippets about rate limiting give the impression that they're going to be at rates that don't affect most normal use. Lots of docker images have 15 layers; doesn't this mean you can't even pull one of these? In effect, there's not really an unauthenticated service at all anymore.
> “But the plans were on display…”
> “On display? I eventually had to go down to the cellar to find them.”
> “That’s the display department.”
> “With a flashlight.”
> “Ah, well, the lights had probably gone.”
> “So had the stairs.”
> “But look, you found the notice, didn’t you?”
> “Yes,” said Arthur, “yes I did. It was on display in the bottom of a locked filing cabinet stuck in a disused lavatory with a sign on the door saying ‘Beware of the Leopard.”
I'm certainly not trying to argue or challenge anyone's interpretations of motive or assumptions of intent (no matter how silly I find them - we're all entitled to our opinions).
I am saying that when change is coming, particularly ambiguous or unclear change like many people feel this is, it's no one's responsibility but yours to make sure your production systems are not negatively affected by the change.
That can mean everything from confirming data with the platform vendor, to changing platforms if you can't get the assurances you need.
Y'all seem to be fixated on complaining about Docker's motives and behaviour, but none of that fixes a production system that's built on the assumption that these changes aren't happening.
> but none of that fixes a production system that's built on the assumption that these changes aren't happening.
Somebody's going to have the same excuse when Google graveyards GCP. Till this change, was it obvious to anyone that you had to audit every PR fluff piece for major changes to the way Docker does business?
> was it obvious to anyone that you had to audit every PR fluff piece for major changes to the way Docker does business?
You seem(?) to be assuming this PR piece, that first announced the change back in Sept 2024, is the only communication they put out until this latest one?
That's not an assumption I would make, but to each their own.
This isn't exactly the same lesson, but I swore off Docker and friends ages ago, and I'm a bit allergic to all not-in-house dependencies for reasons like this. They always cost more than you think, so I like to think carefully before adopting them.
But Mr Dent, the plans have been available in the local planning office for the last nine months.”
“Oh yes, well as soon as I heard I went straight round to see them, yesterday afternoon. You hadn’t exactly gone out of your way to call attention to them, had you? I mean, like actually telling anybody or anything.”
“But the plans were on display …”
“On display? I eventually had to go down to the cellar to find them.”
“That’s the display department.”
“With a flashlight.”
“Ah, well the lights had probably gone.”
“So had the stairs.”
“But look, you found the notice didn’t you?”
“Yes,” said Arthur, “yes I did. It was on display in the bottom of a locked filing cabinet stuck in a disused lavatory with a sign on the door saying ‘Beware of the Leopard’.”
> I don't get everyone's refusal here to be responsible for their own shit
No kidding. Clashes with the “gotta hustle always” culture, I guess.
Or it means that they can’t hide their four full-time jobs from each of the four employers as easily while they fix this at all four places at the same time.
The “I am owed free services” mentality needs to be shot in the face at close range.
I haven't gone through my emails, but I assume there was email communication somewhere along the way. It's safe to assume there's been a good 2-3 months of communication, though it may not have been as granular or targeted as some would have liked.
I mean, there has never not been some issue with Docker Desktop that I have to remember to work around. We're all just collectively cargo culting that Docker containers are "the way" and putting up with these troubles is the price to pay.
If you offer a service, you have some responsibility towards your users. One of those responsibilities is to give enough notice about changes. IMO, this change doesn't provide enough notice. Why not making it a year, or at least a couple of months? Probably because they don't want people to have enough notice to force their hand.
You don't. You have responsibility towards your owners/shareholders. You only have to worry about your customers if they are going to leave. Non-paying users not so much - you're just cutting costs now zirp isn't a thing.
If this was a public company I would put my tin foil hat and believe that it's a quick buck scheme to boost CEO pay. A short sighted action that is not in the shareholders interest. But I guess that's not the case? Who knows...
At this stage of the product lifecycle, free users are unlikely to ever give you money without some further "incentives". This shouldnt be news by now, especially on HN.
If you're production service is relying on a free-tier someone else provides, you must have some business continuity built in. These are not philanthropic organisations.
What principal are you using to suggest that responsibility comes from?
I have a blog, do I have to give my readers notice before I turn off the service because I can't afford the next hosting charge?
Isn't this almost exclusively going to effect engineers? Isn't it more of the engineer's responsibility not to allow their mission critical software to have such a fragile signal point of failure?
> Probably because they don't want people to have enough notice to force their hand.
It's bait and switch that has the stakes of "adopt our new policy, that makes us money, that you never signed up for, or your business fails." That's a gun to the head.
Not an acceptable interaction. This will be the end of Docker Hub if they don't walk back.
Yes. But they are paying for this bandwidth, authenticated or not. This is just busy work, and I highly doubt it will make much of a difference. They should probably just charge more.
> take the risk of a production outage in case we do too many pulls at once.
And the exact time you have some production emergency is probably the exact time you have a lot of containers being pulled as every node rolls forward/back rapidly...
And then docker.io rate limits you and suddenly your 10 minute outage becomes a 1 hour outage whilst someone plays a wild goose chase trying to track down every docker hub reference and point it at some local mirror/cache.
> If we had instead simply mirrored everything into a registry at a big cloud provider, we would never have paid docker a cent for the privilege of having unplanned work foisted upon us.
Indeed, you’d be paying the big cloud provider instead, most likely more than you pay today. Go figure.
They should have provided more notice. Your case is simply prioritizing work that you would have wanted to complete anyway. As a paying customer you could check if your unauthenticated requests can go via specific outbound IP addresses that they can then whitelist? I’m not sure but they may be inclined to provide exceptions for paying customers - hopefully.
> It's busy-work that provides no business benefit, but-for our supplier's problems.
I dunno, if I were paying for a particular quality-of-service I'd want my requests authenticated so I can make claims if that QoS is breached. Relying on public pulls negates that.
Making sure you can hold your suppliers to contract terms is basic due diligence.
It is a trade-off. For many services I would absolutely agree with you, but for hosting public open-source binaries, well, that really should just work, and there's value in keeping our infrastructure simpler.
This sounds like its only talking about authenticated pulls:
> We’re introducing image pull and storage limits for Docker Hub. This will impact less than 3% of accounts, the highest commercial consumers. For many of our Docker Team and Docker Business customers with Service Accounts, the new higher image pull limits will eliminate previously incurred fees.
So it goes. You're a business, pay to make the changes. It's a business expense. Docker ain't doing anything that their agreements/licenses say they can't do.
It's not fair, people shout. Neither are second homes when people don't even have their first but that doesn't seem to be a popular opinion on here.
Wouldn't they get a choice as to what type of authentication they want to use then? I'd assume they could limit access in multiple ways, vs just the dockerhub way.
I just cannot imagine going into public and saying, roughly the equivalent of I want free unlimited bandwidth because I'm too lazy to do the very basics of managing my own infra.
> If we had instead simply mirrored everything into a registry at a big cloud provider, we would never have paid docker a cent for the privilege of having unplanned work foisted upon us.
I mean, if one is unwilling to bother to login to docker on their boxes, is this really even an actual option? Hm.
Docker is a company, sure, and they’re entitled to compensation for their services, sure. That said, bandwidth is actually really cheap. Especially at that scale. Docker has publicly been struggling for cash for years. If they’re stuck on expensive clouds from a bygone VC era, that’s on them. Affordable bandwidth is available.
My main complaint is:
They built open source tools used all over the tech world. And within those tools they privileged their own container registry, and provided a decade or more of endless and free pulls. Countless other tools and workflows and experiences have been built on that free assumption of availability. Similarly, Linux distros have had built-in package management with free pulling for longer than I’ve been alive. To get that rug-pull for open-source software is deeply disappointing.
Not only that, but the actual software hosted on the platform is other people’s software. Being distributed for free. And now they’re rent-seeking on top of it and limiting access to it.
I assume most offices and large commercial businesses have cached and other tools built into their tools, but for indie developers and small businesses, storage of a ton of binary blobs starts to add up. That’s IF they can even get the blobs the first time, since I imagine they could experience contention and queuing if you use many packages.
And many people use docker who aren’t even really aware of what they’re doing - plenty of people (myself included) have a NAS or similar system with docker-wrapping GUI pre-installed. My NAS doesn’t even give me the opportunity to login to docker hub when pulling packages. It’s effectively broken now if I’m on a CGNAT.
You can always rebuild the dockerfile, or create an account increasing your pull limit, or just host all the free infrastructure they’ve already built, or use competitors providing the same service with the risk of them doing exactly the same in the future.
The difference with docker hub and just mirroring a linux repo is docker made it really easy for people so they don’t need it to get into infra weeds but the complexity was always there.
I don't see how that particular issue is relevant here.
Add a port number to the reference, problem solved.
The reality is, DockerHub (though originally called the Docker Index), was the first Docker image registry to even exist, and it was the only one to exist when image references were created.
Now, I would say there are definitely some issues you could have referenced here that would be more relevant (e.g. mirrors only working for DockerHub).
They're OCI images now, and Docker was largely a stolen idea from UNIXes* (unices?), including the term containers. As much as I like what Podman to open it up using Containerfiles and not defaulting to this, it might as well go even farther tweak the standard a bit - provide something like Dockerfile that's less golang-inspired and more linux-inspired, and improve the format for images - so the industry can move on from Docker lock-in.
That's what makes it approachable (though I don't agree it has an semblance to go other than "FROM <ref>" being fully qualified (minus the carve-out for Hub), but even then it can absolutely act more like a local import if you have an image with that ref locally (or you can even override it in your build command in a couple of different ways).
Also note, Docker can build whatever format you want, it just defaults to the Dockerfile format, but you can give it whatever syntax parser you want.
> Countless other tools and workflows and experiences have been built on that free assumption of availability
Cannot help but notice that, had Microsoft offered such a sweet deal, this place would've been ablaze with cries of "Embrace, extend, extinguish" and suchlike. (This still regularly happens, e.g., when new Github features are announced). Perhaps even justifiably so, but the community has failed to apply that kind of critical thinking to any other company involved in open source. If your workflow is not agnostic wrt where you pull images from, it is kind of silly to blame it on Docker Inc.
Having said that, it is definitely a problem for many. I work at a technical university and I am sure colleges/research institutes will hit the limit repeatedly and easily.
Not a new concept. If you aren't paying you are the product. Increasingly capitalism demands the product also pay. Buckle up, late stage capitalism continues.
Sometimes you're the product, other times you're the raw material ready to be crushed up into a fine powder and mixed with 20 additives known only by numbers, and thoroughly homogenized to make the product.
Hmm yes but if it is limited to 10 in an hour that could even be an issue for hobbyists if you update multiple dockers at the same time. For example the excellent matrix ansible playbook pulls numerous dockers in a single update run because every little feature is in a separate container. Same with home assistant add-ons. It's pretty easy to reach 10 in an hour. Even though you may not pull any for a whole month afterwards. I only do this once a month because most matrix bridges only get updates at that rate.
I have to say though, 90% of the dockers I use aren't on docker hub anymore. Most of them reside on the github docker repo now (ghcr.io). I don't know where the above playbook pulls from though as it's all automated in ansible.
And really docker is so popular because of its ecosystem. There are many other container management platforms. I think that they are undermining their own value this way. Hobbyists will never pay for docker pulls but they do generate a lot of goodwill as most of us also work in IT. This works the other way around too. If we get frustrated with docker and start finding alternatives it's only a matter of time until we adopt them at work too.
If they have an issue with bandwidth costs they could just use the infrastructure of the many public mirrors available that also host most Linux distros etc. I'm sure they'd be happy to add publicly available dockers.
Ironically, it's the paid rates that are being reduced more (though they don't have hourly limits still, so more flexibility, but the fair use thing might come up), as they were infinite previously, now Pro is 34 pulls/hour (on average, which is less than authenticated), Team is 138 pulls/hour (or 4 times Pro) and Business 1380 pulls/hour (40 times pro, 10 times team).
My feeling this is trying to get more people to create docker accounts, so the upsell can be more targeted.
This means there is a market for a docker proxy. Just install it, in a Docker container of course, and it caches the most common containers you use locally!
Yes, and we can agree I'm not going to participate and if they want to take away their service that's their business decision.
They're entitled to do what they want and implement any business model they want. They're not entitled to any business, to my data, nor their business model working.
The entitlement of ... the VC powered powerplant that reinvented and reimagined electricity, gave out free electricity and put all the comeptitors out of business, succeeded in monopolizing electrity then come looking for a payday so they can pad the accounts and 'exit' passing it off to the next round of suckers. Truly unbelieveable.
That's business, baby. We're all involved in it, like it or not. And especially American tech/developer culture seems to thrive on fyigm gated community stuff.
I couldn't give 2 shits whatever Docker does. They're a service, if I wanna use it I'll pay, if not then I'll use something else. Ez pz
> Does this commercial company expect volunteers to give them images for free which give their paid subscriptions value?
Yes, to an extent, because it costs money to store and serve data, no matter what kind of data it is or it's associated IP rights/licensing/ownership. Regardless, this isn't requiring people to buy a subscription or otherwise charging anyone to access the data. It's not even preventing unauthenticated users from accessing the data. It's reducing the rate at which that data can be ingested without ID/Auth to reduce the operational expense of making that data freely (as in money) and publicly available. Given the explosion in traffic (demand) and the ability to make those demands thanks to automation and AI relative to the operational expense of supplying it, rate limiting access to free and public data egress is not in and of itself unreasonable. Especially if those that are responsible for that increased OpEx aren't respecting fair use (legally or conceptually) and even potentially abusing the IP rights/licensing of "images [given] for free" to the "Library built on the back of volunteers".
To what extent that's happening, how relevant it is to docker, and how effective/reasonable Docker's response to it are all perfectly reasonable discussions to have. The entitlement is referring to those that explicitly or implicitly expect or demand such a service should be provided for free.
Note: you mentioned you don't use docker. a single docker pull can easily be 100's of MB's (official psql image is ~150MB for example) or even in some cases over a GB worth of network transfer depending on the image. Additionally, there is no restriction by docker/dockerhub that prevents or discourages people from linking to source code or alternative hosts of the data. Furthermore you don't have to do a pull everytime you wish to use an image, and caching/redistributing them within your LAN/Cluster is easy. Should also be mentioned Docker Hub is more than just a publicly accessible storage endpoint for a specific kind of data, and their subscription services provide more that just hosting/serving that data.
It's like GitHub limiting the number of checkout you can do each hour on public repos.
Unless you pay a sub to get rid of the limit.
So, yeah, they kind of taking advantage of people putting their work on DH to try&sell subs.
But nobody have to put their images on DH.
And to be honest, I don't think the discoverability factor is as important on DH that it is on GitHub.
So if people want to pay for they own registry to make it available for free for everyone, it's less an issue than hosting your repo on your own GitLab/Gitea instance.
There’s plenty of folks behind a CGNAT, sometimes shared with thousands of others. And this is more common in regions where actually paying for this service is often too expensive.
I’ve also seen plenty of docker-compose files which pull out this amount of images (typically small images).
I’m not saying that Docker Inc should provide free bandwidth, but let’s not also pretend that this won’t be an issue for a lot of users.
He could also hit the limit downloading a single image, if I'm understanding his situation.
If I'm an infrequent tinkerer that occasionally needs docker images, I'm not going to pay a monthly cost to download e.g. 1 image/month that happens to be hosted on Docker's registry.
(It sounds like you can create an account and do authenticated pulls; which is fine and pretty workable for a large subset of my above scenario; I'm just pointing out a reason paying dollars for occasional one-off downloads is unpopular)
From a security and reproducibility perspective, you, shouldn’t want to pull directly. I’ve used Artifactory in the past as a pass through cache that can “promote” image, making them available to test and production environments as they go through whatever validation process is required. Then you know images (or packages, or gems, or modules, or whatever you are deploying) has at least been tested and an unpinned dependency isn’t going to surprise you in production.
This is what I've seen (and done) at every place that used containers at any kind of scale. I'm frankly impressed with the people who can sleep at night with their production hosts pointed directly at docker hub.
Agreed, it seems like a bunch of people in this thread are scared of having to setup authentication and monitoring, but are not scared of chain attack in the latest docker image they never even looked at.
No one is mad merely because there is a capped free service and an unlimited paid service offering.
The ire is because of the rug pull. (I presume) you know that. It’s predatory behavior to build an entire ecosystem around your free offering (on the backs of OSS developers) then do the good old switcheroo.
I think Docker started the bloated image mess. Have you ever seen a project with <100MB in size?
Guess pack everything with gzip isn't a good idea when size matters.
Docker Hub have a traffic problem, so does every intranet image registry. It's slow. The culprit is Docker (and maybe ppl who won't bother to optimize)
Is there an easy way of changing the default repository that's pulled from when you issue a 'docker pull <whatever>' command, or do you always have to make sure to execute 'docker pull <mycustomrepo.io/whatever>' explicitly?
> do you always have to make sure to execute 'docker pull <mycustomrepo.io/whatever>' explicitly
I started using explicit repository names for everything including Docker Hub 5+ years ago and I don't regret it. I haven't thought about mirrors since, and I find it easier to reason about everything. I use pull-through caches with dedicated namespaces for popular upstream registries.
- hub.example.com/ubuntu --> ubuntu from Docker Hub
- ghcr.example.com/org/projectA --> project from GHCR
I tried using mirrors at first, but it was a disaster with the shorthand notation because you can have namespace collisions. Consider:
That's not useful because your definitions are still ambiguous unless you go look at the mappings, so all you've done is add external config vs explicitly declaring the namespace.
Plus, you can set up a pull-through cache everywhere it makes sense.
I'd be interested to hear about scenarios where mirrors are more than a workaround for failing to understand the power of Docker's namespacing and defaulting to the shorthand notation for everything.
It’s basically your apartment building example (esp. something like the STEM dorms)
When this stuff breaks in the hours leading up to a homework assignment being due, it’s going to discourage the next generation of engineers from using it.
You're absolutely right, but explaining the cost to the employer and/or the client and getting approvals to even use Docker will be a PITA.
Currently for smaller clients of the software house I work for we (normal employees) were able to use Docker whenever we felt like without manager's approval to optimize the deployment and maintenance costs on our side.
> why would you expect a commercial company to give you containers for free
Because they did. But you're right—they have no obligation to continue doing so. Now that you mention it, it also reminds me that GitHub has no such obligation either.
In a way, expecting free container images is similar to how we can download packages from non-profit Linux distributions or how those distributions retrieve the kernel tarball from its official website. So, I’m not sure whether it’s better for everyone to start paying Docker Hub for bandwidth individually or for container images to be hosted by a non-profit, supported by donations from those willing to contribute.
If the electricity were generated by thousands of volunteers pedalling in their basement, then yes, I would expect the utility company not to be too greedy.
Not a huge fan of Docker as a company in general, but this is spot on- the DockerHub free tier is still quite generous for private/hobby usage actually - if you are a professional user, well you should very well be having your own professional solution, either your own internal registry or a commercial SaaS registry.
How many new distinct docker images do you get daily? I'd expect less then one, on average, with a occasional peak when you do exploration.
There is always a commercial subscription. You need only a single $9/mo account and you get 25,000 pulls/month.
And if you are not willing to pay $9/mo, then you should be OK with using free personal account for experiments, or to spread out your experiments over longer timeline.
If Docker explicitly offers a service for free, then users are well within their rights to use it for free. That’s not entitlement, that’s simply accepting an offer as it stands.
Of course, Docker has every right to change their pricing model at any time. But until they do, users are not wrong for expecting to continue using the service as advertised.
I've seen this "sense of entitlement" argument come up before, and to be clear: users expecting a company to honor its own offer isn’t entitlement, it’s just reasonable.
There's already a rate limit on pulls. All this does is make that rate limit more inconvenient by making it hourly instead of allowing you to amortize it over 6 hours.
10 per hour is slightly lower than 100 per 6 hours, but not in any meaningful way from a bandwidth perspective, especially since image size isn't factored into these rate limits in any way.
If bandwidth is the real concern, why change to a more inconvenient time period for the rate limit rather than just lowering the existing rate limit to 60 per 6 hours?
Until you have this one weird buildpak thing that for some unfathomable reason keeps downloading all the toolchain layers all the time for each build of the app.
Then again, good that this measure forces fixing this bad behaviour, but as a user of buildpack you are not always in the know how to fix it.
It kind of depends. To a degree you are right, but not entirely. For the past two months for instance I've been making a huge push to de-cloud-ify myself entirely and self-host everything. I do have the bandwidth and I do have the hardware that is needed. Having said that, I am not making this whole thing little by little but whenever I have time. There were times when I was pulling 30 images/hour and it's clearly a one-off thing. While corporations are certainly abusing docker's generosity, in practice, the people that pull hundreds of images on hourly basis is astronomically low - most commonly one-off things, much like what I am doing. I've worked in similar environments and the abusers are the exception rather than the rule. The bottom line is, this genuinely feels like some half-assed biz-dev decision, promising to cut costs by 20%. Been there, done that. In the long run, those quick cuts ended up costing a lot more.
How much bandwidth do you suppose DockerHub uses? I can't see it being any less than 10gigabit, probably more like 100gigabit. Just the cost of that transit is likely in the $600-6,000/mo range. Then you need to factor in the additional costs for storage and compute to serve it, switching gear, and management and maintenance. That's probably at least as much as transit.
They aren't likely able to go for peering arrangements ("free" bandwidth) because their traffic is likely very asymmetric, and that doesn't save the management/storage/compute costs.
I don't know what Docker's financials are, but I can imagine, as a business owner myself, situations where it was lean enough that that sort of cost could mean the difference between running the service and not.
The bigger the service, the more financial incentive they have to be smart and not pay absurd prices for things, since they can give themselves higher profit margins by controlling their costs.
So yeah you can say it's entitlement but if you build your business in one way and then change the fundamental limits AFTER you've gotten market saturation you really shouldn't be shocked at complaints. It's their fault because they fostered the previous user behavior.
People understand that bandwidth costs money but that seems to have been priced in to their previous strategy or they did it knowingly as a loss leader to gain market share. If they knew this was a fundamental limitation they should have addressed it years ago.
Perhaps they should have started by putting "we will enforce limits soon" in all documentation.. and in a few years, starting enforcement but with pretty high limits? and then slowly dialing limits down over a few years?
That's exactly what they did. I remember setting up the docker proxy 4 years ago when we started getting first "rate limit" errors. And if someone was ignoring the news for all that time.. Well, tough for them, there was definitely enough notice.
We are all getting a little tired of “Come Into My Parlor,” Said the Spider to the Fly.
Of if you want it a little more colorfully: capturing a market with free candy to get us into your van.
Or more accurately, this “free until we need something from you” is the moral equivalent of a free meal or ski trip but you are locked into a room to watch a timeshare marketing deck.
Open Source is built on a gift economy. When you start charging you break the rules of that socioeconomic model. So most of us make tools that we have a hope of keeping running the way they were designed. Some of us stay at the edges because we know we won’t still be interested in this tool in two years and we don’t want to occupy other people’s head space unfairly or dishonestly. Some of us believe we can persevere and then we find there’s a new programming language that we like so much more than this one that we fuck off and leave a vacuum behind us that others have to scramble to fill (I’m talking about you, TJ).
And then there’s these kinds of people who hoover up giant sections of the mindshare using VC money and don’t ever find a new revenue stream, like Mozilla managed. And it’s getting really fucking old.
One of the problems with XML is that the schemas aren’t cached by default, and so a build system or a test harness that scans dozens or hundreds of these files per run, or twenty devs running their sandboxes in a tight loop, can eat up ridiculous amounts of bandwidth. But the important difference is that they architected a way to prevent that, it’s just that it is fiddly to get set up and nobody reads the instructions. I found an email chain with the W3c.org webmaster complaining about it, and myself and a couple other people tried to convince him that they needed to add a 100ms delay to all responses.
My reasoning was that a human loading a single XML file would never notice this increase, but a human running 100s of unit tests definitely would want to know why they suddenly got slower, and doing something about it wouldn’t just get back that extra 20 seconds, they’d get back 40 (or in our case 5-10minutes) by making the call one more time and putting it into a PR. We only noticed we were doing it because our builds stopped working one day when the internet went down.
There’s no build tool I know of that will deal with being 429’d at 10 requests. Especially if you know anything about how layers work. There are tons that would work just fine being traffic shaped to a server with a lower SLA. Dedicate half of your cluster to paying customers and half to the free tier. Or add 250ms delay. It’ll sort itself out. People will install Artifactory or some other caching proxy and you’ll still have the mindshare at a lower cost per use.
Who knew that cramming 6 GB of Debian dependencies over the wire to run a Python script was a terrible idea? Who could have seen this coming? Maybe people wouldn't think bandwidth grows on trees if literally everyone in Silicon Valley hadn't developed software that way for the last 15 years.
But idk maybe Docker shouldn't have pulled a bait-and-switch, which is also classical known as a "dick move".
Literally every company offering a free service pulls a bait-and-switch. Fool me once, shame on you. Fool me three dozen times... you can't get fooled again. This was not unexpected. This was completely predictable from the moment anyone in the team asked "what does this command do?" and the answer was "it downloads something from Docker Hub"
Bandwidth is cheap, especially at scale, unless you're in one of the large clouds that make a shitload of money gouging their customers on egress fees.
I don't say that Docker Inc should foot the bill for other multibillion dollar companies, but the fact that even after 8 years it still is impossible to use authentication in the registry-mirrors option is mind-boggling.
Entitlement like building a business on top of free open source tech, relying on free open source to build and support your userbase, then using your cheapest line item as a cudgel to coerce subscriptions?
I host OSS images there, and I see no notice about how they will be affected. If they limit access to my published images, then it will be an issue. In that case the benefit and thus incentive for many of the projects which have made docker and docker hub pervasive goes away. Without that adoption, there would probably be no docker hub today.
This should help people understand a bit better why this feel a bit underhanded. The images are free, and I and many other OSS devs have used docker hub in partnership to provide access to software, often paying for the ability to publish there. In this case, any burden of extra cost was on the producer side.
Turning this into a way to "know" every user and extract some value from them is their prerogative, but it does not feel like it is good faith. It also feels a bit creepy in the sense of "the user is the product".
Most of the OSS projects I use seem to either have moved to the GitHub container registry or some other (smaller) equivalent. Some have even set up their own registries behind Cloudflare.
One of the first things I did was move to Quay.io which is unlimited everything for OSS projects. I was reaching a point where I had 1M+ pulls a month (I suspect some kind of DDoS, accidental or otherwise, for a project with just 1.7k stars) - and not having to even think about the bandwidth or anything was wonderful. It's nice to be supported by Red Hat which I generally consider more benevolent towards OSS as opposed to Docker Hub.
They do have special provisions for OSS projects hosting their images on DH.
I don't know all the details, but you should be able to find it in the docs.
This has been the standard practice for all tech companies. Make it free to capture the market and snuff out all competition. Once they have secured the whole market then its time to start making money to pay back the millions they borrowed from VCs for decades
It’s like playing Plague Inc. (reverse version of Pandemic the board game where you play as the disease): to win, develop all possible methods of spreading first; only then develop symptoms, and do it fast before anyone has time to react
I find it surprising that people notice the part about symptoms[1], and despite this happening repeatedly we do relatively little against the part about spreading.
Part of it is perhaps by definition, “spreading” already assumes success. Still, I’d welcome some regulation; or at least awareness; e.g. a neologism for companies in that stage, growing at cost and only getting ready to develop symptoms.
Dockerhub isn't vetted either. Dockerhub is major compliance risk. Too many images of questionable maintenance status and sometimes questionable build. Aside from maybe some base images I wouldn't pull anything from there for enterprise use. (For toying/experimenting around slightly different)
One can't rely on library updates being done, thus one has to have a build chain form many images.
I feel that dockerhub no longer can be the steward for the default docker repo because of this and the limitations they previously have implemented. It is time for them to hand over the baton stick to someone else, or that the notion of a default repo is removed all together
For years, people have been trying to add a “override the default registry because docker hub is a single point of failure” option to the docker client.
Upstream has blocked it. A fork over this one little feature is long overdue.
Fully qualified is indeed the way to go. Unfortunately a lot of tutorials manifests and other materials have short names, and changing the default registry just opens up a load of typo squatting attacks.
That's a very good argument, but people have down used to dropping the domain prefix because it's always been optional.
Giving people the option to configure a default repository via the daemon.json would alleviate that issue, but I'm not sure if that's really enough to fork.
Oh great, so now it's a hidden default that might be different system-to-system. Disjoint from the actual deployment config. Please, dear god, no. I'm sorry, this is a bad idea.
The web is absolutely littered with docker tutorials and a huge proportion of them (not operated or maintained by docker themselves) would no longer be valid, I'm sure.
That said, they also maintain a list of aliases for a bunch of container images. https://github.com/containers/shortnames . My distro's podman package then drops in that file into /etc/containers/registries.conf.d/ via one of the package's dependencies.
Sure, digests are a good tool when needed, but there's a huge gulf between the risk factors here, and they don't really solve the "trust across repos" problem.
The top level image hashes tend to change between repos (because the image name changes in the manifest and is included in the hash).
So you'd have to go through an verify each layer sha.
Good tool for selecting an exact image in a repo, not a replacement for trust at the naming level (it's a bit like the difference between owning a domain and cert pinning with hpkp).
Manifests are tacked on afterwards, and have a lot of complexity that I'm not sure most folks have actually thought through.
Ex - lots of refs are to "multi-arch" images, Except... there's no such thing as a multi-arch image, the entire identifier is just a reference to a manifest that then points to a list of images (or other manifests) by arch, and the actual resolved artifact is a single entry in that list.
But it means the manifest needs to be able to reference and resolve other names, and that means including... names.
Basically - the digests weren't intended to support image verification across repos, and the tool doesn't treat them that way. The digest was intended to allow tighter specification than a tag (precisely because a publisher might push a different image to the same tag later).
The name is not being set in the manifests, though.
The only reference between objects are digests.
The only place an image name is stored is actually independent of any manifest, it is an endpoint in the registry that resolves the given name to a digest.
Once you have that digest the rest is a DAG.
not sure about the old docker image format, but most modern tools use OCI image format, and that doesn't embed the image name in the manifest, just digests, so it's totally portable everywhere.
`latest` also has many different definitions. Some people think it's tip of the dev branch (ie, pulling from `main` git branch), some people think it's most recent stable release (ie, some released tag).
One way to get around this is to just not use `latest` at all, and only push docker tags that perfectly mirror the corresponding git branches/tags.
that really depends on your thoughts of the trustworthiness of the 'owners' of those tags vs the security bugs in the sha256 you pinned and then didn't keep an eye on...
Huh? If you don't like having back-up registries, just specify one. You can also always use a fully qualified image name if you want to source from a specific registry.
This gets at something I've never quite understood about Docker. And it might be a dumb question: why do we need a dedicated host for Docker images in the first place?
I can see the use case for base images: they're the canonical, trusted source of the image.
But for apps that are packaged? Not as much. I mean, if I'm using a PaaS, why can't I just upload my Docker image to them, and then they store it off somewhere and deploy it to N nodes? Why do I have to pay (or stay within a free tier) to host the blob? Many PaaS providers I've seen are happy to charge a few more bucks a month just to host Docker images.
I'm not seeing any sort of value added here (and maybe that's the point).
So, on a technical level, obviously you don't. "docker image import" allows images to be stored anywhere you want.
But obviously the real problem is that you're asking the wrong question. We don't "need" a centralized image repository. We WANT one, because the feature that Docker provides that "just use a tarball" doesn't (in addition to general ease-of-use, of course) is authentication, validation and security. And that's valuable, which is why people here are so pissed off that it's being locked behind a paywall.
But given that it has value... sorry folks, someone's got to pay for it. You can duplicate it yourself, but that is obviously an engineering problem with costs.
Just write the check if you're a heavy user. It's an obvious service with an obvious value proposition. It just sucks if it wasn't part of your earlier accounting.
> It just sucks if it wasn't part of your earlier accounting.
I generally agree with you, but to be fair to the complainers, what sucks is that Docker didn't make it clear up front that it should be part of your accounting. I don't know if they always intended to monetize this way (if so, we'd call that a bait and switch) or if they sincerely had other plans that just didn't pan out, but either way the problem is the same: There's a trend across all of technology of giving your stuff away for free until you become the obvious choice for everything, then suddenly altering the deal and raising prices.
That kind of behavior has in the past been deemed anticompetitive and outlawed, because it prevents fair competition between solutions on their merits and turns it into a competition for who has the deepest war chest to spend on customer-acquisition-by-free-stuff.
Everyone on here cries about this behaviour unless they're the ones with stakes in a biz and their stocks going up up up. Then suddenly it's just business.
Business isn't static. Costs and operations are not static.
At one point Docker probably may have had an authentic intent for a free service, but costs along the way changed the reality of operations and long-term cash flow and success of the business required making changes. Maybe the cash saved from bandwidth is what makes the next project possible that helps them grow the bottom line.
Further what was once a positive value proposition 18 months ago can turn into a losing proposition today, and a company should be allowed to adapt to new circumstances and be allowed to make new decisions without being anchored and held back by historical decisions (unless under contract).
As fun as it is hold executives to unrealistic standards, they're not fortune-tellers that can predict the future and they're making the best possible decisions they can given the constraints they're under. And I don't begrudge them if those decisions are in their own best interest, such as is their responsibility.
I'll give Docker the benefit of the doubt that this wasn't a bait-and-switch, that they never excepted it to become so successful, and that costs outpaced their ability to monetize the success and were eating into cash reserves faster than plan. I think the current outcome isn't so bad, and that we're still getting a considerable amount of value for free. It's unfortunate that some people are only finding out now, and are now under pressure to address an issue they didn't sign up for.
If there was no value provided to customers, then why are people in this thread so angry? They're angry because something valuable they used to get for free now requires money!
Standing up your own registry is trivial at the kind of scales (dozens-to-hundreds of images pulls per day!) that we're talking about. It's just expensive, so people want Docker, Inc. to do it for free. Well...
Because it's a classic EEE-style technique. You force (or very strongly encourage) customers to use your system instead of any competing system. Then, only when all the customers are using your system, and all the competitors are out of business because they don't get customers any more, you rug-pull the customers for money. This wouldn't be a problem if you always had to specify a system when pulling an image, because Docker would be on equal ground to everyone else.
This sucks for individuals and open source. For folks that have a heavy reliance on dockerhub, here are some things that may help (not all are applicable to all use cases):
1. Setup a pull through mirror. Google Artifact Registry has decent limits and good coverage for public images. This requires just one config change and can be very useful to mitigate rate limits if you're using popular images cached in GAR.[1]
2. Setup a private pull through image registry for private images. This will require renaming all the images in your build and deployment scripts and can get very cumbersome.
3. Get your IPs allowlisted by Docker, especially if you can't have docker auth on the servers. The pricing for this can be very high. Rough numbers: $20,000/year for 5 IPs and usually go upwards of $50k/year.
4. Setup a transparent docker hub mirror. This is great because no changes need to be made to pipelines except one minor config change (similar to 1). We wrote a blog about how this can be done using the official docker registry image and AWS.[2] It is very important to NOT use the official docker registry image [3] as that itself can get throttled and lead to hairy issues. Host your own fork of the registry image and use that instead.
We spent a lot of time researching this for certain use cases while building infrastructure for serving Github actions at WarpBuild.
Register for free and you get a higher limit: 40 pulls is plenty. What do you imagine running that requires more than 40 dockerhub (not local) pulls on an hourly basis?
if i start an eks cluster in a NAT environment with 10 nodes and 4 daemon sets. I need 40 pulls by default. Lots of tutorials out there to do this that will no longer work as well.
by default anything you need from helm charts will be pulled from docker hub. and its normal to have a storage daemon, networking agents, loggers on every node so if you launch enough at once during an autoscale event, you'd trigger this limit.
It seems like a good time to point out that oci images' layer-based caching system is incredibly bandwidth inefficient. A change to a lower layer invalidates all layers above it, regardless of whether there's actually any dependency on the changed data.
With a competent caching strategy (the sort of thing you'd set up with nix or bazel) it's often faster to send the git SHA and build the image on the other end than it is to move built images around. This is because 99% of that image you're downloading or pushing is probably already on the target machine, but the images don't contain enough metadata to tell you where that 1% is. A build tool, by contrast, understands inputs and outputs. If the inputs haven't changed, it can just use the outputs which are still lying around from last time.
The way Dockerfiles work, yes I think it does need to do this. It wouldn't be a matter of "conflicts" but rather of assembling containers with the wrong data. Imagine a Dockerfile like so:
RUN echo 1 > A
RUN echo "$(cat A) + 1" | bc > B
So that's two layers each with one file.
If, in a later version, the first command changes to `echo 3 > A` then the contents of B should become "4", even though the second command didn't change. That is, neither layer can be reused because the layers depend on each other.
But maybe there's no dependency. If your Dockerfile is like this:
RUN echo 1 > A
RUN echo 2 > B
Then the second layer could in theory be re-used when the first layer changes, and not built/pushed/downloaded a second time.
RUN echo 3 > A # new
RUN echo 2 > B # no dependency on layer 1, can be reused
But docker doesn't do this. It plays it safe and unnecessarily rebuilds both layers anyway. And since these files end up with timestamps, the hashes of the layers differ, so both layers are consequently reuploaded and redownloaded.
Build tools like nix and bazel require more of the user. You can't just run commands all willy nilly, you have to tell them more info about which things depend on which other things. But the consequence is that instead of a list of layers you end up with a richer representation of how dependency works in your project (I guess it's a DAG). Armed with this, when you try to build the next version of something, you only have to rebuild the parts that actually depend on the changes.
Whether the juice is worth the squeeze is an open question. I think it is.
I'm curious about this regarding GCP as well. I have a few Cloud Run Containers set up pulling their image directly from Docker Hub, then injecting my single config file from a Secrets-Manager-backed volume mount. That way I don't have to maintain my own package in GCP's Package Registry when the upstream project only publishes to Docker Hub
As someone mentioned, GitHub has something to prevent this, but it's unclear (or at least undocumented) what.
We at Depot [0] work around this by guaranteeing that a new runner brought online has a unique public IP address. Thus avoiding the need to login to Docker to pull anything.
Subsequently, we also do the same unique public IP address for our Docker image build product as well. Which helps with doing image builds where you're pulling from base images, etc.
OTOH, I don't understand by the big cloud platforms don't support caching, or at least make it easy. Azure pulling container dependencies on every build just feels rude.
I.e Docker terms of service restrict distribution in this way?
Is there any technical restraints?
I.e Docker specify no-cache
I expect Docker don't want their images cached and would want you to use their service and transform you in to a paying subscriber through limitations on free tier.
My feeling is the way the naming scheme was defined (and subsequent issues around modifying the default registry), docker wanted to try to lock people into using docker hub over allowing public mirrors to be set up easily. This failed, so they've needed to pivot somewhat to reduce their load.
These platforms do cache quite a bit. It's just that there is a very high volume of traffic and a lot of it does update pretty frequently (or has to check for updates)
Seconding, though it does require some setup at least for self-hosted. Gitlab also has a full container registry built in, so it's not difficult to pull the image you want, push it to gitlab, and reference that going forward.
Yeah I don't get why I have to setup caching myself for this kind of thing. Like wouldn't it be more efficient to do it lower down in their infra anyway?
I can tell you with 100% certainty that ECR definitely has limits, just not "screw you" ones like the blog post. So, while I do think switching to public.aws.ecr/docker/library is awesome, one should not make that switch and then think "no more 429s for me!" because they can still happen. Even AWS is not unlimited at anything
This is using AWS ECR as a proxy to docker hub, correct?
Edit: Not exactly, it looks like ECR mirrors docker-library (a.k.a. images on docker hub no preceded by a namespace), not all of Docker Hub.
Edit 2: I think the example you give there is misleading, as Ubuntu has its own namespace in ECR. If you want to highlight that ECR mirrors docker-library, a more appropriate example might be `docker pull public.ecr.aws/docker/library/ubuntu`.
In case someone from Gitlab is watching: there is a long-standing issue that Gitlab Dependency Proxy does not work with containerd rewriting (https://gitlab.com/gitlab-org/gitlab/-/issues/350485), making it impossible to use as a generic Docker Hub mirror.
Yes, but in this case it's not the problem. It's more about not accepting `?ns=docker.io` as a query parameter on an endpoint, so a rather small and isolated technical issue.
The way I figured out this was going on is that the organization I work at MITM’s non allowlisted endpoints. We started having engineers randomly fail for pulling from dockerhub proxying to Cloudflare R2 under the hood. After a while of scratching our heads I thought to check dockerhub and here was the news. Builds failing like this is because many prominent projects are now changing under the hood the storage location. I can say with reasonable confidence a lot of prominent projects have moved already. Notably the Python base image was moved to Cloudflare R2 sometime yesterday which caused all manner of fun breakages before we implemented a fix
We fixed the problem by using a pull through registry
The 10 pulls per IP per hour isn't my main concern. 40 pulls per hour for an authenticated user may be a little low, if you're trying out something new.
The unauthenticated limit doesn't bother me as much, though I was little upset when I first saw it. Many business doesn't bother setting up their own registry, even though they should, nor do they care to pay for the service. I suspect that many doesn't even know that Docker can be used without Docker Hub. These are the freeloaders Docker will be targetting. I've never worked for company that was serious about Docker/Kubernetes and didn't run their own registry.
One major issue for Docker is that they've always ran a publicly available registry, which is the default and just works. So people have just assumed that this was how Docker works and they've never bothered setting up accounts for developers nor production systems.
I dunno, your reasoning could also be applied to dependency management registries. It is not even only about cost, it is a lot of infra to set up authentication with every single external registry with every single automation tool that might need to pull from said registry.
Like, I get it, but it adds considerable work and headaches to thousands (millions?) of people.
We run our own registry for our containers, but we don't for images from docker.io, quay.io, mcr.microsoft.com, etc. Why would we need to? It obviously seems now we do.
To avoid having an image you're actively using being removed from the registry. Arguably it doesn't happen often, but when you're running something in production you should be in control. Below a certain scale it might not make sense to run your own registry and you just run the risk, but if you can affort it, you should "vendor" everything.
Not Docker, but I worked on a project that used certain Python libraries, where the author would yank the older versions of the library everything they felt like rewriting everything, this happened multiple times. After that happened the second time we just started running our own Python package registry. That way we where in control of upgrades.
Did this for years at my previous job to defend against the rate limits and against dependencies being deleted out from under us with no warning. (E.g. left-pad.)
Catching, vulnerability scanning, supply chain integrity, insurance against upstream removal. All these things are true for other artifact types as well.
When Docker went hard on subscriptions, my company pivoted to Rancher Desktop as the replacement.
I can't stress enough how much I dislike Rancher. I know we moved to it as a cost saving measure as I am assuming we would have to buy subs for Docker.
Yet there is nothing I found easier to use than Docker proper. Rancher has a Docker compatible mode and it falls down in various ways.
Now that this has happened, I wonder if Rancher is pulling by default from the Docker Hub registry, in which case now we'll need to setup our own registry for images we use, keep them up to date etc. Which feels like it would be more costly than paying up to Docker to begin with.
Even if I have doesn't matter, Rancher is what we are suppose to use, and not using it means you're out of step with whats supported and such, which I don't find to be a good place to be.
Do you mean "Rancher" Rancher, or Rancher Desktop? Those are two different things. I have found the latter to be a Just Works™ app that's miles ahead of Podman Desktop. Now, that one is a mess.
Can you elaborate a little on what you don't like about Rancher?
Have been looking at moving my org over to it (30 engineers), seems quite nice so far - built-in in k3s is great and it works well on my macbook
Its docker drop in support is lacking. There are instances where I expect it to do something and it errors or does something bizarre.
The interface is very basic. I had to get plugins for very basic functionality that has been built into Docker Desktop for years, like Logs Explorer.
It seemingly always prompts for Admin Access on the computer, even though Docker long ago stopped doing this and has worked without admin access for some time.
The prompt for enabling admin access is funny. If you don't have it already, it will prompt you to enable it, if you have it enabled, it will pop up another window, very similar, and the wording will say "Startup Rancher Desktop without administrator access" but its easy to miss the wording difference, cause the font is small.
I've had stability issues, containers randomly crashing or the daemon going down out of nowhere. Happened more than once.
It claims to be a drop in for Docker CLI, but while I don't have the list handy at the moment, I know this isn't true, particularly with docker-compose
I could go on, but its still really rough around the edges.
The storage costs coming in from 1st March feel like they're going to catch a lot of organisations out too. Private repos will cost $10/month per 100GB of storage, something that was previously not charged for. We're in the middle of a clear out because we have several TB of images that we'd rather not pay for on top of the existing subscription costs.
The storage enforcement costs have been delayed until 2026 to give time for new (automated) tooling to be created and for users to have time to adjust.
The pull limits have also been delayed at least a month.
Do you have a source for that? My company was dropping dockerhub this week as we have no way of clearing up storage usage (untagging doesn't work) until this new tooling exists and can't afford the costs of all the untagged images we have made over the last few years.
(I work there)
If you have a support contact or AE they can tell you if you need an official source. Marketing communications should be sent out at some point.
Thanks, Just seems like quite poor handling on the comms around the storage changes as there is only a week to go and the current public docs make it seem like the only way to not start paying is to delete the repos or I guess your whole org.
Yep, agree that comms have a lot of room for improvement. We do have initial delete capabilities of manifests available now, but functionality is fairly basic. It will improve over time, along with automated policies.
When will people pay their engineers to do actual engineering, instead of as a proxy for SaaS spending? Please, dear God, just hire one guy to run a mirror. Then, every time Docker et al turn the screws, we don't have to have these threads.
Some obvious mitigations: don't depend on docker hub for publishing, use mirrors for stuff that does depend on that, use one of the several docker for desktop alternatives, etc. No need to pay anyone. Chances are that you already use a mirror without realizing it if you are using any of the widely used cloud or CI platforms.
Can one of the big tech companies please use their petty cash account to acquire what remains of docker.com? Maybe OSS any key assets and donate docker hub, trademarks, etc. to some responsible place like the Linux Foundation which would be a good fit. This stuff is too widely used to leave taken hostage by an otherwise unimportant company like Docker. And the drama around this is getting annoying.
MS, Google, AWS, anyone?
Alternatively, let's just stop treating docker.io as a default place where containers live. That's convenient for Docker Inc. but not really necessary otherwise. Docker Inc is overly dependent on everybody just defaulting to fetching things without an explicit registry host from there. And with these changes, you wouldn't want any of your production environments be dependent on that anyway because those 429 errors could really ruin your day. So, any implied defaults should be treated like what they are: a user error.
If most OSS projects stop pushing their docker containers to docker hub and instead spin up independent registries, most of the value of docker hub evaporates. Mostly the whole point of putting containers there was hassle free usage for users. It seems that Docker is breaking that intentionally. It's not hassle free anymore. So, why bother with it at all? Plenty of alternative ways to publish docker containers.
I do feel like it's ... a good thing if we can move away from "stuff is stored on servers and downloaded over and over and over to other servers in an arbitrary region".
I know that places like Circle already do a lot of stuff to automatically set up local caches as it can to avoid redownloading the same thing over and over from the outside world, and I hope that becomes more of the norm.
I suspect uni labs will have the most problems with this.
Teaching people to use Docker is not uncommon. The entire class pulling an image at (roughly) the same time is not uncommon either.
Yes, you can ask people to set up an account (provided you don't have policies against requiring students to sign up for unvetted US-based third-party services and provide personal data to them), but that complicates things.
If you're getting something for free... you should ask a question who and how is actually paying for it. Facebook can give you lots of stuff for free... because they can show you ads and use that awesome data for various purposes
Docker can't really market to machines doing most of downloads autonomously and probably can't monetize download data well to, so they want you to start paying them... or go use something else.
If I read these limits correctly, looks like lots of things are going to break on March 1st
So this is more or less a funnel towards getting people either to register and log in, or open their wallets and pay up a bit for increased usage.
That's understandable, but if the claim would be that this is primarily related to the costs of bandwidth, shouldn't the instructions to deploy an image caching solution (e.g. Sonatype Nexus or anything else) be at the forefront?
Like, if the same image gets pulled for some CI process that doesn't have a cache for whatever reason or gets redeployed often, having a self-hosted proxy between the user and Docker Hub would solve it really well with quite limited risks.
My workflow lately has been to establish an operations organization on my local intranets Forgejo instance. I then pull the images I want once from the internet, and shove them into that. From here, I make sure all my compose and scripts reference my local registry on the Forgejo server.
These dates have been delayed. They will not take effect March 1. Pull limit changes are delayed at least a month, storage limit enforcement is delayed until next year.
I am mainly mentioning this with regards to Azure and other providers egress prices.
And in Europe, onprem stuff is expensive if you are peering to other countries.
The last time I had to care professionally about bandwidth pricing for CDN price optimization in the US, wholesale bandwidth pricing was following a pattern similar to Moore’s law, with either bandwidth doubling, or price halving every 18-21 months. This was partly why you could get what looked like good deals from CDN providers for multi year contracts. They knew their prices were just going to fall. Part of what drives this is that we keep finding ways to utilize fiber, so there’s a technical aspect, but a lot of it also comes down to adding more physical connections. There’s even network consolidation happening where 2 companies will do enough data sharing that they will get peering agreements and just add a cat6 patch between servers hosted in the same datacenter and short circuit the network.
It’s been almost a decade so it’s possible things have slowed considerably, or demand has outstripped supply, but given how much data steam seems to be willing to throw at me, I know pricing is likely no where near what it was last I looked (it’s the only metered thing I regularly see and it’s downloading 10’s of GB daily for a couple games in my collection).
Using egress pricing is also the wrong metric. You’d be better off looking at data costs between regions/datacenters to get a better idea about wholesale costs, since high egress costs is likely a form of vender lockin, while higher looking at cross region avoids any “free” data costs through patch cables skewing the numbers.
Not sure about bandwidth between countries, there’s different economics there. I’d expect some self similarity there, but laying trunks might be so costly that short of finding ways to utilize fiber better is the only real way to increase supply.
Azure and the other mega clouds seem to enjoy massive profit margins on bandwidth… why would they willingly drop those prices when they can get away with high prices?
If bandwidth costs are important, there are plenty of options that will let you cut the cost by 10x (or more). Either with a caching layer like an external CDN (if that works for your application), or by moving to any of the mid-tier clouds (if bandwidth costs are an important factor, and caching won’t work for your application).
AWS, GCP, and Azure are the modern embodiment of the phrase “nobody ever got fired for buying IBM.”
Most companies don’t benefit from those big 3 mega clouds nearly as much as they think they do.
So, sure, send a note to your Azure rep complaining about the cost of bandwidth… nothing will change, of course, because companies aren’t willing to switch away from the mega clouds.
> and other providers
Other providers, like Hetzner, OVH, Scaleway, DigitalOcean, Vultr, etc., do not charge anywhere near the same for bandwidth as Azure. I think they are all about 8x to 10x cheaper.
A CDN will increase your bandwidth costs not lower it.
Eg Fastly prices:
US/Europe $0.10/GB
India $0.28/GB
Not all bandwidth is equal. eg Hetzner will pay for fast traffic into Europe but don't pay the premium that others like AWS do to ensure it gets into Asia uncongested.
BunnyCDN charges significantly less for data that they serve, for example.
I didn’t say all CDNs are cheaper. Some CDNs see an opportunity to charge a premium, and they do!
Fastly sees themselves as far more than just a CDN. They call themselves an “edge cloud platform”, not a CDN.
> Not all bandwidth is equal. eg Hetzner will pay for fast traffic into Europe but don't pay the premium that others like AWS do to ensure it gets into Asia uncongested.
Sure… there are sometimes tradeoffs, but for bandwidth-intensive apps, you’re sometimes (often?) better off deploying regional instances that are closer to your customers, rather than paying a huge premium to have better connectivity at a distance. Or, for CDN-compatible content, you’re probably better off using an affordable CDN that will bring your content closer to your users.
If you absolutely need to use AWS’s backbone for customers in certain geographic regions, there’s nothing stopping you from proxying those users through AWS to your application hosted elsewhere, by choosing the AWS region closest to your application and putting a proxy there. You’ll be paying AWS bandwidth plus your other provider’s bandwidth, but you’ll still be saving tons of money to route the traffic that way if those geographic regions only represent a small percentage of your users… and if they represent a large percentage, then you can host something more directly in their region to make the experience even better.
For many types of applications, having higher latency / lower bandwidth connectivity isn’t even a problem if the data transfer is cheaper and saves money… the application just needs to do better caching on the client side, which is a beneficial thing to do even for clients that are well-connected to the server.
It depends, and I am not convinced there is a one-size-fits-all solution, even if you were to pay through the nose for one of the hyperscalers.
I have plenty of professional experience with AWS and GCP, but I also have professional experience with different degrees of bare metal deployment, and experience with mid-tier clouds. If costs don’t matter, then sure, do whatever.
Transit is cheap (and gets cheaper every year), cloud markups and profit margins are expensive. Like, you can still rack a server and pay peanuts for the networking, but that isn't covered in a Medium post, so nobody knows how to do it anymore.
I love how everyone is arguing about networking costs inside the tiny prison cell is "the cloud". Because obviously the only way to push bits over the wire is through an AWS Internet Gateway, which was the very first packet-switched routing ever.
With storage now becoming something that is charged for, I hope we can make the case for trying to shrink images.
There is a huge difference in images carefully curated, with separate build layers and shipped layers vs the ones that dump in the codebase, install a whole compiler toolchain needed to build the application / wheels / (whatever its called in Node.JS), package it, and then ship off the image.
I was forced into using some kind of Docker thing at my last job, where I looked into the license and it starts out:
"Docker Desktop is free for small businesses (fewer than 250 employees AND less than $10 million in annual revenue), personal use, education, and non-commercial open source projects."
I think that's reasonable, but it's hard for me to believe everyone's paying when they should be. I set up podman instead and I haven't had any major issues.
There are many decent alternatives for MacOS too. I had good luck with minikube a few years ago. This article seems decent, based on my previous research:
Yeah I'm on a Mac. Uh, you know I really had a memory of homebrew getting things out of the .app or something, but I really can't find any evidence that was ever the case. I blame sleep deprivation, this is like the 13th blunder I've made this week haha.
Docker Desktop is the only thing that has that license. Every time someone mentions docker I have to be annoying and make sure they didn't mean they installed Docker Desktop.
Docker just needs to be open source software, there's no real revenue model that makes sense, but damn they're trying. Now I guess dockerhub is also just off the table.
I've been using it as a cache for a while locally and it's a solid choice.
---
I guess an edit - it does also have basic TTL, which might cover your GC case, but it's not very configurable or customizable. It's literally just a TTL flag on the proxied image.
I use Harbor at work at $LARGE_COMPANY and it works well. I don't run and maintain it however, I'm just a consumer of it from another team that manages it.
They already set up a URL in harbor that mirrors docker.io containers.
harbor should have enough features and is popular/rising, otherwise Artifactory will do everything you imagine but is quite heavy both on resources and configuration.
Could you spin up something like a Steam cache but for Docker? So when someone in your network pulls an image, it gets cached and served to subsequent pullers of the same image.
It is not immediately clear to me if the limit is per repo/package or globally in the hub. For instance, I fear it will not be possible to add a new kubernetes node to my cluster without hitting the limit as it would need to pull all the individual images.
its globally in the hub. The announcement says they are rolling out the block in March but this already blew up my k8s clusters since last weekend. Have tried looking for an option to tell k8s to always use authentication tokens for image pulls, but can't find a way, going to have to add Kyverno to mutate all incoming Deployments to add imagePullSecrets.
Years ago, when Docker and Docker Hub was starting out, I didn't really understand why is the thing not built to allow super easy self-hosting/proxying. It seemed crazy to me that one company is hosting full OS images that needed to be downloaded over and over again, especially for CI environments. Linux distro ISO images always had mirrors, even torrent distribution. Whenever I tried looking at caching Docker Hub images, I gave up and just accepted the wasted bandwidth.
setting up your own local mirror is pretty easy. You set your docker to pull from your local network registry and you configure the registry to pull from the public repo if the requested image isn't found.
10 pulls an hour is wild. There's no way we can wait hours and hours for work clusters to rebuild. Even just daily updates to containers will be over 10.
This forces pretty much everyone to move to a Pro subscription or to put a cache in front of docker.io.
With podman and kube (crio and containerd) you can create mirror config such that the pulls happen from a mirror transparently. Some mirrors also support proxy cache behaviour so you dont in theory have to preload images (though might be necessary with the new limits)
Docker Inc. pushed all this work on individuals by being shitty and not supporting adding the ability to add to / change the default registry search. Redhat has been patching Docker engine to let their users do it. It would be trivial if it could be an engine-wide setting ["mydockercache.me", "docker.io"] that would be transparent to everyone's Dockerfile.
So sometimes when I do docker run I see a bunch of progress bars pulling images and layers. Are all of them counted individually or it all just counts as one?
The inevitable consequence of misconfigured Watchtower disease, I suppose. I pay for Docker because I like all of their products, and their private registry + scout is good, so I can go on misconfiguring all of the things!
There _sort-of_ is. You can pull from registries other than Docker Hub (these can be run by anyone with the will and resources to do so -- GHCR is a popular one), though these may have their own usage restrictions.
You can run your own following Docker's own guide here[0] if you'd like. It's not peer-to-peer in the sense that the lines between clients and servers are blurred, as with torrenting, but it allows for a distributed registry architecture, which I think is the part that matters here.
I don't think it would be possible out of the box: Docker pulls assume a direct HTTP download for the image. It would be pretty cool to build a proxy that acts as a torrent client for images however it would be a lot less ergonomic to use on top of the security risk of tags not being checksums.
Is it just me, or is this really low, especially since one Docker Compose can have multiple pulls? This also sounds like it would be impossible to use Docker behind a NAT or a VPN, when unauthenticated.
10 pulls per hour per IP will even impact my homelab deployment if I'm not updating for a few weeks and I bump version of every software I run at once.
If you're homelab is proper, you likely own a /56 range, also known as 256x /64 which is what they're limiting. I've always known my prefix would come in handy! Now, I only need to work out how to make it work without having to define all 256 network interfaces.
It's hard to call it "abuse" when Docker has been allowing -- and IMO tacitly encouraging -- this usage pattern for most/all of their existence.
I get that bandwidth is expensive, but this feels a bit like the usual "make it free to get lots of users, and then start charging when everyone is locked in" plan.
If they really just want to reduce their own costs, they should be evangelizing the use of a caching proxy, and providing a super easy way for people to set one up, both on the server and client side. (Maybe they already do this; I haven't looked.)
Sure, they were encouraging usage of the docker hub, but it's been at least a couple of years since they started pushing on the other way, when they introduced the first rate-limits.
If everybody did a fair-use of the Docker Hub maybe we wouldn't have the rate-limits in the first place? But I think we all learned that won't be happening in the open Internet.
See my comment above for the numbers (https://news.ycombinator.com/item?id=43127004), but the free limits haven't changed in magnitude, rather they've reduced how bursty the requests can be (which is somewhat interesting, in that for the free levels you'd expect the usage to be more bursty, and the more paid levels to be more consistent given more workers and more tooling happening at all hours).
This is obviously the first time a big Silicon Valley company took back the free lunch and slapped a price tag on it. How could we have ever learned our lesson before this?
Many ISP's provide /56 or at least /64 these days, but at any rate you can always get some from cloud providers and use Wireguard to tunnel the rest... There really isn't much excuse for not supporting IPv6 at homelab scale.
With my latest project, we decided to setup a private registry for development and we are considering setting a main and backup registry for prod as well. We are a smaller operation so I'm not sure how much it would scale to larger business needs.
One thing I haven't seen mentioned at all here is the impact of this change on self-hosting. Updating apps or checking for updates becomes a real challenge with such a small rate limit. Suddenly everybody will have to switch to some mirror/proxy (or self-host one). For people running k8s clusters... good luck.
I understand Docker is paying for the bandwidth, but it's relatively cheap for them at the scale they operate. ghcr.io doesn't impose any rate limit at all (although it isn't really GitHub's main product), which I'd say proves that it's sustainable. In any case, 100 to 10 and 200 to 40 are both huge decreases and are unjustifiable for me.
1. Don't use Docker; you don't need it. People were hosting things successfully long before Docker.
2. Host a mirror.
3. Rack a server and stop paying insane egress costs and padding the profit margins of grifters.
Copyrighted content is not illegal. Anyway, in this case, the initial download of the otherwise redistributable content would be piracy. From the docker TOS:
> 2.4 You may not access or use the Service for the purpose of bringing an intellectual property infringement claim against Docker or for the purpose of creating a product or service competitive with the Service.
Which is a great reason to default to / publish on other registries.
Amazing that HN doesn't realize that transit networking is actually not expensive at all. You can saturate a gigabit connection at a colo for pennies, but nobody blogged about that, so nobody knows it's an option.
> When utilizing the Docker Platform, users should be aware that excessive data transfer, pull rates, or data storage can lead to throttling, or additional charges. To ensure fair resource usage and maintain service quality, we reserve the right to impose restrictions or apply additional charges to accounts exhibiting excessive data and storage consumption.
Well, that's ominous. No mention what they consider consider excessive or how much they might charge. They're essentially saying they can send you whatever bill they want.
Back when Docker got popular, maybe 10 years ago, I was behind a slow ADSL connection (less < 2Mbps) and I couldn't stand anything up with it to save my life. Downloads from Docker Hub just wouldn't complete.
I figured some kind of smart download manager and caching system would save the day but frankly I saw Docker as a step backward because I had been doing a really good job of installing 100+ web services on a single server since 2003 or so. [1] [2]
Looking back it, I'm sure that a short timeout was a deliberate decision by the people running Docker Hub, as people with slow internet connections because telcos choose not to serve us with something better are unpeople.
[1] Nothing screams "enterprise feature, call sales for pricing" like being able to run your own local hub
[2] My experience with docker is roughly: if you can write a bash script to build your environment, you can write a Dockerfile; the Dockerfile is the gateway to a system that will download 5GB of images when you really want to install 50MB of files, so what's the point? Sure, Docker accelerates your ability to have 7 versions of the JVM and 35 different versions of Python, but is that something to be proud of, really?
> My experience with docker is roughly: if you can write a bash script to build your environment, you can write a Dockerfile
I agree.
> Sure, Docker accelerates your ability to have 7 versions of the JVM and 35 different versions of Python, but is that something to be proud of, really?
No, but it's not my fault that the python packaging ecosystem is broken and requires isolation, and that every Java project relies on a brittle toolchain. At least docker means that nonsense is isolated and doesn't affect the stuff I write.
Okay, so I guess I'm running my own docker repo and builds now. So long and thanks for all the fish.
edit: Oh, per hour. I thought that was per MONTH. Okay, I can survive with this, but it's still puts me on notice. Need to leave dockerhub sooner than later.
This is just another thing in a laundry list of things from Docker that feel developer-hostile. Does it make sense? Sure, it might, given the old architecture of Docker Hub.
I'm biased (i.e., co-founder of Depot [0]) and don't have the business context around internal Docker things. So this is just my view of the world as we see it today. There are solutions to the egress problem that negates needing to push that down to your users. So, this feels like an attempt to get even more people onto their Docker Desktop business model and not explicitly related to egress costs.
This is why when we release our registry offering, we won't have this kind of rate limiting. There are also solutions to avoiding the rate limits in CI. For example, our GitHub Actions runners come online with a public unique IP address for every job you run. Avoiding the need to login to Docker at all.
> There are solutions to the egress problem that negates needing to push that down to your users.
Please do elaborate on what those are!
There are always lots of comments like this providing extremely vague prescriptions for other people's business needs. I'd love to hear details if you have them, otherwise you're just saying "other companies have found ways to get someone else besides their customers to pay for egress costs" without any context for why those people are willing to pay the costs in those contexts.
Well now it's time to do everything I can to never use dockerhub. This is going to be annoying.
If you don't want to host an OSS repository, just decide to not do that. And this is the first I've heard of it so now it's an emergency to work around this rug pull.
Now for every image I'm going to have to try to find a trustable alternative source. (things like postgres, redis, nginx) or copy and rehost everything.
Then don't use those projects, or build them yourself, or use non-Docker options. I'm aghast that people think they need an expensive, bloated container runtime to run software. It's never been necessary.
Seems reasonable on its face to me given what Docker Hub offers. Unless you’re orchestrating your entire homelab with containers from Docker Hub and doing fresh pulls everytime, you’re highly unlikely to hit that limit - and if you do, the personal plan quadruples your allowance for a $0 annual cost.
The only folks likely to feel pain from this change were those either deliberately abusing Docker’s prior generosity or using bad development and deployment practices to begin with. I suspect that for 99% of us regular users, we won’t see or feel a thing.
For residential usage, unless you're in an apartment tower where all your neighbors are software engineers and you're all behind a CGNAT, you can still do a pull here and there for learning and other hobbyist purposes, which for Docker is a marketing expense to encourage uptake in commercial settings.
If you're in an office, you have an employer, and you're using the registry for commercial purposes, you should be paying to help keep your dependencies running. If you don't expect your power plant to give you electricity for free, why would you expect a commercial company to give you containers for free?
My startup pays Docker for their registry hosting services, for our private registry. However, some of our production machines are not set up to authenticate towards our account, because they are only running public containers.
Because of this change, we now need to either make sure that every machine is authenticated, or take the risk of a production outage in case we do too many pulls at once.
If we had instead simply mirrored everything into a registry at a big cloud provider, we would never have paid docker a cent for the privilege of having unplanned work foisted upon us.
However, if you are using docker's registry without authentication and you don't want to go through the effort of adding the credentials you already have, you are essentially relying on a free service for production already, which may be pulled any time without prior notice. You are already taking the risk of a production outage. Now it's just formalized that your limit is 10 pulls per IP per hour. I don't really get how this can shift your evaluation from using (and paying for) docker's registry to paying for your own registry. It seems orthogonal to the evaluation itself.
This is by design, according to docker.
I’ve never encountered anyone at any of my employers that wanted to use docker hub for anything other than a one-time download of a base image like Ubuntu or Alpine.
I’ve also never seen a CD deployment that doesn’t repeatedly accidentally pull in a docker hub dependency, and then occasionally have outages because of it.
It’s also a massive security hole.
Fork it.
I have a vague memory of reading something to that effect on their bug tracker, but I always thought the reasoning was ok. IIRC it was something to the effect that the goal was to keep things simple for first time users. I think that's disservice to users, because you end up with many refusing to learn how things actually work, but I get the sentiment.
> I’ve also never seen a CD deployment that doesn’t repeatedly accidentally pull in a docker hub dependency, and then occasionally have outages because of it.
There's a point where developers need to take responsibility for some of those issues. The core systems don't prevent anyone from setting up durable build pipelines. Structure the build like this [1]. Set up a local container registry for any images that are required by the build and pull/push those images into a hosted repo. Use a pull through cache so you aren't pulling the same image over the internet 1000 times.
Basically, gate all registry access through something like Nexus. Don't set up the pull through cache as a mirror on local clients. Use a dedicated host name. I use 'xxcr.io' for my local Nexus and set up subdomains for different pull-through upstreams; 'hub.xxcr.io/ubuntu', 'ghcr.xxcr.io/group/project', etc..
Beyond having control over all the build infrastructure, it's also something that would have been considered good netiquette, at least 15-20 years ago. I'm always surprised to see people shocked that free services disappear when the stats quo seems to be to ignore efficiency as long as the cost of inefficiency is externalized to a free service somewhere.
1. https://phauer.com/2019/no-fat-jar-in-docker-image/
Same. The “I don’t pay for it, why do I care” attitude is abundant, and it drives me nuts. Don’t bite the hand that feeds you, and make sure, regularly, that you’re not doing that by mistake. Else, you might find the hand biting you back.
This is really not complicated and your not entitled to unlimited anonymous usage of any service.
You would put this as a separate registry and storage from your actual self-hosted registry of explicitly pushed example.com/ images.
It's an extremely common use-case and well-documented if you try to RTFM instead of just throwing your hands in the air before speculating and posting about how hard or impossible this supposedly is.
You could fall back to DNS rewrite and front with your own trusted CA but I don't think that particular approach is generally advisable given how straightforward a pull-through cache is to set up and operate.
All the large objects in the OCI world are identified by their cryptographic hash. When you’re pulling things when building a Dockerfile or preparing to run a container, you are doing one of two things:
a) resolving a name (like ubuntu:latest or whatever)
b) downloading an object, possibly a quite large object, by hash
Part b may recurse in the sense that an object can reference other objects by hash.
In a sensible universe, we would describe the things we want to pull by name, pin hashes via a lock file, and download the objects. And the only part that requires any sort of authentication of the server is the resolution of a name that is not in the lockfile to the corresponding hash.
Of course, the tooling doesn’t work like this, there usually aren’t lockfiles, and there is no effort made AFAICT to allow pulling an object with a known hash without dealing with the almost entirely pointless authentication of the source server.
If you rewrite DNS, you should of course also have a custom CA trusted by your container engine as well as appropriate certificates and host configurations for your registry.
You'll always need to take these steps if you want to go the rewrite-DNS path for isolation from external services because some proprietary tool forces you to use those services.
Artifactory and Nexus are the two I've used for work. Harbor is also popular.
I can't think of the name right now, but there are some cool projects doing a p2p/distributed type of cache on the nodes directly too.
Announcing a new limitation that requires rolling out changes to prod with 1 week notice should absolutely shift your evaluation of whether you should pay for this company's services.
https://www.docker.com/blog/november-2024-updated-plans-anno...
At Docker, our mission is to empower development teams by providing the tools they need to ship secure, high-quality apps — FAST. Over the past few years, we’ve continually added value for our customers, responding to the evolving needs of individual developers and organizations alike. Today, we’re excited to announce significant updates to our Docker subscription plans that will deliver even more value, flexibility, and power to your development workflows.
We’ve listened closely to our community, and the message is clear: Developers want tools that meet their current needs and evolve with new capabilities to meet their future needs.
That’s why we’ve revamped our plans to include access to ALL the tools our most successful customers are leveraging — Docker Desktop, Docker Hub, Docker Build Cloud, Docker Scout, and Testcontainers Cloud. Our new unified suite makes it easier for development teams to access everything they need under one subscription with included consumption for each new product and the ability to add more as they need it. This gives every paid user full access, including consumption-based options, allowing developers to scale resources as their needs evolve. Whether customers are individual developers, members of small teams, or work in large enterprises, the refreshed Docker Personal, Docker Pro, Docker Team, and Docker Business plans ensure developers have the right tools at their fingertips.
These changes increase access to Docker Hub across the board, bring more value into Docker Desktop, and grant access to the additional value and new capabilities we’ve delivered to development teams over the past few years. From Docker Scout’s advanced security and software supply chain insights to Docker Build Cloud’s productivity-generating cloud build capabilities, Docker provides developers with the tools to build, deploy, and verify applications faster and more efficiently.
Sorry, where in this hyped up marketingspeak walloftext does it say "WARNING we are rugging your pulls per IPv4"?
Right at the top of the page it says:
> consumption limits are coming March 1st, 2025.
Then further in the article it says:
> We’re introducing image pull and storage limits for Docker Hub.
Then at the bottom in the summary it says again:
> The Docker Hub plan limits will take effect on March 1, 2025
I think like everyone else is saying here, if you rely on a service for your production environments it is your responsibility to stay up to date on upcoming changes and plan for them appropriately.
If I were using a critical service, paid or otherwise, that said "limits are coming on this date" and it wasn't clear to me what those limits were, I certainly would not sit around waiting to find out. I would proactively investigate and plan for it.
I mean just starting with the title:
> Announcing Upgraded Docker Plans: Simpler, More Value, Better Development and Productivity
Wow great it's simpler, more value, better development and productivity!
Then somewhere in the middle of the 1500-word (!) PR fluff there is a paragraph with bullet points:
> With the rollout of our unified suites, we’re also updating our pricing to reflect the additional value. Here’s what’s changing at a high level:
> • Docker Business pricing stays the same but gains the additional value and features announced today.
> • Docker Personal remains — and will always remain — free. This plan will continue to be improved upon as we work to grant access to a container-first approach to software development for all developers.
> • Docker Pro will increase from $5/month to $9/month and Docker Team prices will increase from $9/user/month to $15/user/mo (annual discounts). Docker Business pricing remains the same.
And at that point if you're still reading this bullet point is coming:
> We’re introducing image pull and storage limits for Docker Hub. This will impact less than 3% of accounts, the highest commercial consumers.
Ah cool I guess we'll need to be careful how much storage we use for images pushed to our private registry on Docker Hub and how much we pull them.
Well it's an utter and complete lie because even non-commercial users are affected.
————
This super long article (1500 words) intentionally buries the lede because they are afraid of a backlash. But you can't reasonably say “I told u so” when you only mentioned in a bullet point somewhere in a PR article that there will be limits that impact the top 3% of commercial users, then 4 months later give a one week notice that images pulls will be capped to 10 pulls per hour LOL.
The least they could do is to introduce random pull failures with an increasing probability rate over time until it finally entirely fails. That's what everyone does with deprecated APIs. Some people are in for a big surprise when a production incident will cause all their images to be pulled again which will cascade in an even bigger failure.
If the PR stuff isn't for you, fine, ignore that. Take notes on the parts that do matter to you, and then validate those in whatever way you need to in order to assure the continuity of your business based on how you rely on Docker Hub.
Simply the phrase "consumption limits" should be a pretty clear indicator that you need to dig into that and find out more, if you rely on Docker in production.
I don't get everyone's refusal here to be responsible for their own shit, like Docker owes you some bespoke explanation or solution, when you are using their free tier.
How you chose to interpret the facts they shared, and what assumptions you made, and if you just sat around waiting for these additional details to come out, is on you.
They also link to an FAQ (to be fair we don't know when that was published or updated) with more of a Q&A format and the same information.
The snippets about rate limiting give the impression that they're going to be at rates that don't affect most normal use. Lots of docker images have 15 layers; doesn't this mean you can't even pull one of these? In effect, there's not really an unauthenticated service at all anymore.
> “But the plans were on display…”
> “On display? I eventually had to go down to the cellar to find them.”
> “That’s the display department.”
> “With a flashlight.”
> “Ah, well, the lights had probably gone.”
> “So had the stairs.”
> “But look, you found the notice, didn’t you?”
> “Yes,” said Arthur, “yes I did. It was on display in the bottom of a locked filing cabinet stuck in a disused lavatory with a sign on the door saying ‘Beware of the Leopard.”
I am saying that when change is coming, particularly ambiguous or unclear change like many people feel this is, it's no one's responsibility but yours to make sure your production systems are not negatively affected by the change.
That can mean everything from confirming data with the platform vendor, to changing platforms if you can't get the assurances you need.
Y'all seem to be fixated on complaining about Docker's motives and behaviour, but none of that fixes a production system that's built on the assumption that these changes aren't happening.
Somebody's going to have the same excuse when Google graveyards GCP. Till this change, was it obvious to anyone that you had to audit every PR fluff piece for major changes to the way Docker does business?
You seem(?) to be assuming this PR piece, that first announced the change back in Sept 2024, is the only communication they put out until this latest one?
That's not an assumption I would make, but to each their own.
This isn't exactly the same lesson, but I swore off Docker and friends ages ago, and I'm a bit allergic to all not-in-house dependencies for reasons like this. They always cost more than you think, so I like to think carefully before adopting them.
“Oh yes, well as soon as I heard I went straight round to see them, yesterday afternoon. You hadn’t exactly gone out of your way to call attention to them, had you? I mean, like actually telling anybody or anything.”
“But the plans were on display …”
“On display? I eventually had to go down to the cellar to find them.”
“That’s the display department.”
“With a flashlight.”
“Ah, well the lights had probably gone.”
“So had the stairs.”
“But look, you found the notice didn’t you?”
“Yes,” said Arthur, “yes I did. It was on display in the bottom of a locked filing cabinet stuck in a disused lavatory with a sign on the door saying ‘Beware of the Leopard’.”
No kidding. Clashes with the “gotta hustle always” culture, I guess.
Or it means that they can’t hide their four full-time jobs from each of the four employers as easily while they fix this at all four places at the same time.
The “I am owed free services” mentality needs to be shot in the face at close range.
https://web.archive.org/web/20241213195423/https://docs.dock...
Here's the January 21st 2025 copy that includes the 10/HR limit.
https://web.archive.org/web/20250122190034/https://docs.dock...
The Pricing FAQ goes back further to December 12th 2024 and includes the 10/HR limit.
https://web.archive.org/web/20241212102929/https://www.docke...
I haven't gone through my emails, but I assume there was email communication somewhere along the way. It's safe to assume there's been a good 2-3 months of communication, though it may not have been as granular or targeted as some would have liked.
Announced in September 2024: https://www.docker.com/blog/november-2024-updated-plans-anno...
At least 6 months of notice.
If you're production service is relying on a free-tier someone else provides, you must have some business continuity built in. These are not philanthropic organisations.
I have a blog, do I have to give my readers notice before I turn off the service because I can't afford the next hosting charge?
Isn't this almost exclusively going to effect engineers? Isn't it more of the engineer's responsibility not to allow their mission critical software to have such a fragile signal point of failure?
> Probably because they don't want people to have enough notice to force their hand.
He says without evidence, assuming bad faith.
Not an acceptable interaction. This will be the end of Docker Hub if they don't walk back.
Docker doesn't know how to monetize.
And the exact time you have some production emergency is probably the exact time you have a lot of containers being pulled as every node rolls forward/back rapidly...
And then docker.io rate limits you and suddenly your 10 minute outage becomes a 1 hour outage whilst someone plays a wild goose chase trying to track down every docker hub reference and point it at some local mirror/cache.
And yes, you’re still using the free tier even if you pay them, if your usage doesn’t have any connection to your paid account.
Indeed, you’d be paying the big cloud provider instead, most likely more than you pay today. Go figure.
https://gallery.ecr.aws/docker/?page=1
It's busy-work that provides no business benefit, but-for our supplier's problems.
> specific outbound IP addresses that they can then whitelist
And then we have an on-going burden of making sure the list is kept up to date. Too risky, IMO.
I dunno, if I were paying for a particular quality-of-service I'd want my requests authenticated so I can make claims if that QoS is breached. Relying on public pulls negates that.
Making sure you can hold your suppliers to contract terms is basic due diligence.
> We’re introducing image pull and storage limits for Docker Hub. This will impact less than 3% of accounts, the highest commercial consumers. For many of our Docker Team and Docker Business customers with Service Accounts, the new higher image pull limits will eliminate previously incurred fees.
It's not fair, people shout. Neither are second homes when people don't even have their first but that doesn't seem to be a popular opinion on here.
https://cloud.google.com/artifact-registry/docs/pull-cached-...
You would have had to authenticate to access that repo as well.
https://aws.amazon.com/ecr/pricing/?nc1=h_ls
> For unauthenticated customers, Amazon ECR Public supports up to 500GB of data per month. https://docs.aws.amazon.com/AmazonECR/latest/public/public-s...
I don't see how it's better.
> If we had instead simply mirrored everything into a registry at a big cloud provider, we would never have paid docker a cent for the privilege of having unplanned work foisted upon us.
I mean, if one is unwilling to bother to login to docker on their boxes, is this really even an actual option? Hm.
this isn't a counterpoint is rewrapping the same point: free services for commercial enterprise is a counterproductive business plan
My main complaint is:
They built open source tools used all over the tech world. And within those tools they privileged their own container registry, and provided a decade or more of endless and free pulls. Countless other tools and workflows and experiences have been built on that free assumption of availability. Similarly, Linux distros have had built-in package management with free pulling for longer than I’ve been alive. To get that rug-pull for open-source software is deeply disappointing.
Not only that, but the actual software hosted on the platform is other people’s software. Being distributed for free. And now they’re rent-seeking on top of it and limiting access to it.
I assume most offices and large commercial businesses have cached and other tools built into their tools, but for indie developers and small businesses, storage of a ton of binary blobs starts to add up. That’s IF they can even get the blobs the first time, since I imagine they could experience contention and queuing if you use many packages.
And many people use docker who aren’t even really aware of what they’re doing - plenty of people (myself included) have a NAS or similar system with docker-wrapping GUI pre-installed. My NAS doesn’t even give me the opportunity to login to docker hub when pulling packages. It’s effectively broken now if I’m on a CGNAT.
That would be more reasonable if they didn't go out of their way to make doing so painful: https://github.com/moby/moby/issues/7203
The reality is, DockerHub (though originally called the Docker Index), was the first Docker image registry to even exist, and it was the only one to exist when image references were created.
Now, I would say there are definitely some issues you could have referenced here that would be more relevant (e.g. mirrors only working for DockerHub).
* https://www.aquasec.com/blog/a-brief-history-of-containers-f...
Docker wasn't stolen from anything. It built on top of existing things, even worked on new things, and provided a nice abstraction to everything.
> provide something like Dockerfile that's less golang-inspired and more linux-inspired
What? Is the thing that's golang inspired the image references? OK...
The Dockerfile takes from golang IMO, it's intentionally very low on syntax. Just like go's text/template and html/template.
Also note, Docker can build whatever format you want, it just defaults to the Dockerfile format, but you can give it whatever syntax parser you want.
Cannot help but notice that, had Microsoft offered such a sweet deal, this place would've been ablaze with cries of "Embrace, extend, extinguish" and suchlike. (This still regularly happens, e.g., when new Github features are announced). Perhaps even justifiably so, but the community has failed to apply that kind of critical thinking to any other company involved in open source. If your workflow is not agnostic wrt where you pull images from, it is kind of silly to blame it on Docker Inc.
Having said that, it is definitely a problem for many. I work at a technical university and I am sure colleges/research institutes will hit the limit repeatedly and easily.
I have to say though, 90% of the dockers I use aren't on docker hub anymore. Most of them reside on the github docker repo now (ghcr.io). I don't know where the above playbook pulls from though as it's all automated in ansible.
And really docker is so popular because of its ecosystem. There are many other container management platforms. I think that they are undermining their own value this way. Hobbyists will never pay for docker pulls but they do generate a lot of goodwill as most of us also work in IT. This works the other way around too. If we get frustrated with docker and start finding alternatives it's only a matter of time until we adopt them at work too.
If they have an issue with bandwidth costs they could just use the infrastructure of the many public mirrors available that also host most Linux distros etc. I'm sure they'd be happy to add publicly available dockers.
Ironically, it's the paid rates that are being reduced more (though they don't have hourly limits still, so more flexibility, but the fair use thing might come up), as they were infinite previously, now Pro is 34 pulls/hour (on average, which is less than authenticated), Team is 138 pulls/hour (or 4 times Pro) and Business 1380 pulls/hour (40 times pro, 10 times team).
My feeling this is trying to get more people to create docker accounts, so the upsell can be more targeted.
They're entitled to do what they want and implement any business model they want. They're not entitled to any business, to my data, nor their business model working.
I couldn't give 2 shits whatever Docker does. They're a service, if I wanna use it I'll pay, if not then I'll use something else. Ez pz
I don't use Docker so I genuinely don't know this...
Is the Docker Library built on the back of volunteers which is then used to sell paid subscriptions?
Does this commercial company expect volunteers to give them images for free which give their paid subscriptions value?
Yes, to an extent, because it costs money to store and serve data, no matter what kind of data it is or it's associated IP rights/licensing/ownership. Regardless, this isn't requiring people to buy a subscription or otherwise charging anyone to access the data. It's not even preventing unauthenticated users from accessing the data. It's reducing the rate at which that data can be ingested without ID/Auth to reduce the operational expense of making that data freely (as in money) and publicly available. Given the explosion in traffic (demand) and the ability to make those demands thanks to automation and AI relative to the operational expense of supplying it, rate limiting access to free and public data egress is not in and of itself unreasonable. Especially if those that are responsible for that increased OpEx aren't respecting fair use (legally or conceptually) and even potentially abusing the IP rights/licensing of "images [given] for free" to the "Library built on the back of volunteers".
To what extent that's happening, how relevant it is to docker, and how effective/reasonable Docker's response to it are all perfectly reasonable discussions to have. The entitlement is referring to those that explicitly or implicitly expect or demand such a service should be provided for free.
Note: you mentioned you don't use docker. a single docker pull can easily be 100's of MB's (official psql image is ~150MB for example) or even in some cases over a GB worth of network transfer depending on the image. Additionally, there is no restriction by docker/dockerhub that prevents or discourages people from linking to source code or alternative hosts of the data. Furthermore you don't have to do a pull everytime you wish to use an image, and caching/redistributing them within your LAN/Cluster is easy. Should also be mentioned Docker Hub is more than just a publicly accessible storage endpoint for a specific kind of data, and their subscription services provide more that just hosting/serving that data.
If you're only looking at Docker Hub as a host of public images, you're only seeing the tip of the iceberg.
Docker Hub subscriptions are primarily for hosting private images, which you can't see from the outside.
IMO, hosting public images with 10 pulls per hour is plenty generous, given how much bandwidth it uses.
So, yeah, they kind of taking advantage of people putting their work on DH to try&sell subs.
But nobody have to put their images on DH. And to be honest, I don't think the discoverability factor is as important on DH that it is on GitHub.
So if people want to pay for they own registry to make it available for free for everyone, it's less an issue than hosting your repo on your own GitLab/Gitea instance.
Yes.
> Does this commercial company expect volunteers to give them images for free which give their paid subscriptions value?
Yes.
I’ve also seen plenty of docker-compose files which pull out this amount of images (typically small images).
I’m not saying that Docker Inc should provide free bandwidth, but let’s not also pretend that this won’t be an issue for a lot of users.
Limits per IPv4 address are really, really annoying. All I can do is flick on a VPN... which likely won't work either
https://www.docker.com/blog/docker-hub-registry-ipv6-support...
I don't know enough about IPv6, is this potentially its own problem?
LetsEncrypt AWS Azure GCP Github Actions
Failing to see how they don't mix well.
You could also, you know, pay Docker for the resources you're using.
If I'm an infrequent tinkerer that occasionally needs docker images, I'm not going to pay a monthly cost to download e.g. 1 image/month that happens to be hosted on Docker's registry.
(It sounds like you can create an account and do authenticated pulls; which is fine and pretty workable for a large subset of my above scenario; I'm just pointing out a reason paying dollars for occasional one-off downloads is unpopular)
Someone (maybe the podman folks?) should do what every Linux distribution has done, and set up a network of signed mirrors that can be rsynced.
Replace "apartment tower" with "CS department at a university", and you have a relatively common situation.
The ire is because of the rug pull. (I presume) you know that. It’s predatory behavior to build an entire ecosystem around your free offering (on the backs of OSS developers) then do the good old switcheroo.
I think Docker started the bloated image mess. Have you ever seen a project with <100MB in size?
Guess pack everything with gzip isn't a good idea when size matters.
Docker Hub have a traffic problem, so does every intranet image registry. It's slow. The culprit is Docker (and maybe ppl who won't bother to optimize)
I started using explicit repository names for everything including Docker Hub 5+ years ago and I don't regret it. I haven't thought about mirrors since, and I find it easier to reason about everything. I use pull-through caches with dedicated namespaces for popular upstream registries.
I tried using mirrors at first, but it was a disaster with the shorthand notation because you can have namespace collisions. Consider: What happens below? What do you get? How do you know where the mirror admin is pulling from? That only works if you have single source of truth or if you keep a mapping somewhere. Ex: That's not useful because your definitions are still ambiguous unless you go look at the mappings, so all you've done is add external config vs explicitly declaring the namespace.Plus, you can set up a pull-through cache everywhere it makes sense.
I'd be interested to hear about scenarios where mirrors are more than a workaround for failing to understand the power of Docker's namespacing and defaulting to the shorthand notation for everything.It’s basically your apartment building example (esp. something like the STEM dorms)
When this stuff breaks in the hours leading up to a homework assignment being due, it’s going to discourage the next generation of engineers from using it.
If the power company gave me free energy for 15 years, i would also be pissed. Rightly? No but hey thats not the issue.
Also with docker being the status quo for so long, it does hurt the ecosystem / beginners quite a lot.
Because they did. But you're right—they have no obligation to continue doing so. Now that you mention it, it also reminds me that GitHub has no such obligation either.
In a way, expecting free container images is similar to how we can download packages from non-profit Linux distributions or how those distributions retrieve the kernel tarball from its official website. So, I’m not sure whether it’s better for everyone to start paying Docker Hub for bandwidth individually or for container images to be hosted by a non-profit, supported by donations from those willing to contribute.
There is always a commercial subscription. You need only a single $9/mo account and you get 25,000 pulls/month.
And if you are not willing to pay $9/mo, then you should be OK with using free personal account for experiments, or to spread out your experiments over longer timeline.
If Docker explicitly offers a service for free, then users are well within their rights to use it for free. That’s not entitlement, that’s simply accepting an offer as it stands.
Of course, Docker has every right to change their pricing model at any time. But until they do, users are not wrong for expecting to continue using the service as advertised.
I've seen this "sense of entitlement" argument come up before, and to be clear: users expecting a company to honor its own offer isn’t entitlement, it’s just reasonable.
10 per hour is slightly lower than 100 per 6 hours, but not in any meaningful way from a bandwidth perspective, especially since image size isn't factored into these rate limits in any way.
If bandwidth is the real concern, why change to a more inconvenient time period for the rate limit rather than just lowering the existing rate limit to 60 per 6 hours?
Then again, good that this measure forces fixing this bad behaviour, but as a user of buildpack you are not always in the know how to fix it.
Adding auth to pulls is easy. Mirroring images internally is easy. anyone that says otherwise is lazy.
bandwidth is super cheap if you dont use any fancy public cloud services.
They aren't likely able to go for peering arrangements ("free" bandwidth) because their traffic is likely very asymmetric, and that doesn't save the management/storage/compute costs.
I don't know what Docker's financials are, but I can imagine, as a business owner myself, situations where it was lean enough that that sort of cost could mean the difference between running the service and not.
People understand that bandwidth costs money but that seems to have been priced in to their previous strategy or they did it knowingly as a loss leader to gain market share. If they knew this was a fundamental limitation they should have addressed it years ago.
Perhaps they should have started by putting "we will enforce limits soon" in all documentation.. and in a few years, starting enforcement but with pretty high limits? and then slowly dialing limits down over a few years?
That's exactly what they did. I remember setting up the docker proxy 4 years ago when we started getting first "rate limit" errors. And if someone was ignoring the news for all that time.. Well, tough for them, there was definitely enough notice.
Of if you want it a little more colorfully: capturing a market with free candy to get us into your van.
Or more accurately, this “free until we need something from you” is the moral equivalent of a free meal or ski trip but you are locked into a room to watch a timeshare marketing deck.
Open Source is built on a gift economy. When you start charging you break the rules of that socioeconomic model. So most of us make tools that we have a hope of keeping running the way they were designed. Some of us stay at the edges because we know we won’t still be interested in this tool in two years and we don’t want to occupy other people’s head space unfairly or dishonestly. Some of us believe we can persevere and then we find there’s a new programming language that we like so much more than this one that we fuck off and leave a vacuum behind us that others have to scramble to fill (I’m talking about you, TJ).
And then there’s these kinds of people who hoover up giant sections of the mindshare using VC money and don’t ever find a new revenue stream, like Mozilla managed. And it’s getting really fucking old.
One of the problems with XML is that the schemas aren’t cached by default, and so a build system or a test harness that scans dozens or hundreds of these files per run, or twenty devs running their sandboxes in a tight loop, can eat up ridiculous amounts of bandwidth. But the important difference is that they architected a way to prevent that, it’s just that it is fiddly to get set up and nobody reads the instructions. I found an email chain with the W3c.org webmaster complaining about it, and myself and a couple other people tried to convince him that they needed to add a 100ms delay to all responses.
My reasoning was that a human loading a single XML file would never notice this increase, but a human running 100s of unit tests definitely would want to know why they suddenly got slower, and doing something about it wouldn’t just get back that extra 20 seconds, they’d get back 40 (or in our case 5-10minutes) by making the call one more time and putting it into a PR. We only noticed we were doing it because our builds stopped working one day when the internet went down.
There’s no build tool I know of that will deal with being 429’d at 10 requests. Especially if you know anything about how layers work. There are tons that would work just fine being traffic shaped to a server with a lower SLA. Dedicate half of your cluster to paying customers and half to the free tier. Or add 250ms delay. It’ll sort itself out. People will install Artifactory or some other caching proxy and you’ll still have the mindshare at a lower cost per use.
But idk maybe Docker shouldn't have pulled a bait-and-switch, which is also classical known as a "dick move".
Bandwidth is cheap, especially at scale, unless you're in one of the large clouds that make a shitload of money gouging their customers on egress fees.
I don't say that Docker Inc should foot the bill for other multibillion dollar companies, but the fact that even after 8 years it still is impossible to use authentication in the registry-mirrors option is mind-boggling.
[1] https://github.com/moby/moby/issues/30880
Without dockerhub you would have to host your own repository, which would cost money.
This should help people understand a bit better why this feel a bit underhanded. The images are free, and I and many other OSS devs have used docker hub in partnership to provide access to software, often paying for the ability to publish there. In this case, any burden of extra cost was on the producer side.
Turning this into a way to "know" every user and extract some value from them is their prerogative, but it does not feel like it is good faith. It also feels a bit creepy in the sense of "the user is the product".
Part of it is perhaps by definition, “spreading” already assumes success. Still, I’d welcome some regulation; or at least awareness; e.g. a neologism for companies in that stage, growing at cost and only getting ready to develop symptoms.
[1]: The American Dialect Society selected “Enshittification” as its 2023 word of the year, source: https://en.m.wikipedia.org/wiki/Enshittification
One can't rely on library updates being done, thus one has to have a build chain form many images.
Upstream has blocked it. A fork over this one little feature is long overdue.
There's already a widely available way to specify exactly which repo you'd prefer in the docker client...
`docker pull [repo]/[image]:[tag]`
And that address format works across basically all of the tooling.
Changing the defaults doesn't actually make sense, because there's no guarantee that repo1/secure_image === repo2/secure_image.
Seems like a HUGE security issue to default named public images to a repo where the images may not be provided by the same owner.
Giving people the option to configure a default repository via the daemon.json would alleviate that issue, but I'm not sure if that's really enough to fork.
It's just not that hard to go fully qualified.
With these changes, I can imagine “intro to docker” tutorials breaking.
I suspect that’ll be enough to let a fork/competitor gain significant market share.
I doubt it.
Remember, tags are mutable. `latest` today can be different tomorrow.
And it expands to all other tags. Nothing prevents me from pushing a new 'version' of a container to an existing `v1.0.1` tag.
Tags are not a way of uniquely identifying a container.
The top level image hashes tend to change between repos (because the image name changes in the manifest and is included in the hash).
So you'd have to go through an verify each layer sha.
Good tool for selecting an exact image in a repo, not a replacement for trust at the naming level (it's a bit like the difference between owning a domain and cert pinning with hpkp).
Ex - lots of refs are to "multi-arch" images, Except... there's no such thing as a multi-arch image, the entire identifier is just a reference to a manifest that then points to a list of images (or other manifests) by arch, and the actual resolved artifact is a single entry in that list.
But it means the manifest needs to be able to reference and resolve other names, and that means including... names.
For a more concrete example, just check https://github.com/moby/moby/issues/44144#issuecomment-12578...
Basically - the digests weren't intended to support image verification across repos, and the tool doesn't treat them that way. The digest was intended to allow tighter specification than a tag (precisely because a publisher might push a different image to the same tag later).
One way to get around this is to just not use `latest` at all, and only push docker tags that perfectly mirror the corresponding git branches/tags.
[1] https://podman.io
So the source of the image can be decided on pull. Some more on this https://www.redhat.com/en/blog/manage-container-registries
It looks like it's ordered by priority.
So you get a concatenation of all those registries and transient network failures are going to change the behavior. I'll take a pass on that one.(podman is a docker compatible replacement with a number of other nice features besides being able to configure the registry)
that said, you can configure "registry-mirrors" in /etc/docker/daemon.json although it is not the same thing
I can see the use case for base images: they're the canonical, trusted source of the image.
But for apps that are packaged? Not as much. I mean, if I'm using a PaaS, why can't I just upload my Docker image to them, and then they store it off somewhere and deploy it to N nodes? Why do I have to pay (or stay within a free tier) to host the blob? Many PaaS providers I've seen are happy to charge a few more bucks a month just to host Docker images.
I'm not seeing any sort of value added here (and maybe that's the point).
But obviously the real problem is that you're asking the wrong question. We don't "need" a centralized image repository. We WANT one, because the feature that Docker provides that "just use a tarball" doesn't (in addition to general ease-of-use, of course) is authentication, validation and security. And that's valuable, which is why people here are so pissed off that it's being locked behind a paywall.
But given that it has value... sorry folks, someone's got to pay for it. You can duplicate it yourself, but that is obviously an engineering problem with costs.
Just write the check if you're a heavy user. It's an obvious service with an obvious value proposition. It just sucks if it wasn't part of your earlier accounting.
I generally agree with you, but to be fair to the complainers, what sucks is that Docker didn't make it clear up front that it should be part of your accounting. I don't know if they always intended to monetize this way (if so, we'd call that a bait and switch) or if they sincerely had other plans that just didn't pan out, but either way the problem is the same: There's a trend across all of technology of giving your stuff away for free until you become the obvious choice for everything, then suddenly altering the deal and raising prices.
That kind of behavior has in the past been deemed anticompetitive and outlawed, because it prevents fair competition between solutions on their merits and turns it into a competition for who has the deepest war chest to spend on customer-acquisition-by-free-stuff.
At one point Docker probably may have had an authentic intent for a free service, but costs along the way changed the reality of operations and long-term cash flow and success of the business required making changes. Maybe the cash saved from bandwidth is what makes the next project possible that helps them grow the bottom line.
Further what was once a positive value proposition 18 months ago can turn into a losing proposition today, and a company should be allowed to adapt to new circumstances and be allowed to make new decisions without being anchored and held back by historical decisions (unless under contract).
As fun as it is hold executives to unrealistic standards, they're not fortune-tellers that can predict the future and they're making the best possible decisions they can given the constraints they're under. And I don't begrudge them if those decisions are in their own best interest, such as is their responsibility.
I'll give Docker the benefit of the doubt that this wasn't a bait-and-switch, that they never excepted it to become so successful, and that costs outpaced their ability to monetize the success and were eating into cash reserves faster than plan. I think the current outcome isn't so bad, and that we're still getting a considerable amount of value for free. It's unfortunate that some people are only finding out now, and are now under pressure to address an issue they didn't sign up for.
Those are generally solved using SSL, no need for centralized storage.
Standing up your own registry is trivial at the kind of scales (dozens-to-hundreds of images pulls per day!) that we're talking about. It's just expensive, so people want Docker, Inc. to do it for free. Well...
1. Setup a pull through mirror. Google Artifact Registry has decent limits and good coverage for public images. This requires just one config change and can be very useful to mitigate rate limits if you're using popular images cached in GAR.[1]
2. Setup a private pull through image registry for private images. This will require renaming all the images in your build and deployment scripts and can get very cumbersome.
3. Get your IPs allowlisted by Docker, especially if you can't have docker auth on the servers. The pricing for this can be very high. Rough numbers: $20,000/year for 5 IPs and usually go upwards of $50k/year.
4. Setup a transparent docker hub mirror. This is great because no changes need to be made to pipelines except one minor config change (similar to 1). We wrote a blog about how this can be done using the official docker registry image and AWS.[2] It is very important to NOT use the official docker registry image [3] as that itself can get throttled and lead to hairy issues. Host your own fork of the registry image and use that instead.
We spent a lot of time researching this for certain use cases while building infrastructure for serving Github actions at WarpBuild.
Hope this helps.
[1] https://cloud.google.com/artifact-registry/docs/pull-cached-...
[2] https://www.warpbuild.com/blog/docker-mirror-setup
[3] https://hub.docker.com/_/registry
by default anything you need from helm charts will be pulled from docker hub. and its normal to have a storage daemon, networking agents, loggers on every node so if you launch enough at once during an autoscale event, you'd trigger this limit.
With a competent caching strategy (the sort of thing you'd set up with nix or bazel) it's often faster to send the git SHA and build the image on the other end than it is to move built images around. This is because 99% of that image you're downloading or pushing is probably already on the target machine, but the images don't contain enough metadata to tell you where that 1% is. A build tool, by contrast, understands inputs and outputs. If the inputs haven't changed, it can just use the outputs which are still lying around from last time.
Does it have to? It seems it should be possible to diff the layers and only invalidate if there are conflicts.
If, in a later version, the first command changes to `echo 3 > A` then the contents of B should become "4", even though the second command didn't change. That is, neither layer can be reused because the layers depend on each other.
But maybe there's no dependency. If your Dockerfile is like this:
Then the second layer could in theory be re-used when the first layer changes, and not built/pushed/downloaded a second time. But docker doesn't do this. It plays it safe and unnecessarily rebuilds both layers anyway. And since these files end up with timestamps, the hashes of the layers differ, so both layers are consequently reuploaded and redownloaded.Build tools like nix and bazel require more of the user. You can't just run commands all willy nilly, you have to tell them more info about which things depend on which other things. But the consequence is that instead of a list of layers you end up with a richer representation of how dependency works in your project (I guess it's a DAG). Armed with this, when you try to build the next version of something, you only have to rebuild the parts that actually depend on the changes.
Whether the juice is worth the squeeze is an open question. I think it is.
A giving person could also set one of these up publicly facing and share it out.
Also it still takes some gymnastics to optionally support docker creds in a workflow https://github.com/orgs/community/discussions/131321
Very hard to find anything definitive still left on the web. This is all I could find...
https://github.com/actions/runner-images/issues/1445#issueco...
https://github.com/actions/runner-images/issues/1445#issueco...
> Very hard to find anything definitive still left on the web
Probably a lot happened behind closed doors so there probably wasn’t much to begin with.
We at Depot [0] work around this by guaranteeing that a new runner brought online has a unique public IP address. Thus avoiding the need to login to Docker to pull anything.
Subsequently, we also do the same unique public IP address for our Docker image build product as well. Which helps with doing image builds where you're pulling from base images, etc.
[0] https://depot.dev
I.e Docker terms of service restrict distribution in this way?
Is there any technical restraints?
I.e Docker specify no-cache
I expect Docker don't want their images cached and would want you to use their service and transform you in to a paying subscriber through limitations on free tier.
My feeling is the way the naming scheme was defined (and subsequent issues around modifying the default registry), docker wanted to try to lock people into using docker hub over allowing public mirrors to be set up easily. This failed, so they've needed to pivot somewhat to reduce their load.
If your project can’t afford to pay for servers and sometime to maintain it, I think we should stick with local shell scripts and precommit hooks.
My blog post on the same at https://avilpage.com/2025/02/free-dockerhub-alternative-ecr-...
The rate limit for unauthenticated pulls is 1/second/IP, source: https://docs.aws.amazon.com/general/latest/gr/ecr-public.htm...
Edit: Not exactly, it looks like ECR mirrors docker-library (a.k.a. images on docker hub no preceded by a namespace), not all of Docker Hub.
Edit 2: I think the example you give there is misleading, as Ubuntu has its own namespace in ECR. If you want to highlight that ECR mirrors docker-library, a more appropriate example might be `docker pull public.ecr.aws/docker/library/ubuntu`.
Not something we'd encountered before but seems earlier than these changes are meant to come into effect.
We've cloned the base image into ECR now and are deriving from there. This is all for internal authenticated stuff though.
We fixed the problem by using a pull through registry
The unauthenticated limit doesn't bother me as much, though I was little upset when I first saw it. Many business doesn't bother setting up their own registry, even though they should, nor do they care to pay for the service. I suspect that many doesn't even know that Docker can be used without Docker Hub. These are the freeloaders Docker will be targetting. I've never worked for company that was serious about Docker/Kubernetes and didn't run their own registry.
One major issue for Docker is that they've always ran a publicly available registry, which is the default and just works. So people have just assumed that this was how Docker works and they've never bothered setting up accounts for developers nor production systems.
Like, I get it, but it adds considerable work and headaches to thousands (millions?) of people.
Not Docker, but I worked on a project that used certain Python libraries, where the author would yank the older versions of the library everything they felt like rewriting everything, this happened multiple times. After that happened the second time we just started running our own Python package registry. That way we where in control of upgrades.
Nexus is very easy to set up.
You should also run your own apt/yum, npm, pypi, maven, whatever else you use, for the same reasons. At a certain scale it's just prudent engineering.
Own your dependency chain.
I can't stress enough how much I dislike Rancher. I know we moved to it as a cost saving measure as I am assuming we would have to buy subs for Docker.
Yet there is nothing I found easier to use than Docker proper. Rancher has a Docker compatible mode and it falls down in various ways.
Now that this has happened, I wonder if Rancher is pulling by default from the Docker Hub registry, in which case now we'll need to setup our own registry for images we use, keep them up to date etc. Which feels like it would be more costly than paying up to Docker to begin with.
All this makes me almost miss Vagrant boxes.
Reasonable price for better dev efficiency. Free for personal use.
Neither are drop in replacements for Docker Desktop, that much I am certain about, thus far.
The team that will have to do this won't have it as a priority, and unfortunately that means it'll always lag behind.
Some of this I realize is company quirk specific, but even if we had our own mirror it doesn't negate the problem entirely.
The interface is very basic. I had to get plugins for very basic functionality that has been built into Docker Desktop for years, like Logs Explorer.
It seemingly always prompts for Admin Access on the computer, even though Docker long ago stopped doing this and has worked without admin access for some time.
The prompt for enabling admin access is funny. If you don't have it already, it will prompt you to enable it, if you have it enabled, it will pop up another window, very similar, and the wording will say "Startup Rancher Desktop without administrator access" but its easy to miss the wording difference, cause the font is small.
I've had stability issues, containers randomly crashing or the daemon going down out of nowhere. Happened more than once.
It claims to be a drop in for Docker CLI, but while I don't have the list handy at the moment, I know this isn't true, particularly with docker-compose
I could go on, but its still really rough around the edges.
Finally, a use for IPv6!
I assume so anyway, as I think ISPs that support ipv6 will give you multiple IPv6 /64 spaces if requested.
The pull limits have also been delayed at least a month.
Can one of the big tech companies please use their petty cash account to acquire what remains of docker.com? Maybe OSS any key assets and donate docker hub, trademarks, etc. to some responsible place like the Linux Foundation which would be a good fit. This stuff is too widely used to leave taken hostage by an otherwise unimportant company like Docker. And the drama around this is getting annoying.
MS, Google, AWS, anyone?
Alternatively, let's just stop treating docker.io as a default place where containers live. That's convenient for Docker Inc. but not really necessary otherwise. Docker Inc is overly dependent on everybody just defaulting to fetching things without an explicit registry host from there. And with these changes, you wouldn't want any of your production environments be dependent on that anyway because those 429 errors could really ruin your day. So, any implied defaults should be treated like what they are: a user error.
If most OSS projects stop pushing their docker containers to docker hub and instead spin up independent registries, most of the value of docker hub evaporates. Mostly the whole point of putting containers there was hassle free usage for users. It seems that Docker is breaking that intentionally. It's not hassle free anymore. So, why bother with it at all? Plenty of alternative ways to publish docker containers.
I know that places like Circle already do a lot of stuff to automatically set up local caches as it can to avoid redownloading the same thing over and over from the outside world, and I hope that becomes more of the norm.
This timeline is kinda wild thouhg.
Vote with your feet and your wallets.
Teaching people to use Docker is not uncommon. The entire class pulling an image at (roughly) the same time is not uncommon either.
Yes, you can ask people to set up an account (provided you don't have policies against requiring students to sign up for unvetted US-based third-party services and provide personal data to them), but that complicates things.
Docker can't really market to machines doing most of downloads autonomously and probably can't monetize download data well to, so they want you to start paying them... or go use something else.
If I read these limits correctly, looks like lots of things are going to break on March 1st
That's understandable, but if the claim would be that this is primarily related to the costs of bandwidth, shouldn't the instructions to deploy an image caching solution (e.g. Sonatype Nexus or anything else) be at the forefront?
Like, if the same image gets pulled for some CI process that doesn't have a cache for whatever reason or gets redeployed often, having a self-hosted proxy between the user and Docker Hub would solve it really well with quite limited risks.
I think a lot of people have misconceptions about how much bandwidth really costs.
It’s been almost a decade so it’s possible things have slowed considerably, or demand has outstripped supply, but given how much data steam seems to be willing to throw at me, I know pricing is likely no where near what it was last I looked (it’s the only metered thing I regularly see and it’s downloading 10’s of GB daily for a couple games in my collection).
Using egress pricing is also the wrong metric. You’d be better off looking at data costs between regions/datacenters to get a better idea about wholesale costs, since high egress costs is likely a form of vender lockin, while higher looking at cross region avoids any “free” data costs through patch cables skewing the numbers.
Not sure about bandwidth between countries, there’s different economics there. I’d expect some self similarity there, but laying trunks might be so costly that short of finding ways to utilize fiber better is the only real way to increase supply.
If bandwidth costs are important, there are plenty of options that will let you cut the cost by 10x (or more). Either with a caching layer like an external CDN (if that works for your application), or by moving to any of the mid-tier clouds (if bandwidth costs are an important factor, and caching won’t work for your application).
AWS, GCP, and Azure are the modern embodiment of the phrase “nobody ever got fired for buying IBM.”
Most companies don’t benefit from those big 3 mega clouds nearly as much as they think they do.
So, sure, send a note to your Azure rep complaining about the cost of bandwidth… nothing will change, of course, because companies aren’t willing to switch away from the mega clouds.
> and other providers
Other providers, like Hetzner, OVH, Scaleway, DigitalOcean, Vultr, etc., do not charge anywhere near the same for bandwidth as Azure. I think they are all about 8x to 10x cheaper.
Eg Fastly prices: US/Europe $0.10/GB India $0.28/GB
Not all bandwidth is equal. eg Hetzner will pay for fast traffic into Europe but don't pay the premium that others like AWS do to ensure it gets into Asia uncongested.
I didn’t say all CDNs are cheaper. Some CDNs see an opportunity to charge a premium, and they do!
Fastly sees themselves as far more than just a CDN. They call themselves an “edge cloud platform”, not a CDN.
> Not all bandwidth is equal. eg Hetzner will pay for fast traffic into Europe but don't pay the premium that others like AWS do to ensure it gets into Asia uncongested.
Sure… there are sometimes tradeoffs, but for bandwidth-intensive apps, you’re sometimes (often?) better off deploying regional instances that are closer to your customers, rather than paying a huge premium to have better connectivity at a distance. Or, for CDN-compatible content, you’re probably better off using an affordable CDN that will bring your content closer to your users.
If you absolutely need to use AWS’s backbone for customers in certain geographic regions, there’s nothing stopping you from proxying those users through AWS to your application hosted elsewhere, by choosing the AWS region closest to your application and putting a proxy there. You’ll be paying AWS bandwidth plus your other provider’s bandwidth, but you’ll still be saving tons of money to route the traffic that way if those geographic regions only represent a small percentage of your users… and if they represent a large percentage, then you can host something more directly in their region to make the experience even better.
For many types of applications, having higher latency / lower bandwidth connectivity isn’t even a problem if the data transfer is cheaper and saves money… the application just needs to do better caching on the client side, which is a beneficial thing to do even for clients that are well-connected to the server.
It depends, and I am not convinced there is a one-size-fits-all solution, even if you were to pay through the nose for one of the hyperscalers.
I have plenty of professional experience with AWS and GCP, but I also have professional experience with different degrees of bare metal deployment, and experience with mid-tier clouds. If costs don’t matter, then sure, do whatever.
egress in the cloud is deliberately expensive as an anti-competitive measure to lock you in and stop you using competitors services
I love how everyone is arguing about networking costs inside the tiny prison cell is "the cloud". Because obviously the only way to push bits over the wire is through an AWS Internet Gateway, which was the very first packet-switched routing ever.
There is a huge difference in images carefully curated, with separate build layers and shipped layers vs the ones that dump in the codebase, install a whole compiler toolchain needed to build the application / wheels / (whatever its called in Node.JS), package it, and then ship off the image.
Clearing your apt cache and removing extraneous packages is peeing in the wind when faced with GB worth of shared objects.
"Docker Desktop is free for small businesses (fewer than 250 employees AND less than $10 million in annual revenue), personal use, education, and non-commercial open source projects."
I think that's reasonable, but it's hard for me to believe everyone's paying when they should be. I set up podman instead and I haven't had any major issues.
https://dev.to/shohams/5-alternatives-to-docker-desktop-46am
I don’t use windows, put presumably you can just use their built in linux environment and docker cli.
Docker just needs to be open source software, there's no real revenue model that makes sense, but damn they're trying. Now I guess dockerhub is also just off the table.
I'd like to find something that:
- Can pull and serve private images
- Has UI to show a list of downloaded images, and some statistics on how much storage and bandwidth they use
- Can run periodic GC to delete unused images
- (maybe) Can be set up to pre-download new tags
IIRC Artifactory has some support for Docker images, but that seems like a big hammer for this problem. [1]
[0] https://docs.docker.com/docker-hub/image-library/mirror/
[1] https://jfrog.com/artifactory/
It... does not have a UI or the GC/pre-download stuff, but it absolutely works for private images (see: https://distribution.github.io/distribution/recipes/mirror/#...)
I've been using it as a cache for a while locally and it's a solid choice.
---
I guess an edit - it does also have basic TTL, which might cover your GC case, but it's not very configurable or customizable. It's literally just a TTL flag on the proxied image.
They already set up a URL in harbor that mirrors docker.io containers.
This forces pretty much everyone to move to a Pro subscription or to put a cache in front of docker.io.
Still doable though.
It’s much healthier for the ecosystem to have lots of small registries rather than all depend on a single central one.
I would be happy to give back to the community by hosting a container p2p host.
would that be even possible out of the box?
You can run your own following Docker's own guide here[0] if you'd like. It's not peer-to-peer in the sense that the lines between clients and servers are blurred, as with torrenting, but it allows for a distributed registry architecture, which I think is the part that matters here.
They could just give X Budget to public images and create a status code for 'server overloaded, pls consider buying premium' or whatever.
It would create the same responose: Either paying or mirroring it yourself but it wouldn't harm the reputation that much.
I switched to podman during their last stunt in 2020 and have been a happy user since.
Going forward, the cheapest (free) container hub today is probably github.
I get that bandwidth is expensive, but this feels a bit like the usual "make it free to get lots of users, and then start charging when everyone is locked in" plan.
If they really just want to reduce their own costs, they should be evangelizing the use of a caching proxy, and providing a super easy way for people to set one up, both on the server and client side. (Maybe they already do this; I haven't looked.)
If everybody did a fair-use of the Docker Hub maybe we wouldn't have the rate-limits in the first place? But I think we all learned that won't be happening in the open Internet.
Setting up a pull-through cache is pretty straight-forward, you can find the instructions in Docker's documentation: https://docs.docker.com/docker-hub/image-library/mirror/
Something that doesn't require me to go through 50+ container setups and manually move every one of them to use my custom proxy?
If that's not enough, you could tunnel through HE's tunnelbroker and get a /48 which has 65,536 separate subnets for 655,360 pulls per hour.
Though, honestly, for the effort involved you're probably better off just mirroring the images.
I understand Docker is paying for the bandwidth, but it's relatively cheap for them at the scale they operate. ghcr.io doesn't impose any rate limit at all (although it isn't really GitHub's main product), which I'd say proves that it's sustainable. In any case, 100 to 10 and 200 to 40 are both huge decreases and are unjustifiable for me.
Really, all this networking expertise floating around, and Docker artifacts already being content-addressable, there should be a way to torrent them.
> 2.4 You may not access or use the Service for the purpose of bringing an intellectual property infringement claim against Docker or for the purpose of creating a product or service competitive with the Service.
Which is a great reason to default to / publish on other registries.
https://github.com/uber/kraken
Well, that's ominous. No mention what they consider consider excessive or how much they might charge. They're essentially saying they can send you whatever bill they want.
I figured some kind of smart download manager and caching system would save the day but frankly I saw Docker as a step backward because I had been doing a really good job of installing 100+ web services on a single server since 2003 or so. [1] [2]
Looking back it, I'm sure that a short timeout was a deliberate decision by the people running Docker Hub, as people with slow internet connections because telcos choose not to serve us with something better are unpeople.
[1] Nothing screams "enterprise feature, call sales for pricing" like being able to run your own local hub
[2] My experience with docker is roughly: if you can write a bash script to build your environment, you can write a Dockerfile; the Dockerfile is the gateway to a system that will download 5GB of images when you really want to install 50MB of files, so what's the point? Sure, Docker accelerates your ability to have 7 versions of the JVM and 35 different versions of Python, but is that something to be proud of, really?
I agree.
> Sure, Docker accelerates your ability to have 7 versions of the JVM and 35 different versions of Python, but is that something to be proud of, really?
No, but it's not my fault that the python packaging ecosystem is broken and requires isolation, and that every Java project relies on a brittle toolchain. At least docker means that nonsense is isolated and doesn't affect the stuff I write.
edit: Oh, per hour. I thought that was per MONTH. Okay, I can survive with this, but it's still puts me on notice. Need to leave dockerhub sooner than later.
I'm biased (i.e., co-founder of Depot [0]) and don't have the business context around internal Docker things. So this is just my view of the world as we see it today. There are solutions to the egress problem that negates needing to push that down to your users. So, this feels like an attempt to get even more people onto their Docker Desktop business model and not explicitly related to egress costs.
This is why when we release our registry offering, we won't have this kind of rate limiting. There are also solutions to avoiding the rate limits in CI. For example, our GitHub Actions runners come online with a public unique IP address for every job you run. Avoiding the need to login to Docker at all.
[0] https://depot.dev
Please do elaborate on what those are!
There are always lots of comments like this providing extremely vague prescriptions for other people's business needs. I'd love to hear details if you have them, otherwise you're just saying "other companies have found ways to get someone else besides their customers to pay for egress costs" without any context for why those people are willing to pay the costs in those contexts.
If you don't want to host an OSS repository, just decide to not do that. And this is the first I've heard of it so now it's an emergency to work around this rug pull.
Now for every image I'm going to have to try to find a trustable alternative source. (things like postgres, redis, nginx) or copy and rehost everything.
The bigger problem is when projects only officially ship as docker images for some banal reason.
As opposed to what? SystemD?
The only folks likely to feel pain from this change were those either deliberately abusing Docker’s prior generosity or using bad development and deployment practices to begin with. I suspect that for 99% of us regular users, we won’t see or feel a thing.