I'm a Kagi search/assistant user and advocate but the "small web" product is a frustrating misnomer.
To me the small web is any little website that was created to be interesting rather than to sell me something. That includes stuff like neocities, "shrine" type sites, single purpose sites, fandom portals, web experiments, etc.
Unfortunately Kagi's definition of "small web" is: blog or webcomic. You must have an RSS feed and it must have recent posts. That rules out so much interesting stuff I don't understand the point.
Expert/auteur websites like Sheldon Brown's (or, one of my favorites, Ask Aaron https://runamok.tech/AskAaron/FAQ.html) are the pinnacle of what's possible with the small web. Today this kind of info ends up in an ad-ridden hosted wiki or locked away in an unsearchable discord.
Then there's exceptionally cool demos like https://thelongestyard.link/q3a-demo/. This sort of thing just doesn't fit in a "blog" format unless you're writing a blog about how you built it and linking out to it.
If anyone knows of a directory of sites like these (preferably with a shuffle option) I'd love to hear about it (and contribute)!
This website is the small web - self contained. It's a really good example of the Internet we had and apparently some still want. I think of it like computer graphics where you're definition of space can get bigger as you add a bunch of resources each with their own model space into the relative context of world space. The small web should define how we do that and discover things, not what or how we build within each specific model space.
This is a fairly recent phenomenon: I'm a longtime Small Web user and even I struggle with this massive influx of AI posts. I'm hopeful it will be addressed.
I am looking for something that would filter for sites that rarely post but have good content. The number one problem with most of these systems is that everything favours frequent posting. Even if I do it manually, I cannot keep the tabs over many rarely posting sites - this is an obvious example of a problem that we delegate to computers. Favouring frequent posters creates incentives to do that even if quality worsens.
I'd be fascinated on the economics of this from Google's perspective: specifically the unit economics on generating updated-once-a-year results to queried-once-in-a-million searches.
Tl;dr: I feel like the long-tail web (90s) was better, but economics pushed high-update-frequency more-centralized results.
I could definitely see value in filters for "has RSS" and "has recent posts"—maybe even as the default view—but I absolutely agree that this is much less interesting to me without the wider world of interesting, small sites.
On a similar note, I maintain and grow a manually curated collection of personal blogs with valid RSS feeds: https://minifeed.net/blogs
The criteria is simple: human-written (as much as I can validate myself), in English (for now), with valid RSS feed, and not a micro-blog (so, more than just feed of links or short tweet-like messages).
Similar to Kagi's Small Web viewer, or StumbleUpon-style viewer: you can get a random listing of blogs [1] or a random listing of posts from all blogs [2]. Feeds and posts are indexed, so full-text search works across all blogs. When possible and permitted by robots.txt, text is scraped for searching, so even if some text is omitted in the RSS feed by the author, search should work.
Though I do plan to implement a similar "view one random post at source" kind of view, soon.
UPD: Feel free to submit a blog, including your own! [3]
The implicit criteria (tech/business and adjacent) is an issue with all these lists for me. But it's also a personal list, which is great. I just wish literally anyone in these had a personal interest on anything else reflected in their lists because I keep checking them and being disappointed.
A topic that's come up before on here with others doing the complaining about a list I liked for this reason but wasn't top-loaded with tech: https://news.ycombinator.com/item?id=47015676
Jokes aside, it's really nice and I can totally see becoming addictive. Kudos to Kagi team for an other user oriented product. (as a side note, I am using Kagi daily and i didn't know about this tool)
Yes, SU was fascinating at the time. I kind of like this style of exploring the web; it gets a bit addictive, you spend hours on it but end up finding interesting content and other stuff that you wouldn't otherwise.
I've been using the Kagi search engine for months now and I'm not impressed. I bought into it because there were a lot of posts saying that it was "just like old Google" but this has not been my experience. It's the same as new Google, you can type in what you're looking for exactly and you'll get random sort-of related websites.
I remember when you could half-remember a comment from a website, type that into Google, and get taken to the article you were looking for. That was back in like 2010. To me that's the old, and useful, search engine that I want.
I switched about a year ago. At the time it did seem like a step up from Google results. But there's been an increasing prevalence of low quality results. Blogspam, AI websites, etc. Obviously not blaming Kagi here, web search has gotten hard recently.
Is Kagi still better than Google? Probably, I don't really know because I don't use Google anymore. But at this point I feel like I'm with them out of inertia more than being an avid supporter. One of these days I'll re-evaluate Google and decide whether to switch back or not.
It does occasionally surface interesting results from small sites that you wouldn't get on Google. I do find that to be useful.
Kagi definitely isn't a bad search engine by any means. Honestly if you haven't used it, try the 100 search free trial on one device. Maybe you'll like it. This feels more like a general decline of the open web.
I'm glad to see this comment and the parent comment voted so near the top. I've had the same experience. In my experience, Kagi used to be great... then it became good... and now it's "better than Google".
"Better than Google" and the fact that I can choose websites to exclude from my search results are two features that I remain willing to pay for, however.
I'm extremely confused by these comments. Are we all using the same google? Just to make sure I wasn't crazy I just did a search on Google and 1/2 the page was a combination of google AI result and ads. Below that there were 2.5 links visible. One reddit result, and two blogspam.
The exact same search on Kagi ('best lllm for coding') nets reddit, hacker news, and some other forum results right at the top, followed by a long dense list of links to various sites (including some of the same blogspam of course), but over all the results are hugely more rich and varied and also not at all the same.
How can you possibly say that a site that gives you 50% ads and a bunch of low quality links is remotely "only a little better" than a site that gives you zero ads and a huge number of better quality links?
I'm glad to see this comment and the parent comment and the grandparent comment voted so near the top. I've had the same experience.
I honestly would love to be able to give my Kagi key to the ChatGPT or Claude clients (or more realistically, configure a proxy) just to have it be their primary tool for searches—respecting my site rankings/lists
I’m confused by this comment. The original comments talk about Kagi not living up to the hype. You say you’ve had the same experience and wish you could get LLMs to use Kagi for web searches?
Especially odd as that’s exactly what Kagi assistant already does. Maybe they’d just rather use their key than pay Kagi for LLM based search.
On that note, Kagi research is legit amazing. There have been times I’ve spent 30min searching for something without success. As a last resort I asked Kagi research and it found why I could not. More than one option even. Now intend to use almost more than normal search.
I've been using it for 2.5 years at this point, and have the same experience. I don't think it's hopeless, but Kagi will need to step up their methods. IMO, there's actually a lot they can do here.
*Do you plan to allow purchasing privacy pass tokens without having an account?*
Yes, this makes sense. This is possible because technically the extension does not care if you have an account or not. It just needs to be 'loaded' with valid tokens. And you can imagine a mechanism where you could also anonymously purchase them, eg. with monero, without ever creating an account at Kagi. Let us know *here* ( https://kagifeedback.org/d/6163-kagi-privacy-pass ) if you are excited about this, as it will help prioritize it.
Personally I don't like being signed in during searches, this seems like a good solution.
Everyone has to answer for themselves why they would be OK with Google hoovering up their data in order to deliver substandard results, vs Kagi actively working to remove low-quality results all while collecting no personal data.
Yes. I use both (Google only at work) and Kagi is certainly no worse and comes with the massive benefit of simply not being Google. It's worth paying for for that reason alone, even if the engineers at Google are constantly working on making sure I'm tracked anyway.
It’s definitely not Kagi’s fault. The AI slop is simply taking effect and I feel sorry for them. I never expected them to match Google’s quality, but I was impressed with how close it was when I used it a few years ago.
I've been using Kagi for ~18months and your description doesn't match my experience at all.
Querying for something like "snowflake json from variant?" in both engines and in google I get a sort-of-right-but-not-really-that-helpful ai summary about "parse_json" function. In Kagi I get an actually useful summary with code examples of parse_json, but also the colon-based syntax for accessing values inside nested objects without needing to parse anything.
I very rarely need to go into a page, I use Kagi quick search summary with the "?" suffix and it almost always gives me a useful answer in one-shot.
First of all, the parent comment's point is that Kagi is often be praised for being like so-called-old-Google[0]. So it's only reasonable to assume they only care about the links, not the LLM summary. What you described is even further from old Google.
Second, if you want this kind of LLM-digested search result, Google AI studio blows everything out of water (including Google search, obviously).
[0] I've never bought into the idea that old Google was so much better. But it seems to be a very popular opinion on HN. ymmv.
So some guy does hard work developing some technique or solving some problem. He documents his experience, puts up a tutorial on DO or AWS or somewhere else, and the ads on that document help offset the cost of hosting. Now comes along Kagi, scrapes that data, and presents it to you, their paying customer.
Do you pay LLM providers for agentic features? From your past submissions, you certainly seem to. Do those features make web searches and curl the results?
Were the models underlying those features trained on all available web content, or are they unlike any other enterprise models out there?
At any rate, you should see a bigger problem in what Google does, which you don't seem to.
Try g.ai. It's stupid fast and uses google indexes. Kagi? sometimes doesn't correctly parse intent, in Google thing you can just ask function doing this and gives you it, with examples, grounding and extremely fast. I'm paying for kagi since the begging and I guess id cancel it because it gives not so much added value
"I remember when you could half-remember a comment from a website, type that into Google, and get taken to the article you were looking for"
It's funny to me that (to my knowledge) no browser (mainstream?) implement this functionality yet. Seems like a no brainer to index what the user have actually seen... (Could even be restricted based on viewport - I don't think it's that crazy of an idea)
I know there's a a number of third party programs which does though. Of course - multi-device being the norm - complicates things.
>It's funny to me that (to my knowledge) no browser (mainstream?) implement this functionality yet. Seems like a no brainer to index what the user have actually seen...
The answer to this is complicated.
Both Google Chrome and Microsoft Edge actually implement this. Behind the scenes, both will upload your browser history to the cloud. You can see it in network packet captures. It's implemented in the browser for the vendor, but not for the user.
The choice to not implement this for the user is very deliberate. It's contrary to the vendor's interests if the browser provides this capability directly to users. If a user's browser can take you to a website directly, then you are not using the vendor's search engine, meaning you are not looking at their ads, paid search results, algorithm, etc. It would severly impact their business model.
This is also the reason why browsers have:
- Adopted Google Chrome's "Omnibar" instead of a separate address bar and search bar.
- Implement only basic hierarchical organization for browser Favorites.
Directly and indirectly, Google is the central nexus of all modern browsers. Aside from Google Chrome, they also:
- Fund the vast majority of Firefox.
- Pay Apple for preferential treatment.
- Provide the same mechanisms to vendors who base their browsers on Chromium (i.e., Microsoft Edge, Brave).
I would love for this to not be the case. There is hope to be found in small independent browser and search companies/projects.
Never thought about this, but it makes sense they don't want a better local search, just for users to rely more on their product. It's messed up - so much time and human potential wasted on poor search and ads.
> Adopted Google Chrome's "Omnibar" instead of a separate address bar and search bar.
On the other hand, the additional tools in the Omnibar (calculator is the example most should be familiar with) makes the bar incredibly useful for random daily tasks.
Also, it seems that there is an "omnibox" API that extensions can use, which allows them to add their own tools to the omnibar/omnibox. Would be interesting as a form of "assistant" in a way.
So fwiw, browsing history shouldn’t be anywhere near that big making it unlikely there what it was. It compresses well, if they were to do it I’m sure they’d do it at regular intervals instead of a years’ worth at a time, etc.
And, of course, Firefox is open source and this wouldn’t be kept a secret.
In which case I'd love to know what it was doing sending that much data to Google IPs when I don't use Google services...
I've read all the Mozilla help pages about what automatic connections Firefox makes and it wasn't accounted for there (unless possibly something to do with SafeBrowsing.)
It's not BS for the people who don't understand the dark patters that guide users to enabling all of this stuff. That's everyone with a Windows PC who didn't bypass the Microsoft account requirement and went with all of the defaults in Microsoft Edge. Everyone using Chrome Enterprise/Education whose Google Workspace admins don't want to get into trouble for not backing up people's stuff (i.e., sharing it all with Google). Same goes for company Windows PCs set up with Microsoft Entra ID. It's everyone with an Android device and a Google account who wants their settings backed up or transfers to a new Android device. It's in the fine print and legalese for all of these products and services.
> Both Google Chrome and Microsoft Edge actually implement this. Behind the scenes, both will upload your browser history to the cloud. You can see it in network packet captures. It's implemented in the browser for the vendor, but not for the user.
Citation needed... (I'm talking about the page *content*, not the metadata like url and title)
There are things like Mymind (SaaS) or Karakeep (selfhosted) that do this, though they require you to explicitly save the pages instead of indexing everything by default
I would really like to see integration between Karakeep and SearXNG so that I could combine online search engine results with my self-hosted bookmarks serice.
I think the joke is that Microsoft did do something very like this -- they call it Windows Recall -- and it got a lot of angry pushback. (Partly, IIRC, because the specific way they did it initially was very bad in terms of security and privacy, but I think a lot of people quite understandably don't trust them to implement it (a) the way they claim they do or (b) competently, so even after they made a bunch of changes aimed at making it less scary it's still viewed with a lot of hostility.)
That's totally fair, though I personally don't share your experience. It could be that we just use search for slightly different reasons.
One of the reasons I love Kagi is that it respects double-quotes for exact matches. This might seem trivial except I remember being frustrated with both Google and DDG years ago for throwing irrelevant results at me even when I'm querying for an exact match. When Kagi was in beta and I got invited as an early adopter, my feedback to them was that I want a search engine that won't throw crap at me when I'm looking for an exact string match. They've honored that feedback! Even though Kagi doesn't necessarily have the most results, I want double-quotes and things like intitle to actually work as expected.
Another awesome thing about Kagi is how it lets you prioritize certain domain names. Likewise, it's great for blocking domains completely. All of this has made my search results very clean.
To each their own. I'm not saying you're wrong, but to me there's no comparison between Kagi's results and every alternative I've tried.
Oh, another thing I like about Kagi is that it's less censored than Google, Bing, and DDG these days. I used to be a fan of DDG until I noticed that results were sparse or nonexistent for anything even remotely controversial I queried. It became too PG-rated.
I’ve been using kagi for about eight months now as well and at least in Europe it’s a significantly better search engine than Google by a long shot. The results are significantly more accurate. I don’t get listicles I don’t get AI spam. I get what I’m searching for, it’s refreshing.
The assistant is a nice addition but it’s search is superior for me.
The only thing that seems to have gotten a lot worse is the trend of ai articles- which isn't kagi's fault but it would be nice if they could figure out how to filter them. They all follow the same patter- "specific thing you want" with a table of contents with loads of repeated chapters and unrelated information, spattered with effectively random images.
They’re starting to with their stopslop. Sites that are mostly ai content get flagged and deranked. Still not perfect and I think they only just started working on the backlog of reports so hopefully it holds up helping.
I've been using it for over 2 years now. I'm quite happy with it. I like that I don't see adds and my searches aren't being used to target ads against me.
I really loved Kagi and was a paid customer for close to two years. But sadly this year I wont be renewing my plan.
Kagi made search feel just “right” it was simple, got the job done and had some really simple but cool search features.
But over time they started doing way too much, and I kept seeing more and more features that I really didn't want. It felt like I was paying for all this while I just wanted to type something on to a text box and click search and see a bunch of results organized according to my filters.
I wish they would just dump all the other nonsense projects like ai and just focus on search only. Or give me an option to pay for search only without any limits.
I do agree actually but I’m sticking with them. Their mission of ending slop but also pushing ai tools seem at odds. On one hand they’re marketing to the anti ai crowd while also joining the ai hype? It’s weird.
As I'm not familiar with how Kagi is "pushing ai tools" this is mostly a comment on the framing of your question.
Are you really saying that a company specializing in search - natural language oriented at its core - should not make use of the biggest technological revolution for processing natural language?
this is hard to evaluate, but we cannot replicate the old web search experience not just of Google, but Altavista, Lycos or Yahoo, when most of the web is siloed and increasingly botted - simply because the stuff you see in the siloed internet is actively "protected" out of your control
perhaps the best we can do is this "small web" thing which can be seen as some sort of retrofuturistic solution, but of course the siloed internet is a black hole of content and effort, and of course if the small web gets enough traction, astroturfed generative AI content will target it
I’ve been on Kagi for over a year and I’m pretty happy with it. At the beginning there were some noticeable differences in results that frustrated me, but at this point I don’t really miss Google except for some of the nice “not web site results” features like calculation and conversion. I mostly go straight to Wolfram Alpha for those now. And for a lot of the “random curiosity satisfaction” stuff where I would have preferred Google results, I’ll now just use ChatGPT or Gemini.
I've switched as of a few years back and it definitely works like pre-AI/search index degradation for me. But I def understand search is very user specific based on how you search and what you are targeting.
throw a question mark on the end to invoke the AI summary results and I find you can get the thing you're looking for as a reference right away. I've used this to dig up forum posts that are over a decade old multiple times with success. Asking the Kagi Assistant for a list of possible links works pretty well too.
Also on Kagi if you see bad results, you can flag the website to ignore it.
For needle-in-the-haystack searches, I find longer quotes works really well (in Kagi or google)
Kagi value proposition for me is not the $5 search but the $10 search plus whatever AI chat model you want (I originally did ultimate when I used it for coding). Controllable search and chat satisfies all my one-shot needs.
I can't really blame Kagi for the web getting bad or for the weak market for secondary search. Part of me wonders if they could use the AI search tools now on the market (now getting lots of investment) instead of the human indexes (subject to monopoly control).
People say that, but on the other hand companies like Google have a lot of much better ways of categorizing things now than they did in the past. I'm not sure I buy the excuse of "gosh, it's just too hard for us :(" from this international company worth trillions employing geniuses.
It really feels either intentional or egregiously incompetent.
How comes yandex.com can show me results that contain my search term? Most egregious example: I am searching for name of abandoned blogspot domain: yandex shows me 1 result, which is that domain. Goolge shows me "no results" fishing monster. Blogspot is google service !!!
I think it's completely unreasonable to assume that anyone would beat Google at the search game, by outgoogling them.
The reason, that Google is not like it was back in the day is that they are fighting a massive, antagonistic industry designed to game Google. The reason that chatGPT et al improves on search is that there's a effective but very expensive compute layer on top, not that they are better at the Google game. (This extra layer works out fine, because our time is more valuable and Google always came at an insane discount, also thanks to ads)
The good news is that Google search results have degraded so much that competitors like Kagi can compete directly. I moved off Google search completely on all devices ~1 year ago and I don't miss it at all, most of the time I forget I have a kagi subscription.
I think that's the problem. I used to find it far superior to google. Now, there are a lot of queries where I am unimpressed with the results and end up trying google just to get better results. (like I used to do with DDG)
I've had a few experiences now where someone is standing over my shoulder asking me to look something up, and I search kagi, find nothing, then search google and find what they asked me to look up. Then when they ask "what was that other search engine you used first?" I don't feel compelled to vouch for kagi :(.
I won't add links so it doesn't look like I'm spamming or promoting a service (though I am, but it seems in line with what you're talking about), but there's a product I've built with my wife which has made things a little bit better in our experience because it gives you an option to choose different providers/indexes, thus tailor results to your personal preference. You can find it from my personal website (my username . com).
It's hard to judge one's personal experience with "personalized" search engines. I have personalized search turned off for Google so Kagi is a much better experience for me. I'd recommend leaning more into their feature to lower/block sites from your results, which with Google would require an extension for a similar but degraded experience.
> I remember when you could half-remember a comment from a website, type that into Google, and get taken to the article you were looking for.
Is that even possible today considering there is so much more information and pages around today than in 2010? Old google worked with old Internet. The old Internet does not exist.
I typed in my dentist's full business name and location, "<name> family dentistry <city> <state>", and it was still #5 in the results. I still, out of habit, tapped the first link and called that number instead. It's ludicrous. In 2010 that would have been the top hit, next to the Wikipedia page on dentistry.
For over two years I’ve maintained the practice of using Kagi and falling back to Google if I couldn’t find something. I can count the number of successes doing that on one hand. In the meantime I get to support a company which actually respects me as a user and isn’t doing things like tying accounts to browsers, AMP (trying to take over the web), trying to kill adblock, etc.
it probably doesn't help that they're constantly bifurcating their tiny team into new projects. their browser is essentially nonfunctional for daily use but they've already moved on to porting it to Linux
I have had a great experience. I can find what I'm looking for and I can block or down-rank sites that are constantly shite.
I did find that Google over the past few years has sucked but my Google results were always miles better than most peoples until a couple years ago.
It's interesting to hear that you can't find what you wanted easily on Kagi.
I am very impressed. Kagi manages to maintain Google-par quality or better most of the time, whereas DDG became an unusable slop pit a few years ago. I'm a very happy customer and happy to keep paying for both Kagi and Orion, in part on principle and in part because the product actually works very well for me.
I don't even use the AI assistant much, only when there are a lot of disjointed search results and I want a quick summary.
Yep that was my experience to. It wasn’t bad necessarily, but certainly not as reliable / dependable as google, and not worth paying for.
Could just be that I’m familiar enough with google to always be able to make it work for me, could be a frog in boiling water type situation, but… as much as Kagi gets talked up on HN, I was pretty disappointed when I tried it. I was ready to get blown away, and instead I was underwhelmed.
The first random page it returned to me was this — https://gaultier.github.io/blog/how_to_make_your_own_static_... — which was about building one's own static site generator, which I really liked. I did not realise when I closed that page how hard it would be to find it again, because, of course every new visit to Kagi returns a different page :-)
yeah, same happened to me, the first site I was sent to was a list of people sending in random "sunday thoughts" (or whatever it was called) on (actual physical) postcards which then got scanned and posted. There were some good things in there. Now I can't find that site again because I didn't realize it was randomized...
> This page is auto-generated from Github Actions workflow that runs every day at night and fetches the 5 latest articles from each of my favorite blogs.
I do love the concept, but a little part of me died
each time I came across an article with a very strong AI voice. That just feels antithetical to the ‘small web’ ethos because it obscures the ‘neighbor’ behind it.
I like the idea, but would like to be able to select a language and see the small web of that language. There are more languages than English, and this tool could make them thrive.
Also somehow if they are clever, they could use this for those translation system they are using, but please let us select our own language without feeding automatic translation like youtube does).
I think the problem is that it's hard to curate feeds in a language you don't understand. I've been building an uncurated index of OPML blogrolls, with no language restriction. The OPML blogrolls are curated by their owners, so someone decided they met some inclusion criteria, but the overall list is uncurated.
Does it work for you guys to go to about and then click on the "list" link?
For me it says I'm blocked due to hitting a "secondary" rate limit (don't understand what that means). I don't think I've opened a page on github yet today so clearly it's a lie. Is it the referer that triggers this?
In general, freeloading the "small web" on a Microsoft service is kind of ironic. Being blocked by algorithms that try to detect if you're really human is precisely one of the things one would hope to get away from by using small, personal websites
No scrapers running on my IP address btw, at least not since it was assigned to me ~10 hours ago (I'm in one of those countries where ISPs seem to have agreed amongst each other that IP addresses must change daily so you can't reliably host things)
Yeah, many links in the embedded blog posts don't work either, presumably because the target website doesn't allow embedding. On mobile I always have to open them in a new tab for them to work.
No, it's the website owners setting a specific header (X-Frame-Options to SAMEORIGIN), as it prevents someone else from embedding your website and phishing for user credentials.
No browser prevents that by default, but this tip is found in pretty much every "best practices" hosting tutorial, so it's very common to stumble upon that browser error in the wild.
Personally my favorite spiritual successor to stumbleupon has been cloudhiker.net. I found kagis to be too personal blog focused for my tastes. I love that kagi is doing so much of this out in the open though.
Same. I got an influx of bots from Singapore (around 50 visits per day) and in figuring out what's up with traffic, I noticed kagi as a reference for the first time.
Weird times. People are training their LLMs on my content yet people are still interested in technical content written by a human being. So I guess you just keep writing, right? I find it disheartening to know I'm training LLMs but I think I'm more encouraged knowing there are still humans reading it.
Ah, this might explain the traffic from Kagi a week or so ago. I've been scratching my head over that one. I just checked, and my wee little blog is listed in smallweb.txt. Neat!
Curious what goes on behind the Next Post and Show Similar buttons.
Bit bummed. The first random page I landed on was a really interesting article for me. The custom cursor (well why not) had me struggling to following a link, and instinctively I refreshed the page. I ended up somewhere else in the haystack with ostensibly no way back to that particular article.
Perhaps I'm yelling into the void here, but what would be great is when first landing at kagi.com/smallweb, the url query parameter would be somehow set, as it is when "Next Post" is clicked.
I think it would, so long as the redirected URL with the search parameter was diarized into browser history. It would however introduce a behavior change that may be undesired (users need to know to press "Next Post" instead of refreshing).
In any case, my Kagi search for the article containing the memorable phrase "rare as rocking-horse s*t" came up empty. Perhaps it's not yet been indexed.
First impressions: My first five pages were stallman.org, a paywalled cybersec newsletter, a German-language blog, an AI-generated blog post ad for a cattle fencing service, and a blog republishing a Disney Parks press release
How do we keep getting surprised by enshittification!?
The worst case scenario is that AI runs everything, we have no skills, and are completely dependent on it...and it shows us crummy commercials and subtly steers us to paid placement with no recourse whatsoever. I hate this possible future, but this is where the money will lead.
I think it shows the limits of hand curation. It's a tiny, human-reviewed slice of the "small web", only allowing a subset of blogs... but if you select the "programming" category and click around for a short while, you get a fair amount of obvious AI slop.
I don't think it's Kagi's fault, but I guess it's depressing in a way. A lot of "small web" bloggers dream of being a part of the "big web", and when they get a cheat button, they have no second thoughts about mashing it.
When I first clicked through, I got an inane essay about how people you love are like Bitcoin. At least I knew it wasn't written by an LLM due to the misspellings and simple errors in thought, but I wondered why that article was on the front page of HN.
I run a Hugo blog and I get more interesting referral traffic from Kagi's small web index than from Google at this point. 5,000 curated sites is small enough to be useful most "indie web" directories are graveyards unfortunately..
So, basically, a random site from their index of ~30,000 sites.
You can choose similar sites by index.
But what are the criterion to have your site listed here, or how it will prevent this from just becoming a massive gamified advertising index, or anything more about "why these?" is not obvious to me.
Can anyone explain what is special about these sites specifically, or where this project is going?
A bit off topic, but I noticed I hardly ever use search anymore. It's just google.com/ai in 99% of cases. I believe in the future, search engines must go in this direction ..
Can we just agree that the internet is broken and no amount of boutique search solutions will save it? Kagi, DDG, Google they are all trying to do a search in a pile of steaming sh*, in a hope of finding that shining diamond.
Quite possible that people will come up with a solution eventually. Like Samizdat was a solution to censorship and a broken publishing system in USSR.
To me the small web is any little website that was created to be interesting rather than to sell me something. That includes stuff like neocities, "shrine" type sites, single purpose sites, fandom portals, web experiments, etc.
Unfortunately Kagi's definition of "small web" is: blog or webcomic. You must have an RSS feed and it must have recent posts. That rules out so much interesting stuff I don't understand the point.
Heavy Kagi user and the idea behind small web was appealing; but how its implemented don't click with me
Their rules excludes an absolute gem like https://www.sheldonbrown.com/ which is, to me, the essence of what we could call the "small web".
Each times the topic pops up, I try a few random ones and never found anything interesting.
There's also novelties like https://www.howmanypeopleareinspacerightnow.com/, this probably hasn't been updated in a decade but that makes it no less interesting.
Then there's exceptionally cool demos like https://thelongestyard.link/q3a-demo/. This sort of thing just doesn't fit in a "blog" format unless you're writing a blog about how you built it and linking out to it.
If anyone knows of a directory of sites like these (preferably with a shuffle option) I'd love to hear about it (and contribute)!
Tl;dr: I feel like the long-tail web (90s) was better, but economics pushed high-update-frequency more-centralized results.
The criteria is simple: human-written (as much as I can validate myself), in English (for now), with valid RSS feed, and not a micro-blog (so, more than just feed of links or short tweet-like messages).
Similar to Kagi's Small Web viewer, or StumbleUpon-style viewer: you can get a random listing of blogs [1] or a random listing of posts from all blogs [2]. Feeds and posts are indexed, so full-text search works across all blogs. When possible and permitted by robots.txt, text is scraped for searching, so even if some text is omitted in the RSS feed by the author, search should work.
Though I do plan to implement a similar "view one random post at source" kind of view, soon.
UPD: Feel free to submit a blog, including your own! [3]
[1] https://minifeed.net/blogs/by/random
[2] https://minifeed.net/global/random
[3] https://minifeed.net/suggest
A topic that's come up before on here with others doing the complaining about a list I liked for this reason but wasn't top-loaded with tech: https://news.ycombinator.com/item?id=47015676
https://github.com/kagisearch/smallweb/blob/main/smallweb.tx...
There is also Small Comic:
https://kagi.com/smallweb/?comic
https://github.com/kagisearch/smallweb/blob/main/smallcomic....
And Small YouTube:
https://kagi.com/smallweb/?yt
https://github.com/kagisearch/smallweb/blob/main/smallyt.txt
https://hcker.news/?smallweb=true
https://news.ycombinator.com/item?id=46618714 (Ask HN: Share your personal website, 2414 comments)
Jokes aside, it's really nice and I can totally see becoming addictive. Kudos to Kagi team for an other user oriented product. (as a side note, I am using Kagi daily and i didn't know about this tool)
I remember when you could half-remember a comment from a website, type that into Google, and get taken to the article you were looking for. That was back in like 2010. To me that's the old, and useful, search engine that I want.
Is Kagi still better than Google? Probably, I don't really know because I don't use Google anymore. But at this point I feel like I'm with them out of inertia more than being an avid supporter. One of these days I'll re-evaluate Google and decide whether to switch back or not.
It does occasionally surface interesting results from small sites that you wouldn't get on Google. I do find that to be useful.
Kagi definitely isn't a bad search engine by any means. Honestly if you haven't used it, try the 100 search free trial on one device. Maybe you'll like it. This feels more like a general decline of the open web.
"Better than Google" and the fact that I can choose websites to exclude from my search results are two features that I remain willing to pay for, however.
The exact same search on Kagi ('best lllm for coding') nets reddit, hacker news, and some other forum results right at the top, followed by a long dense list of links to various sites (including some of the same blogspam of course), but over all the results are hugely more rich and varied and also not at all the same.
How can you possibly say that a site that gives you 50% ads and a bunch of low quality links is remotely "only a little better" than a site that gives you zero ads and a huge number of better quality links?
I honestly would love to be able to give my Kagi key to the ChatGPT or Claude clients (or more realistically, configure a proxy) just to have it be their primary tool for searches—respecting my site rankings/lists
On that note, Kagi research is legit amazing. There have been times I’ve spent 30min searching for something without success. As a last resort I asked Kagi research and it found why I could not. More than one option even. Now intend to use almost more than normal search.
I'm thinking of trying it out Kagi, but adding another monthly commitment is what's holding me back.
A single credit top-up and occasional usage until the credits run out sounds good to me.
Also, from the Kagi privacy pass FAQ at https://blog.kagi.com/kagi-privacy-pass#faq:
Personally I don't like being signed in during searches, this seems like a good solution.Feels like your comment saying it was too much effort to cancel Kagi took more effort than cancelling Kagi.
If you don't use the service in a month, they just refund you. This has kept me from unsubscribing for years now. Some months I use it, some I don't.
It's more of hassle to unsub, and re-sub again when I want.
Querying for something like "snowflake json from variant?" in both engines and in google I get a sort-of-right-but-not-really-that-helpful ai summary about "parse_json" function. In Kagi I get an actually useful summary with code examples of parse_json, but also the colon-based syntax for accessing values inside nested objects without needing to parse anything.
I very rarely need to go into a page, I use Kagi quick search summary with the "?" suffix and it almost always gives me a useful answer in one-shot.
Second, if you want this kind of LLM-digested search result, Google AI studio blows everything out of water (including Google search, obviously).
[0] I've never bought into the idea that old Google was so much better. But it seems to be a very popular opinion on HN. ymmv.
I see a problem with this.
Were the models underlying those features trained on all available web content, or are they unlike any other enterprise models out there?
At any rate, you should see a bigger problem in what Google does, which you don't seem to.
"I remember when you could half-remember a comment from a website, type that into Google, and get taken to the article you were looking for"
It's funny to me that (to my knowledge) no browser (mainstream?) implement this functionality yet. Seems like a no brainer to index what the user have actually seen... (Could even be restricted based on viewport - I don't think it's that crazy of an idea)
I know there's a a number of third party programs which does though. Of course - multi-device being the norm - complicates things.
The answer to this is complicated.
Both Google Chrome and Microsoft Edge actually implement this. Behind the scenes, both will upload your browser history to the cloud. You can see it in network packet captures. It's implemented in the browser for the vendor, but not for the user.
The choice to not implement this for the user is very deliberate. It's contrary to the vendor's interests if the browser provides this capability directly to users. If a user's browser can take you to a website directly, then you are not using the vendor's search engine, meaning you are not looking at their ads, paid search results, algorithm, etc. It would severly impact their business model.
This is also the reason why browsers have:
- Adopted Google Chrome's "Omnibar" instead of a separate address bar and search bar.
- Implement only basic hierarchical organization for browser Favorites.
Directly and indirectly, Google is the central nexus of all modern browsers. Aside from Google Chrome, they also:
- Fund the vast majority of Firefox.
- Pay Apple for preferential treatment.
- Provide the same mechanisms to vendors who base their browsers on Chromium (i.e., Microsoft Edge, Brave).
I would love for this to not be the case. There is hope to be found in small independent browser and search companies/projects.
On the other hand, the additional tools in the Omnibar (calculator is the example most should be familiar with) makes the bar incredibly useful for random daily tasks. Also, it seems that there is an "omnibox" API that extensions can use, which allows them to add their own tools to the omnibar/omnibox. Would be interesting as a form of "assistant" in a way.
I'm fairly certain I've caught Firefox doing something similar (regularly sending multiple tens of MB to Google servers in the background.)
And, of course, Firefox is open source and this wouldn’t be kept a secret.
I've read all the Mozilla help pages about what automatic connections Firefox makes and it wasn't accounted for there (unless possibly something to do with SafeBrowsing.)
I wonder if the EU could fine them a couple weeks of revenue for this. Seems illegal.
Citation needed... (I'm talking about the page *content*, not the metadata like url and title)
https://github.com/openrecall/openrecall
Did even Microsoft try something like this? It's of course something you'd only want running locally
Which company would you trust with this kind of deep surveillance information on you though?
I guess because it isn't then trivial for a web browser to do, indexing every text ever rendered?
One of the reasons I love Kagi is that it respects double-quotes for exact matches. This might seem trivial except I remember being frustrated with both Google and DDG years ago for throwing irrelevant results at me even when I'm querying for an exact match. When Kagi was in beta and I got invited as an early adopter, my feedback to them was that I want a search engine that won't throw crap at me when I'm looking for an exact string match. They've honored that feedback! Even though Kagi doesn't necessarily have the most results, I want double-quotes and things like intitle to actually work as expected.
Another awesome thing about Kagi is how it lets you prioritize certain domain names. Likewise, it's great for blocking domains completely. All of this has made my search results very clean.
To each their own. I'm not saying you're wrong, but to me there's no comparison between Kagi's results and every alternative I've tried.
Oh, another thing I like about Kagi is that it's less censored than Google, Bing, and DDG these days. I used to be a fan of DDG until I noticed that results were sparse or nonexistent for anything even remotely controversial I queried. It became too PG-rated.
The assistant is a nice addition but it’s search is superior for me.
Kagi made search feel just “right” it was simple, got the job done and had some really simple but cool search features.
But over time they started doing way too much, and I kept seeing more and more features that I really didn't want. It felt like I was paying for all this while I just wanted to type something on to a text box and click search and see a bunch of results organized according to my filters.
I wish they would just dump all the other nonsense projects like ai and just focus on search only. Or give me an option to pay for search only without any limits.
Are you really saying that a company specializing in search - natural language oriented at its core - should not make use of the biggest technological revolution for processing natural language?
perhaps the best we can do is this "small web" thing which can be seen as some sort of retrofuturistic solution, but of course the siloed internet is a black hole of content and effort, and of course if the small web gets enough traction, astroturfed generative AI content will target it
Kagi I've been using and it's fine. Better than DDG for sure. But sometimes I still go back to google to find something kagi is struggling with.
Also on Kagi if you see bad results, you can flag the website to ignore it.
Kagi value proposition for me is not the $5 search but the $10 search plus whatever AI chat model you want (I originally did ultimate when I used it for coding). Controllable search and chat satisfies all my one-shot needs.
I can't really blame Kagi for the web getting bad or for the weak market for secondary search. Part of me wonders if they could use the AI search tools now on the market (now getting lots of investment) instead of the human indexes (subject to monopoly control).
It really feels either intentional or egregiously incompetent.
[1]: [The Man Who Killed Google Search](https://www.wheresyoured.at/the-men-who-killed-google/)
The reason, that Google is not like it was back in the day is that they are fighting a massive, antagonistic industry designed to game Google. The reason that chatGPT et al improves on search is that there's a effective but very expensive compute layer on top, not that they are better at the Google game. (This extra layer works out fine, because our time is more valuable and Google always came at an insane discount, also thanks to ads)
I've had a few experiences now where someone is standing over my shoulder asking me to look something up, and I search kagi, find nothing, then search google and find what they asked me to look up. Then when they ask "what was that other search engine you used first?" I don't feel compelled to vouch for kagi :(.
Is that even possible today considering there is so much more information and pages around today than in 2010? Old google worked with old Internet. The old Internet does not exist.
Im using qwant now and i feel its better.
And yes, Google's founders were right that web ads would kill that experience you want.
The main usecase for Kagi is the fact that you can personally uprank/downrank/pin/block sites. And it has a bunch of creature comforts built in like:
- Attempting to detect AI slop, concatenating listicles ("10 best ...") under one search result heade
- Attempting to block translated Reddit results
- Custom lenses that search only coding resources or recipes or whatnot
- Redirects (so x.com > xcancel.com), although I feel this should be a browser feature
- Better translate than Google
There's probably a few things I'm forgetting.
Kagi is abysmal at image search though. Just assume you will have to use Google for that.
It's interesting to hear that you can't find what you wanted easily on Kagi.
I don't even use the AI assistant much, only when there are a lot of disjointed search results and I want a quick summary.
Could just be that I’m familiar enough with google to always be able to make it work for me, could be a frog in boiling water type situation, but… as much as Kagi gets talked up on HN, I was pretty disappointed when I tried it. I was ready to get blown away, and instead I was underwhelmed.
It refreshes every 5 hours and shows you the most recent blogs published on Kagi. Check it out!
https://kagi.com/smallweb/?url=https://pliutau.com/reading-l...
> This page is auto-generated from Github Actions workflow that runs every day at night and fetches the 5 latest articles from each of my favorite blogs.
Also somehow if they are clever, they could use this for those translation system they are using, but please let us select our own language without feeding automatic translation like youtube does).
https://alexsci.com/rss-blogroll-network/
For me it says I'm blocked due to hitting a "secondary" rate limit (don't understand what that means). I don't think I've opened a page on github yet today so clearly it's a lie. Is it the referer that triggers this?
In general, freeloading the "small web" on a Microsoft service is kind of ironic. Being blocked by algorithms that try to detect if you're really human is precisely one of the things one would hope to get away from by using small, personal websites
No scrapers running on my IP address btw, at least not since it was assigned to me ~10 hours ago (I'm in one of those countries where ISPs seem to have agreed amongst each other that IP addresses must change daily so you can't reliably host things)
No browser prevents that by default, but this tip is found in pretty much every "best practices" hosting tutorial, so it's very common to stumble upon that browser error in the wild.
Previous post 7-sept-2023 https://news.ycombinator.com/item?id=37420281 185 comments. And https://news.ycombinator.com/item?id=39476015 23-feb-2023 36 comments
There are a surprising amount out there: https://blog.woblick.dev/en/2025/best-stumbleupon-alternativ...
http://cloudhiker.net
https://www.offscopes.com
Newsletter version if you prefer: https://randomdailyurls.com
Weird times. People are training their LLMs on my content yet people are still interested in technical content written by a human being. So I guess you just keep writing, right? I find it disheartening to know I'm training LLMs but I think I'm more encouraged knowing there are still humans reading it.
Curious what goes on behind the Next Post and Show Similar buttons.
Perhaps I'm yelling into the void here, but what would be great is when first landing at kagi.com/smallweb, the url query parameter would be somehow set, as it is when "Next Post" is clicked.
In any case, my Kagi search for the article containing the memorable phrase "rare as rocking-horse s*t" came up empty. Perhaps it's not yet been indexed.
Curious if there are any statistics on which topics people are writing about.
The worst case scenario is that AI runs everything, we have no skills, and are completely dependent on it...and it shows us crummy commercials and subtly steers us to paid placement with no recourse whatsoever. I hate this possible future, but this is where the money will lead.
> "It would hurt consumers, and we'd have to think about what we'd have to do about that, but that's really the last of our concerns right now."
I don't think it's Kagi's fault, but I guess it's depressing in a way. A lot of "small web" bloggers dream of being a part of the "big web", and when they get a cheat button, they have no second thoughts about mashing it.
https://blog.kagi.com/small-web
:-)
You can choose similar sites by index.
But what are the criterion to have your site listed here, or how it will prevent this from just becoming a massive gamified advertising index, or anything more about "why these?" is not obvious to me.
Can anyone explain what is special about these sites specifically, or where this project is going?
Quite possible that people will come up with a solution eventually. Like Samizdat was a solution to censorship and a broken publishing system in USSR.