I feel like AI has gotten to the point where the message is: If you want to make something (art/code/music/writing) you can do it for your own enjoyment, but you aren't allowed to make money from it anymore; only the large corporations can make money from content. If you do release something creative, it'll just be fed back into the machine to be copied over and over.
As someone who simultaneously makes music professionally, and works in IT professionally, it has been really interesting watching GenAI unfold, and the diverging cultures around it. It is almost like the world is splitting into two "societies":
1. One that loves AI + Big Business + very fast Innovation and disruption
2. One that loves Artisanal work + Small Business + slower but more sustainable innovation
I personally prefer living in #2, but I can totally see both "societies" continuing to exist and develop in their own ways.
Of course there is always the reality that different societies always end up interacting and affecting eachother.
I am waiting for the online reification of this split with bated breath so that I can fuck off to society #2 and never have to interact with society #1 again.
I'm not too worried about it because the first segment of society is doomed to be 'good but never great.'
AI lacks the ability to identify greatness because it's trained on the output of the average person who also lacks this ability.
It's going to create a new elite class of people who have good taste and the masses who have bad taste. Many current elites will end up with the masses. They may retain their wealth on paper, but it will be a cheap, low-quality existence but they will be convinced it's luxury.
I think everyone will get what they want, but not everyone will get what they need.
At least for art - I don't think you'll find anyone who actually enjoys art hanging up anything produced by AI on their walls. For these kinds of "customers", they could equally easily frame & hang up a poster of the Mona Lisa. Artists are not at threat, if anything, AI makes original artworks more precious & enjoyable.
My worry is that, at least among the artists I know, many kept themselves afloat early career by doing commercial freelance jobs like illustrations for local events or companies. Those kinds of jobs might largely vanish.
On the other hand, with the internet inevitably becoming swamped by AI generated content, I can definitely see a de-digitalization of art moving into offline spaces. At least for independent work, you don’t necessarily need mass appeal or exposure, but rather access to individuals and small groups with an actual willingness to pay for art.
That's assuming that the only market is stuff people are hanging up. The games industry, one that already takes advantage of its workers, is going to love this to the detriment of really passionate artists who love their craft and industry.
genAI is going to be great for indie games. Solo productions are much easier to produce and will only get easier as tooling improves. I sort of see this as a spotify moment I guess. A democratizing force that will allow many more people to get paid for their art but with much less job security and often as a second job. Whether that's a good thing is certainly up for debate but I think as a consumer it's probably good for me.
I imagine it'll take a functional legal body to do this IE maybe europe, but I think there should be a legally binding set of metadata you can attach to images to specify that they must not be used for training (with real penalties if companies are caught)
This content must not be used for training or refining LLMs. If it is, rest assured that if and when the regulatory environment around training data shifts in any country in which we have legal standing, we reserve the right to sue.
Maybe even with a class action element: any lawsuit stemming from a violation of this license shall cover all other violations at the same time.
I don't understand the endgame here. Websites let Google crawl their content in exchange of traffic. If Google cuts that out completely, what incentive do websites have to not block the Google crawlers?
I understand that Google is feeling an existential threat from other AI products that provide answers directly. But they must also understand their symbiotic relationship with the web.
The end game is the consumer no longer leaving Google and the web becoming synonymous to Google for them. Why shop on some random website when you can have Gemini buy it for you? Why look for information on Wikipedia when… you get the idea.
I think the coming years will be pivotal for the web. Facebook attempted a similar strategy back when their apps got traction, but they ultimately failed. Let’s hope Google fails too.
What I really don't understand is where the next generation of training material will come from. If websites stop being published and/or crawled, how will the machine continue to be fed.
> If Google cuts that out completely, what incentive do websites have to not block the Google crawlers?
Completely, yes, that destroys the incentive. But they can reduce it 80% or 90% or so, to the point that it's just barely worthwhile to allow their crawlers.
You will be kept inside the Google ecosystem the same way people are kept inside Facebook.
I’m curious how they plan to generate new content in the future, because it seems obvious that simple web pages will become obsolete and eventually stop being filled with fresh data.
It will probably end with a warning every time you click a link, something like: “You are leaving to an external unsafe site.”
The impression I get from Google's own marketing material is that Google doesn't believe in "the web". And it hasn't believed in the web for years.
Think about it. Pretty much every time they show a search box with someone asking for directions to reach a physical place, what hours is it open, etc.
The greatest thing about the internet is that it has removed distances around the whole world, but Google's major value proposition seems to be that... it can accurately index and query information about local businesses?
We abrogated getting traffic to our websites to Google long ago. Mostly because Google was so good at it that the alternatives became significantly less useful.
Now that Google is focusing on becoming 'self contained', so to speak, we should find a better way to drive traffic to websites. Ideally one that's not under the control of a single corporation.
I know this is likely to do with the nature of the problem, but that hasn’t stopped us from getting some wildly-unsuitable decentralised nonsense in the past.
Does a move like this give more power / value to websites like reddit? A link aggregator that is organized is much more useful for finding new websites.
As a website owner I have seen major upticks in viewership myself but really it hits hard when you see an Ai summary that is wrong and your sites there. The whole Ai for everything push unfortunatly will downskill the world I fear and nothing can be done about it.
I feel this. I asked a developer today a question about how our product is programmed to handle something, and he just sent me a summary from the internal AI assistant they've started using.
He used to provide really good, thoughtful answers, but now it's just copy/paste from the AI.
> He used to provide really good, thoughtful answers
This hits hard. There’s a senior engineer at my job who is known for well written proposals. Today he shared a doc that had the typical AI formatting, was hard to read, and clearly not his style.
On the other hand, if others use AI to summerize stuff, does it matter anymore?
I have a co-worker who does this now. He's very smart, very capable, very experienced and it's clear that he's just a frontend for Claude now. It's tragic.
Maybe organize to give these workers more equity or rev share instead of just a wage so they care more for quality results instead of the behaviors they’re evaluated on and you’ll find them more pleasant to work with.
Nearly every iteration of google's innovation has made the web worse. Websites chasing SEO with low quality garbage sites, to sites plagued with adsense digital ads & popovers, to now stealing from websites and selling that data. The web was more readable as a link ring on geocities
These kinds of declarations rarely make sense to me because they don't seem to model the issues in the way that I see them. I have dual roles: one as a person who writes a blog (a "content producer" in our present parlance) and as a user. As a user, I want my browser user agent to act on my behalf to display web pages, and I want my search agent to extract information from numerous sources and synthesize them with appropriate sourcing.
One could argue that my content production being a hobby lets me be pretty blasé about being intermediated by a platform. That is somewhat true. If I relied upon this as a living, I would probably also conclude that actions that harm my way of living are a war on "the web", though realistically any neutral party observing must conclude that if it is a war, it's one on my kind of participation in the web - content creation for the purpose of revenue / notoriety / some other reward.
As a user, I don't actually care very much for each website and its creator. The information contained therein is useful to me, but the heterogeneity of these sites is mostly an obstacle to the information. I am much happier when my search and summarization agents are able to accurately synthesize what these websites say, in so much as such a synthesis allows me to model reality more accurately.
So I could be convinced that this change from Google makes it less likely for accurate content to be created and that I'll be misled more often. But this is a tool, and my world-model will frequently be tested by reality. If the search-and-synthesis machine fails to produce useful outcomes, I will know. And I'll have to adjust the way I treat knowledge I obtain through it so that I don't get catastrophic outcomes. But that's the same already. I don't really know that Google's search results are not planted ones calibrated to change my opinion. And I don't know that they don't collude with the Internet Archive (with whom they have a pre-existing relationship) to make it look like their constructed consensus is real.
As a user, I have to make a lot of decisions already, and having to painstakingly read search results to synthesize them myself is far less useful than using an agent. So if there is a war on the web, then I am glad to join it, on the side against the web.
I...have to agree about siding against the web...An optimistic part of me sees this as a move that pushes in the same direction that the "web" has already been going in for a long time - preventing users from getting the right information in an honest and efficient manner, preserving their attention budget, and choice. Until now, it was through increasing the noise to push monetary incentives, and now it's by cutting the noise to push monetary incentives. Why optimistic: up till now, there was no single enemy, and it was hard to fight a (somewhat) disjointed system; now, Google is positioning itself to push things further to the worse, with them (and small number of other companies) being the clear target.
My hope is that this will help overflow the proverbial glass for an increasing amount of people and we'll start pushing back towards the "old" web before Google and ad networks have transformed it, or find new modalities of interacting more freely with each other, and the content.
It's not going to be a small or easy fight, though...to a large extent, it's a fight against the current state of capitalism itself, and winning back our attention, critical thinking and choice.
I would feel more sad about this if the web wasn’t so rotten to begin with. On average, any random site is just trying to throw ads at you and harass you to subscribe and such.
I have a particular disdain for “subscribe to our newsletter” modals. Especially when I’ve spent a sum total of less than 3 seconds looking at the webpage.
How such modals aren’t considered pop-ups is beyond me.
That rot was the direct result of the ad economy that made Google all of its money. Now maybe if they hadn't done it then somebody else would have, but they did do it, and poisoned the well we all drink from.
Out of my countless www experiments the website made for myself turned out most enjoyable. Technically it is a blog with links, quotes, categories, tags and search. Sometimes i download all pages it links to. (tens of thousands)
Google dropped it from the index long ago. I had a fun discussion with some google folk where they kept arguing my website was designed wrong and that some pages had tomany links.
Basically, if you write an article about the largest banana companies you have to chose which to link to!
The 10 best movies article is better than the best 100. If you make a list of all the movies you've seen your page gradually turns into something really bad. Others will be punished for linking to it but only if you add the nth entry.
As the website is just for me it is clearly their loss not mine. No way im ging to consider linking a sub set of patents or research papers.
At one point the web was drowned in "listicles," low-effort sites made to match queries like "best movies from the 90s" or "new music in 2023". Google attempted to downrank these sorts of sites because they were in general very very low quality and were just designed to catch a lot of traffic and display ads alongside low-effort content. (Think one page per list entry, each page transition is a whole new set of ads.) Users disliked these. It sounds like your site was misclassified as a low-effort listicle site.
Im sure it is really hard to run a search engine that size. I have ideas how they could improve but it isnt my job. They chose to populate results with big websites which probably is good enough for most users. The problem is that there is now no point creating websites which is terrible for google. If it picks up the domain and (against all ods) deems it worthy of traffic it can be blacklisted at any time.
I'm not even sure this is bad anymore. The web is so overrun with SEO crap that it could probably use the cleansing that comes with Google's abandonment, Usenet-style.
> De-googlifying your mental apparatus becomes more urgent today. Find other search engines, don’t use the Chrome browser. Or wake up in a slopified AOL kind of environment where your access to information is limited to what Google’s synthetic text extruders deem relevant.
Everything is probably re-traceable fairly easily because Google Analytics is on nearly every web page.
But I understand maintaining your own source of archives, videos, documents, etc.
Sounds like a good vibe coding project actually.. to try and keep it all organized offline.
I guess the extra insult is that the summaries still suck. I feel like every time I google a technical question, I get something wrong which references a youtube video watched by 30 people about an unrelated subject.
It is interesting to look at the past predictions on here of AI search/answer companies like Perplexity possibly dethroning Google search and comparing to the reactions of Google just doing the same thing themselves.
Why would it be good if Perplexity does it, but bad if Google does it? What are the principles at play here?
Perplexity does not control who gets traffic on the internet. They don't own a significant percentage of the mobile OS, browse and online search market share. They can't force the industry in one way or the other, consequences be damned.
People don't like Google because it's bad. If competition wins, maybe they'll stop being so bad. But if they become badder themselves, that's not good.
I'm confused how the strategy works in the long run. If fewer people are incentivized to build websites on novel topics, there will be less content in general and less training data... plus AI overview results see less ad conversions and therefor less ad revenue. Whats the long game? I get that the paradigm is changing but this seems like its not going to help them maintain their dominance.
A) Google will do a good job of this, people will find their summaries more useful, and the web will evolve into a more closed system that better serves its users
or ...
B) They're gated AI community will suck, and people will start using a different search engine that better serves its users.
My money isn't on A), but they do have a lot of clout so I wouldn't rule it out.
Kind of curious how it would pan out, if there was a government enforced meta tag one could add to signal what the data could be used for - for example “no-ai”.
That would allow people to still let Google to access their site, but restrict its usage. Similar for open source projects on GitHub, etc.
The tech giants already violated existing copyright laws when scraping for AI content and faced very few consequences. So far the government has shown an inability to enforce anything.
The thing everyone needs to ask before advocating for something "government enforced" is "what would happen if this was in the hands of a hostile government?"
And then remember that (a) just because it's not hostile to you today, doesn't mean it won't be tomorrow, and (b) one man's "hostile" is another man's "utopia."
The thing everyone needs to ask before advocating for laissez faire is "what does a hostile and monopolistic search engine giant like Google gain from us doing nothing?"
And then remember that just because Google is not hostile to you today, doesn't mean it won't be tomorrow.
Step 1: Be really lax in enforcing compliance with it so that nobody complies with it.
Step 2: Abruptly switch to iron-fist enforcement where suddenly people get jail time for violations, but only for entities that have been critical of the government.
This is by no means the only or most likely way, just what I could come up with in 30 seconds. There may be much better "evil government" strategies.
If Google stops driving traffic to websites, won't those websites stop allowing Google to crawl their pages? The pendulum might be in motion, but it seems like there should still be some natural equilibrium that it's heading to.
They don't think they really need any more content outside of a few deals they can cut directly with publishers. And they already have YouTube, which produces limitless free content for them to use as they see fit. My blog from 20 years ago, or indeed all of our blogs today, are not something Google feels will be any loss to their product.
Someone will search for "Kylie Jenner" and they will get some kind of shopping opportunity (with Google getting a commission) and links to her profile on YouTube. And maybe some publisher content on the subject. In all cases they'll probably want to angle to get more of an "aGeNtIc" experience, where Google just reads you the story or buys the lipstick for you, without you leaving google.com.
There won't be "websites" anymore, it will all just be Google. Other behemoths that generate original content (that aren't AI) like sports, news, entertainment will either be big enough to sign individual deals on pain of litigation or just force-scraped (as is happening now) by bots that are indistinguishable from human users.
I thought this was going to be about having to use your corporate approved phone to scan reCATCHA QR codes. Was just able to opt out of my first one but obviously won’t be able to forever.
I don't think people cared all that much about whether or not the content was citable. You can't cite Wikipedia, and that's not going anywhere.
Facebook may well fail when people don't enjoy it. But all Google ever promised was information, of variously dubious quality, and that's still their draw.
> The goal is to take away the web and guide people into Google’s abstraction on top of it. An abstraction they control and moderate. It’s about monopolizing access to information.
Google’s Vision since they were founded:
Google's mission is to organize the world's information and make it universally accessible and useful
They told everyone what they were doing the whole time
Google declared war on blogs and other content long time ago, when it used our websites to harvest data to target readers with ads accross the entire internet. We used to have (for twenty years!) medical technology website for MDs. How can we compete with short unrelated YouTube videos or other spam content that serve Google ads targeting doctors? How do you think the entire creative blogosphere of the early 2000s collapsed into nothingness?
I don't think people much liked the pre-search-engine era. They used lousy search engines when they became available, and when a good one started they liked it so much that they verbed its name.
I don't know if it's Google AB-testing something, but the summaries below usual search result entries (the non-AI ones) are unbelievably bad today. Sometimes the link is a Reddit or SO post, but the summary is from a reply/answer with no vote contradicting the highest-voted ones.
It's conspiracy, but it feels like Google is actively making the usual search worse so everyone will use AI overview more.
Don’t worry when I track down most AI answers it is usually just some Redditor’s comment, which is quite scary when you think about it and Redditors in general.
But I want redditor's comments. It's almost my only use case of google now. What I'm complaining about is that google search can't even summary the right reddit comments.
It is not just about replacing search results with text blurbs generated on Alphabet premise either. They're making it so that unless you have an Android certified (Or Apple) smartphone you will not be a human being, you will be assumed to be a bot and blocked by their captchas.
Passkeys are a big part of this future, too. The spec has device attestation built in, so if passkeys gain traction, they could lock it down so only approved software is allowed to log in to services. If that happens, it means your ability to log in to services will be mediated by one of 3 US big tech companies. "For security," of course.
Honestly the bigger problem for me. I use SearXNG, but DDG is acceptable, or people like Kagi.
But if ReCAPTCHA won't consider me human unless i have a certified phone, having search alternatives doesn't matter -- the websites themselves are just gonna block me
You may use an alternative search engine, but 90% won’t. If people accept the new way of searching, meaning, no longer visiting websites, there will no longer be any websites that could show you captchas.
At the end of the day, is it really all that different to provide a list of links, versus an answer or overview of a few paragraphs with links to lots of different higher-quality sources?
I follow those source links all the time. Not just to "check sources" but because they provide a ton more detail. And the links are usually much better than what I'll get with regular keyword search results.
> It’s about monopolizing access to information.
Not as long as there are competitors like OpenAI and Anthropic. In fact, LLM's have provided Google with stronger competition than it's ever had before. ChatGPT and Claude are doing what Bing was never able to.
We’ve gone from Only links to the source -> Mostly links to the source, with a short summary picked almost verbatim from the source -> AI summary that mangles several sources’ information together and gets top billing -> Only the AI summary with some footnotes linking to the source.
Google has been fairly slowly been turning up the temperature of the pot and we’re only a few degrees away from a full boil. Let’s not pretend or be naive enough to think that’s not what’s happening.
Ask any publisher and you will get a resounding "yes, it is very different." On average they're able to attribute about a 33% decrease (globally) in traffic to google's (or others') AI answers. [1]
You're right that there are competitors, but those competitors are doing the same thing: hoovering up content and then not giving anything back for it. There are deals in place for some of the largest publishers [2] [3], but that leaves a ton of content out in the cold. That's going to decrease the amount of content that's out there, which will decrease the quality of AI search. I don't know where that ends, but given how leveraged the economy is in AI it seems like a good idea for somebody to figure it out.
A lot of the time, the answer itself is good, but the links are spam blogs and Tiktok videos. I don't think there's a real connection between how the text is generated and what "references" are picked for it. I just searched for a math history topic and the reference was a literal TikTok video that's an advertisement for a sketchy mobile calculator app?
So yeah, these references are boosting web content, but it has nothing to do with the high-quality sources used to train the LLMs in the first place.
Glad I haven’t used anything google for more than a decade. For internet searches, you can host searxng instance and use it. Other services too are self-hostable, even far better than google.
Welcome to the third-party internet. Unless every micro-decision you make while browsing can be stripped down, packaged into neat data points, and sold, you're not welcome here.
the cool thing, google is much like meta, the kids see it as something boomers are using. my daughter is 12, whenever I say “google it” she says “that’s very, very funny Dad, you are fun guy.” it’ll take some time until boomers are off google as well (my usage of google is probably at 30% of where it used to be) but their days of “this is where you go to ‘search’” are numbered
I've got a half thought about concept that maybe we need a concept like AMP back. I hated AMP. I'm glad it's dead. But you could use it to define things that you were at least advised that it would be shown in the google ui and carousel. I feel like we need a guarantee from the LLMs that if we provide some kind of meta data in our source material you'll honor stuff from it. Like show our advertisers so we get some revenue still from you showing our content on your LLM site.
Totally vibed version of this:
```
{
"version": "https://agent-source.org/v1",
"canonical_url": "https://ninjasandrobots.com/the-cone",
"title": "The Real Reason Nobody Moved the Cone",
"source_name": "Ninjas and Robots",
"author": "Nathan Kontny",
"summary": "An essay about embarrassment, public action, and why obvious fixes go undone.",
"preferred_citation": "Ninjas and Robots",
"source_card": {
"headline": "The Real Reason Nobody Moved the Cone",
"description": "People avoid obvious public actions not because they are lazy, but because being seen trying is embarrassing.",
"image": "https://ninjasandrobots.com/images/cone-card.jpg",
"cta": "Read the full essay"
},
"allowed_excerpt": {
"max_chars": 500,
"preferred_excerpt": "People often avoid obvious public action because embarrassment feels more immediate than danger."
},
"commercial_terms": {
"ads_allowed": true,
"sponsor_card_url": "https://ninjasandrobots.com/.well-known/sponsor-card.json",
"licensing_contact": "hello@ninjasandrobots.com"
}
}
```
But something to get our original source honored better in the LLM. Maybe if one of the LLMs do this, we'd give it more loyalty? Maybe the government needs to compel this kind of behavior? No idea. It does suck though our content is just turned into AI's own tokens and we're left with a tiny "source" link if we're lucky.
Given that these platforms are increasing intermediating experiences between websites/companies/etc and end-users, I suspect we’ll soon see a strong push back in that direction to adopt more things like schema markup to get more control back in some sense. Things are only going to get worse though.
It is not a war on the web, but on how it was traditionally used (and abused). And that "traditional" way was shaped by google too.
As you want a cookie, i put you in a table, napking, serve you a bag of cookies and hope that you eat/find the cookie you want, while hearing my music, watching my ads, pushing you more foods that I sell and other services. And sometimes, that is the experience you are searching for. But also, many just want a cookie.
That is what a conversational and maybe agentic interface can give you. Have someone a blueberry cookie? Then it gives it to you, and also give pointers to restaurants that give a more complete experience sometimes (while others may try to scam you). It is a shortcut, but also doesn't hide you the traditional way to access that.
They are not saints, but neither are all the ones in the other side. But the new way to access the relevant information you want, in a way that you can use it, have its own value.
Google isn't a search company, and hasn't been ever since they bought DoubleClick. Their core business is advertising.
They're trying to pivot into AI because they have gobs of "evidence" that the vast majority of people have been typing natural language questions into Google instead of looking for specific terms
Google pre 2010 was perfectly functional. No realtime search suggestions, advanced search parameters that were actually working, possibility of doing an exact string search if needed.
The technology for indexing the web was mature enough by then, already then.
I agree that much of the downward spiral was caused by google itself, tho.
I feel like AI has gotten to the point where the message is: If you want to make something (art/code/music/writing) you can do it for your own enjoyment, but you aren't allowed to make money from it anymore; only the large corporations can make money from content. If you do release something creative, it'll just be fed back into the machine to be copied over and over.
As someone who simultaneously makes music professionally, and works in IT professionally, it has been really interesting watching GenAI unfold, and the diverging cultures around it. It is almost like the world is splitting into two "societies":
1. One that loves AI + Big Business + very fast Innovation and disruption
2. One that loves Artisanal work + Small Business + slower but more sustainable innovation
I personally prefer living in #2, but I can totally see both "societies" continuing to exist and develop in their own ways.
Of course there is always the reality that different societies always end up interacting and affecting eachother.
I predict mixtapes, with the operative word being tapes, make a big comeback.
I am waiting for the online reification of this split with bated breath so that I can fuck off to society #2 and never have to interact with society #1 again.
I'm not too worried about it because the first segment of society is doomed to be 'good but never great.'
AI lacks the ability to identify greatness because it's trained on the output of the average person who also lacks this ability.
It's going to create a new elite class of people who have good taste and the masses who have bad taste. Many current elites will end up with the masses. They may retain their wealth on paper, but it will be a cheap, low-quality existence but they will be convinced it's luxury.
I think everyone will get what they want, but not everyone will get what they need.
Needs to be inverted.
Tax excess tech profits that derive from the efforts of others and use the proceeds to fund living artists.
Vaguely analogous to levies on blank cassettes that went to offset piracy. Give the money directly to actual artists, not labels/publishers, though.
You’re describing a social revolution. Otherwise there is no way that leaders whose power over us corrupts them would want to put that into law.
The cassette reference was a tax on consumers to send money upward. What you’re describing is the complete inverse.
No, it is exactly the same thing. The tax on cassettes raised money that was given to artists.
At least for art - I don't think you'll find anyone who actually enjoys art hanging up anything produced by AI on their walls. For these kinds of "customers", they could equally easily frame & hang up a poster of the Mona Lisa. Artists are not at threat, if anything, AI makes original artworks more precious & enjoyable.
My worry is that, at least among the artists I know, many kept themselves afloat early career by doing commercial freelance jobs like illustrations for local events or companies. Those kinds of jobs might largely vanish.
On the other hand, with the internet inevitably becoming swamped by AI generated content, I can definitely see a de-digitalization of art moving into offline spaces. At least for independent work, you don’t necessarily need mass appeal or exposure, but rather access to individuals and small groups with an actual willingness to pay for art.
That's assuming that the only market is stuff people are hanging up. The games industry, one that already takes advantage of its workers, is going to love this to the detriment of really passionate artists who love their craft and industry.
Lots of illustrator jobs for businesses too
genAI is going to be great for indie games. Solo productions are much easier to produce and will only get easier as tooling improves. I sort of see this as a spotify moment I guess. A democratizing force that will allow many more people to get paid for their art but with much less job security and often as a second job. Whether that's a good thing is certainly up for debate but I think as a consumer it's probably good for me.
I imagine it'll take a functional legal body to do this IE maybe europe, but I think there should be a legally binding set of metadata you can attach to images to specify that they must not be used for training (with real penalties if companies are caught)
No money and no audience.
Recognition and gratitude keeps me going. Money pays the bills, but if that was the only concern, I'd still be a software developer.
Anonymously feeding the slop machine is nothing like it.
I’m itching for some sort of no-training license:
This content must not be used for training or refining LLMs. If it is, rest assured that if and when the regulatory environment around training data shifts in any country in which we have legal standing, we reserve the right to sue.
Maybe even with a class action element: any lawsuit stemming from a violation of this license shall cover all other violations at the same time.
A big corporation using LLM’s to pump out lazy “art” gets the exact same scrutiny from me.
I don't understand the endgame here. Websites let Google crawl their content in exchange of traffic. If Google cuts that out completely, what incentive do websites have to not block the Google crawlers?
I understand that Google is feeling an existential threat from other AI products that provide answers directly. But they must also understand their symbiotic relationship with the web.
The end game is the consumer no longer leaving Google and the web becoming synonymous to Google for them. Why shop on some random website when you can have Gemini buy it for you? Why look for information on Wikipedia when… you get the idea.
I think the coming years will be pivotal for the web. Facebook attempted a similar strategy back when their apps got traction, but they ultimately failed. Let’s hope Google fails too.
We're going back to the CompuServe/AOL/Prodigy model
What I really don't understand is where the next generation of training material will come from. If websites stop being published and/or crawled, how will the machine continue to be fed.
Current executives think it's a problem for the future executives.
Either Google is ignoring that, or crossing their fingers and hoping that one LLM can produce data to train another one.
The long-run doesn't matter as much as the short-term gains for those in power.
The web is going to become China, which is a collection of walled gardens
> If Google cuts that out completely, what incentive do websites have to not block the Google crawlers?
Completely, yes, that destroys the incentive. But they can reduce it 80% or 90% or so, to the point that it's just barely worthwhile to allow their crawlers.
Google ignores robots.txt and botnets residential addresses to crawl anyway? (LLM startups already do this.)
You will be kept inside the Google ecosystem the same way people are kept inside Facebook.
I’m curious how they plan to generate new content in the future, because it seems obvious that simple web pages will become obsolete and eventually stop being filled with fresh data.
It will probably end with a warning every time you click a link, something like: “You are leaving to an external unsafe site.”
The impression I get from Google's own marketing material is that Google doesn't believe in "the web". And it hasn't believed in the web for years.
Think about it. Pretty much every time they show a search box with someone asking for directions to reach a physical place, what hours is it open, etc.
The greatest thing about the internet is that it has removed distances around the whole world, but Google's major value proposition seems to be that... it can accurately index and query information about local businesses?
We abrogated getting traffic to our websites to Google long ago. Mostly because Google was so good at it that the alternatives became significantly less useful.
Now that Google is focusing on becoming 'self contained', so to speak, we should find a better way to drive traffic to websites. Ideally one that's not under the control of a single corporation.
Anyone miss StumbleUpon?
An open way to trade, store, and export lists of websites in a way that works seamlessly on desktop and mobile browsers would be pretty neat.
It feels strange there’s no decentralised search.
I know this is likely to do with the nature of the problem, but that hasn’t stopped us from getting some wildly-unsuitable decentralised nonsense in the past.
There is, YaCy, it just isn’t very good as it suffers from lack of attention/interest.
I don't see how being decentralized helps search. Makes it quite harder if the fediverse is any indication
Does a move like this give more power / value to websites like reddit? A link aggregator that is organized is much more useful for finding new websites.
But Reddit also doesn't want you visiting new websites.
There is also old-fashioned marketing. Go find your audience to be heard.
(sorry, nit pick, but I don't your usage of 'abrogate' is quite correct here, you can't abrogate to something)
> but I don't your usage
If we're nitpicking, you don't what their usage?
> If we're nitpicking, you don't what their usage?
Abrogate their usage.
He may have meant abdicated
As a website owner I have seen major upticks in viewership myself but really it hits hard when you see an Ai summary that is wrong and your sites there. The whole Ai for everything push unfortunatly will downskill the world I fear and nothing can be done about it.
> downskill the world
I feel this. I asked a developer today a question about how our product is programmed to handle something, and he just sent me a summary from the internal AI assistant they've started using.
He used to provide really good, thoughtful answers, but now it's just copy/paste from the AI.
> He used to provide really good, thoughtful answers
This hits hard. There’s a senior engineer at my job who is known for well written proposals. Today he shared a doc that had the typical AI formatting, was hard to read, and clearly not his style.
On the other hand, if others use AI to summerize stuff, does it matter anymore?
I have a co-worker who does this now. He's very smart, very capable, very experienced and it's clear that he's just a frontend for Claude now. It's tragic.
Maybe organize to give these workers more equity or rev share instead of just a wage so they care more for quality results instead of the behaviors they’re evaluated on and you’ll find them more pleasant to work with.
That will only encourage this behavior
I'm not his boss; I'm on a different team. But we're a very small company with very good compensation and revenue share in the company.
That ain't it.
Nearly every iteration of google's innovation has made the web worse. Websites chasing SEO with low quality garbage sites, to sites plagued with adsense digital ads & popovers, to now stealing from websites and selling that data. The web was more readable as a link ring on geocities
These kinds of declarations rarely make sense to me because they don't seem to model the issues in the way that I see them. I have dual roles: one as a person who writes a blog (a "content producer" in our present parlance) and as a user. As a user, I want my browser user agent to act on my behalf to display web pages, and I want my search agent to extract information from numerous sources and synthesize them with appropriate sourcing.
One could argue that my content production being a hobby lets me be pretty blasé about being intermediated by a platform. That is somewhat true. If I relied upon this as a living, I would probably also conclude that actions that harm my way of living are a war on "the web", though realistically any neutral party observing must conclude that if it is a war, it's one on my kind of participation in the web - content creation for the purpose of revenue / notoriety / some other reward.
As a user, I don't actually care very much for each website and its creator. The information contained therein is useful to me, but the heterogeneity of these sites is mostly an obstacle to the information. I am much happier when my search and summarization agents are able to accurately synthesize what these websites say, in so much as such a synthesis allows me to model reality more accurately.
So I could be convinced that this change from Google makes it less likely for accurate content to be created and that I'll be misled more often. But this is a tool, and my world-model will frequently be tested by reality. If the search-and-synthesis machine fails to produce useful outcomes, I will know. And I'll have to adjust the way I treat knowledge I obtain through it so that I don't get catastrophic outcomes. But that's the same already. I don't really know that Google's search results are not planted ones calibrated to change my opinion. And I don't know that they don't collude with the Internet Archive (with whom they have a pre-existing relationship) to make it look like their constructed consensus is real.
As a user, I have to make a lot of decisions already, and having to painstakingly read search results to synthesize them myself is far less useful than using an agent. So if there is a war on the web, then I am glad to join it, on the side against the web.
I...have to agree about siding against the web...An optimistic part of me sees this as a move that pushes in the same direction that the "web" has already been going in for a long time - preventing users from getting the right information in an honest and efficient manner, preserving their attention budget, and choice. Until now, it was through increasing the noise to push monetary incentives, and now it's by cutting the noise to push monetary incentives. Why optimistic: up till now, there was no single enemy, and it was hard to fight a (somewhat) disjointed system; now, Google is positioning itself to push things further to the worse, with them (and small number of other companies) being the clear target.
My hope is that this will help overflow the proverbial glass for an increasing amount of people and we'll start pushing back towards the "old" web before Google and ad networks have transformed it, or find new modalities of interacting more freely with each other, and the content.
It's not going to be a small or easy fight, though...to a large extent, it's a fight against the current state of capitalism itself, and winning back our attention, critical thinking and choice.
I would feel more sad about this if the web wasn’t so rotten to begin with. On average, any random site is just trying to throw ads at you and harass you to subscribe and such.
I have a particular disdain for “subscribe to our newsletter” modals. Especially when I’ve spent a sum total of less than 3 seconds looking at the webpage.
How such modals aren’t considered pop-ups is beyond me.
So you want websites to rely on traffic from Google instead of building their own newsletter? Interesting.
Do you trust Google to do a better job?
That rot was the direct result of the ad economy that made Google all of its money. Now maybe if they hadn't done it then somebody else would have, but they did do it, and poisoned the well we all drink from.
While they seem against being scraped themselves: https://serpapi.com/blog/google-v-serpapi-motion-to-dismiss-...
Good thing they took "Do no evil" out of their manifesto years ago
Out of my countless www experiments the website made for myself turned out most enjoyable. Technically it is a blog with links, quotes, categories, tags and search. Sometimes i download all pages it links to. (tens of thousands)
Google dropped it from the index long ago. I had a fun discussion with some google folk where they kept arguing my website was designed wrong and that some pages had tomany links.
Basically, if you write an article about the largest banana companies you have to chose which to link to!
The 10 best movies article is better than the best 100. If you make a list of all the movies you've seen your page gradually turns into something really bad. Others will be punished for linking to it but only if you add the nth entry.
As the website is just for me it is clearly their loss not mine. No way im ging to consider linking a sub set of patents or research papers.
At one point the web was drowned in "listicles," low-effort sites made to match queries like "best movies from the 90s" or "new music in 2023". Google attempted to downrank these sorts of sites because they were in general very very low quality and were just designed to catch a lot of traffic and display ads alongside low-effort content. (Think one page per list entry, each page transition is a whole new set of ads.) Users disliked these. It sounds like your site was misclassified as a low-effort listicle site.
Im sure it is really hard to run a search engine that size. I have ideas how they could improve but it isnt my job. They chose to populate results with big websites which probably is good enough for most users. The problem is that there is now no point creating websites which is terrible for google. If it picks up the domain and (against all ods) deems it worthy of traffic it can be blacklisted at any time.
I'm not even sure this is bad anymore. The web is so overrun with SEO crap that it could probably use the cleansing that comes with Google's abandonment, Usenet-style.
> De-googlifying your mental apparatus becomes more urgent today. Find other search engines, don’t use the Chrome browser. Or wake up in a slopified AOL kind of environment where your access to information is limited to what Google’s synthetic text extruders deem relevant.
Everything is probably re-traceable fairly easily because Google Analytics is on nearly every web page.
But I understand maintaining your own source of archives, videos, documents, etc.
Sounds like a good vibe coding project actually.. to try and keep it all organized offline.
I guess the extra insult is that the summaries still suck. I feel like every time I google a technical question, I get something wrong which references a youtube video watched by 30 people about an unrelated subject.
It is interesting to look at the past predictions on here of AI search/answer companies like Perplexity possibly dethroning Google search and comparing to the reactions of Google just doing the same thing themselves.
Why would it be good if Perplexity does it, but bad if Google does it? What are the principles at play here?
Perplexity does not control who gets traffic on the internet. They don't own a significant percentage of the mobile OS, browse and online search market share. They can't force the industry in one way or the other, consequences be damned.
People don't like Google because it's bad. If competition wins, maybe they'll stop being so bad. But if they become badder themselves, that's not good.
I'm confused how the strategy works in the long run. If fewer people are incentivized to build websites on novel topics, there will be less content in general and less training data... plus AI overview results see less ad conversions and therefor less ad revenue. Whats the long game? I get that the paradigm is changing but this seems like its not going to help them maintain their dominance.
Ah, that's where you're wrong. There is no long term. Investors want results now. "Later" is for the greater fools.
What if there is no long game? Just people at Google optimising for their current KPIs.
To me it seems either ...
A) Google will do a good job of this, people will find their summaries more useful, and the web will evolve into a more closed system that better serves its users
or ...
B) They're gated AI community will suck, and people will start using a different search engine that better serves its users.
My money isn't on A), but they do have a lot of clout so I wouldn't rule it out.
A) has plenty of dystopian followups.
Kind of curious how it would pan out, if there was a government enforced meta tag one could add to signal what the data could be used for - for example “no-ai”.
That would allow people to still let Google to access their site, but restrict its usage. Similar for open source projects on GitHub, etc.
The tech giants already violated existing copyright laws when scraping for AI content and faced very few consequences. So far the government has shown an inability to enforce anything.
So far, yeah. The courts shrugged and said it was allowed under current law.
So the solution to that would be “change the law”.
> government enforced
The thing everyone needs to ask before advocating for something "government enforced" is "what would happen if this was in the hands of a hostile government?"
And then remember that (a) just because it's not hostile to you today, doesn't mean it won't be tomorrow, and (b) one man's "hostile" is another man's "utopia."
The thing everyone needs to ask before advocating for laissez faire is "what does a hostile and monopolistic search engine giant like Google gain from us doing nothing?"
And then remember that just because Google is not hostile to you today, doesn't mean it won't be tomorrow.
Well, when I said “I’m curious” it was true. I’m actually curious.
So how do you think a meta noai tag would be used by a hostile government?
It would be something the website owner set.
Step 1: Be really lax in enforcing compliance with it so that nobody complies with it.
Step 2: Abruptly switch to iron-fist enforcement where suddenly people get jail time for violations, but only for entities that have been critical of the government.
This is by no means the only or most likely way, just what I could come up with in 30 seconds. There may be much better "evil government" strategies.
If Google stops driving traffic to websites, won't those websites stop allowing Google to crawl their pages? The pendulum might be in motion, but it seems like there should still be some natural equilibrium that it's heading to.
They don't think they really need any more content outside of a few deals they can cut directly with publishers. And they already have YouTube, which produces limitless free content for them to use as they see fit. My blog from 20 years ago, or indeed all of our blogs today, are not something Google feels will be any loss to their product.
Someone will search for "Kylie Jenner" and they will get some kind of shopping opportunity (with Google getting a commission) and links to her profile on YouTube. And maybe some publisher content on the subject. In all cases they'll probably want to angle to get more of an "aGeNtIc" experience, where Google just reads you the story or buys the lipstick for you, without you leaving google.com.
There won't be "websites" anymore, it will all just be Google. Other behemoths that generate original content (that aren't AI) like sports, news, entertainment will either be big enough to sign individual deals on pain of litigation or just force-scraped (as is happening now) by bots that are indistinguishable from human users.
We got to that point a while ago. Many of the major social media’s are essentially uncrawlable.
Communities have moved from public forums to private discords. Most of the major social media’s are unviewable without an account.
I thought this was going to be about having to use your corporate approved phone to scan reCATCHA QR codes. Was just able to opt out of my first one but obviously won’t be able to forever.
It looks like Google has taken a note out of Facebook's "lose trust" playbook.
Facebook had a huge opportunity in the post-AI world: real humans.
Instead of focusing on connections, they've been optimizing their properties for doomscrolling.
Google, similarly, has lost the plot on what made them trustworthy in the first place: navigating to citable content.
Both companies started on this trend well before AI, but this might be the final nail in their respective coffins[0].
[0]Yes they'll likely still be profitable for a long time, but the Bell Labs-esque downfall has begun (imo).
I don't think people cared all that much about whether or not the content was citable. You can't cite Wikipedia, and that's not going anywhere.
Facebook may well fail when people don't enjoy it. But all Google ever promised was information, of variously dubious quality, and that's still their draw.
Fair, citable is probably the wrong term.
This is a problem Google has been battling forever, with all the SEO click spam.
In either case, Google was the tool that many people used to find "trustworthy" information (citable or not), compared to the other tools online.
> The goal is to take away the web and guide people into Google’s abstraction on top of it. An abstraction they control and moderate. It’s about monopolizing access to information.
Google’s Vision since they were founded:
Google's mission is to organize the world's information and make it universally accessible and useful
They told everyone what they were doing the whole time
you're right, I think we didn't realize these implied parts: "make it universally accessible [to Google] and useful [to Google's financial interests].
A nice, terse, little rant. I agree completely.
I surprised however, that it didn't describe phase 2 of the disaster, where in the models no longer have fresh www content to train on.
It's hard to understand the long term vision of this strategy...
Google declared war on blogs and other content long time ago, when it used our websites to harvest data to target readers with ads accross the entire internet. We used to have (for twenty years!) medical technology website for MDs. How can we compete with short unrelated YouTube videos or other spam content that serve Google ads targeting doctors? How do you think the entire creative blogosphere of the early 2000s collapsed into nothingness?
Nobody is stopping you from publishing on the net.
Nobody is stopping you from blocking bot traffic.
You don't need search engines - you can just link between sites or have webrings. Like we used to, pre-2000.
Nobody is stopping you from not using ads on the net.
Nobody can force you to use non-essential cookies (and thus: a cookie-banner).
Imagine there was a war going on, and no-one was showing up.
I don't think people much liked the pre-search-engine era. They used lousy search engines when they became available, and when a good one started they liked it so much that they verbed its name.
Well, they are kind of desperate after missing both cloud and AI.
I would blame trash like Discord more though. Alternative search engines are available, but the crappy little web chat hides info inside.
> I would blame trash like Discord more though. Alternative search engines are available, but the crappy little web chat hides info inside.
Well, we had the same problem with IRC. There's value to be had in not everything being discoverable in 5 seconds with a google search.
I don't know if it's Google AB-testing something, but the summaries below usual search result entries (the non-AI ones) are unbelievably bad today. Sometimes the link is a Reddit or SO post, but the summary is from a reply/answer with no vote contradicting the highest-voted ones.
It's conspiracy, but it feels like Google is actively making the usual search worse so everyone will use AI overview more.
Don’t worry when I track down most AI answers it is usually just some Redditor’s comment, which is quite scary when you think about it and Redditors in general.
But I want redditor's comments. It's almost my only use case of google now. What I'm complaining about is that google search can't even summary the right reddit comments.
It is not just about replacing search results with text blurbs generated on Alphabet premise either. They're making it so that unless you have an Android certified (Or Apple) smartphone you will not be a human being, you will be assumed to be a bot and blocked by their captchas.
Passkeys are a big part of this future, too. The spec has device attestation built in, so if passkeys gain traction, they could lock it down so only approved software is allowed to log in to services. If that happens, it means your ability to log in to services will be mediated by one of 3 US big tech companies. "For security," of course.
Honestly the bigger problem for me. I use SearXNG, but DDG is acceptable, or people like Kagi.
But if ReCAPTCHA won't consider me human unless i have a certified phone, having search alternatives doesn't matter -- the websites themselves are just gonna block me
You may use an alternative search engine, but 90% won’t. If people accept the new way of searching, meaning, no longer visiting websites, there will no longer be any websites that could show you captchas.
The AI answers provide tons of source links.
At the end of the day, is it really all that different to provide a list of links, versus an answer or overview of a few paragraphs with links to lots of different higher-quality sources?
I follow those source links all the time. Not just to "check sources" but because they provide a ton more detail. And the links are usually much better than what I'll get with regular keyword search results.
> It’s about monopolizing access to information.
Not as long as there are competitors like OpenAI and Anthropic. In fact, LLM's have provided Google with stronger competition than it's ever had before. ChatGPT and Claude are doing what Bing was never able to.
> I follow those source links all the time.
The vast majority of people don’t.
We’ve gone from Only links to the source -> Mostly links to the source, with a short summary picked almost verbatim from the source -> AI summary that mangles several sources’ information together and gets top billing -> Only the AI summary with some footnotes linking to the source.
Google has been fairly slowly been turning up the temperature of the pot and we’re only a few degrees away from a full boil. Let’s not pretend or be naive enough to think that’s not what’s happening.
Ask any publisher and you will get a resounding "yes, it is very different." On average they're able to attribute about a 33% decrease (globally) in traffic to google's (or others') AI answers. [1]
You're right that there are competitors, but those competitors are doing the same thing: hoovering up content and then not giving anything back for it. There are deals in place for some of the largest publishers [2] [3], but that leaves a ton of content out in the cold. That's going to decrease the amount of content that's out there, which will decrease the quality of AI search. I don't know where that ends, but given how leveraged the economy is in AI it seems like a good idea for somebody to figure it out.
[1] https://pressgazette.co.uk/media-audience-and-business-data/...
[2] https://futureweek.com/a-complete-list-of-publishers-strikin...
[3] https://digiday.com/media/a-timeline-of-the-major-deals-betw...
> is it really all that different to provide a list of links
Probably not, but I don't like change.
> The AI answers provide tons of source links.
A lot of the time, the answer itself is good, but the links are spam blogs and Tiktok videos. I don't think there's a real connection between how the text is generated and what "references" are picked for it. I just searched for a math history topic and the reference was a literal TikTok video that's an advertisement for a sketchy mobile calculator app?
So yeah, these references are boosting web content, but it has nothing to do with the high-quality sources used to train the LLMs in the first place.
Most people don't look at the sources even though the sources often contradict the statements.
I've stopped using Google and find I'm not missing anything
Glad I haven’t used anything google for more than a decade. For internet searches, you can host searxng instance and use it. Other services too are self-hostable, even far better than google.
You host your own global search engine? That's impressive.
Welcome to the third-party internet. Unless every micro-decision you make while browsing can be stripped down, packaged into neat data points, and sold, you're not welcome here.
This war was already declared a decade ago. By many interests. And victory followed.
I think though a big part of this was YouTube replaced blogs. It's a generational thing.
How far along the curve do you think TikTok is to replacing YouTube?
the cool thing, google is much like meta, the kids see it as something boomers are using. my daughter is 12, whenever I say “google it” she says “that’s very, very funny Dad, you are fun guy.” it’ll take some time until boomers are off google as well (my usage of google is probably at 30% of where it used to be) but their days of “this is where you go to ‘search’” are numbered
I've got a half thought about concept that maybe we need a concept like AMP back. I hated AMP. I'm glad it's dead. But you could use it to define things that you were at least advised that it would be shown in the google ui and carousel. I feel like we need a guarantee from the LLMs that if we provide some kind of meta data in our source material you'll honor stuff from it. Like show our advertisers so we get some revenue still from you showing our content on your LLM site.
Totally vibed version of this:
``` { "version": "https://agent-source.org/v1", "canonical_url": "https://ninjasandrobots.com/the-cone", "title": "The Real Reason Nobody Moved the Cone", "source_name": "Ninjas and Robots", "author": "Nathan Kontny", "summary": "An essay about embarrassment, public action, and why obvious fixes go undone.", "preferred_citation": "Ninjas and Robots", "source_card": { "headline": "The Real Reason Nobody Moved the Cone", "description": "People avoid obvious public actions not because they are lazy, but because being seen trying is embarrassing.", "image": "https://ninjasandrobots.com/images/cone-card.jpg", "cta": "Read the full essay" }, "allowed_excerpt": { "max_chars": 500, "preferred_excerpt": "People often avoid obvious public action because embarrassment feels more immediate than danger." }, "commercial_terms": { "ads_allowed": true, "sponsor_card_url": "https://ninjasandrobots.com/.well-known/sponsor-card.json", "licensing_contact": "hello@ninjasandrobots.com" } } ```
But something to get our original source honored better in the LLM. Maybe if one of the LLMs do this, we'd give it more loyalty? Maybe the government needs to compel this kind of behavior? No idea. It does suck though our content is just turned into AI's own tokens and we're left with a tiny "source" link if we're lucky.
Given that these platforms are increasing intermediating experiences between websites/companies/etc and end-users, I suspect we’ll soon see a strong push back in that direction to adopt more things like schema markup to get more control back in some sense. Things are only going to get worse though.
If it's so bad, people won't use it. If it's good, why be against it ?
You don't write post to reach the biggest amount of people, you do because you're passionate and ultimately you get people following you.
If average Joe doesn't go on your website, what's the big deal ?
I think this feature will be very useful to fight back on the optimized SEO hell that we currently have.
Everyone goes through live nation/Ticketmaster. Would you say they provide a good experience?
"If Nestle were so bad, people wouldn't buy their products."
It is not a war on the web, but on how it was traditionally used (and abused). And that "traditional" way was shaped by google too.
As you want a cookie, i put you in a table, napking, serve you a bag of cookies and hope that you eat/find the cookie you want, while hearing my music, watching my ads, pushing you more foods that I sell and other services. And sometimes, that is the experience you are searching for. But also, many just want a cookie.
That is what a conversational and maybe agentic interface can give you. Have someone a blueberry cookie? Then it gives it to you, and also give pointers to restaurants that give a more complete experience sometimes (while others may try to scam you). It is a shortcut, but also doesn't hide you the traditional way to access that.
They are not saints, but neither are all the ones in the other side. But the new way to access the relevant information you want, in a way that you can use it, have its own value.
Google isn't a search company, and hasn't been ever since they bought DoubleClick. Their core business is advertising.
They're trying to pivot into AI because they have gobs of "evidence" that the vast majority of people have been typing natural language questions into Google instead of looking for specific terms
Google pre 2010 was perfectly functional. No realtime search suggestions, advanced search parameters that were actually working, possibility of doing an exact string search if needed.
The technology for indexing the web was mature enough by then, already then.
I agree that much of the downward spiral was caused by google itself, tho.