A blueprint for how Google organizes everything on the web
May 31, 2024 11:43 AM   Subscribe

Leaked Documents Reveal How Google Search Gatekeeps the Internet This week, a 2,500-page leak, first reported by search engine optimization (SEO) veteran Rand Fishkin, gave the world an insight into the 26-year-old mystery of Google Search.
posted by heyitsgogi (29 comments total) 28 users marked this as a favorite
 
Search Engine Land has had a lot of good contextual coverage on this, see here for a starting point.

To me it's no surprise that Google "gatekeeps the Internet". What is a surprise is the specific techniques they are using. And how at odds what they say they do internally is with what they tell the outside world.
posted by Nelson at 11:51 AM on May 31 [5 favorites]


I do love Chrome but, man, the further you read into this the more I realize this is a bad thing to keep around.
posted by JoeZydeco at 11:53 AM on May 31 [12 favorites]


It's interesting to get a peek inside and I can certainly agree that Google should be a lot more transparent about how it ranks results, but none of this seems surprising at all. Reading through this, it seems obvious that the highlighted metrics are in use.
posted by ssg at 12:13 PM on May 31 [3 favorites]


So it's "be evil" all the way down
posted by chavenet at 12:19 PM on May 31 [12 favorites]


Because ads have become Google's lifeblood, they've (maybe intentionally, maybe not) set about to downrank the long tail of content into oblivion because some maybe-existing site like Ted's Fixit Blog which has very specific instructions on how to repair a Panasonic CX-705 (model name made up) tape player doesn't have ads or google tracking code, and that's why a Google Search for "Repair Panasonic CX-705" and that blog is literally nowhere in the results.
posted by tclark at 12:22 PM on May 31 [28 favorites]


Yeah, I think you've got to remember that the people who are all worked up about this are SEO people. They have exactly one job, and that entire job is subverting the search engines to put their information in front of you instead of other information. That doesn't make them necessarily evil; maybe they're trying to uprank puppy shelters and unicorn conservation societies, who knows. But the entire purpose of SEO, whether they're good or ill, is to break search. Google became popular because it was the first engine that tried to account for SEO games and present stuff that might actually be useful. They kind of suck now, but wow, it could be so much worse. This is like bank robbers complaining that banks are hiding the details of their security protocols.
posted by phooky at 12:27 PM on May 31 [32 favorites]




just here for the upranking of unicorn conservation societies, was not disappointed
posted by HearHere at 12:35 PM on May 31 [6 favorites]


There was a time when SEO was divided -- including by its practitioners -- into white-hat and black-hat. White-hat SEO was the kind of "use structural HTML markup, chunk your page sensibly, be kind to mobile, be accessible" stuff that I was happy to teach my students. Black-hat was manipulative bullshit like invisible-to-humans text intended to manipulate the search index.

Google used to punish black-hat SEO, I swear they did! But they stopped -- or at minimum, they stopped whack-a-moling new black-hat techniques. We're all seeing where that led.

Also, get offa mah lawn.
posted by humbug at 1:07 PM on May 31 [21 favorites]


How is this reveal new and different from the filter bubbles we've known and loved? /hamburger font
posted by infini at 1:21 PM on May 31


It'd be nice if they identified and deranked AI generated content..

The Washington Post Tells Staff It’s Pivoting to AI

We need more people using ad blockers based upon DNS and maybe explore IP based ad blocking more.
posted by jeffburdges at 1:25 PM on May 31 [11 favorites]


Yeah, I think you've got to remember that the people who are all worked up about this are SEO people. They have exactly one job, and that entire job is subverting the search engines to put their information in front of you instead of other information. That doesn't make them necessarily evil; maybe they're trying to uprank puppy shelters and unicorn conservation societies, who knows. But the entire purpose of SEO, whether they're good or ill, is to break search. Google became popular because it was the first engine that tried to account for SEO games and present stuff that might actually be useful. They kind of suck now, but wow, it could be so much worse. This is like bank robbers complaining that banks are hiding the details of their security protocols.

I get what you're saying but you're working off of two assumptions. The first is that there exists, even if only theoretically, an ideal state of Google search rankings- one in which, every time a person enters a search, they get the result that would make them happiest, serve their needs the most, generate the most utility for them, whatever. That might be true. The second is that Google's object in designing their search and ranking is making the Google Search product and its results as close to that state as possible. That is radically false; Google wants your queries to lead to as many Google-provided ad views and clicks as possible. In the face of the reality of Google as a company and how it manages Google Search as a product, SEO is a necessary self-preservative.
posted by Pope Guilty at 1:51 PM on May 31 [10 favorites]


Pope Guilty: " The second is that Google's object in designing their search and ranking is making the Google Search product and its results as close to that state as possible. "

While radically untrue now, this is in fact what Google originally was famous for and why they became popular. Then they let the ad team start being in charge of the search team. It's truly sad and at the same time utterly predictable.
posted by caution live frogs at 1:59 PM on May 31 [13 favorites]


Meanwhile, Google removes third-party cookies, so they are the only ones who can track users across sites (via Chrome). It may be time to switch to the Brave browser.
posted by tedwhite at 4:18 PM on May 31




Do not switch to the Brave Browser. Brave just replaces the ads and tracking the publishers put in with their own ad scheme.

Firefox is a viable alternative though. And a good ad blocker like uBlock Origin. It's likely but not certain uBO will still work fine in Chrome even after the upcoming Manifest v3 change.
posted by Nelson at 4:24 PM on May 31 [30 favorites]


I’ve used Edge exclusively since they switched to chromium. It supports all the usual extensions and divorces my search activities from all of google’s tracking cookies. Granted, Bing comes with its own set of cookies but they are far less prevalent in the wild.

AI-generated content is swiftly engulfing search results. Soon it will be turtles all the way down.
posted by simra at 4:32 PM on May 31 [2 favorites]


Google became popular because it was the first engine that tried to account for SEO games and present stuff that might actually be useful.

When Google was getting started (before they became a company even) I was attending classes at our town's community college. I remember spending a lot of time in their computer room browsing the web. It was an age before internet ubiquity.

There were already several search engines by that point. Long forgotten names like Excite and Altavista. And the thing to remember is that all of them accepted people paying them for placement. There was a lot of commentary around that time, about what the internet, and the World Wide Web, was going to turn out to be, about staying true to its founding principles of openness. A lot of discussion was had about how bad it was that those engines was outright letting money distort their rankings.

So many people talk about how Google became popular because of how their results were, but it feels like I'm the only person who remembers the mollifying effect of their "Don't Be Evil" motto. It was the knockout punch for their competitors. Google even got the idealistic techno-utopianists on their side with that. It was a statement that appeared to show people: you see? Capitalism is compatible with this weird and radically open thing we're making. Ha ha! Ha.

Where did it all start to go wrong? I have a theory. It's not just Google that changed. I think it was that the userbase of the internet, at last, as long hoped, finally did expand to most of the population. That was caused mostly by the rise of social media (especially Facebook), mobile internet devices, and other mass-market uses like streaming video and (to an extent) MMORPGs.

Once essentially everyone became an internet user, its resistance to change went way, way up. Before, if Google had turned obviously evil, most users would find out about it, and were savvy enough to look for alternatives. Now, even if a compnay's service goes way downhill, it takes years for people in general to notice, and a sizeable portion of those won't care, or are stuck using it for various reasons, like Google's tight integration with Chrome and Android, and that can be plenty enough for that company to continue, in zombie mode, for decades.

The thing is, the seeds of all of these things were visible from the start. Google saw them, and tried very hard to do it all right from the very beginning, and they still succumbed to all the monetary pressures that ruin everything else. It reveals something deeply telling about the nature of capitalism, and it's one of the reasons I've thrown my lot in with the Fediverse. For all of its flaws, and there are many, corporatism isn't among them.
posted by JHarris at 4:37 PM on May 31 [40 favorites]


Yeah, I think you've got to remember that the people who are all worked up about this are SEO people. They have exactly one job, and that entire job is subverting the search engines to put their information in front of you instead of other information.

I don't think that's what the article says. This seems like it's relevant not just to SEO wonks, but to everyone:

“It is problematic to me that Google is providing no context on critical items in the data such as ‘isElectionAuthority’ or ‘isCovidLocalAuthority.’ How is Google defining an authority in these critical domains?” Ruby said in an emailed statement. “I should not have to guess at what the answer is. Google should be forthcoming and tell me what the answer is.”

We all use these search engines, and transparency is important. Especially when we have supposedly gotten transparency before and it seems to be contradicted directly by this new info.
posted by limeonaire at 6:29 PM on May 31 [2 favorites]


Also I'm sure I don't have to remind folks on this site, SEO and Google's undocumented changes to how sites are ranked can mean the difference between a business (or your favorite website) failing or thriving.
posted by limeonaire at 6:49 PM on May 31 [16 favorites]


I have a one weird trick for searching for health info on the internet these days: "health question" + NHS.

It would be neat to get a search extension that basically said 'do you want us to prioritise results from the following boring but truthful places' and gave you a checklist. I already have one to weed out AI, but it'd be nice to have one that emphasizes the solid choices.
posted by dorothyisunderwood at 7:18 PM on May 31 [7 favorites]


There are many browsers based on Chromium, like ungoogled-chromium, so likely some retain Manifest v3 as long as doing so improves uBlock Origin, etc.

I still wish Servo had become a serious contender, but actually development remains fairly active, but directed more towards local web apps.
posted by jeffburdges at 8:19 PM on May 31 [1 favorite]


“It is problematic to me that Google is providing no context on critical items in the data such as ‘isElectionAuthority’ or ‘isCovidLocalAuthority.’ How is Google defining an authority in these critical domains?” Ruby said in an emailed statement. “I should not have to guess at what the answer is. Google should be forthcoming and tell me what the answer is.”

After watching the past few years in which (what I perceive to be) reasonable election and covid authorities were harassed by anti-vax zealots and election deniers, maybe providing a list of them is the wrong thing.
posted by coberh at 9:15 PM on May 31 [3 favorites]


Interesting Igalia now mostly controls Servo development. They rock! See also, this interesting Servo book section.
posted by jeffburdges at 5:26 AM on June 1 [1 favorite]


Maybe it's just me being naive, but when I read eg ElectionAuthority I don't think "person or groupn who gets to be authoritative on election news, am inherently controversial choice" but "ah yes, the Electoral Commission, or whatever local equivalent". Same for CovidLocalAuthority - the CDC, health minister, whatever the local authorities wrt Covid are.

I guess it depends on if you read "authority" as expert, or as government.
posted by Dysk at 5:48 AM on June 1 [5 favorites]


Tauri looks interesting too. And supports servo in theory, but not yet in practice since servo lacks sandboxing. At some point we could've all rust front end if using dioxus or similar too.
posted by jeffburdges at 7:12 AM on June 1 [1 favorite]


SEO people...have exactly one job, and that entire job is subverting the search engines to put their information in front of you instead of other information.

"Subverting"? WTF? There are many SEOs that have instead been trying to help people follow the rules of the search engines. (H2s come below H1s, remember to make things easy to read by dividing them up by subheads, and oh yeah, if you are writing about the thing, then NAME THE THING YOU ARE WRITING ABOUT. I advised a sports site for a while, and they wrote LEGIONS of articles about teams in which...they never named the team! Just the city! As you can imagine, that affected their SEO rather a lot!)

However, and long before the ad people were completely in charge, the rules got super-convoluted and mysterious over time. And there are things that happen such as "yes, you may have the best resource on the Web for this topic, but you don't update it 20 times a year, so we are going to put this crappy page that updates daily in first place." If you create convoluted rules, it becomes easier for bad actors to manipulate them.

I was doing this stuff back in 2006, when the sponsored results had a brightly colored background behind them and were labeled "Advertisement." Amazingly, a bunch of the people that I worked with thought these were highlighted because they were the best result! Over time they have really reduced the sponsorship labeling (although maybe that's a good thing, given the misinterpretation).
posted by rednikki at 11:54 AM on June 2 [6 favorites]


When I saw the term "gatekeeping" in this post, I thought it was going to be about missing search results, not about rankings. Some people these days are saying that they use DuckDuckGo to find results that were not indexed or displayed by Google in the searches. Thoughts?
posted by TreeHugger at 7:19 AM on June 3


DuckDuckGo, as demonstrated when Bing went down recently, depends heavily on it for their results.

Also, at the scale that Google operates at, having a low ranking is practically the same as going missing. If a good page isn't offered or if it's on page 13 of results, either way no one's going to click on it.
posted by JHarris at 2:26 AM on June 5


« Older Fish are smarter than we think   |   At the whim of 'brain one' Newer »


This thread has been archived and is closed to new comments