Google recently updated the documentation of its Google-Extended web crawler user agent, reflecting changes in product naming and clarifying the impact on search, which may be a concern for those who choose to block the crawler. The updated documentation offers clearer guidance on controlling content access for use in AI model training.
Google-Extended User Agent
Introduced on September 28, 2023, Google-Extended offers web publishers a user agent that can be used to control how their sites are crawled. Publishers can allow or disallow the Google-Extended user agent using the Robots Exclusion Protocol, giving them a way to opt-out of having their content scraped and included in AI training datasets.
Google describes Google-Extended as a “standalone product token” but that’s non-standard terminology for how publishers understand the concept of User Agents.
The original announcement described the new user agent:
“Today we’re announcing Google-Extended, a new control that web publishers can use to manage whether their sites help improve Bard and Vertex AI generative APIs, including future generations of models that power those products.
By using Google-Extended to control access to content on a site, a website administrator can choose whether to help these AI models become more accurate and capable over time.”
Blocking Google-Extended is done with the “Google-Extended” User Agent:
User-agent: Google-Extended Disallow: /
Google Changelog
Google keeps a changelog of important updates made to guidance and communication with web publishers and the search marketing community. The changelog of Google’s developer pages announced a change to the Google-Extended documentation.
The revision comes after the renaming of Bard to Gemini Apps, specifying that Google-Extended’s indexing now contributes to Gemini Apps and Vertex AI generative APIs. The new wording reassures publishers that this does not affect Google Search, addressing potential concerns about the possible implications from opting out of Google-Extended AI data collection.
What Changed?
Google’s changelog clarifies that Google-Extended crawling is exclusive to Gemini Apps and has no impact on Google Search.
The Changelog advises:
“Updated the description of the Google-Extended product token
What: With the name change of Bard to Gemini Apps, we clarified that Gemini Apps is affected by Google-Extended, and, based on publisher feedback, we specified that Google-Extended doesn’t affect Google Search.”
The updated guidance no longer uses the Bard brand name, switching it out to Gemini. And the following sentence was added:
“Google-Extended does not impact a site’s inclusion or ranking in Google Search.”
Read Google’s updated crawler overview:
Overview of Google crawlers and fetchers (user agents)
Featured Image by Shutterstock/Ribkhan