05/07/2025 / By Willow Tohi
In a federal antitrust trial that could reshape the future of online search, Google admitted it continues to use web content to train its AI-powered search features – even when publishers explicitly opt out.
The revelation came during testimony from Eli Collins, a vice president at Google DeepMind. He confirmed that while the company respects opt-outs for general AI training, its search division operates under different rules. The case, unfolding in Washington, D.C., underscores growing tensions between tech giants and publishers over who controls – and profits from – online content.
During cross-examination by Department of Justice (DOJ) attorney Diana Aguilar, Collins acknowledged that once Google’s Gemini AI model is integrated into its search engine, the company can train it on data publishers sought to block.
“Once you take the Gemini [AI model] and put it inside the search org, the search org has the ability to train on the data that publishers had opted out of training, correct?” Aguilar asked. Collins replied: “Correct – for use in search.”
This distinction has alarmed publishers, who argue that Google’s AI-generated summaries – displayed above traditional search results – divert traffic from their sites, eroding ad revenue. To fully block AI training, publishers must opt out of Google Search indexing entirely via the robots.txt protocol, a move that would effectively render their content invisible in search results.
The testimony emerged during a pivotal antitrust trial before Judge Amit Mehta, who ruled in 2023 that Google unlawfully monopolized the search market. The DOJ is now pushing for drastic remedies, including forcing Google to divest its Chrome browser and prohibiting payments to secure default search status on devices.
Internal documents revealed that Google filtered out 80 billion of 160 billion content “tokens” due to publisher opt-outs. Despite this, the search engine giant retained vast datasets from search sessions and YouTube videos.
The DOJ’s broader efforts also include proposals to prevent Google from dominating future AI developments. Regulators may require Google to open its search indexes, data and AI models to competitors and restrict agreements that limit rivals’ access to web content. Additionally, the DOJ suggests allowing websites to opt out of AI training without sacrificing search visibility – a policy that could redefine digital consent.
When pressed about whether Google’s search dominance unfairly advantaged its AI, Collins conceded that DeepMind CEO Demis Hassabis had explored using search rankings to enhance AI performance, though no such model was confirmed to exist.
The case highlights a Catch-22 for publishers: Allow Google to scrape content for AI training or vanish from search rankings altogether. As AI Overviews increasingly replace click-throughs, smaller outlets face existential threats.
Earlier this year, education platform Chegg sued Google, alleging AI summaries decimated its revenue. A Google spokesperson defended the policy, stating publishers could use robots.txt to block indexing – ignoring the collateral damage to their visibility.
Meanwhile, OpenAI faces similar legal challenges, with lawsuits accusing it of training AI models on “stolen private information” without consent. Authors and publishers argue that AI firms exploit copyrighted material, raising questions about fair use in the age of machine learning. Google’s recent privacy policy updates may be an attempt to preempt legal action, but experts warn that courts have yet to establish clear boundaries on AI data usage.
The trial underscores a broader debate over data ownership in the AI era. While Google frames its practices as innovation, critics see a pattern of leveraging monopoly power to override consent. As Judge Mehta weighs remedies, the outcome could set a precedent for how tech giants balance progress with publisher rights. For now, the message to content creators is clear: In Google’s ecosystem, opting out of AI training may mean opting out of the internet itself.
Sources for this article include:
Tagged Under:
AI, AI training, AI-generated summaries, Big Tech, computing, debt bomb, Eli Collins, Glitch, Google, information technology, intellectual property, money supply, online content, opt out, privacy watch, search engine, tech giants, technocrats
This article may contain statements that reflect the opinion of the author
COPYRIGHT © 2017 FUTURETECH.NEWS
All content posted on this site is protected under Free Speech. FutureTech.news is not responsible for content written by contributing authors. The information on this site is provided for educational and entertainment purposes only. It is not intended as a substitute for professional advice of any kind. FutureTech.news assumes no responsibility for the use or misuse of this material. All trademarks, registered trademarks and service marks mentioned on this site are the property of their respective owners.