Robots.txt files help limit search engine crawler (like Googlebot) from seeing unimportant pages on your site.
Google’s John Mueller responded to a question on LinkedIn to discuss the use of an unsupported noindex directive on the robots.txt of his own personal website. He explained the pros and cons of search engine support for the directive and offered insights into Google’s internal discussions about supporting it. John Mueller’s Robots.txt Mueller’s robots.txt has … Read more
The robots.txt file of the personal blog of Google’s John Mueller became a focus of interest when someone on Reddit claimed that Mueller’s blog had been hit by the Helpful Content system and subsequently deindexed. The truth turned out to be less dramatic than that but it was still a little weird. SEO Subreddit Post … Read more
Robots.txt files help limit search engine crawler (like Googlebot) from seeing unimportant pages on your site.
Robots.txt is a useful and powerful tool to instruct search engine crawlers on how you want them to crawl your website. Managing this file is a key component of good technical SEO. It is not all-powerful – in Google’s own words, “it is not a mechanism for keeping a web page out of Google” – … Read more
Did you know that Google Search checks about four billion host names each and every day for robots.txt purposes? Gary Illyes said in the December Search Off The Record podcast “we have about four billion host names that we check every single day for robots.txt.” He said this at the 20:31 mark in the video. … Read more
Who said robots.txt files had to be boring? Check out the easter egg in YouTube’s robots.txt file: This is what I call “Marketer Bait.” In short, you can create things with the intent to get marketers to promote them, like what I’m doing right now. I saw something interesting and now I’m sharing it. As … Read more
Google announced a new robots.txt report within Google Search Console and at the same time said it will sunset the old robots.txt tester tool. The new tool shows which robots.txt files Google found for the top 20 hosts on your site, the last time they were crawled, and any warnings or errors encountered. Google also … Read more
Google has released a new robots.txt report within Google Search Console. Google also made relevant information around robots.txt available from within the Page indexing report in Search Console. Finally, Google has decided to sunset the robots.txt tester. The new robots.txt report. Google’s new robots.txt report shows gives you some information Google has on your robots.txt file … Read more
Google today announced a new “standalone product token”, Google-Extended, that lets you control whether Bard and Vertex AI can access the content on your site. This seems to be the end result of a “public discussion” Google initiated in July, when the company promised to gather “voices from across web publishers, civil society, academia and … Read more
OpenAI has relaunched the browse with Bing feature in ChatGPT which lets users use ChatGPT leveraging the index of Bing Search. This was after Open AI turned off the feature after it was caught accessing content behind paywalls. OpenAI has said on X, “ChatGPT can now browse the internet to provide you with current and … Read more