Super HN - Super Hacker News

Robots.txt Is a Suicide Note (wiki.archiveteam.org) 13 points by rafram 19 minutes ago

Not sure the emotive language is warranted. Message appears to be “if you use robots.txt AND archive sites honor it AND you are dumb enough to delete your data without a backup THEN you won’t have a way to recover and you’ll be sorry”.

It also presumes that dealing with automated traffic is a solved problem, which with the volumes of LLM scraping going on, is simply not true for more hobbyist setups. by bonaldi 1 minute ago

I think this is kind of misguided - it ignores the main reason sites use robots.txt, which is to exclude irrelevant/old/non-human-readable pages that nevertheless need to remain online from being indexed by search engines - but it's an interesting look at Archiveteam's rationale. by rafram 16 minutes ago

[delayed] by rolph 1 minute ago