Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Reading the title I thought you meant the opposite.

Aka, an ai.txt file that disallow ai to train or use your data similar to robots.txt (but for cases when you still want to be crawled, just not extrapolated)



Feels like an enhancement to a sitemap.xml could be a better way to go here.

https://developers.google.com/search/docs/crawling-indexing/...


I thought the exact same. Creating a new type of robots.txt but making it do the opposite does not make sense.


I've been (slowly) writing a new type of OSS license around this exact concept so it's easier to (legally) stop LLMs hoovering up IP [1] (under "derivative works not permitted").

[1] https://github.com/cheatcode/joystick/blob/development/LICEN...


They've been ingesting "all rights reserved" content because they think copyright doesn't apply. Licenses won't help.


We'll see. I think courts will end up interpreting it in the same way that they do music sampling other music. In effect that's all it is: a remix of existing information.


I guess the good part that in ai.txt you can talk to AI. So if you want you can tell it to not crawl or make other agreements with it, just in plain english. What a time to be alive.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: