Protecting User Choice – Using WordPress and Tumblr to train AI models
-
Very recently, Automattic published the following announcement regarding use of WordPress and Tumblr sites for training of AI. The problem with this announcement is that it refers to the Privacy Section of the General Settings. My blog is missing the part of this section that specifically exposes the checkbox “Prevent third party sharing for xyz.com”
I realize that my site is host externally, which means I have a WordPress.org account, but I also have a WordPress.com account in order to post here and ask questions like this.
Is the Privacy Section of General Settings different between sites hosted by WordPress.com, and externally hosted sites?
The blog I need help with is: (visible only to logged in users)
-
I forgot to include the relevant links:
Protecting User Choice
More Control Over the Content You Share
-
Is the Privacy Section of General Settings different between sites hosted by WordPress.com, and externally hosted sites?
Hi there, Your self-hosted WordPress website does not have a Privacy setting for Third Party Sharing and Protecting User Choice post has been updated to include the following:
- We will only share public content that’s hosted on WordPress.com and Tumblr, and only from sites that haven’t opted out.
- We are not including content from sites hosted elsewhere even if they use Automattic plugins like Jetpack or WooCommerce.
I hope that helps.
-
Thank you, @justjennifer, that clarifies part of my concern, but the language in the updated post confounds me. It’s not clear what is meant by “We are not including content from sites hosted elsewhere…” One interpretation is this: externally hosted sites are excluded from Protecting User Choice. Could you clarify this?
-
Hello again, This new Privacy setting is only available to WordPress.com (and Tumblr) sites because those Privacy settings determine what is included in our site’s robots.txt file. Most WordPress.com site owners do not have direct access to robots.txt file like self-hosted websites do.
On your self-hosted site, you would set up your own robots.txt file to disallow or allow bots from crawling your site. There are a number of resources out there to help you. Search for “robots.txt” in any search engine. Here’s the original https://www.robotstxt.org/, a help page from Yoast (NAYY) and search results on WordPress.org.
For more help with your WordPress site, head to the support forums at https://wordpress.org/support/forums/
Of course another option is to move your current WordPress site to WordPress.com for hosting. https://wordpress.com/support/import/ If you have questions about that, I’d recommend starting a new forum thread.
-
Hello again, I just came across this article in my reading which might help you in crafting your standalone WordPress site’s robots.txt file.
Block the Bots that Feed “AI” Models by Scraping Your Website
Again, no affiliation.
Best wishes.
-
@justjennifer, thank you very much. The link you posted March 5 was very useful. I was able to follow the subsequent links, installed YoastSEO(free) plug-in, created my robots.txt file, and revised my .htaccess file.
-
- The topic ‘Protecting User Choice – Using WordPress and Tumblr to train AI models’ is closed to new replies.