Internet Archive Copyright Infringement

  • Unknown's avatar

    I deleted my site [redacted address] about three weeks ago. Recently I found that Archive.org’s Wayback Machine had copied entire articles (which were the result of hours of research), contrary to the Copyright Notice on my site itself.

    This can be viewed here:
    [redacted]

    The Internet Archive has been very untransparent and unwilling to cooperate over email. Is there anything WordPress can do to remedy the situation? Any technical avenues that can lead to the removal of the archived site (such as a robots.txt file, etc)?

  • Unknown's avatar

    Hi,

    Although the Internet Archive is not something affiliated or related to WordPress and there will be limited support in matters like such here in these forums, I will still tag this post for staff’s attention to see what is the best solution in cases like such.

    Best :)
    Sunil Chauhan

  • Hi there,

    It’s up to Archive.org to decide when and how to include/remove this page from search results, WordPress.com doesn’t have any control over that. For more information you can read our FAQ about search engines here:

    http://en.support.wordpress.com/search-engines/

    Have you reached out to them via (email visible only to moderators and staff) regarding removing your content?

    Let me know if you have any questions.

    Thanks!

  • Unknown's avatar

    I have sent several DMCA Notices to Archive.org and they refuse to do anything about the situation. They literally copied entire articles despite a copyright notice on my site itself saying “No part of this publication may be reproduced, distributed, or transmitted in any form or by any means…except in the case of BRIEF QUOTATIONS…” etc.

    I think the problem is that I’m having trouble proving that I own the site, since my earlier account was deleted. Is there any way to restore the account? The username was [redacted] and the site was [redacted].

  • Unknown's avatar

    I also specifically disallowed search engines from indexing my site, but somehow Archive.org thought it would be okay to steal entire articles that were the results of hundreds of hours of research. This situation is reprehensible.

  • Hi there,

    To restore that account, you need to reply to the account closure email we have sent you.

    Please note that the email associated with that account is not the same as the one you’re using for your current account.

  • Unknown's avatar

    Dear Friends,

    My name is Mark Graham.

    I manage the Wayback Machine at the Internet Archive. I also manage our Patron Services team.

    Regarding the claim above “The Internet Archive has been very untransparent and unwilling to cooperate over email.”

    To the best of my knowledge we did get an email from you, which we responded to, but have not heard back from you yet.

    Please direct any requests to (email visible only to moderators and staff) and please feel free to CC me at (email visible only to moderators and staff)

    Providing responsive and responsible service to our Patrons is our goal.

    – Mark

  • Unknown's avatar

    Hmmm…. I see the email addresses above were related… not what I expected.

    Lets see if this works:

    info(at)archive(dot)org
    mark(at)archive(dot)org

  • @markjgraham

    Thanks for chiming in – yes, we automatically redact email addresses unless they are bracketed with code .

  • Unknown's avatar

    @markjgraham,

    So from what I understand now, it is no longer possible to verify the fact that I own the previously deleted site. This is why I changed my request from excluding the entire URL to simply deleting the two specific pages that linked to complete articles, as the email I received from info(at)archive(dot)org directed me to. This is because the copyright notice itself on the original site stated that only “brief quotations” could be used under any circumstances.

    Regardless of whether I can prove I own the site, this is still a case of copyright infringement.


    @fstat
    ,

    I no longer have access to the email address associated with the previous account. I know I changed the email address a few times while the account was active, do you still have logs of those email addresses?

  • I no longer have access to the email address associated with the previous account. I know I changed the email address a few times while the account was active, do you still have logs of those email addresses?

    We do have a log of any email that has been associated with the account, but only the current email can be used as a valid proof of ownership. Please reach out to The Account Recovery Team to discuss potential options in recovering your account: passwordhelp@wordpress.com.

  • Unknown's avatar

    staff-doublebassd Thanks, I’ve sent them an email.

  • Unknown's avatar

    To anyone that can help:

    staff-doublebassd, @fstat,

    I’m also a bit puzzled as to how the Internet Archive was even able to index my old site to begin with. My site was public, but I explicitly remember checking the box to prohibit search engines from indexing my site.

  • Robots.txt settings do not prohibit search engines; if that’s the message that came across to you, we’ll need to address that. I’ll file a request to make that clearer.

    Also I’m not sure that archive.org really counts as a search engine.

    If the site is public, anyone with a link can access it. Did you share links to the site publicly? Or, might anyone you shared with have shared the link?

  • Unknown's avatar

    @supernovia,

    Thanks for the reply. I did share links to the site publicly, which likely explains why the site was able to be indexed. But in the past, when I ran a different site years ago, checking the “Do not index” box was sufficient to prevent all indexing. With that site, nothing on archive.org shows up except for a link that contains a “robots.txt” in it.

    I understand that robots.txt does not guarantee that a site will not be indexed, but it seems rather inconsiderate to me for someone to ignore a file like that.

    Are there any technical ways for WordPress to intervene and secure the removal of the site?

  • You’d need to check in more with archive.org, but again I don’t think it’s necessarily just a search engine so I wouldn’t count on that search engines option to work there.

    I have filed a request to make sure that option is clearer.

    Hopefully you’ll be able to work with accounts recovery to regain access and sort out any issues you have. Good luck!

  • Unknown's avatar

    @supernovia Thanks again for the response. Can WordPress write an official notice stating that I am the owner of [redacted] or something? I just need written documentation that I own the archived content.

  • Unknown's avatar

    @markjgraham,

    Since no one at the Internet Archive seemed to care to respond to my last email, which I sent as you directed, I’m posting this publicly.

    Let me make this clear: if the content wasn’t mine, I wouldn’t have bothered spending hours upon hours writing emails, trying to recover my old account, finding documentation, etc, to get TWO URLs removed. And if the content wasn’t mine, I wouldn’t be so frustrated either.

    Seriously. TWO URLs. That’s all. They’re not bringing commercial profit to your company (or so I’m told), and removing them isn’t going to affect the integrity of the other archives.

    Just remove these TWO URLs, and I’ll stop bothering you. It’ll also save your time, mine, and that of the WordPress staff trying to recover my account.

    Links:
    [redacted]

    Thank you.

  • Unknown's avatar

    [redacted name],

    We have a process in please to respond to takedown requests. Part of that process includes requiring that people demonstrate to us that they are the rights owner of the material in question.

    It is my understanding (of course I could be mistaken) that you have still not met that test with my team.

    Having said that, I trust you, and want to do the “right thing”. As such I have asked that the two URLs you cite above be excluded from the Wayback Machine. I am confident that will happen before this time tomorrow.

    Please know that helping to make the Web more useful and reliable is our goal.

    Thank you for reaching out the way you did, and for giving my team and I the opportunity to satisfy your request.

    Please don’t hesitate to contact me directly if you ever have any additional requests.

    Be safe and well.

    – Mark Graham, Director, the Wayback Machine at the Internet Archive

  • Unknown's avatar

    This issue has been resolved.

    Can this thread be deleted?

  • The topic ‘Internet Archive Copyright Infringement’ is closed to new replies.