Yesterday I was being a smart ass with the following tweet:
Considering releasing my own version of huckleberry Finn – where the only diff is every instance of the characters name is fuckleberry hinn
I figured this was a really quick edit, I could make the changes and upload it to Scribd. I could then make a blog post of the irrational behavior of both sides of this recent Huckleberry Finn debate. I downloaded the Project Gutenberg edition of the book (URL is below), opened up MS Word and did the find and replace. 3 minutes later I had Huckleberry Finn – The Fuckleberry Edition. I listed it was this edition after the title. I also placed my screen name after Twain as an author (it is a derivative work and it’s not an original work by me – so both names were appropriate.
I then uploaded the file to Scribd, which if you are not familiar with is an online document sharing service. Probably the largest service of this kind. I uploaded my document and perused it. I then click the embed code to place it on my site in preparation of writing the blog post announcing it. Waiting a second for the code, came back with this work has been removed. What the heck. I click the back button, the work is still there. I click refresh, the work is still there. I click the embed button, this work has been removed. Hmmmm. Maybe I tried to embed it quickly, and I would need to wait a minute before that would work. Then I notice the email sitting in Thunderbird (full email at the end of this post):
We have removed your document “Adventures of Huckleberry Finn – Fuckleberry Edition” (id: 46416259) because our text matching system determined that it was very similar to a work that has been marked as copyrighted and not permitted on Scribd.
So what they are saying is a public domain book is not permitted on Scribd – and because they have a fingerprint for it that on someone else claims they own it. This means that Scribd signed off on the fingerprint. Really? Really? How hard would it be for Scribd to regularly take the full Gutenberg Archive – fingerprint all those books and reject the allowance of copyright claims to anything that matches document fingerprints from there. It is services (and companies claiming the copyright) that confuse and relinquish rights for individuals that exist in the public domain. The general person doesn’t know or understand the public domain – this makes them understand it even less. This makes some people that gets these notices go – “Oops I was mistaken and Twain must own the copyright.” This mis-educates them and erodes knowledge in an area that Scribd should be trumpeting. I have quite a few things (mostly turn of the century sheet music) uploaded to Scribd. Every item I have uploaded is in the public domain – or at least I was under a good faith impression it was.
Here is the email I sent Scribd about all of this with their original email at the end:
Dear Scribd,
I received the notice attached at the end of this email about violating a copyright on Scribd and the work was removed. I understand the need to police this action on your site, especially since it allows you to maintain safe harbor provisions under the DMCA. The problem is that the work in question was rudimentary edited copy of Huckleberry Finn. I am assuming unless another work triggered this, that there is an issue if someone is trying to claim that Huckleberry Finn is under copyright.
This file was edited from Gutenberg edition at the following URL:
http://www.gutenberg.org/ebooks/76
Since I removed all references to Project Gutenberg I have removed any worry of trademark infringement as per their license. So if this is an issue about a certain edition / edits – my source file is from Project Gutenberg, and you should once again take it up with the original person claiming copyright over this work. The main problem that this calls into question is what other public domain works are being banned from your site? Huckleberry Finn is fairly high profile, what about lesser known public domain works?
I did say my edition was a crude edit – it was done to coincide with a blog post pointing out that anyone can legally do anything to a public domain book (except on Scribd). The recent controversy about the Huckleberry Finn edition coming out that removes the N word and Injun – made changing Huckleberry to Fuckleberry a crude but important commentary on the recent controversy. I can thank you now for giving me a new blog post to right instead of that other one.
Received Copyright Notice:
Subject: Your Document Has Been Removed
Hello, creeva2912 –
We have removed your document “Adventures of Huckleberry Finn – Fuckleberry Edition” (id: 46416259) because our text matching system determined that it was very similar to a work that has been marked as copyrighted and not permitted on Scribd.
Like all automated matching systems, our system is not perfect and occasionally makes mistakes. If you believe that your document is not infringing, please contact us at copyright@scribd.com and we will investigate the matter.
As stated in our terms of use, repeated incidents of copyright infringement will result in the deletion of your Scribd.com account and prohibit you from uploading material to Scribd.com in the future. To prevent us from having to take these steps, please delete from scribd.com any material you have uploaded to which you do not own the necessary rights and refrain from uploading any material you are not entitled to upload. For more information about Scribd.com’s copyright policy, please read the Terms of Use located at http://www.scribd.com/terms.
Best regards, Scribd Support Team Questions? http://scribd.com/faq
UPDATE: Received a reply:
Subject: Scribd Copyright/DMCA request received: 94540 / Copyright Violation for Huckleberry Finn (ticket #94540
|
|
|
|
|
I’ll update this again when I get back a real response. Until then we’ll wait on Jason.
UPDATE – Jason Replies
Your request (#94540) has been deemed solved.
To review, comment and reopen the request, follow the link below:
http://support.scribd.com/tickets/94540
Jason, Jan-06 11:44 am (PST):
Thank you for contacting Scribd Support.
I’m sorry that our automated copyright protection system misidentified your document as infringing. We try very hard to protect the rights of authors, and sometimes our copyright robot gets a little oversensitive.
I’ve restored your document and removed all references from your account.
Cheers,
Scribd Support
copyright@scribd.com
Questions? http://scribd.com/faq
Now to the normal person that doesn’t understand the implication of mis-education caused by this time of behavior (seriously I wasn’t trying to sound smart there), they would let this go. I am curious though how this falls in the “lessons learned” chart. What other things are they going to claim under copyright? Why can’t we submit “known good” fingerprints of works. The auto-detection of youtube’s music and video match might not be explorable – but a text only search sure is.
Here was my response:
So while I had a case that is fixed – what about other public domain books? How is this going to fixed from a long term stand point. Should I start taking popular public domain books and submitting them one by one to see if they are erroneously flagged?
I’m curious now what Scribd’s solution for handling this going forwarding instead of a case by case basis. Checking Scribd already had many copies of Huckleberry Finn – so I am confused why they were given a pass and my was flagged. I would also like to know why can’t Scribd have a known list of public domain books (Huck Finn would easily be in the top 100-500) and have those fingerprinted as “clean”.
Waiting
UPDATE -
Reply from Jason:
We are continuously improving our copyright management system to safeguard against false positives across the board, not just on a case-by-case basis. However, details of our copyright systems are not available to the public.
You are more than welcome to test the system with other public domain books. I look forward to your findings.
Best,
Jason
I know what I’m doing tonight.
![scribd-logo[1]](http://creeva.com/wp-content/uploads/2011/01/scribd-logo1.jpg)
