Steganography and Piracy in the Age of Digital Distribution
It’s clear that now more than ever content is getting distributed digitally. For example, Amazon recently revealed that it sold more books on the Kindle last quarter than hardcovers. Unfortunately for the copyright holders, digital content is super easy to pirate. And although the ethics of the situation are unclear (as a society, do we even want to enforce copyright?) this post proceeds from a frame that we have a digital property and want to protect it.
DRM (“digital rights management”) is out. People hate it (although the closed iTunes ecosystem might suggest otherwise, they might be an exception due to the dominance of Apple’s delivery infrastructure). In general, people want their files to be unencrypted so they can use them how they want and when they want. So how do you distribute your written content in a way that deters piracy? One method is to have your ebook distributor automatically add a stamp to every copy of your PDF. This is really easy to do, with only a few lines of code. Unfortunately, it’s also really easy to subvert, with only a few lines of code required to paste a big black block over every stamp.
You could password protect the file, instead, but the password can easily be distributed. The problem (from an abstract perspective) is that while you can do a unique one-way password decrypt, once you get into the content, you have access to all the information. Then you can redistribute the information however you want. What you need to do is somehow encode the information so that you have a way of identifying a unique signature, but at the same time not significantly visually shift the information in any way as to alert the viewer! In other words, the text itself (or the information-content itself) must be structured in a way that you can infer extra information.
My first proposed solution? Modifying the kerning of the text in a way that will subtly, yet uniquely, determine a key. The problem with my solution is that the text can be scanned and normalized, and then re-outputted as either a separate PDF or even a text file. The normalization process is difficult- by having to parse the PDF text, you’re forced to use some sort of OCR technology. This obstacle can be solved if you have a sufficiently randomized font, such that the person decrypting is forced to rewrite the OCR algorithm each time a new text is to be interpreted. This is a decent deterrent, but not foolproof.
Another solution: Make sure that the aesthetics of the presentation of the information are a significant value-add. That way, by normalizing the text, you’re losing information.
Another solution: embed some useful diagrams in your document, and use a steganography technique like LSB encoding to hide the user’s key. Unfortunately, it’s also easy for a would-be pirate to write software that applies a random pixel distribution wash over the entire page, in order to distort the obscured information.
So for god’s sakes, play your cards close to your chest and don’t let people know what’s up. But I’m working on some software to automatically add these sort of subtly protections to ebooks without disrupting the user’s ability to fully own and enjoy their purchase. Contact me if interested.
Follow @zburt
Subscribe to the blog via email.
AwesomenessReminders is owned and operated by me, Zachary Burt.
RSS Feed
