Before I provide a response, I must preface it with the fact that we are currently working on and selling projects which use DLP tools.
That said, what I think Richard misses about DLP is the fingerprinting and discovery aspect. DLP solutions provide radically enhanced methods of fingerprinting and finding 'unstructured' data beyond comparing hashes or strings. Unstructured data, is data that doesn't follow some kind of programmatic pattern. For example, credit card numbers are structured data and need to conform to certain guidelines. It's fairly easy to find and detect that sort of data. Unstructured data on the other hand are things like spreadsheets, documents, presentations, podcasts, movies etc. However, even then those are just containers for the data, and it is possible for the same information to be copied from a word document to a spreadsheet (for e.g.). DLP provide a way of fingerprinting the underlying information, and then detecting it across the organisation.
For example, one could fingerprint the board minutes on a PA's laptop, then examine all mailboxes, databases and file servers to locate them. Or, one could do the same for customer records and work out which systems are storing customer personal information. Alternatively, one could work out which systems are in scope for PCI DSS compliance (or descoping) because they contain card-holder data. Then, much later, one could monitor communication channels, flash sticks and printers and block any instances of the classified information being distributed outside of designated groups.
The reason this sort of stuff is important, is that organisations aren't very good at knowing where their important data is. People who've done 'information classification' projects before, will tell you they took a long time because the business people knew what data was important, but not how or where it was stored, and the IT people knew which systems the business people thought were important, but not which parts of information in that system were important. Being able to do this sort of fingerprinting and discovery makes the task of mapping these to each other much easier. Additionally, being able to fingerprint a blob of data and assign the whole blob specific properties makes life easier. You don't have to classify each paragraph of the board meeting's minutes, you can fingerprint every one and assign a policy to all of them.
The second part of a DLP solution, the enforcement, is the bit Richard was talking about. If we look at previous information classification projects again, even if you did come up with a decent data/system/comms map and classification scheme, you couldn't do much more than write policies or put a bit more effort into securing the systems holding the important data. The DLP tools let security teams start putting controls around the actual data, not their format or system, and provides a method to enforce that policy. In implementing this part, it's easy to alert on everything and end up with an unmanageable and unwatched list of alerts. Initially, key policies should be expressed as a block rule, assuming you aren't an unrealistic rule-nazi this will allow you to define rules for very confidential information or high-risk leaks (e.g. 1million customer records and one set of minutes of a board meeting). However, once you've got that tweaked and usable, all the stuff in the middle may need a more nuanced approach in the form of logs and alerts. It's my personal belief that security analysts can't do that part, I've tried and it's just way too much work. The communication context is something the data owner needs to comment on, and takes too much time to work out. This is where I think the workflow component comes in as described in my other blog entry on the topic.
Then (almost) finally, I think DLP has potential to allow an organisation with an immature security posture, to fairly quickly put controls around high risk data, start working out where their high risk data is stored and where their biggest leaks are. Those last two will help them prioritise their security efforts better than the other risk assessments consultancies like mine are famous for overcharging for ;)
I do agree that DLP tools aren't going to provide a fool proof way of detecting all attempts at smuggling data out. I've tested a couple and while steganography works all the time, in some cases just bzip2'ing it worked too. I don't think only stupid people will get detected by the DLP tool (although given the number of "mistakes" you end up seeing, blocking stupid is useful) as they do go quite far in picking up things like copy pasting snippets of text into other documents or inserting some random text in between paragraphs etc. But in the end it won't kill that werewolf for you.