A recent Wall Street Journal article on Nielsen scraping data from patientslikeme.com has drawn attention to the privacy debate regarding personal health information. It’s important to note that Patients Like Me (PLM) encourages patients who suffer from specific conditions to actively record and share their health information with the PLM community.
I have used this blog to express my opinions about social media and privacy. Here, I declared, “your stuff isn’t worth anything.” My comment was a generalization aimed at the picture of a cat wearing a turtleneck that someone posts on Facebook or the 140-character “ode to cheesecake” shared on Twitter. I stand by that statement; “they” aren’t interested in ownership of gems of that ilk.
There is value to the public health if social health data is compiled into a composite form. For example, what if:
- A group of patients using a specific drug suddenly report a side effect that wasn’t caught in clinical trials
- The entire population of Springfield complains about a local hospital being dropped by an insurance company and threatens to find another insurance company
- A diabetes blogger suggests a new feature for an insulin pump and the entire online diabetes community weighs in
- Digitally-active physicians chime in and correct misleading information about a disease that is proliferating on the internet
What’s alarming is that users may be contributing information to a site that can be combined with other public data and used to identify who they are, what they suffer from, where they live, etc. In other words, someone can scrape data from the entire internet to piece together your personal puzzle, looking something like: Alphonse Brownstone, 123 Main Street, chronic earlobe fungus, likes chocolate-covered tater tots, 1987 conviction for public drunkenness. Needless to say, this poses an issue for privacy wonks and you should be worried about that.
Whether we like it or not, data scraping is what makes the internet useful and free. Turn that off and all of our favorite sites will have two choices: charge for service or shut down. Some digitally engaged patients already expect their information to be analyzed, they just want to be treated appropriately. Some research firms are asking for an open conversation about privacy.
So what we really need is some sort of Declaration of User-Generated Content that spells out exactly what everyone’s responsibilities are. Some examples:
- Organizations who own digital properties that encourage User-Generated Content must allow users to control who their content is shared with
- Organizations who scrape the internet for data must make every effort to avoid personally identifiable information and publish data only in composite views
- Individuals who publish their own content on the internet must assume that the information is visible to everyone unless they have made a clear effort to block that content from the public view
I’d put my Phil Hancock on that. Would you? If something is missing, tell us in the comments…