When you think of using social media to conduct groundbreaking customer research, a hospital probably isn’t the first place you’d look. But the Computational Epidemiology Group in the Boston Children’s Hospital’s Computational Health Informatics Program (CHIP) is a leader in mining social channels for health-related insights. They’re now looking to sentiment analysis on Twitter to complement more traditional research methods and help hospital administrators better track patient experience metrics.
The team at Boston Children’s Hospital first attracted attention to their work a few years ago when they successfully opinion mined reviews on the popular rating website Yelp. Their analysis showed that the ingredients people described in their reviews mentioning food poisoning matched closely with ingredients that public health officials later reported were involved in food-borne illness outbreaks for that time.
It was an important finding since it showed that socially shared information potentially carried useful health insights. Dr. John Brownstein, BCH’s Chief Innovation Officer and a leader of the research team, noted, “It’s hard to make people come to you. People aren’t engaged necessarily in public health.” But if you can tap into their online voices, he said, “you can actually get a huge amount of information that would not come from another vehicle.”
Could Twitter provide a faster, more useful picture of hospital quality?
Jared Hawkins, a faculty member at Boston Children’s Hospital and a researcher on Brownstein’s team, is an active developer of new tools and methods that extend the usefulness of sentiment analysis. We recently spoke with Hawkins about research he’s published using Twitter sentiment as a proxy for hospital quality metrics.
As Hawkins explained, quality metrics like the Hospital Consumer Assessment of Healthcare Providers and Systems (HCAHPS) survey are extremely important to hospital administrators. They reflect the quality of care being provided and drive everything from high-profile rankings to federal government reimbursement rates. And yet, these reports often lag today’s events by 12 or more months because of the need to survey patients and tabulate findings. Many administrators also complain about low response rates and insufficient granularity in the data. And, of course, the surveys sometimes overlook non-clinical (but still important) parts of the patient experience—parking, food, traffic, and so forth.
What if, Hawkins wondered, commentary on Twitter could serve as a proxy for these official quality scores or serve as a new satisfaction metric? Even if potentially less accurate, could the timeliness and granularity of a Twitter-powered sentiment score still make it useful for reporting and decision-making?
“Given that social media data are close to real time,” he sums up, “we wanted to see if we could capture this discussion and if the content is useful.”
With Twitter’s support, Hawkins and his colleagues began by developing a collection of hundreds of thousands of tweets that explicitly mentioned hospitals in the 12-month period between late 2012 and late 2013.
They first weeded out tweets that had nothing to do with patient experience. Hawkins is proud of their efforts here to identify and isolate specific topics; they pushed the envelope of artificial intelligence and machine learning to automatically eliminate tweets related to, say, fundraising, open jobs, or clinical research. It’s easier said than done at scale.
The team then trained a computer algorithm to parse the remaining tweets and watch for expressions of anger, happiness, and most every emotion in between. Reading each tweet to determine meaning and sentiment would never have worked given the size of their dataset. Hawkins’ team instead relied again upon sophisticated algorithms to do the work for them.
Importantly, Hawkins’ experience had taught him that sentiment analysis can be tricky since people communicate in confusing and inconsistent ways. Consider the difficulty of teaching an algorithm to differentiate between, “THEY would not let my dog stay in this hospital” vs. “I would not let my dog stay in this hospital.” Slang, abbreviations, emojis, foreign languages, and spelling mistakes added to the challenges that Hawkins’ team faced. Both open-source and internally developed tools helped get the job done with surprising accuracy.
Hawkins’ Twitter-powered research succeeds, so what’s next?
Ultimately, Hawkins’ research helped him demonstrate that his hypothesis was mostly right. Eliminating hospitals that didn’t attract enough tweets to offer reliable data, he showed an interesting correlation between some important hospital quality outcomes and his Twitter sentiment scores.
Importantly, though, they did not see a relationship between tweet sentiment and the gold-standard HCAHPS experience data.
“This is a brand new way of using Twitter data,” Hawkins notes. “It may be that we have to be cautious about using tweet sentiment to understand quality.”
But, obsessing about perfection misses the point, he argues, if the speed of getting a “good enough” insight can be substantially improved. And, research continues that could further improve the accuracy of his findings. He’s already underway with that effort.
Hawkins went on to rate hospitals based on his findings, naming Seattle Children’s Hospital, the Mayo Clinic, and Boston Children’s Hospital as the early leaders in Twitter-based public sentiment. They’re all highly regarded institutions so it shouldn’t be a surprise that they’re also much beloved on Twitter by their patients.
Looking to the future, Hawkins envisions a number of enhancements. He’d first like to better understand the strength of the sentiment being communicated. Was the patient furious or merely unhappy? It’d also be useful, he thinks, if the cause of the sentiment could be identified. Was the food what made the patient unhappy? Or was it a surly physician? Daily, weekly or even real-time reports are on his roadmap, too. A larger dataset and additional data sources could be helpful, too. Taking a beat from J. D. Power and Associates, Hawkins can one day envision social metrics powering awards and recognition. The team is looking to commercialize some of these ideas and has launched the service CrowdClinical to explore possibilities.
Even with these improvements, it’s important to note that Hawkins doesn’t see social sentiment metrics ever replacing conventional survey tools. At best, he thinks they could be a useful complement that helps administrators quickly understand patient sentiment and then make informed plans to improve their experience.
An extended version of this article was originally published by Connective DX in their Connective Thinking newsletter.
James A. Gardner, @jamesagardner, is a Boston-area sales and marketing professional with a passion for consumer technologies and all things health, web & social. He started his career with Procter & Gamble before earning his MBA at Northwestern University's Kellogg School of Management. Since then, James has served senior clients as a consultant with McKinsey & Company, led complex digital projects with Boston-area agencies, and built several high-performing marketing teams. He’s also been published and quoted in multiple professional publications, most recently CMSWire, CIO.com, eHealthcare Strategy & Trends, and MedTech Boston.
Send this to friend