As coronavirus infections exploded in the spring of 2020, everyone was looking for ideas about how to fight what had become a full-blown pandemic. The Wallace H. Coulter Department of Biomedical Engineering put out the call for ideas and offered faculty members seed funding to pursue them.
Turns out, Cassie Mitchell already was on the case.
“We’d been working for a few weeks on something, since the White House started asking data scientists to do analysis on old SARS data, but also on the emerging Covid dataset that was being updated weekly,” said Mitchell, assistant professor in Coulter BME who specializes in using data to forecast disease. “We had already started adapting our text-mining architecture for Covid-19.”
Text mining is what it sounds like: an artificial intelligence process that involves analyzing a lot of existing text for useful data that could lead to new discoveries. Mitchell’s lab received a $10,000 seed grant to use her tools to dig through millions of peer-reviewed articles, seeking hidden patterns that would be relevant to Covid-19 — perhaps identifying patient risk factors or even drugs that might be repurposed to treat the virus.
Using Covid-19 as a test case, the lab adapted a process called link prediction, an important tool in artificial intelligence and machine learning that predicts the existence of a link between two entities. It’s kind of like filling in the blanks after identifying the blanks.
Link prediction is at play when your social media platform suggests a new friend for you, or when an online marketplace predicts which customers will buy what products.
“Though it’s used for other things, we adapted it to biomedical text — as you might imagine, a biomedical application is more difficult than dealing with customer segmentation data,” Mitchell said.
Mitchell’s team excavated information from the articles and built a “knowledge graph, or network that links symptoms, drugs, antecedent diseases, genes, proteins, and much more to Covid-19 or similar coronaviruses,” she said.
The team ranked relationships with the coronavirus to find the most promising research paths, with the intent of expediting translational research. They highlighted thousands of potential repurposed drugs for further research.