Document Type
Article
Publication Date
4-5-2020
Abstract
The year 2020 has seen an unprecedented COVID-19 pandemic due to the outbreak of a novel strain of coronavirus in 180 countries. In a desperate effort to discover new drugs and vaccines for COVID-19, many scientists are working around the clock. Their valuable time and effort may benefit from computer-based mining of a large volume of health science literature that is a treasure trove of information. In this paper, we have developed a graph-based model using abstracts of 10,683 scientific articles to find key information on three topics: transmission, drug types, and genome research related to coronavirus. A subgraph is built for each of the three topics to extract more topic-focused information. Within each subgraph, we use a betweenness centrality measurement to rank order the importance of keywords related to drugs, diseases, pathogens, hosts of pathogens, and biomolecules. The results reveal intriguing information about antiviral drugs (Chloroquine, Amantadine, Dexamethasone), pathogen-hosts (pigs, bats, macaque, cynomolgus), viral pathogens (zika, dengue, malaria, and several viruses in the coronaviridae virus family), and proteins and therapeutic mechanisms (oligonucleotide, interferon, glycoprotein) in connection with the core topic of coronavirus. The categorical summary of these keywords and topics may be a useful reference to expedite and recommend new and alternative directions for COVID-19 research.
Recommended Citation
S. Ahamed, M. Samad "Information Mining for COVID-19 Research From a Large Volume of Scientific Literature" arXiv (2020) https://arxiv.org/abs/2004.02085