Marc Zimmermann, Juliane Fluck, Le T. Thi, Corinna Kolarik, Kai Kumpf and Martin Hofmann Pages 785 - 796 ( 12 )
Information extraction approaches have been successfully applied to mine the scientific literature in biology and medicine. So far, the main focus of research and development in this domain was on the recognition and extraction of gene and protein names in the context of molecular biology and genome research and on disease names and other medical terms in the context of clinical research. Similar to biology and medical sciences, medicinal chemistry, pharmacology and toxicology are descriptive sciences. However, information extraction approaches in these disciplines encounter a number of problems that are specific to the fact that these scientific areas are essentially centred at chemical compounds and their structures. In this review, we will give a short overview on general information extraction strategies in the life sciences and we will introduce new approaches to apply information extraction to the domain of pharmacology, medicinal chemistry and toxicology. Finally, we will emphasize on how information extraction approaches will support public and commercial research in medicinal chemistry, pharmacology and toxicology by linking information on chemical structures to biological information.
information extraction, named entity recognition, chemical structure reconstruction, biological effects, semantic mediation
Fraunhofer-Institute forAlgorithms and Scientific Computing (SCAI), Schloss Birlinghoven, D-53754 St. Augustin, Germany;