Jian Zhang*, Yu Zhang, Yanlin Li, Song Guo and Guifu Yang Pages 1888 - 1897 ( 10 )
Objective: Cancer is one of the most serious diseases affecting human health. Among all current cancer treatments, early diagnosis and control significantly help increase the chances of cure. Detecting cancer biomarkers in body fluids now is attracting more attention within oncologists. In-silico predictions of body fluid-related proteins, which can be served as cancer biomarkers, open a door for labor-intensive and time-consuming biochemical experiments.
Methods: In this work, we propose a novel method for high-throughput identification of cancer biomarkers in human body fluids. We incorporate physicochemical properties into the weighted observed percentages (WOP) and position-specific scoring matrices (PSSM) profiles to enhance their attributes that reflect the evolutionary conservation of the body fluid-related proteins. The least absolute selection and shrinkage operator (LASSO) feature selection strategy is introduced to generate the optimal feature subset.
Results: The ten-fold cross-validation results on training datasets demonstrate the accuracy of the proposed model. We also test our proposed method on independent testing datasets and apply it to the identification of potential cancer biomarkers in human body fluids.
Conclusion: The testing results promise a good generalization capability of our approach.
Cancer biomarkers, Body fluid, Evolutionary conservation, Physicochemical properties, LASSO, PSSM.
School of Computer and Information Technology, Xinyang Normal University, Xinyang, Information Engineering College, Huanghuai University, Zhumadian, School of Computer and Information Technology, Xinyang Normal University, Xinyang, School of Computer and Information Technology, Xinyang Normal University, Xinyang, College of Information Science and Technology, Northeast Normal University, Changchun