ECCB 2016 main conference Proteins

PT27 – AUCpreD: proteome-level protein disorder prediction by AUC-maximized Deep Convolutional Neural Fields


Mississippi September 6, 2016 2:00 pm - 2:20 pm

Bookmark and Share


Proceeding talk – Theme: Proteins.

Abstract

Motivation: Protein intrinsically disordered regions (IDRs) play an important role in many biological processes. Two key properties of IDRs are (i) the occurrence is proteome-wide and (ii) the ratio of disordered residues is about 6%, which prevents accurate prediction of IDRs. Most IDR prediction methods use sequence profile to improve accuracy, which prevents its application to proteome-wide prediction since it is time-consuming to generate sequence profiles. On the other hand, the methods without using sequence profile fare much worse than using sequence profile. Method: This paper formulates IDR prediction as a sequence labelling problem and employs a new machine learning method called Deep Convolutional Neural Fields (DeepCNF) to solve it. To deal with highly imbalanced order/disorder ratio, instead of training DeepCNF by widely-used maximum-likelihood, we develop a novel approach to train it by maximizing AUC (Area Under the ROC Curve), which is an unbiased measure for class-imbalanced data. Availability: http://raptorx2.uchicago.edu/StructurePropertyPred/predict/.

Link to PDF file

Authors

Sheng Wang, Department of Human Genetics, University of Chicago, United States
Jianzhu Ma, Toyota Technological Institute at Chicago, United States
Jinbo Xu, Toyota Technological Institute at Chicago, United States