Abstract
Membrane proteins are critical mediators for tumor progression and present enormous therapeutic potentials. Although gene profiling can identify their cancer-specific signatures, systematic correlations between protein functions and tumor-related mechanisms are still unclear. We present here the CrMP-Sol database (https:// biogateway. aigene. org. cn/g/ CrMP), which aims to breach the gap between the two. Machine learning was used to extract key functional descriptions for protein visualization in the 3D-space, where spatial distributions provide function-based predictive connections between proteins and cancer types. CrMP-Sol also presents QTY-enabled water-soluble designs to facilitate native membrane protein studies despite natural hydrophobicity. Five examples with varying transmembrane helices in different categories were used to demonstrate the feasibility. Native and redesigned proteins exhibited highly similar characteristics, predicted structures and binding pockets, and slightly different docking poses against known ligands, although task-specific designs are still required for proteins more susceptible to internal hydrogen bond formations. The database can accelerate therapeutic developments and biotechnological applications of cancer-related membrane proteins.
Keywords: Membrane protein, Protein design, QTY code, Machine learning, Protein function, Cancer, Bioinformatics