TY - JOUR
T1 - DeepLoc 2.1
T2 - multi-label membrane protein type prediction using protein language models
AU - Ødum, Marius Thrane
AU - Teufel, Felix
AU - Thumuluri, Vineet
AU - Almagro Armenteros, José Juan
AU - Johansen, Alexander Rosenberg
AU - Winther, Ole
AU - Nielsen, Henrik
N1 - © The Author(s) 2024. Published by Oxford University Press on behalf of Nucleic Acids Research.
PY - 2024/7/5
Y1 - 2024/7/5
N2 - DeepLoc 2.0 is a popular web server for the prediction of protein subcellular localization and sorting signals. Here, we introduce DeepLoc 2.1, which additionally classifies the input proteins into the membrane protein types Transmembrane, Peripheral, Lipid-anchored and Soluble. Leveraging pre-trained transformer-based protein language models, the server utilizes a three-stage architecture for sequence-based, multi-label predictions. Comparative evaluations with other established tools on a test set of 4933 eukaryotic protein sequences, constructed following stringent homology partitioning, demonstrate state-of-the-art performance. Notably, DeepLoc 2.1 outperforms existing models, with the larger ProtT5 model exhibiting a marginal advantage over the ESM-1B model. The web server is available at https://services.healthtech.dtu.dk/services/DeepLoc-2.1.
AB - DeepLoc 2.0 is a popular web server for the prediction of protein subcellular localization and sorting signals. Here, we introduce DeepLoc 2.1, which additionally classifies the input proteins into the membrane protein types Transmembrane, Peripheral, Lipid-anchored and Soluble. Leveraging pre-trained transformer-based protein language models, the server utilizes a three-stage architecture for sequence-based, multi-label predictions. Comparative evaluations with other established tools on a test set of 4933 eukaryotic protein sequences, constructed following stringent homology partitioning, demonstrate state-of-the-art performance. Notably, DeepLoc 2.1 outperforms existing models, with the larger ProtT5 model exhibiting a marginal advantage over the ESM-1B model. The web server is available at https://services.healthtech.dtu.dk/services/DeepLoc-2.1.
KW - Internet
KW - Membrane Proteins/chemistry
KW - Protein Sorting Signals
KW - Sequence Analysis, Protein
KW - Software
UR - http://www.scopus.com/inward/record.url?scp=85198056537&partnerID=8YFLogxK
U2 - 10.1093/nar/gkae237
DO - 10.1093/nar/gkae237
M3 - Journal article
C2 - 38587188
SN - 0305-1048
VL - 52
SP - W215-W220
JO - Nucleic Acids Research
JF - Nucleic Acids Research
IS - W1
ER -