ESM-TFpredict

an explainable method for transcription factor prediction within protein sequences

In this study, we propose an ESM-TFpredict model, which leverages a pre-trained protein language model to encode amino acid sequences, followed by 1-D convolutional neural networks for transcription factor prediction.

The architecture of ESM-TFpredict model

To elucidate the model’s decision-making, we employ an integrated gradients technique to highlight the important features driving TF identification. The experimental results demonstrate that the TF-related regions have dominant influences on TF prediction task.

TF prediction attribution score of Zinc finger and BTB domain-containing protein 32 (Zinc finger position: 373-395, 401-423, and 428-450)

References

2023

  1. Explainable Transcription Factor Prediction with Protein Language Models (accepted)
    Liyuan Gao, Kyler Shu, Jun Zhang, and Victor.S Sheng
    In 2023 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), 2023