Prediction of DNA Binding Sites Bound to Specific Transcription Factors by the SVM Algorithm
DOI:
https://doi.org/10.24996/ijs.2022.63.11.37Keywords:
SVM, DNA sequences, Transcription Factors (TFs), Kernel Ridge Regression (KRR), Kernel Logistic Regression (KLR)Abstract
In gene regulation, transcription factors (TFs) play a key function. It transmits genetic information from DNA to messenger RNA during the process of DNA transcription. During this step, the transcription factor binds to a segment of the DNA sequence known as Transcription Factor Binding Sites (TFBS). The goal of this study is to build a model that predicts whether or not a DNA binding site attaches to a certain transcription factor (TF). TFs are regulatory molecules that bind to particular sequence motifs in the gene to induce or restrict targeted gene transcription. Two classification methods will be used, which are support vector machine (SVM) and kernel logistic regression (KLR). Moreover, the KLR algorithm depends on another regression algorithm, namely kernel ridge regression (KRR). Discovering binding sites for a transcription factor can help determine genes which it regulates, analyze its functions, comprehend regulation in living organisms, recognize causal disease variations, and, most importantly, manufacture pharmaceutical drugs.