Franklin_Daniel.pdf (1.8 MB)

STRUCTURAL CLASSIFICATION OF PROTEINS USING IMAGE BASED MACHINE LEARNING

Download (1.8 MB)
thesis
posted on 23.05.2021, 13:16 by Daniel Franklin
Classification of proteins is an important area of research that enables better grouping of proteins either by their function, evolutionary similarities or in their structural makeup. Structural classification is the area of research that this thesis focuses on. We use visualizations of proteins to build a machine learning class prediction model, that successfully classifies proteins using the Structural Classification of Proteins (SCOP) framework. SCOP is a well-researched classification with many approaches using a representation of a proteins secondary structure in a linear chain of structures. This thesis uses a novel approach of rendering a three dimensional visualization of the protein itself and then applying image based machine learning to determine a protein’s SCOP classification. The resulting convolutional neural network (CNN) method has achieved average accuracies in the range 78-87% on the 25PDB dataset, which is better than or equal to the existing methods.

History

Language

eng

Degree

Master of Science

Program

Computer Science

Granting Institution

Ryerson University

LAC Thesis Type

Thesis

Usage metrics

Computer Science (Theses)

Keywords

Exports