Computer Science Faculty Research

Application of Subspace Clustering in DNA Sequence Analysis

Tim Wallace, Tennessee State UniversityFollow
Ali Sekmen, Tennessee State UniversityFollow
Xiaofei Wang, Tennessee State UniversityFollow

Document Type

Article

Publication Date

9-29-2015

Abstract

Identification and clustering of orthologous genes plays an important role in developing evolutionary models such as validating convergent and divergent phylogeny and predicting functional proteins in newly sequenced species of unverified nucleotide protein mappings. Here, we introduce an application of subspace clustering as applied to orthologous gene sequences and discuss the initial results. The working hypothesis is based upon the concept that genetic changes between nucleotide sequences coding for proteins among selected species and groups may lie within a union of subspaces for clusters of the orthologous groups. Estimates for the subspace dimensions were computed for a small population sample. A series of experiments was performed to cluster randomly selected sequences. The experimental design allows for both false positives and false negatives, and estimates for the statistical significance are provided. The clustering results are consistent with the main hypothesis. A simple random mutation binary tree model is used to simulate speciation events that show the interdependence of the subspace rank versus time and mutation rates. The simple mutation model is found to be largely consistent with the observed subspace clustering singular value results. Our study indicates that the subspace clustering method may be applied in orthology analysis.

Recommended Citation

Tim Wallace, Ali Sekmen, and Xiaofei Wang.Journal of Computational Biology.Oct 2015.940-952.http://doi.org/10.1089/cmb.2015.0084

Download

Included in

Computational Biology Commons

COinS

Digital Scholarship @ Tennessee State University

TSU Library

Computer Science Faculty Research

Application of Subspace Clustering in DNA Sequence Analysis

Document Type

Publication Date

Abstract

Recommended Citation

Included in

Search

Links

Browse

Author Corner

Digital Scholarship @ Tennessee State University

TSU Library

Computer Science Faculty Research

Application of Subspace Clustering in DNA Sequence Analysis

Authors

Document Type

Publication Date

Abstract

Recommended Citation

Included in

Share

Search

Links

Browse

Author Corner