A Multi-metric Algorithm for Hierarchical Clustering of Same-Length Protein Sequences - Artificial Intelligence Applications and Innovations (AIAI 2018)
Conference Papers Year : 2018

A Multi-metric Algorithm for Hierarchical Clustering of Same-Length Protein Sequences

Abstract

The identification of meaningful groups of proteins has always been a major area of interest for structural and functional genomics. Successful protein clustering can lead to significant insight, assisting in both tracing the evolutionary history of the respective molecules as well as in identifying potential functions and interactions of novel sequences. Here we propose a clustering algorithm for same-length sequences, which allows the construction of subset hierarchy and facilitates the identification of the underlying patterns for any given subset. The proposed method utilizes the metrics of sequence identity and amino-acid similarity simultaneously as direct measures. The algorithm was applied on a real-world dataset consisting of clonotypic immunoglobulin (IG) sequences from Chronic lymphocytic leukemia (CLL) patients, showing promising results.
Fichier principal
Vignette du fichier
468652_1_En_18_Chapter.pdf (1.16 Mo) Télécharger le fichier
Origin Files produced by the author(s)
Loading...

Dates and versions

hal-01821300 , version 1 (22-06-2018)

Licence

Identifiers

Cite

Sotirios–filippos Tsarouchis, Maria Th. Kotouza, Fotis E. Psomopoulos, Pericles A. Mitkas. A Multi-metric Algorithm for Hierarchical Clustering of Same-Length Protein Sequences. 14th IFIP International Conference on Artificial Intelligence Applications and Innovations (AIAI), May 2018, Rhodes, Greece. pp.189-199, ⟨10.1007/978-3-319-92016-0_18⟩. ⟨hal-01821300⟩
182 View
85 Download

Altmetric

Share

More