Evaluating Semantic Similarity and Centrality on Gene Annotation


  • Aishwarya AV, Anooja Ali, Vishwanath R Hulipalled, Akshita Srikanth, Aishwarya Gajanana Naik


Gene Ontology (GO) is a vocabulary available in bio informatics that indicates the functionality of proteins and genes. This dynamic vocabulary demonstrate the functionality at cellular component, biological process and molecular level. Different methods are there to evaluate this semantic similarity focusing on multiple approaches. In this paper we use jackknife methodology by considering five popular similarity measures. Protein Protein Interaction network (PPI) is created based on these similarity values, there by leading to the formation of clusters of identical or similar protein complexes. There are various methods available in literature to detect the essential proteins. These essential proteins are the hub nodes in the network. To form clusters of these networks, we apply various centrality measures to identify the most influential node. The clusters so formed help us in easy identification of the category of protein complex they belong to. Disease pathways are disintegrated and reasonably implanted in PPI network. So the research to discover the disease pathways over the set of predefined gene annotation can provide further advances in disease gene discovery.