Please use this identifier to cite or link to this item: http://hdl.handle.net/10773/37778
Title: Classifying and discovering genomic sequences in metagenomic repositories
Author: Silva, Jorge Miguel
Almeida, João Rafael
Oliveira, José Luís
Keywords: Taxonomic Classification
Organism Identification
Compression
Web Portal
Data Aggregation
Genomic Catalogue
Issue Date: 2023
Publisher: Elsevier
Abstract: The taxonomic and functional composition of microbial communities from environmental, agricultural, and therapeutic settings is increasingly being studied using metagenomic methodologies in large-scale genomic applications. This has led to exponential growth in the field and has impacted on healthcare, pharmacology and biotechnology. However, with the current methodologies, it is sometimes difficult to obtain conclusive identification of an organism. In addition, the growth of the metagenomic field has led to the creation of large amounts of data held by different hosts, which characterize data differently and make analysis difficult. Therefore, correct data aggregation and classification improve and facilitate the discovery of repositories of interest. This paper tackles these issues by proposing a methodology for organism identification, data aggregation and content characterization, visualization and selection. We propose a three-step pipeline for organism identification that uses compression-based metrics, an aggregation mechanism for content characterization, and a web database catalogue for data exposition and visualization.
Peer review: yes
URI: http://hdl.handle.net/10773/37778
DOI: 10.1016/j.procs.2023.01.441
ISSN: 1877-0509
Appears in Collections:DETI - Artigos
IEETA - Artigos

Files in This Item:
File Description SizeFormat 
Classifying-and-discovering-genomic-sequences-in-meta_2023_Procedia-Computer.pdf540.53 kBAdobe PDFView/Open


FacebookTwitterLinkedIn
Formato BibTex MendeleyEndnote Degois 

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.