%0 Generic
%A Urhan, Aysun
%A Cosma, Bianca-Maria
%A Earl, Ashlee M.
%A Manson, Abigail L.
%A Abeel, Thomas
%D 2024
%T SAFPredDB: Bacterial synteny database
%U 
%R 10.4121/ac84802e-853f-46f1-9786-b9d29c0f7557.v2
%K bionformatics
%K microbial genomics
%K genomics
%K protein language model
%K bacterial genomics
%K comparative genomics
%K protein embeddings
%K sequence analysis
%K bacterial synteny
%X <p>SAFPredDB is a bacterial synteny database built for the gene function prediction tool SAFPred, Synteny Aware Function Predictor. The database is a collection of conserved synteny and operons found across the bacterial kingdom. First, we formulated a synteny model based on experimentally known operons and the genomic features common in bacteria. We designed a bottoms-up, purely computational approach to build our database based on the proposed synteny model using complete bacterial genome assemblies from the Genome Taxonomy Database (GTDB).</p><p><br></p><p>Although we initially built SAFPred for our prediction tool only, it can be used for other purposes where such a catalog is needed. As a standalone database, it can be queried to mine information about conserved genomic patterns in bacteria. In addition, it can be updated as newer assemblies are added to GTDB.</p>
%I 4TU.ResearchData