%0 Generic %A Urhan, Aysun %A Cosma, Bianca-Maria %A Earl, Ashlee M. %A Manson, Abigail L. %A Abeel, Thomas %D 2024 %T SAFPredDB: Bacterial synteny database %U %R 10.4121/ac84802e-853f-46f1-9786-b9d29c0f7557.v2 %K bionformatics %K microbial genomics %K genomics %K protein language model %K bacterial genomics %K comparative genomics %K protein embeddings %K sequence analysis %K bacterial synteny %X <p>SAFPredDB is a bacterial synteny database built for the gene function prediction tool SAFPred, Synteny Aware Function Predictor. The database is a collection of conserved synteny and operons found across the bacterial kingdom. First, we formulated a synteny model based on experimentally known operons and the genomic features common in bacteria. We designed a bottoms-up, purely computational approach to build our database based on the proposed synteny model using complete bacterial genome assemblies from the Genome Taxonomy Database (GTDB).</p><p><br></p><p>Although we initially built SAFPred for our prediction tool only, it can be used for other purposes where such a catalog is needed. As a standalone database, it can be queried to mine information about conserved genomic patterns in bacteria. In addition, it can be updated as newer assemblies are added to GTDB.</p> %I 4TU.ResearchData