cff-version: 1.2.0 abstract: "
This dataset contains genome-wide CADD (Combined Annotation Dependent Depletion) scores for chicken and turkey, generated as part of research aimed at predicting the deleteriousness of genetic variants in non-model species. The objective of the study was to develop and apply a generic, species-agnostic pipeline that computes CADD scores using only a high-quality reference genome, corresponding gene annotation, and a multi-species alignment (MSA) to infer ancestral sequences. The research involved computational methods rather than experimental sample collection; genomic reference assemblies, available functional annotations, and an evolutionary MSA were used as input features to train a machine learning model that assigns PHRED-like CADD scores to all possible single nucleotide variants across the genome. The resulting data consist of chromosome-wise tab-delimited files containing CADD scores for chicken (chr{chr}.tsv.gz) and turkey (Turkey_chr{chr}.tsv.gz), which can be used for comparative genomics, evolutionary analyses, and prioritization of candidate variants in genomic and breeding studies. The work is described in the publication “A generic pipeline for CADD Score generation: chickenCADD and turkeyCADD”, accepted in G3.