Title of the dataset: Data underlying the publication: Cytochrome P450 CYP712G1 catalyzes the double oxidation of orobanchol en route to the rhizosphere signaling strigolactone, solanacol Creators: Robin van Velzen Related publication: Cytochrome P450 CYP712G1 catalyzes the double oxidation of orobanchol en route to the rhizosphere signaling strigolactone, solanacol Description: Transcriptome assemblies of Nicotiana tabacum, Astragalus sinensis, and Cosmos bipinnatus root RNA-seq data and sequence alignment with associated phylogenetic gene tree of cytochrome P450 CYP712 gene family protein sequences. Keywords: Root transcriptome assembly Cosmos bipinnatus Astragalus sinensis Nicotiana tabacum Strigolactones Solanacol Medicaol Gene family analysis Phylogenetic analysis Protein sequence alignment Cytochrome P450 CYP712 Evolution This dataset contains the following files: Astragalus_sinicus_SRR13286078.Trinity.fasta Cosmos_bipinnatus_SRR354676.Trinity.fasta Nicotiana_tabacum_SRR7540357.Trinity.fasta CYP712_alignment.fasta CYP712_gene_tree.newick Explanation of variables: Trinity.fasta files comprise transcriptome assemblies in fasta format; SRR codes refer to underlying data publicly available on the genbank short read archive. CYP712 files comprise gene sequence alignment (in fasta format) and phylogenetic tree (in newick format); sequence names include genbank accession numbers. Methods, materials and software: Root transcriptomes were assembled de novo using Trinity pipeline version 2.12. Multiple sequence alignments were generated with MAFFT v7.450 with automatic selection of appropriate algorithm, a gap open penalty of 1.26 and an offset value 0.123, and the BLOSUM62 scoring matrix. Optimal model of sequence evolution as determined using Modeltest-NG v.0.1.7 was CPREV+I+G. Gene tree was reconstructed in a Bayesian framework using MrBayes v 3.2.6 implemented in Geneious Prime with a chain length of 2.2 million generations; sampling every 1000th generation; 4 heated chains with a temperature of 0.2. The first 200,000 generations were discarded as burnin. This dataset is published under the CC BY license. This license allows reusers to distribute, remix, adapt, and build upon the material in any medium or format, so long as attribution is given to the creator.