NCBI Logo
GEO Logo
   NCBI > GEO > Accession DisplayHelp Not logged in | LoginHelp
GEO help: Mouse over screen elements for information.
          Go
Series GSE76252 Query DataSets for GSE76252
Status Public on Apr 27, 2016
Title A robust (re-)annotation approach to generate unbiased mapping references for RNA-seq-based analyses of differential expression across closely related species
Organisms Drosophila mauritiana; Drosophila melanogaster; Drosophila simulans
Experiment type Expression profiling by high throughput sequencing
Summary Background: RNA-seq based on short reads generated by next generation sequencing technologies has become the main approach to study differential gene expression. Until now the main applications of this technique have been to study the variation of gene expression in a whole organism, tissue or cell type under different conditions or at different developmental stages. However, RNA-seq also has a great potential to be used in evolutionary studies to investigate gene expression divergence in closely related species. Since the more reliable statistical methods for differential gene expression inference are based on the use of raw read count data, the reference genomes of the species to be compared need to be highly comparable.
Results: We show that the published genomes and annotations of the three closely related Drosophila species, D. melanogaster, D. simulans and D. mauritiana, have limitations for inter-specific gene expression studies. This is due to missing gene models in at least one of the genome annotations, unclear orthology assignments and significant length differences in the different species. We propose that published reference genomes should be re-annotated before using them as references for RNA-seq experiments to include as many genes as possible and to account for a potential length bias. For that we present a straight-forward reciprocal re-annotation pipeline that allows to reliably compare the expression for nearly all genes annotated in D. melanogaster. We carried out a RNA-seq experiment in combination with quantitative real-time PCR to confirm that the newly generated gene sets do not result in a high number of false positives as observed with references that still show a clear length difference of gene models between species.
Conclusions: We conclude that our reciprocal re-annotation of previously published genomes facilitates the analysis of significantly more genes in an inter-specific differential gene expression study. We propose that the established pipeline can easily be applied to re-annotate other genomes of closely related animals and plants to improve comparative expression analyses.
 
Overall design mRNA profiles of larval eye-antennal imaginal discs (late L3) of three species of Drosophila (D. melanogaster OreR, D. simulans YVF and D. mauritiana TAM16) were generated by deep sequencing using Illumina HiSeq 2000. 3 biological replicates were generated for D. melanogaster sample and sequenced in 50 bp single-end reads; 3 biological replicates were generated for D. simulans sample and sequenced in 100 bp paired-end reads; 6 biological replicates were generated for D. mauritiana sample, 3 of these were sequenced in 50 bp single-end reads and the other 3 in 100 bp paired-end reads.
 
Contributor(s) Torres-Oliva M, Almudi I, McGregor AP, Posnien N
Citation(s) 27220689
Submission date Dec 22, 2015
Last update date May 15, 2019
Contact name Nico Posnien
E-mail(s) nico.posnien@googlemail.com
Organization name University of Göttingen
Department Department of Developmental Biology
Street address Justus-von-Liebig-Weg 11
City Göttingen
ZIP/Postal code 37077
Country Germany
 
Platforms (3)
GPL13304 Illumina HiSeq 2000 (Drosophila melanogaster)
GPL13306 Illumina HiSeq 2000 (Drosophila simulans)
GPL21269 Illumina HiSeq 2000 (Drosophila mauritiana)
Samples (12)
GSM1977846 Dmel_OreR_repA
GSM1977847 Dmel_OreR_repB
GSM1977848 Dmel_OreR_repC
Relations
BioProject PRJNA306729
SRA SRP067685

Download family Format
SOFT formatted family file(s) SOFTHelp
MINiML formatted family file(s) MINiMLHelp
Series Matrix File(s) TXTHelp

Supplementary file Size Download File type/resource
GSE76252_Counts_Length_Direct_DmauDmel.txt.gz 212.3 Kb (ftp)(http) TXT
GSE76252_Counts_Length_Direct_DmauDsim.txt.gz 238.2 Kb (ftp)(http) TXT
GSE76252_Counts_Length_Published_DmauDmel.txt.gz 183.5 Kb (ftp)(http) TXT
GSE76252_Counts_Length_Published_DmauDsim.txt.gz 181.0 Kb (ftp)(http) TXT
GSE76252_Counts_Length_Reciprocal_DmauDmel.txt.gz 210.6 Kb (ftp)(http) TXT
GSE76252_Counts_Length_Reciprocal_DmauDsim.txt.gz 237.4 Kb (ftp)(http) TXT
GSE76252_DmelCDS_with_ReanDsim_GeneSet_1to1orth.fa.gz 5.9 Mb (ftp)(http) FA
GSE76252_DmelCDS_with_ReanDsim_exoutput_1to1orth.gff.gz 519.2 Kb (ftp)(http) GFF
GSE76252_PubDmau_with_ReanDsim_GeneSet_1to1orth.fa.gz 5.9 Mb (ftp)(http) FA
GSE76252_PubDmau_with_ReanDsim_exoutput_1to1orth.gff.gz 1.0 Mb (ftp)(http) GFF
GSE76252_PubDsim_with_DmelCDS_GeneSet.fa.gz 6.4 Mb (ftp)(http) FA
GSE76252_Readme.txt.gz 513 b (ftp)(http) TXT
GSE76252_ReanDsim_with_ReanDmau_GeneSet_1to1orth.fa.gz 5.9 Mb (ftp)(http) FA
GSE76252_ReanDsim_with_ReanDmau_exoutput_1to1orth.gff.gz 497.6 Kb (ftp)(http) GFF
GSE76252_dmel-allgenes-CDS-r5.55-longest_norepeats_FBgn.fa.gz 6.2 Mb (ftp)(http) FA
SRA Run SelectorHelp
Processed data are available on Series record
Raw data are available in SRA

| NLM | NIH | GEO Help | Disclaimer | Accessibility |
NCBI Home NCBI Search NCBI SiteMap