GEO Accession viewer

NCBI > GEO > Accession Display

Not logged in | Login

GEO help: Mouse over screen elements for information.

Sample GSM4471389

Query DataSets for GSM4471389

Status

Public on Jan 31, 2021

Title

RNA-seq_DMSO1h_rep1

Sample type

SRA

Source name

K562 erythroleukemia cells

Organism

Homo sapiens

Characteristics

method protocol: TT-seq, RNA-seq (Schwalb et al, 2016; Wachutka et al., 2019)
treatment condition: 37 C, DMSO for 1h, 500uM 4sU (Carbosynth) labeling during the last 10 minutes.

Treatment protocol

Cells at a confluency of 300.000 cell/mL were treated with DMSO (1:20.000 dilution, Sigma, D2438) as solvent control or with 1 μM Pladienolide-B from a DMSO-resuspended 20 mM stock (1:20.000 dilution, Santa Cruz, sc-391691).

Growth protocol

Human K562 cells were obtained from DSMZ (DSMZ no.: ACC-10) and cultured in antibiotic-free RPMI 1640 medium (Thermo Fisher Scientific, 31870–074) supplemented with 10% heat-inactivated fetal bovine serum (Thermo Fisher Scientific, 10500–064) and 2 mM GlutaMAX (Thermo Fisher Scientific, 35050087) at 37 ̊C and 5% CO2. Cells were verified to be free of mycoplasma contamination using Plasmo Test Mycoplasma Detection Kit (InvivoGen, rep-pt1). K562 cells were authenticated at the DSMZ Identification Service according to standards for STR profiling (ASN-0002). Biological replicates were grown independently.

Extracted molecule

total RNA

Extraction protocol

Ovation Universal RNA-Seq System (NuGEN)

Library strategy

RNA-Seq

Library source

transcriptomic

Library selection

cDNA

Instrument model

Illumina NextSeq 500

Description

Total fragmented RNA

Data processing

TT-seq and RNA-seq: Paired-end 75 and 150 bp reads with additional 6 bp of barcodes were obtained for each group of samples. Two replicates were treated with Pladienolide B (Pla-B) and two replicates treated with a control solvent (DMSO). Reads were aligned to the hg20/hg38 (GRCh38) genome assembly (Human Genome Reference Consortium) using STAR 2.6.0 (Dobin et al., 2013), with the following specifications: outFilterMismatchNmax 2, outFilterMultimapScoreRange 0 and alignIntronMax 500000. Bam files were filtered with Samtools (Li et al., 2009) to remove alignments with MAPQ smaller than 7 (-q 7) and only proper pairs (-f2) were selected. Read counts for different features were calculated with HTSeq (Anders et al., 2015). Further data processing was carried out using the R/Bioconductor environment. Expressed transcripts were defined as possessing more than 50 read counts per kilobase (RPK) in two summarized replicates of TT-seq solvent control (DMSO). Prior to quantification, data was normalized by using added RNA spike-in as described previously (Schwalb et al., 2016)
mNET-seq: Paired-end 42 bp reads with additional 6 bp of barcodes were obtained for each of the samples. Reads were trimmed for adapter content with Cutadapt (1.18, RRID:SCR_011841) (Martin, 2011) with -O 12 -m 25 -a TGGAATTCTCGG -A GATCGTCGGACT. mNET-seq data was normalized using S. cerevisiae RNA spike-ins. To this end, a combined genome was generated using the Ensemble genome assembly for both human hg38 (GRCh38) and S. cerevisiae (R64-1-1), against which the reads were mapped using STAR (2.6.0, RRID:SCR_015899) (Dobin et al., 2013). Around 80% and 20% of reads mapped to the human and S. cerevisiae genomes, respectively. Bam files were filtered with Samtools (1.3.1, RRID:SCR_002105) (Li et al., 2009) to remove alignments with MAPQ smaller than 7 (-q 7) and only proper pairs (-f2) were selected. Read counts for different features were calculated with HTSeq (0.6.1.p1, RRID:SCR_005514) (Anders et al., 2015). Further data processing was carried out using the R/Bioconductor environment. Antisense bias ratio was determined using positions in regions without antisense annotation with a coverage of at least 100 according to the defined major isoforms. Data was normalized using added S. cerevisiae RNA spike-ins. mNETs-seq coverage was normalized with a median of ratios method (Love et al., 2014) using the antisense corrected counts for S. cerevisiae transcripts with an RPK of 100 or higher in two summarized replicates of mNET-seq solvent control (DMSO).
ChIP-seq: Paired-end 42 or 75 bp reads with additional 6 bp of barcodes were obtained for each of the samples. Reads were aligned using Bowtie 2 (2.3.5, RRID:SCR_005476) (Langmead and Salzberg, 2012) to both human hg38 (GRCh38) and D. melanogaster (BDGP6.28). Bam files were filtered with Samtools (1.3.1, RRID:SCR_002105) (Li et al., 2009) to remove alignments with MAPQ smaller than 7 (-q 7) and only proper pairs (-f2) were selected. Further data processing was carried out using the R/Bioconductor environment. ChIP-seq coverages were obtained from piled-up counts for every genomic position, using physical coverage, that is, counting both sequenced bases covered by reads and unsequenced bases spanned between proper mate-pair reads. Data was normalized using added D. melanogaster RNA spike-ins. Normalization factors were obtained by dividing the total D. melanogaster read counts for each sample by the total read counts of the sample with the lowest read counts. ChIP-seq coverages were divided by the respective normalization factors.
Major isoform annotation: Salmon (0.13.1, RRID:SCR_017036) (Patro et al., 2017) was used in order to select the major isoforms present in our dataset. RNA-seq samples for 1 h DMSO or 1 uM Pla-B treatments were mapped against curated RefSeq annotated protein-coding isoforms (UCSC RefSeq GRCh38, downloaded in April 2019). For each gene, the major isoform was determined as the one with maximum mean Transcripts Per Million (TPM) value across all RNA-seq samples. Major isoforms were excluded from further analysis if they represented less than 70 % of the gene isoforms based on the calculated mean TPM value. Additionally, major isoforms associated with overlapping genes as well as isoforms located on chromosomes X, Y and M were discarded from further analysis. The final major isoform annotation includes 6,694 isoforms containing 65,976 exons and 59,282 introns. A total of 5,535 major transcript isoforms of protein-coding genes with RPK >= 50 of TT-seq solvent control (DMSO) were included in the analysis.
Intronless genes annotation: Intronless genes were defined as RefSeq annotated genes comprising one single isoform with one single exon. To avoid effects from neighboring intron-containing genes, only intronless genes at least 1 kb distant from the neighboring intron-containing annotated transcripts (strand independent) were included in the analysis. Moreover, because a long 3Â´ UTR has been recently reported to be alternatively spliced in an intronless gene (FranÃ§ois et al., 2018), only intronless genes with UTRs <= 100 bp were included. A total of 51 expressed protein-coding intronless genes were included in the analysis.
Genome_build: hg20/hg38 (GRCh38) genome assembly (Human Genome Reference Consortium http://hgdownload.soe.ucsc.edu/downloads.html#human)
Supplementary_files_format_and_content: bigwig coverage files

Submission date

Apr 09, 2020

Last update date

Jan 31, 2021

Contact name

Sara Patricia Monteiro Martins

E-mail(s)

sara.martins@mpibpc.mpg.de

Organization name

MPI for biophysical chemistry

Department

Molecular Biology

Lab

Cramer

Street address

Fassberg 11

City

Göttingen

ZIP/Postal code

37077

Country

Germany

Platform ID

GPL18573

Series (1)

GSE148433

Efficient RNA polymerase II pause release requires U2 snRNP function

Relations

BioSample

SAMN14570400

SRA

SRX8092784

Supplementary file	Size	Download	File type/resource
GSM4471389_RNA-seq_DMSO1h_rep1_minus.bw	81.3 Mb	(ftp)(http)	BW
GSM4471389_RNA-seq_DMSO1h_rep1_plus.bw	85.2 Mb	(ftp)(http)	BW
SRA Run Selector
Raw data are available in SRA
Processed data provided as supplementary file
Processed data are available on Series record