|
|
GEO help: Mouse over screen elements for information. |
|
Status |
Public on Dec 16, 2013 |
Title |
SU-DHL4_ChIPSeq |
Sample type |
SRA |
|
|
Source name |
B cells
|
Organism |
Homo sapiens |
Characteristics |
disease state: diffuse large B cell lymphoma (DLBCL) dlbcl subtype: GCB cell line: SU-DHL4 chip antibody: STAT3 (Santa Cruz, sc-482X)
|
Treatment protocol |
Cells were suspended in their growing media at 1e6 per mL and crosslinked with formaldehyde at a final concentration of 1% for 10 minutes at room temperature, followed by quenching with glycine in PBS at a final concentration of 125 mM for 5 minutes.
|
Growth protocol |
Cell lines OCI-Ly7, OCI-Ly19, SU-DHL2, SU-DHL4, SU-DHL6, SU-DHL10, and U-2932 were cultured in suspension in RPMI 1640, supplemented with 15% heat-inactivated FBS. Cell lines OCI-Ly3 and OCI-Ly10 were cultured in suspension in IMDM, supplemented with 15% heat-inactivated FBS and 55 uM beta-mercaptoethanol.
|
Extracted molecule |
genomic DNA |
Extraction protocol |
Crosslinked cell pellets were dounce homogenized to collect nuclei. Nuclear lysate was sonicated, then immunoprecipitated. Antibody-protein-DNA complexes were collected using Protein A agarose beads. Libraries were prepared according to Illumina's instructions using non-Illumina enzymes and kits. Briefly, DNA was end-repaired using a combination of T4 DNA polymerase, E. coli DNA Pol I large fragment (Klenow polymerase) and T4 polynucleotide kinase (End-It Repair Kit, Epicenter-Illumina). The blunt ends were treated with Klenow fragment (32 to 52 exo minus) and dATP to yield a protruding 3- 'A' base for ligation of Illumina's adapters. DNA was then gel purified and size selected for 150-300 bp fragments to exclude unligated adapters, PCR amplified with Illumina primers using Phusion high-fidelity DNA polymerase for 15 cycles, and size selected again from an agarose gel for 150-300 bp fragments. The purified DNA was captured on an Illumina flow cell for cluster generation. Libraries were sequenced on the Genome Analyzer IIx following the manufacturer's protocols.
|
|
|
Library strategy |
ChIP-Seq |
Library source |
genomic |
Library selection |
ChIP |
Instrument model |
Illumina Genome Analyzer IIx |
|
|
Description |
Sample 7
|
Data processing |
Base calls performed using Illumina ELAND ChIP-Seq reads were aligned to hg19 using BowTie v0.12.5, parameters "-p 8 -S -q -v 2 -m 1 --best --strata encodeHg19Male" (or encodeHg19Female). The SPP peak caller v1.10.1 was used with a relaxed peak calling threshold (FDR = 0.9) to obtain a large number of peaks (maximum of 300,000) that span true signal as well as noise (false identifications). Used the Irreproducible Discovery Rate (IDR) framework in order to identify high confidence and reliable regions of enrichments (peaks) in the ChIP-seq datasets. Specifically, we followed the ENCODE uniform processing pipeline as outlined at http://anshul.kundaje.net/projects/idr. The number of peaks with IDR scores better than 0.02 (2%) was used as the cross-replicate peak rank threshold. Overlapping or abutting high confidence STAT3 peaks were merged into broader “binding regions” to facilitate comparison between cell lines (output file: ChIPSeq_STAT3BindingRegions.txt). The ChIP-Seq data for each line was then rescored to determine how many fragments mapped to each binding region. Binding regions that occurred in only one cell line were eliminated from further analysis. Replicates were normalized using DESeq: The effective library size of each sample was estimated based on the pooled count data. First, a reference sample was defined in which the reference count of each binding region is its geometric mean over all samples. Second, for each sample, a vector was calculated as the ratios of the read counts over the reference counts for all the binding regions. Third, the median of these ratios across all the binding regions was defined as the “size factor” of each sample. Lastly, each sample was normalized by dividing the real counts by its size factor. Applied the negative binomial model in DESeq to assess the significance levels of differential STAT3 binding. The dispersion parameter in the model is estimated from the data by examining the relationship between the mean and variance of read counts across all the BRs. To contrast two conditions, we used the parameterized negative binomial model for each gene and obtained the p-values. To correct for multiple comparison, we adjusted the p-values with the Benjamini-Hochberg procedure which controls false discovery rate. We compared 24 GCB replicates versus 11 ABC replicates. Cell line OCI-Ly19 was excluded from the analysis: RNA-Seq data showed that its gene expression clustered in between the subtypes, probably due to its EBV+ status. STAT3 binding regions (BRs) were associated with genes that they might regulate via the Genomic Regions Enrichment of Annotations Tool v2.0.2 using its default settings (http://great.stanford.edu). To determine whether a BR and a given gene are linked, GREAT first determines a putative regulatory domain for every gene. This consists of a basal regulatory domain (BRD) from 5 kb upstream to 1 kb downstream, plus an extended domain calculated by elongating the BRD both upstream and downstream for 1000 kb or until reaching another gene’s BRD, whichever occurs first. Once this set of extended regulatory domains is established, GREAT associates the list of ChIP-Seq STAT3 binding regions with all of the genes whose regulatory domains they overlap. Genome_build: hg19 Supplementary_files_format_and_content: *ChIPSeq_STAT3_IDRpeaks.bed: BED files were generated using the IDR pipeline (http://anshul.kundaje.net/projects/idr). Files are in the UCSC-supported ENCODE narrowPeak format (http://genome.ucsc.edu/FAQ/FAQformat.html#format12). NOTE: BED column 4 reports ranks, as the IDR analysis pipeline uses rank as an important measurement of peak significance. Supplementary_files_format_and_content: ChIPSeq_STAT3BindingRegions.txt: File was generated by a custom script. "IDRpeaks.bed" files for all 9 cell lines were concatenated. Any overlapping or abutting regions were merged together to form one larger "binding region" (BR). Regions that occurred in only one cell line were discarded. Headers are "chr/start/end/numCellLines", where "numCellLines" is "number of cell lines in which this peak is found". Linked as supplementary file on Series record. Supplementary_files_format_and_content: ChIPSeq_ReadCountsByReplicate.txt: Abundance measurements. The total fragment count for 10,337 STAT3 binding regions in each of the 35 replicates, after normalization. Linked as supplementary file on Series record. Supplementary_files_format_and_content: ChIPSeq_StatisticalResults.txt: Mean fragment count for 10,337 STAT3 binding regions in each subtype (ABC, GCB) after normalization; calculated fold change; p-value and FDR for differential gene expression between the two subtypes. Linked as supplementary file on Series record. Supplementary_files_format_and_content: ChIPSeq_AssociatedGenes.txt: Location-based association between 10,337 STAT3 binding regions and annotated RefSeq genes (analyzed via GREAT). In many cases, STAT3 BRs fall in regions where the regulatory domains of two genes overlap; consequently, GREAT associates these BRs with both genes. For purposes of downstream comparison, we treated these multiple associations as separate table entries, to allow the greatest sensitivity in detecting associations with gene expression data. Linked as supplementary file on Series record.
|
|
|
Submission date |
Sep 10, 2013 |
Last update date |
May 15, 2019 |
Contact name |
Jennifer Marion Hardee |
E-mail(s) |
jenn.hardee@gmail.com
|
Organization name |
Stanford University
|
Street address |
300 Pasteur Dr., M-344
|
City |
Stanford |
State/province |
California |
ZIP/Postal code |
94305 |
Country |
USA |
|
|
Platform ID |
GPL10999 |
Series (2) |
GSE50723 |
Whole genome mapping of STAT3 binding sites in the two major subtypes of diffuse large B cell lymphoma |
GSE50724 |
Correlation between STAT3 binding presence and gene expression levels in subtypes of diffuse large B cell lymphoma |
|
Relations |
BioSample |
SAMN02351679 |
SRA |
SRX347427 |
Supplementary file |
Size |
Download |
File type/resource |
GSM1227210_SU-DHL4_ChIPSeq_STAT3_IDRpeaks.bed.gz |
145.9 Kb |
(ftp)(http) |
BED |
SRA Run Selector |
Raw data are available in SRA |
Processed data provided as supplementary file |
Processed data are available on Series record |
|
|
|
|
|