Download Sequence and Track Data
Download FASTA and GenBank flat file
You can download sequence and other data from the graphical viewer by accessing the Download menu on the toolbar.
You can download the FASTA formatted sequence of the visible range, all markers created on the sequence, or all selections made of the sequence. You can make a highlighted selection using click and drag, and multiple selections can be made by holding down on the Ctrl button on your keyboard.
You can also download annotation and sequence data from the sequence track in GenBank flat file format.
Download Track Data
The Download Track Data dialog allows you to download portions of track data in tabular formats for further analysis.
Currently, you can download data from gene annotation and selected feature tracks, SNP annotation tracks (NCBI data), and remotely loaded variation tracks (such as EVA (European Variation Archive) tracks) or user-provided VCF tracks. Only tracks added to the graphical view will be shown in the download dialog. Every track with data available for download will have download icon at the track's right corner in the graphical view. Some SNP annotation data cannot be downloaded from the graphical viewer, including legacy SNP tracks older than NCBI SNP build 151.
The "Download Track Data" dialog in the graphical viewer does not permit downloads containing greater than 30 million features or SNPs. Complete NCBI SNP release VCF files can be found on the SNP FTP site.
Please refer to the NCBI genomes FTP site or the NCBI Datasets page to obtain NCBI full genome annotation data.
Please contact us if you would like to download a track or track format that we do not currently offer in this dialog.
Data Formats
GFF3
Please refer to the GFF3 file format description. This format is currently only available for gene annotation tracks.
When the "Include RNA and CDS features" box is checked, RNAs, CDS, exons, and other features (if any) annotated on the gene track will be included in the downloaded file. When this box in unchecked, only the gene feature rows will be included in the file.
If the requested range starts or stop in the middle of a feature, the reported start or stop coordinates will match the requested coordinate(s). The range will not be extended to encompass the full range of the feature. Rows for truncated features will contain the attribute "partial=true" in column 9.
CSV
The CSV format is currently only available for gene annotation tracks. The CSV (comma-separated value) table includes columns reporting the gene feature start and stop coordinates, symbol, strand/orientation, NCBI Gene database ID, and official gene name. The start and stop coordinates in the table correspond to the full range of the gene feature and may extend past the requested range coordinates. Gene names may not be available for some gene features.
BED
The BED file reports six columns (accession, start, stop, gene or feature name, score, strand).
Currently, the "Include RNA and CDS features" option is not supported for the CSV and BED file format options. Therefore, these file formats cannot be generated for tracks that only include RNA and CDS features, e.g. CCDS Features tracks.
VCF
NCBI SNP tracks and uploaded or streamed variation (VCF) tracks can be downloaded in VCF format. To obtain VCF files of whole genome NCBI SNP annotation, please go to the NCBI SNP FTP site at ftp://ftp.ncbi.nlm.nih.gov/snp/.
Please refer to this page for more information on downloading image data as PDF or SVG files.
Table of Contents
- Sequence Viewer application
- Documentation Home
- General
- Help
- Interface
- Tutorials
- Manuals
- Demo pages
- Related Resources