U.S. flag

An official website of the United States government

Format

Send to:

Choose Destination

Links from GEO Profiles

    • Showing Current items.

    SON SON DNA and RNA binding protein [ Homo sapiens (human) ]

    Gene ID: 6651, updated on 10-Dec-2024

    Summary

    Official Symbol
    SONprovided by HGNC
    Official Full Name
    SON DNA and RNA binding proteinprovided by HGNC
    Primary source
    HGNC:HGNC:11183
    See related
    Ensembl:ENSG00000159140 MIM:182465; AllianceGenome:HGNC:11183
    Gene type
    protein coding
    RefSeq status
    REVIEWED
    Organism
    Homo sapiens
    Lineage
    Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Primates; Haplorrhini; Catarrhini; Hominidae; Homo
    Also known as
    SON3; BASS1; DBP-5; NREBP; TOKIMS; C21orf50
    Summary
    This gene encodes a protein that contains multiple simple repeats. The encoded protein binds RNA and promotes pre-mRNA splicing, particularly of transcripts with poor splice sites. The protein also recognizes a specific DNA sequence found in the human hepatitis B virus (HBV) and represses HBV core promoter activity. There is a pseudogene for this gene on chromosome 1. Alternative splicing results in multiple transcript variants. [provided by RefSeq, Jul 2013]
    Expression
    Ubiquitous expression in bone marrow (RPKM 43.7), lymph node (RPKM 33.4) and 25 other tissues See more
    Orthologs
    NEW
    Try the new Gene table
    Try the new Transcript table

    Genomic context

    See SON in Genome Data Viewer
    Location:
    21q22.11
    Exon count:
    15
    Annotation release Status Assembly Chr Location
    RS_2024_08 current GRCh38.p14 (GCF_000001405.40) 21 NC_000021.9 (33543038..33577481)
    RS_2024_08 current T2T-CHM13v2.0 (GCF_009914755.1) 21 NC_060945.1 (31924877..31959322)
    RS_2024_09 previous assembly GRCh37.p13 (GCF_000001405.25) 21 NC_000021.8 (34915344..34949787)

    Chromosome 21 - NC_000021.9Genomic Context describing neighboring genes Neighboring gene DnaJ heat shock protein family (Hsp40) member C28 Neighboring gene phosphoribosylglycinamide formyltransferase, phosphoribosylglycinamide synthetase, phosphoribosylaminoimidazole synthetase Neighboring gene BRD4-independent group 4 enhancer GRCh37_chr21:34903402-34904601 Neighboring gene basic transcription factor 3 pseudogene 6 Neighboring gene ATAC-STARR-seq lymphoblastoid silent region 13260 Neighboring gene NANOG-H3K27ac-H3K4me1 hESC enhancer GRCh37_chr21:34914753-34915484 Neighboring gene ATAC-STARR-seq lymphoblastoid silent region 13261 Neighboring gene microRNA 6501 Neighboring gene H3K27ac hESC enhancer GRCh37_chr21:34960679-34961178 Neighboring gene DNA replication fork stabilization factor DONSON Neighboring gene crystallin zeta like 1 Neighboring gene ATAC-STARR-seq lymphoblastoid active region 18382 Neighboring gene ATAC-STARR-seq lymphoblastoid active region 18383

    Genomic regions, transcripts, and products

    Expression

    • Project title: Tissue-specific circular RNA induction during human fetal development
    • Description: 35 human fetal samples from 6 tissues (3 - 7 replicates per tissue) collected between 10 and 20 weeks gestational time were sequenced using Illumina TruSeq Stranded Total RNA
    • BioProject: PRJNA270632
    • Publication: PMID 26076956
    • Analysis date: Mon Apr 2 22:54:59 2018

    Bibliography

    GeneRIFs: Gene References Into Functions

    What's a GeneRIF?

    Phenotypes

    Associated conditions

    Description Tests
    ZTTK syndrome
    MedGen: C4310696 OMIM: 617140 GeneReviews: Not available
    Compare labs

    Copy number response

    Description
    Copy number response
    Triplosensitivity

    No evidence available (Last evaluated 2019-04-24)

    ClinGen Genome Curation Page
    Haploinsufficency

    Sufficient evidence for dosage pathogenicity (Last evaluated 2019-04-24)

    ClinGen Genome Curation PagePubMed

    HIV-1 interactions

    Protein interactions

    Protein Gene Interaction Pubs
    Pr55(Gag) gag HIV-1 Gag interacts with SON as demonstrated by proximity dependent biotinylation proteomics PubMed

    Go to the HIV-1, Human Interaction Database

    Interactions

    Products Interactant Other Gene Complex Source Pubs Description

    General gene information

    Markers

    Clone Names

    • FLJ21099, FLJ33914, KIAA1019

    Gene Ontology Provided by GOA

    Function Evidence Code Pubs
    enables DNA binding IEA
    Inferred from Electronic Annotation
    more info
     
    enables RNA binding HDA PubMed 
    enables RNA binding IBA
    Inferred from Biological aspect of Ancestor
    more info
     
    enables RNA binding IDA
    Inferred from Direct Assay
    more info
    PubMed 
    enables protein binding IPI
    Inferred from Physical Interaction
    more info
    PubMed 
    Component Evidence Code Pubs
    located_in nuclear speck IDA
    Inferred from Direct Assay
    more info
    PubMed 

    General protein information

    Preferred Names
    protein SON
    Names
    Bax antagonist selected in Saccharomyces 1
    NRE-binding protein
    SON DNA binding protein
    negative regulatory element-binding protein

    NCBI Reference Sequences (RefSeq)

    NEW Try the new Transcript table

    RefSeqs maintained independently of Annotated Genomes

    These reference sequences exist independently of genome builds. Explain

    These reference sequences are curated independently of the genome annotation cycle, so their versions may not match the RefSeq versions in the current genome build. Identify version mismatches by comparing the version of the RefSeq in this section to the one reported in Genomic regions, transcripts, and products above.

    Genomic

    1. NG_052981.2 RefSeqGene

      Range
      5002..39445
      Download
      GenBank, FASTA, Sequence Viewer (Graphics)

    mRNA and Protein(s)

    1. NM_001291411.2NP_001278340.2  protein SON isoform E

      Status: REVIEWED

      Description
      Transcript Variant: This variant (e) lacks multiple 3' coding exons and contains an alternate 3' exon, resulting in a distinct 3' coding region and 3' UTR, compared to variant f. It encodes isoform E which is shorter and has a distinct C-terminus, compared to isoform F.
      Source sequence(s)
      AP000303
      Consensus CDS
      CCDS74784.1
      Related
      ENSP00000371095.4, ENST00000381679.8
      Conserved Domains (4) summary
      PHA03247
      Location:170460
      PHA03247; large tegument protein UL36; Provisional
      PHA03379
      Location:340673
      PHA03379; EBNA-3A; Provisional
      PRK10811
      Location:12891481
      rne; ribonuclease E; Reviewed
      NF000535
      Location:730903
      MSCRAMM_SdrC; MSCRAMM family adhesin SdrC
    2. NM_001291412.3NP_001278341.1  protein SON isoform H

      Status: REVIEWED

      Description
      Transcript Variant: This variant (h) represents the allele encoded by the GRCh38 reference genome and encodes isoform (H).
      Source sequence(s)
      AP000303, AP000304
      Consensus CDS
      CCDS77624.1
      UniProtKB/TrEMBL
      A0A994J4Y9, J3QSZ5
      Related
      ENSP00000371111.2, ENST00000381692.6
      Conserved Domains (2) summary
      pfam01585
      Location:333376
      G-patch; G-patch domain
      cl00054
      Location:398441
      DSRM; Double-stranded RNA binding motif. Binding is not sequence specific but is highly specific for double stranded RNA. Found in a variety of proteins including dsRNA dependent protein kinase PKR, RNA helicases, Drosophila staufen protein, E. coli RNase III, ...
    3. NM_001412132.1NP_001399061.1  protein SON isoform I

      Status: REVIEWED

      Description
      Transcript Variant: This variant (i) uses the same exon combination as variant h but represents the allele encoded by the T2T genome assembly. The encoded isoform (I) has a slightly different sequence in the C-terminal region compared to isoform H.
      Source sequence(s)
      CP068257
      UniProtKB/TrEMBL
      A0A994J4Y9, Q6ZRV7
    4. NM_001412133.1NP_001399062.1  protein SON isoform J

      Status: REVIEWED

      Description
      Transcript Variant: This variant (j) uses the same exon combination as variant f but represents the allele encoded by the T2T genome assembly. The encoded isoform (J) has a slightly different sequence in the C-terminal region compared to isoform F.
      Source sequence(s)
      CP068257
    5. NM_032195.3NP_115571.3  protein SON isoform B

      Status: REVIEWED

      Description
      Transcript Variant: This variant (b) lacks multiple 3' coding exons and contains an alternate 3' exon, resulting in a distinct 3' coding region and 3' UTR, compared to variant f. The encoded isoform (B) is shorter and has a distinct C-terminus, compared to isoform F.
      Source sequence(s)
      AP000303, AP000304
      Consensus CDS
      CCDS13631.1
      Related
      ENSP00000300278.2, ENST00000300278.8
      Conserved Domains (4) summary
      PHA03247
      Location:170460
      PHA03247; large tegument protein UL36; Provisional
      PHA03379
      Location:340673
      PHA03379; EBNA-3A; Provisional
      PRK10811
      Location:12891481
      rne; ribonuclease E; Reviewed
      NF000535
      Location:730903
      MSCRAMM_SdrC; MSCRAMM family adhesin SdrC
    6. NM_138927.4NP_620305.3  protein SON isoform F

      Status: REVIEWED

      Description
      Transcript Variant: This variant (f) represents the allele encoded by the GRCh38 reference genome and encodes isoform (F).
      Source sequence(s)
      AF380184, AK307612, AP000303
      Consensus CDS
      CCDS13629.1
      UniProtKB/Swiss-Prot
      D3DSF5, D3DSF6, E7ETE8, E7EU67, E7EVW3, E9PFQ2, O14487, O95981, P18583, Q14120, Q6PKE0, Q9H7B1, Q9P070, Q9P072, Q9UKP9, Q9UPY0
      Related
      ENSP00000348984.4, ENST00000356577.10
      Conserved Domains (6) summary
      PHA03247
      Location:170460
      PHA03247; large tegument protein UL36; Provisional
      PHA03379
      Location:340673
      PHA03379; EBNA-3A; Provisional
      PRK10811
      Location:12891481
      rne; ribonuclease E; Reviewed
      NF000535
      Location:730903
      MSCRAMM_SdrC; MSCRAMM family adhesin SdrC
      pfam01585
      Location:23052349
      G-patch; G-patch domain
      cl00054
      Location:23692419
      DSRM_SF; double-stranded RNA binding motif (DSRM) superfamily

    RNA

    1. NR_103797.2 RNA Sequence

      Status: REVIEWED

      Description
      Transcript Variant: This variant (c) contains an alternate internal exon and uses an alternate splice site at the 3' exon, compared to variant f. This variant is represented as non-coding because the use of the 5'-most expected translational start codon, as used in variant f, renders the transcript a candidate for nonsense-mediated mRNA decay (NMD).
      Source sequence(s)
      AP000303, AP000304
      Related
      ENST00000455528.5

    RefSeqs of Annotated Genomes: GCF_000001405.40-RS_2024_08

    The following sections contain reference sequences that belong to a specific genome build. Explain

    Reference GRCh38.p14 Primary Assembly

    Genomic

    1. NC_000021.9 Reference GRCh38.p14 Primary Assembly

      Range
      33543038..33577481
      Download
      GenBank, FASTA, Sequence Viewer (Graphics)

    Alternate T2T-CHM13v2.0

    Genomic

    1. NC_060945.1 Alternate T2T-CHM13v2.0

      Range
      31924877..31959322
      Download
      GenBank, FASTA, Sequence Viewer (Graphics)

    Suppressed Reference Sequence(s)

    The following Reference Sequences have been suppressed. Explain

    1. NM_003103.5: Suppressed sequence

      Description
      NM_003103.5: This RefSeq was permanently suppressed because currently there is insufficient support for the transcript and the protein.
    2. NM_138925.1: Suppressed sequence

      Description
      NM_138925.1: This RefSeq was permanently suppressed because currently there is insufficient support for the transcript and the protein.
    3. NR_103796.1: Suppressed sequence

      Description
      NR_103796.1: This RefSeq was permanently suppressed because it is now thought that this transcript variant does encode a protein.
    4. NR_103798.1: Suppressed sequence

      Description
      NR_103798.1: This RefSeq was temporarily suppressed because currently there is not sufficient data to support this transcript.