NCBI Home Page NCBI Site Search page NCBI Guide that lists and describes the NCBI resources
Conserved domains on  [gi|1907114401|ref|XP_036015519|]
View 

AT-rich interactive domain-containing protein 2 isoform X1 [Mus musculus]

Protein Classification

Graphical summary

 Zoom to residue level

show extra options »

Show site features     Horizontal zoom: ×

List of domain hits

Name Accession Description Interval E-value
RFX_DNA_binding pfam02257
RFX DNA-binding domain; RFX is a regulatory factor which binds to the X box of MHC class II ...
372-454 3.71e-30

RFX DNA-binding domain; RFX is a regulatory factor which binds to the X box of MHC class II genes and is essential for their expression. The DNA-binding domain of RFX is the central domain of the protein and binds ssDNA as either a monomer or homodimer. It recognize X-boxes (DNA of the sequence 5'-GTNRCC(0-3N)RGYAAC-3', where N is any nucleotide, R is a purine and Y is a pyrimidine) using a highly conserved 76-residue DNA-binding domain (DBD).


:

Pssm-ID: 460512  Cd Length: 79  Bit Score: 114.57  E-value: 3.71e-30
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907114401  372 EKFACQWLNAHFEVNPDCSVSRAEMYSEYLSTCSKLARgGILTSTGFYKCLRTVFPNHTVKRVEdstSSGQAHIHVIGVK 451
Cdd:pfam02257    1 QVFALQWLNANYEVAEGSSVPRSRVYAHYLSFCSKLGI-KPLNAASFGKLIRQVFPNLKTRRLG---TRGQSKYHYCGIR 76

                   ...
gi 1907114401  452 RRA 454
Cdd:pfam02257   77 LKA 79
SP1-4_N super family cl41773
N-terminal domain of transcription factor Specificity Proteins (SP) 1-4; Specificity Proteins ...
527-1003 3.03e-07

N-terminal domain of transcription factor Specificity Proteins (SP) 1-4; Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. There are many SPs in vertebrates (9 SPs in humans and mice, 7 SPs in chicken, and 11 SPs in teleost fish), but arthropods only have 3 SPs. SPs belong to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. SP factors may be separated into three groups based on their domain architecture and the similarity of their N-terminal transactivation domains: SP1-4, SP5, and SP6-9. The transactivation domains between the three groups are not homologous to one another. SP1-4 have similar N-terminal transactivation domains characterized by glutamine-rich regions, which, in most cases, have adjacent serine/threonine-rich regions. This model represents the N-terminal domain of SP1-4.


The actual alignment was detected with superfamily member cd22540:

Pssm-ID: 425404 [Multi-domain]  Cd Length: 511  Bit Score: 54.93  E-value: 3.03e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907114401  527 TVAQTVSRIPPNPSVHTHQQQNSPVTVIQNKAPIPCEVV-KATVIQNSVPQTAVPVSISVGGAPAQ--------NSVGQN 597
Cdd:cd22540     17 STTQDSQPSPLALLAATCSKIGPPAVEAAVTPPAPPQPTpRKLVPIKPAPLPLGPGKNSIGFLSAKgniiqlqgSQLSSS 96
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907114401  598 HSAGPQPVTVVNSQTLLHHPSVMPQPSPLHTVVPGQvpsgtpvtviqQTVPQSRMFGRVQSIPACTSTVSQGQQlITTSP 677
Cdd:cd22540     97 APGGQQVFAIQNPTMIIKGSQTRSSTNQQYQISPQI-----------QAAGQINNSGQIQIIPGTNQAIITPVQ-VLQQP 164
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907114401  678 QPMHTSSQQTAAGSQPQDTVIIAPPQYVT----TSASNIVSATSVQNFQVA--TGQVVTIAGVPSPQPSRVGFQ--NIAP 749
Cdd:cd22540    165 QQAHKPVPIKPAPLQTSNTNSASLQVPGNviklQSGGNVALTLPVNNLVGTqdGATQLQLAAAPSKPSKKIRKKsaQAAQ 244
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907114401  750 KPLPSQQVSPSVVQQPIQQPQQPAQQSVVIVSQPaqqGQAYAPAIHQIVLANPAALPagQTVQLTGQPnitpSSSPSPVP 829
Cdd:cd22540    245 PAVTVAEQVETVLIETTADNIIQAGNNLLIVQSP---GTGQPAVLQQVQVLQPKQEQ--QVVQIPQQA----LRVVQAAS 315
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907114401  830 PTNNQVPTamSSSSTLQSQGPPPTVSQMLSVKRQQQQQHSPAAPAQQVQVQVQQPQQVQVQVQPQQPSAGVGQPApneSS 909
Cdd:cd22540    316 ATLPTVPQ--KPLQNIQIQNSEPTPTQVYIKTPSGEVQTVLLQEAPAATATPSSSTSTVQQQVTANNGTGTSKPN---YN 390
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907114401  910 LIKQLLLPKRGPStpGGKLILPAPQIpppnnarAPSPQVVYQVANNqaaGFGVQGQTPAQQLLVGQQNvqlvqsamppaG 989
Cdd:cd22540    391 VRKERTLPKIAPA--GGIISLNAAQL-------AAAAQAIQTININ---GVQVQGVPVTITNAGGQQQ-----------L 447
                          490
                   ....*....|....
gi 1907114401  990 GVQTVPISNLQILP 1003
Cdd:cd22540    448 TVQTVSSNNLTISG 461
PTZ00395 super family cl33180
Sec24-related protein; Provisional
1129-1426 1.60e-04

Sec24-related protein; Provisional


The actual alignment was detected with superfamily member PTZ00395:

Pssm-ID: 185594 [Multi-domain]  Cd Length: 1560  Bit Score: 46.61  E-value: 1.60e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907114401 1129 SESELQVGSLLNGRKYSDSSLPPSNSGklQSETSQCS------------------LISNGPSLELGENGAPGKQNS---- 1186
Cdd:PTZ00395   228 NEGDVQKTNPWQGKQGNSATSPPANEN--NAVTLSCSndqqrgassaaesgyahhRGSNIASHTPNDNIMHAANNPlnnt 305
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907114401 1187 EPVDMQDVKGDLKKALVNGICDFDKGD--------GSH------LSKNIPNHKTSNHVGNGEISPVEP--QGTSGATQQD 1250
Cdd:PTZ00395   306 NDAQRNAIQGDLVRGAPNDKNSFDRGNektyqiygGFHdgspnaASAGAPFNGLGNQADGGHINQVHPdaRGAWAGGPHS 385
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907114401 1251 TAKGDQlERVSNGPVLTLGGS-------PSTSSMQEAPSVATPPLSGTDLPNGPlasslNSDVPQQRPSVVVSPHSTAPV 1323
Cdd:PTZ00395   386 NASYNC-AAYSNAAQSNAAQSnagfsnaGYSNPGNSNPGYNNAPNSNTPYNNPP-----NSNTPYSNPPNSNPPYSNLPY 459
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907114401 1324 IQghqviaVPHSGPRVT--PSALSSDARSTNGTA-ECKTVKRPAEDNDRDTVPGIPNKVGVRIVTISDPNnagCSATMVA 1400
Cdd:PTZ00395   460 SN------TPYSNAPLSnaPPSSAKDHHSAYHAAyQHRAANQPAANLPTANQPAANNFHGAAGNSVGNPF---ASRPFGS 530
                          330       340
                   ....*....|....*....|....*.
gi 1907114401 1401 VPAGADPSTVAKvaiESAAQQKQQHP 1426
Cdd:PTZ00395   531 APYGGNAATTAD---PNGIAKREDHP 553
 
Name Accession Description Interval E-value
RFX_DNA_binding pfam02257
RFX DNA-binding domain; RFX is a regulatory factor which binds to the X box of MHC class II ...
372-454 3.71e-30

RFX DNA-binding domain; RFX is a regulatory factor which binds to the X box of MHC class II genes and is essential for their expression. The DNA-binding domain of RFX is the central domain of the protein and binds ssDNA as either a monomer or homodimer. It recognize X-boxes (DNA of the sequence 5'-GTNRCC(0-3N)RGYAAC-3', where N is any nucleotide, R is a purine and Y is a pyrimidine) using a highly conserved 76-residue DNA-binding domain (DBD).


Pssm-ID: 460512  Cd Length: 79  Bit Score: 114.57  E-value: 3.71e-30
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907114401  372 EKFACQWLNAHFEVNPDCSVSRAEMYSEYLSTCSKLARgGILTSTGFYKCLRTVFPNHTVKRVEdstSSGQAHIHVIGVK 451
Cdd:pfam02257    1 QVFALQWLNANYEVAEGSSVPRSRVYAHYLSFCSKLGI-KPLNAASFGKLIRQVFPNLKTRRLG---TRGQSKYHYCGIR 76

                   ...
gi 1907114401  452 RRA 454
Cdd:pfam02257   77 LKA 79
SP2_N cd22540
N-terminal domain of transcription factor Specificity Protein (SP) 2; Specificity Proteins ...
527-1003 3.03e-07

N-terminal domain of transcription factor Specificity Protein (SP) 2; Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. SP2 contains the least conserved DNA-binding domain within the SP subfamily of proteins, and its DNA sequence specificity differs from the other SP proteins. It localizes primarily within subnuclear foci associated with the nuclear matrix, and can activate, or in some cases, repress expression from different promoters. The transcription factor SP2 serves as a paradigm for indirect genomic binding. It does not require its DNA-binding domain for genomic DNA binding and occupies target promoters independently of whether they contain a cognate DNA-binding motif. SP2 belongs to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. SP factors may be separated into three groups based on their domain architecture and the similarity of their N-terminal transactivation domains: SP1-4, SP5, and SP6-9. The transactivation domains between the three groups are not homologous to one another. SP1-4 have similar N-terminal transactivation domains characterized by glutamine-rich regions, which, in most cases, have adjacent serine/threonine-rich regions. This model represents the N-terminal domain of SP2.


Pssm-ID: 411776 [Multi-domain]  Cd Length: 511  Bit Score: 54.93  E-value: 3.03e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907114401  527 TVAQTVSRIPPNPSVHTHQQQNSPVTVIQNKAPIPCEVV-KATVIQNSVPQTAVPVSISVGGAPAQ--------NSVGQN 597
Cdd:cd22540     17 STTQDSQPSPLALLAATCSKIGPPAVEAAVTPPAPPQPTpRKLVPIKPAPLPLGPGKNSIGFLSAKgniiqlqgSQLSSS 96
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907114401  598 HSAGPQPVTVVNSQTLLHHPSVMPQPSPLHTVVPGQvpsgtpvtviqQTVPQSRMFGRVQSIPACTSTVSQGQQlITTSP 677
Cdd:cd22540     97 APGGQQVFAIQNPTMIIKGSQTRSSTNQQYQISPQI-----------QAAGQINNSGQIQIIPGTNQAIITPVQ-VLQQP 164
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907114401  678 QPMHTSSQQTAAGSQPQDTVIIAPPQYVT----TSASNIVSATSVQNFQVA--TGQVVTIAGVPSPQPSRVGFQ--NIAP 749
Cdd:cd22540    165 QQAHKPVPIKPAPLQTSNTNSASLQVPGNviklQSGGNVALTLPVNNLVGTqdGATQLQLAAAPSKPSKKIRKKsaQAAQ 244
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907114401  750 KPLPSQQVSPSVVQQPIQQPQQPAQQSVVIVSQPaqqGQAYAPAIHQIVLANPAALPagQTVQLTGQPnitpSSSPSPVP 829
Cdd:cd22540    245 PAVTVAEQVETVLIETTADNIIQAGNNLLIVQSP---GTGQPAVLQQVQVLQPKQEQ--QVVQIPQQA----LRVVQAAS 315
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907114401  830 PTNNQVPTamSSSSTLQSQGPPPTVSQMLSVKRQQQQQHSPAAPAQQVQVQVQQPQQVQVQVQPQQPSAGVGQPApneSS 909
Cdd:cd22540    316 ATLPTVPQ--KPLQNIQIQNSEPTPTQVYIKTPSGEVQTVLLQEAPAATATPSSSTSTVQQQVTANNGTGTSKPN---YN 390
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907114401  910 LIKQLLLPKRGPStpGGKLILPAPQIpppnnarAPSPQVVYQVANNqaaGFGVQGQTPAQQLLVGQQNvqlvqsamppaG 989
Cdd:cd22540    391 VRKERTLPKIAPA--GGIISLNAAQL-------AAAAQAIQTININ---GVQVQGVPVTITNAGGQQQ-----------L 447
                          490
                   ....*....|....
gi 1907114401  990 GVQTVPISNLQILP 1003
Cdd:cd22540    448 TVQTVSSNNLTISG 461
PTZ00395 PTZ00395
Sec24-related protein; Provisional
1129-1426 1.60e-04

Sec24-related protein; Provisional


Pssm-ID: 185594 [Multi-domain]  Cd Length: 1560  Bit Score: 46.61  E-value: 1.60e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907114401 1129 SESELQVGSLLNGRKYSDSSLPPSNSGklQSETSQCS------------------LISNGPSLELGENGAPGKQNS---- 1186
Cdd:PTZ00395   228 NEGDVQKTNPWQGKQGNSATSPPANEN--NAVTLSCSndqqrgassaaesgyahhRGSNIASHTPNDNIMHAANNPlnnt 305
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907114401 1187 EPVDMQDVKGDLKKALVNGICDFDKGD--------GSH------LSKNIPNHKTSNHVGNGEISPVEP--QGTSGATQQD 1250
Cdd:PTZ00395   306 NDAQRNAIQGDLVRGAPNDKNSFDRGNektyqiygGFHdgspnaASAGAPFNGLGNQADGGHINQVHPdaRGAWAGGPHS 385
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907114401 1251 TAKGDQlERVSNGPVLTLGGS-------PSTSSMQEAPSVATPPLSGTDLPNGPlasslNSDVPQQRPSVVVSPHSTAPV 1323
Cdd:PTZ00395   386 NASYNC-AAYSNAAQSNAAQSnagfsnaGYSNPGNSNPGYNNAPNSNTPYNNPP-----NSNTPYSNPPNSNPPYSNLPY 459
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907114401 1324 IQghqviaVPHSGPRVT--PSALSSDARSTNGTA-ECKTVKRPAEDNDRDTVPGIPNKVGVRIVTISDPNnagCSATMVA 1400
Cdd:PTZ00395   460 SN------TPYSNAPLSnaPPSSAKDHHSAYHAAyQHRAANQPAANLPTANQPAANNFHGAAGNSVGNPF---ASRPFGS 530
                          330       340
                   ....*....|....*....|....*.
gi 1907114401 1401 VPAGADPSTVAKvaiESAAQQKQQHP 1426
Cdd:PTZ00395   531 APYGGNAATTAD---PNGIAKREDHP 553
PHA03247 PHA03247
large tegument protein UL36; Provisional
477-948 3.50e-04

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 45.70  E-value: 3.50e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907114401  477 AVADLS-PTPSPAGIPHGPqAAGNHFQRTPVTNQSSNLTAT---QMSFPVQGIHTVAQTVSRIPPNPSVHTHQQQNSPVT 552
Cdd:PHA03247  2562 AAPDRSvPPPRPAPRPSEP-AVTSRARRPDAPPQSARPRAPvddRGDPRGPAPPSPLPPDTHAPDPPPPSPSPAANEPDP 2640
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907114401  553 VIQNKAPIPCEVVKATviqnSVPQTAVPVSISVGGAPAQNSV---GQNHSAGPQPVTVVNSQTLLHHPSVMPQPSPLHTV 629
Cdd:PHA03247  2641 HPPPTVPPPERPRDDP----APGRVSRPRRARRLGRAAQASSppqRPRRRAARPTVGSLTSLADPPPPPPTPEPAPHALV 2716
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907114401  630 vpgqvpSGTPVTVIQQTVPQSRMFGRVQSIPACTSTVSQGQQLITTSPQPMHTSSQQTAAGsqPQDTViiAPPQYVTTSA 709
Cdd:PHA03247  2717 ------SATPLPPGPAAARQASPALPAAPAPPAVPAGPATPGGPARPARPPTTAGPPAPAP--PAAPA--AGPPRRLTRP 2786
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907114401  710 SNIVSATSVQNFQVATGQVVTIAGVPSPQPSRVGFQNIA---PKPLPSQQVSPSVvQQPIQQPQQPAQQSVVIVSQPAQQ 786
Cdd:PHA03247  2787 AVASLSESRESLPSPWDPADPPAAVLAPAAALPPAASPAgplPPPTSAQPTAPPP-PPGPPPPSLPLGGSVAPGGDVRRR 2865
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907114401  787 GQAYAPAihqivlANPAAlPAGQTVQLTGQPnitpssspspvpptnnqvptAMSSSSTLQSQGPPPTVSQMLSVKRQQQQ 866
Cdd:PHA03247  2866 PPSRSPA------AKPAA-PARPPVRRLARP--------------------AVSRSTESFALPPDQPERPPQPQAPPPPQ 2918
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907114401  867 QHSPAAPAQQVQVQVQQPQQVQVQVQPQQPSAGVGQPAPNESSLIKQLLLPKRGPS----TPGGKLILPAPQIPPPNNAR 942
Cdd:PHA03247  2919 PQPQPPPPPQPQPPPPPPPRPQPPLAPTTDPAGAGEPSGAVPQPWLGALVPGRVAVprfrVPQPAPSREAPASSTPPLTG 2998

                   ....*.
gi 1907114401  943 APSPQV 948
Cdd:PHA03247  2999 HSLSRV 3004
Med15 pfam09606
ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of ...
538-976 4.10e-04

ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of the ARC-Mediator co-activator is a three-helix bundle with marked similarity to the KIX domain. The sterol regulatory element binding protein (SREBP) family of transcription activators use the ARC105 subunit to activate target genes in the regulation of cholesterol and fatty acid homeostasis. In addition, Med15 is a critical transducer of gene activation signals that control early metazoan development.


Pssm-ID: 312941 [Multi-domain]  Cd Length: 732  Bit Score: 45.00  E-value: 4.10e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907114401  538 NPSVHTHQQQNSPVTVIQNKAPIPCEVVKATVIQNSVPQTAVPVSISVGGAPAQNSVGqnhsagpqPVTVVNSQTLLHHP 617
Cdd:pfam09606   59 QQQQPQGGQGNGGMGGGQQGMPDPINALQNLAGQGTRPQMMGPMGPGPGGPMGQQMGG--------PGTASNLLASLGRP 130
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907114401  618 S----VMPQPSPLHTVVPGQVPSGT-----PVTVIQQTVPQSRMFGRVQSIPACTSTVSQGQQLITTSPQPM-------- 680
Cdd:pfam09606  131 QmpmgGAGFPSQMSRVGRMQPGGQAggmmqPSSGQPGSGTPNQMGPNGGPGQGQAGGMNGGQQGPMGGQMPPqmgvpgmp 210
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907114401  681 -------HTSSQQTAAGSQPQDTVIIAPPQYVTTSASNIVSATSVQnFQVATGQVVTIA-GVPSPQPSRVGFQNIAPkpl 752
Cdd:pfam09606  211 gpadagaQMGQQAQANGGMNPQQMGGAPNQVAMQQQQPQQQGQQSQ-LGMGINQMQQMPqGVGGGAGQGGPGQPMGP--- 286
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907114401  753 PSQQvsPSVVQQPIQQPQQPAQQSVVIVSQPAQQGQAYAPAIHQIVlaNPAALPAGQTVQLTGQpnitpssSPSPVPPTN 832
Cdd:pfam09606  287 PGQQ--PGAMPNVMSIGDQNNYQQQQTRQQQQQQGGNHPAAHQQQM--NQSVGQGGQVVALGGL-------NHLETWNPG 355
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907114401  833 NQVPTAMSSsstlQSQGPPPTVSQ----MLSVKRQQQQQHSPAApaqqvqvqvqqpqqvqvqvqpqqpsagvgQPAPNES 908
Cdd:pfam09606  356 NFGGLGANP----MQRGQPGMMSSpspvPGQQVRQVTPNQFMRQ-----------------------------SPQPSVP 402
                          410       420       430       440       450       460
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 1907114401  909 SlikqlllpKRGPSTPGGKLILPAPqIPPPNNARAPSPQVvyqvannqaagfgvqGQTPAQQLLVGQQ 976
Cdd:pfam09606  403 S--------PQGPGSQPPQSHPGGM-IPSPALIPSPSPQM---------------SQQPAQQRTIGQD 446
 
Name Accession Description Interval E-value
RFX_DNA_binding pfam02257
RFX DNA-binding domain; RFX is a regulatory factor which binds to the X box of MHC class II ...
372-454 3.71e-30

RFX DNA-binding domain; RFX is a regulatory factor which binds to the X box of MHC class II genes and is essential for their expression. The DNA-binding domain of RFX is the central domain of the protein and binds ssDNA as either a monomer or homodimer. It recognize X-boxes (DNA of the sequence 5'-GTNRCC(0-3N)RGYAAC-3', where N is any nucleotide, R is a purine and Y is a pyrimidine) using a highly conserved 76-residue DNA-binding domain (DBD).


Pssm-ID: 460512  Cd Length: 79  Bit Score: 114.57  E-value: 3.71e-30
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907114401  372 EKFACQWLNAHFEVNPDCSVSRAEMYSEYLSTCSKLARgGILTSTGFYKCLRTVFPNHTVKRVEdstSSGQAHIHVIGVK 451
Cdd:pfam02257    1 QVFALQWLNANYEVAEGSSVPRSRVYAHYLSFCSKLGI-KPLNAASFGKLIRQVFPNLKTRRLG---TRGQSKYHYCGIR 76

                   ...
gi 1907114401  452 RRA 454
Cdd:pfam02257   77 LKA 79
SP2_N cd22540
N-terminal domain of transcription factor Specificity Protein (SP) 2; Specificity Proteins ...
527-1003 3.03e-07

N-terminal domain of transcription factor Specificity Protein (SP) 2; Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. SP2 contains the least conserved DNA-binding domain within the SP subfamily of proteins, and its DNA sequence specificity differs from the other SP proteins. It localizes primarily within subnuclear foci associated with the nuclear matrix, and can activate, or in some cases, repress expression from different promoters. The transcription factor SP2 serves as a paradigm for indirect genomic binding. It does not require its DNA-binding domain for genomic DNA binding and occupies target promoters independently of whether they contain a cognate DNA-binding motif. SP2 belongs to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. SP factors may be separated into three groups based on their domain architecture and the similarity of their N-terminal transactivation domains: SP1-4, SP5, and SP6-9. The transactivation domains between the three groups are not homologous to one another. SP1-4 have similar N-terminal transactivation domains characterized by glutamine-rich regions, which, in most cases, have adjacent serine/threonine-rich regions. This model represents the N-terminal domain of SP2.


Pssm-ID: 411776 [Multi-domain]  Cd Length: 511  Bit Score: 54.93  E-value: 3.03e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907114401  527 TVAQTVSRIPPNPSVHTHQQQNSPVTVIQNKAPIPCEVV-KATVIQNSVPQTAVPVSISVGGAPAQ--------NSVGQN 597
Cdd:cd22540     17 STTQDSQPSPLALLAATCSKIGPPAVEAAVTPPAPPQPTpRKLVPIKPAPLPLGPGKNSIGFLSAKgniiqlqgSQLSSS 96
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907114401  598 HSAGPQPVTVVNSQTLLHHPSVMPQPSPLHTVVPGQvpsgtpvtviqQTVPQSRMFGRVQSIPACTSTVSQGQQlITTSP 677
Cdd:cd22540     97 APGGQQVFAIQNPTMIIKGSQTRSSTNQQYQISPQI-----------QAAGQINNSGQIQIIPGTNQAIITPVQ-VLQQP 164
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907114401  678 QPMHTSSQQTAAGSQPQDTVIIAPPQYVT----TSASNIVSATSVQNFQVA--TGQVVTIAGVPSPQPSRVGFQ--NIAP 749
Cdd:cd22540    165 QQAHKPVPIKPAPLQTSNTNSASLQVPGNviklQSGGNVALTLPVNNLVGTqdGATQLQLAAAPSKPSKKIRKKsaQAAQ 244
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907114401  750 KPLPSQQVSPSVVQQPIQQPQQPAQQSVVIVSQPaqqGQAYAPAIHQIVLANPAALPagQTVQLTGQPnitpSSSPSPVP 829
Cdd:cd22540    245 PAVTVAEQVETVLIETTADNIIQAGNNLLIVQSP---GTGQPAVLQQVQVLQPKQEQ--QVVQIPQQA----LRVVQAAS 315
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907114401  830 PTNNQVPTamSSSSTLQSQGPPPTVSQMLSVKRQQQQQHSPAAPAQQVQVQVQQPQQVQVQVQPQQPSAGVGQPApneSS 909
Cdd:cd22540    316 ATLPTVPQ--KPLQNIQIQNSEPTPTQVYIKTPSGEVQTVLLQEAPAATATPSSSTSTVQQQVTANNGTGTSKPN---YN 390
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907114401  910 LIKQLLLPKRGPStpGGKLILPAPQIpppnnarAPSPQVVYQVANNqaaGFGVQGQTPAQQLLVGQQNvqlvqsamppaG 989
Cdd:cd22540    391 VRKERTLPKIAPA--GGIISLNAAQL-------AAAAQAIQTININ---GVQVQGVPVTITNAGGQQQ-----------L 447
                          490
                   ....*....|....
gi 1907114401  990 GVQTVPISNLQILP 1003
Cdd:cd22540    448 TVQTVSSNNLTISG 461
PTZ00395 PTZ00395
Sec24-related protein; Provisional
1129-1426 1.60e-04

Sec24-related protein; Provisional


Pssm-ID: 185594 [Multi-domain]  Cd Length: 1560  Bit Score: 46.61  E-value: 1.60e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907114401 1129 SESELQVGSLLNGRKYSDSSLPPSNSGklQSETSQCS------------------LISNGPSLELGENGAPGKQNS---- 1186
Cdd:PTZ00395   228 NEGDVQKTNPWQGKQGNSATSPPANEN--NAVTLSCSndqqrgassaaesgyahhRGSNIASHTPNDNIMHAANNPlnnt 305
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907114401 1187 EPVDMQDVKGDLKKALVNGICDFDKGD--------GSH------LSKNIPNHKTSNHVGNGEISPVEP--QGTSGATQQD 1250
Cdd:PTZ00395   306 NDAQRNAIQGDLVRGAPNDKNSFDRGNektyqiygGFHdgspnaASAGAPFNGLGNQADGGHINQVHPdaRGAWAGGPHS 385
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907114401 1251 TAKGDQlERVSNGPVLTLGGS-------PSTSSMQEAPSVATPPLSGTDLPNGPlasslNSDVPQQRPSVVVSPHSTAPV 1323
Cdd:PTZ00395   386 NASYNC-AAYSNAAQSNAAQSnagfsnaGYSNPGNSNPGYNNAPNSNTPYNNPP-----NSNTPYSNPPNSNPPYSNLPY 459
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907114401 1324 IQghqviaVPHSGPRVT--PSALSSDARSTNGTA-ECKTVKRPAEDNDRDTVPGIPNKVGVRIVTISDPNnagCSATMVA 1400
Cdd:PTZ00395   460 SN------TPYSNAPLSnaPPSSAKDHHSAYHAAyQHRAANQPAANLPTANQPAANNFHGAAGNSVGNPF---ASRPFGS 530
                          330       340
                   ....*....|....*....|....*.
gi 1907114401 1401 VPAGADPSTVAKvaiESAAQQKQQHP 1426
Cdd:PTZ00395   531 APYGGNAATTAD---PNGIAKREDHP 553
PHA03247 PHA03247
large tegument protein UL36; Provisional
477-948 3.50e-04

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 45.70  E-value: 3.50e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907114401  477 AVADLS-PTPSPAGIPHGPqAAGNHFQRTPVTNQSSNLTAT---QMSFPVQGIHTVAQTVSRIPPNPSVHTHQQQNSPVT 552
Cdd:PHA03247  2562 AAPDRSvPPPRPAPRPSEP-AVTSRARRPDAPPQSARPRAPvddRGDPRGPAPPSPLPPDTHAPDPPPPSPSPAANEPDP 2640
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907114401  553 VIQNKAPIPCEVVKATviqnSVPQTAVPVSISVGGAPAQNSV---GQNHSAGPQPVTVVNSQTLLHHPSVMPQPSPLHTV 629
Cdd:PHA03247  2641 HPPPTVPPPERPRDDP----APGRVSRPRRARRLGRAAQASSppqRPRRRAARPTVGSLTSLADPPPPPPTPEPAPHALV 2716
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907114401  630 vpgqvpSGTPVTVIQQTVPQSRMFGRVQSIPACTSTVSQGQQLITTSPQPMHTSSQQTAAGsqPQDTViiAPPQYVTTSA 709
Cdd:PHA03247  2717 ------SATPLPPGPAAARQASPALPAAPAPPAVPAGPATPGGPARPARPPTTAGPPAPAP--PAAPA--AGPPRRLTRP 2786
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907114401  710 SNIVSATSVQNFQVATGQVVTIAGVPSPQPSRVGFQNIA---PKPLPSQQVSPSVvQQPIQQPQQPAQQSVVIVSQPAQQ 786
Cdd:PHA03247  2787 AVASLSESRESLPSPWDPADPPAAVLAPAAALPPAASPAgplPPPTSAQPTAPPP-PPGPPPPSLPLGGSVAPGGDVRRR 2865
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907114401  787 GQAYAPAihqivlANPAAlPAGQTVQLTGQPnitpssspspvpptnnqvptAMSSSSTLQSQGPPPTVSQMLSVKRQQQQ 866
Cdd:PHA03247  2866 PPSRSPA------AKPAA-PARPPVRRLARP--------------------AVSRSTESFALPPDQPERPPQPQAPPPPQ 2918
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907114401  867 QHSPAAPAQQVQVQVQQPQQVQVQVQPQQPSAGVGQPAPNESSLIKQLLLPKRGPS----TPGGKLILPAPQIPPPNNAR 942
Cdd:PHA03247  2919 PQPQPPPPPQPQPPPPPPPRPQPPLAPTTDPAGAGEPSGAVPQPWLGALVPGRVAVprfrVPQPAPSREAPASSTPPLTG 2998

                   ....*.
gi 1907114401  943 APSPQV 948
Cdd:PHA03247  2999 HSLSRV 3004
Med15 pfam09606
ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of ...
538-976 4.10e-04

ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of the ARC-Mediator co-activator is a three-helix bundle with marked similarity to the KIX domain. The sterol regulatory element binding protein (SREBP) family of transcription activators use the ARC105 subunit to activate target genes in the regulation of cholesterol and fatty acid homeostasis. In addition, Med15 is a critical transducer of gene activation signals that control early metazoan development.


Pssm-ID: 312941 [Multi-domain]  Cd Length: 732  Bit Score: 45.00  E-value: 4.10e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907114401  538 NPSVHTHQQQNSPVTVIQNKAPIPCEVVKATVIQNSVPQTAVPVSISVGGAPAQNSVGqnhsagpqPVTVVNSQTLLHHP 617
Cdd:pfam09606   59 QQQQPQGGQGNGGMGGGQQGMPDPINALQNLAGQGTRPQMMGPMGPGPGGPMGQQMGG--------PGTASNLLASLGRP 130
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907114401  618 S----VMPQPSPLHTVVPGQVPSGT-----PVTVIQQTVPQSRMFGRVQSIPACTSTVSQGQQLITTSPQPM-------- 680
Cdd:pfam09606  131 QmpmgGAGFPSQMSRVGRMQPGGQAggmmqPSSGQPGSGTPNQMGPNGGPGQGQAGGMNGGQQGPMGGQMPPqmgvpgmp 210
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907114401  681 -------HTSSQQTAAGSQPQDTVIIAPPQYVTTSASNIVSATSVQnFQVATGQVVTIA-GVPSPQPSRVGFQNIAPkpl 752
Cdd:pfam09606  211 gpadagaQMGQQAQANGGMNPQQMGGAPNQVAMQQQQPQQQGQQSQ-LGMGINQMQQMPqGVGGGAGQGGPGQPMGP--- 286
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907114401  753 PSQQvsPSVVQQPIQQPQQPAQQSVVIVSQPAQQGQAYAPAIHQIVlaNPAALPAGQTVQLTGQpnitpssSPSPVPPTN 832
Cdd:pfam09606  287 PGQQ--PGAMPNVMSIGDQNNYQQQQTRQQQQQQGGNHPAAHQQQM--NQSVGQGGQVVALGGL-------NHLETWNPG 355
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907114401  833 NQVPTAMSSsstlQSQGPPPTVSQ----MLSVKRQQQQQHSPAApaqqvqvqvqqpqqvqvqvqpqqpsagvgQPAPNES 908
Cdd:pfam09606  356 NFGGLGANP----MQRGQPGMMSSpspvPGQQVRQVTPNQFMRQ-----------------------------SPQPSVP 402
                          410       420       430       440       450       460
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 1907114401  909 SlikqlllpKRGPSTPGGKLILPAPqIPPPNNARAPSPQVvyqvannqaagfgvqGQTPAQQLLVGQQ 976
Cdd:pfam09606  403 S--------PQGPGSQPPQSHPGGM-IPSPALIPSPSPQM---------------SQQPAQQRTIGQD 446
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
545-853 7.95e-04

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 44.37  E-value: 7.95e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907114401  545 QQQNSPVTVIQNKAPIPCE--VVKATVIQNSVPQTAVPvSISVGGAPAQNSVGQNHSAGPQPVTVVNSQTLLHHPSVmpq 622
Cdd:pfam03154  167 LQTQPPVLQAQSGAASPPSppPPGTTQAATAGPTPSAP-SVPPQGSPATSQPPNQTQSTAAPHTLIQQTPTLHPQRL--- 242
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907114401  623 PSPlHTVVPGQVPSGTPVTVIQQTVPQSRMFGRVQSIP-ACTSTVSQGQQLITTSPQPMHTSSQQTAAGSQPQDTVIIAP 701
Cdd:pfam03154  243 PSP-HPPLQPMTQPPPPSQVSPQPLPQPSLHGQMPPMPhSLQTGPSHMQHPVPPQPFPLTPQSSQSQVPPGPSPAAPGQS 321
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907114401  702 PQYVTTSASNivsatsvqnfqvatgqvvtiAGVPSPQPSRVGFQNIAPKPLPSQQVSPSVVQQPIQQPQQPAQQSVVIVS 781
Cdd:pfam03154  322 QQRIHTPPSQ--------------------SQLQSQQPPREQPLPPAPLSMPHIKPPPTTPIPQLPNPQSHKHPPHLSGP 381
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907114401  782 QPAQQGQAYAP----------AIHQIVLANPAAL---PAGQTVQLT-GQPNITPSSSPSPVPPTNNQVPtamSSSSTLQS 847
Cdd:pfam03154  382 SPFQMNSNLPPppalkplsslSTHHPPSAHPPPLqlmPQSQQLPPPpAQPPVLTQSQSLPPPAASHPPT---SGLHQVPS 458

                   ....*.
gi 1907114401  848 QGPPPT 853
Cdd:pfam03154  459 QSPFPQ 464
DUF5585 pfam17823
Family of unknown function (DUF5585); This is a family of unknown function found in chordata.
485-853 6.07e-03

Family of unknown function (DUF5585); This is a family of unknown function found in chordata.


Pssm-ID: 465521 [Multi-domain]  Cd Length: 506  Bit Score: 41.10  E-value: 6.07e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907114401  485 PSPAGIPHGPQAAgnHFQRTPVTNQSSNlTATQMSFPVQGIHTVAQTVSRIPPNPSvhthqqQNSPVTVIQNkapIPCEV 564
Cdd:pfam17823   67 PAPVTLTKGTSAA--HLNSTEVTAEHTP-HGTDLSEPATREGAADGAASRALAAAA------SSSPSSAAQS---LPAAI 134
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907114401  565 VKATVIQNSVPQTAVP-VSISVGGAPAQNSVGQNHSAGPQPVTVVNSQTLLHHPSVMPQP------SPLHTVVPGQ---- 633
Cdd:pfam17823  135 AALPSEAFSAPRAAACrANASAAPRAAIAAASAPHAASPAPRTAASSTTAASSTTAASSApttaasSAPATLTPARgist 214
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907114401  634 --VPSGTPV--TVIQQ-----TVPQSRMFGRVQSIPACTSTVSQGQQLITTSPQPMHTSS----QQTAAGSQPQDTVIIA 700
Cdd:pfam17823  215 aaTATGHPAagTALAAvgnssPAAGTVTAAVGTVTPAALATLAAAAGTVASAAGTINMGDpharRLSPAKHMPSDTMARN 294
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907114401  701 PpqyVTTSASNIVSATSvqnfQVATGQ-VVTIAGVPSPQPSRVGFQNIAPKPLPSqqvspsvvqqpiqqpqqpaQQSVVI 779
Cdd:pfam17823  295 P---AAPMGAQAQGPII----QVSTDQpVHNTAGEPTPSPSNTTLEPNTPKSVAS-------------------TNLAVV 348
                          330       340       350       360       370       380       390
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....
gi 1907114401  780 VSQPAQqgqayapaihqivlanpAALPAGQTVqltgqpnitpssspsPVPPTnNQVPTAMSSSSTLQSQGPPPT 853
Cdd:pfam17823  349 TTTKAQ-----------------AKEPSASPV---------------PVLHT-SMIPEVEATSPTTQPSPLLPT 389
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
571-760 6.15e-03

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 41.29  E-value: 6.15e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907114401  571 QNSVPQTAVPVSISVGGAPAQNSVG-----QNHSAGPQPVTvvnsqtllhhPSVMPQPSPLHTVVPGQ-VPSGTPVTVIQ 644
Cdd:pfam03154  163 QQQILQTQPPVLQAQSGAASPPSPPppgttQAATAGPTPSA----------PSVPPQGSPATSQPPNQtQSTAAPHTLIQ 232
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907114401  645 QTVPQsrmfgRVQSIPACTSTVSQgqqlITTSPQPMHTSSQQTaagSQPQDTVIIAP-PQYVTTSASNIVSATSVQNFQV 723
Cdd:pfam03154  233 QTPTL-----HPQRLPSPHPPLQP----MTQPPPPSQVSPQPL---PQPSLHGQMPPmPHSLQTGPSHMQHPVPPQPFPL 300
                          170       180       190
                   ....*....|....*....|....*....|....*..
gi 1907114401  724 aTGQVVTIAGVPSPQPSRVGFQNIAPKPLPSQQVSPS 760
Cdd:pfam03154  301 -TPQSSQSQVPPGPSPAAPGQSQQRIHTPPSQSQLQS 336
 
Blast search parameters
Data Source: Precalculated data, version = cdd.v.3.21
Preset Options:Database: CDSEARCH/cdd   Low complexity filter: no  Composition Based Adjustment: yes   E-value threshold: 0.01

References:

  • Wang J et al. (2023), "The conserved domain database in 2023", Nucleic Acids Res.51(D)384-8.
  • Lu S et al. (2020), "The conserved domain database in 2020", Nucleic Acids Res.48(D)265-8.
  • Marchler-Bauer A et al. (2017), "CDD/SPARCLE: functional classification of proteins via subfamily domain architectures.", Nucleic Acids Res.45(D)200-3.
Help | Disclaimer | Write to the Help Desk
NCBI | NLM | NIH