NCBI Home Page NCBI Site Search page NCBI Guide that lists and describes the NCBI resources
Conserved domains on  [gi|2181405596|ref|NP_001387096|]
View 

protein transport protein Sec31A isoform 15 [Homo sapiens]

Protein Classification

Graphical summary

 Zoom to residue level

show extra options »

Show site features     Horizontal zoom: ×

List of domain hits

Name Accession Description Interval E-value
WD40 super family cl29593
WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions ...
13-332 3.24e-24

WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment.


The actual alignment was detected with superfamily member cd00200:

Pssm-ID: 475233 [Multi-domain]  Cd Length: 289  Bit Score: 104.34  E-value: 3.24e-24
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405596   13 AWSPAQNhpiYLATGtsaqqldatfSTNASLEIFELD-------LSDPSLDMKSCATFSSSHRyhkliwgpykmdskgdv 85
Cdd:cd00200     16 AFSPDGK---LLATG----------SGDGTIKVWDLEtgellrtLKGHTGPVRDVAASADGTY----------------- 65
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405596   86 sgvLIAGGENGNIILYDPSKiiaGDKEVVIAQndkHTGPVRALDvniFQTN--LVASGANESEIYIWDLNNFATPMTPGA 163
Cdd:cd00200     66 ---LASGSSDKTIRLWDLET---GECVRTLTG---HTSYVSSVA---FSPDgrILSSSSRDKTIKVWDVETGKCLTTLRG 133
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405596  164 KTQPpedISCIAWNrQVQHILASASPSGRATVWDLRKNEPIIKVSDHSNRMHCsgLAWHPDvATQMVLASEDDrlpVIQM 243
Cdd:cd00200    134 HTDW---VNSVAFS-PDGTFVASSSQDGTIKLWDLRTGKCVATLTGHTGEVNS--VAFSPD-GEKLLSSSSDG---TIKL 203
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405596  244 WDLRfASSPLRVLENHARGILAIAWSmADPELLLSCGKDAKILCSNPNTGEVLYELPTNTQWCFDIQWCPRNPAVLSaAS 323
Cdd:cd00200    204 WDLS-TGKCLGTLRGHENGVNSVAFS-PDGYLLASGSEDGTIRVWDLRTGECVQTLSGHTNSVTSLAWSPDGKRLAS-GS 280

                   ....*....
gi 2181405596  324 FDGRISVYS 332
Cdd:cd00200    281 ADGTIRIWD 289
Atrophin-1 super family cl38111
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
737-1090 1.16e-10

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


The actual alignment was detected with superfamily member pfam03154:

Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 66.33  E-value: 1.16e-10
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405596  737 QPNIMQLRDRLCRAQGEPVAGHESPKIPYEKQQLPKGRPGPVAGHHQMPRVQTQQYYPHVRIAPTVTTWSNKTPT---AL 813
Cdd:pfam03154  170 QPPVLQAQSGAASPPSPPPPGTTQAATAGPTPSAPSVPPQGSPATSQPPNQTQSTAAPHTLIQQTPTLHPQRLPSphpPL 249
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405596  814 PSHPPAASPSDTQGENPPPPgfIMHGNVNPnAAGQLPTSPGHMHTQV-----PPYPQPQPYQPAQPYPFGTGGSAMYRPQ 888
Cdd:pfam03154  250 QPMTQPPPPSQVSPQPLPQP--SLHGQMPP-MPHSLQTGPSHMQHPVppqpfPLTPQSSQSQVPPGPSPAAPGQSQQRIH 326
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405596  889 QPVA--------PPTSNAYP----NTPYISSASSyTGQSQLYAAQ------HQASSPTSSPATSFPPPPS----SGASFQ 946
Cdd:pfam03154  327 TPPSqsqlqsqqPPREQPLPpaplSMPHIKPPPT-TPIPQLPNPQshkhppHLSGPSPFQMNSNLPPPPAlkplSSLSTH 405
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405596  947 HGGPGAPP------SSSAYALPPGTTGTLPAASELPASQRTGPqngwnDPPALNRVPKKKKMPEN-FMP--PVPITSPIM 1017
Cdd:pfam03154  406 HPPSAHPPplqlmpQSQQLPPPPAQPPVLTQSQSLPPPAASHP-----PTSGLHQVPSQSPFPQHpFVPggPPPITPPSG 480
                          330       340       350       360       370       380       390
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*
gi 2181405596 1018 NPLGDPQSQMLQQQPSApvpLSSQSSFPQPHLPGGQ--PFHGVQQPLGQTGMPPSFSKPNIEGAPGAPIGNTFQH 1090
Cdd:pfam03154  481 PPTSTSSAMPGIQPPSS---ASVSSSGPVPAAVSCPlpPVQIKEEALDEAEEPESPPPPPRSPSPEPTVVNTPSH 552
ACE1-Sec16-like super family cl14807
Ancestral coatomer element 1 (ACE1) of COPII coat complex assembly protein Sec16; COPII coat ...
534-657 4.72e-09

Ancestral coatomer element 1 (ACE1) of COPII coat complex assembly protein Sec16; COPII coat complex plays an important role in vesicular traffic of newly synthezised proteins from the endoplasmatic reticulum (ER) to the Golgi apparatus by mediating the formation of transport vesicles. COPII consists of an outer coat, made up of the scaffold proteins Sec31 and Sec13, and the cargo adaptor complex, Sec23 and Sec24, which are recruited by the small GTPase Sar1. Sec16 is involved in the early steps of the assembly process. Sec16 forms elongated heterotetramers with Sec13, Sec13-(Sec16)2-Sec13. It interacts with Sec13 by insertion of a single beta-blade to close the six-bladded beta propeller of Sec13. In the same way Sec13 interacts with Sec31 and Nup145C, a nuclear pore protein, all of these contain a structurally related ancestral coatomer element 1 (ACE1). Sec16 is believed to be a key component in maintaining the integrity of the ER exit site.


The actual alignment was detected with superfamily member cd09233:

Pssm-ID: 449359 [Multi-domain]  Cd Length: 314  Bit Score: 59.19  E-value: 4.72e-09
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405596  534 ITQALLTGNFESAVDLCLhDNRM-ADAIILAIAGGQELLARTQKKyFAKSQSKIT---RLITAVVMKNWKEIVESC---- 605
Cdd:cd09233     69 FRNLLLTGNRKEALELAL-DNGLwAHALLLASSLGKETWAEVVSR-FARSESKLNdplQTLYQLFSGNSPEAITELadnp 146
                           90       100       110       120       130       140
                   ....*....|....*....|....*....|....*....|....*....|....*....|..
gi 2181405596  606 -----DLKNWREALAAVLTYAKPD-EFSALCDLlgtrleneGDSLLQTQ----ACLCYICAG 657
Cdd:cd09233    147 aeaewALGNWREHLAIILSNRTSNlDLEALVEL--------GDLLAQRGlveaAHICYLLAG 200
 
Name Accession Description Interval E-value
WD40 cd00200
WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions ...
13-332 3.24e-24

WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment.


Pssm-ID: 238121 [Multi-domain]  Cd Length: 289  Bit Score: 104.34  E-value: 3.24e-24
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405596   13 AWSPAQNhpiYLATGtsaqqldatfSTNASLEIFELD-------LSDPSLDMKSCATFSSSHRyhkliwgpykmdskgdv 85
Cdd:cd00200     16 AFSPDGK---LLATG----------SGDGTIKVWDLEtgellrtLKGHTGPVRDVAASADGTY----------------- 65
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405596   86 sgvLIAGGENGNIILYDPSKiiaGDKEVVIAQndkHTGPVRALDvniFQTN--LVASGANESEIYIWDLNNFATPMTPGA 163
Cdd:cd00200     66 ---LASGSSDKTIRLWDLET---GECVRTLTG---HTSYVSSVA---FSPDgrILSSSSRDKTIKVWDVETGKCLTTLRG 133
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405596  164 KTQPpedISCIAWNrQVQHILASASPSGRATVWDLRKNEPIIKVSDHSNRMHCsgLAWHPDvATQMVLASEDDrlpVIQM 243
Cdd:cd00200    134 HTDW---VNSVAFS-PDGTFVASSSQDGTIKLWDLRTGKCVATLTGHTGEVNS--VAFSPD-GEKLLSSSSDG---TIKL 203
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405596  244 WDLRfASSPLRVLENHARGILAIAWSmADPELLLSCGKDAKILCSNPNTGEVLYELPTNTQWCFDIQWCPRNPAVLSaAS 323
Cdd:cd00200    204 WDLS-TGKCLGTLRGHENGVNSVAFS-PDGYLLASGSEDGTIRVWDLRTGECVQTLSGHTNSVTSLAWSPDGKRLAS-GS 280

                   ....*....
gi 2181405596  324 FDGRISVYS 332
Cdd:cd00200    281 ADGTIRIWD 289
WD40 COG2319
WD40 repeat [General function prediction only];
89-333 4.90e-23

WD40 repeat [General function prediction only];


Pssm-ID: 441893 [Multi-domain]  Cd Length: 403  Bit Score: 103.07  E-value: 4.90e-23
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405596   89 LIAGGENGNIILYDpskiIAGDKEvvIAQNDKHTGPVRALDVNiFQTNLVASGANESEIYIWDLNNFATPMTPGAKTQPp 168
Cdd:COG2319    177 LASGSDDGTVRLWD----LATGKL--LRTLTGHTGAVRSVAFS-PDGKLLASGSADGTVRLWDLATGKLLRTLTGHSGS- 248
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405596  169 edISCIAWNRQVQHiLASASPSGRATVWDLRKNEPIIKVSDHSNRMHcsGLAWHPDvATQMVLASEDDRlpvIQMWDLRf 248
Cdd:COG2319    249 --VRSVAFSPDGRL-LASGSADGTVRLWDLATGELLRTLTGHSGGVN--SVAFSPD-GKLLASGSDDGT---VRLWDLA- 318
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405596  249 ASSPLRVLENHARGILAIAWSmADPELLLSCGKDAKILCSNPNTGEVLYELPTNTQWCFDIQWCPRNPAVLSaASFDGRI 328
Cdd:COG2319    319 TGKLLRTLTGHTGAVRSVAFS-PDGKTLASGSDDGTVRLWDLATGELLRTLTGHTGAVTSVAFSPDGRTLAS-GSADGTV 396

                   ....*
gi 2181405596  329 SVYSI 333
Cdd:COG2319    397 RLWDL 401
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
737-1090 1.16e-10

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 66.33  E-value: 1.16e-10
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405596  737 QPNIMQLRDRLCRAQGEPVAGHESPKIPYEKQQLPKGRPGPVAGHHQMPRVQTQQYYPHVRIAPTVTTWSNKTPT---AL 813
Cdd:pfam03154  170 QPPVLQAQSGAASPPSPPPPGTTQAATAGPTPSAPSVPPQGSPATSQPPNQTQSTAAPHTLIQQTPTLHPQRLPSphpPL 249
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405596  814 PSHPPAASPSDTQGENPPPPgfIMHGNVNPnAAGQLPTSPGHMHTQV-----PPYPQPQPYQPAQPYPFGTGGSAMYRPQ 888
Cdd:pfam03154  250 QPMTQPPPPSQVSPQPLPQP--SLHGQMPP-MPHSLQTGPSHMQHPVppqpfPLTPQSSQSQVPPGPSPAAPGQSQQRIH 326
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405596  889 QPVA--------PPTSNAYP----NTPYISSASSyTGQSQLYAAQ------HQASSPTSSPATSFPPPPS----SGASFQ 946
Cdd:pfam03154  327 TPPSqsqlqsqqPPREQPLPpaplSMPHIKPPPT-TPIPQLPNPQshkhppHLSGPSPFQMNSNLPPPPAlkplSSLSTH 405
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405596  947 HGGPGAPP------SSSAYALPPGTTGTLPAASELPASQRTGPqngwnDPPALNRVPKKKKMPEN-FMP--PVPITSPIM 1017
Cdd:pfam03154  406 HPPSAHPPplqlmpQSQQLPPPPAQPPVLTQSQSLPPPAASHP-----PTSGLHQVPSQSPFPQHpFVPggPPPITPPSG 480
                          330       340       350       360       370       380       390
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*
gi 2181405596 1018 NPLGDPQSQMLQQQPSApvpLSSQSSFPQPHLPGGQ--PFHGVQQPLGQTGMPPSFSKPNIEGAPGAPIGNTFQH 1090
Cdd:pfam03154  481 PPTSTSSAMPGIQPPSS---ASVSSSGPVPAAVSCPlpPVQIKEEALDEAEEPESPPPPPRSPSPEPTVVNTPSH 552
PHA03247 PHA03247
large tegument protein UL36; Provisional
712-1108 2.09e-10

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 65.73  E-value: 2.09e-10
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405596  712 SQYANLLAAQGSIAAALAFLPDNTNQPNIMQLRDRlCRAQGEPVAGHESPKIPYekqqlPKGRPGPVAGHHQMPRVQTQQ 791
Cdd:PHA03247  2632 SPAANEPDPHPPPTVPPPERPRDDPAPGRVSRPRR-ARRLGRAAQASSPPQRPR-----RRAARPTVGSLTSLADPPPPP 2705
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405596  792 YYPhvriAPTVTTWSNKTPTAL-PSHPPAASPSDTQGENPPPP--GFIMHGNVNPNAAGQLPTSPGH-MHTQVPPYPQPQ 867
Cdd:PHA03247  2706 PTP----EPAPHALVSATPLPPgPAAARQASPALPAAPAPPAVpaGPATPGGPARPARPPTTAGPPApAPPAAPAAGPPR 2781
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405596  868 PYQPAQPYPFGTGGSAMYRPQQPVAPPTSNAYPNTPYISSASSYTGQSQLYAAQhqaSSPTSSPATSFPPPPSSGASFQH 947
Cdd:PHA03247  2782 RLTRPAVASLSESRESLPSPWDPADPPAAVLAPAAALPPAASPAGPLPPPTSAQ---PTAPPPPPGPPPPSLPLGGSVAP 2858
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405596  948 GGPGA--PPSSSAYALPpgTTGTLPAASELPAsqrtgpqngwndpPALNRVPKKKKMPenfmPPVPITSPIMNPLGDPQS 1025
Cdd:PHA03247  2859 GGDVRrrPPSRSPAAKP--AAPARPPVRRLAR-------------PAVSRSTESFALP----PDQPERPPQPQAPPPPQP 2919
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405596 1026 QMLQQQPSAPVPLSSQSSFPQPHLPGGQPFHGVQQPLGQTGMPpsfskPNIEGAPGAPIGNTFQHVQSLPTKKITKKPIP 1105
Cdd:PHA03247  2920 QPQPPPPPQPQPPPPPPPRPQPPLAPTTDPAGAGEPSGAVPQP-----WLGALVPGRVAVPRFRVPQPAPSREAPASSTP 2994

                   ...
gi 2181405596 1106 DEH 1108
Cdd:PHA03247  2995 PLT 2997
ACE1-Sec16-like cd09233
Ancestral coatomer element 1 (ACE1) of COPII coat complex assembly protein Sec16; COPII coat ...
534-657 4.72e-09

Ancestral coatomer element 1 (ACE1) of COPII coat complex assembly protein Sec16; COPII coat complex plays an important role in vesicular traffic of newly synthezised proteins from the endoplasmatic reticulum (ER) to the Golgi apparatus by mediating the formation of transport vesicles. COPII consists of an outer coat, made up of the scaffold proteins Sec31 and Sec13, and the cargo adaptor complex, Sec23 and Sec24, which are recruited by the small GTPase Sar1. Sec16 is involved in the early steps of the assembly process. Sec16 forms elongated heterotetramers with Sec13, Sec13-(Sec16)2-Sec13. It interacts with Sec13 by insertion of a single beta-blade to close the six-bladded beta propeller of Sec13. In the same way Sec13 interacts with Sec31 and Nup145C, a nuclear pore protein, all of these contain a structurally related ancestral coatomer element 1 (ACE1). Sec16 is believed to be a key component in maintaining the integrity of the ER exit site.


Pssm-ID: 187750 [Multi-domain]  Cd Length: 314  Bit Score: 59.19  E-value: 4.72e-09
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405596  534 ITQALLTGNFESAVDLCLhDNRM-ADAIILAIAGGQELLARTQKKyFAKSQSKIT---RLITAVVMKNWKEIVESC---- 605
Cdd:cd09233     69 FRNLLLTGNRKEALELAL-DNGLwAHALLLASSLGKETWAEVVSR-FARSESKLNdplQTLYQLFSGNSPEAITELadnp 146
                           90       100       110       120       130       140
                   ....*....|....*....|....*....|....*....|....*....|....*....|..
gi 2181405596  606 -----DLKNWREALAAVLTYAKPD-EFSALCDLlgtrleneGDSLLQTQ----ACLCYICAG 657
Cdd:cd09233    147 aeaewALGNWREHLAIILSNRTSNlDLEALVEL--------GDLLAQRGlveaAHICYLLAG 200
Sec16_C pfam12931
Sec23-binding domain of Sec16; Sec16 is a multi-domain vesicle coat protein. The C-terminal ...
534-728 5.57e-07

Sec23-binding domain of Sec16; Sec16 is a multi-domain vesicle coat protein. The C-terminal region is the part that binds to Sec23, a COPII vesicle coat protein. This association is part of the transport vesicle coat structure.


Pssm-ID: 432884  Cd Length: 279  Bit Score: 52.56  E-value: 5.57e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405596  534 ITQALLTGNFESAVDLCLhDNRM-ADAIILAIAGGQELLARTQKKY----FAKSQSKITRLItAVVMK----NWKEIVE- 603
Cdd:pfam12931    1 IRALLLTGDREKALWLAL-DKKLwAHALLIASTLGKEKWKEVVQEFvrseFKGSNNKSGESL-AALYQvfagNSEEAVDe 78
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405596  604 --------SCDLKNWREALAAVLTYAKPDEFSALCDlLGTRLENEGdslLQTQACLCYICAG---NVEKLVACWTKAQDG 672
Cdd:pfam12931   79 lvppsknaLWALDNWRETLALVLSNRSPGDVEALLA-LGDLLAQYG---RTEAAHICFLLAGlplSQTVLLGADHVRFPS 154
                          170       180       190       200       210       220
                   ....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405596  673 SHPLSLQDLI--EkvvILRKAVQLTqAMDTSTVGV--LLAAKMsQYANLLAAQGSIAAAL 728
Cdd:pfam12931  155 TFGNDLESILltE---IYEYALSLS-PPQPPFVGLphLLPYKL-QHAAVLAEYGLVSEAQ 209
PLN00181 PLN00181
protein SPA1-RELATED; Provisional
203-333 7.54e-04

protein SPA1-RELATED; Provisional


Pssm-ID: 177776 [Multi-domain]  Cd Length: 793  Bit Score: 43.92  E-value: 7.54e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405596  203 PIIKVSdhsNRMHCSGLAWHPDVATQMVLASEDDrlpVIQMWDLrfASSPLRV-LENHARGILAIAWSMADPELLLSCGK 281
Cdd:PLN00181   525 PVVELA---SRSKLSGICWNSYIKSQVASSNFEG---VVQVWDV--ARSQLVTeMKEHEKRVWSIDYSSADPTLLASGSD 596
                           90       100       110       120       130
                   ....*....|....*....|....*....|....*....|....*....|..
gi 2181405596  282 DAKILCSNPNTGEVLYELPTNTQWCFdIQWCPRNPAVLSAASFDGRISVYSI 333
Cdd:PLN00181   597 DGSVKLWSINQGVSIGTIKTKANICC-VQFPSESGRSLAFGSADHKVYYYDL 647
 
Name Accession Description Interval E-value
WD40 cd00200
WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions ...
13-332 3.24e-24

WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment.


Pssm-ID: 238121 [Multi-domain]  Cd Length: 289  Bit Score: 104.34  E-value: 3.24e-24
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405596   13 AWSPAQNhpiYLATGtsaqqldatfSTNASLEIFELD-------LSDPSLDMKSCATFSSSHRyhkliwgpykmdskgdv 85
Cdd:cd00200     16 AFSPDGK---LLATG----------SGDGTIKVWDLEtgellrtLKGHTGPVRDVAASADGTY----------------- 65
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405596   86 sgvLIAGGENGNIILYDPSKiiaGDKEVVIAQndkHTGPVRALDvniFQTN--LVASGANESEIYIWDLNNFATPMTPGA 163
Cdd:cd00200     66 ---LASGSSDKTIRLWDLET---GECVRTLTG---HTSYVSSVA---FSPDgrILSSSSRDKTIKVWDVETGKCLTTLRG 133
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405596  164 KTQPpedISCIAWNrQVQHILASASPSGRATVWDLRKNEPIIKVSDHSNRMHCsgLAWHPDvATQMVLASEDDrlpVIQM 243
Cdd:cd00200    134 HTDW---VNSVAFS-PDGTFVASSSQDGTIKLWDLRTGKCVATLTGHTGEVNS--VAFSPD-GEKLLSSSSDG---TIKL 203
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405596  244 WDLRfASSPLRVLENHARGILAIAWSmADPELLLSCGKDAKILCSNPNTGEVLYELPTNTQWCFDIQWCPRNPAVLSaAS 323
Cdd:cd00200    204 WDLS-TGKCLGTLRGHENGVNSVAFS-PDGYLLASGSEDGTIRVWDLRTGECVQTLSGHTNSVTSLAWSPDGKRLAS-GS 280

                   ....*....
gi 2181405596  324 FDGRISVYS 332
Cdd:cd00200    281 ADGTIRIWD 289
WD40 COG2319
WD40 repeat [General function prediction only];
89-333 4.90e-23

WD40 repeat [General function prediction only];


Pssm-ID: 441893 [Multi-domain]  Cd Length: 403  Bit Score: 103.07  E-value: 4.90e-23
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405596   89 LIAGGENGNIILYDpskiIAGDKEvvIAQNDKHTGPVRALDVNiFQTNLVASGANESEIYIWDLNNFATPMTPGAKTQPp 168
Cdd:COG2319    177 LASGSDDGTVRLWD----LATGKL--LRTLTGHTGAVRSVAFS-PDGKLLASGSADGTVRLWDLATGKLLRTLTGHSGS- 248
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405596  169 edISCIAWNRQVQHiLASASPSGRATVWDLRKNEPIIKVSDHSNRMHcsGLAWHPDvATQMVLASEDDRlpvIQMWDLRf 248
Cdd:COG2319    249 --VRSVAFSPDGRL-LASGSADGTVRLWDLATGELLRTLTGHSGGVN--SVAFSPD-GKLLASGSDDGT---VRLWDLA- 318
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405596  249 ASSPLRVLENHARGILAIAWSmADPELLLSCGKDAKILCSNPNTGEVLYELPTNTQWCFDIQWCPRNPAVLSaASFDGRI 328
Cdd:COG2319    319 TGKLLRTLTGHTGAVRSVAFS-PDGKTLASGSDDGTVRLWDLATGELLRTLTGHTGAVTSVAFSPDGRTLAS-GSADGTV 396

                   ....*
gi 2181405596  329 SVYSI 333
Cdd:COG2319    397 RLWDL 401
WD40 COG2319
WD40 repeat [General function prediction only];
89-333 1.59e-21

WD40 repeat [General function prediction only];


Pssm-ID: 441893 [Multi-domain]  Cd Length: 403  Bit Score: 98.44  E-value: 1.59e-21
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405596   89 LIAGGENGNIILYDpskiIAGDKEVVIAQNdkHTGPVRALDvniFQTN--LVASGANESEIYIWDLNNFATPMTPGAKTQ 166
Cdd:COG2319    135 LASGSADGTVRLWD----LATGKLLRTLTG--HSGAVTSVA---FSPDgkLLASGSDDGTVRLWDLATGKLLRTLTGHTG 205
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405596  167 PpedISCIAWNRQvQHILASASPSGRATVWDLRKNEPIIKVSDHSNRMHCsgLAWHPDvATQMVLASEDDRlpvIQMWDL 246
Cdd:COG2319    206 A---VRSVAFSPD-GKLLASGSADGTVRLWDLATGKLLRTLTGHSGSVRS--VAFSPD-GRLLASGSADGT---VRLWDL 275
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405596  247 RfASSPLRVLENHARGILAIAWSmADPELLLSCGKDAKILCSNPNTGEVLYELPTNTQWCFDIQWCPRNPAVLSaASFDG 326
Cdd:COG2319    276 A-TGELLRTLTGHSGGVNSVAFS-PDGKLLASGSDDGTVRLWDLATGKLLRTLTGHTGAVRSVAFSPDGKTLAS-GSDDG 352

                   ....*..
gi 2181405596  327 RISVYSI 333
Cdd:COG2319    353 TVRLWDL 359
WD40 COG2319
WD40 repeat [General function prediction only];
121-336 2.08e-19

WD40 repeat [General function prediction only];


Pssm-ID: 441893 [Multi-domain]  Cd Length: 403  Bit Score: 92.28  E-value: 2.08e-19
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405596  121 HTGPVRALDVNiFQTNLVASGANESEIYIWDLnnfATPMTPGAKTQPPEDISCIAWNRQvQHILASASPSGRATVWDLRK 200
Cdd:COG2319     77 HTAAVLSVAFS-PDGRLLASASADGTVRLWDL---ATGLLLRTLTGHTGAVRSVAFSPD-GKTLASGSADGTVRLWDLAT 151
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405596  201 NEPIIKVSDHSNRMHCsgLAWHPDvATQMVLASEDDRlpvIQMWDLRfASSPLRVLENHARGILAIAWSmADPELLLSCG 280
Cdd:COG2319    152 GKLLRTLTGHSGAVTS--VAFSPD-GKLLASGSDDGT---VRLWDLA-TGKLLRTLTGHTGAVRSVAFS-PDGKLLASGS 223
                          170       180       190       200       210
                   ....*....|....*....|....*....|....*....|....*....|....*.
gi 2181405596  281 KDAKILCSNPNTGEVLYELPTNTQWCFDIQWCPrNPAVLSAASFDGRISVYSIMGG 336
Cdd:COG2319    224 ADGTVRLWDLATGKLLRTLTGHSGSVRSVAFSP-DGRLLASGSADGTVRLWDLATG 278
WD40 cd00200
WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions ...
171-337 1.72e-16

WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment.


Pssm-ID: 238121 [Multi-domain]  Cd Length: 289  Bit Score: 81.23  E-value: 1.72e-16
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405596  171 ISCIAWNRQvQHILASASPSGRATVWDLRKNEPIIKVSDHSNRMhcSGLAWHPDvATQMVLASEDDrlpVIQMWDLRfAS 250
Cdd:cd00200     12 VTCVAFSPD-GKLLATGSGDGTIKVWDLETGELLRTLKGHTGPV--RDVAASAD-GTYLASGSSDK---TIRLWDLE-TG 83
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405596  251 SPLRVLENHARGILAIAWSmADPELLLSCGKDAKILCSNPNTGEVLYELPTNTQWCFDIQWCPrNPAVLSAASFDGRISV 330
Cdd:cd00200     84 ECVRTLTGHTSYVSSVAFS-PDGRILSSSSRDKTIKVWDVETGKCLTTLRGHTDWVNSVAFSP-DGTFVASSSQDGTIKL 161

                   ....*..
gi 2181405596  331 YSIMGGS 337
Cdd:cd00200    162 WDLRTGK 168
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
737-1090 1.16e-10

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 66.33  E-value: 1.16e-10
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405596  737 QPNIMQLRDRLCRAQGEPVAGHESPKIPYEKQQLPKGRPGPVAGHHQMPRVQTQQYYPHVRIAPTVTTWSNKTPT---AL 813
Cdd:pfam03154  170 QPPVLQAQSGAASPPSPPPPGTTQAATAGPTPSAPSVPPQGSPATSQPPNQTQSTAAPHTLIQQTPTLHPQRLPSphpPL 249
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405596  814 PSHPPAASPSDTQGENPPPPgfIMHGNVNPnAAGQLPTSPGHMHTQV-----PPYPQPQPYQPAQPYPFGTGGSAMYRPQ 888
Cdd:pfam03154  250 QPMTQPPPPSQVSPQPLPQP--SLHGQMPP-MPHSLQTGPSHMQHPVppqpfPLTPQSSQSQVPPGPSPAAPGQSQQRIH 326
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405596  889 QPVA--------PPTSNAYP----NTPYISSASSyTGQSQLYAAQ------HQASSPTSSPATSFPPPPS----SGASFQ 946
Cdd:pfam03154  327 TPPSqsqlqsqqPPREQPLPpaplSMPHIKPPPT-TPIPQLPNPQshkhppHLSGPSPFQMNSNLPPPPAlkplSSLSTH 405
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405596  947 HGGPGAPP------SSSAYALPPGTTGTLPAASELPASQRTGPqngwnDPPALNRVPKKKKMPEN-FMP--PVPITSPIM 1017
Cdd:pfam03154  406 HPPSAHPPplqlmpQSQQLPPPPAQPPVLTQSQSLPPPAASHP-----PTSGLHQVPSQSPFPQHpFVPggPPPITPPSG 480
                          330       340       350       360       370       380       390
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*
gi 2181405596 1018 NPLGDPQSQMLQQQPSApvpLSSQSSFPQPHLPGGQ--PFHGVQQPLGQTGMPPSFSKPNIEGAPGAPIGNTFQH 1090
Cdd:pfam03154  481 PPTSTSSAMPGIQPPSS---ASVSSSGPVPAAVSCPlpPVQIKEEALDEAEEPESPPPPPRSPSPEPTVVNTPSH 552
PHA03247 PHA03247
large tegument protein UL36; Provisional
712-1108 2.09e-10

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 65.73  E-value: 2.09e-10
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405596  712 SQYANLLAAQGSIAAALAFLPDNTNQPNIMQLRDRlCRAQGEPVAGHESPKIPYekqqlPKGRPGPVAGHHQMPRVQTQQ 791
Cdd:PHA03247  2632 SPAANEPDPHPPPTVPPPERPRDDPAPGRVSRPRR-ARRLGRAAQASSPPQRPR-----RRAARPTVGSLTSLADPPPPP 2705
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405596  792 YYPhvriAPTVTTWSNKTPTAL-PSHPPAASPSDTQGENPPPP--GFIMHGNVNPNAAGQLPTSPGH-MHTQVPPYPQPQ 867
Cdd:PHA03247  2706 PTP----EPAPHALVSATPLPPgPAAARQASPALPAAPAPPAVpaGPATPGGPARPARPPTTAGPPApAPPAAPAAGPPR 2781
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405596  868 PYQPAQPYPFGTGGSAMYRPQQPVAPPTSNAYPNTPYISSASSYTGQSQLYAAQhqaSSPTSSPATSFPPPPSSGASFQH 947
Cdd:PHA03247  2782 RLTRPAVASLSESRESLPSPWDPADPPAAVLAPAAALPPAASPAGPLPPPTSAQ---PTAPPPPPGPPPPSLPLGGSVAP 2858
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405596  948 GGPGA--PPSSSAYALPpgTTGTLPAASELPAsqrtgpqngwndpPALNRVPKKKKMPenfmPPVPITSPIMNPLGDPQS 1025
Cdd:PHA03247  2859 GGDVRrrPPSRSPAAKP--AAPARPPVRRLAR-------------PAVSRSTESFALP----PDQPERPPQPQAPPPPQP 2919
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405596 1026 QMLQQQPSAPVPLSSQSSFPQPHLPGGQPFHGVQQPLGQTGMPpsfskPNIEGAPGAPIGNTFQHVQSLPTKKITKKPIP 1105
Cdd:PHA03247  2920 QPQPPPPPQPQPPPPPPPRPQPPLAPTTDPAGAGEPSGAVPQP-----WLGALVPGRVAVPRFRVPQPAPSREAPASSTP 2994

                   ...
gi 2181405596 1106 DEH 1108
Cdd:PHA03247  2995 PLT 2997
WD40 COG2319
WD40 repeat [General function prediction only];
89-247 3.44e-09

WD40 repeat [General function prediction only];


Pssm-ID: 441893 [Multi-domain]  Cd Length: 403  Bit Score: 60.31  E-value: 3.44e-09
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405596   89 LIAGGENGNIILYDpskiIAGDKEVVIAQNdkHTGPVRALDVNiFQTNLVASGANESEIYIWDLNNFATPMTPGAKTqpp 168
Cdd:COG2319    261 LASGSADGTVRLWD----LATGELLRTLTG--HSGGVNSVAFS-PDGKLLASGSDDGTVRLWDLATGKLLRTLTGHT--- 330
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405596  169 EDISCIAWNRQVQhILASASPSGRATVWDLRKNEPIIKVSDHSNRMHcsGLAWHPD---VATqmvlASEDDRlpvIQMWD 245
Cdd:COG2319    331 GAVRSVAFSPDGK-TLASGSDDGTVRLWDLATGELLRTLTGHTGAVT--SVAFSPDgrtLAS----GSADGT---VRLWD 400

                   ..
gi 2181405596  246 LR 247
Cdd:COG2319    401 LA 402
ACE1-Sec16-like cd09233
Ancestral coatomer element 1 (ACE1) of COPII coat complex assembly protein Sec16; COPII coat ...
534-657 4.72e-09

Ancestral coatomer element 1 (ACE1) of COPII coat complex assembly protein Sec16; COPII coat complex plays an important role in vesicular traffic of newly synthezised proteins from the endoplasmatic reticulum (ER) to the Golgi apparatus by mediating the formation of transport vesicles. COPII consists of an outer coat, made up of the scaffold proteins Sec31 and Sec13, and the cargo adaptor complex, Sec23 and Sec24, which are recruited by the small GTPase Sar1. Sec16 is involved in the early steps of the assembly process. Sec16 forms elongated heterotetramers with Sec13, Sec13-(Sec16)2-Sec13. It interacts with Sec13 by insertion of a single beta-blade to close the six-bladded beta propeller of Sec13. In the same way Sec13 interacts with Sec31 and Nup145C, a nuclear pore protein, all of these contain a structurally related ancestral coatomer element 1 (ACE1). Sec16 is believed to be a key component in maintaining the integrity of the ER exit site.


Pssm-ID: 187750 [Multi-domain]  Cd Length: 314  Bit Score: 59.19  E-value: 4.72e-09
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405596  534 ITQALLTGNFESAVDLCLhDNRM-ADAIILAIAGGQELLARTQKKyFAKSQSKIT---RLITAVVMKNWKEIVESC---- 605
Cdd:cd09233     69 FRNLLLTGNRKEALELAL-DNGLwAHALLLASSLGKETWAEVVSR-FARSESKLNdplQTLYQLFSGNSPEAITELadnp 146
                           90       100       110       120       130       140
                   ....*....|....*....|....*....|....*....|....*....|....*....|..
gi 2181405596  606 -----DLKNWREALAAVLTYAKPD-EFSALCDLlgtrleneGDSLLQTQ----ACLCYICAG 657
Cdd:cd09233    147 aeaewALGNWREHLAIILSNRTSNlDLEALVEL--------GDLLAQRGlveaAHICYLLAG 200
Sec16_C pfam12931
Sec23-binding domain of Sec16; Sec16 is a multi-domain vesicle coat protein. The C-terminal ...
534-728 5.57e-07

Sec23-binding domain of Sec16; Sec16 is a multi-domain vesicle coat protein. The C-terminal region is the part that binds to Sec23, a COPII vesicle coat protein. This association is part of the transport vesicle coat structure.


Pssm-ID: 432884  Cd Length: 279  Bit Score: 52.56  E-value: 5.57e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405596  534 ITQALLTGNFESAVDLCLhDNRM-ADAIILAIAGGQELLARTQKKY----FAKSQSKITRLItAVVMK----NWKEIVE- 603
Cdd:pfam12931    1 IRALLLTGDREKALWLAL-DKKLwAHALLIASTLGKEKWKEVVQEFvrseFKGSNNKSGESL-AALYQvfagNSEEAVDe 78
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405596  604 --------SCDLKNWREALAAVLTYAKPDEFSALCDlLGTRLENEGdslLQTQACLCYICAG---NVEKLVACWTKAQDG 672
Cdd:pfam12931   79 lvppsknaLWALDNWRETLALVLSNRSPGDVEALLA-LGDLLAQYG---RTEAAHICFLLAGlplSQTVLLGADHVRFPS 154
                          170       180       190       200       210       220
                   ....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405596  673 SHPLSLQDLI--EkvvILRKAVQLTqAMDTSTVGV--LLAAKMsQYANLLAAQGSIAAAL 728
Cdd:pfam12931  155 TFGNDLESILltE---IYEYALSLS-PPQPPFVGLphLLPYKL-QHAAVLAEYGLVSEAQ 209
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
895-1108 5.74e-07

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 54.00  E-value: 5.74e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405596  895 TSNAYPNTPYISSASSYTGQSQlyaaQHQASSPTSSPATSFPPPPSSGASFQHGGPGAPPSSSAYALPPGTTgtlPAASE 974
Cdd:pfam03154  144 TSPSIPSPQDNESDSDSSAQQQ----ILQTQPPVLQAQSGAASPPSPPPPGTTQAATAGPTPSAPSVPPQGS---PATSQ 216
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405596  975 LPASQRT--GPQNGWNDPPALN--RVPK--------KKKMPENFMPPVPITSPIMNPLGDPQSQMLQQQPS--------A 1034
Cdd:pfam03154  217 PPNQTQStaAPHTLIQQTPTLHpqRLPSphpplqpmTQPPPPSQVSPQPLPQPSLHGQMPPMPHSLQTGPShmqhpvppQ 296
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405596 1035 PVPLSSQSS------FPQPHLPgGQPFHGVQQPLGQTgMPPSFSKPNIEGAPGAPIgnTFQHVQSLPTKKITKKPIPDEH 1108
Cdd:pfam03154  297 PFPLTPQSSqsqvppGPSPAAP-GQSQQRIHTPPSQS-QLQSQQPPREQPLPPAPL--SMPHIKPPPTTPIPQLPNPQSH 372
PHA03247 PHA03247
large tegument protein UL36; Provisional
764-1105 2.99e-06

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 51.86  E-value: 2.99e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405596  764 PYEKQQLPKGRPGP------VAGHHQMPRVQTQQYYPHVRIAPtvttwsnKTPTALPShPPAASPSDTQGENPPPPGFIM 837
Cdd:PHA03247  2562 AAPDRSVPPPRPAPrpsepaVTSRARRPDAPPQSARPRAPVDD-------RGDPRGPA-PPSPLPPDTHAPDPPPPSPSP 2633
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405596  838 HGNvNPNAAGQLPTSPGHMHTQVPPYPQPQPYQPAQPYPFGTGGSA-MYRPQQPVAPPTsnaypntpyissASSYTGQSQ 916
Cdd:PHA03247  2634 AAN-EPDPHPPPTVPPPERPRDDPAPGRVSRPRRARRLGRAAQASSpPQRPRRRAARPT------------VGSLTSLAD 2700
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405596  917 LYAAQHQASSPTSSPATSFPPPPSSGAsfqhGGPGAPPSSSAYALPPgttgtLPAASELPASQR---------------- 980
Cdd:PHA03247  2701 PPPPPPTPEPAPHALVSATPLPPGPAA----ARQASPALPAAPAPPA-----VPAGPATPGGPArparppttagppapap 2771
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405596  981 -----TGPQNGWNDPPALNRVPKKKKMPENFMP-----PVPITSPIMNPLGDPQSQMLQQQPSAPVPLSSQSSFPQPHLP 1050
Cdd:PHA03247  2772 paapaAGPPRRLTRPAVASLSESRESLPSPWDPadppaAVLAPAAALPPAASPAGPLPPPTSAQPTAPPPPPGPPPPSLP 2851
                          330       340       350       360       370
                   ....*....|....*....|....*....|....*....|....*....|....*.
gi 2181405596 1051 GGqpfhGVQQPLGQTG-MPPSFSKPNIEGAPGAPigntfqHVQSLPTKKITKKPIP 1105
Cdd:PHA03247  2852 LG----GSVAPGGDVRrRPPSRSPAAKPAAPARP------PVRRLARPAVSRSTES 2897
WD40 cd00200
WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions ...
252-336 3.70e-06

WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment.


Pssm-ID: 238121 [Multi-domain]  Cd Length: 289  Bit Score: 50.03  E-value: 3.70e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405596  252 PLRVLENHARGILAIAWSmADPELLLSCGKDAKILCSNPNTGEVLYELPTNTQWCFDIQWCPRNPAVLSaASFDGRISVY 331
Cdd:cd00200      1 LRRTLKGHTGGVTCVAFS-PDGKLLATGSGDGTIKVWDLETGELLRTLKGHTGPVRDVAASADGTYLAS-GSSDKTIRLW 78

                   ....*
gi 2181405596  332 SIMGG 336
Cdd:cd00200     79 DLETG 83
PRK10263 PRK10263
DNA translocase FtsK; Provisional
942-1177 3.45e-04

DNA translocase FtsK; Provisional


Pssm-ID: 236669 [Multi-domain]  Cd Length: 1355  Bit Score: 45.08  E-value: 3.45e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405596  942 GASFQHGGPGAPPSSSAYAlppgttgTLPAASELPASQR---TGPQNGWNDPPALNRV---PKKKKMPENFMPPV--PIT 1013
Cdd:PRK10263   677 GEQYQHDVPVNAEDADAAA-------EAELARQFAQTQQqrySGEQPAGANPFSLDDFefsPMKALLDDGPHEPLftPIV 749
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405596 1014 SPIMNPLGDPQSQMLQQQPSAPVPLSSQSSFPQPHLPGGQPFHGVQQPLGQtgmPPSFSKPNIEGAPGAPIGNTFQHVQS 1093
Cdd:PRK10263   750 EPVQQPQQPVAPQQQYQQPQQPVAPQPQYQQPQQPVAPQPQYQQPQQPVAP---QPQYQQPQQPVAPQPQYQQPQQPVAP 826
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405596 1094 LPTKKITKKPI---PDEHLILKTTFEDLIQRCLSSATDPQTKrklddaskrLEFLYDKLREqtLSPTITSGLHNIARSIE 1170
Cdd:PRK10263   827 QPQYQQPQQPVapqPQDTLLHPLLMRNGDSRPLHKPTTPLPS---------LDLLTPPPSE--VEPVDTFALEQMARLVE 895
                          250
                   ....*....|....*...
gi 2181405596 1171 TR-----------NYSEG 1177
Cdd:PRK10263   896 ARladfrikadvvNYSPG 913
PRK14959 PRK14959
DNA polymerase III subunits gamma and tau; Provisional
939-1052 5.84e-04

DNA polymerase III subunits gamma and tau; Provisional


Pssm-ID: 184923 [Multi-domain]  Cd Length: 624  Bit Score: 43.90  E-value: 5.84e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405596  939 PSSGASFQHGGPGAP-PSSSAYAL--PPGTTGTLPAAselPASQRTgPQNGWNDPPALNRVPkKKKMPENFMPPVPITSP 1015
Cdd:PRK14959   373 PSGGGASAPSGSAAEgPASGGAATipTPGTQGPQGTA---PAAGMT-PSSAAPATPAPSAAP-SPRVPWDDAPPAPPRSG 447
                           90       100       110       120
                   ....*....|....*....|....*....|....*....|....*.
gi 2181405596 1016 IMnPLGDPQSQMLQQQPSAPVPLSSQS---------SFPQPHLPGG 1052
Cdd:PRK14959   448 IP-PRPAPRMPEASPVPGAPDSVASASdapptlgdpSDTAEHTPSG 492
PLN00181 PLN00181
protein SPA1-RELATED; Provisional
203-333 7.54e-04

protein SPA1-RELATED; Provisional


Pssm-ID: 177776 [Multi-domain]  Cd Length: 793  Bit Score: 43.92  E-value: 7.54e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405596  203 PIIKVSdhsNRMHCSGLAWHPDVATQMVLASEDDrlpVIQMWDLrfASSPLRV-LENHARGILAIAWSMADPELLLSCGK 281
Cdd:PLN00181   525 PVVELA---SRSKLSGICWNSYIKSQVASSNFEG---VVQVWDV--ARSQLVTeMKEHEKRVWSIDYSSADPTLLASGSD 596
                           90       100       110       120       130
                   ....*....|....*....|....*....|....*....|....*....|..
gi 2181405596  282 DAKILCSNPNTGEVLYELPTNTQWCFdIQWCPRNPAVLSAASFDGRISVYSI 333
Cdd:PLN00181   597 DGSVKLWSINQGVSIGTIKTKANICC-VQFPSESGRSLAFGSADHKVYYYDL 647
Med15 pfam09606
ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of ...
752-1147 1.52e-03

ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of the ARC-Mediator co-activator is a three-helix bundle with marked similarity to the KIX domain. The sterol regulatory element binding protein (SREBP) family of transcription activators use the ARC105 subunit to activate target genes in the regulation of cholesterol and fatty acid homeostasis. In addition, Med15 is a critical transducer of gene activation signals that control early metazoan development.


Pssm-ID: 312941 [Multi-domain]  Cd Length: 732  Bit Score: 42.69  E-value: 1.52e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405596  752 GEPVAGHESPKIPYEKQQLPKGRP----GPVAGHHQMPRVQTQQyyPHVRIAPTVTTWSNKTPTALPSHPPAASPSDTQG 827
Cdd:pfam09606  107 GGPMGQQMGGPGTASNLLASLGRPqmpmGGAGFPSQMSRVGRMQ--PGGQAGGMMQPSSGQPGSGTPNQMGPNGGPGQGQ 184
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405596  828 ENPPPPGFIMH-GNVNPNAAGQlPTSPGHMHTQVPPYPQPQPYQPAQPYPFGTGGSAMYRPQQPVAPptsnaypnTPYIS 906
Cdd:pfam09606  185 AGGMNGGQQGPmGGQMPPQMGV-PGMPGPADAGAQMGQQAQANGGMNPQQMGGAPNQVAMQQQQPQQ--------QGQQS 255
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405596  907 SASSYTGQSQLYAAQHQASSPTSSPATSFPPPPSS-GASFQHGGPGAPPSSSAYALPP---GTTGTLPAASELPASQ--- 979
Cdd:pfam09606  256 QLGMGINQMQQMPQGVGGGAGQGGPGQPMGPPGQQpGAMPNVMSIGDQNNYQQQQTRQqqqQQGGNHPAAHQQQMNQsvg 335
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405596  980 ---RTGPQNGWNDPPALNRV--------PKKKKMPENFMPPVPITSPIMNPLGDPQSQMLQQQPSAPVPLSSQSSFPQPH 1048
Cdd:pfam09606  336 qggQVVALGGLNHLETWNPGnfgglganPMQRGQPGMMSSPSPVPGQQVRQVTPNQFMRQSPQPSVPSPQGPGSQPPQSH 415
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405596 1049 LPGGQPF-HGVQQPLGQTGMPPSFSKPNIEGAPGAPIGNTFQHVQSLPTKKITKKPIPDEHLILKTTFEDLIQRCLSSAT 1127
Cdd:pfam09606  416 PGGMIPSpALIPSPSPQMSQQPAQQRTIGQDSPGGSLNTPGQSAVNSPLNPQEEQLYREKYRQLTKYIEPLKRMIAKMEN 495
                          410       420
                   ....*....|....*....|
gi 2181405596 1128 DPQTKRKLDDASKRLEFLYD 1147
Cdd:pfam09606  496 DPGDIDKMNKMKRLLEILSN 515
PHA02682 PHA02682
ORF080 virion core protein; Provisional
887-1128 2.16e-03

ORF080 virion core protein; Provisional


Pssm-ID: 177464 [Multi-domain]  Cd Length: 280  Bit Score: 41.39  E-value: 2.16e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405596  887 PQQPvAPPTSNAYPNTPYISSASSYTGQSQLyAAQHQASSPTSSPATSFPPPPSSGASFQHGGPgAPPS--------SSA 958
Cdd:PHA02682    37 PAAP-CPPDADVDPLDKYSVKEAGRYYQSRL-KANSACMQRPSGQSPLAPSPACAAPAPACPAC-APAApapavtcpAPA 113
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405596  959 YALPPGTTGTLPAASELPASQRTGPQNgwndPPALNRVPKKkkmpenfmPPVPITSPImnplgdpqsqmlqqqPSAPvPL 1038
Cdd:PHA02682   114 PACPPATAPTCPPPAVCPAPARPAPAC----PPSTRQCPPA--------PPLPTPKPA---------------PAAK-PI 165
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405596 1039 SSQSSFPQPHLPGGqpfhgvqqplgqtgmppsfSKPNIEGAPGApigntfqhvqslptKKITKKPIPDEHLILKTTFEDL 1118
Cdd:PHA02682   166 FLHNQLPPPDYPAA-------------------SCPTIETAPAA--------------SPVLEPRIPDKIIDADNDDKDL 212
                          250
                   ....*....|
gi 2181405596 1119 IQRCLSSATD 1128
Cdd:PHA02682   213 IKKELADIAD 222
PRK12323 PRK12323
DNA polymerase III subunit gamma/tau;
886-1071 4.27e-03

DNA polymerase III subunit gamma/tau;


Pssm-ID: 237057 [Multi-domain]  Cd Length: 700  Bit Score: 41.40  E-value: 4.27e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405596  886 RPQQPVAPPTSNAYPNTPYISSASSYTGQSQLYAAQHQASSPTSSPATSFPPPPSSGASFQHGGPGAPPSSSAYALPPGT 965
Cdd:PRK12323   401 APPAAPAAAPAAAAAARAVAAAPARRSPAPEALAAARQASARGPGGAPAPAPAPAAAPAAAARPAAAGPRPVAAAAAAAP 480
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405596  966 TGTLPAASELPASQRTGPqngWNDPPALNRVPKKKKMPENFMPPV--PITSPIMNPLGDPQSQMLQQQPSAPVPlsSQSS 1043
Cdd:PRK12323   481 ARAAPAAAPAPADDDPPP---WEELPPEFASPAPAQPDAAPAGWVaeSIPDPATADPDDAFETLAPAPAAAPAP--RAAA 555
                          170       180
                   ....*....|....*....|....*...
gi 2181405596 1044 FPQPHLPGGQPfhgvqqPLGQTGMPPSF 1071
Cdd:PRK12323   556 ATEPVVAPRPP------RASASGLPDMF 577
PRK12323 PRK12323
DNA polymerase III subunit gamma/tau;
878-1083 4.69e-03

DNA polymerase III subunit gamma/tau;


Pssm-ID: 237057 [Multi-domain]  Cd Length: 700  Bit Score: 41.01  E-value: 4.69e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405596  878 GTGGSAMYRP--QQPVAPPTSNAYPNTPYISSASSYTGQSQLYAAQHQASSPTSSPATSFPPPPSSGASFQHGGPGAPPS 955
Cdd:PRK12323   367 QSGGGAGPATaaAAPVAQPAPAAAAPAAAAPAPAAPPAAPAAAPAAAAAARAVAAAPARRSPAPEALAAARQASARGPGG 446
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405596  956 SSAyalPPGTTGTLPAASELPASQRTGPqngwndPPALNRVPKKKKMPENFMPPVPITSPIMNPL-GDPQSQMLQQQPSA 1034
Cdd:PRK12323   447 APA---PAPAPAAAPAAAARPAAAGPRP------VAAAAAAAPARAAPAAAPAPADDDPPPWEELpPEFASPAPAQPDAA 517
                          170       180       190       200
                   ....*....|....*....|....*....|....*....|....*....
gi 2181405596 1035 PVPLSSQSSFPQPHLPGGQPFHGVQQPLGQTGMPPSFSKPNIEGAPGAP 1083
Cdd:PRK12323   518 PAGWVAESIPDPATADPDDAFETLAPAPAAAPAPRAAAATEPVVAPRPP 566
PAT1 pfam09770
Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate ...
749-889 5.95e-03

Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate chromosome transmission during cell division.


Pssm-ID: 401645 [Multi-domain]  Cd Length: 846  Bit Score: 40.79  E-value: 5.95e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405596  749 RAQGEPVAGHESPKIPYEKQQLPKGRPGPVAGHHQMPRVQTQQYYPHVRIAPTVTTWSNKTPTALPSHPPAASPS--DTQ 826
Cdd:pfam09770  210 PAQQPAPAPAQPPAAPPAQQAQQQQQFPPQIQQQQQPQQQPQQPQQHPGQGHPVTILQRPQSPQPDPAQPSIQPQaqQFH 289
                           90       100       110       120       130       140
                   ....*....|....*....|....*....|....*....|....*....|....*....|...
gi 2181405596  827 GENPPPPGFIMHGNVNPNAAGQlPTSPGHMHTQVPPYPQPQPYQPAQPYPFGTGGSAMYRPQQ 889
Cdd:pfam09770  290 QQPPPVPVQPTQILQNPNRLSA-ARVGYPQNPQPGVQPAPAHQAHRQQGSFGRQAPIITHPQQ 351
SSDP pfam04503
Single-stranded DNA binding protein, SSDP; This is a family of eukaryotic single-stranded DNA ...
819-1083 8.36e-03

Single-stranded DNA binding protein, SSDP; This is a family of eukaryotic single-stranded DNA binding proteins with specificity to a pyrimidine-rich element found in the promoter region of the alpha2(I) collagen gene.


Pssm-ID: 461334 [Multi-domain]  Cd Length: 293  Bit Score: 39.55  E-value: 8.36e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405596  819 AASPSDTQGENPPPPGfiMHGNvnPNAAGQLPTSPGHMHTQVPPYPQPQPYQPAQPYpfGTGGSAMYRPQQPVAPPTSNA 898
Cdd:pfam04503   18 AAAPSPVMGQMPPGDG--MPGG--PMPPGFFQSPPSHPSSQPSPHAQPPPHNPATMM--GPHSQPFMGPRYPGGPRPSVR 91
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405596  899 YPNTPyiSSASSYTGQSQLYAAQHQASSPTSSPATSFPPPPSSGASFQHGGPGAPPSSSA--YALPPGTT---GTLPAAS 973
Cdd:pfam04503   92 MPQQG--NDFNGPPGQQPMMPNSMDPTRPGGHPNMGGPMQRMNPPRGPGMGPMGPQSYGPgmRGPPPNSTdgpGGMPPMN 169
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405596  974 ELPASQRTGPQngwndPPALNRVPKKKKMPENFMPP-----VPITSPIMNPLGDP--QSQMLQQQPSAPVPLSSQSSFPQ 1046
Cdd:pfam04503  170 MGPGGRRPWPQ-----PNASNPLPYSSSSPGSYGGPpggggPPGPTPIMPSPQDStnSGENMYTLMNPVGPGGNRANFPM 244
                          250       260       270       280
                   ....*....|....*....|....*....|....*....|....*
gi 2181405596 1047 ------PHLPGGQPFHGVQQPLGQTGMPPSFSKPN--IEGAPGAP 1083
Cdd:pfam04503  245 gpglegPMGPNGMEPHHSNGSLGSGDMDGMKNSPAnvLSNGPGTP 289
 
Blast search parameters
Data Source: Precalculated data, version = cdd.v.3.21
Preset Options:Database: CDSEARCH/cdd   Low complexity filter: no  Composition Based Adjustment: yes   E-value threshold: 0.01

References:

  • Wang J et al. (2023), "The conserved domain database in 2023", Nucleic Acids Res.51(D)384-8.
  • Lu S et al. (2020), "The conserved domain database in 2020", Nucleic Acids Res.48(D)265-8.
  • Marchler-Bauer A et al. (2017), "CDD/SPARCLE: functional classification of proteins via subfamily domain architectures.", Nucleic Acids Res.45(D)200-3.
Help | Disclaimer | Write to the Help Desk
NCBI | NLM | NIH