|
Name |
Accession |
Description |
Interval |
E-value |
| Ten_N super family |
cl24184 |
Teneurin Intracellular Region; This family is found in the intracellular N-terminal region of ... |
192-351 |
7.72e-79 |
|
Teneurin Intracellular Region; This family is found in the intracellular N-terminal region of the Teneurin family of proteins. These proteins are 'pair-rule' genes and are involved in tissue patterning, specifically probably neural patterning. The intracellular domain is cleaved in response to homophilic interaction of the extracellular domain, and translocates to the nucleus. Here it probably carries out to some transcriptional regulatory activity. The length of this region and the conservation suggests that there may be two structural domains here (personal obs:C Yeats). The actual alignment was detected with superfamily member pfam06484:
Pssm-ID: 461932 [Multi-domain] Cd Length: 367 Bit Score: 266.46 E-value: 7.72e-79
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 192 DPGKLAAPGSAP--HGHGEGARRQEQPSNNPGQPTLQ----PLPPSHKQHpAQHHPSITSLNRNSLTNRRNQSPAPPAAL 265
Cdd:pfam06484 203 DPDEEFSPNSYLvrTGSGPQSAPSEQPPNFQNHSRLRtpppPLPPPHKQN-QHHHPSINSLNRSSLTNRRNPSPAPTASL 281
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 266 PAELQTTPESVQLQDSWVLGSNVPLESRHFLFKTGTGTTPLFSTATPGYTMASGSVYSPPTRPLPRNTLSRSAFKFKKSS 345
Cdd:pfam06484 282 PAELQSTQESVQLQDSWVLNSNVPLETRHFLFKTGTGTTPLFCTASPGYPLTSGTVYSPPPRPLPRNTFSRPAFKLKKPY 361
|
....*.
gi 1907188904 346 KYCSWR 351
Cdd:pfam06484 362 KYCSWK 367
|
|
| NHL super family |
cl18310 |
NHL repeat unit of beta-propeller proteins; The NHL(NCL-1, HT2A and LIN-41)-repeat is found in ... |
1204-1556 |
1.17e-40 |
|
NHL repeat unit of beta-propeller proteins; The NHL(NCL-1, HT2A and LIN-41)-repeat is found in multiple tandem copies, typically as 6 instances. It is about 40 residues long and resembles the WD repeat and other beta-propeller structures. The repeats have a catalytic activity in Peptidyl-glycine alpha-amidating monooxygenase; proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localized to the repeats. Tripartite motif-containing protein 32 interacts with the activation domain of Tat. This interaction is mediated by the NHL repeats. The actual alignment was detected with superfamily member cd14953:
Pssm-ID: 302697 [Multi-domain] Cd Length: 323 Bit Score: 154.23 E-value: 1.17e-40
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 1204 VSSIMGNGRRrsiscpscnGQADGNKLLA----PVALACGIDGSLYVGDF--NYVRRIFPSGNVTSVL----ELSSN--- 1270
Cdd:cd14953 1 VSTVAGSGTA---------GFSGGGGTAArfnsPSGVAVDAAGNLYVADRgnHRIRKITPDGVVTTVAgtgtAGFADggg 71
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 1271 PAHRYY----LATDPvTGDLYVSDTNTRRIYRpksLTGAKDLTknaeVVAGTGEqclpfdeARCGDGGKAVEATLMSPKG 1346
Cdd:cd14953 72 AAAQFNtpsgVAVDA-AGNLYVADTGNHRIRK---ITPDGVVS----TLAGTGT-------AGFSDDGGATAAQFNYPTG 136
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 1347 MAIDKNGLIYFVDGT--MIRKVDQNGIISTLLGSNDLTSAR--PLTcdtsmhisQVRLEWPTDLAINPMDNsIYVLD--N 1420
Cdd:cd14953 137 VAVDAAGNLYVADTGnhRIRKITPDGVVTTVAGTGGAGYAGdgPAT--------AAQFNNPTGVAVDAAGN-LYVADrgN 207
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 1421 NVVLQITENRQVRIAAGRPmhcqvpGVEYPVGKHAVQTTLESATAIAVSYSGVLYITETDEkkiNRIRQVTTDGEISLVA 1500
Cdd:cd14953 208 HRIRKITPDGVVTTVAGTG------TAGFSGDGGATAAQLNNPTGVAVDAAGNLYVADSGN---HRIRKITPAGVVTTVA 278
|
330 340 350 360 370
....*....|....*....|....*....|....*....|....*....|....*..
gi 1907188904 1501 GIPSEcdckndancdcyQSGD-GYAKDAKLNAPSSLAASPDGTLYIADLGNIRIRAV 1556
Cdd:cd14953 279 GGGAG------------FSGDgGPATSAQFNNPTGVAVDAAGNLYVADTGNNRIRKI 323
|
|
| Tox-GHH |
pfam15636 |
GHH signature containing HNH/Endo VII superfamily nuclease toxin; A predicted toxin of the HNH ... |
2673-2750 |
7.89e-40 |
|
GHH signature containing HNH/Endo VII superfamily nuclease toxin; A predicted toxin of the HNH/Endonuclease VII fold present in bacterial polymorphic toxin systems with a characteriztic sG[HQ]H signature motif. In bacterial polymorphic toxin systems, the toxin is exported by the type 2, type 6, type 7 or TcdB/TcaC-type secretion system. The metazoan teneurin proteins possess an inactive of this domain at their C-terminus. :
Pssm-ID: 464783 Cd Length: 78 Bit Score: 142.75 E-value: 7.89e-40
10 20 30 40 50 60 70
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 1907188904 2673 EEKARILEQARQRALARAWAREQQRVRDGEEGARLWTEGEKRQLLSAGKVQGYDGYYVLSVEQYPELADSANNIQFLR 2750
Cdd:pfam15636 1 EERKRLLEHAKKRAVREAWHRERQLLRNGLPGSRDWTDEEKEELLSTGSVPGYDGEYIHPVEQYPELADDPSNIRFRK 78
|
|
| RhsA |
COG3209 |
Uncharacterized conserved protein RhaS, contains 28 RHS repeats [General function prediction ... |
1510-2450 |
1.05e-34 |
|
Uncharacterized conserved protein RhaS, contains 28 RHS repeats [General function prediction only]; :
Pssm-ID: 442442 [Multi-domain] Cd Length: 1103 Bit Score: 146.44 E-value: 1.05e-34
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 1510 NDANCDCYQSGDGYAKDAKLNAPSSLAASPDGTLYIADLGNIRIRAVSKNKPLLNSMNFYEVASPTDQELYIFDINGTHQ 1589
Cdd:COG3209 109 AAATASAGRLVSTGAGAGGTVTAATGGTLGATAGSATTGSTDGGRGGVAVTGLAGGGASAYGLTLGGAAAGPATGVGTGA 188
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 1590 YTVSLVTGDYLYNFSYSNDNDVTAVTDSNGNTLRIRRDPNRMPVRVVSPDNQVIWLTIGTNGCLKSMTAQGLELVLFTYH 1669
Cdd:COG3209 189 VTLATGLAGSALLALGSGAILGGLAGAYSGSATTATGTALGTPASVAATVTGSATGAAGAGAAVATAATTLGGTTGAGTG 268
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 1670 GNSGLLATK---SDETGWTTFFDYDSEGRLTNVTFPTGVVTNLHGDMDKAITVDIESSSREEDVSITSNLSSIDSFYTMV 1746
Cdd:COG3209 269 ASGAGLDAStgtGGAGGSNAAATAGGLGGAGLGSGGAGGGGTAGGTTTAAGTTGTAAVSGAADAGTTTTTGTGTGGTTTT 348
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 1747 QDQLRNSYQIGYDGSLRIFYASGLDSHYQTEPHVLAGTANPTVAKRNMTLPGENGQNLVEWRFRKEQAQGKVNVFGRKLR 1826
Cdd:COG3209 349 VGGGGSLTLGGYGAAGGLTTSVGAGGGGSTSGSTTTVGGGGTATGSGGGSSTTGVGAGTTTTSTTGGDGGPATAAGALTA 428
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 1827 VNGRNLLSVDFDRTTKTEKIYDDHRKFLLRIAYDTSGHPTLWLPSSKLMAVNVTYSSTGQIASIQRGTTSEKVDYDSQGR 1906
Cdd:COG3209 429 GGTATGTGTGGGGTTAGTDATTTTGGAGASGTLTTTGGAATGATTGGGTEAGTGGGTLTSGSAGATTLGTDTTLDDTLGG 508
|
410 420 430 440 450 460 470 480
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 1907 IVSRVFADGKTWSYTYLEKSMVLLLHSQRQYIFEYDMWDRLSAITMPSVARHTMQTIRSIGYYRNIYNPPESNASIITDY 1986
Cdd:COG3209 509 TTTTTAGARGLVVTTGTTLTLGTTTTATLSATDATGTGDTTTTGTVGTGTSTGTGGTGTVTTTGDGTGGASTTTGTTGGT 588
|
490 500 510 520 530 540 550 560
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 1987 NEEGLLLQTAFLGTSRRVLFKYRRQTRLSEILYDSTRVSFTYDETAGVLKTVNLQSDGFicTIRYRQIGPLIDRQIFRFS 2066
Cdd:COG3209 589 ATTTTVTTTTTTSTAGTTTTTTSGYTRAGLTLTLGTGTASGLERATASTGSTTGGTTGT--GVTTTGTTTTRATGTTGTG 666
|
570 580 590 600 610 620 630 640
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 2067 EDGMVNARFDYSYDNSFRVTSMQGVINETPLPIDLYQFDDISGKVEQFGKFGVIYYDINQIISTAVMTYTKHFDAHGRIK 2146
Cdd:COG3209 667 TGVTAGLTTLATGGTTVGGGTGTTSTATTGATTGGTETGTTVTTLAGGTTTRLGTTTTGGGGGTTTDGTGTGGTTGTLTT 746
|
650 660 670 680 690 700 710 720
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 2147 EIQYEifRSLMYWITIQYDNMGRVTKREIKIGPFANTTKYAYEYDVDGQLQTVYLNEKIMWRYNYDLNGNLHLLNPSSSA 2226
Cdd:COG3209 747 TSTTT--TTTAGALTYTYDALGRLTSETTPGGVTQGTYTTRYTYDALGRLTSVTYPDGETVTYTYDALGRLTSVITVGSG 824
|
730 740 750 760 770 780 790 800
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 2227 RLTPL-----RYDLRDRITRLgdvqyrldEDGFLRQRGTEIFEYSSKGLLTRVYSKGSGWTviYRYDGLGRRVSSKTSLG 2301
Cdd:COG3209 825 GGTDLqdrtyTYDAAGNITSI--------TDALRAGTLTQTYTYDALGRLTSATDPGTTES--YTYDANGNLTSRTDGGT 894
|
810 820 830 840 850 860 870 880
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 2302 QHLQFFYADLtyPTRITHvynhSSSEITSLYYDLQGHlfameissgdefyiaSDNTGTPLAVFSSNGLMLKQIQYTAYGE 2381
Cdd:COG3209 895 TTYTYDALGR--LVSVTK----PDGTTTTYTYDALGH---------------TDHLGSVRALTDASGQVVWRYDYDPFGN 953
|
890 900 910 920 930 940
....*....|....*....|....*....|....*....|....*....|....*....|....*....
gi 1907188904 2382 IYFDSNVDFQLVIGFHGGLYDPLTKLIHFGERDYDILAGRWTTPDieiwkRIGKDPAPfNLYMFRNNNP 2450
Cdd:COG3209 954 LLAETSGAAANPLRFTGQEYDAETGLYYNGARYYDPALGRFLSPD-----PIGLAGGL-NLYAYVGNNP 1016
|
|
| acid_disulf_rpt |
NF033662 |
acidic double-disulfide repeat; The acidic double-disulfide repeat is an Asp-rich repeat with ... |
834-864 |
9.71e-08 |
|
acidic double-disulfide repeat; The acidic double-disulfide repeat is an Asp-rich repeat with four nearly invariant Cys residues in a repeat length of about 35 amino acids. :
Pssm-ID: 411265 [Multi-domain] Cd Length: 32 Bit Score: 49.82 E-value: 9.71e-08
10 20 30
....*....|....*....|....*....|.
gi 1907188904 834 AMETLCTDSKDNEGDGLIDCMDPDCCLQSSC 864
Cdd:NF033662 2 ATDTTCSDGIDNDGDGLTDCADPDCAGNPVC 32
|
|
| DUF5885 super family |
cl44670 |
Family of unknown function (DUF5885); This is a family of uncharacterized proteins of unknown ... |
565-750 |
1.77e-07 |
|
Family of unknown function (DUF5885); This is a family of uncharacterized proteins of unknown function found in viruses. The actual alignment was detected with superfamily member pfam19232:
Pssm-ID: 437064 Cd Length: 265 Bit Score: 55.01 E-value: 1.77e-07
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 565 CHGNGECvsGTCH-CFPGFLGPDCSraACPVLCSGNGQ----------YSKGRC----LCFSGwkgTECDVPTTQCI-DP 628
Cdd:pfam19232 34 CTTDAQC--GTCMtCVAGACTPKAS--CCGGVTCGAGQtcdaktntcvYVKGYCsadhPCPSG---SACDTAKNACIaQP 106
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 629 QCG---GRGiCIMG-------------------SCACNSGYK-GENCE--------EADCLDP---------------GC 662
Cdd:pfam19232 107 PYGpdsGKG-CVRGfgawiweldpatnsgvwrcRCANGSLYNsAHECSpladqtlcAAENLDPnalvpassvpafaayGW 185
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 663 SNHGVCIH-------------GECHCNPGWGGSNCEILKTmcadqCSGHGTYLQESGSCTCDPNWTGpdcsneicsvdcg 729
Cdd:pfam19232 186 GNQPVLINkstagaavpsplaGVCPCKPGWAGGSCTEDRT-----CNGRGTWNETTGQCACNIDFSG------------- 247
|
250 260
....*....|....*....|....
gi 1907188904 730 shgvcmGGSCRCEEG---WTGPAC 750
Cdd:pfam19232 248 ------HNSCGDDNNctsWTGPRC 265
|
|
| DSL super family |
cl19567 |
Delta serrate ligand; This family has been redefined to correspond to the EGF-like domain ... |
741-784 |
2.04e-05 |
|
Delta serrate ligand; This family has been redefined to correspond to the EGF-like domain defined by structure. The actual alignment was detected with superfamily member pfam01414:
Pssm-ID: 473190 Cd Length: 46 Bit Score: 43.77 E-value: 2.04e-05
10 20 30 40
....*....|....*....|....*....|....*....|....*..
gi 1907188904 741 CEEGWTGPACNqRACHPRCAE--HGTC-KDGKCECSQGWNGEHCTIA 784
Cdd:pfam01414 1 CDENYYGSTCS-KFCRPRDDKfgHYTCdANGNKVCLPGWTGPYCDKP 46
|
|
| EGF_CA |
cd00054 |
Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular ... |
802-831 |
4.64e-03 |
|
Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements. :
Pssm-ID: 238011 Cd Length: 38 Bit Score: 36.85 E-value: 4.64e-03
10 20 30
....*....|....*....|....*....|
gi 1907188904 802 PGLCNSNGRCTLDQNGWHCVCQPGWRGAGC 831
Cdd:cd00054 8 GNPCQNGGTCVNTVGSYRCSCPPGYTGRNC 37
|
|
|
|
Name |
Accession |
Description |
Interval |
E-value |
| Ten_N |
pfam06484 |
Teneurin Intracellular Region; This family is found in the intracellular N-terminal region of ... |
192-351 |
7.72e-79 |
|
Teneurin Intracellular Region; This family is found in the intracellular N-terminal region of the Teneurin family of proteins. These proteins are 'pair-rule' genes and are involved in tissue patterning, specifically probably neural patterning. The intracellular domain is cleaved in response to homophilic interaction of the extracellular domain, and translocates to the nucleus. Here it probably carries out to some transcriptional regulatory activity. The length of this region and the conservation suggests that there may be two structural domains here (personal obs:C Yeats).
Pssm-ID: 461932 [Multi-domain] Cd Length: 367 Bit Score: 266.46 E-value: 7.72e-79
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 192 DPGKLAAPGSAP--HGHGEGARRQEQPSNNPGQPTLQ----PLPPSHKQHpAQHHPSITSLNRNSLTNRRNQSPAPPAAL 265
Cdd:pfam06484 203 DPDEEFSPNSYLvrTGSGPQSAPSEQPPNFQNHSRLRtpppPLPPPHKQN-QHHHPSINSLNRSSLTNRRNPSPAPTASL 281
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 266 PAELQTTPESVQLQDSWVLGSNVPLESRHFLFKTGTGTTPLFSTATPGYTMASGSVYSPPTRPLPRNTLSRSAFKFKKSS 345
Cdd:pfam06484 282 PAELQSTQESVQLQDSWVLNSNVPLETRHFLFKTGTGTTPLFCTASPGYPLTSGTVYSPPPRPLPRNTFSRPAFKLKKPY 361
|
....*.
gi 1907188904 346 KYCSWR 351
Cdd:pfam06484 362 KYCSWK 367
|
|
| NHL_like_1 |
cd14953 |
Uncharacterized NHL-repeat domain in bacterial proteins; This bacterial family of NHL-repeat ... |
1204-1556 |
1.17e-40 |
|
Uncharacterized NHL-repeat domain in bacterial proteins; This bacterial family of NHL-repeat domains is found in a variety of domain architectures. The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies, typically as 6 instances. It is about 40 residues long and resembles the WD repeat and other beta-propeller structures.
Pssm-ID: 271323 [Multi-domain] Cd Length: 323 Bit Score: 154.23 E-value: 1.17e-40
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 1204 VSSIMGNGRRrsiscpscnGQADGNKLLA----PVALACGIDGSLYVGDF--NYVRRIFPSGNVTSVL----ELSSN--- 1270
Cdd:cd14953 1 VSTVAGSGTA---------GFSGGGGTAArfnsPSGVAVDAAGNLYVADRgnHRIRKITPDGVVTTVAgtgtAGFADggg 71
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 1271 PAHRYY----LATDPvTGDLYVSDTNTRRIYRpksLTGAKDLTknaeVVAGTGEqclpfdeARCGDGGKAVEATLMSPKG 1346
Cdd:cd14953 72 AAAQFNtpsgVAVDA-AGNLYVADTGNHRIRK---ITPDGVVS----TLAGTGT-------AGFSDDGGATAAQFNYPTG 136
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 1347 MAIDKNGLIYFVDGT--MIRKVDQNGIISTLLGSNDLTSAR--PLTcdtsmhisQVRLEWPTDLAINPMDNsIYVLD--N 1420
Cdd:cd14953 137 VAVDAAGNLYVADTGnhRIRKITPDGVVTTVAGTGGAGYAGdgPAT--------AAQFNNPTGVAVDAAGN-LYVADrgN 207
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 1421 NVVLQITENRQVRIAAGRPmhcqvpGVEYPVGKHAVQTTLESATAIAVSYSGVLYITETDEkkiNRIRQVTTDGEISLVA 1500
Cdd:cd14953 208 HRIRKITPDGVVTTVAGTG------TAGFSGDGGATAAQLNNPTGVAVDAAGNLYVADSGN---HRIRKITPAGVVTTVA 278
|
330 340 350 360 370
....*....|....*....|....*....|....*....|....*....|....*..
gi 1907188904 1501 GIPSEcdckndancdcyQSGD-GYAKDAKLNAPSSLAASPDGTLYIADLGNIRIRAV 1556
Cdd:cd14953 279 GGGAG------------FSGDgGPATSAQFNNPTGVAVDAAGNLYVADTGNNRIRKI 323
|
|
| Tox-GHH |
pfam15636 |
GHH signature containing HNH/Endo VII superfamily nuclease toxin; A predicted toxin of the HNH ... |
2673-2750 |
7.89e-40 |
|
GHH signature containing HNH/Endo VII superfamily nuclease toxin; A predicted toxin of the HNH/Endonuclease VII fold present in bacterial polymorphic toxin systems with a characteriztic sG[HQ]H signature motif. In bacterial polymorphic toxin systems, the toxin is exported by the type 2, type 6, type 7 or TcdB/TcaC-type secretion system. The metazoan teneurin proteins possess an inactive of this domain at their C-terminus.
Pssm-ID: 464783 Cd Length: 78 Bit Score: 142.75 E-value: 7.89e-40
10 20 30 40 50 60 70
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 1907188904 2673 EEKARILEQARQRALARAWAREQQRVRDGEEGARLWTEGEKRQLLSAGKVQGYDGYYVLSVEQYPELADSANNIQFLR 2750
Cdd:pfam15636 1 EERKRLLEHAKKRAVREAWHRERQLLRNGLPGSRDWTDEEKEELLSTGSVPGYDGEYIHPVEQYPELADDPSNIRFRK 78
|
|
| RhsA |
COG3209 |
Uncharacterized conserved protein RhaS, contains 28 RHS repeats [General function prediction ... |
1510-2450 |
1.05e-34 |
|
Uncharacterized conserved protein RhaS, contains 28 RHS repeats [General function prediction only];
Pssm-ID: 442442 [Multi-domain] Cd Length: 1103 Bit Score: 146.44 E-value: 1.05e-34
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 1510 NDANCDCYQSGDGYAKDAKLNAPSSLAASPDGTLYIADLGNIRIRAVSKNKPLLNSMNFYEVASPTDQELYIFDINGTHQ 1589
Cdd:COG3209 109 AAATASAGRLVSTGAGAGGTVTAATGGTLGATAGSATTGSTDGGRGGVAVTGLAGGGASAYGLTLGGAAAGPATGVGTGA 188
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 1590 YTVSLVTGDYLYNFSYSNDNDVTAVTDSNGNTLRIRRDPNRMPVRVVSPDNQVIWLTIGTNGCLKSMTAQGLELVLFTYH 1669
Cdd:COG3209 189 VTLATGLAGSALLALGSGAILGGLAGAYSGSATTATGTALGTPASVAATVTGSATGAAGAGAAVATAATTLGGTTGAGTG 268
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 1670 GNSGLLATK---SDETGWTTFFDYDSEGRLTNVTFPTGVVTNLHGDMDKAITVDIESSSREEDVSITSNLSSIDSFYTMV 1746
Cdd:COG3209 269 ASGAGLDAStgtGGAGGSNAAATAGGLGGAGLGSGGAGGGGTAGGTTTAAGTTGTAAVSGAADAGTTTTTGTGTGGTTTT 348
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 1747 QDQLRNSYQIGYDGSLRIFYASGLDSHYQTEPHVLAGTANPTVAKRNMTLPGENGQNLVEWRFRKEQAQGKVNVFGRKLR 1826
Cdd:COG3209 349 VGGGGSLTLGGYGAAGGLTTSVGAGGGGSTSGSTTTVGGGGTATGSGGGSSTTGVGAGTTTTSTTGGDGGPATAAGALTA 428
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 1827 VNGRNLLSVDFDRTTKTEKIYDDHRKFLLRIAYDTSGHPTLWLPSSKLMAVNVTYSSTGQIASIQRGTTSEKVDYDSQGR 1906
Cdd:COG3209 429 GGTATGTGTGGGGTTAGTDATTTTGGAGASGTLTTTGGAATGATTGGGTEAGTGGGTLTSGSAGATTLGTDTTLDDTLGG 508
|
410 420 430 440 450 460 470 480
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 1907 IVSRVFADGKTWSYTYLEKSMVLLLHSQRQYIFEYDMWDRLSAITMPSVARHTMQTIRSIGYYRNIYNPPESNASIITDY 1986
Cdd:COG3209 509 TTTTTAGARGLVVTTGTTLTLGTTTTATLSATDATGTGDTTTTGTVGTGTSTGTGGTGTVTTTGDGTGGASTTTGTTGGT 588
|
490 500 510 520 530 540 550 560
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 1987 NEEGLLLQTAFLGTSRRVLFKYRRQTRLSEILYDSTRVSFTYDETAGVLKTVNLQSDGFicTIRYRQIGPLIDRQIFRFS 2066
Cdd:COG3209 589 ATTTTVTTTTTTSTAGTTTTTTSGYTRAGLTLTLGTGTASGLERATASTGSTTGGTTGT--GVTTTGTTTTRATGTTGTG 666
|
570 580 590 600 610 620 630 640
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 2067 EDGMVNARFDYSYDNSFRVTSMQGVINETPLPIDLYQFDDISGKVEQFGKFGVIYYDINQIISTAVMTYTKHFDAHGRIK 2146
Cdd:COG3209 667 TGVTAGLTTLATGGTTVGGGTGTTSTATTGATTGGTETGTTVTTLAGGTTTRLGTTTTGGGGGTTTDGTGTGGTTGTLTT 746
|
650 660 670 680 690 700 710 720
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 2147 EIQYEifRSLMYWITIQYDNMGRVTKREIKIGPFANTTKYAYEYDVDGQLQTVYLNEKIMWRYNYDLNGNLHLLNPSSSA 2226
Cdd:COG3209 747 TSTTT--TTTAGALTYTYDALGRLTSETTPGGVTQGTYTTRYTYDALGRLTSVTYPDGETVTYTYDALGRLTSVITVGSG 824
|
730 740 750 760 770 780 790 800
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 2227 RLTPL-----RYDLRDRITRLgdvqyrldEDGFLRQRGTEIFEYSSKGLLTRVYSKGSGWTviYRYDGLGRRVSSKTSLG 2301
Cdd:COG3209 825 GGTDLqdrtyTYDAAGNITSI--------TDALRAGTLTQTYTYDALGRLTSATDPGTTES--YTYDANGNLTSRTDGGT 894
|
810 820 830 840 850 860 870 880
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 2302 QHLQFFYADLtyPTRITHvynhSSSEITSLYYDLQGHlfameissgdefyiaSDNTGTPLAVFSSNGLMLKQIQYTAYGE 2381
Cdd:COG3209 895 TTYTYDALGR--LVSVTK----PDGTTTTYTYDALGH---------------TDHLGSVRALTDASGQVVWRYDYDPFGN 953
|
890 900 910 920 930 940
....*....|....*....|....*....|....*....|....*....|....*....|....*....
gi 1907188904 2382 IYFDSNVDFQLVIGFHGGLYDPLTKLIHFGERDYDILAGRWTTPDieiwkRIGKDPAPfNLYMFRNNNP 2450
Cdd:COG3209 954 LLAETSGAAANPLRFTGQEYDAETGLYYNGARYYDPALGRFLSPD-----PIGLAGGL-NLYAYVGNNP 1016
|
|
| Vgb |
COG4257 |
Streptogramin lyase [Defense mechanisms]; |
1226-1504 |
3.54e-10 |
|
Streptogramin lyase [Defense mechanisms];
Pssm-ID: 443399 [Multi-domain] Cd Length: 270 Bit Score: 63.50 E-value: 3.54e-10
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 1226 DGNKLLAPVALACGIDGSLYVGDF--NYVRRIFPSGNVTSVLELSSNPAHRYYLATDPvTGDLYVSDTNTRRIYRpksLT 1303
Cdd:COG4257 54 PLGGGSGPHGIAVDPDGNLWFTDNgnNRIGRIDPKTGEITTFALPGGGSNPHGIAFDP-DGNLWFTDQGGNRIGR---LD 129
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 1304 gakdlTKNAEVVAGTgeqcLPFDEARcgdggkaveatlmsPKGMAIDKNGLIYFVD--GTMIRKVD-QNGIISTLLGSND 1380
Cdd:COG4257 130 -----PATGEVTEFP----LPTGGAG--------------PYGIAVDPDGNLWVTDfgANAIGRIDpDTGTLTEYALPTP 186
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 1381 LTSarpltcdtsmhisqvrlewPTDLAINPmDNSIYVLD--NNVVLQITEnrqvriAAGRpmhcqvpgveypVGKHAVQT 1458
Cdd:COG4257 187 GAG-------------------PRGLAVDP-DGNLWVADtgSGRIGRFDP------KTGT------------VTEYPLPG 228
|
250 260 270 280
....*....|....*....|....*....|....*....|....*.
gi 1907188904 1459 TLESATAIAVSYSGVLYITETDekkINRIRQVTTDGEISLVAgIPS 1504
Cdd:COG4257 229 GGARPYGVAVDGDGRVWFAESG---ANRIVRFDPDTELTEYV-LPS 270
|
|
| Rhs_assc_core |
TIGR03696 |
RHS repeat-associated core domain; This model represents a conserved unique core sequence ... |
2376-2450 |
2.78e-09 |
|
RHS repeat-associated core domain; This model represents a conserved unique core sequence shared by large numbers of proteins. It is occasional in the Archaea Methanosarcina barkeri) but common in bacteria and eukaryotes. Most fall into two large classes. One class consists of long proteins in which two classes of repeats are abundant: an FG-GAP repeat (pfam01839) class, and an RHS repeat (pfam05593) or YD repeat (TIGR01643). This class includes secreted bacterial insecticidal toxins and intercellular signalling proteins such as the teneurins in animals. The other class consists of uncharacterized proteins shorter than 400 amino acids, where this core domain of about 75 amino acids tends to occur in the N-terminal half. Over twenty such proteins are found in Pseudomonas putida alone; little sequence similarity or repeat structure is found among these proteins outside the region modeled by this domain.
Pssm-ID: 274730 [Multi-domain] Cd Length: 77 Bit Score: 55.58 E-value: 2.78e-09
10 20 30 40 50 60 70
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*
gi 1907188904 2376 YTAYGEIYFDSNVDFQLvIGFHGGLYDPLTKLIHFGERDYDILAGRWTTPDieiwkRIGKDpAPFNLYMFRNNNP 2450
Cdd:TIGR03696 1 YDPYGEVLSESGAAPNP-LRFTGQYYDAETGLYYNGARYYDPELGRFLSPD-----PIGLG-GGLNLYAYVGNNP 68
|
|
| acid_disulf_rpt |
NF033662 |
acidic double-disulfide repeat; The acidic double-disulfide repeat is an Asp-rich repeat with ... |
834-864 |
9.71e-08 |
|
acidic double-disulfide repeat; The acidic double-disulfide repeat is an Asp-rich repeat with four nearly invariant Cys residues in a repeat length of about 35 amino acids.
Pssm-ID: 411265 [Multi-domain] Cd Length: 32 Bit Score: 49.82 E-value: 9.71e-08
10 20 30
....*....|....*....|....*....|.
gi 1907188904 834 AMETLCTDSKDNEGDGLIDCMDPDCCLQSSC 864
Cdd:NF033662 2 ATDTTCSDGIDNDGDGLTDCADPDCAGNPVC 32
|
|
| DUF5885 |
pfam19232 |
Family of unknown function (DUF5885); This is a family of uncharacterized proteins of unknown ... |
565-750 |
1.77e-07 |
|
Family of unknown function (DUF5885); This is a family of uncharacterized proteins of unknown function found in viruses.
Pssm-ID: 437064 Cd Length: 265 Bit Score: 55.01 E-value: 1.77e-07
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 565 CHGNGECvsGTCH-CFPGFLGPDCSraACPVLCSGNGQ----------YSKGRC----LCFSGwkgTECDVPTTQCI-DP 628
Cdd:pfam19232 34 CTTDAQC--GTCMtCVAGACTPKAS--CCGGVTCGAGQtcdaktntcvYVKGYCsadhPCPSG---SACDTAKNACIaQP 106
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 629 QCG---GRGiCIMG-------------------SCACNSGYK-GENCE--------EADCLDP---------------GC 662
Cdd:pfam19232 107 PYGpdsGKG-CVRGfgawiweldpatnsgvwrcRCANGSLYNsAHECSpladqtlcAAENLDPnalvpassvpafaayGW 185
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 663 SNHGVCIH-------------GECHCNPGWGGSNCEILKTmcadqCSGHGTYLQESGSCTCDPNWTGpdcsneicsvdcg 729
Cdd:pfam19232 186 GNQPVLINkstagaavpsplaGVCPCKPGWAGGSCTEDRT-----CNGRGTWNETTGQCACNIDFSG------------- 247
|
250 260
....*....|....*....|....
gi 1907188904 730 shgvcmGGSCRCEEG---WTGPAC 750
Cdd:pfam19232 248 ------HNSCGDDNNctsWTGPRC 265
|
|
| C_rich_MXAN6577 |
NF041328 |
MXAN_6577-like cysteine-rich domain; |
616-770 |
5.48e-06 |
|
MXAN_6577-like cysteine-rich domain;
Pssm-ID: 469225 [Multi-domain] Cd Length: 145 Bit Score: 48.60 E-value: 5.48e-06
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 616 TECDVPTTQCIDPQ--CGGRGICIMGScACNSGykgeNCEEAdcldpgCSNHGVCIHGECHCNPGwggsnceilKTMCAD 693
Cdd:NF041328 12 AGCPEPGAVCPEGLsvCGGACVDLRSD-PSNCG----ACGVA------CGAGQTCVAGACGCGPG---------TVACGG 71
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 694 QCSGHGTylqesgsctcDPNWTGPdcsneiCSVDCGSHGVCMGGSCR--CEEGWT--GPAC--------NQRACHPRCAE 761
Cdd:NF041328 72 ACVDTAS----------DPAHCGA------CGAACAPGQVCEGGACReaCSEGLTrcGGACvdlatdplHCGACGVACDP 135
|
....*....
gi 1907188904 762 HGTCKDGKC 770
Cdd:NF041328 136 GESCRGGAC 144
|
|
| DSL |
pfam01414 |
Delta serrate ligand; This family has been redefined to correspond to the EGF-like domain ... |
741-784 |
2.04e-05 |
|
Delta serrate ligand; This family has been redefined to correspond to the EGF-like domain defined by structure.
Pssm-ID: 460202 Cd Length: 46 Bit Score: 43.77 E-value: 2.04e-05
10 20 30 40
....*....|....*....|....*....|....*....|....*..
gi 1907188904 741 CEEGWTGPACNqRACHPRCAE--HGTC-KDGKCECSQGWNGEHCTIA 784
Cdd:pfam01414 1 CDENYYGSTCS-KFCRPRDDKfgHYTCdANGNKVCLPGWTGPYCDKP 46
|
|
| PLN02919 |
PLN02919 |
haloacid dehalogenase-like hydrolase family protein |
1277-1560 |
3.91e-05 |
|
haloacid dehalogenase-like hydrolase family protein
Pssm-ID: 215497 [Multi-domain] Cd Length: 1057 Bit Score: 49.46 E-value: 3.91e-05
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 1277 LATDPVTGDLYVSDTNTRRIYrpksltgAKDLTKNAEV-VAGTGEQCL---PFDEArcgdggkaveaTLMSPKGMAID-K 1351
Cdd:PLN02919 573 LAIDLLNNRLFISDSNHNRIV-------VTDLDGNFIVqIGSTGEEGLrdgSFEDA-----------TFNRPQGLAYNaK 634
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 1352 NGLIYFVD--GTMIRKVD-QNGIISTLLGS----NDLTSARPLTcdtsmhiSQVrLEWPTDLAINPMDNSIYV------- 1417
Cdd:PLN02919 635 KNLLYVADteNHALREIDfVNETVRTLAGNgtkgSDYQGGKKGT-------SQV-LNSPWDVCFEPVNEKVYIamagqhq 706
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 1418 ------LD-------------------------------------NNVVLQITENRQVR-----------IAAGRPMhcq 1443
Cdd:PLN02919 707 iweyniSDgvtrvfsgdgyernlngssgtstsfaqpsgislspdlKELYIADSESSSIRaldlktggsrlLAGGDPT--- 783
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 1444 VPGVEYPVGKH---AVQTTLESATAIAVSYSGVLYITETDEKKINRIRQVTtdGEISLVAGIPsecdckndancdcyQSG 1520
Cdd:PLN02919 784 FSDNLFKFGDHdgvGSEVLLQHPLGVLCAKDGQIYVADSYNHKIKKLDPAT--KRVTTLAGTG--------------KAG 847
|
330 340 350 360
....*....|....*....|....*....|....*....|..
gi 1907188904 1521 --DGYAKDAKLNAPSSLAASPDGTLYIADLGNIRIRAVSKNK 1560
Cdd:PLN02919 848 fkDGKALKAQLSEPAGLALGENGRLFVADTNNSLIRYLDLNK 889
|
|
| NHL |
cd05819 |
NHL repeat unit of beta-propeller proteins; The NHL(NCL-1, HT2A and LIN-41)-repeat is found in ... |
1525-1651 |
5.73e-05 |
|
NHL repeat unit of beta-propeller proteins; The NHL(NCL-1, HT2A and LIN-41)-repeat is found in multiple tandem copies, typically as 6 instances. It is about 40 residues long and resembles the WD repeat and other beta-propeller structures. The repeats have a catalytic activity in Peptidyl-glycine alpha-amidating monooxygenase; proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localized to the repeats. Tripartite motif-containing protein 32 interacts with the activation domain of Tat. This interaction is mediated by the NHL repeats.
Pssm-ID: 271320 [Multi-domain] Cd Length: 269 Bit Score: 47.31 E-value: 5.73e-05
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 1525 KDAKLNAPSSLAASPDGTLYIADLGNIRIRAVSKN-KPLLN-------SMNFYE---VASPTDQELYI----------FD 1583
Cdd:cd05819 3 GPGELNNPQGIAVDSSGNIYVADTGNNRIQVFDPDgNFITSfgsfgsgDGQFNEpagVAVDSDGNLYVadtgnhriqkFD 82
|
90 100 110 120 130 140 150
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....
gi 1907188904 1584 INGTHQYTVSlVTGDYLYNFSY------SNDNDVtAVTDSNGNtlRIrrdpnrmpvRVVSPDNQVIwLTIGTNG 1651
Cdd:cd05819 83 PDGNFLASFG-GSGDGDGEFNGprgiavDSSGNI-YVADTGNH--RI---------QKFDPDGEFL-TTFGSGG 142
|
|
| C_rich_MXAN6577 |
NF041328 |
MXAN_6577-like cysteine-rich domain; |
724-825 |
4.81e-04 |
|
MXAN_6577-like cysteine-rich domain;
Pssm-ID: 469225 [Multi-domain] Cd Length: 145 Bit Score: 42.82 E-value: 4.81e-04
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 724 CSVDCGSHGVCMGGSCRCEEGWT--GPAC--------NQRACHPRCAEHGTCKDGKCecsqgwngehctiahyldkivkd 793
Cdd:NF041328 45 CGVACGAGQTCVAGACGCGPGTVacGGACvdtasdpaHCGACGAACAPGQVCEGGAC----------------------- 101
|
90 100 110
....*....|....*....|....*....|....*....
gi 1907188904 794 kigyKEGCP-GLCNSNGRCT-LDQNGWHC-----VCQPG 825
Cdd:NF041328 102 ----REACSeGLTRCGGACVdLATDPLHCgacgvACDPG 136
|
|
| EGF_CA |
cd00054 |
Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular ... |
656-685 |
3.77e-03 |
|
Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements.
Pssm-ID: 238011 Cd Length: 38 Bit Score: 37.23 E-value: 3.77e-03
10 20 30
....*....|....*....|....*....|....*
gi 1907188904 656 DCLDPG-CSNHGVCIHGE----CHCNPGWGGSNCE 685
Cdd:cd00054 4 ECASGNpCQNGGTCVNTVgsyrCSCPPGYTGRNCE 38
|
|
| EGF_CA |
cd00054 |
Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular ... |
802-831 |
4.64e-03 |
|
Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements.
Pssm-ID: 238011 Cd Length: 38 Bit Score: 36.85 E-value: 4.64e-03
10 20 30
....*....|....*....|....*....|
gi 1907188904 802 PGLCNSNGRCTLDQNGWHCVCQPGWRGAGC 831
Cdd:cd00054 8 GNPCQNGGTCVNTVGSYRCSCPPGYTGRNC 37
|
|
| RHS_repeat |
pfam05593 |
RHS Repeat; RHS proteins contain extended repeat regions. These repeats often appear to be ... |
1673-1704 |
6.63e-03 |
|
RHS Repeat; RHS proteins contain extended repeat regions. These repeats often appear to be involved in ligand binding. Note that this model may not find all the repeats in a protein and that it covers two RHS repeats. The 3D structure of an RHS-repeat-containing protein (the B and C components of an ABC toxin complex) has been determined. The RHS repeats form an extended strip of beta-sheet that spirals around to form a hollow shell, encapsulating the variable C-terminal domain.
Pssm-ID: 461685 [Multi-domain] Cd Length: 37 Bit Score: 36.42 E-value: 6.63e-03
10 20 30
....*....|....*....|....*....|..
gi 1907188904 1673 GLLATKSDETGWTTFFDYDSEGRLTNVTFPTG 1704
Cdd:pfam05593 5 GRLTSVTDPDGRVTTYTYDAAGRLTAVTDPDG 36
|
|
|
|
Name |
Accession |
Description |
Interval |
E-value |
| Ten_N |
pfam06484 |
Teneurin Intracellular Region; This family is found in the intracellular N-terminal region of ... |
192-351 |
7.72e-79 |
|
Teneurin Intracellular Region; This family is found in the intracellular N-terminal region of the Teneurin family of proteins. These proteins are 'pair-rule' genes and are involved in tissue patterning, specifically probably neural patterning. The intracellular domain is cleaved in response to homophilic interaction of the extracellular domain, and translocates to the nucleus. Here it probably carries out to some transcriptional regulatory activity. The length of this region and the conservation suggests that there may be two structural domains here (personal obs:C Yeats).
Pssm-ID: 461932 [Multi-domain] Cd Length: 367 Bit Score: 266.46 E-value: 7.72e-79
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 192 DPGKLAAPGSAP--HGHGEGARRQEQPSNNPGQPTLQ----PLPPSHKQHpAQHHPSITSLNRNSLTNRRNQSPAPPAAL 265
Cdd:pfam06484 203 DPDEEFSPNSYLvrTGSGPQSAPSEQPPNFQNHSRLRtpppPLPPPHKQN-QHHHPSINSLNRSSLTNRRNPSPAPTASL 281
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 266 PAELQTTPESVQLQDSWVLGSNVPLESRHFLFKTGTGTTPLFSTATPGYTMASGSVYSPPTRPLPRNTLSRSAFKFKKSS 345
Cdd:pfam06484 282 PAELQSTQESVQLQDSWVLNSNVPLETRHFLFKTGTGTTPLFCTASPGYPLTSGTVYSPPPRPLPRNTFSRPAFKLKKPY 361
|
....*.
gi 1907188904 346 KYCSWR 351
Cdd:pfam06484 362 KYCSWK 367
|
|
| NHL_like_1 |
cd14953 |
Uncharacterized NHL-repeat domain in bacterial proteins; This bacterial family of NHL-repeat ... |
1204-1556 |
1.17e-40 |
|
Uncharacterized NHL-repeat domain in bacterial proteins; This bacterial family of NHL-repeat domains is found in a variety of domain architectures. The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies, typically as 6 instances. It is about 40 residues long and resembles the WD repeat and other beta-propeller structures.
Pssm-ID: 271323 [Multi-domain] Cd Length: 323 Bit Score: 154.23 E-value: 1.17e-40
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 1204 VSSIMGNGRRrsiscpscnGQADGNKLLA----PVALACGIDGSLYVGDF--NYVRRIFPSGNVTSVL----ELSSN--- 1270
Cdd:cd14953 1 VSTVAGSGTA---------GFSGGGGTAArfnsPSGVAVDAAGNLYVADRgnHRIRKITPDGVVTTVAgtgtAGFADggg 71
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 1271 PAHRYY----LATDPvTGDLYVSDTNTRRIYRpksLTGAKDLTknaeVVAGTGEqclpfdeARCGDGGKAVEATLMSPKG 1346
Cdd:cd14953 72 AAAQFNtpsgVAVDA-AGNLYVADTGNHRIRK---ITPDGVVS----TLAGTGT-------AGFSDDGGATAAQFNYPTG 136
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 1347 MAIDKNGLIYFVDGT--MIRKVDQNGIISTLLGSNDLTSAR--PLTcdtsmhisQVRLEWPTDLAINPMDNsIYVLD--N 1420
Cdd:cd14953 137 VAVDAAGNLYVADTGnhRIRKITPDGVVTTVAGTGGAGYAGdgPAT--------AAQFNNPTGVAVDAAGN-LYVADrgN 207
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 1421 NVVLQITENRQVRIAAGRPmhcqvpGVEYPVGKHAVQTTLESATAIAVSYSGVLYITETDEkkiNRIRQVTTDGEISLVA 1500
Cdd:cd14953 208 HRIRKITPDGVVTTVAGTG------TAGFSGDGGATAAQLNNPTGVAVDAAGNLYVADSGN---HRIRKITPAGVVTTVA 278
|
330 340 350 360 370
....*....|....*....|....*....|....*....|....*....|....*..
gi 1907188904 1501 GIPSEcdckndancdcyQSGD-GYAKDAKLNAPSSLAASPDGTLYIADLGNIRIRAV 1556
Cdd:cd14953 279 GGGAG------------FSGDgGPATSAQFNNPTGVAVDAAGNLYVADTGNNRIRKI 323
|
|
| Tox-GHH |
pfam15636 |
GHH signature containing HNH/Endo VII superfamily nuclease toxin; A predicted toxin of the HNH ... |
2673-2750 |
7.89e-40 |
|
GHH signature containing HNH/Endo VII superfamily nuclease toxin; A predicted toxin of the HNH/Endonuclease VII fold present in bacterial polymorphic toxin systems with a characteriztic sG[HQ]H signature motif. In bacterial polymorphic toxin systems, the toxin is exported by the type 2, type 6, type 7 or TcdB/TcaC-type secretion system. The metazoan teneurin proteins possess an inactive of this domain at their C-terminus.
Pssm-ID: 464783 Cd Length: 78 Bit Score: 142.75 E-value: 7.89e-40
10 20 30 40 50 60 70
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 1907188904 2673 EEKARILEQARQRALARAWAREQQRVRDGEEGARLWTEGEKRQLLSAGKVQGYDGYYVLSVEQYPELADSANNIQFLR 2750
Cdd:pfam15636 1 EERKRLLEHAKKRAVREAWHRERQLLRNGLPGSRDWTDEEKEELLSTGSVPGYDGEYIHPVEQYPELADDPSNIRFRK 78
|
|
| NHL_like_1 |
cd14953 |
Uncharacterized NHL-repeat domain in bacterial proteins; This bacterial family of NHL-repeat ... |
1277-1557 |
1.55e-38 |
|
Uncharacterized NHL-repeat domain in bacterial proteins; This bacterial family of NHL-repeat domains is found in a variety of domain architectures. The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies, typically as 6 instances. It is about 40 residues long and resembles the WD repeat and other beta-propeller structures.
Pssm-ID: 271323 [Multi-domain] Cd Length: 323 Bit Score: 148.06 E-value: 1.55e-38
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 1277 LATDPvTGDLYVSDTNTRRIYRpksltgakdLTKNAEV--VAGTGEqclpfdEARCGDGGKAveATLMSPKGMAIDKNGL 1354
Cdd:cd14953 28 VAVDA-AGNLYVADRGNHRIRK---------ITPDGVVttVAGTGT------AGFADGGGAA--AQFNTPSGVAVDAAGN 89
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 1355 IYFVDGT--MIRKVDQNGIISTLLGsndlTSARPLTCDTSMhiSQVRLEWPTDLAINPMDNsIYVLD--NNVVLQITENR 1430
Cdd:cd14953 90 LYVADTGnhRIRKITPDGVVSTLAG----TGTAGFSDDGGA--TAAQFNYPTGVAVDAAGN-LYVADtgNHRIRKITPDG 162
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 1431 QVRIAAGRPmhcqVPGveYPVGKHAVQTTLESATAIAVSYSGVLYITETDEkkiNRIRQVTTDGEISLVAGIPSEcdckn 1510
Cdd:cd14953 163 VVTTVAGTG----GAG--YAGDGPATAAQFNNPTGVAVDAAGNLYVADRGN---HRIRKITPDGVVTTVAGTGTA----- 228
|
250 260 270 280
....*....|....*....|....*....|....*....|....*..
gi 1907188904 1511 dancdcYQSGDGYAKDAKLNAPSSLAASPDGTLYIADLGNIRIRAVS 1557
Cdd:cd14953 229 ------GFSGDGGATAAQLNNPTGVAVDAAGNLYVADSGNHRIRKIT 269
|
|
| RhsA |
COG3209 |
Uncharacterized conserved protein RhaS, contains 28 RHS repeats [General function prediction ... |
1510-2450 |
1.05e-34 |
|
Uncharacterized conserved protein RhaS, contains 28 RHS repeats [General function prediction only];
Pssm-ID: 442442 [Multi-domain] Cd Length: 1103 Bit Score: 146.44 E-value: 1.05e-34
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 1510 NDANCDCYQSGDGYAKDAKLNAPSSLAASPDGTLYIADLGNIRIRAVSKNKPLLNSMNFYEVASPTDQELYIFDINGTHQ 1589
Cdd:COG3209 109 AAATASAGRLVSTGAGAGGTVTAATGGTLGATAGSATTGSTDGGRGGVAVTGLAGGGASAYGLTLGGAAAGPATGVGTGA 188
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 1590 YTVSLVTGDYLYNFSYSNDNDVTAVTDSNGNTLRIRRDPNRMPVRVVSPDNQVIWLTIGTNGCLKSMTAQGLELVLFTYH 1669
Cdd:COG3209 189 VTLATGLAGSALLALGSGAILGGLAGAYSGSATTATGTALGTPASVAATVTGSATGAAGAGAAVATAATTLGGTTGAGTG 268
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 1670 GNSGLLATK---SDETGWTTFFDYDSEGRLTNVTFPTGVVTNLHGDMDKAITVDIESSSREEDVSITSNLSSIDSFYTMV 1746
Cdd:COG3209 269 ASGAGLDAStgtGGAGGSNAAATAGGLGGAGLGSGGAGGGGTAGGTTTAAGTTGTAAVSGAADAGTTTTTGTGTGGTTTT 348
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 1747 QDQLRNSYQIGYDGSLRIFYASGLDSHYQTEPHVLAGTANPTVAKRNMTLPGENGQNLVEWRFRKEQAQGKVNVFGRKLR 1826
Cdd:COG3209 349 VGGGGSLTLGGYGAAGGLTTSVGAGGGGSTSGSTTTVGGGGTATGSGGGSSTTGVGAGTTTTSTTGGDGGPATAAGALTA 428
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 1827 VNGRNLLSVDFDRTTKTEKIYDDHRKFLLRIAYDTSGHPTLWLPSSKLMAVNVTYSSTGQIASIQRGTTSEKVDYDSQGR 1906
Cdd:COG3209 429 GGTATGTGTGGGGTTAGTDATTTTGGAGASGTLTTTGGAATGATTGGGTEAGTGGGTLTSGSAGATTLGTDTTLDDTLGG 508
|
410 420 430 440 450 460 470 480
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 1907 IVSRVFADGKTWSYTYLEKSMVLLLHSQRQYIFEYDMWDRLSAITMPSVARHTMQTIRSIGYYRNIYNPPESNASIITDY 1986
Cdd:COG3209 509 TTTTTAGARGLVVTTGTTLTLGTTTTATLSATDATGTGDTTTTGTVGTGTSTGTGGTGTVTTTGDGTGGASTTTGTTGGT 588
|
490 500 510 520 530 540 550 560
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 1987 NEEGLLLQTAFLGTSRRVLFKYRRQTRLSEILYDSTRVSFTYDETAGVLKTVNLQSDGFicTIRYRQIGPLIDRQIFRFS 2066
Cdd:COG3209 589 ATTTTVTTTTTTSTAGTTTTTTSGYTRAGLTLTLGTGTASGLERATASTGSTTGGTTGT--GVTTTGTTTTRATGTTGTG 666
|
570 580 590 600 610 620 630 640
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 2067 EDGMVNARFDYSYDNSFRVTSMQGVINETPLPIDLYQFDDISGKVEQFGKFGVIYYDINQIISTAVMTYTKHFDAHGRIK 2146
Cdd:COG3209 667 TGVTAGLTTLATGGTTVGGGTGTTSTATTGATTGGTETGTTVTTLAGGTTTRLGTTTTGGGGGTTTDGTGTGGTTGTLTT 746
|
650 660 670 680 690 700 710 720
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 2147 EIQYEifRSLMYWITIQYDNMGRVTKREIKIGPFANTTKYAYEYDVDGQLQTVYLNEKIMWRYNYDLNGNLHLLNPSSSA 2226
Cdd:COG3209 747 TSTTT--TTTAGALTYTYDALGRLTSETTPGGVTQGTYTTRYTYDALGRLTSVTYPDGETVTYTYDALGRLTSVITVGSG 824
|
730 740 750 760 770 780 790 800
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 2227 RLTPL-----RYDLRDRITRLgdvqyrldEDGFLRQRGTEIFEYSSKGLLTRVYSKGSGWTviYRYDGLGRRVSSKTSLG 2301
Cdd:COG3209 825 GGTDLqdrtyTYDAAGNITSI--------TDALRAGTLTQTYTYDALGRLTSATDPGTTES--YTYDANGNLTSRTDGGT 894
|
810 820 830 840 850 860 870 880
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 2302 QHLQFFYADLtyPTRITHvynhSSSEITSLYYDLQGHlfameissgdefyiaSDNTGTPLAVFSSNGLMLKQIQYTAYGE 2381
Cdd:COG3209 895 TTYTYDALGR--LVSVTK----PDGTTTTYTYDALGH---------------TDHLGSVRALTDASGQVVWRYDYDPFGN 953
|
890 900 910 920 930 940
....*....|....*....|....*....|....*....|....*....|....*....|....*....
gi 1907188904 2382 IYFDSNVDFQLVIGFHGGLYDPLTKLIHFGERDYDILAGRWTTPDieiwkRIGKDPAPfNLYMFRNNNP 2450
Cdd:COG3209 954 LLAETSGAAANPLRFTGQEYDAETGLYYNGARYYDPALGRFLSPD-----PIGLAGGL-NLYAYVGNNP 1016
|
|
| NHL_like_1 |
cd14953 |
Uncharacterized NHL-repeat domain in bacterial proteins; This bacterial family of NHL-repeat ... |
1314-1557 |
6.93e-32 |
|
Uncharacterized NHL-repeat domain in bacterial proteins; This bacterial family of NHL-repeat domains is found in a variety of domain architectures. The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies, typically as 6 instances. It is about 40 residues long and resembles the WD repeat and other beta-propeller structures.
Pssm-ID: 271323 [Multi-domain] Cd Length: 323 Bit Score: 128.80 E-value: 6.93e-32
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 1314 VVAGTGeqclpfdeARCGDGGKAVEATLMSPKGMAIDKNGLIYFVDGT--MIRKVDQNGIISTLLG------SNDLTSAr 1385
Cdd:cd14953 3 TVAGSG--------TAGFSGGGGTAARFNSPSGVAVDAAGNLYVADRGnhRIRKITPDGVVTTVAGtgtagfADGGGAA- 73
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 1386 pltcdtsmhisqVRLEWPTDLAINPMDNsIYVLD--NNVVLQITENRQVRIAAGrpmhcqVPGVEYPVGKHAVQTTLESA 1463
Cdd:cd14953 74 ------------AQFNTPSGVAVDAAGN-LYVADtgNHRIRKITPDGVVSTLAG------TGTAGFSDDGGATAAQFNYP 134
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 1464 TAIAVSYSGVLYITETDEkkiNRIRQVTTDGEISLVAGIPSEcdckndancdcYQSGDGYAKDAKLNAPSSLAASPDGTL 1543
Cdd:cd14953 135 TGVAVDAAGNLYVADTGN---HRIRKITPDGVVTTVAGTGGA-----------GYAGDGPATAAQFNNPTGVAVDAAGNL 200
|
250
....*....|....
gi 1907188904 1544 YIADLGNIRIRAVS 1557
Cdd:cd14953 201 YVADRGNHRIRKIT 214
|
|
| NHL |
cd05819 |
NHL repeat unit of beta-propeller proteins; The NHL(NCL-1, HT2A and LIN-41)-repeat is found in ... |
1223-1554 |
5.31e-19 |
|
NHL repeat unit of beta-propeller proteins; The NHL(NCL-1, HT2A and LIN-41)-repeat is found in multiple tandem copies, typically as 6 instances. It is about 40 residues long and resembles the WD repeat and other beta-propeller structures. The repeats have a catalytic activity in Peptidyl-glycine alpha-amidating monooxygenase; proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localized to the repeats. Tripartite motif-containing protein 32 interacts with the activation domain of Tat. This interaction is mediated by the NHL repeats.
Pssm-ID: 271320 [Multi-domain] Cd Length: 269 Bit Score: 89.69 E-value: 5.31e-19
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 1223 GQADGnKLLAPVALACGIDGSLYVGDF--NYVRRIFPSGN-VTSVLELSSNPAHRYY---LATDPvTGDLYVSDTNTRRI 1296
Cdd:cd05819 1 GTGPG-ELNNPQGIAVDSSGNIYVADTgnNRIQVFDPDGNfITSFGSFGSGDGQFNEpagVAVDS-DGNLYVADTGNHRI 78
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 1297 YRpksltgakdLTKNAEVVAGTGeqclpfdearcGDGGKAVEatLMSPKGMAIDKNGLIYFVDgTM---IRKVDQNGIIS 1373
Cdd:cd05819 79 QK---------FDPDGNFLASFG-----------GSGDGDGE--FNGPRGIAVDSSGNIYVAD-TGnhrIQKFDPDGEFL 135
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 1374 TLLGSNDLTSARpltcdtsmhisqvrLEWPTDLAINPmDNSIYVLDnnvvlqiTENRQVRI--AAGRPMHcQVPGVEYPV 1451
Cdd:cd05819 136 TTFGSGGSGPGQ--------------FNGPTGVAVDS-DGNIYVAD-------TGNHRIQVfdPDGNFLT-TFGSTGTGP 192
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 1452 GKhavqttLESATAIAVSYSGVLYITETDEkkiNRIRQVTTDGEISlvagipsecdckndancdcYQSGDGYAKDAKLNA 1531
Cdd:cd05819 193 GQ------FNYPTGIAVDSDGNIYVADSGN---NRVQVFDPDGAGF-------------------GGNGNFLGSDGQFNR 244
|
330 340
....*....|....*....|...
gi 1907188904 1532 PSSLAASPDGTLYIADLGNIRIR 1554
Cdd:cd05819 245 PSGLAVDSDGNLYVADTGNNRIQ 267
|
|
| NHL |
cd05819 |
NHL repeat unit of beta-propeller proteins; The NHL(NCL-1, HT2A and LIN-41)-repeat is found in ... |
1222-1487 |
7.09e-16 |
|
NHL repeat unit of beta-propeller proteins; The NHL(NCL-1, HT2A and LIN-41)-repeat is found in multiple tandem copies, typically as 6 instances. It is about 40 residues long and resembles the WD repeat and other beta-propeller structures. The repeats have a catalytic activity in Peptidyl-glycine alpha-amidating monooxygenase; proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localized to the repeats. Tripartite motif-containing protein 32 interacts with the activation domain of Tat. This interaction is mediated by the NHL repeats.
Pssm-ID: 271320 [Multi-domain] Cd Length: 269 Bit Score: 80.44 E-value: 7.09e-16
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 1222 NGQADGNkLLAPVALACGIDGSLYVGDF--NYVRRIFPSGNVTSVLELSSNPAHRYY----LATDPvTGDLYVSDTNTRR 1295
Cdd:cd05819 47 FGSGDGQ-FNEPAGVAVDSDGNLYVADTgnHRIQKFDPDGNFLASFGGSGDGDGEFNgprgIAVDS-SGNIYVADTGNHR 124
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 1296 IYRpksltgakdLTKNAEVVAGTGeqclpfdearcgdGGKAVEATLMSPKGMAIDKNGLIYFVDGT--MIRKVDQNGIIS 1373
Cdd:cd05819 125 IQK---------FDPDGEFLTTFG-------------SGGSGPGQFNGPTGVAVDSDGNIYVADTGnhRIQVFDPDGNFL 182
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 1374 TLLGSNDLTSARpltcdtsmhisqvrLEWPTDLAINPMDNsIYVLD--NNVVLQITENRQVRIAAGRPMhCQVPGVEYPV 1451
Cdd:cd05819 183 TTFGSTGTGPGQ--------------FNYPTGIAVDSDGN-IYVADsgNNRVQVFDPDGAGFGGNGNFL-GSDGQFNRPS 246
|
250 260 270
....*....|....*....|....*....|....*.
gi 1907188904 1452 GkhavqttlesataIAVSYSGVLYITETDEKKINRI 1487
Cdd:cd05819 247 G-------------LAVDSDGNLYVADTGNNRIQVF 269
|
|
| Vgb |
COG4257 |
Streptogramin lyase [Defense mechanisms]; |
1226-1504 |
3.54e-10 |
|
Streptogramin lyase [Defense mechanisms];
Pssm-ID: 443399 [Multi-domain] Cd Length: 270 Bit Score: 63.50 E-value: 3.54e-10
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 1226 DGNKLLAPVALACGIDGSLYVGDF--NYVRRIFPSGNVTSVLELSSNPAHRYYLATDPvTGDLYVSDTNTRRIYRpksLT 1303
Cdd:COG4257 54 PLGGGSGPHGIAVDPDGNLWFTDNgnNRIGRIDPKTGEITTFALPGGGSNPHGIAFDP-DGNLWFTDQGGNRIGR---LD 129
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 1304 gakdlTKNAEVVAGTgeqcLPFDEARcgdggkaveatlmsPKGMAIDKNGLIYFVD--GTMIRKVD-QNGIISTLLGSND 1380
Cdd:COG4257 130 -----PATGEVTEFP----LPTGGAG--------------PYGIAVDPDGNLWVTDfgANAIGRIDpDTGTLTEYALPTP 186
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 1381 LTSarpltcdtsmhisqvrlewPTDLAINPmDNSIYVLD--NNVVLQITEnrqvriAAGRpmhcqvpgveypVGKHAVQT 1458
Cdd:COG4257 187 GAG-------------------PRGLAVDP-DGNLWVADtgSGRIGRFDP------KTGT------------VTEYPLPG 228
|
250 260 270 280
....*....|....*....|....*....|....*....|....*.
gi 1907188904 1459 TLESATAIAVSYSGVLYITETDekkINRIRQVTTDGEISLVAgIPS 1504
Cdd:COG4257 229 GGARPYGVAVDGDGRVWFAESG---ANRIVRFDPDTELTEYV-LPS 270
|
|
| Rhs_assc_core |
TIGR03696 |
RHS repeat-associated core domain; This model represents a conserved unique core sequence ... |
2376-2450 |
2.78e-09 |
|
RHS repeat-associated core domain; This model represents a conserved unique core sequence shared by large numbers of proteins. It is occasional in the Archaea Methanosarcina barkeri) but common in bacteria and eukaryotes. Most fall into two large classes. One class consists of long proteins in which two classes of repeats are abundant: an FG-GAP repeat (pfam01839) class, and an RHS repeat (pfam05593) or YD repeat (TIGR01643). This class includes secreted bacterial insecticidal toxins and intercellular signalling proteins such as the teneurins in animals. The other class consists of uncharacterized proteins shorter than 400 amino acids, where this core domain of about 75 amino acids tends to occur in the N-terminal half. Over twenty such proteins are found in Pseudomonas putida alone; little sequence similarity or repeat structure is found among these proteins outside the region modeled by this domain.
Pssm-ID: 274730 [Multi-domain] Cd Length: 77 Bit Score: 55.58 E-value: 2.78e-09
10 20 30 40 50 60 70
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*
gi 1907188904 2376 YTAYGEIYFDSNVDFQLvIGFHGGLYDPLTKLIHFGERDYDILAGRWTTPDieiwkRIGKDpAPFNLYMFRNNNP 2450
Cdd:TIGR03696 1 YDPYGEVLSESGAAPNP-LRFTGQYYDAETGLYYNGARYYDPELGRFLSPD-----PIGLG-GGLNLYAYVGNNP 68
|
|
| NHL_PKND_like |
cd14952 |
NHL repeat domain of the protein kinase PknD; PknD is a mycobacterial transmembrane protein ... |
1277-1553 |
4.65e-09 |
|
NHL repeat domain of the protein kinase PknD; PknD is a mycobacterial transmembrane protein with a cytosolic kinase domain and an extracellular sensor domain that contains NHL repeats. It plays a key role in the development of central nervous system tuberculosis, by mediating the invasion of host brain endothelia. The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies, typically as 6 instances. It is about 40 residues long and resembles the WD repeat and other beta-propeller structures.
Pssm-ID: 271322 [Multi-domain] Cd Length: 247 Bit Score: 59.53 E-value: 4.65e-09
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 1277 LATDPvTGDLYVSDTNTRRIYRpksltgakdltknaeVVAGTGEQC-LPFDEarcgdggkaveatLMSPKGMAIDKNGLI 1355
Cdd:cd14952 15 VAVDA-AGNVYVADSGNNRVLK---------------LAAGSTTQTvLPFTG-------------LYQPQGVAVDAAGTV 65
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 1356 YFVDGtmirkvDQNGIISTLLGSNDLTsARPLTcdtsmhisqvRLEWPTDLAINPMDNsIYVLDNnvvlqiTENRQVRIA 1435
Cdd:cd14952 66 YVTDF------GNNRVLKLAAGSTTQT-VLPFT----------GLNDPTGVAVDAAGN-VYVADT------GNNRVLKLA 121
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 1436 AGRPMHCQVPgveypvgkhavQTTLESATAIAVSYSGVLYITETDEkkiNRIRQvttdgeisLVAGipsecdckndANCD 1515
Cdd:cd14952 122 AGSNTQTVLP-----------FTGLSNPDGVAVDGAGNVYVTDTGN---NRVLK--------LAAG----------STTQ 169
|
250 260 270
....*....|....*....|....*....|....*...
gi 1907188904 1516 CYQSGDGyakdakLNAPSSLAASPDGTLYIADLGNIRI 1553
Cdd:cd14952 170 TVLPFTG------LNSPSGVAVDTAGNVYVTDHGNNRV 201
|
|
| NHL_like_2 |
cd14957 |
Uncharacterized NHL-repeat domain in bacterial and archaeal proteins; The NHL (NCL-1, HT2A and ... |
1341-1651 |
1.19e-08 |
|
Uncharacterized NHL-repeat domain in bacterial and archaeal proteins; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies, typically as 6 instances. It is about 40 residues long and resembles the WD repeat and other beta-propeller structures.
Pssm-ID: 271327 [Multi-domain] Cd Length: 280 Bit Score: 58.82 E-value: 1.19e-08
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 1341 LMSPKGMAIDKNGLIYFVD--GTMIRKVDQNGIISTLLGSNDltsarpltcdtsmhISQVRLEWPTDLAINPMDNsIYVL 1418
Cdd:cd14957 17 FNTPRGIAVDSAGNIYVADtgNNRIQVFTSSGVYSYSIGSGG--------------TGSGQFNSPYGIAVDSNGN-IYVA 81
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 1419 DNNvvlqitENR-QVRIAAGrpmhcqvpGVEYPVGKHAVQTT-LESATAIAVSYSGVLYITETDEkkiNRIRQVTTDGEI 1496
Cdd:cd14957 82 DTD------NNRiQVFNSSG--------VYQYSIGTGGSGDGqFNGPYGIAVDSNGNIYVADTGN---HRIQVFTSSGTF 144
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 1497 slvagipsecdckndancdCYQSGDGYAKDAKLNAPSSLAASPDGTLYIADLGNIRIRavsknkpllnsmnfyevasptd 1576
Cdd:cd14957 145 -------------------SYSIGSGGTGPGQFNGPQGIAVDSDGNIYVADTGNHRIQ---------------------- 183
|
250 260 270 280 290 300 310
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*.
gi 1907188904 1577 qelyIFDINGTHQYTV-SLVTGDYLynFSYSNDNDVtavtDSNGNTLRIRRDPNRmpVRVVSPDNqVIWLTIGTNG 1651
Cdd:cd14957 184 ----VFTSSGTFQYTFgSSGSGPGQ--FSDPYGIAV----DSDGNIYVADTGNHR--IQVFTSSG-AYQYSIGTSG 246
|
|
| acid_disulf_rpt |
NF033662 |
acidic double-disulfide repeat; The acidic double-disulfide repeat is an Asp-rich repeat with ... |
834-864 |
9.71e-08 |
|
acidic double-disulfide repeat; The acidic double-disulfide repeat is an Asp-rich repeat with four nearly invariant Cys residues in a repeat length of about 35 amino acids.
Pssm-ID: 411265 [Multi-domain] Cd Length: 32 Bit Score: 49.82 E-value: 9.71e-08
10 20 30
....*....|....*....|....*....|.
gi 1907188904 834 AMETLCTDSKDNEGDGLIDCMDPDCCLQSSC 864
Cdd:NF033662 2 ATDTTCSDGIDNDGDGLTDCADPDCAGNPVC 32
|
|
| DUF5885 |
pfam19232 |
Family of unknown function (DUF5885); This is a family of uncharacterized proteins of unknown ... |
565-750 |
1.77e-07 |
|
Family of unknown function (DUF5885); This is a family of uncharacterized proteins of unknown function found in viruses.
Pssm-ID: 437064 Cd Length: 265 Bit Score: 55.01 E-value: 1.77e-07
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 565 CHGNGECvsGTCH-CFPGFLGPDCSraACPVLCSGNGQ----------YSKGRC----LCFSGwkgTECDVPTTQCI-DP 628
Cdd:pfam19232 34 CTTDAQC--GTCMtCVAGACTPKAS--CCGGVTCGAGQtcdaktntcvYVKGYCsadhPCPSG---SACDTAKNACIaQP 106
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 629 QCG---GRGiCIMG-------------------SCACNSGYK-GENCE--------EADCLDP---------------GC 662
Cdd:pfam19232 107 PYGpdsGKG-CVRGfgawiweldpatnsgvwrcRCANGSLYNsAHECSpladqtlcAAENLDPnalvpassvpafaayGW 185
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 663 SNHGVCIH-------------GECHCNPGWGGSNCEILKTmcadqCSGHGTYLQESGSCTCDPNWTGpdcsneicsvdcg 729
Cdd:pfam19232 186 GNQPVLINkstagaavpsplaGVCPCKPGWAGGSCTEDRT-----CNGRGTWNETTGQCACNIDFSG------------- 247
|
250 260
....*....|....*....|....
gi 1907188904 730 shgvcmGGSCRCEEG---WTGPAC 750
Cdd:pfam19232 248 ------HNSCGDDNNctsWTGPRC 265
|
|
| NHL_PKND_like |
cd14952 |
NHL repeat domain of the protein kinase PknD; PknD is a mycobacterial transmembrane protein ... |
1230-1484 |
1.03e-06 |
|
NHL repeat domain of the protein kinase PknD; PknD is a mycobacterial transmembrane protein with a cytosolic kinase domain and an extracellular sensor domain that contains NHL repeats. It plays a key role in the development of central nervous system tuberculosis, by mediating the invasion of host brain endothelia. The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies, typically as 6 instances. It is about 40 residues long and resembles the WD repeat and other beta-propeller structures.
Pssm-ID: 271322 [Multi-domain] Cd Length: 247 Bit Score: 52.60 E-value: 1.03e-06
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 1230 LLAPVALACGIDGSLYVGDF--NYVRRIFPSGNVTSVLELS--SNPAHryyLATDPVtGDLYVSDTNTRRIyrpksltga 1305
Cdd:cd14952 51 LYQPQGVAVDAAGTVYVTDFgnNRVLKLAAGSTTQTVLPFTglNDPTG---VAVDAA-GNVYVADTGNNRV--------- 117
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 1306 kdltknAEVVAGTGEQC-LPFdearcgdggkaveATLMSPKGMAIDKNGLIYFVDGtmirkvDQNGIISTLLGSNDLTsA 1384
Cdd:cd14952 118 ------LKLAAGSNTQTvLPF-------------TGLSNPDGVAVDGAGNVYVTDT------GNNRVLKLAAGSTTQT-V 171
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 1385 RPLTCDTSmhisqvrlewPTDLAINPMDNsIYVLDNNvvlqitENRQVRIAAGRPMHCQVP--GVEYPVGkhavqttles 1462
Cdd:cd14952 172 LPFTGLNS----------PSGVAVDTAGN-VYVTDHG------NNRVLKLAAGSTTPTVLPftGLNGPLG---------- 224
|
250 260
....*....|....*....|..
gi 1907188904 1463 ataIAVSYSGVLYITETDEKKI 1484
Cdd:cd14952 225 ---VAVDAAGNVYVADRGNDRV 243
|
|
| NHL_like_3 |
cd14956 |
Uncharacterized NHL-repeat domain in bacterial proteins; The NHL (NCL-1, HT2A and LIN-41) ... |
1284-1558 |
1.20e-06 |
|
Uncharacterized NHL-repeat domain in bacterial proteins; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies, typically as 6 instances. It is about 40 residues long and resembles the WD repeat and other beta-propeller structures.
Pssm-ID: 271326 [Multi-domain] Cd Length: 274 Bit Score: 52.67 E-value: 1.20e-06
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 1284 GDLYVSDTNTRRIyrpksltgakdltknaEVVAGTGEQCLPFDEARCGDGGkaveatLMSPKGMAIDKNGLIYFVDGT-- 1361
Cdd:cd14956 24 DNVYVADARNGRI----------------QVFDKDGTFLRRFGTTGDGPGQ------FGRPRGLAVDKDGWLYVADYWgd 81
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 1362 MIRKVDQNGIISTLLGSNdltSARPLTCDTsmhisqvrlewPTDLAINPmDNSIYVLD--NNVVLQITENRQVRIAAGRP 1439
Cdd:cd14956 82 RIQVFTLTGELQTIGGSS---GSGPGQFNA-----------PRGVAVDA-DGNLYVADfgNQRIQKFDPDGSFLRQWGGT 146
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 1440 mhcqvpgvEYPVGKhavqttLESATAIAVSYSGVLYITETdekKINRIRQVTTDGEISLVAGIPSecdckndancdcyqS 1519
Cdd:cd14956 147 --------GIEPGS------FNYPRGVAVDPDGTLYVADT---YNDRIQVFDNDGAFLRKWGGRG--------------T 195
|
250 260 270
....*....|....*....|....*....|....*....
gi 1907188904 1520 GDGyakdaKLNAPSSLAASPDGTLYIADLGNIRIRAVSK 1558
Cdd:cd14956 196 GPG-----QFNYPYGIAIDPDGNVFVADFGNNRIQKFTA 229
|
|
| Vgb |
COG4257 |
Streptogramin lyase [Defense mechanisms]; |
1275-1554 |
1.94e-06 |
|
Streptogramin lyase [Defense mechanisms];
Pssm-ID: 443399 [Multi-domain] Cd Length: 270 Bit Score: 51.94 E-value: 1.94e-06
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 1275 YYLATDPvTGDLYVSDTNTRRIYRpksltgakdltknaeVVAGTGEqclpFDEARCGDGGkaveatlmSPKGMAIDKNGL 1354
Cdd:COG4257 20 RDVAVDP-DGAVWFTDQGGGRIGR---------------LDPATGE----FTEYPLGGGS--------GPHGIAVDPDGN 71
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 1355 IYFVDGT--MIRKVD-QNGIISTLLGSNDLTSarpltcdtsmhisqvrlewPTDLAINPmDNSIYVLD--NNVVLQIT-E 1428
Cdd:COG4257 72 LWFTDNGnnRIGRIDpKTGEITTFALPGGGSN-------------------PHGIAFDP-DGNLWFTDqgGNRIGRLDpA 131
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 1429 NRQVRiaagrpmhcqvpgvEYPVGKHAVQTTlesatAIAVSYSGVLYITETdekKINRIRQVTTD-GEISLvagipsecd 1507
Cdd:COG4257 132 TGEVT--------------EFPLPTGGAGPY-----GIAVDPDGNLWVTDF---GANAIGRIDPDtGTLTE--------- 180
|
250 260 270 280
....*....|....*....|....*....|....*....|....*..
gi 1907188904 1508 ckndancdcyqsgdgYAKDAKLNAPSSLAASPDGTLYIADLGNIRIR 1554
Cdd:COG4257 181 ---------------YALPTPGAGPRGLAVDPDGNLWVADTGSGRIG 212
|
|
| C_rich_MXAN6577 |
NF041328 |
MXAN_6577-like cysteine-rich domain; |
616-770 |
5.48e-06 |
|
MXAN_6577-like cysteine-rich domain;
Pssm-ID: 469225 [Multi-domain] Cd Length: 145 Bit Score: 48.60 E-value: 5.48e-06
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 616 TECDVPTTQCIDPQ--CGGRGICIMGScACNSGykgeNCEEAdcldpgCSNHGVCIHGECHCNPGwggsnceilKTMCAD 693
Cdd:NF041328 12 AGCPEPGAVCPEGLsvCGGACVDLRSD-PSNCG----ACGVA------CGAGQTCVAGACGCGPG---------TVACGG 71
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 694 QCSGHGTylqesgsctcDPNWTGPdcsneiCSVDCGSHGVCMGGSCR--CEEGWT--GPAC--------NQRACHPRCAE 761
Cdd:NF041328 72 ACVDTAS----------DPAHCGA------CGAACAPGQVCEGGACReaCSEGLTrcGGACvdlatdplHCGACGVACDP 135
|
....*....
gi 1907188904 762 HGTCKDGKC 770
Cdd:NF041328 136 GESCRGGAC 144
|
|
| DSL |
pfam01414 |
Delta serrate ligand; This family has been redefined to correspond to the EGF-like domain ... |
741-784 |
2.04e-05 |
|
Delta serrate ligand; This family has been redefined to correspond to the EGF-like domain defined by structure.
Pssm-ID: 460202 Cd Length: 46 Bit Score: 43.77 E-value: 2.04e-05
10 20 30 40
....*....|....*....|....*....|....*....|....*..
gi 1907188904 741 CEEGWTGPACNqRACHPRCAE--HGTC-KDGKCECSQGWNGEHCTIA 784
Cdd:pfam01414 1 CDENYYGSTCS-KFCRPRDDKfgHYTCdANGNKVCLPGWTGPYCDKP 46
|
|
| PLN02919 |
PLN02919 |
haloacid dehalogenase-like hydrolase family protein |
1277-1560 |
3.91e-05 |
|
haloacid dehalogenase-like hydrolase family protein
Pssm-ID: 215497 [Multi-domain] Cd Length: 1057 Bit Score: 49.46 E-value: 3.91e-05
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 1277 LATDPVTGDLYVSDTNTRRIYrpksltgAKDLTKNAEV-VAGTGEQCL---PFDEArcgdggkaveaTLMSPKGMAID-K 1351
Cdd:PLN02919 573 LAIDLLNNRLFISDSNHNRIV-------VTDLDGNFIVqIGSTGEEGLrdgSFEDA-----------TFNRPQGLAYNaK 634
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 1352 NGLIYFVD--GTMIRKVD-QNGIISTLLGS----NDLTSARPLTcdtsmhiSQVrLEWPTDLAINPMDNSIYV------- 1417
Cdd:PLN02919 635 KNLLYVADteNHALREIDfVNETVRTLAGNgtkgSDYQGGKKGT-------SQV-LNSPWDVCFEPVNEKVYIamagqhq 706
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 1418 ------LD-------------------------------------NNVVLQITENRQVR-----------IAAGRPMhcq 1443
Cdd:PLN02919 707 iweyniSDgvtrvfsgdgyernlngssgtstsfaqpsgislspdlKELYIADSESSSIRaldlktggsrlLAGGDPT--- 783
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 1444 VPGVEYPVGKH---AVQTTLESATAIAVSYSGVLYITETDEKKINRIRQVTtdGEISLVAGIPsecdckndancdcyQSG 1520
Cdd:PLN02919 784 FSDNLFKFGDHdgvGSEVLLQHPLGVLCAKDGQIYVADSYNHKIKKLDPAT--KRVTTLAGTG--------------KAG 847
|
330 340 350 360
....*....|....*....|....*....|....*....|..
gi 1907188904 1521 --DGYAKDAKLNAPSSLAASPDGTLYIADLGNIRIRAVSKNK 1560
Cdd:PLN02919 848 fkDGKALKAQLSEPAGLALGENGRLFVADTNNSLIRYLDLNK 889
|
|
| NHL |
cd05819 |
NHL repeat unit of beta-propeller proteins; The NHL(NCL-1, HT2A and LIN-41)-repeat is found in ... |
1525-1651 |
5.73e-05 |
|
NHL repeat unit of beta-propeller proteins; The NHL(NCL-1, HT2A and LIN-41)-repeat is found in multiple tandem copies, typically as 6 instances. It is about 40 residues long and resembles the WD repeat and other beta-propeller structures. The repeats have a catalytic activity in Peptidyl-glycine alpha-amidating monooxygenase; proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localized to the repeats. Tripartite motif-containing protein 32 interacts with the activation domain of Tat. This interaction is mediated by the NHL repeats.
Pssm-ID: 271320 [Multi-domain] Cd Length: 269 Bit Score: 47.31 E-value: 5.73e-05
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 1525 KDAKLNAPSSLAASPDGTLYIADLGNIRIRAVSKN-KPLLN-------SMNFYE---VASPTDQELYI----------FD 1583
Cdd:cd05819 3 GPGELNNPQGIAVDSSGNIYVADTGNNRIQVFDPDgNFITSfgsfgsgDGQFNEpagVAVDSDGNLYVadtgnhriqkFD 82
|
90 100 110 120 130 140 150
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....
gi 1907188904 1584 INGTHQYTVSlVTGDYLYNFSY------SNDNDVtAVTDSNGNtlRIrrdpnrmpvRVVSPDNQVIwLTIGTNG 1651
Cdd:cd05819 83 PDGNFLASFG-GSGDGDGEFNGprgiavDSSGNI-YVADTGNH--RI---------QKFDPDGEFL-TTFGSGG 142
|
|
| NHL_like_1 |
cd14953 |
Uncharacterized NHL-repeat domain in bacterial proteins; This bacterial family of NHL-repeat ... |
1519-1559 |
8.37e-05 |
|
Uncharacterized NHL-repeat domain in bacterial proteins; This bacterial family of NHL-repeat domains is found in a variety of domain architectures. The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies, typically as 6 instances. It is about 40 residues long and resembles the WD repeat and other beta-propeller structures.
Pssm-ID: 271323 [Multi-domain] Cd Length: 323 Bit Score: 47.52 E-value: 8.37e-05
10 20 30 40
....*....|....*....|....*....|....*....|.
gi 1907188904 1519 SGDGYAKDAKLNAPSSLAASPDGTLYIADLGNIRIRAVSKN 1559
Cdd:cd14953 12 FSGGGGTAARFNSPSGVAVDAAGNLYVADRGNHRIRKITPD 52
|
|
| DSL |
pfam01414 |
Delta serrate ligand; This family has been redefined to correspond to the EGF-like domain ... |
710-752 |
1.19e-04 |
|
Delta serrate ligand; This family has been redefined to correspond to the EGF-like domain defined by structure.
Pssm-ID: 460202 Cd Length: 46 Bit Score: 41.84 E-value: 1.19e-04
10 20 30 40
....*....|....*....|....*....|....*....|....*.
gi 1907188904 710 CDPNWTGPDCSNEiCSV--DCGSHGVC-MGGSCRCEEGWTGPACNQ 752
Cdd:pfam01414 1 CDENYYGSTCSKF-CRPrdDKFGHYTCdANGNKVCLPGWTGPYCDK 45
|
|
| C_rich_MXAN6577 |
NF041328 |
MXAN_6577-like cysteine-rich domain; |
724-825 |
4.81e-04 |
|
MXAN_6577-like cysteine-rich domain;
Pssm-ID: 469225 [Multi-domain] Cd Length: 145 Bit Score: 42.82 E-value: 4.81e-04
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 724 CSVDCGSHGVCMGGSCRCEEGWT--GPAC--------NQRACHPRCAEHGTCKDGKCecsqgwngehctiahyldkivkd 793
Cdd:NF041328 45 CGVACGAGQTCVAGACGCGPGTVacGGACvdtasdpaHCGACGAACAPGQVCEGGAC----------------------- 101
|
90 100 110
....*....|....*....|....*....|....*....
gi 1907188904 794 kigyKEGCP-GLCNSNGRCT-LDQNGWHC-----VCQPG 825
Cdd:NF041328 102 ----REACSeGLTRCGGACVdLATDPLHCgacgvACDPG 136
|
|
| YvrE |
COG3386 |
Sugar lactone lactonase YvrE [Carbohydrate transport and metabolism]; Sugar lactone lactonase ... |
1236-1372 |
6.10e-04 |
|
Sugar lactone lactonase YvrE [Carbohydrate transport and metabolism]; Sugar lactone lactonase YvrE is part of the Pathway/BioSystem: Non-phosphorylated Entner-Doudoroff pathway
Pssm-ID: 442613 [Multi-domain] Cd Length: 266 Bit Score: 44.11 E-value: 6.10e-04
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 1236 LACGIDGSLYVGDFNYVR------RIFPSGNVTSVLE--LSSN-----PAHRYylatdpvtgdLYVSDTNTRRIYR-PKS 1301
Cdd:COG3386 98 GVVDPDGRLYFTDMGEYLptgalyRVDPDGSLRVLADglTFPNgiafsPDGRT----------LYVADTGAGRIYRfDLD 167
|
90 100 110 120 130 140 150
....*....|....*....|....*....|....*....|....*....|....*....|....*....|...
gi 1907188904 1302 LTGAkdLTkNAEVVAgtgeqclpfdEARCGDGGkaveatlmsPKGMAIDKNGLIY--FVDGTMIRKVDQNGII 1372
Cdd:COG3386 168 ADGT--LG-NRRVFA----------DLPDGPGG---------PDGLAVDADGNLWvaLWGGGGVVRFDPDGEL 218
|
|
| EGF_2 |
pfam07974 |
EGF-like domain; This family contains EGF domains found in a variety of extracellular proteins. |
565-587 |
8.48e-04 |
|
EGF-like domain; This family contains EGF domains found in a variety of extracellular proteins.
Pssm-ID: 400365 Cd Length: 26 Bit Score: 38.87 E-value: 8.48e-04
|
| NHL_like_6 |
cd14962 |
Uncharacterized NHL-repeat domain in bacterial proteins; The NHL (NCL-1, HT2A and LIN-41) ... |
1334-1553 |
1.01e-03 |
|
Uncharacterized NHL-repeat domain in bacterial proteins; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies, typically as 6 instances. It is about 40 residues long and resembles the WD repeat and other beta-propeller structures.
Pssm-ID: 271332 [Multi-domain] Cd Length: 271 Bit Score: 43.73 E-value: 1.01e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 1334 GKAVEATLMSPKGMAIDKNGLIYFVDGT--MIRKVDQNGIISTLLGSNDLtsarpltcdtsmhisQVRlewPTDLAINPM 1411
Cdd:cd14962 49 GNAGPNRFVSPIGVAIDANGNLYVSDAElgKVFVFDRDGKFLRAIGAGAL---------------FKR---PTGIAVDPA 110
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 1412 DNSIYVLDnnvvlqiTENRQVRI--AAGRPMHcQVPgveyPVGKHAVQttLESATAIAVSYSGVLYITETDEKKINRI-- 1487
Cdd:cd14962 111 GKRLYVVD-------TLAHKVKVfdLDGRLLF-DIG----KRGSGPGE--FNLPTDLAVDRDGNLYVTDTMNFRVQIFda 176
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 1488 --RQVTTDGEISLVAG---IPSECDCKNDAN---CDCYQS---------------GDGYAKDAKLNAPSSLAASPDGTLY 1544
Cdd:cd14962 177 dgKFLRSFGERGDGPGsfaRPKGIAVDSEGNiyvVDAAFDnvqifnpegellltvGGPGSGPGEFYLPSGIAIDKDDRIY 256
|
....*....
gi 1907188904 1545 IADLGNIRI 1553
Cdd:cd14962 257 VVDQFNRRI 265
|
|
| NHL_like_5 |
cd14963 |
Uncharacterized NHL-repeat domain in bacterial proteins; The NHL (NCL-1, HT2A and LIN-41) ... |
1229-1479 |
1.40e-03 |
|
Uncharacterized NHL-repeat domain in bacterial proteins; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies, typically as 6 instances. It is about 40 residues long and resembles the WD repeat and other beta-propeller structures.
Pssm-ID: 271333 [Multi-domain] Cd Length: 268 Bit Score: 43.05 E-value: 1.40e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 1229 KLLAPVALACGIDGSLYVGDFnYVRRI--F-PSGNVTSVLelssnpAHRYYL--ATDPVT-----GDLYVSDTNTRRIYr 1298
Cdd:cd14963 54 EFKYPYGIAVDSDGNIYVADL-YNGRIqvFdPDGKFLKYF------PEKKDRvkLISPAGlaiddGKLYVSDVKKHKVI- 125
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 1299 pksltgakdltknaeVVAGTGEQCLPFdearcGDGGKAvEATLMSPKGMAIDKNGLIYFVD--GTMIRKVDQNG-IISTL 1375
Cdd:cd14963 126 ---------------VFDLEGKLLLEF-----GKPGSE-PGELSYPNGIAVDEDGNIYVADsgNGRIQVFDKNGkFIKEL 184
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 1376 LGSNDLTSArpltcdtsmhisqvrLEWPTDLAINPmDNSIYVLDN--NVVLQITENRQVRIAAGRpmhcqvPGVEypvgk 1453
Cdd:cd14963 185 NGSPDGKSG---------------FVNPRGIAVDP-DGNLYVVDNlsHRVYVFDEQGKELFTFGG------RGKD----- 237
|
250 260
....*....|....*....|....*.
gi 1907188904 1454 havQTTLESATAIAVSYSGVLYITET 1479
Cdd:cd14963 238 ---DGQFNLPNGLFIDDDGRLYVTDR 260
|
|
| YD_repeat_2x |
TIGR01643 |
YD repeat (two copies); This model describes two tandem copies of a 21-residue extracellular ... |
1881-1921 |
1.63e-03 |
|
YD repeat (two copies); This model describes two tandem copies of a 21-residue extracellular repeat found in Gram-negative, Gram-positive, and animal proteins. The repeat is named for a YD dipeptide, the most strongly conserved motif of the repeat. These repeats appear in general to be involved in binding carbohydrate; the chicken teneurin-1 YD-repeat region has been shown to bind heparin.
Pssm-ID: 273728 [Multi-domain] Cd Length: 42 Bit Score: 38.34 E-value: 1.63e-03
10 20 30 40
....*....|....*....|....*....|....*....|..
gi 1907188904 1881 YSSTGQ-IASIQRGTTSEKVDYDSQGRIVSRVFADGKTWSYT 1921
Cdd:TIGR01643 1 YDAAGRlTGSTDADGTTTRYTYDAAGRLVEITDADGGSTRYE 42
|
|
| EGF_2 |
pfam07974 |
EGF-like domain; This family contains EGF domains found in a variety of extracellular proteins. |
662-684 |
1.79e-03 |
|
EGF-like domain; This family contains EGF domains found in a variety of extracellular proteins.
Pssm-ID: 400365 Cd Length: 26 Bit Score: 37.71 E-value: 1.79e-03
|
| EGF_2 |
pfam07974 |
EGF-like domain; This family contains EGF domains found in a variety of extracellular proteins. |
695-719 |
1.81e-03 |
|
EGF-like domain; This family contains EGF domains found in a variety of extracellular proteins.
Pssm-ID: 400365 Cd Length: 26 Bit Score: 37.71 E-value: 1.81e-03
|
| Keratin_B2 |
pfam01500 |
Keratin, high sulfur B2 protein; High sulfur proteins are cysteine-rich proteins synthesized ... |
652-770 |
2.70e-03 |
|
Keratin, high sulfur B2 protein; High sulfur proteins are cysteine-rich proteins synthesized during the differentiation of hair matrix cells, and form hair fibres in association with hair keratin intermediate filaments. This family has been divided up into four regions, with the second region containing 8 copies of a short repeat. This family is also known as B2 or KAP1.
Pssm-ID: 366678 [Multi-domain] Cd Length: 161 Bit Score: 40.93 E-value: 2.70e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 652 CEEADCLDPGCSNHGVCihGECHCNPGWGGSNCeILKTMCADQCSGHGTYLQESGSCTCDPNwTGPDCSNEICSVDCGSH 731
Cdd:pfam01500 4 CGTSFCGFPTCSTGGTC--GSGCCQPCCCQSSC-CRPSCCQTSCCQPTTFQSSCCRPTCQPC-CQTSCCQPTCCQTSSCQ 79
|
90 100 110
....*....|....*....|....*....|....*....
gi 1907188904 732 GVCMGGSCRCEEGWTGPACNQRACHPRCAEHGTCKDGKC 770
Cdd:pfam01500 80 TGCGGIGYGQEGSSGAVSSRTRWCRPDCRVEGTCLPPCC 118
|
|
| NHL_like_5 |
cd14963 |
Uncharacterized NHL-repeat domain in bacterial proteins; The NHL (NCL-1, HT2A and LIN-41) ... |
1528-1624 |
3.01e-03 |
|
Uncharacterized NHL-repeat domain in bacterial proteins; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies, typically as 6 instances. It is about 40 residues long and resembles the WD repeat and other beta-propeller structures.
Pssm-ID: 271333 [Multi-domain] Cd Length: 268 Bit Score: 42.28 E-value: 3.01e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1907188904 1528 KLNAPSSLAASPDGTLYIADLGNIRIRAVSKN----------KPLLNSM----------NFYeVASPTDQELYIFDINGT 1587
Cdd:cd14963 54 EFKYPYGIAVDSDGNIYVADLYNGRIQVFDPDgkflkyfpekKDRVKLIspaglaiddgKLY-VSDVKKHKVIVFDLEGK 132
|
90 100 110 120
....*....|....*....|....*....|....*....|..
gi 1907188904 1588 HQYTVSLVtGDYLYNFSYSN----DNDVT-AVTDSNGNtlRI 1624
Cdd:cd14963 133 LLLEFGKP-GSEPGELSYPNgiavDEDGNiYVADSGNG--RI 171
|
|
| EGF_CA |
cd00054 |
Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular ... |
656-685 |
3.77e-03 |
|
Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements.
Pssm-ID: 238011 Cd Length: 38 Bit Score: 37.23 E-value: 3.77e-03
10 20 30
....*....|....*....|....*....|....*
gi 1907188904 656 DCLDPG-CSNHGVCIHGE----CHCNPGWGGSNCE 685
Cdd:cd00054 4 ECASGNpCQNGGTCVNTVgsyrCSCPPGYTGRNCE 38
|
|
| EGF_CA |
cd00054 |
Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular ... |
802-831 |
4.64e-03 |
|
Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements.
Pssm-ID: 238011 Cd Length: 38 Bit Score: 36.85 E-value: 4.64e-03
10 20 30
....*....|....*....|....*....|
gi 1907188904 802 PGLCNSNGRCTLDQNGWHCVCQPGWRGAGC 831
Cdd:cd00054 8 GNPCQNGGTCVNTVGSYRCSCPPGYTGRNC 37
|
|
| RHS_repeat |
pfam05593 |
RHS Repeat; RHS proteins contain extended repeat regions. These repeats often appear to be ... |
1673-1704 |
6.63e-03 |
|
RHS Repeat; RHS proteins contain extended repeat regions. These repeats often appear to be involved in ligand binding. Note that this model may not find all the repeats in a protein and that it covers two RHS repeats. The 3D structure of an RHS-repeat-containing protein (the B and C components of an ABC toxin complex) has been determined. The RHS repeats form an extended strip of beta-sheet that spirals around to form a hollow shell, encapsulating the variable C-terminal domain.
Pssm-ID: 461685 [Multi-domain] Cd Length: 37 Bit Score: 36.42 E-value: 6.63e-03
10 20 30
....*....|....*....|....*....|..
gi 1907188904 1673 GLLATKSDETGWTTFFDYDSEGRLTNVTFPTG 1704
Cdd:pfam05593 5 GRLTSVTDPDGRVTTYTYDAAGRLTAVTDPDG 36
|
|
| YD_repeat_2x |
TIGR01643 |
YD repeat (two copies); This model describes two tandem copies of a 21-residue extracellular ... |
1669-1707 |
8.68e-03 |
|
YD repeat (two copies); This model describes two tandem copies of a 21-residue extracellular repeat found in Gram-negative, Gram-positive, and animal proteins. The repeat is named for a YD dipeptide, the most strongly conserved motif of the repeat. These repeats appear in general to be involved in binding carbohydrate; the chicken teneurin-1 YD-repeat region has been shown to bind heparin.
Pssm-ID: 273728 [Multi-domain] Cd Length: 42 Bit Score: 36.41 E-value: 8.68e-03
10 20 30
....*....|....*....|....*....|....*....
gi 1907188904 1669 HGNSGLLATKSDETGWTTFFDYDSEGRLTNVTFPTGVVT 1707
Cdd:TIGR01643 1 YDAAGRLTGSTDADGTTTRYTYDAAGRLVEITDADGGST 39
|
|
|