|
|
||||||||
Medical College of Georgia, School of Dentistry, Department of Oral Biology and Maxillofacial Pathology, 1120 15th Street, Augusta, GA 30912; ddickins{at}mail.mcg.edu
(I) Introduction (A) OVERVIEW OF PEPTIDASES (B) OVERVIEW OF CPS (C) REGULATION OF CPS AND OTHER PEPTIDASES (II) Clan CA (A) EVOLUTION OF CLAN CA ENZYMES (B) GENERAL PROPERTIES OF PAPAINRELATED (FAMILY C1A) CPS (1) Structure and activity (2) The proregion (C) BIOCHEMICAL PROPERTIES, EXPRESSION, AND NORMAL FUNCTIONS OF MAMMALIAN CPS OF SUBFAMILY C1A (1) Widely expressed cathepsins (a) Cathepsins B, H, and L (i) Properties and tissue distribution (ii) Functions of cathepsins B, H, and L (iii) Cathepsins B, H, and L in the oral cavity (b) Dipeptidyl peptidase I (DPP I) (i) Properties and tissue distribution (ii) Functions of DPP I and its role in pre-pubertal periodontitis and Papillon-Lefèvre and Haim-Munk syndromes (c) Cathepsins O and X (i) Properties and tissue distribution (ii) Functions (d) Cathepsin F (i) Properties and tissue distribution (ii) Functions of cathepsin F (2) Tissue-specific cathepsins (a) Cathepsin S (i) Properties and tissue distribution (ii) Functions of cathepsin S and its role in antigen presentation (iii) Potential roles for cathepsin S in oral tissues and the oral cavity (b) Cathepsin K (i) Properties and tissue distribution (ii) Potential functions of cathepsin K in oral tissues (c) Lymphopain (d) Cathepsin V and other cathepsin L-like sequences (i) Properties and tissue distribution (ii) Functions (e) Cathepsins J, P, CLRP, M, Q, R, and the testins (i) Properties and tissue distribution (ii) Functions (III) Calpains (IV) Mammalian Peptidases of Clan CD (A) THE LEGUMAINS (B) THE CASPASES (i) Properties (ii) Function of caspases in inflammation and apoptosis (iii) Apoptosis and oral tissues (V) Mechanisms of Exposure of Tissues to Host CPs and the Consequences (i) RELEASE OF MATURE ENZYMES (ii) RELEASE OF PROENZYMES (iii) CONTROL OF RELEASED ENZYMES (VI) The Proteolytic Cascade: Interactions between Enzymes and Their Inhibitors (VII) Interaction of Host CPs with the Immune System (VIII) CPs and Sjögren's Syndrome (IX) CPs and Periodontal Disease (X) CPs in the Oral Cavity (XI) CPs and Tooth Development and Movement (XII) CPs and Cancer (XIII) CPs and Arthritis (XIV) Summary and Conclusions REFERENCES
| Abstract |
|---|
|
|
|---|
Key words. Cysteine protease, peptidase, evolution, cathepsin, oral tissues, human
| (I) Introduction |
|---|
|
|
|---|
The number of identified mammalian CPs has grown considerably in the past few years. The purpose of this review is to summarize the salient features of these enzymes, concentrating on known (or potential) functions, and to indicate relevance to oral tissues. In addition, the potential for interactions with other peptidases (and their inhibitors) to form proteolytic networks will be examined. It is important to note that while some endogenous CPs have been shown to be involved in processes such as inflammation, antigen presentation, bone remodeling, and cancer, in only a few cases has their role been examined in oral tissues. Further, there are several CPs whose function remains to be established, but whose properties warrant investigation of potential oral sources. Exogenous sources of CPs (e.g., from pathogens) and the cystatins themselves will be the subjects of a separate review.
(A) OVERVIEW OF PEPTIDASES
Enzymatic cleavage of peptide bonds is fundamental to almost every aspect of life, and peptidases represent about 2% of all gene products. Examples can be found in digestion, blood coagulation and fibrinolysis, processing of preproproteins such as collagen, immune function, development, and apoptosis (reviewed in Twining, 1994). Not surprisingly, peptidase genes are found in the genomes of all cellular organisms (and several types of virus), and there are arrays of proteolytic enzymes distributed in cellular and tissue compartments. Peptide bond scission proceeds via a nucleophilic attack on the carbonyl carbon, followed by a general acid-base hydrolysis. The general term peptidase is preferred for enzymes that catalyze this reaction, although protease and proteinase are commonly found in the literature. Peptidases are generally grouped into five major types (cysteine, serine, threonine, aspartate, and metalloproteinase) according to the mechanism used to generate the nucleophile in the active site (reviewed in Barrett et al., 1998). In the CPs, an activated cysteine residue is used as the nucleophile and a histidine residue as the proton donor. In some enzymes, a third residue serves to orient the His residue.
The CPs comprise a complex set of enzymes (reviewed in Rawlings and Barrett, 1994; Kirschke et al., 1995; Chapman et al., 1997; Barrett et al., 1998). CPs of various physical and biochemical types are found in all kingdoms of organisms, indicating that they are among the most ancient proteins. Phylogenetic analysis is proving to be a powerful tool for elucidating the relationships between and among the large numbers of CP-like protein sequences that have been identified thus far, and phylogeny now provides the foundation for classifying the CPs (and other peptidases) (Rawlings and Barrett, 1993, 1994; see Barrett et al., 1998, for a detailed compilation of peptidases using this scheme, and detailed descriptions of their enzymatic properties). However, in the literature, descriptions of the different groups of CPs and their relationships to each other are complicated by differences in usage of the hierarchical phylogenetic terms class, superfamily, family, and subfamily. For example, the latter three terms have all been applied to papain-like cysteine peptidases (Karrer et al., 1993; Rawlings and Barrett, 1993; Berti and Storer, 1995). Used in a phylogenetic context (that is, where evolutionary relationships are considered), these terms implicitly mean a grouping based on the position of a node on a phylogenetic tree that links all members of the group, and reflects the degree of divergence of that group from other groups. Strictly, a family of proteins (or matching domains of chimeric proteins) should be a monophyletic groupthat is, the members share a most recent common ancestor that is not an ancestor of one or more proteins not included in the group. The term subfamily is often used to denote clear divisions within a family, such as functional differences in proteins of otherwise similar sequences. The term superfamily is often used to denote groups of proteins that have relatively low (but detectable) sequence similarity to each other, but which have structural and functional properties consistent with a common evolutionary origin. In the system of Rawlings and Barrett, the term family is used to denote recognizable groups such as the papain-like enzymes. In a family, every member has a statistically significant relationship to at least one other member of the family (at least in the sequence of residues involved in catalytic activity), which implies evolution from a common ancestral peptidase. Deep divergences within the phylogenetic tree generally warrant the designation of the main branches as subfamilies. The term clan denotes groups of families for which there are indications of evolutionary relationships (e.g., active site arrangements, common three-dimensional structures) but which lack statistically significant similarities in sequence. CP clans are designated CA, CD, etc., and families by C+number (e.g., C12).
(B) OVERVIEW OF CPS
From sequence and structural comparisons, it is clear that several types of CPs had independent origins, and there are at least six distinct clans consisting of 43 families, of which over half are from viruses (Rawlings and Barrett, 2000). Most and perhaps all clans represent convergent evolution, since in the majority of cases the enzymes have been shown to have a distinctly different organization of catalytic residues. Enzymes in clan CA (papain-related) have the catalytic residues in the order Cys....His....Asn/Asp, while enzymes in clan CD (legumain-related) have an active site catalytic dyad with a His-Gly-spacer-Ala-Cys motif (Chen JM et al., 1998). In the oral cavity and the nasopharynx, the substantial majority of CPs encounteredwhether from the host, from viruses, bacteria, parasitic protozoa or helminths, or the dietare from these two clans.
CPs are best known from members of the CA clan that were purified by classic biochemical techniques: the plant enzyme papain and the related mammalian lysosomal cathepsins B, H, L, and DPP I. (It should be noted that the term cathepsin is a general term for a peptidase, especially lysosomal, with an acidic pH optimum that is involved in protein degradation, and is not restricted to peptidases of any one catalytic type. For example, cathepsin D is an aspartic peptidase.) More recently, molecular cloning has been used to identify other related mammalian CP cathepsinsprimarily humanand now the human genome is known to contain at least 11 related, but distinct, CP cathepsins: B, F, H, K, L, O, S, V, X, DPP I, and lymphopain. At this time, a search of the near-complete human genome database does not reveal any additional functional genes (data not shown), and thus these cathepsins may represent the entire human complement of enzymes most closely related to papain. Most of the 11 CP cathepsins evolved relatively early and are present in all mammals. The literature concerning these proteins has been greatly complicated by the use of the same letter to designate different proteins, and different letters to designate the same protein. Some of the various designations, and presently accepted or recommended ones, are summarized in the Table, together with the human chromosomal locations. Amino acid identity between and among these mammalian cathepsins is high in the vicinity of the active site, but overall levels of similarity generally range from 20 to 60% (Wiederanders et al., 1992). However, the recently evolved cathepsin V shares 78% similarity with cathepsin L (see below). Other cathepsin activities have been described in the literature, but their relationship to the 11 listed above is uncertain (Kirschke et al., 1995). In addition to other, evolutionarily independent, types of CPs in the human genome (e.g., clan CD enzymes), there are other CPs, such as cancer procoagulant, that remain to be characterized.
(C) REGULATION OF CPS AND OTHER PEPTIDASES
It is axiomatic that peptidases represent potentially dangerous enzyme activities that must be subject to strict control and containment within appropriate compartments, and that a failure in these controls could lead to pathology. Several mechanisms are used to regulate peptidase activity, in addition to transcriptional and post-transcriptional controls (Twining, 1994; Chapman et al., 1997). Most enzymes are synthesized as inactive zymogens that must be activated by proteolytic cleavage. This may be autocatalytic under specific conditions, such as low pH. Release of peptidases from a cell is generally a controlled process. The CPs are readily inactivated by oxidation of the active site cysteine and require a reducing environment for full activity. Many human CPs are unstable at neutral pH and require an acidic pH for full activity. Once activated, enzyme activity can be lost by degradation. A major control governing peptidase activity is the presence of protein inhibitors that bind tightly to the enzyme, blocking substrate binding. It must be emphasized that, in many basic tissue reactions (e.g., inflammation), CPs probably do not function in isolation. Rather, they are components of complex networks constituting representatives of different types of peptidase and their respective inhibitors. Within these networks, cross-activation of zymogens and cross-inactivation of inhibitors can provide an amplification of an initial perturbation.
Numerous low-molecular-weight CP inhibitors have been developed. One of the most widely used is E-64 [L-trans-epoxysuccinyl-leucylamino(4-guanidino)butane], a potent irreversible inhibitor of many (but not all) CPs that forms a thioether with the active site cysteine (Barrett et al., 1982). The vinylsulfone N-morpholinurea-leucine-homophenylalanine-vinylsulfone-phenyl (LHVS) is a potent inhibitor of cathepsin S, and quite effective against cathepsin F (Shi et al., 2000).
| (II) Clan CA |
|---|
|
|
|---|
(A) EVOLUTION OF CLAN CA ENZYMES
The papain-related enzyme family C1 is grouped into two subfamilies: C1A, comprised of "papain-like" enzymes (in a general sense; i.e., statistically similar in sequence and related in structure to papain), and C1B, consisting of intracellular bleomycin hydrolases and related bacterial aminopeptidases. These subfamilies display significant similarity only in the vicinity of the active-site residues, and represent the earliest evolutionary divergence in the C1 enzymes (Berti and Storer, 1995). Clan CA enzymes arose in bacteria in the Archean(4000 to 2500 millennia ago) and papain-related CPs are phylogenetically ubiquitous (reviewed in Tort et al., 1999).
Several phylogenetic analyses of the C1A enzymes have been reported, often in conjunction with structural considerations, although not all of these studies have included a statistical analysis of branch support (e.g., Wiederanders et al., 1992; Berti and Storer, 1995; Santamaria et al., 1999; Tort et al., 1999; Wex et al., 1999; Rawlings and Barrett, 2000). An alignment of vertebrate cathepsins and related proteins and two representative plant enzymes is shown in Fig. 1
, and a phylogenetic tree based on this alignment in Fig. 2
. Each branch on the tree represents either a duplication of a common ancestor (leading to paralogous genes) or speciation (leading to orthologs). The various trees are generally consistent, regardless of the methodologies used, and the C1A family of "papain-like" enzymes can be divided into two major, ancient groups (subfamilies), which have been referred to as Branch A and Branch B (Tort et al., 1999). These branches arose by duplication of a common ancestor over 2700 Ma, before the eukaryote-prokaryote divergence. A few other enzymes from parasitic and free-living protozoa and metazoa, as well as the recently described human cathepsin O, do not strongly localize to either branch (although they appear to group weakly with Branch B), and the timing of their divergence from the ancestors of the two major groups is uncertain.
|
|
The mammalian cathepsin L-like CPs (cathepsins L, V, K, and S and the rodent M, J/P, Q, R, CLRP subgroup) form a distinct, well-supported (99% bootstrap) group (Fig. 2
). Cathepsin F and lymphopain are also related to each other, although with weaker support (65% in this analysis). Significantly, gene pairs with a later common ancestor have tended to remain together on the chromosome (see Table
). Thus, cathepsin pairs F and lymphopain co-localize to 11q13.1-13.3, K and S to 1q21, and L and V to 9q21-22. Members of each pair also have intron/exon organizations that are similar to each other. Other cathepsins map to unique positions in the human genome and tend to have unique intron/exon arrangements. Cathepsins L and V show 78% sequence similarity, consistent with a relatively recent gene duplication of an ancestral cathepsin L gene (Brömme et al., 1999; Itoh et al., 1999). They are sufficiently similar as to suggest that cathepsin V evolved after the mammalian radiation: Efforts to find a mouse ortholog of cathepsin V have failed, and thus the distribution of cathepsin V in mammals may be limited (Brömme et al., 1999). In general, vertebrate cathepsins representing phylogenetically ancient types (cathepsins B, X, L, H, O, F, and DPP I) are ubiquitously expressed in mammalian tissues, although often at quite different levels in different sites (see below). It should be noted that this does not preclude specific functions in certain tissues, as has been shown for cathepsins B, L, and DPP I (see below). However, within Branch B, other cathepsins have evolved tissue-specific patterns of expression (see below), primarily associated with cells of the immune system.
|
-helices and ß-pleated sheets, and most relative insertions and deletions between the proteins occur in the loops and turns linking these elements, consistent with a common ancestry. The molecules are bi-lobed. The two domains are designated L and R, and the L domain consists of the majority of the N-terminal half of the protein, while the R domain consists of the C-terminal half of the molecule and the most N-terminal residues. The catalytic site is located in a cleft between the lobes: the catalytic Cys on the L domain, the His opposite on the R domain. Typically, the structure is stabilized by three disulfide bonds. In contrast to many other peptidases, in no case has a CP cathepsin been shown to have a single specific substrate, although they do differ considerably in their preferred cleavage site. Polypeptide substrates bind along the cleft in an extended conformation. Binding sites for substrate residues N-terminal to the cleaved peptide bond are designated as S1, S2...etc.; those C-terminal are designated as S1', S2'...(where S1 and S1' are proximal to the cleaved bond). Similarly, P1, P2...; P1', P2'......, etc., are used to designate the corresponding substrate residues (Barrett, 1994). Only the S2 subsite is a real pocket: the other sites are shallow indentations on the surface of the enzyme. Most enzymes favor a hydrophobic residue at the P2 site, whose side chain projects down into a hydrophobic S2 pocket, but papain-like CPs vary widely in their accommodation of an aromatic residue in this position. Some (e.g., cathepsin B, cruzipain) will accept an arginine at this position by forming a salt bridge. Interactions with the substrates P3 and P2' residues involve the side chains, while P2, P1, and P1' involve both main and side-chain contacts (reviewed in Turk et al., 1998; McGrath, 1999). Most enzymes are endopeptidases, but cathepsin B has strong carboxypeptidase activity, whereas cathepsin H has strong aminopeptidase and limited endopeptidase activity (reviewed in Kirschke et al., 1995). Cathepsin B has an 18-residue insertion proximal to the active-site cleft that forms an occluding loop. This restricts access to potential substrates in the prime sites and helps provide an anchor for the C-terminal carboxyl group. Flexibility in the loop facilitates endopeptidase activity (reviewed in Mort and Buttle, 1997; McGrath, 1999). In Cathepsin H, an 8-residue segment of the proregion, the minichain, remains attached via a disulfide bond and restricts access to the non-primed sites (reviewed in Turk et al., 1997; McGrath, 1999). The plant aminopeptidases aleurain and oryzain share sequence similarity with cathepsin H, and presumably a similar structure-function relationship (Kirschke et al., 1995; reviewed in Turk et al., 1997, 1998). The cystatins are CP inhibitors that can place N-terminal residues along the active site in the same orientation as a substrate, thereby occupying the S2 subsite, and also place hairpin loop regions in the active site (Turk et al., 1997). Just as the CP loops can interfere with the binding of polypeptide substrates, they can also interfere with the binding of cystatins. For example, cystatin C inhibits cathepsin B less well than cathepsin L. There are also proteins with homology to papain that lack peptidase activity due to substitutions in the active site (e.g., the testins, see below).
The overall conservation of cathepsin CP structure is maintained in both the ancient A and B branches. However, sensitivity to inhibition by cystatins is not uniformly distributed. In particular, cathepsin B is only poorly inhibited, while cathepsin X is resistant. Taking into account the ancient origins of DPP I, and the more recent mammalian origin of salivary cystatins (Dickinson, manuscript in preparation), it seems more probable that the target(s) for salivary cystatins is a member of branch B.
(2) The proregion
Typically, Clan CA enzymes are synthesized as inactive preproenzymes with a signal peptide and a multifunctional N-terminal proregion. The C1A subfamily proregion varies between 38 and 251 residues, depending on the enzyme and, to a smaller extent, on the species. Branch B-type CPs have a proregion around 60-100 residues in length. Cathepsin X is unusual in having a very short proregion (38 or 41 residues, depending on the prepeptide cleavage site), while DPP I has a long 206 (human)-residue prosegment due to an N-terminal extension. Alignments of the proregions of the 11 known human cathepsins, novel rodent cathepsins, and those of papain and aleurain are shown in Fig. 3
. It can be seen that the definitive human Branch B-type CPs (cathepsins L, V, S, K, and H), as well as the rodent placental cathepsins (M, P and Q), have related proregions with similarities to those of papain and aleurain (Fig. 3A
). The rodent proteins CTLA-2 and the testins also have homology to the Branch B proregions. As noted above, the origins of the related cathepsins F and lymphopain, and the more distantly related cathepsin O, are not well-resolved with respect to Branches A and B. However, cathepsin F and lymphopain show clear similarity to other Branch B enzymes in their proregions, and they have quite good matches to three conserved motifs within the Branch B-type prodomain (see below), consistent with the closer relationship to Branch B suggested by the tree constructed with the use of the mature enzyme sequences (Fig. 2
). The cathepsin O proregion has a modest overall similarity, and partial matches to the three motifs. In contrast, the proregions of Branch A-type CPs (cathepsins B, X, and DPP I) are distinct from those of Branch B (Fig. 3
), and lack matches to the conserved motifs. The proregion of cathepsin X shows some similarity to that of cathepsin B (Fig. 3B
), while that of DPP I appears to be unrelated to other CPs (Fig. 1C
). This fits with the results summarized above from phylogenetic analyses of larger numbers of enzymes from diverse species that identify these two distinct major branches. Despite the low sequence similarity of the proregions of cathepsins B, L, and K, analysis of their crystal structures shows that they have a common fold (Coulombe et al., 1996; LaLonde et al., 1999). Most likely this represents high divergence of ancestrally related sequences, but convergent evolution of ancestrally unrelated domains cannot be excluded at present.
|
The N-terminal propeptide segment has been shown to be a potent and relatively specific inhibitor that serves to maintain the precursor in an inactive state until cleaved (Carmona et al., 1996; reviewed in Turk et al., 1997). The proregions bind to the active site in a linear, extended conformation, but in the reverse orientation to normal substrates. This binding is distinct to the interaction of CPs with cystatins, and likely provides a combination of a good fit to the site with resistance to proteolysis. The inhibitory activities of the cathepsin L and B proregions are sharply pH-dependent; for cathepsin L, the Ki is less than 0.5 nM at a pH above 4.3, but rises to 3.0 nM at pH 4.0, consistent with a low pH-dependent autoactivation of the enzyme (Fox et al., 1992; Carmona et al., 1996). Construction of cathepsin L proregion peptides containing various N- and C-terminal deletions allowed the inhibitory domain to be localized to a stretch of 30 residues located just following the peptide lysosome targeting motif (Carmona et al., 1996). Deletion of this domain caused a more than 200-fold increase in the Ki value. This domain contains the so-called ERFNIN motif: the conserved propeptide sequence E X3 R X2 (I/V) F/W X2 N X3 I X3 N previously identified in Branch B enzymes, but not other members of clan CA, from a phylogenetically diverse group of organisms (Karrer et al., 1993) (see Fig. 3
). It is located within an alpha helix, and the conserved residues are in contact with the surface of the enzyme. Interestingly, this motif is also found in the mouse cytotoxic T-lymphocyte antigen-2 (CTLA-2) {alpha} and ß gene products (Karrer et al., 1993; see Fig. 3A
). These proteins show significant similarity to the L-type proregions (Denizot et al., 1989), and the CTLA-2ß protein has been shown to be a good inhibitor of cathepsin L (Ki = 24 nM), H (IC50 = 67 nM), and papain (Ki = 25 nM), but not cathepsin B (Delaria et al., 1994). It is likely that the CTLA-2 genes evolved by duplication of an ancestral L-type cathepsin gene and subsequent deletion of the enzyme-coding region. However, it should be noted that CTLA-2ß is a relatively non-specific inhibitor and can exist as a dimer or tetramer. Thus, it may be functionally distinct from the L-type proregions. The species distribution of CTLA-2-like proteins has not been explored. Inhibitory regions of the 56-amino-acid rat cathepsin B proregion have also been examined (Chen et al., 1996). Two regions were identified that caused 150- and 625-fold increases in the Ki. Alanine scanning identified W-24p and C-42p (rat procathepsin B numbering) as the most important residues within these regions.
In general, the N-terminal propeptide must be removed proteolytically for activation (but see below). pH-dependent auto-catalysis is believed to proceed in trans (reviewed in Turk et al., 2000). For the lysosomal cathepsins, the acidic conditions of the lysosome promote autocatalytic cleavage and dissociation of the propeptide (Mason and Massey, 1992; Ishidoh et al., 1998). Removal of the proregion can also occur by cleavage by other peptidases. However, the propeptide also serves to stabilize the enzyme against inactivation by denaturation under neutral or alkaline conditions (Mason et al., 1987; Yamaguchi et al., 1990). A conserved GXNXFXD motif (the GNFD motif) that is involved in both autoactivation and appropriate folding of the enzyme (Vernet et al., 1995) was identified within L-type but not other clan CA enzymes. Site-directed mutagenesis of the conserved residues and expression in yeast (which does not process the wild-type propapain) indicated that the negative charge of this region is involved in triggering processing at low pH. It will be of interest to examine the function of a highly conserved, negatively charged region that follows the GNFD motif (Fig. 3A
). Processing of procathepsin L can be considerably enhanced by polyanions such as dextran sulfate and glycosaminoglycans (GAGs) (Mason and Massey, 1992; Ishidoh and Kominami, 1995). There is no information on whether polyanions interact with the proregion.
The propeptides of the clan CA enzymes generally share no structural similarity with the cystatins. It is therefore interesting that a region of the N-terminal propeptide extension of cathepsin F has been shown to display similarity to a cystatin domain (Nagler et al., 1999a). The level of sequence similarity is weak (it is not detected by a BLAST search of Genbank at http://www.ncbi.nlm.nih.gov), but molecular modeling predicted that the propeptide would have a structure similar to that of chicken egg white cystatin. Consistent with this conclusion, searches of a non-redundant peptide database with cystatin sequences revealed proteins in Japanese flounder, Drosophila, and Caenorhabditis elegans that contain an N-terminal domain with homology to cystatins and a C-terminal domain with significant homology to mammalian cathepsin F (Fig. 4
). Thus, there appears to be an ancient CP lineage within the papain-related subfamily in which the propeptide contains a cystatin-like inhibitory domain. Whether this cystatin domain is a functional inhibitor remains to be established.
|
Cathepsin B, H, and L proteins and activities (reviewed in Barrett and Kirschke, 1981; Howie et al., 1985; Kirschke et al., 1995; Xing et al., 1998) and mRNAs (Qian et al., 1989; Söderström et al., 1999) have been detected in all tissues and cells examined. Consistent with this ubiquitous pattern of expression, these genes lack the TATA box motifs normally found in highly regulated genes but frequently absent from constitutively expressed genes (Ishidoh et al., 1989a,b; Qian et al., 1991). However, there is significant variation in the levels of these enzymes and their ratios in different tissues and cells (e.g., Qian et al., 1989; Gong et al., 1993; Katunuma et al., 1993; Söderström et al., 1999). Cathepsin B is the most abundant and widely expressed cathepsin and is found at high levels in macrophages. At the mRNA level, the highest levels are found in non-skeletal tissues. Cathepsin B levels in skeletal tissues are not greatly lower than those of non-skeletal tissues, while cathepsin H skeletal tissue mRNA levels are very low. Cathepsin L levels are generally higher in tissues that turn over more rapidly, such as the liver and ovary, and in phagocytic cells such as stimulated macrophages. In the rat, high levels of cathepsin B and L mRNA are found in the kidney.
In addition to transcriptional regulation, there is evidence that cathepsin levels may be governed by post-transcriptional processing and differences in translation rates of alternative transcripts (Chauhan et al., 1993; Gong et al., 1993; Yan et al., 1998). The level of expression of cathepsin L in fibroblasts is increased by several growth factors (e.g., epidermal growth factor, fibroblast growth factor, and platelet-derived growth factor), phorbol esters, and by oncogene-mediated transformation in vitro (reviewed in Ishidoh and Kominami, 1998). Cathepsin L has also been shown to be induced in granulosa cells of growing follicles by follicle-stimulating hormone, and in pre-ovulatory follicles in response to leuteinizing hormone in a progesterone receptor-dependent manner (Robker et al., 2000). Cathepsin B levels may be associated with cell differentiation (reviewed in Yan et al., 1998). Cathepsin B was not detected immunohistochemically in normal minor salivary glands (Steinfeld et al., 2000). However, minor glands in organotypic culture expressed significant levels of cathepsin B, primarily in the ducts, and these levels were substantially increased by treatment with prolactin.
(ii) Functions of cathepsins B, H, and L
Until recently, the ubiquitous lysosomal distribution of cathepsins B, H, and L has led them to be considered primarily "housekeeping" enzymes essential to the normal protein turnover of cells. Consistent with this view, broad CP inhibitors block up to 40% of cellular protein turnover (Shaw and Dean, 1980; reviewed in Barrett and Kirschke, 1981). However, regulation by growth factors and variation in expression levels imply duties beyond those of "housekeeping" and raise the possibility of tissue-specific functions. A powerful approach to the study of function in vivo is the generation of homozygous null mutants through the generation of transgenic knockout mice. Surprisingly, homozygous cathepsin-B-deficient mice have an apparently normal phenotype (Deussing et al., 1998). This suggests functional redundancy but raises the question of why a redundant gene has been so conserved throughout vertebrate evolution. Cathepsin-L-deficient mice have periodic shedding of fur and abnormal skin morphology but are otherwise viable (Nakagawa et al., 1998). Significantly, these mice also have a defect in major histocompatibility complex (MHC) class-II-mediated antigen presentation. In antigen-presenting cells (APCs), extracellular foreign proteins are internalized via endocytosis or phagocytosis and degraded to peptides in the endocytic pathway. Major histocompatibility complex (MHC) class II molecules bind derived antigenic peptides and present them on the cell surface to CD4+ T-helper cells. Intracellular trafficking of the MHC class II molecules and binding of antigen are regulated processes (reviewed in Wolf and Ploegh, 1995). In the endoplasmic reticulum, class II
and ß chains form a heterodimer, and three
ß heterodimers associate with an invariant (Ii) chain trimer. This nonamer is then transported to the Golgi apparatus and sorted to the endocytic pathway by a signal in the Ii chain cytoplasmic domain, preventing it from entering the constitutive secretory pathway. In the endocytic compartment, the MHC class II molecules can encounter the foreign peptides. However, the Ii chain binds to the peptide-binding domain, blocking this interaction until it is removed by sequential proteolytic cleavage. The Iip10 fragment is the smallest that retains the N-terminal endosome-targeting sequence and a C-terminal extension in the peptide-binding groove. Further cleavage of the Iip10 fragment causes dissociation of the nonamer and release of {alpha}ß heterodimers bound to the CLIP fragment of the Ii chain, which occupies the peptide-binding site until the heterodimer interacts with another class-II-like chaperone molecule (HLA-DM in humans). This causes release of CLIP and allows peptide binding to occur. If the Ii chain is not cleaved, the nonamers can be targeted to the lysosome by the Ii chain cytoplasmic tail and degraded. Thus, cleavage of Iip10 is an important regulatory step, and the sequence and timing of Ii cleavage events likely determine the antigenic peptides presented. With a transgenic mouse knockout, cathepsin L has been shown to be essential for the degradation of the invariant (Ii) chain and cleavage of Iip10 to produce the CLIP fragment during MHC class-II-restricted antigen presentation in cortical thymic epithelial cells, but not in bone-marrow-derived antigen-presenting cells, which instead use cathepsin S (Nakagawa et al., 1998). Interestingly, the p41 form of the invariant chain contains a 64-amino-acid fragment with a thyroglobulin type 1 domain (Lenarcic et al., 1997) that binds and inhibits cathepsin L (Ki 1.7 pM), but not cathepsin S (Guncar et al., 1999). It may therefore be involved in regulation of Ii degradation, and in production of antigenic epitopes in endosomes (Fineschi et al., 1996).
Although lysosomal peptidases, including CPs, are undoubtedly involved in peptide antigen processing, the exact role of individual enzymes remains equivocal (Villadangos and Ploegh, 2000). The use of inhibitors "specific" for individual cathepsins has provided evidence for a role for cathepsins B (a CP) and D (an aspartyl peptidase) in antigen processing both in vitro and in vivo. For example, treatment of a mouse T-cell clonal line with a cathepsin B inhibitor suppressed processing of an ovalbumin antigenic epitope, and treatment of mice immunized with ovalbumin with this inhibitor suppressed the Th2 response and IgE production (Katunuma et al., 1998). Similarly, treatment of mice experimentally infected with Leishmania major with a cathepsin B inhibitor causes a switch in the immune response from Th2 to Th1, possibly reflecting a change in antigen processing (Maekawa et al., 1998). However, the use of inhibitors in these studies is complicated by the potential lack of complete specificity, and by the fact that the various cathepsins are involved in transprocessing of each other (e.g., Ishidoh et al., 1999). Cathepsin-B-deficient mice show no evidence for a role of cathepsin B in MHC class-II-mediated antigen presentation (including ovalbumin), indicating either that cathepsin B is not involved in this process, or that there is redundancy in the proteolytic system (Deussing et al., 1998).
Cathepsin L and, to a lesser extent, cathepsin B have been implicated in normal tissue-remodeling events. Hormonal regulation of cathepsin L levels in the granulosa cells of follicles suggests that it may be involved in the degradation of the follicle wall that leads to release of the mature oocyte (Robker et al., 2000). Cathepsin B mRNA levels rise in the apoptotic lumenal epithelial cells of regressing prostate and mammary glands, consistent with a role in degradation of the basement membrane, an early event in cell death (Guenette et al., 1994). Cathepsin CPs have been implicated in various stages of embryogenesis. The supply of amino acids to the developing mouse embryo prior to development of the chorioallentoic placenta is mediated by proteolysis of proteins in the visceral yolk sac, and levels of active cathepsin L are relatively high in this tissue at this time, in comparison both with later times and with the placenta (prior to parturition), as well as with cathepsin B (Sol-Church et al., 1999b). During implantation of the embryo, the embryonic trophoblasts invade the uterine stroma in a controlled manner, degrading the extracellular matrix (ECM). The endometrial connective tissue cells respond with the decidual reaction, which involves an enlargement of the cells and remodeling of the ECM. This provides a barrier to uncontrolled trophoblast invasion, and facilitates formation of an immunologically privileged site. As the placenta forms, decidual cells adjacent to the embryo undergo apoptosis and are phagocytized by the trophoblasts. The mouse placenta expresses substantially higher levels of cathepsin L mRNA relative to tissues such as the liver and kidney, and these levels are at their highest during implantation, suggesting a possible role in this process (Hamilton et al., 1991; Sol-Church et al., 1999b). The placenta also secretes procathepsin L, which may have proteolytic activity under certain circumstances, as well as other activities (see below). Injection of higher doses of E-64 into pregnant mice during the period of blastocyst attachment leads to a complete failure of implantation. Lower doses result in stunted embryos and a reduced decidual reaction (Afonso et al., 1997). These results suggest that CPs are essential for normal embryo development and decidualization of the uterus. Previously, it was suggested that cathepsins B and L were important in these processes (Afonso et al., 1997). However, the subsequent construction of cathepsin-B- and L-deficient mice (see above), which appear to grow and develop normally during gestation, makes this possibility seem less likely, although there could easily be redundancy in the enzyme systems. The recent discovery of placental-specific CPs (see below) might lead to clarification of these issues in the future. Placental cathepsin L mRNA levels also rise prior to parturition, possibly related to the degeneration of tissue around the placenta in preparation for birth (Hamilton et al., 1991). The role of cathepsin CPs in human implantation is unknown.
Thus far, discussion of the functions of cathepsins B, H, and L has primarily been confined to a lysosomal or endosomal location: the degradation of proteins trafficking through the endosomal system. However, it is also now clear that cathepsins B, H, and L are not purely lysosomal, and that they can be released from cells under various circumstances (see below). In the presence of thiol compounds, cathepsin B is active in the pH range of 5-6, while cathepsin L is active at pH 3-6.5, and cathepsin H has an optimum of 6.5-6.8 (Kirschke et al., 1995). In these pH ranges, cathepsins B and L and, to a lesser extent, cathepsin H can degrade a variety of components of the extracellular matrix, such as proteoglycans, laminin, and collagens II, IX, and XI (Maciewicz et al., 1990a; Buck et al., 1992; reviewed in Kirschke et al., 1995). Cathepsin L is a potent elastase at the optimal pH (5.5), where it is almost as active as pancreatic elastase, and significantly more than neutrophil elastase (both serine peptidases) (Chapman et al., 1994). In contrast, cathepsin B is 100-fold less active than cathepsin L against this substrate (Mason et al., 1986). Cathepsins B, H, and L are unstable at neutral pH, and are irreversibly inactivated above pH 7 (Barrett and Kirschke, 1981). Cathepsin L has a half-life of only about 1 minute at pH 7.2 and 37°C (Wang B et al., 1998), while cathepsin B is about 15-fold more stable (Turk et al., 1995). The rate of auto-degradation of cathepsin B at neutral pH is reduced in the presence of alternative substrates (Buck et al., 1992). Such instability would be expected to limit the extracellular degradative activity of these enzymes severely. Further, the concentration of cystatin C in vivo is sufficiently high to provide rapid and effective inhibition of cathepsin L and cathepsin B (even though the latter is less-well-inhibited by this cystatin), provided it remains in molar excess (Turk et al., 1995). However, various conditions can arise to enhance the stability of these enzymes (see below), and in contrast to the active enzymes, the proenzymes (which can also be released (see below)) are stable at neutral pH, as is a complex of mature cathepsin B and the proregion.
(iii) Cathepsins B, H, and L in the oral cavity
As lysosomal enzymes, cathepsins B, H, and L likely function in normal protein turnover of intracellular and endocytosed proteins in oral as in other tissues. Cathepsin B has been immunolocalized to granular duct cells in the rat submandibular gland and co-localized with renin in secretory granules, suggesting a role in processing secreted proteins (Sano et al., 1993). The role of these cathepsinseither intracellular or extracellularin normal remodeling of oral tissues has not been addressed to any great extent. Cathepsins B and L have been localized to gingival fibroblasts, and this source may have a role to play in periodontal disease (discussed in detail below). Interestingly, phenytoin and cyclosporin A suppress the expression of cathepsin L (as well as of MMP-1 and TIMP-1), but not cathepsin B, in cultured gingival fibroblasts. Both these drugs induce gingival overgrowth, suggesting that some of this overgrowth is the result of impaired extracellular matrix degradation involving cathepsin L (Yamada et al., 2000).
It is axiomatic that the immune system is central to the maintenance of oral health, and the progression from gingivitis to periodontal disease. Therefore, the involvement of cathepsins B, H, and L in the function of the immune system described above also applies to the oral cavity. As lysosomal enzymes, they also function in phagocytosis and can be released extracellularly by immune cells, where they can be involved in remodeling (or damaging) the extracellular matrix and tissues as outlined above. However, these released cathepsins can also participate in more powerful proteolytic cascades. This area, and the number of studies which have examined the activities of cathepsins B, H, and L in gingival fluids with respect to periodontal disease, are discussed below. Their potential role in Sjögren's syndrome is also discussed in a separate section.
(b) Dipeptidyl peptidase I (DPP I)
(i) Properties and tissue distribution
Dipeptidyl peptidase I (DPP I) is the accepted nomenclature for an enzyme previously called cathepsin C, among other names (e.g., cathepsin J). DPP I is a Branch A enzyme most closely related to cathepsin B, and is likely to be phylogenetically widely distributed. It is a lysosomal CP with a pH optimum of 5-6 that primarily cleaves dipeptides from the N-terminus of polypeptides, although it also has endopeptidase activity (Kirschke et al., 1995). It does not cleave substrates with N-terminal Arg, Lys, or Pro, or Pro in the penultimate position. It has some distinct differences from other lysosomal CPs: It has a long 206 (human)-residue prosegment that has an N-terminal extension relative to the papain-related CPs, it forms oligomers of around 200 kDa, and it requires halide ion to be maximally active. The enzyme is inhibited by stefins A and B and chicken cystatin, but only weakly by E-64, and is unstable at > pH 7.5 (Nikawa et al., 1992; Dolenc et al., 1996).
In the mouse, Western blot analysis demonstrated DPP I in the majority of tissues examined (Pham et al., 1997): The highest levels were found in the spleen, lung, liver, and small and large intestines, while very low levels were found in the heart and brain. Comparable results were found for the mRNA distribution (Rao et al., 1997). DPP I is also present in various immune cells, including neutrophils, lymphocytes, and macrophages, and treatment of lymphocytes with interleukin-2 (IL-2) was shown to cause a significant increase in DPP I mRNA levels (Rao et al., 1997).
(ii) Functions of DPP I and its role in pre-pubertal periodontitis and Papillon-Lefèvre and Haim-Munk syndromes
In mammals, multiple functions have been ascribed to DPP I. It is thought to have a role in general protein degradation and turnover. More specific functions have been suggested, such as activation of platelet factor XIII. Recently, DPP I was shown to be required for the activation of granzymesserine peptidases important in cytotoxic lymphocyte granule-mediated apoptosisand could be involved in activation of other serine peptidase zymogens such as neutrophil elastase (Pham and Ley, 1999).
Missense mutations in the DPP I gene, located at 11q14, have very recently been shown to be responsible for one recessive form of pre-pubertal periodontitis, a rapidly progressing, heritable form of the disease that affects the primary dentition (Hart et al., 2000). Two distinct autosomal-recessive palmoplantar keratoderma disorders, Papillon-Lefèvre syndrome and Haim-Munk syndrome, characterized by hyperkeratosis of specific epithelial areas, particularly the hands and feet, are also characterized by severe early-onset periodontitis, resulting in the loss of the primary and secondary dentition. Papillon-Lefèvre syndrome is usually first diagnosed by dentists. Both syndromes have now been shown to result from mutations in the DPP I gene (Hart et al., 1999, 2000; Toomes et al., 1999). Why loss of this widely distributed lysosomal enzyme should preferentially affect these tissues is unknown, although Chediak-Higashi syndrome, which also affects lysosomes, is also associated with immune dysfunction and severe early-onset periodontal disease (Tempel et al., 1972; Introne et al., 1999). The association of DPP I with these disorders illustrates that a ubiquitously expressed cathepsin can have tissue-specific functions, and need not be confined to a housekeeping function.
(c) Cathepsins O and X
(i) Properties and tissue distribution
Little is known about cathepsin O. It was originally cloned from breast tumor tissue by the polymerase chain-reaction (PCR) by means of primers directed to conserved CP sequences (Velasco et al., 1994). It has a predicted prodomain of a typical length (84 residues) but with only partial matches to the three consensus sequences discussed above (see Fig. 3A
). Northern analysis demonstrated mRNA in all tissues, with the highest levels in the ovary, kidney, liver, and placenta and the lowest in the thymus and skeletal muscle. The native protein has not been purified, although a recombinant protein has been obtained by expression in E. coli. No enzymatic properties (pH profile, stability, inhibition) have been reported.
Another novel human cathepsin has been independently characterized by three groups, who used identification of novel ESTs in the database, followed by screening of cDNA libraries. It was initially designated as cathepsin X (Nagler and Menard, 1998), cathepsin Z (Santamaria et al., 1998a), and cathepsin P (Pungercar and Ivanovski, 2000). Phylogenetic analysis (see Fig. 2
) indicates that a CP designated as cathepsin Y cloned from rat spleen (Sakamoto et al., 1999) is the rat ortholog of cathepsin X. Cathepsin X is unusual in having a very short proregion (38 or 41 residues, depending on the prepeptide cleavage site) that is even smaller than that of cathepsin B. It completely lacks the N-terminal region that contains the lysosomal targeting consensus and the ERNF/WNIN motif found in the cathepsin L group. The role of this short proregion in folding, inhibition, and stabilization at different pHs remains to be determined. However, it does contain a cysteine residue in a position similar to that of a cysteine in cathepsin B that has been shown to be important in inhibition by the proregion (Chen et al., 1996; see above). Two potential N-glycosylation sites are present in the mature protein that could serve to target it to the lysosome. Interestingly, the proregion also contains an RGD integrin-binding motif.
Recombinant human procathepsin X was obtained by expression in Pichia pastoris (Nagler et al., 1999b). Unlike other cathepsins, it did not activate auto-catalytically at low pH, but cathepsin L was found to convert the proenzyme efficiently to the active form. Cathepsin X was found to be a very good carboxypeptidase, with a pH optimum around 5.0, and a relatively poor endopeptidase. The 3D structure of human procathepsin X has been determined. A Cys residue in the proregion is covalently bound to the active-site Cys, and a 3-residue "mini-loop" insertion between the Gln of the oxyanion hole and the active-site cysteine (predicted by primary sequence alignment algorithms) partially occludes the S2' subsite, providing an explanation for the carboxypeptidase activity (Sivaraman et al., 2000). It is not inhibited by human cystatin C.
Northern blot and RT-PCR analysis demonstrated ubiquitous expression of cathepsin X, although the levels varied considerably between tissues (Nagler and Menard, 1998; Santamaria et al., 1998a; Deussing et al., 2000; Pungercar et al., 2000). Ubiquitous expression in the mouse and human was consistent with the characterization of the promoter as housekeeping-type (Deussing et al., 2000). Cathepsin X was also highly expressed in a variety of cancer cell lines, and may therefore be up-regulated with malignant transformation (Santamaria et al., 1998a; Pungercar et al., 2000). Cathepsin X was immunolocalized in human hepatocytes and Kupffer cells, and in the epithelial cells of distal tubules (Pungercar et al., 2000). It showed a diffuse, mostly peri-membranous distribution, in contrast to the punctated, granular distribution shown by cathepsin B, which was also primarily localized to the proximal tubules of the kidney. This suggests that cathepsin X may be localized to the membrane or the adjacent extracellular space. An examination of expression in oral tissues has not been reported.
(ii) Functions
The physiological functions of cathepsin X are unknown. The rat enzyme was initially identified based on its ability to produce bradykinin-potentiating peptide from plasma (Sakamoto et al., 1999). In equimolar amounts, this peptide increases the activity of bradykinin seven-fold, and in two-fold excess, by 23-fold. The precursor protein for this peptide is unknown. Bradykinin has been shown to synergize with IL-1 or TNF{alpha} to stimulate IL-6 production by human gingival fibroblasts (Modéer et al., 1998). Therefore, cathepsin X activity could contribute to the pathogenesis of periodontal disease by increasing the effect of this pro-inflammatory mediator.
(d) Cathepsin F
(i) Properties and tissue distribution
Cathepsin F was independently cloned by three groups either by using PCR and degenerate oligonucleotides directed to conserved CP regions or by identifying novel ESTs in the database (Wang B et al., 1998; Nagler et al., 1999a; Santamaria et al., 1999). The proregion is very large (251 residues), due to an N-terminal extension with an N-terminal region which has similarity to a cystatin domain, followed by a 50-residue flexible linker peptide (Nagler et al., 1999a). The following C-terminal segment of this proregion has overall similarity to the Branch B-like group, although it is most similar to lymphopain. Like lymphopain, it contains a peptide lysosome-targeting motif, followed by a partial match to the ERWNIN motif, ERFNAQ, consistent with these enzymes forming a phylogenetically distinct subgroup (Wex et al., 1999). The proprotein contains 5 potential N-glycosylation sites. Transient expression in Cos-7 cells localized the protein to vesicles, most likely lysosomes (Wang B et al., 1998).
Cathepsin F has been expressed in Pichia pastoris (Wang B et al., 1998). The enzyme autocatalytically activated at an acidic pH, and was shown to have a level of activity toward synthetic substrates similar to that of cathepsin L, with a broad pH optimum between 5.2 and 6.8. The catalytic efficiency (kcat/Km) was comparable with that of cathepsin L, which is the most active lysosomal CP cathepsin. Like cathepsins K, L, and S, cathepsin F prefers a bulky hydrophobic or aromatic residue at the P2 position. The enz