Mining methods and typical structural mechanisms of terpene cyclases

Terpenoids, formed by cyclization and/or permutation of isoprenes, are the most diverse and abundant class of natural products with a broad range of significant functions. One family of the critical enzymes involved in terpenoid biosynthesis is terpene cyclases (TCs), also known as terpene synthases (TSs), which are responsible for forming the ring structure as a backbone of functionally diverse terpenoids. With the recent advances in biotechnology, the researches on terpene cyclases have gradually shifted from the genomic mining of novel enzyme resources to the analysis of their structures and mechanisms. In this review, we summarize both the new methods for genomic mining and the structural mechanisms of some typical terpene cyclases, which are helpful for the discovery, engineering and application of more and new TCs.


Biosynthesis of terpenoids
Terpenoids are one of the most diverse natural compounds which include terpenes with a number of isoprene (C5) units, as well as C5 polymers containing phosphate, hydroxyl, carboxyl, aldehyde and other functional groups (Dickschat 2016;Helfrich et al. 2019). According to the data currently recorded in the Dictionary of Natural Products (http:// dnp. chemn etbase. com), the number of terpenoids has reached 80,000 (Chen et al. 2020a). These terpenoids play important roles in higher plants, fungi, bacteria, insects and marine organisms, such as plant hormones; carotenoids in photosynthesis; steroids in cell membrane; quinone compounds transferring electron . In addition to the daily used perfumes, resins and pigments, terpenoids have a broader application prospect in the field of medicine. A typical example is paclitaxel (e.g., Taxol ® ), a diterpenoid compound isolated from the endophytic fungus of Taxus brevifolia or the bark of the Pacific yew (T. brevifolia). Due to the unique anti-cancer mechanism and excellent efficacy, paclitaxel is widely used for the treatments of breast cancer, soft nest cancer and lung cancer, among others (Miele et al. 2012). In recent years, more functions of terpenoids have been reported, such as the treatment of diabetes by inhibiting the activity of α-glycosidase (Valdes et al. 2020), and overcoming the multidrug resistance in tumor treatment as ABC transporter modulators (Goncalves et al. 2020), etc. These achievements indicate the broad application prospects of terpenoids, which in turn stimulates researchers' interest in the biosynthetic pathways of terpenoids.
Terpenoids are originated from two common precursor substances: isopentenyl diphosphate (IPP) and dimethylallyl diphosphate (DMAPP). The synthesis of IPP and DMAPP involves two different pathways: in most prokaryotes and plant plastids, these compounds are produced through the 2-methyl-d-erythritol-4-phosphate (MEP) pathway; In most eukaryotes, archaea and some prokaryotes, they are produced by mevalonate acid (MVA) pathway (Daletos et al. 2020). IPP can be isomerized to DMAPP by isopentenyl pyrophosphate isomerase (Hampel et al. 2005).
Monoterpene (C 10 H 16 ) is formed by two isoprene units (cyclic or non-cyclic). Common monoterpenes and monoterpenoids are typical volatile compounds that can be isolated from essential oils of various herbs and citrus fruits (Davis and Croteau 2000). Sesquiterpene (C 15 H 24 ) is composed of three isoprene units and usually found in essential oils and extracts, including santalol, caryophyllene, and humic acid. Diterpene (C 20 H 32 ) is formed by four isoprene units. In general, diterpenoids have a ring structure and are subsequently oxidized into alcohols, aldehydes or acids (Keeling and Bohlmann 2006), such as diterpene resin acids which are abundant in conifers. Triterpenes (C 30 H 48 ), composed of six isoprene units, Fig. 1 Biosynthesis pathway of different terpenoids. OPP: pyrophosphate group are a class of compounds including squalene and sterol precursors. Tetraterpenes (C 40 H 64 ), composed of eight isoprene units, are mainly carotene and carotenoid compounds. The higher polyisoprene compounds are the structural basis of natural rubber and latex.

The catalytic mechanism and classification of terpene cyclases
Terpenoids have both simple linear hydrocarbon chain structures and complicated cyclic structures. The complex structure of terpenoids is formed by the cyclization of linear, chiral polyisoprene substrates. In terpenoid biosynthesis, cyclization is usually the first step of the synthesis process, followed by hydroxylation or other modifications to produce terpene hydrocarbons or terpene alcohols. This vital cyclization reaction is catalyzed by TCs. Interestingly, some TCs may have multiple active centers, thereby completing the construction of multiple cyclization and chiral centers in a single-step reaction . For cyclases, a small change in the protein structure may result in great impacts on its activity, and changes in some key sites may produce brand-new catalytic activity (Keeling et al. 2008).
TCs-catalyzed reaction usually starts with the formation of carbocations, so TCs can be classified according to the formation mechanism of carbocations (Wendt and Schulz 1998): ionization-dependent TCs (class I, Fig. 2a) and protonation-dependent TCs (class II, Fig. 2b). Class I terpene cyclase initiates the cyclization reaction by ionization of the phosphate group. There is a conserved motif 'DDXXD' rich in aspartic acid in its active center, which combines with divalent metal ions (e.g., Mg 2+ ), and attacks the pyrophosphate group of substrates through ionization of metal cations, then promotes the departure of pyrophosphate group and cyclization of linear molecules. Whereas class II terpene cyclase does not have the above-mentioned conserved motif rich in aspartic acid residues, but initiates the protonation ring reaction through the N-terminal conserved motif 'DXDD' (Dickschat 2016).
Although, terpene cyclases have relatively low K M values, the extremely low k cat values of terpene cyclases severely limit the efficient production of terpenes (Table 1). Understanding the structure and catalytic mechanism of terpene cyclase is the key to solve this problem. Boutanaev et al. (2015) investigate the basis of terpenoid-scaffold diversity through analysis of multiple sequenced plant genomes and discovered that the primary drivers of terpenoid-scaffold diversity are terpene cyclases. At the beginning, researchers usually extract from plants or use degenerate primers for PCR to obtain terpene cyclases (Hezari et al. 1995;Kawaide et al. 1997), With the rapid development of highthroughput sequencing technology, microbial whole gene composition become an important source of terpene cyclases for researchers (Hou and Dickschat 2020;). Here we summarize several common methods of terpene cyclases mining to enhance the understanding of the methods of mining terpene cyclases and discover new terpene cyclases.

Fig. 2
Different formation mechanisms of carbocation in class I and class II terpene cyclases. a Ionization-dependent reaction catalyzed by class I terpene cyclases with GPP as the substrate. b Protonation-dependent reaction catalyzed by class II terpene cyclases with GGPP as the substrate

Direct cloning
In this method, degenerate primers are designed based on the highly conserved sequences obtained by homology analysis. The sequence fragments of related enzymes are amplified by PCR, and then the target enzyme genes are cloned by constructing a cDNA library (Fig. 3). Kawaide et al. (1997) cloned the first fungal diterpene cyclase gene from the cDNA library of Phaeosphaeria sp. L487 by designing primers based on the conservative sequence of copalyl diphosphate (CPP) synthase in plants and amplifying the gene through PCR. The diterpene cyclase genes from Gibberela fujikuroi (Toyomasu et al. 2000) and Phoma betae (Oikawa et al. 2001) were also successfully cloned in the same way. Basyuni et al. (2007) cloned four multifunctional triterpene cyclases from Bruguiera gymnorrhiza leaves and Rhizophora stylosa using the homologous cloning method and successfully verified the substrates.
The cloning method based on sequence homology is direct and convenient, and is usually regarded as the first choice in gene cloning, but there are still certain limitations in the application. For example, when the enzymes are of low homology or have specific structures, the direct cloning method is often not suitable. For smaller diterpene cyclases, in addition to using probes for gene fishing in the cDNA library, the full length of target gene can also be obtained by first cloning the core fragments in conserved sequences and then extending to the complete sequence using the RACE (rapid amplification of cDNA ends) technology. For instance, Ye et al. (2018) used reverse transcription-polymerase chain reaction RT-PCR (reverse transcription-polymerase chain reaction) and RACE technology to clone a gene As-Ses TPS (sesquiterpene cyclases from Aquilaria sinensis) from the total RNA of A. sinensis. As-Ses TPS encodes a germacrene-d-synthetase which is located in the cytoplasm and only expressed in the scent-forming part of A. sinensis.

Genome sequencing
With the rapid development of DNA sequencing technology, the genomes of many species in addition to yeast, Escherichia coli, mice and other model organisms, have been sequenced and annotated. To obtain more comprehensive candidate genes in the genome, many databases have predicted the putative coding sequence while recording the genome information. For example, 362 fungal genomes are listed in the Joint Genomics Institute (JGI, http:// genome. jgi-psf. org), which has the goal to gather 1000 sequenced genomes from all fungal families in the next few years. Through similar websites,  Huang et al. Bioresour. Bioprocess. (2021) 8:66 genomes of target species and their gene annotation can be obtained (Fig. 4). Shibuya et al. (2009) predicted 13 homologous genes of oxidosqualene cyclases (OSCs) in the Arabidopsis genome sequencing results, and obtained their corresponding cDNA by RT-PCR. The screening, expression and substrate validation had been done in Saccharomyces cerevisiae deficient in lanosterol synthase. Matsuda et al. (2015) sequenced the genome of Emericella variecolor, which can produce a variety of terpenoids. The selected candidate genes were chosen by domain prediction, domain and conservative sequence alignment methods. A number of terpene cyclases were successfully identified, and the closely related P450 monooxygenases in the same biosynthetic pathway were found from the gene clusters of these terpene cyclases.

Gene cluster retrieval
In the genomes of fungi and plants, the genes responsible for some secondary metabolites biosynthesis, such as polyketides, non-ribosomal polypeptides and terpenoids often gather in a form of clusters (Bills and Gloer 2016). Therefore, gene cluster retrieval is a useful method to discover new and novel enzymes in a certain biosynthesis pathway by mining and characterizing the adjacent genes according to the location of a related gene. This method has been widely used in gene mining of terpene cyclases from eukaryotic (especially fungal) sources (Fig. 4).
GGPP synthase is an essential enzyme for the synthesis of diterpene compounds in fungi. It is usually located on the biosynthetic gene cluster of diterpenes. Therefore, by cloning the GGPPs gene and analyzing the gene clusters, the adjacent diterpene cyclase genes can be obtained. Oikawa et al. (2001) employed this method to excavate the diterpene cyclase of aphidicolin in P. betae. Toyomasu et al. (2008) applied this method to excavation of the terpene cyclase from Phomopsis amygdali, and obtained the diterpene cyclase of labdanate. Compared with the direct gene cloning, the gene cluster retrieval method does not require the construction of cDNA library, nor does it rely on the homology of cyclase genes. It is an effective strategy for discovering novel terpene cyclase genes.

Gene screening based on biosynthesis pathway
In E. coli with the MVA pathway, accumulation of IPP would be toxic to the bacterium and inhibit the growth of cells. If the genes that encode the IPP-consuming and terpene-synthesizing enzymes are introduced into the engineered strain, the growth inhibition might be relieved. For instance, Withers et al. (2007) introduced FPP synthase and sesquiterpene cyclase into E. coli, screened the cDNA library in a high-throughput mode, then obtained the gene encoding a hemiterpene cyclase that can relieve the growth inhibition. As a new way to find terpene cyclases, the gene screening approach based on biosynthesis pathway does not depend on sequence homology and is suitable for searching the terpene cyclase genes with unknown genetic information.
Another screening method based on biosynthesis pathway is to monitor the expression level of target proteins and the related terpenoid yield after adding inducer. Xu et al. (2013) cloned three full-length cDNAs (ASS1, ASS2, ASS3) of sesquiterpene cyclases which may be related to the formation of aroma from the library of A. sinensis, and expressed them in E. coli. After treated with a plant hormone, methyl jasmonate (MeJA), the expression of ASSs was significantly induced, and the corresponding yield of sesquiterpenes was also increased accordingly, so that the sesquiterpene cyclase with FPP as the substrate was successfully revealed.

Transcriptome analysis
The expression of terpenoids will increase after the inducer is added, and the candidate genes of the relevant enzymes in the biosynthetic pathway of the target compound can be obtained by analyzing the changes in the transcriptome information before and after adding the inducer (Lange and Ahkami 2013). While mining new enzymes, this method can also provide a more complete understanding of the biosynthetic pathway of terpenoids. Misra et al. (2014) detected the gene transcription changes of sweet basil treated with the secondary metabolic initiator MeJA and identified 388 candidate unique transcripts of MeJA responsiveness. Transcript analysis indicated that, in addition to controlling its own biosynthesis and stress response, MeJA also up-regulates the transcripts of various secondary metabolic pathways, including those of terpenoids and phenylpropane/flavonoids. In addition, the combination of transcript and metabolite analysis revealed the biosynthesis of medically significant urane-type and olean-type pentacyclic triterpenes induced by MeJA. By transcript analysis, two MeJA-responsive oxidiosqualene cyclases (ObAS1 and ObAS2) were successfully identified, which are composed of 761 and 765 amino acid residues, respectively.

Class I terpene cyclases
Class I terpene cyclases contain DDXXD and NSE metalbinding motifs which are found with α, αβ and αβγ domain architectures (Gennadios et al. 2009;Janke et al. 2014;Koksal et al. 2011;Whittington et al. 2002). Class I terpene cyclase initiates the cyclization reaction by ionization of the phosphate group. Here, we introduce structure and cyclization mechanism of some class I terpene cyclases to enhance the acknowledge of class I terpene cyclases.
The first reaction (Whittington et al. 2002) catalyzed by SoBPPS is the formation of allyl carbocationic intermediate by the ionization of pyrophosphate group (Fig. 5), and then the pyrophosphate group is reassembled to C3 atom, forming (3R)-linalyl diphosphate (LPP). When the C2-C3 bond rotates to cis-configuration, the pyrophosphate group is ionized again to carry out the subsequent cyclization process (Fig. 5). The C2-C3 bond changes from trans-configuration to cis-configuration of substrate. This process is common in terpene cyclase (Hyatt et al. 2007;Kampranis et al. 2007;Morehouse et al. 2017;Rudolph et al. 2016), while there are also some cyclases that do not need configuration transformation, such as selinadiene synthase, whose C2-C3 bond still maintains trans-configuration (Baer et al. 2014). BPPS (Fig. 6a) exists as a dimer (βα:αβ), whose active site is located in the α domain. It has the characteristic sequence of class I cyclase DDXXD and (N,D)DXX(S,T) XXXE. After binding with three Mg 2+ and pyrophosphate groups, the conformation of BPPS changes from an open state to a closed state (Fig. 6b, c), to remove solvent and prevent carbon cation from being eliminated prematurely by solvent molecules during cyclization. F578 and W323 stabilize the reaction intermediates through cation-π interaction. In the closed conformation, water molecule #110 forms hydrogen bonds with Y426, S451 and pyrophosphate groups, which plays an important role in controlling the molecular conformation (Whittington et al. 2002). The three basic residues, R314, R493 and K512, help Mg 2+ to better combine with pyrophosphate group and form the recognition motif of pyrophosphate with metal ions . Despinasse et al. (2017) isolated a new borneol pyrophosphate synthase (LaBPPS) from Lavandula angustifolia, which possesses 50% identity to the protein sequence of BPPS originally isolated from S. officinalis (SoBPPS). LaBPPS produces more by-products, such as α-pinene, β-pinene, camphene, linalool and so on. By comparing the differences of sequences and structures between LaBPPS and SoBPPS, the key amino acid residues which can change product distribution or even the molecular structure of the product may be found. In recent years, computer technology has also been applied to the study of terpene cyclases. Methods, such as TerDockin (O'Brien et al. 2018) and EnzyDock (Das et al. 2019), can predict the orientation of substrates and carbocation intermediates, showing the importance of computer technology in the study of terpene cyclase structures.

Limonene synthase (LS)
Limonene is a class of chiral monoterpene molecules with two enantiomers (4R)-(+)-limonene and (4S)-(−)limonene. Limonene has biofuel potential (Beller et al. 2015), antibacterial and antioxidant capabilities (Fahim et al. 2019) and cancer prevention effect. Some oral limonene and other limonene drugs for the treatment of breast cancer are in clinical trials (Silva et al. 2019;Singh and Sharma 2015).
(+)-Limonene and (−)-limonene are synthesized by specific limonene synthases. Hyatt et al. (2007) obtained the (−)-LS crystal structure (PDB ID: 2ONG & 2ONH) derived from Mentha spicata. Morehouse et al. (2017) obtained the (+)-LS crystal structure (PDB ID: 5UV0) derived from Citrus sinensis. Similar to BPPS, LS has two domains (α & β). The catalytically active site of LS is located in α domain. Although the identity between the two LS sequences is only 44.7%, their amino acid residues in the active sites are highly conserved . Kumar et al. (2017) examined the reasons of the two LSs for generating stereoselectivity, and found that the residues M458/I450 and N345/I336, located in the active sites of (−)-LS and (+)-LS, determine the binding direction of the substrate in the enzyme: when the binding direction of (−)-LS substrate is not correct, there will be a steric hindrance between the M458 residue and the substrate molecules, so that the substrate can only form the direction suitable for (−)-limonene synthesis. The I336 residue of (+)-LS also plays a similar role in the synthesis of (+)-limonene.
Compared with other monoterpene cyclases, the LS-catalyzed cyclization is simpler: Limonene can be obtained by one-step deprotonation after configuration change, and the existence of intermediate LPP can be further verified by fluorinated substrate analogue 8,9-difluorogerany diphosphate (DFGPP) (Morehouse et al. 2019). If we have enough information of the structure-activity relationship between monoterpene cyclase and its substrate, limonene synthase might be used to catalyze the synthesis of more monoterpene skeleton molecules by mutating the key residues ( Fig. 7) (Srividya et al. 2015;Xu et al. 2018). Xu et al. (2017) successfully converted (−)-LS into a pinene synthase (N345A/L423A/ S454A) and a phellandrene synthase (N345I) by protein engineering of (−)-LS derived from M. spicata, thus revealing the plasticity of the active center of LS, providing a reference for redesign of monoterpenoid enzymes.

1,8-Cineole synthase (CS)
1,8-Cineole, also known as eucalyptol, is the main component of eucalyptus oil, which has anti-inflammatory (Li et al. 2016), antiviral , anti-microbial and antioxidant (Seol and Kim 2016). In the process of CS-enzymatic reaction, α-terpinyl cation intermediates that commonly appear in the monoterpene cyclase-catalyzed reaction will also be formed (Croteau et al. 1994). Finally, carbocation will be eliminated by water molecules to form α-terpineol. This shows that solvent molecules have an important influence on the stability of reaction intermediates or the elimination of carbon positive ions in the catalytic reaction of terpene cyclases (Fig. 8).
The crystal structure of SfCS from Salvia fruticosa (PDB ID: 2J5C) showed it has αβ domains (Kampranis et al. 2007). Recently, as the first monoterpene cyclase from bacteria, the crystal structure of ScCS from Streptomyces clavuligerus ATCC 27064 was analyzed (Karuppiah et al. 2017). It was found that ScCS from bacteria contains only α domain, which is significantly different from Eucalyptus synthases from plants.
In SfCS, N338 is a conserved residue in CS from S. officinalis and Arabidopsis thaliana, probably because its side chain can form hydrogen bonds with water molecule. Therefore, N338 may play a key role in controlling water molecules, eliminating carbocation intermediates and producing α-terpineol. After mutation, the variants did no longer produce cineole, instead they produced sabinene, limonene and a small amount of α/β-pinene and myrcene. This phenomenon further confirmed the important role of N338 (Kampranis et al. 2007). While in ScCS, W58 and N305 regulate a water molecule, which may be crucial for the subsequent elimination of carbocation.
It is closely correlated to the structure of its substrate binding pocket why terpene cyclases can bind to a substrate with a specific chain length. The variant N338A of SfCS expands the substrate pocket, which shifts its substrate from GPP (C10) to the sesquiterpene universal substrate FPP (C15), resulting in 49% trans-αbergamotene (Kampranis et al. 2007). Karuppiah et al. (2017) compared the structure of ScCS with a sesquiterpene cyclase, aristolochene synthase from Aspergillus  terreus (AtAS, PDB ID: 4KUX), and a selinadiene synthase (SdS, PDB ID: 4OKZ), and found that the two residues F77 and F179 limit the substrate combining pockets of ScCS. Linalool/nerolidol synthase (ScLinS, PDB ID: 5NX4), from the same source as ScCS, could produce linalool with GPP as a substrate, and nerolidol with FPP as a substrate. The amino acids of ScCS at positions corresponding to F77 and F179 are T75 and C177, which are less sterically hindered. These two cases show that the substrate binding pocket of terpene cyclases has certain restrictions on the size of the substrate molecules, so that it can only accept substrates with suitable chain length entering the active center. The structural modification of the substrate pocket of the enzyme may produce more new cyclase molecules that may accommodate different chain lengths.
The terpene cyclases corresponding to the 4 isomers of α-Bisabolol have been discovered (Albertti et al., 2018;Attia et al. 2012;Muangphrom et al. 2019;Nakano et al. 2011). Li et al. (2013) obtained the crystal structure (PDB ID: 4FJQ) of an α-bisabolol synthase (AaBOS) from Artemisia annua. AaBOS has αβ domains and the active center is located in the α domain. The DDXXD characteristic sequence is located in the D helix, and the (N,D)DXX(S,T)XXXE metal binding motif is located in the H helix, except G replaces S/T in AaBOS. Previous studies have shown that G can retain the ability of S/T to bind metal ions in plant terpene cyclase (Zhou and Peters 2009).
AaBOS catalyzes the cyclization reaction of FPP (Fig. 9), and the carbocation intermediate is finally eliminated by water molecules to produce the hydroxylated product α-bisabolol ). Based on the crystal structure, a five-site mutant AaBOS-Mut (V373N/L381A/I395V/N398I/L399T) that could produce γ-humylene as the major product (68.8%) was identified . The structure-activity relationship of the mutant was investigated, indicating that the mutation L399T is essential for the production of γ-humylene.
Amorphadiene synthase (AaADS) from A. annua was found to catalyze the first step of artemisinin biosynthesis, which has an 82% identity with the AaBOS sequence. The study on structural mechanism of AaBOS provides a reference for the related researches of AaADS . As the crystal structure of amorpha-4,11-dime synthase is not available, AaBOS can be used as an excellent template for homologous modeling of AaADS. Recently, the (+)-α-bisabolol from Artemisia kurramensis and Artemisia maritima has been found. These two BOSs produced the unique product of (+)-α-bisabolol, revealing their high enantioselectivity and product specificity. Enantiomeric pure (+)-α-bisabololl can be de novo synthesized in yeast expression system, in which the yield was about 83 mg/L (Muangphrom et al. 2016).   The regulatory mechanism of terpenes on water molecules is currently unclear. The final catalytic step of α-bisabolene synthase is to eliminate carbocations through water molecules, while α-bisabolene synthase gets non-hydroxylated products through deprotonation. The crystal structure of α-bisabolene synthase (AgBOeS) derived from Abies grandis has been obtained (McAndrew et al. 2011), which has a αβγ three-domain structure, with k cat /K M = 38 M −1 s −1 . If the eutectic structure of AaBOS and AgBOeS was obtained using alternative substrates, the regulation mechanism of AaBOS on water molecules may be obtained by structural analysis and comparison, which may provide reference for the study of the regulation mechanism of terpene cyclases on water molecules. Bisabolane, a saturated alkane of α-myxol, is a kind of biofuel . It can be synthesized by α-myxol or α-myrcene, but the cost of biological synthesis is too high. If the catalytic efficiency of cyclase can be improved by protein engineering, the cost will be reduced. For this reason, the research on the interaction and structure-activity relationship between enzyme and substrate is very important.

epi-Isozizaene synthase (EIZS)
Albaflavenone is synthesized by cyclization of FPP catalyzed by enzyme EIZS followed by two-step continuous oxidation of CYP170A1 (Fig. 10) (Zhao et al. 2008). The EIZS-catalyzed cyclization, goes through the steps of hydrogen transfer and methyl transfer after the formation of bisabolyl cation, finally forming  The crystal structure of EIZS from Streptomyces coelicolor A3 (2) (Fig. 11) has been solved (PDB ID: 3KB9), indicating that EIZS has only an α domain ). While being located by Mg 2+ , the bound pyrophosphate group also forms hydrogen bonds with R194, K247, R338 and Y339. These interactions stabilize the closed conformation of EIZS and promote the ionization of pyrophosphate group. By comparing the crystal structure of EIZS D99N without binding ligand and that of the EIZS complex with pyrophosphate group, it is found that the helix and loop of the active center have obvious conformational changes, making the solvent molecules discharge from the active pocket and preventing the premature elimination of carbonium ion intermediates. Using benzyltriethylammonium cation (BTAC) to simulate the reaction intermediate bisabolyl cation, it was found that the cation intermediate is stabilized by the electrostatic action of pyrophosphate anion and the cation-π interactions of F95, F96, and F198. Mutation of these aromatic amino acid residues will break such stabilizing effects and may cause the premature elimination of carbocations, thereby producing other cyclized molecules.
The contour of the terpene cyclase activity pocket is very similar to the product structure. When this contour does not match the product molecule well, more by-products will be produced. The EIZS-catalyzed reaction produces 80% of epi-isozizaene and minority of other sesquiterpene products, indicating a certain degree of promiscuity of the cyclase. By replacing some hydrophobic residues with polar amino acids, the contour of the active pocket can be changed (Blank et al. 2017;Li et al. 2014), allowing the enzyme to produce more valuable cyclized molecules (Fig. 12). The product of variant F95H (Li et al. 2014) was β-curcemene (50% purity), which can be transferred into biofuel bisabolene after hydrogenation. The measured k cat /K M of this variant was 2600 M −1 s −1 , which is 70 times higher than that of α-bisabolene synthase mentioned before "α-Bisabolol synthase (α-BOS)" section), indicating the high catalytic efficiency of this mutant enzyme. By analyzing the crystal structure of the F95Q mutant (PDB ID: 6OFV), it was found that the active pocket of the F95Q mutant is enlarged, which makes the binding of FPP more flexible, thereby affecting the precise control of the original enzyme on the cyclization reaction (Blank et al. 2019).

Fig. 12
Predominant cyclization products of EIZS mutants. Main products of the corresponding mutants were summarized from refs. Blank et al. 2017;Li et al. 2014) The polarity change of this site also lets new water molecules enter the active pocket. In the crystal structure, the conformation of the benzyltriethylammonium cation, which mimics the reaction intermediate bisabolyl cation, is changed by the influence of water molecules.

Aristolochene synthase (AS)
The bicyclic sesquiterpene aristolochne is the precursor material for the synthesis of gigantenone, sporogen-AO1, bipolaroxin, PR-toxin and other mycotoxins . The crystal structures of aristolochne synthases derived from Penicillium roqueforti (PrAS, PDB ID: 1DI1) (Caruthers et al. 2000) and A. terreus (AtAS, PDB ID: 2OA6) (E.Y. Shishova et al. 2006) have been resolved. The identity of the two protein sequences is 61%. The amino acids in the active site are highly conserved, with different residues located mainly on the protein surface.
PrAS is a class I terpene cyclase with a single alpha domain (Caruthers et al. 2000). The isoprene chain of the substrate will start the reaction on the hydrophobic interface composed of F112, F178, L108, L111 and G205. While the residues of R200, K251, Y341 and R340 can interact with pyrophosphate groups, metal ions and DDXXD/E motifs through hydrogen bonds or salt bridges to stabilize the binding of pyrophosphate groups, which is very important for the conformational transformation of enzymes (Faraldos et al. 2012). This stabilization effect is common in terpene cyclases, such as BPPS (as mentioned in "Bornyl diphosphate synthase (BPPS)" section) and EIZS (as mentioned in "epi-Isozizaene synthase (EIZS)" section).
The template role of the enzyme in the entire cyclization reaction is very important. Through site-directed mutagenesis, the roles of multiple residues were discovered: Y92 controls FPP to facilitate the conformation of cyclization and folds it into (S)-(−)-germacrene A (Calvert et al. 2002); W334 and F112 can stabilize the carbocation intermediates by cation-π interaction (Faraldos et al. 2011a;Forcat and Allemann 2006); F178 promotes the formation of FPP cyclization and eudesmane cation, and F178 also interacts with F112 to co-stabilize the transition state of the cyclization on the reaction (Forcat and Allemann 2006), while L108, Y92 and F112 jointly control FPP to form the correct conformation (Faraldos et al. 2011b). PrAS can also convert the unnatural substrate 7-methylene farnesyl diphosphate to aristolochne and a small amount of valenene (Faraldos et al. 2016), showing that PrAS has the potential for protein engineering and mutation of key amino acids in the catalytic process to get access to a variety of products (Calvert et al. 2002;Faraldos et al. 2011b;Felicetti and Cane 2004;Forcat and Allemann 2006).
AtAS is also a class I terpene cyclase with a single α domain (Shishova et al. 2006), and its residues R175, K226, R314 and Y315 have the same effect as the corresponding sequence of PrAS. The AtAS crystals have a tetramer structure. However, the presence of dimer enzymes was verified through gel analysis. Previous studies have found that the activity of AtAS decreased when its concentration was higher than 27 nM (Felicetti and Cane 2004). After further research, it was inferred that the tetramer structure may inhibit the enzyme activity. Chen et al. (2013) obtained the eutectic structure of AtAS with substrates and carbocation intermediate analogs, making the study of the structure-function relationship between the enzyme and the substrate more insightful. Van der Kamp et al. (2013) proposed the dynamic process of AtAS conformational change through molecular dynamics (MD) simulation: at first the substrate molecule binds followed by the binding of two Mg 2+ cations, the enzyme conformation changes from open to closed state, and then the last Mg 2+ ion combines and starts the cyclization reaction. When the reaction is finished, the product molecule and two Mg 2+ ions are released first, the pyrophosphate and Mg 2+ are released finally. There are water molecules at the active site of AtAS in the reaction state ), but the final product is not a hydroxylated molecule, indicating that these water molecules have relatively fixed binding positions by the hydrogen bonding of surrounding residues. They do not directly participate in the cyclization reaction but participate in the network of molecular conformation control. Zhang et al. (2016a, b) studied AtAS, farnesyl diphosphate synthase (FPPS) and 5-epi-aristolochene synthase (EAS) using QM(DFT)/MM and MD simulation, concluding that the protonation state of the pyrophosphate group is different among various enzymes, which is closely related to the enzyme structure and its microenvironment. In AtAS, it was presumed to be in a deprotonated state (PP i 3− ); in FPPS, it was speculated to be in a single-protonated state (PP i 2− ); EAS was in a double-protonated state (PP i − ). This finding allows a better understanding of the mechanism on cleavagee of pyrophosphate groups.

5-epi-Aristolochene synthase (or epi-aristolochene synthase, EAS)
Capsidiol is a phytoalexin that can resist fungal or bacterial infections. It is catalyzed from FPP by EAS and 5-epi-aristolochene hydroxylase (EAH) (Li et al. 2015). EAS from Nicotiana tabacum is the first plant terpene cyclase obtained a crystal structure. The structure shows that the enzyme is a monomer containing αβ domain, including the class I terpene cyclase DDXXD and (N,D) DXX(S,T)XXXE metal binding motif. Its active center was determined through the crystal structure analysis of the complex of FPP analog and enzyme (Starks et al. 1997).
R264 of EAS directly interacts with phosphate group, which keeps the position of the substrate. R266 interacts with Y527 and T528 to make the enzyme semi closed, and then R266 reverses its conformation to form a salt bridge with E531, thus making the conformation completely closed (Gennadios et al. 2009;Rising et al. 2015;Zhang et al. 2016a). The N-terminal β domain stabilizes the helix and loops of the active site, without which the R266-E531 salt bridge will be broken, resulting in the failure of forming closed conformation. W273 and Y527 of the active pocket play an important role in the localization and stabilization of the substrate. When the substrate and Mg 2+ enter the substrate pocket, the W273 generates π-π stacking effect with the C 6,7 double bond of substrate, making the substrate more reactive and promoting the C 1,10 ring-closure process. Y527 stabilizes the reaction intermediate through cation-π interaction, and also helps the conformational transformation of the enzyme. If Y527 is mutated, it will make the active site more flexible, and it is unable to form a closed conformation (Zhang et al. 2016a).
Compared with the AS-catalyzed reaction, EAS has a different product configuration, which shows that the active pocket of terpene cyclase is an important template for substrate folding. Y520 of EAS acts as a proton donor to catalyze the protonation of the intermediate (R)-(+)germacrene A (Starks et al. 1997). It is inferred that Y92 in PrAS has the same function, but Y92 of PrAS does not align with Y520 of EAS, which may be more conducive to form products with different configurations (Caruthers et al. 2000). O'Brien et al. (2016) combined quantum mechanics with computational docking to construct an all-atom model of all possible reaction intermediates in the active pocket of EAS, which has great significance for structure analysis of enzyme and protein engineering. Zhang et al. (2019) studied the fidelity of AtAS and the promiscuousness of EAS using QM/MM and found that the D444-Y520 dyad acts as a pair of additional acidbase residue pairs to increase the promiscuousness of EAS. In AtAS, F81/F147 stabilizes the carbocation intermediate through steric hindrance and cation-π interaction, which enhances the fidelity of AtAS.
The identity of N. tabacum 5-epi-aristolochene synthase and Hyoscyamus muticus premnaspirodiene synthase is 75%. O' Maille et al. (2008) found that there are 9 key residues that may affect the final product of these enzymes. EAS was successfully transformed into premnaspirodiene synthase by replacing amino acid residues at the 9 positions. By comparing structures of N. tabacum 5-epi-aristolochene synthase, H. muticus premnaspirodiene synthase with their mutants (Koo et al. 2016), it is found that the catalytic promiscuity of terpene cyclase is related to the stability of the enzyme itself: the increase of stability will reduce the number of conversions and decrease the diversity of by-products. This is because the increased rigidity of the enzyme reduces the flexibility of conformational changes, resulting in slow product release, thereby increasing the enzyme's template effect on the cyclization reaction and reducing the generation of by-products. These studies deepen our understanding on the catalytic promiscuity of terpene cyclases, and provide new ideas for protein engineering of these terpene cyclases. EAS has the ability to catalyze unnatural substrates, and it can simultaneously utilize unnatural substrates (cis, or trans)-FPP and natural substrates (trans, trans)-FPP to produce epi-aristolochne . EAS can also catalyze the synthesis of anilinogeranyl diphosphate to form a new macrocyclic alkaloid 3,7-dimethyl-trans, trans-3,7-azaparacyclophane-diene (geraniline) (Rising et al. 2015). These studies have shown the flexibility of proteins and the possibility that those enzymes might act as good start points for extensive molecular evolution.

Diterpene cyclases Taxadiene synthase (TS)
Paclitaxel is a class of tetracyclic diterpenoids containing multiple oxygen functional groups. It is the first tubulin stabilizer that can make cells stay in G2-M stage by promoting tubulin aggregation and stabilizing microtubules, leading to cell apoptosis and achieving the purpose of anti-cancer (Yang and Horwitz 2017). In 1992, paclitaxel was approved by the U.S. Food and Drug Administration (FDA) for the treatment of ovarian cancer, and was subsequently used in the treatment of breast cancer, head cancer, lung cancer and other cancers, becoming one of the most popular anti-cancer drugs in the world .
Taxadiene, the skeleton structure of paclitaxel, is generated by GGPP cyclization (Fig. 13). The reaction includes the ionization activation of pyrophosphate groups, proton transfer, deprotonation and elimination of carbocations (Koksal et al. 2011;Lin and Hezari 1996;Schrepfer et al. 2016). Koksal et al. (2011) got the crystal structure of TS from T. brevifolia, complexed with Mg 2+ , 13-aza-13,14-dihydrocopalyl diphosphate (ACP) or 2-fluorogeranylgeranyl diphosphate (FGP) (TS-ACP, PDB ID: 3P5P; TS-FGP, PDB ID: 3P5R). This is the first crystal structure of diterpene cyclases, which contains the αβγ domain (Fig. 14a). The catalytic site is located in the α domain, and contains the typical class I terpene cyclase metal binding motifs, D 613 DMAD and N 757 DTKTYQAE. But this crystal structure is inactive with the transport sequence cut and an additional 27 amino acid residues truncated.
Just like the N-terminal of BPPS (Whittington et al. 2002) and EAS (Starks et al. 1997) (Fig. 14b, c), the truncated N-terminal sequence is above the active site and can cover the active site, transforming the enzyme into a closed conformation which is necessary for the cyclization reaction. However, because the crystal structure obtained so far does not contain this N-terminal sequence; therefore, it only represents the structure of an inactive polypeptide, instead of the native enzyme. The conformation is not in the closed state required for the  . c Structure of EAS complexed with substrate analog farnesyl hydroxyphosphonate and Mg 2+ (PDB ID: 5EAT). d Metal-binding motifs of taxadiene synthase. e Effector triad of taxadiene synthase which sites in the helix that is shown in magenta. f Key aromatic and polar residues of taxadiene synthase. Mg 2+ is in green; α, β, & γ domains are in white, light blue and wheat, respectively; The red shows N-terminal segment which 'caps' the active site catalytic reaction, but presents a half-closed state (Koksal et al. 2011), which makes it difficult to study the structure-function relationship between enzyme and substrate. Therefore, the structural mechanism of taxadiene synthase is mostly elucidated based on homology modeling, quantum mechanical simulation and molecular dynamics calculations. Schrepfer et al. (2016) employed the crystal structures of BPPS and its complex with Mg 2+ and 3-azageranyl diphosphate (PDB ID: 1N20) (Whittington et al. 2002) as templates, constructed a TS structure model in the closed conformation. Combining with energy minimization and MD simulation, they analyzed the conformational changes and key residues involved in the entire cyclization reaction of TS. The results of the simulation analysis were verified by site-directed mutagenesis. Through this model, it is found that the effector triad of taxadiene synthase (i.e., the pyrophosphate group receptor R754, the linker S713, and the effector V714-O) are closely related to the ionization of the pyrophosphate group. The receptor R754 will deflect first to form a hydrogen bond with the pyrophosphate group of substrate, while the linker S713 also deflects and forms a hydrogen bond with V714-O. This conformational change causes the carbonyl oxygen (G182-O) of the effector G182 to flip into the active center, which is important to the formation of the closed conformation and the ionization of the pyrophosphate group. This effector triplet is widely present in class I terpene cyclase, but the spatial position of the triplet is different among different sources (Baer et al. 2014). Based on the model obtained in former research, Van Rijn et al. (2019) further explored the cyclization mechanism of taxadiene synthase using the QM/MM method and found that due to the influence of the position, orientation and conformation of the cation, as well as the electrostatic interaction between the active site structure and the influence of water molecules, the carbocation rearrangement catalyzed by TS has significant sensitivity to the change of intramolecular microenvironment. Freud et al. (2017) figured out the proton transfer step of the TS enzymatic cyclization process using the QM/MM method. From the energy point of view, this work revealed the path in which verticillen-12-yl cation is transferred directly, giving verticillen-8-yl cation in one step, which is more advantageous than the indirect process going through verticillen-4-yl cation.
The aromatic group of the protein is essential to stabilize the carbocation intermediate during the cyclization reaction. For example, W753, F834, Y835, and Y841 can stabilize carbocations through cation-π interactions. By mutating these residues, the cyclization cascade can be terminated early to obtain the corresponding cyclization products, which is of great significance to the study of mechanism and the design of synthetic methods for terpenoid cyclization. Mutants W753H and Y841F can produce the main products, cembrene A and verticillia-3,7,12(13)-triene, respectively (Schrepfer et al. 2016). Ansbacher et al. (2018) constructed a structural model of the W753H mutant, revealing energetically that the mutant is more conducive to deprotonation of the intermediate to form cembrene A. Polar residues S587, Q609, Y684, Y688, C719 and C830 may act as catalytic bases for the deprotonation of taxadiene (Koksal et al. 2011). Edgar et al. (2017 examined the effects of these residues and found that the mutant Q609G gives verticillia-3,7,12(13)triene as the unique product, while the mutant Y688L can produce taxa-4(20),11(12)-diene which is more conducive to the hydroxyl of taxadiene-5α-hydroxylase (CYP725A4) (Jennewein et al. 2004), thus initiating the first step of converting taxene into the anticancer molecule, Paclitaxel. Pemberton et al. (2017) investigated the function of βγ domains of TS and found that when the βγ domains are removed, TS loses its cyclization ability; when only removing the γ-domain, the catalytic activity will be greatly reduced; the β-domain also has some influences on the fidelity of the enzyme, indicating that there exists a certain interaction between the functional domains of the enzyme. Janke et al. (2014) obtained the crystal structure of CotB2 from Streptomyces melanosporofaciens, which is the first crystal structure of diterpene cyclase from bacteria (PDB ID: 4OMG), but this protein is an inactive enzyme with 15 N-terminal residues and 10 C-terminal residues truncated (Fig. 15a). In the following research, Driller et al. (2018) successfully obtained the eutectic structure (PDB ID: 6GGI) of CotB2 with Mg 2+ and a substrate analogue, 2-fluoro-3,7,18-dolabellatriene. This structure presents a closed conformation in the reaction state, and the peptide chains which were cut off at the C-terminal before blocking the active center of the enzyme, such as a lid (Fig. 15b).

Cyclooctat-9-en-7-ol synthase (CotB2)
CotB2 belongs to class I terpene cyclase, but the sequence of its metal binding module DDXD is different from the typical DDXXD sequence: a kink is introduced near the proline residue of the third aspartic acid in this motif, which makes the α-helix D containing rich aspartic acid motif is shorter than other class I terpene cyclases (Janke et al. 2014). CotB2 also has effector triad, receptor R177, linker D180 and effector G182-O, but the triplet structure has less influence on the enzyme conformation (Tomita et al. 2017). This may be due to the fact that R177 and D180 are already connected through salt bridges in the open conformation, which reduces the conformational changes after substrate binding (Driller et al. 2019).
CotB2-catalyzed reaction involves special carbon rearrangement phenomenon (Meguro et al. 2015). C 8,9 will rearrange during the cyclization (Fig. 15c), which shows the special synthetic ability and precise stereochemical control of the terpene cyclase. The hydroxylation products of CotB2 were obtained by eliminating carbon positive ions with water molecules. In the reaction, the active pocket of terpene cyclase contains some water molecules, which can stabilize the intermediates by forming hydrogen bonds. At the same time, water molecules have the ability to eliminate carbon positive ions, which are affected by the hydrogen bonds of surrounding amino acids without eliminating carbon cations. Site directed mutagenesis of these residues may result in inactivation of the enzyme, or more product molecules may be produced (Fig. 16) (Görner et al. 2013;Janke et al. 2014;Tomita et al. 2017). Assisted by some computational technologies, the reaction mechanism of CotB2 and its mutants are well studied that pointing the way for future terpene cyclase design (Raz et al. 2020;Tang et al. 2020).

Sesterterpene cyclase Sesterterpene cyclase from Streptomyces mobaraensis (SmTS1) with multiple products
Recently, Hou and Dickschat (2020) reported a geranylfarnesyl diphosphate synthase (SmGFPPS), which is the first bacterial geranylfarnesyl diphosphate (GFPP) synthase (GFPPS) and a multiproduct sesterterpene synthase (StTS) from Streptomyces mobaraensis (SmTS1). SmTS1 can convert GFPP into a mixture of sesterterpene hydrocarbons and one sesterterpene alcohol (Fig. 17). The reaction is initiated by protonation of GFPP. Isotope labeling experiment revealed that different modes of hydrogen and methyl migrations resulted in different products and the quenching of cations by water molecules causes the sesterterpene alcohol. Starting from the ophiobolin F synthase from Aspergillus clavatus, a few StTSs from fungi (Chiba et al. 2013;Matsuda et al. 2016Matsuda et al. , 2015Mitsuhashi et al. 2017;Okada et al. 2016;Qin et al. 2016;Ye et al. 2015) and plants (Huang et al. 2018(Huang et al. , 2017Shao et al. 2017) were discovered recently. In plants clustered genes for two discrete GFPPS and sesterterpene synthase  Huang et al. Bioresour. Bioprocess. (2021) 8:66 are found, while in fungi sesterterpene biosynthesis is always promoted by bifunctional enzymes containing a C-terminal trans-prenyltransferase (PT) and an N-terminal TPSs domain. The PT domain produces the universal C25 sesterterpene precursor GFPP which is then cyclized by the TPS domain to form diverse scaffolds. Although most StTSs produce various sesterpene backbones via class-I cyclization mechanism, AtTPS06 from A. thaliana utilizes a class-II cyclization mechanism (Chen et al. 2020b). Sesterterpennoids are amongst the rarest of all isoprenoids with approximately 1300 compounds known, which are widely distributed in terrestrial fungi, cyanobacteria, lichens, higher plants, insects, and various marine organisms and exhibit noteworthy biological activities (Li and Gustafson 2020;Liu et al. 2007;Wang et al. 2013). Apart from the remarkable biological activities of sesterterpenoids, complex molecular structures of sesterpenoids attract many chemist and biologist to explore great potential of sesterpenoids (Guo et al. 2021).

Class II terpene cyclases
Class II terpenoid cyclases initiate catalysis by protonation of a carbon-carbon π bond or epoxide moiety in an isoprenoid substrate to yield a carbocation. These enzymes contain the characteristic sequence motif DXDD which is unrelated to the aspartate-rich motif of class I terpene synthases. The β and γ domains compose the active site of class II terpene synthases. Here, we introduce several class II terpene synthases to enhance the understanding of enzyme structure and function.

ent-Copalyl diphosphate synthase (ent-CPS)
Enantimeric copalyl diphosphate (ent-CPP, Fig. 18a) is an essential precursor of gibberellin . AtCPS from A. thaliana is the first class II diterpene cyclase with crystal structure analysis. The enantiomer copalyl pyrophosphate synthase (AtCPS) derived from A. thaliana is the first class II diterpene cyclase reported with a crystal structure. This structure is that of the enzyme cocrystallized with a substrate analog, (S)-15-aza-14,15-dihydrogeranylgeranyl thiolodiphosphate (PDB ID: 3PYA) or with a product analogue, 13-aza-13,14-dihydrocopalyl diphosphate (PDB ID: 3PYB), with resolutions of 2.25 Å Fig. 16 Products of CotB2 mutants. Products of the corresponding mutants were summarized from refs. (Görner et al. 2013;Janke et al. 2014;Tomita et al. 2017) and 2.75 Å, respectively (Koeksal et al. 2011). AtCPS activates the reaction through the protonation of carbon-carbon double bonds promoted by an aspartic acid residue. It has the characteristic sequence of D 377 IDD. The enzyme has 3 domains (α, β, γ), with the active center located at the interface of β and γ domains. The reaction finally achieves the elimination of carbocations by deprotonation.
D379 in the characteristic sequence of D 377 IDD protonates the carbon double bond far away from the pyrophosphate group to initiate the cyclization cascade reaction step, while N425 plays mainly a stabilizing role. It was found that the residue of N332 is conserved, while W369 and W505 may stabilize the carbonium ion in the cyclization process through cation-π interactions; the water molecules in the active center also play an important role in the catalysis; however, in AtCPS, the metal binding sequence E 199 DEND is far away from the pyrophosphate group in the substrate analogues, which may be due to the fact that Mn 2+ used in crystallization cannot completely replace Mg 2+ , or the pH value of the system is low, weakening the binding effect of metal ions (Koeksal et al. 2011). Due to the low resolution, it is difficult to analyze the structure-function relationship. Therefore, by optimizing the crystallization conditions and introducing Mg 2+ , a crystal structure with higher resolution was finally obtained (PDB ID: 4LIX, Fig. 18b). In the previous crystal structure, the pyrophosphate group had two orientations (Fig. 18c), while in the optimized structure (Fig. 18d) the pyrophosphate group has only one single orientation, which explains the positioning effect of Mg 2+ -pyrophosphate group. However, due to the low pH under the crystallization conditions, which affects the binding of metal ions, the crystal structure of Mg 2+ -bound enzyme was not obtained (Koksal et al. 2014). Independent of the substrate channel in the enzyme, there is also a hydrogen ion channel composed of multiple polar amino acids and water molecules, which is conducive to the protonation of carbon-carbon double bonds by D379 (Koksal et al. 2014).
The carbocation in the AtCPS-catalyzed reaction is finally eliminated by deprotonation to form a double bond, and water molecules act as a generalized base in this step, forming hydrogen bonds with H263 and N322 . Through site-directed mutagenesis, the resultant mutants H263A, N322A and H263A/N322A were found to produce 8-hydroxy-ent-CPP (Fig. 18a), indicating that H263 and N322 play a role in positioning water molecules. After they are mutated to Ala, the hydrogen bonding around water molecules is reduced, Fig. 17 Products of SmTS1 making the water molecules directly contact with carbocations to produce hydroxylated products. When H263 and N322 were substituted with aromatic amino acids, it was found that the steps of the cyclization cascade of mutants H263F and H263Y were changed: (−)-kolavenyl diphosphate [(−)-KPP] was formed through hydrogen transfer and methyl transfer (Fig. 18a) (Potter et al. 2016). Szymczyk et al. (2020) investigated the properties of AtCPS mutants by computer-aided methods, and successfully identified 6 single-site mutants. After combination of the 6 single-site mutations, better mutant enzymes were generated with improved stability and ligand affinity, which enriches the enzyme library of AtCPS mutants and reveals the key amino acid residues affecting the stability and ligand affinity. Rudolf et al. (2016) solved the structure of ent-CPS derived from Streptomyces platensis CB00739 (SpCPS, PDB ID: 5BP8, Fig. 18b), which is the first class II diterpene cyclase structure derived from bacteria. SpCPS is discovered from the biosynthesis pathway of platensimycin (PTM) and platencin (PTN), but unlike the AtCPS originating from a plant source, SpCPS contains only two domains, β and γ, without alpha domain. This discovery enriches the knowledge concerning the structure, catalytic mechanism and evolutionary relationship of bacterial diterpene synthases.
Copalyl diphosphate (CPP) has many configurations, including CPP, ent-CPP, syn-CPP, syn-ent-CPP which can couple with different terpene synthases to form diverse labdane-related diterpenoids. Reuben J. Peters et al. has done a good summary of the related work (Peters 2010;Zi et al. 2014). These results show that more complex ring structures will be produced when class I and II terpene cyclases are combined to catalyze the formation of cyclized molecules, which suggests that terpene synthases have a rich product library and the ability to form new molecules.
The crystal structure (Fig. 19b) of SHC was first resolved in 1997 with 2.9 Å from Alicyclobacillus  (Wendt et al. 1997) and the resolution of the structure was later extended to 2.0 Å as facilitated by a new crystal form (Wendt et al. 1999). The enzyme contains eight so-called QW-sequence repeats, while Oxidosqualene cyclase has five QW-sequence. QWsequence has been suggested that they shield the cyclases against the released enthalpy of the highly exergonic reaction (Wendt et al. 1999(Wendt et al. , 1997. SHCs are capable of activating different functionalities other than the traditional terminal isoprene C=C group as well as being compatible with a wide range of nucleophiles beyond the 'ene-functionality' . Thus, squalene-hopene cyclases demonstrate a great potential to be used as a toolbox for general Brønsted acid catalysis (Hammer et al. 2013). For example, Eichhorn et al. (2018) gained mutants of A. acidocaldarius SHC, which are responsible for improved (E,E)-homofarnesol conversion. After optimation for reaction condition, A whole cell biotransformation process is presented in which E. coli cells producing an improved SHC variant allows the conversion of 125 g/L (E,E)-homofarnesol to (−)-Ambrox in 72 h. Besides, SHC can product many novel cyclic molecules from different substrates via protein engineering or other methods, which exhibit the promiscuity and application prospect of SHC (Bastian et al. 2017;Eichhorn et al. 2018;Fukuda et al. 2018;Hammer et al. 2015;Ideno et al. 2018;Kuhnel et al. 2017;Nakano et al. 2019;Siedenburg et al. 2013).

Bifunctional terpene cyclases
Bifunctional terpene synthases which contain two active site of terpene synthases. For example, abietadiene synthase, geosmin synthase (class I-class I), abietadiene synthase (class I-class II), and fusicoccadiene synthase (class I-class I). Here, we pick abietadiene as a typical sample to introduce the structure and function of bifunctional terpene synthases.
N 451 plays a role in catalyzing the protonation of the carbon-carbon double bond in the functional domain of class II terpene cyclases. N451A mutation showed that K M and k cat values of the resultant class II cyclase mutant were significantly reduced when compared with the wild type, while the active site of class I cyclase is not significantly affected. Through molecular dynamics simulations, studies on the active center of the class II cyclase of AgAS show that the structural difference from other class II cyclases is mainly located at loop 482-492 (Fig. 20c), which impacts on the GGPP conformation (Zhou et al. 2012).
This protein can produce other circularized molecules through protein engineering (Fig. 20a). For example, the A723S mutant converts abietadiene synthase to pimaradiene cyclase (Wilderman and Peters 2007). H348 acts as a catalytic base to deprotonate the carbocation to form a double bond, producing (+)-CPP, while the H348D mutant eliminates the final carbocations by water molecules, producing 8α-hydroxy-CPP (Schalk et al. 2012).
In 2002, Brodelius et al. (2002) constructed a fusion protein of farnesyl diphosphate synthase and EAS, which produced an artificial bifunctional cyclase converting FPP into epi-aristolochene more efficiently. In recent years, the crystal structures of fusicoccadiene synthase (PDB ID: 5ERN) from P. amygdali  and geosmin synthase (PDB ID: 5DZ2) from S. coelicolor (Harris et al. 2015) have been resolved, making If we can conduct in-depth structure-activity relationship analysis based on the structure of the bifunctional enzyme, it may be possible to create more bifunctional cyclases and to further improve the catalytic efficiency of terpene cyclases.

Conclusion
Terpene cyclases can cyclize GPP, FPP, GGPP and other linear substrates with only three or less structural functional domains of α, β, and γ, forming monoterpenes, sesquiterpenes, diterpenes and other products with different structures. This shows the terpene cyclase's ability to precisely control the cyclization reaction. This regulation helps the processes of carbocation generation, transfer and elimination of substrate, which is mainly affected by the contour of the cyclase activity pocket and some aromatic residues or polarities that can interact with carbocations. The elimination of carbocations heralds the end of the entire cyclization reaction. There are two main elimination methods. One is that adjacent carbon atoms deprotonate in some polar residues and/ or pyrophosphate groups eliminate carbocations, giving olefin products (Hyatt et al. 2007;Morehouse et al. 2017); another way is to form hydroxyl products with residual water molecules in the active pockets (Croteau et al. 1994;Li et al. 2013) or react with other groups to achieve elimination (Whittington et al. 2002). These two different carbocation elimination methods further increase the molecular diversity of the terpene cyclase catalyzed reactions. By mutating the key residues that stabilize the carbocation, or replacing the hydrophobic amino acid residues in the active pocket, the cyclization reaction can be terminated or even rebuilt to obtain more cyclization products, or to enhance the specificity and catalytic efficiency of the enzyme toward the targeted product.
When the enzyme catalyzes the cyclization reaction, the conformation will change from open to closed state. This conformational change eliminates a large number of water molecules in the active pocket, so that the cyclization process can proceed smoothly, but this does not mean that all water molecules are excluded completely. There are still some water molecules retained, which are essential in maintaining the hydrogen bond network of the active pocket, and have important impacts on the formation of product (Christianson 2017;Kampranis et al. 2007;Koksal et al. 2014;Li et al. 2013;Van Rijn et al. 2019;Whittington et al. 2002). Terpene cyclases have relatively low K M values, showing strong affinity with the substrates, but their extremely low k cat values also severely limit the efficient production of terpenes (Table 1), indicating a new direction and a great challenge for protein engineering of these terpene cyclases.
Both the "role of water" and "catalytic reaction with low kcat " required extensive studies of the exact enzymatic catalytic mechanism at the atomic level, which could be achieved by some promising technologies, such as the isotope labeling experiment and various computational modelling especially the QM/MM simulations.
In this review, the methods for genomic mining of terpene cyclases were classified and sorted out, and several typical crystal structures of cyclases were emphatically introduced. According to the current research, there are more studies regarding the cyclases of monoterpenes and sesquiterpenes, but less are available for those of diterpenoids. With the increase of isoprene units, it is bound to require more precise regulation of the substrate by the cyclase. To strengthen the study on the cyclization enzymes of diterpenes and more complex or active terpenes, it is expected to further elucidate the regulation mechanism of enzymes on terpene cyclization. In recent years, aided with computational biology technologies, the methods for studying the structure and mechanism of cyclization enzymes become more diverse (Freud et al. 2017;Muangphrom et al. 2019;Szymczyk et al. 2020;Van Rijn et al. 2019). It is expected that more enzyme sources with novel structures and properties will be discovered, and more active terpenoid molecules will be synthesized through extensive or intensive exploration of these cyclases. There are some terpene synthases can accept non-natural substrate analogues, such as SHC "Squalenehopene cyclases" section). These fantastic function attracts chemist to use them as catalysts. Harms et al. (2020) had shed light on the recent advances of these attractive terpene synthase that can catalyze non-natural substrates. This suggested that the chemical diversity of terpenoids still has a great potential to further develop. It is believed that the future research will continue to reveal the diversity and magic of terpene synthases' structure and function, which plays an increasingly significant role in the production of terpenoids by synthetic biology methods.