Sequence similarity network analysis
Sequence similarity network is a powerful method dealing with the functional classification of a large number of protein sequences (Atkinson et al. 2009). Each protein sequence is represented by a node, and an edge is only drawn between a pair of nodes that have a BlastP e-value more stringent than a certain cut-off value. To construct a sequence similarity network of LldR homologues, a blast search of PLldR from the P. aeruginosa XMG strain was carried out and a total of 425 sequences that share >30 % sequence identities were retrieved. The retrieved sequences also include the PdhR from E. coli protein which senses pyruvate and regulate the expression of pyruvate dehydrogenase (PDH) multienzyme complex (Ogasawara et al. 2007). The e-value threshold was set to 10−70 to just separate the two functional diverged proteins LldR and PdhR from E. coli into different clusters (Fig. 1a). It is interesting to note that at this e-value threshold PLldR from P. aeruginosa XMG is also in a different cluster; the most sequences is from Pseudomonas, implying the functional divergence of Pseudomonas LldRs and ELldR. This is consistent with the fact that LldR from Pseudomonas senses both l
-lactate and d
-lactate, while the ELldR only senses l
-lactate. At a more relaxed e-value threshold of 10−60, LldRs from E. coli and P. aeruginosa remain in different clusters, while the cluster containing PdhR merges with that containing LldR from P. aeruginosa (Fig. 1b), suggesting that LldR from Pseudomonas is evolutionarily more closely related to PdhR than LldR from E. coli.
Sequence homology analysis and structure prediction
A multi-sequence alignment was performed using PLldR and its homologues, including ELldR and CLldR. As shown in Fig. 2, a certain sequence identity exists between PLldR and ELldR (42 % sequence identity) and between PLldR and CLldR (29 % sequence identity). According to the determined crystal structure of CLldR from C. glutamicum in complex with its target operator DNA, there are four conserved amino acid residues indispensable for DNA-binding, which are also conserved in FadR. The corresponding residues in PLldR from P. aeruginosa, R38, R48, R52, and G69 were identified. Besides, the four putative PLldR residues involved in Zn2+-binding (D152, H156, H205, and H227) were indicated, which are also completely conserved among PLldR from P. aeruginosa and its homologues. This suggests that a common structural feature of Zn2+-binding exists in the regulatory domain of LldRs (Gao et al. 2008).
To further analyze the LldR homologues, the secondary structures of PLldR from P. aeruginosa and ELldR from E. coli were predicted using the secondary structure prediction program PSIPRED (Fig. 3a, b). The amino acids residues of PLldR and ELldR with predicted α-helices or β-sheets were then marked on the protein sequence alignment with magenta or yellow colors, according to the prediction results. The amino acids residues of CLldR from C. glutamicum, whose crystal structure was solved, were also similarly labeled according to their secondary structures. The comparison result showed in Fig. 3c indicated that LldRs from P. aeruginosa and E. coli shared similar secondary structure with LldR from C. glutamicum, and both consist of ten α-helices and two β-sheets. Like CLldR, the N-terminal domain of PLldR, which comprises of α1, α2, α3, β1, and β2, contains a typical prokaryotic helix-turn-helix (HTH) DNA-binding motif. This is consistent with the common feature of HTH family of transcription factors. As shown in Fig. 2, this HTH motif possesses the most conserved amino acids residues. The residues of the HTH motif of PLldR from P. aeruginosa showed 50 % sequence identity with those from C. glutamicum and 64 % sequence identity with those from E. coli, indicating that the HTH motif is more conserved than other parts in these three LldR proteins. The predicted secondary structure of the C-terminal region of PLldR consisted of seven α-helices (α4–α10), which is also the same as the case of CLldR. This region is supposed to be a regulatory domain, which plays an important role in ligand-binding and dimerization (Gao et al. 2008).
The crystal structure of CLldR was reported by Gao et al. (Fig. 4a) (Gao et al. 2008). However, the structures of PLldR and ELldR are still unavailable. The three-dimensional structure prediction of the PLldR and ELldR proteins was then performed using the I-TASSER server; the results were showed in Fig. 4b, c. The three-dimensional structures of these three LldR proteins were superimposed and compared. As shown in Fig. 4d, the overall structures of the three LldR proteins were quite similar. The root-mean-square deviation (RMSD) between CLldR and PLldR was 0.467 Å for 131 aligned Cα atoms; the RMSD between ELldR and CLldR was 0.433 Å for 136 aligned Cα atoms; and the RMSD between ELldR and PLldR was 0.448 Å for 176 aligned Cα atoms. The N-terminal domains of these three LldRs matched each other quite well, which are responsible for DNA-binding. This result is consistent with its high sequence conservation across evolution. On the other hand, there are some differences existing in several loops among the C-terminal domains of PLldR, ELldR, and CLldR, which might be due to the difference in the ligands they recognize. For instance, PLldR can associate with both l
-lactate and d
-lactate, while ELldR can only recognizes l
-lactate.
Crystallization and X-ray crystallographic analysis of PLldR
To perform a crystallographic analysis of PLldR, the recombinant plasmid pET-28a-lldR was constructed and successfully transformed into the E. coli strain BL21 (DE3). The full-length PLldR protein from P. aeruginosa was expressed as an N-terminally His-tagged protein (theoretical molecular weight of ~28 kDa) and purified by Ni2+-affinity and gel-filtration chromatography. The results of gel-filtration chromatography showed that PLldR was eluted as an approximately 60-kDa protein, indicating that PLldR exists as a dimer in solution.
Crystallization screening and further optimization yielded rod-shaped PLldR crystals. An PLldR crystal obtained in the optimized crystallization condition [2 %(w/v) PEG 8000, 0.1 M Tris–HCl, pH 8.5] is shown in Fig. 5b. The crystal diffracted to 2.55 Å resolution (Fig. 6) and belonged to the trigonal space group P3, with unit-cell parameters a = 68.5 Å, b = 68.5 Å, and c = 237.0 Å. Diffraction data were collected and processed with a final R
merge value of 9.2 % (87.2 % for the highest resolution shell). The data completeness, data multiplicity, and average I/σ(I) values of the collected dataset were 99.7 %, 5.7, and 18.1, respectively (99.6 %, 5.7, and 2.2 for the highest resolution shell, respectively).
Based on the calculation of the Matthews coefficient, it is estimated that there are four molecules of PLldR in each asymmetric unit. In this case, the Matthews coefficient is 2.77 Å3 Da−1, which corresponds to a solvent content of 55.6 % (Matthews 1968). Further work towards structural determination is underway. Selenomethionine-substituted PLldR protein is also being prepared. For a better understanding of lactate-binding modes and regulatory mechanism, co-crystallization or soaking the PLldR crystals with the substrates (l
-lactate and d
-lactate) are also in progress. This study would shed light on revealing the mechanisms of the FadR family of regulators that regulate many important microbial metabolic processes.