Prediction of 19F NMR Chemical Shifts in Labeled Proteins: Computational Protocol and Case Study
William C. Isley, III,* Andrew K. Urick, William C. K. Pomerantz,* and Christopher J. Cramer
Abstract
The structural analysis of ligand complexation in biomolecular systems is important in the design of new medicinal therapeutic agents; however, monitoring subtle structural changes in a protein’s microenvironment is a challenging and complex problem. In this regard, the use of protein-based (19)F NMR for screening low-molecular-weight molecules (i.e., fragments) can be an especially powerful tool to aid in drug design. Resonance assignment of the protein’s (19)F NMR spectrum is necessary for structural analysis. Here, a quantum chemical method has been developed as an initial approach to facilitate the assignment of a fluorinated protein’s (19)F NMR spectrum. The epigenetic “reader” domain of protein Brd4 was taken as a case study to assess the strengths and limitations of the method. The overall modeling protocol predicts chemical shifts for residues in rigid proteins with good accuracy; proper accounting for explicit solvation of fluorinated residues by water is critical.
Keywords: 19F NMR; DFT; NMR; bromodomain; fluorine; screening.
■ INTRODUCTION
Epigenetic proteins regulate the expression of genetic information through addition, removal, or molecular recognition of post-translational modifications of DNA or DNAassociated proteins. Bromodomains are epigenetic protein modules that bind to N-ε-acetyl groups on lysine side-chains including those of acetylated histone proteins. Small-molecule chemical probes for these proteins are in high demand for their potential therapeutic regulation of disease.1 Since the first reports in 2010, of two nanomolar inhibitors for BET bromodomains Brd2, 3, 4 and T,2,3 18 clinical trials have been initiated to test the efficacy of BET bromodomain inhibition in the areas of cancer and inflammation.1,4 There are many other bromodomains, however, which lack specific chemical probes to evaluate their role in both health and disease.5
We recently reported a protein-based NMR method for bromodomain ligand discovery, using fluorine-labeled aromatic amino acids.6,7 Since inception, this method has been used for screening libraries of low-complexity, small molecules termed fragments,7−10 as well as higher complexity molecules based on kinase inhibitor scaffolds.11 In these experiments, proteins are expressed in the presence of fluorine-labeled amino acids (e.g., 3-fluorotyrosine, or 3FY) resulting in global replacement of the nonfluorine labeled aromatic amino acid. A feature of this method is the sensitivity of the 19F nucleus to different chemical environments, typically leading to rapidly obtained and wellresolved 1D 19F NMR spectra. In the case of the first bromodomain of Brd4, we replaced all seven tyrosine residues with 3FY resulting in dispersed resonances spanning over 12 ppm (see Figure 1).6
A notable challenge when conducting protein-observed 19F NMR experiments (or PrOF NMR), is the initial assignment of the NMR resonances. This is most often facilitated by sitedirected mutagenesis experiments.12 In these experiments, a particular amino acid that is labeled with fluorine is mutated to a different amino acid, and the 19F NMR spectrum of the mutant protein is compared to the wild-type protein. The absence of a single resonance is then used to assign the NMR spectrum. However, this method is not generally applicable as not all mutant proteins may express well, or multiple resonances may be perturbed in the NMR spectrum. In the case of 3FY-labeled Brd4, several of the mutant proteins were expressed at low levels, and some were more susceptible to degradation, thus complicating the NMR analysis, particularly for the 3FY resonances for Y118 and Y119.6 To help overcome challenges of mutagenic protein expression, we reasoned that a computational approach to NMR chemical shift prediction would function as a useful tool in facilitating the 19F NMR assignment motivating this study. Protein-based NMR Ensemble (NMRE) simulations for 1H, 13C, and 14N have been able to utilize the wealth of available NMR data in the literature to create a predictive protocol from machine learning algorithms; however, the database on 19F NMR in proteins is significantly smaller.13,14 In general, for modeling of NMR chemical shifts, we note that sampling of the protein’s conformational ensemble has also been shown to significantly improve prediction accuracy compared to simply using the X- ray crystal structure.15,16
Chemical shift predictions for fluorinated proteins have highlighted several challenges for theory. Using the 5fluorotryptophan-labeled galactose-binding protein, Pearson et al. predicted the 19F NMR spectrum with reasonable agreement with experimental data.17 Additionally, Sternberg et al.18 developed a semiempirical protocol for prediction of fluorotryptophan 19F NMR for a solid-state membranebound-protein gramicidin A, which gave reasonable agreement with more rigorous levels of theory. Conversely, despite systematic analysis of local electrostatic effects and shortrange contacts, Lau and Gerig were unable to reliably predict the 19F NMR spectrum of the more dynamic 6-fluorotryptophan-labeled dihydrofolate reductase.19 These mixed results for fluorotryptophan illustrate the challenges associated with making accurate predictions. The challenge is enhanced by the even more narrow spectral range of ≈2 ppm for 5fluortryptophan-labeled Brd4. Additionally, recent work by Kasireddy et al.20 demonstrates the challenges and potential utility of 19F NMR predictions on fluorohistidine molecules. By contrast, the prediction of the chemical shifts within 3FYsubstituted proteins has not been previously attempted.
In this report, we propose that the wider spectral range of 3FY provides a novel and significantly more accessible platform to predictively assign the 19F NMR spectrum of a 3FY-labeled bromodomain, Brd4. The enhanced utility of 3FY as a predictive platform is more challenging than other fluorinated aromatics (4-fluorophenylalanine or 5-fluorotryptophan) given the increased conformational flexibility and additional hydrogen bonding interactions provided by the phenol and asymmetric substitution. We also demonstrate that automation of spectral predictions for 19F NMR in bromodomain-containing proteins is feasible, and that a cluster based method for prediction of 19F NMR chemical shifts shows great promise for further development. While traditional predictions of protein-based 1
H NMR involve dynamic simulations to sample the large ensemble of configurations, the relative rigidity and availability of X-ray crystal structures for bromodomains, and the limited number of side-chain resonances requiring assignment, render an exploratory cluster-based model feasible. Due to the high conservation of aromatic amino acids in the majority of the 61 bromodomains,6 this method may find general utility for these proteins.
■ THEORETICAL METHODS
In this section, we summarize our overall protocol to arrive at 19 F chemical shift predictions starting from a Protein Data Bank (PDB) file for an unlabeled protein. The solution of the crystal structure of 3FY-labeled Brd4 (PDB 4QZS) was shown to have minimal structural perturbation (RMSD 0.089 Å) vs the unlabeled protein (PDB 3MXF).3,6 We include full details of file manipulation and software employed; scripts developed for process automation are also publically available.21 This work utilizes the overall scheme below, with details described in the following sections.
1. The protein structure is taken from the PDB file.
2. To obtain an initial solvation environment around the target residues, a molecular dynamic optimization of water molecules on a frozen protein is performed.
3. From the solvated protein, clusters are excised around each 3FY. The NMR ab initio calculations of the full protein are prohibitively expensive, so protein fragments are used.
4. Geometries sampling the fluorine and phenol conformers are generated, and the target 3FY is optimized within a frozen protein fragment using density functional theory.
5. NMR chemical shifts are predicted using a Boltzmann average over the cluster conformers.
Positions. We used the online Molecular Dynamics on Web (MDWeb) toolset,22 initially going through the following steps for a given protein structure:
1. Select action “Prepare Structure Topology for AMBER ParmFF99SB* (Hornak & Simmerling, including Best & Hummer psi modification)”23,24
2. Select action for structural optimization of 50 or more exterior water molecules using the Classical Molecular Interaction Potentials (CMIP)25
3. Select action to energetically minimize hydrogen atom positions using NAMD26
4. Select action to export PDB structure
5. File I/O: The resulting PDB exported from MDWeb lacks the column at the end of a regularly formatted PDB file that specifies the actual atom designation for each ATOM type. Opening the PDB file in OpenBabel,27 and choosing to convert PDB → PDB fixes this issue.
Cluster Generation. The final, properly formatted PDB file serves as input to the cluster generation script. Additional input includes specification of a target residue number and a cutoff radius to be used for cluster generation. Waters solvating the exterior of the protein are optimized with the AMBER force field during step 3, which samples the solvation configurational space rapidly and serves as a platform for cluster exploration.
The script generates a local cluster by identifying other protein residues within the cutoff radius of the target residues’ non-hydrogen atoms. Once all such surrounding residues have been found, the script keeps them in position, caps all open backbone termini with acetyl- (N terminus) or N-methylamino(C terminus) groups, and eliminates all remaining atoms. Note that any crystallographically conserved water molecules, counterions, or small molecules in the PDB file are removed during the prepare topology stepwater molecules are added back to the structure, however, in the next step. The interior water molecules are added during cluster generation, and they are manually inserted so as to match the phenol−water distance determined from X-ray diffraction, along with the oxygen atom oriented at same angle from the aromatic plane as in the XRD. The orientations of the hydrogen atoms on the water relative to the phenol (not available from XRD) are chosen so that the water can participate in hydrogen bonding (either as a donor or acceptor depending on orientation and nearby side chain functionality). 3FY residues exposed to the protein surface are solvated with one to three water molecules. The coordinates of these waters are allowed to relax, as discussed further below. Once a cluster (with no fluorine atoms) has been generated, four conformers are created manually, corresponding to the four relative orientations of fluorine and the phenolic proton (see Figure 3).
Optimization. The geometry of the target residue is then optimized with Density Functional Theory, within a frozen cluster (i.e., only the fluorinated residue is optimized within its otherwise fixed cluster framework). This step employs the M06-L28 density functional and the def2-SVP basis set29 on all atoms. The aqueous SMD30 continuum solvation model is also employed, as early surveys of gas-phase results were found to give poor structures and also to suffer from convergence difficulties in clusters having local charge separations. F NMR Chemical Shift Prediction. Following restrained optimization of the cluster, the 19F NMR chemical shift is predicted. Trifluoroacetate (δexp = −76.55 ppm) is employed to compute a reference chemical shielding (σref), with computations of 3FY chemical shifts δpred then being determined as where σpred is the shielding predicted for 3FY in the cluster. All chemical shifts are reported relative to CFCl3 (set to 0.0 ppm). To predict chemical shifts most accurately, a linear regression of predicted values on experimental measurements is a common practice. We benchmarked several computational protocols on a training set of 14 molecules containing arylfluorine bonds (Table S1 of Supporting Information) evaluating such model parameters as (1) gas phase vs SMD implicit solvation, (2) density functional choice (B3LYP vs PBE0), and (3) basis set size (double-ζ vs triple-ζ quality). Most protocols performed well over the benchmark set, and PBE0 with implicit solvation provided the highest accuracy for prediction of the chemical shift of 3-fluorotyrosine. We adopted PBE0/SMD with the EPR-II basis set as our recommended protocol, as it offers an optimal combination of accuracy and computational efficiency. The corresponding linear regression to be used with chemical shifts predicted from this level of theory is Additional analysis is provided below in Results and Discussion.
In many instances, a 3FY residue can adopt multiple poses within its associated cluster, and the phenol group leads further to multiple possible rotamers. To account for an equilibrium distribution of structures, Boltzmann weighted chemical shifts were computed as Software. All optimization and chemical shift computations were accomplished using the Gaussian09 Rev D.01 suite of electronic structure programs.31
■ RESULTS AND DISCUSSION
No benchmarking study for the accuracy of 19F NMR chemical shift predictions for biomolecular moieties at the density functional (DFT) or Hartree−Fock (HF) level of theory was available. To validate our own protocol, we explored various options over a 14-molecule training set as outlined in the theoretical methods. This protocol was employed to predict composite 19F chemical shifts using local clusters and accounting for conformational flexibility. We next address the modeling challenges associated with our protocol and the physical insights into the effects of chemical environment on 19 F NMR shifts in proteins that it provides.
Optimizing 19F NMR Prediction Protocol. Prior to cluster generation, a method to predict the 19F NMR of aryl fluorine resonances in a training set of different fluorinated aromatic rings spanning about the same spectral frequency range as 3FY (−125 to −145 ppm) was optimized.6 The training set and experimental NMR data are reported in Table S1; data for a subset are shown in Figure 2. The phenols in the training set all include one explicit water molecule acting as a hydrogen bond acceptor to represent the solvation shell. Addition of water to 3FY shifts δpred downfield by 8 ppm, from −150 ppm to −142 ppm. The protocols tested include molecular optimization with M06-L/def2-SVP in either the gas phase or with an aqueous SMD solvation model. After these structures were obtained, NMR predictions were performed using either the PBE032 or B3LYP33 density functionals in combination with either the EPR-II or EPR-III basis sets.34 Additionally, NMR predictions were made at the Hartree−Fock level of theory using the 6-311++G(2d,2p)35,36 basis set and the aqueous SMD solvation model.
Results for selected regressions of computed training set 19F NMR data on experimental measurements are shown in Table 1. Plotted in Figure 2, the correlation between experiment and theory selected for further use was found for the protocol combining the aqueous SMD solvation model,30 the PBE0 density functional,32 and the EPR-II basis set.34 A similar statistical performance was obtained with the larger basis PBE0/EPR-III/SMD model, but we chose to continue with PBE0 was selected given its better performance for 3FY (see Table S1). Although some prior research suggested that 19F chemical shifts for fluorobenzenes were accurately predicted at the Hartree−Fock level of theory,37 we found density functional methods to be much more accurate over our biologically motivated training set. 19F NMR has been found to be more sensitive to changes in the electrostatic potential,38 and it has been shown that including exact Hartree−Fock exchange in hybrid density functionals increasingly degrades the prediction of nuclear shieldings as nuclei become heavier.39 Compared to 1H and 13C nuclei, modeling of 19F NMR chemical shifts in other systems has found greater errors for HF predictions than for those at the DFT or MP2 levels.38−40
Cluster Radial Convergence. Accurate modeling of the fluorine-19 isotope’s magnetic behavior requires that cluster models reproduce the local environment derived from the full protein. The fluorinated resonance of Y65 (3FY65) of the apo form of the Brd4 bromodomain (PDB ID: 4IOR) was selected to evaluate the convergence of predicted chemical shift with respect to cluster size. Residue 3FY65 serves as a sensitive test case because it is a solvent-exposed amino acid which might be expected to sample quite different environments in different rotamer states (e.g., protein interior vs exterior directed fluorine). The relative energies and chemical shifts for each conformer are included in Table 2. During cluster generation, a radial cutoff of at least 2.75 Å is required in order to include the residue having a carbonyl group that can serve as a hydrogen bond acceptor for the phenol. A cutoff of 3.25 Å is required to encompass surrounding water molecules because hydrogen atoms are not used in cluster generation. After these previously absent possible phenolic hydrogen bond acceptors (and/or donors, although 3FY is generally a better acid than a base in this regard) have been included, the convergence of the predicted fluorine NMR chemical shift improves. However, the relative stability of each conformer as a function of cluster size is not as well converged. Taking into consideration the computational expense of the chemical shift prediction step, clusters generated with a radial cutoff of greater than 4.00 Å were found to be too large to be conveniently employed in the chemical shift calculation step. In general, to balance computational efficiency and accuracy in modeling the local environment, we recommend employing a cutoff distance between 3.25 and 4.00 Å. Unless otherwise noted, results reported below are for a cutoff distance of 4.00 Å.
3-Fluorotyrosine Conformer Weights. One challenge from a chemical modeling standpoint is the accurate sampling of all thermodynamically relevant conformers. For the 3FY systems considered here, there are nearly always four relevant conformers (Figure 3). These conformers account for the internal orientation of the phenol hydrogen relative to the fluorine and the relative orientation of the fluorine to the protein tertiary structure. The number of accessible conformers can increase if additional phenol hydrogen-bond acceptors are present (non-hydrogen bonded conformers are generally much higher in energy). These different conformers expose the sensitive fluorine probes to different magnetic environments.
Considering the phenol’s conformational effects on the 19F chemical shift, there are two key observations. In the absence of other external groups, s-cis (H,F) conformers are slightly lower in energy than s-trans conformers, which reflects the expected favorable electrostatic interaction expected in the absence of alternative hydrogen-bonding opportunities (e.g., with coordinating water molecules). There is a large difference in the chemical shifts for the two fluorine−hydrogen orientations: strans conformers have 130 ppm, whereas the lowerenergy s-cis conformers have 150 ppm. A strong upfield shift is consistent with significantly higher nuclear shielding from a dipolar interaction with the hydroxyl group. This result is also consistent with computations and experimental results from Dalvit et al., who identified highly shielded fluorine nuclei in close proximity to hydrogen-bond donors.41 If external hydrogen-bond acceptors for the hydroxyl group are present, however, this disparity in chemical shifts is substantially reduced. These various effects are shown in Figure 4 for residue 3FY65, for which there are indeed four accessible conformers and for which two phenolic hydrogen-bond acceptors are observed: (1) external water and (2) an interior peptide backbone carbonyl. In the case of 3FY65, as predictions converge with increasing cluster size, it is apparent that the most favorable conformer involves externally oriented fluorine, with the s-trans phenolic proton hydrogen-bonded to the interior peptide carbonyl. In an effort to experimentally assess predicted conformer weighting PrOF NMR spectra of 3FYBrd4 were acquired at 15, 25, and 35 °C to see if our model could accurately replicate the spectra at different temperatures. However, because the small changes in chemical shift from different temperatures (avg. change = 0.13 ppm) are much lower than the current error in our method, we were unable to draw conclusions from these experiments (Table S3).
Accurately Modeling the Phenolic Environment. Accurate modeling of the phenolic proton environment is a critical challenge for the accurate prediction of the 3FY 19F NMR, as a phenolic hydrogen-bond acceptor has a significant impact on the 19F chemical shift. Furthermore, inclusion of only the hydrogen-bond acceptor can be problematic if the acceptor itself is involved in additional strong interactions (e.g., a charged residue interacting with adjacent ionic residues). This happens in clusters where the target 3FY has a phenolic hydrogen bond to a negatively charged aspartate or glutamate.
The anionic hydrogen-bond acceptor, in the absence of a positive counterion, overdelocalizes electronic density onto fluorine, leading to an erroneous upfield shift. For example, a cluster of 3FY118 with a cutoff radius of 3.00 Å involves a hydrogen bond to a glutamate carboxylate and leads to a chemical shift prediction of −148.5 ppm. An increased cluster size, however, ultimately includes ion-pairing of the glutamate’s carboxylate with the guanidinium group of arginine 113 (see Figure 4), and the predicted chemical shift becomes −140.8 ppm, which is substantially reduced in magnitude (see Table 4). Assessing Physical Contributions to Chemical Shifts. Although it would be difficult to partition contributions from chemically intuitive sources such as van der Waals forces, electrostatic charges, and hydrogen bonds to the 19F chemical shift in the full protein environment, we have performed a series of calculations designed better to assess them in appropriate model systems. In particular, we have predicted the 19F chemical shift and fluorine Mulliken population in fluorobenzene as an argon atom, a sodium cation, a fluoride anion, and a water molecule are adjusted along the C−F axis over a range of lengths (with continuum aqueous solvation; Figure 5). These four probes interact predominantly through dispersion, positive charge, negative charge, and hydrogen bonding, respectively. Ar and F− have a very similar effect on the chemical shift; a significant deshielding effect is predicted at smaller distances. The effect is as large as 21 ppm at a distance of 2.5 Å, but it reduces to 2 ppm by 3.5 Å. The behavior with respect to Ar is consistent with 1H deshielding observed in sterically compressed organic complexes.42 We note that Ar does not affect the population density on fluorine. The effect of fluoride is further discussed below. Explicit hydrogen bonding from a water molecule to the fluorine atom results in a smaller increase in deshielding, ranging from 11 ppm at 2.5 Å (O−F distance, somewhat shorter than expected for a typical hydrogen bond) to 0 ppm at 3.5 Å. In evaluating hydrogen bonding effects on the fluorine chemical shift, Dalvit and Vulpetti found that fluorines participating in hydrogen bonds exhibit a range of shieldings but are typically more shielded than those in hydrophobic environments.41 One difference in our model is that we have found that our models require explicit water to obtain the best match with experimental measurements. Although we do not observe a strong shielding effect nor a large accumulation charge on fluorine via water interactions, the net shielding and increased charge accumulation relative to fluoride or argon are consistent with the findings of Dalvit and Vulpetti. The sodium cation has the opposite effect on the 19F chemical shift, significantly shielding the fluorine atom, with the effect ranging from 18 ppm at 2.5 Å to 2.5 ppm at 3.5 Å. These results do not show the same behavior exhibited by alkali metal fluoride materials.43 Population analysis of the fluorine atom’s effective charge shows that the amount of electronic density localized onto the fluorine tracks with the trends in both fluoride and sodium. Table S4 shows that as the sodium cation gets closer, the amount of electrons on fluorine increases; the opposite trend is seen for fluoride. The increased (or decreased) shielding can be explained by an induced dipole moment on the fluorobenzene ring in response to the electrostatic charge getting closer. 19
F NMR Predictions for 3-Fluorotyrosine Mutant BRD4. Using a cutoff distance of 4.0 Å, we predicted the 19F NMR spectrum for the entire fluorinated mutant protein (Table 3). First we considered optimizing only each target residue within a rigid surrounding framework. We found this protocol to be sufficient for a subset of residues, namely, 3FY98 and 3FY118, but not to be sufficient for the entire protein. Closer examination of clusters with large discrepancies revealed that explicit water molecules directly interact with the 3FY phenol group in other instances. Even though the water positions are optimized during the classical molecular dynamics portion of the cluster generation protocol, the sensitivity of the 19
F chemical shifts in the training set to local solvation led us to hypothesize that further optimization of adjacent water molecule positions might improve our chemical shift predictions (Figure 6). Data when nearby water molecules are included in the partial geometry optimization step are shown in Table 4. 3FY65 has two water molecules directly participating in hydrogen bonding interactions. We see that optimization of water has a 3 ppm shift on the low-energy s-trans-ex conformation (−132.5 vs −135.5 ppm), and it dramatically improves the agreement of the Boltzmann weighted δtot with experiment. We also note that for the solvent-optimized protocol, both exterior fluorine orientations predict quite chemical shifts quite similar to one another (δfits‑trans‑ex = −135.5 and δsfit‑cis‑ex = −136.6 ppm) and to experiment (δexp = −137.4 ppm), while interior fluorine conformers are likely not to contribute (δsfit‑trans‑in = −121.5 and δsfit‑cis‑in = −129.7 ppm).
3FY97 has the strongest upfield shift measured at δexp = −140.1 ppm. We note that this moiety is on the exterior of the protein and has three water molecules that directly participate in hydrogen bonding with the phenol and fluorine. The predicted chemical shift for the s-cis-in conformer (Figure 7) has the most upfield shift we predict at δsfit‑cis‑in = −138.6 ppm and matches quite closely with experimental measurements.
3FY98 has one water molecule participating in a hydrogen bond with the phenol. Models for 3FY98 exhibit different behavior than experimental measurements. For a cutoff radius of 4.0 Å, only one water molecule is found near the residue, compared to the protein X-ray crystal structure where a cluster of six waters is adjacent (only one being <3.5 Å away, see Figure 8). Given the possibility of an extended hydrogen bonding network, further exploration was performed on 3FY98 with additional water moieties. Results showed negligible change from H2O·3FY98. The phenoxide form of 3FY would provide an insufficient shift in neutral conditions with a pKa over 8.4, which is not enough to account for the discrepancy in magnitude or population.44 Given that an extended solvation sphere does not account for the difference, we postulate that the discrepancy could be due to a dynamic change in configuration that is not taken into account by our modeling protocol. Consistent with this hypothesis, we note that the carbonyl backbone hydrogen acceptor K102 is very flexible, as measured by its b-factor. Any structural perturbations on the carbonyl backbone would not be taken into account by this protocol.
3FY118 has one water molecule directly participating in a hydrogen bond with the phenol. This hydrogen bond slightly reduces the upfield shift by accepting additional negative charge density. As the second further upfield shift measured at δexp = −138.0 ppm, we see qualitative and quantitative agreement for the chemical shift predicted at δsfit‑trans‑ex = −137.4 ppm. We note that 3FY118 does not have an “interior” fluorine configuration as the interior fluorine clashes directly with carboxylate side chain, resulting in an extremely high energy configuration.
3FY119 is measured to have the furthest downfield chemical shift and is largely shielded from the protein surface by two adjacent α-helices. We find 3FY119 participates in a hydrogen bond with an adjacent carboxylate of aspartate 127. This carboxylate has an ion pair with arginine 122 and has one hydrogen-bound water molecule (which does not participate in hydrogen bonding with the 3FY119). The s-cis-ex configuration is predicted to have a chemical shift of δsfit‑cis‑ex = −124.4 ppm, and with the exception of previously mentioned 3FY98, it is the most downfield residue. We note that the s-trans-ex configuration has no hydrogen acceptor for the phenolic proton, and with a low barrier for rotation of the phenol, all initial calculations converge to s-cis-ex (Figure 7).
3FY137 and 3FY139 are both solvent-accessible residues on the exterior of the protein. We find that during molecular dynamic optimization, 3FY137 has one sodium ion and two waters directly interacting with the phenol. Given the close proximity of sodium to the residue and the high mobility of sodium cations, sodium was also relaxed during solvent optimization.
The s-cis-ex configuration is the only configuration that results in a chemical shift in the typical range for an external residue at −134.1 ppm as compared to −134.0 ppm. 3FY139 has two waters and a sodium atom near the residue with water participating in hydrogen bonding to the phenol. Similar behavior is noted here where the s-cis-ex configuration predicts a chemical shift at −137.9 as compared to a measured chemical shift at −136.6 ppm. We note that for both 3FY137 and 3FY139, DFT optimization of solvent waters were critical to improve the accuracy of the NMR predictions with changes between 6 to 10 ppm.
The predicted most stable configurations for each residue are shown in Figure 7. If we examine this set of residues, we see that the s-trans-ex conformer is predicted to be the most stable conformer for all residues except 3FY97 and 3FY98, where the scis-in conformer is more stable. However, if we compare the predicted chemical shifts to their respective measured values, we notice that the s-cis isomers show a much smaller deviation than s-trans isomers for five out of seven residues. The residues where water does not act as the phenolic hydrogen bond acceptor (3FY97, 3FY118, 3FY119) have the most reliable predictions compared to measured values (MSE 2 ppm). In applying this protocol to future systems, we note that the assignment of new resonances can be eased by eliminating conformers that have predicted chemical shifts well downfield of observed 3FY chemical shifts. Second, the resonances where the phenol directly interacts with a well conserved carboxylate residue are most reliable.
We then examined the effect that optimizing water has on the chemical shift of the fluorinated tyrosine. If 3FY98 is excluded, which shows not fully understood divergent behavior, the protocol for the best match exclusive target residue relaxation has a mean unsigned deviation (MUD) from experimental data of 3.5 ppm (see Table S2). Including optimized water networks with the target residue leads to improvements in the protocol when the residue is directly solvated by water. When explicit water solvent is optimized, the MUD for the same protocol improves to 1.3 ppm. This demonstrates that explicit solvent optimization dramatically improves predictions of the fluorine environment in the bromodomain-containing proteins. Future work should likely focus further on the accurate prediction of relative conformer energetics as they influence averaged chemical shifts.
■ CONCLUSIONS
This work has taken a first step toward automated prediction of 19 F NMR spectra for bromodomains. The machinery has been built and tested on Brd4 to take the protein crystal structure, extract clusters based on target residues and desired cluster sizes, and predict chemical shifts. We have shown that water plays a significant role in the prediction of the 19F chemical shifts of 3FY residues and that models must take account of water’s influence on the tyrosine phenol group in order to accurately predict chemical shifts. This makes 3FY much more challenging to model than 4-fluorophenylalanine-labeled proteins but can provide significant insight into the local structure of hydrogen bonding networks. Further work on accurately sampling the thermodynamically accessible ensemble of configurations (e.g., through sampling of molecular dynamics (MD) trajectories as evaluated by Lehtivarjo et al.)16 could improve this first-generation protocol, albeit at a considerably higher computational cost associated with performing MD simulations. As a reasonable first approximation to this challenge of thermodynamic sampling, our first-generation model uses simply the four conformers associated with the orientation of the phenolic proton and the fluorine atom. Further consideration of more sophisticated techniques for ensemble averaging is warranted in future studies. Based on results from Brd4, this method should be able to predict accurate chemical shifts for 3-fluorotyrosine residues where water does not directly participate as a hydrogen bonding partner; for microsolvated residues, further attention to alternative possibilities may be necessary. This protocol shows promise as a tool to facilitate assignment of challenging 19 F NMR spectra in labeled proteins.
■ REFERENCES
(1) Huston, A.; Arrowsmith, C. H.; Knapp, S.; Schapira, M. Probing the epigenome. Nat. Chem. Biol. 2015, 11 (8), 542−545.
(2) Nicodeme, E.; Jeffrey, K. L.; Schaefer, U.; Beinke, S.; Dewell, S.; Chung, C. W.; Chandwani, R.; Marazzi, I.; Wilson, P.; Coste, H.; White, J.; Kirilovsky, J.; Rice, C. M.; Lora, J. M.; Prinjha, R. K.; Lee, K.; Tarakhovsky, A. Suppression of inflammation by a synthetic histone mimic. Nature 2010, 468 (7327), 1119−1123.
(3) Filippakopoulos, P.; Qi, J.; Picaud, S.; Shen, Y.; Smith, W. B.; Fedorov, O.; Morse, E. M.; Keates, T.; Hickman, T. T.; Felletar, I.; Philpott, M.; Munro, S.; McKeown, M. R.; Wang, Y. C.; Christie, A. L.; West, N.; Cameron, M. J.; Schwartz, B.; Heightman, T. D.; La Thangue, N.; French, C. A.; Wiest, O.; Kung, A. L.; Knapp, S.; Bradner, J. E. Selective inhibition of BET bromodomains. Nature 2010, 468 (7327), 1067−1073.
(4) Filippakopoulos, P.; Knapp, S. Targeting bromodomains: epigenetic readers of lysine acetylation. Nat. Rev. Drug Discovery 2014, 13 (5), 337−356.
(5) Bamborough, P.; Chung, C. W. Fragments in bromodomain drug discovery. MedChemComm 2015, 6 (9), 1587−1604.
(6) Mishra, N. K.; Urick, A. K.; Ember, S. W. J.; Schönbrunn, E.; Pomerantz, W. C. Fluorinated Aromatic Amino Acids Are Sensitive 19F NMR Probes for Bromodomain-Ligand Interactions. ACS Chem. Biol. 2014, 9 (12), 2755−2760.
(7) Gee, C. T.; Koleski, E. J.; Pomerantz, W. C. K. Fragment Screening and Druggability Assessment for the CBP/p300 KIX Domain through Protein-Observed 19F NMR Spectroscopy. Angew. Chem., Int. Ed. 2015, 54 (12), 3735−3739.
(8) Leung, E. W. W.; Yagi, H.; Harjani, J. R.; Mulcair, M. D.; Scanlon, M. J.; Baell, J. B.; Norton, R. S. 19F NMR as a Probe of Ligand Interactions with the iNOS Binding site of SPRY Domain-Containing SOCS Box Protein 2. Chem. Biol. Drug Des. 2014, 84 (5), 616−625.
(9) Pomerantz, W. C.; Wang, N.; Lipinski, A. K.; Wang, R.; Cierpicki, T.; Mapp, A. K. Profiling the Dynamic Interfaces of Fluorinated Transcription Complexes for Ligand Discovery and Characterization. ACS Chem. Biol. 2012, 7 (8), 1345−1350.
(10) Gee, C. T.; Arntson, A. E.; Urick, A. K.; Mishra, N. K.; Pomerantz, W. C. K. Protein-Observed 19F NMR for Fragment Screening, Affinity Quantification, and Druggability Assessment. Nat. Protoc. 2016, Accepted.
(11) Urick, A. K.; Hawk, L. M. L.; Cassel, M. K.; Mishra, N. K.; Liu, S.; Adhikari, N.; Zhang, W.; dos Santos, C. O.; Hall, J. L.; Pomerantz, W. C. K. Dual Screening of BPTF and Brd4 Using Protein-Observed Fluorine NMR Uncovers New Bromodomain Probe Molecules. ACS Chem. Biol. 2015, 10 (10), 2246−2256.
(12) Kitevski-LeBlanc, J. L.; Prosser, R. S. Current applications of F19 NMR to studies of protein structure and dynamics. Prog. Nucl. Magn. Reson. Spectrosc. 2012, 62, 1−33.
(13) Han, B.; Liu, Y.; Ginzinger, S. W.; Wishart, D. S. SHIFTX2: significantly improved protein chemical shift prediction. J. Biomol. NMR 2011, 50 (1), 43−57.
(14) Fluorine Chemical Shift Database. Laboratory of Dr Carl Frieden, Washington University School of Medicine. Website: http:// biochem.wustl.edu/bmbnmr/Fluorine.html.
(15) Karp, J. M.; Erylimaz, E.; Cowburn, D. Correlation of chemical shifts predicted by molecular dynamics simulations for partially disordered proteins. J. Biomol. NMR 2015, 61 (1), 35−45.
(16) Lehtivarjo, J.; Tuppurainen, K.; Hassinen, T.; Laatikainen, R.; Perakylä , M. Combining NMR ensembles and molecular dynamics̈ simulations provides more realistic models of protein structures in solution and leads to better chemical shift prediction. J. Biomol. NMR 2012, 52 (3), 257−267.
(17) Pearson, J. G.; Oldfield, E.; Lee, F. S.; Warshel, A. ChemicalShifts in Proteins - A Shielding Trajectory Analysis of the Fluorine Nuclear Magnetic Resonance Spectrum of the Escherichia coli Galactose binding protein using a multipole shielding polarizability local reaction field molecular dynamics approach. J. Am. Chem. Soc. 1993, 115 (15), 6851−6862.
(18) Sternberg, U.; Klipfel, M.; Grage, S. L.; Witter, R.; Ulrich, A. S. Calculation of fluorine chemical shift tensors for the interpretation of oriented 19F-NMR spectra of gramicidin A in membranes. Phys. Chem. Chem. Phys. 2009, 11 (32), 7048−7060.
(19) Lau, E. Y.; Gerig, J. T. Origins of fluorine NMR chemical shifts in fluorine-containing proteins. J. Am. Chem. Soc. 2000, 122 (18), 4408−4417.
(20) Kasireddy, C.; Bann, J. G.; Mitchell-Koch, K. R. Demystifying fluorine chemical shifts: electronic structure calculations address origins of seemingly anomalous 19F-NMR spectra of fluorohistidine isomers and analogues. Phys. Chem. Chem. Phys. 2015, 17 (45), 30606−30612.
(21) Isley, W. C., III Comp-Chem-Tools. Website: https://github. com/william-isley-3rd/Comp-Chem-Tools.
(22) Hospital, A.; Andrio, P.; Fenollosa, C.; Cicin-Sain, D.; Orozco, M.; Gelpí, J. L. MDWeb and MDMoby: an integrated web-based platform for molecular dynamics simulations. Bioinformatics 2012, 28 (9), 1278−1279.
(23) Hornak, V.; Abel, R.; Okur, A.; Strockbine, B.; Roitberg, A.; Simmerling, C. Comparison of multiple Amber force fields and development of improved protein backbone parameters. Proteins: Struct., Funct., Genet. 2006, 65 (3), 712−725.
(24) Best, R. B.; Hummer, G. Optimized Molecular Dynamics Force Fields Applied to the Helix−Coil Transition of Polypeptides. J. Phys. Chem. B 2009, 113 (26), 9004−9015.
(25) Gelpí, J. L.; Kalko, S. G.; Barril, X.; Cirera, J.; de la Cruz, X.; Luque, F. J.; Orozco, M. Classical molecular interaction potentials: Improved setup procedure in molecular dynamics simulations of proteins. Proteins: Struct., Funct., Genet. 2001, 45 (4), 428−437.
(26) Phillips, J. C.; Braun, R.; Wang, W.; Gumbart, J.; Tajkhorshid, E.; Villa, E.; Chipot, C.; Skeel, R. D.; Kale, L.; Schulten, K. Scalablé molecular dynamics with NAMD. J. Comput. Chem. 2005, 26 (16), 1781−1802.
(27) O’Boyle, N.; Banck, M.; James, C.; Morley, C.; Vandermeersch, T.; Hutchison, G. Open Babel: An open chemical toolbox. J. Cheminf. 2011, 3 (1), 33.
(28) Zhao, Y.; Truhlar, D. G. A new local density functional for maingroup thermochemistry, transition metal bonding, thermochemical kinetics, and noncovalent interactions. J. Chem. Phys. 2006, 125 (19), 194101.
(29) Weigend, F.; Ahlrichs, R. Balanced basis sets of split valence, triple zeta valence and quadruple zeta valence quality for H to Rn: Design and assessment of accuracy. Phys. Chem. Chem. Phys. 2005, 7 (18), 3297−3305.
(30) Marenich, A. V.; Cramer, C. J.; Truhlar, D. G. Universal solvation model based on solute electron density and on a continuum model of the solvent defined by the bulk dielectric constant and atomic surface tensions. J. Phys. Chem. B 2009, 113 (18), 6378−96.
(31) Frisch, M. J.; Trucks, G. W.; Schlegel, H. B.; Scuseria, G. E.; Robb, M. A.; Cheeseman, J. R.; Scalmani, G.; Barone, V.; Mennucci, B.; Petersson, G. A.; Nakatsuji, H.; Caricato, M.; Li, X.; Hratchian, H. P.; Izmaylov, A. F.; Bloino, J.; Zheng, G.; Sonnenberg, J. L.; Hada, M.; Ehara, M.; Toyota, K.; Fukuda, R.; Hasegawa, J.; Ortiz, J. V.; Cioslowski, J.; Fox, D. J. Gaussian 09, Revision C.1; Gaussian, Inc.: Wallingford, CT, 2009.
(32) Adamo, C.; Barone, V. Toward reliable density functional methods without adjustable parameters: The PBE0 model. J. Chem. Phys. 1999, 110 (13), 6158−6170.
(33) Becke, A. D. Density-functional thermochemistry. III. The role of exact exchange. J. Chem. Phys. 1993, 98 (7), 5648.
(34) Chong, D. P. Recent advances in density functional methods; World Scientific: Singapore, 1995.
(35) McLean, A. D.; Chandler, G. S. Contracted Gaussian basis sets for molecular calculations. I. Second row atoms, Z = 11−18. J. Chem. Phys. 1980, 72 (10), 5639−5648.
(36) Krishnan, R.; Binkley, J. S.; Seeger, R.; Pople, J. A. Selfconsistent molecular orbital methods. XX. A basis set for correlated wave functions. J. Chem. Phys. 1980, 72 (1), 650−654.
(37) Sanders, L. K.; Oldfield, E. Theoretical Investigation of 19F NMR Chemical Shielding Tensors in Fluorobenzenes. J. Phys. Chem. A 2001, 105 (34), 8098−8104.
(38) Oldfield, E. CHEMICAL SHIFTS IN AMINO ACIDS, PEPTIDES, AND PROTEINS: From Quantum CPI-1612 Chemistry to Drug Design. Annu. Rev. Phys. Chem. 2002, 53 (1), 349−378.
(39) Allen, M. J.; Keal, T. W.; Tozer, D. J. Improved NMR chemical shifts in density functional theory. Chem. Phys. Lett. 2003, 380 (1−2), 70−77.
(40) Liu, P.; Goddard, J. D.; Arsenault, G.; Gu, J.; McAlees, A.; McCrindle, R.; Robertson, V. Theoretical studies of the conformations and 19F NMR spectra of linear and a branched perfluorooctanesulfonamide (PFOSAmide). Chemosphere 2007, 69 (8), 1213−1220.
(41) Dalvit, C.; Invernizzi, C.; Vulpetti, A. Fluorine as a HydrogenBond Acceptor: Experimental Evidence and Computational Calculations. Chem. – Eur. J. 2014, 20 (35), 11058−11068.
(42) Marchand, A. P.; Rose, J. E. Bridge-proton absorptions in the nuclear magnetic resonance spectra of norbornene and related systems. J. Am. Chem. Soc. 1968, 90 (14), 3724−3731.
(43) Cai, S.-H.; Chen, Z.; Wan, H.-L. Theoretical Investigation of 19F NMR Chemical Shielding of Alkaline-Earth-Metal and AlkaliMetal Fluorides. J. Phys. Chem. A 2002, 106 (6), 1060−1066.
(44) Seyedsayamdost, M. R.; Reece, S. Y.; Nocera, D. G.; Stubbe, J. Mono-, Di-, Tri-, and Tetra-Substituted Fluorotyrosines: New Probes for Enzymes That Use Tyrosyl Radicals in Catalysis. J. Am. Chem. Soc. 2006, 128 (5), 1569−1579.