Open Access

Structural Bioinformatics Studies of Integral Transmembrane Enzymes pMMO Complex, C560, CYB, and DHSD and their AlphaFold3-Predicted Water-Soluble QTY Variants

,  and   
Jan 28, 2025

Cite
Download Cover

Introduction

On the planet earth, nature has evolved a set of twenty amino acids that are consisted of 10 hydrophilic amino acids and 10 hydrophobic amino acids (Figure 1, Figure S1) (1, 2, 3, 4). Hydrophilic amino acids form hydrogen bonds with water molecules, thus readily water-soluble. They include Aspartic acid (D), Glutamic acid (E), Asparagine (N), Glutamine (Q), Lysine (K), Arginine (R), Serine (S), Threonine (T), Histidine (H) and Tyrosine (Y). Conversely, hydrophobic amino acids do not form hydrogen bonds with water molecules, thus, water-insoluble. They include Leucine (L), Isoleucine (I), Valine (V), Phenylalanine (F), Methionine (M), Tryptophan (W) and Alanine (A). Cysteine (C) and Glycine (G) are only weakly hydrophilic and hydrophobic, respectively. Proline (P) is a unique case: the nitrogen of Proline is part of the ring structure formed by the side chain. Proline is frequently found at the N-terminus of helices, but if it occurs in the middle of an alpha-helix, it forces the helix to bend (4).

Figure 1.

Experimental 1.5Å-resolution X-ray electron density maps of 20 amino acids arranged by size. This figure is provided by Dr. Mike Sawaya (UCLA), and used with permission in order to show the individual amino acid electron density maps at high resolution. The density maps demonstrate similar shapes of V and T; L, D, N, E and Q, and F and Y. Please see Dr. Mike Sawaya's original website: http://people.mbi.ucla.edu/sawaya/m230d/Modelbuilding/modelbuilding.html (courtesy of Dr. Michael R. Sawaya of University of California, Los Angeles, CA, USA).

Proteins can be generally divided into two classes: Class I is hydrophilic and Class II is hydrophobic (1, 2, 3, 4). Class I hydrophilic proteins include water-soluble proteins that reside in the cytoplasm, such as hemoglobin, as well as metabolic enzymes and circulating proteins outside cells, such as growth factors, hormones, and antibodies. On the other hand, Class II hydrophobic proteins comprise the integral membrane proteins that are embedded in cellular and other membranes. They include G protein-coupled receptors (GPCRs), membrane transporters and ion channels as well as the photosynthesis machinery. The Class I proteins are generally water-soluble, and the Class II proteins, as integral membrane proteins, are generally water-insoluble. In order to solubilize Class II proteins, various detergent or surfactants are required during isolation from their lipid-bilayer membrane environment and for stabilization after removal (5, 6, 7).

Alpha-helices can be classified into three chemically distinct types (Please see Table S1). Although they are significantly different in their chemical properties and water-solubility, they have nearly identical molecular structures, namely: i) a 1.5Å rise per amino acid, ii) a 100° rotation per amino acid, iii) 3.6 amino acids per helical turn, iv) a 5.4Å rise per helical turn, and v) the key feature, i.e. each NH-group of an amino acid forms an H-bond with the C=O group of the amino acid 4 residues away. The latter, repeated i, i + 3, i + 4 H-bonding pattern is the most prominent characteristic of an alpha-helix (1, 2, 3, 4).

The Type I alpha-helix is mostly comprised of hydrophilic amino acids on the surface, including D, E, N, H, Q, K, R, S, T and Y, and is commonly found on the outer layer in water-soluble globular proteins, and in some cases, in the inner layer of membrane helices, away from the lipid bilayer. The Type II alpha-helix is mostly comprised of hydrophobic amino acids, including L, I, V, F, M, A, W and P, and is commonly found in the helical transmembrane segments of membrane proteins and buried in the interior of water-soluble proteins. The Type III amphiphilic alpha-helix is almost equally comprised of hydrophilic and hydrophobic amino acids that are partitioned into the hydrophobic face and the hydrophilic face. The Type III alpha-helix is sometimes attached to the surface of membrane lipid bilayers, or partially buried in the hydrophobic core and partially exposed on the surface of water-soluble globular proteins, or in the integral membrane pores that face away from the hydrophobic lipid bilayer (Table S1). Glycine (G) is referred as hydrophobic because its side chain hydrogen does not engage in H-bonds, but is only weakly hydrophobic at the same time.

Nature has evolved over time for the alpha-helices to be 3 chemically distinct types. After classifying them, we can now apply a simple QTY code to convert one type of alpha-helix, i.e, hydrophobic alpha-helix to hydrophilic ones and vice versa, apply the reverse QTY code (rQTY code) to convert hydrophilic alpha-helix to hydrophobic ones. The QTY code systematically replaces the hydrophobic amino acids L, V, I and F into hydrophilic Q, T, and Y. Here, we demonstrate the QTY code conversion of integral membrane proteins to their water-soluble counterparts.

Methanol holds potential as an alternative source of energy to petroleum and coal due to its combustion. It can be produced by the oxidation of methane, which is a greenhouse gas commonly found in the atmosphere. There are no current enzymatic catalysis at the industrial scale that efficiently perform this reaction, but naturally occurring enzymes are known to carry out this function in methanotrophs. In particular, the integral transmembrane enzyme, i.e., particulate methane monooxygenase (pMMO), which is a complex composing of pMOA, pMOB, and pMOC found in methanotrophic bacteria, is known to perform this oxidation.

The membrane-bound pMMO is the predominant catalyst in nature to convert methane to methanol and is a three-protein complex that requires copper to carry out catalysis (8, 9, 10, 11, 12). Their X-ray crystal structures (8) and cryoEM structures (9) have been elucidated, and their 2D array structures have also been determined by cryo-electron tomography (10). pMMO is substantially more effective than natural soluble MMO (sMMO), the other of such naturally occurring catalyzers, since it has much higher affinity to molecular oxygen (O2) and methane (CH4) to catalyze the conversion of methane to methanol. pMMO has a lower apparent kM than sMMO (12, 13, 14), most likely due to the high-density pMMO array structure that synergistically reinforce to capture methane (10) and carry out the subsequent enzymatic catalysis. pMMO enzyme catalytic assay has been well established (12, 13, 14, 15). In some methanotrophs under copper starvation conditions, pMMO can be expressed as ~20% total protein (16) and 80% of the pMMO is in the membranes that form high-density arrays (10, 17).

However, it is extremely difficult to express, purify and study membrane proteins like pMMO because they require detergent to stabilize them during removing from the cell membrane (18). It is impossible to produce membrane proteins in industrial scale of kilograms. Thus, innovative methods must be found to express and purify membrane proteins in large scale.

One of us (S. Zhang) conceived and invented a simple QTY code to directly convert a hydrophobic alpha-helix to a hydrophilic alpha-helix (19). It was demonstrated that the QTY code can convert membrane proteins including G protein-coupled receptors and cytokine receptors into their water-soluble analogs that retain their biological function, namely ligand binding activities similar to the native receptors (20, 21, 22, 23). It was further demonstrated that the water-soluble QTY analog receptor CXCR4QTY was used to design a highly sensitive detection device (24). Furthermore, we also carried out structural bioinformatic studies using AlphaFold2 of diverse integral membrane proteins. These studies also show that the water-soluble QTY analogs superposed very well with the corresponding native membrane proteins and yielded RMSDs of often less than 2Å including GPCRs (25), glucose transporters (26), SLC solute carrier transporters (27), ABC transporters (28), glutamate transporters (29) and monoamine transporters (30).

We recently demonstrated that a membrane-bound bacterial enzyme histidine kinase was converted to its water-soluble QTY analog, whereby the QTY enzyme histidine kinase retains its 4 different enzymatic activities (31). We therefore believe that we can also design pMMO to transform it into its water-soluble analog pMMOQTY that may retain its enzymatic catalytic activity to convert methane into methanol.

Structural similarities exist between subunits of the particulate methane monooxygenase and cytochrome c oxidase II (terminal complex in the respiratory chain). Both contain a copper center, and such comparative study could yield important insights to electron transfer in oxidation (9, 31). Specifically, the subunits CYB, C560, and DHSD are known to be involved in electron transferring. CYB is a subunit of the ubiquinol-cytochrome c reductase complex (cytochrome b-c1 complex) responsible for transferring electrons to cytochrome c complex, and C560 and DHSD are both subunits of the receiving respiratory complex II succinate dehydrogenase (cytochrome c complex) (32, 33). Their structures have also been experimentally determined through CryoEM (33) and X Ray Crystallography (34, 35). However, similar problems regarding the difficulty of studying transmembrane regions exist. Thus, solubilizing the methane monooxygenase by QTY code can offer a target for further studies on electron transfer during oxidation.

AlphaFold2 and the latest version AlphaFold3 is an online program newly developed by Google DeepMind (36, 37, 38) capable of using deep learning to predict protein structures from FASTA sequences. In comparison to traditional experimentally predicted structures, AlphaFold3 has a remarkable increase in efficiency while maintaining fairly high accuracy. Despite known limitations in flexible loops and the persisting need for experimental verification, it proves to significantly accelerate both protein structural studies and novo protein design.

Here, we present bioinformatic studies of six previously experimentally determined transmembrane proteins by others including pMOA, pMOB, pMOC, CYB, C560, and DHSD and their AlphaFold3-predicted water-soluble QTY variants. We provide superpositions of the insoluble native transporters and their soluble QTY variants. We also provide comparative structures of hydrophobic molecules with their hydrophilic QTY variants.

Results and Discussion
Rationale of the QTY code

The QTY code systematically replaces hydrophobic amino acids such as leucine (L), isoleucine (I), valine (V), and phenylalanine (F) into the hydrophilic amino acids glutamine (Q), threonine (T), and tyrosine (Y). Since the electron density maps of L vs Q, I/V vs T, and F vs Y are highly structurally similar, such conversion is able to preserve the functional structure of proteins while the transmembrane domain loses its hydrophobic characteristic.

Protein sequence alignments, isoelectric focusing (pI) and molecular weights

We aligned the protein sequences of the native pMOA, pMOB, pMOC, CYB, C560, and DHSD and their QTY variants (Figure 2). Despite overall mutations of amino acids in the protein sequence (3.66–22.89%) and notable variation in the transmembrane domain (33.33–86.61%), the isoelectric-focusing points (pI) of native proteins and engineered QTY variants remain alike, with unit differences ranging between 0–0.10. This is due to the lack of charges in amino acids Q, T, and Y at neutral pH, which minimizes impact on a protein's pI. Such differences are rather insignificant regarding surface charges and unlikely to disrupt delicate structures and functions.

Figure 2.

Protein sequence alignments of six native protein sequences with their water-soluble QTY variants. The symbols | and *indicate whether amino acids are identical or different, respectively. The alpha helices (blue) are shown above the sequences. Alignments shown include a) pMOA vs pMOAQTY, b) pMOB vs pMOBQTY, c) pMOC vs pMOCQTY, d) CYB vs CYBQTY, e) C560 vs C560QTY, f) DHSD vs DHSDQTY.

Superposition of native transporters and their water soluble QTY variants

After predicting the structures of water-soluble QTY variants with Alphafold3, we superposed the molecular structure of six CryoEM-determined transporters and their respective QTY variants in order to directly compare their structures. The molecular structures of native transporters are already experimentally determined for pMOA (PDB: 7EV9), pMOB(PDB: 7EV9), and pMOC (PDB: 7S4I), CYB (PDB: 8IOG), C560 (PDB: 8GS8), DHSD (PDB: 8GS8). Thus, the superpositions were carried out for pMOA vs pMOAQTY, pMOB vs pMOBQTY, pMOC vs pMOCQTY, C560 vs C560QTY, CYB vs CYBQTY, and DHSD vs DHSDQTY.

The experimentally determined native molecular structures and AlphaFold3-determined modified QTY structures superposed surprisingly well, with root mean square deviation (RMSD) values ranging below 1.00 Å at between 0.302 Å and 0.595 Å for all six protein pairs (Table 1 and Figure 3). Specifically, their RMSD values are as follows: pMOA vs pMOAQTY (0.302Å), pMOB vs pMOBQTY (0.460Å), pMOC vs pMOCQTY (0.590Å), CYB vs CYBQTY (0.595Å), C560 vs C560QTY (0.489Å), and DHSD vs DHSDQTY (0.343Å). Despite 33.33–86.61% amino acid substitutions in the transmembrane domain using the QTY code, these results demonstrate relatively similar and preserved 3-dimensional folds between the native transporters and their QTY variants.

RMSD and protein characteristic comparison between six selected native proteins and their QTY variants.

RMSD (Å) pI Mw (kDa) TM Variation (%) Overall Variation (%)
pMOACryoEM - 6.96 28.4252 - -
pMOAQTY 0.302 6.95 28.7047 44.44 22.67
pMOBCryoEM - 6.03 42.7860 - -
pMOBQTY 0.460 6.03 42.8397 33.33 3.66
pMOCCryoEM - 5.67 28.8514 - -
pMOCQTY 0.590 5.67 29.1388 41.98 22.00
CYBCryoEM - 7.82 42.7176 - -
CYBQTY 0.595 7.76 43.3619 51.48 22.89
C560CryoEM - 9.30 14.9700 - -
C560QTY 0.489 9.20 15.1600 86.61 19.12
DHSDCryoEM - 7.75 10.8887 - -
DHSDQTY 0.343 7.73 11.1701 35.38 22.33

Abbreviations: RMSD = residue mean square distance; pI = isoelectric focusing Point; MW = molecular weight; TM = transmembrane; and (-) = not applicable.

Figure 3.

Superpositions of six Cryo-EM-determined structures of native oxidation enzymes and their AlphaFold3-predicted water-soluble QTY variants. The CryoEM-determined structures are obtained from the Protein Data Bank (PDB). The CryoEM structures (magenta) are superposed with their QTY variants (cyan) predicted by AlphaFold3. These superposed structures show that the native transporters and their QTY variants have very similar structures. For clarity of direct comparisons, unstructured loops in the CryoEM structures were removed in the QTY variants. a) pMOA vs pMOAQTY (0.302 Å), b) pMOB vs pMOBQTY (0.460 Å), c) pMOC vs pMOCQTY (0.590 Å), d) CYB vs CYBQTY (0.595 Å), e) C560 vs C560QTY (0.489 Å), and f) DHSD vs DHSDQTY (0.343 Å).

Superpositions of AlphaFold3-predicted native transporters and their water-soluble QTY variants

We also ask how well the AlphaFold3-predicted native transporters would superpose with the AlphaFold3-predicted structures of their water-soluble QTY variants. Therefore, we carried out the superposition of the following pairs of protein structures (Figure 4) and yielded relatively low RMSD values for pMOAAF3 vs pMOAQTY, pMOBAF3 vs pMOBQTY, pMOCAF3 vs pMOCQTY, CYBAF3 vs CYBQTY, C560AF3 vs C560QTY, and DHSDAF3 vs DHSDQTY. These results suggest AlphaFold3 predicted and experimentally determined structures share similarity between the native proteins and their QTY variants, although the QTY variant structures need to be experimentally verified. We also superposed the AlphaFold3-predicted native structures with their experimentally obtained CryoEM structures and calculated their RMSD to further test AlphaFold3's accuracy (Figure S8 in Supplementary Materials).

Figure 4.

Superpositions of AlphaFold3-predicted structures of native and their QTY variants. Color code: green = AlphaFold3-predicted native structures; cyan = AlphaFold3-predicted water-soluble QTY variants. a) C560AlphaFold3 vs C560QTY, b) CYBAlphaFold3 vs CYBQTY, c) DHSDAlphaFold3 vs DHSDQTY, d) pMOAAlphaFold3 vs pMOAQTY, e) pMOBAlphaFold3 vs pMOBQTY, and f) pMOCAlphaFold3 vs pMOCQTY.

Superpositions of CryoEM structures with AlphaFold3-predicted native transporters and their water-soluble QTY variants

We also ask how the superposition will appear i) if we superpose the experimentally determined CryoEM transporter structures with ii) AlphaFold3-predicted native proteins and iii) AlphaFold3-predicted water-soluble QTY variant proteins. The superposition of the above three structures are shown in Figure 5. Structures in the superposition appear to align well, with minor differences in their residues and folding. To gain a clearer view of potential structural differences that would affect the proteins' abilities to function as transmembrane proteins, we also superposed only the transmembrane domains of the structures (Figure S9 in Supplementary Materials).

Figure 5.

Superpositions of CryoEM structures with AlphaFold3-predicted native enzymes and their water-soluble QTY variants. Superposition of i) the experimentally determined CryoEM structures (magenta) with ii) AlphaFold3-predicted native transporters (green) and iii) AlphaFold3-predicted water-soluble QTY variant transporters (cyan). a) C560CryoEM vs C560AlphaFold3 vs C560QTY, b) CYBCryoEM vs CYBAlphaFold3 vs CYBQTY, c) DHSDCryoEM vs DHSDAlphaFold3 vs DHSDQTY, d) pMOACryoEM vs pMOAAlphaFold3 vs pMOAQTY, e) pMOBCryoEM vs pMOBAlphaFold3 vs pMOBQTY, and f) pMOCCryoEM vs pMOCAlphaFold3 vs pMOCQTY.

Analysis of the hydrophobic surface of native transporters and the water-soluble QTY variants

To validate the effectiveness of our protein engineering, we mapped the hydrophobicity of Cryo-EM-determined structures against AlphaFold3-predicted structures with the QTY code applied (Figure 6). The original proteins possess 2–8 transmembrane alpha-helical domains directly embedded in phospholipid bilayers, and their structures contain the highly hydrophobic amino acids L, I, V, and F. As desired, hydrophobic patches were largely reduced after systematic replacement of amino acids L, I, V, F with the more hydrophilic Q, T, and Y using the QTY code in said transmembrane regions, demonstrating the desired conversion.

Figure 6.

Hydrophobic surface of six native proteins and their water-soluble QTY variants. The native oxidation enzymes have many hydrophobic residues L, I, V, and F in the transmembrane helices. After Q, T, and Y substitutions of L, I and V, and F respectively, the hydrophobic surface patches (yellow) in the transmembrane helices become more hydrophilic (cyan). a) pMOA vs pMOAQTY, b) pMOB vs pMOBQTY, c) pMOC vs pMOCQTY, d) CYB vs CYBQTY, e) C560 vs C560QTY, f) DHSD vs DHSDQTY.

AlphaFold3 predictions

Structural biology researchers have sought to predict posttranslational protein folding accurately and efficiently for almost 70 years. With the advent of machine learning tool AlphaFold2 in 2021 (36, 37) and AlphaFold3 in 2024 (38), we are now able to study the structure of previously unattainable proteins, especially those with integral embedded transmembrane regions. Rather than undergo the difficult process of gene expression, protein production, detergent selection, purification, detergent exchange, maintaining stability and avoiding agglutination, scientists are able to generate fairly accurate predictions within hours, or even minutes depending on protein size.

In collaboration with the European Bioinformatics Institute (EBI), DeepMind has made over 214 million predicted protein structures freely available online (https://alphafold.ebi.ac.uk) within the three years since AlphaFold2's release. This speed and accuracy is unprecedented in the field of structural biology (36, 37, 38).

Rationale for selecting enzymes in this study

Particulate methane monooxygenase (pMMO) is one of the three enzymatic complexes known to carry out methane hydroxylation (8,9,10,11,12,13,14,15). The enzyme complex is very important for removing methane from the air and reducing the climate change damage. Its mechanism is not fully understood but thought to be related to the structurally similar cytochrome c oxidase II complex which also contains functionally important copper ions. The cytochrome c oxidase II complex found in the mitochondrial membrane is also involved in human diseases including mitochondrial encephalomyopathy, hypertrophic cardiomyopathy (HCM), sporadic mitochondrial myopathy (MM), pheochromocytoma syndrome, gastric stromal sarcoma, and nuclear type 3 mitochondrial complex II deficiency. We thus are interested to carry out the structural bioinformatic study of the six proteins in pMMO subunits and representative parts of the repository complex with known CryoEM structures to further validate the QTY code.

The water-soluble QTY variants may be useful in a variety of scenarios including i) using the purified soluble proteins to carry out ligand-binding drug discoveries, ii) determining the currently unknown detailed mechanisms and functions of methane monooxygenase and similar copper-center-containing proteins, and iii) potentially achieving industrial-scale use of methane as a biofuel by expressing-soluble pMMOs in filamentous algae which allows direct secretion of the protein into surrounding culture media without the need of cell break-down.

We previously showed that the QTY code can also be reversible because of their simple pairwise changes, L <=> Q, I/V <=> T, and F <=> Y. This is similar as the DNA double-helix, A <=> T, G <=> C and vice versa T <=> A, C <=> G. These QTY or reverse QTY changes can be results of a single base pair mutation (Figure S1).

Karagöl et al. reported in several glutamate transporters and (29), serotonin, dopamine and norepinephrine transporters (30) that underwent such L <=> Q, I <=> T, and F <=> Y mutations. Some of them result diseases and others are silent, all depending on the specific location of mutations.

Membrane transporter proteins are conserved from bacteria to mammals (39). This remarkable conservation makes them ideal subjects for evolutionary studies. For instance, evolutionary profiling potentially allows a better understanding of systematic amino acid substitutions, including the QTY code (29, 30). The phenomena perhaps also took place in other transmembrane helical proteins including pMOA, pMOB, pMOC, CYB, C560, and DHSD. Applied the same idea to the transmembrane helices of pMMO complex and others, QTY code-based studies are likely to provide insights into biological foundations of chemically distinct alpha-helices. Understanding the alpha-helical evolution has a potential to provide crucial information on protein molecular evolution and sequence-based structural predictions. Moreover, by combining this information with variant based approaches, we analyzed to what extent Darwinian natural selection and mutationally favored biases affected the diversification of QTY-code pairs (L/Q, I/T, V/T, and F/Y).

The chemical properties of the helical structures are governed by the genetic code, as genetic code's second position is determinant of the chemical nature of amino acids (1, 2, 3, 4, 19). For instance, i) amino acids with U at the second position are hydrophobic (Phe, Leu, Ile, Val, and Met); ii) amino acids with C at the second position are less hydrophobic (Pro and Ala), or with a hydroxyl -OH group (Ser and Thr); iii) amino acids with A at the second position are hydrophilic and water-soluble (Asp, Glu, Asn, Glu, Lys, His and Tyr), and 2 stop codons Ochre (UAA) and Amber (UAG); iv) amino acids (Arg and Ser) with G at the second position are water-soluble, Cys is partially water-soluble and Gly is achiral and has a hydrogen as the side chain (1, 2, 3, 4, 19). The stop codon is UGA. In general, pyrimidine U and C at the second position confer hydrophobicity; in contrast, purine A and G at the second position confer hydrophilicity (Figure S1).

The variations all occur in the second position of the genetic code, including transition mutation, i.e., purine to purine (A->G, G->A) and pyrimidine to pyrimidine (C->U, or U->C); or transversion mutation (U->A, U->G, C->A, C->G, A->U, A->C, G->U, G->C). In the case of L->Q, I->T, and F->Y substitutions. i) in L (leucine), two codons are CUA and CUG, and in Q (glutamine), two codons are CAA and CAG; in these cases, the second position of U is mutated to A, which is a transversion mutation (U->A). ii) In I (isoleucine), three codons are AUU, AUC, and AUA, in T (threonine), four codons are ACU, ACC, ACA, and ACG; which is a transition mutation (U->C). iii) In F (phenylalanine), two codons are UUU and UUC, in Y (tyrosine), two codons are UAU and UAC, and the second position of U is mutated to A which is a transversion mutation (U->A). The same logic applies in the case of the reverse QTY mutations of Q->L, T->I, Y->F. Transitions are expected to be more frequent than transversions, with a ratio much higher in coding regions. F<=>Y and L<=>Q changes that required same mutations (U->A and A->U) observed in similar ratios.

Conclusion

Nature has evolved a set of 20 amino acids over billions of years with diverse chemical properties and some of which share remarkable structural similarities. These similarities are especially striking for L vs D, N, E, Q, V/I vs T and F vs Y. These amino acids are shared for all living systems on earth (perhaps elsewhere of living system on planets in other solar system in the universe). Understanding these fine structural similarities allows us to pairwise inter-change them and still keep the protein's structural integrity and retain the biological function.

Applying the simple QTY code, we bioengineered six water-soluble QTY variants of the pMMO complex and cytochrome c oxidase II proteins by systematically converting hydrophobic alpha-helices to hydrophilic ones. We then employed in silico bioinformatics tools to analyze the characteristics of native proteins and their QTY variants. After superposing in PyMOL, the structures of QTY variants showed a high overall similarity to both experimentally determined Cryo-EM structures and AlphaFold3-predicted structures of the native proteins, suggesting a high likelihood that their functions would be retained. To further verify the reliability, we calculated various characteristics of the two groups including root mean square deviation, isoelectric focusing points, molecular weight, variations and hydrophobicity. Despite significant changes to the amino acids in the integral transmembrane domain and reduction in hydrophobic patches, bioinformatic results demonstrate that the structures of native proteins and their QTY variants remain highly similar. This suggests the viability of the QTY code as an approach to modeling water-soluble variants of various membrane proteins. We believe the bioengineered soluble QTY proteins may have real-world application potentials not only for structural and mechanistic studies of membrane enzymes, but also for removing methane from the air for combating climate change, and for developing a more sustainable alternative energy source.

Methods
Protein sequence alignments and other characteristics

The native protein sequences for transporters including pMOA, pMOB, and pMOC, CYB, C560, and DHSD, were obtained from UniProt (https://www.uniprot.org). The sequences for the QTY variants were aligned using the same methods as previously described. The MWs and pI values of the proteins were calculated using the Expasy (https://web.expasy.org/compute_pi/).

AlphaFold3 predictions

The protein structures of the QTY variants were predicted using AlphaFold3 (https://github.com/sokrypton/ColabFold) and by following the instructions at the website. PBD files for the predicted native protein structures were obtained from The EBI (https://alphafold.ebi.ac.uk), which contains all AlphaFold3-predicted structures for native proteins. The UniProt website (https://www.uniprot.org) provided protein ID, entry name, description, and FASTA sequence for each native protein. The QTY code can be applied to FASTA sequences by manually replacing amino acids in the TM domains (found on the Protter 2D diagram-plotting website, http://wlab.ethz.ch/protter/start/) but can also be done through the QTY method website (https://pss.sjtu.edu.cn/). The website also provides MWs, pI values, TM variation, and overall variation.

Superposed structures

PBD files for native protein structures experimentally determined by CryoEM were taken from the PDB include pMOA, pMOB, and pMOC, CYB, C560, DHSD. Predictions for the QTY variants were carried out using the AlphaFold3 program, which can be found at https://github.com/sokrypton/ColabFold. These structures were superposed and the RMSDs were calculated using PyMOL (https://pymol.org/3/). For detailed method information, please see references 2530.

Structure visualization

PyMOL (https://pymol.org/3/) was used to superpose the native protein structure and the QTY variant. UCSF Chimera (https://www.rbvi.ucsf.edu/chimera) was used to render each protein model with hydrophobicity patches. For detailed method information, please see references 2530.