OPENPichia: licence-free Komagataella phaffii chassis strains and toolkit for protein expression

Claes, Katrien; Van Herpe, Dries; Vanluchene, Robin; Roels, Charlotte; Van Moer, Berre; Wyseure, Elise; Vandewalle, Kristof; Eeckhaut, Hannah; Yilmaz, Semiramis; Vanmarcke, Sandrine; Çıtak, Erhan; Fijalkowska, Daria; Grootaert, Hendrik; Lonigro, Chiara; Meuris, Leander; Michielsen, Gitte; Naessens, Justine; van Schie, Loes; De Rycke, Riet; De Bruyne, Michiel; Borghgraef, Peter; Callewaert, Nico

doi:10.1038/s41564-023-01574-w

OPENPichia: licence-free Komagataella phaffii chassis strains and toolkit for protein expression

Resource
Open access
Published: 04 March 2024

Volume 9, pages 864–876, (2024)
Cite this article

Download PDF

You have full access to this open access article

From

View current issue Submit your manuscript

OPENPichia: licence-free Komagataella phaffii chassis strains and toolkit for protein expression

Download PDF

8881 Accesses
1 Citation
10 Altmetric
Explore all metrics

Abstract

The industrial yeast Komagataella phaffii (formerly named Pichia pastoris) is commonly used to synthesize recombinant proteins, many of which are used as human therapeutics or in food. However, the basic strain, named NRRL Y-11430, from which all commercial hosts are derived, is not available without restrictions on its use. Comparative genome sequencing leaves little doubt that NRRL Y-11430 is derived from a K. phaffii type strain deposited in the UC Davis Phaff Yeast Strain Collection in 1954. We analysed four equivalent type strains in several culture collections and identified the NCYC 2543 strain, from which we started to develop an open-access Pichia chassis strain that anyone can use to produce recombinant proteins to industry standards. NRRL Y-11430 is readily transformable, which we found to be due to a HOC1 open-reading-frame truncation that alters cell-wall mannan. We introduced the HOC1 open-reading-frame truncation into NCYC 2543, which increased the transformability and improved secretion of some but not all of our tested proteins. We provide our genome-sequenced type strain, the hoc1^tr derivative that we named OPENPichia as well as a synthetic, modular expression vector toolkit under liberal end-user distribution licences as an unencumbered OPENPichia resource for the microbial biotechnology community.

Recent advances of molecular toolbox construction expand Pichia pastoris in synthetic biology applications

Article 30 November 2016

Strains and Molecular Tools for Recombinant Protein Production in Pichia pastoris

Main

Recombinant proteins are predominantly produced by just a few different host cells. Escherichia coli is the main prokaryotic host used for the production of simple stable proteins that have few or no disulphide bonds. Human HEK293 cells or hamster CHO cells are used to produce more complex eukaryotic proteins that require, among other things, the formation and isomerization of disulphide bonds^1,2,3 and complex-type N-glycosylation. Taking up the intermediate position, the methylotrophic yeast Pichia pastoris (reclassified as Komagataella phaffii) combines the easy cultivation, fast growth and highly scalable robust bioreactor processes of a microbial host with the capabilities of a eukaryotic secretory system.

In 1954 H. Phaff deposited a methylotrophic yeast strain that he isolated from a black oak tree (Quercus kelloggii) in the Yosemite region⁴. This isolate was stored in the culture collection of the University of California at Davis and named UCD-FST K-239, with formally equivalent type-strain deposits in other culture collections named NRRL YB-4290, NRRL Y-7556, CBS 2612, NCYC 2543 and MUCL 46514. In the 1950s UCD-FST K-239 could not be distinguished from other methylotrophic yeast strains isolated in 1919 by A. Guilliermond, and Phaff categorized all isolates together as a new species named P. pastoris (the genus Pichia was established half a century before, in 1904, by E. C. Hanssen⁵; Fig. 1a). P. pastoris was reclassified into the genus Komagataella in 1995. The two distinctly evolved isolates from Phaff and Guilliermond were later (2005) divided into two separate species and renamed K. phaffii and Komagataella pastoris by C. Kurtzman⁶ based on sequencing of 26S ribosomal DNA. Consequently, the Phaff strain (UCD-FST K-239, NRRL YB-4290, NRRL Y-7556, CBS 2612, NCYC 2543 and MUCL 46514) is considered the type strain of the species K. phaffii, whereas the Guilliermond strain (CBS 704 and NRRL Y-1603) is the type strain of K. pastoris.

In the 1970s Komagataella yeast species, which can utilize methanol as a sole carbon source^7,8,9, attracted the interest of the Phillips Petroleum Company. They had a vast supply of methane gas, which is produced during oil refinement and can be readily oxidized to methanol. The Phillips Petroleum Company isolated a P. pastoris strain that fermented methanol to form a single-cell protein source for animal feed and patented this application in 1980 (with a priority date of 12 April 1979)¹⁰. Patenting included a requirement for strain deposition, with the patented strain being named NRRL Y-11430 (known as CBS 7435 in a different culture collection). The Phillips Petroleum Company contracted the Salk Institute Biotechnology/Industrial Associates in the 1980s to develop NRRL Y-11430 for recombinant protein production. NRRL Y-11430-derived strains were generated by nitrosoguanidine mutagenesis, resulting in (among other things) the GS115 strain, which is a HIS4 auxotrophic mutant¹¹, and the X-33 strain, which is a HIS4-complemented GS115 produced by Invitrogen^11,12,13. Phillips Petroleum sold the patent rights for their Pichia system to Research Corporation Technologies (RCT; https://pichia.com/) in 1993. Surprisingly, NRRL Y-11430 (Agricultural Research Service Culture Collection, ARS-NRRL) is not distributed anymore by NRRL and the same holds for the equivalent CBS 7435 deposit (Westerdijk Fungal Biodiversity Institute, CBS). To our knowledge, the NRRL Y-11430 parental industrial strain can only be obtained at the American Type Culture Collection (ATCC 76273) under a restrictive material transfer agreement (MTA) precluding third-party distribution and use for product manufacturing. Derivative industrial strains (GS115 and X-33) have similar restrictions when licensed from the providing companies. Royalty payments are imposed on products manufactured in them.

Hence, despite the expiry of the associated patent more than 20 yr ago, socio-economic utilization of the parental NRRL Y-11430 strain and its derivatives in this way remains monopolized through a commercial licensing scheme. The lack of freedom to distribute the result of synthetic biology efforts in academia and industry alike to enhance the capabilities of Pichia strains greatly impedes progress with this cornerstone system of recombinant protein biotechnology. An equivalent open-access alternative is hence long overdue.

Researchers in academia and industry ideally need to use the same parental Pichia strain lineage that has already been commercialized because regulatory agencies are familiar with this strain. To achieve this goal, we and others have recently turned to genome sequencing of the K. phaffii type strains that are present in culture collections throughout the world to try and identify the original isolate from nature that the Phillips Petroleum Company researchers used in their derivation of NRRL Y-11430, as the basis from which an open-access system could be built¹⁴. Here we resequenced genomes of four type strains and selected the NCYC 2543 deposit for development as a chassis strain. This equivalent deposit of the Phaff UCD-FST K-239 type strain is genomically near-identical to NRRL Y-11430, consistent with it being the parent type strain, and the NCYC collection provides liberal distribution and commercial use licences. We exhaustively compare the biological features of NCYC 2543 to the NRRL Y-11430 industrial strain, and engineer an optimized derived ‘OPENPichia’ strain that is equally performant as the industrial strain.

We present OPENPichia together with a modular protein expression vector toolkit completely built from synthetic DNA, free of third-party MTAs, that is compatible with toolkits from other Pichia developer laboratories¹⁵ as a resource for the global microbial metabolic engineering and synthetic biology communities.

Results

Genome resequencing of K. phaffii strains

We resequenced (average of 180× genome coverage) NRRL YB-4290, NRRL Y-7556, CBS 2612, NCYC 2543 and the NRRL Y-11430 industrial strain. The reads were mapped against the reference genome (CBS 7435), which includes the mitochondrial genome and two K. phaffii linear killer-like plasmids¹² (Supplementary Tables 1–3).

The proportion of reads originating from the two killer-like plasmids varied between 0% and 9% (Supplementary Table 3). K. phaffii killer-like plasmids are linear autonomously replicating DNA fragments with a length of 9.5 and 13.1 kilobases (kb)¹² that place a biosynthetic load on cells and also encode exotoxins that can kill yeast cells^12,16, which might conceivably reduce culture viability. Killer-like plasmids were absent from CBS 2612 and NCYC 2543 but present in NRRL YB-4290, Y-7556 and Y-11430 (Supplementary Table 4). The NRRL YB-4290 and CBS 2612 strains were deposited by Phaff, whereas the NRRL Y-7556 strain was a re-deposit of CBS 2612 by D. Yarrow (CBS; Fig. 1a). Given that NRRL Y-7556 has killer-like plasmids but CBS 2612 does not, it is clear that killer-like plasmids can be lost frequently in vitro simply by propagation and single-clone purification.

A phylogenetic tree of resequenced strains (this study) and previously published K. phaffii genomes^11,14,17 showed that K. phaffii type strains are clustered with NRRL Y-11430, CBS 7435 and close relatives (Fig. 1b). Our data support the previously published hypothesis^14,18 that all deposited K. phaffii strains are derived from the Phaff isolate¹⁷.

To identify an equivalent type strain to NRRL Y-11430, we identified single nucleotide polymorphisms (SNPs) and short insertion–deletions (indels) in our resequenced strains (Supplementary Table 4). We detected approximately 20 intergenic/intronic/silent exonic differences between NRRL Y-11430 and CBS 7435. Note that the type-strain deposits of the different culture collections (NRRL YB-4290, NCYC 2543, CBS 2612 and NRRL Y-7556) also differ from one another, each at one other coding sequence-altering genomic position and a few non-coding ones, probably reflecting drift due to the background mutational rate during strain propagation (Fig. 1c).

We focused on protein-coding alterations that consistently distinguish the industrial strain NRRL Y-11430 from these equivalent type-strain deposits. Three coding sequence-altering mutations (in SEF1, RSF2 and HOC1) were shared by all type strains but were absent from the industrial strain NRRL Y-11430. We re-analysed raw sequencing reads from a previous characterization of NRRL YB-4290 and NRRL Y-7556, and confirmed the presence of SEF1, RSF2 and HOC1 mutations¹⁴. As all three mutations are shared by the type strains, we conclude that they represent the original K. phaffii isolate and that NRRL Y-11430 is mutated at these loci.

SEF1, RSF2 and HOC1 genotypes in NRRL Y-11430 and CBS 7435

SEF1 encodes a putative transcription factor (UniProt ID F2QV09). The SNP causes a S315C mutation in NRRL Y-11430. RSF2 encodes a transcription factor that is involved in methanol- and biotin-starvation (UniProt ID F2QW29). The SNP introduces a stop codon (W748*) in NRRL Y-11430, resulting in a carboxy (C)-terminal deletion of 183 amino acids. Full-length Rsf2p is similar to a Saccharomyces cerevisiae homologue¹⁹, providing support for the idea that this was the original genomic state, as previously reported¹⁴. HOC1 (OCH1 homologue) encodes an α-1,6-mannosyltransferase (UniProt ID F2QVW2) involved in the synthesis of cell-wall mannan and is part of the mannan polymerase II complex²⁰. The industrial strain NRRL Y-11430 has a single base pair (bp) deletion in a poly-A stretch (at position 755 of the 1,191 bp coding sequence). This is predicted to result in a C-terminally truncated protein (274 versus 398 amino acids), with the last 22 codons after the frameshift and before the first-occurring stop codon coding for an altered C-terminal peptide. We confirmed the indel in the homopolymer using Sanger sequencing (Supplementary Data 1). In parallel, the same mutation was identified in K. Wolfe’s laboratory (UC Dublin) as a quantitative trait locus mutation that yielded between two- and three-fold higher secretion of a β-glucosidase that was used as a secretion reporter protein¹⁸.

Growth rate and protein production differences between strains

Next, we compared characteristics that are important for the use of K. phaffii in recombinant protein production and focused on NCYC 2543 given the availability of open-access distribution options by the NCYC culture collection (https://www.ncyc.co.uk/licences). We compared the growth rates of NRRL Y-11430, GS115, NCYC 2543 and a NCYC 2543 HIS4-knockout mutant (NCYC 2543 Δhis4; Fig. 2). GS115 grew significantly slower, as reported earlier¹⁴. Given that the NCYC 2543 Δhis4 strain did not grow slower than its wild-type counterpart, the slower growth of GS115 is not, or at least not only, due to histidine auxotrophy.

**Fig. 2: Comparison of the maximal growth rate of NRRL Y-11430, NCYC 2543, the two NCYC 2543 *hoc1*^tr mutants, GS115 and NCYC 2543 Δ*his4*.**

We expressed a selection of proteins (Supplementary Table 5) in NRRL Y-11430 and NCYC 2543 to evaluate how well NCYC 2543 expressed recombinant proteins. We chose four proteins that exemplify the different protein types produced by biotechnology companies: a cytokine (GM-CSF), a redox enzyme (GaOx), a VHH-hFcα fusion (Cdiff-VHH-IgA) and a VHH-hFcγ fusion (CovidVHH-IgG). We tested two promoters—that is, glyceraldehyde 3-phosphate dehydrogenase promoter (PGAP; constitutive) and alcohol oxidase I promoter (PAOX1; methanol-inducible). Protein expression in K. phaffii is prone to clonal variations that can interfere with the comparison of expression capabilities between strains, mostly due to the integration site and the copy number of the construct²¹. To overcome this problem, a single-copy of the cloned gene was targeted to specific promoter regions in the genome. We confirmed copy number and integration sites by quantitative and integration-site-specific PCR, and two independent clones that expressed each of the four proteins were cultured in triplicate. Similar amounts of proteins were produced by both the PGAP and PAOX1 constructs (Extended Data Fig. 1), with the exception of the VHH-hFcγ fusion, where NRRL Y-11430 outperformed NCYC 2543. However, NCYC 2543 harbouring PGAP constructs grew to higher densities than NRRL Y-11430 harbouring PGAP constructs (Extended Data Fig. 2), whereas this was not the case for PAOX. In addition, NRRL Y-11430 harbouring PGAP constructs (in limiting glucose) produced more host cell proteins than NCYC 2543 harbouring PGAP (Extended Data Fig. 1). We hypothesize that a low level of cell lysis or protein leakage occurs in NRRL Y-11430 cultured on glucose.

HOC1 truncation restores NCYC 2543 transformation efficiency

The transformation efficiency of NCYC 2543 was only 16% (95% confidence interval, 13–19%) and 3% (95% confidence interval, 2–7%) compared with NRRL Y-11430 for PAOX1 and PGAP expression constructs (Fig. 3a), which is consistent with the low transformation efficiency of the type strains that was reported previously^14,18. As the S. cerevisiae Hoc1p orthologue is an α-1,6-mannosyltransferase that functions to produce the outermost layer of the ascomycete cell wall, we hypothesized that a reduced diffusional/charge barrier, due to reduced mannan/mannosylphosphate density, might explain the superior transformability of NRRL Y-11430. Using the split-marker method, we introduced a single base pair deletion in HOC1 of NCYC 2543 (Extended Data Fig. 3a) to produce NCYC 2543 hoc1^tr-1 and a larger deletion to remove 115 bp downstream of the novel stop codon to produce NCYC 2543 hoc1^tr-2 (Extended Data Fig. 3b). We used quantitative PCR with reverse transcription (RT–qPCR) to measure the production of HOC1 messenger RNA and found that HOC1 transcription was downregulated in the strains with a premature stop codon (Fig. 3b). The NCYC 2543 hoc1^tr-1 strain, in which the premature stop codon is separated from the canonical stop codon by 405 nucleotides (nt), produced the lowest level of transcripts. The NRRL Y-11430 and NCYC 2543 hoc1^tr-2 strains have 371 and 290 nt, respectively, between the premature and the canonical stop codon, which correlates with transcript abundance. In conclusion, HOC1-truncated strains lack part of the C-terminal catalytic domain and probably also contain less Hoc1p in the mannan polymerase complex. We compared the transformation efficiency of both wild-type strains and hoc1^tr mutants and found that the HOC1 truncation strongly increased the transformation efficiency of the type strain and even surpassed the transformation frequency by 1.5–3-fold compared with NRRL Y-11430 (Fig. 3a).

**Fig. 3: Effect of the *HOC1* truncations on plasmid transformation efficiency and *HOC1* mRNA abundance in the resulting strains.**

Cell walls of NRRL Y-11430, NCYC 2543 and NCYC 2543 hoc1 ^tr

We characterized cell-wall mannoprotein N-glycans using capillary electrophoresis²² after growth on glucose or glycerol (Extended Data Fig. 4). All four strains had very similar profiles, indicating that the pathway of synthesis of the mannan core was intact. This capillary electrophoresis method is unsuited to detailed profiling of higher-polymerized mannan N-glycans. Most mannosylphosphates are added to the mannan side branches of these long chains, which makes them bind to cationic dyes such as Alcian blue. Hence, we compared the Alcian blue staining intensity of NCYC 2543, NRRL Y-11430 and the two type-strain hoc1^tr mutants (Fig. 4a). Reduced Alcian blue staining of the latter was consistent with that of published S. cerevisiae hoc1 strains^23,24.

**Fig. 4: Characterization of the cell walls of NRRL Y-11430, NCYC 2543 and the two NCYC 2543 *hoc1*^tr mutants.**

The resistance of strains to Congo red and Calcofluor white was analysed to assess their cell-wall integrity (Fig. 4b)^14,18. The type strain was more resistant than NRRL Y-11430 to both dyes but this difference was absent in the HOC1-truncated mutants, which shows that Hoc1p contributes to cell-wall integrity. Transmission electron microscopy using a freeze substitution technique (which draws OsO₄ membrane-staining contrast reagent and fixatives through the cell wall during the dehydration of cells) revealed increased electron scattering by the outermost cell-wall layer of the wild-type NCYC 2543 strain compared with the HOC1-truncated strains (Fig. 4c). This is probably caused by OsO₄ accumulation in the mannan layer of the cell wall during freeze substitution. Scanning electron microscopy analyses indicated that all four strains were structurally similar (Fig. 4c), indicating the absence of gross malformations. We conclude that the hoc1^tr mutation results in a mild deficiency in cell-wall integrity, which increases transformability and in some cases increases the production or secretion of recombinant proteins¹⁸.

Protein production by NCYC 2543 hoc1 ^tr

The growth rates and protein production capacities of the NCYC 2543 hoc1^tr strains were compared with NRRL Y-11430 and NCYC 2543. No significant difference in growth rate was observed (Fig. 2). We tested the PGAP- and PAOX1-based production of GBP and CovidVHH-IgG in the supernatant using SDS–polyacrylamide gel electrophoresis (SDS–PAGE) and enzyme-linked immunosorbent assay (ELISA; GBP only). We screened 24 clones of each strain with the exception of the type strain, where only 11 and 9 transformants were obtained for PAOX1- and PGAP-constructs, respectively (Fig. 5a,b).

**Fig. 5: Overview of the strain performance of NRRL Y-11430, NCYC 2543 and the two NCYC 2543 *hoc1*^tr mutants.**

The NCYC 2543 hoc1^tr strains outperformed both NRRL Y-11430 and NCYC 2543 in the production of GBP protein from the PGAP promoter, although the differences were small and clonal distributions overlapped. These results are consistent with published data¹⁸. No statistically significant differences in PAOX1-GBP protein production were observed between strains. For PAOX1-CovidVHH-IgG production and secretion, NCYC 2543 produced reduced yields compared with the hoc1^tr strains (Fig. 5a). However, we observed two classes of production levels, raising the question of copy number effect. We determined that single-copy insertions resulted in higher CovidVHH-IgG production than double-copy insertions (Fig. 5a; each asterisk represents one copy in the tested clone). Given that this was observed for all four tested strains, it is likely to be an effect of this specific protein rather than the host.

Next, the expression and surface display of an amino (N)-terminal FLAG-tagged and C-terminal V5-tagged human lysozyme were evaluated. By detecting the tags on both sides of the protein (Fig. 5c and Extended Data Figs. 5, 6), we observed a similar intensity of detection in NRRL Y-11430 and NCYC 2453 hoc1^tr-1, and reduced detection in NCYC 2543, showing that the truncated hoc1 allele is beneficial for protein surface display and/or the ease of detection of the displayed protein using antibody detection reagents.

To evaluate whether HOC1-truncated NCYC 2543 can be cultured to a high cell density in bioreactors, we compared NCYC 2543 hoc1^tr-1 and NCYC 2543 in a fermentation experiment at a 3 l scale. We chose CovidVHH-IgG as the target protein, under control of PGAP. As a result of the higher transformation efficiency, the HOC1-truncated strain had a double-copy insertion, whereas the type-strain NCYC 2543 only had a single-copy insertion. In this experiment both strains had comparable growth (Extended Data Fig. 7). The batch phase length, oxygen demand, growth rate and final cell density were very similar. NCYC 2543 hoc1^tr-1 produced almost double the amount of protein produced by the parental strain. We concluded that truncation of the HOC1 gene does not negatively influence the performance of K. phaffii in a bioreactor.

In conclusion, the HOC1 truncation did not have a negative effect on protein production in any of our experiments and sometimes yielded better production. As reported recently by Brady et al.¹⁴, issues with transformability, which makes it laborious to generate multicopy integration clones, was the key reason they opted for continued use of NRRL Y-11430-based strains.

We have solved this problem, and hereby rename NCYC 2543 hoc1^tr-1 as OPENPichia.

OPENPichia modular protein expression vector toolkit

Commercial K. phaffii expression kits containing NRRL Y-11430-derived strains as well as expression vectors are commonplace because they are convenient and work well. The conditions of sale of these kits are legally restrictive and forbid further distribution and reutilization of both the strains and the vectors included in them, including use in commercial production. Importantly, commercial applications require licensing from the kit provider, which can take time and incur costs. To also overcome issues with these proprietary DNA constructs, we used de novo synthesis combined with rapid cloning methods²⁵. The development of a robust genetic toolkit with ‘freedom to operate’ is still expensive and time-consuming.

We provide a genetic toolkit and cloning framework to the community (Fig. 6 and Extended Data Fig. 8)²⁶. We used a modular build based on Golden Gate cloning, similar to other toolkits^{15,27,28,29,30,31,32,33,34,35}. Golden Gate assembly is based on the use of Type IIS restriction endonucleases that cut outside their recognition sites, which allows users to flank DNA fragments of interest with customizable 4 nt overhangs, enabling directional multi-insert cloning in a single reaction. The MoClo system takes this concept a step further as it standardizes Golden Gate assembly by designating a priori all DNA elements of a desired vector, which are typically referred to as ‘parts’, to a particular ‘part type’ (for example, promoter, coding sequence and so on) and flanking each part type by unique 4 nt overhangs and Type IIS restriction sites³⁵. The MoClo system is comprised of eight part types, of which Part 3 (coding sequence) and Part 4 (terminator) can be split up to allow additional modularity—for example, to incorporate N- and C-terminal fusion partners for the protein of interest. In practice, parts are derived from PCR fragments or synthetic constructs, which are first subcloned in entry vectors, also known as ‘Level 0’ vectors (Fig. 6). The vectors of interest can then be assembled into expression vectors, which are termed ‘Level 1’ vectors. By providing proper connector sequences with additional Type IIS restriction sites, the resulting Level 1 vectors can then be further assembled to obtain multigene or ‘Level 2’ vectors, which is the top level in the hierarchy of the system. In the current toolkit, all 4 nt overhangs were adopted to ensure a high degree of compatibility with existing yeast toolkits^15,28,32 and ensure a near 100% predicted ligation fidelity³⁶. As this toolkit is essentially derived from the S. cerevisiae MoClo system, it shares the restriction enzymes (BsmBI and BsaI), most of the 4 nt overhangs as well as the number and design of the individual part types²⁸. An overview of the part types and the parts that are provided in our OPENPichia toolkit is presented in Extended Data Fig. 8. Part sequences are presented in Supplementary Data 2 and materials can be obtained from the Belgian Coordinated Collections of Microorganisms (BCCM)/GeneCorner Plasmid Collection²⁶. We custom-built an MTA in collaboration with GeneCorner to enable the use of all of these plasmids, thereby making royalty-free commercial manufacturing possible.

**Fig. 6: Overview of the available OPENPichia strains and the different parts of the MoClo toolbox.**

Discussion

K. phaffii (formerly known as P. pastoris) is an important protein production host in both academia and industry but the most common industrially developed strains are still distributed with restrictive MTAs and/or commercial licensing, despite the associated patents having expired decades ago. To facilitate academic and commercial host strain development for recombinant protein expression and enable distribution throughout the biotechnology community, we derived an OPENPichia strain and OPENPichia vector cloning kit that enables royalty-free commercial manufacture of K. phaffii products. The OPENPichia strains are distributed by our non-profit research organization, VIB (OPENPichia.com) in an arrangement with the NCYC culture collection. A one-time fee is charged to cover expenses as well as continued resource maintenance and development, following which any use is allowed, including royalty-free commercial product manufacture and onward distribution of further-engineered OPENPichia-derived strains. The OPENPichia vector cloning materials are openly distributed for any purpose by the BCCM (http://bccm.belspo.be/about-us/bccm-genecorner).

Our OPENPichia strain (HOC1-truncated K. phaffii type strain) is almost identical to the former patent-deposit NRRL Y-11430 strain. Only a handful of mutations were identified in comparative genome analyses, of which only four alter the protein code (SNPs and indels). OPENPichia does not harbour killer-like plasmids and its maximum growth rate is the same as that of NRRL Y-11430. With respect to protein production, small differences can occur between the K. phaffii type strain and NRRL Y-11430 but there is no consistently better performing strain, considering the variety of proteins tested in our study. Brady and colleagues¹⁴ previously reported that NRRL Y-11430 had the highest levels of protein expression compared with other K. phaffii strains but none of the type strains from which NRRL Y-11430 was derived were included in their study. Due to the increased cell-wall robustness and reduced transformation efficiencies of type strains, they were excluded from the protein expression experiments performed by Brady and colleagues¹⁴. We indeed observed that the transformation efficiency of the type strain is reduced compared with NRRL Y-11430 but we overcame this through the introduction of a frameshift mutation in HOC1 of the type strain, which resulted in improved transformation efficiency compared with NRRL Y-11430.

Using PGAP-controlled gene expression, NRRL Y-11430 has somewhat more host cell proteins in its culture supernatant and grows to a lower cell culture density (in shake flasks) compared with the type strain. We hypothesize that both observations are related and due to slightly increased cell lysis in NRRL Y-11430, which can have an impact on the need for additional purification steps. A similar observation was made for the HOC1-truncated type strains, although the differences were very small.

Our study shows how to build ‘generic’, robust, validated and openly available biotechnological platforms after patents on foundational strains expire, rather like the development of more affordable ‘generic/biosimilar’ medicines. We previously reported a similar effort for the HEK293 cell lineage³⁷ that is used for viral vector and vaccine manufacturing and hope that others will join us in open science endeavours to develop different synthetic biology chassis systems. For now, we invite all K. phaffii researchers and users to contribute to, and benefit from, our OPENPichia resource.

Methods

Strains and media

The wild-type K. phaffii strains NRRL YB-4290, NRRL Y-7556 and NRRL Y-11430 were obtained from the Agricultural Research Service, CBS 2612 was obtained from the Westerdijk Institute (Netherlands) and NCYC 2543 was obtained from the National Collection of Yeast Cultures. All mentioned strains were cultured and maintained on YPD or YPD agar.

All entry and expression vectors were propagated and are available in the E. coli DH5α strain. MC1061 and MC1061λ strains were also successfully used and generally showed higher transformation efficiency as well as easier green–white or red–white screening than was the case for DH5α. All E. coli strains were cultured and maintained on Luria–Bertani (LB) agar.

The following antibiotics were used at a concentration of 50 µg ml⁻¹ for the selection in E. coli: Zeocin, nourseothricin, hygromycin, kanamycin, chloramphenicol and carbenicillin. The following antibiotics were used at a concentration of 100 µg ml⁻¹ for the selection in K. phaffii: Zeocin, nourseothricin, hygromycin, geneticin and blasticidin.

Several media were used: LB (1% tryptone, 0.5% yeast extract and 0.5% NaCl), yeast extract peptone dextrose (YPD; 1% yeast extract, 2% peptone and 2% d-glucose), yeast extract peptone glycerol (YPG; 1% yeast extract, 2% peptone and 1% glycerol), BMY (1% yeast extract, 2% peptone, 1.34% yeast nitrogen base without amino acids and 100 mM potassium phosphate buffer pH 6), buffered minimal glycerol yeast extract medium (BMGY; BMY with 1% glycerol), BMDY (BMY with 2% d-glucose), buffered methanol-complex medium (BMMY; BMY with 1% methanol) and limiting glucose (1% yeast extract, 2% peptone, 100 mM phosphate buffer pH 6, 50 g l⁻¹ Enpresso EnPump substrate and 5 ml l⁻¹ Enpresso EnPump enzyme solution). For plates, 1.5% agar was added to the LB media and 2% to the YPD media; when Zeocin selection was used, the media were set to pH 7.5.

All oligonucleotides and synthetic DNA fragments were ordered from Integrated DNA Technologies. All synthetic DNA fragments (gBlocks and Genes) were designed and adapted for synthesis using the Codon Optimization Tool and gBlocks Gene Fragments Entry Tool available at the website of Integrated DNA Technologies Europe.

Illumina sequencing

The strains were cultured overnight in YPD medium and the genomic DNA (gDNA) was extracted using an Epicentre MasterPure Yeast DNA Purification Kit. Sample preparation (DNA fragmentation, adaptor ligation, size selection and amplification) and next-generation sequencing (5 × 10⁶ 150-bp paired-end reads) was done by Eurofins using Illumina technology. The raw sequence reads were uploaded to the NCBI database under the accession number PRJNA909165. The reads were checked for quality using fastqc³⁸, from which the %GC and number of reads were obtained. From the number of reads, the average overall coverage was calculated using the formula \(\frac{\mathrm{{reads}}\times {\mathrm{read}}\,{\mathrm{length}}\left({\mathrm{bp}}\right)}{{\mathrm{length}}\; {\mathrm{of}}\; {\mathrm{genomic}}\; {\mathrm{DNA}}+{\mathrm{mitochondrial}}\; {\mathrm{DNA}}\left({\mathrm{bp}}\right)}\).

Next-generation sequencing analysis

The reads were trimmed using Trimmomatic³⁹ to remove adaptors, leading and trailing low-quality bases (cut off quality of three), low-quality reads (four-base sliding window quality of <15) and reads below 100 bp. Next, the reads were aligned to a reference and the mutations were identified using Breseq⁴⁰ in consensus mode. The genome sequence published by Sturmberger et al.¹² was used as a reference. The reference sequences for killer-like plasmids and the mitochondrial DNA were obtained from Sturmberger et al.¹² and Brady et al.¹⁶, respectively. The reported coverage depth was calculated using the Breseq algorithm. This is done by fitting a negative binomial distribution to the read-coverage depth observed at unique reference positions. The mean of this binomial fit is used as the coverage depth. The copy number of killer-like plasmids was estimated by comparing their coverage depth with the average of the four chromosomes. The coverage depth for each molecule was calculated as the mean of a binomial fit for the coverage depth for each reference position.

Phylogenetic tree

To generate a phylogenetic tree, the sequencing data from this study were combined with the previously published raw reads¹⁴ and also aligned as described above. From the predicted mutations of both datasets, a whole genome alignment was constructed, from which a phylogenetic tree was calculated using the Mega X⁴¹ software package. A maximum likelihood algorithm was used with a Hasegawa–Kishino–Yano substitution matrix.

Creation of the NCYC 2543 Δhis4 strain

The NCYC 2543 Δhis4 strain was generated using the split-marker method that was described previously by Heiss and colleagues⁴². The homology arms of the HIS4 gene were selected from Näätsaari et al.⁴³ and the reference genome of the CBS 7435 strain. First, a construct containing the two homology arms with a floxed nourseothricin acetyltransferase marker was created. Two overlapping fragments, which overlap for a length of 594 bp, containing one of the homologies and a part of the antibiotic marker were then generated by PCR using Taq polymerase (Promega). These fragments were purified through phenol–chloroform precipitation. Briefly, following the addition of an equal volume of phenol:chloroform:isoamyl alcohol (25:24:1), the solution was mixed, centrifuged (5 min at 12,000g) and the liquid phase was isolated by decanting. A one-tenth volume of 3 M sodium acetate pH 5.5 and two volumes of 100% ethanol was added the sample, which was then mixed and centrifuged (15 min at 12,000g). Finally, the pellet containing the amplified DNA was washed with 70% ethanol, air-dried and resuspended in water.

Both purified fragments were transformed into NCYC 2543 competent cells by electroporation, and the transformants were streaked to single clone onto YPD plates containing nourseothricin and cultured at room temperature for 2 days. The resulting clones were replica plated onto CSM-his plates for growth screening and cultured for 2 days at room temperature. Strict non-growers were checked by colony PCR for replacement of the HIS4 gene with the antibiotic marker cassette.

The nourseothricin acetyltransferase marker was finally removed by transient expression of a Cre-recombinase. This gene was cloned into a plasmid with an autonomously replicating sequence⁴⁴ and a Zeocin-resistance cassette, which was then transformed into the Δhis4 strain. The transformants were incubated overnight on a YPD plate containing Zeocin and the resulting colonies were transferred to YPD plates without antibiotics. The removal of the antibiotic cassettes of the plasmid and HIS4 knockout was verified with replica plating on YPD containing the respective antibiotics and double-checked via colony PCR.

Creation of the NCYC 2543 hoc1 ^tr strains

The NCYC 2543 hoc1^tr strains were generated using the split-marker method described in the previous section. The left homology arm of the HOC1 gene was chosen such that it contained about 1 kb upstream of the premature stop codon. K. phaffii gDNA was used as the PCR template. The right homology arm was chosen so that it contained about 1 kb downstream of the premature stop codon. The left and right homology arms were respectively fused by PCR to the first and last two-thirds of the floxed nourseothricin acetyltransferase marker. The PCR fragments were gel purified and the DNA was recovered using a Wizard SV Gel and PCR Clean-Up System (Promega) according to the manufacturer’s instructions. Both purified fragments were transformed into NCYC 2543 competent cells by electroporation, and the transformants were streaked to single clone onto YPD plates containing nourseothricin and cultured at room temperature for 2 days. The resulting clones were screened through colony PCR using a forward primer that annealed upstream of the left homology arm and a reverse primer that annealed to the nourseothricin selection marker. The nourseothricin acetyltransferase marker was removed by transient expression of a Cre-recombinase as described in the previous section. The engineered HOC1 locus was confirmed for both strategies by colony PCR and Sanger sequencing. The sequences for the PCR primers and split-marker cassettes are in Supplementary Tables 6 and 7.

Growth analysis

The different K. phaffii strains were cultured on YPD agar for 2 days, inoculated in triplicate into a 5 ml preculture in test tubes containing BMDY and cultured overnight at 28 °C with shaking at 225 rpm The optical density at 600 nm (OD₆₀₀) of each culture was measured and 250 ml BMDY was inoculated at a starting OD₆₀₀ of 0.05. Samples of 1 ml were immediately isolated from each culture to measure and check the starting OD₆₀₀. Next, the culture was cultivated in shake flasks at 28 °C with shaking at 225 rpm; samples of 1 ml were isolated every 2 h for 22 h and again after 26 and 29 h. All samples were diluted accordingly and measured within an OD₆₀₀ range of 0.05–1.00.

Recombinant protein expression

The expression vectors were made using a MoClo toolkit, based on Golden Gate cloning as described in this paper (Supplementary Data 2). Briefly, the protein-coding sequences were ordered synthetically with Part 3b-type BsaI overhangs (NEB, R3733) and cloned into the entry vector with BsmBI (NEB, R0739). Next, expression vectors were made by assembly of the Level 0 parts.

The cloning procedure was as follows: 1 µl T4 DNA ligase (400 U; NEB, M0202), 2 µl T4 DNA ligase buffer (NEB, M0202) and 1 µl restriction enzyme (20 U) were added to 20 fmol backbone (pPTK081 for entry vectors or any P8 backbone for destination vectors). An excess of insert (>1,000 fmol PCR amplicon or synthetic gene, or 10 pmol annealed oligonucleotides) was added for a BsmBI assembly, whereas equimolar amounts (20 fmol) of each entry vector were added for a BsaI assembly. BsmBI assembly mixtures were incubated according to the following protocol: >25 cycles of 42 °C for 2 min (digest) and 16 °C for 5 min (ligation), followed by 60 °C for 10 min (final digest) and 80 °C for 10 min (heat inactivation step). BsaI assembly mixtures were incubated similarly, except that the digestion steps were performed at 37 °C.

K. phaffii electrocompetent cells were generated using the previously described lithium acetate method⁴⁵. Briefly, precultures were inoculated in 5 ml YPD and cultured overnight in an incubator at 28 °C with rotation at 250 rpm. The precultures were diluted and cultured to an OD₆₀₀ of approximately 1.5. Cells were harvested by centrifugation (1,519g for 5 min at 4 °C) from 50 ml of the culture, resuspended in 200 ml of a lithium acetate and dithiothreitol solution (100 mM lithium acetate, 10 mM dithiothreitol, 0.6 M sorbitol and 10 mM Tris–HCl pH 7.5) and incubated at 28 °C for 30 min with rotation at 100 rpm. The cells were then collected by centrifugation (1,519g for 5 min at 4 °C), washed twice with 1 M ice-cold sorbitol and finally resuspended in 1.875 ml of 1 M ice-cold sorbitol. DNA (0.5–1 µg) was added to aliquots of 80 µl and electroshocked (1.5 kV, 200 Ω and 25 µF). A 1 ml volume of 1 M sorbitol was immediately added to the samples and the suspension was incubated at 28 °C for 2–5 h. Next, the cells were plated on YPD agar containing the appropriate antibiotic and colonies were isolated after 2 days of incubation at 30 °C.

To enable the comparison of expression levels, only colonies with single-copy integration of the construct were selected. The copy number was determined by quantitative PCR on a LightCycler 480 system (Roche) using primers that bind PAOX1 and PGAP. The genes OCH1 and ALG9 were used as references. NCYC 2543 gDNA was included as a single-copy positive control. A single-copy plasmid integration will yield one additional copy and more than two copies would be the result of multiple plasmid integrations. Amplification efficiencies were determined using serial dilutions of gDNA samples. Reactions were set up in 10 μl with final concentrations of 300 nM forward primer, 300 nM reverse primer, 1×SensiFast SYBR no-ROX mastermix (Bioline), 10 ng gDNA and the following cycling conditions: 3 min at 95 °C, followed by 45 cycles of 95 °C for 3 s, 60 °C for 30 s at a ramp rate 2.5 °C s⁻¹ and 72 °C for 1 s, and ending with 0.11 °C s⁻¹ from 65 °C to 95 °C for melting curve determination (5 acquisitions s⁻¹). Copy numbers were calculated using the ΔΔC_t method⁴⁶.

The different strains expressing the recombinant proteins were cultured on YPD agar plates for 2 days, inoculated in triplicate into a 5 ml preculture of BMDY and cultured at 28 °C overnight with shaking at 225 rpm. Next, the cultures for PAOX1-driven expression were inoculated in 2 ml BMDY, cultured for 24 h in a microtiter plate, transferred to 2 ml BMMY and incubated for 48 h in a microtiter plate. After 24 h in BMMY, an extra 1% methanol was added. The cultures for PGAP-driven expression were instead inoculated in 2 ml limiting glucose medium and incubated for 48 h in a microtiter plate. The OD₆₀₀ was measured for all cultures and the supernatant was collected by centrifugation (2,500g for 5 min). The samples were incubated with EndoH (produced in-house) to remove N-glycans and analysed by SDS–PAGE.

ELISA-based quantification of GBP

Each well of a Nunc MaxiSorp 96-well plate was coated with 75 ng anti-penta-His (Qiagen, 34660) in PBS solution and incubated overnight at 4 °C. The wells were washed three times with 200 µl wash buffer (PBS + 0.05% Tween-20) and any residual liquid was removed. The samples were blocked with 100 µl Reagent Diluent (1% Probumin (Millipore, 82-045-1) in PBS pH 7.2) for 2 h. This was followed by three washes with 200 µl wash buffer and the removal of any residual liquid. Dilutions of the yeast supernatant were prepared in 96-deep-well plates, and 100 µl of a 100,000-fold dilution was applied to each well, followed by incubation for 1 h with gentle shaking in a table-top plate shaker. The wells were washed three times with 200 µl wash buffer and the residual liquid was removed. The samples were provided with 100 µl of 250 ng ml⁻¹ MonoRab rabbit anti-camelid VHH coupled to horseradish peroxidase in Reagent Diluent and incubated for 1 h with gentle shaking in a table-top plate shaker. Each well was washed three times with 200 µl wash buffer and the residual liquid was removed. 3,3′,5,5′-Tetramethylbenzidine substrate was prepared according to the manufacturer’s instructions (BD OptEIA) and 100 µl was applied to each well, followed by a 10 min incubation. Finally, 50 µl stop solution (2 N H₂SO₄) was added to each well and the plate was read at 450 nm using a plate reader. The absorbance units were background corrected. All strains were compared in a Kruskal–Wallis omnibus test (two-sided), followed by a pairwise (two-sided) comparison corrected with Dunn’s multiple comparison procedure.

Flow cytometry to compare surface display of human lysozyme

Electroporation of a surface display plasmid to multiple K. phaffii strains (NRRL Y-11430, NCYC 2543 and OPENPichia) was performed using the lithium acetate method described in the ‘Recombinant protein expression’ section. We chose the previously reported⁴⁷ pPSD-FLAG-hLYZ-V5-Sag1 plasmid as a test case. It expresses the wild-type human lysozyme protein flanked by an N-terminal FLAG tag and a C-terminal V5 tag, and is fused at the C-terminal end to a Sag1 anchor under the control of the AOX1 promoter. The copy number of the surface display construct in the resulting strains was determined as described earlier. Clones that were determined to have one integrated copy of the surface display construct were inoculated in BMGY supplemented with 50 µg ml⁻¹ Zeocin in technical triplicates and cultured for 24 h at 28 °C with shaking at 200 rpm. The cultures were then transferred to BMMY supplemented with 50 µg ml⁻¹ Zeocin, set to 10 OD₆₀₀ units ml⁻¹ and further cultured at 28 °C for 24 h with shaking at 200 rpm. After 12 h, the cultures were spiked with an additional 1% methanol. After induction, the cells were harvested by centrifugation at 1,500g for 5 min and washed three times with ice-cold washing buffer (PBS containing 1 mM EDTA pH 7.2 and one cOmplete Inhibitor EDTA-free tablet (Roche) per 50 ml buffer). The cells were kept on ice during the entire staining procedure. Unstained controls, single-stain controls and an empty vector control were included.

The cells (at an OD₆₀₀ of two) were stained with mouse monoclonal anti-V5 (1/500; AbD Serotec, MCA2892) and rabbit polyclonal anti-FLAG (1/200; Sigma-Aldrich, F7425) in ice-cold staining buffer (wash buffer containing 0.5 mg ml⁻¹ BSA) for 1 h at 4 °C. They were then washed three times with ice-cold staining buffer and stained with goat anti-mouse AF568 (1/250; Thermo Fischer Scientific, A-11031), goat anti-rabbit AF488 (1/500; Thermo Fischer Scientific, A11008) and Live/Death stain eFluor506 (1/1,000; Thermo Fischer Scientific) for 1 h at 4 °C. This was followed by three washes with ice-cold staining buffer before analysis on a BD FACSMelody instrument. The data were analysed using the FlowJo software. The gating strategy is shown in Extended Data Fig. 6.

Comparison of NCYC 2543 and NCYC 2543 hoc1 ^tr in a fed-batch process

Fermentations were conducted using a SciVario Twin 3 l fermenter (Eppendorf) containing 800 ml basal salts medium as described in the Pichia Fermentation Process Guidelines (Invitrogen Corporation, 2002). Yeast extract (Neogen, NCM0218A) was further added at a concentration of 10 g l⁻¹ to supplement the batch medium.

To prepare the inoculum seed culture, a 1 l baffled flask containing 100 ml of BMGY, 1% yeast extract, 2% peptone, 1.34% yeast nitrogen base, 1% glycerol, 100 mM potassium phosphate pH 6.0), supplemented with 4 × 10–5% biotin, was inoculated with the expression clone of interest at an initial OD₆₀₀ of 0.1. The culture in the flask was incubated at 28 °C with agitation at 200 rpm for 20–24 h until the OD₆₀₀ reached the range of 20–30.

The batch phase of the fermenter was initiated by inoculating batch medium with inoculum seed at an initial OD of one. The cultivation temperature was maintained at 25 °C with an airflow rate of 1 vvm. The pH was automatically controlled at 6.0 by the addition of 25% wt/wt ammonium hydroxide as required. The dissolved oxygen levels were maintained at 30% saturation through control of agitation (600–1,200 rpm) and the addition of pure oxygen. Foam formation was prevented by the addition of an antifoam solution (Struktol, J673A).

Once the initial glycerol (40 g l⁻¹) was fully consumed, marked by a rapid increase in the percentage of dissolved oxygen, the fed-batch phase commenced with the introduction of a 50% glucose solution (wt/wt) supplemented with 12 ml l⁻¹ PTM1 solution. The feed rate was adjusted to 20 ml h⁻¹ l⁻¹ batch volume and linearly increased to 40 ml h⁻¹ l⁻¹ batch volume over a duration of 48 h to introduce 1 l of feed solution. All process parameters were maintained at the levels established during the batch phase throughout the entire fermentation process.

RT–qPCR analysis of HOC1 mRNA

The four strains were inoculated in BMGY medium, in triplicate, from an overnight preculture and cultured for 20 h at 28 °C and 200 rpm. The cells (10 OD₆₀₀ units) were pelleted and washed with RNase-free water. Total RNA was prepared using a RiboPure-Yeast Kit (Invitrogen, AM1926), followed by a DNase treatment using a TURBO DNA-free Kit (Invitrogen, AM1907) according to the manufacturer’s instructions. Complementary DNA was then prepared using an iScript cDNA Synthesis Kit (BioRad, 1708891). The RT–qPCR reaction was performed for technical triplicates of each biological replicate using the following conditions: activation for 5 min at 95 °C, followed by 40 cycles of 10 s at 95 °C, 15 s at 55 °C and 20 s at 72 °C, and a final elongation step for 40 s at 72 °C. The transcript level variance of eight reference genes for normalization (UCB6, TDH3, QCR9, ALG9, PGK1, TAF10, ACT1 and TPI1) was analysed using the geNorm algorithm, as implemented in the qbase+ software⁴⁸, to identify the genes whose transcript levels were least affected under the experimental conditions used. Based on these data (not shown), the HOC1 transcript levels were normalized using the geometric mean of the genes QCR9 and ALG9. The levels of HOC1 transcript were determined using two primer pairs. Determination of amplification efficiencies and conversion of raw C_q values to calibrated normalized relative quantity was performed using the qbase+ software. Statistical analysis of the calibrated normalized relative quantities was done using the GraphPad Prism 9 software package. All primers used are listed in Supplementary Table 6.

Transformation efficiency testing

Competent cells were prepared using the lithium acetate method described in the ‘Recombinant protein expression’ section. Each strain was transformed with 200 ng linearized plasmid and several dilutions of the transformation mix were plated on either non-selective YPD agar or YPD agar containing 100 µg ml⁻¹ Zeocin. For each transformation, colonies were counted from the plates where clear individual colonies could be observed after incubation at 30 °C for 2 days. Both the selective and non-selective plates were counted to correct for a potential difference in the number of competent cells per transformation.

A linear model (estimated using ordinary least squares) was fitted in the statistical software R⁴⁹. The log-transformed normalized transformation efficiency (natural logarithm of the number of transformants per million clones) was used as the outcome variable, and the strain and promotor type, including an interaction effect were used as the predictor variables. The model explains a statistically significant and substantial proportion of variance (coefficient of multiple correlation (R²) = 0.94, F(7,38) = 81.33, P < 0.001 and adjusted R² = 0.93). Model-predicted group means with 95% confidence intervals were obtained using the ggeffects package with heteroscedasticity-consistent variance estimators from the sandwich package (vcovHC, type HC0)^50,51.

Capillary gel electrophoresis-laser induced fluorescence detection-based glycan analysis of cell-wall mannoproteins

Strains were inoculated in YPD or YPG medium, from their respective precultures, at an OD₆₀₀ of 0.05 and cultured overnight at 28 °C and 200 rpm. The next day, 500 OD₆₀₀ units per strain were pelleted (10 min at 1,500g) and the mannoproteins were isolated as follows. The pellets were washed three times with Milli-Q water, after which 20 mM citrate buffer pH 6.6 was added at 1 ml per 150 µg of wet cell weight. The resuspended cells were autoclaved for 1.5 h at 120 °C in cryovials and then centrifuged for 10 min at 16,000g. Three volumes of ice-cold methanol were added to the supernatant fractions and the vials were incubated for 15 min at 20 °C. The mannoproteins were spun down at 16,000g for 10 min and the pellets were left to dry until transparent. The pellets were resuspended in 50 µl RCM buffer (8 M urea, 360 mM Tris–HCl pH 8.6 and 3.2 mM EDTA) and stored at 4 °C until further analysis.

N-linked oligosaccharides were prepared from the purified mannoproteins following blotting to polyvinylidene fluoride membrane in the wells of 96-well plate membrane plates and analysed by capillary electrophoresis with laser-induced fluorescence detection using an ABI 3130 capillary DNA sequencer as described previously²².

Alcian blue assay

The assay was performed as described previously²³, with the following adaptations. Briefly, Alcian blue was prepared in 0.02 N HCl at a concentration of 63 µg ml⁻¹ and the solution was centrifuged to remove insoluble precipitates. An overnight culture of each strain was cultured in YPD medium at 28 °C and 200 rpm. The next day, the cells were pelleted and the supernatant was removed. The cells were washed with 0.02 N HCl and the pellet was resuspended in 0.02 N HCl to 10 OD₆₀₀ units ml⁻¹. The cells (100 µl; 1 OD₆₀₀) were transferred to a 96-well V-bottomed plate, to which 100 µl of the Alcian blue solution was added. Following incubation at room temperature for 15 min, the plate was centrifuged at 3,220g for 15 min, after which the pellets were visually checked.

Congo red and Calcofluor white test

The test was performed as described elsewhere⁵², with slight adaptations. Briefly, the strains were cultured overnight in BMGY. The next day, dilutions were made to obtain between 1 × 10⁵ and 10 cells in 5 µl BMGY. Drops of 5 µl were spotted on the different plates, which were incubated at 30 °C for 3 days. Congo red (Sigma, C6767) and Calcofluor white (Fluka, 18909) were present at final concentrations of 75 µg ml⁻¹ and 10 µg ml⁻¹, respectively.

Electron microscopy

Transmission electron microscopy

The strains were cultured overnight in BMGY at 28 °C and 200 rpm. High-pressure freezing, as described previously⁵³, was carried out in a high-pressure freezer (Leica EM ICE). The cells were pelleted and frozen as a paste in 150 µm copper carriers. High-pressure freezing was followed by quick freeze substitution as described previously⁵⁴. Briefly, the carriers were placed on top of the frozen FS solution inside a cryovial containing 1% double-distilled water, 1% OsO₄ and 0.5% glutaraldehyde in dried acetone. After reaching 4 °C for 30 min, the samples were infiltrated stepwise over 3 days at 0–4 °C in Spurr’s resin and embedded in capsules. The polymerization was performed at 70 °C for 16 h. Ultrathin sections of a gold interference colour were cut using an ultramicrotome (Leica EM UC6), followed by post staining, in a Leica EM AC20 system, with uranyl acetate at 20 °C for 40 min and lead at 20 °C for 10 min.

The sections were collected on formvar-coated copper slot grids. The grids were viewed using a JEM-1400Plus transmission electron microscope (JEOL) operating at 60 kV.

Scanning electron microscopy

The strains were cultured overnight in BMGY at 28 °C and 200 rpm. The cells were fixed overnight in 1.5% paraformaldehyde and 3% glutaraldehyde in 0.05 M sodium cacodylate buffer pH 7.4. The fixed cells were centrifuged for 2 min at 1,000g between each of the following steps. First, the cells were washed three times with 0.1 M sodium cacodylate buffer pH 7.4 and then incubated for 30 min in 0.1 M sodium cacodylate pH 7.4 containing 2% OsO₄. The osmicated samples were washed three times with Milli-Q water before a stepwise ethanol dehydration (50%, 70%, 90% and 2 × 100%). This was followed by two incubations in hexamethyldisilazane solution (Sigma-Aldrich), as a final dehydration step, after which the samples were spotted on silicon grids (Ted Pella) and air-dried overnight at room temperature. Finally, the samples were coated with 5 nm platinum in a Q150T ES sputter coater (Quorum Technologies) and placed in a Gemini 2 Cross beam 540 microscope (Zeiss) for scanning electron microscopy imaging at 1.50 kV using an SE2 detector.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Data availability

All raw reads of the genomes sequenced in this study have been submitted to the NCBI and can be found under the accession number PRJNA909165. The CBS 7435 reference genome can be found under the NCBI accession number GCA_900235035.2. Source data are provided. All other data supporting the findings of this study are available from the corresponding authors. OPENPichia is available from VIB (OPENPichia.com). The expression vector construction toolkit can be obtained from the BCCM at https://bccm.belspo.be/catalogues/plasmid-sets/openpichia.

References

Karbalaei, M., Rezaee, S. A. & Farsiani, H. Pichia pastoris: a highly successful expression system for optimal synthesis of heterologous proteins. J. Cell. Physiol. https://doi.org/10.1002/jcp.29583 (2020).
Adivitiya, Dagar, V. K. & Khasa, Y. P. in Yeast Diversity in Human Welfare (eds Satyanarayana, T. & Kunze, G.) 215–250 (Springer, 2017).
Yang, Z. & Zhang, Z. Engineering strategies for enhanced production of protein and bio-products in Pichia pastoris: a review. Biotechnol. Adv. 36, 182–195 (2018).
Article CAS PubMed Google Scholar
Phaff, H. J., Miller, M. W. & Shifrine, M. The taxonomy of yeasts isolated from Drosophila in the Yosemite region of California. Antonie van Leeuwenhoek 22, 145–161 (1956).
Article CAS PubMed Google Scholar
Phaff, H. J. A proposal for amendment of the diagnosis of the genus Pichia hansen. Antonie van Leeuwenhoek 22, 113–116 (1956).
Article CAS PubMed Google Scholar
Kurtzman, C. P. Description of Komagataella phaffii sp. nov. and the transfer of Pichia pseudopastoris to the methylotrophic yeast genus Komagataella. Int. J. Syst. Evol. 55, 973–976 (2005).
Article CAS Google Scholar
Ogata, K., Nishikawa, H. & Ohsugi, M. A yeast capable of utilizing methanol. Agric. Biol. Chem. 33, 1519–1520 (1969).
Article CAS Google Scholar
Tani, Y., Miya, T., Nishikawa, H. & Ogata, K. The microbial metabolism of methanol. Part I. Formation and crystallization of methanol-oxidizing enzyme in a methanol-utilizing yeast, Kloeckera sp. no. 2201. Agric. Biol. Chem. 36, 68–83 (1972).
CAS Google Scholar
Tani, Y., Miya, T. & Ogata, K. The microbial metabolism of methanol part II. Properties of crystalline alcohol oxidase from Kloeckera sp. no. 2201. Agric. Biol. Chem. 36, 76–83 (1972).
Article CAS Google Scholar
Wegner, E. H. A process for producing single cell protein material and culture. European patent EP0017853B2 (1980).
De Schutter, K. et al. Genome sequence of the recombinant protein production host Pichia pastoris. Nat. Biotechnol. 27, 561–566 (2009).
Article PubMed Google Scholar
Sturmberger, L. et al. Refined Pichia pastoris reference genome sequence. J. Biotechnol. 235, 121–131 (2016).
Article CAS PubMed Central Google Scholar
Mattanovich, D. et al. Open access to sequence: browsing the Pichia pastoris genome. Microb. Cell Fact. 8, 53 (2009).
Article PubMed PubMed Central Google Scholar
Brady, J. R. et al. Comparative genome‐scale analysis of Pichia pastoris variants informs selection of an optimal base strain. Biotechnol. Bioeng. 117, 543–555 (2020).
Article CAS PubMed Google Scholar
Prielhofer, R. et al. GoldenPiCS: a Golden Gate-derived modular cloning system for applied synthetic biology in the yeast Pichia pastoris. BMC Syst. Biol. 11, 123 (2017).
Article PubMed PubMed Central Google Scholar
Love, K. R. et al. Comparative genomics and transcriptomics of Pichia pastoris. BMC Genomics 17, 550 (2016).
Article PubMed PubMed Central Google Scholar
Braun-Galleani, S. et al. Genomic diversity and meiotic recombination among isolates of the biotech yeast Komagataella phaffii (Pichia pastoris). Microb. Cell Fact. 18, 211 (2019).
Article CAS PubMed PubMed Central Google Scholar
Offei, B. et al. Identification of genetic variants of the industrial yeast Komagataella phaffii (Pichia pastoris) that contribute to increased yields of secreted heterologous proteins. PLoS Biol. 20, e3001877 (2022).
Article CAS PubMed PubMed Central Google Scholar
Lu, L., Roberts, G. G., Oszust, C. & Hudson, A. P. The YJR127C/ZMS1 gene product is involved in glycerol-based respiratory growth of the yeast Saccharomyces cerevisiae. Curr. Genet. 48, 235–246 (2005).
Article CAS PubMed Google Scholar
Jungmann, J. & Munro, S. Multi-protein complexes in the cis Golgi of Saccharomyces cerevisiae with α-1,6-mannosyltransferase activity. EMBO J. 17, 423–434 (1998).
Article CAS PubMed PubMed Central Google Scholar
Vogl, T., Gebbie, L., Palfreyman, R. W. & Speight, R. Effect of plasmid design and type of integration event on recombinant protein expression in Pichia pastoris. Appl. Environ. Microbiol. 84, e02712-17 (2018).
Article ADS PubMed PubMed Central Google Scholar
Laroy, W., Contreras, R. & Callewaert, N. Glycome mapping on DNA sequencing equipment. Nat. Protoc. 1, 397–405 (2006).
Article CAS PubMed Google Scholar
Conde, R., Pablo, G., Cueva, R. & Larriba, G. Screening for new yeast mutants affected in mannosylphosphorylation of cell wall mannoproteins. Yeast 20, 1189–1211 (2003).
Article CAS PubMed Google Scholar
Friis, J. & Ottolenghi, P. The genetically determined binding of alcian blue by a minor fraction of yeast cell walls. C. R. Trav. Lab. Carlsberg 37, 327–341 (1970).
CAS PubMed Google Scholar
Casini, A., Storch, M., Baldwin, G. S. & Ellis, T. Bricks and blueprints: methods and standards for DNA assembly. Nat. Rev. Mol. Cell Biol. 16, 568–576 (2015).
Article CAS PubMed Google Scholar
OPENPichia Plasmid Set. Belgian Coordinated Collections of Microorganisms https://bccm.belspo.be/catalogues/plasmid-sets/openpichia (2022).
Moore, S. J. et al. EcoFlex: a multifunctional MoClo kit for E. coli synthetic biology. ACS Synth. Biol. 5, 1059–1069 (2016).
Article CAS PubMed Google Scholar
Lee, M. E., DeLoache, W. C., Cervantes, B. & Dueber, J. E. A highly characterized yeast toolkit for modular, multipart assembly. ACS Synth. Biol. 4, 975–986 (2015).
Article CAS PubMed Google Scholar
van Dolleweerd, C. J. et al. MIDAS: a modular DNA assembly system for synthetic biology. ACS Synth. Biol. 7, 1018–1029 (2018).
Article PubMed Google Scholar
Hernanz-Koers, M. et al. FungalBraid: a GoldenBraid-based modular cloning platform for the assembly and exchange of DNA elements tailored to fungal synthetic biology. Fungal Genet. Biol. 116, 51–61 (2018).
Article CAS PubMed Google Scholar
Sarrion-Perdigones, A. et al. GoldenBraid: an iterative cloning system for standardized assembly of reusable genetic modules. PLoS ONE 6, e21622 (2011).
Article ADS CAS PubMed PubMed Central Google Scholar
Obst, U., Lu, T. K. & Sieber, V. A modular toolkit for generating Pichia pastoris secretion libraries. ACS Synth. Biol. 6, 1016–1025 (2017).
Article CAS PubMed Google Scholar
Andreou, A. I. & Nakayama, N. Mobius assembly: a versatile Golden-Gate framework towards universal DNA assembly. PLoS ONE 13, e0189892 (2018).
Article PubMed PubMed Central Google Scholar
Engler, C. et al. A Golden Gate modular cloning toolbox for plants. ACS Synth. Biol. 3, 839–843 (2014).
Article CAS PubMed Google Scholar
Weber, E., Engler, C., Gruetzner, R., Werner, S. & Marillonnet, S. A modular cloning system for standardized assembly of multigene constructs. PLoS ONE 6, e16765 (2011).
Article ADS CAS PubMed PubMed Central Google Scholar
Potapov, V. et al. Comprehensive profiling of four base overhang ligation fidelity by T4 DNA ligase and application to DNA assembly. ACS Synth. Biol. 7, 2665–2674 (2018).
Article CAS PubMed Google Scholar
Lin, Y.-C. et al. Genome dynamics of the human embryonic kidney 293 lineage in response to cell biology manipulations. Nat. Commun. 5, 4767 (2014).
Article ADS CAS PubMed Google Scholar
Andrews, S. FastQCc: a quality control tool for high throughput sequence data v.0.11.9 (Babraham Bioinformatics, 2019); http://www.bioinformatics.babraham.ac.uk/projects/fastqc/
Bolger, A. M., Lohse, M. & Usadel, B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30, 2114–2120 (2014).
Article CAS PubMed PubMed Central Google Scholar
Deatherage, D. E. & Barrick, J. E. in Engineering and Analyzing Multicellular Systems, Vol. 1151 (eds Sun, L. & Shou, W.) 165–188 (Springer, 2014).
Kumar, S., Stecher, G., Li, M., Knyaz, C. & Tamura, K. MEGA X: molecular evolutionary genetics analysis across computing platforms. Mol. Biol. Evol. 35, 1547–1549 (2018).
Article CAS PubMed PubMed Central Google Scholar
Heiss, S., Maurer, M., Hahn, R., Mattanovich, D. & Gasser, B. Identification and deletion of the major secreted protein of Pichia pastoris. Appl. Microbiol. Biotechnol. 97, 1241–1249 (2013).
Article CAS PubMed Google Scholar
Näätsaari, L. et al. Deletion of the Pichia pastoris KU70 homologue facilitates platform strain generation for gene expression and synthetic biology. PLoS ONE 7, e39720 (2012).
Article ADS PubMed PubMed Central Google Scholar
Weninger, A., Hatzl, A.-M., Schmid, C., Vogl, T. & Glieder, A. Combinatorial optimization of CRISPR/Cas9 expression enables precision genome engineering in the methylotrophic yeast Pichia pastoris. J. Biotechnol. 235, 139–149 (2016).
Article CAS PubMed Google Scholar
Wu, S. & Letchworth, G. J. High efficiency transformation by electroporation of Pichia pastoris pretreated with lithium acetate and dithiothreitol. BioTechniques 36, 152–154 (2004).
Article CAS PubMed Google Scholar
Livak, K. J. & Schmittgen, T. D. Analysis of relative gene expression data using real-time quantitative PCR and the \(2^{-\Delta\Delta{\mathrm{C}}_{\mathrm{T}}}\) method. Methods 25, 402–408 (2001).
Article CAS PubMed Google Scholar
Boone, M. et al. Massively parallel interrogation of protein fragment secretability using SECRiFY reveals features influencing secretory system transit. Nat. Commun. 12, 6414 (2021).
Article ADS CAS PubMed PubMed Central Google Scholar
Vandesompele, J. et al. Accurate normalization of real-time quantitative RT-PCR data by geometric averaging of multiple internal control genes. Genome Biol. 3, research0034.1 (2002).
Article Google Scholar
R Core Team. R: A Language and Environment for Statistical Computing (R Foundation for Statistical Computing, 2009).
Lüdecke, D. ggeffects: Tidy data frames of marginal effects from regression models. J. Open Source Softw. 3, 772 (2018).
Zeileis, A., Köll, S. & Graham, N. Various versatile variances: an object-oriented implementation of clustered covariances in R. J. Stat. Softw. 95, 1–36 (2020).
Article Google Scholar
Ram, A. F. J. & Klis, F. M. Identification of fungal cell wall mutants using susceptibility assays based on Calcofluor white and Congo red. Nat. Protoc. 1, 2253–2256 (2006).
Article CAS PubMed Google Scholar
Arendt, P. et al. An endoplasmic reticulum-engineered yeast platform for overproduction of triterpenoids. Metab. Eng. 40, 165–175 (2017).
Article CAS PubMed Google Scholar
McDonald, K. L. & Webb, R. I. Freeze substitution in 3 hours or less. J. Microsc. 243, 227–233 (2011).
Article CAS PubMed Google Scholar

Download references

Acknowledgements

K.C. was supported by an Innovation Mandate of VLAIO (HBC.2021.0249). D.V.H. was supported by a Baekeland mandate of VLAIO (Flanders Innovation & Entrepreneurship fund) in collaboration with Inbiose NV, and is now an employee of Inbiose NV. R.V. was supported by a Strategic Basic Research fellowship from the Fund for Scientific Research and otherwise supported by Ghent University. C.R., B.V.M. and J.N. are supported by Strategic Basic Research fellowships of the Fund for Scientific Research Flanders (FWO). E.W. and S.V. are supported by grants from Ghent University. S.Y. and E.C. are supported by a grant from the Bill and Melinda Gates Foundation (INV-037592). H.E. is supported by a Fundamental Research fellowship of the Fund for Scientific Research Flanders (FWO). H.G. was a post-doctoral fellow funded by Ghent University and VIB. C.L. was supported by a Strategic Basic Research fellowship of the Fund for Scientific Research Flanders (FWO). G.M. was supported by VIB. L.v.S. is a VIB post-doctoral fellow and supported by grants from the Industrial Research Fund of Ghent University, VLAIO and the European Commission (HERA-Pilot). Research in the Callewaert laboratory is supported by grants from UGent, the Fund for Scientific Research Flanders (FWO) and core resources from VIB. We thank the staff of the VIB Flow Core Ghent for providing access to flow cytometry equipment and for their technical assistance. We also thank J. Beauprez for valuable discussions; M. Arslan, A. V. Hecke and S. Devos for their assistance with some of the experiments; and R. A. Symakani for the design of Extended Data Fig. 8. We thank J. Cregg for his careful reading of the originally submitted introductory section, which appears in this paper in an abbreviated version.

Author information

These authors contributed equally: Katrien Claes, Dries Van Herpe, Robin Vanluchene.

Authors and Affiliations

Center for Medical Biotechnology, VIB, Ghent, Belgium
Katrien Claes, Dries Van Herpe, Robin Vanluchene, Charlotte Roels, Berre Van Moer, Elise Wyseure, Kristof Vandewalle, Hannah Eeckhaut, Semiramis Yilmaz, Sandrine Vanmarcke, Erhan Çıtak, Daria Fijalkowska, Hendrik Grootaert, Chiara Lonigro, Leander Meuris, Gitte Michielsen, Justine Naessens, Loes van Schie & Nico Callewaert
Department of Biochemistry and Microbiology, Ghent University, Ghent, Belgium
Katrien Claes, Dries Van Herpe, Robin Vanluchene, Charlotte Roels, Berre Van Moer, Elise Wyseure, Kristof Vandewalle, Hannah Eeckhaut, Semiramis Yilmaz, Sandrine Vanmarcke, Erhan Çıtak, Daria Fijalkowska, Hendrik Grootaert, Chiara Lonigro, Leander Meuris, Gitte Michielsen, Justine Naessens, Loes van Schie & Nico Callewaert
Inbiose NV, Ghent, Belgium
Dries Van Herpe
Department of Biomedical Molecular Biology, Ghent University, Ghent, Belgium
Riet De Rycke & Michiel De Bruyne
BioImaging Core, VIB, Ghent, Belgium
Riet De Rycke, Michiel De Bruyne & Peter Borghgraef

Authors

Katrien Claes
View author publications
You can also search for this author in PubMed Google Scholar
Dries Van Herpe
View author publications
You can also search for this author in PubMed Google Scholar
Robin Vanluchene
View author publications
You can also search for this author in PubMed Google Scholar
Charlotte Roels
View author publications
You can also search for this author in PubMed Google Scholar
Berre Van Moer
View author publications
You can also search for this author in PubMed Google Scholar
Elise Wyseure
View author publications
You can also search for this author in PubMed Google Scholar
Kristof Vandewalle
View author publications
You can also search for this author in PubMed Google Scholar
Hannah Eeckhaut
View author publications
You can also search for this author in PubMed Google Scholar
Semiramis Yilmaz
View author publications
You can also search for this author in PubMed Google Scholar
Sandrine Vanmarcke
View author publications
You can also search for this author in PubMed Google Scholar
Erhan Çıtak
View author publications
You can also search for this author in PubMed Google Scholar
Daria Fijalkowska
View author publications
You can also search for this author in PubMed Google Scholar
Hendrik Grootaert
View author publications
You can also search for this author in PubMed Google Scholar
Chiara Lonigro
View author publications
You can also search for this author in PubMed Google Scholar
Leander Meuris
View author publications
You can also search for this author in PubMed Google Scholar
Gitte Michielsen
View author publications
You can also search for this author in PubMed Google Scholar
Justine Naessens
View author publications
You can also search for this author in PubMed Google Scholar
Loes van Schie
View author publications
You can also search for this author in PubMed Google Scholar
Riet De Rycke
View author publications
You can also search for this author in PubMed Google Scholar
Michiel De Bruyne
View author publications
You can also search for this author in PubMed Google Scholar
Peter Borghgraef
View author publications
You can also search for this author in PubMed Google Scholar
Nico Callewaert
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

This work was originally conceived and initiated by K.C., D.V.H., K.V. and N.C. K.C., D.V.H., R.V., K.V., H.E., S.Y., S.V., E.C., E.W., B.V.M., D.F., H.G., C.L., G.M., L.M., J.N., C.R. and L.v.S. performed experiments and contributed to data analysis and/or result presentation. R.D.R., M.D.B. and P.B. performed the electron microscopy. D.V.H., R.V., K.C. and N.C. co-wrote the manuscript. K.C. and N.C. supervised the work.

Corresponding authors

Correspondence to Katrien Claes or Nico Callewaert.

Ethics declarations

Competing interests

D.V.H. is now an employee of Inbiose NV. R.V. is now an employee of Those Vegan Cowboys. K.V. was a VIB post-doctoral fellow and is now an employee of Inbiose NV. H.G. is now an employee of Eurofins. C.L. is now an employee of the Council of Europe. G.M. now works at Animab.

Peer review

Peer review information

Nature Microbiology thanks Jiazhang Lian, Laura Navone and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Expression comparison between NCYC 2543 and NRRL Y-11430.

The proteins were expressed using the GAP or AOX1 promoter (three biological replicates per strain, promoter and protein). As controls, both wild type strains were grown and analysed as well. Supernatant samples were treated with EndoH to remove N-glycans and samples were analysed on SDS–PAGE. EndoH is also visible on the gels at around 30 kDa. The arrows indicate the expected location of the expressed protein band, based on the theoretical size.

Source data

Extended Data Fig. 2 Summary of the end-ODs of the pGAP- and pAOX1-based cultivations at harvest.

For both strains, NCYC 2543 and NRRL Y-11430, the end-ODs for the three model protein cultivations are depicted. Data points for the biological replicates (n = 2) were determined from four technical replicates. Box plots elements: centre line, median; bottom and top lines, lower and upper quartiles; whiskers, maximally 1.5× the interquartile range, or less when no data points are outside this distance.

Source data

Extended Data Fig. 3 Overview of the HOC1 genome engineering strategy.

a. Alignment of a part of the HOC1 gene as present in NRRL Y-11430 vs. NCYC 2543, showing the frameshift resulting in a premature stop codon in the NRRL Y-11430. b. Resulting genomic HOC1 sequence upon split-marker-based gene editing. Two strategies were followed where either the single base pair deletion (indicated with *) resulting in the Hoc1p truncation and a Lox72 scar is introduced downstream of the stop codon; or where an additional 115 bp deletion downstream of the resulting stop codon and Lox72 scar is introduced.

Extended Data Fig. 4 CGE-LIF profiles of the cell wall mannoproteins of the different strains grown on YPD or YPG.

N-glycan profiles of the cell wall mannoproteins when the strains were grown on YPD (left panels) vs. on YPG (right panels). The predominant peaks are Man₈GlcNAc₂ (M8), Man₉GlcNAc₂ (M9), and Man₁₀GlcNAc₂ (M10).

Extended Data Fig. 5 Surface display phenotype of Pichia pastoris strains NRRL Y-11430, NCYC 2543 and OPENPichia.

Human lysozyme was fused to the C-terminal part of Sag1p (which contains a GPI anchor) and with an N-terminal FLAG tag and a C-terminal V5 tag for detection in flow cytometry. The resulting fusion protein was expressed using the AOX1 promoter. Copy number was determined and two clones with a copy number of 1 were selected for each strain. Cells were plotted by a 5% quantile contour plot with outliers presented as dots. Quadrant gates were set using non-stained and single-stained controls. The number of technical replicates is 3. The gating strategy is shown in Extended Data Fig. 6.

Extended Data Fig. 6 Gating strategy of the yeast surface display experiment.

At least 10,000 events are captured on a FACSMelody instrument and data is analysed with the FlowJo software. First, debris is gated out, then single cells are gated using both the side scatter (SSC) and the forward scatter (FSC), and finally living cells are gated by using the Live/Death stain eFluor506. The resulting living single Pichia cells are analysed based on FLAG and V5 signal. The gates of IV and V were determined based on non-stained and single-stained controls.

Extended Data Fig. 7 Comparative analysis of NCYC 2543 and NCYC 2543 hoc1^tr strains expressing CovidVHH-IgG in a 3 l fermenter.

Both strains, expressing the protein with a GAP promoter were compared in a fed-batch production in a 3 l fermenter. Note that NCYC 2543 had a single-copy insertion, while NCYC 2543 hoc1^tr had a double-copy insertion of the CovidVHH-IgG expression cassette. They have comparable growth kinetics (batch phase length, oxygen demand and biomass formation), but NCYC 2543 hoc1^tr produced more protein. a. Profile of oxygen uptake rate (OUR), carbon exchange rate (CER) and resipiration quotient (RQ). The end of the batch phase and start of the feed phase is indicated with a vertical dotted line b. Biomass and protein production kinetics. Biomass data points are the mean of triplicate technical repeat measurements of the same fermentation run. Protein concentrations are single measurements as determined from the Äkta A260nm measurements. c. SDS–PAGE analysis of time-course samples collected from the fermentation runs. Samples are labelled with hours elapsed since the start of the fermentation process. Equivalent volumes of supernatant were loaded into each well for analysis. d. Comparison of ProteinA elution peaks from the ÄKTA chromatograms of harvest samples from both fermentation runs (a single replicate per strain is shown). To verify the successful capture of the entire product from the supernatants, purification fractions (load, flow through (FT), and elution) were analysed with SDS–PAGE.

Source data

Extended Data Fig. 8 The modular cloning or MoClo principle.

In Level -1, source DNA, such as PCR fragments or synthetic DNA are flanked with the proper Type IIS restriction sites and 4 nt overhangs, which are then accommodated in a Level 0 Entry vector through BsmBI digest and T4 DNA ligation. Then, selected Level 0 vectors are assembled into a Level 1 Expression vector by means of a BsaI digest and T4 DNA ligation. Finally, the system allows the assembly of multiple transcription units (promoter, CDS, terminator) from the individual Level 1 vectors, into a higher order Level 2 vector, in case the assembly connector sequences were properly selected during the assembly of the Level 1 vectors. Note that the Part 3 Coding Sequence can be split up in a Part 3a and Part 3b, to allow additional modularity. Likewise, the Part 4 Terminator can be split up in a Part 4a and a Part 4b.

Supplementary information

Supplementary Information

Supplementary Tables 1–7.

Reporting Summary

Supplementary Table

Supplementary Tables 1–7.

Supplementary Data 1

Sanger sequencing of the hoc1^tr region.

Supplementary Data 2

FASTA files of all MoClo Parts.

Supplementary Data 3

GenBank file of all model proteins.

Source data

Source Data Fig. 2

Source data.

Source Data Fig. 3

Source data.

Source Data Fig. 5

Unprocessed SDS–PAGE gels for Fig. 5a and source data of Fig. 5b.

Source Data Extended Data Fig. 1

Source data.

Source Data Extended Data Fig. 2

Source data.

Source Data Extended Data Fig. 7

Unprocessed SDS–PAGE gels.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Claes, K., Van Herpe, D., Vanluchene, R. et al. OPENPichia: licence-free Komagataella phaffii chassis strains and toolkit for protein expression. Nat Microbiol 9, 864–876 (2024). https://doi.org/10.1038/s41564-023-01574-w

Download citation

Received: 23 March 2023
Accepted: 01 December 2023
Published: 04 March 2024
Issue Date: March 2024
DOI: https://doi.org/10.1038/s41564-023-01574-w
Springer Nature Limited

OPENPichia: licence-free Komagataella phaffii chassis strains and toolkit for protein expression

Abstract

Similar content being viewed by others

Main

Results

Genome resequencing of K. phaffii strains

SEF1, RSF2 and HOC1 genotypes in NRRL Y-11430 and CBS 7435

Growth rate and protein production differences between strains

HOC1 truncation restores NCYC 2543 transformation efficiency

Cell walls of NRRL Y-11430, NCYC 2543 and NCYC 2543 hoc1 tr

Protein production by NCYC 2543 hoc1 tr

OPENPichia modular protein expression vector toolkit

Discussion

Methods

Strains and media

Illumina sequencing

Next-generation sequencing analysis

Phylogenetic tree

Creation of the NCYC 2543 Δhis4 strain

Creation of the NCYC 2543 hoc1 tr strains

Growth analysis

Recombinant protein expression

ELISA-based quantification of GBP

Flow cytometry to compare surface display of human lysozyme

Comparison of NCYC 2543 and NCYC 2543 hoc1 tr in a fed-batch process

RT–qPCR analysis of HOC1 mRNA

Transformation efficiency testing

Capillary gel electrophoresis-laser induced fluorescence detection-based glycan analysis of cell-wall mannoproteins

Alcian blue assay

Congo red and Calcofluor white test

Electron microscopy

Transmission electron microscopy

Scanning electron microscopy

Reporting summary

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Competing interests

Peer review

Peer review information

Additional information

Extended data

Supplementary information

Source data

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation

Cell walls of NRRL Y-11430, NCYC 2543 and NCYC 2543 hoc1 ^tr

Protein production by NCYC 2543 hoc1 ^tr

Creation of the NCYC 2543 hoc1 ^tr strains

Comparison of NCYC 2543 and NCYC 2543 hoc1 ^tr in a fed-batch process