Zhi-Chao
Lei§
abc,
Xinchang
Wang§
d,
Liulin
Yang
*a,
Hang
Qu
a,
Yibin
Sun
e,
Yang
Yang
a,
Wei
Li
bc,
Wen-Bin
Zhang
e,
Xiao-Yu
Cao
a,
Chunhai
Fan
f,
Guohong
Li
bc,
Jiarui
Wu
ghi and
Zhong-Qun
Tian
*a
aState Key Laboratory of Physical Chemistry of Solid Surfaces, Collaborative Innovation Center of Chemistry for Energy Materials (iChEM), Innovation Laboratory for Sciences and Technologies of Energy Materials of Fujian Province (IKKEM), College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, P. R. China. E-mail: zqtian@xmu.edu.cn; llyang@xmu.edu.cn
bNational Laboratory of Biomacromolecules, CAS Center for Excellence in Biomacromolecules, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, P. R. China
cUniversity of Chinese Academy of Sciences, Beijing 100049, P. R. China
dSchool of Electronic Science and Engineering, State Key Laboratory of Physical Chemistry of Solid Surfaces, Xiamen University, Xiamen 361005, P. R. China
eBeijing National Laboratory for Molecular Sciences, Key Laboratory of Polymer Chemistry & Physics of Ministry of Education, Center for Soft Matter Science and Engineering, College of Chemistry and Molecular Engineering, Peking University, Beijing 100871, P. R. China
fSchool of Chemistry and Chemical Engineering, Frontiers Science, Center for Transformative Molecules and National Center for Translational Medicine, Shanghai Jiao Tong University, Shanghai 200240, P. R. China
gKey Laboratory of Systems Biology, Center for Excellence in Molecular Cell Science, Shanghai Institute of Biochemistry and Cell Biology, Chinese Academy of Sciences, Shanghai, 200031, P. R. China
hSchool of Life Science and Technology, ShanghaiTech University, Shanghai, 201210, P. R. China
iKey Laboratory of Systems Health Science of Zhejiang Province, School of Life Science, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Hangzhou, 310024, P. R. China
First published on 17th January 2024
Molecular assembly is the process of organizing individual molecules into larger structures and complex systems. The self-assembly approach is predominantly utilized in creating artificial molecular assemblies, and was believed to be the primary mode of molecular assembly in living organisms as well. However, it has been shown that the assembly of many biological complexes is “catalysed” by other molecules, rather than relying solely on self-assembly. In this review, we summarize these catalysed-assembly (catassembly) phenomena in living organisms and systematically analyse their mechanisms. We then expand on these phenomena and discuss related concepts, including catalysed-disassembly and catalysed-reassembly. Catassembly proves to be an efficient and highly selective strategy for synergistically controlling and manipulating various noncovalent interactions, especially in hierarchical molecular assemblies. Overreliance on self-assembly may, to some extent, hinder the advancement of artificial molecular assembly with powerful features. Furthermore, inspired by the biological catassembly phenomena, we propose guidelines for designing artificial catassembly systems and developing characterization and theoretical methods, and review pioneering works along this new direction. Overall, this approach may broaden and deepen our understanding of molecular assembly, enabling the construction and control of intelligent assembly systems with advanced functionality.
However, the artificial materials obtained through self-assembly of building blocks are still too primitive in both structure and function compared to the multicomponent, hierarchical assemblies in living organisms. As illustrated in Fig. 1, increasing complexity generally leads to increasing functionality, even the emergence of novel properties at new levels of complexity. The self-assembly approach seems to have limitations in constructing such complex assemblies, and over-reliance on it may hinder the advancement of molecular assembly research (vide infra). These concerns have motivated a shift in the research paradigm of molecular assembly, with a need to explore alternative assembly methods beyond the self-assembly of building blocks.12,13
Molecular assembly in living organisms was traditionally thought to occur mainly through self-assembly.14 However, more and more discoveries have challenged this view.15,16 Studies have shown that in living organisms, many macromolecular complexes and even organisms themselves are assembled through “catalysis” of other molecules rather than self-assembly. These “catalysing” molecules efficiently assist the assembly of molecular building blocks in a controllable manner and do not remain in the final structures. Moreover, research has shown that certain molecules can facilitate the disassembly or reshaping of existing molecular assemblies.17,18 These findings resonate with the previously proposed concept of “catassembly” (catalysed-assembly), in which helping species called “catassemblers” accelerate molecular assembly similar to but more complicated than catalysts in chemical reactions.12,19,20 In this review, we summarize these biological examples of catassembly and discuss the underlying mechanisms. By drawing inspiration from nature, we suggest potential design principles and research methodologies for artificial catassembly systems, anticipating advancing the functional complex molecular assembly.
Most of these molecular assemblies are constructed by self-assembly of building blocks, which has two major steps: first, the rational design of building blocks and interactions between them; second, these building blocks are mixed under specific conditions, usually in solution, and assembled into thermodynamically stable products.9 The resulting molecular assemblies have either a finite number of subunits, such as molecular knots31 and host–guest complexes,32 or an infinite number of subunits, such as self-assembled monolayers33,34 and supramolecular polymers.35–37 Self-assembly is a simple, efficient and highly productive strategy, making it the most widely used approach for molecular assembly. It has led to the development of many functional materials for application in drug delivery,38 light harvest,39 electronics,40 and sensing.41,42
First, the processes of complex and hierarchical molecular assemblies are usually controlled by kinetics with many energy barriers in the potential energy surface, and easily get stuck in some intermediate states. For example, the autonomous assembly of DNA and RNA is often impeded by high energy barriers; therefore annealing or a “catalyst” is required to allow the system to cross the barrier.26,43,44
Second, from the perspective of thermodynamics, the autonomous molecular assembly can only occur when the change in Gibbs free energy is negative. However, many molecular assembly processes involve a positive change in Gibbs free energy and require energy input. Such examples are abundant in living organisms. For instance, the assembly of microtubules requires the consumption of GTP.45
Third, the construction of hierarchical molecular assemblies with multiple components involved poses significant challenges for self-assembly. When the assembly occurs in a single pot, undesired interactions between subunits from different modules can result in incorrect assemblies. To prevent such promiscuous interactions, the design of suitable subunits with orthogonal interactions is necessary. However, that could be extremely difficult, especially in multicomponent assembly systems. As illustrated in Fig. 2, it is extremely hard to obtain multicomponent, hierarchical molecular assemblies solely by self-assembly of building blocks, and almost impossible to achieve the emergence of advanced properties.
To overcome the limitations of self-assembly, alternative strategies have been proposed, such as assisted assembly,8,12 controlled self-assembly,46 and multistep assembly, expanding the toolbox for molecular assembly.13 Assisted assembly employs external factors to regulate and control the assembly process. External assistance can be classified into two categories: physical and chemical ones. Physical assistance includes methods such as heating, stirring, ultrasonication, light, electrical field, and magnetic field. Chemical assistance includes methods such as adding or removing solvents, enzymes, templates, and chiral centres. These assistance strategies are devised and employed both deliberately and inadvertently. For a particular assembly process, multiple types of assistance can be coordinated to create a more efficient and effective assembly process. Nevertheless, most researchers still habitually refer to these assisted methods as “assisted self-assembly”. This term is quite confusing and fails to clearly distinguish the relative contribution of self-assembly versus assisted assembly.12
Unlike self-assembly, which relies on pre-existing components spontaneously arranging into assemblies, catassembly employs catassemblers to actively drive and direct the assembly process (Fig. 2). Analogous to catalysis in covalent synthesis, catassembly enhances controllability over the assembly system.12 Compared to self-assembly and other assisted assembly strategies, catassembly offers several key advantages. First, the catassembler can change the potential energy surface to lower the energy barrier and speed up the assembly process (Movie S1 in the ESI‡). Second, the catassembler is reusable, allowing for the assembly of multiple subunits with just one equivalent of the catassembler. Third, catassembly is expected to be highly controllable. A catassembler can select specific subunits and guide their assembly in a multi-component system. In an assembly system with multiple pathways, a catassembler can direct a specific pathway leading to the target product. Moreover, using different catassemblers with the same subunits may result in products with different structures. Collectively, these advantages make catassembly a promising strategy for more efficient and controllable molecular assembly.
In the current state of the art of the field of molecular assembly, a few examples of catassembly have been reported.47–50 In contrast, catassembly has been widely found to play vital roles in living organisms, especially in the construction of many essential biomacromolecular complexes.15,16 In the following section, we will present several typical biological catassembly cases as examples.
Molecular chaperones have been discovered that can assist proteins to fold correctly and do not remain in final structures. They can be divided into two categories: ATP-independent and ATP-dependent.51,53 Most ATP-independent chaperones act as “holdases”, binding to proteins and protecting them from forming aggregates.53 ATP-dependent chaperones include GroEL–GroES, Hsp70, and Hsp90, which can all assist the folding of proteins when fuelled by ATP.51
GroEL–GroES (Hsp60–Hsp10) is the most well-known ATP-dependent chaperone (Fig. 3).54–56 Through biochemical and structural biology methods, the detailed working mechanisms have been unveiled. Newly synthesized peptides are unfolded and have many hydrophobic residues exposed. GroEL with ADP bound has a hydrophobic cavity that can accommodate these newly synthesized hydrophobic peptides. After the allosteric binding of ATP and GroES to GroEL, the cavity changes from hydrophobic to hydrophilic, providing an optimal environment for protein folding and exposing its hydrophilic surface. Afterwards, ATP is hydrolysed to ADP, causing the cavity to change back to hydrophobic and therefore the dissociation of GroES. The folded protein is then released, and the catassembler GroEL–GroES starts a new cycle. This process is a typical example of catassembly. With the assistance of catassemblers, the folding pathway leading to the native state is favoured, and the pathways leading to aggregates are hindered.
It has been shown that prokaryotes have one chaperone network to play the roles of assisting the newly synthesized peptide to fold and stress-denatured protein to refold, whereas eukaryotes have two distinct chaperone networks: one is linked with the translation process and has a function in de novo protein folding, and the other is stress-inducible and has a role in stress-denatured protein refolding.57 For example, in S. cerevisiae, the CLIPS chaperone network containing SSB1/2 has been identified to function in de novo protein folding, whereas the Hsp chaperone network containing Hsp82 and SSA-family members functions in the refolding of stress-denatured proteins.57 That means molecular chaperones can function in both “good” and “bad” times for cells.58
Currently, four chaperone–usher machineries have been identified for different types of pili.63 They share similar catassembly mechanisms (Fig. 4B).63 The chaperone molecule binds the subunit and helps to fold it, preventing the formation of aggregates. The chaperone then delivers the subunit to the usher, a pore protein located on the outer membrane. With the assistance of the usher, the subunit on the usher replaces the chaperone because the interactions between subunits are stronger than those between the subunit and the chaperone. This displacement leads to the assembly of a new subunit to the pili and the release of the chaperone. The chaperone then diffuses to the inner membrane to fetch another subunit and begin a new assembly cycle (Fig. 4B). Through this mechanism, the pili grow into an organelle that contains thousands of subunits outside the membrane. Without the protection and assistance of catassemblers, only aggregates will be obtained.
When trying to express RbcL and RbcS from plants in E. coli, no active RuBisCO complexes but only random aggregates are formed.66 That means these building blocks cannot self-form the RuBisCO complex solely. Recent research found that the assembly of RuBisCO is a typical example of catassembly that needs the participation of a series of catassemblers. Only when several chaperones are co-expressed, a fully active RuBisCO complex can be assembled.67–70
With the determination of the structures of a series of intermediates, the detailed catassembly mechanisms are revealed (Fig. 5).65,69 Catassemblers Raf1 or RbcX can bind and stabilize the RbcL subunits, facilitating the formation of either RbcL2Raf12 or RbcL2RbcX4 intermediates. Without these catassemblers, the formation of the intermediates is not possible, and the subsequent recognition of other subunits cannot occur. Following their formation, these intermediates further assemble into the complex of RbcL8Raf18 or RbcL8RbcX16. The binding sites of RbcS to RbcL are different from those of Raf1 and RbcX. The binding of RbcS to RbcL8Raf18 or RbcL8RbcX16 will finally form RuBisCO and induce structural changes in RbcL, leading to the allosteric release of Raf1 or RbcX. The catassemblers Raf1 or RbcX will then assist in the assembly of other subunits. Throughout the process, the catassemblers function as indispensable components of the intermediates. Without them, the aggregate pathways rather than the correct assembly pathway would be chosen.
As one of the most important molecular assemblies in cells, the assembly of chromatin primarily occurs through catassembly. Each nucleosome contains two copies of histones H2A, H2B, H3, and H4. However, direct mixing of histone octamers with DNA in vitro does not generate chromatin but precipitation, due to the promiscuous and irregular histone–DNA and histone–histone interactions. In 1978, Laskey et al. first discovered that the addition of Nap1 enables chromatin formation, and Nap1 is not present in established chromatin.76 Nap1 is the first molecular chaperone ever discovered.
Since then, more than 30 histone chaperones have been identified.77–79 In the current model, histone chaperones CAF1, ASF1, FACT, and MCM2 work synergistically to form an assembly line for building new chromatin after DNA replication. They cooperate to recycle the old histones, deliver new histones to the replication fork, and deposit them into the new chromatin (Fig. 7). CAF1 and ASF1 function in the assembly of new histones. And MCM2 and FACT work in the recycling of old histones. Most histone chaperones have no similarities in sequence but share the ability to bind histones. This binding shields the positively charged histones and protects them from forming improper interactions with DNA and other proteins. The histone chaperones then deliver the histones and assist their assembly with DNA to form nucleosomes.
Histone chaperones are not permanent components of chromatin, but rather play a role in building, restoring, and adjusting chromatin structures. Some histone chaperones, such as Nap1 and Hsp90, can bind to all histones, while others show specific selectivity for certain histones. For example, CAF1 only selects the H3.1–H4 tetramer, while HIRA can only bind to the H3.3–H4 tetramer.80,81 This selectivity means that a specific chaperone can build the nucleosome with a specific structure, which in turn affects gene transcription.
Products | Catassemblers | Mechanisms | |
---|---|---|---|
Assembly | Chaperone–usher pili62,63 | PapD–PapH, FimC–FimD, Caf1M–Caf1A, CfaA–CfaC | Protect unstable subunits, delivery subunits, assist assembly |
Type IV pili60 | PilQ,TcpC, PilM, PilN, PilO, PilP, TcpR, TcpD, TcpS,PilC, PilG, TcpE | ||
Chromatin78 | Nap1, ASF1, MCM2, CAF1, NASP, HJURP, SET, NCL, HIRA, FACT, DAXX–ATRX, etc. | ||
Phage tail83–85 | gp57a, gp38, gpG, gpGT, gp17.5, gp17.5* | ||
MHC-I86,87 | Tapasin, TAPBPR | ||
Myosin88,89 | UNC-45 | ||
Outer membrane proteins90,91 | BAM complex | ||
RuBisCO65 | Cpn60, Cpn20, RbcX, Raf1, Raf2, BSD2 | Facilitate the formation of key intermediates | |
Proteasome82,92,93 | pba1, pba2, pba3, pba4, ump1 | ||
HIV94,95 | IP6 | ||
snRNPs96 | SMN complex | ||
N2OR73 | NosL, NosD | ATP-fuelled assembly/folding | |
Type IV pili60 | PilB, PilF, TcpT | ||
Folding | Protein51,56 | GroEL–GroES, Hsp90, Hsp70 | |
RNA97,98 | DEAD-box proteins | ||
Protein53,99 | Spy, SurA, Skp, SecB, Hsp27, ttHsp40, PAT | Binding induced folding | |
RNA98 | Hfq, CspA, NCp |
Different types of catassemblers can have different roles in the assembly process. Some may act passively to avoid incorrect subunit–subunit interactions, while others play an active role. The binding of catassemblers to subunits may be allosteric or use the same binding sites as other subunits. As mentioned above, catassemblers may function through three mechanisms. First, catassemblers can bind and protect the immature subunit. Secondly, catassemblers are essential for the formation of key intermediates. Thirdly, catassemblers may act as “workers” on the assembly line fuelled by external energy.
One catassembler can have multiple functions. For example, a histone chaperone can either protect histones or function as a “worker” on the assembly line. Raf1 can both stabilize RbcL and participate in key intermediates.65 On the other hand, the formation of biomacromolecular complexes might need the cooperation of multiple catassemblers. For instance, the assembly of the proteasome requires the coordination of 10 chaperones.82
One may wonder why such complex mechanisms are necessary, and why self-assembly is insufficient in these cases. The complexity of the subunits, the assembly pathways, and the environment may provide some clues.
First, the initial structures of the subunits are unstable, and they do not adopt their final stable conformations until they are incorporated into the complexes. Therefore, catassemblers are needed to stabilize the subunits and make them compatible for assembly. Otherwise, the self-assembly of these immature subunits may lead to unwanted products. For example, the structure of RbcL is unstable until it binds to the catassemblers, and it adopts its final conformation only when the RuBisCO complex is formed.65
Second, the biological assembly process involves numerous subunits and complex assembly pathways, in which specific subunits must be added at the right time and position. Such complex processes cannot be achieved solely by self-assembly. The catassemblers act as protecting groups in covalent synthesis to ensure that the correct building blocks are added to the complex in the right order and result in the correct final product. For instance, in the catassembly of the proteasome, catassemblers pba1–4 can bind to the intermediates of the core particle and prevent them from interacting with another module called the lid before the core particle matures.82
Finally, the assembly process of biological macromolecules takes place in a complex and crowded environment, where they are exposed to other irrelevant molecules and subunits. These surrounding molecules or subunits may interfere with the interactions between the intended subunits. In such complex conditions, self-assembly alone cannot achieve targeted molecular assembly with desired functions, and catassemblers are required. For example, chaperonin molecules like GroEL–GroES can isolate the newly synthesized peptides from the complex environment and allow them to fold inside their cavity.51
Although these processes are not typical assembly processes, they share the same physico-chemical nature in the formation, disruption, and rearrangement of noncovalent interactions by catassemblers. These catassemblers bring indispensable controllability and new functions to the assembly. Without them, these complicated processes cannot be achieved. Accordingly, they could be called noncanonical catassemblers, and the corresponding processes can be called catalysed-disassembly, catalysed-reassembly or noncanonical catassembly.
In cells, the degradation of proteins occurs through the ubiquitin–proteasome system. Wasted proteins are tagged with ubiquitin and caged into the proteasome for degradation. However, some stable proteins can only be degraded once they are unfolded. The unfolding breaks noncovalent interactions and costs energy. In S. cerevisiae, Cdc48 has been found to collaborate with Ufd1 and Npl4 to unfold proteins fuelled by ATP (Fig. 8).17,100 Cdc48 is a hexamer complex that has a double-ring structure with a pore at the centre. The ubiquitin attached to the protein to be unfolded is first recognized by Cdc48. The Cdc48 hexamer has 12 ATPases, and the hydrolysis of ATP makes the ring structure act as a “conveyer belt” that drags the ubiquitin inside the ring. The continuous rotation of Cdc48 makes the protein pass through the small pore and unfold. The unfolded protein is then released and transferred to the proteasome for further degradation. Therefore, Cdc48 is a noncanonical catassembler for catalysed disassembly.
![]() | ||
Fig. 8 Cdc48 and Ufd-Npl4 can function as uncanonical catassemblers that unfold proteins by continuous rotation fuelled by ATP. |
The remodeler-assisted sliding of the histone octamer on DNA is a typical example of ATP-fuelled reshaping of assemblies.18,101,102 There are four families of remodelers, ISWI, SWI/SNF, CHD, and INO80, which all have a shared ATPase domain and can remodel nucleosomes. One important function of the remodelers is to make histone octamers slide on DNA. This process consumes ATP and may have two effects on chromatin: first, it makes the nucleosome-occupied DNA available and creates space for the binding of other factors; second, sliding of the nucleosome in a condensed direction compacts chromatin (Fig. 9).
![]() | ||
Fig. 9 Remodeler is an uncanonical catassembler that can facilitate the sliding of histone on DNA, which reshapes the assembly state of chromatin, e.g., makes the chromatin much more compacted. |
The structures of a series of remodelers have been solved, revealing a possible shared twist-based mechanism for their actions.102,103 The target nucleosome is bound by a remodeler in an open state. The binding of ATP to the remodeler changes it to a closed state and induces a twist defect at the entry site of DNA, corresponding to a slight disassembly. Upon hydrolysis of ATP, the remodeler returns to the open state, and the twist defect is transferred to the exit side of DNA, accompanied by a reassembly process. Each cycle of this action results in the sliding of the histone octamer by 1 bp of DNA. Through this elaborate mechanism, catassemblers reshape the chromatin structures.
The structures of both INO80–nucleosome and SWR1–nucleosome have been resolved, revealing part of the mechanisms involved.106–108 INO80 binding to the nucleosome disassembles 15 bp DNA at SHL6 and disrupts DNA–H2A interactions. SWR1 binding to the nucleosome is similar to other remodelers, as it distorts 1 bp DNA and changes the nucleosome structure. The binding of both INO80 and SWR1 destabilizes the nucleosomes and speeds up the replacement of subunits. However, detailed mechanisms for removal and addition are still unclear.
RecA is an important protein for the homologous recombination repair of DNA.109,110 As shown in Fig. 10B, RecA acts as a noncanonical catassembler to accelerate DNA strand exchange. RecA binds to ssDNA with higher affinity when ATP is present. When the ATP–RecA–ssDNA complex encounters dsDNA with the same sequence, a complex with a triple helix is formed. The invading strand gradually displaces the original strand. After the hydrolysis of ATP, RecA and the original strand dissociate.
![]() | ||
Fig. 11 During the migration of RNA polymerase on DNA, the nucleosomes form a natural hurdle. Single-molecular studies have shown that FACT acts as a noncanonical catassembler to disassemble and reassemble nucleosomes during this process. Reproduced with permission from Ref. 112. Copyright 2018, Elsevier. |
Chen et al. found that FACT can facilitate both the disassembly and reassembly of nucleosomes using single molecular magnetic tweezers.112,113 When binding FACT, the nucleosome can be unfolded at a lower force (5 pN) compared to the typical 10-18 pN required. More importantly, FACT preserves the integrity of the nucleosome, preventing its destruction during stretching. These results demonstrate that FACT is a unique catassembler that can loosen the nucleosome while maintaining its integrity, indicating that it can facilitate the passage of RNA polymerase. Further research using cryo-TEM confirms that FACT can facilitate nucleosome disassembly in front of the replication fork and restoration after the replication fork.114–116
Catassemblers | Functions | Mechanisms |
---|---|---|
Cdc48, Ufd1, Npl417,100 | Unfolding protein | ATP-fuelled unfolding |
PilT60 | Disassembly | ATP-fuelled disassembly |
Chromatin remodeler18,103 | Reshaping nucleosome | ATP-fuelled reshaping |
INO80, SWR106–108 | Histone exchange | ATP-fuelled subunit replacement |
RecA109,110 | DNA strand displacement | |
FACT111,112 | Hold nucleosome during transcription | Hold subunits |
![]() | ||
Fig. 12 The advancement of molecular assembly requires concerted efforts in the development of novel subunits, assembly methods, characterization and theoretical methods. |
As discussed above, catassembly is a widely existing strategy for complex molecular assembly in living organisms. It has demonstrated its high efficiency in selecting assembly pathways, controlling the assembly with multiple subunits, and modulating the functionality of complex assemblies. However, only a few examples of artificial catassemblies have been reported. In the following section, we propose possible models for artificial catassembly, highlight some primitive works, and discuss the challenges in system design, characterization techniques, and theoretical approaches. We also anticipate the potential applications of catassembly in constructing functional complex assemblies.
In many biological molecular assemblies, the building blocks of complex structures are sometimes initially unstable and require protection until the final complex forms. When a similar issue arises in artificial assembly systems, catassemblers can be designed or screened to stabilize unstable subunits.
Therefore, in the first suggested model (Fig. 13A), the catassembler can prevent subunits A from self-formation through irregular interactions, promoting A and B to bind accurately. Subunit B will replace the catassembler and release it from the complex.
Catassemblers can act as activators for molecular assembly in the second model (Fig. 13B). The direct assembly of subunits A and B is hindered due to the concealment of their interaction sites. The introduction of a catassembler initiates the assembly process. The catassembler binds to subunit A in an allosteric manner, thereby exposing the binding sites on subunit A. Subsequently, the activated subunit A interacts with subunit B, forming the complex AB. The binding of A and B weakens the interactions between A and the catassembler, resulting in the release of the catassembler. The catassembler then proceeds to locate a new subunit A and starts a new cycle. This system is a responsive molecular assembly system. It can also be designed as a sensing system, where molecules to be detected act as catassemblers.
Furthermore, catassembly can be powered by external energy in the third model. This requires that the catassemblers be responsive to energy inputs. The energy inputs could be in the form of chemical energy, light, electrical energy, etc. Assuming a catassembler exists in a deactivated state, as shown in Fig. 13C, the energy input can activate the catassembler, initiating a catassembly cycle like Fig. 13B. After the assembly finishes, the deactivated catassembler is released from the product and can be reactivated by further energy input.
It should be noted that catassemblers differ from co-assembly components in that they do not become a part of the final product and can be reused, making catassembly more efficient. The leaving and recycling of catassemblers are crucial in this process. We proposed four possible leaving and recycling mechanisms for catassemblers. (1) The catassembler shares the binding site with other subunits and can be replaced by them at the end of the assembly process (Fig. 13A). (2) The completion of the assembly weakens the binding between the catassembler and the subunits, eventually causing the catassembler to detach (Fig. 13B). (3) The catassembler and the final assembly product undergo phase separation (for instance, precipitation), with the product separated into another phase and the catassembler remaining with residual subunits. (4) The recycling of the catassembler is powered by energy input.
In the construction of complex molecular assemblies involving multiple components and hierarchical structures, catassembly is increasingly necessary. Catassemblers can protect the subunits from undesired interactions, much like protecting groups in covalent synthesis. They can be removed at the appropriate time once the target intermediates or products are formed. All in all, catassemblers ensure that specific subunits join the assembly at the right time and in the right manner, making it possible to assemble complex molecular architectures in a single pot like a single cell.
One important aspect of molecular assembly which has long been overlooked is mass transportation. In biological science, the term “recruitment” is often used to describe the result of mass transport of specific species, but not the process itself. The mass transport within cells is distinct from that in artificial assembly systems which are mainly in dilute solution. Cells are compartmentalized into various organelles and membrane-less organelles, and loaded by other biomacromolecules, glucose, and salts, making the interior of cells a crowded environment.119–121 Molecular assembly relying on molecular diffusion and/or Brownian motion is inefficient in such a crowded environment.122
Many biological catassemblers can deliver subunits in some effective ways. For example, in the case of pili assembly, catassemblers can deliver subunits from the inner membrane to usher proteins on the outer membrane.63 Many histone chaperones have also been reported as carriers that deliver histones to replication forks or other target positions.78 Many “transporter” proteins for RNA have also been found.123 In this way, catassemblers can assist molecular assembly by accelerating mass transport. However, due to the lack of appropriate techniques, this function of catassemblers is poorly understood and highly deserves more attention.
By learning from biological assembly and proposing three preliminary models of catassembly, one can gain some initial insights into how to advance artificial assembly by introducing catassembly. The more complex the assembly, the greater the demand for using catassembly, and the more types of catassemblers are required.
Currently, when self-assembly fails to produce desired products, researchers tend to redesign the structure of subunits rather than investigate the assembly process to identify the cause of failure. By studying the assembly pathways, kinetics, and mechanisms, we can gain a deeper and broader understanding of the construction and function of molecular assembly. Catassembly systems are usually highly complex, featuring hierarchical structures, multi-site long-range interactions, and feedback networks. These features make the investigation of the catassembly mechanism challenging. To reveal the physicochemical landscape of these systems, characterization techniques with high temporal, spatial, and energy resolution, as well as theoretical approaches, are highly anticipated.
Due to the weak interactions involved in catassembly, characterization methods with high energy resolution are required to monitor the formation and breakage of non-covalent interactions from the atomic level to the molecular level and the macromolecular level. Each level of the catassembly process requires different characterization methods. At the atomic level, nuclear magnetic resonance (NMR) is a high-resolution, in situ method with various hardware innovations and pulse sequences that have been widely applied to study biological systems.124,125 At the molecular level, spectroscopic methods based on vibrational and electronic effects can provide valuable information, for instance, fluorescence, circular dichroism, Raman, and UV-Vis spectroscopy. At the macromolecular level, characterization methods mainly involve the usage of light, magnetism, and microscopes, such as optical or magnetic tweezers.126,127 Some characterization methods can provide information about assembly processes at multi-levels, for instance, mass spectroscopy. The choice of the characterization method depends on the information desired. For example, single molecular magnetic tweezers can measure the force inside biological molecular complexes at a resolution of 1 pN, which has been used in the study of catassembly.112,128–130
In addition, time resolution is also an important aspect of characterization techniques, especially for the study of kinetics. Catassembly typically features multiple pathways and complex kinetic processes. During the assembly process, different pathways create transient intermediates with low content and complex structures. These features of intermediates demand characterization methods that can deal with complex samples. To reveal the details of the process, characterization techniques with both high structural resolution and time resolution are very much needed. NMR has been applied to characterize the transient, high-energy state complex of a protein ensemble whose content is as low as 0.5% of the protein population.131 Moreover, NMR can reveal information on spatial distance and dynamics of intermediates, which is important for structural elucidation and mechanistic explanation. The marriage of microfluidic and NMR techniques significantly improves the time resolution of the NMR method, enabling NMR to study the fast kinetics processes of supramolecular systems with high structural and time resolutions.132–134 Cryo-TEM combined with advanced particle-picking methods provides insights into key intermediates of biomacromolecular complexes.135 Moreover, time-resolved cryo-TEM has also been developed to study the dynamic process of biological macromolecules.136–138 Other techniques, including mass spectroscopy, stopped-flow spectroscopy, and single-molecule fluorescence techniques, can also be helpful.
Complex catassembly systems usually present the following characteristics: multiple components, multiple interactions at multiple sites, coupling of short- and long-range interactions, complex assembly pathways, entropy-driven, or complementation of entropy and enthalpy.139,140 Especially, catassembly systems in living organisms are more complex and always in larger temporal and spatial scales than chemical reactions.141 All these features bring great challenges to the current theoretical foundations of catassembly and computing science. To address this challenge, developing new physicochemical and computing methods is crucial for the theoretical study of catassembly.
The theoretical framework of physical chemistry should emphasize the synergy and feedback of non-covalent interactions in multiple sites, which often leads to nonlinear changes in key physicochemical parameters in catassembly processes. In addition, mass and energy transfer in catassembly processes might occur at the same temporal and spatial scales, especially in living systems. Therefore, current reaction-diffusion models should be improved to explain the coupling mechanism of mass transfer and catassembly processes. In particular, special mass transfer methods, such as active transport122 and laminar and turbulent flow in microfluidics,142,143 might provide new insights into the design of catassembly systems coupled with mass transfer processes. More importantly, most catassembly systems in living organisms undergo non-equilibrium pathways and often reside in a complex crowded environment, requiring a breakthrough in the non-equilibrium statistical theory to fully understand these systems.144,145
From the perspective of computing methods, developing new density functional approximations is vital for analysing the synergy of non-covalent interactions.146,147 Furthermore, the integration of quantum mechanics, all-atom, and coarse-grained models into a theoretical framework is necessary to construct a multi-scale model with both precision and efficiency.148,149 This model might further be connected with kinetic Monte Carlo and grand-canonical Monte Carlo methods to search for the structure of active intermediates at different scales.150–152
The most abundant examples of artificial catassemblies come from the realm of DNA assembly. Most DNAs are assembled using annealing (self-assembly) to obtain desired products.26,43 Turberfield et al. were the first to demonstrate the isothermal formation of the DNA structure through the catassembly of hairpin subunits.48 The hairpin structure contains two domains: the loop domain, whose hybridization is fuel for the assembly; and the stem domain, which provides an energy barrier for the process.
These catassembly processes rely on the same mechanism. For example, in the work by Yin et al., the hairpin structures of subunits provide a high energy barrier that hinders the interaction between them (Fig. 14A).153 The catassembler strand can bind to subunit A and open the hairpin by strand displacement. This removes the kinetical obstruction and exposes the interaction sites, leading to the sequential binding of subunits B and C. Subunit C then competes with the catassembler strand and induces the release of the catassembler strand. The released catassembler can start a new cycle. Through this method, the isothermal assembly of the DNA three-way junction is achieved. The catassembler acts as an activator for the system. This catassembly case was further applied in building DNA tetrahedra, and used as a detection system for other molecules.153–155
![]() | ||
Fig. 14 Catassemblers to accelerate the assembly process. (A) The catassembly of a three-way junction DNA from hairpin subunits A, B, and C. Reproduced with permission from Ref. 153. Copyright 2008 Springer Nature. (B) The catassembly of the DNA double strand by several steps of strand displacement. Reproduced with permission from Ref. 162. Copyright 2010 Oxford University Press. (C) Electrons act as catassemblers for the recognition of host and guest molecules. Reproduced with permission from Ref. 167. Copyright 2022 Springer Nature. |
One system that has been explored is the “hybridization chain reaction” (HCR), which is a process of molecular assembly rather than a reaction.156,157 In this process, two DNA subunits with hairpin-like structures can form supramolecular polymers by self-assembly. However, the long stem of the subunits creates a spatial obstacle that slows down the assembly. To overcome this, an initiator strand is added that opens up one subunit and triggers sequential polymerization of the other subunits. Although the initiator strands remain in the final product, one equivalent initiator can catalyse the assembly of thousands of subunits. Therefore, the initiator strand acts as a catassembler in the HCR. This strategy has been successfully applied in the amplification of signals,158 construction of hydrogels,159,160 and solving mazes.161
Zhang et al. developed a DNA catassembly process using strand displacement.49,162 In the example shown in Fig. 14B, the assembly of strands a and d is prevented by strands b and c. When the catassembler strand binds to strand a, strand c is released and binding sites for strand d are exposed. Strand d then competes with strand b and the catassembler strand, leading to their gradual release. Finally, the catassembler strand binds to strand a with only a few base pairs and releases autonomously. This strategy has been used in DNA circuits, constructing DNA nanotubes, building multiple enzyme systems and assembly of nanocrystals.163–166
Jiao et al. recently reported that electrons can be used as a catassembler to accelerate the recognition of host and guest molecules (Fig. 14C).167 In their system, both the host and guest molecules are radical cations. Coulombic interactions between cations make molecular recognition kinetically unfavourable. The addition of catalytic amounts of electrons partially reduces the host or guest molecules, lowering the energy barrier and accelerating their assembly. This smart mechanism seems to be a versatile strategy that can be widely used.168
Zhao et al. reported that the molecule TTC4L has poor solubility at pH levels below 7.0 and will self-assemble into amorphous precipitates. However, the addition of β-CD greatly enhances the solubility of TTC4L. Furthermore, the encapsulation of TTC4L by β-CD reduces collisions between TTC4L molecules. Although this slows down the assembly process, it also alters the assembly pathway of TTC4L, resulting in the formation of a micelle structure (Fig. 15A). These micelles eventually form precipitates and release the catassembler β-CD.169
![]() | ||
Fig. 15 Catassemblers assist in choosing the assembly pathway. (A) The molecule TTC4L forms microspheres with the assistance of β-CD. (B) The chiral cage CAAA-1 functions as a catassembler for the assembly of right-handed TPPS supramolecular polymers. Reproduced from Ref. 170 with permission from the Royal Society of Chemistry. (C) The catassembler α-CD can help the molecule C4AZO to choose the assembly pathway leading to right-handed supramolecular polymers. Reproduced from Ref. 171 with permission from the American Chemistry Society. |
Catassemblers can also participate in the assembly of supramolecular polymers to enhance their enantioselectivity. For example, the self-assembly of TPPS molecules occurs slowly and produces both left- and right-handed supramolecular polymers. However, Wang et al. reported that the addition of a chiral cage, CAAA-1, greatly increases the assembly rate of TPPS and results in the formation of only right-handed polymers (Fig. 15B).170 The catassembler interacts strongly with TPPS monomers to regulate their chirality and has a weaker interaction with the polymer, allowing it to be released autonomously. Similarly, Zhi et al. found that α-CD can facilitate the formation of right-handed C4AZO supramolecular polymers. While the self-assembly of C4AZO produces enantiomers, the addition of α-CD accelerates the assembly rate of right-handed polymers and decreases the assembly rate of left-handed polymers, resulting in the right-handed product (Fig. 15C).171
As a linear biological polymer, RNA can misfold into structures with incorrect conformations. In nature, protein chaperones are available to correct misfolded RNA. Recently, we demonstrated that RNA can also function as a chaperone to solve the misfolding problem of RNA (Fig. 16A).172 During assembly along with synthesis, RNA can easily misfold and form misfolded products due to complex intramolecular interactions. Without assistance, it took several hours to weeks for the misfolded product to refold into the native product. With the assistance of a designed RNA strand as a chaperone, the refolding process can be accelerated by around 1000 times. This research provides evidence that RNA chaperones may exist in the RNA world. Similarly, many kinds of artificial molecular chaperones for proteins have also been developed.173,174
Xie et al. found that cytosine can self-assemble into a glassy state on the surface of Au(111).175 The cytosine molecules mainly form T-junction structures and contain several types of dimers, which are not the most energetically favourable structures. However, when the catassembler, water, is deposited onto the system, the established molecular assemblies reform into extended molecular-chain arrays containing several stable hydrogen-bonded dimers. DFT calculations show that water can break the weak interactions in the glassy state and facilitate reassembly into stable chain structures. Water molecules have low absorption energy and can easily desorb from the system.
Similarly, Li et al. reported that CO2 can function as a catassembler on surfaces (Fig. 16B).176 On the surface of Au(111), Fe atoms and C3PC molecules can self-assemble into Sierpiński triangle structures, with each Fe atom coordinated with three C3PC molecules. However, this is not the most stable compound that Fe and C3PC can form. The introduction of CO2 to the system disrupts the formed structures and allows them to reassemble into a chain structure, with each Fe atom coordinated with four C3PC molecules. The interaction between CO2 and Fe is weak, so a large dosage is needed. CO2 can easily desorb from the surface and is not present in the final structure.
Living beings are such integrated systems featuring multi-component, multi-pathway, and multi-level structures. Due to these “multiples,” there is a significant “difference” between living organisms and artificial molecular assembly systems in terms of complexity and diversity of structures and functions. In living organisms, different molecules serve different purposes and work synergistically without interfering with each other as an integrated smart system. Some molecules are responsible for energy harvesting, others serve as carriers, writers, and erasers for information, some work as sensors, and others are involved in the formation and breakage of covalent bonds. Moreover, the operation and maintenance of life require continuous input and output of matter, energy, and information. As a result, the assembly in living organisms cannot solely be dependent on thermodynamic spontaneous processes, but rather on other processes such as catassembly and dissipative processes, thus generating higher functions with hierarchical structures. This is a typical manifestation of “more is different”.
The ultimate goal of artificial molecular assembly is to construct molecular systems that can function like living organisms and play a key role in bionics.179–181 From the perspective of complexity science, the whole is greater than the sum of its parts. In current molecular assembly systems, adding controlling components beyond building blocks will greatly increase the system's complexity. This may promote the emergence of novel functions unachievable through self-assembly alone. For example, introducing catassemblers into the system provides extra controllability over when, where and how molecular assembly occurs. These machineries can collectively promote correct noncovalent interactions among subunits, eliminate incorrect interactions, and restore subunits to a correct assembly state.
It can be expected that as artificial molecular assemblies become more complex, catassembly will become an increasingly efficient and important assembly strategy. Catassemblers can function as “builders” constructing functional assemblies; as “sentry” watching and reacting to external signals; as “transporters” assisting in mass transport; as “coordinators” for controlling systems spatiotemporally; or even as “doctors” repairing damaged parts. For example, as depicted as a tower bridge in Fig. 17, “builders” can build complex and functional assembly systems, and “coordinators” can reshape the structure of the tower bridge, switch its function when a ship (external signal) approaches, and restore the structure and function after that. Overall, catassemblers can play multiple indispensable roles in a well-run molecular assembly system with high complexity.
The study of catassembly may also initiate reforms in conventional catalysis theory especially in biobased complex systems. Similar to catassembly systems, the interactions between enzymes and substrates, as well as transition states, also involve multiple binding sites and synergistic weak interactions, along with feedback mechanisms.5,182 A profound understanding of the fundamental principles governing these interactions is also crucial for advancing catalytic methodologies, such as organic catalysis and supramolecular catalysis.183–185
Catassembly and catalysis can also be coupled coherently to make complementary advantages to efficiently build multi-level structures of assemblies.12 For example, the formation of the capsid of bacteriophage HK97 is a typical process of catassembly coupled with chemical reactions.186 As the coupled reactions continuously consume the assembled products, they push the assembly equilibrium in the positive direction. The assembly provides the sites required for coupling, further promoting the coupling process. More importantly, the stepwise coupling of catassembly and catalysis can efficiently and precisely construct flexible and stable assembly systems with hierarchical structures.12 Furthermore, the rational integration of chemical reactions (including spontaneous reactions, field-assisted reactions, and catalysis) and molecular assembly (including self-assembly, field-assisted assembly, and catassembly) can construct highly advanced functional molecular systems that involve many components and complex processes.
It should be emphasized that over the past few years, the advent of the artificial intelligence (AI) era has offered unprecedented opportunities to revolutionize the landscape of molecular assembly. With the breakthrough of algorithms and increasing computational power, AI is enabling the design of building blocks and prediction of their interactions. More importantly, AI can elucidate the pathways and mechanisms of molecular assembly processes, which are highly complex and remain poorly understood nowadays.187,188 Of particular interest is the potential of AI to understand and predict how catassemblers control and regulate molecular assembly processes.
On the one hand, a series of machine learning (ML) approaches based on the neural network, including AlphaFold, AlphaFold2 and RoseTTAFold, have achieved accurate prediction of protein structures from their sequences, and have been used to design proteins as assembly subunits.187,189–191 Furthermore, methods like AlphaFold-Multimer8 have been developed to identify the protein docking sites and produce accurate structures of heterodimeric protein complexes.192–194 These tools could be applied in the design of catassemblers and catassembly processes. In addition, advances in ML-based data mining methods enable the discovery of hidden patterns from heterogeneous big data with implicit noise, thereby consequently enhancing resolution and relieving chemists of time-consuming tasks.195,196 Introducing the deep learning methods into multiscale molecular modelling could also optimize the parameters of force fields, extract equilibrium properties of systems containing rare events and promote the accessible time and spatial scales of atomistic simulations.197–199
On the other hand, large language models (LLMs) are anticipated to extract chemical information from the literature reports and provide comprehensive insights into the design of molecular assembly systems.200–202 Leveraging LLMs alongside big data techniques could uncover commonalities among seemingly irrelevant catassembly systems in living organisms. Furthermore, data-driven experimentation by high-throughput automation and real-time kinetic monitoring is of utmost importance in navigating the complexities of molecular assembly, which can generate precise and reliable data for training models.203–205 Meanwhile, optimization methods, such as Bayesian optimization and evolutionary algorithms, could act as a “brain” to expedite the exploration of optimal experimental conditions and facilitate error-free autonomous workflows.203,206–208 AI-assisted and AI-driven methods will offer new avenues for transforming the paradigm of molecular assembly and catassembly research. The synergy between the emergent nature of AI and molecular assembly systems holds the great promise of creating advanced functionality and even emergence of intelligent molecular systems that transcend the capabilities of living organisms. This will also deepen our understanding of complexity science, leading to better ways to manage and control complex systems.
Catassembly has been shown to be an efficient and highly selective strategy for synergistically controlling and manipulating various noncovalent interactions, especially in hierarchical molecular assemblies. We emphasize that canonical catassemblers can either accelerate assembly or guide assembly pathways; on the other hand, noncanonical catassemblers play a pivotal role in enabling reassembly and (partial) disassembly to achieve the smart assembly with highly responsible and versatile properties. Catassemblers can play multiple roles, such as builders, transporters, and coordinators, making catassembly an indispensable strategy in constructing and regulating complex assembly systems.
Accordingly, we propose design principles for artificial catassembly and discuss corresponding characterization and theoretical methods. We summarize three basic models of catassembly, including (1) catassemblers preventing undesirable aggregation and promoting correct assembly; (2) catassemblers activating building blocks to facilitate assembly; and (3) energy-powered catassembly processes. Along this direction, we propose four pathways for catassembler departure, including competitive replacement by building blocks, allosteric effects, phase separation and dissipative recycle of catassemblers.
We further discuss representative catassembly works in which catassemblers accelerate assembly, guide assembly pathways, and reshape assemblies. By learning from biological assembly and advanced artificial assembly, it is evident that the more complex the assembly, the greater the demand for catassembly, and the more types of catassemblers are required. These features are distinctively different from those of self-assembly as the most widely used strategy in molecular assembly for decades. Moreover, the rational combination of self-assembly with catassembly, one of the important branches of assisted-assembly, could be necessary to advance the new generations of artificial molecular assembly and its relevant materials and devices.
Although the concept of catassembly was originally inspired by catalysis chemistry about ten years ago, we have now realized, through studying catassembly in living organisms, that catassembly and catalysis belong to a big family but may have some distinct principles based on the famous essay “More Is Different” by P. W. Anderson. Catassembly could be a much more complicated process based on noncovalent interactions, featuring synergistic multivalent and multicomponent interactions, and long-range feedback. These features are essential to construct hierarchical structures which can eventually generate new properties and functions, which have yet to be the focus of chemical catalysis. Therefore, it could be highly desirable to make great efforts to develop and even establish fundamental principles, theoretical framework and characterization tools for catassembly. This approach may also initiate reforms in conventional catalysis theory and develop strategies integrating covalent synthesis and noncovalent assembly to construct complex molecular systems.
Undoubtedly, the advancement of molecular assembly into complexity science must require more and more people from different fields to assemble and work synergistically to bridge the gap between chemistry and life science. There is a good reason to be optimistic about the new era of AI. It offers unprecedented opportunities to revolutionize the landscape of molecular assembly. Large language models will extract the chemical and biological information from the literature and then uncover commonalities among seemingly irrelevant or unworkable catassembly systems, such as those found in living organisms. More importantly, AI can act as a “catassembler” to greatly improve the communication language and skills of chemists, biologists and physicists from diverse research fields and disciplines to advance the complexity science.
Footnotes |
† This review is dedicated to the memory of Prof. Haojun Liang. |
‡ Electronic supplementary information (ESI) available. See DOI: https://doi.org/10.1039/d3cs00634d |
§ These authors contributed equally. |
This journal is © The Royal Society of Chemistry 2024 |