PaperHub
5.5
/10
Poster4 位审稿人
最低3最高3标准差0.0
3
3
3
3
ICML 2025

Steering Protein Language Models

OpenReviewPDF
提交: 2025-01-23更新: 2025-08-07
TL;DR

An activation editing method in Protein Language Model to steer the sequence generation toward desired properties and its downstream application to protein optimizaiton.

摘要

关键词
Protein Language ModelSteeringProtein Engineering

评审与讨论

审稿意见
3

This paper presents a method to control the output of protein language models, inspired by the Activation Steering approach in the LLM domain, allowing the generated sequences to exhibit a given property. This method does not require retraining the protein language model and can be directly applied during the inference stage. The paper validates the effectiveness of this approach by manipulating properties such as thermostability and solubility.

给作者的问题

I don't have other questions.

论据与证据

The paper experimentally demonstrates that the proposed method can generate sequences with improved performance on a given property, supporting its claim.

方法与评估标准

The method proposed in this paper is reasonable.

However, I believe the evaluation criteria used in this paper have potential issues. During the generation process of ESM-2, the model's behavior (such as activation steering or certain activation parameters) directly depends on the structure and inference process of ESM-2, and the generated protein is then evaluated for properties such as thermostability using ESM-2. If the activation of the model is guided or modified during the generation process, then the evaluation may exhibit a certain level of circular dependency with the generation process, potentially causing the generated protein to perform exceptionally well in the ESM-2 evaluation, because the model has already been adjusted during the generation. This could potentially be considered an attack on ESM-2. In this part of the experiment (Table 1), we also observe that ESM-2 shows the best performance. Therefore, the paper needs to clarify whether this good performance is due to such an attack.

理论论述

This paper is more "application-oriented," and therefore does not include many theoretical claims.

实验设计与分析

I have reviewed the experimental design section of the paper, and I believe more testing metrics need to be included. The paper primarily tests how well the generated sequences from the protein language model align with the target properties and measures diversity and novelty, but I believe an important metric for evaluating PLM-generated results is testing the authenticity of these sequences, which the authors have overlooked. It is necessary for the paper to include metrics that assess the authenticity of the generated sequences, such as pLDDT and sequence likelihood.

补充材料

This paper does not provide supplementary materials.

与现有文献的关系

The paper proposes a more novel method for controlling PLM output, which is insightful for protein design.

遗漏的重要参考文献

I did not find any important missing references.

其他优缺点

I am curious whether the paper's method performs well for modeling more niche properties. The model's performance was primarily validated in terms of thermostability, solubility, and fluorescence brightness, but these are already well-studied properties, and many protein sequences with these properties are included in the PLM training data. However, for protein design tasks involving more niche properties, I am uncertain whether this method would still perform well, as it fundamentally relies on the internal knowledge of PLM.

其他意见或建议

I don't have other comments.

作者回复

We sincerely thank the reviewer for providing valuable feedback. We detail our response below point by point. Please kindly let us know whether you have any further concerns.


Q1: "the evaluation may exhibit a certain level of circular dependency. ... This could potentially be considered an attack on ESM-2. In this part of the experiment (Table 1), we also observe that ESM-2 shows the best performance. "

We thank the reviewer for raising the important issue of potential circular dependency in our evaluation. Using ESM-2 both as the base generative model and as part of the downstream predictor could potentially introduce biases. However, we emphasize the following points to address this concern:

  • While using ESM-2 as a base model achieves superior performance in Table 1 to other base models, by comparing the generation performance with Action Steering and without Activation Steering (Original Model), we can we can observe a significant performance gain. This validates that improvements are attributable to the steering mechanism rather than inherent biases in ESM-2.

  • Besides, our proposed Activation Steering method demonstrates consistent and significant improvements over both Fine-tuning and the Original Model across all three base models (ESM-2, ESM-3, and ProLLaMA). This cross-model consistency suggests the improvements stem from the method itself rather than inherent biases in ESM-2.


Q2: " It is necessary for the paper to include metrics that assess the authenticity of the generated sequences, such as pLDDT and sequence likelihood."

To address the concern about assessing the authenticity of generated sequences, we incorporate the pLDDT metric, evaluated using ESMFold, to measure the structural integrity of both initial and optimized protein sequences. The results, presented in the tables below, demonstrate that the pLDDTs remain consistent before and after applying activation steering. This consistency confirms that our method effectively guides protein generation towards desired functionalities while maintaining the authenticity of the sequences

Table 1: pLDDT for protein generation.

Base ModelMethodThermostabilitySolubility
ProLLaMAOriginal Model40.0240.02
ProLLaMAFine-tuning37.7937.68
ProLLaMAActivation Steering39.5139.52
ESM2Original Model74.0969.17
ESM2Fine-tuning72.6268.16
ESM2Activation Steering73.0268.36
ESM3Original Model76.1473.54
ESM3Fine-tuning75.5072.71
ESM3Activation Steering76.1872.61

Table 2 pLDDT for protein optimization regarding thermostability.

Medium difficultyHard difficulty
Before Optimization79.7579.99
AdaLead51.0631.15
ESM2 + ASPO76.7579.59
ESM3 + ASPO81.0078.91

Table 3 pLDDT for protein optimization regarding solubility.

Medium difficultyHard difficulty
Before Optimization77.9275.94
AdaLead36.8735.89
ESM2 + ASPO78.3877.13
ESM3 + ASPO78.1577.98

Q3: "I am curious whether the paper's method performs well for modeling more niche properties... as it fundamentally relies on the internal knowledge of PLM."

  • We appreciate the reviewer's interest in the applicability of our method to more niche properties. We would first note that the challenge with niche properties is the scarcity of labeled data, which can hinder the training of effective predictors to estimate the performance.

  • We conducted an additional experiment focusing on the niche property of hydrolysis activity at 30°C for polyethylene terephthalate (PET). We utilized the dataset from (Seo et al. 2025), comprising 184 annotated samples.

    • To estimate the performance, we train a predictor on this dataset, achieving a Pearson correlation coefficient (R) of 0.57 in 5-fold cross-validation, indicating a reasonable prediction capability under data constraints.
    • We design optimization tasks of varying difficulty:
      • Medium Difficulty Task: Optimizing 69 samples with an initial average predicted activity of 1.61.
      • Hard Difficulty Task: Optimizing 67 samples with an initial average predicted activity of 175.62.
  • The results post-optimization using our ASPO method combined with ESM3 are as follows:

Task DifficultyInitial Average ActivityPost-Optimization Average Activity
Medium1.615.00
Hard175.62236.91
  • These results demonstrate that our ASPO method can effectively optimize even niche protein properties, such as hydrolysis activity in PET, underlining the versatility and potential of our method in broader protein design applications.

Reference: Seo, Hogyun, et al. Landscape profiling of PET depolymerases using a natural sequence cluster framework. Science 2025.

审稿人评论

Thanks for your rebuttal and I have raised my score to 3.

审稿意见
3

This paper introduces a method for steering Protein Language Models (PLMs) towards outputs with desirable properties. It is based on a technique called 'Activation Steering' used for LLMs, where internal activation vectors of LLMs are modified to shift them towards desired behaviour.

The authors show how this method can be applied to both Autoencoder PLMs like ESM2 and ESM3 as well as Autoregressive PLMs like ProLLAMA to produce proteins with higher ‘fitness’, i.e. higher values of desired properties like thermostability and solubility. In line with previous work in this field, oracle models are created by training CNN models based on relatively large datasets.

The proposed method shows performance improvements over existing methods, while maintaining diversity and novelty of generated proteins.

给作者的问题

Most of my questions are listed in the sections above.

论据与证据

  1. The paper claims to enable property-specific protein generation without the requirement for retraining models, thus providing a scalable alternative to other resource intensive techniques.
  2. The paper introduces a new iterative optimization method which involves choosing which tokens to mutate based on an estimate of their individual fitness score.
  3. An empirical evaluation is performed on models with 3 different architectures to show the universal effectiveness of the method and its performance in comparison to some existing methods in the field.

Claim (1) is supported by empirical evidence, at least for the models included in the paper. Activation steering is indeed less resource intensive than directed evolution/reinforcement learning based methods as well as fine-tuning, though it is unclear if it so in comparison with other benchmarks used in the paper.

Some of the technical claims made in the paper are unconvincing to me, such as “magnitude of a representation’s projection onto [the direction of the steering vector] can indicate it’s relatedness to the property”, which I have detailed in the methods/questions section. The method itself also makes certain assumptions which I don’t think are true. Again, I have detailed these in later sections.

With respect to claim (2) – the paper does introduce a new optimization procedure to choose the tokens most weakly associated with the property of interest and update the internal representations of those tokens. It is unclear to me if the method of determining if a token contributes positively or negatively to the fitness function is completely correct, which follows from the point made above about projection onto the steering vector. Moreover, it is not clear to me from a theoretical standpoint how simply adding a common steering vector to the layer representation of any token, no matter where it is in relation to the regions of positive and negative fitness values, would improve overall fitness of the protein. Again, I have elaborated on this in the question section. For claim (3), empirical evidence is shown on 3 different architectures. With the lack of any theoretical discussion or explanation I’m unconvinced on why this method universally applicable to all PLMs. Is there a fundamental property of models on which this method is based? Latent spaces of different models behave differently – some may be amenable to post-training modifications while some may not.

Overall, I think this paper may still be valuable since it details a method which empirically shows better results than existing methods for a real-world task with meaningful utility. However, there are multiple methodological leaps and assumptions which mean that it doesn’t convince me in terms of being an actual novel technical contribution.

方法与评估标准

Methods:

I’m unsure about certain parts of the method, which I have detailed below:

  1. Firstly, the assumption that “PLMs inherently encapsulate intrinsic knowledge about specific protein properties” is only true if the training dataset has sufficiently large numbers of samples exhibiting the negative to positive range of values you want to generate at inference time. Without a meaningfully high number of samples, the model would not gain a semantic understanding of the required property and would not encode this in the structure of its latent space. This is not true for fine-tuning based approaches, where you can fine-tune on an additional dataset with relatively few samples and get desired behaviour. Thus, this method is an alternative to fine-tuning based methods only when the property of interest is already in the dataset the model is trained on.
  2. Secondly, I’m unsure whether just adding a vector to the latent representation of a deep learning model will always lead to meaningful results. For example, adding a random vector to the latent representation of the model is likely to lead to gibberish outputs, as demonstrated in many papers on adversarial attacks. What if adding the steering vector to the latent representation takes the model to an unseen region, i.e. a region which doesn’t exist in the training set?
  3. Even if you assume that the model is well behaved under the steering transformation, it is not strictly true that adding the steering vector to the latent representation of a particular sample will move it towards the desired property. Concepts like functional continuity, linearity, and other properties necessary for this to be true are not necessarily naturally developed by the model. As a simple thought experiment, assume that there is a spherical region in the middle of the latent space that corresponds to positive fitness. Then, in order to transform samples outside the positive space to lie within the positive space, I need to add vectors to all samples pointing inward towards the positive region, whereas the steering vector would always point in the same direction. You are assuming that a translational transform in the latent space of each layer can shift any token towards the desired property, which is not true.
  4. “the magnitude of a representation’s projection onto [the direction of the steering vector] can indicate its relatedness to the property”. This is not strictly true. For example, you can infinitely scale a vector and its projection onto the steering vector will keep increasing. That doesn’t necessarily mean it’s relatedness to the property will keep increasing.
  5. One question I have with this method is how you maintain existing properties of proteins not necessarily related to the fitness property, especially for protein optimization. This is especially possible with the iterative optimization where you repeatedly replace the tokens least associated with the property to optimize for. For example, say I have a protein with all the required properties but solubility. In optimizing for solubility using this method, do I lose other properties? This is important for protein optimization because I would generally start with a protein with certain properties and try to optimize for others. If the optimized protein is unrecognizable from the protein I start from, then this is just a conditional protein generation method, not an optimization method.

[Discuss all method drawbacks here]

Evaluation:

  1. In line with previous works, this paper uses a trained model as an oracle. This model is trained on thousands (~40000) labeled examples and shows medium to high accuracy/correlation on a test set. While this is not optimal, it is in line with previous work in the area published at similar venues. In the absence of actual ground-truth values for the propertied of interest, this evaluation criteria, though not perfect, is the best that is possible.
  2. Apart from fitness, evaluation criteria, such as diversity and novelty are well-chosen to assess the effectiveness of the generated sequences. One thing that is missing is, as mentioned, maintenance of existing properties apart from the property to optimize, as mentioned previously.
  3. I’m unclear on the Dist_init and Dist_high metrics – are large or small values of these metrics favourable? I’m not sure how these metrics contribute to telling us about the quality of the output. There’s also no clear pattern in Tables 2, 3, 4 that tells me which models are better.

理论论述

No major theoretical claims are made, the paper is mostly based on empirical results. It would be good to see some theory on major claims, such as whether adding steering vectors to intermediate layers always moves the output closer to the high fitness region, and whether this happens in a linear or smooth manner from the initial point to the final point as you perturb the activations more and more.

实验设计与分析

Experiments are well designed to support the claims of the paper. Results are shown on major PLM structures (though only 3) and for three different properties.

补充材料

I reviewed the parts about how the oracle that gets solubility/thermostability values for optimized proteins is trained. While not optimal, I don't see any other way short of wet lab experiments to get ground truth values of fitness for the optimized proteins. I also reviewed the description of measures used to evaluate the models and have mentioned some concerns in the methods and evaluation section.

与现有文献的关系

The paper is in line with previous work on protein fitness optimization.

遗漏的重要参考文献

NA

其他优缺点

Strengths:

  1. The paper addresses a topic of importance in protein engineering.
  2. The method is conceptually simple and seemingly shows good results despite its simplicity.
  3. The method is demonstrated on both AE-PLMs and AR-PLMs, implying some sort of generalizability. However, this is not surprising given that the method is based on a fundamental concept of semantic organization in neural network latent spaces, which could be counted as a strength of the method.

Weaknesses:

I have outlines multiple weaknesses related to the method in the 'Methods and Evaluation' section. I will briefly summarize them here:

  1. Optimizing for properties which are not reflected in the original training set. This is only a criticism because the method is compared to fine-tuning, which can be used to generate proteins with properties not in the original set using a small, new dataset.
  2. Assumptions about the steering transformation being a coherent transformation path in the feature space of the model, in terms of semantic continuity of the space, linearity, etc. Theoretical grounding would be useful here.
  3. Concerns about the magnitude of the projection of a vector reflecting it's relatedness to the property. Counterexample given in the 'Methods and Evaluation' section (point 4).
  4. Questions about maintaining existing properties unrelated to the property to optimize for.
  5. Model evaluation is based on oracle classification models without actual ground truth. However, as I mentioned, there's no real way to get around this short of actual experimental validation.

其他意见或建议

I will rate this paper as a 'weak accept' but I think certain issues need to be addressed, mainly point 2, 3 listed in the 'Methods and Evaluation' section above.

作者回复

Thank you for your thoughtful and positive feedback. We have provided a detailed explanation for your concerns as follows. Please feel free to let us know if you have any additional concerns or questions.

W2: Assumptions .. continuity, linearity...Theoretical grounding

  • Our method is based on two hypotheses: the linear representation hypothesis and the superposition hypothesis (T. Adly 2024). The linear representation hypothesis suggests that neural networks encode meaningful concepts as directions in their activation spaces, while the superposition hypothesis extends this by suggesting that networks utilize almost-orthogonal directions in high-dimensional spaces to represent features, embodying properties of additivity and homogeneity. These hypotheses underpin the design of many existing algorithms.

Ref: T. Adly. Scaling monosemanticity: Extracting interpretable features from Claude 3 sonnet. Anthropic, 2024.

  • However, the theoretical validation of these hypotheses is ongoing. As such, the theoretical grounding of our method remains a subject for future research.

M1: the assumption .. only true if dataset has large numbers of samples

  • Our method assumes that the pretrained PLM has already encapsulated comprehensive knowledge of protein properties, given its pretraining on a massive number of both naturally occurring and synthetically designed protein sequences. This extensive pretraining dataset supports our assumption that the PLM possesses a universal understanding of various protein properties, making it suitable for steering the PLM to explore specific properties.

  • However, we acknowledge that if the PLM lacks prior knowledge of a target property, our method may not be effective, which is a limitation.

  • Regarding the reviewer's point on fine-tuning, it is important to clarify that FT also relies on the pretrained model having some foundational knowledge of the desired properties. Without this, FT on a small dataset risks overfitting and poor generalization, similar to the limitations faced by our proposed method.

M2: unsure whether just adding a vector

  • We appreciate the reviewer's concern regarding the potential risks of modifying latent representations. Indeed, adding arbitrary vectors can disrupt model outputs, akin to adversarial attacks. However, our method carefully defines the steering vector to ensure meaningful modifications aligned with desired properties.

  • To illustrate, consider a simple case with a positive sample (zpz_p) and a negative sample (znz_n). To shift znz_n towards zpz_p, the intuitive direction for modification is v=zpznv=z_p-z_n. Extending this, if we have sets of positive and negative samples, the steering vector can be defined as the mean difference between these sets, aligning changes with the observed data distribution.

  • Our method assumes that the latent space adheres to the linear representation hypothesis and superposition hypothesis, allowing these mean-based modifications to navigate towards desired properties effectively. Empirical results confirm that this method in steering the latent representation leads to the desired enhancements.

M3: not strictly true that adding the steering vector .. move it towards the desired property

  • The effectiveness of the proposed method relies on the linear representation hypothesis and superposition hypothesis. The thought experiment designed by the reviewer, where the latent space exhibits non-linear characteristics, poses a challenge to our assumptions. In this case, the proposed method does not work.

M4&W3: Concerns about the magnitude .. reflecting relatedness to the property

  • Normalization techniques like LayerNorm and RMSNorm in transformers constrain vector magnitudes, ensuring they don’t scale infinitely. This addresses the counterexample.

  • Additionally, the magnitude is indicative of the importance of the token. Our relatedness score takes the magnitude of the token representations into consideration, following the attention mechanisms in transformers, which compute attention scores via dot products between queries and tokens.

M5&W4: maintain unrelated properties

  • We evaluated both thermostability and solubility to ensure our method maintains unrelated properties during optimization. Due to length limitations, we only present solubility experiments. As shown in the following tables, steering for solubility has minimal impact on the unrelated property thermostability.

Tab: Protein generation

Base ModelMethodsol(target)therm(unrelated)
ProLLaMAOriginal.2356
ProLLaMAFT.2457
ProLLaMAAS.2856
ESM2Original.3357
ESM2FT.4156
ESM2AS.4457
ESM3Original.3254
ESM3FT.3955
ESM3AS.4954

Tab: Protein optimization

MediumMediumHardHard
sol(target)therm(unrelated)soltherm
Before Opt.2854.0954
AdaLead.6250.5351
ESM2+ASPO.5153.3554
ESM3+ASPO.6553.453
审稿意见
3

This paper adapts activation steering techniques from LLMs to Protein Language Models (PLMs) to guide protein generation toward desired properties without retraining. It introduces an Activation Steering-based Protein Optimization (ASPO) framework that outperforms existing methods on thermostability, solubility, and GFP brightness tasks.

给作者的问题

The single objective design is quite limited. How do you deal with multi-property optimization?

论据与证据

The key claims are well-supported by experimental evidence across multiple PLM architectures. Results show significant improvements in target properties while maintaining diversity and novelty compared to fine-tuning and other baselines.

However, the worse fine-tuning results might be due to insufficient data and large numbers of learnable parameters. Do you use parameter-efficient fine-tuning, like LoRA?

方法与评估标准

The methodology is sound, with appropriate evaluation metrics (fitness, diversity, novelty, distance measures) and comprehensive comparison against established baselines on relevant protein engineering tasks.

理论论述

N/A

实验设计与分析

The number of use cases is small though.

In the ablation studies, it's better to show the trend of both properties and basic protein generation qualities with respect to the ablated variables.

补充材料

All.

与现有文献的关系

It should be easy for the people in science community to adapt this method with a pre-trained large models.

遗漏的重要参考文献

N/A

其他优缺点

See above.

其他意见或建议

See above.

作者回复

We appreciate very much your constructive comments on our paper. We have provided a detailed explanation for your questions as follows. Please feel free to let us know if you have any additional concerns or questions.


Q1: "How do you deal with multi-property optimization?"

Thank you for your insightful question regarding our method for multi-property optimization..

  • We propose a simple solution to perform steering on multiple properties by computing a composite steering vector that incorporates the steering vectors of individual properties. Specifically, if vthermv_\ell^{therm} represents the steering vector for thermostability and vsolv^{sol}_\ell for solubility, the combined steering vector used for multi-property optimization is given by

v=vtherm+vsolv_\ell = v_\ell^{therm} + v^{sol}_{\ell}.

  • The remaining settings are the same as single property optimization.
  • We empirically test this method in scenarios involving both protein generation and protein optimization. The results, as detailed in the tables below, demonstrate that our method effectively improves multiple properties, albeit with a slight trade-off compared to optimizing a single property.
  • We plan to further refine our method to better manage these trade-offs and explore more sophisticated methods for steering vector combination and weighting for multi-property optimization in future work.

Table 1: Multi-Property Steering for Protein Generation

Based ModelMethodThermostabilitySolubility
ESM2Original Model56.770.327
ESM2Activation Steering75.290.409
ESM3Original Model60.700.312
ESM3Activation Steering67.820.449

Table 2: Multi-Property Steering for Protein Optimization on Thermostability Medium Difficulty Task

MethodThermostabilitySolubility
Before Optimization59.780.299
ESM2 + ASPO76.190.401
ESM3 + ASPO76.640.332

Table 3: Multi-Property Steering for Protein Optimization on Thermostability Hard Difficulty Task

MethodThermostabilitySolubility
Before Optimization46.380.237
ESM2 + ASPO78.110.278
ESM3 + ASPO74.990.311

Q2: "In the ablation studies, it's better to show the trend of both properties and basic protein generation qualities with respect to the ablated variables."

Thank you for your valuable suggestion. We include trends for diversity, distance to the initial set, and distance tothe high fitness set in the following tables. and will update them in the corresponding figures. We will also reflect these trends in the revised figures to provide a clearer visualization of how the ablated variables impact both properties and basic protein generation qualities.

Table 4: Sensitivity to number of samples for steering vectors extraction in protein thermostability optimization (Fig. 4(a))

number of samples102550100250
ESM2-Medium-Diverisity6.217.697.867.717.71
ESM2-Medium-Dist_init6.156.757.497.7098.15
ESM2-Medium-Dist_high10.8710.7710.5410.6310.51
ESM2-Hard-Diverisity4.826.045.976.096.13
ESM2-Hard-Dist_init6.667.217.027.297.30
ESM2-Hard-Dist_high10.2110.449.918.8259.10
ESM3-Medium-Diverisity6.966.966.926.946.94
ESM3-Medium-Dist_init6.086.096.096.006.00
ESM3-Medium-Dist_high10.5310.3110.089.7110.12
ESM3-Hard-Diverisity6.936.956.916.926.932
ESM3-Hard-Dist_init7.717.607.607.577.68
ESM3-Hard-Dist_high9.949.749.469.259.27

Table 5: Sensitivity of α\alpha in protein thermostability optimization (Fig. 6(a))

α\alpha0.050.250.512520
ESM2-Medium-Diverisity7.017.687.707.717.297.797.74
ESM2-Medium-Dist_init7.677.687.717.718.007.887.89
ESM2-Medium-Dist_high10.4310.4510.5210.6311.4211.4611.39
ESM2-Hard-Diverisity6.836.056.076.096.3486.3706.348
ESM2-Hard-Dist_init6.437.687.717.297.377.367.36
ESM2-Hard-Dist_high8.888.888.868.839.199.159.19
ESM3-Medium-Diverisity7.017.257.146.946.916.916.91
ESM3-Medium-Dist_init6.006.006.006.006.016.016.01
ESM3-Medium-Dist_high9.669.639.669.7110.0610.0610.06
ESM3-Hard-Diverisity7.157.147.026.926.836.836.83
ESM3-Hard-Dist_init7.597.587.587.577.577.577.57
ESM3-Hard-Dist_high9.389.329.289.259.849.849.84

Q3: "the worse fine-tuning results might be due to insufficient data and large numbers of learnable parameters. Do you use parameter-efficient fine-tuning, like LoRA?"

  • We appreciate your comment regarding parameter efficiency in fine-tuning. Indeed, we employ LoRA, specifically with a rank of 8, which we found optimal in our experiments after evaluating various ranks [2, 4, 8, 12, 16]. The rank of 8 outperformed others, with rank 4 being the next best. Our choice of hyperparameter alpha is 16.
审稿意见
3

This paper introduces activation steering, which is a technique adapted from large language models, to control protein language models for generating and optimizing protein sequences with targeted properties (e.g., thermostability, solubility, fluorescence). The method modifies internal model activations using steering vectors derived from contrastive representations of proteins with desired and undesired properties. For optimization tasks, the authors propose ASPO (activation steering-based protein optimization) and identify critical mutation sites via projection onto steering vectors. Experiments across autoencoder (i.e., ESM2, ESM3) and autoregressive (i.e., ProLLaMA) protein LMs demonstrate significant improvements in target properties without model retraining, outperforming fine-tuning and traditional optimization baselines.

给作者的问题

  1. ESM2/3 models both demonstrate the scalability that a larger scale model is able to obtain better performance in various tasks in their paper. Therefore, could the performance of steering be further boosted by enlarging the model scale (such as, using ESM2 3B/15B or ESM3 7B/98B)?
  2. In Figure 3, for AR-PLM, as the number of samples for extracting steering vectors increases, the relevant property values consistently decline. The line graph in the figure starts at 10 samples, and I am curious about the performance when the number of samples is further reduced (e.g., 1 sample). If the final trend shows that performance is always decreasing as the number of samples increases, does this mean that for AR-PLMs, users should use as few samples as possible for steering to obtain the best result?
  3. Considering that the proposed method is training-free, will the proposed steering approach be compatible with other protein steering or optimization methods?

论据与证据

Mostly supported.

方法与评估标准

Yes.

理论论述

Not applicable since there's no theoretical contributions.

实验设计与分析

Yes.

补充材料

Yes. Experiment Details

与现有文献的关系

n/a

遗漏的重要参考文献

[PPLM] Plug and Play Language Models: A Simple Approach to Controlled Text Generation. In ICLR 2020.

其他优缺点

Strengths

  1. The proposed activation steering is a training-free powerful method to generate protein sequences with target properties.
  2. The authors also propose a novel Activation Steering-based Protein Optimization (ASPO) framework to improve the protein optimization performance and identify the mutation site.
  3. Experimental results demonstrate that the proposed approach is able to significantly improve the steering generation performance on target property across various protein language models (ESM2, ESM3, ProLLaMA).

Weaknesses

  1. In section 4.1, the ESM3 model is able to generate a full sequence from all-mask states, while ESM2 can not. Therefore, for the ESM3 model, authors should also provide the results of directly generating full sequences in addition to revising based on a reference sequence.
  2. Authors should provide more details about experiments, such as the number of iterations T, length distribution of generated sequences.
  3. Clarity issue. Could you please further explain the lines 189-194: "For practical implementation, [...], A linear classifier to distinguish the representations from the desired and undersired sets."? Does this mean that you only steer the activation of the layer with the highest validation accuracy, instead of all layers, during inference?
  4. Missing discussion of PPLM, an important related work on steering pre-trained LMs for conditional generation.

其他意见或建议

Minor:

Typo: line 361, "Sensitivityive".

作者回复

We sincerely thank the reviewer for providing valuable feedback. We detail our response below point by point.

W1: the ESM3 model ... generate a full sequence from all-mask states

We conduct experiments on full sequence generation using ESM3, maintaining the same settings as described in Section 4.1. The results are presented in the table below, demonstrating that Activation Steering significantly outperforms the baseline methods, thereby confirming its effectiveness.

ThermostabilitySolubility
Original Model52.80.376
Fine-tuning65.50.412
Activation Steering79.60.466

W2: more details about experiments

  • The number of iterations T and the value of K
    • Protein optimization: thermostability: T=8, K=4; solubility: T=4, K=2; GFP: T=4, K=2
  • Length distribution of generated sequences
    • Protein generation
      • thermostability: 60~256, mean: 194.1
      • solubility: 47~256, mean: 168.5
    • Protein optimization
      • thermostability
        • medium difficulty: 60~256, mean: 183.9
        • hard: 102-256, mean: 208.2
      • solubility
        • medium: 71~256, mean: 180.4
        • hard: 47~253, mean: 158.9
      • GFP: length of all sequences is 237

W3: further explain the lines 189-194

Our method first involves computing a relatedness score for each token to identify which should be mutated, as discussed prior to the mentioned lines. In lines 189-194, we propose to determine the layer for computing the relatedness score as the one with the highest validation accuracy. This ensures that the most informative layer are utilized for token selection.

After determining the tokens for mutation, we replace these tokens with a mask token. Importantly, activation steering is applied during inference across all layers, except the input layer, to steer the model's prediction on the masked tokens towards the desired property.

We will revise lines 189-194 to enhance clarity on these points.

W4: Missing discussion of PPLM

Thank you for pointing out the omission of PPLM. PPLM indeed pioneers the concept of steering by modifying key-value pairs in the model's attention mechanism, guided by gradients from an attribute model. In contrast, our method employs a simpler activation steering approach that directly manipulates activations and does not require training an additional attribute model or updating steering vectors.

We will include discussion of PPLM in the related work to highlight these distinctions.

Q1: Could the performance of steering be further boosted by enlarging the model scale

  • We conducted experiments using ESM2-3B for protein sequence generation, maintaining the same settings as in Sec 4.1. The results are summarized in the table below.
    • Compared to ESM2-650M, the proposed method Activation Steering shows similar performance in thermostability and significantly improves solubility.
    • In contrast, fine-tuning performance decreases with larger ESM2, likely due to the need for more data to achieve optimal results.
ThermostabilitySolubility
Original Model56.10.298
FT64.20.385
AS80.50.631

Q2: the performance when the number of samples is further reduced

  • We appreciate the reviewer's interest in the performance of AR-PLM with fewer samples. To address this, we conducted additional experiments using ProLLaMA with sample sizes ranging from 1 to 10. Each configuration was tested 10 times to mitigate randomness. The results, summarized in the table below, indicate that the optimal number of samples varies depending on the property being optimized. For thermostability, the best performance occurs with 8 samples, while for solubility, it peaks at 3 samples. Notably, the performance does not consistently improve with fewer samples; the lowest number of samples (1 sample) did not yield the best results.
  • This suggests that while reducing the number of samples can sometimes enhance performance, likely by focusing the generation on a narrower subcluster of proteins with desired properties, there is a trade-off in terms of robustness. Performance becomes less predictable and can vary significantly depending on the specific samples used to compute steering vectors.
  • In conclusion, while fewer samples can sometimes be beneficial, the optimal number of samples depends on the specific application and desired property, balancing performance with robustness.
1235810
Thermostability64.361.857.471.874.573.5
Solubility0.3440.4910.5070.4920.4460.302

Q3: compatible with other protein steering or optimization methods?

In this paper, we focus on studying steering for PLM and we apply PLM steering exclusively within the proposed ASPO method for protein optimization, positioning ASPO as a competitor rather than complementary to existing methods. Integration with other protein optimization strategies remains an interesting direction for future research.

审稿人评论

I really appreciate authors' efforts in addressing my concerns. I accordingly raise my score to 3. Please do incorporate all the discussion above in the final version.

最终决定

The paper adapts activation steering from LLMs to PLMs for protein sequence generation and optimization, showcasing a training-free method and the ASPO framework with experimental improvements over baselines. Reviewers initially had concerns like lack of certain experiment details, unclear method descriptions, theoretical assumptions, and issues with evaluation criteria. The authors responded with additional experiments (e.g., full sequence generation for ESM3, niche property experiments), clarifications (such as method steps and related work like PPLM), and evidence to address these concerns, like using LoRA for fine-tuning, showing multi-property optimization solutions, and mitigating risks of latent representation modifications. Reviewers reassessed positively, with some raising their scores. Considering the research's significance, novelty of the approach, and the authors' effective responses to feedback, the paper is recommended for acceptance, provided the authors incorporate all relevant discussions and clarifications in the final version.