6.0

/10

Poster4 位审稿人

最低4最高8标准差1.4

3.5

置信度

正确性2.8

贡献度2.5

表达3.0

NeurIPS 2024

Invariant Tokenization of Crystalline Materials for Language Model Enabled Generation

Keqiang Yan,Xiner Li,Hongyi Ling,Kenna Ashen,Carl Edwards,Raymundo Arroyave,Marinka Zitnik,Heng Ji,Xiaofeng Qian,Xiaoning Qian,Shuiwang Ji

OpenReview PDF

提交: 2024-05-15更新: 2024-11-06

摘要

关键词

tokenization of crystalslanguage modelsmaterials generation

评审与讨论

审稿意见

评分: 6置信度: 32024-07-10

This paper presents a novel approach for generating crystal materials using language models. The key innovation lies in the Mat2Seq method, which converts 3D crystal structures into 1D sequences while ensuring SE(3) and periodic invariance. This approach addresses the challenge of representing crystal structures in a way that is unique and invariant under different mathematical descriptions.

优点

The paper is well-written, providing clear descriptions of the problem background and proposed methods.
It tackles an interesting problem by using language models to generate crystal structures and introduces the consideration of SE(3) and periodic invariance for the first time.
The experiments are comprehensive, covering standard benchmarks and attempting to discover crystals with specific properties, demonstrating the robustness and versatility of the method.
Although the paper ensures the invariance of the generated structures, enforcing this invariance at the tokenizer level might limit the diversity of the generated structures to a small subspace. To obtain structures with different rotations, additional post-processing may be required, unlike models based directly in the spatial domain which can generate various rotational variants naturally.

缺点

The paper should discuss the computational complexity differences between language models and specialized crystal models, as language models generally have higher parameter counts and computational demands.
The comparison of language models is limited. The experiments only compare with CrystalLLM, despite mentioning other methods.
The rationale for comparing only with CrystalLLM in Section 4.2 and not with other methods from Section 4.1 should be clarified.

问题

For crystalline materials, isn't the original data already the most primitive unit? If we first apply Niggli cell reduction before using CIF files to describe unit cells, can we ensure a consistent representation?
While generating continuous crystal structures in 3D space seems natural, I am curious whether this token generation method can ensure that the reconstructed 3D crystal structure is continuous. Specifically, can the generated sequences always be converted back into a meaningful crystal representation?

局限性

When performing conditional generation, it appears that generating structures for each specific property requires fine-tuning under specific conditions. This could result in high complexity in practical applications.

作者回复

2024-08-07

Dear Reviewer vCYe,

Thank you for your recognition that our approach addresses the challenge of representing crystal structures in a unique and invariant way under different mathematical descriptions. For your concerns and questions, we provide point-to-point responses below.

The paper should discuss the computational complexity differences between language models and specialized crystal models, as language models generally have higher parameter counts and computational demands.

Thank you very much for your suggestion, and we do think this is important to point out. Thus, we include the computational complexity and efficiency in generation comparisons with specialized crystal models like CDVAE and DiffCSP in Table 11 attached in the above general responses. As we can see in the table, LLMs based Mat2Seq indeed has more parameters, but with similar amount of GPU resources following DiffCSP running efficiency comparisons, Mat2Seq is significantly faster in generation.

We additionally show that decreasing the model size by 8 times will not influence the RMSE a lot and can further increase the running efficiency. For the smaller model, it achieved RMSE of 0.039 which is significantly better than DiffCSP with RMSE of 0.049, and more than 3 times faster in generation process.

The comparison of language models is limited. The experiments only compare with CrystalLLM, despite mentioning other methods.

Thank you for your question. We'd like to mention that we use Table 1 to directly show the failures of other two LLMs for crystal generation, and crystalLLM from Meta team that are finetuned instead of training from scratch is not directly comparable due to unfair training logic (they are using pre-trained large language models). We tend to believe the crystal structure prediction task established by DiffCSP and CrystaLLM is a reasonable and powerful benchmark to compare with, but other LLMs based crystal generation methods are either not directly comparable, or provide no evaluations for this task.

The rationale for comparing only with CrystalLLM in Section 4.2 and not with other methods from Section 4.1 should be clarified.

Thank you very much for this question. The reason for not including CDVAE and DiffCSP in Section 4.2 is also simple: we just want to compare with baselines in a fair way. For the tasks defined in Section 4.2, it requires the model to be trained on CrystaLLM dataset with 2.3 million crystal structures, while CDVAE and DiffCSP have only been trained on much smaller dataset like MP20 and MPTS52 with less than 50k training samples. It is unfair to compare with them, so we do not include them in the comparison in Section 4.2.

Additionally, the major contribution of this work, just as you mentioned before, is addressing the challenge of representing crystal structures in a unique and invariant way under different mathematical descriptions. To show this, we comprehensively compare our Mat2Seq with CrystaLLM that does not address this.

Furthermore, we tend to believe the performance gains beyond CDVAE and DiffCSP are already clearly shown by comparisons in Section 4.1.

For crystalline materials, isn't the original data already the most primitive unit? If we first apply Niggli cell reduction before using CIF files to describe unit cells, can we ensure a consistent representation?

Unfortunately no. Although usually the original data is the most primitive unit cells, there are a lot of different most primitive unit cells for the same crystal structure, as shown in Figure 1, like shifting periodic boundaries.

For your second question, actually the baseline method CrystaLLM uses Niggli cell reduction before using CIF files, but as we can see in Table 1, it fails to maintain a consistent representation for the same crystal structure before and after periodic transformations that will not change the crystal structure at all.

While generating continuous crystal structures in 3D space seems natural, I am curious whether this token generation method can ensure that the reconstructed 3D crystal structure is continuous. Specifically, can the generated sequences always be converted back into a meaningful crystal representation?

To show this, it is better to demonstrate the validity and stability of generated crystals from Mat2Seq. We have provided additional experiments for validity evaluations and stability evaluations in Table 8 attached above in the general responses. It can be seen that Mat2Seq achieves competitive validity ratio and remarkable stability ratios (around 50% of the generated structures can fall into the energy hull region of E_{hull} < 0.1 eV following FlowMM pipeline).

For your second question, as we can see from Table 8, CDVAE, DiffCSP, and the most recent FlowMM published after the NeurIPS deadline, none of them can guarantee a 100% validity rate, which means the generated crystals by these SOTA methods are not always meaningful.

When performing conditional generation, it appears that generating structures for each specific property requires fine-tuning under specific conditions. This could result in high complexity in practical applications.

Yes, currently we tend to utilize the pretrained model on that 2.3 M dataset because this pretrained model is available and potentially powerful beyond training from scratch. This kind of finetuning will potentially introduce less complexity compared with training from scratch. We will release the pretrained model for others to use.

Additionally, a foundation model covering various properties require a huge amount of crystal structures with corresponding properties. This kind of dataset is very expensive to establish and currently out of our scope. And this is a promising direction to move forward.

Thank you again for your questions. If you have any other questions, we are more than willing to answer.

Yours sincerely, Authors.

2024-08-10

The authors' rebuttal has addressed my concerns, and I have no further questions.

评论- Thank you for your responses

2024-08-10

Dear Reviewer vCYe,

Thank you for your recognition and responses!

Your suggestions undoubtedly help us enhance the clarity of this work, and we are glad that your concerns have been addressed.

Yours sincerely, Authors

审稿意见

评分: 6置信度: 52024-07-12

In this paper, the authors focus on the application of language models in the field of material generation. Starting from CIF files that represent crystal unit cell structures, they primarily utilize the Niggli reduction method to organize structures under different translations, rotations, and unit cell expansions into a unique representation. This representation is then used for crystal generation and conditional generation tasks. Compared to CrystaLLM, the proposed method demonstrates better generalization capabilities across multiple tasks.

优点

The authors introduce a new process, Mat2Seq, which uniquely represents crystal structures using Niggli reduction. This approach ensures SE(3) invariance, periodic invariance, and completeness, avoiding data augmentation problem and shortening token length in language model.
The paper showcases Mat2Seq’s ability to generalize to novel crystal structures that were not seen during training. This is a crucial aspect for practical applications and demonstrates the robustness of the method.

缺点

The authors utilized a unique representation method for crystal structures, which theoretically promises to become an essential step in data preprocessing. This representation method primarily aims to improve the model's ability to recognize equivariant transformations in materials. However, the evaluation does not demonstrate this method's generalization capability to equivariant structures. It remains unclear from the authors' validation whether this method help language model learns from SE(3) equivariance or if the improved performance is due to the new token representation.
In the experimental validation in Section 4.2 and the conditional generation in Section 4.3, the evaluation of capabilities is very limited. The authors only provided the proportion of generated structures without detailing whether new structures can be generated, how many of the generated structures are valid, and accuracy of these generated structures. The generated structures need detailed explanation and evaluation.

问题

In Section 3.1, the authors list two conditions for uniqueness and three types of transformations: translation, rotation, and unit cell expansion. It is suggested that Figure 1 should clearly illustrate all these transformations. Additionally, the "Change Lattice" operation in Figure 1 is confusing. Which type of transformation does this correspond to? Simply changing the length in the a-direction should not alter the atomic positions. Clarification is needed on whether the authors intended to illustrate unit cell expansion or some other operation.
In Table 1, the authors demonstrate Mat2Seq's ability to recognize crystal uniqueness, but the specific evaluation details are not provided. How were the values obtained, especially the 30% proportion? The methodology and criteria for this evaluation should be clearly explained.
In Section 4.3, while band gap is indeed an important property for semiconductors, it should be noted that whether the band gap is zero is a classification task distinguishing metals from non-metals. The authors' use of <0.5 as a threshold is inappropriate, especially when they mention that "values from 0 to 0.5 are marked as 0," which exacerbates the issue. This oversimplification could lead to misleading results.
In Section 4.2, in evaluating the ability to generate structures of new materials, the authors only indicate whether the generated structures match the given chemical formulas. It is recommended to include an evaluation of the atomic distance differences between the real and generated structures using RMSE to better assess the generation accuracy.
In Section 4.3, compared to commonly used machine learning models for material screening, generative models have the potential to produce out-of-distribution (OOD) structures, aiding researchers in discovering new structures with desired properties. However, the authors do not inform readers about the chemical formula repetitions, structure repetitions, or generation errors in the generated structures. A standard should be defined to evaluate the generated structures based on validity, uniqueness, and novelty.

局限性

The authors utilize an existing unique unit cell representation method in the data preprocessing stage for language models. This method holds promise as a necessary step for future work. However, the evaluation does not provide sufficient evidence to demonstrate its application value. Additionally, the conditional generation applications mentioned by the authors are also quite limited, based on the evaluations provided in the paper.

作者回复

2024-08-07

Dear Reviewer Sr8M,

Thank you very much for your time invested in reviewing our work. For your concerns and questions, we provide point-to-point responses below.

About authors' validation whether Mat2Seq helps LM learns from SE(3) equivariance or if the improved performance is due to the new token representation.

Thank you for this question. Actually, we'd like to point out that the only major difference between Mat2Seq and CrystaLLM is the SE(3) invariance 1D crystal sequence used, while the tokenization process is similar. To be specific, we used the exactly same model settings and very similar tokenization process following CrystaLLM, with the aim of showing the importance of SE(3) invariance and periodic invariance sequence representations. With CrystaLLM naturally served as the ablation of the inclusion of SE(3) invariance and periodic invariance sequence representations implemented by Mat2Seq, we show by comprehensive experimental results in the original manuscript (Tables 2 and 3) and in the attachment of the global rebuttal (Table 12) that Mat2Seq sequences will yield better performances.

Thus, by using CrystaLLM as a solid baseline and ablation, we believe the importance of SE(3) invariance and periodic invariance sequence representations for materials is clearly demonstrated.

We slightly organized your questions and concerns for the evaluation of Section 4.2 (Weakness 2, Question 4) and 4.3 (Weakness 2, Question 5), and provided responses below.

First of all, we agree with your point that for section 4.2 it is better to further include evaluation of the atomic distance differences between the real and generated structures using RMSE to better assess the generation accuracy. We provide detailed hit rate and RMSE in the new Table 12 attached above in general responses. It can be seen in the table that Mat2Seq significantly performs better than CrystaLLM in terms of Hit Rate (whether the generated structure matches with the literature structure), and RMSE.

Additionally, your suggestion for section 4.3 to include metrics like uniqueness, validity, and novelty is indeed very important for aiding researchers in discovering new structures with desired properties. Thus, we follow your suggestion and conduct evaluations for the generated crystals when conditioned towards lower or higher band gap values, with results shown in Table 9 attached above in general responses. It can be seen in Table 9 that Mat2Seq achieves remarkable validity with 88% for lower band gap condition and 90% for higher band gap condition, with good uniqueness of 98% and 92%, and good novelty of 86% and 99%.

About figure 1 and "Change Lattice" operation.

Thank you for your suggestion. Currently we show two types of periodic transformations that will alter unit cell structures significantly to demonstrate the failures of previous methods. For sure, we can update Figure 1 to include more demonstrative examples of all transformations that can result in different unit cell structures once we can update the paper.

For your second question, the "Change Lattice" operation does not correspond to cell expansion. Let's use a simple example to better demonstrate this. For example, you have a 2D material with atom at the origin of a cell, with cell lattices (0, 1) and (1, 0). A change lattice operation means you can actually change the lattice vectors to (1, 0) and (1, 1) without changing the area (or volume for 3D structures) of the cell at all. It is still a minimum unit cell, but with different lattice structures. We will also add these texts in the appendix once we have the chance, to enhance the clarity.

How were the values obtained, especially the 30% proportion?

Thank you for your question. The evaluation is conducted as follows: (1) for all crystal structure in MP 20 dataset, we transform it to a different unit cell representation without changing the crystal structure at all, e.g., by shifting the periodic boundaries as shown in Figure 1, to obtain a second representation of the same crystal. (2) We feed the original and the second crystal structure into 1D sequence representations of different methods, and then compare the differences between the output of these two different inputs for the same crystal structure. If the sequence mismatches, e.g., the resultant coordinates or type for the first atom is different, then it is a failure. (3) We go through the whole dataset and calculate the success rate. 30% means only 30% of the structures before and after the transformation will have the same sequence representation when using CrystaLLM.

The band gap zero is a classification task distinguishing metals from non-metals. This oversimplification could lead to misleading results.

Thank you for your question. However, we kindly disagree with you for this point. We want to mention that the band gap of a material is the energy difference between the top of the valence band and the bottom of the conduction band. It is calculated by subtracting the energy of the valence band maximum (E_v) from the energy of the conduction band minimum (E_c). We treat materials with Eg = E_c - E_v < 0.5 eV as a separate group, which will indeed include both true metals/semimetals with (Eg = 0) and small gap materials 0< Eg <0.5 eV), but we do not regard this group as pure "metal" group. Similarly, we separate materials with 0.5 <= Eg < 1.0 eV, 1.0 <= Eg < 1.5 eV, ... etc. as individual groups. This is just our current grouping protocol. For your interested applications, you can easily divide materials into different groups, e.g. separating Eg=0 as an individual group for metal electrode applications, and 0<Eg<0.5 as another group for mid-infared and thermoelectric applications.

With extensive additional experimental results and further clarifications, we hope we addressed your concerns and questions. If you have any other questions, we are more than willing to answer.

Yours sincerely, Authors.

2024-08-13

Thank you to the authors for the detailed response and the additional experimental evaluations. These supplements have enriched the paper's experimental assessment, particularly the added evaluations on the validity, uniqueness, and novelty of the generated structures, which have improved the overall quality of the paper. As a result, I have decided to adjust my score to 6.

However, there is still a point worth discussing. Regarding the authors' evaluation of conditional generation in the last reply, although the authors have explained their approach to merge materials with zero and near-zero band gaps, I still have some concerns. In the MP database, the ratio of materials with a zero band gap to those with a non-zero band gap is approximately 1:1, and in the MP20 dataset, the proportion of materials with a zero band gap is even higher. When evaluating the conditional generation capability, the authors chose to merge materials with Eg = 0 and near-zero band gaps into the range 0 ≤ Eg < 0.5 eV. This approach, which is based on a dataset where the proportion of Eg = 0 is very high and has very low proportion of 0 < Eg < 0.5 eV, seems confusing to me and does not adequately validate the conditional generation capability. I recommend a separate discussion on this aspect to provide a clearer validation of the conditional generation capability.

Once again, thank you to the authors for their hard work and for the thorough response to our feedback.

评论- Further responses from authors

2024-08-13

Dear Reviewer Sr8M,

Thank you for your responses and recognition of our work. We appreciate your insightful questions regarding the evaluation of the conditional generation ability of the proposed method. Below, we provide further clarifications and discussions which we will also include in the manuscript.

From our understanding, your question mainly concerns the high proportion of materials with zero band gap and the low proportion of materials with a band gap value between 0 and 0.5 eV. You suggest that this may not adequately validate the model's conditional generative ability, as the proportion of materials with 0 ≤ Eg < 0.5 eV is substantial. In other words, if the ratio of a group of materials in a dataset is large enough, an unconditional model might also generate a significant proportion of materials that satisfy the given group condition.

To clarify this, let us begin with the dataset distribution. The training set we used for generating materials with high or low band gap values contains 61,541 crystals in total, with 46,933 crystals (77.9%) having 0 ≤ Eg < 0.5 eV, 9,814 crystals (15.9%) with 0.5 ≤ Eg ≤ 3 eV, and 4,794 crystals (7.8%) with Eg > 3 eV. As you mentioned in your comments, the ratio of crystals with 0 ≤ Eg < 0.5 eV is indeed large in the training set. Therefore, to better demonstrate the model's conditional generation capacity, we not only show the success rate of generating crystals with low band gap values (<0.5 eV) but also the success rate of generating crystals with high band gap values (> 3 eV). This approach highlights the ability of the proposed method to significantly alter the distribution from the training set. As shown in Table 4 of the paper and Table 9 in the general response, Mat2Seq can significantly change the generated crystal distribution from 7.8% with Eg > 3 eV in the training data to more than 90% in the conditional generation distribution, with a remarkable 92.2% uniqueness ratio and 89.8% validity ratio. Therefore, we believe that the conditional generation capacity of Mat2Seq is well demonstrated.

Furthermore, another reason we do not group crystals with a specific band gap value (e.g., 0 eV) into a single group but rather use a range of band gap values for this task is that current state-of-the-art machine learning-based band gap predictors (such as ComFormer, ALIGNN, and others) treat the prediction of band gap values as a regression task, rather than first classifying materials as metal or non-metal and then performing regression for non-metal materials. As a result, it is challenging to determine whether a generated material satisfies a 0 eV band gap because the mean absolute error (MAE) can be as large as 0.122 eV.

Thank you again for your recognition and insightful suggestions and comments. If you have any additional questions, we would be more than happy to answer them.

Yours sincerely, The Authors

评论- Email notification was down on OpenReview

2024-08-13

Dear Reviewer Sr8M,

The email notification system on OpenReview was down for a period of time, and we posted our further responses to your insightful additional question during that period.

Thank you for your recognition, as well as your insightful suggestions and comments. If you have any additional questions, we would be more than happy to answer them.

Yours sincerely, The Authors

审稿意见

评分: 4置信度: 32024-07-12

There are several challenges when developing LMs for materials: 1) each crystal structure consists of an infinite number of atoms and a unique and invariant unit cell must therefore be selected for each crystal 2) the unit cell can be represented in a one-dimensional (1D) sequence that maintains invariance under arbitrary rotations and ensures completeness. To tackle these challenges, the paper proposed a method called Mat2Seq that systematically transforms 3D crystal structures into 1D sequences. This is achieved by first identifying SO(3) equivariant unit cells and subsequently converting these into SE(3) invariant sequences. The experimental results in crystal structure prediction and crystal discovery with desired properties validate the efficacy of Mat2Seq.

优点

I understand why it is important for the inherent symmetry of materials to be reflected in the text when designing a language model for material generation.
The importance of unique representation and completeness from the structure of the material to the sequence is emphasized.

缺点

Overall, the novelty and contribution of the paper are low. Instead of proposing new algorithms, they simply apply technologies that are already widely used in materials science to the process of representing materials as 1d sequence text. More specifically, the method for determining the equivariant unit cell is derived from Niggli cell reduction rather than being an original contribution of this paper. This concept is already widely used in the field of materials science, and the authors’ application of it to the text representation of materials does not seem like a significant contribution.
It is unclear what Section 3.3 is trying to convey. The authors claim to achieve text representation reflecting SE(3) symmetry following SO(3) symmetry, but it is doubtful whether this is well conveyed in this section.
The evaluation metrics used are limited. To demonstrate efficiency more effectively, the paper should present values for various metrics such as Validity, Stability, and S.U.N., which are commonly addressed in other studies.

问题

Could you provide a more detailed explanation with examples regarding the input for the language model (LM)?
Why is the target property value inserted at the beginning of the text input when generating materials for a specific target property?
Does “irreducible atom sets” refer to the atom sets contained within the minimum unit cell?

局限性

Please refer to the Weakness section.

作者回复

2024-08-06

Dear Reviewer 5Kst,

Thank you for your time invested in reviewing this work. We provide point-to-point responses to your questions and concerns.

Weakness 1

Thank you very much for raising this question. However, we kindly disagree with this and we feel there might be a potential misunderstanding of our proposed method.

Let's begin from a question. Could Niggli cell reduction algorithms or other widely used 1D crystal representations including CIF files achieve uniqueness or SE(3) invariance? The answer to this question is No. More specifically, Niggli cell reduction can only give you a set of lattice vectors and cannot determine a unique crystal unit cell. Additionally, there is a previous work CrystaLLM that has been using Niggli cell reduction when converting the crystal structures into 1D sequences. However, as we show in Table 1, CrystaLLM cannot achieve unique crystal 1D sequence representations.

Furthermore, let's look into our proposed method Mat2Seq and demonstrate how it achieves the desired properties including uniqueness, completeness, and SE(3) invariance. The niggli cell reduction is only used in the first step when we want to uniquely determine a set of lattice vectors. After that, we need to uniquely determine a corresponding unit cell. And then, we need to determine a unique ordering of atoms in the cell, and features used to completely represent the crystal 3D unit cell structures. The Niggli cell reduction is only an initial step of our proposed method, and to the best of our knowledge, Mat2Seq is the first work that can achieve uniqueness, completeness, and SE(3) invariance when converting 3D crystal structures into 1D sequences in the field of materials science. Thus, we kindly disagree that this problem is solved by previous Niggli cell reduction algorithms or any other widely used methods in materials science.

Last but not least, we want to point out that all previous LLM based crystal generation methods fail to achieve uniqueness and SE(3) invariance. It is valuable to propose a method to address this limitation of the current process in this direction.

Weakness 2

We appreciate your question. To clarify, as we mentioned in the beginning of section 3, we begin with Requirements for ideal crystal sequence representations in Section 3.1, and then move forward to how to determine a SO(3) equivariant unit cell in Section 3.2, then, Section 3.3 is used to demonstrate how to convert a SO(3) equivariant unit cell into SE(3) invariant 1D sequence.

Specifically, in Section 3.3, we show that for given determined SO(3) equivariant and periodic invariant unit cells $\mathbf{M}=(\mathbf{A}_u, \mathbf{P}_u, \mathbf{L}_u)$ , we represent them by SE(3) and periodic invariant sequences that are complete to guarantee the full reconstruction of crystal structures.

More specifically, the SO(3) equivariant unit cell obtained in Section 3.2 cannot be directly used as input for LLMs, because a given crystal structure can have infinite number of SO(3) equivariant unit cells that differ by a rotation transformation.

Additionally, to show that the converted 1D sequence satisfies all the requirements, we provide detailed proofs in Section 3.4.

With these being said, we will appreciate any additional specific suggestions you have to improve this section, and we will revise the main paper accordingly once we have the chance.

Weakness 3

Thank you very much for your valuable question.

Following your suggestion, we further followed a very recent work FlowMM in ICML 24 published after the NeurIPS submission deadline to establish a fair comparison in terms of Validity, Stability, and S.U.N. It is worth noting that the calculation of stability and S.U.N. is very expensive (requires extensive DFT calculations) and only has been used by very recent works even published after NeurIPS deadline. We show the results in Table 8 attached above in general responses. It can be seen in the table that Mat2Seq achieves competitive results in terms of validity, stability, and S.U.N., even 28% better beyond FlowMM in terms of S.U.N. (stable, unique, and novel) that published in ICML 24 after NeurIPS deadline.

Question 1

Sure, we'd like to mention that a concrete example of the Mat2Seq input for a crystal structure is shown in Figure 2 in the main paper. The whole Mat2Seq sequence for that crystal structure is "formula Ag 4 Hg 2 I 8 \n space_group_symbol I-4 \n lattice_parameters \n a 6.5361 b 6.5361 c 13.1629 \n alpha 90.0000 beta 90.0000 gamma 90.0000 \n Ag 2 0.0000 0.0000 0.0000 \n Ag 2 0.0000 0.5000 0.7500 \n Hg 2 0.0000 0.5000 0.2500 \n I 8 0.2400 0.7596 0.6200 \n\n", where all these symbols and numbers are mapped to int values by a mapping dictionary.

Question 2

Thank you for your valuable question. The nature of Autoregressive language models is the conditional probabilities of the next token given its predecessors, $p(C_i|\theta)=\prod_{j=1}^{n_i}p(c_j | c_1:c_{j-1}; \theta)$ . Thus, if you want to establish a conditional distribution $p_\theta (\cdot|s)$ to generate 3D crystal structures possessing the property $s$ , it is natural to place the target property values at the beginning of the text other than anywhere else.

Question 3

No, the irreducible atom sets refer to the subset of the atoms in the minimum unit cell. For example, as you can see in the structure of Ag4Hg2I8 in Figure 2 in the main paper, there are 14 atoms in the minimum unit cell, however, the irreducible atom set only contains 2 Ag, 1 Hg, and 1 I. This is because the positions of other seven I atoms can be fully recovered by using I 0.2400 0.7596 0.6200 and the space group transformations for I-4 group, and that's exactly why there is a number 8 in "I 8 0.2400 0.7596 0.6200" after symbol I.

With these clarifications provided, we hope we addressed your questions and concerns. And if you have any other questions, we are more than willing to answer.

Yours sincerely, Authors.

评论- We appreciate the chance to answer any additional questions

2024-08-13

Dear Reviewer 5Kst,

Thank you again for the valuable time you invested in reviewing this work. For your previous questions and concerns, we have provided detailed clarifications along with extensive additional experiments. We tend to believe these clarifications and additional experiments could thoroughly address your concerns and questions.

As the author-reviewer discussion period is closing soon, we would greatly appreciate the chance to answer any additional questions you may have.

Yours sincerely, The Authors

评论- Thanks for the rebuttal

2024-08-14

Thanks for the detailed rebuttal and response.

I am not arguing that the contribution is weak solely because the proposed method only uses Niggli reduction. However, I still question whether the methods applied after the initial Niggli reduction algorithm—such as determining the origin, assigning unique ordering of atoms, and representing lattice vectors in terms of lengths and angles (which are also widely used in the field of materials science)—are strong enough as the main contribution.
This issue has been resolved.
Thanks for conducting the experiment. Could you explain why the structural validity is low compared to the other baselines in this case? Additionally, it would be beneficial to include the results of CrystalLLM in the final version.

4, 5, 6: The unclear parts have been clarified.

Some of my concerns have been resolved, but some still remain. I will raise my score from 3 to 4. Thanks.

评论- Thank you for joining the discussion

2024-08-14

Dear Reviewer 5Kst,

Thank you very much for joining the discussion. Following your responses, we would like to provide further clarification on your remaining questions, specifically questions 1 and 3.

For your first question, we respectfully disagree with your point and would like to emphasize that Mat2Seq, to the best of our knowledge, is the first work that can achieve the unique and SE(3)-invariant conversion from 3D crystal structures to 1D sequences. If this is not the case, could you please point us to related works that have solved this problem before? Additionally, if this problem has been widely studied and solved, why have previous LLM-based methods, including your cited CrystalLLM from Meta and the other two works, failed to maintain such uniqueness and SE(3) invariance for the 1D crystal sequence representation?

We also respectfully disagree with your point of viewing the novelty of a method solely by whether the components of the method are easy or not. Rather, a method can be novel and efficient even if each step is straight forward, as long as it solves challenges that have not been addressed by previous studies, and these challenges are important to be solved rather than ignored.

Furthermore, let's consider the novelty of this work by evaluating what the field will be like with or without it. Currently, none of the previous studies on LLM-based crystal generation have achieved unique and SE(3)-invariant 1D sequence representations. Without this work, researchers in this field might remain confused and continue using similar approaches, like CIF files used by CrystalLLM, which are far from unique. With this work, we not only demonstrate how to achieve uniqueness and SE(3) invariance—eliminating the need for extensive data augmentation—but also show that this approach enhances the model's performance in generating crystals.

For your third question:

As we can see from the results, the structural validity is actually similar to FlowMM when the generation temperature is set to 1.35, and it can be further improved to achieve higher structural validity by lowering the temperature. Additionally, the combined validity, where the generated crystal is both structurally valid and compositionally valid, is 83.4%, which is higher than FlowMM’s 80.6% and DiffCSP’s 83.3%, as measured by the structural validity ratio multiplied by the composition validity ratio.
The reason we do not directly compare with CrystalLLM (Meta) for generation tasks is simply that CrystalLLM from Meta uses pre-trained LLMs trained on a vast amount of text data, while Mat2Seq and CrystaLLM from the UK team are trained from scratch solely for crystal sequences.

Thank you again for joining the discussion. We are glad that some of your concerns have been addressed, and we appreciate the opportunity to further discuss the remaining issues. If you have any additional questions, we are more than willing to answer them.

Yours sincerely, The Authors

审稿意见

评分: 8置信度: 32024-07-16

This article propose a novel method, known as Mat2Seq, to tackle this challenge. Mat2Seq to converts 3D crystal structures into 1D sequences and ensures that different mathematical descriptions of the same crystal are represented in a single unique sequence, thereby probably achieving SE(3) and periodic invariance. Experimental results show that, with language models, Mat2Seq achieves promising performance in crystal structure generation as compared with prior methods. Overall this work gives new insights and should be accepted for the conference after minor revisions.

优点

Mat2Seq to converts 3D crystal structures into 1D sequences
it reports the framework for creating unique and complete crystal sequence representations, followed by the construction of a material LLM capable of generating novel crystal structure with desired properties of interest.

缺点

Lack of experimental data validation for the generated data.

问题

Author needs to verify the generated data with more experimental results as currently it have multiple limitations. 2. Author needs to compare the accuracy with already existing CIF data of crystals.

局限性

The limitations of the current Mat2Seq include: (1) it cannot be directly used for other atomic systems, like molecules and proteins; (2) the extension to model disordered materials remains a challenging frontier; and (3) large-scale training with more stable crystal structures can potentially enhance the robustness and performance when more computational resources are available.

作者回复

2024-08-06

Dear Reviewer JJwt,

Thank you very much for your recognition of our work in terms of insights and contributions. For your questions and concerns, we provide point-to-point responses as follows.

Lack of experimental data validation for the generated data. Need to verify the generated data with more experimental results as currently it have multiple limitations. Need to compare the accuracy with already existing CIF data of crystals.

Thank you for your insightful comments. We totally agree that the comparison with experimental data (or in other words, crystal structures that have been experimentally observed) is important to demonstrate the ability of the proposed method for real world applications, beyond synthetic crystal structures obtained from pure DFT calculations.

Actually, to demonstrate this, we conduct experiments by using Mat2Seq to discover recently experimentally discovered crystal structures from literature in Section 4.2. The visual comparison between the Mat2Seq generated structure with the experimentally observed crystal structure is shown in Figure 4, which shows our proposed method's ability for experimentally observed crystal structures. We also add one experiments to show the match rate and RMSE in Table 12 attached above in the general response.

Additionally, the datasets used in Section 4.1 also contain a large number of crystal structures that are experimentally observed. For example, in MP20 dataset test set, there are 3819 crystal structures that are experimentally observed. To show the accuracy of Mat2Seq for experimental data, we additionally conduct evaluation experiments on these 3819 crystal structures with results shown in Table 10 attached above in general response. It is worth noting that Mat2Seq achieves 65.2% match rate with 0.042 RMSE on these 3819 crystal structures.

For current limitations listed in the paper.

Thank you for your valuable question.

(1) Designing a LLM-based generative method for all atomic systems while satisfying uniqueness and completeness is highly nontrivial and a open question. We tend to address this problem in future works.

(2) Different from other generative methods like diffusion based or flow based methods, LLM based methods generate crystal structures in an autoregressive manner. This difference makes a lot of techniques used by previous diffusion or flow based methods invalid for LLM-based methods. And this is an open frontier to be explored as also discussed by previous LLM-based methods. In this paper, we proposed a potential solution for this problem, and we tend to believe there are rooms for future works to improve.

(3) As discussed by the ML community, usually more data will result in more powerful ML models. However, generating large scale crystal structure dataset is very expensive and currently out of our scope, so we list this as a potential direction to further improve the power of ML methods for crystal structure generation.

We hope we addressed your concerns. If you have additional questions or concerns, we are more than willing to answer.

Yours sincerely, Authors.

作者回复

2024-08-07

Dear Reviewers, ACs, and PCs,

We thank all reviewers for your time invested in reviewing our work, and appreciate your valuable suggestions.

For all your (reviewers') questions and concerns, we provide detailed clarifications with additional experimental results, and some of these experiments are quite expensive to run like S.U.N. and stability (which require extensive DFT calculations). We attached detailed experimental results in the pdf file here for you to view.

To be specific, in Table 8, we add the comparison with a very recent SOTA method FlowMM that is published in ICML 24 after the NeurIPS submission deadline, to establish a fair comparison in terms of Validity, Stability, and S.U.N. (stable, unique, and novel). We follow the FlowMM pipeline and generate 1k materials to obtain these results.

In Table 9, we add the unique ratio, validity ratio, and novelty ratio of generated materials conditioned towards low and high band gap values detailed in Section 4.3.

In Table 10, we add the match rate (%) and RMSE for experimentally observed crystal structures in MP-20 test set.

In Table 11, we add the efficiency in generation speed, and model complexity comparisons with CDVAE and DiffCSP.

In Table 12, we add the Hit rate (whether the generated structure matches with recently experimentally observed crystals from literature) and RMSE for the 10 challenge crystal systems detailed in Section 4.2, compared with previous SOTA method CrystaLLM.

With these extensive experiments and detailed further clarifications, we hope we addressed all your concerns and questions thoroughly. If you have any other questions or concerns, we are more than willing to answer.

Yours sincerely, Authors

最终决定Accept (poster)

2024-09-25

The paper presents a new representation for crystalline materials as 1D sequences, that can then be trained using an LLM. While prior works have already demonstrated methods to do this, the novelty of the current paper is in designing a unique & invariant representation for crystals. Through experiments on commonly used datasets, the authors show this new method, called Mat2Seq, is competitive with other methods in this space.

Strengths:

New approach for representing crystals in a unique and invariant manner.
Competitive results on benchmark datasets.

Weaknesses:

Limited novelty, compared to the well-known Niggli reduction.
The authors do not compare the method to state-of-the-art LLM approaches to this problem like Gruver et al. 2024.

Recommendation: Despite the limitations, I vote to accept the paper as it presents a new and interesting method for representing crystal structures. The use of LLMs for material science has been growing rapidly in recent times, and the Mat2Seq representation could be valuable for this line of work.