6.8

/10

Poster4 位审稿人

最低4最高5标准差0.4

2.8

置信度

创新性2.8

质量3.3

清晰度3.3

重要性2.5

NeurIPS 2025

Permissioned LLMs: Enforcing Access Control in Large Language Models

Bargav Jayaraman,Virendra Marathe,Hamid Mozaffari,William F. Shen,Krishnaram Kenthapadi

OpenReview PDF

提交: 2025-05-10更新: 2025-10-29

TL;DR

We propose a new class of fine-tuned LLMs, Permissioned LLMs, that enforce access control on responses to queries, thus protecting sensitive training/tuning data from unauthorized queries.

摘要

关键词

LLMPEFTaccess control

评审与讨论

审稿意见

评分: 4置信度: 32025-06-21

This paper addresses a critical and underexplored challenge in deploying LLMs within enterprise or multi-tenant environments: how to enforce fine-grained access control when LLMs are fine-tuned on data from multiple security domains. The authors propose Permissioned LLMs (PermLLMs), a class of models that associate subsets of LoRA adapters with specific data domains. They present three mechanisms—Activate, Merge, and Union—to enforce domain-specific access control during both fine-tuning and inference. To evaluate the enforcement strength, they introduce two novel empirical metrics: Domain Distinguishability Index (DDI) and Utility Gap Index (UGI). The paper is well-motivated, formally grounded, and supported by empirical results on multiple datasets.

优缺点分析

Strengths:

The paper tackles a real-world security problem that is likely to become increasingly important as LLMs are deployed across enterprise and governmental settings. It introduces a formal framework for modeling access control in LLMs and provides a rigorous definition of access advantage.
The experimental evaluation is comprehensive in scope, covering multiple datasets with varying characteristics and two different base models. The use of membership inference attacks to construct the DDI metric is innovative and provides a principled way to assess domain separability. The inclusion of multiple utility metrics for UGI offers different perspectives on access control effectiveness.

Weakness:

The most significant limitation of this work is its scalability constraints. The Union mechanism, which provides the best performance, requires up to 2^n adapters for n domains, creating a combinatorial explosion that quickly becomes impractical. While the authors acknowledge this limitation, they provide insufficient analysis of when scalability becomes prohibitive or guidance on the maximum practical number of domains.

问题

How does the system handle noisy domain boundaries or cross-domain data leakage (e.g., data records shared across silos)?
Hyperparameter Sensitivity: Given that only default LoRA rank (64) was tested with limited ablation studies, how sensitive are the results to key hyperparameters like adapter rank, learning rate schedules, and the choice of layers for adapter placement?

局限性

Yes

最终评判理由

This paper addresses a critical and underexplored challenge in deploying LLMs within enterprise or multi-tenant environments: how to enforce fine-grained access control when models are fine-tuned on data from multiple security domains. The authors present a simple yet conceptually meaningful solution to this problem. While the proposed approach has certain limitations in terms of scalability and generality, the initial solution provides a useful starting point for future work in this space. Overall, I believe the paper makes a valuable contribution to the community. My current score reflects both my appreciation of the paper’s central insight and the clarifications provided during the rebuttal. I will maintain my original score.

格式问题

作者回复

2025-07-31

We thank the reviewer for their detailed feedback and for recognizing the merits of our access control formalism and evaluation framework and their relevance to the enterprise settings. Below we would like to respond to the questions and concerns raised in the review:

Guidance on usability and scalability

Regarding the comment "...insufficient analysis of when scalability becomes prohibitive or guidance on the maximum practical number of domains.": The usability is subjective and depends on the dataset sizes, size of the models and the compute resources available. We expect the mechanisms to scale well to few dozen domains for the models of sizes similar to Llama-8B and Mistral-7B models. We have experimented with 3 to 10 security domains in our experiments across different datasets. In practice, the combinations are few and sparse in enterprise settings and thus we don't need all possible permutations for Union. For instance, in enterprises there would be hierarchy in organization and not all employees would get all possible access combinations: employees typically only have access to lower levels than their designation. In such settings, the combinations could be in the order of nlog(n) for a tree-type hierarchical organization structure. All these interesting scalability issues are a part of our future work.

Handling domain segregation

In response to the question "How does the system handle noisy domain boundaries or cross-domain data leakage (e.g., data records shared across silos)?": We assume the separation of data across security domains has happened before our access control mechanisms come into play. Most enterprise settings already have strict separation/segregation of data across different silos or organizations. Any individual that is part of any two organizations would naturally get access to data from both domains. The actual handling of separation of noisy cross-domain boundaries is beyond the scope of this work.

Hyperparameter sensitivity

In response to the question "Given that only default LoRA rank (64) was tested with limited ablation studies, how sensitive are the results to key hyperparameters like adapter rank, learning rate schedules, and the choice of layers for adapter placement?": We tried adapters with different LoRA ranks ranging from 2 to 64, and found it to have limited impact on the metrics, and hence set the default value to 64. We also tried different learning rates between 1e-5 and 5e-4 using grid search in our initial experiments and set the best values for all our experiments. We have not experimented with choosing specific layers for adapter placement and went with the standard LoRA implementation of using adapters for all layers. We will add this discussion in the Detailed Experiment Setup section in the appendix.

2025-08-05

We thank the reviewer for their time and would like to know if we answered their queries. If there are any remaining concerns or questions, we would be happy to respond to them.

审稿意见

评分: 4置信度: 32025-07-01

This work studies the problem of using datasets in enterprises where parts of it have different access control permissions. The paper proposes a new framework called PermLLM to utilize these datasets for finetuning LLMs while respecting the access control permissions. The core idea is to ensure that an LLM fine-tuned on data from multiple "security domains" only generates responses based on the domains a specific user is authorized to access. The paper formalizes this problem, proposes a generalized metric called access advantage along with an auditor process to measure this empirically, proposes 3 methods to achieve PermLLMs and experimentally evaluates them on 4 different datasets (WMDP, GPQA, SimpleQA and RCV1).

The 3 proposed methods are variations of training and combining PEFT adapters that are trained on the relevant "security domains". The 3 methods are:

Activate: Each adapter is trained on a single "security domain" and activations are combined during inference.
Merge: Each adapter is trained on a single "security domain" and adapters are combined for inference.
Union: Each adapter is trained for existing combinations of "security domains"

The paper also proposes 2 metrics to measure how effective these methods are called Domain Distinguishability Index (DDI) based on Membership Inference Attacks and Utility Gap Index (UGI) based on LLM utility evaluation. They find that these techniques work well in the simplified access control model presented in the paper with few domains.

优缺点分析

Strengths:

Paper is clearly written and easy to understand and follow.
The work is thorough as it provides a framework for theoretically understanding this problem as well as metrics to empirically evaluate the implementations.

Weaknesses:

The provided methods for PermLLMs seem fairly simple and combine already existing methods.
The methods provided do not scale well with increasing number of domains. Activate and Merge both suffer from low utility when number of domains increase. Union requires exponential space and training resources. Furthermore, the access control model is very simple without any hierarchies and it seems like it is expected that it does not change over time.

问题

Given that the construction is private by construction as each user can only use an adapter from security domains it has access to, its not clear how much value all the empirical metrics are adding to understanding the privacy of the mechanisms? If the question is about auditing, would it be easier and more effective to audit that the adapters themselves are trained on the correct data instead?
Can you elaborate on how UGI makes sense as a metric? From what I can tell, it doesn't tell us much about utility and can only tell us about privacy leakage in a very roundabout way.

局限性

yes

最终评判理由

The authors clarified most of the questions I had in their response but as the other reviewers also mentioned, the current set of results have a few weaknesses:

The empirical auditing metrics rely on either assumptions that utility loss is expected if query does not have access to the correct domain or rely on MIA attacks that might not work well in practice.
The methods provided to achieve Permissioned LLMs do not seem to scale well to complex access controls that systems like this will usually use such as a larger number of domains.

Based on this, I will choose to keep my initial rating.

格式问题

no concerns

作者回复

2025-07-31

We thank the reviewer for their detailed feedback and for acknowledging the merits of our access control formalism and evaluation framework. We would like to take the opportunity to respond to the questions and concerns raised in the review:

Regarding the auditing process

In response to the question "Given that the construction is private by construction as each user can only use an adapter from security domains it has access to, its not clear how much value all the empirical metrics are adding to understanding the privacy of the mechanisms? If the question is about auditing, would it be easier and more effective to audit that the adapters themselves are trained on the correct data instead?":

Our access control evaluation is with respect to an auditor having access to all the domain adapters and domain datasets who wants to verify if the access control mechanism is implemented correctly. Hence the auditor chooses a query from a target domain and evaluates the access advantage when querying the target domain adapter (i.e., "in" adapter) instead of any other adapter (i.e., "out" adapter). This also answers how much information a user can gain for a target domain query when they have access to the target domain versus when they do not, thereby bounding the leakage about an individual query/record.

The above problem can also be reformulated by fixing an adapter and querying it with "in" and "out" domain data, and our experiments already capture this. The UGI metrics reported in Figure 2 and 3, and Table 3, show the average gaps across all domain adapters and domain datasets, which implicitly captures this scenario. For example, given two adapters, A1 and A2, and two security domains, D1 and D2, the UGI metric finds the average utility gap between "in" and "out" adapters for both domains: ((U(A1, D1) - U(A2, D1)) + (U(A2, D2) - U(A1, D2))) / 2. The alternate problem formulation is also equivalent to this: ((U(A1, D1) - U(A1, D2)) + (U(A2, D2) - U(A2, D1))) / 2.

Finally, both DDI and UGI are agnostic to the underlying implementation of access control. Whether the system is built using PEFT or other future techniques, our metrics remain applicable. This makes them broadly useful for auditing the effectiveness of any access control enforcement mechanism, not just the specific construction used in this work.

Regarding the UGI metric

In response to the question "Can you elaborate on how UGI makes sense as a metric? From what I can tell, it doesn't tell us much about utility and can only tell us about privacy leakage in a very roundabout way.": We note that the UGI metric quantifies the difference in utility between the cases when a user has access to a security domain and when they do not have access to that domain. This pertains to how much information a user can gain for a target domain query when they have access to the target domain versus when they do not. UGI does not quantify the model utility itself.

Regarding the scalability and simplicity of the approach

Regarding the comment "The provided methods for PermLLMs seem fairly simple and combine already existing methods.": While the access control mechanisms are simple and straightforward, we would like to highlight the effectiveness of these approaches in enforcing access control as shown in our experiments. Additionally, we also implemented a naive prompt-based access control baseline which fails to achieve any meaningful access control, while also being susceptible to potential jailbreaking attacks. We will include these new results and a thorough discussion in the paper's revision.

Regarding the comment "The methods provided do not scale well with increasing number of domains...": Since the problem of access control in LLMs has not been formalized before and ours is the first work in this effort, the major goals of our work are to formalize access control enforcement in LLMs and introduce empirical metrics to evaluate the effectiveness. As we stated in our limitations section, the scalability of our approach to large number of domains (order of thousands) is not feasible at the moment. For large number of domains or permission groups, we can cluster similar permission groups into hierarchical clusters and train the PEFT modules on the clusters. We leave this extension for future work.

2025-08-05

We thank the reviewer for their time and would like to know if we answered their queries. If there are any remaining concerns or questions, we would be happy to respond to them.

2025-08-06

I think that answers all of my questions. I will keep my rating.

审稿意见

评分: 5置信度: 32025-07-03

This paper proposed Permissioned LLMs (PermLLMs), a new design for enforcing access control in outputs of LLMs through efficient fine-tuning. To assess the performance of access control in LLMs, the authors also propose a formal framework for quantifying the relevance and utility of access control of LLMs and two metrics, Domain Distinguishability Index (DDI) and Utility Gap Index (UGI), to benchmark the LLM utility. The paper presents experiments on four public datasets to demonstrate the efficacy of PermLLMs

优缺点分析

Strengths:

The work is well-motivated and well-written, with a research question of clear real-world application potential. The problem of tailoring LLM outputs based on user access is an interesting and important one.
The paper includes a formal method to the LLM access control problem, which quantifies the strictness of the model following access control of domains and a game-based threat (audit) model (Sec. 2.3) for evaluation of the system.

Weaknesses:

One particular weakness in the framework is the design of DDI, which utilizes membership inference attacks for distinguishing model outputs. A downside inherent in this design would be, naturally, that the index is only as good as the strongest attack, which still remains an open problem for MIAs against LLMs, and especially in black-box scenarios. A more recent critique of the MIA methods toward LLMs is given by Meeus et al. in [1], in short, post-hoc collected shadow sets for MIAs are not good classifiers of training sample memberships.
The notion of Access Advantage (Def. 2.2) and UGI (Def. 4.2) needs more clarification. From a user perspective, one would certainly expect that an LLM should not suffer utility loss on any query. Suppose the model can generalize on a particular domain $D_{S_u}$ based on other pre-trained data alone; should the model be penalized for its output even though it has not observed this domain?

References [1] Matthieu Meeus, Igor Shilov, Shubham Jain, Manuel Faysse, Marek Rei, Yves-Alexandre de Montjoye. SoK: Membership Inference Attacks on LLMs are Rushing Nowhere (and How to Fix It). In SatML, 2025.

问题

Can you clarify why the relevance score function $relv()$ is the utility value itself (Sec. 4.2)? It seems to be a central component to the entire framework, and it would certainly be beneficial to see discussions on it. Continuing the discussion on UGI, relevance to a specific domain and utility do not seem to be simply equivalent.

局限性

The authors discussed some limitations of the proposed method.

最终评判理由

My main concerns have been addressed, and the authors will include results and discussion mentioned in the rebuttal.

格式问题

none

作者回复

2025-07-31

We thank the reviewer for their detailed feedback and for acknowledging the practical significance of the problem and our evaluation framework. Below we would like to clarify the concerns and questions raised:

On the reliance on MIA success

The reviewer raised a valid concern that the DDI evaluation is only as good as the best MIA in their comment "One particular weakness in the framework is the design of DDI, which utilizes membership inference attacks...".

We agree with the reviewer that the empirical success of DDI is dependent on the strongest membership inference attacks, which could get trickier for LLMs as the training set sizes increase in the general LLM training setting. However, in our scenario the security domains are distinct from each other in the data distribution sense, and as such even with larger domains the MIAs would be successful in distinguishing the domains.

Moreover, DDI is designed to be modular with respect to the choice of MIA. In our experiments, we already evaluate DDI using five different MIA techniques, demonstrating its flexibility and robustness across a range of attack strategies. As the field progresses and stronger and more refined MIAs are developed, they can be directly incorporated into the DDI calculation without modification to the framework. This design ensures that DDI remains adaptable and relevant in assessing access control effectiveness.

On the clarification of UGI metric

Regarding the comment "The notion of Access Advantage (Def. 2.2) and UGI (Def. 4.2) needs more clarification. From a user perspective, one would certainly expect that an LLM should not suffer utility loss on any query. Suppose the model can generalize on a particular domain based on other pre-trained data alone; should the model be penalized for its output even though it has not observed this domain?": We note that there is no penalization on model output/utility with Permissioned LLMs. If the model generalizes well with just its pre-training data, then it will perform well on queries related to domains the user does not have access to. However, this does not imply the access control does not work as intended. The access advantage definition only bounds the gap in model utility between the case where the user has access to a domain and the case where the user has no access to that domain, which includes any model utility obtained from learnings without the model gleaning the data from the domain the user doesn't have access to. We will clarify this in the revision.

Regarding the comment "Can you clarify why the relevance score function is the utility value itself (Sec. 4.2)?...": For UGI, we are using the gap in utility, which is why the relevance score defined in Def 2.2 is the utility for the target domain for the UGI instantiation. Other access advantage instantiations could use different relevance score metrics, for instance DDI uses AUC and TPR@(low)FPR as relevance score metrics.

2025-08-04

Thank you for your detailed rebuttal, which has addressed most of my concerns. I am satisfied with the explanation for Access Advantage (Def. 2.2) and UGI (Def. 4.2). However, I still maintain that using MIAs in the DDI evaluation is a weakness yet to be addressed: the empirical success of MIAs do not offer privacy guarantee of the domain data, I would again point to Meeus et al., 2025, which demonstrated that distribution shifts between members and non-members make the membership status predicted unreliable. Following the authors’ discussion of disjoint domains, it would also imply a fundamental difficulty in adapting to more complex domain distributions. At this stage, the authors’ response can only be interpreted as wishing a perfect MIA would exist in the future. For this reason, I’m inclined to maintain my current rating.

2025-08-06

We appreciate the reviewer's concerns on data distribution related limitations of the DDI approach to evaluate PermLLM mechanisms. While we agree with the reviewer that DDI will be as effective as the best MIA available, we believe there is value in testing it out on a real dataset. To that end, we have recently conducted experiments on the PubMedQA dataset. PubMedQA contains approximately 200K medical articles formatted as <Context + Question + "\n" + Answer>. We encoded these articles using the GTE sentence encoder and applied k-means clustering to the resulting embeddings to derive 10 non-overlapping security domains. While clustering enforces semantic similarity within each domain and dissimilarity across domains, the underlying data distribution remains the same, since all samples originate from the same dataset. This design allows us to test DDI in a setting where domain semantics vary but there is no distribution shift between members and non-members, thereby directly addressing the concern.

The DDI results on PubMedQA are shown below:

LLaMA model results:

MIA	AUC_ROC (std)	TPR@1%FPR (std)	TPR@5%FPR (std)
loss	0.81 (0.07)	0.16 (0.11)	0.36 (0.15)
zlib	0.77 (0.07)	0.10 (0.05)	0.30 (0.13)
mink	0.86 (0.05)	0.25 (0.12)	0.48 (0.15)
mink++	0.90 (0.02)	0.31 (0.08)	0.57 (0.08)
Ref	1.00 (0.00)	0.98 (0.02)	1.00 (0.00)

Mistral model results:

MIA	AUC_ROC (std)	TPR@1%FPR (std)	TPR@5%FPR (std)
loss	0.95 (0.03)	0.51 (0.21)	0.75 (0.14)
zlib	0.88 (0.05)	0.32 (0.17)	0.57 (0.15)
mink	0.98 (0.01)	0.75 (0.14)	0.91 (0.07)
mink++	0.99 (0.01)	0.93 (0.07)	0.98 (0.02)
Ref	1.00 (0.00)	1.00 (0.00)	1.00 (0.00)

While some datasets may exhibit the limitations of MIAs the reviewer pointed out, the above results (across semantically disjoint but distributionally consistent domains) along with the other scenarios in the paper demonstrate that in practice there will be many datasets and workloads where DDI will capture access control enforcement. We will add these new results to the camera ready version of the paper.

2025-08-09

Thank you for addressing my concern. I encourage you to include the above results and relevant discussion on the potential limitations of the MIA approach in the revised draft. I am happy to update my rating accordingly.

2025-08-09

Yes, we will absolutely add the above results and the relevant discussion in the paper, along with notes on potential limitations of MIA. We greatly appreciate the reviewer's rating update.

审稿意见

评分: 4置信度: 22025-07-22

This paper introduces Permissioned LLMs , a novel class of large language models designed to enforce organizational data access control structures directly on their query responses in enterprise settings. The core problem addressed is the breakdown of traditional access control when LLMs, fine-tuned on siloed and protected data, serve requests from individuals with varying access privileges.

优缺点分析

Strengths: The paper provides a strong formalization of the problem and the proposed solution, which is crucial for a security-related domain. The development of the "access advantage" metric and its instantiations (DDI, UGI) appears theoretically sound and practical for evaluating the efficacy of PermLLM mechanisms. The mechanisms built on PEFT leverage existing robust techniques. The extensive experiments on diverse datasets provide solid empirical validation of both the mechanisms and the metrics.

Weaknesses : the paper primarily focuses on domain-level segregation, It would be beneficial to discuss how PermLLMs scale to extremely fine-grained access control policies and complex hierarchical or attribute-based access control structures common in enterprises. Additionally, the paper does not explicitly detail the performance overhead (e.g., inference latency, training time, memory footprint) introduced by the PermLLM mechanisms compared to standard fine-tuning or other LLM deployment strategies. While PEFT is efficient, quantifying the added cost for access control enforcement would be valuable for practical deployment.

问题

How do the proposed PermLLM mechanisms scale to very large numbers of users, permission groups, or highly granular access control policies, which are common in real-world enterprise environments? Could the authors discuss the computational or architectural implications
While the UGI metric evaluates utility, could the authors provide a more detailed analysis or discussion on the inherent trade-offs between strict access control enforcement and the LLM's overall utility or helpfulness for authorized users, especially for complex query types?

局限性

see questions and weaknesses

格式问题

作者回复

2025-07-31

We thank the reviewer for their detailed feedback and for acknowledging the merits of our access control formalism and evaluation framework. We would like to clarify the concerns and questions raised in the review:

Scalability and applicability to complex settings

The reviewer raised some valid concerns about the scalability of the approach in the comments "It would be beneficial to discuss how PermLLMs scale..." and " How do the proposed PermLLM mechanisms scale to very large numbers of users...".

A big part of the goals of our work was to formalize access control enforcement in LLMs and introduce empirical metrics to evaluate its effectiveness. In our paper, we focus on domain-level segregation as a foundation step. Our access control algorithms (Activate, Merge, and Union), though simple, are concrete mechanisms that achieve the domain-level segregation of LLM responses based on user access privileges. More complex scenarios such as scalability, fine-grained access and hierarchical domains would require additional research and is left for future work. That said, we believe that finer-grained or attribute-based access control policies can, in principle, be reduced to controlled filtering or masking of training inputs within the PermLLM framework. For example, filtering specific attribute values before adapter training enables extension toward such policies without fundamental changes to the framework.

As we stated in our limitations section, we believe the scalability of our approach to large number of domains (order of thousands) is not feasible at the moment. For large number of users or permission groups, we can cluster similar permission groups into hierarchical clusters and train the PEFT modules on the clusters. We leave this extension for future work.

Performance overhead of Permissioned LLMs

Regarding the comment "...the paper does not explicitly detail the performance overhead...": We can add the additional memory footprint data for our permissioned LLMs in terms of additional PEFT model parameters in the revision. We do not observe any increase in training cost for Active, for Merge the cost is nominal and independent of training data size or training time. For Union, we require training the models on the union of domains which scales the training time with the number of domain sets and their sizes. Although this scaling can inflate to 2^n sets of domains in principle, in practice, we expect the sets of domains to be much lower in many important application settings.

Clarification on utility–access control trade-offs

Regarding the comment "...discussion on the inherent trade-offs between strict access control enforcement and the LLM's overall utility...": The UGI metric measures the gap in model utility between a domain a user has access to and a domain the user doesn't have access to. It thus impacts the utility for unauthorized users by design. It does not, however, trade-off the utility for users with legitimate access to security domains even for complex queries as long as they have access to the domains required to answer the query. The UGI could however be upper-bounded by the model generalizability across different security domains in cases where the domains have correlations and/or when learnings from one security domain help answer queries for another security domain. In such scenarios, a user with no access to a target domain could still get relevant answers from a model that has generalized the learnings from other domains the user has access to, and the access controls are not violated as long as the model does not glean over the data from target domain to answer the user's query. We do not consider any cross-domain correlations in our experiment setup to keep the analysis straightforward.

2025-08-05

We thank the reviewer for their time and would like to know if we answered their queries. If there are any remaining concerns or questions, we would be happy to respond to them.

最终决定Accept (poster)

2025-09-17

The paper introduces Permissioned LLMs, a new framework to enforce access control on the outputs of fine-tuned language models. The authors propose mechanisms based on Parameter-Efficient Fine-Tuning to segregate data access. They also introduce two novel metrics, the Domain Distinguishability Index and the Utility Gap Index, to gauge the effectiveness of the access control systems.

Strength: Tackles an important security problem for enterprise LLMs. Formalises the problem and introduces an auditable evaluation framework. Empirical validation is thorough across multiple datasets and models.

Weakness: Scalability of the proposed mechanisms (L9eS, mrTF, z3PK). DDI metric's reliance on membership inference attacks. The Union method is computationally impractical for a large number of access domains. 1HWR correctly noted that the DDI metric is only as effective as the best available MIA (where current MIAs have clear limitations).

The decision to accept is based on the paper's novelty and importance. It provides the first formal treatment of access control for fine-tuned LLMs. This is a crucial step for building trustworthy AI systems. The authors were highly responsive during the rebuttal. They convincingly addressed the most serious criticism about the DDI metric. They conducted new experiments on the PubMedQA dataset. The extra work significantly strengthened the paper and convinced 1HWR to support acceptance. Scalability remains a valid concern - however, it is an acceptable limitation for a foundational paper that opens up a new field of inquiry.