FracFace: Breaking The Visual Clues—Fractal-Based Privacy-Preserving Face Recognition
This work uses fractal features to reduce visual clues that are critical for privacy leakage while preserving high face recognition accuracy.
摘要
评审与讨论
The paper proposes FracFace, a privacy-preserving face-recognition (PPFR) framework that (i) first prunes low-utility frequency channels via a Frequency-Channel-Refining (FCR) block and then (ii) remaps the retained spectrum into a recursive fractal index space through a Frequency-Fractal-Mapping (FFM) module. The authors argue that FCR alleviates energy sparsity that leaves visual cues in prior PPFR methods, while FFM introduces a non-linear, non-invertible permutation that further obfuscates spatial regularities. Experiments on six benchmarks (LFW, CelebA, AgeDB, CFP-FP, CALFW, CPLFW) and two attack models (U-Net, StyleGAN) show that FracFace retains ≥ 95 % verification accuracy on LFW while lowering SSIM of reconstructed faces by 25 –60 % relative to MinusFace, PartialFace and FaceObfuscator. The authors claim (a) state-of-the-art privacy–utility trade-off and (b) 15 – 60 % higher resistance to white- and black-box reconstruction.
优缺点分析
The empirical section is competently executed, featuring six public benchmarks, two reconstruction attackers (U-Net and StyleGAN2), and comparisons with eight recent PPFR baselines, which provide the reader with a reasonably broad picture of utility–privacy trade-offs. Implementation details are explicit, and reproduced numbers for ArcFace and PartialFace align with those reported in prior work, suggesting sound experimental hygiene.
The paper follows a logical flow (threat model → method → experiments → analysis); equations are typeset cleanly, and most figures (e.g., the t-SNE visual in Fig. 2 and qualitative reconstructions in Fig. 3) are easy to digest. Template security in face recognition remains a pressing problem; a lightweight transform that can be embedded at the sensor level would be valuable for edge deployments, making the topic well-motivated.
Despite the fractal rhetoric, the core technical novelty reduces to a fixed permutation of the retained DCT channels layered on top of a pruning heuristic already explored by MinusFace and PartialFace; no new learning objective, loss function, or cryptographically grounded mechanism is introduced. Because the permutation is deterministic and global, an attacker who uncovers or guesses it can revert the mapping, collapsing FracFace to a vanilla sparsity mask. The paper offers neither an information-theoretic bound nor a key-management strategy, leaving its “non-invertible” claim largely rhetorical.
The reconstruction gap over prior art (≈ 0.03–0.06 LPIPS on LFW) is marginal and reported only for two dated attackers; stronger diffusion-based or adaptive white-box attacks are absent, so it is unclear whether the method would hold up in practice. There is no ablation on fractal depth or pruning ratio, making it hard to judge robustness. Dataset coverage is narrow (frontal, high-resolution faces); low-resolution surveillance and varying pose scenarios, where frequency scrambling could behave differently, are untested.
问题
-
Can you provide a quantitative argument (e.g., entropy loss, one-way permutation hardness) that FFM is non-invertible even when its parameters are known?
-
How does FracFace fare against a reconstruction adversary that knows the fractal mapping and trains an end-to-end diffusion or GAN inversion accordingly?
-
What is the privacy–utility trade-off when varying fractal layers and pruning ratios?
局限性
No. The manuscript lacks an explicit “Limitations & Societal Impact” section and therefore omits several critical issues. I recommend that the authors:
- discuss how privacy degrades if the fractal mapping becomes public, and outline mitigation (e.g., key rotation, hardware obfuscation).
- reflect on how the technique might be deployed for mass-surveillance systems that further erode civil liberties, and suggest policy or technical safeguards (opt-in consent, on-device storage only, revocation).
- include template-linkability, gradient leakage in federated settings, and adaptive diffusion-based reconstruction as open challenges.
最终评判理由
The authors’ rebuttal offers several clarifications and additions that partially address the raised concerns, but key issues remain unresolved. Regarding the non-invertibility of the proposed Fractal Feature Mapping (FFM), the authors argue that exponential index growth leads to aliasing via modulo projection, rendering the mapping non-injective. While this supports collision-based non-invertibility, it falls short of establishing cryptographic hardness or one-wayness, as no complexity analysis or entropy bounds are provided.
On robustness against adaptive attacks, the authors incorporate PGDiff and StyleGAN-based attacks, assuming the transformation is known, which is a step forward. However, these attacks are not optimized for the transformed domain, and the rebuttal concedes that adversarial performance could improve with targeted retraining, leaving concerns about inversion robustness unresolved. The inclusion of ablation studies on fractal depth and pruning is sufficient and directly responds to prior requests, offering interpretable privacy–utility trade-offs. In terms of novelty, the authors reiterate FFM's structural aliasing mechanism and random kernel instantiation but introduce no new learning paradigms or theoretical constructs beyond existing permutation-based privacy-preserving methods.
Overall, although the rebuttal improves the paper by providing ablations and a stronger, albeit still generic, attack baseline, the fundamental concerns about deterministic mapping, robustness under adaptive attacks, and limited methodological novelty are only partially addressed. Therefore, I stand by my original recommendation.
格式问题
No
Dear Reviewer qSTt,
We sincerely thank you for your valuable comments and suggestions. We respond to your questions and comments below:
Response to Q1. On the Non-Invertibility of FFM
Thank you for this insightful question. The non-invertibility of the FFM is proved in [S1]. We conducted a quantitative analysis to support our claim, drawing from both entropy-based evidence and one-way permutation hardness.
The mapping defined by FFM is:
$
\mathcal{F}^{[k]}[i,j] = M_0[i,j] + \sum_{\ell=1}^{k} (b_\ell - 1) \cdot \beta_\ell, \quad \text{where } \beta_\ell = \prod_{s=1}^{\ell - 1} b_s,\ \beta_1 = 1.
$
This mapping grows rapidly with depth $k$, and the output is projected into a fixed index space of size denoted as $C$. Notably, each element of increases proportionally with , implying that the range of expands exponentially with . To project the resulting indices into a fixed channel space, we define a modulo-based index mapping
However, due to the rapid growth of , the modulo projection inevitably causes aliasing:
$
\exists\ (i_1,j_1) \neq (i_2,j_2) \quad \text{s.t.} \quad \psi[i_1,j_1] = \psi[i_2,j_2].
$
This proves that is not injective, and thus, the inverse is not uniquely defined
This is not incidental, but structurally guaranteed due to the following:
1)The maximum index in grows as ;
2)The output space of is fixed at size ;
3)Collisions are guaranteed. For example, when $C = 81$, $k = 2$, and $b_1 = b_2 = 3$, $\mathcal{F}^{[2]}$ yields values in \[1,162]\, projected into \[1,81]\ by mod.
The FFM transformation is non-invertible even under known parameters, due to its exponential growth in index space and subsequent modulo projection. This structural non-injectivity ensures that channel remapping cannot be deterministically reversed by an adversary, which underpins FFM's privacy-preserving design.
References
[S1] Double parameters fractal sorting matrix and its application in image encryption.
Response to Q2. Robustness Against Diffusion/GAN-based Reconstruction
Thank you for raising this important question. We extended our evaluation to stronger adversaries by including U-Net, StyleGAN, and Diffusion-based inversion models. As shown in Table R1, we benchmarked these attacks on SSIM, MSE, and IDS.
Table R1: Comparison of Reconstruction Attacks Against FracFace
| Attack Model | SSIM ↓ | MSE ↓ | IDS ↓ | Inference Time |
|---|---|---|---|---|
| U-Net | 0.3613 | 0.0853 | 0.3028 | ~1.2 sec |
| StyleGAN | 0.3151 | 0.0962 | 0.4936 | ~5.3 sec |
| Diffusion (PGDiff in NeurIPS2023) | 0.3809 | 0.0623 | 0.4325 | ~300 sec |
While diffusion models can generate finer textures, they do not significantly outperform U-Net or StyleGAN in terms of privacy leakage under our setting. In fact, they slightly underperform in IDS and SSIM, and incur substantially higher computational cost (e.g., ~300s per image even with only 20 denoising steps).
Importantly, this experiment demonstrates FracFace’s resilience even against strong reconstruction adversaries with full knowledge of the fractal mapping. Despite that, FracFace still achieves low identity leakage and reconstruction quality, validating its effectiveness across diverse threat models.
Due to time limit, we used the SOTA pre-trained diffusion model []. It may be possible to train an end-to-end diffisuion or GAN inversion model that perfoms better than the pre-trained model. But we believe the improvement would not be significant, because FracFace fundementally removed detailed visual clues (see Fig. 3) that the reconstruction model relied on.
Response to Q3. Privacy–Utility Trade-off under Varying Fractal Layers and Pruning Ratios
We performed the ablation studies to investigate how fractal layers $k$ and pruning ratio affect the privacy–utility trade-off.
(1) Impact of Fractal Layers $k$
We varied the number of FFM iterations $k$ and evaluated both recognition accuracy and privacy robustness under StyleGAN-based reconstruction. As shown in Table R2, increasing $k$ leads to stronger privacy protection (LPIPS ↑, SSIM ↓), but accuracy begins to decline beyond $k=2$. We identify $k=2$ as the optimal point, offering a 20% LPIPS gain over $k=1$ with only a 0.02% drop in accuracy. Visualization results will be included in the final version, per policy.
Table R2: Effect of Fractal DepthLayers $k$ on Privacy–Utility Trade-off
| $k$ | Accuracy (%) | SSIM ↓ | LPIPS ↑ |
|---|---|---|---|
| 1 | 99.71 | 0.5227 | 0.5291 |
| 2 | 99.69 | 0.4015 | 0.6353 |
| 3 | 96.46 | 0.3729 | 0.7925 |
| 4 | 92.13 | 0.2580 | 0.8357 |
(2) Impact of FBA Pruning Ratios
To evaluate sensitivity to pruning, we ranked DCT channels by global energy and retained the top-N at five pruning levels: 20%, 40%, 50%, 60%, and 80%. As shown in Table R3, stronger pruning improves privacy but degrades accuracy—especially beyond 50%. The 50% level provides the best trade-off: LPIPS improves by 35% over the 20% case, while accuracy remains high (99.69%).
Table R3: Effect of Pruning Ratios on Privacy–Utility Trade-off
| Pruning Ratios | Accuracy (%) | SSIM ↓ | LPIPS ↑ |
|---|---|---|---|
| 20% | 99.83 | 0.7857 | 0.3184 |
| 40% | 99.71 | 0.6291 | 0.4833 |
| 50% | 99.69 | 0.3012 | 0.6839 |
| 60% | 89.26 | 0.3109 | 0.7294 |
| 80% | 87.24 | 0.2793 | 0.8605 |
These results confirm a clear privacy–utility trade-off. The setting with $k=2$, and 50% pruning achieves strong privacy protection while keeping recognition accuracy high. We will include these findings and full visualizations in the final version.
Response to Weakness 1
We thank the reviewer for raising this critical question regarding the novelty and security of our approach.
Firstly, prior works such as MinusFace and PartialFace have explored coefficient pruning in the frequency domain through static filtering. However, they do not address the deeper issue of frequency-domain sparsity, a structural vulnerability that makes frequency maps prone to reconfiguration and inversion. In contrast, FracFace explicitly targets this limitation through a novel fractal-based remapping mechanism, called FFM. As shown in our ablation results (Fig. 6 and Table 3), pruning alone is insufficient to disrupt reconstructive cues, and it is the FFM module that fundamentally improves privacy while preserving recognition performance.
Furthermore, FFM operates through recursive fractal expansion followed by a modulo projection, which inherently leads to non-injective index mappings—that is, multiple inputs may map to the same output. This structural aliasing ensures that even when the mapping function is fully known, it cannot be uniquely inverted. We provide a quantitative argument of this non-invertibility in Q1.
Finally, we operate under a strong white-box threat model, where the adversary is assumed to know the FFM mechanism. However, they do not know the specific instance of the random base Fractal Kernal , and Fractal Lattice , which varies across deployments and is not recoverable from outputs alone (see Q2).
We will include these analyses in the final version to clarify that FracFace innovatively applies fractal remapping to address the above challenge, moves beyond simple pruning via fractal remapping, and achieves provable non-invertibility even under strong threat models.
Response to Weakness 2
We address each concern as follows:
First, while LPIPS improvements may appear numerically small, we emphasize that privacy leakage is assessed through multiple complementary metrics. In addition to LPIPS, we report IDS, SSIM, and MSE, which together offer a fuller picture of reconstruction quality and identity preservation (see Table 2 in our paper). Next, we further evaluated FracFace under stronger adversaries, including U-Net, StyleGAN, and a diffusion-based inversion model. Detailed results and discussions are provided in Q2. Moreover, we have conducted ablation study on fractal depth and pruning ratio (see Q3). In terms of dataset coverage, FracFace have evaluated on 8 public datasets, including AgeDB, CFP-FP, IJB-B, and IJB-C, which introduce challenges such as pose variation, lighting, partial occlusion. In particular, CelebA, IJB-B,and IJB-C reflect unconstrained and non-frontal conditions. FracFace consistently outperforms baselines under these scenarios (Table 2), validating its robustness beyond frontal high-quality settings. Finally, on low-resolution scenarios, we note that degraded images often provide inherent privacy benefits by reducing recoverable visual details. This aligns with our core goal of obscuring identity while preserving utility.
Response to Limitations
Thank you for raising this important point. While a brief discussion of limitations is included in Appendix A.6 of the main paper, we agree that a more explicit and thorough treatment of the suggested issues would strengthen the work. We will incorporate these discussions in the final version. Due to the word limit of this response, we are unable to elaborate in detail here but would be happy to provide further clarification during the next review phase.
Looking forward to further discussion and feedback.
Dear Reviewer qSTt,
We sincerely appreciate your valuable time and effort in reviewing our paper and helping to improve its quality.
Since you have provided a mandatory acknowledgement without additional comments, we would like to kindly ask if there are any remaining concerns or unresolved questions that we could further clarify or address. We would be sincerely grateful if you could provide a response.
Best Regards
The Authors
This paper proposes FracFace, a novel PPFR framework that conceals visual clues of face images by operating in the frequency domain. The method introduces two main components: a frequency channel refining module that reduces frequency domain sparsity by suppressing non-identity-related components, and a frequency fractal mapping module that remaps refined features into a fractal structure to hinder reconstruction. Extensive experimental results suggest that FracFace achieves strong resistance to both U-Net and StyleGAN-based attacks while preserving recognition accuracy.
优缺点分析
Strengths
- Privacy-preserving face recognition is an important topic with societal impact. It is worthy of further investigation.
- Though frequency-based PPFR is not new, the method targets frequency sparsity, a clear and under-addressed vulnerability in prior PPFR designs.
- The method outperforms prior works in resiliency to both AE- and GAN-based reconstruction attacks, and achieves satisfactory recognition accuracy.
Weaknesses
- The motivation behind fractal kernel construction could be more intuitive; some components are overly complex without clear ablation.
- The reliance on heuristic pruning rules in frequency channel refining seems to lack adaptivity.
问题
- The paper presents incremental novelty: frequency-based PPFR approaches have long been explored in prior work. However, the authors do introduce new techniques to effectively reduce feature sparsity in the frequency domain, which is believed to be critical for defending against reconstruction attacks. The proposed modules are thoughtfully integrated into the frequency-based PPFR framework and demonstrate strong empirical performance.
- Among different options that could obfuscate reconstruction, what is the motivation for using fractal structures specifically? Could similar gains be achieved via other non-linear mappings or learned index transformations?
- How sensitive is FracFace to the choice of pruning strategy in FCR? Have the authors considered adaptively learning the channel selection or making it adaptive to image content?
局限性
Yes.
最终评判理由
I appreciate the authors' response. After considering their rebuttal and the feedback from my fellow reviewers, I have decided to maintain my original score.
格式问题
N/A.
Dear Reviewer NRe7,
Thank you for your positive evaluation and encouraging remarks! We carefully investigated each question and weakness trying to address them to the best of our capability within the time frame. Our response are as following:
Response to Q1. Novelty
The paper presents incremental novelty: frequency-based PPFR approaches have long been explored in prior work. However, the authors do introduce new techniques to effectively reduce feature sparsity in the frequency domain, which is believed to be critical for defending against reconstruction attacks. The proposed modules are thoughtfully integrated into the frequency-based PPFR framework and demonstrate strong empirical performance.
We greatly appreciate your recognition of the technical contributions in our work, especially regarding the challenge of frequency domain sparsity. As you rightly pointed out, frequency-based PPFR pipelines have been widely explored. However, most existing approaches focus on filtering or masking coefficients, without explicitly tackling the structural sparsity in the frequency domain. This sparsity inherently permits reconfigurability and reversibility, posing a serious vulnerability under reconstruction attacks. In this work, we identify frequency domain sparsity as a key limitation that compromises the balance between privacy and usability. As shown in our ablation studies (Fig. 6 and Table 3 in our main paper), improvements in the frequency domain, for example through the FCR module, alone are limited in addressing the core challenge. This motivates the need for a novel approach. To this end, we innovatively leverage the inherent self-similarity and recursive locality of fractal structures, and design the FFM module to effectively mitigate frequency-domain sparsity. FFM enables FracFace to systematically overcome frequency sparsity and achieve robust privacy protection without sacrificing usability.
Response to Q2. Motivation & Performance
Among different options that could obfuscate reconstruction, what is the motivation for using fractal structures specifically? Could similar gains be achieved via other non-linear mappings or learned index transformations?
We thank the reviewer for this insightful question. The motivation for designing fractal structures is to defend against reconstruction attacks by disrupting the structure of frequency representations exploited by generative models. In developing this mechanism, we explored various alternatives, such as randomized nonlinear mappings and learned index transformations. We ultimately chose fractal structures because they offer a rare combination of effective reconstruction resistance and preservation of recognition performance.
Specifically, our choice of fractal-based frequency remapping is motivated by two intrinsic properties of fractals:
1. Local Recursion: Fractal mappings expand spatially through recursive integer rules, generating complex and multiscale index patterns that are not easily invertible. This recursive structure increases entropy in the transformed space and helps prevent reconstruction, even when the adversary has full white box access to the approach.
2. Self-Similarity: Despite structural complexity, fractals preserve local correlations across scales, which helps maintain the semantic consistency of identity-related signals. This allows recognition models to extract usable embeddings, preserving utility even under aggressive obfuscation.
These two properties complement each other and together enable us to tackle a core problem in PPFR: balancing privacy and usability. As shown in our results (Table 1 and 2), FracFace significantly boosts resistance to reconstruction attacks (e.g., 60% LPIPS increase), while keeping recognition accuracy almost unchanged (e.g., only 0.04% drop on LFW), highlighting its effectiveness in achieving this balance.
We explored several alternative transformations, including:
- Random or pseudo-random channel permutations, which are easy to reverse if known and often degrade utility due to lack of structural regularity;
- Learned index transformations, which introduce additional parameters and may suffer from overfitting or instability under unseen conditions;
- Frequency-space scrambling based on fixed priors, which often fail to generalize across identities and domains.
While these alternatives had some potential, they failed to meet two key requirements of our approach: structural non-invertibility and semantic preservation. In contrast, fractal mappings are deterministic yet chaotic, simple yet expressive. This unique combination allows us to build a defense that is both mathematically analyzable and practically effective. We will add this analysis.
Response to Q3. On the Sensitivity to FCR Pruning Strategy
How sensitive is FracFace to the choice of pruning strategy in FCR? Have the authors considered adaptively learning the channel selection or making it adaptive to image content?
Thank you for your thoughtful question. In our work, we carefully analyzed the sensitivity of pruning strategies. Our choice was guided by both empirical results and prior studies, which show that overly aggressive or overly conservative pruning often leads to unstable or suboptimal performance. To understand this behavior in greater detail, we focused our analysis on the 40%–60% pruning range, where the transition became particularly evident. As shown in Table R1, we found that retaining approximately 50% of the DCT channels (i.e., 81 out of 192) provided the best trade-off: it significantly improved visual obfuscation (LPIPS increased from 0.3184 to 0.6839) while maintaining high recognition accuracy (99.69%).
Table R1: Effect of Refining Ratio on Privacy–Utility Trade-off
| Pruning | Accuracy (%) | SSIM↓ | LPIPS↑ |
|---|---|---|---|
| 20% | 99.83 | 0.7857 | 0.3184 |
| 40% | 99.71 | 0.6291 | 0.4833 |
| 50% | 99.69 | 0.3012 | 0.6839 |
| 60% | 89.26 | 0.3109 | 0.7294 |
| 80% | 87.24 | 0.2793 | 0.8605 |
In fact, the process of selecting a pruning strategy can be seen as a form of adaptive learning process that evaluates how channel selection affects privacy and utility.
In practical applications, adapting the pruning strategy to each image introduces non-negligible computational overhead and deployment cost—particularly since FracFace operates on high-resolution images (112×112). For inputs with consistent pixel dimensions, our experiments show that a fixed pruning strategy offers greater efficiency and stability. Therefore, we adopt a fixed pruning approach and recommend cropping images to a consistent resolution when using FracFace.
We appreciate your suggestion and will add the above in the final verison.
Response to W1. Motivation & Ablation
The motivation behind fractal kernel construction could be more intuitive; some components are overly complex without clear ablation.
Thank you for the valuable comment. We would like to clarity the intuition behind the fractal kernel and will include it in the final version. As detailed in our response to Q2, the design is motivated by optimizing the trade-off of between privacy and utility, leveraging the unique properties of fractals. We have also examined ablation effects related to FCR pruning sensitivity in Q3. Moreover, we conducted an additional ablation study, varying the number of fractal iterations , to assess both recognition accuracy and privacy robustness under a StyleGAN-based reconstruction attack. As summarized in Table R2, increasing consistently enhances privacy (LPIPS ↑ from 0.53 to 0.84; SSIM ↓ from 0.52 to 0.26), indicating stronger visual obfuscation. However, recognition accuracy starts to decline beyond . We find strikes the best balance, achieving a 20% LPIPS improvement over with only a negligible 0.02% accuracy drop. We will add these findings to the final version to support the discussion.
Table R2: Impact of Fractal Depth on Privacy–Utility Trade-off
| Accuracy (%) | SSIM ↓ | LPIPS ↑ | |
|---|---|---|---|
| 1 | 99.71 | 0.5227 | 0.5291 |
| 2 | 99.69 | 0.4015 | 0.6353 |
| 3 | 96.46 | 0.3729 | 0.7925 |
| 4 | 92.13 | 0.2580 | 0.8357 |
Response to W2. The Reliance on Heuristic Pruning Rules in Frequency Channel Refining Seems to Lack Adaptivity
The reliance on heuristic pruning rules in frequency channel refining seems to lack adaptivity.
Thank you for the insightful comment. We have discussed this issue in Q3, where we evaluated the trade-off between pruning strength and performance, and highlighted that a fixed strategy achieves stable results across typical input resolutions (e.g., 112×112). While content-aware refining could offer finer control, it may also introduce additional computational cost. We will clarify this rationale in the final version and include adaptive refinement as a potential direction for future work.
We hope we have addressed your questions. Please let us know your further concerns during discussion.
Dear Reviewer NRe7:
We appreciate your continued engagement with our work.
If there are any aspects that you feel could benefit from further clarification, we would be happy to address them as we finalize the paper. We remain committed to improving the clarity and overall quality of the submission.
We truly value your insights and would greatly appreciate any further suggestions you may have. We will be staying online and responsive in the coming days, should there be anything we can clarify or improve further.
I appreciate the authors' response. After considering their rebuttal and the feedback from my fellow reviewers, I have decided to maintain my original score.
This study proposes FracFace, a fractal-based privacy-preserving face recognition framework that disrupts the spatial structure of frequency-domain features to weaken reconstruction attacks while retaining identity-critical information. It consists of two modules: Frequency Channels Refining (FCR): attenuates non-identity-related frequency bands (e.g., skin color, lighting, expression) to reduce feature sparsity and frequency interference; and Frequency Fractal Mapping (FFM): leverages fractal self-similarity and local chaos to remap the refined frequency channels into an irreversible fractal structure, imposing structured perturbations on deep representations. Evaluated on common benchmarks, the method delivers lower SSIM and higher LPIPS under U-Net and StyleGAN attacks while maintaining high recognition accuracy, outperforming six state-of-the-art privacy protection methods.
优缺点分析
Strengths:
- Originality: Introduced a fractal-based paradigm (FFM) to handle PPFR, bringing fresh insights into how self-similar, multi-scale structures can obscure frequency-domain cues.
- Comprehensive experiments: Evaluated privacy and recognition trade-offs against two different reconstruction models (U-Net and StyleGAN), demonstrating consistent robustness improvements across both white-box and black-box scenarios.
- Quantitative privacy gains: Reported clear improvements—15% to 60% better attack resistance—in SSIM and LPIPS metrics, clearly showing reduced visual recoverability of facial features.
- High recognition accuracy: Despite aggressive obfuscation, the method maintained top-tier face-verification performance on multiple public benchmarks (LFW, CelebA, AgeDB, CFP-FP, CALFW, CPLFW), underlining balanced privacy-utility performance.
- Thorough comparative study: Benchmarked against six state-of-the-art PPFR methods, providing a clear picture of where FracFace stands relative to existing cryptographic and non-cryptographic techniques.
Weaknesses:
- The impact of different iteration depths k in the FFM module on both privacy protection and recognition accuracy is not fully explored—no systematic study shows how varying k levels shift the trade-off.
- The paper does not analyze how varying the attenuation intensity in the Frequency Band Attenuation step affects the balance between residual visual leakage and identity retention.
- Although the FCR module is designed to suppress non-identity frequency bands, the manuscript lacks a detailed channel-level study identifying which exact bands most strongly impact privacy leakage versus recognition performance. In particular, the brief mention in [line 158] of “predefining three frequency ranges” is too cursory—additional experiments should systematically evaluate each individual band’s contribution to both reconstruction vulnerability (e.g., SSIM/LPIPS changes when that band alone is masked or retained) and recognition utility, thereby providing a clear guidance on optimal frequency-wise pruning.
- The paper does not specify how the "Protection (%)" values in Table 1 are computed, making the metric unreproducible and somewhat subjective. [lines 206–207] state: "Green denotes the proportion associated with the privacy protection level." However, it remains unclear whether this refers to the proportion of original frequency channels (or visual cues) that are effectively concealed by each method.
- In Fig. 1’s pipeline, the depiction of the “Frequency Band Attenuation” shows a pruning space P that does not appear to match the formula [line 159] P = P₁ ∪ P₂ ∪ P₃ given in the text. Furthermore, in the “Refined Frequency Channel” box there is no indication of how S(1) and S(2) are separated. The paper needs to include a more detailed explanation of these points.
问题
- Ablation on fractal depth k
- Please include a systematic study of how varying the number of FFM iterations (k) affects both privacy (e.g. SSIM/LPIPS under your reconstruction attacks) and recognition accuracy.
- Actionable Guidance: Report curves for k=1 through k=N (or the relevant range), and identify the “sweet spot” where additional iterations no longer yield meaningful privacy gains or begin to hurt accuracy. And show the results on StyleGAN reconstruction.
- Sensitivity to FBA pruning strength
- Can you quantify the impact of different low-frequency attenuation levels on privacy vs. utility? For instance, evaluate the performance under different levels.
- Actionable Guidance: Provide metrics on feature sparsity, SSIM/LPIPS, and recognition accuracy for each level, and recommend an optimal range.
- Evaluation Criterion: Demonstrating a clear privacy–utility curve with an identified optimal level band would strengthen the paper’s practical relevance and raise the score.
- Exact Definition & Computation of Protection (%)
- Please provide the precise formula or algorithm you used to compute the “Protection (%)” entries in Table 1. Specifically, state what quantities appear in the numerator and denominator (e.g. number of masked frequency channels over total channels, or number of pixels over total pixels), and include a short numerical example.
- Explain Separation of S(1) vs. S(2) in “Refined Frequency Channel
- The box labeled “Refined Frequency Channel” shows outputs S(1) and S(2) but gives no indication how they are computed or split. Please add a concise algorithmic description (pseudocode or bullet steps) or a small sub‐diagram indicating which operations produce S(1) vs. S(2).
- Per-channel frequency sensitivity analysis
- Which specific frequency bands contribute most to privacy leakage or recognition performance? A “mask-only” vs. “retain-only” experiment per band would clarify this.
- Actionable Guidance: For each grouped band (e.g. low/mid/high), measure the change in SSIM/LPIPS and accuracy when that band alone is kept or pruned. Present results as a heatmap.
- Evaluation Criterion: If you identify and justify a minimal subset of channels that capture identity while minimizing leak risk, the methodological clarity and impact would improve.
We suggest consulting Wang et al. [1*]—in particular Sections 4.1 (“White-box Attacking Experiments”) and 4.2 (“Black-box Attacking Experiments”)—for both qualitative and quantitative reconstruction results across diverse attack scenarios. Additionally, see the ablation study in Wang et al. [2*], which examines the relationship between mask ratio and recognition performance.
[1*] Yinggui Wang, Jian Liu, Man Luo, Le Yang, and Li Wang. Privacy-preserving face recognition in the frequency domain. AAAI 2022. [2*] K. Wang et al., “FaceMAE: Privacy-preserving face recognition via masked autoencoders,” 2022, arXiv:2205.11090
局限性
Yes
最终评判理由
The authors have addressed all of my concerns and questions. After considering the comments from other reviewers and their communications with the authors, I have finally decided to raise my rating.
格式问题
All looks fine.
Dear Reviewer qzRH,
Thank you for your positive evaluation and encouraging remarks! We carefully investigated each question and weakness trying to address them to the best of our capability within the time frame. Our response are as following:
Response to Q1. Ablation on Fractal Depth
Please include a systematic study of how varying the number of FFM iterations (k) affects both privacy and recognition accuracy.
Thank you for this constructive suggestion. We conducted the requested ablation study by varying the number of fractal iterations , evaluating both recognition accuracy and privacy robustness under the StyleGAN-based reconstruction attack.
The results are summarized in Table R1 below. As increases, privacy improves consistently—LPIPS increases from 0.53 to 0.84, and SSIM drops from 0.52 to 0.26—indicating stronger obfuscation. However, recognition accuracy begins to degrade beyond . Notably, emerges as the optimal trade-off, offering a 20% LPIPS gain over with a minimal 0.02% drop in accuracy. The results on StyleGAN reconstruction images cannot be upload in the response. We will include them in the final version.
Table R1: Impact of Fractal Depth on Privacy–Utility Trade-off
| Accuracy (%) | SSIM ↓ | LPIPS ↑ | |
|---|---|---|---|
| 1 | 99.71 | 0.5227 | 0.5291 |
| 2 | 99.69 | 0.4015 | 0.6353 |
| 3 | 96.46 | 0.3729 | 0.7925 |
| 4 | 92.13 | 0.2580 | 0.8357 |
Response to Q2. Sensitivity to FBA Pruning Strength
“Can you quantify the impact of different low-frequency attenuation levels on privacy vs. utility?”
Thank you for the insightful suggestion. To assess the sensitivity of our system to FBA pruning strength, we conducted additional experiments varying the refining ratio in the FCR module using an energy-guided channel pruning strategy. Specifically, we ranked DCT channels by global energy and retained the top-N channels corresponding to five pruning levels: 20%, 40%, 50%, 60%, and 80% (inspired by FaceMAE). We evaluated the effect of each level on feature sparsity, privacy robustness (SSIM / LPIPS), and recognition accuracy. The results are presented in Table R2 below.
Table R2: Effect of FBA Pruning Strength on Privacy–Utility Trade-off
| Refining Ratio | Accuracy (%) | SSIM ↓ | LPIPS ↑ |
|---|---|---|---|
| 20% | 99.83 | 0.7857 | 0.3184 |
| 40% | 99.71 | 0.6291 | 0.4833 |
| 50% | 99.69 | 0.3012 | 0.6839 |
| 60% | 89.26 | 0.3109 | 0.7294 |
| 80% | 87.24 | 0.2793 | 0.8605 |
We observe a clear privacy–utility trade-off curve: as refining becomes stronger (i.e., more channels are removed), privacy consistently improves (e.g., LPIPS: 0.32 → 0.86), but recognition utility drops, especially beyond 50%. Notably, 50% pruning yields the best balance, increasing LPIPS by 35% over the 20% case while maintaining near-perfect recognition accuracy (99.69%). Beyond that point, privacy stays about the same, while accuracy keeps going down (e.g., 60% → 89.26%, 80% → 87.24%). These results highlight that moderate refining (40–50%) forms an optimal operating range, effectively improving privacy with minimal impact on utility. This justifies our default choice of 50% refinement in the main paper. We will include this ablation and its discussion in the final version. The figure on privacy-utility curve cannot be upload in the response. We will include them in the final version. We thank the reviewer for encouraging this more comprehensive analysis.
Response to Q3. Definition and Computation of Protection (%)
“Specifically, state what quantities appear in the numerator and denominator, and include a short numerical example.”
Thank you for your insightful question. We agree that the computation of Protection (%) warrants a precise definition and will add it to the final version along with a numerical example.
Definition: Protection (%) represents the share of frequency-domain channels that are either (i) filtered out by FCR or (ii) structurally disrupted by FFM, as a percentage of the total number of channels.
Protection (%) =
Where:
- : total number of DCT frequency channels (e.g., 192 for 12×16 DCT).
- : number of low-energy channels pruned by FCR.
- : number of remaining channels remapped by FFM.
Here is a numerical example illustrating the calculation of Protection (%):
Table R3: Example calculation
| Method | Protection (%) | |||
|---|---|---|---|---|
| PartialFace | 192 | 130 | 0 | 67.7 |
| FaceObfuscator | 128 | 112 | 0 | 87.5 |
| FracFace (Ours) | 192 | 111 | 81 | 100.0 |
This unified formulation allows for consistent, interpretable comparison of visual disruption levels across methods, regardless of the specific masking or remapping mechanism.
Response to Q4. Pseudocode
The box labeled “Refined Frequency Channel” shows outputs S(1) and S(2) but gives no indication how they are computed or split.
Thank you for the helpful suggestion. We acknowledge the specific calculation process of S(1) and S(2) lacked clarity and appreciate the opportunity to improve our explanation. To construct the two refined frequency channel groups S(1) and S(2), we apply a index-like traversal over a frequency index grid. The pseudocode is as follows:
# Step 1: Create frequency index grid
F = create_matrix(M=12, N=12)
# Step 2: S-shaped traversal
freq_list = []
for i in range(M):
row = F[i]
if i % 2 == 0:
freq_list.extend(row) # left-to-right
else:
freq_list.extend(row[::-1]) # right-to-left
# Step 3: Split into two groups
S1 = freq_list[0:80]
S2 = freq_list[80:161]
The indexing method is illustrated in Fig. 8. We will include the pseudocode and a sub-diagram showing how S(1) and S(2) are generated to improve clarity in the final version.
Response to Q5. The Sensitivity Analysis of Different Frequency Bands
Which specific frequency bands contribute most to privacy leakage or recognition performance?
Thank you for the insightful suggestion. Following your advice, we conducted a per-band frequency sensitivity analysis to investigate which frequency ranges contribute most to privacy leakage and recognition performance. As shown in Table R4, we conducted both “retain-only” and “mask-only” experiments on three grouped frequency bands (Low, Mid, High):
- Low-frequency channels are strongly correlated with visual appearance and structure. Masking them significantly improves privacy (↓SSIM, ↑LPIPS) without hurting recognition (99.41% accuracy), confirming their dominant role in visual leakage.
- Mid-frequency channels are most critical for identity representation. Retaining only mid-bands achieves 97.69% accuracy, while masking them causes a notable drop to 92.15%, indicating their contribution to discriminative identity cues.
- High-frequency channels provide limited standalone utility but help encode textures. Masking them has little impact on accuracy but slightly reduces privacy performance. (Clarity: We just provide the result data, as shown in Table R3. And we are unable to provide the visualization result of heatmap per the rebuttal policy, but we will put the result in the final version)
Table R4 The Sensitivity Analysis of Different Frequency Bands
| Group | Mode | Accuracy (%) ↑ | SSIM ↓ | LPIPS ↑ |
|---|---|---|---|---|
| Low | Retain | 93.58 | 0.7526 | 0.3583 |
| Low | Mask | 99.41 | 0.4253 | 0.5840 |
| Mid | Retain | 97.69 | 0.6291 | 0.4269 |
| Mid | Mask | 92.15 | 0.5832 | 0.4503 |
Response to W1. Ablation on Fractal Depth k
Please see our response to Q1 for detailed analysis.
Response to W2. Sensitivity to FBA Refining Strength
Please see our response to Q2 for detailed analysis.
Response to W3. The Sensitivity Analysis of Different Frequency Bands
Please see our response to Q5 for detailed analysis.
Response to W4. Definition and Computation of Protection (%)
Please see our response to Q3 for detailed analysis.
Response to W5. Clarifying “P = P₁ ∪ P₂ ∪ P₃” and the generation of S(1)/S(2)
Thank you for pointing out this important issue. We confirm that our implementation follows the formula as stated in line 159. After converting the input image to the frequency domain, we retain a three-layer structure, where each corresponds to a group of channels ordered by energy. Specifically, ₁ captures dominant identity-relevant signals, while and reflect more fine-grained appearance traits such as illumination and texture. Our FCR module applies layer-wise frequency refining to remove appearance-related components while preserving discriminative features for recognition. Regarding the separation of the refined channels into and , we apply an S-shaped scanning pattern across the frequency grid and split the resulting index sequence evenly into two subsets. This design ensures spatial and frequency diversity across both groups (see our response to Q4 for more detail). We appreciate this helpful suggestion and will revise Fig. 1 in the final version to clearly reflect both the layered pruning structure and the generation of and .
We hope we have addressed your questions. Please let us know your further concerns during discussion.
Dear Reviewer qzRH,
We sincerely appreciate your recognition of the strengths of our work, as well as your constructive comments. We have carefully considered all of your feedback and have provided detailed responses to each of your concerns.
We would greatly appreciate it if you could take a moment to review our responses. If our response does not fully resolve your concerns, we would be happy to engage in further discussion. We are fully committed to improving the clarity and quality of our submission.
We will remain online and responsive during the discussion period, and we truly value any additional feedback you may have.
Thank you again for your time and consideration.
Best Regards
The Authors
I appreciate the authors’ rebuttal, which has addressed all of the questions and concerns I previously raised. I have also reviewed the issues raised by other reviewers and the subsequent discussions between the reviewers and the authors. I find no remaining concerns and have no further questions.
Dear Reviewer qzRH,
We thank you for your reply and for the time and effort you devoted to reviewing our paper. Your comments have been invaluable in helping us improve the work, and we greatly appreciate your recognition of our contributions.
If there are any remaining issues that might still affect your evaluation, we would be grateful to learn of them so that we can address them promptly. If there are no further concerns, we would appreciate if you could adjust your score.
We will remain online and responsive during the discussion period, and we truly value any additional feedback you may have.
Thank you again for your constructive feedback and thoughtful engagement.
Best regards,
The Authors
Dear Reviewer qzRH,
We sincerely thank you for confirming that our rebuttal has addressed all your previous questions and concerns. We greatly value your recognition of our efforts and are encouraged by your positive assessment. Your supportive feedback further strengthens our confidence in the contribution and clarity of this work.
Best regards,
The Authors
The paper introduces FracFace, a privacy-preserving face recognition framework that aims to reduce visual clues in facial features, making it more resistant to reconstruction attacks by generative models. The framework leverages frequency domain processing and fractal-based mapping to obfuscate identity features while maintaining recognition accuracy.
优缺点分析
Strength: FracFace introduces a fractal-based transformation to disrupt the spatial structure in facial data, combining frequency domain refinement and fractal mapping.
Weaknesses:
- The idea of using the frequency domain to do privacy-preserving face recognition is very common in this field. The novelty of this paper is limited.
- The fractal mapping (FFM) approach might not be the most efficient way to obfuscate features. The introduction of randomized structural perturbations could lead to potential issues in preserving important features needed for accurate recognition, especially in more complex face images or in challenging lighting and pose variations.
- There may still be residual privacy risks in certain cases where low-frequency features or mild visual cues remain detectable by adversarial models. Could you make some illustrations?
- The proposed method primarily evaluates the framework against synthetic attacks (U-Net and StyleGAN). While these models represent typical reconstruction threats, the paper does not explore how FracFace performs against more realistic adversarial attacks in a dynamic, uncontrolled real-world setting.
问题
- The results show a slight drop in recognition accuracy due to privacy protection mechanisms. How does FracFace perform in real-world systems where even a small drop in accuracy could significantly impact its usability in critical applications?
- Could FracFace be combined with other privacy-preserving face recognition techniques, such as homomorphic encryption or federated learning?
- Even with FCR and FFM, are there any residual visual clues that could still be exploited by an adversary? For instance, could subtle attributes like age, ethnicity, or gender still be inferred from the transformed facial features?
局限性
- The study focuses primarily on attacks from U-Net and StyleGAN, but there may be other types of adversarial or black-box attacks that could exploit new vulnerabilities. The model might not be as robust against emerging attack strategies.
- FracFace’s performance was evaluated on standard face recognition datasets (LFW, CelebA), which consist of relatively high-quality images. The approach may not perform as well with low-resolution or noisy images.
- The privacy evaluation is primarily based on metrics such as SSIM, MSE, and LPIPS, which focus on visual quality and similarity. However, these metrics do not capture all dimensions of privacy, such as identity inference risk or psychological privacy risks.
最终评判理由
Overall, the authors have adequately addressed most of my concerns, particularly regarding the privacy-utility trade-off, compatibility with other techniques (HE and FL), and residual visual cues. However, their response on novelty could be more robust, and their defense of FFM’s efficiency could benefit from more detailed experiments under extreme or edge-case conditions. While their approach to adversarial threat models and privacy risks is strong, there is room for improvement in discussing the robustness against black-box and emerging attacks, as well as performance under noisy or low-resolution images. In summary, the authors have made a good effort to address your feedback, but there are still areas that could benefit from more detailed explanation or additional experiments.
格式问题
na
Dear Reviewer sFx1,
Thank you for your valuable comments and suggestions. Our detailed point-by-point responses are provided below.
Response to Q1. On the Privacy–Utility Trade-off
We would like to clarify the concern as follows. In practice, minor reductions in recognition accuracy are acceptable, since modern face recognition systems adjust thresholds based on deployment context, with stricter settings for border control and more lenient ones for surveillance.
This trade-off is a well-known challenge in the literature, where enhanced protection often comes at the cost of utility. For example, FaceObfuscator [S1] explicitly discusses the usability–privacy trade-off in real-world deployments, while PPFR-FD [S2] demonstrates that only when removing over 94% of frequency channels would significantly reduce recognition accuracy. Our method retains 50% of these channels, offering both structural protection and utility, as shown in Table R1.
Table R1: Minimal Utility Drop of FracFace Compared to ArcFace
| Dataset | ArcFace (%) | FracFace (%) | Results |
|---|---|---|---|
| LFW | 99.73 | 99.69 | –0.04% |
| CelebA | 95.35 | 95.91 | +0.56% |
| AgeDB | 97.19 | 96.96 | –0.23% |
| CFP-FP | 96.83 | 96.14 | –0.69% |
| IJB-C (1e-4) | 92.26 | 92.66 | +0.40% |
| IJB-C (1e-5) | 93.58 | 93.43 | –0.15% |
References
[S1] FaceObfuscator. USENIX Security 2024.
[S2] Privacy-Preserving Face Recognition in the Frequency Domain. AAAI 2022.
Response to Q2. Compatibility with HE and FL
The suggestion is interesting! As a modular model-agnostic preprocessing layer operating at the feature level, FracFace is inherently well-suited for combination with HE and FL, offering complementary protection across different stages of the pipeline. Specifically, FracFace enhances input-level visual privacy by obfuscating frequency and spatial cues prior to downstream processing.
| Method | Role of FracFace | Benefits |
|---|---|---|
| HE | Acts as a lightweight preprocessing module before encryption | Suppresses redundant visual features → lowers embedding entropy → mitigates HE’s high inference cost while enhancing visual obfuscation |
| FL | Obfuscates facial structures locally before feature extraction | Reduces gradient inversion risk → improves privacy under shared model updates; Frequency pruning reduces redundancy → speeds up local training |
Response to Q3. Can residual visual cues reveal age, gender, or ethnicity?
Thank you for this thoughtful question. FracFace is specifically designed to suppress visual cues that contribute to face reconstruction. While it is possible that certain soft-biometric attributes may partially persist after transformation, these signals alone are generally insufficient for reliable identity inference by an adversary.
To examine this concern, we conducted a user study on 20 FracFace-processed celebrity images. Participants (54 responses) were asked if they recognized the person and which cues they used. As shown in Table R1, most relied on features like face shape and hairstyle, not soft-biometrics. Specifically:
Table R1: User Reliance on Soft-Biometric Attributes for Identity Inference
| Attribute | Reported Usage (%) |
|---|---|
| Ethnicity | 0.0% |
| Age | 3.7% |
| Gender | 12.3% |
Over 76% of participants reported relying on guesswork or vague impressions, citing a lack of clear visual cues for recognition.
Response to W1. Novelty
Thank you for the comment.
While frequency-domain processing is common in PPFR, we address a novel challenge: the sparsity-driven reconstructibility of DCT features. Our FCR module prunes channels based on semantic relevance rather than fixed frequency bands, reducing redundancy. However, as shown in Fig. 6 and Table 3, this alone is insufficient. To go beyond frequency space, we propose a novel fractal-based mechanism that remaps DCT features into a recursively constructed, non-invertible fractal space. To our knowledge, this is the first use of fractal geometry to disrupt structural coherence in the frequency domain (see Appendix A.2).
Response to W2. On the Efficiency of FFM, Challenging Scenarios
We thank the reviewer for raising this valuable point.
We emphasie that this work aims to address the core challenge of balancing privacy and utility in PPFR, which requires to disrupt spatial regularities (resisting reconstruction), while preserving identity-related features to support recognition. Therefore, we evaluated the effeciency with regard to face recoginition accuracy and privacy preservation (measured by SSIM, IDS, MSE, and LPIPS). Compared with 6 SOTA ways to obfuscate feature (Table 1 and 2), our expeirements show that FFM is the most effecient for this specific purpose. Some of existing works e.g., FaceObfuscator and Minusface, can obfuscate features, but the results can be easily reverted, and thus unsuitble for this work.
Regarding computational effeciency, we have provided the FFM complexity calculation in Appendix A.5 Training Details.
Recognition accuracy drop remains negligible (0.04%-2% in our experiments) under practical deployment thresholds, and FracFace performs robustly across challenging conditions. Datasets like IJB-B, IJB-C, and CelebA feature real-world variations in pose, lighting, and resolution, yet FracFace consistently preserves high recognition accuracy while enhancing privacy protection.
Response to W3. Residual Privacy Risks and Visual Clues
Thank you for the insightful question. We consider a threat model where an adversary aims to reconstruct and identify faces from obfuscated images, assuming knowledge of FracFace parameters (e.g., fractal depth, expansion schedule). While visual cues cannot be fully removed due to recognition needs, FracFace greatly reduces them compared to prior work. Soft-biometric traits like face shape may remain but are beyond this work’s primary scope. Crucially, our transformation is non-invertible and does not introduce new privacy risks.
In detail, two components, the initial fractal kernel , and the index lattice , are secret and randomized per deployment, ensuring unique, secure instances. In addition, the fractal index mapping is defined recursively as:
This is then projected modulo the channel dimension :
Because grows exponentially with , many different positions collide to the same index under . That is,
This shows that is non-injective by design, and no inverse mapping exists. Even if the adversary learns the full mapping rule , they cannot resolve the original positions or channel correspondences without knowing the randomized and .
These design choices ensure even subtle visual cues are unrecoverable under strong white-box attacks. The non-invertibility of and randomness from and jointly establish a robust privacy boundary, making FracFace highly resistant to reconstruction.
Response to W4. Adversarial Threat Model
We thank the reviewer for the concern. While our experiments use generative reconstruction attacks (e.g., StyleGAN inversion, U-Net), these simulate worst-case white-box threats (see W3), not just synthetic scenarios. Our assumed attacker has full access to model details and data priors—far beyond typical real-world adversaries. Yet, FracFace remains highly resistant to reconstruction, achieving:
- Over 60% drop in SSIM and 80% increase in LPIPS compared to unprotected images;
- Effective removal of visual cues even when all system components (excluding secret seeds) are known.
This threat model subsumes real-world conditions like lighting, pose, and domain shifts. Since FracFace withstands such strong attacks, its robustness extends to more typical, weaker adversaries.
Response to L1. Attack Diversity
We additional evaluated against a recent Diffusion-based method.
Table R1: Comparison of Reconstruction Attacks Against FracFace
| Attack Model | SSIM ↓ | MSE ↓ | IDS ↓ | Inference Time |
|---|---|---|---|---|
| U-Net | 0.3613 | 0.0853 | 0.3028 | ~1.2 sec |
| StyleGAN | 0.3151 | 0.0962 | 0.4936 | ~5.3 sec |
| Diffusion | 0.3809 | 0.0623 | 0.4325 | ~300 sec |
Response to L2. Low-Resolution/ Noisy Inputs
See W2.
Response to L3. Identity Inference/Psychological Privacy Risks
To assess broader privacy, we conducted a user study (see Q3). 76% wrongly guessed, indicating weak visual cues. This suggests FracFace effectively limits identity inference and psychological privacy risks.
Looking forward to further discussion and feedback.
Dear Reviewer sFx1,
Thank you very much for your thoughtful and constructive comments. We have carefully considered all of your feedback and have provided detailed responses to each of your concerns.
We would greatly appreciate it if you could take a moment to review our responses. If our response does not fully resolve your concerns, we would be happy to engage in further discussion. We are fully committed to improving the clarity and quality of our submission.
We will remain online and responsive during the discussion period, and we truly value any additional feedback you may have.
Thank you again for your time and consideration.
Best Regards
The Authors
Dear Reviewer sFx1,
We sincerely appreciate your valuable time and effort in reviewing our paper and helping to improve its quality.
Since you have provided a mandatory acknowledgement without additional comments, we would like to kindly ask if there are any remaining concerns or unresolved questions that we could further clarify or address. We would be sincerely grateful if you could provide a response.
Best Regards
The Authors
The paper proposes a new method for privacy-preserving face recognition based on the principle of obfuscating soft biometrics features. While frequency components are widely-used in privacy-preserving approaches, the authors propose a non-invertible remapping strategy making use of fractal kernels. This provides robustness to reconstruction attacks while maintaining high recognition performance, which is a challenging task.
The paper has a focus on the obfuscation-based perspective for privacy-preserving face recognition, and is more limited on the crypto-based perspective of the problem. While this is not a negative point, the authors might clarify this aspect in order to avoid missed expectations on deterministic mapping. The rebuttal improves the paper by providing ablations and a stronger (generic) attack baseline.
Finally, the paper contributes techniques with potential practical impact in the privacy-preserving ML community, with comprehensive and solid validation that can be published.