/10

Spotlight4 位审稿人

最低3最高4标准差0.4

ICML 2025

LotteryCodec: Searching the Implicit Representation in a Random Network for Low-Complexity Image Compression

Haotian Wu,Gongpu Chen,Pier Luigi Dragotti,Deniz Gunduz

OpenReview PDF

提交: 2025-01-24更新: 2025-08-16

摘要

关键词

Implicit neural representationsource codingoverfitted image compressionlottery codec hypothesislow-complexity image codec.

评审与讨论

审稿意见

评分: 32025-03-07

This paper investigate the lottery ticket hypothesis for implicit representation based image compression.

It proposes to overfits a binary mask and modulation vectors to the source image, and then leverages a randomly initialized neural network to generate the reconstruction.

The proposed LotteryCodec achieves state-of-the-art performance among overfitted image codecs designed for single-image compression at a reduced computational cost.

Additionally, LotteryCodec can adjust its decoding complexity by varying the mask ratio, providing flexible solutions for diverse computational and performance needs.

给作者的问题

This is a good paper regarding ideas.

There are some concerns regarding experiments and metrics, listed as (1-5) in previous sections. I am ready to increase the score if my concerns can be solved.

论据与证据

Yes. The manuscript is well written and logical.

方法与评估标准

(1) Only bpp-PSNR is compared. It is suggested to add MS-SSIM as evaluation metric.

(2) MACs/pixel is not a reliable metric for the real running complexity of neural networks. For neural applications, IO might occupy most of the latency. The decoding pipeline of the proposed method is more complex than the original INRs, as shown in figure 4. It is better to compare the latency with baseline INRs C3 on a BD-rate decoding latency curve.

(3) Encoding latency should be compared with baseline C3.

理论论述

No theoretical claims.

实验设计与分析

The experiment is well designed, and the anlysis is clear and thorough.

补充材料

I did not read the source code provided in the supplementary material.

(4) More visual comparisons could be included in the appendix.

与现有文献的关系

Related to many lottery ticket papers, which are already properly discussed in the manuscript.

遗漏的重要参考文献

Missing important citations: Is overfitting necessary for implicit video representation? ICML 2023

(5) This previous icml paper investigates the same topic of lottery hypothesis for implicit representation. The difference should be properly discussed.

其他优缺点

None.

其他意见或建议

L178, networt

作者回复

2025-04-01

We appreciate the reviewer’s valuable comments and for highlighting Choi’s ICML 2023 paper. Our detailed responses to each comment are as follows:

(1). Following the suggestion, we have conducted additional MS-SSIM experiments on the Kodak dataset. The results, presented in Table 4.1, demonstrate that our method consistently outperforms VTM, achieving up to a -43.81% reduction in BD-rate, closely matching ELIC's performance. We note that previous overfitted codecs only focus on PSNR, and direct MS-SSIM optimization is unstable for those baselines (as reported in C3 and its extension work (Fig. 6 and line 3 in [1])).

[1]. Ballé, Jona, et al. ``Good, Cheap, and Fast: Overfitted Image Compression with Wasserstein Distortion.''

Table 4.1 MS-SSIM / bpp performance on Kodak dataset

	MS-SSIM1 / bpp1	MS-SSIM2 / bpp2	MS-SSIM3 / bpp3	MS-SSIM4 / bpp4	BD-rate vs. VTM (%)
VTM	13.10 / 0.212	14.31 / 0.287	16.75 / 0.492	18.56 / 0.704	0
LotteryCodec	13.85 / 0.153	16.78 / 0.275	19.46 / 0.473	22.70 / 0.853	-43.81
ELIC	12.22 / 0.091	15.86 / 0.215	18.83 / 0.394	21.67 / 0.667	-44.60
MLIC+	14.91 / 0.148	16.53 / 0.214	18.20 / 0.307	19.77 / 0.425	-52.75

(2-3). We have compared the encoding (NVIDIA L40S) and decoding latency of LotteryCodec and its BD-rate against other alternatives (see Table 4.2 with a structured pruning over masking ratio of 0.8). Additional latency results across resolutions are reported in Tables 2.1 (Reviewer bWTB) and an coding exmaple is given in Table 3.2 (Reviewer itee). Given the fast decoding speed of overfitted codecs, we evaluate all overfitted codecs on an Intel Xeon CPU. Overall, our method achieves faster decoding with slightly higher encoding time, compared with other overfitted codecs. Additional analysis is provided in our response to Reviewer itee. We want to note that real-world latency is affected by many uncontrollable factors in a lab setting and can be significantly reduced through various optimization techniques, making fair coding speed comparisons difficult. For example, [2] reported that Cool-chic achieves a 100ms latency using a C API for binary arithmetic coding, while our implementation is slower due to the lack of such optimization. Nonetheless, we recognize the importance of real-world latency and provide these evaluations for a practical perspective. To ensure fairness, all reported results are based on same unoptimized decoding implementations, with no methods using C API optimizations. We expect similar speedups across all methods with these techniques.

[2] Blard, Théophile, et al. "Overfitted image coding at reduced complexity." 2024.

Table 4.2 Coding time for Kodak images

Models	Encoding time	Decoding time	BD rate
Traditional codec	CPU (s)	CPU (ms)	-
VTM	85.53	352.52	0
AE-based codec	GPU (ms)	GPU (ms)	-
EVC (S/M/L)	20.23/32.21/51.35	18.82/23.73/32.56	3.3% / -0.8% / -1.9%
MLIC+	205.60	271.31	-13.19%
Overfitted codec	GPU(sec / 1k steps)	CPU (ms)	-
LotteryCodec (d=8/16/24)	13.86/14.64/14.92	261.3/267.5/278.3	-3.64%
C3 (d=12/18/24)	13.10/13.98/14.32	272.1/284.6/295.0	+3.24%

(4). We have conducted extensive additional ablation studies in the rebuttal (as shown in these tables) and will include visualizations of these results to support our analysis in our revised manuscript, such as (a). impact of each component (Tables 3.1 from Reviewer itee ) and (b). training latency vs. performance (Tables 3.2 from Reviewer itee), (c) visual comparison for MS-SSIM results (Table 4.1 here). For clarity, here we provide the corresponding numerical results in tabular form in the above rebuttal.
(5). We will add a discussion over Choi et al.’s ICML 2023 paper in our revised manuscript. Here, we highlight the key differences between their approach and ours: While both studies leverage the Lottery Ticket Hypothesis (LTH) for INRs, Choi et al. apply LTH to video representation using image-wise encoding with multiple supermask overlays and unpruned biases, boosting representation at the cost of increased complexity and bit rate. In contrast, our LotteryCodec adopts a pixel-wise model and focuses on low-complexity image compression problem. We introduce mechanisms such as Fourier initialization and Rewind modulation to enhance rate-distortion performance, distinguishing our approach from Choi’s. Although Choi’s method is a novel contribution to video representation, it still falls short of state-of-the-art compression techniques. We will also add one paragraph to discuss the potential extension of our work for video compression (see response to Reviewer bwTB for more details).
(6). We have proofread the manuscript again and corrected the typos.

审稿人评论

2025-04-02

Thanks for the reply, I raise my score. Please include these important new results in the manuscript

审稿意见

评分: 42025-03-12

This paper introduces LotteryCodec, a novel, low-complexity image compression scheme based on overfitting. LotteryCodec effectively overfits a binary mask of an over-parameterized, randomly initialized network to an image, achieving high-performance compression. To enhance its performance, techniques such as Fourier initialization and rewind modulation are proposed. Extensive experimental results demonstrate LotteryCodec's high compression ratio and low decoding complexity.

update after rebuttal

As discussed below, I maintain my positive score.