暂无评分数据
ICLR 2024
PatchCraft: Learning Optimized Image Patch for Enhanced Visual Attention of CLIP
TL;DR
we propose a method to learn an optimized patch which can enhance visual attention of vision-language models like CLIP
摘要
关键词
vision-language modelsexplainable AItransformer-based models
评审与讨论
暂无评审记录