W5: User study & evaluation on third-party bench

The user study of image-driven style transfer is presented in table below:

metric	AdaAttN	EFDM	StyTR-2	CAST	InST	StyleID	StyleShot
human style	0.075	0.112	0.129	0.135	0.093	0.040	0.416

StyleShot achieves the highest style alignment score.

The third-party bench in [2] is a painting-dominant benchmark biased to ten painting styles such as impressionism and expressionism, while our StyleBench covers 73 distinct styles, ranging from paintings, flat illustrations, 3D rendering to sculptures with varying materials.

As shown in Fig. R4, Fig. R5, StyleShot achieves the stable and superior performance on this third-party bench. We also provide the quantitative results in the tables below for reference only:

metric	AdaAttN	EFDM	StyTR-2	CAST	InST	StyleID	StyleShot
clip image	0.470	0.428	0.464	0.539	0.490	0.551	0.603
style loss	13.140	14.832	12.749	13.117	14.122	14.881	15.415

metric	DEADiff	DeamBooth	InST	Style-Aligned	Style Crafter	StyleDrop	StyleShot
clip image	0.552	0.761	0.606	0.698	0.721	0.558	0.602
clip text	0.232	0.171	0.207	0.204	0.189	0.232	0.210
style loss	24.237	7.668	14.570	6.330	4.056	5.279	4.412

[1] Domain Enhanced Arbitrary Image Style Transfer via Contrastive Learning, ACM SIGGRAPH

[2] A Comprehensive Evaluation of Arbitrary Image Style Transfer Methods, IEEE TVCG.