CVPR 2026

Composite-Attribute Person Re-Identification via Pose-Guided Disentanglement

Kartik Patwari^1* Noranart Vesdapunt² Chien-Yi Wang² Dawei Li² Cong Phuoc Huynh² Ning Zhou² Chen-Nee Chuah¹ Kah Kuen Fu²

¹University of California, Davis ²Amazon

^*This work was done at Amazon.

Paper BibTeX

Retrieved person image matching the query condition

Problem. Existing multimodal Re-ID methods perform well with full descriptions, but their accuracy drops substantially for short attribute queries. CA-ReID studies retrieval conditioned on a reference image and a short or composite attribute query, while requiring both identity consistency and attribute satisfaction.

Abstract

Recent advancements in vision-language models have enabled multi-modal person re-identification (Re-ID), where the system takes both an image and a text query to identify matching individuals. While previous state-of-the-art methods perform well with detailed, sentence-level descriptions, we found that their Recall@1 drops by half when using short, keyword-based queries due to ambiguity, training biases, and under-represented attributes. Despite this challenge, short queries provide a more natural and efficient user experience, requiring less effort and allowing for iterative refinement. To address this limitation, we introduce a new problem setting, Composite-Attributes Person Re-ID (CA-ReID), along with a fine-grained composite attribute dataset with queries belonging to varying levels of ambiguity. We further propose two methods: Dense Disentangling Loss to promote attribute-specific embeddings, and Part-Aware Representations that use pose estimation to align textual attributes with relevant body regions. Our method sets a new state of the art on the new CA-ReID benchmark (up to +17% Recall@1) and performs on par with prior methods on existing CC-ReID benchmarks.

Method Overview

Pose groups image patches into body regions, and text is projected into matching slots.

CA-ReID Dataset

Composite Attributes

We separate attributes into five semantic body-part regions: head, top, bottom, feet, and other. The other category consists of belongings, context, and accessories.

Query Difficulty

We create condition texts from easy queries with the full target description and all attributes, medium queries with two to three attributes, and hard queries with a single attribute, which are the most ambiguous.

Results

Quantitative Results

CA-ReID improves retrieval accuracy on short-query settings, with the largest gains on the harder splits.

CA-ReID benchmark summary

Results on the proposed CA-ReID setting.

Method	Query	Celeb-ReID-L			COCAS+Real2
Method	Query	R@1	R@5	mAP	R@1	R@5	mAP
DIFFER^*	E	23.6	25.2	12.5	31.6	39.2	14.9
InstructReID	E	81.8	95.6	20.8	82.7	93.5	36.4
	M	74.0	81.8	19.0	51.8	78.2	17.9
	H	41.6	71.0	14.7	44.0	67.9	19.9
CA-ReID (Ours)	E	83.1	97.8	24.5	83.9	94.7	38.3
	M	78.9	86.2	23.3	55.1	82.4	21.2
	H	58.6	79.4	20.4	50.4	74.2	21.3

^*Image-only ReID.

Hard queries by attribute region

Celeb-ReID-L results for CA-ReID (Ours).

Region	R@1	R@5	mAP
Head	59.4	79.3	18.4
Top	62.4	85.2	23.5
Bottom	60.3	83.2	21.9
Feet	58.3	80.2	21.0
Accessories	55.4	76.1	19.5
Belongings	58.2	77.8	19.9
Context	56.2	74.1	18.4

Standard CC-ReID benchmark comparisons

Comparison on LTCC and PRCC benchmarks.

Method	Venue	LTCC		PRCC
Method	Venue	Top1	mAP	Top1	mAP
TransReID	CVPR'21	46.6	44.8	34.4	17.1
CAL	CVPR'22	55.2	55.8	40.1	18.0
AIM	CVPR'23	57.9	58.3	40.6	19.1
LDF	ACM'23	58.4	58.6	32.9	15.4
3DInv	ICCV'23	40.9	18.9	56.5	57.2
CCFA	CVPR'23	45.3	22.1	61.2	58.4
CLIP3D	CVPR'24	42.1	22.9	61.8	58.3
InstructReID	CVPR'24	66.7	46.7	54.2	52.3
DIFFER	CVPR'25	68.5	64.7	58.2	31.6
CA-ReID (Ours)	CVPR'26	63.8	53.7	55.2	43.4

Results

Qualitative Results

BibTeX

Citation

@inproceedings{patwari2026composite,
  title={Composite-Attribute Person Re-Identification via Pose-Guided Disentanglement},
  author={Patwari, Kartik and Vesdapunt, Noranart and Wang, Chien-Yi and Li, Dawei and Huynh, Cong Phuoc and Zhou, Ning and Chuah, Chen-Nee and Fu, Kah Kuen},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={13812--13823},
  year={2026}
}

Contact: kpatwari@ucdavis.edu