Part-Aware Representations (PAR)
Pose groups image patches into body regions, and text is projected into matching slots.
Problem. Existing multimodal Re-ID methods perform well with full descriptions, but their accuracy drops substantially for short attribute queries. CA-ReID studies retrieval conditioned on a reference image and a short or composite attribute query, while requiring both identity consistency and attribute satisfaction.
A New Setting
The query consists of a reference person image and a short attribute condition.
Method
Pose groups image patches into body regions, and text is projected into matching slots.
DDL separates identity cues from attribute edits and reduces cross-part leakage.
The reference image and attribute query are fused, then matched against the gallery.
Benchmark
Results
CA-ReID improves retrieval accuracy on short-query settings, with the largest gains on the harder splits.
Results on the proposed CA-ReID setting.
| Method | Query | Celeb-ReID-L | COCAS+Real2 | ||||
|---|---|---|---|---|---|---|---|
| R@1 | R@5 | mAP | R@1 | R@5 | mAP | ||
| DIFFER [38]* | E | 23.6 | 25.2 | 12.5 | 31.6 | 39.2 | 14.9 |
| Inst-ReID [21] | E | 81.8 | 95.6 | 20.8 | 82.7 | 93.5 | 36.4 |
| M | 74.0 | 81.8 | 19.0 | 51.8 | 78.2 | 17.9 | |
| H | 41.6 | 71.0 | 14.7 | 44.0 | 67.9 | 19.9 | |
| CA-ReID (Ours) | E | 83.1 | 97.8 | 24.5 | 83.9 | 94.7 | 38.3 |
| M | 78.9 | 86.2 | 23.3 | 55.1 | 82.4 | 21.2 | |
| H | 58.6 | 79.4 | 20.4 | 50.4 | 74.2 | 21.3 | |
*Image-only ReID.
Celeb-ReID-L results for CA-ReID (Ours).
| Region | R@1 | R@5 | mAP |
|---|---|---|---|
| Head | 59.4 | 79.3 | 18.4 |
| Top | 62.4 | 85.2 | 23.5 |
| Bottom | 60.3 | 83.2 | 21.9 |
| Feet | 58.3 | 80.2 | 21.0 |
| Accessories | 55.4 | 76.1 | 19.5 |
| Belongings | 58.2 | 77.8 | 19.9 |
| Context | 56.2 | 74.1 | 18.4 |
Comparison on LTCC and PRCC benchmarks.
| Method | Venue | LTCC | PRCC | ||
|---|---|---|---|---|---|
| Top1 | mAP | Top1 | mAP | ||
| TransReID | CVPR'21 | 46.6 | 44.8 | 34.4 | 17.1 |
| CAL | CVPR'22 | 55.2 | 55.8 | 40.1 | 18.0 |
| AIM | CVPR'23 | 57.9 | 58.3 | 40.6 | 19.1 |
| LDF | ACM'23 | 58.4 | 58.6 | 32.9 | 15.4 |
| 3DInv | ICCV'23 | 40.9 | 18.9 | 56.5 | 57.2 |
| CCFA | CVPR'23 | 45.3 | 22.1 | 61.2 | 58.4 |
| CLIP3D | CVPR'24 | 42.1 | 22.9 | 61.8 | 58.3 |
| Inst-ReID | CVPR'24 | 66.7 | 46.7 | 54.2 | 52.3 |
| DIFFER | CVPR'25 | 68.5 | 64.7 | 58.2 | 31.6 |
| CA-ReID (Ours) | CVPR'26 | 63.8 | 53.7 | 55.2 | 43.4 |
Examples
Short-query retrieval examples from the benchmark.
BibTeX
@InProceedings{patwari2026careid,
title = {Composite-Attribute Person Re-Identification via Pose-Guided Disentanglement},
author = {Patwari, Kartik and Vesdapunt, Noranart and Wang, Chien-Yi and Li, Dawei and
Huynh, Cong Phuoc and Zhou, Ning and Chuah, Chen-Nee and Fu, Kah Kuen},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
year = {2026}
}