Shape My Moves
Text-Driven Shape-Aware Synthesis of Human Motions

1Adobe Research    2University of Maryland College Park   

CVPR 2025

Motion Synthesis on Various Body Shapes

We showcase results of our model ShapeMove when it is applied to synthesize the same actions for different body shapes. In the following videos, we use the same motion description with different shape descriptions to demonstrate how our method captures the natural variations in performing the same action by the different body types. The first row of video illustrates four sample body types. The Action buttons allow choosing and displaying different motions executed by these four body types.


Body Type 1

Body Type 2

Body Type 3

Body Type 4

Actions




Method back to top

In the inference phase (b), our model predicts motion tokens and the shape parameter from text inputs. We de-quantize these tokens using FSQ, and project into shape parameters with Projector. We concatenate Shape and Motion feature, and decode into the generated motion sequence with the Motion Decoder. Our model effectively synthesizes shape parameters and shape-aware motions reflecting the physical form and actions described in the input text.



Comparisons with Baselines for Given Body Shapes back to top

We showcase comparative results here among all of the baselines, including T2M-GPT [1], MotionDiffuse [2], MotionGPT [3], MLD [4], and MDM [5]. We show the ground truth motion and body shape in the first column for reference. This page contains 30 samples categorized by different actions. Selecting a particular Action, e.g., 'Run', shows a list of sample videos corresponding to the selected action type. Each individual video in this list can then be selected and viewed.


Actions



Videos





Ground Truth

Ours

MotionDiffuse
.

T2M-GPT

Ground Truth Body Shape

MotionGPT

MLD

MDM
.



References
[1] Jianrong Zhang, Yangsong Zhang, Xiaodong Cun, Shaoli Huang, Yong Zhang, Hongwei Zhao, Hongtao Lu, and Xi Shen. T2m-gpt: Generating human motion from textual descriptions with discrete representations. In CVPR, 2023.
[2] Mingyuan Zhang, Zhongang Cai, Liang Pan, Fangzhou Hong, Xinying Guo, Lei Yang, and Ziwei Liu. Motiondiffuse: Text-driven human motion generation with diffusion model. arXiv preprint, 2022.
[3] Biao Jiang, Xin Chen, Wen Liu, Jingyi Yu, Gang Yu, and Tao Chen. Motiongpt: Human motion as a foreign language. NIPS, 2024.
[4] Xin Chen, Biao Jiang, Wen Liu, Zilong Huang, Bin Fu, Tao Chen, and Gang Yu. Executing your commands via motion diffusion in latent space. In CVPR, 2023.
[5] Guy Tevet, Sigal Raab, Brian Gordon, Yoni Shafir, Daniel Cohen-or, and Amit Haim Bermano. Human motion diffusion model. In ICLR, 2023.

BibTeX

 TBU