My research aims to develop robotic systems that achieve human-like intelligence and dexterity
and operate in complex and evolving environments with safety, robustness, and trustworthiness. 🤖
We introduce Distribution Contractive Reinforcement Learning (DICE-RL), a framework that uses
reinforcement learning (RL) as a "distribution contraction" operator to refine pretrained
generative robot policies. DICE-RL turns a pretrained behavior prior into a high-performing "pro"
policy by amplifying high-success behaviors from online feedback. It enables mastery of complex
long-horizon manipulation skills directly from high-dimensional pixel inputs, both in simulation
and on a real robot.
We introduce Latent Policy Barrier, a framework for robust visuomotor policy learning.
LPB treats the latent embeddings of expert demonstrations as an implicit barrier separating safe,
in-distribution states from unsafe, out-of-distribution (OOD) ones. Our approach decouples the role
of precise expert imitation and OOD recovery into a base diffusion policy and a dynamics model.
At inference time, the dynamics model predicts future latent states and optimizes them to stay
within the expert distribution.
We introduce a method that automatically generates reward functions for agents to learn new tasks
using only a text description of the task goal and the agent’s visual observations, by leveraging
feedback from vision language foundation models (VLMs).
We introduce a method that leverages both vision and force modalities for robot-assisted dressing.
Our method combines the vision-based policy, trained in simulation, with the force dynamics model,
learned in the real world to achieve better dressing performance and safety for the user.
We develop, for the first time, a robot-assisted dressing system that is able to dress different
garments on people with diverse body shapes and poses from partial point cloud observations,
based on a single reinforcement learning policy.
We propose a dedicated algorithm and accelerator co-design framework dubbed ViTCoD to accelerate ViTs.
On the algorithm level, we prune and polarize the attention maps to have either denser or sparser patterns.
On the hardware level, we develop an accelerator to coordinate the denser/sparser workloads for higher hardware utilization.
We discover for the first time that both efficient DNNs and their lottery subnetworks can be identified
from a supernet and propose a two-in-one training scheme with jointly architecture searching and parameter pruning
to identify them.
We propose a method that leverages human guidance for high DOF robot motion planning in partial observable
environments. We project the robot’s continuous configuration space to a discrete task model and utilize
inverse RL to learn motion-level guidance from human critiques.
Education
Carnegie Mellon University Master of Science in Robotics (MSR)