I'm a Masters student in computer science at Stanford
University
where I research robot learning, foundation models, and computer vision advised by Professor Jiajun Wu.
I have interned multiple times at Apple working on computer vision. I've contributed to health and
fitness features,
gaze and gesture on Apple Vision Pro,
and novel view synthesis / neural rendering on Apple Vision Pro.
Our work on memristor arrays is published in Nature Machine Intelligence.
Jan 2017
Invited Talk @ TEDxDeerfield "Understanding AI and Its Future" [YouTube]
Research
I'm broadly interested in robot learning, with the long-term goal of building generalists robots for
the household.
Specifically, my research focuses on 1) applying foundation models to robotics, 2) learning from
human demonstrations.
BLADE is a framework for long-horizon robotic manipulation by integrating imitation learning and
model-based planning.
BLADE leverages language-annotated demonstrations, extracts abstract action knowledge from large
language models (LLMs),
and constructs a library of structured, high-level action representations.
SfA is a framework to discover 3D part geometry and joint parameters of unseen articulated objects
via a sequence of inferred interactions. We show that 3D interaction and perception should
be considered in conjunction to construct 3D articulated CAD models.
This work demonstrates energy-efficient, memristor-based convolutional networks that achieve high
accuracy using weight-sharing
techniques and reduced parameters, highlighting potential for future edge AI.
Apple Inc. | US Patent US20240041354A1 | Filed 2023 |
Patent link
The embodiments describe a method for tracking caloric expenditure using a camera by analyzing face
tracking data, motion sensor data, and environmental factors to estimate energy expenditure.
Apple Inc. | US Patent US20230096949A1 | Filed 2022 | Patent link
The embodiments describe a method for monitoring posture and motion using mobile devices by combining
motion sensor data and skeletal data to estimate and classify a user's body pose through machine learning.