russtedrake PRO
Roboticist at MIT and TRI
(Foundation models for dexterous manipulation)
Russ Tedrake
MIT, EECS/CSAIL
russt@mit.edu
DARPA Robotics Competition, 2015
large language models
visually-conditioned language models
large behavior models
\(\sim\) VLA (vision-language-action)
\(\sim\) EFM (embodied foundation model)
vision encoder
language encoder
action
decoder
robot joint encoder
Why actions (for dexterous manipulation) could be different:
should we expect similar generalization / scaling-laws?
Robotics: Science and Systems, 2023
\(\Rightarrow\) Many new startups (some low-cost, some humanoids)
\(\Rightarrow\) Major new investments by tech giants
&
Why actions (for dexterous manipulation) could be different:
should we expect similar generalization / scaling-laws?
One problem: we don't (yet) have internet scale robot data
Big data
Big transfer
Small data
No transfer
robot teleop
(the "transfer learning bet")
Open-X
simulation rollouts
novel devices
NVIDIA selected Drake and MuJoCo
(for potential inclusion in Omniverse)
(Establishing faith in)
http://manipulation.mit.edu
By russtedrake