Building the foundation for the next generation of intelligent systems through world models and multimodal perception. 通过世界模型和多模态感知,奠基下一代智能系统。
About 关于我们
We are a dynamic research collective at the forefront of artificial intelligence,
dedicated to solving the fundamental challenges of embodied intelligence. Our vision is to create
self-evolving agents capable of perceiving, reasoning, and acting in complex, real-world
environments—agents that continuously learn and adapt to seamlessly bridge the digital and physical
worlds.
Our mission is to establish a complete pipeline for these agents, forming a closed loop of efficient
data preparation, multimodal perception, spatiotemporal decision-making, and continuous learning. By
integrating cutting-edge research in 3D vision, world models, and large language models, we are
building the foundation for the next generation of intelligent systems.
We are always looking for motivated PhD students, postdocs, and research assistants who share our
vision. Check out the Join
Us section and follow us on
Wechat.
我们是一个走在人工智能前沿的充满活力的研究集体,致力于解决具身智能的根本挑战。我们的愿景是创造能够在复杂的现实世界环境中感知、推理和行动的自我进化智能体——能够持续学习和适应,无缝连接数字与物理世界的智能体。
我们的使命是为这些智能体建立一个完整的创建流程,形成高效数据制备、多模态感知、时空决策和持续学习的闭环。通过整合3D视觉、世界模型和大型语言模型领域的最前沿研究,我们正在为下一代智能系统奠定基础。
我们随时欢迎有共同愿景的博士生、博士后和研究助理加入我们。请查看我们的加入我们栏目并关注我们的微信公众号。
We develop innovative methods for efficient collection, annotation, and preprocessing of embodied AI data. We focus on creating high-quality datasets that enable robust robot learning in real-world environments. 我们开发创新的方法以前沿、高效的方式收集、标注和处理具身AI数据。我们专注于创建高质量数据集,以在现实环境中实现鲁棒的机器人学习。
We build comprehensive world models through multimodal sensory fusion and understanding. Our research enables agents to perceive and reason about complex 3D environments using multiple modalities. 我们通过多模态感知融合建立全面的世界模型。我们的研究使智能体能够使用多模态感知和推理复杂的3D环境。
We train robot policies that can make intelligent decisions in complex spatiotemporal environments. Our work bridges the gap between simulation and real-world deployment through robust policy learning. 我们致力于训练能够在复杂时空环境中做出智能决策的机器人策略模型。我们的研究通过鲁棒策略学习弥合了仿真与现实部署的差距。
We enable agents to continuously learn and adapt to new environments and tasks without catastrophic forgetting. Our research addresses the fundamental challenges of lifelong learning in embodied AI. 我们让智能体能够不断学习并适应新环境和新任务。我们的研究应对了具身AI的持续学习的基础挑战,从而避免灾难性遗忘。
AnchorGen: Multi-View Geometric Anchoring for Keyframe-Aware Embodied Video Generation
Under Review
AnchorGen 是一种关键帧感知的几何锚定视频生成框架,用于提升机器人动作条件视频的三维一致性。方法通过自监督二维-三维对比学习自动发现接触与状态变化等重要关键帧,并以稀疏几何特征作为结构化条件注入多模态扩散模型,在真实机器人数据上显著提升生成质量与空间一致性。
Associate Professor, Tsinghua University Shenzhen International Graduate School 副教授,清华大学深圳国际研究生院
Personal Website 个人主页 | Email: wangzhi@sz.tsinghua.edu.cn 邮箱: wangzhi@sz.tsinghua.edu.cn
Researcher 研究员
Personal Website 个人主页 | Email: jiangjingyanjlu@gmail.com 邮箱: jiangjingyanjlu@gmail.com
We are always looking for passionate and talented students to join our team. If you are interested in shaping the future of embodied AI, we encourage you to apply!
我们一直在寻找有热情和才华的学生加入我们的团队。如果您对塑造具身人工智能的未来感兴趣,我们鼓励您申请。
Students with a strong background in math, AI, programming, or robotics are welcome.
欢迎数理或计算机基础扎实的同学加入课题组参与科研。
📩 Contact Yuzhi Huang📩 联系 黄誉之
Please include "Prospective Student" in your email subject with your CV and sample code/papers.
发送简历及代表作,邮件标题请注 “Prospective Student”。
📩 Contact Prof. Zhi Wang📩 联系 王智教授
We invite bright minds worldwide for cutting-edge collaborative research in embodied AI.
诚邀海内外优秀学者合作交流,共同推进具身智能前沿研究。
📩 Contact Jingyan Jiang📩 联系 姜婧妍
Open to robotics deployment and industrial technology collaborations.
开放探讨大模型工业落地、前沿算法研发与核心技术转化合作。
📩 Contact Prof. Zhi Wang📩 联系 王智教授