Before LLM Emergence
- Perception:
									
- Scene Understanding:
											
- Utilizes algorithms for object detection, semantic segmentation, and scene reconstruction using 2D image data and LiDAR/RGB-D point clouds.
 
 - State Estimation and SLAM:
											
- Integrates pose estimation and mapping, leveraging vision-based approaches, LiDAR, and IMU data.
 
 - Learning-Based Methods:
											
- Employs supervised and self-supervised techniques to enhance pose-tracking and reconstruction capabilities.
 
 
 - Scene Understanding:
											
 - Planning:
									
- Search-Based Planning:
											
- Uses heuristics and graphs to compute robot trajectories, exemplified by pathfinding in discrete spaces.
 
 - Sampling-Based Planning:
											
- Randomly samples points in configuration spaces to connect paths or generate states, suitable for high-dimensional and continuous spaces.
 
 - Task Planning:
											
- Utilizes object-level abstractions and symbolic reasoning for planning in discrete domains.
 
 - Reinforcement Learning:
											
- Applies end-to-end formulations for tasks like visual navigation and driving behaviors.
 
 
 - Search-Based Planning:
											
 - Control:
									
- PID Control Loops:
											
- The fundamental method for maintaining desired operational states through feedback mechanisms.
 
 - Model Predictive Control (MPC):
											
- Uses optimization-based methods for action sequence generation, particularly in dynamic environments.
 
 - Imitation Learning:
											
- Learns control policies by mimicking expert demonstrations, applicable in contexts like urban driving and drone acrobatics.
 
 - Reinforcement Learning:
											
- Optimizes control policies through accumulated rewards, focusing on direct action generation from sensory data without relying on dynamics models.
 
 
 - PID Control Loops:
											
 
graph LR;
    A[Robotics]
    A --> B[Perception]
    A --> C[Planning]
    A --> D[Control]
    B --> B1[Scene Understanding]
    B1 --> B1a[Object Detection]
    B1 --> B1b[Semantic Segmentation]
    B1 --> B1c[Scene Reconstruction using 2D]
    B1 --> B1d[Scene Reconstruction using LiDAR/RGB-D]
    B --> B2[State Estimation and SLAM]
    B2 --> B2a[Pose Estimation]
    B2 --> B2b[Mapping with Vision-based Approaches]
    B2 --> B2c[Mapping with LiDAR]
    B2 --> B2d[Mapping with IMU]
    B --> B3[Learning-Based Methods]
    B3 --> B3a[Supervised Techniques for Pose-Tracking]
    B3 --> B3b[Supervised Techniques for Reconstruction]
    B3 --> B3c[Self-Supervised Techniques for Pose-Tracking]
    B3 --> B3d[Self-Supervised Techniques for Reconstruction]
    C --> C1[Search-Based Planning]
    C1 --> C1a[Heuristics for Pathfinding]
    C1 --> C1b[Graphs for Pathfinding in Discrete Spaces]
    C --> C2[Sampling-Based Planning]
    C2 --> C2a[Random Sampling in Configuration Spaces]
    C2 --> C2b[Path Connections]
    C2 --> C2c[Suitable for High-Dimensional Spaces]
    C2 --> C2d[Suitable for Continuous Spaces]
    C --> C3[Task Planning]
    C3 --> C3a[Object-Level Abstractions]
    C3 --> C3b[Symbolic Reasoning in Discrete Domains]
    C --> C4[Reinforcement Learning]
    C4 --> C4a[End-to-End Formulations for Visual Navigation]
    C4 --> C4b[End-to-End Formulations for Driving Behaviors]
    D --> D1[PID Control Loops]
    D1 --> D1a[Fundamental Method for Operational State Maintenance]
    D --> D2["Model Predictive Control (MPC)"]
    D2 --> D2a[Optimization-Based Action Sequence Generation]
    D2 --> D2b[Application in Dynamic Environments]
    D --> D3[Imitation Learning]
    D3 --> D3a[Mimicking Expert Demonstrations]
    D3 --> D3b[Application in Urban Driving]
    D3 --> D3c[Application in Drone Acrobatics]
    D --> D4[Reinforcement Learning]
    D4 --> D4a[Optimizes Control Policies through Accumulated Rewards]
    D4 --> D4b[Direct Action Generation from Sensory Data]
    D4 --> D4c[No Reliance on Dynamics Models]
							LLM based
General-Purpose Robotics
Robots are designed to perform a wide range of tasks in various environments.
The tasks can be as simple as picking up an object and placing it in a different location

or as complex as conducting detailed inspections and maintenance tasks in an underground mine.

The environments can be structured, like a factory floor, or unstructured, like a disaster site.
General-purpose robots are designed to perform a wide range of tasks in various environments without the need for reprogramming or reconfiguration.
LLM advantages:
- Pre-trained Foundation models:
									
- large-scale models that are trained on extensive and diverse datasets to acquire a broad knowledge base.
											
- can generalize knowledge across a wide range of tasks and domains.
 
 
 - large-scale models that are trained on extensive and diverse datasets to acquire a broad knowledge base.
											
 - Interpretation of Natural Language Instructions:
									
- enable robots to understand and respond to human commands in natural language,
											
- improving user interaction and accessibility.
 
 
 - enable robots to understand and respond to human commands in natural language,
											
 - Multi-Modal Sensory Data Interpretation:
									
- allow robots to interpret and respond to sensory data in natural language.
											
- improve explainability on task planning and action generation.
 
 
 - allow robots to interpret and respond to sensory data in natural language.