- Reviewed existing RAG methods
									- embedding model
 
- Reviewed existing RLHF methods
									- rewarding model
 
- Reviewed existing finetuning methods
									- prompt/instruction tuning
- parameter tuning
 
RAG

- Frozen LLM, focus on RAG and RLHF
- RAG: assist LLM to generate in-context responses
									- main questions:
											- how and what to retrieve:
													- similarity for unstructured
- sql for relational
- cypher for graph
 
- when to retrieve:
													- retrieve after scope is obtained
- let LLM decide
 
 
- how and what to retrieve:
													
- other questions:
											- how to preprocess
													- system prompt: setup roles, tasks, workflow, restrictions
 
- how to prompt
- how to pass context
- how to post-process
													- json, reference list, etc.
 
- how to verify
													- accuracy, relevance to the knowledge base
 
 
- how to preprocess
													
 
- main questions:
											
- RLHF: Mitigate hallucinations
RLHF
RL guiding LLM
LLM is the agent to control robot
- prompt tuning
- parameter tuning
LLM guiding RL
RL is the agent to control robot
- convert human feedback to reward policy
- automatically generate reward policy based on tasks/scenarios
Framework Brainstorming
Robot + LLM
graph TD;
    R((Robot))-->|Prompt|L[LLM]
    L-->|Response|R
Robot + LLM + Retriever
graph TD;
    R((Robot))-->|query|A[Retriever]
    R-->|Update|D([State Logs])
    A-->|Retrieve|C([Knowledge Base])
    A-->|Retrieve|D
    C-->E[Post-retriever]
    D-->E
    E-->|Prompt|L[LLM]
    L-->|Response|R
    
Robot + LLM + Retriever + RLHF
graph TD;
    R((Robot))-->|query|A[Retriever]
    R-->|Update|D([State Logs])
    A-->|Retrieve|C([Knowledge Base])
    A-->|Retrieve|D
    C-->E[Post-retriever]
    D-->E
    E-->|Prompt|L[LLM]
    L-->|Response|RLHF
    RLHF-->|Response|R
    R-->|Reward|RLHF
    RLHF-->|Reward|L
    
Use cases brainstorming
- Eyesim
									- Assignments
- API
- State Logs: Data collected during the simulation
 
- Vehicle
									- Document:
- State Logs:
 
- Arm
									- Document:
- State Logs:
 
- Humanoid
									- Document:
- State Logs:
 
 
							 
								