View Detailed Profile
[UCLA RL-LLM] Chapter 0: Course outline and prologue

[UCLA RL-LLM] Chapter 0: Course outline and prologue

Chapter 0

[UCLA RL-LLM] Chapter 2.1: NLP foundations, language modeling, RNNs

[UCLA RL-LLM] Chapter 2.1: NLP foundations, language modeling, RNNs

Chapter

[UCLA RL-LLM] Chapter 1.1: MDP foundations, imitation learning, and value iteration

[UCLA RL-LLM] Chapter 1.1: MDP foundations, imitation learning, and value iteration

Chapter

[UCLA RL-LLM] Chapter 1.2: Deep policy evaluation

[UCLA RL-LLM] Chapter 1.2: Deep policy evaluation

Chapter

[UCLA RL-LLM] Chapter 1.4: Deep policy gradient methods (PPO, GRPO)

[UCLA RL-LLM] Chapter 1.4: Deep policy gradient methods (PPO, GRPO)

Chapter

[UCLA RL-LLM] Chapter 3.1: Reinforcement learning from human feedback (PPO, DPO)

[UCLA RL-LLM] Chapter 3.1: Reinforcement learning from human feedback (PPO, DPO)

Chapter

[UCLA RL-LLM] Chapter 1.3: Deep policy gradient methods (A3C)

[UCLA RL-LLM] Chapter 1.3: Deep policy gradient methods (A3C)

Chapter

[UCLA RL-LLM] Chapter 1.5: AlphaGo, test-time compute, and expert iteration

[UCLA RL-LLM] Chapter 1.5: AlphaGo, test-time compute, and expert iteration

Chapter

[Guest Lecture at UCLA RL Course, Spring 2025] Inverse Reinforcement Learning Meets LLM Alignment

[Guest Lecture at UCLA RL Course, Spring 2025] Inverse Reinforcement Learning Meets LLM Alignment

Recording of the guest lecture for [

[UCLA RL-LLM] Chapter 3.2: Reinforcement learning with verifiable rewards (RLVR)

[UCLA RL-LLM] Chapter 3.2: Reinforcement learning with verifiable rewards (RLVR)

Chapter

[UCLA RL-LLM] Chapter 2.3: Transformers II (modern transformers updates and sampling methods)

[UCLA RL-LLM] Chapter 2.3: Transformers II (modern transformers updates and sampling methods)

Chapter

Stanford CS229 I Machine Learning I Building Large Language Models (LLMs)

Stanford CS229 I Machine Learning I Building Large Language Models (LLMs)

For more information about Stanford's Artificial Intelligence programs visit: https://stanford.io/ai This lecture provides a concise ...

[UCLA RL-LLM] Chapter 2.4: In-context learning and instruction fine-tuning

[UCLA RL-LLM] Chapter 2.4: In-context learning and instruction fine-tuning

Chapter