LLM fine-tuning techniques I'd learn if I were to customize them: Bookmark this. 1. LoRA 2. QLoRA 3. Prefix Tuning 4. Adapter Tuning 5. Instruction Tuning 6. P-Tuning 7. BitFit 8. Soft Prompts 9. RLHF 10. RLAIF 11. DPO (Direct Preference Optimization) 12. GRPO (Group Relative Policy Optimization) 13. RLAIF (RL with AI Feedback) 14. Multi-Task Fine-Tuning 15. Federated Fine-Tuning My favourite is GRPO for building reasoning models. What about you? I've shared my full tutorial on GRPO in the replies.