Trending topics
#
Bonk Eco continues to show strength amid $USELESS rally
#
Pump.fun to raise $1B token sale, traders speculating on airdrop
#
Boop.Fun leading the way with a new launchpad on Solana.
LLM fine-tuning techniques I'd learn if I were to customize them:
Bookmark this.
1. LoRA
2. QLoRA
3. Prefix Tuning
4. Adapter Tuning
5. Instruction Tuning
6. P-Tuning
7. BitFit
8. Soft Prompts
9. RLHF
10. RLAIF
11. DPO (Direct Preference Optimization)
12. GRPO (Group Relative Policy Optimization)
13. RLAIF (RL with AI Feedback)
14. Multi-Task Fine-Tuning
15. Federated Fine-Tuning
My favourite is GRPO for building reasoning models. What about you?
I've shared my full tutorial on GRPO in the replies.
Top
Ranking
Favorites