Have you tuned a model?

venusaur@lemmy.world · 1 month ago

Have you tuned a model?

pyr0ball@lemmy.world · 1 month ago

Yeah, done two separate things in this space.

Cover letter fine-tuning: Llama-3.2-3B-Instruct as the base, QLoRA via Unsloth (rank 16, 10 epochs). Trained on ~62 of my own cover letters, exported to GGUF, loaded into Ollama. Fits comfortably on 8GB VRAM with 4-bit quantisation. Noticeably more consistent than prompting a generic model for voice and style matching.

Email classification: completely different story. Classifier models for routing emails into categories (rejection, interview scheduled, offer, etc.) don’t need a GPU at all. DeBERTa-small runs on CPU in milliseconds. The hard part is the labeling pipeline. We bootstrapped with deterministic heuristics to auto-label high-confidence cases, then routed uncertain ones to a human review queue. Around 2,000 labeled examples was enough for meaningful accuracy.

vs RAG: for classification, fine-tuning wins cleanly. RAG is better when you need to reason over retrieved documents. If you’re making a consistent categorical judgment, you want it baked into the weights, not reconstructed from context at inference time.

I build local-first process pipeline tooling at circuitforge.tech

venusaur@lemmy.world · 1 month ago

Oh that’s really interesting! I’m also interested in the classification case. Can you tell me more or direct to where to learn more about DeBerta? Do you train it the same way? Prompt and response sets? Does it work on any open source model? I can only run up to 4B right now.