great series of posts, i went down a similar path recently for a slightly different use case - i did not use axolotl though, i was worried about missing out on understanding some details due to potential abstractions. it's great to see documentation on how others tackle similar problems, i documented the process i went through here - https://atredis.com/blog/2024/6/3/how-to-train-your-large-la...
There's a ton of abstraction in axolotl, for sure, but so far I haven't found that it gets in the way. The main competitor in that space seems to be Unsloth, but that only works with a single GPU machine, so didn't fit my purposes. I'll dive into your blogpost. Thanks for posting!
Unless you have GPUs available, LoRa is the only accessible option. That being said, if your task is simple enough, you can skip the problem entirely and just pick a small base model.
When you get good enough at filtering the dataset for training, do you still need an AI, or do you understand the problem domain and can use a deterministic system?
great series of posts, i went down a similar path recently for a slightly different use case - i did not use axolotl though, i was worried about missing out on understanding some details due to potential abstractions. it's great to see documentation on how others tackle similar problems, i documented the process i went through here - https://atredis.com/blog/2024/6/3/how-to-train-your-large-la...
There's a ton of abstraction in axolotl, for sure, but so far I haven't found that it gets in the way. The main competitor in that space seems to be Unsloth, but that only works with a single GPU machine, so didn't fit my purposes. I'll dive into your blogpost. Thanks for posting!
I used unsloth, I was only using a single GPU for testing - looking forward to follow up posts.
For tasks like data extraction, are people doing full finetunes or training a LoRA? Is it any different for classification?
LoRA is perfect for that. Where it fails is when you want to teach a new domain.
Unless you have GPUs available, LoRa is the only accessible option. That being said, if your task is simple enough, you can skip the problem entirely and just pick a small base model.
When you get good enough at filtering the dataset for training, do you still need an AI, or do you understand the problem domain and can use a deterministic system?