Fine-tune a Mistral-7b model with Direct Preference Optimization | by Maxime Labonne

Retrieved on: 2024-01-01 18:38:57

Tags for this article:

Large language models

Artificial intelligence

Machine learning

OpenAI

Natural language processing

Prompt engineering

Reinforcement learning from human feedback

Generative pre-trained transformer

Click the tags to see associated articles and topics

Fine-tune a Mistral-7b model with Direct Preference Optimization | by Maxime Labonne. View article details on HISWAI: https://towardsdatascience.com/fine-tune-a-mistral-7b-model-with-direct-preference-optimization-708042745aac

Excerpt

Pre-trained Large Language Models (LLMs) can only perform next-token prediction, making them unable to answer questions.

Article found on: towardsdatascience.com

View Original Article