Article Details
Retrieved on: 2024-06-29 16:15:54
Tags for this article:
Click the tags to see associated articles and topics
Excerpt
Kullback-Leiber divergence has been widely used in Knowledge Distillation (KD) to compress Large Language Models (LLMs).
Article found on: ui.adsabs.harvard.edu
This article is found inside other hiswai user's workspaces. To start your own collection, sign up for free.
Sign UpAlready have an account? Log in here