NLP Assignment

This site accompanies our course project for CS 613: Natural Language Processing at IIT Gandhinagar.
Our project is titled "Beyond Naive Merging: Enhancing LLM Compression via Alpha Optimization, Task-Specific Similarity, and Neural Alignment".

Project overview

Large Language Models (LLMs) are powerful but expensive to deploy. We build on the "Pruning via Merging" (MKA) method and study how to make layer-merging based compression more principled and reliable. Concretely, we:

Treat the merge weight \(\alpha\) as a trainable parameter and optimize it using data driven methods rather than a fixed similarity heuristic.
Show that layer similarity is task and language dependent by analyzing similarity heatmaps across MMLU domains and Global MMLU languages.
Propose an "align then merge" pipeline that first aligns neurons and then merges layers, improving stability at higher compression ratios.

The main experimental results and plots are available via the navigation links:

Neural Alignment – prompt browser and alignment based comparisons.
Similarity – similarity visualizations and correlation tables.
Alpha Values – analysis of learned \(\alpha\) values and compression accuracy.

Our code is available at: https://github.com/Jain-Laksh/Layer-Merging-via-Manifold-Alignment.

Acknowledgements

Course: CS 613 – Natural Language Processing, IIT Gandhinagar
Instructor: Prof. Mayank Singh
Teaching Assistant: Sailesh Panda
Team: "Dropout Squad" (third year B.Tech, IIT Gandhinagar)
Aditya Borate
Aryan Solanki
Laksh Jain
Nishchay Bhutoria
Parthiv Patel
Rudra Pratap Singh
Soham Gaonkar