Skip to content

Similarity

This page summarizes how redundancy patterns behave across tasks and languages, and how these similarities relate to the data shown below.

1. Redundancy patterns depend on the task domain

The task level analysis shows that redundancy patterns are strongly structured by the underlying task.

  • Across MMLU tasks, redundant layers are concentrated toward the final layers (approximately layers 25 to 31), which supports the idea that merging should primarily target later layers.
  • Redundancy patterns are not uniform across tasks, indicating clear task dependence.
  • Tasks that are conceptually similar tend to have more similar redundancy patterns (for example, Math and Computer Science, or Legal and Humanities).
  • Task level similarity is generally high; the Math and Computer Science pair is the strongest with a correlation of 0.951.
medical legal math cs humanities
medical 1.000000 0.887484 0.966250 0.940329 0.931864
legal 0.887484 1.000000 0.884862 0.899014 0.862810
math 0.966250 0.884862 1.000000 0.951172 0.903280
cs 0.940329 0.899014 0.951172 1.000000 0.885816
humanities 0.931864 0.862810 0.903280 0.885816 1.000000

MMLU heatmaps (tasks)

2. Language has an even stronger effect than task

The language level analysis reveals that varying the language, even for a fixed task, has an even larger impact on redundancy.

  • When the task is fixed (medical) but the language changes, layer similarity varies more strongly.
  • Redundancy across languages becomes less consistent and more irregular.
  • Cross language correlations are lower (for example, Spanish and Chinese have a correlation of 0.730), whereas correlations across tasks stay above 0.86.
  • These trends suggest that language has a stronger influence on layer redundancy than the task domain.
Spanish (es) Chinese (zh) French (fr)
Spanish (es) 1.000000 0.729526 0.836475
Chinese (zh) 0.729526 1.000000 0.775130
French (fr) 0.836475 0.775130 1.000000

GMMLU heatmaps (languages)