Your Unembedding Matrix Is Secretly a Feature Lens for Text Embeddings

EmbedFilter improves LLM-based text embeddings by filtering out the "edge spectrum" subspace that encodes uninformative high-frequency tokens.

Why do LLM-based text embeddings often perform poorly on zero-shot tasks, and how can we filter the unembedding matrix to improve them?

Large language models (LLMs) often perform poorly as zero-shot embedding models because their internal representations are biased toward frequent, semantically irrelevant tokens. This "representation collapse" forces embeddings into a narrow, anisotropic region of the vector space, masking nuanced semantic features. The authors identify that the model's unembedding matrix acts as a "feature lens" that actively writes these high-frequency tokens into the embedding space. They introduce EmbedFilter, a linear transformation that filters out this specific "edge spectrum" subspace to unmask the underlying semantic representations. This post-processing step consistently improves zero-shot performance across multiple LLM backbones, achieving up to a 14.1% gain on the Massive Text Embedding Benchmark (MTEB) while simultaneously enabling efficient dimensionality reduction.

Paper Primer

EmbedFilter works by isolating the "edge spectrum"—the singular vectors associated with the largest and smallest singular values of the unembedding matrix—and removing them from the embedding. This is like a noise-canceling filter that identifies the specific frequency band where "average" token bias lives and mutes it, leaving only the semantic bulk of the signal.

EmbedFilter significantly boosts zero-shot embedding performance without training.

Evaluations on the MTEB benchmark across Qwen, Llama, and Mistral backbones.

Why does the unembedding matrix cause this bias in the first place?

The unembedding matrix is designed to map hidden states to vocabulary probabilities; the authors show it encodes a latent subspace corresponding to an "average" token—a frequency-weighted representation of the training corpus—which pulls raw embeddings toward common, uninformative tokens.

How does this differ from standard dimensionality reduction or whitening techniques?

Unlike whitening, which typically requires a calibration dataset to compute statistics, EmbedFilter is a training-free, heuristic post-processing step that leverages the model's existing unembedding matrix parameters to identify and remove the biased subspace.

Researchers can now treat the LLM unembedding matrix as a diagnostic tool to refine embeddings, enabling high-performance, low-dimensional retrieval without the need for additional fine-tuning or complex prompt engineering.

Paper Primer

The abstract identifies noisy high‑frequency token bias and proposes EmbedFilter to clean LLM embeddings.

Large language models excel at zero‑shot tasks but underperform as off‑the‑shelf embedding generators because their vectors over‑represent high‑frequency tokens. By recognizing that the unembedding matrix actively injects these noisy components, the authors propose EmbedFilter—a simple linear projection that excises the edge spectrum subspace, thereby sharpening semantic features, shrinking dimensionality, and accelerating retrieval without sacrificing quality.

https://github.com/CentreChen/EmbFilter

Introduction

LLMs embed noisy high‑frequency components, and a simple filter can recover cleaner semantics.

Large language models excel on many benchmarks, yet when their hidden states are used directly as text embeddings they fall short of the quality needed for downstream retrieval and similarity tasks. Prior attempts such as prompt engineering only provide fragile, modest improvements, leaving a systematic gap between LLM capabilities and embedding performance.

Our analysis, powered by the Logit Lens, reveals that raw embeddings are biased toward high‑frequency tokens that carry little semantic content. This “representation collapse” is consistent across model families, suggesting a universal hidden subspace—named the Edge Spectrum Subspace—that steers vectors toward an average, frequency‑weighted token. By linearly projecting out this subspace (the EmbedFilter), we can recover more faithful semantic features with negligible overhead.

LLM embeddings contain a hidden component that pulls them toward common, semantically weak tokens, obscuring the true meaning of the input.

Mechanistic Interpretability Tools

Background: key tools for probing embeddings and their semantics.

We first review how large language models produce text embeddings and the interpretability lenses we use to inspect them.

The unembedding matrix is the final linear layer that maps hidden states to vocabulary logits, i.e., it decides which token each hidden vector would predict.

The Logit Lens projects any intermediate hidden representation directly into the vocabulary space, showing which tokens that layer would predict if it were the final layer.

**Figure 1.** Logit Lens applied to text embeddings from three LLM backbones. Word clouds show the top-aligned tokens with the highest decoding probabilities, which are primarily high-frequency yet semantically uninformative. The input text, encoded by the text embeddings, is given as: "We call this a ‘lens’ because it is one way of extracting information from GPT’s internal activations. I imagine there is other information present in the activations that cannot be understood by looking at logits over tokens. The logit lens show us some of what is going on, not all of it." This corresponds to the official notation of the logit lens.

Discovery of the Edge Spectrum

We expose the edge‑spectrum subspace that drives high‑frequency token bias in LLM embeddings.

LLM embeddings are highly anisotropic, clustering in a narrow subspace, and they tend to align with frequent, low‑semantic tokens. This suggests that the dominant subspace encodes the edge‑spectrum of high‑frequency tokens, motivating its isolation.

We evaluate models from Qwen‑2.5 (0.5 B) to Mistral‑v0.3‑Instruct (7 B) and Llama‑3.1‑Instruct (8 B), using the RedPajama corpus to approximate the true word‑frequency distribution $p$ with $\hat{p}$.

Treating the unembedding matrix $U$ and the projection $W$, the logits for a token satisfy $\mathbf{h}W^{\top}U = \log(\mathbf{q}) + b$. Using the Moore–Penrose pseudo‑inverse $W^{+}$ we solve for the hidden state that would produce the empirical frequency $\hat{p}$, yielding the average token embedding $\hat{\mathbf{h}} = \log(\hat{p})\,W^{+}U$.

A thin band of singular directions at the extremes of the $WU$ spectrum that disproportionately amplifies logits of frequent tokens, while contributing little to semantic variation.

Projection onto the edge spectrum: $\mathbf{p}= (\mathbf{h}\!\cdot\!e_3)\,e_3 + (\mathbf{h}\!\cdot\!e_4)\,e_4 = [0,\,0,\,3,\,4]$.

Removing the edge component: $\mathbf{h}^{\text{filtered}} = \mathbf{h} - \mathbf{p} = [1,\,2,\,0,\,0]$.

Assuming a frequent token’s logit is proportional to the dot product with $\mathbf{h}$, the original logit is $3+4=7$; after filtering it drops to $0$, illustrating the edge spectrum’s boost.

Eliminating the low‑energy edge directions removes the disproportionate boost given to high‑frequency tokens while preserving the core semantic components (the first two dimensions).

**Figure 2.** $\Delta\pi$ distribution for Qwen, Llama and Mistral.

The EmbedFilter Method

EmbedFilter linearly removes the edge spectrum to sharpen text embeddings.

After identifying the edge spectrum subspace, we now describe how EmbedFilter excises it with a single linear step.

EmbedFilter projects each embedding onto the middle singular vectors of the unembedding matrix, discarding the extreme directions that carry noisy, high‑frequency token information.

How does EmbedFilter differ from a standard low‑rank PCA projection that also drops extreme components?

PCA would keep the top singular vectors, whereas EmbedFilter deliberately discards them and retains the middle spectrum; this flips the usual variance‑preserving intuition to suppress high‑frequency token influence instead of preserving it.

Extract the bulk matrix $V[l_{\tau}:r_{\tau}] = \begin{bmatrix}0&1&0\\0&0&1\\0&0&0\end{bmatrix}$ (columns 2‑4 of $I_{6}$).

Form the projector $\Phi_{\tau}=V[l_{\tau}:r_{\tau}]\,V[l_{\tau}:r_{\tau}]^{\top}$, which zeroes out dimensions 1, 5, 6 and leaves dimensions 2‑4 unchanged.

Apply $\hat{e}=e\Phi_{\tau}^{\top}$: the result is $(0,\,2,\,3,\,4,\,0,\,0)$ – the edge dimensions are removed.

Compute the distance between two toy embeddings $e^{(1)}=(1,2,3,4,5,6)$ and $e^{(2)}=(6,5,4,3,2,1)$ after filtering: $\|e^{(1)}\Phi_{\tau}^{\top}-e^{(2)}\Phi_{\tau}^{\top}\|_{2}= \sqrt{(2-5)^2+(3-4)^2+(4-3)^2}= \sqrt{9+1+1}= \sqrt{11}$, identical to the distance computed directly on the retained dimensions.

EmbedFilter removes only the edge dimensions; the remaining bulk dimensions retain exact Euclidean geometry, guaranteeing that similarity‑based downstream tasks are unaffected.

**Figure 3.** Re-running logit lens analysis in Section 1 with text embeddings refined by EmbedFilter. Top-6 tokens from logit lens are displayed, with colored entries indicate tokens that have literal connections with the input text. EmbedFilter suppresses the expression of frequent tokens and enhances the semantic richness of text embeddings.

By excising the edge spectrum, EmbedFilter yields embeddings that are both cleaner and amenable to aggressive dimensionality reduction without sacrificing task performance.

Experimental Setup

We evaluate EmbedFilter on the MTEB benchmark across three LLM backbones.

We built the evaluation pipeline on the official MTEB implementation, reporting each task’s standard metric.

The benchmark aggregates a suite of downstream tasks—semantic similarity, classification, clustering, and retrieval—to measure how well a text embedding captures useful semantics.

We run EmbedFilter on three popular LLM backbones, varying the reduction factor $τ$ to 2, 4, and 8, which scales the output dimensionality to $1/τ$ of the original size.

On Qwen‑2.5‑05B, EmbedFilter yields the largest relative gain at $τ\!=\!8$, improving the overall MTEB score by +9.0%.

Table 1 shows a +9.0% increase (from 68.81 to 74.41) for the $\tau$=8 column, the highest improvement among all $\tau$ settings for this model.

Llama‑3.1‑8B‑Instruct benefits most at $τ\!=\!2$, where EmbedFilter raises the MTEB score by +7.8%.

The $\tau$=2 column in Table 1 records a +7.8% jump (from 53.47 to 57.70), the strongest lift for the Llama backbone.

For Mistral‑7B‑Instruct‑V0.3, the $\tau$=4 setting delivers a +5.8% improvement over the vanilla baseline.

In Table 1 the $\tau$=4 column rises from 51.43 to 57.32, a +5.8% gain that stands out among the Mistral results.

Main Results

EmbedFilter removes edge‑spectrum noise, yielding stronger text embeddings.

EmbedFilter removes the edge‑spectrum subspace from the unembedding matrix, yielding cleaner semantic embeddings.

EmbedFilter boosts the MTEB average score by up to 14% across all evaluated setups.

Table 1 reports the performance gains for PromptEOL and ECHO configurations.

Ablation Studies

We examine how the filtering ratio and design choices affect EmbedFilter’s performance.

The hyperparameter $τ$ controls the filtering ratio: embeddings are shrunk to $1/τ$ of their original size, which cuts index storage by the same factor and yields an approximate $τ$‑fold speedup in similarity search.

Even at a high filtering ratio $τ$ = 8, EmbedFilter remains competitive on the Massive Text Embedding Benchmark (MTEB), confirming that the gains are not solely due to dimensionality reduction.

Applying EmbedFilter to Llama‑3.1‑8B‑Instruct reduces the representation size while improving downstream scores, surpassing strong baselines such as SimCSE and coCondensor despite the smaller dimension.

We ablate five filtering configurations on Qwen2.5‑0.5B with PromptEOL ($\tau$ = 2). Config 1 truncates the first half of dimensions (Matryoshka), and Config 2 randomly drops half; both underperform the baseline.

Configs 3–5 target singular subspaces: dominant, secondary, and bulk. EmbedFilter (which removes the secondary subspace) yields the best downstream results, while the inverse operation (Config 5) performs worst. Config 4 (bulk filtering) markedly outperforms Config 5, aligning with the $\Delta$$\pi$ analysis that the secondary subspace encodes frequent tokens more strongly than the dominant one.

Comparison and Discussion

We compare EmbedFilter against whitening and other calibration baselines on MTEB.

Recall that the unembedding matrix contains an edge spectrum subspace that injects noisy high‑frequency components; EmbedFilter projects away this subspace to yield cleaner embeddings.

Whitening decorrelates embedding dimensions by scaling them to unit variance, turning an anisotropic embedding space into an isotropic one.

EmbedFilter outperforms whitening without any calibration data, akin to cleaning a blurry photo without a reference image while whitening relies on a sample palette for adjustment.

**Table 6.** MTEB results for EmbedFilter and whitening. Best results are highlighted in bold.

Appendices

Supplementary details, prompts, figures, and proofs that support the main text.

We examined why large language models underperform on zero‑shot text‑embedding tasks, identified an “edge spectrum” subspace in the unembedding matrix that injects high‑frequency noise, and introduced EmbedFilter to linearly remove that subspace. Applying EmbedFilter consistently improves zero‑shot performance and reduces embedding dimensionality, which cuts storage costs and speeds up retrieval.

This work was funded by Lenovo Group; we thank Ang Lv for writing suggestions, Yuhan Liu and Yankai Lin for computational resources, and the anonymous KDD reviewers for their constructive feedback.

The bibliography below lists all cited works, ranging from foundational embedding studies to recent large‑model benchmarks.

Section A details the experimental protocol: we evaluate every task in the Massive Text Embedding Benchmark (MTEB), including STS, Classification, Clustering, Pair Classification, Re‑ranking, Retrieval, and Summarization. For each model we use the PromptEOL and ECHO templates shown in Figure code2, and we offset indices for Mistral‑7B‑Instruct‑V0.3 until ℓ$\tau$ = 128.

Section B provides the equivalence‑transformation proof. Starting from the projection matrix $\Phi_{\tau}=V[l_{\tau}:r_{\tau}]\,V[l_{\tau}:r_{\tau}]^{\top}$, we define $V_{\tau}=V[l_{\tau}:r_{\tau}]$ and show that $\|x\Phi_{\tau}^{\top}\|^{2}=\|xV_{\tau}-yV_{\tau}\|^{2}$ by letting $z=x-y$ and using the identity $V_{\tau}V_{\tau}^{\top}=I$.

### PromptEOL **Qwen** `Summarize the sentence: "{text}" in one word:` **Llama** `Summarize the sentence: "{text}" in one word:` **Mistral** `This sentence: "{text}" means [MASK]`

**Figure 4.** $\Delta\pi$ distribution for high-frequency, low-frequency and randomly sampled tokens on the Qwen model.

**Table 7.** Evaluation metrics used for MTEB tasks.

This table maps various task categories to their corresponding evaluation metrics.

Read the original paper

Open the simplified reader on Paperglide