Skip to content

Embeddings LoRA & TP#3091

Merged
BenjaminBossan merged 53 commits intohuggingface:mainfrom
michaelbenayoun:embeddings_lora_tp
Mar 24, 2026
Merged

Embeddings LoRA & TP#3091
BenjaminBossan merged 53 commits intohuggingface:mainfrom
michaelbenayoun:embeddings_lora_tp

Conversation

@michaelbenayoun
Copy link
Member

@michaelbenayoun michaelbenayoun commented Mar 11, 2026

To be merged after #3079 .

It adds support for LoRA embedding with TP.

cc @3outeille

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@michaelbenayoun michaelbenayoun marked this pull request as ready for review March 12, 2026 15:50
Copy link
Member

@BenjaminBossan BenjaminBossan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for extending TP support to LoRA embedding layers. This PR mostly looks good as is, just a few comments, please check. Since they are not included in the PR CI, I ran the updated tests on my machine and they passed.

# (used by the hooks), make the hooks use this fake module, and then add the hooks to the original
# `_embed` method acting as the forward pass lora_embedding_A.
tp_layer = copy.deepcopy(ALL_PARALLEL_STYLES[tp_plan])
mod = SimpleNamespace()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah damn. Do you think the usage of SimpleNamespace is robust? Otherwise, we could consider using nn.utils.parametrize.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Parametrize will not do much here, what we need is a module holding the weights so that the input and output functions can use it.

I changed SimpleNamespace, which is very basic and simple, with something a bit more elaborate. We create a nn.Module, holding the lora_embedding_A Parameter dict instead of individual weights. This way, even if the parameter changes, we still get the proper weight for a given adapter.

I think it is cleaner than the first approach.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed SimpleNamespace, which is very basic and simple, with something a bit more elaborate. We create a nn.Module, holding the lora_embedding_A Parameter dict instead of individual weights.

That sounds better, but I think you didn't push that change yet.

Also, instead of monkey-patching lora_module._embed here, I think it's better to go into peft.tuners.lora.Embedding._embed and add branching code in there to check for tp_plan and call the corresponding helper functions there.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just pushed.
About your second suggestion, great idea. I just pushed it as well.

Copy link
Member

@BenjaminBossan BenjaminBossan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the refactor. This looks like the more robust solution to me. PR looks good now.

@BenjaminBossan BenjaminBossan merged commit 58169b5 into huggingface:main Mar 24, 2026
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants