A few things I know about LoRA

Oscar Nevarez
4 min readMar 12, 2023

--

I wanted to write my thoughts and very basic understanding about LoRA and some related concepts I have more clarity of compared say when I started studying, just a few months ago. If you feel confident with most of the terminology you can skip the background part and jump directly into the chapter interest you the most.

Disclaimer: this is not a comprehensive guide about training nor the use of LoRA files, nor it is a scientific/academic article. I do however believe, that it could help to shade some light to the topic using lay terms and code approach.

I’lll try to lay down the terms as simple as I would’ve like to be presented to me, a regular somehow knowledgeable person with technical understanding of things.

Finally, although LoRA can be implemented in any neural network, this post will focus in is use with stable diffusion.

Background

To understand LoRA first we need to have some vague understanding of what’s a neural networks, LLMs and how they work.

The easiest way to put it I found is this:

A neural network is composed of multiple layers. Each layer performs a mathematical operation called matrix multiplication, which takes two separate matrices and combines them into one. Both of these matrices are then multiplied by the same input, with the output vectors combining them coordinate-wise.

Another introductory course I like is Neural Networks Explained in 5 minutes by IBM

What is LoRA?

LoRA stands for Low-Rank Adaptation of Large Language Models, the original paper is available here.

How does LoRA works?

LoRA works by freezing the pre-trained model weights and injecting trainable rank decomposition matrices into each layer of the Transformer architecture, being those but not limited to attention layers. This greatly reduces the number of trainable parameters for downstream tasks, while still performing on-par or better than fine-tuning in terms of model quality.

What’s in a LoRA file?

a LoRA file contain weights, every one of these weights is saved into node layers which are named using an specific naming convention. These weights play and interact with the model during the inference.

Some training methods reorganize the layers in a way that’s easier to access, might change the prefix and suffix or could also include complementary information on the file aka metadata.

Training

What are the available libraries to fine-tune a model an get LoRA files from it?

Why some LoRA files doesn’t work?

It’s very rare that a LoRA file is inherently broken as rather that it’s has a different structure from the expected by the consumer.

If you leave this thing with only a few takeaways let the following be one of them:

There is no such thing as a LoRA format

There is a LoRA algorithm or training method but, the mode, configuration and hence the output structure depends of the given input and the training implementation.

Some of the reasons for your file doesn’t work or produce no difference when used are the following:

  • The model used for the training is different than the once used during the inference.
  • The layers contain a different naming convention from the expected or are just different in its structure.

Understanding key differences

Let’s discover the structure of a random LoRA file taken from civitAI, I’ll use safetensors library for this, guided by the file extension (.safetensors)

from safetensors import safe_open
# Download your favorite LoRA fila from civitai
# Using : https://civitai.com/models/6526/studio-ghibli-style-lora
with safe_open("lora.safetensors", framework="pt", device=0) as f:
for k in f.keys():
print(f"key={k}")

The output of the above code can be found here, and, a colab notebook here.

Most if not all of the keys found follow a prefix pattern of the following nature:

  • lora_te_text_model_encoder_layers_N_mlp_fcN
  • lora_unet_down_blocks_N_attentions_N_proj_in
  • lora_unet_up_blocks_N_attentions_N_proj_in

While the suffixes match any of :

  • lora_down.weight
  • lora_up.weight
  • alpha

Although this kind of structure is the most often used it’s not a default. Prepare to find multiple different formats out there.

Take another example, a pokemon LoRA file trained using Diffusers official implementation. The generated file is not a safetensor but a pytorch binary file, its structure as we might expect, looks different, more aligned to the transformers architecture. Read more about this, here and here.

Inference

Where to find LoRA files

How Automatic1111 uses LoRA files?

This is a tricky question and the source of many open issues in a111 repositories. There is no one answer here, a1111 dropped its own LoRA training support, but this is a mix of existing techniques used together.

A111 will try a few things before you can activate your LoRA file:

  • Read metadata if available to detect the LoRA origin and training settings.
  • Detect naming convention by reading file keys
  • Convert between old naming convention to new one
  • Convert any weight mismatch
  • Ignore unmatched layers
  • Inject the LoRA network into the current model for its usage

If the above succeeds then A111 will handle the proper calculations needed to use your LoRA file during the inference, including prompt weight and loRA ranks.

Conclusion

In this post we cover the basic functionality of the LoRA algorithm, how can we use it in stable diffusion and what are the key implementation differences. Hopefully 🤞 this helps someone.

Sign up to discover human stories that deepen your understanding of the world.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

--

--

Oscar Nevarez
Oscar Nevarez

Written by Oscar Nevarez

Dad, Polymath , Passionate about technology, AI Advocate, Human.

Responses (1)

Write a response