Cloud LoRA

Spin up your Llama LoRA and get millions of tokens in minutes.

Here's the deal:

Get scalable inference for your LoRAs as soon as they're done training. Get started with a one-line change to your code.
Sign up for our private alpha here and get your first million tokens free.

        

    # create your Llama model and apply your LoRA adapter
    peft_model = ...

    # create a cloud LoRA model
    cloud_model = CloudLora.create(peft_model)

    # scalable remote inference
    cloud_model.remote().get_completion(...)