Using Self Hosted LLM from your Smartphone with acai.so and Let's Encrypt

Oct 17, 2023

This guide picks up where the last one left off. It assumes you have a locally hosted OpenAI endpoint accessible on the network.

Self Hosting OpenAI Chat Endpoint with GPU-accelerated MistralOrca 7B 8K (GGUF) and Llama CPP Python Server

Joe Still

October 16, 2023

Serving models as an emulated OpenAI Endpoint enables a few important benefits:Thanks for reading blog.bios.dev! Subscribe for free to receive new posts and support my work. Application start-up is decoupled from model initialization (faster code iteration)

Read full story

Let’s say your endpoint is accessible on the network at http://192.168.1.123:8180/v1

To make practical use of this endpoint from my smartphone, I am going to use acai.so. acai.so is an AI-first in-browser chat experience that keeps all your chat and note data locally in your browser. Total privacy! However, by default you really need to use OpenAI for GPT completions… Not so private! Let’s secure that self hosted endpoint with a free SSL certificate and connect from acai.so.

Step 1: Prepare Nginx and Certbot

(TODO: link to a github with all this code in it)

The last guide offered examples for both Linux and Windows. In my implementation, I have a dedicated 24/7 Linux machine I use for all kinds of nonsense like Plex and PiHole DNS, etc. I will assume you have an Ubuntu machine running on your local network with Docker installed. This can be the same machine hosting your endpoint or in my case, I have my Windows gaming computer hosting the endpoint and my Ubuntu machine hosting the HTTPS proxy.

On with the code…

Create a docker-compose.yaml:

version: '3.8'

services:
  nginx:
    image: nginx:latest
    volumes:
      - ./nginx-conf.d:/etc/nginx/conf.d
      - ./letsencrypt:/etc/letsencrypt
    ports:
      # 44301 avoids conflict with my k3s HTTPS, use "443:443"
      # if you prefer but "44301:443" will work for you too
      - "44301:443"

This creates Nginx container we will keep up to proxy the SSL traffic to your endpoint.

Next, create a docker-compose-certbot.yaml:

version: '3.8'

services:
  certbotdnsmanual:
    image: certbot/certbot
    volumes:
      - ./letsencrypt:/etc/letsencrypt
    environment:
      - CERT_DOMAIN
    command: -d ${CERT_DOMAIN} --manual --preferred-challenges dns certonly

We will use this to automate SSL certificate creation.

And finally, I made a little convenience script cert-gen.sh:

set -e

test -n "$CERT_DOMAIN" || { echo "ERROR: CERT_DOMAIN not set in env"; exit 1; }

mkdir -vp nginx-conf.d

# Generate the certs
docker compose \
  -f docker-compose-certbot.yaml \
  run --rm -it \
  certbotdnsmanual

# Generate the VHOST
VHOST_CONF="vhost.conf" # TODO: multi domain??
cat << EOF > "nginx-conf.d/${VHOST_CONF}"
server {
    listen 443 ssl;
    server_name _;  # Replace with your desired hostname if multiple

    ssl on;
    ssl_certificate /etc/letsencrypt/live/${CERT_DOMAIN}/cert.pem;
    ssl_certificate_key /etc/letsencrypt/live/${CERT_DOMAIN}/privkey.pem;

    location / {
        # Replace with your external IP address
        proxy_pass http://192.168.1.123:8180;  
    }
}
EOF

You will need to update the proxy_pass target in this script to point at the correct endpoint IP and port.

Step 2: Generate SSL and Virtual Host

At this point we are ready to generate an SSL and Virtual Host. For this example to work, you will need to own a domain and control its DNS. I chose to use llama.bios.dev. In my examples, be sure to use your preferred subdomain.

With the above files in place, you should be able to run this command:

CERT_DOMAIN=llama.bios.dev bash cert-gen.sh

It will ask you some questions about owning the domain and opting into email contact, I leave you to answer those as you will. Finally it will prompt you to set a TXT record for the provided subdomain.

For example:

...

Please deploy a DNS TXT record under the name:

_acme-challenge.llama.bios.dev.

with the following value:

q2gpz8c9Gq-XrJ2lcST29Id9nTq9JoCEIZfsbl1t4

...

So you will want to create a TXT record for _acme-challenge.llama with value q2gpz8c9Gq-XrJ2lcST29Id9nTq9JoCEIZfsbl1t4 in your DNS settings. Deeper instruction for accomplishing that step is outside the scope of this guide.

Once the record is in place, hit ENTER.

This only gives you SSL for 6 months, but you can repeat the process when they are up for renewal. I will update this guide or write a fresh one later to explain a permanent auto-renewal solution. This script will also generate the necessary Nginx virtual host to be picked up by our Nginx container definition.

You must also set an A record in your DNS to point your subdomain to the private network address where your Nginx container will be running (in my case, the Ubuntu box rather than the gaming PC).

Step 3: Start it up and plug into acai.so

If the above worked, you now have a valid SSL for your subdomain and the subdomain should be pointing to the machine that will run the Nginx container. Let’s start it!

docker compose up -d

That’s it, Nginx should be up and if you visit https://llama.bios.dev:44301/docs it should route to the hosting machine and proxy the endpoint. Lets plug it into acai.so!

Visit https://acai.so and under Settings > Access Configuration, replace the OpenAI API Base URL with your new HTTPS endpoint (eg, https://llama.bios.dev:44301/v1) and put in anything for the OpenAI API Key (eg, lmaounlimitedtokens) it doesn’t matter what you choose for the key. Just can’t be blank.

Hit Submit to save the change and you should be able to chat with your local model from any device on the network!!

Conclusion

This is a very raw methodology to deliver local model access via acai.so which could be improved in a few ways in the future:

Link a GitHub repo with all that code I told you to paste
Organize the prerequisites sooner (eg “you need a domain” and “you need to know DNS”)
Update to CertBot process to keep the container running for fully automatic SSL renewal (right now you will need to re-run it after 6 months).

The next guide I want to write in this series is about how to make that SSL endpoint accessible to your smartphone ANYWHERE IN THE WORLD

Did you find this helpful? Did you get stuck? Did I miss something? Reach out on X.

blog.bios.dev