-
Notifications
You must be signed in to change notification settings - Fork 698
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add support for 13B and 70B models, workflow, readme
- Loading branch information
1 parent
75cd9d0
commit ee97955
Showing
9 changed files
with
217 additions
and
1 deletion.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,32 @@ | ||
name: Build Docker images on master push | ||
|
||
on: | ||
push: | ||
branch: | ||
- master | ||
|
||
jobs: | ||
build_api: | ||
runs-on: ubuntu-latest | ||
steps: | ||
- uses: actions/checkout@c85c95e3d7251135ab7dc9ce3241c5835cc595a9 # v3.5.3 | ||
- run: docker login --username "${{ github.actor }}" --password ${{ secrets.GITHUB_TOKEN }} ghcr.io | ||
- run: docker buildx create --use | ||
# 7B | ||
- run: docker buildx build --platform linux/amd64,linux/arm64 -f api/Dockerfile --tag ghcr.io/getumbrel/llama-gpt-api-llama-2-7b-chat:${{ github.sha }} --push . | ||
- run: docker buildx build --platform linux/amd64,linux/arm64 -f api/Dockerfile --tag ghcr.io/getumbrel/llama-gpt-api-llama-2-7b-chat:latest --push . | ||
# 13B | ||
- run: docker buildx build --platform linux/amd64,linux/arm64 -f api/13B.Dockerfile --tag ghcr.io/getumbrel/llama-gpt-api-llama-2-7b-chat:${{ github.sha }} --push . | ||
- run: docker buildx build --platform linux/amd64,linux/arm64 -f api/13B.Dockerfile --tag ghcr.io/getumbrel/llama-gpt-api-llama-2-7b-chat:latest --push . | ||
# 70B | ||
- run: docker buildx build --platform linux/amd64,linux/arm64 -f api/70B.Dockerfile --tag ghcr.io/getumbrel/llama-gpt-api-llama-2-7b-chat:${{ github.sha }} --push . | ||
- run: docker buildx build --platform linux/amd64,linux/arm64 -f api/70B.Dockerfile --tag ghcr.io/getumbrel/llama-gpt-api-llama-2-7b-chat:latest --push . | ||
|
||
build_ui: | ||
runs-on: ubuntu-latest | ||
steps: | ||
- uses: actions/checkout@c85c95e3d7251135ab7dc9ce3241c5835cc595a9 # v3.5.3 | ||
- run: docker login --username "${{ github.actor }}" --password ${{ secrets.GITHUB_TOKEN }} ghcr.io | ||
- run: docker buildx create --use | ||
- run: docker buildx build --platform linux/amd64,linux/arm64 -f ui/Dockerfile --tag ghcr.io/getumbrel/llama-gpt-ui:${{ github.sha }} --push . | ||
- run: docker buildx build --platform linux/amd64,linux/arm64 -f ui/Dockerfile --tag ghcr.io/getumbrel/llama-gpt-ui:latest --push . |
File renamed without changes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,84 @@ | ||
<p align="center"> | ||
<a href="https://apps.umbrel.com/app/llama-gpt"> | ||
<img width="150" height="150" src="https://i.imgur.com/F0h1k80.png" alt="LlamaGPT" width="200" /> | ||
</a> | ||
</p> | ||
<p align="center"> | ||
<h1 align="center">LlamaGPT</h1> | ||
<p align="center"> | ||
A self-hosted, offline, ChatGPT-like chatbot, powered by Llama 2. 100% private, with no data leaving your device. | ||
<br /> | ||
<a href="https://umbrel.com"><strong>umbrel.com »</strong></a> | ||
<br /> | ||
<br /> | ||
<a href="https://twitter.com/umbrel"> | ||
<img src="https://img.shields.io/twitter/follow/umbrel?style=social" /> | ||
</a> | ||
<a href="https://t.me/getumbrel"> | ||
<img src="https://img.shields.io/badge/community-chat-%235351FB"> | ||
</a> | ||
<a href="https://reddit.com/r/getumbrel"> | ||
<img src="https://img.shields.io/reddit/subreddit-subscribers/getumbrel?style=social"> | ||
</a> | ||
<a href="https://community.umbrel.com"> | ||
<img src="https://img.shields.io/badge/community-forum-%235351FB"> | ||
</a> | ||
</p> | ||
</p> | ||
|
||
## Demo | ||
|
||
https://github.com/getumbrel/llama-gpt/assets/10330103/71521963-6df2-4ffb-8fe1-f079e80d6a8b | ||
|
||
## How to install | ||
|
||
### Install LlamaGPT on your umbrelOS home server | ||
|
||
Running LlamaGPT on an [umbrelOS](https://umbrel.com) home server is one click. Simply install it from the [Umbrel App Store](https://apps.umbrel.com/app/llama-gpt). | ||
|
||
<!-- Todo: update badge link after launch --> | ||
|
||
[![LlamaGPT on Umbrel App Store](https://apps.umbrel.com/app/nostr-relay/badge-dark.svg)](https://apps.umbrel.com/app/llama-gpt) | ||
|
||
### Install LlamaGPT anywhere else | ||
|
||
You can run LlamaGPT on any x86 or arm64 system. Make sure you have Docker installed. | ||
|
||
Then, clone this repo and `cd` into it: | ||
|
||
``` | ||
git clone https://github.com/getumbrel/llama-gpt.git | ||
cd llama-gpt | ||
``` | ||
|
||
You can now run LlamaGPT with any of the following models depending upon your hardware: | ||
|
||
| Model size | Model used | Minimum RAM required | How to start LlamaGPT | | ||
| ---------- | ----------------------------------- | -------------------- | ------------------------------------------------ | | ||
| 7B | Nous Hermes Llama 2 7B (GGML q4_0) | 8GB | `docker compose up -d` | | ||
| 13B | Nous Hermes Llama 2 13B (GGML q4_0) | 16GB | `docker compose -f docker-compose-13b.yml up -d` | | ||
| 70B | Meta Llama 2 70B Chat (GGML q4_0) | 48GB | `docker compose -f docker-compose-70b.yml up -d` | | ||
|
||
You can access LlamaGPT at `http://localhost:3000`. | ||
|
||
To stop LlamaGPT, run: | ||
|
||
``` | ||
docker compose down | ||
``` | ||
|
||
## Acknowledgements | ||
|
||
A massive thank you to the following developers and teams for making LlamaGPT possible: | ||
|
||
- [Mckay Wrigley](https://github.com/mckaywrigley) for building [Chatbot UI](https://github.com/mckaywrigley). | ||
- [Andrei](https://github.com/abetlen) for building the [Python bindings for llama.cpp](https://github.com/abetlen/llama-cpp-python). | ||
- [NousResearch](https://nousresearch.com) for [fine-tuning the Llama 2 7B and 13B models](https://huggingface.co/NousResearch). | ||
- [Tom Jobbins](https://huggingface.co/TheBloke) for [quantizing the Llama 2 models](https://huggingface.co/TheBloke/Nous-Hermes-Llama-2-7B-GGML). | ||
- [Meta](https://ai.meta.com/llama) for releasing Llama 2 under a permissive license. | ||
|
||
--- | ||
|
||
[![License](https://img.shields.io/github/license/getumbrel/llama-gpt?color=%235351FB)](https://github.com/getumbrel/llama-gpt/blob/master/LICENSE.md) | ||
|
||
[umbrel.com](https://umbrel.com) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,26 @@ | ||
# Define the image argument and provide a default value | ||
ARG IMAGE=ghcr.io/abetlen/llama-cpp-python:latest | ||
|
||
# Define the model file name and download url | ||
ARG MODEL_FILE=llama-2-13b-chat.bin | ||
ARG MODEL_DOWNLOAD_URL=https://huggingface.co/TheBloke/Nous-Hermes-Llama2-GGML/resolve/main/nous-hermes-llama2-13b.ggmlv3.q4_0.bin | ||
|
||
FROM ${IMAGE} | ||
|
||
ARG MODEL_FILE | ||
ARG MODEL_DOWNLOAD_URL | ||
|
||
# Download the model file | ||
RUN apt-get update -y && \ | ||
apt-get install --yes curl && \ | ||
mkdir -p /models && \ | ||
curl -L -o /models/${MODEL_FILE} ${MODEL_DOWNLOAD_URL} | ||
|
||
WORKDIR /app | ||
|
||
COPY . . | ||
|
||
EXPOSE 8000 | ||
|
||
# Run the server start script | ||
CMD ["/bin/sh", "/app/run.sh"] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,26 @@ | ||
# Define the image argument and provide a default value | ||
ARG IMAGE=ghcr.io/abetlen/llama-cpp-python:latest | ||
|
||
# Define the model file name and download url | ||
ARG MODEL_FILE=llama-2-70b-chat.bin | ||
ARG MODEL_DOWNLOAD_URL=https://huggingface.co/TheBloke/Llama-2-70B-Chat-GGML/resolve/main/llama-2-70b-chat.ggmlv3.q4_0.bin | ||
|
||
FROM ${IMAGE} | ||
|
||
ARG MODEL_FILE | ||
ARG MODEL_DOWNLOAD_URL | ||
|
||
# Download the model file | ||
RUN apt-get update -y && \ | ||
apt-get install --yes curl && \ | ||
mkdir -p /models && \ | ||
curl -L -o /models/${MODEL_FILE} ${MODEL_DOWNLOAD_URL} | ||
|
||
WORKDIR /app | ||
|
||
COPY . . | ||
|
||
EXPOSE 8000 | ||
|
||
# Run the server start script | ||
CMD ["/bin/sh", "/app/run.sh"] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,16 @@ | ||
version: '3.6' | ||
|
||
services: | ||
llama-gpt-api: | ||
image: 'ghcr.io/getumbrel/llama-gpt-api-llama-2-13b-chat:latest' | ||
environment: | ||
MODEL: '/models/llama-2-13b-chat.bin' | ||
|
||
llama-gpt-ui: | ||
image: 'ghcr.io/getumbrel/llama-gpt-ui:latest' | ||
ports: | ||
- 3000:3000 | ||
environment: | ||
- 'OPENAI_API_KEY=sk-XXXXXXXXXXXXXXXXXXXX' | ||
- 'OPENAI_API_HOST=http://llama-gpt-api:8000' | ||
- 'DEFAULT_MODEL=/models/llama-2-13b-chat.bin' |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,16 @@ | ||
version: '3.6' | ||
|
||
services: | ||
llama-gpt-api: | ||
image: 'ghcr.io/getumbrel/llama-gpt-api-llama-2-70b-chat:latest' | ||
environment: | ||
MODEL: '/models/llama-2-70b-chat.bin' | ||
|
||
llama-gpt-ui: | ||
image: 'ghcr.io/getumbrel/llama-gpt-ui:latest' | ||
ports: | ||
- 3000:3000 | ||
environment: | ||
- 'OPENAI_API_KEY=sk-XXXXXXXXXXXXXXXXXXXX' | ||
- 'OPENAI_API_HOST=http://llama-gpt-api:8000' | ||
- 'DEFAULT_MODEL=/models/llama-2-70b-chat.bin' |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters