Serverless picoLLM: LLMs Running in AWS Lambda!

Code for the Serverless LLM article on picovoice.ai which you can find here: picoLLM on Lambda.

Disclaimer

THIS DEMO EXCEEDS AWS FREE TIER USAGE. YOU WILL BE CHARGED BY AWS IF YOU DEPLOY THIS DEMO.

Prerequisites

You will need to following in order to deploy and run this demo:

A Picovoice Console account with a valid AccessKey.
An AWS account.
AWS SAM CLI installed and setup. Follow the offical guide completely.
A valid Docker installation.

Setup

Clone the serverless-picollm repo:

git clone https://github.com/Picovoice/serverless-picollm.git

Download a Phi2 based .pllm model from the picoLLM section of the Picovoice Console.

Tip

Other models will work as long as they are chat-enabled and fit within the AWS Lambda code size and memory limits. You will also need to update the Dialog object in client.py to the appropriate class.

For example, if using Llama3 with the llama-3-8b-instruct-326 model, the line in client.py should be updated to:

dialog = picollm.Llama3ChatDialog(history=3)

Place the downloaded .pllm model in the models/ directory.
Replace "${YOUR_ACCESS_KEY_HERE}" inside the src/app.py file with your AccessKey obtained from Picovoice Console.

Deploy

Use AWS SAM CLI to build the app:

sam build

Use AWS SAM CLI to deploy the app, following the guided prompts:

sam deploy --guided

At the end of the deployment AWS SAM CLI will print an outputs section. Make note of the WebSocketURI. It should look something like this:

CloudFormation outputs from deployed stack
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Outputs
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Key                 HandlerFunctionFunctionArn
Description         HandlerFunction function ARN
Value               arn:aws:lambda:us-west-2:000000000000:function:picollm-lambda-HandlerFunction-ABC123DEF098

Key                 WebSocketURI
Description         The WSS Protocol URI to connect to
Value               wss://ABC123DEF098.execute-api.us-west-2.amazonaws.com/Prod
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

wss://ABC123DEF098.execute-api.us-west-2.amazonaws.com/Prod

Note

If you make any changes to the model, Dockerfile or app.py files, you will need to repeat all these deployment steps.

Chat!

Run client.py, passing in the URL copied from the deployment step:

python client.py -u <WebSocket URL>

Once connected the client will give you a prompt. Type in your chat message and picoLLM will stream back a response from the lambda!

> What is the capital of France?
< The capital of France is Paris.

< [Completion finished @ `6.35` tps]

Important

When you first send a message you may get the following response: < [Lambda is loading & caching picoLLM. Please wait...]. This means the picoLLM is loading the model as lambda streams it from the Elastic Container Registry. Because of the nature and limitations of AWS Lambda this process may take upwards of a few minutes. Subsequent messages and connections will not take as long to load as lambda will cache the layers.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
models		models
resources		resources
src		src
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
client.py		client.py
template.yaml		template.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Serverless picoLLM: LLMs Running in AWS Lambda!

Disclaimer

Prerequisites

Setup

Deploy

Chat!

About

Releases

Packages

Languages

Picovoice/serverless-picollm

Folders and files

Latest commit

History

Repository files navigation

Serverless picoLLM: LLMs Running in AWS Lambda!

Disclaimer

Prerequisites

Setup

Deploy

Chat!

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages