In this tutorial we will package and deploy a simple model that exposes an HTTP API and serves predictions to device managed by Synpse.
What is Lightning Flash?
Flash is a high-level deep learning framework for fast prototyping, baselining, finetuning and solving deep learning problems. It features a set of tasks for you to use for inference and finetuning out of the box, and an easy to implement API to customize every step of the process for full flexibility.
Flash is built for beginners with a simple API that requires very little deep learning background, and for data scientists, Kagglers, applied ML practitioners and deep learning researchers that want a quick way to get a deep learning baseline with advanced features PyTorch Lightning offers.
You can view PyTorch Lightning’s quick start here: https://lightning-flash.readthedocs.io/en/latest/quickstart.html.
Prerequisites
- Synpse account - free for up to 5 devices
- At least one device registered to your Synpse account
- Docker
- Python environment
Building the model
Lightning Flash public repository has plenty of examples here: https://github.com/PyTorchLightning/lightning-flash/tree/master/flash_examples/serve. I decided to go with the image_classification as it was important to me to have some kind of service that could differentiate between ants and bees. You can read more about the model in the image classification section.
Repository with the code can be found here: https://github.com/synpse-hq/synpse-lightning-flash-example.
Step 1: Clone example repo
git clone https://github.com/synpse-hq/synpse-lightning-flash-example.git
Running server locally
What surprised me a lot was how easy it is to start serving with Flash. Open the image_classifier.py file in your favorite editor:
from flash.image import ImageClassifier
# Our downloaded weights
model = ImageClassifier.load_from_checkpoint("./image_classification_model.pt")
# Binding to all interfaces (we will need that so it works in Docker container)
model.serve(host="0.0.0.0")
Now, to start it, we will need to install several dependencies that will help with image classification and serving:
pip install -r requirements.txt
To start the server locally:
python image_classifier.py
Step 2: Trying out with the client
While the server is running, use the client to make HTTP requests:
import base64
from pathlib import Path
import requests
import flash
with (Path("./assets") / "ant.jpg").open("rb") as f:
imgstr = base64.b64encode(f.read()).decode("UTF-8")
body = {"session": "UUID", "payload": {"inputs": {"data": imgstr}}}
resp = requests.post("http://127.0.0.1:8000/predict", json=body)
print(resp.json())
To run it:
python client.py
{'session': 'UUID', 'result': {'outputs': 'ants'}}
I have added both bee.jpg and ant.jpg files to the assets/ directory so feel free to try both :)
Step 3: Building and publishing Docker images with models
We can simply embed the code with data into a Docker image. You can build your own with:
docker build -t <your docker username>/synpse-lighning-flash:latest -f Dockerfile .
docker push <your docker username>/synpse-lighning-flash:latest
Or you can just use mine that I have built and published: karolisr/synpse-lighning-flash:latest.
Step 4: Preparing device
For several years now I have been using a combination of RaspberryPis and an Intel NUC to run various background services such as Home Assistant, NodeRED, Drone, PiHole, etc. It’s a very silent machine and actually performs really well:
Installation instructions can be found here but the short version is:
- Create a project in Synpse main page
- Go to Devices
- Click on Provision Device
- Copy paste the command to your device
This command will figure out the device architecture, download correct binary and register your device to the service. Once registered, your device will appear in the list:
Click on the device menu or device details and add a label type: controller, you can put anything you like here but it will have to match application specification later on.
Step 5: Deploy Flash serving to the device
To deploy, it will be similar to what you have seen in Docker Compose files (if you used it before):
name: synpse-flash
scheduling:
type: Conditional
# Selecting my device. This could be targeting hundreds or thousands
# of devices in industrial applications
selectors:
type: controller
spec:
containers:
- name: classifier
# Docker image name
image: karolisr/synpse-lighning-flash:latest
# Optionally create a secret with your DockerHub password. I need this
# as my IP pulls a lot of images. This can also be used if you are using
# a private registry
auth:
username: karolisr
fromSecret: dockerPassword
# This time, we are exposing a custom port and not the default 8000
ports:
- 9950:8000
Once deployed, you can view application status and logs:
Step 6: Run prediction against the deployed model
One way would be to call the model on http://
To open a proxy tunnel, run in one terminal:
synpse device proxy nuc 8000:9950
Then, you can run our client again to make predictions as if it was running on your own machine:
We can see predictions being logged in the Synpse dashboard too:
Next steps
Feel free to check out other features of PyTorch Flash like their new Flash Zero, it’s surprisingly easy to use (after spending some time with Tensorflow I really appreciate the simplicity).
For model training you should definitely check out grid.ai, there are some good articles out there on setting up the environment as they offer some free credits as well as sessions that have things like GPU dependencies preinstalled.