Deploy models through CRD
Deploy the model
- Create one
Model
CRD to rule them all.
What is CRD?
CRD stands for Custom Resource Definition for Kubernetes, which extends the Kubernetes API by allowing users to customize resource types.
Cervices called Operator and Controller manages these custom resources to deploy, manage, and monitor applications in a Kubernetes cluster.
Ollama Operator manages the deployment and operation of large language models through a CRD with version number ollama.ayaka.io/v1
and type Model
.
apiVersion: ollama.ayaka.io/v1
kind: Model
metadata:
name: phi
spec:
image: phi
Working with kind
?
The default provisioned StorageClass
in kind
is standard
, and will only work with ReadWriteOnce
access mode, therefore if you would need to run the operator with kind
, you should specify persistentVolume
with accessMode: ReadWriteOnce
in the Model
CRD:
apiVersion: ollama.ayaka.io/v1
kind: Model
metadata:
name: phi
spec:
image: phi
persistentVolume:
accessMode: ReadWriteOnce
Copy the following command to create a phi Model
CRD:
cat <<EOF >> ollama-model-phi.yaml
apiVersion: ollama.ayaka.io/v1
kind: Model
metadata:
name: phi
spec:
image: phi
persistentVolume:
accessMode: ReadWriteOnce
EOF
or you can create your own file:
apiVersion: ollama.ayaka.io/v1
kind: Model
metadata:
name: phi
spec:
image: phi
- Apply the
Model
CRD to your Kubernetes cluster:
kubectl apply -f ollama-model-phi.yaml
- Wait for the
Model
to be ready:
kubectl wait --for=jsonpath='{.status.readyReplicas}'=1 deployment/ollama-model-phi
- Ready! Now let's forward the ports to access the model:
kubectl port-forward svc/ollama-model-phi ollama
- Interact with the model:
ollama run phi
or use the OpenAI API compatible endpoint:
curl http://localhost:11434/v1/chat/completions -H "Content-Type: application/json" -d '{
"model": "phi",
"messages": [
{
"role": "user",
"content": "Hello!"
}
]
}'