Deploy models through CRD
Deploy the model
- Create one
ModelCRD to rule them all.
What is CRD?
CRD stands for Custom Resource Definition for Kubernetes, which extends the Kubernetes API by allowing users to customize resource types.
Cervices called Operator and Controller manages these custom resources to deploy, manage, and monitor applications in a Kubernetes cluster.
Ollama Operator manages the deployment and operation of large language models through a CRD with version number ollama.ayaka.io/v1 and type Model.
apiVersion: ollama.ayaka.io/v1
kind: Model
metadata:
name: phi
spec:
image: phiWorking with kind?
The default provisioned StorageClass in kind is standard, and will only work with ReadWriteOnce access mode, therefore if you would need to run the operator with kind, you should specify persistentVolume with accessMode: ReadWriteOnce in the Model CRD:
apiVersion: ollama.ayaka.io/v1
kind: Model
metadata:
name: phi
spec:
image: phi
persistentVolume:
accessMode: ReadWriteOnceCopy the following command to create a phi Model CRD:
cat <<EOF >> ollama-model-phi.yaml
apiVersion: ollama.ayaka.io/v1
kind: Model
metadata:
name: phi
spec:
image: phi
persistentVolume:
accessMode: ReadWriteOnce
EOFor you can create your own file:
apiVersion: ollama.ayaka.io/v1
kind: Model
metadata:
name: phi
spec:
image: phi- Apply the
ModelCRD to your Kubernetes cluster:
kubectl apply -f ollama-model-phi.yaml- Wait for the
Modelto be ready:
kubectl wait --for=jsonpath='{.status.readyReplicas}'=1 deployment/ollama-model-phi- Ready! Now let's forward the ports to access the model:
kubectl port-forward svc/ollama-model-phi ollama- Interact with the model:
ollama run phior use the OpenAI API compatible endpoint:
curl http://localhost:11434/v1/chat/completions -H "Content-Type: application/json" -d '{
"model": "phi",
"messages": [
{
"role": "user",
"content": "Hello!"
}
]
}'