AI/ML
How to Deploy Kimi K2 on AWS ECS or EKS (Kubernetes) - Full Step by Step Guide
Option 1: Deploy Kimi K2 on AWS EKS (Elastic Kubernetes Service)
Prerequisites:
AWS account
IAM role with EKS and EC2 permissions
GPU-enabled EC2 instance types (e.g., p3.2xlarge)
Kubectl, eksctl, Helm installed
Docker & Git installed locally
Step 1: Create an EKS Cluster with GPU Nodes
eksctl create cluster \--name kimi-k2-cluster \--region us-east-1 \--nodegroup-name gpu-nodes \--node-type p3.2xlarge \--nodes 2 \--nodes-min 1 \--nodes-max 3 \--managed
Step 2: Build and Push Docker Image for Kimi K2
git clone https://huggingface.co/moonshotai/Kimi-K2-Instructcd Kimi-K2-Instruct
Dockerfile:
FROM nvidia/cuda:12.1.1-cudnn8-devel-ubuntu20.04RUN apt update && apt install -y python3 python3-pip gitRUN pip3 install torch torchvision transformers accelerate huggingface_hubWORKDIR /appCOPY . .CMD ["python3", "app.py"]aws ecr create-repository --repository-name kimi-k2$(aws ecr get-login --no-include-email --region us-east-1)docker build -t kimi-k2 .docker tag kimi-k2:latest <account-id>.dkr.ecr.us-east-1.amazonaws.com/kimi-k2:latestdocker push <account-id>.dkr.ecr.us-east-1.amazonaws.com/kimi-k2:latest
Step 3: Create Kubernetes YAML Deployment Files
deployment.yaml:
apiVersion: apps/v1kind: Deploymentmetadata:name: kimi-k2spec:replicas: 1selector:matchLabels:app: kimi-k2template:metadata:labels:app: kimi-k2spec:containers:- name: kimi-k2image: <your-ecr-repo-url>resources:limits:nvidia.com/gpu: 1ports:- containerPort: 7860
service.yaml:
apiVersion: v1kind: Servicemetadata:name: kimi-k2-servicespec:type: LoadBalancerselector:app: kimi-k2ports:- protocol: TCPport: 80targetPort: 7860kubectl apply -f deployment.yamlkubectl apply -f service.yaml
Step 4: Access Kimi K2 on Public IP
kubectl get svc kimi-k2-service
Option 2: Deploy Kimi K2 on AWS ECS (Elastic Container Service)
Prerequisites:
AWS CLI configured
IAM roles for ECS + ECR
Docker installed
ECS Fargate or EC2 cluster created
ECR repository created
Step 1: Build Docker Image
Same steps as above. Push image to Amazon ECR.
Step 2: Create ECS Task Definition
task-definition.json:
{"family": "kimi-k2-task","containerDefinitions": [{"name": "kimi-k2","image": "<account-id>.dkr.ecr.us-east-1.amazonaws.com/kimi-k2:latest","memory": 30720,"cpu": 2048,"essential": true,"portMappings": [{"containerPort": 7860,"hostPort": 7860}]}],"requiresCompatibilities": ["EC2"],"networkMode": "bridge","cpu": "2048","memory": "30720"}aws ecs register-task-definition --cli-input-json file://task-definition.json
Step 3: Run Task on ECS Cluster
aws ecs run-task \--cluster kimi-k2-cluster \--launch-type EC2 \--task-definition kimi-k2-task \--count 1
Final Thoughts
Deploying Kimi K2 on EKS or ECS gives you the power to scale open-source LLMs efficiently in the cloud.
Kubernetes allows for autoscaling, GPU scheduling and production-grade LLM APIs all while keeping you in control.
Need enterprise grade deployment or DevOps help? Contact our AI DevOps experts at OneClick IT Consultancy.
Comment