Litellm

  • 简要说明

    • LiteLLM 是一种创新型代理,通过提供符合 OpenAI API 规范的统一标准,简化了将各种大型语言模型(LLM)集成到应用程序中的过程
    • 它允许在不同的 LLM 提供商(如 Azure OpenAI、Ollama、OpenAI、Cohere 和 Gemini)之间无缝切换
    • 这种灵活性支持各种用例,包括自定义检索增强生成(RAG)、代理任务和普通聊天交互,提高了部署 LLM 驱动的解决方案的效率和适应性
  • 主要特点

    • 统一的API接口:用户可以通过单一的接口访问和管理多种LLM服务
    • 负载均衡:自动分配请求到不同的LLM服务,确保系统稳定性和响应速度
    • 消费跟踪:实时监控LLM服务的使用情况,帮助用户控制成本
    • 自定义日志和限制:用户可以根据项目需求自定义日志记录和设置预算、速率限制
    • 重试和回退逻辑:在多个部署(如Azure/OpenAI)之间提供重试和回退机制,确保服务连续性
  • 部署架构

    • 说明
    litellm + postgress + redis
    • 官方参考
    https://docs.litellm.ai/docs/proxy/deploy
    https://github.com/BerriAI/litellm/tree/main

9.1 Docker

  • 操作如下

    • docker-compose.yaml , 仅限测试用
    version: "3.5"
    services:
    litellm:
      build:
        context: .
        args:
          target: runtime
      image: ghcr.io/berriai/litellm:main-stable
      ports:
        - "4000:4000"
      environment:
          DATABASE_URL: "postgresql://llmproxy:dbpassword9090@db:5432/litellm"
          STORE_MODEL_IN_DB: "True"
      volumes:
        - ./config.yaml:/app/config.yaml
      env_file:
        - env_config
    
    db:
      image: postgres
      restart: always
      environment:
        POSTGRES_DB: litellm
        POSTGRES_USER: llmproxy
        POSTGRES_PASSWORD: dbpassword9090
      healthcheck:
        test: ["CMD-SHELL", "pg_isready -d litellm -U llmproxy"]
        interval: 1s
        timeout: 5s
        retries: 10
    • env_config
    ############
    # Secrets
    # YOU MUST CHANGE THESE BEFORE GOING INTO PRODUCTION
    ############
    
    LITELLM_MASTER_KEY="sk-1234"
    
    ############
    # Database - You can change these to any PostgreSQL database that has logical replication enabled.
    ############
    
    # DATABASE_URL="your-postgres-db-url" 
    
    ############
    # User Auth - SMTP server details for email-based auth for users to create keys 
    ############
    
    # SMTP_HOST = "fake-mail-host"
    # SMTP_USERNAME = "fake-mail-user"
    # SMTP_PASSWORD="fake-mail-password"
    # SMTP_SENDER_EMAIL="fake-sender-email"
    • config.yaml
    # 配置redis信息
      router_settings:
        redis_host: 192.168.4.252 
        redis_password: 26eef3LjoNF
        redis_port: 6389
    • 创建容器
    docker-compose up -d

9.2 Kubernetes

  • 操作如下

    • configmap-litellm.yaml
    apiVersion: v1
    kind: ConfigMap
    metadata:
    name: litellm-config
    namespace: dify
    data:
    config.yaml: |
      router_settings:
        redis_host: 192.168.4.252 
        redis_password: 26eef3LjoNF
        redis_port: 6389
    • secrets.yaml
    apiVersion: v1
    kind: Secret
    metadata:
    name: dify-credentials
    namespace: dify
    data:
    # check Base64 online https://base64.us/ 
    # DATABASE_URL postgresql://postgres:difyai123456@192.168.4.252:5434/litellm
    litellm-pg-url: cG9zdGdyZXNxbDovL3Bvc3RncmVzOmRpZnlhaTEyMzQ1NkAxOTIuMTY4LjQuMjUyOjU0MzQvbGl0ZWxsbQ==
    # LITELLM_MASTER_KEY sk-ctv2pgOFXT
    litellm-master-key: c2stY3R2MnBnT0ZYVA==
    type: Opaque
    • deployment-litellm.yaml
    apiVersion: apps/v1
    kind: Deployment
    metadata:
    name: dify-litellm
    namespace: dify
    spec:
    replicas: 1
    selector:
      matchLabels:
        app: litellm
    strategy:
      rollingUpdate:
        maxSurge: 50%
        maxUnavailable: 0
      type: RollingUpdate
    template:
      metadata:
        labels:
          app: litellm
      spec:
        imagePullSecrets:
        - name: public-docker-registry
        containers:
          - name: dify-litellm
            image: registry.dachensky.com/litellm:main-stable
            imagePullPolicy: Always
            env:
              - name: STORE_MODEL_IN_DB
                value: 'true'
              - name: LITELLM_MASTER_KEY
                valueFrom:
                  secretKeyRef:
                    name: dify-credentials
                    key: litellm-master-key
              - name: DATABASE_URL
                valueFrom:
                  secretKeyRef:
                    name: dify-credentials
                    key: litellm-pg-url
            args:
              - "--config"
              - "/app/config.yaml"
            volumeMounts:
              - name: config-volume
                mountPath: /app
                readOnly: true
            resources:
              requests:
                cpu: 100m
                memory: 256Mi
              limits:
                cpu: 200m
                memory: 512Mi
            ports:
            - name: http
              containerPort: 4000
            livenessProbe:
              httpGet:
                path: /health/liveliness
                port: http
              initialDelaySeconds: 120
              periodSeconds: 15
              successThreshold: 1
              failureThreshold: 3
              timeoutSeconds: 10
            readinessProbe:
              httpGet:
                path: /health/readiness
                port: http
              initialDelaySeconds: 120
              periodSeconds: 15
              successThreshold: 1
              failureThreshold: 3
              timeoutSeconds: 10
        volumes:  # Define volume to mount proxy_config.yaml
          - name: config-volume
            configMap:
              name: litellm-config 
    
    ---
    
    apiVersion: v1
    kind: Service
    metadata:
    name: dify-litellm
    namespace: dify
    spec:
    selector:
      app: litellm
    ports:
      - protocol: TCP
        port: 4000
        targetPort: 4000
    type: ClusterIP
    clusterIP: None
    • 至此,已完成整个litellm在k8s集群交付