This document aims to guide LLM service providers on how to dynamically register their service instances with the LLM Gateway via a Nacos registry. By following these guidelines, the gateway will be able to automatically discover your service and apply appropriate routing, retry, and fallback strategies based on the metadata you provide.
The core mechanism of service discovery is that your LLM service registers as a Nacos instance and provides a specific set of metadata upon registration. The LLM Gateway listens for service changes in Nacos, reads this metadata, and dynamically converts it into a fully functional gateway endpoint configuration.
A basic Nacos registration request includes the following key information:
ServiceName: The name of your service collection (e.g., deepseek-service).Ip & Port: The network address where your service instance listens for traffic.Metadata: A collection of key-value pairs, which is crucial for configuring all gateway behaviors.metadata Configuration FieldsAll gateway-specific configurations are passed through the metadata field of the Nacos instance. Below are all the supported metadata keys and their descriptions.
cluster
stringcluster) in the gateway this endpoint should belong to. The gateway aggregates service instances with the same cluster name based on this value."deepseek_cluster"id
stringcluster."deepseek-primary-instance-1"ip
string0.0.0.0."203.0.113.55"port
string (representing an integer)Port registered by the Nacos instance itself. Its usage is similar to the ip field."9000"name
string"DeepSeek V2 Chat (Primary)"address
string"api.deepseek.com"llm-meta.fallback
string (“true” or “false”)"false"llm-meta.api_key
stringllm-meta.retry_policy.name
string"NoRetry"llm-meta.retry_policy.config
string (JSON format)You can configure retry behavior by using a combination of llm-meta.retry_policy.name and llm-meta.retry_policy.config.
CountBased
"CountBased"times field.times (integer): Number of retry attempts.llm-meta.retry_policy.name: "CountBased"llm-meta.retry_policy.config: {"times": 2}ExponentialBackoff
"ExponentialBackoff"times, initialInterval, maxInterval, and multiplier fields.times (integer): Number of retries.initialInterval (string): Initial wait duration (e.g., “200ms”).maxInterval (string): Maximum wait duration (e.g., “5s”).multiplier (float): The multiplier factor for the delay.llm-meta.retry_policy.name: "ExponentialBackoff"llm-meta.retry_policy.config: {"times": 3, "initialInterval": "200ms", "maxInterval": "5s", "multiplier": 2.0}NoRetry
"NoRetry"config field.Detailed usage please refer to our official samples。
The configuration file of pixiu, need to enable the adapter of llmregistrycenter, as follows:
adapters:
- id: test
name: dgp.adapter.llmregistrycenter
config:
registries:
nacos:
protocol: nacos
address: "127.0.0.1:8848"
timeout: "5s"
group: test_llm_registry_group
namespace: public
This example demonstrates how to register a fully-featured LLM service instance using the Nacos Go SDK. The instance will be configured as the primary endpoint in the deepseek_cluster, using an exponential backoff retry policy, and will fall back to the next service in the cluster upon failure.
package main
import (
"encoding/json"
"log"
"github.com/nacos-group/nacos-sdk-go/vo"
)
func main() {
// ... (Code for creating the Nacos client is omitted here)
// client, err := createNacosClient()
// 1. Prepare the JSON configuration for the retry policy
retryConfig := map[string]interface{}{
"times": 3,
"initialInterval": "200ms",
"maxInterval": "8s",
"multiplier": 2.5,
}
retryConfigJSON, _ := json.Marshal(retryConfig)
// 2. Construct the metadata containing all gateway configurations
metadata := map[string]string{
// --- Core Endpoint Configuration ---
"cluster": "deepseek_cluster",
"id": "deepseek-primary",
"name": "DeepSeek V2 Chat (Primary)",
// Optional (use ip+port or address): The instance's IP and Port
"ip": "203.0.113.55",
"port": "9000",
// Optional (use ip+port or address): address field
"address": "api.deepseek.com",
// --- LLM-Specific Metadata ---
"llm-meta.fallback": "true",
// API Keys in JSON string format
"llm-meta.api_keys": "key-xxxxxxxx",
// --- Retry Policy Configuration ---
"llm-meta.retry_policy.name": "ExponentialBackoff",
"llm-meta.retry_policy.config": string(retryConfigJSON),
}
// 3. Register the Nacos instance
// Note: The Ip and Port here are the actual listening addresses of the service instance,
// while the ip and port in the metadata are the addresses you want the gateway to access.
_, err := client.RegisterInstance(vo.RegisterInstanceParam{
Ip: "192.168.1.10", // The service's internal IP
Port: 8001, // The service's internal port
ServiceName: "deepseek-service",
GroupName: "DEFAULT_GROUP",
Ephemeral: true,
Healthy: true,
Weight: 10,
Metadata: metadata,
})
if err != nil {
log.Fatalf("Failed to register service instance: %v", err)
}
log.Println("Service instance registered successfully!")
// ...
}