使用 Litefuse 实现 Baseten 的可观测性

本指南将介绍如何将 Baseten 与 Litefuse 集成。Baseten 的推理 API 与 OpenAI 的客户端库完全兼容,因此我们可以使用 Litefuse 的 OpenAI 替换方案来追踪应用的所有部分。

什么是 Baseten? Baseten 是一个推理平台,让开发者能够在生产环境中部署和扩展机器学习模型。它通过兼容 OpenAI 的 API 提供快速、可靠的模型推理,并支持流行的开源模型。

什么是 Litefuse? Litefuse 是一个开源的 AI Agent 可观测性与评估平台,帮助团队追踪 API 调用、监控性能并调试 AI 应用中的问题。

第一步:安装依赖

确保已安装所需的 Python 包:

%pip install openai langfuse -q
 

第二步:设置环境变量

import os
 
# Get keys for your project from the project settings page: https://litefuse.cloud
os.environ["LANGFUSE_PUBLIC_KEY"] = "pk-lf-..." 
os.environ["LANGFUSE_SECRET_KEY"] = "sk-lf-..." 
os.environ["LANGFUSE_BASE_URL"] = "https://litefuse.cloud"
 
# Get your Baseten API key from https://app.baseten.co/settings/api_keys
os.environ["BASETEN_API_KEY"] = "..."
 

第三步:Litefuse OpenAI 替换方案

在这一步,我们通过 from langfuse.openai import openai 使用原生的 OpenAI 替换方案

要开始通过 OpenAI 客户端库使用 Baseten,请将你的 Baseten API Key 传给 api_key 选项,并将 base_url 改为 https://inference.baseten.co/v1

# instead of import openai:
from langfuse.openai import openai
 
client = openai.OpenAI(
  api_key=os.environ.get("BASETEN_API_KEY"),
  base_url="https://inference.baseten.co/v1",
)

第四步:运行示例

下面的代码演示了如何使用受追踪的 OpenAI 客户端调用 Baseten 的对话模型。所有 API 调用都会被 Litefuse 自动追踪。

response = client.chat.completions.create(
  model="zai-org/GLM-4.6",
  messages=[
    {"role": "system", "content": "You are a travel agent. Be descriptive and helpful."},
    {"role": "user", "content": "Tell me the top 3 things to do in San Francisco"},
  ],
  name="baseten-example-trace"
)
 
print(response.choices[0].message.content)

第五步:在 Litefuse 中查看 trace

运行示例模型调用后,你可以在 Litefuse 中查看 trace。你将看到关于 Baseten API 调用的详细信息,包括:

  • 请求参数(模型、messages、temperature 等)
  • 响应内容
  • token 使用统计
  • 延迟指标

Litefuse trace 示例

Litefuse 中的公开示例 trace 链接

Interoperability with the Python SDK

You can use this integration together with the Litefuse SDKs to add additional attributes to the observation.

The @observe() decorator provides a convenient way to automatically wrap your instrumented code and add additional attributes to the observation.

from langfuse import observe, propagate_attributes, get_client
 
langfuse = get_client()
 
@observe()
def my_llm_pipeline(input):
    # Add additional attributes (user_id, session_id, metadata, version, tags) to all spans created within this execution scope
    with propagate_attributes(
        user_id="user_123",
        session_id="session_abc",
        tags=["agent", "my-observation"],
        metadata={"email": "user@litefuse.ai"},
        version="1.0.0"
    ):
 
        # YOUR APPLICATION CODE HERE
        result = call_llm(input)
 
        return result
 
# Run the function
my_llm_pipeline("Hi")

Learn more about using the Decorator in the Langfuse SDK instrumentation docs.

Troubleshooting

No observations appearing

First, enable debug mode in the Python SDK:

export LANGFUSE_DEBUG="True"

Then run your application and check the debug logs:

  • OTel observations appear in the logs: Your application is instrumented correctly but observations are not reaching Litefuse. To resolve this:
    1. Call langfuse.flush() at the end of your application to ensure all observations are exported.
    2. Verify that you are using the correct API keys and base URL.
  • No OTel spans in the logs: Your application is not instrumented correctly. Make sure the instrumentation runs before your application code.
Unwanted observations in Litefuse

The Langfuse SDK is based on OpenTelemetry. Other libraries in your application may emit OTel spans that are not relevant to you. These still count toward your billable units, so you should filter them out. See Unwanted spans in Litefuse for details.

Missing attributes

Some attributes may be stored in the metadata object of the observation rather than being mapped to the Litefuse data model. If a mapping or integration does not work as expected, please raise an issue on GitHub.

Next Steps

Once you have instrumented your code, you can manage, evaluate and debug your application:

这个页面对你有帮助吗?