🧠 用 Python + Docker 搭建一个 Anthropic API 中转服务（实战记录）

一句话定位

把上游 LLM API 接一层薄薄的透明代理，跑在 Docker 里，方便自己就近用，也方便统一管理 key 和放日志。

这个服务我在 MiniMax 的 Anthropic 兼容接口前挡了一层。两年以前，我自己为了连接 Claude，做了很重的适配中间件；
其实大多数场景用不到那么重——一个 FastAPI + httpx + Docker 就够。记录下来供以后复用。

干嘛用的

不是要做一个「产品」，是一个给自己用的透明转发层。典型场景：

Claude Code 连不上官方接口（网络原因）
想把 API Key 统一放在内网服务器上，不散到每台开发机
在不改 API 的前提下，中间插一层来记录请求延迟 / 用量
统一对外的 base_url，切模型厂商时不用改客户端代码

架构

Claude Code / curl
        |
   Nginx（你的域名 / SSL 终结）
        |
   FastAPI（Docker, 内网 :5066）
        |
   api.minimaxi.com/anthropic（上游）

Nginx 管 SSL + 反向代理，FastAPI 只做透传。加了 Nginx 并不是必须的，
但有了它之后切域名、配证书都方便，不用改 Docker 容器。

两个文件搞定

app.py

from fastapi import FastAPI, Request
from fastapi.responses import StreamingResponse, PlainTextResponse
import httpx

app = FastAPI()

UPSTREAM = &quot;https://api.minimaxi.com/anthropic&quot;


@app.get(&quot;/&quot;)
async def index():
    return PlainTextResponse(&quot;test&quot;)


@app.api_route(&quot;/v1/{path:path}&quot;, methods=[&quot;POST&quot;, &quot;GET&quot;, &quot;OPTIONS&quot;])
async def proxy(path: str, request: Request):
    body = await request.body()
    headers = dict(request.headers)
    headers.pop(&quot;host&quot;, None)
    headers.pop(&quot;content-length&quot;, None)
    headers.pop(&quot;connection&quot;, None)

    url = f&quot;{UPSTREAM}/v1/{path}&quot;

    async with httpx.AsyncClient(timeout=None) as client:
        resp = await client.request(
            method=request.method,
            url=url, content=body,
            headers=headers, params=request.query_params
        )
        if resp.status_code &gt;= 400:
            return PlainTextResponse(
                resp.text, status_code=resp.status_code
            )
        return StreamingResponse(
            resp.aiter_bytes(),
            media_type=resp.headers.get(&quot;content-type&quot;, &quot;&quot;)
        )

Dockerfile

FROM python:3.11-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY app.py .
EXPOSE 5066
CMD [&quot;uvicorn&quot;, &quot;app:app&quot;, &quot;--host&quot;, &quot;0.0.0.0&quot;, &quot;--port&quot;, &quot;5066&quot;]

依赖只有三个：fastapi, uvicorn, httpx。

构建和跑

docker build -t mini-proxy .
docker run -d --name mini-proxy -p 5066:5066 --restart always mini-proxy

Nginx 那边加一个 location 块：

location / {
    proxy_pass http://127.0.0.1:5066;
    proxy_set_header Host $host;
    proxy_set_header X-Real-IP $remote_addr;
    proxy_buffering off;          # SSE 必须关
    chunked_transfer_encoding on;
}

几个注意的点

SSE 流式必须关 proxy_buffering

这是最常见的坑。Nginx 默认会 buffer 上游响应，SSE 就变成等整个流结束才发出去——那不叫流式了。
关掉之后 chunk 到即转发。

错误直接透传

不包装错误。上游返回什么就是什么，本地不加额外的 JSON 壳。
这样客户端收到的错误格式跟直连一样，不用做二次适配。

端口选 5066

没什么特殊理由，只是跟内网其他服务不冲突。

日志设计

如果要加日志，建议只记录结构化的元信息：

字段	示例
path	/v1/messages
method	POST
status	200
latency_ms	320
user_agent	claude-code/1.0

不要记完整 body 和 SSE 内容。即使用内网，token 流里可能夹着敏感对话。记路径 + 耗时就够了。

总结

一个透传代理，走了三次迭代才稳定下来：第一次写的时候加了鉴权、限流、用量统计，太重了，
后面拆掉了，反而跑得更稳。

中转层越薄越好，薄到可以忘了它存在。