Структурированные результаты

Какие проблемы решить

Принудительный вывод LLM в формате JSON и указанная схема

Старые решения.

В подсказке требуется, чтобы LLM выводил JSON-формат и имя и тип каждого поля, например

output in JSON object with follow fileds:
- name: string
- age: number
- isFemale: boolean

LangChain имеет Parser, который поможет вам генерировать подсказки.

Вопрос 1.

Тем не менее, есть вероятность, что LLM будет выводить формат, отличный от JSON, или что поля не будут ожидаться.

Последующее решение

API OpenAI вводит режим «json_object», который принуждает LLM вернуть JSON.

Вопрос 2.

Поля, возвращаемые LLM, могут быть не ожидаемыми.

Последние решения

Схема Structured Outputs OpenAI позволяет передавать четкую схему JSON в полях API, так что LLM может выводить данные в определенном формате.

response_format: { "type": "json_schema", "json_schema": … , "strict": true }

Вы можете задать формат в файле json_schema, например:

{
        type: "json_schema",
        json_schema: {
            name: "math_response",
            schema: {
                type: "object",
                properties: {
                    steps: {
                        type: "array",
                        items: {
                            type: "object",
                            properties: {
                                explanation: { type: "string" },
                                output: { type: "string" }
                            },
                            required: ["explanation", "output"],
                            additionalProperties: false
                        }
                    },
                    final_answer: { type: "string" }
                },
                required: ["steps", "final_answer"],
                additionalProperties: false
            },
            strict: true
        }
}

В Node.js вы можете проще использовать:

Определение схемы JSON

import { z } from "zod";
import { zodResponseFormat } from "openai/helpers/zod";

const Step = z.object({
  explanation: z.string(),
  output: z.string(),
});

const MathResponse = z.object({
  steps: z.array(Step),
  final_answer: z.string(),
});

Вставьте в поле response_format

const completion = await openai.beta.chat.completions.parse({
  model: "gpt-4o-mini",
  messages: [
    { role: "system", content: "You are a helpful math tutor. Guide the user through the solution step by step." },
    { role: "user", content: "how can I solve 8x + 7 = -23" },
  ],
  response_format: zodResponseFormat(MathResponse, "math_response"),
});

Очень удобно.

- Как использовать в LangChain * *

import { ChatOpenAI } from "@langchain/openai";
import { HumanMessage, SystemMessage } from "@langchain/core/messages";
import { StringOutputParser } from "@langchain/core/output_parsers";
import { z } from "zod";
import { zodResponseFormat } from "openai/helpers/zod";

export async function main() {
  const CalendarEvent = z.object({
    name: z.string(),
    date: z.string(),
    participants: z.array(z.string()),
  });

  const model = new ChatOpenAI({
    model: "gpt-4o-mini",
    // 在这里添加
    modelKwargs: {
      response_format: zodResponseFormat(CalendarEvent, "event"),
    },
  });
  const messages = [
    new SystemMessage("Extract the event information."),
    new HumanMessage("我和小明参加婚礼"),
  ];
  const parser = new StringOutputParser();

  const chain = model.pipe(parser);
  const resp = await chain.invoke(messages);
  console.log(resp);
}

Референс

官网