Structured Outputs

solve any problems

ForceLLM output JSON format and specified Schema

Old solutions

In the prompt,LLM is required to output the Output JSON format and the name and type of each field, for example

output in JSON object with follow fileds:
- name: string
- age: number
- isFemale: boolean

LangChain has a Parser that can help you generate prompt words.

Question 1

However,LLM still have a chance to output is not in JSON format, or the fields are not expected.

Later solutions

OpenAI's API introduces thejson_object Pattern, which can forceLLM to return JSON format.

Question 2

The fields returned byLLM may not be expected.

the latest solutions

OpenAI's Structured Outputs solution allows you to pass an explicit JSON Schema in the field of the API soLLM can output the specified format.

response_format: { "type": "json_schema", "json_schema": … , "strict": true }

You can specify the format in json_schema, for example:

{
        type: "json_schema",
        json_schema: {
            name: "math_response",
            schema: {
                type: "object",
                properties: {
                    steps: {
                        type: "array",
                        items: {
                            type: "object",
                            properties: {
                                explanation: { type: "string" },
                                output: { type: "string" }
                            },
                            required: ["explanation", "output"],
                            additionalProperties: false
                        }
                    },
                    final_answer: { type: "string" }
                },
                required: ["steps", "final_answer"],
                additionalProperties: false
            },
            strict: true
        }
}

In Node.js, you can use it more easily: Earth:

Define JSON Schema first

import { z } from "zod";
import { zodResponseFormat } from "openai/helpers/zod";

const Step = z.object({
  explanation: z.string(),
  output: z.string(),
});

const MathResponse = z.object({
  steps: z.array(Step),
  final_answer: z.string(),
});

Then put it in the response_format field

const completion = await openai.beta.chat.completions.parse({
  model: "gpt-4o-mini",
  messages: [
    { role: "system", content: "You are a helpful math tutor. Guide the user through the solution step by step." },
    { role: "user", content: "how can I solve 8x + 7 = -23" },
  ],
  response_format: zodResponseFormat(MathResponse, "math_response"),
});

Earth convenient.

How to use it in LangChain

import { ChatOpenAI } from "@langchain/openai";
import { HumanMessage, SystemMessage } from "@langchain/core/messages";
import { StringOutputParser } from "@langchain/core/output_parsers";
import { z } from "zod";
import { zodResponseFormat } from "openai/helpers/zod";

export async function main() {
  const CalendarEvent = z.object({
    name: z.string(),
    date: z.string(),
    participants: z.array(z.string()),
  });

  const model = new ChatOpenAI({
    model: "gpt-4o-mini",
    // 在这里添加
    modelKwargs: {
      response_format: zodResponseFormat(CalendarEvent, "event"),
    },
  });
  const messages = [
    new SystemMessage("Extract the event information."),
    new HumanMessage("我和小明参加婚礼"),
  ];
  const parser = new StringOutputParser();

  const chain = model.pipe(parser);
  const resp = await chain.invoke(messages);
  console.log(resp);
}

reference

official website