Reasoning

Access the inner thoughts and chain-of-thought process of advanced reasoning models like DeepSeek R1 and OpenAI o1/o3.

Added in v1.7.0

Configuring Thinking
1. Setting Effort Level
2. Per-Request Configuration
Accessing Thinking Results
1. Streaming Thinking
Backward Compatibility (Deprecated)
Supported Capabilities

NodeLLM provides a unified way to access the “thinking” or “reasoning” process of models like DeepSeek R1, OpenAI o1/o3, and Claude 3.7/4. Many models now expose their internal chain of thought or allow configuring the amount of effort spent on reasoning.

Configuring Thinking

You can control the reasoning behavior using the .withThinking() or .withEffort() methods. This is particularly useful for models like o3-mini or claude-3-7-sonnet.

Setting Effort Level

Effort levels (low, medium, high) allow you to balance between speed/cost and reasoning depth.

import { NodeLLM } from "@node-llm/core";

const chat = NodeLLM.chat("o3-mini")
  .withEffort("high"); // Options: "low", "medium", "high"

const response = await chat.ask("Solve this complex architecture problem...");

Per-Request Configuration

If you prefer to be stateless or set configuration only for a specific request, you can pass the thinking configuration directly to ask() or stream().

const response = await chat.ask("Solve this puzzle", {
  thinking: { budget: 16000 }
});

Accessing Thinking Results

The results of thinking are available via the .thinking property on the response object. This unified object contains the text, tokens used, and any cryptographic signatures provided by the model.

const response = await chat.ask("Prove that the square root of 2 is irrational.");

// High-level access via response.thinking
if (response.thinking) {
  console.log("Thought Process:", response.thinking.text);
  console.log("Tokens Spent:", response.thinking.tokens);
  console.log("Verification Signature:", response.thinking.signature);
}

// Show the final answer
console.log("Answer:", response.content);

Streaming Thinking

When using .stream(), thinking content is emitted in chunks. You can capture it by checking chunk.thinking.

const chat = NodeLLM.chat("deepseek-reasoner");

for await (const chunk of chat.stream("Explain quantum entanglement")) {
  if (chunk.thinking?.text) {
    process.stdout.write(`[Thinking] ${chunk.thinking.text}`);
  }
  if (chunk.content) {
    process.stdout.write(chunk.content);
  }
}

Backward Compatibility (Deprecated)

Previously, reasoning text was accessed via the response.reasoning property. While still supported for backward compatibility, it is recommended to transition to the structured response.thinking.text API.

Supported Capabilities

Currently, the following models have enhanced reasoning support in NodeLLM:

Model ID	Provider	Support Level
`deepseek-reasoner`	DeepSeek	Full text extraction
`o1-`, `o3-`	OpenAI	Effort configuration & token tracking
`claude-3-7-`, `claude--4-*`	Anthropic	Budget-based thinking & full text extraction
`gemini-2.0-flash-thinking-*`	Gemini	Full thinking text extraction