Ollama

Run Large Language Models locally on your machine with full support for vision, tools, and embeddings while maintaining total data sovereignty.

Allows you to run large language models locally using Ollama.

Configuration

Standard configuration for local inference (defaults to http://localhost:11434/v1):

import { createLLM } from "@node-llm/core";

// Defaults to http://localhost:11434/v1
const llm = createLLM({ provider: "ollama" });

If your Ollama instance is running on a different machine or port:

const llm = createLLM({ 
  provider: "ollama", 
  ollamaApiBase: "http://192.168.1.10:11434/v1" // Note the /v1 suffix 
});

You can pass Ollama/OpenAI-compatible parameters using .withParams().

const chat = llm.chat("llama3").withParams({
  temperature: 0.7,
  seed: 42,
  num_ctx: 8192 // Ollama specific context size
});

Models: Supports any model pulled via ollama pull.
Vision: Use vision-capable models like llama3.2-vision or llava.
Tools: Fully supported for models with tool-calling capabilities (e.g., llama3.1).
Embeddings: High-performance local vector generation.
Model Discovery: Inspect your local library and model metadata via llm.listModels().

const response = await chat.ask("Describe this image", {
  files: ["./image.png"]
});

List all models currently pulled in your Ollama library to inspect their context windows and features:

const models = await NodeLLM.listModels();
console.table(models);

The following features are not supported natively by Ollama’s OpenAI-compatible API:

For full feature parity locally, consider using LocalAI and connecting via the OpenAI Provider with a custom openaiApiBase.