✨ New Plugin Alert ✨ SleekRank is now available with €50 launch discount
✨ New Plugin Alert ✨ SleekRank is now available with €50 launch discount
✨ New Plugin Alert ✨ SleekRank is now available with €50 launch discount
✨ New Plugin Alert ✨ SleekRank is now available with €50 launch discount
✨ New Plugin Alert ✨ SleekRank is now available with €50 launch discount
✨ New Plugin Alert ✨ SleekRank is now available with €50 launch discount
✨ New Plugin Alert ✨ SleekRank is now available with €50 launch discount
✨ New Plugin Alert ✨ SleekRank is now available with €50 launch discount
✨ New Plugin Alert ✨ SleekRank is now available with €50 launch discount
✨ New Plugin Alert ✨ SleekRank is now available with €50 launch discount

AI Chatbot with Ollama on WordPress

SleekAI talks to Ollama's OpenAI-compatible endpoint on http://localhost:11434/v1 or whatever LAN address you expose, so a chatbot grounded in your WordPress content can run on the same machine, a homelab box, or a private cloud VM. Bring your own model and key.

♾️ Lifetime License available

SleekAI chatbot for Ollama

Ollama is the easiest way to host a chatbot locally

Ollama bundles model download, quantization, and inference into a single binary that runs on a Mac mini, a Linux box, or a small Hetzner VM. Pull llama3.1:70b, qwen2.5:32b, or mistral-nemo with one command and it exposes an OpenAI-compatible API on port 11434. That is exactly the shape SleekAI's provider settings expect.

SleekAI plugs straight into that endpoint. Set the base URL to http://localhost:11434/v1 on a single-server WordPress install, or to a private LAN IP if Ollama runs on a separate machine. Paste any string as the API key - Ollama ignores it but the SleekAI form requires a non-empty value. Pick the exact model tag you pulled and the chatbot is live. The same display conditions, mapped variables, and Multibot setup that work on OpenAI work here.

Because the request never leaves your network, visitor questions, system prompts, and the documents you inject into the context all stay inside the perimeter. Conversations land in wp_sleek_ai_chats on the WordPress database, and the log records which Ollama model produced each reply. The combination is what a lot of self-hosted WordPress teams have been quietly waiting for: a real chat experience that does not require a third-party API contract.

Workflow

Run SleekAI on Ollama in four steps

1

Install Ollama and pull a model

Install Ollama on the box that will host the model, then run ollama pull llama3.1:70b-instruct or whichever tag matches your hardware. Confirm it answers with a quick curl to /v1/chat/completions.
2

Add the provider in SleekAI

In provider settings choose OpenAI-compatible, set the base URL to http://localhost:11434/v1 or your LAN URL, and put any string in the API key field. Ollama ignores the key but SleekAI needs a value.
3

Create a bot with the model tag

Pick the new provider, paste the exact model tag you pulled, and set the system prompt. Choose which post types and meta keys flow into context just like with any other provider.
4

Watch logs and tune

Each reply in wp_sleek_ai_chats records the Ollama model tag, token counts, and origin page. Filter by failed responses to catch model timeouts and tune the prompt where the bot drifts.

Try it now

Ask the Ollama demo bot

This bot is wired to a hypothetical Ollama server running Llama 3.1 in a homelab. Ask how SleekAI handles the localhost endpoint, model tags, and offline workflows.

Comparison

Generic chatbot vs SleekAI for Ollama

Generic chatbot

  • Hard-coded to OpenAI or Anthropic cloud APIs only
  • No way to set a localhost or LAN base URL
  • Cannot use model tags like llama3.1:70b directly
  • Routes every prompt through a vendor relay
  • No per-bot model selection across the same install

SleekAI chatbot

  • Native support for Ollama's OpenAI-compatible endpoint
  • Use any pulled tag: llama3.1, qwen2.5, mistral, gemma
  • Runs fully offline once the model is downloaded
  • Multibot can mix Ollama, OpenAI, and Anthropic in one site
  • Logs the exact Ollama model used for each reply

Features

What SleekAI gives you for Ollama

Localhost-friendly

Point SleekAI at http://localhost:11434/v1 when WordPress and Ollama share a box, or at a LAN IP when they don't. The provider field accepts any HTTPS or HTTP URL that speaks the OpenAI chat schema.

Any Ollama model tag

Drop the exact tag you pulled - llama3.1:70b, qwen2.5:32b-instruct, mistral-nemo, deepseek-r1 - into the model field. SleekAI passes it through unchanged so the right weights answer each request.

Mix and match per bot

Run a small fast Ollama model for marketing chat and a big cloud model for the support bot, all from the same WordPress install. Multibot scopes each bot to a post type, role, or URL pattern.

Use cases

Where Ollama plus SleekAI fits best

Privacy-first sites

Membership, health, or legal sites where customer questions cannot leave the perimeter. Ollama on a single VM answers grounded in WordPress content with zero outbound API calls.

Edge and offline

Conferences, ships, factories, and rural deployments. Pull the model once over a connection and keep answering questions from the on-site WordPress install while offline.

Predictable cost

Trade per-token pricing for a fixed monthly VM bill. High-volume documentation bots that would burn through OpenAI credits often run cheaper on a single GPU box with Ollama.

The bigger picture

Why Ollama plus WordPress is the new homelab default

A year ago, hosting your own chatbot meant a Python notebook, a CUDA install, and a long weekend. Ollama collapsed all of that into a single binary that downloads quantized weights and serves an OpenAI-compatible API on port 11434. That one decision made local inference accessible to anyone who can run brew install or apt install.

The bottleneck moved from infrastructure to integration, and the integration into WordPress is exactly where most marketing sites and membership communities live. SleekAI closes that loop by treating the Ollama endpoint as a normal provider. There is no special adapter, no proprietary protocol, no Sleek-controlled relay.

The same widget that powers a cloud-backed chatbot can run entirely on a Mac mini under a desk, on a Hetzner VM, or on a GPU server in a private cloud, with no outbound network call required after the model is pulled. For privacy-conscious teams, edge deployments, and anyone tired of metered token pricing, that combination turns the WordPress site into a perfectly capable AI surface without giving up control of the data or the bill.

Questions

Common questions about SleekAI for Ollama

No. Ollama is free open-source software. Install it on macOS, Linux, or Windows, run ollama pull llama3.1:70b, and the OpenAI-compatible server is available on port 11434 with no signup. SleekAI just needs the URL and any string in the key field.

 

For a small Mac mini or a 16GB VPS, qwen2.5:7b-instruct or llama3.1:8b are sensible defaults. With a 24GB GPU you can run llama3.1:70b-instruct-q4 comfortably for a marketing chatbot. Always match the tag in SleekAI to the exact tag you pulled in Ollama.

 

Yes. Set OLLAMA_HOST=0.0.0.0:11434 on the Ollama machine so it listens on all interfaces, then put it behind a Cloudflare Tunnel, Tailscale, or a Wireguard mesh. Point SleekAI at the resulting URL with a bearer token enforced at the tunnel layer.

 

Yes, and SleekAI uses it. Replies stream into the chat widget token by token through the same Server-Sent Events flow used for OpenAI, so the experience feels identical to a cloud-hosted bot once the first token arrives.

 

Multibot lets each chatbot pick its own provider, model, and prompt. You can have one bot on llama3.1:70b for docs and another on qwen2.5:7b for the homepage, both pointing at the same Ollama server with different tags.

 

On a recent Mac with M2 Max or an Nvidia RTX 4090, an 8B model answers in well under a second to first token, comparable to gpt-4o-mini. Bigger 70B quantized models add a few seconds of latency but still feel responsive in a chat UI.

 

Ollama exposes embeddings endpoints for models like nomic-embed-text and mxbai-embed-large. Point SleekAI's embeddings provider at the same base URL and pick the embedding model tag to build a fully local retrieval pipeline.

 

Yes. Most teams put Caddy, Nginx, or Cloudflare Tunnel in front of Ollama to terminate TLS and add auth, then set the SleekAI base URL to https://ollama.example.com/v1. The OpenAI-compatible path stays the same regardless of the proxy in front of it.

 

Pricing

More than 1000+
happy customers

Explore our flexible licensing options tailored to your needs. Upgrade your license anytime to access more features, or opt for a lifetime license for ongoing value, including lifetime updates and lifetime support. Our hassle-free upgrade process ensures that our platform can grow with you, starting from whichever plan you choose.

Starter

€79

EUR

per year

  • 3 websites
  • 1 year of updates
  • 1 year of support

Pro

€149

EUR

per year

  • Unlimited websites
  • 1 year of updates
  • 1 year of support

Lifetime ♾️

Most popular

€249

EUR

once

  • Unlimited websites
  • Lifetime updates
  • Lifetime support

...or get the Bundle Deal
and save €250 🎁

The Bundle (unlimited sites)

Pay once, own it forever

Elevate your WordPress site with our exclusive plugin bundle that includes all of our premium plugins in one package. Enjoy lifetime updates and lifetime support. Save significantly compared to buying plugins individually.

What’s included

  • SleekAI

  • SleekByte

  • SleekMotion

  • SleekPixel

  • SleekRank

  • SleekView