✨ New Plugin Alert ✨ SleekRank is now available with €50 launch discount
✨ New Plugin Alert ✨ SleekRank is now available with €50 launch discount
✨ New Plugin Alert ✨ SleekRank is now available with €50 launch discount
✨ New Plugin Alert ✨ SleekRank is now available with €50 launch discount
✨ New Plugin Alert ✨ SleekRank is now available with €50 launch discount
✨ New Plugin Alert ✨ SleekRank is now available with €50 launch discount
✨ New Plugin Alert ✨ SleekRank is now available with €50 launch discount
✨ New Plugin Alert ✨ SleekRank is now available with €50 launch discount
✨ New Plugin Alert ✨ SleekRank is now available with €50 launch discount
✨ New Plugin Alert ✨ SleekRank is now available with €50 launch discount

Document-Trained AI Chatbot for WordPress (PDF and DOCX)

Drop PDFs, DOCX files, and plain text into SleekAI. The plugin extracts and chunks them, stores embeddings in your WordPress database, and the chatbot retrieves passages at chat time using your OpenAI, Anthropic, Google, or OpenRouter key.

♾️ Lifetime License available

SleekAI chatbot for Document-Trained Chatbot

Half of your useful knowledge sits in files, not posts

Almost every business has a document library that is more useful than its website: a product manual, a sales playbook, a benefits PDF, a research paper, a compliance booklet. These files do not have URLs, they do not get indexed by search engines, and they certainly do not get read by chatbots crawling your sitemap. Making them queryable by a chatbot used to mean uploading to a vendor's vector store with its own dashboard, its own billing, and its own data residency story to argue about with security.

SleekAI's document mode keeps the documents and the index inside WordPress. Upload a PDF or DOCX through the plugin's Documents tab and it is parsed, chunked, embedded with your configured embedding model, and stored in a SleekAI table alongside source filename, page number, and chunk position. At chat time the retriever ranks chunks against the user's question and the top ones are inserted into the system prompt with their source markers. Citations link back to the document and the specific page, so when the bot says it answered from page 14 of the benefits guide, the user can verify.

The non-obvious part is metadata. SleekAI lets you tag each document with categories, restrict it by user role, and assign it to specific bots. A benefits PDF can be available only to logged-in employees through an internal bot; a public product brochure can be shared with the marketing bot. The same plugin runs both because document scope is part of the bot configuration, not a separate platform. When you replace a document with a new version, the old chunks are removed and the new ones reindexed without touching the bots that referenced the file.

Workflow

How document training runs end to end

1

Upload and tag

Drop files into the Documents tab, give each a category (HR, product, legal), set a role allowlist if needed, and assign it to one or more bots. The ingestion queue picks up the file in the background.
2

Extract and chunk

SleekAI extracts text, runs OCR on scanned PDFs, and chunks the result by your configured size and overlap. Each chunk records its filename, page number, and section heading for citation purposes.
3

Embed and store

Chunks are embedded with your configured embedding model (OpenAI, Voyage, Cohere, or self-hosted) and stored in a SleekAI table with the metadata. Embeddings can also be mirrored to Pinecone or Qdrant if you prefer an external vector store.
4

Retrieve and answer

At chat time the user message is used as the retrieval query, the top chunks are inserted into the system prompt, and the model is told to answer from them with citations. Sources are rendered under each reply.

Try it now

A chat over uploaded documents

Employee asks a benefits question, the bot retrieves the relevant page from the uploaded PDF and answers from it.

Comparison

Generic chatbot vs SleekAI for document training

Generic chatbot

  • Documents must be uploaded to a third-party vendor's store
  • No fine-grained scope per document, per bot, or per role
  • Citations point to a vendor URL, not your file and page
  • Replacing a document leaves stale chunks in the index
  • Compliance teams have to approve another data processor

SleekAI chatbot

  • PDF, DOCX, TXT, and Markdown ingestion built into the plugin
  • Chunks stored in WordPress with page number and filename metadata
  • Per-document role and bot scope, not just account-wide access
  • Reupload replaces chunks atomically, no stale fragments left
  • Citations include filename and page number, linkable to the source

Features

What SleekAI gives you for Document-Trained Chatbot

Native file ingestion

Drop PDFs, DOCX, TXT, and Markdown into the Documents tab. SleekAI extracts text, chunks by configurable size and overlap, and stores chunks alongside filename, page number, and section heading metadata for retrieval.

Per-document scoping

Each document carries a category, a role allowlist, and a list of bots that can read it. A benefits PDF can stay invisible to public visitors and only feed the internal employee bot, without juggling two vector stores.

Atomic reupload

Upload a new version of the same document and SleekAI removes the previous chunks and embeddings before inserting the new ones. Citations always reflect the live version. No half-updated index, no stale facts.

Use cases

Where document training earns its budget

HR and benefits questions

Employees ask the same benefits and policy questions every year. A bot trained on the benefits PDF answers them instantly with the page number for verification.

Product manual lookups

Customers ask install or troubleshooting questions answered in the 200-page manual. The bot finds the right page and quotes it instead of asking them to download and search the PDF.

Research and reports

Consultancies upload past research deliverables and let staff query them in natural language. Citations point back to the original report and page so the lineage is clear.

The bigger picture

Why document training belongs on your own server

A document is often a more honest source than a web page. It is the version that legal signed off on, the manual the engineers actually maintain, the playbook the sales team prints out before the quarterly offsite. When a chatbot answers from those documents, the answer carries the weight of the source.

When a chatbot answers from a vendor's crawl of your marketing site, the answer carries the weight of whoever last updated that page. The strongest case for document training is content that intentionally does not live on the public web. Internal benefits guides, NDA-bound research, paid product manuals, regulated compliance booklets.

Sending those to a vendor's vector store invites a long compliance review and an annual security questionnaire. Keeping them inside WordPress lets you reuse the access controls and the backup story you already have. Roles and capabilities already determine who sees which content; documents inherit the same model.

Backups already cover wp-content and the database; document chunks and embeddings are inside both. The chatbot becomes a feature of the site rather than a separate service to govern. That is the difference between a tool you can deploy in a week and one that lives in legal review for a quarter.

Questions

Common questions about SleekAI for Document-Trained Chatbot

PDF (text and scanned with OCR), DOCX, DOC, TXT, Markdown, and HTML are supported out of the box. CSV is supported for tabular data with optional column-aware chunking. Scanned PDFs go through OCR before chunking, which is configurable per upload to balance speed against accuracy.

 

Documents are stored in wp-content/uploads/sleek-ai/documents with WordPress's standard private-file protections. Chunks and embeddings are stored in dedicated SleekAI tables in your WordPress database. Optionally, embeddings can be pushed to an external vector store (Pinecone, Qdrant) but the default keeps everything inside WP.

 

There is no hard limit, but practical performance starts to degrade past a few thousand pages per document because chunking and embedding take time. SleekAI runs the ingestion job in the background via WordPress cron or an action scheduler queue, so large uploads do not block the admin.

 

Yes. Each bot's settings include a document scope: all documents, a category, an explicit allowlist, or a tag-based filter. A common pattern is a public marketing bot scoped to brochures and a private employee bot scoped to HR and policy documents on the same site.

 

Upload a new file with the same identifier (filename or document slug) and SleekAI replaces the chunks and embeddings atomically. The bot starts citing the new version on the next chat turn. There is no separate reindex step to remember; the upload action triggers it.

 

Yes. Every chunk carries filename, page number, and section heading metadata. The system prompt instructs the model to cite the source inline like 'source: benefits-2026.pdf, page 12' and the widget renders a Sources block under each reply. Clicking the citation downloads the file at the relevant page when the browser supports PDF page anchors.

 

The system prompt instructs the model to say it cannot find the answer in the uploaded documents rather than guessing. You can configure the fallback per bot: a generic 'contact support' message, a handoff to a contact form, or a soft retry that searches related categories before giving up.

 

Yes. SleekAI uses Tesseract by default for OCR with options to swap in a different OCR backend through a filter. Per-upload settings let you choose whether to OCR the file, which language model to use, and whether to keep both the OCR text and the original scanned image bytes accessible from the document's metadata page.

 

Pricing

More than 1000+
happy customers

Explore our flexible licensing options tailored to your needs. Upgrade your license anytime to access more features, or opt for a lifetime license for ongoing value, including lifetime updates and lifetime support. Our hassle-free upgrade process ensures that our platform can grow with you, starting from whichever plan you choose.

Starter

€79

EUR

per year

  • 3 websites
  • 1 year of updates
  • 1 year of support

Pro

€149

EUR

per year

  • Unlimited websites
  • 1 year of updates
  • 1 year of support

Lifetime ♾️

Most popular

€249

EUR

once

  • Unlimited websites
  • Lifetime updates
  • Lifetime support

...or get the Bundle Deal
and save €250 🎁

The Bundle (unlimited sites)

Pay once, own it forever

Elevate your WordPress site with our exclusive plugin bundle that includes all of our premium plugins in one package. Enjoy lifetime updates and lifetime support. Save significantly compared to buying plugins individually.

What’s included

  • SleekAI

  • SleekByte

  • SleekMotion

  • SleekPixel

  • SleekRank

  • SleekView