Duckle v0.4.1 is out - DuckDB 1.5.4, in-app updates, Custom SQL for duck sources, and proxy support for REST. Read the release notes
Guides

Duckie AI and AI transforms

Duckle ships a local AI assistant (Duckie) plus six AI transforms; three are pure-local, three call any OpenAI-compatible API.

Duckie, the local assistant

Duckie is powered by Qwen 2.5 Coder 1.5B via llama.cpp, downloaded once (~1.1 GB) and run entirely on your CPU as a llama-server subprocess on 127.0.0.1. Describe a pipeline in English; Duckie streams back valid Duckle pipeline JSON; an "Insert into canvas" button drops positioned, wired nodes.

  • Truly local - no API key, works offline.
  • Streamed responses - watch the pipeline JSON appear token by token.
  • One-click insert - the "Insert into canvas" button drops positioned, wired nodes.
  • Bring-your-own-model - point baseUrl at Ollama, OpenAI, llama.cpp, Cohere, Voyage, anything OpenAI-shaped.
  • Sandboxed - the model has no filesystem, network, or tool access; it only emits text.

AI transforms

ComponentWhat it doesLocal?
Embeddings (xf.ai.embed)OpenAI-compatible /v1/embeddingsNeeds API
LLM Transform (xf.ai.llm)Per-row chat completion with {column} templatesNeeds API
Classify (xf.ai.classify)LLM-backed, constrains to N user categories, normalizes to UNKNOWNNeeds API
Text Chunker (xf.ai.chunk)RAG-ready splitterPure local
PII Redact (xf.ai.pii)Regex redaction of emails, phones, SSNs, cardsPure local
Semantic Dedupe (xf.ai.dedupe)Cosine over precomputed embeddingsPure local

Clean data before it reaches your AI

Garbage in, garbage out applies doubly to retrieval and model context. Duckle gives you the data-quality toolkit to put clean, deduplicated, validated rows in front of your AI before you spend a single embedding token. The same canvas that builds your ETL also builds your RAG ingestion.

  • Deduplicate with exact Distinct, Uniqueness, Fuzzy Deduplicate, and Record Match.
  • Semantic dedupe to collapse near-duplicate text by meaning.
  • Profile and describe columns up front so you know what you are feeding the model.
  • Validate and route failures to a reject port instead of poisoning your index.
  • Normalize types, encodings, and casing.
  • Redact PII before embedding.
  • Chunk and embed for RAG.
  • Classify rows with an LLM.
  • Hybrid retrieval locally with Vector Similarity Search (cosine, L2, inner product via vss) and Full-Text Search (BM25 via fts), no model API.
  • Land it in pgvector, Pinecone, Qdrant, Weaviate, or Milvus.

Example: RAG ingestion

src.s3 (markdown)
  -> xf.ai.chunk (chunkSize=1500, overlap=150)
  -> xf.ai.pii (redact)
  -> xf.ai.embed (model=text-embedding-3-small)
  -> xf.ai.dedupe (threshold=0.95)
  -> snk.pgvector (table=docs)
Going further: The same OpenAI-compatible plumbing connects Claude or any LLM to Duckle via the MCP server. See Scheduling, MCP & deploy Learn hub Use cases.