Guides

Duckie AI and AI transforms

Duckle ships a local AI assistant (Duckie) plus six AI transforms; three are pure-local, three call any OpenAI-compatible API.

Duckie, the local assistant

Duckie is powered by Qwen 2.5 Coder 1.5B via llama.cpp, downloaded once (~1.1 GB) and run entirely on your CPU as a llama-server subprocess on 127.0.0.1. Describe a pipeline in English; Duckie streams back valid Duckle pipeline JSON; an "Insert into canvas" button drops positioned, wired nodes.

Truly local - no API key, works offline.
Streamed responses - watch the pipeline JSON appear token by token.
One-click insert - the "Insert into canvas" button drops positioned, wired nodes.
Bring-your-own-model - point baseUrl at Ollama, OpenAI, llama.cpp, Cohere, Voyage, anything OpenAI-shaped.
Sandboxed - the model has no filesystem, network, or tool access; it only emits text.

AI transforms

Component	What it does	Local?
Embeddings (`xf.ai.embed`)	OpenAI-compatible `/v1/embeddings`	Needs API
LLM Transform (`xf.ai.llm`)	Per-row chat completion with `{column}` templates	Needs API
Classify (`xf.ai.classify`)	LLM-backed, constrains to N user categories, normalizes to `UNKNOWN`	Needs API
Text Chunker (`xf.ai.chunk`)	RAG-ready splitter	Pure local
PII Redact (`xf.ai.pii`)	Regex redaction of emails, phones, SSNs, cards	Pure local
Semantic Dedupe (`xf.ai.dedupe`)	Cosine over precomputed embeddings	Pure local

Clean data before it reaches your AI

Garbage in, garbage out applies doubly to retrieval and model context. Duckle gives you the data-quality toolkit to put clean, deduplicated, validated rows in front of your AI before you spend a single embedding token. The same canvas that builds your ETL also builds your RAG ingestion.

Deduplicate with exact Distinct, Uniqueness, Fuzzy Deduplicate, and Record Match.
Semantic dedupe to collapse near-duplicate text by meaning.
Profile and describe columns up front so you know what you are feeding the model.
Validate and route failures to a reject port instead of poisoning your index.
Normalize types, encodings, and casing.
Redact PII before embedding.
Chunk and embed for RAG.
Classify rows with an LLM.
Hybrid retrieval locally with Vector Similarity Search (cosine, L2, inner product via vss) and Full-Text Search (BM25 via fts), no model API.
Land it in pgvector, Pinecone, Qdrant, Weaviate, or Milvus.

Example: RAG ingestion

src.s3 (markdown)
  -> xf.ai.chunk (chunkSize=1500, overlap=150)
  -> xf.ai.pii (redact)
  -> xf.ai.embed (model=text-embedding-3-small)
  -> xf.ai.dedupe (threshold=0.95)
  -> snk.pgvector (table=docs)

Going further: The same OpenAI-compatible plumbing connects Claude or any LLM to Duckle via the MCP server. See Scheduling, MCP & deploy Learn hub Use cases.