Duckie AI and AI transforms
Duckle ships a local AI assistant (Duckie) plus six AI transforms; three are pure-local, three call any OpenAI-compatible API.
Duckie, the local assistant
Duckie is powered by Qwen 2.5 Coder 1.5B via llama.cpp, downloaded once (~1.1 GB) and run entirely on your CPU as a llama-server subprocess on 127.0.0.1. Describe a pipeline in English; Duckie streams back valid Duckle pipeline JSON; an "Insert into canvas" button drops positioned, wired nodes.
- Truly local - no API key, works offline.
- Streamed responses - watch the pipeline JSON appear token by token.
- One-click insert - the "Insert into canvas" button drops positioned, wired nodes.
- Bring-your-own-model - point
baseUrlat Ollama, OpenAI, llama.cpp, Cohere, Voyage, anything OpenAI-shaped. - Sandboxed - the model has no filesystem, network, or tool access; it only emits text.
AI transforms
| Component | What it does | Local? |
|---|---|---|
Embeddings (xf.ai.embed) | OpenAI-compatible /v1/embeddings | Needs API |
LLM Transform (xf.ai.llm) | Per-row chat completion with {column} templates | Needs API |
Classify (xf.ai.classify) | LLM-backed, constrains to N user categories, normalizes to UNKNOWN | Needs API |
Text Chunker (xf.ai.chunk) | RAG-ready splitter | Pure local |
PII Redact (xf.ai.pii) | Regex redaction of emails, phones, SSNs, cards | Pure local |
Semantic Dedupe (xf.ai.dedupe) | Cosine over precomputed embeddings | Pure local |
Clean data before it reaches your AI
Garbage in, garbage out applies doubly to retrieval and model context. Duckle gives you the data-quality toolkit to put clean, deduplicated, validated rows in front of your AI before you spend a single embedding token. The same canvas that builds your ETL also builds your RAG ingestion.
- Deduplicate with exact Distinct, Uniqueness, Fuzzy Deduplicate, and Record Match.
- Semantic dedupe to collapse near-duplicate text by meaning.
- Profile and describe columns up front so you know what you are feeding the model.
- Validate and route failures to a reject port instead of poisoning your index.
- Normalize types, encodings, and casing.
- Redact PII before embedding.
- Chunk and embed for RAG.
- Classify rows with an LLM.
- Hybrid retrieval locally with Vector Similarity Search (cosine, L2, inner product via
vss) and Full-Text Search (BM25 viafts), no model API. - Land it in pgvector, Pinecone, Qdrant, Weaviate, or Milvus.
Example: RAG ingestion
src.s3 (markdown)
-> xf.ai.chunk (chunkSize=1500, overlap=150)
-> xf.ai.pii (redact)
-> xf.ai.embed (model=text-embedding-3-small)
-> xf.ai.dedupe (threshold=0.95)
-> snk.pgvector (table=docs)
Going further: The same OpenAI-compatible plumbing connects Claude or any LLM to Duckle via the MCP server. See Scheduling, MCP & deploy
Learn hub
Use cases.