Component reference
Duckle ships 329 components across six kinds - 104 sources (src.*), 128 transforms (xf.*), 59 sinks (snk.*), 12 data-quality validators (qa.*), 19 control-flow nodes (ctl.*), and 7 code runners (code.*). Every component has Basic, Schema, Preview, Advanced, and Validation tabs and a live preview.
104 sources
Files, lakehouse, databases, warehouses, object storage, streaming, SaaS APIs, NoSQL, vector DBs and more, all under src.*.
128 transforms
Fields, rows, aggregates, joins, windows, strings, dates, JSON, arrays, CDC/SCD, AI and search, all under xf.*.
59 sinks
Files, databases, warehouses, object storage, vector DBs, messaging and comms, all under snk.* with universal write modes.
Looking for a connector-only list grouped by system? See the integrations directory. New to Duckle? Start with install & quickstart.
Sources src.*
Sources read data into a pipeline. Connection-backed sources reuse saved, encrypted connections; file sources accept paths, globs and object-storage URIs.
| Group | Items |
|---|---|
| Files | CSV, TSV, Parquet, JSON, JSONL/NDJSON, Excel .xlsx, YAML, TOML, Fixed-width, XML, Apache Avro |
| Geospatial files | GeoJSON, Shapefile, GeoPackage, KML, GPX, GML (via spatial extension) |
| Lakehouse | Apache Iceberg, Delta Lake, DuckLake |
| Embedded DBs | SQLite, DuckDB |
| Network relational | PostgreSQL, MySQL, MariaDB, CockroachDB, SQL Server, Oracle, ClickHouse |
| Object storage | Amazon S3, Google Cloud Storage, Azure Blob, HTTP(S), MinIO, Cloudflare R2, Backblaze B2 |
| Warehouses | MotherDuck, Snowflake, BigQuery, Redshift, Databricks SQL, Azure Synapse, DuckDB Quack |
| Streaming | Apache Kafka/Redpanda, NATS JetStream, GCP Pub/Sub, RabbitMQ, AWS Kinesis |
| SaaS REST | Salesforce, HubSpot, Stripe, Shopify, Notion, Airtable, GitHub, GitLab, Jira, Slack, Zendesk and more, plus generic src.rest / src.graphql |
| API protocols | OData v4, SOAP/XML |
| NoSQL & search | MongoDB, Cassandra/ScyllaDB, Elasticsearch/OpenSearch, Redis, CouchDB, DynamoDB |
| Vector/AI DBs | pgvector, Qdrant, Weaviate, Milvus |
| File transfer | FTP/FTPS, SFTP |
| Mailbox | IMAP |
| Webhook listener | Inbound webhook endpoint |
| Desktop clipboard | Read from the system clipboard |
| Git repo | Read files from a Git repository |
Transforms xf.*
Transforms reshape, combine and enrich rows between sources and sinks. The visual Map node (xf.map) joins a main input to up to 3 lookups with per-output expressions and a filter.
| Group | Items |
|---|---|
| Fields | Map (visual mapper joining a main input to up to 3 lookups with per-output expressions + filter), Project/Select, Cast, Rename, Add/Drop/Reorder Column, Coalesce, UUID v4 |
| Rows | Filter (with reject port), Distinct, Sample, Top N/Limit, Sort, Skip, Top N per Group, Forward/Backward/Constant Fill |
| Aggregate | Group By, Rollup, Cube, Count, Window Aggregate, Cumulative, Approx Quantile (t-digest), Approx Count Distinct (HyperLogLog) |
| Join | Inner, Left, Right, Full Outer, Cross, Lookup, Semi, Anti, Spatial |
| Set ops | Union, Union All, Intersect, Except |
| Window | Row Number, Rank, Dense Rank, Lead, Lag, First Value, Last Value, NTile |
| Strings | Regex Replace/Extract/Match, Split, Concat, Trim, Case, Length, Substring, Format, Hash (md5/sha1/sha256), IP Parse, URL Parse, Text Similarity, Base64, Pad |
| Date/Time | Parse, Format, Extract, Diff/Add, Truncate, Timezone, Time Bin, Now, Epoch |
| Numeric | Round, Modulo, Abs, Log, Power, Sqrt, Bucketize, Z-Score, Clamp, Sign |
| JSON/nested | Parse, Stringify, Flatten, JSONPath, Merge, Array Aggregate |
| Array | Explode/Unnest, Collect List, Element At, Contains, Distinct, Length |
| Pivot/shape | Pivot, Unpivot, Denormalize, Normalize, Transpose |
| CDC/SCD | Incremental Load (watermark), Diff Detect, SCD Type 1, SCD Type 2, Merge/Upsert (with delete propagation), DuckLake CDC reader, Row Hash, Audit Stamp |
| AI/Search | Vector Similarity Search (vss), Full-Text Search BM25 (fts), Embeddings, LLM Transform, Classify, Text Chunker, PII Redact, Semantic Dedupe |
| Geospatial | Spatial Distance, Buffer, Intersects |
| Debug | Log Rows, Assert |
| dbt | xf.dbt runs dbt models (dbt-duckdb / Fusion) |
Sinks snk.*
Sinks write pipeline output to a destination. All sinks support the same write modes.
| Group | Items |
|---|---|
| Files | CSV, TSV, Parquet, JSON/JSONL, Excel, XML, Avro, Spatial, Iceberg |
| Databases | Postgres, CockroachDB, MySQL, MariaDB, SQLite, DuckDB |
| Warehouses | Snowflake, BigQuery, Redshift, MotherDuck, Databricks, DuckLake, Quack |
| Object storage | S3, GCS, Azure Blob (Parquet/CSV/JSON) |
| NoSQL | MongoDB |
| Vector DBs | pgvector, Pinecone, Qdrant, Weaviate, Milvus |
| Messaging | Kafka, NATS, Pub/Sub, RabbitMQ |
| Comms | Email (via SMTP) |
| File transfer | FTP/FTPS/SFTP |
| Write mode | Behavior |
|---|---|
overwrite | Replace the destination contents |
append | Add rows to existing data |
truncate | Empty the target, then write |
upsert | Universal MERGE with optional delete propagation |
Data quality qa.*
Validators split their input: passing rows continue on the main port, while failures route to a reject port for quarantine or re-processing.
| Group | Items |
|---|---|
| Validators | Not Null, Range, Regex, Uniqueness, Schema Validate |
| Profiling | Column Profile, Describe, Histogram |
| Cleansing & matching | Standardize, Fuzzy Deduplicate (Jaro-Winkler/Levenshtein), Record Match |
Control flow ctl.*
Control-flow nodes orchestrate execution, branching, concurrency and error handling across a pipeline (19 total).
| Group | Items |
|---|---|
| Orchestration | Run Job (call a child pipeline with context vars), Iterate, For Each, Schedule trigger |
| Concurrency | Parallelize (concurrent in-pipeline branches, auto-detected from CPU cores) |
| Branching | Switch (route rows to case outputs), Try/fallback |
| Signals | Wait, Die, Warn, Log, and more |
Code code.*
Code runners drop down to custom logic when a visual node is not enough (7 total).
| Runner | Notes |
|---|---|
| Custom SQL | code.sql |
| SQL template | Parameterized SQL |
| Shell | code.shell |
| JavaScript | boa engine |
| WebAssembly | wasmi |
| Python-style routines | Scripted row logic |