Docker
Minimal, production-ready container image to run FastBCP, a high-performance bulk copy utility designed for data integration and automation workflows.
This setup targets FastBCP ≥ 0.28.0, which supports passing the license inline via --license "<content>".
Image Overview
- Base image:
debian:trixie-slim - Entrypoint:
/usr/local/bin/FastBCP - Repository: FastBCP-Image on GitHub
- DockerHub: arpeio/fastbcp
- Published automatically via GitHub Actions for each new release and weekly security updates
Prerequisites
- Docker 24+ (or Podman)
- FastBCP Linux x64 ≥ 0.28.0 binary (for build only)
- Optional:
FastBCP_Settings.jsonto mount/copy into/configfor custom logging settings
Using the Prebuilt Image from DockerHub
You can use a prebuilt image from DockerHub that already includes the FastBCP binary. You must provide your own license at runtime.
Available Tags
- Version-specific tags are aligned with FastBCP releases (e.g.,
v0.28.3) latesttag always points to the most recent FastBCP version
Automatic Updates
- New releases: Images are automatically built when new FastBCP versions are released
- Security updates: The latest version of each minor branch (e.g., latest v0.27.x, v0.28.x, v0.29.x) is automatically rebuilt weekly (every Monday) with the latest base image and security patches
- This ensures that all actively used versions remain secure without breaking compatibility
- Example: If you use
v0.28.8(latest of 0.28.x branch), it gets security updates even afterv0.29.0is released
Pull the Image
# Latest version
docker pull arpeio/fastbcp:latest
# Specific version
docker pull arpeio/fastbcp:v0.28.3
Basic Commands
# Get command line help
docker run --rm arpeio/fastbcp:latest
# Check version
docker run --rm arpeio/fastbcp:latest --version
License Requirement
Since version 0.28.0, pass the license content directly via --license "…".
export licenseContent=$(cat ./FastBCP.lic)
# Use $licenseContent in your docker run commands
docker run --rm arpeio/fastbcp:latest \
--license "$licenseContent" \
[other parameters...]
Prefer --env-file, Docker/Compose/Kubernetes secrets, or managed identities for cloud credentials. Avoid leaving the license content in shell history.
Usage Examples
The Docker image uses the FastBCP binary as its entrypoint, so you can run it directly with parameters as defined in the FastBCP documentation.
Example 1: SQL Server → Parquet on S3
export licenseContent=$(cat ./FastBCP.lic)
docker run --rm \
-e AWS_ACCESS_KEY_ID=${AWS_ACCESS_KEY_ID} \
-e AWS_SECRET_ACCESS_KEY=${AWS_SECRET_ACCESS_KEY} \
-e AWS_REGION=${AWS_REGION} \
arpeio/fastbcp:latest \
--connectiontype "mssql" \
--server "host.docker.internal,1433" \
--user "FastUser" \
--password "FastPassword" \
--database "tpch_test" \
--query "SELECT * FROM dbo.orders WHERE year(o_orderdate)=1998" \
--fileoutput "orders.parquet" \
--directory "s3://arpeioftoutput/dockertest/" \
--paralleldegree 12 \
--parallelmethod "Ntile" \
--distributekeycolumn "o_orderkey" \
--merge false \
--license "$licenseContent"
Example 2: SQL Server → CSV with Custom Logging
export licenseContent=$(cat ./FastBCP.lic)
docker run --rm \
-v fastbcp-config:/config \
-v fastbcp-data:/data \
-v fastbcp-logs:/logs \
arpeio/fastbcp:latest \
--settingsfile "/config/FastBCP_Settings_Logs_To_Files.json" \
--connectiontype "mssql" \
--server "host.docker.internal,1433" \
--user "FastUser" \
--password "FastPassword" \
--database "tpch_test" \
--query "SELECT * FROM dbo.orders WHERE year(o_orderdate)=1998" \
--fileoutput "orders.csv" \
--directory "/data/orders/csv" \
--delimiter "|" \
--decimalseparator "." \
--dateformat "yyyy-MM-dd HH:mm:ss" \
--paralleldegree 12 \
--parallelmethod "Ntile" \
--distributekeycolumn "o_orderkey" \
--merge false \
--license "$licenseContent"
Example 3: PostgreSQL → Parquet on Azure Data Lake Storage (ADLS)
export licenseContent=$(cat ./FastBCP.lic)
export adlscontainer="arpeioadlseu"
docker run --rm \
-e AZURE_CLIENT_ID=${AZURE_CLIENT_ID} \
-e AZURE_TENANT_ID=${AZURE_TENANT_ID} \
-e AZURE_CLIENT_SECRET=${AZURE_CLIENT_SECRET} \
arpeio/fastbcp:latest \
--connectiontype "pgcopy" \
--server "host.docker.internal:15432" \
--user "FastUser" \
--password "FastPassword" \
--database "tpch" \
--sourceschema "tpch_10" \
--sourcetable "orders" \
--query "SELECT * FROM tpch_10.orders WHERE o_orderdate >= '1998-01-01' AND o_orderdate < '1999-01-01'" \
--fileoutput "orders.parquet" \
--directory "abfss://${adlscontainer}.dfs.core.windows.net/fastbcpoutput/testdfs/orders" \
--paralleldegree -2 \
--parallelmethod "Ctid" \
--license "$licenseContent"
Example 4: Oracle → Parquet on Google Cloud Storage (GCS)
export licenseContent=$(cat ./FastBCP.lic)
export gcsbucket="arpeio-gcs-bucket"
export GOOGLE_APPLICATION_CREDENTIALS_JSON=$(cat ./gcp-credentials.json)
docker run --rm \
-e GOOGLE_APPLICATION_CREDENTIALS="${GOOGLE_APPLICATION_CREDENTIALS_JSON}" \
arpeio/fastbcp:latest \
--connectiontype "oraodp" \
--server "host.docker.internal:1521/FREEPDB1" \
--user "TPCH_IN" \
--password "TPCH_IN" \
--database "FREEPDB1" \
--sourceschema "TPCH_IN" \
--sourcetable "ORDERS" \
--fileoutput "orders.parquet" \
--directory "gs://${gcsbucket}/fastbcpoutput/testgs/orders" \
--parallelmethod "Rowid" \
--paralleldegree -2 \
--license "$licenseContent"
Volumes
The Docker image declares several volumes to organize data and configuration:
VOLUME ["/config", "/data", "/work", "/logs"]
Volume Configuration and Access Modes
| Volume | Purpose | Access Mode | Notes |
|---|---|---|---|
/config | Contains user-provided configuration files (e.g., Serilog settings) | Read-only / Read-many | Shared across multiple containers; not modified |
/data | Input/output data directory | Read-many/Write-many | Stores imported or exported data files |
/work | Temporary working directory (container WORKDIR) | Read-many/Write-many | Used internally for temporary processing |
/logs | Log output directory (per-run or aggregated logs) | Read-many/Write-many | Stores runtime and execution logs |
Configuring FastBCP Logging
Available starting from version v0.28.3
FastBCP supports custom logging configuration through an external Serilog settings file in JSON format. This allows you to control how and where logs are written — to the console, to files, or dynamically per run.
You can download an example logging settings file directly from GitHub: FastBCP_Settings_Logs_To_Files.json
Custom settings files must be mounted into the container under the /config directory.
Example: Logging to Console, Airflow, and Dynamic Log Files
The following configuration is recommended for most production or Airflow environments. It writes:
- Logs to the console for real-time visibility
- Run summary logs to
/airflow/xcom/return.jsonfor Airflow integration - Per-run logs under
/logs, automatically named with{LogTimestamp}and{TraceId}
{
"Serilog": {
"Using": [
"Serilog.Sinks.Console",
"Serilog.Sinks.File",
"Serilog.Enrichers.Environment",
"Serilog.Enrichers.Thread",
"Serilog.Enrichers.Process",
"Serilog.Enrichers.Context",
"Serilog.Formatting.Compact"
],
"WriteTo": [
{
"Name": "Console",
"Args": {
"outputTemplate": "{Timestamp:yyyy-MM-ddTHH:mm:ss.fff zzz} -|- {Application} -|- {runid} -|- {Level:u12} -|- {fulltargetname} -|- {Message}{NewLine}{Exception}",
"theme": "Serilog.Sinks.SystemConsole.Themes.ConsoleTheme::None, Serilog.Sinks.Console",
"applyThemeToRedirectedOutput": false
}
},
{
"Name": "File",
"Args": {
"path": "/airflow/xcom/return.json",
"formatter": "Serilog.Formatting.Compact.CompactJsonFormatter, Serilog.Formatting.Compact"
}
},
{
"Name": "Map",
"Args": {
"to": [
{
"Name": "File",
"Args": {
"path": "/logs/{logdate}/{sourcedatabase}/log-{filename}-{LogTimestamp}-{TraceId}.json",
"formatter": "Serilog.Formatting.Compact.CompactJsonFormatter, Serilog.Formatting.Compact",
"rollingInterval": "Infinite",
"shared": false,
"encoding": "utf-8"
}
}
]
}
}
],
"Enrich": [
"FromLogContext",
"WithMachineName",
"WithProcessId",
"WithThreadId"
],
"Properties": {
"Application": "FastBCP"
}
}
}
- If a target directory (such as
/logsor/airflow/xcom) does not exist, FastBCP automatically creates it. - The file
/airflow/xcom/return.jsonis designed to provide run summaries compatible with Airflow's XCom mechanism.
Available Tokens for Path or Filename Formatting
You can use the following placeholders to dynamically generate log file names or directories:
| Token | Description |
|---|---|
{logdate} | Current date in yyyy-MM-dd format |
{logtimestamp} | Full timestamp of the log entry |
{sourcedatabase} | Name of the source database |
{sourceschema} | Name of the source schema |
{sourcetable} | Name of the source table |
{filename} | Name of the file being processed |
{runid} | Run identifier provided in the command line |
{traceid} | Unique trace identifier generated at runtime |
Mounting a Custom Settings File
Your Serilog configuration file (for example, FastBCP_Settings_Logs_To_Files.json) must be placed in /config, either by mounting a local directory or by using a Docker named volume.
Example with named volumes:
# First, copy your config file to a volume location
cp ~/FastBCP_Settings_Logs_To_Files.json /volumes/fastbcp-config/
# Then run FastBCP with mounted volumes
docker run --rm \
-v fastbcp-config:/config \
-v fastbcp-data:/data \
-v fastbcp-logs:/logs \
arpeio/fastbcp:latest \
--settingsfile "/config/FastBCP_Settings_Logs_To_Files.json" \
--connectiontype "mssql" \
--server "host.docker.internal,1433" \
--user "FastUser" \
--password "FastPassword" \
--database "tpch_test" \
--query "SELECT * FROM dbo.orders" \
--fileoutput "orders.csv" \
--directory "/data" \
--paralleldegree 12 \
--parallelmethod "Ntile" \
--distributekeycolumn "o_orderkey" \
--merge false \
--license "$licenseContent"
If the --settingsfile argument is not provided, FastBCP will use its built-in default logging configuration.