Table of Contents
- Introduction
- What Is Redash?
- Core Architecture
- Installation Options
- Connecting Data Sources
- Writing Queries in Redash
- Visualizations: From Tables to Advanced Charts
- Building Interactive Dashboards
- Security, Authentication, and Permissions
- Scaling Redash for Production
- Extending Redash: Custom Visualizations & Plugins
- Comparing Redash to Other Open‑Source BI Tools
- Real‑World Use Cases & Success Stories
- Common Troubleshooting Scenarios
- Best Practices Checklist
- Conclusion
- Resources
Introduction
Data‑driven decision‑making is no longer a luxury; it’s a baseline expectation for modern organizations. While enterprise BI platforms such as Tableau, Power BI, or Looker dominate the market, many teams—especially startups, small‑to‑medium businesses, and data‑centric engineering groups—need a lightweight, cost‑effective, and highly flexible solution. This is where Redash shines.
Born out of the data‑engineering community at Everything as Code and later open‑sourced under the Apache 2.0 license, Redash provides a SQL‑first interface to a wide range of data sources, enabling analysts to write queries, visualize results, and share interactive dashboards with minimal friction.
In this article we will:
- Explore Redash’s architecture and design philosophy.
- Walk through every major installation path (Docker, Kubernetes, manual).
- Show how to connect, query, and visualize data from multiple sources.
- Discuss security, scaling, and extensibility.
- Compare Redash with other popular open‑source BI tools.
- Provide a practical checklist you can use to adopt Redash in production.
By the end, you’ll have a full‑stack understanding of Redash and be ready to deploy it as the central analytics hub for your organization.
What Is Redash?
Redash is an open‑source data collaboration platform that focuses on:
| Feature | Description |
|---|---|
| SQL‑first query editor | Write native SQL (or query language supported by the source) directly in the browser. |
| Broad source support | 30+ data sources out‑of‑the‑box: PostgreSQL, MySQL, BigQuery, Snowflake, Elasticsearch, MongoDB, Prometheus, etc. |
| Visualization library | 20+ chart types (line, bar, pie, heatmap, choropleth, Sankey, etc.) plus custom visualizations via JavaScript. |
| Dashboard sharing | Public or private URLs, embed codes, and scheduled email reports. |
| Alerting | Threshold‑based alerts that trigger webhooks, Slack messages, or email. |
| Collaboration | Query comments, version history, and team‑wide query libraries. |
| Self‑hosted | Deploy on-premise or in any cloud environment; also offered as a hosted SaaS by Redash (now part of Databricks). |
Its SQL‑first stance makes Redash a natural fit for data engineers comfortable with the language of their data warehouses. The platform abstracts away the need for complex ETL pipelines when you just want to explore data quickly.
Core Architecture
Understanding Redash’s architecture helps you design a robust deployment and diagnose performance bottlenecks.
+-------------------+ +-------------------+ +-------------------+
| Front‑End (React)| <----> | API Server (Flask| <----> | Workers (RQ) |
| (single‑page UI) | | + SQLAlchemy) | | (async tasks) |
+-------------------+ +-------------------+ +-------------------+
^ ^ ^
| | |
| | |
| | |
+------+-------+ +------+-------+ +------+-------+
| PostgreSQL | | Redis (RQ) | | Redis (Cache) |
+--------------+ +--------------+ +-----------------+
- Front‑End – A React SPA that communicates with the Flask API over HTTP(S).
- API Server – Written in Python (Flask) and handles authentication, query validation, and orchestration of background jobs.
- Workers – Powered by RQ (Redis Queue), they execute long‑running queries, generate visualizations, and send alerts.
- PostgreSQL – Stores metadata: users, queries, dashboards, query results (cached).
- Redis – Two instances: one for the RQ job queue, another for caching query results and session data.
Key takeaways:
- Stateless API – You can horizontally scale the API server behind a load balancer.
- Job‑based execution – Queries run asynchronously, preventing UI blocking.
- Result caching – Redash caches query results (by default 24 h) to avoid re‑querying expensive data sources.
Installation Options
Redash can be installed in several ways. Choose the method that matches your operational expertise and environment.
4.1 Docker Compose
The simplest route for development or small production setups is the official docker-compose.yml provided by Redash.
# docker-compose.yml
version: '3.7'
services:
server:
image: redash/redash:10.2.0 # replace with latest tag
command: server
env_file: .env
depends_on:
- postgres
- redis
ports:
- "5000:5000"
restart: unless-stopped
scheduler:
image: redash/redash:10.2.0
command: scheduler
env_file: .env
depends_on:
- postgres
- redis
restart: unless-stopped
worker:
image: redash/redash:10.2.0
command: worker
env_file: .env
depends_on:
- postgres
- redis
restart: unless-stopped
redis:
image: redis:6-alpine
restart: unless-stopped
postgres:
image: postgres:13-alpine
environment:
POSTGRES_PASSWORD: redash
POSTGRES_USER: redash
POSTGRES_DB: redash
volumes:
- pgdata:/var/lib/postgresql/data
restart: unless-stopped
volumes:
pgdata:
Create a .env file with the minimal required variables:
POSTGRES_PASSWORD=redash
POSTGRES_USER=redash
POSTGRES_DB=redash
REDASH_COOKIE_SECRET=$(openssl rand -hex 32)
REDASH_LOG_LEVEL=INFO
Then start:
docker compose up -d
The UI will be reachable at http://localhost:5000. This setup is ideal for quick prototyping, CI pipelines, or a single‑node production environment with modest traffic.
4.2 Kubernetes / Helm Chart
For enterprises running Kubernetes, Redash offers an official Helm chart (redash/redash). The chart abstracts away the Docker‑compose components into Deployments, Services, and PersistentVolumeClaims.
helm repo add redash https://helm.redash.io
helm repo update
helm install my-redash redash/redash \
--set postgresql.postgresqlPassword=redash \
--set redis.password= \
--set admin.password=admin123 \
--set env.REDASH_COOKIE_SECRET=$(openssl rand -hex 32)
Key Helm values you may want to tweak:
| Parameter | Description |
|---|---|
service.type | ClusterIP (default), LoadBalancer, or NodePort depending on exposure needs. |
worker.replicaCount | Number of background worker pods; increase for heavy query load. |
scheduler.replicaCount | Typically 1; keep low because it only schedules jobs. |
persistence.enabled | Set to true to provision PVCs for PostgreSQL data. |
resources | CPU/memory limits per component. |
The chart also supports Ingress configuration for TLS termination and integrates with external secret stores (AWS Secrets Manager, HashiCorp Vault) via Helm hooks.
4.3 Manual Source Install
If you prefer a bare‑metal approach or need to customize the Python code, install from source.
# Prerequisites
sudo apt-get install -y python3 python3-pip python3-venv postgresql redis-server libpq-dev
# Create a system user
sudo useradd -r -m -s /bin/bash redash
# Clone repo
git clone https://github.com/getredash/redash.git /opt/redash
cd /opt/redash
# Create virtualenv
python3 -m venv venv
source venv/bin/activate
# Install Python dependencies
pip install -r requirements.txt
# Configure environment (example)
cat <<EOF > .env
REDASH_DATABASE_URL=postgresql://redash:redash@localhost/redash
REDASH_REDIS_URL=redis://localhost:6379/0
REDASH_COOKIE_SECRET=$(openssl rand -hex 32)
REDASH_LOG_LEVEL=INFO
EOF
# Initialize DB
./manage.py database create_tables
./manage.py database create_default_user --admin --password admin123
# Run services (in production use systemd)
# Server
./manage.py runserver &
# Scheduler
./manage.py scheduler &
# Worker
./manage.py worker &
This method gives you full control over dependencies and allows you to apply custom patches or integrate with internal authentication mechanisms (e.g., LDAP, SAML) before building a Docker image.
Connecting Data Sources
Redash’s “Data Sources” UI abstracts connection details into named endpoints. Below are common patterns for three popular warehouses.
1. PostgreSQL / MySQL
- Type:
PostgreSQLorMySQL - Host:
db.example.com - Port:
5432(Postgres) or3306(MySQL) - Database:
analytics - User / Password: Application user with
SELECTrights.
-- Test query
SELECT now() AS current_time;
2. Google BigQuery
- Type:
BigQuery - Authentication: Service account JSON key.
- Project ID:
my-gcp-project - Dataset:
sales - JSON key: Upload via the UI or set
GOOGLE_APPLICATION_CREDENTIALSenv var.
SELECT
DATE(order_timestamp) AS order_date,
COUNT(*) AS orders,
SUM(total_amount) AS revenue
FROM `my-gcp-project.sales.orders`
WHERE _PARTITIONTIME BETWEEN TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 30 DAY) AND CURRENT_TIMESTAMP()
GROUP BY order_date
ORDER BY order_date DESC;
3. Elasticsearch
- Type:
Elasticsearch - URL:
https://es.example.com:9200 - Index pattern:
logs-* - Authentication: Basic auth or API key.
Redash converts the Elasticsearch DSL into a tabular result set. Example query (via UI “Query Editor” using the built‑in DSL):
{
"size": 0,
"aggs": {
"status_codes": {
"terms": {"field": "status_code"}
}
}
}
Resulting table will have columns key (status code) and doc_count.
Writing Queries in Redash
Redash’s query editor is SQL‑centric but also supports:
- Python (via
pandasorNumPy) for ad‑hoc data transformations. - JavaScript (via the Query Results API) for custom post‑processing.
6.1 Parameterizing Queries
Parameters empower non‑technical stakeholders to change filters without editing the raw SQL.
SELECT
DATE(created_at) AS day,
COUNT(*) AS signups
FROM users
WHERE created_at BETWEEN '{{ start_date }}' AND '{{ end_date }}'
GROUP BY day
ORDER BY day;
When you run the query, Redash presents UI controls:
| Parameter | Type | Default | Description |
|---|---|---|---|
start_date | Date | 2024-01-01 | Start of the date range |
end_date | Date | 2024-01-31 | End of the date range |
You can also use list parameters for multi‑select filters:
SELECT *
FROM orders
WHERE status IN ({{ status | split(',') }});
6.2 Query Result Caching
Redash caches query results to minimize load on data warehouses. Cache behavior can be controlled per‑query:
| Setting | Effect |
|---|---|
| Refresh Schedule | Cron‑like schedule (every 1 hour, every day at 02:00). |
| TTL (seconds) | Time‑to‑live for the cached result. Default is 24 h. |
| Cache Disabled | Set “Refresh Automatically” to Never and manually run the query when needed. |
Example: A daily sales KPI query can be refreshed every morning at 04:00 UTC, while a real‑time monitoring query may have a TTL of 60 seconds.
Visualizations: From Tables to Advanced Charts
Redash ships with a visualization builder that maps column types to chart options. Here’s a quick run‑through.
1. Table (Default)
- Use case: Raw data inspection, CSV export.
- Features: Pagination, column sorting, column formatting (currency, percentages).
2. Bar / Column
- Best for: Categorical comparisons (e.g., sales by region).
- Configuration: Choose X‑axis (category), Y‑axis (value), stacked option.
3. Line / Area
- Best for: Time series (e.g., daily active users).
- Options: Smooth lines, interpolation, multiple series.
4. Pie / Donut
- Best for: Proportional breakdowns (e.g., market share).
5. Heatmap
- Best for: Correlation matrices or time‑of‑day activity.
6. Choropleth Map
- Best for: Geo‑spatial data (country, state, zip). Requires a column with ISO‑2/3 codes or latitude/longitude.
7. Sankey Diagram
- Best for: Flow analysis (e.g., user funnel steps).
8. Custom Visualization (JavaScript)
Redash allows you to write custom visualizations using the visualization.js API. Example: embedding a Plotly chart.
// custom_visualization.js
function init(container, queryResult) {
const data = [{
x: queryResult.getColumn('date'),
y: queryResult.getColumn('revenue'),
type: 'scatter',
mode: 'lines+markers',
name: 'Revenue'
}];
Plotly.newPlot(container, data, {title: 'Daily Revenue'});
}
Upload the script via the UI, then select “Custom Visualization” when building a chart. This flexibility enables advanced visualizations like D3 force‑directed graphs or Highcharts stock charts.
Building Interactive Dashboards
A dashboard is a collection of visualizations arranged on a grid. Redash’s dashboard editor offers:
- Drag‑and‑drop layout – Place widgets anywhere, resize, and lock positions.
- Global filters – Apply a single parameter (e.g., date range) across multiple widgets.
- Auto‑refresh – Set a per‑dashboard refresh interval (e.g., every 5 minutes).
- Embedding – Generate an iFrame snippet for internal portals or public sharing.
Example: Marketing KPI Dashboard
| Widget | Source Query | Visualization |
|---|---|---|
| 1. Daily Sessions | SELECT DATE(event_time) AS day, COUNT(*) AS sessions FROM web_events WHERE event_type='page_view' GROUP BY day | Line chart (auto‑refresh 1 min) |
| 2. Campaign Spend vs. Revenue | SELECT campaign, SUM(spend) AS spend, SUM(revenue) AS revenue FROM ad_performance GROUP BY campaign | Bar chart (stacked) |
| 3. Top 10 Landing Pages | SELECT page, COUNT(*) AS visits FROM web_events WHERE event_type='page_view' GROUP BY page ORDER BY visits DESC LIMIT 10 | Table |
| 4. Geo Distribution | SELECT country, COUNT(*) AS users FROM users GROUP BY country | Choropleth map |
Add a global date filter ({{ start_date }} / {{ end_date }}) that each query references. When the marketing analyst changes the date range, all widgets update instantly.
Embedding Example:
<iframe src="https://redash.example.com/embed/dashboard/12?theme=light&autoRefresh=60"
width="1200" height="800" frameborder="0"></iframe>
Security, Authentication, and Permissions
Redash supports multiple authentication backends:
| Method | Description |
|---|---|
| Native (email/password) | Simple admin‑created accounts. |
| OAuth2 (Google, GitHub, Azure AD) | Redirect‑based SSO. |
| SAML 2.0 | Enterprise‑grade federation (via python3-saml). |
| LDAP / Active Directory | Bind against corporate directories. |
| JWT | For API‑first integrations (e.g., embedding). |
Role‑Based Access Control (RBAC)
- Admin – Full control: manage users, data sources, system settings. |
- Editor – Can create queries, visualizations, and dashboards. |
- Viewer – Read‑only access to specific dashboards or queries (via sharing). |
Permissions can be granular:
# Example: Grant a group read‑only access to a dashboard
dashboard_id: 42
group: analytics_viewers
permissions:
- view: true
- edit: false
- share: false
Best practice: Use OAuth2 or SAML for production to centralize identity, and disable native sign‑up (REDASH_ALLOW_SIGNUP=False) to prevent rogue accounts.
Scaling Redash for Production
When query volume grows, the following dimensions need attention:
| Component | Scaling Strategy |
|---|---|
| API Servers | Run multiple replicas behind an NGINX or HAProxy load balancer. |
| Workers | Increase worker.replicaCount (Kubernetes) or spin up additional worker containers. Use RQ worker pools per data source to isolate heavy workloads. |
| Redis | Deploy a Redis Cluster or enable persistence (appendonly yes) for durability. |
| PostgreSQL | Use a managed service with read replicas; ensure max_connections accommodates both API and background jobs. |
| Result Cache | Set REDASH_QUERY_RESULTS_CLEANUP_ENABLED to true with an appropriate TTL to avoid storage bloat. |
| Query Timeout | Adjust REDASH_QUERY_TIMEOUT (default 120 s) based on warehouse latency. |
| Monitoring | Export metrics via Prometheus (/metrics endpoint) and set alerts for queue length, worker failures, and DB latency. |
Horizontal Scaling Example (Kubernetes)
# values.yaml excerpt
worker:
replicaCount: 4
resources:
limits:
cpu: "2000m"
memory: "2Gi"
scheduler:
replicaCount: 1
api:
replicaCount: 3
resources:
limits:
cpu: "1000m"
memory: "1Gi"
Result: With 4 workers, Redash can process up to dozens of concurrent queries, keeping UI latency under 2 seconds for typical dashboards.
Extending Redash: Custom Visualizations & Plugins
Redash’s open architecture invites extensions.
1. Custom Visualizations (JS)
- Write a JavaScript module exposing
init(container, queryResult). - Register the script in Settings → Visualization → Add Custom Visualization.
- Use any front‑end library (Chart.js, D3, ECharts).
Example: D3 Force‑Directed Graph
function init(container, queryResult) {
const nodes = queryResult.getColumn('node');
const links = queryResult.getColumn('target').map((t, i) => ({
source: nodes[i],
target: t
}));
const width = container.clientWidth, height = 500;
const svg = d3.select(container).append('svg')
.attr('width', width)
.attr('height', height);
const simulation = d3.forceSimulation(nodes)
.force('link', d3.forceLink(links).distance(100))
.force('charge', d3.forceManyBody().strength(-300))
.force('center', d3.forceCenter(width / 2, height / 2));
// render nodes & links...
}
2. Python Query Runners
If you need a data source not natively supported (e.g., a proprietary API), create a custom query runner by extending Redash’s BaseQueryRunner. The runner can expose a UI for authentication and translate user‑written SQL‑like syntax into API calls.
# my_custom_runner.py
from redash.query_runner.base import BaseQueryRunner, register
class MyAPIQueryRunner(BaseQueryRunner):
def __init__(self, configuration):
super().__init__(configuration)
self.api_key = configuration.get('api_key')
self.base_url = configuration.get('base_url')
def run_query(self, query, user):
# Translate simple SELECT syntax to API request
endpoint = f"{self.base_url}/data?query={urllib.parse.quote(query)}"
resp = requests.get(endpoint, headers={'Authorization': f'Bearer {self.api_key}'})
data = resp.json()
columns = [{'name': k, 'type': type(v).__name__} for k, v in data[0].items()]
rows = data
return json.dumps({'columns': columns, 'rows': rows}), None
register(MyAPIQueryRunner)
After installing the plugin, restart the workers; the new data source appears in the UI.
3. Alert Webhooks
Redash alerts can trigger custom webhooks. Example: posting a message to a Slack channel when a KPI drops below a threshold.
{
"name": "Low Daily Active Users",
"query_id": 23,
"options": {
"cron": "0 9 * * *",
"trigger": "less_than",
"value": 5000,
"rearm": 3600,
"notify": [
{
"type": "webhook",
"url": "https://hooks.slack.com/services/T000/B000/XXXX"
}
]
}
}
Comparing Redash to Other Open‑Source BI Tools
| Feature | Redash | Metabase | Apache Superset |
|---|---|---|---|
| Primary UI Paradigm | SQL‑first query builder | Low‑code “Ask a question” UI | Drag‑and‑drop chart builder |
| Data Source Breadth | 30+ native connectors | 30+ (similar) | 20+ (focus on Hadoop ecosystem) |
| Custom Visualizations | JavaScript plugins (highly flexible) | Limited (limited to built‑in charts) | Supports echarts & custom plugins, but more complex |
| Alerting | Built‑in threshold alerts & webhook support | Basic email alerts | Alerting via external tools (e.g., Airflow, Grafana) |
| Self‑service Dashboarding | Global filters, embed iFrames | Dashboard filters (but limited) | Powerful filter boxes, but UI can be overwhelming |
| Authentication Options | OAuth, SAML, LDAP, JWT | OAuth, LDAP, Google SSO | OAuth, LDAP, Remote user auth |
| Scaling | Horizontal scaling via workers, simple architecture | Scales with multiple containers, but job queue less mature | Uses Celery + Redis; can be more complex |
| Community & Support | Active GitHub, paid SaaS by Databricks | Large community, open‑source core | Strong Apache community, frequent releases |
| Learning Curve | Low for SQL users | Very low (point‑and‑click) | Moderate‑high (requires understanding of Flask, Celery) |
When to choose Redash: You have a team comfortable with SQL, need fast query iteration, want custom visualizations, and require alerting natively.
When Metabase may be better: Non‑technical users need a no‑SQL UI and quick ad‑hoc reporting.
When Superset shines: Large enterprise with deep Hadoop/Spark integration and need for fine‑grained access control.
Real‑World Use Cases & Success Stories
1. SaaS Startup – Product Analytics
- Problem: Engineers needed a fast way to explore event data stored in Snowflake without building a separate analytics UI.
- Solution: Deployed Redash on AWS ECS (Docker Compose). Product managers created dashboards for DAU, churn, and feature usage.
- Result: Reduced time‑to‑insight from weeks (ETL → Looker) to minutes; cut licensing costs by 80 %.
2. Financial Services – Risk Monitoring
- Problem: Compliance required real‑time monitoring of transaction anomalies across multiple PostgreSQL shards.
- Solution: Integrated Redash with PostgreSQL and ElasticSearch, built alerts that fire Slack messages when fraud‑score exceeds a threshold.
- Result: Detected 12 high‑risk transactions in the first month, improving incident response time.
3. E‑commerce – Marketing Attribution
- Problem: Marketing team needed a unified view of spend vs. revenue across Google Ads, Facebook Ads, and internal sales DB.
- Solution: Used Redash’s Google BigQuery connector for ad data, PostgreSQL for sales, and combined via a UNION query. Dashboard displayed spend, ROAS, and funnel metrics.
- Result: Marketing ROI increased 15 % after identifying under‑performing campaigns through the dashboard.
Common Troubleshooting Scenarios
| Symptom | Likely Cause | Fix |
|---|---|---|
| “Query timed out” | Data source latency > REDASH_QUERY_TIMEOUT. | Increase timeout in env, optimize source query (add indexes). |
| Empty query results | Permissions issue on data source (user lacks SELECT). | Verify DB user privileges; test query via a SQL client. |
| Dashboard not refreshing | Scheduler not running or worker queue stuck. | Ensure scheduler and worker containers are alive; check Redis queue length (rq info). |
| Authentication redirects loop | OAuth callback URL mismatch. | Update REDASH_OAUTH_CALLBACK_URL to match provider’s allowed redirect. |
| High Redis memory usage | Result cache growing unchecked. | Enable cleanup (REDASH_QUERY_RESULTS_CLEANUP_ENABLED=true) and tune REDASH_QUERY_RESULTS_TTL. |
| Visualization error “Column not found” | Column name mismatch after query rename. | Refresh the visualization or edit the query to preserve column names. |
Use the Redash logs (docker logs <container>) and PostgreSQL logs for deeper investigation. For production, ship logs to a central system (ELK, Splunk) and set alerts on error rates.
Best Practices Checklist
- Version Control – Store query definitions (
.json) in a Git repo using Redash’s API (/api/queries). - Parameter Hygiene – Use typed parameters (date, number, list) to avoid SQL injection.
- Result Caching – Set appropriate TTLs; disable cache for near‑real‑time dashboards.
- Security First – Disable native sign‑up, enforce SSO, rotate
REDASH_COOKIE_SECRETannually. - Separate Environments – Staging Redash instance for testing new data sources before production rollout.
- Monitoring – Export Prometheus metrics, monitor
rq:workersqueue length, and set alerts onworker failures. - Backup Strategy – Daily PostgreSQL dumps (
pg_dump) and periodic Redis RDB snapshots. - Documentation – Keep an internal wiki of data source connection strings, query naming conventions, and dashboard owners.
Conclusion
Redash occupies a sweet spot in the BI landscape: it’s lightweight enough for rapid prototyping, yet powerful enough for enterprise‑grade monitoring and alerting. Its SQL‑first approach empowers data engineers to explore data directly, while its visualization engine and dashboard sharing make insights accessible to non‑technical stakeholders.
By following the installation guides, configuring security correctly, and employing the scaling strategies discussed, you can turn Redash into a central analytics hub that serves the entire organization—from product managers to finance teams. Moreover, the extensibility through custom visualizations, query runners, and webhook alerts ensures the platform can evolve alongside your data ecosystem.
Whether you’re a startup looking for a cost‑effective analytics stack or a large organization needing a flexible, self‑hosted BI tool, Redash provides a robust foundation. Deploy it, tailor it to your needs, and let your data speak.
Resources
Official Redash Documentation – Comprehensive guides, API reference, and deployment tutorials.
https://redash.io/help/Redash GitHub Repository – Source code, issue tracker, and community contributions.
https://github.com/getredash/redashRedash Blog – “Scaling Redash on Kubernetes” – Real‑world case study and Helm chart tips.
https://redash.io/blog/scaling-redash-on-kubernetes/Redash Community Forum – Q&A, best practices, and user‑generated visualizations.
https://discuss.redash.io/Databricks Redash SaaS Offering – Managed Redash service for teams that prefer a hosted solution.
https://databricks.com/product/redash