AWS
Self-hosting on AWS
This guide provides AWS-specific information for deploying Membrane in your AWS environment.
S3 Storage configuration
Create the necessary S3 buckets:
resource "aws_s3_bucket" "tmp" {
bucket = "${var.environment}-integration-app-tmp"
}
resource "aws_s3_bucket" "connectors" {
bucket = "${var.environment}-integration-app-connectors"
}
resource "aws_s3_bucket" "static" {
bucket = "${var.environment}-integration-app-static"
}
# Lifecycle rules for tmp bucket
resource "aws_s3_bucket_lifecycle_configuration" "tmp" {
bucket = aws_s3_bucket.tmp.id
rule {
id = "cleanup"
status = "Enabled"
filter {
prefix = ""
}
expiration {
days = 7
}
}
}
# CORS configuration for static bucket
resource "aws_s3_bucket_cors_configuration" "static" {
bucket = aws_s3_bucket.static.id
cors_rule {
allowed_headers = ["*"]
allowed_methods = ["GET"]
allowed_origins = ["*"]
max_age_seconds = 3000
}
}
Create CloudFront Distribution:
resource "aws_cloudfront_origin_access_control" "static" {
name = "${var.environment}-static-oac"
description = "OAC for static S3 bucket"
origin_access_control_origin_type = "s3"
signing_behavior = "always"
signing_protocol = "sigv4"
}
resource "aws_cloudfront_distribution" "static" {
enabled = true
default_root_object = "index.html"
aliases = ["static.${var.environment}.${var.hosted_zone_name}"]
origin {
domain_name = aws_s3_bucket.static.bucket_regional_domain_name
origin_id = aws_s3_bucket.static.id
origin_access_control_id = aws_cloudfront_origin_access_control.static.id
}
default_cache_behavior {
allowed_methods = ["GET", "HEAD"]
cached_methods = ["GET", "HEAD"]
target_origin_id = aws_s3_bucket.static.id
viewer_protocol_policy = "redirect-to-https"
forwarded_values {
query_string = false
cookies {
forward = "none"
}
}
}
price_class = "PriceClass_100"
restrictions {
geo_restriction {
restriction_type = "none"
}
}
viewer_certificate {
acm_certificate_arn = aws_acm_certificate.cloudfront.arn
ssl_support_method = "sni-only"
minimum_protocol_version = "TLSv1.2_2021"
}
tags = {
Service = "core"
}
}
resource "aws_s3_bucket_policy" "static_cloudfront" {
bucket = aws_s3_bucket.static.id
policy = jsonencode({
Version = "2012-10-17",
Statement = [
{
Effect = "Allow",
Principal = {
Service = "cloudfront.amazonaws.com"
},
Action = "s3:GetObject",
Resource = "${aws_s3_bucket.static.arn}/*",
Condition = {
StringEquals = {
"AWS:SourceArn" = aws_cloudfront_distribution.static.arn
}
}
}
]
})
}
IAM Role Configuration
Membrane containers support AWS IAM role-based access to S3 and other AWS services. This is the preferred method over providing explicit access and secret keys.
Container IAM Configuration
To use IAM roles instead of access keys:
- Create an IAM role with appropriate S3 permissions
- Assign this role to your ECS tasks, EKS pods, or EC2 instances
- Omit the
AWS_ACCESS_KEY_IDandAWS_SECRET_ACCESS_KEYenvironment variables
When running in a properly configured AWS environment, the containers will automatically use the IAM role credentials.
MongoDB on AWS
While AWS DocumentDB is technically compatible with our application, we recommend using native MongoDB (either self-hosted on EC2 or a managed MongoDB Atlas cluster) for better compatibility.
Some customers have encountered edge cases with DocumentDB due to differences in MongoDB API implementation. If you choose to use DocumentDB, be prepared for potential compatibility issues.
Redis on AWS
For Redis, you can use Amazon ElastiCache. Keep in mind that Redis is only used as a cache in our application and can be safely restarted or cleared. There's no persistent data stored in Redis that isn't recoverable from other sources.
ElastiCache TLS Configuration
If you're using ElastiCache with in-transit encryption enabled, set the following environment variable:
REDIS_DISABLE_TLS_VERIFICATION=trueThis is required because ElastiCache uses self-signed certificates that won't pass standard TLS verification.
EKS Deployment
When deploying Membrane on Amazon EKS, you may encounter startup issues due to DNS resolution delays that are common in Kubernetes environments. This section provides recommended configurations and troubleshooting guidance.
Common Startup Issues
CrashLoopBackOff due to health check failures: During initial deployment or image upgrades, pods may fail health checks because MongoDB or Redis connections timeout before DNS resolution completes.
Symptoms:
- Pods entering
CrashLoopBackOffstate shortly after startup - Health check endpoint returning failures
- Logs showing connection timeouts to MongoDB or Redis
Recommended EKS Configuration
Add these environment variables to your deployment to handle slower DNS resolution in Kubernetes:
env:
# Increase MongoDB server selection timeout (default: 30000ms)
- name: MONGO_SERVER_SELECTION_TIMEOUT_MS
value: '60000'
# Increase Redis connection timeout (default: 10000ms for standalone)
- name: REDIS_CONNECT_TIMEOUT_MS
value: '60000'
# Configure health check retry behavior for transient failures
- name: HEALTH_CHECK_RETRIES
value: '5' # Number of retries (default: 3)
- name: HEALTH_CHECK_RETRY_DELAY_MS
value: '2000' # Initial delay between retries in ms (default: 1000)
- name: HEALTH_CHECK_MAX_RETRY_DELAY_MS
value: '30000' # Maximum delay between retries in ms (default: 10000)
# Required for ElastiCache with in-transit encryption
- name: REDIS_DISABLE_TLS_VERIFICATION
value: 'true'
# Optional: Skip specific health checks if needed
# - name: SKIP_HEALTH_CHECKS
# value: "mongo,redis" # Or "all" to skip all checksHealth Check Retry Configuration
Health checks include built-in retry logic with exponential backoff to handle transient network issues common in Kubernetes environments.
| Variable | Default | Description |
|---|---|---|
HEALTH_CHECK_RETRIES | 3 | Number of retry attempts after initial failure |
HEALTH_CHECK_RETRY_DELAY_MS | 1000 | Initial delay between retries (milliseconds) |
HEALTH_CHECK_MAX_RETRY_DELAY_MS | 10000 | Maximum delay between retries (milliseconds) |
The retry mechanism uses exponential backoff with jitter:
- First retry: ~1 second delay
- Second retry: ~2 seconds delay
- Third retry: ~4 seconds delay
- And so on, up to the maximum delay
For EKS environments with slow DNS resolution, we recommend increasing these values:
env:
- name: HEALTH_CHECK_RETRIES
value: '5'
- name: HEALTH_CHECK_RETRY_DELAY_MS
value: '2000'
- name: HEALTH_CHECK_MAX_RETRY_DELAY_MS
value: '30000'Health Check Skip Options
If retry logic is insufficient, you can skip specific health checks using the SKIP_HEALTH_CHECKS environment variable:
| Value | Description |
|---|---|
all | Skip all health checks (MongoDB, Redis, storage, custom code) |
mongo | Skip only MongoDB connectivity check |
redis | Skip only Redis connectivity check |
storage | Skip only cloud storage bucket check |
custom_code | Skip only custom code runner check |
mongo,redis | Skip multiple checks (comma-separated) |
Note: Skipping health checks is recommended only as a last resort. The retry mechanism should handle most transient failures.
Kubernetes Probe Configuration
In addition to the environment variables, consider adjusting your Kubernetes probe settings:
startupProbe:
httpGet:
path: /
port: 5000
initialDelaySeconds: 10
periodSeconds: 5
failureThreshold: 60 # Allow up to 5 minutes for startup
livenessProbe:
httpGet:
path: /
port: 5000
initialDelaySeconds: 30
periodSeconds: 10
failureThreshold: 3
readinessProbe:
httpGet:
path: /
port: 5000
initialDelaySeconds: 5
periodSeconds: 5
failureThreshold: 3Troubleshooting
1. Check pod logs for connection errors:
kubectl logs <pod-name> -n <namespace>Look for timeout errors related to MongoDB or Redis connections.
2. Verify DNS resolution from within the cluster:
kubectl run -it --rm debug --image=busybox --restart=Never -- nslookup <your-mongodb-host>3. Test connectivity to external services:
kubectl run -it --rm debug --image=mongo:latest --restart=Never -- mongosh "mongodb+srv://<connection-string>" --eval "db.runCommand({ping:1})"4. If issues persist after configuration changes:
- Ensure security groups allow traffic between EKS nodes and your MongoDB/Redis instances
- Verify VPC peering or PrivateLink configurations if using cross-VPC connections
- Check that IAM roles have appropriate permissions for S3 bucket access
Updated 27 days ago