In Part IIa, we detailed the challenges we faced when automating the deployment of a secure static website using S3, CloudFront, and WAF. Service interdependencies, eventual consistency, error handling, and AWS API complexity all presented hurdles. This post details the actual implementation journey.
We didn’t start with a fully fleshed-out solution that just worked. We had to “lather, rinse and repeat”. In the end, we built a resilient automation script robust enough to deploy secure, private websites across any organization.
The first take away - the importance of logging and visibility. While logging wasn’t the first thing we actually tackled, it was what eventually turned a mediocre automation script into something worth publishing.
run_command()
While automating the process of creating this infrastructure, we need to feed the output of one or more commands into the pipeline. The output of one command feeds another. But each step of course can fail. We need to both capture the output for input to later steps and capture errors to help debug the process. Automation without visibility is like trying to discern the elephant by looking at the shadows on the cave wall. Without a robust solution for capturing output and errors we experienced:
When AWS CLI calls failed, we found ourselves staring at the terminal trying to reconstruct what went wrong. Debugging was guesswork.
The solution was our first major building block: run_command()
.
echo "Running: $*" >&2
echo "Running: $*" >>"$LOG_FILE"
# Create a temp file to capture stdout
local stdout_tmp
stdout_tmp=$(mktemp)
# Detect if we're capturing output (not running directly in a terminal)
if [[ -t 1 ]]; then
# Not capturing → Show stdout live
"$@" > >(tee "$stdout_tmp" | tee -a "$LOG_FILE") 2> >(tee -a "$LOG_FILE" >&2)
else
# Capturing → Don't show stdout live; just log it and capture it
"$@" >"$stdout_tmp" 2> >(tee -a "$LOG_FILE" >&2)
fi
local exit_code=${PIPESTATUS[0]}
# Append stdout to log file
cat "$stdout_tmp" >>"$LOG_FILE"
# Capture stdout content into a variable
local output
output=$(<"$stdout_tmp")
rm -f "$stdout_tmp"
if [ $exit_code -ne 0 ]; then
echo "ERROR: Command failed: $*" >&2
echo "ERROR: Command failed: $*" >>"$LOG_FILE"
echo "Check logs for details: $LOG_FILE" >&2
echo "Check logs for details: $LOG_FILE" >>"$LOG_FILE"
echo "TIP: Since this script is idempotent, you can re-run it safely to retry." >&2
echo "TIP: Since this script is idempotent, you can re-run it safely to retry." >>"$LOG_FILE"
exit 1
fi
# Output stdout to the caller without adding a newline
if [[ ! -t 1 ]]; then
printf "%s" "$output"
fi
}
This not-so-simple wrapper gave us:
stdout
and stderr
for every commandrun_command()
became the workhorse for capturing our needed inputs
to other processes and our eyes into failures.
We didn’t arrive at run_command()
fully formed. We learned it the
hard way:
stdout
took fine-tuningThe point of this whole exercise is to host content, and for that, we need an S3 bucket. This seemed like a simple first task - until we realized it wasn’t. This is where we first collided with a concept that would shape the entire script: idempotency.
S3 bucket names are globally unique. If you try to create one that exists, you fail. Worse, AWS error messages can be cryptic:
Our naive first attempt just created the bucket. Our second attempt checked for it first:
create_s3_bucket() {
if run_command $AWS s3api head-bucket --bucket "$BUCKET_NAME" --profile $AWS_PROFILE 2>/dev/null; then
echo "Bucket $BUCKET_NAME already exists."
return
fi
run_command $AWS s3api create-bucket \
--bucket "$BUCKET_NAME" \
--create-bucket-configuration LocationConstraint=$AWS_REGION \
--profile $AWS_PROFILE
}
Making the script “re-runable” was essential unless of course we could
guarantee we did everything right and things worked the first time.
When has that every happened? Of course, we then wrapped the creation
of the bucket run_command()
because every AWS call still had the
potential to fail spectacularly.
And so, we learned: If you can’t guarantee perfection, you need idempotency.
Configuring a CloudFront distribution using the AWS Console offers a
streamlined setup with sensible defaults. But we needed precise
control over CloudFront behaviors, cache policies, and security
settings - details the console abstracts away. Automation via the AWS
CLI gave us that control - but there’s no free lunch. Prepare yourself
to handcraft deeply nested JSON payloads, get jiggy with jq
, and
manage the dependencies between S3, CloudFront, ACM, and WAF. This is
the path we would need to take to build a resilient, idempotent
deployment script - and crucially, to securely serve private S3
content using Origin Access Control (OAC).
Why do we need OAC?
Since our S3 bucket is private, we need CloudFront to securely retrieve content on behalf of users without exposing the bucket to the world.
Why not OAI?
AWS has deprecated Origin Access Identity in favor of Origin Access Control (OAC), offering tighter security and more flexible permissions.
Why do we need jq
?
In later steps we create a WAF Web ACL to firewall
our CloudFront distribution. In order to associate the WAF Web ACL with
our distribution we need to invoke the update-distribution
API which
requires a fully fleshed out JSON payload updated with the Web ACL id.
GOTHCHA: Attaching a WAF WebACL to an existing CloudFront distribution requires that you use the
update-distribution
API, notassociate-web-acl
as one might expect.
Here’s the template for our distribution configuration (some of the Bash variables used will be evident when you examine the completed script):
{
"CallerReference": "$CALLER_REFERENCE",
$ALIASES
"Origins": {
"Quantity": 1,
"Items": [
{
"Id": "S3-$BUCKET_NAME",
"DomainName": "$BUCKET_NAME.s3.amazonaws.com",
"OriginAccessControlId": "$OAC_ID",
"S3OriginConfig": {
"OriginAccessIdentity": ""
}
}
]
},
"DefaultRootObject": "$ROOT_OBJECT",
"DefaultCacheBehavior": {
"TargetOriginId": "S3-$BUCKET_NAME",
"ViewerProtocolPolicy": "redirect-to-https",
"AllowedMethods": {
"Quantity": 2,
"Items": ["GET", "HEAD"]
},
"ForwardedValues": {
"QueryString": false,
"Cookies": {
"Forward": "none"
}
},
"MinTTL": 0,
"DefaultTTL": $DEFAULT_TTL,
"MaxTTL": $MAX_TTL
},
"PriceClass": "PriceClass_100",
"Comment": "CloudFront Distribution for $ALT_DOMAIN",
"Enabled": true,
"HttpVersion": "http2",
"IsIPV6Enabled": true,
"Logging": {
"Enabled": false,
"IncludeCookies": false,
"Bucket": "",
"Prefix": ""
},
$VIEWER_CERTIFICATE
}
The create_cloudfront_distribution()
function is then used to create
the distribution.
create_cloudfront_distribution() {
# Snippet for brevity; see full script
run_command $AWS cloudfront create-distribution --distribution-config file://$CONFIG_JSON
}
Key lessons:
update-configuation
, not associate-web-acl
for CloudFront
distributionsjq
to modify the existing configuration to add the WAF Web
ACL idCool. We have a CloudFront distribution! But it’s wide open to the world. We needed to restrict access to our internal VPC traffic - without exposing the site publicly. AWS WAF provides this firewall capability using Web ACLs. Here’s what we need to do:
Keep in mind that CloudFront is designed to serve content to the public internet. When clients in our VPC access the distribution, their traffic needs to exit through a NAT gateway with a public IP. We’ll use the AWS CLI to query the NAT gateway’s public IP and use that when we create our allow list of IPs (step 1).
find_nat_ip() {
run_command $AWS ec2 describe-nat-gateways --filter "Name=tag:Environment,Values=$TAG_VALUE" --query "NatGateways[0].NatGatewayAddresses[0].PublicIp" --output text --profile $AWS_PROFILE
}
We take this IP and build our first WAF component: an IPSet. This becomes the foundation for the Web ACL we’ll attach to CloudFront.
The firewall we create will be composed of an allow list of IP addresses (step 2)…
create_ipset() {
run_command $AWS wafv2 create-ip-set \
--name "$IPSET_NAME" \
--scope CLOUDFRONT \
--region us-east-1 \
--addresses "$NAT_IP/32" \
--ip-address-version IPV4 \
--description "Allow NAT Gateway IP"
}
…that form the rules for our WAF Web ACL (step 3).
create_web_acl() {
run_command $AWS wafv2 create-web-acl \
--name "$WEB_ACL_NAME" \
--scope CLOUDFRONT \
--region us-east-1 \
--default-action Block={} \
--rules '[{"Name":"AllowNAT","Priority":0,"Action":{"Allow":{}},"Statement":{"IPSetReferenceStatement":{"ARN":"'$IPSET_ARN'"}},"VisibilityConfig":{"SampledRequestsEnabled":true,"CloudWatchMetricsEnabled":true,"MetricName":"AllowNAT"}}]' \
--visibility-config SampledRequestsEnabled=true,CloudWatchMetricsEnabled=true,MetricName="$WEB_ACL_NAME"
}
This is where our earlier jq
surgery becomes critical - attaching
the Web ACL requires updating the entire CloudFront distribution
configuration. And that’s how we finally attach that Web ACL to our
CloudFront distribution (step 4).
DISTRIBUTION_CONFIG=$(run_command $AWS cloudfront get-distribution-config --id $DISTRIBUTION_ID)
<h1 id="usejqtoinjectwebaclidintoconfigjson">Use jq to inject WebACLId into config JSON</h1>
UPDATED_CONFIG=$(echo "$DISTRIBUTION_CONFIG" | jq --arg ACL_ARN "$WEB_ACL_ARN" '.DistributionConfig | .WebACLId=$ACL_ARN')
<h1 id="passupdatedconfigbackintoupdate-distribution">Pass updated config back into update-distribution</h1>
echo "$UPDATED_CONFIG" > updated-config.json
run_command $AWS cloudfront update-distribution --id $DISTRIBUTION_ID --if-match "$ETAG" --distribution-config file://updated-config.json
At this point, our CloudFront distribution is no longer wide open. It is protected by our WAF Web ACL, restricting access to only traffic coming from our internal VPC NAT gateway.
For many internal-only sites, this simple NAT IP allow list is enough. WAF can handle more complex needs like geo-blocking, rate limiting, or request inspection - but those weren’t necessary for us. Good design isn’t about adding everything; it’s about removing everything that isn’t needed. A simple allow list was also the most secure.
When we set up our bucket, we blocked public access - an S3-wide security setting that prevents any public access to the bucket’s contents. However, this also prevents CloudFront (even with OAC) from accessing S3 objects unless we explicitly allow it. Without this policy update, requests from CloudFront would fail with Access Denied errors.
At this point, we need to allow CloudFront to access our S3
bucket. The update_bucket_policy()
function will apply the policy
shown below.
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Service": "cloudfront.amazonaws.com"
},
"Action": "s3:GetObject",
"Resource": "arn:aws:s3:::$BUCKET_NAME/*",
"Condition": {
"StringEquals": {
"AWS:SourceArn": "arn:aws:cloudfront::$AWS_ACCOUNT:distribution/$DISTRIBUTION_ID"
}
}
}
]
}
Modern OAC best practice is to use the AWS:SourceArn condition to ensure only requests from your specific CloudFront distribution are allowed.
It’s more secure because it ties bucket access directly to a single distribution ARN, preventing other CloudFront distributions (or bad actors) from accessing your bucket.
"Condition": {
"StringEquals": { "AWS:SourceArn": "arn:aws:cloudfront::$AWS_ACCOUNT:distribution/$DISTRIBUTION_ID" }
}
With this policy in place, we’ve completed the final link in the security chain. Our S3 bucket remains private but can now securely serve content through CloudFront - protected by OAC and WAF.
We are now ready to wrap a bow around these steps in an idempotent Bash script.
jq
patch.
Restrict Access with WAFjq
and update-distribution
.Each segment of our script is safe to rerun. Each is wrapped in run_command()
,
capturing results for later steps and ensuring errors are logged. We
now have a script we can commit and re-use with confidence whenever we
need a secure static site. Together, these steps form a robust,
idempotent deployment pipeline for a secure S3 + CloudFront website -
every time.
You can find the full script here.
A hallmark of a production-ready script is an ‘-h’ option. Oh wait - your script has no help or usage? I’m supposed to RTFC? It ain’t done skippy until it’s done.
Scripts should include the ability to pass options that make it a flexible utility. We may have started out writing a “one-off” but recognizing opportunities to generalize the solution turned this into another reliable tool in our toolbox.
Be careful though - not every one-off needs to be Swiss Army knife. Just because aspirin is good for a headache doesn’t mean you should take the whole bottle.
Our script now supports the necessary options to create a secure, static website with a custom domain and certificate. We even added the ability to include additional IP addresses for your allow list in addition to the VPC’s public IP.
Now, deploying a private S3-backed CloudFront site is as easy as:
Example:
./s3-static-site.sh -b my-site -t dev -d example.com -c arn:aws:acm:us-east-1:cert-id
Inputs:
This single command now deploys an entire private website - reliably and repeatably. It only takes a little longer to do it right!
The process of working with ChatGPT to construct a production ready script that creates static websites took many hours. In the end, several lessons were reinforced and some gotchas discovered. Writing this blog itself was a collaborative effort that dissected both the technology and the process used to implement it. Overall, it was a productive, fun and rewarding experience. For those not familiar with ChatGPT or who are afraid to give it a try, I encourage you to explore this amazing tool.
Here are some of the things I took away from this adventure with ChatGPT.
With regard to the technology, some lessons were reinforced, some new knowledge was gained:
update-distribution
API call not associate-web-acl
when
adding WAF ACLs to your distribution!Thanks to ChatGPT for being an ever-present back seat driver on this journey. Real AWS battle scars + AI assistance = better results.
In Part III we wrap it all up as we learn more about how CloudFront and WAF actually protect your website.
This post was drafted with the assistance of ChatGPT, but born from real AWS battle scars.
If you like this content, please leave a comment or consider following me. Thanks.