Ever locked yourself out of your own S3 bucket? That’s like asking a golfer if he’s ever landed in a bunker. We’ve all been there.
Scenario:
A sudden power outage knocks out your internet. When service resumes, your ISP has assigned you a new IP address. Suddenly, the S3 bucket you so carefully protected with that fancy bucket policy that restricts access by IP… is protecting itself from you. Nice work.
And here’s the kicker, you can’t change the policy because…you can’t access the bucket! Time to panic? Read on…
This post will cover:
S3 bucket policies are powerful and absolute. A common security pattern is to restrict access to a trusted IP range, often your home or office IP. That’s fine, but what happens when those IPs change without prior notice?
That’s the power outage scenario in a nutshell.
Suddenly (and without warning), I couldn’t access my own bucket. Worse, there was no easy way back in because the bucket policy itself was blocking my attempts to update it. Whether you go to the console or drop to a command line, you’re still hitting that same brick wall—your IP isn’t in the allow list.
At that point, you have two options, neither of which you want to rely on in a pinch:
The root account is a last resort (as it should be), and AWS support can take time you don’t have.
Once you regain access to the bucket again, it’s time to build a policy that includes an emergency backdoor from a trusted environment. We’ll call that the “safe room”. Your safe room is your AWS VPC.
While your home IP might change with the weather, your VPC is rock solid. If you allow access from within your VPC, you always have a way to manage your bucket policy.
Even if you rarely touch an EC2 instance, having that backdoor in your pocket can be the difference between a quick fix and a day-long support ticket.
A script to implement our safe room approach must at least:
This script helps you recover from lockouts and prevents future ones by ensuring your VPC is always a reliable access point.
Our script is light on dependencies but you will need to have curl
and the aws
script installed on your EC2.
A typical use of the command requires only your new IP address and the
bucket name. The aws CLI will try credentials from the environment,
your ~/.aws config
, or an instance profile - so you only need -p
if you
want to specify a different profile. Here’s the minimum you’d need to run the
command if you are executing the script in your VPC:
./s3-bucket-unlock.sh -i <your-home-ip> -b <bucket-name>
Options:
-i
Your current public IP address (e.g., your home IP).-b
The S3 bucket name.-v
(Optional) VPC ID; auto-detected if not provided.-p
(Optional) AWS CLI profile (defaults to $AWS_PROFILE
or default
).-n
Dry run (show policy, do not apply).Example with dry run:
./s3-bucket-unlock.sh -i 203.0.113.25 -b my-bucket -n
The dry run option lets you preview the generated policy before making any changes—a good habit when working with S3 policies.
Someone once said that we learn more from our failures than from our successes. At this rate I should be on the AWS support team soon…lol. Well, I probably need a lot more mistakes under my belt before they hand me a badge. In any event, ahem, we learned something from our power outage. Stuff happens - best be prepared. Here’s what this experience reinforced:
Sometimes it’s not a mistake - it’s a failure to realize how fragile access is. My home IP was fine…until it wasn’t.
Our script will help us apply a quick fix. The process of writing it was a reminder that security balances restrictions with practical escape hatches.
Next time you set an IP-based bucket policy, ask yourself:
Thanks to ChatGPT for being an invaluable backseat driver on this journey. Real AWS battle scars + AI assistance = better results.
In Part IIa, we detailed the challenges we faced when automating the deployment of a secure static website using S3, CloudFront, and WAF. Service interdependencies, eventual consistency, error handling, and AWS API complexity all presented hurdles. This post details the actual implementation journey.
We didn’t start with a fully fleshed-out solution that just worked. We had to “lather, rinse and repeat”. In the end, we built a resilient automation script robust enough to deploy secure, private websites across any organization.
The first take away - the importance of logging and visibility. While logging wasn’t the first thing we actually tackled, it was what eventually turned a mediocre automation script into something worth publishing.
run_command()
While automating the process of creating this infrastructure, we need to feed the output of one or more commands into the pipeline. The output of one command feeds another. But each step of course can fail. We need to both capture the output for input to later steps and capture errors to help debug the process. Automation without visibility is like trying to discern the elephant by looking at the shadows on the cave wall. Without a robust solution for capturing output and errors we experienced:
When AWS CLI calls failed, we found ourselves staring at the terminal trying to reconstruct what went wrong. Debugging was guesswork.
The solution was our first major building block: run_command()
.
echo "Running: $*" >&2
echo "Running: $*" >>"$LOG_FILE"
# Create a temp file to capture stdout
local stdout_tmp
stdout_tmp=$(mktemp)
# Detect if we're capturing output (not running directly in a terminal)
if [[ -t 1 ]]; then
# Not capturing → Show stdout live
"$@" > >(tee "$stdout_tmp" | tee -a "$LOG_FILE") 2> >(tee -a "$LOG_FILE" >&2)
else
# Capturing → Don't show stdout live; just log it and capture it
"$@" >"$stdout_tmp" 2> >(tee -a "$LOG_FILE" >&2)
fi
local exit_code=${PIPESTATUS[0]}
# Append stdout to log file
cat "$stdout_tmp" >>"$LOG_FILE"
# Capture stdout content into a variable
local output
output=$(<"$stdout_tmp")
rm -f "$stdout_tmp"
if [ $exit_code -ne 0 ]; then
echo "ERROR: Command failed: $*" >&2
echo "ERROR: Command failed: $*" >>"$LOG_FILE"
echo "Check logs for details: $LOG_FILE" >&2
echo "Check logs for details: $LOG_FILE" >>"$LOG_FILE"
echo "TIP: Since this script is idempotent, you can re-run it safely to retry." >&2
echo "TIP: Since this script is idempotent, you can re-run it safely to retry." >>"$LOG_FILE"
exit 1
fi
# Output stdout to the caller without adding a newline
if [[ ! -t 1 ]]; then
printf "%s" "$output"
fi
}
This not-so-simple wrapper gave us:
stdout
and stderr
for every commandrun_command()
became the workhorse for capturing our needed inputs
to other processes and our eyes into failures.
We didn’t arrive at run_command()
fully formed. We learned it the
hard way:
stdout
took fine-tuningThe point of this whole exercise is to host content, and for that, we need an S3 bucket. This seemed like a simple first task - until we realized it wasn’t. This is where we first collided with a concept that would shape the entire script: idempotency.
S3 bucket names are globally unique. If you try to create one that exists, you fail. Worse, AWS error messages can be cryptic:
Our naive first attempt just created the bucket. Our second attempt checked for it first:
create_s3_bucket() {
if run_command $AWS s3api head-bucket --bucket "$BUCKET_NAME" --profile $AWS_PROFILE 2>/dev/null; then
echo "Bucket $BUCKET_NAME already exists."
return
fi
run_command $AWS s3api create-bucket \
--bucket "$BUCKET_NAME" \
--create-bucket-configuration LocationConstraint=$AWS_REGION \
--profile $AWS_PROFILE
}
Making the script “re-runable” was essential unless of course we could
guarantee we did everything right and things worked the first time.
When has that every happened? Of course, we then wrapped the creation
of the bucket run_command()
because every AWS call still had the
potential to fail spectacularly.
And so, we learned: If you can’t guarantee perfection, you need idempotency.
Configuring a CloudFront distribution using the AWS Console offers a
streamlined setup with sensible defaults. But we needed precise
control over CloudFront behaviors, cache policies, and security
settings - details the console abstracts away. Automation via the AWS
CLI gave us that control - but there’s no free lunch. Prepare yourself
to handcraft deeply nested JSON payloads, get jiggy with jq
, and
manage the dependencies between S3, CloudFront, ACM, and WAF. This is
the path we would need to take to build a resilient, idempotent
deployment script - and crucially, to securely serve private S3
content using Origin Access Control (OAC).
Why do we need OAC?
Since our S3 bucket is private, we need CloudFront to securely retrieve content on behalf of users without exposing the bucket to the world.
Why not OAI?
AWS has deprecated Origin Access Identity in favor of Origin Access Control (OAC), offering tighter security and more flexible permissions.
Why do we need jq
?
In later steps we create a WAF Web ACL to firewall
our CloudFront distribution. In order to associate the WAF Web ACL with
our distribution we need to invoke the update-distribution
API which
requires a fully fleshed out JSON payload updated with the Web ACL id.
GOTHCHA: Attaching a WAF WebACL to an existing CloudFront distribution requires that you use the
update-distribution
API, notassociate-web-acl
as one might expect.
Here’s the template for our distribution configuration (some of the Bash variables used will be evident when you examine the completed script):
{
"CallerReference": "$CALLER_REFERENCE",
$ALIASES
"Origins": {
"Quantity": 1,
"Items": [
{
"Id": "S3-$BUCKET_NAME",
"DomainName": "$BUCKET_NAME.s3.amazonaws.com",
"OriginAccessControlId": "$OAC_ID",
"S3OriginConfig": {
"OriginAccessIdentity": ""
}
}
]
},
"DefaultRootObject": "$ROOT_OBJECT",
"DefaultCacheBehavior": {
"TargetOriginId": "S3-$BUCKET_NAME",
"ViewerProtocolPolicy": "redirect-to-https",
"AllowedMethods": {
"Quantity": 2,
"Items": ["GET", "HEAD"]
},
"ForwardedValues": {
"QueryString": false,
"Cookies": {
"Forward": "none"
}
},
"MinTTL": 0,
"DefaultTTL": $DEFAULT_TTL,
"MaxTTL": $MAX_TTL
},
"PriceClass": "PriceClass_100",
"Comment": "CloudFront Distribution for $ALT_DOMAIN",
"Enabled": true,
"HttpVersion": "http2",
"IsIPV6Enabled": true,
"Logging": {
"Enabled": false,
"IncludeCookies": false,
"Bucket": "",
"Prefix": ""
},
$VIEWER_CERTIFICATE
}
The create_cloudfront_distribution()
function is then used to create
the distribution.
create_cloudfront_distribution() {
# Snippet for brevity; see full script
run_command $AWS cloudfront create-distribution --distribution-config file://$CONFIG_JSON
}
Key lessons:
update-configuation
, not associate-web-acl
for CloudFront
distributionsjq
to modify the existing configuration to add the WAF Web
ACL idCool. We have a CloudFront distribution! But it’s wide open to the world. We needed to restrict access to our internal VPC traffic - without exposing the site publicly. AWS WAF provides this firewall capability using Web ACLs. Here’s what we need to do:
Keep in mind that CloudFront is designed to serve content to the public internet. When clients in our VPC access the distribution, their traffic needs to exit through a NAT gateway with a public IP. We’ll use the AWS CLI to query the NAT gateway’s public IP and use that when we create our allow list of IPs (step 1).
find_nat_ip() {
run_command $AWS ec2 describe-nat-gateways --filter "Name=tag:Environment,Values=$TAG_VALUE" --query "NatGateways[0].NatGatewayAddresses[0].PublicIp" --output text --profile $AWS_PROFILE
}
We take this IP and build our first WAF component: an IPSet. This becomes the foundation for the Web ACL we’ll attach to CloudFront.
The firewall we create will be composed of an allow list of IP addresses (step 2)…
create_ipset() {
run_command $AWS wafv2 create-ip-set \
--name "$IPSET_NAME" \
--scope CLOUDFRONT \
--region us-east-1 \
--addresses "$NAT_IP/32" \
--ip-address-version IPV4 \
--description "Allow NAT Gateway IP"
}
…that form the rules for our WAF Web ACL (step 3).
create_web_acl() {
run_command $AWS wafv2 create-web-acl \
--name "$WEB_ACL_NAME" \
--scope CLOUDFRONT \
--region us-east-1 \
--default-action Block={} \
--rules '[{"Name":"AllowNAT","Priority":0,"Action":{"Allow":{}},"Statement":{"IPSetReferenceStatement":{"ARN":"'$IPSET_ARN'"}},"VisibilityConfig":{"SampledRequestsEnabled":true,"CloudWatchMetricsEnabled":true,"MetricName":"AllowNAT"}}]' \
--visibility-config SampledRequestsEnabled=true,CloudWatchMetricsEnabled=true,MetricName="$WEB_ACL_NAME"
}
This is where our earlier jq
surgery becomes critical - attaching
the Web ACL requires updating the entire CloudFront distribution
configuration. And that’s how we finally attach that Web ACL to our
CloudFront distribution (step 4).
DISTRIBUTION_CONFIG=$(run_command $AWS cloudfront get-distribution-config --id $DISTRIBUTION_ID)
<h1 id="usejqtoinjectwebaclidintoconfigjson">Use jq to inject WebACLId into config JSON</h1>
UPDATED_CONFIG=$(echo "$DISTRIBUTION_CONFIG" | jq --arg ACL_ARN "$WEB_ACL_ARN" '.DistributionConfig | .WebACLId=$ACL_ARN')
<h1 id="passupdatedconfigbackintoupdate-distribution">Pass updated config back into update-distribution</h1>
echo "$UPDATED_CONFIG" > updated-config.json
run_command $AWS cloudfront update-distribution --id $DISTRIBUTION_ID --if-match "$ETAG" --distribution-config file://updated-config.json
At this point, our CloudFront distribution is no longer wide open. It is protected by our WAF Web ACL, restricting access to only traffic coming from our internal VPC NAT gateway.
For many internal-only sites, this simple NAT IP allow list is enough. WAF can handle more complex needs like geo-blocking, rate limiting, or request inspection - but those weren’t necessary for us. Good design isn’t about adding everything; it’s about removing everything that isn’t needed. A simple allow list was also the most secure.
When we set up our bucket, we blocked public access - an S3-wide security setting that prevents any public access to the bucket’s contents. However, this also prevents CloudFront (even with OAC) from accessing S3 objects unless we explicitly allow it. Without this policy update, requests from CloudFront would fail with Access Denied errors.
At this point, we need to allow CloudFront to access our S3
bucket. The update_bucket_policy()
function will apply the policy
shown below.
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Service": "cloudfront.amazonaws.com"
},
"Action": "s3:GetObject",
"Resource": "arn:aws:s3:::$BUCKET_NAME/*",
"Condition": {
"StringEquals": {
"AWS:SourceArn": "arn:aws:cloudfront::$AWS_ACCOUNT:distribution/$DISTRIBUTION_ID"
}
}
}
]
}
Modern OAC best practice is to use the AWS:SourceArn condition to ensure only requests from your specific CloudFront distribution are allowed.
It’s more secure because it ties bucket access directly to a single distribution ARN, preventing other CloudFront distributions (or bad actors) from accessing your bucket.
"Condition": {
"StringEquals": { "AWS:SourceArn": "arn:aws:cloudfront::$AWS_ACCOUNT:distribution/$DISTRIBUTION_ID" }
}
With this policy in place, we’ve completed the final link in the security chain. Our S3 bucket remains private but can now securely serve content through CloudFront - protected by OAC and WAF.
We are now ready to wrap a bow around these steps in an idempotent Bash script.
jq
patch.
Restrict Access with WAFjq
and update-distribution
.Each segment of our script is safe to rerun. Each is wrapped in run_command()
,
capturing results for later steps and ensuring errors are logged. We
now have a script we can commit and re-use with confidence whenever we
need a secure static site. Together, these steps form a robust,
idempotent deployment pipeline for a secure S3 + CloudFront website -
every time.
You can find the full script here.
A hallmark of a production-ready script is an ‘-h’ option. Oh wait - your script has no help or usage? I’m supposed to RTFC? It ain’t done skippy until it’s done.
Scripts should include the ability to pass options that make it a flexible utility. We may have started out writing a “one-off” but recognizing opportunities to generalize the solution turned this into another reliable tool in our toolbox.
Be careful though - not every one-off needs to be Swiss Army knife. Just because aspirin is good for a headache doesn’t mean you should take the whole bottle.
Our script now supports the necessary options to create a secure, static website with a custom domain and certificate. We even added the ability to include additional IP addresses for your allow list in addition to the VPC’s public IP.
Now, deploying a private S3-backed CloudFront site is as easy as:
Example:
./s3-static-site.sh -b my-site -t dev -d example.com -c arn:aws:acm:us-east-1:cert-id
Inputs:
This single command now deploys an entire private website - reliably and repeatably. It only takes a little longer to do it right!
The process of working with ChatGPT to construct a production ready script that creates static websites took many hours. In the end, several lessons were reinforced and some gotchas discovered. Writing this blog itself was a collaborative effort that dissected both the technology and the process used to implement it. Overall, it was a productive, fun and rewarding experience. For those not familiar with ChatGPT or who are afraid to give it a try, I encourage you to explore this amazing tool.
Here are some of the things I took away from this adventure with ChatGPT.
With regard to the technology, some lessons were reinforced, some new knowledge was gained:
update-distribution
API call not associate-web-acl
when
adding WAF ACLs to your distribution!Thanks to ChatGPT for being an ever-present back seat driver on this journey. Real AWS battle scars + AI assistance = better results.
In Part III we wrap it all up as we learn more about how CloudFront and WAF actually protect your website.
This post was drafted with the assistance of ChatGPT, but born from real AWS battle scars.
If you like this content, please leave a comment or consider following me. Thanks.
After designing a secure static website on AWS using S3, CloudFront, and WAF as discussed in Part I of this series, we turned our focus to automating the deployment process. While AWS offers powerful APIs and tools, we quickly encountered several challenges that required careful consideration and problem-solving. This post explores the primary difficulties we faced and the lessons we learned while automating the provisioning of this infrastructure.
A key challenge when automating AWS resources is managing service dependencies. Our goal was to deploy a secure S3 website fronted by CloudFront, secured with HTTPS (via ACM), and restricted using WAF. Each of these services relies on others, and the deployment sequence is critical:
Missteps in the sequence can result in failed or partial deployments, which can leave your cloud environment in an incomplete state, requiring tedious manual cleanup.
AWS infrastructure often exhibits eventual consistency, meaning that newly created resources might not be immediately available. We specifically encountered this when working with ACM and CloudFront:
Handling these delays requires building polling mechanisms into your automation or using backoff strategies to avoid hitting API limits.
Reliable automation is not simply about executing commands; it requires designing for resilience and repeatability:
Additionally, logging the execution of deployment commands proved to be an unexpected challenge. We developed a run_command
function that captured both stdout and stderr while logging the output to a file. However, getting this function to behave correctly without duplicating output or interfering with the capture of return values required several iterations and refinements. Reliable logging during automation is critical for debugging failures and ensuring transparency when running infrastructure-as-code scripts.
While the AWS CLI and SDKs are robust, they are often verbose and require a deep understanding of each service:
Throughout this process, we found that successful AWS automation hinges on the following principles:
Automating AWS deployments unlocks efficiency and scalability, but it demands precision and robust error handling. Our experience deploying a secure S3 + CloudFront website highlighted common challenges that any AWS practitioner is likely to face. By anticipating these issues and applying resilient practices, teams can build reliable automation pipelines that simplify cloud infrastructure management.
Next up, Part IIb where we build our script for creating our static site.
This post was drafted with the assistance of ChatGPT, but born from real AWS battle scars.
If you like this content, please leave a comment or consider following me. Thanks.
While much attention is given to dynamic websites there are still many uses for the good ‘ol static website. Whether for hosting documentation, internal portals, or lightweight applications, static sites remain relevant. In my case, I wanted to host an internal CPAN repository for storing and serving Perl modules. AWS provides all of the necessary components for this task but choosing the right approach and configuring it securely and automatically can be a challenge.
Whenever you make an architectural decision various approaches are possible. It’s a best practice to document that decision in an Architectural Design Record (ADR). This type of documentation justifies your design choice, spelling out precisely how each approach either meets or fails to meet functional or non-functional requirements. In the first part of this blog series we’ll discuss the alternatives and why we ended up choosing our CloudFront based approach. This is our ADR.
Description | Notes | |
---|---|---|
1. | HTTPS website for hosting a CPAN repository | Will be used internally but we would like secure transport |
2. | Controlled Access | Can only be accessed from within a private subnet in our VPC |
3. | Scalable | Should be able to handle increasing storage without reprovisioning |
4. | Low-cost | Ideally less than $10/month |
5. | Low-maintenance | No patching or maintenance of applicaation or configurations |
6. | Highly available | Should be available 24x7, content should be backed up |
Now that we’ve defined our functional and non-functional requirements let’s look at some approaches we might take in order to create a secure, scalable, low-cost, low-maintenance static website for hosting our CPAN repository.
This solution at first glance seems like the quickest shot on goal. While S3 does offer a static website hosting feature, it doesn’t support HTTPS by default, which is a major security concern and does not match our requirements. Additionally, website-enabled S3 buckets do not support private access controls - they are inherently public if enabled. Had we been able to accept an insecure HTTP site and public access this approach would have been the easiest to implement. If we wanted to accept public access but required secure transport we could have used CloudFront with the website enabled bucket either using CloudFront’s certificate or creating our own custom domain with its own certificate.
Since our goal is to create a private static site, we can however use CloudFront as a secure, caching layer in front of S3. This allows us to enforce HTTPS, control access using Origin Access Control (OAC), and integrate WAF to restrict access to our VPC. More on this approach later…
Pros:
Cons:
Analysis:
While using an S3 website-enabled bucket is the easiest way to host static content, it fails to meet security and privacy requirements due to public access and lack of HTTPS support.
Perhaps the obvious approach to hosting a private static site is to deploy a dedicated Apache or Nginx web server on an EC2 instance. This method involves setting up a lightweight Linux instance, configuring the web server, and implementing a secure upload mechanism to deploy new content.
Pros:
Cons:
Analysis:
Using a dedicated web server is a viable alternative when additional flexibility is needed, but it comes with added maintenance and cost considerations. Given our requirements for a low-maintenance, cost-effective, and scalable solution, this may not be the best approach.
A common approach I have used to securely serve static content from an S3 bucket is to use an internal proxy server (such as Nginx or Apache) running on an EC2 instance within a private VPC. In fact, this is the approach I have used to create my own private yum repository, so I know it would work effectively for my CPAN repository. The proxy server retrieves content from an S3 bucket via a VPC endpoint, ensuring that traffic never leaves AWS’s internal network. This approach requires managing an EC2 instance, handling security updates, and scaling considerations. Let’s look at the cost of an EC2 based solution.
The following cost estimates are based on AWS pricing for us-east-1:
Item | Pricing |
---|---|
Instance type: t4g.nano (cheapest ARM-based instance) | Hourly cost: \$0.0052/hour |
Monthly usage: 730 hours (assuming 24/7 uptime) | (0.0052 x 730 = \$3.80/month) |
Pros:
Cons:
Analysis:
If predictable costs and full server control are priorities, EC2 may be preferable. However, this solution requires maintenance and may not scale with heavy traffic. Moreover, to create an HA solution would require additional AWS resources.
As alluded to before, CloudFront + S3 might fit the bill. To create a secure, scalable, and cost-effective private static website, we chose to use Amazon S3 with CloudFront (sprinkling in a little AWS WAF for good measure). This architecture allows us to store our static assets in an S3 bucket while CloudFront acts as a caching and security layer in front of it. Unlike enabling public S3 static website hosting, this approach provides HTTPS support, better scalability, and fine-grained access control.
CloudFront integrates with Origin Access Control (OAC), ensuring that the S3 bucket only allows access from CloudFront and not directly from the internet. This eliminates the risk of unintended public exposure while still allowing authorized users to access content. Additionally, AWS WAF (Web Application Firewall) allows us to restrict access to only specific IP ranges or VPCs, adding another layer of security.
Let’s look at costs:
Item | Cost | Capacity | Total | |
---|---|---|---|---|
Data Transfer Out | First 10TB is \$0.085 per GB | 25GB/month of traffic | Cost for 25GB: (25 x 0.085 = \$2.13) | |
HTTP Requests | \$0.0000002 per request | 250,000 requests/month | Cost for requests: (250,000 x 0.0000002 = \$0.05) | |
Total CloudFront Cost: \$2.13 (Data Transfer) + \$0.05 (Requests) = \$2.18/month |
Pros:
Cons:
Analysis:
And the winner is…CloudFront + S3!
Using just a website enabled S3 bucket fails to meet the basic requiredments so let’s eliminate that solution right off the bat. If predictable costs and full server control are priorities, Using an EC2 either as a proxy or a full blown webserver may be preferable. However, for a low-maintenance, auto-scaling solution, CloudFront + S3 is the superior choice. EC2 is slightly more expensive but avoids CloudFront’s external traffic costs. Overall, our winning approach is ideal because it scales automatically, reduces operational overhead, and provides strong security mechanisms without requiring a dedicated EC2 instance to serve content.
Now that we have our agreed upon approach (the “what”) and documented our “architectural decision”, it’s time to discuss the “how”. How should we go about constructing our project? Many engineers would default to Terraform for this type of automation, but we had specific reasons for thinking this through and looking at a different approach. We’d like:
While Terraform is a popular tool for infrastructure automation, it introduces several challenges for this specific project. Here’s why we opted for a Bash script over Terraform:
State Management Complexity
Terraform relies on state files to track infrastructure resources, which introduces complexity when running and re-running deployments. State corruption or mismanagement can cause inconsistencies, making it harder to ensure a seamless idempotent deployment.
Slower Iteration and Debugging
Making changes in Terraform requires updating state, planning, and applying configurations. In contrast, Bash scripts execute AWS CLI commands immediately, allowing for rapid testing and debugging without the need for state synchronization.
Limited Control Over Execution Order
Terraform follows a declarative approach, meaning it determines execution order based on dependencies. This can be problematic when AWS services have eventual consistency issues, requiring retries or specific sequencing that Terraform does not handle well natively.
Overhead for a Simple, Self-Contained Deployment
For a relatively straightforward deployment like a private static website, Terraform introduces unnecessary complexity. A lightweight Bash script using AWS CLI is more portable, requires fewer dependencies, and avoids managing an external Terraform state backend.
Handling AWS API Throttling
AWS imposes API rate limits, and handling these properly requires implementing retry logic. While Terraform has some built-in retries, it is not as flexible as a custom retry mechanism in a Bash script, which can incorporate exponential backoff or manual intervention if needed.
Less Direct Logging and Error Handling
Terraform’s logs require additional parsing and interpretation, whereas a Bash script can log every AWS CLI command execution in a simple and structured format. This makes troubleshooting easier, especially when dealing with intermittent AWS errors.
Although Bash was the right choice for this project, Terraform is still useful for more complex infrastructure where:
For our case, where the goal was quick, idempotent, and self-contained automation, Bash scripting provided a simpler and more effective approach. This approach gave us the best of both worlds - automation without complexity, while still ensuring idempotency and security.
This post was drafted with the assistance of ChatGPT, but born from real AWS battle scars.
If you like this content, please leave a comment or consider following me. Thanks.
This is the last in our three part series where we discuss the creation of a private, secure, static website using Amazon S3 and CloudFront.
Amazon S3 and CloudFront are powerful tools for hosting static websites, but configuring them securely can be surprisingly confusing-even for experienced AWS users. After implementing this setup for my own use, I discovered a few nuances that others often stumble over, particularly around CloudFront access and traffic routing from VPC environments. This post aims to clarify these points and highlight a potential gap in AWS’s offering.
The typical secure setup for hosting a static website using S3 and CloudFront looks like this:
This setup ensures that even if someone discovers your S3 bucket URL, they won’t be able to retrieve content directly. All access is routed securely through CloudFront.
For many AWS users, especially those running workloads inside a VPC, the first head-scratcher comes when internal clients access the CloudFront-hosted website. You might notice that this traffic requires a NAT gateway, and you’re left wondering:
Here’s the key realization:
CloudFront is a public-facing service. Even when your CloudFront distribution is serving content from a private S3 bucket, your VPC clients are accessing CloudFront through its public endpoints.
This distinction is not immediately obvious, and it can be surprising to see internal traffic going through a NAT gateway and showing up with a public IP.
For my use case, I wasn’t interested in CloudFront’s global caching or latency improvements; I simply wanted a secure, private website hosted on S3, with a custom domain and HTTPS. AWS currently lacks a streamlined solution for this. A product offering like “S3 Secure Website Hosting” could fill this gap by combining:
To restrict access to your CloudFront-hosted site, you can use AWS WAF with an IPSet containing your NAT gateway’s public IP address. This allows only internal VPC clients (routing through the NAT) to access the website while blocking everyone else.
The S3 + CloudFront setup is robust and secure - once you understand the routing and public/private distinction. However, AWS could better serve users needing simple, secure internal websites by acknowledging this use case and providing a more streamlined solution.
Until then, understanding these nuances allows you to confidently deploy secure S3-backed websites without surprises.
This post was drafted with the assistance of ChatGPT, but born from real AWS battle scars.
If you like this content, please leave a comment or consider following me. Thanks.
We’ve all been there. Someone bursts into your office (or sends you a message on Teams) with their hair on fire. “Everything is broken! The system is down! Customers are complaining!”
Your adrenaline spikes. Your brain starts racing. Over the years, I’ve learned that the best leaders don’t rush to react. They don’t speed up, they slow down, assess, and respond with clarity. Ever hear of the OODA loop? Did you know that late commitment is an important element of agility?
My experience handling many a supposed Chernobyl meltdowns has led me to what I call Rob’s Rule of Three. It’s my personal framework that I’ve used for years to successfully cut through the noise and make decisions under stress. You too can be a successful executive IF you can just…
…keep your head when all about you are losing theirs and blaming it on you…
To remind myself, my team and my organization of the imporatance of these rules I wrote them on my whiteboard. When someone brought me a “problem” they were reminded of them.
Let’s break this down.
Panic distorts reality. Panic creates a distorted projection of the future. Our imaginings of possible negative outcomes never really match what actually happens does it?
Many issues that feel urgent in the moment turn out to be noise or minor hiccups. Before reacting, help the person and the organization explain:
Many times, the fire is extinquished under the blanket of scrutiny.
As a leader, you can’t and shouldn’t solve everything. Sometimes your role is to delegate, coach, or empower others to handle it. Yes, you probably climbed the ladder to your current perch by personally handling many a crisis. People look at you as a hero. But that’s history. You’re not in the hero business anymore. You are a leader of heroes. Resist the urge to be the hero. It’s a trap. - Who owns this system or process? - Who is closest to the problem? - Is this something no one else is equipped to handle?
If it’s not your problem, hand it off and move on. And sometimes when there is no one else capable of handling the issue, you have identified an organization hole or an opportunity to mentor your replacement!
Not everything is a five-alarm fire. Some problems can wait. The universe changes every second Padawan. Feel the force. - Will delaying action cause harm? - Can we mitigate the issue for now in some way and plan a proper fix later?
Most issues can be scheduled into normal workstreams. Reminding people of the process reinforces calm. Reserve your immediate energy for true emergencies.
Scott Adams, creator of Dilbert, once joked about Wally’s Rule - that after three days, most requests are irrelevant. While it’s a comic exaggeration, there’s truth in it: Some problems evaporate if you simply wait.
When you have a cold, if you get lot’s of rest, drink plenty of fluids and have some chicken soup it will go away in 5-7 days. If you do nothing it might take a week.
When panic strikes, leaders are judged by their ability to stay calm and make sound decisions. The Rule of Three cuts through the noise, reduces reactionary decisions, and reinforces trust within your team.
I have often reminded my team and our internal customers of the team’s track record when there is panic in the air:
“When was the last time we didn’t solve a customer problem? When did we ever leave a system broken? Stop. Think. We’re still here. The business is still running. You are the A-Team! Together we’ll solve this one too!”
Calm is faster. Calm is smarter. Slow is fast, less is more.
This is my personal addendum. “Take an emergency lightly” doesn’t mean ignoring it. It means approaching it with the confidence that you and your team will handle it. Because “that’s sort of what we do”.
So, next time someone runs in with their hair on fire, stop, drop, but don’t roll. Remember the Rule of Three. And take it lightly, while applying a little self affirmation.
…let’s just move on shall we ;-)
So you want to be the guy, the one that swoops in to the shop that has been saddled with the legacy Perl application because you’ve been doing Perl since the last century? You know that shop, they have a Perl application and a bunch of developers that only do Python and they’ve suddenly becom allergic to learning something new (to them). From my own experience, here are some of the technologies you’ll encounter and should be familiar with to be the guy.
mod_perl
FastCGI
Moose
HTML::Template
Mason
Template::Toolkit
I checked off the things I’ve encountered in my last three jobs.
Of course, the newer Perl based frameworks are good to know as well:
docker
cpanm
carton
make
bash
…and of these, I think the most common thing you’ll encounter on
sites that run Perl applications is mod_perl
.
Well, maybe not gold, but certainly higher rates and salaries for experienced Perl developers. You’re a unicorn! Strut your stuff. Don’t back down and go cheap. Every day someone leaves the ranks of Perl development only to become one of the herd leaving you to graze alone.
Over the last three years I’ve earned over a half-million dollars in salary and consulting fees. Some of you are probably earning more. Some less. But here’s the bottom line, your skills are becoming scarcer and scarcer. And here’s the kicker…these apps aren’t going away. Companies are loathe to touch some of their cash cows or invest in any kind of “rewrite”. And here’s why…
And here’s what they want you do for a big pile of their cash:
perl
According to the “interweb” the average salary for an experienced Perl developer is around $50/hour or about $100K or so. I’m suspicious of those numbers to be honest. Your mileage may vary but here’s what I’ve been able to get in my last few jobs:
…and I’m not a great negotiator. I do have over 20 years of experience with Perl and over 40 years of experience in IT. I’m not shy about promoting the value of that experience either. I did turn down a job for $155K/year that would have required some technical leadership, a position I think should have been more like $185k/year to lead a team of Perl developers across multiple time zones.
Even if you decide to leave a job or are done with an assignement, don’t burn bridges. Be willing to help them with a transition. Be polite, ask for a recommendation if appropriate. If they’re not planning on rehiring, they may be willing to contract with you for spot assignments.
In my last blog I introduced AppRunner a relatively new service from AWS that helps application developers concentrate on their applications instead of infrastructure. Similar to Elastic Beanstalk, AppRunner is an accelerator that gets your web application deployed in the most efficient way possible.
One of my concerns has been whether Amazon is actually committed to enhancing and extending this promising service. Searching the 2023 re:Invent announcements I was disappointed to see any news about new features for AppRunner. However, it was encouraging to see that they did include a typical promotional seminar on AppRunner.
The video is definitley worth watching, but the “case for AppRunner” is a bit tedious. They seem to be trying to equate AppRunner with modernization and reduction of technical debt. If hiding the complexities of deploying a scalable web application to the cloud (specifically AWS) is modernization then ok? I guess?
But let’s be honest here. It’s magic. You give them a container, they give you a scalable web application. I’m not sure that’s modernization or reducing technical debt. It sounds more like going “all in”. Which, by the way is totally cool with me. For my money, if you are going to leverage the cloud, then you damn well ought to leverage it. Don’t be shy. Take advantage of these services that reduce friction for developers and help you create value for your customers.
Since I referenced a re:Invent webinar I should mention that I’ve attended re:Invent 5 times. It was an amazing and fun experience. However, the last time I attended it was such a cluster f*ck that I decided it just wasn’t worth the effort and haven’t been back since. Their content is on-line (thank you AWS!) now and I can pick and choose to attend on-line now instead of trying to figure out how to manage my limited time and the demand they have for specific seminars. If you do go, plan on standing in line (a lot).
The straw that broke this camel’s back was picking up some kind of virus at the Venetian on day 1. Oh, the humanity! They make you walk through the casino to get to where you need to go. I have no doubt that I picked up some nasty bug somewhere between a craps and black jack table. This was way before COVID, but I wouldn’t even dream of going there witout an N95 mask.
Unfortunatley, I spent the first few days in bed missing most of the conference. I literally almost died. To this day I’m not sure how I got on a plane on Friday and made it home. After I nearly hit my head on the porcelain throne as I returned everything I happened to have eaten in Las Vegas to their water recycling plant, I passed out. When I woke up on the polished Venetian bathroom floor I decided that as cool as the swag was and how great it was to come home with more T-shirts than I would ever need, it just wasn’t worth the energy required to attend re:Invent. Speaking of cool…if you do happen to pass out in a Venetian bathroom, the marble floors are soothingly cool and you will get a good nights rest.
Do not underestimate the amount of energy you need to attend re:Invent! Prepare yourself. To really experience re:Invent you need to wake at 6am, join the herd of people that parade to breakfast, plan your attack and move like a ninja through the venues. Seriously, start your day with their amazing breakfast.
I am partial to the Venetian, so that’s where I tried to stay by booking early. The Venetian can accomodate 15,000 people for breakfast and they do an amazing job. Gluten free? Yup? Veggie? Yup. You will not go hungry.
re:Invent now hosts over 50,000 attendees. The first year I went to re:Invent there were less about 10,00 in attendance. Honestly, it has become a complete mess. Buses take attendees between venues. But don’t count on getting to your seminar on-time. And if you are late, tough luck. Your spot will be given to the stand-bys.
Enough about re:Invent…but if you do go, get yourself invited to some vendor event - they are awesome! And don’t forget re:Play!
In my last blog I mentioned a technical issue I had with AppRunner. Well, it turns out I’m not crazy. Their documentation was wrong (and lacking) and here’s the explanation I got from AWS support.
Hello Rob,
Thank you for your continued patience while working on this case with
me. I am reaching out to you with an update on the issue of
associating custom domain with the AppRunner service using AWS CLI. To
recap, I understand that you wanted to use AWS CLI to link custom
domain with AppRunner service, so that you could use www subdomain
with the custom domain. For that, as mentioned in the AppRunner
documentation at [1] we tried using the associate-custom-domain AWS
CLI command [2] and we noticed that the command was returning only the
status of the link and the CertificateValidationRecord objects were
not returned as a part of the output.
For this, I reached out the internal team since as per the
documentation, the CertificateValidationRecord objects should have
been returned. Upon working with the internal team, we realized that
we need to run describe-custom-domains AWS CLI command [3] together
with the associate-custom-domain AWS CLI command to get the
CertificateValidationRecord objects and then we need to add these
records manually to the custom domain in Route53 as a CNAME record
with the record name and value obtained from the
describe-custom-domains AWS CLI command. We have to perform the manual
actions even if the Route53 and AppRunner is in the same account when
working with AWS CLI. I am also providing the step by step details
below:
<ol>
<li>Run the associate-custom-domain AWS CLI command:</li>
</ol>
"aws apprunner associate-custom-domain --service-arn <AppRunner-Service-ARN> --domain-name <Custom-Domain> --enable-www-subdomain"
This will return the output as follows:
<h1 id="output:">Output:</h1>
{
"DNSTarget": "xxxxxxxxxx.us-east-1.awsapprunner.com",
"ServiceArn": "AppRunner-Service-ARN",
"CustomDomain": {
"DomainName": "Custom-Domain",
"EnableWWWSubdomain": true,
"Status": "creating"
}
<h1>}</h1>
<ol>
<li>Now, run the describe-custom-domains AWS CLI command few seconds after running the associate-custom-domain AWS CLI command:</li>
</ol>
"aws apprunner describe-custom-domains --service-arn <AppRunner-Service-ARN>"
This will return an output with CertificateValidationRecords objects as follows:
<h1 id="output:">Output:</h1>
{
"DNSTarget": "xxxxxxxxxx.us-east-1.awsapprunner.com",
"ServiceArn": "AppRunner-Service-ARN",
"CustomDomains": [
{
"DomainName": "Custom-Domain",
"EnableWWWSubdomain": true,
"CertificateValidationRecords": [
{
"Name": "_5bf3e29fca6c29d29fc2b6e023bcaee3.apprunner-sample.com.",
"Type": "CNAME",
"Value": "_3563838161b023d78b951b036072e510.mhbtsbpdnt.acm-validations.aws.",
"Status": "PENDING_VALIDATION"
},
{
"Name": "_7f20ef08b12fbdddb670d0c7fb3c8076.www.apprunner-sample.com.",
"Type": "CNAME",
"Value": "_e1b6f670fac2f42ce30d160c2e3d92ea.mhbtsbpdnt.acm-validations.aws.",
"Status": "PENDING_VALIDATION"
},
{
"Name": "_14fc6b4f0d6b6a5524e7c3147eaec89d.2a57j78h5fsbzb7ey72hbx9c01pbxcf.apprunner-sample.com.",
"Type": "CNAME",
"Value": "_baecf356e1894de83dfca1b51cd8999f.mhbtsbpdnt.acm-validations.aws.",
"Status": "PENDING_VALIDATION"
}
],
"Status": "pending_certificate_dns_validation"
}
]
<h1>}</h1>
<ol>
<li>In Route53, you need to manually add the records with the record
type as CNAME with corresponding name and values.</li>
</ol>
*I realize that the additional step of using describe-custom-domains
AWS CLI command is not mentioned in the documentation and for that I
have updated the internal team and the documentation team to get this
information added to the documentation.* Also, I tested the above steps
and I can confirm that the above workflow is validated and the domain
was associated successfully using AWS CLI.
Now, coming to the actual query of associating custom domain in a
different account from the AppRunner service, the internal team has
confirmed that currently the console custom domain functionality only
works if the Route 53 domain and App Runner service are in the same
account. The same is mentioned in Step 5 of the AppRunner
documentation at [1].
However, in case of AWS CLI, you need to perform the same steps as
above and you need to manually add the CertificateValidationRecords to
the account owning the Route 53 domain(s). You can view the
certificate validation record via the CLI using the
describe-custom-domain command as mentioned above.
So, I’m happy to report that the issue was resolved which gives me more confidence that AppRunner has a future.
For my application, since AppRunner still does not support EFS or mounting external file systems, I will need to identify how I am using my EFS session directory and remove that dependency.
Looking at my application, I can see a path using S3. Using S3 as a session store will not be particulary difficult. S3 will not have the performance characteristics of EFS but I’m not sure that matters. Deleting session objects becomes a bit more complex since we can’t just delete a “directory”.
Another intriguing use for AppRunner is to use it to implement services, either RESTful APIs or back-end services seldom invoked services.
APIs are definitely one of the target uses for this service as discussed in the re:Invent video. Triggering a task is also a use case I want to explore. Currently, I use a CloudWatch event to trigger a Lambda that invokes a Fargate task for doing things like a nighlty backup. That dance seems like it can be replaced (somehow) by using AppRunner…hmmm..need to noodle this so more…
So far, I luv me some AppRunner.
In May of 2021, AWS released AppRunner to the public.
AWS App Runner is an AWS service that provides a fast, simple, and cost-effective way to deploy from source code or a container image directly to a scalable and secure web application in the AWS Cloud. You don’t need to learn new technologies, decide which compute service to use, or know how to provision and configure AWS resources.
App Runner connects directly to your code or image repository. It provides an automatic integration and delivery pipeline with fully managed operations, high performance, scalability, and security.
What makes AppRunner so compelling are these important features:
Back in 2012, I started a SaaS application (Treasurer’s Briefcase) for providing recordkeeping services for small non-profit organizations like PTOs, PTAs and Booster clubs. Back then, I cobbled together the infrastructure using the console, then started to explore CloudFormation and eventually re-architected everything using Terraform.
The application is essentially based on a LAMP stack - well sort of since I use a different templating web framework rather than PHP. The stack consists of an EC2 that hosts the Apache server, an EC2 that hosts some backend services, an ALB, a MySQL RDS instance and a VPC. There are a few other AWS services used like S3, SQS and EFS, but essentially the stack is relatively simple. Even so, provisioning all of that infrastructure using Terraform alone and creating a development, test, and production environments was a bit daunting but a great learning experience.
Starting with the original infrastructure, I reverse engineered it
using terraforming
and then
expanded it using terraform
.
The point being, it wasn’t necessarily easy to get it all right the first time. Keeping up with Terraform was also a challenge as it evolved over the years too. Moreover, maintaining infrastructure was just another task that provided no incremental value to the application. Time spent on that task took away from creating new features and enhancements that could provide more value to customers.
Enter AppRunner…with the promise of taking all of that work and chucking it out the window. Imagine creating a Docker container with your application and handing it to AWS and saying “host this for me, make it scalable, create and maintain an SSL certificate for me, create a CI/CD pipeline to redeploy the application when I make changes and make it cheap.” I’m in.
AppRunner has evolved over the years and has become much more mature. However, it still has some warts and pimples that might make you think twice about using it. Back in 2021 it was an interesting new service, an obvious evolutionary step from Fargate Tasks which provide some of the same features as AppRunner. Applications that utilized Fargate Tasks as the basis for running their containerized web applications still had to provision a VPC, load balancers, and manage scaling on their own. AppRunner bundles all of those capabilities and creates a compelling argument for moving Fargate based apps to AppRunner.
Prior to October 2022 AppRunner did not support the ability to access resources from within a VPC. That made it impossible for example, to use a non-publicly accessible RDS instance. With that addition in October of 2022, it was now possible to have a web application that could access your RDS in your VPC.
The fall of 2023 has seen several changes that make AppRunner even more compelling:
Change | Description | Date |
---|---|---|
Release: App Runner adds supports for AWS Europe (Paris), AWS Europe (London), and AWS Asia Pacific (Mumbai) Regions | AWS App Runner now supports AWS Europe (Paris), AWS Europe (London), and AWS Asia Pacific (Mumbai) Regions. | November 8, 2023 |
Release: App Runner adds dual stack support for incoming network traffic | AWS App Runner now adds dual stack support for incoming traffic through public endpoints. | November 2, 2023 |
Release: App Runner automates Route 53 domain configuration for your services | AWS App Runner automates Route 53 domain configuration for your App Runner service web applications. | October 4, 2023 |
Release: App Runner adds support for monorepo source-code based services | AWS App Runner now supports the deployment and maintenance for monorepo source-code based services. | September 26, 2023 |
Release: App Runner adds more features to auto scaling configuration management | AWS App Runner enhances auto scaling configuration management features. | September 22, 2023 |
Some of the limitations of AppRunner currently include:
The first limitation is a bit of show-stopper for more than a few web applications that might rely on mounted file systems to access assets or provide a stateful storage environment. For my application I use EFS to create session directories for logged in users. Using EFS I can be assured that each EC2 in my web farm accesses the user’s session regardless of which EC2 serves the request. Without EFS, I will be forced to re-think how to create a stateful storage environment for my web app. I could use S3 as storage (and probably should) but EFS provided a “quick-shot-on-goal” at the time.
The second limitation was just frustrating as associating a custom domain sort of kinda works. If I associate a domain managed by AWS (in the same account as my AppRunner application) then I was able to get the TLD to resolve and work as expected. AppRunner was able to associate my appliation to the domain AND provide an SSL certificate. It will redirect any http request to https. Unfortunately, I could not associate www sub-domain using the CLI as documented. In fact I could not even get the CLI to work without trying to enable the www sub-domain. Working with AWS support confirmed my experience and I still have a ticket pending with support on this issue. I’m confident that will be resolved soon(?) so it should not limit my ability to use this service in the future.
AppRunner is an exciting new service that will make application development and deployment seamless allowing developers to focus on the application not the infrastructure.
You can find the AppRunner roadmap and current issues here.
Every development project ultimately has a goal of providing some kind of value to the organization that has decided to initiate a software development project.
The bottom line of any software development project is the bottom line. Does the cost of the project AND the maintenance of the project create a profit?
I know what you are thinking. Not all software applications are designed to produce profit. Untrue. Even applications we call “internal” create value or contribute to the creation of value.
Let’s talk about and characterize failure first. Because its much easier to define (as anyone who has had the misfortune of working with a product development team that cannot define “done” knows). And I’ve been told that that most software development projects fail.
The project is canceled.
This is the “first order broke” condition of projects. It took too long, it went over budget and looked to continue to be a money pit (someone understood the fallacy of sunk costs), the environment changed making the application moot or a new CEO decided to replace all internal applications with some SaaS, PaaS, or his own pet project.
The application was launched and did not meet the goals of the project.
This can mean a lot of things: the project does not solve enough of the business problems to justify the continued cost of maintenance. Or perhaps the application did not generate enough revenue to justify its existence because of poor market acceptance. People just hate using it.
The project is in use, people use it, but the ROI is too far in the future or perhaps indeterminate.
The project becomes a drag on the organization. No one wants to pull the plug because they have no alternative (or believe they don’t). There’s no appetite to rewrite, refactor or reimagine the application. It becomes a huge boat anchor that a handful of engineers keep running by kicking it in the ass whenever it stalls.
The project launches on time and under budget.
Keep in mind that this is (mostly) a necesasry, but insufficient condition for success. Yes, there are some successful projects that are over budget or late, but its sort of like starting Monopoly owing everyone money. You need to catch up and catch up fast.
The application completely solves the business problem.
Again, a necessary but insufficient condition for success. If the application is difficult to maintain and requires constant attention that costs more than it saves or produces, it’s not a success.
The application just works
…and is a critical component in a complex workflow - without it nothing else would - its cost to develop and maintain is easily justified by the the nature of its job. It successfully completes its mission every single day.
Oh yeah, Agile. I read articles about Agile and people’s experience with it all the time. I suspect most opinions are based on few data points and mostly from one person’s negative (or rarely positive) experience with Agile. My opinions (and that’s all they are…YMMV) are based on working with some fairly large clients that I am not at liberty to divulge. One FANG, one Fortune 50 company, one major manufacturer of phones and multiple companies with more than 5000 employees. I’m not opining based on one ride on the merry-go-round. I’m the kind of person that always believes that I just don’t get it, and I need to learn more, read more and accept more to overcome my ignorance and lack of experience. It’s a viewpoint that has allowed me to grow in my career and learn a lot of very useful things that have conspired to make me, if not wealthy, not concerned about money.
I am now having a lot fun going back to my roots of being a software developer. While I have been on the management side of projects employing the Agile process I am now in the belly of the beast. It smells bad, feels wrong and kills productivity. But, again, YMMV.
Product Owners - All “product owners” are not created equal. They have varying degrees of understanding of their own domain. Some even believe developers have ESP. To be fair, some expect developers (and rightly so) to “ask questions”. The problem is, what happens when the developer does not understand the domain. What questions should they ask? They are clueless.
Product owners should assume nothing (in my opinion) and determine the level of domain expertise developers have. It is their responsibility to make that assessment - if they don’t they must be explicit with requirements, otherwise you’ll almost certainly end up with a project or feature that does not meet your needs.
So, here’s the bottom line. Any idea worth something greater than 0 that also has a wee bit of marketing behind it quickly becomes an opportunity for gypsies, tramps and thieves to exploit the ignorant masses. Take Christianity for example. Need I say more? Agile has become the Chrisitianity of corporate America. No one dare mention that is doesn’t solve our problems or make us feel any better. Fuck Agile, the ceremonies, the training, the roles the practice…it is the most unproductive enviroment one can devise for developing software. Look it up…Bill Gates wrote an entire BASIC interpreter and shoved it into 4K of a ROM. He then worked on a 32K version that was essentially a complete OS. He didn’t need Agile to do that.
So, let’s be clear. Agile is social engineering. An attempt to organize human beings in order to create something that no one of them could do alone (or so it goes). Somehow I don’t think Agile works. Some will say, yeah, well not every project should use Agile. Yes, that’s true, but the sad fact is that corporate America is not nuanced. They are binary. They want single solutions to complex problems and do not want to hear…it depends. And so they consume the entire bottle of aspirin.
There will be a day when people look back at the unproductive, waste and utter insansity that is “Agile”. They will marvel at the way that a single, possibly good idea for some things, was transformed into a dogma that haunted software development for a decade.
I’m hopeful however that really smart companies know that instituting things like Agile are the bellweather of their demise. They will avoid trying to fit round pegs into square holes. They will embrace the idea that you can plan things properly, but plans can change without embracing a chaotic, highly disorganized process that actually masquerades as a structured protocol.
You have been warned. When some consultant you hire to justify the outsourcing of your development team says that they can replace your current processes with an Agile team from Elbonia and a scrum master from Bumblefuck…be afraid…be very afraid. There is no free lunch.
One final thought…why is software development so hard? And why do we struggle so to create applications?
It’s not a hard question actually. The goal of software development is to codify a solution to a problem. But first…and here is the reveal…you have to define the problem. That is, in and of itself the most difficult thing in the development process. Missed requirements are, in my experience, the biggest reason for “re-work”. Note I did not say “bugs” or “defects”. Most maintenance on systems is because of missed requirements, not because programmers make mistakes. Oh, for sure, they do. But really? Think. Look back at your tickets and do a root cause analysis.
There are other reasons software development is hard. First, people do not communicate well. The do not communicate precisely and they do not communicate accurately. Next, the tools to express the solutions to our problems are complex and incomplete. Better ingredients make better pizzas. Papa Johns!
Okay, I have to wrap this up…Agile sucks. I hate Agile. I want to mute myself when I’m in stand-ups just to say every day “Oh, I was on mute.” and torture everyone that thinks this ceremony is useful.
Oh,I’m having issues with my internet so I may have to drop soon….open the pod bay doors Hal?