In my last blog I introduced AppRunner a relatively new service from AWS that helps application developers concentrate on their applications instead of infrastructure. Similar to Elastic Beanstalk, AppRunner is an accelerator that gets your web application deployed in the most efficient way possible.
One of my concerns has been whether Amazon is actually committed to enhancing and extending this promising service. Searching the 2023 re:Invent announcements I was disappointed to see any news about new features for AppRunner. However, it was encouraging to see that they did include a typical promotional seminar on AppRunner.
The video is definitley worth watching, but the “case for AppRunner” is a bit tedious. They seem to be trying to equate AppRunner with modernization and reduction of technical debt. If hiding the complexities of deploying a scalable web application to the cloud (specifically AWS) is modernization then ok? I guess?
But let’s be honest here. It’s magic. You give them a container, they give you a scalable web application. I’m not sure that’s modernization or reducing technical debt. It sounds more like going “all in”. Which, by the way is totally cool with me. For my money, if you are going to leverage the cloud, then you damn well ought to leverage it. Don’t be shy. Take advantage of these services that reduce friction for developers and help you create value for your customers.
Since I referenced a re:Invent webinar I should mention that I’ve attended re:Invent 5 times. It was an amazing and fun experience. However, the last time I attended it was such a cluster f*ck that I decided it just wasn’t worth the effort and haven’t been back since. Their content is on-line (thank you AWS!) now and I can pick and choose to attend on-line now instead of trying to figure out how to manage my limited time and the demand they have for specific seminars. If you do go, plan on standing in line (a lot).
The straw that broke this camel’s back was picking up some kind of virus at the Venetian on day 1. Oh, the humanity! They make you walk through the casino to get to where you need to go. I have no doubt that I picked up some nasty bug somewhere between a craps and black jack table. This was way before COVID, but I wouldn’t even dream of going there witout an N95 mask.
Unfortunatley, I spent the first few days in bed missing most of the conference. I literally almost died. To this day I’m not sure how I got on a plane on Friday and made it home. After I nearly hit my head on the porcelain throne as I returned everything I happened to have eaten in Las Vegas to their water recycling plant, I passed out. When I woke up on the polished Venetian bathroom floor I decided that as cool as the swag was and how great it was to come home with more T-shirts than I would ever need, it just wasn’t worth the energy required to attend re:Invent. Speaking of cool…if you do happen to pass out in a Venetian bathroom, the marble floors are soothingly cool and you will get a good nights rest.
Do not underestimate the amount of energy you need to attend re:Invent! Prepare yourself. To really experience re:Invent you need to wake at 6am, join the herd of people that parade to breakfast, plan your attack and move like a ninja through the venues. Seriously, start your day with their amazing breakfast.
I am partial to the Venetian, so that’s where I tried to stay by booking early. The Venetian can accomodate 15,000 people for breakfast and they do an amazing job. Gluten free? Yup? Veggie? Yup. You will not go hungry.
re:Invent now hosts over 50,000 attendees. The first year I went to re:Invent there were less about 10,00 in attendance. Honestly, it has become a complete mess. Buses take attendees between venues. But don’t count on getting to your seminar on-time. And if you are late, tough luck. Your spot will be given to the stand-bys.
Enough about re:Invent…but if you do go, get yourself invited to some vendor event - they are awesome! And don’t forget re:Play!
In my last blog I mentioned a technical issue I had with AppRunner. Well, it turns out I’m not crazy. Their documentation was wrong (and lacking) and here’s the explanation I got from AWS support.
Hello Rob,
Thank you for your continued patience while working on this case with
me. I am reaching out to you with an update on the issue of
associating custom domain with the AppRunner service using AWS CLI. To
recap, I understand that you wanted to use AWS CLI to link custom
domain with AppRunner service, so that you could use www subdomain
with the custom domain. For that, as mentioned in the AppRunner
documentation at [1] we tried using the associate-custom-domain AWS
CLI command [2] and we noticed that the command was returning only the
status of the link and the CertificateValidationRecord objects were
not returned as a part of the output.
For this, I reached out the internal team since as per the
documentation, the CertificateValidationRecord objects should have
been returned. Upon working with the internal team, we realized that
we need to run describe-custom-domains AWS CLI command [3] together
with the associate-custom-domain AWS CLI command to get the
CertificateValidationRecord objects and then we need to add these
records manually to the custom domain in Route53 as a CNAME record
with the record name and value obtained from the
describe-custom-domains AWS CLI command. We have to perform the manual
actions even if the Route53 and AppRunner is in the same account when
working with AWS CLI. I am also providing the step by step details
below:
<ol>
<li>Run the associate-custom-domain AWS CLI command:</li>
</ol>
"aws apprunner associate-custom-domain --service-arn <AppRunner-Service-ARN> --domain-name <Custom-Domain> --enable-www-subdomain"
This will return the output as follows:
<h1 id="output:">Output:</h1>
{
"DNSTarget": "xxxxxxxxxx.us-east-1.awsapprunner.com",
"ServiceArn": "AppRunner-Service-ARN",
"CustomDomain": {
"DomainName": "Custom-Domain",
"EnableWWWSubdomain": true,
"Status": "creating"
}
<h1>}</h1>
<ol>
<li>Now, run the describe-custom-domains AWS CLI command few seconds after running the associate-custom-domain AWS CLI command:</li>
</ol>
"aws apprunner describe-custom-domains --service-arn <AppRunner-Service-ARN>"
This will return an output with CertificateValidationRecords objects as follows:
<h1 id="output:">Output:</h1>
{
"DNSTarget": "xxxxxxxxxx.us-east-1.awsapprunner.com",
"ServiceArn": "AppRunner-Service-ARN",
"CustomDomains": [
{
"DomainName": "Custom-Domain",
"EnableWWWSubdomain": true,
"CertificateValidationRecords": [
{
"Name": "_5bf3e29fca6c29d29fc2b6e023bcaee3.apprunner-sample.com.",
"Type": "CNAME",
"Value": "_3563838161b023d78b951b036072e510.mhbtsbpdnt.acm-validations.aws.",
"Status": "PENDING_VALIDATION"
},
{
"Name": "_7f20ef08b12fbdddb670d0c7fb3c8076.www.apprunner-sample.com.",
"Type": "CNAME",
"Value": "_e1b6f670fac2f42ce30d160c2e3d92ea.mhbtsbpdnt.acm-validations.aws.",
"Status": "PENDING_VALIDATION"
},
{
"Name": "_14fc6b4f0d6b6a5524e7c3147eaec89d.2a57j78h5fsbzb7ey72hbx9c01pbxcf.apprunner-sample.com.",
"Type": "CNAME",
"Value": "_baecf356e1894de83dfca1b51cd8999f.mhbtsbpdnt.acm-validations.aws.",
"Status": "PENDING_VALIDATION"
}
],
"Status": "pending_certificate_dns_validation"
}
]
<h1>}</h1>
<ol>
<li>In Route53, you need to manually add the records with the record
type as CNAME with corresponding name and values.</li>
</ol>
*I realize that the additional step of using describe-custom-domains
AWS CLI command is not mentioned in the documentation and for that I
have updated the internal team and the documentation team to get this
information added to the documentation.* Also, I tested the above steps
and I can confirm that the above workflow is validated and the domain
was associated successfully using AWS CLI.
Now, coming to the actual query of associating custom domain in a
different account from the AppRunner service, the internal team has
confirmed that currently the console custom domain functionality only
works if the Route 53 domain and App Runner service are in the same
account. The same is mentioned in Step 5 of the AppRunner
documentation at [1].
However, in case of AWS CLI, you need to perform the same steps as
above and you need to manually add the CertificateValidationRecords to
the account owning the Route 53 domain(s). You can view the
certificate validation record via the CLI using the
describe-custom-domain command as mentioned above.
So, I’m happy to report that the issue was resolved which gives me more confidence that AppRunner has a future.
For my application, since AppRunner still does not support EFS or mounting external file systems, I will need to identify how I am using my EFS session directory and remove that dependency.
Looking at my application, I can see a path using S3. Using S3 as a session store will not be particulary difficult. S3 will not have the performance characteristics of EFS but I’m not sure that matters. Deleting session objects becomes a bit more complex since we can’t just delete a “directory”.
Another intriguing use for AppRunner is to use it to implement services, either RESTful APIs or back-end services seldom invoked services.
APIs are definitely one of the target uses for this service as discussed in the re:Invent video. Triggering a task is also a use case I want to explore. Currently, I use a CloudWatch event to trigger a Lambda that invokes a Fargate task for doing things like a nighlty backup. That dance seems like it can be replaced (somehow) by using AppRunner…hmmm..need to noodle this so more…
So far, I luv me some AppRunner.
In May of 2021, AWS released AppRunner to the public.
AWS App Runner is an AWS service that provides a fast, simple, and cost-effective way to deploy from source code or a container image directly to a scalable and secure web application in the AWS Cloud. You don’t need to learn new technologies, decide which compute service to use, or know how to provision and configure AWS resources.
App Runner connects directly to your code or image repository. It provides an automatic integration and delivery pipeline with fully managed operations, high performance, scalability, and security.
What makes AppRunner so compelling are these important features:
Back in 2012, I started a SaaS application (Treasurer’s Briefcase) for providing recordkeeping services for small non-profit organizations like PTOs, PTAs and Booster clubs. Back then, I cobbled together the infrastructure using the console, then started to explore CloudFormation and eventually re-architected everything using Terraform.
The application is essentially based on a LAMP stack - well sort of since I use a different templating web framework rather than PHP. The stack consists of an EC2 that hosts the Apache server, an EC2 that hosts some backend services, an ALB, a MySQL RDS instance and a VPC. There are a few other AWS services used like S3, SQS and EFS, but essentially the stack is relatively simple. Even so, provisioning all of that infrastructure using Terraform alone and creating a development, test, and production environments was a bit daunting but a great learning experience.
Starting with the original infrastructure, I reverse engineered it
using terraforming
and then
expanded it using terraform
.
The point being, it wasn’t necessarily easy to get it all right the first time. Keeping up with Terraform was also a challenge as it evolved over the years too. Moreover, maintaining infrastructure was just another task that provided no incremental value to the application. Time spent on that task took away from creating new features and enhancements that could provide more value to customers.
Enter AppRunner…with the promise of taking all of that work and chucking it out the window. Imagine creating a Docker container with your application and handing it to AWS and saying “host this for me, make it scalable, create and maintain an SSL certificate for me, create a CI/CD pipeline to redeploy the application when I make changes and make it cheap.” I’m in.
AppRunner has evolved over the years and has become much more mature. However, it still has some warts and pimples that might make you think twice about using it. Back in 2021 it was an interesting new service, an obvious evolutionary step from Fargate Tasks which provide some of the same features as AppRunner. Applications that utilized Fargate Tasks as the basis for running their containerized web applications still had to provision a VPC, load balancers, and manage scaling on their own. AppRunner bundles all of those capabilities and creates a compelling argument for moving Fargate based apps to AppRunner.
Prior to October 2022 AppRunner did not support the ability to access resources from within a VPC. That made it impossible for example, to use a non-publicly accessible RDS instance. With that addition in October of 2022, it was now possible to have a web application that could access your RDS in your VPC.
The fall of 2023 has seen several changes that make AppRunner even more compelling:
Change | Description | Date |
---|---|---|
Release: App Runner adds supports for AWS Europe (Paris), AWS Europe (London), and AWS Asia Pacific (Mumbai) Regions | AWS App Runner now supports AWS Europe (Paris), AWS Europe (London), and AWS Asia Pacific (Mumbai) Regions. | November 8, 2023 |
Release: App Runner adds dual stack support for incoming network traffic | AWS App Runner now adds dual stack support for incoming traffic through public endpoints. | November 2, 2023 |
Release: App Runner automates Route 53 domain configuration for your services | AWS App Runner automates Route 53 domain configuration for your App Runner service web applications. | October 4, 2023 |
Release: App Runner adds support for monorepo source-code based services | AWS App Runner now supports the deployment and maintenance for monorepo source-code based services. | September 26, 2023 |
Release: App Runner adds more features to auto scaling configuration management | AWS App Runner enhances auto scaling configuration management features. | September 22, 2023 |
Some of the limitations of AppRunner currently include:
The first limitation is a bit of show-stopper for more than a few web applications that might rely on mounted file systems to access assets or provide a stateful storage environment. For my application I use EFS to create session directories for logged in users. Using EFS I can be assured that each EC2 in my web farm accesses the user’s session regardless of which EC2 serves the request. Without EFS, I will be forced to re-think how to create a stateful storage environment for my web app. I could use S3 as storage (and probably should) but EFS provided a “quick-shot-on-goal” at the time.
The second limitation was just frustrating as associating a custom domain sort of kinda works. If I associate a domain managed by AWS (in the same account as my AppRunner application) then I was able to get the TLD to resolve and work as expected. AppRunner was able to associate my appliation to the domain AND provide an SSL certificate. It will redirect any http request to https. Unfortunately, I could not associate www sub-domain using the CLI as documented. In fact I could not even get the CLI to work without trying to enable the www sub-domain. Working with AWS support confirmed my experience and I still have a ticket pending with support on this issue. I’m confident that will be resolved soon(?) so it should not limit my ability to use this service in the future.
AppRunner is an exciting new service that will make application development and deployment seamless allowing developers to focus on the application not the infrastructure.
You can find the AppRunner roadmap and current issues here.
Every development project ultimately has a goal of providing some kind of value to the organization that has decided to initiate a software development project.
The bottom line of any software development project is the bottom line. Does the cost of the project AND the maintenance of the project create a profit?
I know what you are thinking. Not all software applications are designed to produce profit. Untrue. Even applications we call “internal” create value or contribute to the creation of value.
Let’s talk about and characterize failure first. Because its much easier to define (as anyone who has had the misfortune of working with a product development team that cannot define “done” knows). And I’ve been told that that most software development projects fail.
The project is canceled.
This is the “first order broke” condition of projects. It took too long, it went over budget and looked to continue to be a money pit (someone understood the fallacy of sunk costs), the environment changed making the application moot or a new CEO decided to replace all internal applications with some SaaS, PaaS, or his own pet project.
The application was launched and did not meet the goals of the project.
This can mean a lot of things: the project does not solve enough of the business problems to justify the continued cost of maintenance. Or perhaps the application did not generate enough revenue to justify its existence because of poor market acceptance. People just hate using it.
The project is in use, people use it, but the ROI is too far in the future or perhaps indeterminate.
The project becomes a drag on the organization. No one wants to pull the plug because they have no alternative (or believe they don’t). There’s no appetite to rewrite, refactor or reimagine the application. It becomes a huge boat anchor that a handful of engineers keep running by kicking it in the ass whenever it stalls.
The project launches on time and under budget.
Keep in mind that this is (mostly) a necesasry, but insufficient condition for success. Yes, there are some successful projects that are over budget or late, but its sort of like starting Monopoly owing everyone money. You need to catch up and catch up fast.
The application completely solves the business problem.
Again, a necessary but insufficient condition for success. If the application is difficult to maintain and requires constant attention that costs more than it saves or produces, it’s not a success.
The application just works
…and is a critical component in a complex workflow - without it nothing else would - its cost to develop and maintain is easily justified by the the nature of its job. It successfully completes its mission every single day.
Oh yeah, Agile. I read articles about Agile and people’s experience with it all the time. I suspect most opinions are based on few data points and mostly from one person’s negative (or rarely positive) experience with Agile. My opinions (and that’s all they are…YMMV) are based on working with some fairly large clients that I am not at liberty to divulge. One FANG, one Fortune 50 company, one major manufacturer of phones and multiple companies with more than 5000 employees. I’m not opining based on one ride on the merry-go-round. I’m the kind of person that always believes that I just don’t get it, and I need to learn more, read more and accept more to overcome my ignorance and lack of experience. It’s a viewpoint that has allowed me to grow in my career and learn a lot of very useful things that have conspired to make me, if not wealthy, not concerned about money.
I am now having a lot fun going back to my roots of being a software developer. While I have been on the management side of projects employing the Agile process I am now in the belly of the beast. It smells bad, feels wrong and kills productivity. But, again, YMMV.
Product Owners - All “product owners” are not created equal. They have varying degrees of understanding of their own domain. Some even believe developers have ESP. To be fair, some expect developers (and rightly so) to “ask questions”. The problem is, what happens when the developer does not understand the domain. What questions should they ask? They are clueless.
Product owners should assume nothing (in my opinion) and determine the level of domain expertise developers have. It is their responsibility to make that assessment - if they don’t they must be explicit with requirements, otherwise you’ll almost certainly end up with a project or feature that does not meet your needs.
So, here’s the bottom line. Any idea worth something greater than 0 that also has a wee bit of marketing behind it quickly becomes an opportunity for gypsies, tramps and thieves to exploit the ignorant masses. Take Christianity for example. Need I say more? Agile has become the Chrisitianity of corporate America. No one dare mention that is doesn’t solve our problems or make us feel any better. Fuck Agile, the ceremonies, the training, the roles the practice…it is the most unproductive enviroment one can devise for developing software. Look it up…Bill Gates wrote an entire BASIC interpreter and shoved it into 4K of a ROM. He then worked on a 32K version that was essentially a complete OS. He didn’t need Agile to do that.
So, let’s be clear. Agile is social engineering. An attempt to organize human beings in order to create something that no one of them could do alone (or so it goes). Somehow I don’t think Agile works. Some will say, yeah, well not every project should use Agile. Yes, that’s true, but the sad fact is that corporate America is not nuanced. They are binary. They want single solutions to complex problems and do not want to hear…it depends. And so they consume the entire bottle of aspirin.
There will be a day when people look back at the unproductive, waste and utter insansity that is “Agile”. They will marvel at the way that a single, possibly good idea for some things, was transformed into a dogma that haunted software development for a decade.
I’m hopeful however that really smart companies know that instituting things like Agile are the bellweather of their demise. They will avoid trying to fit round pegs into square holes. They will embrace the idea that you can plan things properly, but plans can change without embracing a chaotic, highly disorganized process that actually masquerades as a structured protocol.
You have been warned. When some consultant you hire to justify the outsourcing of your development team says that they can replace your current processes with an Agile team from Elbonia and a scrum master from Bumblefuck…be afraid…be very afraid. There is no free lunch.
One final thought…why is software development so hard? And why do we struggle so to create applications?
It’s not a hard question actually. The goal of software development is to codify a solution to a problem. But first…and here is the reveal…you have to define the problem. That is, in and of itself the most difficult thing in the development process. Missed requirements are, in my experience, the biggest reason for “re-work”. Note I did not say “bugs” or “defects”. Most maintenance on systems is because of missed requirements, not because programmers make mistakes. Oh, for sure, they do. But really? Think. Look back at your tickets and do a root cause analysis.
There are other reasons software development is hard. First, people do not communicate well. The do not communicate precisely and they do not communicate accurately. Next, the tools to express the solutions to our problems are complex and incomplete. Better ingredients make better pizzas. Papa Johns!
Okay, I have to wrap this up…Agile sucks. I hate Agile. I want to mute myself when I’m in stand-ups just to say every day “Oh, I was on mute.” and torture everyone that thinks this ceremony is useful.
Oh,I’m having issues with my internet so I may have to drop soon….open the pod bay doors Hal?
Flavio Poletti recently blogged about
using Barcode::ZBar
for reading QR codes. In a previous life I have
written applications for workflows that involved reading QR codes from
documents and have also used Barcode::ZBar
very effectively…highly
recommended.
Back then, iirc I had quite a struggle satisfying all of the dependencies required to get it actually work. I spent a little time after reading his article to see I could successfully build it once more, this time in a Docker container. The result is this project:
It was a little difficult piecing together the required packages, but in the end it works!
Next, I may create a webservice from the container and install it on a
Fargate cluster. The service will allow you to upload .png
files and
scan them for QR codes. Stay tuned!
map
The code below violates more than a few best practices making the intent of the code difficult to understand. Let’s get started…
User
Can you refactor this snippet of Perl code?
if (@multi_elems) {
my $delim_qr1 = qr/\d+$group_delim1/o;
#my $delim_qr2 = $group_delim2;
foreach my $entry ( @log_data{@EIDs} ) {
@{$entry}{@multi_elems}
= map {
my $x = $_;
$x && ( $x =~ s/$delim_qr1//go )
? [
# XXX: if $_ =~ m/^($group_delim2)+$/
# the result is ()
# at least on perl 5.8.0
split /$group_delim2/, $x
]
: []
} @{$entry}{@multi_elems};
}
#print STDERR "$err\n" if $err;
}
ChatGPT
Here’s the refactored version of your provided Perl code. I’ve added comments to explain what the code is doing and made some minor improvements for readability:
# Check if there are any multi_elems
if (@multi_elems) {
# Define the regex delimiter for group_delim1
my $delim_qr1 = qr/\d+$group_delim1/o;
# Iterate through each entry in the log_data hash
foreach my $entry (@log_data{@EIDs}) {
# Update each multi_elem in the current entry
@{$entry}{@multi_elems} = map {
my $x = $_;
# Check if the value matches the delimiter regex
$x && ($x =~ s/$delim_qr1//go) ?
[
# Split the value by group_delim2
split /$group_delim2/, $x
]
: [] # If the value doesn't match, return an empty array reference
} @{$entry}{@multi_elems};
}
}
The refactored code is more readable and has comments explaining the purpose of each block. Note that the code logic has not been changed, and it should function the same as the original snippet.
User
The use of the ternary operator obfuscates the meaning
ChatGPT
I understand your concern. Here’s the refactored version of the code
using an if
statement instead of the ternary operator to improve
readability:
# Check if there are any multi_elems
if (@multi_elems) {
# Define the regex delimiter for group_delim1
my $delim_qr1 = qr/\d+$group_delim1/o;
# Iterate through each entry in the log_data hash
foreach my $entry (@log_data{@EIDs}) {
# Update each multi_elem in the current entry
@{$entry}{@multi_elems} = map {
my $x = $_;
my $result;
# Check if the value matches the delimiter regex
if ($x && ($x =~ s/$delim_qr1//go)) {
# Split the value by group_delim2
$result = [split /$group_delim2/, $x];
} else {
# If the value doesn't match, return an empty array reference
$result = [];
}
$result;
} @{$entry}{@multi_elems};
}
}
In this version, I replaced the ternary operator with an if-else
statement, which assigns the result to the $result
variable. This
should make the code more understandable and easier to follow.
User
Excellent but I don’t like cuddled elses
ChatGPT
Sure, I’ve updated the code to uncuddle the else
statement:
# Check if there are any multi_elems
if (@multi_elems) {
# Define the regex delimiter for group_delim1
my $delim_qr1 = qr/\d+$group_delim1/o;
# Iterate through each entry in the log_data hash
foreach my $entry (@log_data{@EIDs}) {
# Update each multi_elem in the current entry
@{$entry}{@multi_elems} = map {
my $x = $_;
my $result;
# Check if the value matches the delimiter regex
if ($x && ($x =~ s/$delim_qr1//go)) {
# Split the value by group_delim2
$result = [split /$group_delim2/, $x];
}
else {
# If the value doesn't match, return an empty array reference
$result = [];
}
$result;
} @{$entry}{@multi_elems};
}
}
Now, the else
statement is on a new line, making the code more
aligned with your preferred style.
User
We are using Perl 5.16. Are there any changes you might suggest?
ChatGPT
Perl 5.16 doesn’t introduce any significant changes that would impact the provided code snippet. However, I can suggest a few general improvements for better readability and maintainability:
use strict;
use warnings;
foreach
loop with a lexical iterator variable to avoid potential conflicts with global variables:
for my $entry (@log_data{@EIDs}) {
# ...
}
With these suggestions, the updated code looks like this:
use strict;
use warnings;
# Check if there are any multi_elems
if (@multi_elems) {
# Define the regex delimiter for group_delim1
my $delim_qr1 = qr/\d+$group_delim1/o;
# Iterate through each entry in the log_data hash
for my $entry (@log_data{@EIDs}) {
# Update each multi_elem in the current entry
@{$entry}{@multi_elems} = map {
my $x = $_;
my $result;
# Check if the value matches the delimiter regex
if ($x && ($x =~ s/$delim_qr1//go)) {
# Split the value by group_delim2
$result = [split /$group_delim2/, $x];
}
else {
# If the value doesn't match, return an empty array reference
$result = [];
}
$result;
} @{$entry}{@multi_elems};
}
}
These changes should enhance the code quality and maintainability while keeping the same functionality.
User
My understanding is that the ‘o’ regex modifier may no longer be required.
ChatGPT
You’re correct. The /o
modifier for regular expressions is no longer
needed in most cases, as Perl optimizes the regex automatically if the
pattern doesn’t change during the runtime of the program. The /o
modifier was useful in older versions of Perl (prior to 5.6) when the
regex engine didn’t optimize regexes in the same way. In modern Perl,
the engine is smart enough to recognize when a regex is constant and
cache the compiled version automatically. You can remove the /o
modifier from the regex in your code snippet:
use strict;
use warnings;
# Check if there are any multi_elems
if (@multi_elems) {
# Define the regex delimiter for group_delim1
my $delim_qr1 = qr/\d+$group_delim1/;
# Iterate through each entry in the log_data hash
for my $entry (@log_data{@EIDs}) {
# Update each multi_elem in the current entry
@{$entry}{@multi_elems} = map {
my $x = $_;
my $result;
# Check if the value matches the delimiter regex
if ($x && ($x =~ s/$delim_qr1//g)) {
# Split the value by group_delim2
$result = [split /$group_delim2/, $x];
}
else {
# If the value doesn't match, return an empty array reference
$result = [];
}
$result;
} @{$entry}{@multi_elems};
}
}
The performance and behavior of the code should remain the same after
removing the /o
modifier.
User
Perl best practices indicates that a map function should only contain one statement. Using the map function above seems to reduce readability since the object being interated over with some complex logic is at the end of the expression.
ChatGPT
I agree with you. Using a map
function with complex logic can reduce
readability. In this case, we can refactor the code to use a for
loop instead of the map
function to improve readability:
use strict;
use warnings;
# Check if there are any multi_elems
if (@multi_elems) {
# Define the regex delimiter for group_delim1
my $delim_qr1 = qr/\d+$group_delim1/;
# Iterate through each entry in the log_data hash
for my $entry (@log_data{@EIDs}) {
# Iterate through each multi_elem in the current entry
for my $multi_elem (@multi_elems) {
my $x = $entry->{$multi_elem};
my $result;
# Check if the value matches the delimiter regex
if ($x && ($x =~ s/$delim_qr1//g)) {
# Split the value by group_delim2
$result = [split /$group_delim2/, $x];
}
else {
# If the value doesn't match, return an empty array reference
$result = [];
}
# Update the multi_elem in the current entry
$entry->{$multi_elem} = $result;
}
}
}
By using a for
loop, the code becomes more readable, and it’s easier
to understand the logic being applied to the elements.
User
I think the else statement in the inner for loop can be removed and we
can assign result to an empty array reference when we define $result
ChatGPT
You’re right. We can simplify the code further by initializing
$result
to an empty array reference and updating it only when the
condition is met. Here’s the updated code:
use strict;
use warnings;
# Check if there are any multi_elems
if (@multi_elems) {
# Define the regex delimiter for group_delim1
my $delim_qr1 = qr/\d+$group_delim1/;
# Iterate through each entry in the log_data hash
for my $entry (@log_data{@EIDs}) {
# Iterate through each multi_elem in the current entry
for my $multi_elem (@multi_elems) {
my $x = $entry->{$multi_elem};
# Initialize the result to an empty array reference
my $result = [];
# Check if the value matches the delimiter regex
if ($x && ($x =~ s/$delim_qr1//g)) {
# Split the value by group_delim2
$result = [split /$group_delim2/, $x];
}
# Update the multi_elem in the current entry
$entry->{$multi_elem} = $result;
}
}
}
By initializing $result
to an empty array reference, we can remove
the else
statement, making the code more concise and easier to read.
User
The $result
variable also now seems unnecessary and we can
further refactor this by directly assigning $entry->{$multi_elem}
to
an empty array and using the next statement to short circuit the if
statement
ChatGPT
You’re right. We can further simplify the code by removing the
$result
variable and directly assigning the value to
$entry->{$multi_elem}
. Here’s the refactored code:
use strict;
use warnings;
# Check if there are any multi_elems
if (@multi_elems) {
# Define the regex delimiter for group_delim1
my $delim_qr1 = qr/\d+$group_delim1/;
# Iterate through each entry in the log_data hash
for my $entry (@log_data{@EIDs}) {
# Iterate through each multi_elem in the current entry
for my $multi_elem (@multi_elems) {
my $x = $entry->{$multi_elem};
# Initialize the multi_elem in the current entry to an empty array reference
$entry->{$multi_elem} = [];
# Check if the value matches the delimiter regex, and short circuit if not
next unless ($x && ($x =~ s/$delim_qr1//g));
# Update the multi_elem in the current entry by splitting the value by group_delim2
$entry->{$multi_elem} = [split /$group_delim2/, $x];
}
}
}
By directly assigning an empty array reference to
$entry->{$multi_elem}
and using the next
statement to short
circuit the loop if the condition is not met, the code becomes even
more concise and easier to read.
User
perl best practices suggests avoiding unless
ChatGPT
You’re right. Using unless
can sometimes make the code less
readable. Instead, we can use an if
statement with a negated
condition. Here’s the updated code:
use strict;
use warnings;
# Check if there are any multi_elems
if (@multi_elems) {
# Define the regex delimiter for group_delim1
my $delim_qr1 = qr/\d+$group_delim1/;
# Iterate through each entry in the log_data hash
for my $entry (@log_data{@EIDs}) {
# Iterate through each multi_elem in the current entry
for my $multi_elem (@multi_elems) {
my $x = $entry->{$multi_elem};
# Initialize the multi_elem in the current entry to an empty array reference
$entry->{$multi_elem} = [];
# Check if the value matches the delimiter regex, and short circuit if not
if (!($x && ($x =~ s/$delim_qr1//g))) {
next;
}
# Update the multi_elem in the current entry by splitting the value by group_delim2
$entry->{$multi_elem} = [split /$group_delim2/, $x];
}
}
}
By using an if
statement with a negated condition instead of
unless
, the code conforms better to Perl best practices and is
easier to understand.
User
How do you think this code would compare to the original in terms of performance under perl 5.16
ChatGPT
The refactored code should have similar performance to the original
code under Perl 5.16. The changes made in the refactored version
mostly deal with readability and best practices, but the overall logic
and flow of the code remain the same. Some improvements in the
refactored code, such as removing the unnecessary else
statement and
the $result
variable, can slightly reduce memory usage and improve
the code’s readability without affecting the performance. The direct
assignment of an empty array reference and the use of the next
statement can make the code more efficient in some cases by avoiding
unnecessary assignments. However, it’s important to note that the
actual performance difference might not be significant, and it would
depend on the specific use case, data being processed, and the overall
complexity of the program. In general, focusing on readability and
maintainability is a good approach, as it will make it easier for you
and others to work with and optimize the code in the future.
Yikes! What the heck does this code do?
User
Please refactor this Perl snippet for readability
@{$_}{ @elem_ids_ref, @elem_ids_noref }
= ( ( map { [] } @elem_ids_ref ), @defaults_noref )
for @log_data{@EIDs};
ChatGPT
Here’s the refactored version of the provided Perl snippet with improved readability:
for my $entry (@log_data{@EIDs}) {
# Initialize array references for elements in @elem_ids_ref
@{$entry}{@elem_ids_ref} = map { [] } @elem_ids_ref;
# Assign the default values for elements in @elem_ids_noref
@{$entry}{@elem_ids_noref} = @defaults_noref;
}
In this refactored version, I’ve replaced the single line of code with
a more readable for
loop that iterates over each entry in
@log_data{@EIDs}
. Within the loop, I’ve separated the assignment of
array references for elements in @elem_ids_ref
and the assignment of
default values for elements in @elem_ids_noref
. This should make the
code easier to understand and maintain.
In this session I have an inkling what I want to do with this code, remove the
next
statements and make this more of a condition based dispatcher.
User
The code below uses the next statement 4 times and uses unless which is not considered to be a best practice. How could we avoid using next in this fashion?
foreach my $elem ( $self->elements() ) {
next unless $elem->table();
next
if ( $elem->{type} eq 'File' || $elem->{type} eq 'Signature' )
&& !$elem->copy_file;
next unless $elem->$allow_hybrid_method;
next unless exists $data->{ $elem->{uniq} };
$elem->{value} = $data->{ $elem->{uniq} };
}
ChatGPT
Here’s a refactored version of the provided Perl code snippet that
avoids using next
and unless
:
foreach my $elem ($self->elements()) {
if ($elem->table()
&& !($elem->{type} eq 'File' || $elem->{type} eq 'Signature') || $elem->copy_file
&& $elem->$allow_hybrid_method
&& exists $data->{$elem->{uniq}}
) {
$elem->{value} = $data->{$elem->{uniq}};
}
}
In this version, I combined all the conditions into a single if
statement, which makes the code more straightforward and easier to
read.
User
That does seem more readable - each of the conditions is essentially a rule that results in assigning a value to a hash member - is there another way to express this functionality?
ChatGPT
Yes, you can create an array of coderefs representing each condition, and then iterate through them to check whether all conditions are met. This approach makes the code more modular and easier to extend or modify in the future:
my @conditions = (
sub { $_[0]->table() },
sub { $_[0]->{type} ne 'File' && $_[0]->{type} ne 'Signature' || $_[0]->copy_file },
sub { $_[0]->$allow_hybrid_method },
sub { exists $_[0]->{uniq} && exists $data->{ $_[0]->{uniq} } },
);
foreach my $elem ($self->elements()) {
my $all_conditions_met = 1;
for my $condition (@conditions) {
unless ($condition->($elem)) {
$all_conditions_met = 0;
last;
}
}
if ($all_conditions_met) {
$elem->{value} = $data->{$elem->{uniq}};
}
}
In this version, we define an array of coderefs (@conditions) that represent each condition. Then, during the iteration over elements, we check whether all conditions are met by iterating through the @conditions array. If all conditions are met, we assign the value to the hash member. This approach makes it easy to add, remove, or modify conditions in the future.
User
Here we go with unless again
ChatGPT
Apologies for that oversight. Let’s rewrite the code snippet without
using unless
:
my @conditions = (
sub { $_[0]->table() },
sub { $_[0]->{type} ne 'File' && $_[0]->{type} ne 'Signature' || $_[0]->copy_file },
sub { $_[0]->$allow_hybrid_method },
sub { exists $_[0]->{uniq} && exists $data->{ $_[0]->{uniq} } },
);
foreach my $elem ($self->elements()) {
my $all_conditions_met = 1;
for my $condition (@conditions) {
if (!$condition->($elem)) {
$all_conditions_met = 0;
last;
}
}
if ($all_conditions_met) {
$elem->{value} = $data->{$elem->{uniq}};
}
}
Now, the unless
keyword has been replaced with a negated if
statement, which makes the code more in line with the best practices.
User
The foreach loop looks very generic and could be a subroutine of its own, with an argument of the conditions
ChatGPT
You’re right, the loop can be extracted into a separate
subroutine. Here’s a refactored version of the code, where the loop is
wrapped in a subroutine called process_elements
:
sub process_elements {
my ($elements, $data, $conditions) = @_;
ELEMENT:
foreach my $elem (@$elements) {
for my $condition (@$conditions) {
next ELEMENT unless $condition->($elem);
}
$elem->{value} = $data->{$elem->{uniq}};
}
}
my @conditions = (
sub { $_[0]->table() },
sub { $_[0]->{type} ne 'File' && $_[0]->{type} ne 'Signature' || $_[0]->copy_file },
sub { $_[0]->$allow_hybrid_method },
sub { exists $_[0]->{uniq} && exists $data->{ $_[0]->{uniq} } },
);
process_elements([$self->elements()], $data, \@conditions);
The process_elements
subroutine takes a list of elements, data, and
an array reference of conditions as its arguments. It iterates through
the elements, and for each element, it checks if all conditions are
met. If all conditions are satisfied, it assigns a value to the
element’s hash member.
User
The calling sequence of process_elements()
is specific to this use
case. How can we make it more generic?
ChatGPT
To make the process_elements
subroutine more generic, you can pass a
code reference for the action to be performed when all conditions are
met. This way, the subroutine can be used in various scenarios, not
just for the specific use case. Here’s the modified version:
sub process_elements {
my ($elements, $conditions, $action) = @_;
ELEMENT:
foreach my $elem (@$elements) {
for my $condition (@$conditions) {
next ELEMENT unless $condition->($elem);
}
$action->($elem);
}
}
my @conditions = (
sub { $_[0]->table() },
sub { $_[0]->{type} ne 'File' && $_[0]->{type} ne 'Signature' || $_[0]->copy_file },
sub { $_[0]->$allow_hybrid_method },
sub { exists $_[0]->{uniq} && exists $data->{ $_[0]->{uniq} } },
);
my $action = sub {
my ($elem) = @_;
$elem->{value} = $data->{$elem->{uniq}};
};
process_elements([$self->elements()], \@conditions, $action);
Now the process_elements
subroutine takes an additional argument,
$action
, which is a code reference for the action to be performed
when all conditions are met. This makes the subroutine more generic
and reusable in different situations.
User
Let’s refactor process_elements()
to receive a hash of arguments.
ChatGPT
To refactor the process_elements
subroutine to receive a hash of
arguments, you can make the following changes:
sub process_elements {
my %args = @_;
my $elements = $args{elements};
my $conditions = $args{conditions};
my $action = $args{action};
ELEMENT:
foreach my $elem (@$elements) {
for my $condition (@$conditions) {
next ELEMENT unless $condition->($elem);
}
$action->($elem);
}
}
my @conditions = (
sub { $_[0]->table() },
sub { $_[0]->{type} ne 'File' && $_[0]->{type} ne 'Signature' || $_[0]->copy_file },
sub { $_[0]->$allow_hybrid_method },
sub { exists $_[0]->{uniq} && exists $data->{ $_[0]->{uniq} } },
);
my $action = sub {
my ($elem) = @_;
$elem->{value} = $data->{$elem->{uniq}};
};
process_elements(
elements => [$self->elements()],
conditions => \@conditions,
action => $action
);
Now the process_elements
subroutine accepts a hash of arguments
instead of a list, which makes it easier to understand the purpose of
each argument when calling the subroutine.
User
Is there another reason this might be a better subroutine?
ChatGPT
Yes, there are several reasons why refactoring the process_elements
subroutine this way might be better:
Overall, the refactored version of the process_elements subroutine is more modular, flexible, and easier to understand, making it a better design choice.
You may recognize the phrase in the title as Damian Conway’s admonition in Perl Best Practices regarding magic values in your programs. It’s good advice to use names for magic numbers and values to make your code more readable for you and those that have to maintain your code.
But what about the don’t use constant
part? His argument for not
using the pragma essentially boils down to the fact that use
constant
creates package scoped constant subroutines at compile time
vs the Readonly
module which executes at runtime. So we are told to
use Readonly
…fair enough…but Perl::Critic
tells us to use
ReadonlyX
because Readonly
is slow and buggy.
Here’s the problem though: ReadonlyX
is not exactly the same as
Readonly
and many CPAN modules use Readonly
. Case in point:
File::BOM
.
Let’s start with a little script that uses both ReadonlyX
and
File::BOM
;
use strict;
use warnings;
use ReadonlyX;
use File::BOM;
1;
…let’s compile this and see what happens…
perl -wc foo.pl
Subroutine croak redefined at /home/rclauer/lib/perl5/Readonly.pm line 13.
Prototype mismatch: sub Readonly::Scalar ($;$) vs ($$) at /home/rclauer/lib/perl5/Readonly.pm line 257.
Prototype mismatch: sub Readonly::Scalar ($;$) vs ($$) at /home/rclauer/lib/perl5/Readonly.pm line 343.
Subroutine Scalar redefined at /home/rclauer/lib/perl5/Readonly.pm line 316.
Subroutine Array redefined at /home/rclauer/lib/perl5/Readonly.pm line 346.
Subroutine Hash redefined at /home/rclauer/lib/perl5/Readonly.pm line 364.
Subroutine Clone redefined at /home/rclauer/lib/perl5/Readonly.pm line 387.
Prototype mismatch: sub Readonly::Readonly (\[%@$]$) vs (\[$@%]@) at (eval 7) line 42.
Subroutine Readonly redefined at (eval 7) line 1.
foo.pl syntax OK
Hmmm…that looks bad, but I hear that Readonly
and ReadonlyX
should be “drop in replacements for each other? Well then perhaps I
can use ReadonlyX
by telling File::BOM
that Readonly
has already
been loaded…a technique I have used before when these two modules
were conflicting.
use strict;
use warnings;
BEGIN {
use Module::Loaded;
use ReadonlyX;
mark_as_loaded('Readonly');
}
use File::BOM;
1;
…let’s compile this and see if things get better…
perl -wc foo.pl
Useless use of reverse in void context at /home/rclauer/lib/perl5/File/BOM.pm line 204.
Odd number of elements in hash assignment at /home/rclauer/lib/perl5/File/BOM.pm line 187.
Compilation failed in require at foo.pl line 14.
BEGIN failed--compilation aborted at foo.pl line 14.
Nope. It appears that File::BOM
makes use of features or uses
Readonly
in a way that is not compatible with ReadonlyX
. Luckily
my use of ReadonlyX
is compatible with Readonly
, so I can reverse
the technique and use Readonly
and mark ReadonlyX
loaded so that
other modules that use ReadonlyX
will use Readonly
.
use strict;
use warnings;
BEGIN {
use Module::Loaded;
use Readonly;
mark_as_loaded('ReadonlyX');
}
use File::BOM;
1;
Of course, this only works if the downstream modules that use
ReadonlyX
use it in a way that is compatible with Readonly
…and
we’ve just introduced the slow and buggy Readonly
to our stack
which we were trying to avoid by using ReadonlyX
in the first place.
But at least we got our legacy code to work. ;-)
make
Anyone that has seen any of my projects on
GitHub knows that I love me some
Makefile
. I’ve been using make
since the days I started
programming and I doubt I’m going to give it up any time soon.
I know the kids today are haters, but make
, in all its wonky glory is a
true wonder and the
Leatherman
of GNU tools. I continually find new tricks that extend my love and
adoration for this ubiquitous tool.
My latest find is what I’ll term scriptlets. Many times
I’ll find myself creating a Makefile
that requires some
non-trivial manipulation or transformation of data. In those cases
I’ll usually try a Perl one-liner and pipe some data through a Perl
command.
Let’s suppose file foo.txt
depends on bar.txt
and we need to replace all
instances of xyz
with abc
(yes, sed
my be a better choice, but
let’s go with this for now…). Your make
recipe might look like this:
foo.txt: bar.txt
perl -npe 's/xyz/abc/g;' $< > $@
But what happens when things get a little more complicated and the
Perl one-liner is insufficient for the task? Well, maybe it’s time to
write a little Perl script. Of course, if I do that, I’ll then need to include
that script in my project, make it executable and reference it from my
Makefile
. Hmmm…is there a better way than cluttering the project
with scripts? Enter scriptlets…
Here’s a short example that removes all members of a hash where the
keys do not represent a file with an extension of .pm
.
define create_json =
use JSON::PP;
my $file_list = JSON::PP->new->decode(<>);
my @pm_files = grep { /[.]pm$/ } keys %{$file_list};
my %new_file_list = map { $_ => $file_list->{$_} } @pm_files;
print JSON::PP->new->pretty->encode(\%new_file_list);
endef
export s_create_json = $(value create_json)
foo.json: bar.json
perl -0 -e "$$s_create_json" $< > $@
…and then:
make
You can read more about the GNU make
features used above by
following these links.
Hope you find make
as useful as I do…drop me a line if you have a
favorite make
trick or tip.
Thanks for reading.
My new favorite Perl module is a very unassuming chap that makes your help and usage notes for your Perl scripts just a little bit sexier…
When you have a fairly complicated script that requires lots of help and usage information, it will probably scroll past one page of the display when you dump it to STDOUT.
Sure, your users could pipe the output through less
or more
(…and we all know that less
is more right?) but why make them do
that?
Enter IO::Pager
…a clever
little module that’s quite easy to use. The pattern I’m using is to
use it, if it’s available…
my $token = eval { require IO::Pager; IO::Pager::open( *STDOUT, '|-:utf8', 'Unbuffered' ); }; print {*STDOUT} <<'END_OF_USAGE'; usage: $0 options command Script to do blah blah blah... Options ------- ... END_OF_USAGE
Brian d foy introduced the Perl community to the idea of a script and module in the same file (the so-called modulino concept). Essentially, the idea is that you can create a script that provides a command line interface to your Perl module. Essentially, the implementation looks something like this:
package Foo
caller or __PACKAGE__->main();
# package guts
...
sub main {
my $foo = Foo->new;
use Getopt::Long;
# do something interesting here...
exit 0;
}
1;
This model comes in quite handy when writing scripts or just exercising your module during it’s creation.
I sometimes find it useful to then create a bash
script that invokes
my modulino. Since I might want to point to a development version
of the script, I might set my PERL5LIB
path to my local development
environment. Running the modulino script, I would then expect it to
use the version in my path. Hence the following modulino script.
#!/usr/bin/env bash
# -*- mode: sh; -*-
MODULINO="Amazon::Credentials"
MODULINO_PATH="${MODULINO//::/\/}.pm"
MODULINO_RUN=$(perl -M$MODULINO -e 'print $INC{"'$MODULINO_PATH'"}';)
test -n "$DEBUG" && echo $$MODULINO_RUN
if test -z "$MODULINO_RUN"; then
echo "$MODULINO is not installed"
exit 1;
fi
perl $MODULINO_RUN "$@"
…and then
$ amazon-credentials.sh -h amazon-credentials.sh options Formats credentials found in env, config, SSO, role Options ------- --help, -h this --ec2, get credentials from server IAM profile --env get credentials from environment variables --profile, -p get credentials from profile in credentials configuration --role get credentials from SSO role --account use with --role, specify AWS account id $ amazon-credentials.sh --profile=test export AWS_ACCESS_KEY_ID=AKI***************** export AWS_SECRET_ACCESS_KEY=**************************************** $ aws sso login $ amazon-credentials.sh --role my-sso-role --account 01234567890
More information about modulinos can be found here:
Although I’ve been programming in Perl for over 25 years, it wasn’t until recently that I’ve had a boost in productivity as great as the one I experienced over the last year. What happened?
Stepping out from a more management oriented role at my former
employer, they needed a technical leader for a legacy Perl web
application migration to the cloud. Somehow I let on that I had some
Perl and AWS experience and so I was tabbed as the technical lead.
The project involved some heavy lifting of years of crufty Perl code
from an on-prem, dedicated Apache server environment to a
containerized AWS environment. Aside from the challenges of trying to
get legacy Perl code running in a Docker container the team had to
mitigate findings from the company’s security scans. This included
perlcritic
findings as well as findings based on other tools that
look for potential security issues.
So, the first hurdle was whittling down the perlcritic
findings and
making sure we weren’t introducing new findings.
perlcritic
to the BuildI’m a big fan of make
and autotools
so naturally our build was
based on GNU
autotools
. This
allowed for a repeatable, standards based, reliable and
extensible build system that worked locally using docker-compose
and
worked in our CI/CD pipeline to deploy Docker images for AWS Fargate.
To make sure that we maintained a baseline of low severity findings
from perlcritic
, I added a step in our Makefile
that ran
perlcritic
and errored out if any severity level exceeded 2. This
prevented any code from going into the repository that would trigger
security concerns since all pushes to the repository were scanned.
My editor of choice has always been Emacs…let the editor wars begin!
I had already added perltidy
as an extension to Emacs so that
perltidy
would be run before any save. Our team standardized on a
set of perltidy
settings and added the .perltidyrc
file to the
project along with the .perlcriticrc
that configures perlcritic
reporting.
Most editors today have language specific syntax highlighting and syntax checking built-in. Flycheck is an Emac’s plugin for syntax checking. Flycheck extensions are available for almost any language and even for things like markdown, JSON and SQL.
Syntax checking as you work in your scripts is
another way to move the process of validating your code further
upstream, but what really supercharged my development efforts was
adding perlcritic
checking to Emacs. The combination of Flycheck’s
support for perltidy
, perlcritic
and Perl syntax checking has
helped me develop faster with less errors.
Perl Best Practices isn’t just code for “standarization”, although having some standards makes maintenance a heckuva lot easier. No, PBP can highlight potential issues in your code and prevent you from having to debug gnarly problems.
Unfortunately, it’s also a nag.
Visiting an old legacy file that someone wrote back in 2012 (before you and they were woke to PBP) is an eye opening experience. When Flycheck gives up after 1527 issues found you start to question whether you really want to edit that code!
Even code that has relatively few findings presents them with electric hilighting that eventually rattles your nerves. It also becomes a bit of an obsession to clean up old code and get down to 0 findings!
In the end tough, the result is better code. Other programmers can read and understand it and the quality will get incrementally better.
Generally speaking then, to supercharge your development efforts move code quality enforcement as far upstream as possible, starting with your editor.