One of the most instrumental books I have read in my career is Senior Software Engineer by David Bryant Copeland. In the book, the author discusses many behavioral traits that are required to be a successful senior engineer. Among those behavioral traits is a concept of how you shouldn't criticize code without knowing the context under which the code was written.
The gist is that it could have been a tired engineer, someone going through personal trauma, administrative pressure, budgetary or time constraints, or a number of other reasons. It's much more involved than dismissing code or even an engineer as hacky or no good.
It's natural to have opinions. However, as I matured, I would adhere to the advice in the book about not jumping to conclusions about the person who wrote the code in question. If I comment on code, it's from a quality and constructive standpoint. Often, performance is involved in one way or another.
Shifting behaviors
Becoming an engineering manager was the biggest catalyst for my growth in this area. I was tasked with managing a team of 3 individual contributors as I was still contributing code on my own. For the first time as an engineer, I had people reporting to me, and I was accountable for supporting the technology behind the business unit.
For context, this was for a business unit that produced conferences, expositions, and trade shows. The largest show was a $16M event in South Africa, but there were other multi-million dollar events domestically. The nature of this domain is cyclical. Site traffic spikes in the week or so leading up to the event and peaks during the event. Once the event is over, preparation begins for the next year's event. After the event, the speaker bios, event videos, sponsors, and agenda still hold value for historical record.
Each year, branding for the event changed, and the host city or venue might have been changed as well. The event date and agenda definitely changed. Patterns start to emerge, and the nature of the business starts to inform engineering decisions.
At the onset, my team and I were lacking context of how to make the turnover from year-to-year into an efficient machine. Admittedly, some of the things we were doing initially could have been better. As I learned more about the business, I produced checklists and documentation for how to handle it from a content perspective. A repeatable process was born.
As time went on, the code got more flexible so it could be applied to a variety of circumstances and different types of events. I created an API of structured data which was consumed by other systems. The API was eventually used to create anything from mobile apps to a printed directory of exhibitors, with the CMS being the single source of truth.
As I was leading that unit, the number of events started to increase, and there was a heavy emphasis trying to replicate the success of the South African event in different markets.
If something produced by my team didn't fit within the vision of efficient and highly reusable, it was rewritten, and this is where the part covered in Copeland's book really started coming together. I had great engineers on my team, but if the context of everything the business was trying to achieve wasn't understood, the things we did would quickly need to be refactored and place more burden on the team.
The lesson learned in all of this was that I realized the importance of context within my team.
Present day
Earlier this week, one of my developer colleagues asked me about a 3rd party script used to administer user surveys that she was asked to implement on our sites. The code was objectively divergent from best practices as it used a synchronous script tag loaded on a 3rd party domain. It creates a single point of failure for page rendering (SPOF). Within the synchronous script, there were many calls to document.write()
.
It would have been easy to dismiss the code and the product it embeds as awful and say that the code is not going on any of our web properties. It was in opposition to site performance and the beneficial correlations it has to business objectives.
However, that would have also dismissed the objective of the person who asked for the code to be put on the site in the first place. As an experienced practitioner, I have learned that maybe the embed script is not to my liking, but it is entirely possible that the dashboard and insights from the user research tool are exactly what the stakeholder needs.
Instead of saying "this code sucks, and we can't support it," there are other avenues to explore that are better than shutting down the conversation.
Opening the dialogue
The 3rd party script would introduce risk and performance regressions. That part is an objective assessment of the situation. The survey isn't required for the initial render, so there is no reason it needs to be blocking from a UI perspective.
The first step was reaching out to the vendor to see if there was an alternative way to embed their product that didn't use a blocking script from a 3rd party domain or document.write
.
We received a response from them that indicated that there was no other way to embed the code. Currently, there isn't an alternative method, so I offered to work with the vendor to improve their product.
This might be above and beyond, but I learned that we paid for the year in advance, and I would rather spend a bit of my time advising and advocating than have something on one of our sites that could drastically diminish the user experience. The state of this is still up in the air, but I hope they take me up on it.
If they say no
If the vendor can't work with me for whatever reason, we're not out of options. We could do some things on our side to make the script less detrimental to the user experience.
Self-host or proxy the script
One option is to proxy the script and self-host it. Our tech stack includes Cloudflare Workers and I have used workers in the past to proxy scripts under the right circumstances. For this use case, I would rather not do this because it would give a vendor access to 1st party cookies.
Since I know this script collects data, it introduces another form of risk, and I don't want to open that can of worms. It could be made less risky if the script was audited and a subresource integrity (SRI) hash was applied to the script, but the product is building the script dynamically based on the survey that is active.
Either way, even if we could proxy the script, it would still be blocking JS and have the document.write
calls.
Use their API
The product has an API, and we could use it to make the product work to our liking. It would require us to write our own logic for how often to display the survey, make sure all question types and validations are supported, and monitor a new service. This does place a burden on the development, but it would be a seamless integration when complete and work the closest to how the product was likely envisioned.
A hyperlink
The humble <a>
tag. We could also link to the survey and serve it from its canonical URL. This would still require a development resource to write the logic for how often the link to the survey displays, but would ease the monitoring burden and require less time than the API option. This is the route that is the quickest to implement unless the vendor offered an alternative embed code.
Selective inclusion
It's doubtful that we really need a survey running on every page of the site all the time but choose to do it that way so the people who run the surveys don't have to put in a development request every time they want to run something. We could change this practice so the survey code is only running on the bare minimum pages it is needed.
If the survey really is needed on every page view all the time, we could use Cloudflare Workers again to inject the code at the edge for a sample of our site traffic.
We could also combine methods so the code is only served on a specific set of layouts to a sample of the traffic.
A lack of process
One thing became apparent, we ended up in a situation where we selected a vendor and committed to a year of their product because there were no guidelines for 3rd party vendor selection and implementation.
I started to look for policies from other organizations or a rubric to help guide future needs. Initially, couldn't find what I was looking for, so I reached out to the web performance Slack channel to see if anyone has come across a 3rd party policy in the wild.
While there weren't specific examples, I got some good criteria and helpful tools that can be used to write my own.
Since then, I did come across a few policies that will help guide the one for my company.
This policy document will be a public, living SLA with the business so it can help proactively identify any potential issues before contracts are signed and commitments are made. At the very least, it will provide us with a definition of where the balance lies between impacting user experience and providing our organization with information that helps make our products better.
Summary
Early in my career, I was critical of pieces of code I didn't like. As I became a manager, I began to understand the context within my team and in the business unit I serviced, and that led me to create checklists and processes to make our team more efficient.
Today, my understanding of context reaches beyond my team or a business unit, it touches the entire organization, our users, and outside partners. That led me to create processes that provide balance and work across the organization to make sure we can still do the things that are important to us, while also preserving the user experience.
I understand my own context as well. Because it's not practical for me to review every technical decision that's being made, I amplify my impact with documentation, demos, and education.
Hopefully, this lets other engineers and possibly non-technical resources understand the technology context, so we aren't stuck with a product that is detrimental to website reliability or user experience.