Web Performance Archetypes: Using Machine Learning to Identify Patterns

Uncovering four performance archetypes that explain how page composition, network conditions, and JavaScript complexity shape real-world web performance.

As I’ve analyzed thousands of pages using the Chrome User Experience Report (CrUX) and HTTP Archive, I’ve realized that “slow” isn’t a single state. It’s a set of distinct behaviors, and the reason for the slowness can vary. A site struggling with JavaScript bloat on a high-end device in New York behaves fundamentally differently than a lightweight site crippled by a slow network in a rural area.

I wanted to understand these relationships more from a web platform perspective, and in this article, I use a machine learning approach called clustering to classify and identify patterns in web performance and page composition.

The dataset

Much of machine learning and data science involves cleaning and preparing the data. Fortunately, I did this before when I wrote a previous article on web performance and machine learning. The data originated from the HTTPArchive and the CrUX API, and I was able to reuse the same dataset from my previous article to do this exploration as well.

What is clustering?

Clustering is a technique used to group similar data points into distinct groups. Items in the same cluster or group are more similar to each other than items in different clusters. If you’ve ever done a card sorting exercise, that is a manual clustering exercise where you put things into groups based on their similarity.

In machine learning, clustering is done programmatically using algorithms like K-Means. After running the clustering algorithms, we’re left with groups of items that have common characteristics or patterns.

Standardizing Features and Running K-Means

One consideration for using a clustering algorithm is determining how many clusters are ideal. Fortunately, there are methods that we can use to programmatically identify that as well. In this data exploration, I used the elbow method to determine the optimal number of clusters. Using this method, the largest bend in the curve will indicate a good starting point for the ideal number of clusters.

In some cases, you get a hard elbow where the ideal number of clusters is easily identified by a significant bend in the visualization:

Clustering visualization of the elbow method with a hard elbow at k=3

When I ran the elbow method over this dataset, I got a more gradual curve where the elbow wasn’t as pronounced:

Clustering on the actual dataset where the elbow was not as pronounced at k=4

However, even with this gradual curve, I could (barely) see that four clusters would be a good number to fit the final model.

Cluster Visualization

Once the number of clusters is determined, I can use PCA to project the high-dimensional feature space into 2D.

What we can see is three very distinct clusters, and one that is a bit fuzzy (cluster 1) because it is interspersed with other clusters.

Showing 4 distinct clusters for performance archetypes

Interpreting the Clusters

To understand the clusters, we examine the average values for key features within each group. This is where my domain expertise in web performance comes into play. After analyzing trends in each cluster, here’s what I noticed:

  • Cluster 0 - (Field-Optimized / Image-Heavy Stable Pages): This is the strongest-performing and largest cluster. It has the best field outcomes overall. These pages have the highest ratio of images relative to the total bytes on the page (img_bytes_ratio ~0.384). However, both lab and field performance are under control, suggesting efficient delivery despite visual richness.
  • Cluster 1 - (Field-Limited / High-Variability Pages): This cluster stands out as the weakest in field performance. It also shows extreme spread and outliers, with large max values in both fcp_p75 and lcp_p75, indicating a highly unstable group. While its JS volume is not the highest in absolute bytes, this cluster appears most constrained by real-user conditions and variability.
  • Cluster 2 - (JS/CSS Heavier but Borderline-Good Field Performance): This group is more resource-heavy than cluster 0 and has the highest js_bytes_ratio ~0.471. Despite this, its field performance is comparatively better than cluster 1. Its LCP (lcp_p75 ~2453ms) is barely inside the “good” range, and TTFB (ttfb_p75 ~900ms) is barely outside the good range. It looks like a “heavier but still mostly controlled” cluster rather than a latency-dominated one.
  • Cluster 3 - (Extreme Lab Slowdowns / Heavy Pages with Layout Instability): This cluster has the worst lab metrics. It also has the largest page weight in JS (bytesJS ~2.49M) and images (bytesImg ~2.84M) and the worst visual stability (cls_p75 ~0.172) and interaction metrics (inp_p75 ~297ms). Interestingly, its average field LCP (lcp_p75 ~2546ms) is worse than cluster 0 and 2 but still better than cluster 1, which suggests severe synthetic bottlenecks without being the single worst real-user cluster.

Cluster Field Metrics Box Plots

We can visualize the distribution of field performance metrics for each cluster to see how the data is distributed. If you’re not familiar with box plots, they are a visual depiction of how data is spread out. They show the median, quartiles, and statistically relevant range of data. Box plots are useful for comparing distributions between different clusters and identifying any outliers or unusual data points. This is a good article that explains how to interpret box plots.

Time to First Byte (TTFB)

TTFB data for each cluster. A table below is the non-visual representation

Time to First Byte (TTFB) p75 - units in milliseconds
Cluster 0 Cluster 1 Cluster 2 Cluster 3
mean 590.539227 1116.870712 900.138824 720.670179
std 319.969492 597.812687 492.832707 510.239135
min 26 178 133 79
25% 368 723 577.5 404
50% 530 999 809.5 581
75% 738 1382 1106 899
max 3534 5634 4010 5320

First Contentful Paint (FCP)

FCP data for each cluster. A table below is the non-visual representation

First Contentful Paint (FCP) p75 - units in milliseconds
Cluster 0 Cluster 1 Cluster 2 Cluster 3
mean 1197.729430 2395.493404 1869.784706 1643.468915
std 511.756519 4612.853907 1042.555435 898.762398
min 272 119 128 357
25% 884 1464 1219.75 1114
50% 1118 1946 1623 1430
75% 1425 2615 2170.25 1928
max 11425 150590 9101 9751

Largest Contentful Paint (LCP)

LCP data for each cluster. A table below is the non-visual representation

Largest Contentful Paint (LCP) p75 - units in milliseconds
Cluster 0 Cluster 1 Cluster 2 Cluster 3
mean 1578.041332 3094.496922 2453.036471 2545.781876
std 717.992229 5181.459280 1456.467192 1794.159260
min 275 119 122 459
25% 1118 1870 1543.75 1539
50% 1450 2545 2085 2087
75% 1880 3446 2967 2853
max 11979 167042 13850 18442

Cumulative Layout Shift (CLS)

CLS data for each cluster. A table below is the non-visual representation

Cumulative Layout Shift (CLS) p75
Cluster 0 Cluster 1 Cluster 2 Cluster 3
mean 0.053291 0.113615 0.079871 0.172276
std 0.136675 0.243097 0.162854 0.290943
min 0 0 0 0
25% 0 0 0 0.01
50% 0 0 0.01 0.06
75% 0.04 0.11 0.09 0.19
max 1.4 2.11 1.35 2.08

Interaction to Next Paint (INP)

INP data for each cluster. A table below is the non-visual representation

Interaction to Next Paint (INP) p75 - units in milliseconds
Cluster 0 Cluster 1 Cluster 2 Cluster 3
mean 160.638729 272.205805 243.747059 296.781876
std 109.736882 297.494588 197.992875 286.211224
min 21 25 32 32
25% 102 129 129 148
50% 135 177 185.5 218
75% 184 288 294 322
max 2200 4214 2335 2962

Conclusion: Beyond “Fast” and “Slow”

By applying clustering to field and lab data, we move beyond the reductive binary of “fast” versus “slow.” This analysis reveals that web performance is a landscape of distinct archetypes, each with its own technical constraints and user experience realities.

Whether a site falls into the Field-Optimized stability of Cluster 0 or the Extreme Lab Slowdown of Cluster 3, the data shows that “page weight” is not destiny. High-performing pages can still be visually rich, and lightweight pages can still be crippled by field variability and network latency.

For developers and stakeholders, these clusters provide a roadmap for optimization:

  • Identify your archetype: Are you struggling with JavaScript execution (Cluster 2) or environmental instability (Cluster 1)?
  • Prioritize the right metrics: cluster your own data and fix the pages at the intersection of the worst performance and the most traffic.
  • Bridge the Lab-Field Gap: Understanding why some sites perform well in tests but fail in the field (and vice versa) is the key to building resilient digital experiences.

Ultimately, machine learning allows us to see the signatures of performance at scale. By recognizing these patterns, we can stop chasing scores and start solving the specific architectural problems that stand between our users and a seamless experience.

I'm Ethan Gardner. I help organizations turn web performance into a competitive advantage and improve developer velocity with design systems. Interested in consulting, audits, or workshops? Reach out.