The Silent Cost of Broken Links
Let's start with why this actually matters beyond the obvious "404 is bad" intuition.
Google's crawl budget is finite, and every 404 your site returns is a small tax on that budget. For large sites with thousands of pages, or for sites that don't have overwhelming domain authority, this matters. Googlebot will crawl broken URLs repeatedly until it decides to deprioritise the site.
Broken internal links also leak PageRank. If page A links to page B and page B returns a 404, the link equity that should flow from A to B evaporates. Over time, on a site with hundreds of internal links pointing at dead pages, this adds up to a meaningful ranking disadvantage.
For agencies, there's also the client relationship dimension. A client who discovers broken links on their own site before you do is a client who starts wondering what else you're missing. You might tell them it's content and their responsibility but then they will feel that you're not in control of their site. And that's a retention problem, not just a technical one.
And link rot compounds. Research consistently shows around 25% of links break within five years. On dynamic CMS-driven sites with active content teams, the rate is higher. Every page added, every URL restructured, every plugin updated is a potential new source of broken links.
This is one layer in a broader site health picture. If you want a framework for thinking about monitoring holistically, The 7 Levels of Website Monitoring is worth reading, broken link monitoring sits at a specific level that uptime checks simply don't cover.
How Link Rot Happens
Understanding the failure modes helps you anticipate problems rather than just react to them.
URL restructuring without proper redirects is the single most common cause on actively maintained sites. Someone decides to reorganise the blog from `/news/post-title` to `/blog/post-title`, updates the CMS permalink structure, and forgets that a hundred internal links and a dozen external backlinks still point to the old paths.
Third-party link rot is outside your control but still your problem. External sites go offline, restructure, or paywall content you've cited.
CMS migrations are high-risk events. Moving from WordPress to Statamic, upgrading Magento, or switching headless architectures almost always leaves URL mapping gaps. Even with careful planning, edge cases slip through, especially for dynamically generated URLs like product category pages or tag archives.
Developer errors are more common than anyone admits. A hardcoded staging URL that makes it into production. A typo in an `href` attribute. A relative path that breaks when the page moves to a subdirectory.
Plugin and module updates in WordPress, Drupal, and Magento can silently change route structures. A plugin that handled custom post type URLs gets updated, changes its slug generation logic, and suddenly a hundred URLs return 404s. Plugins should use major versions for this but you can only hope they do.
Deleted assets get removed from CDNs or media libraries without anyone checking whether pages still reference them.
How Google Handles 404s and Soft 404s
This is where it gets technically interesting, and where a lot of teams make prioritisation mistakes.
A hard 404 (HTTP 404 Not Found) is straightforward: the server tells Googlebot the page doesn't exist. Google will eventually drop it from the index and stop crawling it. This is actually the correct response for genuinely missing content.
A 410 Gone tells Google the resource is permanently removed and it should stop crawling it. Google processes 410s faster than 404s in terms of deindexing. For intentionally deleted pages, a 410 is semantically more appropriate than a 404.
Soft 404s are the insidious ones. This is where a page returns 200 OK but renders content that's effectively an error state, a "Product Not Found" message, a blank template, a "This category is empty" page. Googlebot sees a successful response, tries to index the content, eventually recognises it as low-quality or near-duplicate, and flags it as a soft 404 in Search Console. These waste crawl budget longer than hard 404s because they don't trigger immediate exclusion.
Google Search Console's Coverage report will show you 404s and soft 404s but only for pages Google has already crawled. It's delayed by days, sometimes weeks. It won't show you outbound broken links at all. And it misses pages that aren't in Google's crawl queue yet. GSC is a useful diagnostic tool; it's not a monitoring solution.
Redirect chains are a related problem. A 301 to a 301 to a 301 dilutes link equity at each step and slows crawl processing. When you're auditing redirects, chains of more than two hops should be collapsed to a direct redirect.
Sitemap integrity matters too. Submitting a sitemap full of URLs that return 404s is a signal to Google that you're not maintaining your site. Sitemap monitoring is something I've built into Vigilant specifically because it's an easy thing to miss until it's a big problem.
Manual Approaches (And Why They Don't Scale)
Most teams start with point-in-time tools. There's nothing wrong with them for one-off audits but let's be honest about where they break down.
Google Search Console is free, authoritative, and integrated with Google's actual crawl data. Its limitations: delayed reporting, only covers crawled pages, and requires manual review.
Screaming Frog is the industry standard for crawl-based auditing. It's powerful. It also requires someone to manually initiate a crawl, wait for it to complete, export the results, and actually look at them.
wget - spider and linkchecker on the command line are genuinely useful for CI/CD integration on individual sites. linkchecker in particular is solid for automated checks in a pipeline. The problem is operational: managing separate scripts for 50 client sites, aggregating results, routing alerts, you've built your own monitoring platform at that point.
Browser extensions like Check My Links are fine for a quick scan of a single page. For a site with 5,000 pages, they're not a tool, they're a joke.
The fundamental limitation of all these approaches is that they're snapshots. A link can break five minutes after your crawl completes. If you're running Screaming Frog monthly, a broken checkout link could go undetected for 29 days.
For agencies managing even a dozen client sites, the operational burden of running manual audits on a useful cadence is unsustainable. I've written more about why uptime monitoring alone isn't enough, broken link detection is a good example of the gap between "site is up" and "site is actually working".
Building a Scalable Broken Link Detection Strategy
Here's the framework I'd apply to get this under control systematically.
Tier your monitoring. Not all pages are equal. Your homepage, top-10 organic landing pages, and conversion paths (checkout, contact, booking) need to be checked frequently, daily or better. Deep blog content from three years ago matters, but a broken link there isn't an incident.
Combine crawl-based and log-based detection. Periodic crawls find broken links proactively. Server log analysis tells you when real users are hitting 404s right now. A spike in 404 log entries is often the first signal that a deployment introduced a broken URL structure and you want to know about that within minutes, not days.
Set 404 rate thresholds. A gradual increase in 404s over weeks is normal link rot. A 404 rate that doubles overnight is a deployment problem. Alerting on threshold spikes gives you fast detection of the latter without alert fatigue from the former.
Centralise multi-site monitoring. If you're managing a portfolio, you need a single dashboard, not 50 separate tools or scripts. The operational overhead of context-switching between per-site tools is where agency monitoring strategies quietly fail.
Automating Broken Link Monitoring for Agency Portfolios
Continuous monitoring changes the economics of this problem. Catching a broken link within hours of it appearing versus discovering it in a quarterly audit aren't just quantitatively different, they're qualitatively different in terms of impact.
Automated monitoring should cover three categories: internal links (pages on your domain linking to other pages on your domain), outbound links (links to external resources), and resource links (images, scripts, stylesheets - a broken CSS file or missing JavaScript bundle can be more damaging than a 404 on a blog post).
For agencies, there's also the client communication layer. You need to be able to show clients that their site's link health is being actively maintained, not just fix things quietly and hope they don't notice. Branded reports that present link health data alongside uptime, performance, and certificate status turn monitoring from an operational cost into a visible, valued service. That's something I've specifically built into Vigilant: the Link Issues feature detects broken links across monitored sites and routes alerts through Slack, Email, or Discord. The white-label Client Pages and branded PDF Reports let agencies present that data to clients professionally which, in practice, strengthens the retainer relationship.
The ROI case is straightforward. A single broken checkout flow caught within an hour instead of a week represents potentially significant lost revenue recovered. One prevented SEO penalty from a crawl budget problem pays for monitoring tooling many times over.
Fixing Broken Links: A Prioritisation Framework
Detection is only useful if it connects to remediation. Here's how I'd triage the queue.
Fix immediately - broken links on conversion paths, primary navigation, and top-10 organic landing pages. These directly affect revenue and crawlability of your most important pages. Don't batch these; fix them the same day they're detected.
Fix within a week - broken internal links on indexed pages with meaningful organic traffic. These leak link equity and harm crawlability even if they're not on primary conversion paths. A week is a reasonable SLA for links that affect SEO but aren't directly blocking user journeys.
Fix monthly in batch - broken outbound links on blog posts and resource pages. Less urgent, but the accumulated E-E-A-T impact of routinely linking to dead resources is real. A monthly cleanup pass is manageable.
Evaluate and remove - links to resources that simply no longer exist anywhere. Sometimes the right answer isn't finding a replacement link; it's removing the reference entirely and updating the surrounding content. A dead link that you remove is better than a dead link that you replace with a marginally relevant alternative.
Use 301 redirects for moved content, 410 Gone for permanently removed content. Don't let deleted pages return soft 404s.
Make Broken Link Monitoring a First-Class Concern
Broken links aren't a cosmetic issue. They compound into measurable SEO damage, lost revenue, eroded client trust, and crawl budget waste, none of which announces itself loudly until the damage is already done.
The only approach that works at scale is continuous automated monitoring with intelligent alerting and a clear remediation workflow. Point-in-time audits have their place, but they're a diagnostic tool, not a monitoring strategy. For agencies managing multiple client sites, the centralisation and white-label reporting layer is what makes this operationally viable.
Start today by auditing your highest-traffic sites and identifying the worst offenders, your server logs and Search Console Coverage report will give you a starting point. Then put continuous monitoring in place so you're catching future issues within hours or days, not weeks.
Vigilant includes broken link detection as part of a broader monitoring platform built specifically for agencies - open source, self-hostable, and built to cover the things that matter beyond just "is the site up." If you want to understand the full scope of what it covers, the features page is a good place to start.
A website that's up isn't the same as a website that works. Monitoring is how you know the difference.