Broken links: How to find them and what will happen if you don't fix dead 404 pages?
On the Internet, and on my social networks, people often ask about broken links. If you put all this in a single list, then the questions will be as follows:
• Why are broken links appearing?
• How to fix / remove broken links?
• How to remove dead links to WordPress and / or Bitrix?
• How do broken links harm the site?
• What happens if you don't fix dead links?
• How to find broken links?
• What programs and plugins are there for finding broken links?
In this post, I'll try to answer all of these questions, but first, you need to start with the basics.
What are broken links?
Broken links mean that on a page with a 200 code, a link is placed giving a 404 code – that is, the page does not exist. When the user lands on this page, he will be notified: "Error 404. The page does not exist or it has been deleted."
Why are broken links appearing?
There are many different descriptions of problems on the Internet, but they all boil down to the fact that someone just has crooked hands.
Recently I did an audit of one site, it has more than 10,000 pages, and almost 9,000 broken links. The thing is that the site and the structure were changed and the programmer made a typo in the formation of the URL, which resulted in errors. The client and other participants did not notice the oversight and of course they did not search, but they saw that the traffic fell and did not know what was the reason.
But if you nevertheless disassemble the most frequent actions associated with the appearance of broken links, then you can deduce the following:
• The structure of the site has changed. In online stores, the nesting of the product is often changed, removing / product /, and in the / catalog /.
• The page has been removed. If we take the example again from the online store, then due to the duplication of goods, you decided to shorten them and delete part of them. But somewhere in the blog, in the review article, there was a link and, accordingly, it automatically becomes a bat.
• Typo. This is what most often affects the appearance of broken links and 301 redirects.
Some people argue that it is possible to get broken links when moving from http to https. But to be honest, I've never seen anything like this anywhere. If you moved incorrectly, 301 redirects were obtained, but not broken links with a 404 response code.
How to fix / remove broken links?
To fix a broken link, you need to figure out why it appeared. In this situation, there are 3 scenarios:
• Typo. If someone made a mistake and typed a mistake, then it is easy to fix it by correcting the URL to the correct one. This is a minor bug and can be fixed in a matter of minutes.
• The page has been removed. In my experience, I don't recommend just deleting the page. After all, it has weight, history, other sites may link to it, etc. If you want to delete a page, it is better to use the .htaccess file to make a 301 redirect to a similar page, or at least to the page of the previous nesting.
• The structure has changed or the page has been deleted. In this scenario, it becomes much more interesting. All the same, through the .htaccess file, you need to make a page-by-page redirect. From the old versions of the URL, to the new one. With the correct construction of the problem, any programmer will cope in 5 minutes.
How to remove broken links to WordPress and / or Bitrix?
Oddly enough, but a good half of the owners of sites on WordPress and Bitrix believe that in their case there is some kind of separate solution to the problem. Not really. On any Content Management System (cms), problems with broken links are solved in the same way.
How do broken links harm the site?
The answer will be simple – its presence. Users who constantly bump into the 404 page in 70% of cases leave the site completely, rather than returning to the previous page. This means that the site loses not only the behavioral factor, but also audience loyalty, which ultimately affects the number of clicks from the search results.
It is because of this that they began to design 404 pages, adding navigation elements to them, a proposal to return to the previous page. Also, make a nice graphical notification that something went wrong and you shouldn't worry.
Depending on what indicators to build on. That is, the presence of a large number of 404 pages will negatively affect the site – this is 100%. The only difference is the difference in influence, on something more, on something less.
• Behavioral factor. Due to the fact that 70% of people leave the site, this negatively affects the behavioral factor, which search engines consider as a factor in the ranking of the site.
• Conversion. In addition, as the reputation will deteriorate in the eyes of users, this will also affect the conversion – the number of orders and the average check.
• Positions in the search results. And since the PF deteriorates, the site's positions will correspondingly become lower. Since search engines have long counted how many people should do what actions on your site, starting from the average values for the cut of sites.
Some argue that the crawling budget will suffer, with which I partly agree, but in my own way. But first, a quote from John Mueller of Google:
4xx errors do not lead to a decrease in the crawling budget. The bot re-scans these pages to make sure they are closed, but does so without affecting the rest of the pages.
In part, I agree with this. But as I noticed, the entire crawling budget was spent on dead links, especially if they were also registered in the sitemap.xml, after changing the structure and forming the url.
To avoid these issues, so that the crawling budget is not wasted, I recommend configuring Last-Modified and If-Modified-Since. At the very least, this will help protect against the passage of the same pages every time, which means it will also save the crawling budget. Similarly, with broken links, when trying to scan them, the robot will receive a response "304 Not Modified", and the second and subsequent times, it will not scan these pages.
How to find broken links?
It is very difficult to manually search for such things, because We do not remember which URLs we have and which ones are in the links. As my practice shows, even such urls "p-l-ng-z-ferulovoyu-kislotoyu" and "piling-s-ferulovoy-kislotoy" are skipped by the team working on the site.
The easiest way to find all links is to use special software.
Broken link search tools
I will not describe how to use one or another tool, as it is on every official website. In addition. learning it yourself, you will learn about other functions of these tools. which will be very useful to you in the future!
• Google Search Console
• Xenu's Link Sleuth
• Screaming Frog
• W3C Link Checker
• Online Broken Link Checker
• Netpeak Spider
• Broken Link Checker
• Check My Links
• Dead Link Checker
• Drlink Check
Broken links – What else would you like to say?!
There are broken links on every second site, at least 1, this is quite normal. Indeed, under this, there is even a saying:
Only those who do not work are not mistaken.
I recommend to conduct an audit of the site, at least once every 6 months, in order to detect all the "jambs" and flaws in the site. Do not be afraid of this, search engines really forgive a lot and allow you to do a lot. The fact that your site contains a large number of dead links is okay. After the fix, the site will be able to continue to grow and develop.
If you do not know how to make an audit, or who to entrust the site audit, then write to me. I can, if not recommend a good performer, then do an audit with my own hands and discuss the project with you. I promise it will be interesting!
Case Study: From 400 to 1,000 Organic Traffic Per Day on a Courses Website
102 Best SEO Tips to Help You Drive Traffic This Year: Do Keyword Research, Hire Writers
Here Is an Insane SEO Audit Template List That You Can Use!
Advanced SEO Audit Elements: Schema Markups, Redirects, and Blackhat Elements!
Discussion about to Get Solutions for Excluded URLs in GSC
Are the Same Page Slug With and Without index.html Counted as Duplicate Content?
Discuss Discovered Currently not Indexed by Google SE