In recent posts, we’ve addressed common reasons web scrapes stop working and best practices for ensuring the scrapes fueling your job board are up and running. Here, we’ll dive deeper into one specific problem many job boards have: sending applicants to 404 pages – i.e., linking from a job posting to a URL that cannot be found.
For job boards, sending applicants to 404 pages not only prevents open jobs from getting filled, it also costs you money if you’re buying traffic and costs your customers if you’re charging them via a CPC model. For applicants, 404 pages are a terrible experience.
Here’s a look at three common reasons job listings link to 404 pages and two ways you can keep from sending interested candidates to broken links.
Job scrapes can break for many reasons, but the most common reasons we see that a listing links to a 404 page are these:
In all these cases, a user clicks an “apply now” or similar button and ultimately discovers that there’s no job on the other end. Our research indicates that candidates drop off the site at that point.
So how big a problem is this for job boards?
We’ve found that about two percent of jobs change every day: they close, they get edited, the tracking parameters used in their URLs get updated, etc. In a week, that means 10 percent of the links you’re sending applicants to might not work.
So if you’re not doing anything to monitor and correct 404 links in job postings and you’re sending CPC traffic to posts, you can safely assume you’re wasting about 10 percent of that CPC budget.
Now let’s look at two ways to avoid sending traffic to 404 pages.
Monitoring your scrapes is one way to prevent the expensive problem of sending traffic to 404 pages. It’s difficult – in many cases impossible – to monitor the scrapes manually.
As we’ve noted in previous blog posts, monitoring amounts to about 80 percent of the work of managing web scrapes. What’s more, while the initial setup of a scrape is a process that has a clear end, monitoring is ongoing: as long as the scrape is live, you have to monitor it to ensure it’s returning valid results. Even with automatic monitoring, someone must review the error messages and correct the errors to avoid sending candidates to 404 pages.
Think of the analogy of doing a big landscaping project in your yard. The initial work is considerable: you have to plan the space, dig, plant, mulch, and so on. But that’s just the beginning; if you want your yard to look good for the long term, you have to maintain it day after day. If you don’t do any maintenance work, such as daily watering or periodic weeding, your yard will revert to its natural state. This is because of the fundamental process of entropy: left unchecked, disorder increases over time.
In the world of web scrapes, something similar happens: we’ve found that, in about six months, almost every source needs some kind of correction. But if you’re monitoring your scrapes on a regular basis, you can make those corrections as things break so that your job board is continually serving valid links and you never have an unwieldy maintenance project on your hands.
UTM logistics is a relatively new discipline in the world of digital marketing. It involves monitoring, analyzing, and reporting on performance based on UTM parameters. If you’re using UTM parameters to track job listings, you may be able to identify and fix 404 errors as part of your UTM logistics work.
For example, a job board advertising positions on Facebook might tag its Facebook apply links with a specific UTM parameter. The job board might then send those leads to an employer paying for a listing with additional UTM parameters. When the employer tracks apply rates from various sources (as defined by UTM parameters), it can easily see trends that signal a potential 404 link: for example, if zero percent of leads from one source are completing an application, that’s a signal that something may be wrong with the listing and that they should check it.
It’s worth noting that, in this context, UTM logistics amounts to a different form of monitoring job listings. It’s not necessarily easier or less labor-intensive than the types of monitoring we do for a standard listing, but if your organization is already engaging in UTM logistics, this may be the simplest way to monitor job postings for 404 errors.
Whether you’re doing dedicated monitoring or modifying UTM logistics efforts to keep an eye on your job scrapes, daily monitoring is ideal for most scenarios.
At this rate, you’re ensuring that no broken links (or other problems) are live for more than 24 hours, which means you’re proactively limiting the problems they might cause (including wasted spend on CPC campaigns).
Daily monitoring also limits the load required to pull and process data.
In some cases, though, having near real-time accuracy matters to job boards. Those situations call for more frequent monitoring. We’ve found that four times per day is adequate to accommodate most of these situations, though it is possible to monitor much more frequently. The more often you monitor, the greater the load on your servers and other technology.
One way to minimize that load is to outsource scrape maintenance, though you can expect to pay a premium if you want monitoring carried out more frequently than daily.
Sending job applicants to 404 pages is expensive. The solution is straightforward, but requires investment of a different kind: the time and energy of developers tasked with monitoring scrapes on an ongoing basis, plus the processing power to handle ongoing monitoring efforts.
If you’d like to stop losing money and / or stop creating poor visitor experiences by sending applicants to 404 pages but you don’t have the resources to staff a team of developers dedicated to monitoring, get in touch. We can handle the work of building and monitoring your scrapes, with greater accuracy and for less money than what you’d be able to do in house.
We noticed you mentioned scraping Indeed.com
Just to confirm: Indeed.com prohibits spidering of its content and they will block anyone trying to scrape it.
Normally, our clients ask us to spider jobs from direct employer websites and ATSes.
In some cases we can spider commercial job boards: if there is a formal agreement between our client and the job board to allow spidering.