Eighty-two percent of employers use job boards to attract talent, and at least 60 percent of job seekers turn to them to find work. To maintain this level of popularity, job boards must remain up to date and accurate. This is why effective web scraping of job postings is crucial for job board operators.
However, maintaining web scrapes in house often requires more time and resources than managers expected or budgeted for. Here are three signs that you should consider outsourcing the web scraping process for your job board.
Setting up a scrape is the easiest part of the web scraping process. In fact, configuring the initial scrape only accounts for about 20 percent of the effort involved. The remaining 80 percent involves supporting and monitoring the scrape.
At the onset of an in-house web scraping project, developers’ main task is to set up scrapes. They don’t have to worry much about what to do with the data they collect, because there isn’t yet much data to deal with. However, scrapes collect live data quickly, and all of that data, which can be pulled from tens of thousands of job listings, needs to be regularly parsed, enhanced, and packaged.
Scrapes also lose effectiveness as websites change or fail. In fact, according to Pingdom, more than 12,000 websites are down at any point in time. Another factor impacting scrape effectiveness? According to our findings, more than 40 percent of career websites are reconfigured every year. Often, scrapes must be updated to accommodate the changes those reconfigurations introduce.
If it’s not updated, the scrape might fail. This could mean the associated job listing is no longer visible or that it no longer sends applicants to the right page. Either way, if the fix isn’t made within mere hours, job boards lose money and the candidate experience suffers.
Outsourcing addresses this problem because outsourced solutions include ongoing maintenance and monitoring – often far more than job boards could do in house.
Properly maintaining a scrape is expensive. The longer you maintain a scrape, the more time and resources you must funnel into it to ensure it works continuously. The primary costs are monitoring, reconfigurations and, in the absence of proper monitoring, customer complaints from broken scrapes. Of course, setting up a scrape and failing to monitor it may seem cheap, but it isn’t.
And, job boards don’t just maintain a scrape. They maintain hundreds and sometimes even thousands.
And really, job boards don’t maintain scrapes – the developers who work for them do. The average salary for a software developer is $80,437. Because the work associated with web scraping continues after its initial configuration, most job boards find that, whatever resources they allocated to the initial project, more are required as the workload increases. And, when customers complain, developers can be pulled off mission critical projects to “put out the fire” from a broken scrape.
Bringing on just a few more developers quickly inflates staffing costs, and that’s before benefits.
And developers aren’t the only cost. To operate web scraping in house, job boards must also establish and monitor servers to run scripts, store data, and handle data transfers. Even if you outsource this work to the cloud, you’ll still need to pay for cloud usage. And if you’re not familiar with how to optimize cloud usage, cloud costs can easily spiral out of control.
Outsourcing web scraping, on the other hand, often costs less than just one full-time developer. This outsourcing also frees up your in-house developers to work on other, more pressing tasks that make better use of their skillset.
What’s more, outsourcing the often problematic scraping system makes your highly valuable developers happy, reducing attrition and helping keep key development projects on schedule.
Growth is exciting, but comes with its own challenges. For job boards, mergers and acquisitions (like Indeed’s 2019 acquisition of Syft ) mean finding ways to cost-effectively manage more scrapes. Smaller job boards become more attractive to potential acquiring partners when they have an efficient scraping setup, and larger job boards can more profitably acquire smaller boards when they scrape efficiently.
In both cases, outsourcing that web scraping is usually the most efficient solution, and not only because it’s a cost-effective way to keep scrapes up and running.
Relying on a third-party web-scraping specialist also ensures that the knowledge required to keep your scrapes up and running doesn’t exist with just one or two people – who could leave your organization at any time. Instead, knowledge, best practices, common issues, and effective resolutions are systematized within the web-scraping provider, both across a team of people and within a sophisticated log and tracking system.
What’s more, outsourced providers can also deliver helpful add-ons like dashboards to monitor the status of your scrapes and also make continual improvements to features and functionality.
Job scraping in house requires a task force of specialized IT professionals and a fully established digital infrastructure. This can cost hundreds of thousands of dollars annually and demands considerable time to maintain.
Even tapping a contractor for the work doesn’t solve the problem: what happens when they get tired of scraping or want to get involved in other projects? All of that knowledge and experience goes away with them.
But outsourcing doesn’t just save costs; it also contributes to a better user experience for the job applicant. When scrapes fail, job candidates suffer. They can get sent to 404 links, outdated postings, or see a job listing with poorly formatted blocks of plain text. This might drive them to leave the site and negatively impact their opinion of the job board.
That costs job boards in the short term, but it can also lead to fewer users over time, which can make it harder to attract customers.
By outsourcing, you can cut costs and direct time and resources toward other initiatives. If you’re looking to optimize spending on your web scraping efforts, contact us here. We offer a free trial.
We noticed you mentioned scraping Indeed.com
Just to confirm: Indeed.com prohibits spidering of its content and they will block anyone trying to scrape it.
Normally, our clients ask us to spider jobs from direct employer websites and ATSes.
In some cases we can spider commercial job boards: if there is a formal agreement between our client and the job board to allow spidering.