We help our customers power their job boards by scraping jobs from around the internet. But we don’t scrape job boards without permission. One question we hear a lot when we’re explaining our offerings boils down to this: Why not?
In this blog post, we’ll answer that question – and hopefully give you a new appreciation for the ethics of web scraping and what it means to be a good citizen of the internet.
If you’re new to the world of web scraping, here’s your primer: web scraping is the practice of pulling data from websites via code. If you use the internet, you almost certainly benefit from web scraping on a daily basis because search engines function by scraping other websites.
Google, maybe the most famous web scraper, uses robots called spiders to scrape websites and populate them as links on results pages. When you type “best job board software” into Google, the links you see are there because Google scraped websites looking for signals that those websites had information about the “best job board software”
Obviously, this is an incredibly valuable example of web scraping: without services like Google, it would be much harder for people to find what they were looking for online. That concept is key to understanding the ethics of web scraping more generally, which we’ll get into shortly.
The service that we offer our customers is a form of web scraping we usually call job scraping or job wrapping: we scrape the web for data about open jobs (aka job listings). We provide this data to our customers who run job boards – which function, as you probably realize, as search engines for people looking for specific types of work.
So how does the concept of ethics come into play? Let’s take a look.
First, let’s make a clear distinction: job scraping is legal. We’re not talking about legal vs. illegal but rather about how to do job scraping “right.” Of course, that’s a much fuzzier question, as any question of ethics is.
Then again, doing the “right” thing in any given situation becomes clearer as you understand more about how that thing works and who it involves. In the case of web scraping (and job scraping specifically), ethical behavior generally boils down to doing the following:
The good news: ethical job scraping isn’t that different from any other type of ethical behavior. However, to perform ethical job scrapes, the people building and maintaining the scrapes must understand what they’re doing well enough to adhere to all of these behaviors.
This brings us to the question we hear so often from potential customers: why don’t we scrape data from job boards?
Briefly, we don’t scrape job boards because that would not constitute an ethical use of web scraping.
Job boards themselves are aggregations of jobs data. If we scraped them to populate another job board, we wouldn’t be adding value – we’d be stealing value. That doesn’t benefit the owner of the job board and it doesn’t benefit those seeking jobs, which means it goes against the “be helpful” principle of ethical job scraping.
This is especially true when you understand the user experience that results from job board listings scraped from other job boards: a user clicks the listing, then they’re redirected to another job board (likely a competitor of the one they just left). From there, they may ultimately get to a job listing, but there’s a good chance that the listing will be expired already. This is one reason we’re big advocates of organic listings on job boards.
Finally, we don’t scrape job boards without permission because so many job boards explicitly prohibit data scraping. When that’s the case, scraping would also violate the “be respectful” principle.
Just as important for our customers, though: we also don’t scrape job boards because their data is not as reliable as the primary sources of data that we do scrape, including ATSes and employer websites. Think about it: have you ever seen a job listing on a job board, clicked “apply,” and learned that the listing is actually closed? That’s the result of the job board having an outdated listing, possibly pulled via an outdated scrape.
In addition to being pulled from primary sources, the job scrapes we provide customers are updated daily (or more often, if that’s important for the type of role). This is valuable for everyone involved:
To summarize: job scraping can be fully ethical, if you do it (or work with a firm that does it) correctly.
Even better: if you’re interested in fueling a job board with scraped job listings from around the web, scraping those listings ethically will create more value for the users of your job board and the sites you scrape, which will help you generate traffic and brand affinity as you grow.
If you still have questions about job scraping or the ethics of job scraping, don’t hesitate to get in touch. We’re passionate about doing the right thing, both for our customers and other citizens of the internet.