Scraping Yell.com Using Python & BeautifulSoup

Web scraping is extracting desired information from a webpage. I will share how to scraping a website and yell.com is study case for this article.

For web scraping we are use Python, BeautifulSoup and Requests as Python library. But for the first you have to some basic knowledge about the HTML tags. For more information on HTML tags please refer to https://www.w3schools.com/tags/.

Getting Started

pip install requestspip install beautifulsoup4

Let’s scrape!

import requests
from bs4 import BeautifulSoup
import pandas as pd

Get the url and the headers variable

header = {'User-Agent': 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:74.0) Gecko/20100101 Firefox/74.0'}dict_datas = []
for page in range(0,3):
page_number = page + 1
html_datas = requests.get('https://www.yell.com/ucs/UcsSearchAction.do',
params={'keywords': 'Restaurants', 'location': 'united+kingdom',
'scrambleSeed': '1558621577', 'pageNum': page_number}, headers=header)

Note: user-agent can be found by click this here. Now, let’s create a python file named app.py with code as follow:

After you execute it, now can get restaurant data in excel file like this.

Conclusion

Thanks for reading.

Software Developer.