Tuesday, November 26, 2024

Install Beautiful Soup on Raspberry Pi (Bookworm)

In this small tutorial we will install Beautiful Soup on Raspberry Pi (Bookworm)

1.) Make sure that your Pi OS is up to date via:

sudo apt-get update

sudo apt-get upgrade

2.) install Phyton 3 via:

sudo apt install python3

Note: this is often pre-installed

3.) Install Beautiful Soup via

sudo apt install python3-bs4

4.) For the webscraping we require also the requests library, which we install via:

sudo apt install python3-requests

Note: this is often pre-installed

5.) Now we can test if it works. For the test we will create a small python script via:

sudo nano test-script.py

6.) Once nano is open, copy the following inside the script:

# imports
import requests
from bs4 import BeautifulSoup
import json

# define which site to crawl
page = requests.get("https://www.google.com")

# Check if we got a website and not an error
if page.status_code == 200:
    content = page.content

# parse the content from the side via BeautifulSoup
DOMdocument = BeautifulSoup(content, 'html.parser')

# extract the titel
title = DOMdocument.title.string

# save the data
data = {
    "title": title
}

# dump the data to a file
with open('google_title.json', 'w') as json_file:
    json.dump(data, json_file, indent=4)

print("HTML Titel from the Google main page was exported into a JSON file.")

After that close nano and save the file

7.) Now we will run the script via:

sudo python test-script.py

8.) After the script finished (which might take some time depending on your device) we can check the content via

nano google_title.json

The content should include the meta titel tag from the main google page.

Related Articles

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Stay Connected

35FollowersFollow
- Advertisement -

Latest Articles