How to scrape links from the web page using beautiful soup

In this recipe you will be able to scrape links from web pages with the help of beautiful soup (bs4) using select method.

Recipe Objective - How to scrape links from the web page using beautiful soup?

Steps to scrape the links from the web page:-

  1. Import necessary modules.
  2. Load an HTML document.
  3. Pass the HTML document into the Beautifulsoup() function.
  4. Get the link tags from web page and put it inside the .select() method. e.g - soup.select('a')
  5. Then use the List Comprehension to print out each link one by one.

Explore the Real-World Applications of Recommender Systems

 

Example:-

import requests
from bs4 import BeautifulSoup as bs

# load the projectpro webpage content
r = requests.get('https://www.projectpro.io/')

# convert to beautiful soup
soup = bs(r.content)

# printing our web page
print(soup.prettify())

# scrapping the links:-
# For all the 'href' links
web_links = soup.select('a')
actual_web_links = [web_link['href'] for web_link in web_links]
actual_web_links

What Users are saying..

profile image

Abhinav Agarwal

Graduate Student at Northwestern University
linkedin profile url

I come from Northwestern University, which is ranked 9th in the US. Although the high-quality academics at school taught me all the basics I needed, obtaining practical experience was a challenge.... Read More

Relevant Projects

Build a Speech-Text Transcriptor with Nvidia Quartznet Model
In this Deep Learning Project, you will leverage transfer learning from Nvidia QuartzNet pre-trained models to develop a speech-to-text transcriptor.

Deploying Machine Learning Models with Flask for Beginners
In this MLOps on GCP project you will learn to deploy a sales forecasting ML Model using Flask.

Learn How to Build PyTorch Neural Networks from Scratch
In this deep learning project, you will learn how to build PyTorch neural networks from scratch.

Insurance Pricing Forecast Using XGBoost Regressor
In this project, we are going to talk about insurance forecast by using linear and xgboost regression techniques.

Loan Eligibility Prediction Project using Machine learning on GCP
Loan Eligibility Prediction Project - Use SQL and Python to build a predictive model on GCP to determine whether an application requesting loan is eligible or not.

Build an End-to-End AWS SageMaker Classification Model
MLOps on AWS SageMaker -Learn to Build an End-to-End Classification Model on SageMaker to predict a patient’s cause of death.

CycleGAN Implementation for Image-To-Image Translation
In this GAN Deep Learning Project, you will learn how to build an image to image translation model in PyTorch with Cycle GAN.

Learn to Build an End-to-End Machine Learning Pipeline - Part 1
In this Machine Learning Project, you will learn how to build an end-to-end machine learning pipeline for predicting truck delays, addressing a major challenge in the logistics industry.

Loan Eligibility Prediction in Python using H2O.ai
In this loan prediction project you will build predictive models in Python using H2O.ai to predict if an applicant is able to repay the loan or not.

Time Series Project to Build a Multiple Linear Regression Model
Learn to build a Multiple linear regression model in Python on Time Series Data