How to scrape data from the paragraph tag in Beautiful Soup

This recipe shows how to scrap the text or data from a paragraph tag by using a specific property in Beautiful Soup in Python.

Recipe Objective - How to scrape data from the paragraph tag using the specific property?

Important Modules:-

  1. bs4 - Beautiful Soup(bs4) is a Python data scraping library for pulling data out of HTML and XML files. 
  2. requests -  Requests library allows you to send HTTP/1.1 requests extremely easily. 

Procedure:-

  1. Import modules.
  2. Load an HTML document.
  3. Pass the HTML document into the Beautifulsoup() function.
  4. Use the

    tag to extract paragraphs from the Beautifulsoup object.

  5. Get text from the HTML document with get_text() or select() function.

Explore the Real-World Applications of Recommender Systems

For more related projects:-

https://www.projectpro.io/projects/data-science-projects/data-science-projects-in-python
https://www.projectpro.io/projects/data-science-projects/machine-learning-projects-in-python

Example:-

import requests
from bs4 import BeautifulSoup as bs

# load the projectpro webpage content
r = requests.get('https://www.projectpro.io/')

# convert to beautiful soup
soup = bs(r.content)

# printing our web page
print(soup.prettify())

# using "select" function:-
# "." used for class and "#" for the id's.
text = soup.select('p.card-text a')
text

What Users are saying..

profile image

Gautam Vermani

Data Consultant at Confidential
linkedin profile url

Having worked in the field of Data Science, I wanted to explore how I can implement projects in other domains, So I thought of connecting with ProjectPro. A project that helped me absorb this topic... Read More

Relevant Projects

Deep Learning Project- Real-Time Fruit Detection using YOLOv4
In this deep learning project, you will learn to build an accurate, fast, and reliable real-time fruit detection system using the YOLOv4 object detection model for robotic harvesting platforms.

Multilabel Classification Project for Predicting Shipment Modes
Multilabel Classification Project to build a machine learning model that predicts the appropriate mode of transport for each shipment, using a transport dataset with 2000 unique products. The project explores and compares four different approaches to multilabel classification, including naive independent models, classifier chains, natively multilabel models, and multilabel to multiclass approaches.

Build Regression Models in Python for House Price Prediction
In this Machine Learning Regression project, you will build and evaluate various regression models in Python for house price prediction.

Hands-On Approach to Regression Discontinuity Design Python
In this machine learning project, you will learn to implement Regression Discontinuity Design Example in Python to determine the effect of age on Mortality Rate in Python.

PyCaret Project to Build and Deploy an ML App using Streamlit
In this PyCaret Project, you will build a customer segmentation model with PyCaret and deploy the machine learning application using Streamlit.

Time Series Forecasting Project-Building ARIMA Model in Python
Build a time series ARIMA model in Python to forecast the use of arrival rate density to support staffing decisions at call centres.

Build an Image Segmentation Model using Amazon SageMaker
In this Machine Learning Project, you will learn to implement the UNet Architecture and build an Image Segmentation Model using Amazon SageMaker

Build a Similar Images Finder with Python, Keras, and Tensorflow
Build your own image similarity application using Python to search and find images of products that are similar to any given product. You will implement the K-Nearest Neighbor algorithm to find products with maximum similarity.

Deploy Transformer-BART Model on Paperspace Cloud
In this MLOps Project you will learn how to deploy a Tranaformer BART Model for Abstractive Text Summarization on Paperspace Private Cloud

Time Series Analysis with Facebook Prophet Python and Cesium
Time Series Analysis Project - Use the Facebook Prophet and Cesium Open Source Library for Time Series Forecasting in Python