Scraping a web page from a website built with React and jQuery


I need to extract the product name, price and default color from the following link:

However, every time I load the below script the information is retrieved differently (sometimes all three values are printed, sometimes 1-2 of them or none). This happens regardless of WebDriverWait.

from selenium import webdriver
from import By
from import WebDriverWait
from import expected_conditions as EC
from selenium.webdriver.common.action_chains import ActionChains as AC
#import json
from bs4 import BeautifulSoup as soup
browser = webdriver.Firefox()

wait = WebDriverWait(browser, 100).until(EC.presence_of_all_elements_located)

s = soup(browser.page_source, 'html.parser')
name ='.product-name')[0].getText()
price ='.product-sale')[0].getText()
color ='.colors-info')[0].getText()
print(name, price, color)

Would you please advise how to extract all three elements? If I try to download the page with requests or scrapy the above elements are missing.


Few points :

  1. There’s an accept cookies button, which you have to click in order to proceed further.

  2. After cookies button there is a modal close button, which we have to click in order to proceed further.

  3. Use Explicit waits, visibility of element for this case.

  4. Prefer CSS over xpath.

  5. Maximize the browser.

Code :

driver = webdriver.Chrome(driver_path)
wait = WebDriverWait(driver, 20)


    print("to accept cookies")
    wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, "button[id='onetrust-accept-btn-handler']"))).click()

    print("to close modal pop up windows")
    wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, "div[class*='closeModal'][class$='confirmacionPais']"))).click()

name = wait.until(EC.visibility_of_element_located((By.CLASS_NAME, 'product-name'))).text
price = wait.until(EC.visibility_of_element_located((By.CLASS_NAME, 'product-sale'))).text
color = wait.until(EC.visibility_of_element_located((By.CLASS_NAME, 'colors-info'))).text

print(name, price, color)

Imports :

from import WebDriverWait
from import By
from import expected_conditions as EC

Output :

to accept cookies
to close modal pop up windows
Midi satin skirt £39.99 Black

Answered By – cruisepandey

This Answer collected from stackoverflow, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Leave a Reply

(*) Required, Your email will not be published