I can't move forward more than one page with Selenium

Issue

I’m triying to scrape a transfermark web and I can’t move fordward more than one page. This is the url of website:
https://www.transfermarkt.es/transfers/transfertagedetail/statistik/top/land_id_zu/0/land_id_ab/0/leihe//datum/2022-07-06

my code:

from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

urlb = 'https://www.transfermarkt.es/transfers/transfertagedetail/statistik/top/land_id_zu/0/land_id_ab/0/leihe//datum/2022-07-06'
driver = webdriver.Chrome()
wait = WebDriverWait(driver, 10)
driver.get(urlb)

# To accept Cookies
wait.until(EC.frame_to_be_available_and_switch_to_it((By.XPATH, '//*[@id="sp_message_iframe_575430"]')))
wait.until(EC.element_to_be_clickable((By.XPATH, '//*[@id="notice"]/div[3]/div[2]/button'))).click()
driver.switch_to.default_content()

while True:
    # button next ">"   
    wait.until(EC.presence_of_element_located((By.XPATH, "//a[@title='A la página siguiente']")))
    driver.find_element(By.XPATH, "//a[@title='A la página siguiente']").click() 

enter image description here

With this code I only can enter to sencond page. I’d like can navigate until the final.

Thanks

Solution

May I suggest using requests with a custom header instead of Selenium?

The following code works:

import requests
from bs4 import BeautifulSoup

headers = {'User-Agent': '...'}
for x in range(1, 18):
    r = requests.get(f'https://www.transfermarkt.es/transfers/transfertagedetail/statistik/top/land_id_zu/0/land_id_ab/0/leihe//datum/2022-07-06/sort//page/{x}', headers=headers)
    soup = BeautifulSoup(r.text, 'html.parser')
    ### find the data you need, save it etc

You can find user agents for different browsers easy, for instance here:
https://www.whatismybrowser.com/guides/the-latest-user-agent/chrome

Answered By – platipus_on_fire_333

This Answer collected from stackoverflow, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Leave a Reply

(*) Required, Your email will not be published