I have written code in Selenium to scrape Accor’s booking website after certain information has been passed. I am able to scrape and return the names of all the hotels on the resultant page with this code.
url = 'https://all.accor.com/ssr/app/accor/hotels/london/index.en.shtml?dateIn=2021-08-20&nights=8&compositions=1&stayplus=false' driver = webdriver.Chrome(executable_path='C:\\Users\\conor\\Desktop\\diss\\chromedriver.exe') driver.get(url) time.sleep(10) working = driver.find_elements_by_class_name('hotel__wrapper') for work in working: name = work.find_element_by_class_name('title__link').text name = name.strip() print(name)
This returns all of the hotel names on the page as expected, however, it’s also returning an extra line with each hotel name, with the star rating of the hotel, which I don’t see in the HTML markup on the page. Here is the output.
Sofitel London St James 5 Star rating The Savoy 5 Star rating Mercure London Bloomsbury Hotel 4 Star rating Novotel London Waterloo 4 Star rating ibis London Blackfriars 3 Star rating Novotel London Blackfriars 4 Star rating Mercure London Bridge 4 Star rating Novotel London Bridge 4 Star rating ibis Styles London Southwark - near Borough Market 3 Star rating Pullman London St Pancras 4 Star rating
Is there a way to remove this extra line of text for the rating being returned with the hotel name? As I only want the hotel names as I am using the names to compare the prices on different sites. Any help appreciated, thank you.
Since you have two string one with name and the other with rating you can split the string and can only use the hotel name part. Here is the example:
for work in working: name_with_rating = work.find_element_by_class_name('title__link').text name = name_with_rating.split("\n") print(name)
Answered By – theNishant