La doc : https://selenium-python.readthedocs.io/navigating.html
cheatsheet : http://allselenium.info/python-selenium-commands-cheat-sheet-frequently-used/
moins complète mais plus propre : http://akul.me/blog/2016/selenium-cheatsheet/
#pip install selenium
#!pip install webdriver_manager
#conda install selenium
Collecting webdriver_manager
Downloading webdriver_manager-3.3.0-py2.py3-none-any.whl (16 kB)
Collecting crayons
Downloading crayons-0.4.0-py2.py3-none-any.whl (4.6 kB)
Collecting configparser
Downloading configparser-5.0.2-py3-none-any.whl (19 kB)
Requirement already satisfied: requests in c:\users\utilisateur\miniconda3\lib\site-packages (from webdriver_manager) (2.25.1)
Requirement already satisfied: colorama in c:\users\utilisateur\miniconda3\lib\site-packages (from crayons->webdriver_manager) (0.4.4)
Requirement already satisfied: certifi>=2017.4.17 in c:\users\utilisateur\miniconda3\lib\site-packages (from requests->webdriver_manager) (2020.12.5)
Requirement already satisfied: idna<3,>=2.5 in c:\users\utilisateur\miniconda3\lib\site-packages (from requests->webdriver_manager) (2.10)
Requirement already satisfied: chardet<5,>=3.0.2 in c:\users\utilisateur\miniconda3\lib\site-packages (from requests->webdriver_manager) (3.0.4)
Requirement already satisfied: urllib3<1.27,>=1.21.1 in c:\users\utilisateur\miniconda3\lib\site-packages (from requests->webdriver_manager) (1.25.11)
Installing collected packages: crayons, configparser, webdriver-manager
Successfully installed configparser-5.0.2 crayons-0.4.0 webdriver-manager-3.3.0
#Import Libraries
from bs4 import BeautifulSoup
from selenium import webdriver
import time
The website https://webscraper.io has some fake pages to test scraping on. we’ll use it on the page https://www.webscraper.io/test-sites/e-commerce/static/computers/laptops to get the product name and the price for the six items listed on the first page.
from webdriver_manager.chrome import ChromeDriverManager
driver = webdriver.Chrome(ChromeDriverManager().install())
# Change argument to the location you installed the chrome driver
# (see selenium installation instructions, or get the driver for your
# system from https://sites.google.com/a/chromium.org/chromedriver/downloads)
#driver = webdriver.Chrome('/Users/xxx/Applications/chromedriver')
[WDM] - ====== WebDriver manager ======
[WDM] - Current google-chrome version is 88.0.4324
[WDM] - Get LATEST driver version for 88.0.4324
[WDM] - There is no [win32] chromedriver for browser 88.0.4324 in cache
[WDM] - Get LATEST driver version for 88.0.4324
[WDM] - Trying to download new driver from https://chromedriver.storage.googleapis.com/88.0.4324.96/chromedriver_win32.zip
[WDM] - Driver has been saved in cache [C:\Users\Utilisateur\.wdm\drivers\chromedriver\win32\88.0.4324.96]
url = 'https://www.webscraper.io/test-sites/e-commerce/static/computers/laptops'
driver.get(url)
# Give the javascript time to render
time.sleep(1)
# Now we have the page, let BeautifulSoup do the rest!
soup = BeautifulSoup(driver.page_source)
# The text containing title and price are in a
# div with class caption.
for caption in soup.find_all(class_='caption'):
product_name = caption.find(class_='title').text
price = caption.find(class_='pull-right price').text
print(product_name, price)
Packard 255 G2 $416.99
Aspire E1-510 $306.99
ThinkPad T540p $1178.99
ProBook $739.99
ThinkPad X240 $1311.99
Aspire E1-572G $581.99