I am running apache-airflow locally on docker. I am not able to import selenium into one of my dag which involves scraping data from the web periodically. I ran pip install command in the console which returned requirement already satisfied. I read the official documentation on Module Management of apache airflow but still cannot figure out.
I tried modifying the sys.path variable as shown in the code below but still got the same error
sys.path.append("C:\\Users\\DELL\\anaconda3\\lib\\site-packages") from selenium import webdriver from selenium.webdriver.support.ui import Select,WebDriverWait from selenium.webdriver.common.by import By from selenium.webdriver.support import expected_conditions as EC from selenium.common.exceptions import TimeoutException
Any hint that would point me in the right direction would be highly appreciated!
Docker is like a "computer" inside your computer. If you are not entering the docker container first, your
pip install is installing
selenium in your computer, not in the docker where Airflow is running.
You can enter the container with
docker exec -it container_name bash and then run the
pip install there. Or better yet, create a requirements file and share it as a volume in your airflow docker-compose file
Answered By – Javier Lopez Tomas