Running Selenium with Headless Chrome Webdriver

PythonSeleniumGoogle ChromeSelenium ChromedriverGoogle Chrome-Headless

Python Problem Overview


So I'm trying some stuff out with selenium and I really want it to be quick.

So my thought is that running it with headless chrome would make my script faster.

First is that assumption correct, or does it not matter if i run my script with a headless driver?

Anyways I still want to get it to work to run headless, but I somehow can't, I tried different things and most suggested that it would work as said here in the October update

https://stackoverflow.com/questions/46920243/how-to-configure-chromedriver-to-initiate-chrome-browser-in-headless-mode-throug

But when I try that, I get weird console output and it still doesn't seem to work.

Any tipps appreciated.

Python Solutions


Solution 1 - Python

To run chrome-headless just add --headless via chrome_options.add_argument, i.e.:

from selenium import webdriver 
from selenium.webdriver.chrome.options import Options
chrome_options = Options()
#chrome_options.add_argument("--disable-extensions")
#chrome_options.add_argument("--disable-gpu")
#chrome_options.add_argument("--no-sandbox") # linux only
chrome_options.add_argument("--headless")
# chrome_options.headless = True # also works
driver = webdriver.Chrome(options=chrome_options)
start_url = "https://duckgo.com"
driver.get(start_url)
print(driver.page_source.encode("utf-8"))
# b'<!DOCTYPE html><html xmlns="http://www....
driver.quit()

> So my thought is that running it with headless chrome would make my > script faster.

Try using chrome options like --disable-extensions or --disable-gpu and benchmark it, but I wouldn't count with much improvement.


References: headless-chrome

> Note: As of today, when running chrome headless on Windows., you should include the  --disable-gpu flag > See crbug.com/737678

Solution 2 - Python

Install & run containerized Chrome:

docker pull selenium/standalone-chrome
docker run --rm -d -p 4444:4444 --shm-size=2g selenium/standalone-chrome

Connect using webdriver.Remote:

driver = webdriver.Remote('http://localhost:4444/wd/hub', DesiredCapabilities.CHROME)
driver.set_window_size(1280, 1024)
driver.get('https://www.google.com')

Solution 3 - Python

from time import sleep

from selenium import webdriver
from selenium.webdriver.chrome.options import Options

chrome_options = Options()
chrome_options.add_argument("--headless")

driver = webdriver.Chrome(executable_path="./chromedriver", options=chrome_options)
url = "https://stackoverflow.com/questions/53657215/running-selenium-with-headless-chrome-webdriver"
driver.get(url)

sleep(5)

h1 = driver.find_element_by_xpath("//h1[@itemprop='name']").text
print(h1)

Then I run script on our local machine

➜ python script.py
Running Selenium with Headless Chrome Webdriver

It is working and it is with headless Chrome.

Solution 4 - Python

If you are using Linux environment, may be you have to add --no-sandbox as well and also specific window size settings. The --no-sandbox flag is no needed on Windows if you set user container properly.

Use --disable-gpu only on Windows. Other platforms no longer require it. The --disable-gpu flag is a temporary work around for a few bugs.

//Headless chrome browser and configure
            WebDriverManager.chromedriver().setup();
            ChromeOptions chromeOptions = new ChromeOptions();
            chromeOptions.addArguments("--no-sandbox");
            chromeOptions.addArguments("--headless");
            chromeOptions.addArguments("disable-gpu");
//          chromeOptions.addArguments("window-size=1400,2100"); // Linux should be activate
            driver = new ChromeDriver(chromeOptions);

Solution 5 - Python

Once you have selenium and web driver installed. Below worked for me with headless Chrome on linux cluster :

from selenium import webdriver
options = webdriver.ChromeOptions()
options.add_argument("--headless")
options.add_argument("--disable-extensions")
options.add_argument("--disable-dev-shm-usage")
options.add_argument("--no-sandbox")
options.add_experimental_option("prefs",{"download.default_directory":"/databricks/driver"})
driver = webdriver.Chrome(chrome_options=options)

Solution 6 - Python

Todo (tested on headless server Debian Linux 9.4):

  1. Do this:

     # install chrome
     curl -sS -o - https://dl-ssl.google.com/linux/linux_signing_key.pub | apt-key add -
     echo "deb [arch=amd64]  http://dl.google.com/linux/chrome/deb/ stable main" >> /etc/apt/sources.list.d/google-chrome.list
     apt-get -y update
     apt-get -y install google-chrome-stable
    
     # install chrome driver
     wget https://chromedriver.storage.googleapis.com/77.0.3865.40/chromedriver_linux64.zip
     unzip chromedriver_linux64.zip
     mv chromedriver /usr/bin/chromedriver
     chown root:root /usr/bin/chromedriver
     chmod +x /usr/bin/chromedriver
    
  2. Install selenium:

     pip install selenium
    

    and run this Python code:

     from selenium import webdriver
     from selenium.webdriver.chrome.options import Options
     options = Options()
     options.add_argument("no-sandbox")
     options.add_argument("headless")
     options.add_argument("start-maximized")
     options.add_argument("window-size=1900,1080"); 
     driver = webdriver.Chrome(chrome_options=options, executable_path="/usr/bin/chromedriver")
     driver.get("https://www.example.com")
     html = driver.page_source
     print(html)
    

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionRhyndenView Question on Stackoverflow
Solution 1 - PythonPedro LobitoView Answer on Stackoverflow
Solution 2 - PythonMax MalyshView Answer on Stackoverflow
Solution 3 - PythonSerhiiView Answer on Stackoverflow
Solution 4 - PythonDevdunView Answer on Stackoverflow
Solution 5 - PythonNikunj KakadiyaView Answer on Stackoverflow
Solution 6 - PythonBasjView Answer on Stackoverflow