Can a website detect when you are using Selenium with chromedriver?
JavascriptPythonGoogle ChromeSeleniumSelenium ChromedriverJavascript Problem Overview
I've been testing out Selenium with Chromedriver and I noticed that some pages can detect that you're using Selenium even though there's no automation at all. Even when I'm just browsing manually just using Chrome through Selenium and Xephyr I often get a page saying that suspicious activity was detected. I've checked my user agent, and my browser fingerprint, and they are all exactly identical to the normal Chrome browser.
When I browse to these sites in normal Chrome everything works fine, but the moment I use Selenium I'm detected.
In theory, chromedriver and Chrome should look literally exactly the same to any webserver, but somehow they can detect it.
If you want some test code try out this:
from pyvirtualdisplay import Display
from selenium import webdriver
display = Display(visible=1, size=(1600, 902))
display.start()
chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument('--disable-extensions')
chrome_options.add_argument('--profile-directory=Default')
chrome_options.add_argument("--incognito")
chrome_options.add_argument("--disable-plugins-discovery");
chrome_options.add_argument("--start-maximized")
driver = webdriver.Chrome(chrome_options=chrome_options)
driver.delete_all_cookies()
driver.set_window_size(800,800)
driver.set_window_position(0,0)
print 'arguments done'
driver.get('http://stubhub.com')
If you browse around stubhub you'll get redirected and 'blocked' within one or two requests. I've been investigating this and I can't figure out how they can tell that a user is using Selenium.
How do they do it?
I installed the Selenium IDE plugin in Firefox and I got banned when I went to stubhub.com in the normal Firefox browser with only the additional plugin.
When I use Fiddler to view the HTTP requests being sent back and forth I've noticed that the 'fake browser's' requests often have 'no-cache' in the response header.
Results like this https://stackoverflow.com/questions/3614472/is-there-a-way-to-detect-that-im-in-a-selenium-webdriver-page-from-javascript suggest that there should be no way to detect when you are using a webdriver. But this evidence suggests otherwise.
The site uploads a fingerprint to their servers, but I checked and the fingerprint of Selenium is identical to the fingerprint when using Chrome.
This is one of the fingerprint payloads that they send to their servers:
{"appName":"Netscape","platform":"Linuxx86_64","cookies":1,"syslang":"en-US","userlang":"en-
US","cpu":"","productSub":"20030107","setTimeout":1,"setInterval":1,"plugins":
{"0":"ChromePDFViewer","1":"ShockwaveFlash","2":"WidevineContentDecryptionMo
dule","3":"NativeClient","4":"ChromePDFViewer"},"mimeTypes":
{"0":"application/pdf","1":"ShockwaveFlashapplication/x-shockwave-
flash","2":"FutureSplashPlayerapplication/futuresplash","3":"WidevineContent
DecryptionModuleapplication/x-ppapi-widevine-
cdm","4":"NativeClientExecutableapplication/x-
nacl","5":"PortableNativeClientExecutableapplication/x-
pnacl","6":"PortableDocumentFormatapplication/x-google-chrome-
pdf"},"screen":{"width":1600,"height":900,"colorDepth":24},"fonts":
{"0":"monospace","1":"DejaVuSerif","2":"Georgia","3":"DejaVuSans","4":"Trebu
chetMS","5":"Verdana","6":"AndaleMono","7":"DejaVuSansMono","8":"LiberationM
ono","9":"NimbusMonoL","10":"CourierNew","11":"Courier"}}
It's identical in Selenium and in Chrome.
VPNs work for a single use, but they get detected after I load the first page. Clearly some JavaScript is being run to detect Selenium.
Javascript Solutions
Solution 1 - Javascript
Basically, the way the Selenium detection works, is that they test for predefined JavaScript variables which appear when running with Selenium. The bot detection scripts usually look anything containing word "selenium" / "webdriver" in any of the variables (on window object), and also document variables called $cdc_
and $wdc_
. Of course, all of this depends on which browser you are on. All the different browsers expose different things.
For me, I used Chrome, so, all that I had to do was to ensure that $cdc_
didn't exist anymore as a document variable, and voilà (download chromedriver source code, modify chromedriver and re-compile $cdc_
under different name.)
This is the function I modified in chromedriver:
File call_function.js:
function getPageCache(opt_doc) {
var doc = opt_doc || document;
//var key = '$cdc_asdjflasutopfhvcZLmcfl_';
var key = 'randomblabla_';
if (!(key in doc))
doc[key] = new Cache();
return doc[key];
}
(Note the comment. All I did I turned $cdc_
to randomblabla_
.)
Here is pseudocode which demonstrates some of the techniques that bot networks might use:
runBotDetection = function () {
var documentDetectionKeys = [
"__webdriver_evaluate",
"__selenium_evaluate",
"__webdriver_script_function",
"__webdriver_script_func",
"__webdriver_script_fn",
"__fxdriver_evaluate",
"__driver_unwrapped",
"__webdriver_unwrapped",
"__driver_evaluate",
"__selenium_unwrapped",
"__fxdriver_unwrapped",
];
var windowDetectionKeys = [
"_phantom",
"__nightmare",
"_selenium",
"callPhantom",
"callSelenium",
"_Selenium_IDE_Recorder",
];
for (const windowDetectionKey in windowDetectionKeys) {
const windowDetectionKeyValue = windowDetectionKeys[windowDetectionKey];
if (window[windowDetectionKeyValue]) {
return true;
}
};
for (const documentDetectionKey in documentDetectionKeys) {
const documentDetectionKeyValue = documentDetectionKeys[documentDetectionKey];
if (window['document'][documentDetectionKeyValue]) {
return true;
}
};
for (const documentKey in window['document']) {
if (documentKey.match(/\$[a-z]dc_/) && window['document'][documentKey]['cache_']) {
return true;
}
}
if (window['external'] && window['external'].toString() && (window['external'].toString()['indexOf']('Sequentum') != -1)) return true;
if (window['document']['documentElement']['getAttribute']('selenium')) return true;
if (window['document']['documentElement']['getAttribute']('webdriver')) return true;
if (window['document']['documentElement']['getAttribute']('driver')) return true;
return false;
};
According to user szx, it is also possible to simply open chromedriver.exe in a hex editor, and just do the replacement manually, without actually doing any compiling.
Solution 2 - Javascript
cdc_
string
Replacing You can use vim
or perl
to replace the cdc_
string in chromedriver
. See answer by @Erti-Chris Eelmaa to learn more about that string and how it's a detection point.
Using vim
or perl
prevents you from having to recompile source code or use a hex-editor.
Make sure to make a copy of the original chromedriver
before attempting to edit it.
Our goal is to alter the cdc_
string, which looks something like $cdc_lasutopfhvcZLmcfl
.
The methods below were tested on chromedriver version 2.41.578706
.
Using Vim
vim /path/to/chromedriver
After running the line above, you'll probably see a bunch of gibberish. Do the following:
- Replace all instances of
cdc_
withdog_
by typing:%s/cdc_/dog_/g
.dog_
is just an example. You can choose anything as long as it has the same amount of characters as the search string (e.g.,cdc_
), otherwise thechromedriver
will fail.
- To save the changes and quit, type
:wq!
and pressreturn
.- If you need to quit without saving changes, type
:q!
and pressreturn
.
- If you need to quit without saving changes, type
Using Perl
The line below replaces all cdc_
occurrences with dog_
. Credit to Vic Seedoubleyew:
perl -pi -e 's/cdc_/dog_/g' /path/to/chromedriver
Make sure that the replacement string (e.g., dog_
) has the same number of characters as the search string (e.g., cdc_
), otherwise the chromedriver
will fail.
Wrapping Up
To verify that all occurrences of cdc_
were replaced:
grep "cdc_" /path/to/chromedriver
If no output was returned, the replacement was successful.
Go to the altered chromedriver
and double click on it. A terminal window should open up. If you don't see killed
in the output, you've successfully altered the driver.
Make sure that the name of the altered chromedriver
binary is chromedriver
, and that the original binary is either moved from its original location or renamed.
My Experience With This Method
I was previously being detected on a website while trying to log in, but after replacing cdc_
with an equal sized string, I was able to log in. Like others have said though, if you've already been detected, you might get blocked for a plethora of other reasons even after using this method. So you may have to try accessing the site that was detecting you using a VPN, different network, etc.
Solution 3 - Javascript
As we've already figured out in the question and the posted answers, there is an anti Web-scraping and a Bot detection service called "Distil Networks" in play here. And, according to the company CEO's interview:
> Even though they can create new bots, we figured out a way to identify > Selenium the a tool they’re using, so we’re blocking Selenium no > matter how many times they iterate on that bot. We’re doing that now > with Python and a lot of different technologies. Once we see a pattern > emerge from one type of bot, then we work to reverse engineer the > technology they use and identify it as malicious.
It'll take time and additional challenges to understand how exactly they are detecting Selenium, but what can we say for sure at the moment:
- it's not related to the actions you take with selenium - once you navigate to the site, you get immediately detected and banned. I've tried to add artificial random delays between actions, take a pause after the page is loaded - nothing helped
- it's not about browser fingerprint either - tried it in multiple browsers with clean profiles and not, incognito modes - nothing helped
- since, according to the hint in the interview, this was "reverse engineering", I suspect this is done with some JS code being executed in the browser revealing that this is a browser automated via selenium webdriver
Decided to post it as an answer, since clearly:
> Can a website detect when you are using selenium with chromedriver?
Yes.
Also, what I haven't experimented with is older selenium and older browser versions - in theory, there could be something implemented/added to selenium at a certain point that Distil Networks bot detector currently relies on. Then, if this is the case, we might detect (yeah, let's detect the detector) at what point/version a relevant change was made, look into changelog and changesets and, may be, this could give us more information on where to look and what is it they use to detect a webdriver-powered browser. It's just a theory that needs to be tested.
Solution 4 - Javascript
A lot have been analyzed and discussed about a website being detected being driven by Selenium controlled ChromeDriver. Here are my two cents:
According to the article Browser detection using the user agent serving different webpages or services to different browsers is usually not among the best of ideas. The web is meant to be accessible to everyone, regardless of which browser or device an user is using. There are best practices outlined to develop a website to progressively enhance itself based on the feature availability rather than by targeting specific browsers.
However, browsers and standards are not perfect, and there are still some edge cases where some websites still detects the browser and if the browser is driven by Selenium controled WebDriver. Browsers can be detected through different ways and some commonly used mechanisms are as follows:
- Implementing [tag:captcha] / [tag:recaptcha] to detect the automatic bots.
>You can find a relevant detailed discussion in How does recaptcha 3 know I'm using selenium/chromedriver?
- Detecting the term HeadlessChrome within headless Chrome UserAgent
>You can find a relevant detailed discussion in Access Denied page with headless Chrome on Linux while headed Chrome works on windows using Selenium through Python
- Using Bot Management service from Distil Networks
>You can find a relevant detailed discussion in Unable to use Selenium to automate Chase site login
- Using Bot Manager service from Akamai
>You can find a relevant detailed discussion in Dynamic dropdown doesn't populate with auto suggestions on https://www.nseindia.com/ when values are passed using Selenium and Python
- Using Bot Protection service from Datadome
>You can find a relevant detailed discussion in Website using DataDome gets captcha blocked while scraping using Selenium and Python
However, using the [tag:user-agent] to detect the browser looks simple but doing it well is in fact a bit tougher.
>Note: At this point it's worth to mention that: it's very rarely a good idea to use user agent sniffing. There are always better and more broadly compatible way to address a certain issue.
Considerations for browser detection
The idea behind detecting the browser can be either of the following:
- Trying to work around a specific bug in some specific variant or specific version of a webbrowser.
- Trying to check for the existence of a specific feature that some browsers don't yet support.
- Trying to provide different HTML depending on which browser is being used.
Alternative of browser detection through UserAgents
Some of the alternatives of browser detection are as follows:
- Implementing a test to detect how the browser implements the API of a feature and determine how to use it from that. An example was Chrome unflagged experimental lookbehind support in regular expressions.
- Adapting the design technique of Progressive enhancement which would involve developing a website in layers, using a bottom-up approach, starting with a simpler layer and improving the capabilities of the site in successive layers, each using more features.
- Adapting the top-down approach of Graceful degradation in which we build the best possible site using all the features we want and then tweak it to make it work on older browsers.
Solution
To prevent the Selenium driven WebDriver from getting detected, a niche approach would include either/all of the below mentioned approaches:
-
Rotating the UserAgent in every execution of your Test Suite using
fake_useragent
module as follows:from selenium import webdriver from selenium.webdriver.chrome.options import Options from fake_useragent import UserAgent options = Options() ua = UserAgent() userAgent = ua.random print(userAgent) options.add_argument(f'user-agent={userAgent}') driver = webdriver.Chrome(chrome_options=options, executable_path=r'C:\WebDrivers\ChromeDriver\chromedriver_win32\chromedriver.exe') driver.get("https://www.google.co.in") driver.quit()
>You can find a relevant detailed discussion in Way to change Google Chrome user agent in Selenium?
-
Rotating the UserAgent in each of your Tests using
Network.setUserAgentOverride
throughexecute_cdp_cmd()
as follows:from selenium import webdriver driver = webdriver.Chrome(executable_path=r'C:\WebDrivers\chromedriver.exe') print(driver.execute_script("return navigator.userAgent;")) # Setting user agent as Chrome/83.0.4103.97 driver.execute_cdp_cmd('Network.setUserAgentOverride', {"userAgent": 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.97 Safari/537.36'}) print(driver.execute_script("return navigator.userAgent;"))
>You can find a relevant detailed discussion in How to change the User Agent using Selenium and Python
-
Changing the property value of
navigator
for webdriver toundefined
as follows:driver.execute_cdp_cmd("Page.addScriptToEvaluateOnNewDocument", { "source": """ Object.defineProperty(navigator, 'webdriver', { get: () => undefined }) """ })
>You can find a relevant detailed discussion in Selenium webdriver: Modifying navigator.webdriver flag to prevent selenium detection
- Changing the values of
navigator.plugins
,navigator.languages
, WebGL, hairline feature, missing image, etc.
>You can find a relevant detailed discussion in Is there a version of selenium webdriver that is not detectable?
- Changing the conventional Viewport
>You can find a relevant detailed discussion in How to bypass Google captcha with Selenium and python?
Dealing with reCAPTCHA
While dealing with [tag:2captcha] and [tag:recaptcha-v3] rather clicking on [tag:checkbox] associated to the text I'm not a robot, it may be easier to get authenticated extracting and using the data-sitekey
.
>You can find a relevant detailed discussion in How to identify the 32 bit data-sitekey of ReCaptcha V2 to obtain a valid response programmatically using Selenium and Python Requests?
tl; dr
You can find a cutting edge solution to evade webdriver detection in:
Solution 5 - Javascript
Example of how it's implemented on wellsfargo.com:
try {
if (window.document.documentElement.getAttribute("webdriver")) return !+[]
} catch (IDLMrxxel) {}
try {
if ("_Selenium_IDE_Recorder" in window) return !+""
} catch (KknKsUayS) {}
try {
if ("__webdriver_script_fn" in document) return !+""
Solution 6 - Javascript
I have checked the chromedriver source code. That injects some javascript files to the browser. Obfuscating JavaScripts result
Every javascript file on this link is injected to the web pages: https://chromium.googlesource.com/chromium/src/+/master/chrome/test/chromedriver/js/
So I used reverse engineering and obfuscated the js files by Hex editing. Now i was sure that no more javascript variable, function names and fixed strings were used to uncover selenium activity. But still some sites and reCaptcha detect selenium!
Maybe they check the modifications that are caused by chromedriver js execution :)
Edit 1:
I discovered there are some parameters in 'navigator' that briefly uncover using of chromedriver. These are the parameters: Chrome 'navigator' parameters modification
- "navigator.webdriver" On non-automated mode it is 'undefined'. On automated mode it's 'true'.
- "navigator.plugins" On headless chrome has 0 length. So I added some fake elements to fool the plugin length checking process.
- "navigator.languages" was set to default chrome value '["en-US", "en", "es"]' .
So what i needed was a chrome extension to run javascript on the web pages. I made an extension with the https://stackoverflow.com/questions/50573455/modifying-the-javascript-navigator-object-with-selenium">js code provided in the article and used https://stackoverflow.com/questions/16511384/using-extensions-with-selenium-python">another article to add the zipped extension to my project. I have successfully changed the values; But still nothing changed!
I didn't find other variables like these but it doesn't mean that they don't exist. Still reCaptcha detects chromedriver, So there should be more variables to change. The next step should be reverse engineering of the detector services that i don't want to do.
Now I'm not sure does it worth to spend more time on this automation process or search for alternative methods!
Solution 7 - Javascript
Try to use selenium with a specific user profile of chrome, That way you can use it as specific user and define any thing you want, When doing so it will run as a 'real' user, look at chrome process with some process explorer and you'll see the difference with the tags.
For example:
username = os.getenv("USERNAME")
userProfile = "C:\\Users\\" + username + "\\AppData\\Local\\Google\\Chrome\\User Data\\Default"
options = webdriver.ChromeOptions()
options.add_argument("user-data-dir={}".format(userProfile))
# add here any tag you want.
options.add_experimental_option("excludeSwitches", ["ignore-certificate-errors", "safebrowsing-disable-download-protection", "safebrowsing-disable-auto-update", "disable-client-side-phishing-detection"])
chromedriver = "C:\Python27\chromedriver\chromedriver.exe"
os.environ["webdriver.chrome.driver"] = chromedriver
browser = webdriver.Chrome(executable_path=chromedriver, chrome_options=options)
chrome tag list here
Solution 8 - Javascript
> partial interface Navigator { readonly attribute boolean webdriver; };
> The webdriver IDL attribute of the Navigator interface must return the value of the webdriver-active flag, which is initially false.
> This property allows websites to determine that the user agent is under control by WebDriver, and can be used to help mitigate denial-of-service attacks.
Taken directly from the 2017 W3C Editor's Draft of WebDriver. This heavily implies that at the very least, future iterations of selenium's drivers will be identifiable to prevent misuse. Ultimately, it's hard to tell without the source code, what exactly causes chrome driver in specific to be detectable.
Solution 9 - Javascript
Firefox is said to set window.navigator.webdriver === true
if working with a webdriver. That was according to one of the older specs (e.g.: archive.org) but I couldn't find it in the new one except for some very vague wording in the appendices.
A test for it is in the selenium code in the file fingerprint_test.js where the comment at the end says "Currently only implemented in firefox" but I wasn't able to identify any code in that direction with some simple grep
ing, neither in the current (41.0.2) Firefox release-tree nor in the Chromium-tree.
I also found a comment for an older commit regarding fingerprinting in the firefox driver b82512999938 from January 2015. That code is still in the Selenium GIT-master downloaded yesterday at javascript/firefox-driver/extension/content/server.js
with a comment linking to the slightly differently worded appendix in the current w3c webdriver spec.
Solution 10 - Javascript
Additionally to the great answer of Erti-Chris Eelmaa - there's annoying window.navigator.webdriver
and it is read-only. Event if you change the value of it to false
it will still have true
. That's why the browser driven by automated software can still be detected.
The variable is managed by the flag --enable-automation
in chrome. The chromedriver launches Chrome with that flag and Chrome sets the window.navigator.webdriver
to true
. You can find it here. You need to add to "exclude switches" the flag. For instance (Go):
package main
import (
"github.com/tebeka/selenium"
"github.com/tebeka/selenium/chrome"
)
func main() {
caps := selenium.Capabilities{
"browserName": "chrome",
}
chromeCaps := chrome.Capabilities{
Path: "/path/to/chrome-binary",
ExcludeSwitches: []string{"enable-automation"},
}
caps.AddChrome(chromeCaps)
wd, err := selenium.NewRemote(caps, fmt.Sprintf("http://localhost:%d/wd/hub", 4444))
}
Solution 11 - Javascript
It works for some websites, remove property webdriver from navigator
from selenium import webdriver
driver = webdriver.Chrome()
driver.execute_cdp_cmd("Page.addScriptToEvaluateOnNewDocument", {
"source":
"const newProto = navigator.__proto__;"
"delete newProto.webdriver;"
"navigator.__proto__ = newProto;"
})
Solution 12 - Javascript
With the availability of Selenium Stealth evading the detection of Selenium driven ChromeDriver initiated [tag:google-chrome] Browsing Context have become much more easier.
selenium-stealth
selenium-stealth is a python package to prevent detection. This programme tries to make python selenium more stealthy. However, as of now selenium-stealth only support Selenium Chrome.
Features that currently selenium-stealth can offer:
-
selenium-stealth with stealth passes all public bot tests.
-
With selenium-stealth selenium can do google account login.
-
selenium-stealth help with maintaining a normal reCAPTCHA v3 score
Installation
Selenium-stealth is available on PyPI so you can install with pip as follows:
$ pip install selenium-stealth
> [tag:selenium4] compatible code
-
Code Block:
from selenium import webdriver from selenium.webdriver.chrome.options import Options from selenium.webdriver.chrome.service import Service from selenium_stealth import stealth options = Options() options.add_argument("start-maximized") # Chrome is controlled by automated test software options.add_experimental_option("excludeSwitches", ["enable-automation"]) options.add_experimental_option('useAutomationExtension', False) s = Service('C:\\BrowserDrivers\\chromedriver.exe') driver = webdriver.Chrome(service=s, options=options) # Selenium Stealth settings stealth(driver, languages=["en-US", "en"], vendor="Google Inc.", platform="Win32", webgl_vendor="Intel Inc.", renderer="Intel Iris OpenGL Engine", fix_hairline=True, ) driver.get("https://bot.sannysoft.com/")
-
Browser Screenshot:
tl; dr
You can find a couple of relevant detailed discussion in:
Solution 13 - Javascript
One more thing I found is that some websites uses a platform that checks the User Agent. If the value contains: "HeadlessChrome" the behavior can be weird when using headless mode.
The workaround for that will be to override the user agent value, for example in Java:
chromeOptions.addArguments("--user-agent=Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.86 Safari/537.36");
Solution 14 - Javascript
It sounds like they are behind a web application firewall. Take a look at modsecurity and OWASP to see how those work.
In reality, what you are asking is how to do bot detection evasion. That is not what Selenium WebDriver is for. It is for testing your web application not hitting other web applications. It is possible, but basically, you'd have to look at what a WAF looks for in their rule set and specifically avoid it with selenium if you can. Even then, it might still not work because you don't know what WAF they are using.
You did the right first step, that is, faking the user agent. If that didn't work though, then a WAF is in place and you probably need to get more tricky.
Point taken from other answer. Make sure your user agent is actually being set correctly first. Maybe have it hit a local web server or sniff the traffic going out.
Solution 15 - Javascript
The bot detection I've seen seems more sophisticated or at least different than what I've read through in the answers below.
EXPERIMENT 1:
- I open a browser and web page with Selenium from a Python console.
- The mouse is already at a specific location where I know a link will appear once the page loads. I never move the mouse.
- I press the left mouse button once (this is necessary to take focus from the console where Python is running to the browser).
- I press the left mouse button again (remember, cursor is above a given link).
- The link opens normally, as it should.
EXPERIMENT 2:
-
As before, I open a browser and the web page with Selenium from a Python console.
-
This time around, instead of clicking with the mouse, I use Selenium (in the Python console) to click the same element with a random offset.
-
The link doesn't open, but I am taken to a sign up page.
IMPLICATIONS:
- opening a web browser via Selenium doesn't preclude me from appearing human
- moving the mouse like a human is not necessary to be classified as human
- clicking something via Selenium with an offset still raises the alarm
Seems mysterious, but I guess they can just determine whether an action originates from Selenium or not, while they don't care whether the browser itself was opened via Selenium or not. Or can they determine if the window has focus? Would be interesting to hear if anyone has any insights.
Solution 16 - Javascript
Even if you are sending all the right data (e.g. Selenium doesn't show up as an extension, you have a reasonable resolution/bit-depth, &c), there are a number of services and tools which profile visitor behaviour to determine whether the actor is a user or an automated system.
For example, visiting a site then immediately going to perform some action by moving the mouse directly to the relevant button, in less than a second, is something no user would actually do.
It might also be useful as a debugging tool to use a site such as https://panopticlick.eff.org/ to check how unique your browser is; it'll also help you verify whether there are any specific parameters that indicate you're running in Selenium.
Solution 17 - Javascript
Some sites are detecting this:
function d() {
try {
if (window.document.$cdc_asdjflasutopfhvcZLmcfl_.cache_)
return !0
} catch (e) {}
try {
//if (window.document.documentElement.getAttribute(decodeURIComponent("%77%65%62%64%72%69%76%65%72")))
if (window.document.documentElement.getAttribute("webdriver"))
return !0
} catch (e) {}
try {
//if (decodeURIComponent("%5F%53%65%6C%65%6E%69%75%6D%5F%49%44%45%5F%52%65%63%6F%72%64%65%72") in window)
if ("_Selenium_IDE_Recorder" in window)
return !0
} catch (e) {}
try {
//if (decodeURIComponent("%5F%5F%77%65%62%64%72%69%76%65%72%5F%73%63%72%69%70%74%5F%66%6E") in document)
if ("__webdriver_script_fn" in document)
return !0
} catch (e) {}
Solution 18 - Javascript
It seems to me the simplest way to do it with Selenium is to intercept the XHR that sends back the browser fingerprint.
But since this is a Selenium-only problem, its better just to use something else. Selenium is supposed to make things like this easier, not way harder.
Solution 19 - Javascript
All I had to do is:
my_options = webdriver.ChromeOptions()
my_options.add_argument( '--disable-blink-features=AutomationControlled' )
Some more info to this: This relates to website skyscanner.com. In the past I have been able to scrape it. Yes, it did detect the browser automation and it gave me a captcha to press and hold a button. I used to be able to complete the captcha manually, then search flights and then scrape. But this time around after completing the captcha I get the same captcha again and again, just can't seem to escape from it. I tried some of the most popular suggestions to avoid automation being detected, but they didn't work. Then I found this article which did work, and by process of elimination I found out it only took the option above to get around their browser automation detection. Now I don't even get the captcha and everything else seems to be working normally.
Versions I am running currently:
-
OS: Windows 7 64bit
-
Browser: Chrome Version 100.0.4896.60 (Official Build) (64-bit)
-
Selenium 4.1.3
-
ChromeDriver 100.0.4896.60 chromedriver_win32.zip 930ff33ae8babeaa74e0dd1ce1dae7ff
Solution 20 - Javascript
Write an html page with the following code. You will see that in the DOM selenium applies a webdriver attribute in the outerHTML
<html>
<head>
<script type="text/javascript">
<!--
function showWindow(){
javascript:(alert(document.documentElement.outerHTML));
}
//-->
</script>
</head>
<body>
<form>
<input type="button" value="Show outerHTML" onclick="showWindow()">
</form>
</body>
</html>
Solution 21 - Javascript
You can try to use the parameter "enable-automation"
var options = new ChromeOptions();
// hide selenium
options.AddExcludedArguments(new List<string>() { "enable-automation" });
var driver = new ChromeDriver(ChromeDriverService.CreateDefaultService(), options);
But, I want to warn that this ability was fixed in ChromeDriver 79.0.3945.16. So probably you should use older versions of chrome.
Also, as another option, you can try using InternetExplorerDriver instead of Chrome. As for me, IE does not block at all without any hacks.
And for more info try to take a look here:
Solution 22 - Javascript
I've found changing the JavaScript "key" variable like this:
//Fools the website into believing a human is navigating it
((JavascriptExecutor)driver).executeScript("window.key = \"blahblah\";");
works for some websites when using Selenium WebDriver along with Google Chrome, since many sites check for this variable in order to avoid being scraped by Selenium.
Solution 23 - Javascript
Answer: YES
Some sites will detect selenium by the browser's fingeprints and other data, other sites will detect selenium based on behavior, not only based on what you do, but what you don't do as well.
Usually with the data that selenium provides is enough to detect it.
you can check the browser fingerprints in sites like this ones
https://bot.sannysoft.com
https://fingerprintjs.github.io/fingerprintjs/
https://antoinevastel.com/bots/
try with your user browser, then try with selenium, you'll see the differences.
You can change some fingerprints with options(), like user agent and others, see the results by yourself.
You can try to avoid this detection by many ways, I recommend using this library:undetected_chromedriver:
https://github.com/ultrafunkamsterdam/undetected-chromedriver
import undetected_chromedriver.v2 as uc
Else you can try using an alternative to selenium. I heard of PhantomJS, but didn't tried.
Solution 24 - Javascript
i have the same problem and solved the issue with the following configuration (in c#)
options.AddArguments("start-maximized");
options.AddArguments("--user-agent=Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.86 Safari/537.36");
options.AddExcludedArgument("enable-automation");//for hiding chrome being controlled by automation..
options.AddAdditionalCapability("useAutomationExtension", false);
//import cookies
options.AddArguments("user-data-dir=" + userDataDir);
options.AddArguments("profile-directory=" + profileDir);