PhantomJS failing to open HTTPS site
HttpsScreen ScrapingPhantomjsHttps Problem Overview
I'm using the following code based on loadspeed.js example to open up a https:// site which requires http server authentication as well.
var page = require('webpage').create(), system = require('system'), t, address;
page.settings.userName = 'myusername';
page.settings.password = 'mypassword';
if (system.args.length === 1) {
console.log('Usage: scrape.js <some URL>');
phantom.exit();
} else {
t = Date.now();
address = system.args[1];
page.open(address, function (status) {
if (status !== 'success') {
console.log('FAIL to load the address');
} else {
t = Date.now() - t;
console.log('Page title is ' + page.evaluate(function () {
return document.title;
}));
console.log('Loading time ' + t + ' msec');
}
phantom.exit();
});
}
Its failing to load the page all the time. What could be wrong here? Are secured sites to be handled any differently? The site can be accessed successfully from browser though.
I'm just starting with Phantom right now and find it too good to stop playing around even though i'm not moving forward with this issue.
Https Solutions
Solution 1 - Https
I tried Fred's and Cameron Tinker's answers, but only --ssl-protocol=any option seem to help me:
phantomjs --ssl-protocol=any test.js
Also I think it should be way safer to use --ssl-protocol=any
as you still are using encryption, but --ignore-ssl-errors=true
will ignore (duh) all ssl errors, including malicious ones.
Solution 2 - Https
The problem is most likely due to SSL certificate errors. If you start phantomjs with the --ignore-ssl-errors=yes option, it should proceed to load the page as it would if there were no SSL errors:
phantomjs --ignore-ssl-errors=yes [phantomOptions] script.js [scriptOptions]
I've seen a few websites having problems with incorrectly implementing their SSL certificates or they've expired, etc. A complete list of command line options for phantomjs is available here: http://phantomjs.org/api/command-line.html. I hope this helps.
Solution 3 - Https
Note that as of 2014-10-16, PhantomJS defaults to using SSLv3 to open HTTPS connections. With the POODLE vulnerability recently announced, many servers are disabling SSLv3 support.
To get around that, you should be able to run PhantomJS with:
phantomjs --ssl-protocol=tlsv1
Hopefully, PhantomJS will be updated soon to make TLSv1 the default instead of SSLv3.
Solution 4 - Https
experienced same issue...
--ignore-ssl-errors=yes was not enough to fix it for me,
had to do two more things:
- change user-agent
- tried all ssl-protocols, the only one that worked was tlsv1 for the page in question
Hope this helps...
Solution 5 - Https
I experienced the same problem (casperjs 1.1.0-beta3/phantomjs 1.9.7). Using --ignore-ssl-errors=yes and --ssl-protocol=tlsv1 solved it. Using only one of the options did not solve it for me.
Solution 6 - Https
I was receiving
> Error creating SSL context" from phantomJS (running on CentOS 6.6)
Building from source fixed it for me. Don't forget to use the phantomjs that you built. (instead of the /usr/local/bin/phantomjs if you have it)
sudo yum -y install gcc gcc-c++ make flex bison gperf ruby openssl-devel freetype-devel fontconfig-devel libicu-devel sqlite-devel libpng-devel libjpeg-devel
git clone git://github.com/ariya/phantomjs.git
cd phantomjs
git checkout 2.0
./build.sh
cd bin/
./phantomjs <your JS file>
Solution 7 - Https
If someone is using Phantomjs with Sahi the --ignore-ssl-errors
option needs to go in your browser_types.xml file. It worked for me.
<browserType>
<name>phantomjs</name>
<displayName>PhantomJS</displayName>
<icon>safari.png</icon>
<path>/usr/local/Cellar/phantomjs/1.9.2/bin/phantomjs</path>
<options>--ignore-ssl-errors=yes --debug=yes --proxy=localhost:9999 /usr/local/Cellar/phantomjs/phantom-sahi.js</options>
<processName>"PhantomJS"</processName>
<capacity>100</capacity>
<force>true</force>
</browserType>
Solution 8 - Https
shebang
?
What about If you're using shebang to execute phantomjs
scripts, use the following shebang line
#!/usr/bin/phantomjs --ignore-ssl-errors=yes
var system = require('system');
var webpage = require('webpage');
// ... rest of your script
Use any of the above answers. i personally like --ignore-ssl-errors=yes
since it's irrelevant to validate my loopback web servers' self-signed certificate.
Solution 9 - Https
None of the other answers here helped me; it may be that the specific site(s) I was working with were too picky with their HTTP headers. This is what worked:
var page = webpage.create();
page.customHeaders = {
"Connection": "keep-alive"
};
I found out that PhantomJS was using "Keep-Alive" (capitalized), and the connection was not being kept alive. :)
Solution 10 - Https
I was getting SSL Handshake Failed
yesterday. I tried many combinations of phantomJS options (--ignore-ssl-errors=yes
etc.), but none of them worked.
Upgrading to phantomJS 2.1.1 fixed it.
I used the phantomJS installation instructions at https://gist.github.com/julionc/7476620, changing the phantomJS version to 2.1.1.
Solution 11 - Https
On the machine you are trying to run phantomjs on to connect to a remote server, run "openssl ciphers." Copy and paste the ciphers listed into the --ssl-ciphers="" command line option. This tells the connecting web server which ciphers are available to use to communicate with your client. If you don't set the ones available on your own machine, it can use any cipher your machine does not understand that the default modern browsers do that are used for the default setting.
Solution 12 - Https
phantomjs --web-security=false --ignore-ssl-errors=true scripts.js
Solution 13 - Https
The only thing that worked for me was upping phantomjs from 1.9x to 2.x ;)