Question
Unable create Screenshot of Craiglist with PhantomJS on Digital Ocean
Have a very strange problem. Since 1-2 weeks I am unable to create screenshots of a Craigslist pages with PhantomJS on Digital Ocean. Does anybody have any idea what the issue could be and why it does not work anymore?
It always worked fine before and still works totally fine when I run it locally on my notebook or on linode. It creates the screenshot within 2-4 seconds. However running the same (no matter if in a special Docker-Container like the one used locally and linode or installed directly on the host) on Digital Ocean keeps on loading forever and if I am lucky and wait long enough I get the screenshot after around 7-10+ minutes.
Tried it on different Droplets (existing & totally new) in different zones (SF & Frankfurt) but always have the same issue. Already contacted Digital Ocean about that. They were able to reproduce the issue but according to them nothing changed on their side and have so also no idea what could cause that. They blame PhantomJS or Craigslist.
It can be reproduced very easily. On a new Droplet (Ubuntu 14.04) the following code will install PhantomJS:
# Install dependencies
sudo apt-get install -y libicu52 libjpeg8 libfontconfig libwebp5
# Install PhantomJS
cd /usr/local/share && \
curl -L -O https://github.com/bprodoehl/phantomjs/releases/download/v2.0.0-20150528/phantomjs-2.0.0-20150528-u1404-x86_64.zip && \
unzip phantomjs-2.0.0-20150528-u1404-x86_64.zip && \
ln -s /usr/local/share/phantomjs-2.0.0-20150528/bin/phantomjs /usr/local/bin/phantomjs
A very basic example script to create a screenshot of a product on craigslist. File called “test-screenshot.js” with this content:
var page = require('webpage').create();
var url = 'http://vancouver.craigslist.ca/van/ctd/5162270100.html';
page.open(url, function() {
page.render('craigslist.png');
phantom.exit();
});
To run the script: “phantomjs test-screenshot.js”.
Does anybody have any idea what is going on?
Thanks!
These answers are provided by our Community. If you find them useful, show some love by clicking the heart. If you run into issues leave a comment, or add your own answer to help others.
×
I believe Craigslist is blocking DigitalOcean IPs from accessing the site. I’ve heard they do it for AWS as well.