By PowerAdSpy
I’m building a tool that collects ad data from public sources every hour. I’ve written the Python script but I’m not sure how to automate it on a DigitalOcean Droplet using cron jobs. What’s the best way to set this up reliably without it crashing?
This textbox defaults to using Markdown to format your answer.
You can type !ref in this text area to quickly search our full set of tutorials, documentation & marketplace offerings and insert the link!
Hi there,
Cron is the right tool for this. Here is a basic setup:
First make sure your script runs cleanly from the command line, then open the crontab:
crontab -e
Add this to run every hour:
0 * * * * /usr/bin/python3 /path/to/your/script.py >> /var/log/scraper.log 2>&1
The >> /var/log/scraper.log 2>&1 part logs both output and errors so you can see what happened if something goes wrong.
A few things that make it more reliable:
Use absolute paths everywhere in your script, cron runs with a minimal environment and relative paths will break. If you are using a virtual environment, call the Python binary inside it directly:
0 * * * * /path/to/venv/bin/python /path/to/your/script.py >> /var/log/scraper.log 2>&1
Also add a check at the top of your script to make sure only one instance runs at a time, since if the scraper takes longer than an hour you will end up with overlapping runs.
DigitalOcean’s monitoring alerts are worth setting up too so you get notified if the Droplet goes down and your scraper stops running silently: https://docs.digitalocean.com/products/monitoring/
Get paid to write technical tutorials and select a tech-focused charity to receive a matching donation.
Full documentation for every DigitalOcean product.
The Wave has everything you need to know about building a business, from raising funding to marketing your product.
Stay up to date by signing up for DigitalOcean’s Infrastructure as a Newsletter.
New accounts only. By submitting your email you agree to our Privacy Policy
Scale up as you grow — whether you're running one virtual machine or ten thousand.
From GPU-powered inference and Kubernetes to managed databases and storage, get everything you need to build, scale, and deploy intelligent applications.