Tutorial

Google Search from Java Program Example

Published on August 3, 2022
author

Pankaj

Google Search from Java Program Example

Sometime back I was looking for a way to search Google using Java Program. I was surprised to see that Google had a web search API but it has been deprecated long back and now there is no standard way to achieve this. Basically google search is an HTTP GET request where query parameter is part of the URL, and earlier we have seen that there are different options such as Java HttpUrlConnection or Apache HttpClient to perform this search. But the problem is more related to parsing the HTML response and get the useful information out of it. That’s why I chose to use jsoup that is an open source HTML parser and it’s capable to fetch HTML from given URL. So below is a simple program to fetch google search results in a java program and then parse it to find out the search results.

package com.journaldev.jsoup;

import java.io.IOException;
import java.util.Scanner;

import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;

public class GoogleSearchJava {

	public static final String GOOGLE_SEARCH_URL = "https://www.google.com/search";
	public static void main(String[] args) throws IOException {
		//Taking search term input from console
		Scanner scanner = new Scanner(System.in);
		System.out.println("Please enter the search term.");
		String searchTerm = scanner.nextLine();
		System.out.println("Please enter the number of results. Example: 5 10 20");
		int num = scanner.nextInt();
		scanner.close();
		
		String searchURL = GOOGLE_SEARCH_URL + "?q="+searchTerm+"&num="+num;
		//without proper User-Agent, we will get 403 error
		Document doc = Jsoup.connect(searchURL).userAgent("Mozilla/5.0").get();
		
		//below will print HTML data, save it to a file and open in browser to compare
		//System.out.println(doc.html());
		
		//If google search results HTML change the <h3 class="r" to <h3 class="r1"
		//we need to change below accordingly
		Elements results = doc.select("h3.r > a");

		for (Element result : results) {
			String linkHref = result.attr("href");
			String linkText = result.text();
			System.out.println("Text::" + linkText + ", URL::" + linkHref.substring(6, linkHref.indexOf("&")));
		}
	}

}

Below is a sample output from above program, I saved the HTML data into file and opened in a browser to confirm the output and it’s what we wanted. Compare the output with below image. Google Search API Java, java google search, google search java program

Please enter the search term.
journaldev
Please enter the number of results. Example: 5 10 20
20
Text::JournalDev, URL::=https://www.journaldev.com/
Text::Java Interview Questions, URL::=https://www.journaldev.com/java-interview-questions
Text::Java design patterns, URL::=https://www.journaldev.com/tag/java-design-patterns
Text::Tutorials, URL::=https://www.journaldev.com/tutorials
Text::Java servlet, URL::=https://www.journaldev.com/tag/java-servlet
Text::Spring Framework Tutorial ..., URL::=https://www.journaldev.com/2888/spring-tutorial-spring-core-tutorial
Text::Java Design Patterns PDF ..., URL::=https://www.journaldev.com/6308/java-design-patterns-pdf-ebook-free-download-130-pages
Text::Pankaj Kumar (@JournalDev) | Twitter, URL::=https://twitter.com/journaldev
Text::JournalDev | Facebook, URL::=https://www.facebook.com/JournalDev
Text::JournalDev - Chrome Web Store - Google, URL::=https://chrome.google.com/webstore/detail/journaldev/ckdhakodkbphniaehlpackbmhbgfmekf
Text::Debian -- Details of package libsystemd-journal-dev in wheezy, URL::=https://packages.debian.org/wheezy/libsystemd-journal-dev
Text::Debian -- Details of package libsystemd-journal-dev in wheezy ..., URL::=https://packages.debian.org/wheezy-backports/libsystemd-journal-dev
Text::Debian -- Details of package libsystemd-journal-dev in sid, URL::=https://packages.debian.org/sid/libsystemd-journal-dev
Text::Debian -- Details of package libsystemd-journal-dev in jessie, URL::=https://packages.debian.org/jessie/libsystemd-journal-dev
Text::Ubuntu – Details of package libsystemd-journal-dev in trusty, URL::=https://packages.ubuntu.com/trusty/libsystemd-journal-dev
Text::libsystemd-journal-dev : Utopic (14.10) : Ubuntu - Launchpad, URL::=https://launchpad.net/ubuntu/utopic/%2Bpackage/libsystemd-journal-dev
Text::Debian -- Details of package libghc-libsystemd-journal-dev in jessie, URL::=https://packages.debian.org/jessie/libghc-libsystemd-journal-dev
Text::Advertise on JournalDev | BuySellAds, URL::=https://buysellads.com/buy/detail/231824
Text::JournalDev | LinkedIn, URL::=https://www.linkedin.com/groups/JournalDev-6748558
Text::How to install libsystemd-journal-dev package in Ubuntu Trusty, URL::=https://www.howtoinstall.co/en/ubuntu/trusty/main/libsystemd-journal-dev/
Text::[global] auth supported = cephx ms bind ipv6 = true [mon] mon data ..., URL::=https://zooi.widodh.nl/ceph/ceph.conf
Text::UbuntuUpdates - Package "libsystemd-journal-dev" (trusty 14.04), URL::=https://www.ubuntuupdates.org/libsystemd-journal-dev
Text::[Journal]Dev'err - Cursus Honorum - Enjin, URL::=https://cursushonorum.enjin.com/holonet/m/23958869/viewthread/13220130-journaldeverr/post/last

That’s all for google search in a java program, use it cautiously because if there is unusual traffic from your computer, chances are Google will block you.

Thanks for learning with the DigitalOcean Community. Check out our offerings for compute, storage, networking, and managed databases.

Learn more about our products

About the authors
Default avatar
Pankaj

author

While we believe that this content benefits our community, we have not yet thoroughly reviewed it. If you have any suggestions for improvements, please let us know by clicking the “report an issue“ button at the bottom of the tutorial.

Still looking for an answer?

Ask a questionSearch for more help

Was this helpful?
 
JournalDev
DigitalOcean Employee
DigitalOcean Employee badge
October 16, 2020

Hi, thank you for sharing the code, but nothing would show up after asking for the user input and I’m not sure what’s wrong with it

- CL

    JournalDev
    DigitalOcean Employee
    DigitalOcean Employee badge
    October 16, 2020

    I am getting an error: Exception in thread “main” java.lang.NoClassDefFoundError: org/jsoup/Jsoup at pa1.WebCrawler.main(WebCrawler.java:24) and eclipse is telling me to attach source

    - 146

      JournalDev
      DigitalOcean Employee
      DigitalOcean Employee badge
      September 20, 2019

      getting error org.jsoup package does not exist.

      - Sachin

        JournalDev
        DigitalOcean Employee
        DigitalOcean Employee badge
        December 12, 2018

        How can I treat a leap year program in java? Please, share your ideas!

        - Zohidbek

          JournalDev
          DigitalOcean Employee
          DigitalOcean Employee badge
          October 20, 2018

          How can i get this program to only return image URL’s? when i change the GOOGLE_SEARCH_URL to https://www.google.com/images , nothing prints out at the end of the program. My knowledge of java is very basic so if anyone can help me out thank you very much!

          - Aidan

            JournalDev
            DigitalOcean Employee
            DigitalOcean Employee badge
            July 27, 2018

            We are not gettng the full result html as we normally see in the browser. All the sponsored links are not appearing in the result.

            - Manish Mandal

              JournalDev
              DigitalOcean Employee
              DigitalOcean Employee badge
              March 26, 2018

              I am getting this exception Exception in thread “main” java.io.IOException: 400 error loading URL https://www.google.com/search?q=relaxed thoughts&num=2

              - anuja tatpuje

                JournalDev
                DigitalOcean Employee
                DigitalOcean Employee badge
                March 16, 2018

                Hi, Thanks for the tutorial. I have a followup question. What if I want to search multiple keywords in one go in the same way? Please help !

                - Star Apple

                  JournalDev
                  DigitalOcean Employee
                  DigitalOcean Employee badge
                  December 23, 2017

                  It’s easier if you open directly to predefined browser if (Desktop.isDesktopSupported()) try { Desktop.getDesktop().browse(new URI(searchURL)); } catch (IOException e1) { e1.printStackTrace(); } catch (URISyntaxException e1) { e1.printStackTrace(); } and you don’t need the number of results, just the searchTerm String searchURL = GOOGLE_SEARCH_URL + “?q=”+searchTerm

                  - André Vilela

                    JournalDev
                    DigitalOcean Employee
                    DigitalOcean Employee badge
                    November 30, 2017

                    Thanks amigo!

                    - Alex

                      Try DigitalOcean for free

                      Click below to sign up and get $200 of credit to try our products over 60 days!

                      Sign up

                      Join the Tech Talk
                      Success! Thank you! Please check your email for further details.

                      Please complete your information!

                      Featured on Community

                      Get our biweekly newsletter

                      Sign up for Infrastructure as a Newsletter.

                      Hollie's Hub for Good

                      Working on improving health and education, reducing inequality, and spurring economic growth? We'd like to help.

                      Become a contributor

                      Get paid to write technical tutorials and select a tech-focused charity to receive a matching donation.

                      Welcome to the developer cloud

                      DigitalOcean makes it simple to launch in the cloud and scale up as you grow — whether you're running one virtual machine or ten thousand.

                      Learn more
                      Animation showing a Droplet being created in the DigitalOcean Cloud console