Big Data

Big Data subscription active
You will receive email notifications for new publications on Big Data.
10 Results
  • Tutorial

    How to Install Hadoop in Stand-Alone Mode on Ubuntu 16.04

    Hadoop is a Java-based programming framework that supports the processing and storage of extremely large datasets on a cluster of inexpensive machines. It was the first major open source project in the big data playin...
    By Melissa Anderson Clustering Big Data Ubuntu Ubuntu 16.04
  • Tutorial

    An Introduction to Big Data Concepts and Terminology

    Big data is a blanket term for the non-traditional strategies and technologies needed to gather, organize, process, and gather insights from large data sets. While the problem of working with data that exceeds the com...
    By Justin Ellingwood Scaling Clustering Big Data Conceptual
  • Tutorial

    Hadoop, Storm, Samza, Spark, and Flink: Big Data Frameworks Compared

    Big data is a blanket term for the non-traditional strategies and technologies needed to gather, organize, process, and gather insights from large datasets. While the problem of working with data that exceeds the comp...
    By Justin Ellingwood Big Data Conceptual
  • Tutorial

    How to Install Hadoop in Stand-Alone Mode on Ubuntu 18.04

    In this tutorial, you'll learn how to install Hadoop in stand-alone mode on an Ubuntu 18.04 server. You'll also run an example MapReduce program to search for occurrences of a regular expression in text files.
    By Melissa Anderson, Hanif Jetha Clustering Big Data Ubuntu Ubuntu 18.04
  • Tutorial

    An Introduction to Hadoop

    Apache Hadoop is one of the earliest and most influential open-source tools for storing and processing the massive amount of readily-available digital data that has accumulated with the rise of the World Wide Web. It ...
    By Melissa Anderson Clustering Big Data Conceptual
  • Tutorial

    How To Spin Up a Hadoop Cluster with DigitalOcean Droplets

    This tutorial will cover setting up a Hadoop cluster on DigitalOcean. The Hadoop software library is an Apache framework that lets you process large data sets in a distributed way across server clusters through levera...
    By Jeremy Morris Big Data Data Analysis Solutions Clustering DigitalOcean Ubuntu 16.04
  • Tutorial

    User Data Collection: Balancing Business Needs and User Privacy

    Collecting user data is common practice in modern sites and applications as a way of providing creators with more information to make decisions and create better experiences. Among other benefits, data can be used to ...
    By Justin Ellingwood Conceptual Big Data Data Analysis
  • Tutorial

    How To Install and Use ClickHouse on Debian 9

    ClickHouse is an open-source, column-oriented analytics database created by Yandex (https://yandex.com) for OLAP and big data use cases. In this tutorial, you'll install the ClickHouse database server and client on yo...
    By bsder Databases Data Analysis Big Data Debian 9
  • Tutorial

    How to Install Hadoop in Stand-Alone Mode on Debian 9

    In this tutorial, you'll install Hadoop in stand-alone mode on a Debian 9 server. You'll also run an example MapReduce program to search for occurrences of a regular expression in text files.
    By Brian Hogan, Melissa Anderson, Hanif Jetha Big Data Debian 9
  • Tutorial

    How to Set Up the Titan Graph Database with Cassandra and ElasticSearch on Ubuntu 16.04

    Titan is an open-source Graph database that is highly scalable. A Graph database is a type of NoSQL where all data is stored as nodes and edges. A graph database is suitable for applications that use highly connected ...
    By Kevin Isaac Big Data Elasticsearch Ubuntu 16.04