Big Data

Big Data subscription active
You will receive email notifications for new publications on Big Data.
42 Results
  • Tutorial

    How to Install Hadoop in Stand-Alone Mode on Ubuntu 16.04

    Hadoop is a Java-based programming framework that supports the processing and storage of extremely large datasets on a cluster of inexpensive machines. It was the first major open source project in the big data playin...
    By Melissa Anderson Clustering Big Data Ubuntu Ubuntu 16.04
  • Tutorial

    How To Install and Use ClickHouse on Ubuntu 20.04

    ClickHouse is an open source, column-oriented analytics database created by Yandex for OLAP and big data use cases. In this tutorial, you'll install the ClickHouse database server and client on your machine. You'll us...
    By bsder Big Data Databases Ubuntu 20.04
  • Tutorial

    An Introduction to Big Data Concepts and Terminology

    Big data is a blanket term for the non-traditional strategies and technologies needed to gather, organize, process, and gather insights from large data sets. While the problem of working with data that exceeds the com...
    By Justin Ellingwood Scaling Clustering Big Data Conceptual
  • Tutorial

    Hadoop, Storm, Samza, Spark, and Flink: Big Data Frameworks Compared

    Big data is a blanket term for the non-traditional strategies and technologies needed to gather, organize, process, and gather insights from large datasets. While the problem of working with data that exceeds the comp...
    By Justin Ellingwood Big Data Conceptual
  • Tutorial

    How To Install Hadoop in Stand-Alone Mode on Ubuntu 18.04

    In this tutorial, you'll learn how to install Hadoop in stand-alone mode on an Ubuntu 18.04 server. You'll also run an example MapReduce program to search for occurrences of a regular expression in text files.
    By Melissa Anderson, Hanif Jetha Clustering Big Data Ubuntu Ubuntu 18.04
  • Tutorial

    An Introduction to Hadoop

    Apache Hadoop is one of the earliest and most influential open-source tools for storing and processing the massive amount of readily-available digital data that has accumulated with the rise of the World Wide Web. It ...
    By Melissa Anderson Clustering Big Data Conceptual
  • Tutorial

    How To Spin Up a Hadoop Cluster with DigitalOcean Droplets

    This tutorial will cover setting up a Hadoop cluster on DigitalOcean. The Hadoop software library is an Apache framework that lets you process large data sets in a distributed way across server clusters through levera...
    By Jeremy Morris Big Data Data Analysis Solutions Clustering DigitalOcean Ubuntu 16.04
  • Tutorial

    User Data Collection: Balancing Business Needs and User Privacy

    Collecting user data is common practice in modern sites and applications as a way of providing creators with more information to make decisions and create better experiences. Among other benefits, data can be used to ...
    By Justin Ellingwood Conceptual Big Data Data Analysis
  • Tutorial

    How To Install and Use ClickHouse on Debian 9

    ClickHouse is an open-source, column-oriented analytics database created by Yandex (https://yandex.com) for OLAP and big data use cases. In this tutorial, you'll install the ClickHouse database server and client on yo...
    By bsder Databases Data Analysis Big Data Debian 9
  • Tutorial

    How to Install Hadoop in Stand-Alone Mode on Debian 9

    In this tutorial, you'll install Hadoop in stand-alone mode on a Debian 9 server. You'll also run an example MapReduce program to search for occurrences of a regular expression in text files.
    By Brian Hogan, Melissa Anderson, Hanif Jetha Big Data Debian 9
  • Tutorial

    How to Set Up the Titan Graph Database with Cassandra and ElasticSearch on Ubuntu 16.04

    Titan is an open-source Graph database that is highly scalable. A Graph database is a type of NoSQL where all data is stored as nodes and edges. A graph database is suitable for applications that use highly connected ...
    By Kevin Isaac Big Data Elasticsearch Ubuntu 16.04
  • Question

    Getting account unblocked

    Hey guys, new here. Been using Digital Ocean off and on for the last few years, mostly for personal projects and have really liked it a lot. Recently, though I started using it for Big Data processes. What I do is spi...
    Accepted Answer: Hey friend, I'd like to explain a bit about the reason for this, and the thoughts behind it. Please know that I'm about to say a lot of things that may not be relevant to you. It isn't necessarily that crypto is again...
    1 By jwalz DigitalOcean Big Data Ubuntu 18.04
  • Question

    What are the most popular Hadoop tools/projects?

    I have a question what are the most popular Hadoop tools/projects?
    Accepted Answer: Hive is an SQL-like language for data processing, which gets converted into a MapReduce job behind the scenes. Hive is popular because it is written using familiar SQL-like syntax. This is often confusing, because Hiv...
    2 By gulatisneha56 Big Data
  • Question

    Nexii Labs is a leading storage, virtualisation and Cloud service providers in India

    DevOps has changed the way an IT organization works and how it gets things done. Devops services and offerings connects development, technical operations and quality assurance personnel in such a way that the process...
    Accepted Answer: @ryanpq SPAM!
    1 By nexiilabs Backups Storage Getting Started Open Source Big Data Clustering CoreOS Arch Linux Ubuntu Ubuntu 16.04 Debian
  • Question

    How do I manually calculate average session duration?

    Hallo guys, can anyone help me tell the exact formula of the average session duration where to get from? I'm having trouble finding a manual calculation of the average session duration For example, from the google ana...
    1 By rifulabyssal Data Analysis DigitalOcean Articles Big Data GraphQL
  • Question

    Showing as Read only Disk

    As I am running my own mail server now since I restarted my Droplet I am not able to write anything or Delete anything from my Droplet. When I log in to droplet using SSH Putty and try to make any changes, I can see c...
    1 By saddamhussainfea Apache Big Data Backups Arch Linux
  • Question

    Can you send data to a server through cellular?

    Hello all. At my current job we have sensors on one of our building's roofs that sends environmental data from the roof to a physical server in the building. This system is proprietary and our building does not allow ...
    1 By csmall9 Applications Databases Open Source Big Data Conceptual
  • Question

    CPU Optimized Droplet works very slow

    Hello, I was using a CPU Optimized Droplet. At first month it worked very fast with a low amount of information for example a process that I developed, it took a maximum of 2 minutes to to deliver results. But the sec...
    2 By AndresRamos95 DigitalOcean Big Data
  • Question

    What is Quantum computing ?

    What is quantum computing and future of quantum computing
    1 By chandu12fvl Big Data Databases DigitalOcean Machine Learning
  • Question

    What's the differences between Volumes and Spaces. Can I Switch from volume to Spaces or vice-versa?

    What's the differences between Volumes and Spaces. Can I Switch from volume to Spaces or vice-versa?
    1 By miguelpeguero30 Big Data Block Storage Arch Linux