By Joshua Hall
While we believe that this content benefits our community, we have not yet thoroughly reviewed it. If you have any suggestions for improvements, please let us know by clicking the “report an issue“ button at the bottom of the tutorial.
Trees are hands-down the most common data structure that you already interact with. From your file system to the DOM, many of the things your computer does is entirely dependent on generating and manipulating trees.
In short, a tree is a collection of nodes/values where each node points to descending nodes and all connect to a single parent node. Let’s look at the tree you’re probably most familiar with: the DOM.
At the root of the tree we have the
document (actually the window but shhh) and every HTML tag we add creates a new child node under it. The main point about trees is that no matter how many nodes we have we could pick anyone at random and always be able to trace our way back to the root or any other element.
Previously, when we created linked lists we were technically already making viable trees because they have a root (the head/tail) and each node was connected to a child with the next/prev pointers. Starting from any node we could always find our way back to a single root, but it’s a bit trivial with just a single chain of nodes.
All of these are valid trees. Most of our different structures based off of trees will be pertaining more with how the data inside the trees are organized, but they’ll all look something like this.
There are some rules we need to follow to have a proper tree. A tree cannot reference its own root, for example if we ran a traversal algorithm on a circular linked list then it would quickly become a recursion nightmare. We also can’t have more than one root or a node with more than one parent.
Finally, a tree must have a directionality to it with everything moving out from the root. A cluster of nodes without a direction is actually its own much more complicated structure, graphs, which we’ll cover in future articles 😉.
We can’t just start linking nodes together and call it a tree, if we break these rules then future data structures and algorithms just won’t be possible.
While pretty boring it’s still necessary to cover the most important terminology you’ll constantly see as you learn more about trees.
Node - Each single object or data point.
Root - The first and uppermost node in the tree from which all other nodes are derived from.
Edge - A connection between two nodes.
Parent - The immediate ancestor of a lower node.
child - The immediate descendant of a higher node.
Siblings - Two nodes on the same depth with the same parent.
Leafs - The bottommost nodes with no children.
Depth - The height of the tree measured in levels with the number of edges away from the root, so level 2 is only two edges away from the root.
Breadth - The width of the tree measured by the number of leafs.
Subtree - A node and its descendants which could be treated as an independent tree. For example, if we created a dictionary as a tree and used a search algorithm that looked at each item alphabetically we could use the node for section of the first letter as the root instead of looking at every item in the tree.
Right now this may seem like a whole lotta fluff but the devil really is in the details. As you progress to more elaborate structures it’ll become increasingly easier to overlook these details and drown yourself in very hard to debug errors.
Thanks for learning with the DigitalOcean Community. Check out our offerings for compute, storage, networking, and managed databases.
Working on improving health and education, reducing inequality, and spurring economic growth? We'd like to help.