When deciding which server architecture to use for your environment, there are many factors to consider, such as performance, scalability, availability, reliability, cost, and management.
In this tutorial, you will learn about commonly used server setups, with a short description of each, including the pros and cons. Keep in mind that all of the concepts covered here can be used in various combinations with one another and that every environment has different requirements, so there is no single correct configuration.
One server setup is when an entire environment resides on a single server. For a typical web application, this includes the web server, application server, and database server. A common variation of this setup is a LAMP stack, which stands for Linux, Apache, MySQL, and PHP, on a single server. An example use case for this is when you want to set up an application quickly. A basic setup like this can be used to test an idea or get a simple web page up and running.
Unfortunately, this offers little in the way of scalability and component isolation. Additionally, the application and the database contend for the same server resources such as CPU, Memory, I/O, and more. As a result, this can possibly cause poor performance and make it difficult to determine the root cause. Using one server is also not readily horizontally scalable. You can learn more about horizontal scaling in our tutorial on Understanding Database Sharding. Learn more about LAMP stack as well in our tutorial on How To Install LAMP on Ubuntu 22.04. The following is a visual representation of using a single server:
The database management system (DBMS) can be separated from the rest of the environment to eliminate the resource contention between the application and the database, and to increase security by removing the database from the DMZ, or public internet.
An example use case is that this can get your application set up quickly and prevents the application and database from fighting over the same system resources. You can also vertically scale each application and database tier separately. This is possible by adding more resources to whichever server needs increased capacity. Depending on your setup, this may also increase security by removing your database from the DMZ.
This setup is slightly more complex than a single server. Performance issues, like high latency, can arise if the network connection between the two servers is geographically distant from each other. There can also be performance issues if the bandwidth is too low for the amount of data being transferred. You can read more on How To Set Up a Remote Database to Optimize Site Performance with MySQL. The following is a visual representation of using a separate database server:
Load balancers can be added to a server environment to improve performance and reliability by distributing the workload across multiple servers. If one of the load balanced servers fails, the other servers will handle the incoming traffic until the failed server becomes healthy again. It can also be used to serve multiple applications through the same domain and port, by using a layer 7 application layer reverse proxy. A few types of software capable of reverse proxy load balancing are HAProxy, Nginx, and Varnish.
An example use case is in an environment that requires scaling by adding more servers, also known as horizontal scaling. When you set up a load balancer, it enables an environment capacity that can be scaled by adding more servers to it. It can also protect against DDOS attacks by limiting client connections to a sensible amount and frequency.
Setting up a load balancer can introduce a performance bottleneck if the load balancer does not have enough resources, or if it is configured poorly. It can also present complexities that require additional consideration, such as where to perform SSL termination and how to handle applications that require sticky sessions. Additionally, the load balancer is a single point of failure, this means that if it goes down, your whole service can go down. A high availability (HA) setup is an infrastructure without a single point of failure. To learn how to implement an HA setup, you can read our documentation on Reserved IPs. You can read more in our guide on An Introduction to HAProxy and Load Balancing Concepts as well. The following is a visual representation of setting up a load balancer:
An HTTP accelerator, or caching HTTP reverse proxy, can be used to reduce the time it takes to serve content to a user through a variety of techniques. The main technique employed with an HTTP accelerator is caching responses from a web or application server in memory, so future requests for the same content can be served quickly, with less unnecessary interaction with the web or application servers. A few examples of software capable of HTTP acceleration are Varnish, Squid, Nginx. An example use case is in an environment with content-heavy dynamic web applications or many commonly accessed files.
HTTP acceleration can increase site performance by reducing CPU load on a web server, through caching and compression, thereby increasing user capacity. It can also be used as a reverse proxy load balance, and some caching software can even protect against DDOS attacks. Unfortunately, it can reduce performance if the cache-hit rate is low, and requires tuning to get the best performance out of it. The following is a visual representation of setting up an HTTP Accelerator:
One way to improve performance for a database system that performs many reads compared to writes, such as a CMS, is to use primary-replica database replication. Replication requires one primary node and one or more replica nodes. In this setup, all updates are sent to the primary node and reads can be distributed across all nodes. An example use case is increasing the read performance of the database tier of an application. Setting up a primary-replica database replication improves database read performance by spreading reads across replicas, and improves write performance by using exclusively for updates, with no time spent on serving read requests.
Some of the cons for primary-replica database replication are that the application accessing the database must have a mechanism to determine which database nodes it should send update and read requests to. Also, if the primary fails, no updates can be performed on the database until the issue is corrected. It also does not have built-in failover in a case of failure of primary node. The following is a visual representation of a primary-replica replication setup, with a single replica node:
It is possible to load balance the caching servers, in addition to the application servers, and use database replication in a single environment. The purpose of combining these techniques is to reap the benefits of each without introducing too many issues or complexity. Here is an example diagram of this type of server environment set up:
For example, imagine a scenario where the load balancer is configured to recognize static requests (like images, CSS, JavaScript, etc.) and send those requests directly to the caching servers, and send other requests to the application servers.
Here is a breakdown of the process when a user sends a request for dynamic content:
When a user requests static content, the following process applies:
This environment still has two single points of failure, the load balancer and the primary database server, but it provides all of the other reliability and performance benefits that were described in previous sections.
Now that you are familiar with some basic server setups, you should have a good idea of what kind of setup you would use for your own application(s). If you are working on improving your own environment, remember that an iterative process is best to avoid introducing too many complexities too quickly.
]]>LVM, or Logical Volume Management, is a storage device management technology that gives users the power to pool and abstract the physical layout of component storage devices for flexible administration. Utilizing the device mapper Linux kernel framework, the current iteration, LVM2, can be used to gather existing storage devices into groups and allocate logical units from the combined space as needed.
The main advantages of LVM are increased abstraction, flexibility, and control. Logical volumes can have meaningful names like “databases” or "root-backup”. Volumes can also be resized dynamically as space requirements change, and migrated between physical devices within the pool on a running system or exported. LVM also offers advanced features like snapshotting, striping, and mirroring.
In this guide, you’ll learn how LVM works and practice basic commands to get up and running quickly on a bare metal machine.
Before diving into LVM administrative commands, it is important to have a basic understanding of how LVM organizes storage devices and some of the terminology it employs.
LVM functions by layering abstractions on top of physical storage devices. The basic layers that LVM uses, starting with the most primitive, are:
Physical Volumes: The LVM utility prefix for physical volumes is pv...
. This physicallyl blocks devices or other disk-like devices (for example, other devices created by device mapper, like RAID arrays) and are used by LVM as the raw building material for higher levels of abstraction. Physical volumes are regular storage devices. LVM writes a header to the device to allocate it for management.
Volume Groups: The LVM utility prefix for volume groups is vg...
.
LVM combines physical volumes into storage pools known as volume groups. Volume groups abstract the characteristics of the underlying devices and function as a unified logical device with combined storage capacity of the component physical volumes.
Logical Volumes: The LVM utility prefix for logical volumes is lv...
, generic LVM utilities might begin with lvm...
. A volume group can be sliced up into any number of logical volumes. Logical volumes are functionally equivalent to partitions on a physical disk, but with much more flexibility. Logical volumes are the primary component that users and applications will interact with.
LVM can be used to combine physical volumes into volume groups to unify the storage space available on a system. Afterwards, administrators can segment the volume group into arbitrary logical volumes, which act as flexible partitions.
Each volume within a volume group is segmented into small, fixed-size chunks called extents. The size of the extents is determined by the volume group. All volumes within the group conform to the same extent size.
The extents on a physical volume are called physical extents, while the extents of a logical volume are called logical extents. A logical volume is a mapping that LVM maintains between logical and physical extents. Because of this relationship, the extent size represents the smallest amount of space that can be allocated by LVM.
Extents are behind much of the flexibility and power of LVM. The logical extents that are presented as a unified device by LVM do not have to map to continuous physical extents. LVM can copy and reorganize the physical extents that compose a logical volume without any interruption to users. Logical volumes can also be expanded or shrunk by adding extents to, or removing extents from, the volume.
Now that you are familiar with some of the terminology and structures LVM uses, you can explore some common ways to use LVM. You’ll start with a procedure that will use two physical disks to form four logical volumes.
Begin by scanning the system for block devices that LVM can access and manage. You can do this with the following command:
- sudo lvmdiskscan
The output will return all available block devices that LVM can interact with:
Output /dev/ram0 [ 64.00 MiB]
/dev/sda [ 200.00 GiB]
/dev/ram1 [ 64.00 MiB]
. . .
/dev/ram15 [ 64.00 MiB]
/dev/sdb [ 100.00 GiB]
2 disks
17 partitions
0 LVM physical volume whole disks
0 LVM physical volumes
In this example, notice that there are currently two disks and 17 partitions. The partitions are mostly /dev/ram*
partitions that are used in the system as a RAM disk for performance enhancements. The disks in this example are /dev/sda
, which has 200G of space, and /dev/sdb
, which has 100G.
Warning: Make sure to double-check that the devices you intend to use with LVM do not have any important data already written to them. Using these devices within LVM will overwrite the current contents. If you have important data on your server, make backups before proceeding.
Now that you know the physical devices you want to use, mark them as physical volumes within LVM using the pvcreate
command:
- sudo pvcreate /dev/sda /dev/sdb
Output Physical volume "/dev/sda" successfully created
Physical volume "/dev/sdb" successfully created
This will write an LVM header to the devices to indicate that they are ready to be added to a volume group.
Verify that LVM has registered the physical volumes by running pvs
:
- sudo pvs
Output PV VG Fmt Attr PSize PFree
/dev/sda lvm2 --- 200.00g 200.00g
/dev/sdb lvm2 --- 100.00g 100.00g
Note that both of the devices are present under the PV
column, which stands for physical volume.
Now that you have created physical volumes from your devices, you can create a volume group. Most of the time, you only have a single volume group per system for maximum flexibility in allocation. The following volume group example is named LVMVolGroup
. You can name your volume group whatever you’d like.
To create the volume group and add both of your physical volumes to it, run:
- sudo vgcreate LVMVolGroup /dev/sda /dev/sdb
Output Volume group "LVMVolGroup" successfully created
Checking the pvs
output again will indicate that your physical volumes are now associated with the new volume group:
- sudo pvs
Output PV VG Fmt Attr PSize PFree
/dev/sda LVMVolGroup lvm2 a-- 200.00g 200.00g
/dev/sdb LVMVolGroup lvm2 a-- 100.00g 100.00g
List a short summary of the volume group with vgs
:
- sudo vgs
Output VG #PV #LV #SN Attr VSize VFree
LVMVolGroup 2 0 0 wz--n- 299.99g 299.99g
Your volume group currently has two physical volumes, zero logical volumes, and has the combined capacity of the underlying devices.
Now that you have a volume group available, you can use it as a pool to allocate logical volumes from. Unlike conventional partitioning, when working with logical volumes, you do not need to know the layout of the volume since LVM maps and handles this for you. You only need to supply the size of the volume and a name.
In the following example, you’ll create four separate logical volumes out of your volume group:
To create logical volumes, use the lvcreate
command. You must pass in the volume group to pull from, and can name the logical volume with the -n
option. To specify the size directly, you can use the -L
option. If, instead, you wish to specify the size in terms of the number of extents, you can use the -l
option.
Create the first three logical volumes with the -L
option:
- sudo lvcreate -L 10G -n projects LVMVolGroup
- sudo lvcreate -L 5G -n www LVMVolGroup
- sudo lvcreate -L 20G -n db LVMVolGroup
Output Logical volume "projects" created.
Logical volume "www" created.
Logical volume "db" created.
You can view the logical volumes and their relationship to the volume group by selecting a custom output from the vgs
command:
- sudo vgs -o +lv_size,lv_name
Output VG #PV #LV #SN Attr VSize VFree LSize LV
LVMVolGroup 2 3 0 wz--n- 299.99g 264.99g 10.00g projects
LVMVolGroup 2 3 0 wz--n- 299.99g 264.99g 5.00g www
LVMVolGroup 2 3 0 wz--n- 299.99g 264.99g 20.00g db
In this example, you added the last two columns of the output. It indicates how much space is allocated to your logical volumes.
Now, you can allocate the rest of the space in the volume group to the "workspace"
volume using the -l
flag, which works in extents. You can also provide a percentage and a unit to better communicate your intentions. In this example, allocate the remaining free space, so you can pass in 100%FREE
:
- sudo lvcreate -l 100%FREE -n workspace LVMVolGroup
Output Logical volume "workspace" created.
Checking the volume group information with the custom vgs
command, notice that you have used up all of the available space:
- sudo vgs -o +lv_size,lv_name
Output VG #PV #LV #SN Attr VSize VFree LSize LV
LVMVolGroup 2 4 0 wz--n- 299.99g 0 10.00g projects
LVMVolGroup 2 4 0 wz--n- 299.99g 0 5.00g www
LVMVolGroup 2 4 0 wz--n- 299.99g 0 20.00g db
LVMVolGroup 2 4 0 wz--n- 299.99g 0 264.99g workspace
The workspace
volume has been created and the LVMVolGroup
volume group is completely allocated.
Now that you have logical volumes, you can use them as normal block devices.
The logical devices are available within the /dev
directory like other storage devices. You can access them in two places:
/dev/volume_group_name/logical_volume_name
/dev/mapper/volume_group_name-logical_volume_name
To format your four logical volumes with the Ext4 filesystem, run the following commands:
- sudo mkfs.ext4 /dev/LVMVolGroup/projects
- sudo mkfs.ext4 /dev/LVMVolGroup/www
- sudo mkfs.ext4 /dev/LVMVolGroup/db
- sudo mkfs.ext4 /dev/LVMVolGroup/workspace
Alternatively, you can run the following:
- sudo mkfs.ext4 /dev/mapper/LVMVolGroup-projects
- sudo mkfs.ext4 /dev/mapper/LVMVolGroup-www
- sudo mkfs.ext4 /dev/mapper/LVMVolGroup-db
- sudo mkfs.ext4 /dev/mapper/LVMVolGroup-workspace
After formatting, create mount points:
- sudo mkdir -p /mnt/{projects,www,db,workspace}
Then mount the logical volumes to the appropriate location:
- sudo mount /dev/LVMVolGroup/projects /mnt/projects
- sudo mount /dev/LVMVolGroup/www /mnt/www
- sudo mount /dev/LVMVolGroup/db /mnt/db
- sudo mount /dev/LVMVolGroup/workspace /mnt/workspace
To make the mounts persistent, use your preferred text editor to add them to /etc/fstab
file. The following example uses nano
:
- sudo nano /etc/fstab
. . .
/dev/LVMVolGroup/projects /mnt/projects ext4 defaults,nofail 0 0
/dev/LVMVolGroup/www /mnt/www ext4 defaults,nofail 0 0
/dev/LVMVolGroup/db /mnt/db ext4 defaults,nofail 0 0
/dev/LVMVolGroup/workspace /mnt/workspace ext4 defaults,nofail 0 0
After editing your file, save and exit. If you’re using nano
, press CTRL+c
, then y
, then ENTER
.
The operating system should now mount the LVM logical volumes automatically at boot.
You now have an understanding of the various components that LVM manages to create a flexible storage system, and how to get storage devices up and running in an LVM setup.
To learn more about working with LVM, check out our guide to using LVM with Ubuntu 18.04.
]]>Using a firewall is as much about making intelligent policy decisions as it is about learning the syntax. Firewalls like iptables
are designed to enforce policies by interpreting rules set by the administrator. However, as an administrator, you need to know what types of rules make sense for your infrastructure.
While other guides focus on the commands needed to get up and running, in this guide, we will discuss some of the decisions you will have to make when implementing a firewall. These choices will affect how your firewall behaves, how locked down your server is, and how it will respond to various conditions that occur. We will be using iptables
as a specific example, but most of the concepts will be broadly applicable.
When constructing a firewall, one of the most important decisions to make is the default policy. This determines what happens when traffic is not matched by any other rules. By default, a firewall can either ACCEPT
any traffic unmatched by previous rules, or DROP
that traffic.
A default policy of ACCEPT
means that any unmatched traffic is allowed to enter the server. This is generally not recommended, because it means that you would need to work backwards from there, blocking all unwanted traffic. Blocklist-type approaches are difficult to manage, because you’d need to anticipate and block every type of unwanted traffic. This can lead to maintenance headaches and is generally prone to mistakes, misconfigurations, and unanticipated holes in the established policy.
The alternative is a default policy of DROP
. This means that any traffic not matched by an explicit rule will not be allowed. Each and every service must be explicitly allowed, which might seem like a significant amount of up-front configuration. However, this means that your policy tends towards security and that you know exactly what is permitted to receive traffic on your server. Also, nearly all preconfigured policies will follow this approach, meaning that you can build on existing defaults.
The choice of a default drop policy leads to another subtle decision. With iptables
and other similar firewalls, the default policy can be set using the built-in policy functionality of the firewall, or implemented by adding a catch-all drop rule at the end of the list of rules.
The distinction between these two methods comes down to what happens if the firewall rules are flushed.
If your firewall’s built-in policy function is set to DROP
and your firewall rules are ever flushed (reset), or if certain matching rules are removed, your services will instantly become inaccessible remotely. This is often a good idea when setting policy for non-critical services so that your server is not exposed to malicious traffic if the rules are removed.
The downside to this approach is that your services will be completely unavailable to your clients until you re-establish permissive rules. You could even potentially lock yourself out of the server if you do not have local or web-based remote access as an alternative.
The alternative to setting a drop policy using the built-in policy functionality is to set your firewall’s default policy to ACCEPT
and then implement a DROP
policy with regular rules. You can add a normal firewall rule at the end of your chain that matches and denies all remaining unmatched traffic.
In this case, if your firewall rules are flushed, your services will be accessible but unprotected. Depending on your options for local or alternative access, this might be a necessary evil to ensure that you can re-enter your server if the rules are flushed. If you decide to use this option, ensure that the catch-all rule always remains the last rule in your rule set.
There are a few different ways of preventing a packet from reaching its intended destination. The choice between these has an impact on how the client perceives its connection attempt and how quickly they are able to determine that their request will not be served.
The first way that packets can be denied is with DROP
. Drop can be used as a default policy or as a target for match rules. When a packet is dropped, iptables
just throws it away. It sends no response back to the client trying to connect and does not give any indication that it has ever even received the packets in question. This means that clients (legitimate or not) will not receive any confirmation of the receipt of their packets.
For TCP connection attempts (such as connections made by a web browser), the connection will stall until the timeout limit has been reached. The lack of response for UDP clients is even more ambiguous. In fact, not receiving a UDP packet back is often an indication that the packet was accepted. If the UDP client cares about receipt of its packets, it will have to resend them to try to determine whether they were accepted, lost in transit, or dropped. This can increase the amount of time that a malicious actor will have to spend to get information about the state of your server ports, but it could also cause problems with legitimate traffic.
An alternative to dropping traffic is to explicitly reject packets that you do not allow. ICMP, or Internet Control Message Protocol, is a meta-protocol used throughout the internet to send status, diagnostic, and error messages between hosts as an out-of-band channel that does not rely on conventional communication protocols like TCP or UDP. When you use the REJECT
target instead of the DROP
target, the traffic is denied and an ICMP packet is returned to the sender to inform them that their traffic was received but will not be accepted. A status message can also be included to provide a reason.
This has a number of consequences. Assuming that ICMP traffic is allowed to reach the client, they will immediately be informed that their traffic is blocked. For legitimate clients, this means that they can contact the administrator or check their connection options to ensure that they are reaching out to the correct port. For malicious users, this means that they can complete their scans and map out the open, closed, and filtered ports in a shorter period of time.
There is a lot to consider when deciding whether to drop or reject traffic. One important consideration is that most malicious traffic will actually be perpetrated by automated scripts. Since these scripts are typically not supervised, dropping illegitimate traffic will not meaningfully discourage them, and will have negative effects for legitimate users. More on this subject can be found on Peter Benie’s website.
The table below shows how a server protected by a firewall will react to different requests depending on the policy being applied to the destination port.
Client Packet Type | NMap Command | Port Policy | Response | Inferred Port State |
---|---|---|---|---|
TCP | nmap [-sT | -sS] -Pn <server> | Accept | TCP SYN/ACK | Open |
TCP | nmap [-sT | -sS] -Pn <server> | Drop | (none) | Filtered |
TCP | nmap [-sT | -sS] -Pn <server> | Reject | TCP RESET | Closed |
UDP | nmap -sU -Pn <server> | Accept | (none) | Open or Filtered |
UDP | nmap -sU -Pn <server> | Drop | (none) | Open or Filtered |
UDP | nmap -sU -Pn <server> | Reject | ICMP Port Unreachable | Closed |
The first column indicates the packet type sent by the client. The second column contains the nmap
commands that can be used to test each scenario. The third column indicates the port policy being applied to the port. The fourth column is the response the server will send back and the fifth column is what the client can infer about the port based on the response it has received.
As with deciding whether to drop or reject denied traffic, you have the option to accept or reject ICMP packets destined for your server.
ICMP is a protocol used for many things. As mentioned, it is often sent back to give status information about requests using other protocols. One of its most popular functions is to send and respond to network pings to verify connectability to remote hosts. There are many other uses for ICMP that are not as widely known, but still useful.
ICMP packets are organized by “type” and then further by “code”. A type specifies the general meaning of the message. For instance, Type 3 means that the destination was unreachable. A code is often used to give further information about a type. For example, ICMP Type 3 Code 3 means that the destination port was unavailable, while ICMP Type 3 Code 0 means that the destination network could not be reached.
Some ICMP types are deprecated, so they can be blocked unconditionally. Among these are ICMP source quench (type 4 code 0) and alternate host (type 6). Types 1, 2, 7 and type 15 and above are all deprecated, reserved for future use, or experimental. Many upstream firewall templates will handle this by default.
Some ICMP types are useful in certain network configurations, but should be blocked in others.
For instance, ICMP redirect messages (type 5) can be useful to illuminate bad network design. An ICMP redirect is sent when a better route is directly available to the client. So if a router receives a packet that will have to be routed to another host on the same network, it sends an ICMP redirect message to tell the client to send the packets through the other host in the future.
This is useful if you trust your local network and want to spot inefficiencies in your routing tables during initial configuration. On an untrusted network, a malicious user could potentially send ICMP redirects to manipulate the routing tables on hosts.
Other ICMP types that are useful in some networks and potentially harmful in others are ICMP router advertisement (type 9) and router solicitation (type 10) packets. Router advertisement and solicitation packets are used as part of IRDP (ICMP Internet Router Discovery Protocol), a system that allows hosts, upon booting up or joining a network, to dynamically discover available routers.
In most cases, it is better for a host to have static routes configured for the gateways it will use. These packets should be accepted in the same situations as the ICMP redirect packets. In fact, since the host will not know the preferred route for traffic of any discovered routes, redirect messages are often needed directly after discovery. If you are not running a service that sends router solicitation packets or modifies your routes based on advertisement packets (like rdisc
), you can safely block these packets.
ICMP types that are usually safe to allow are below, but you may want to disable them if you want to be extra careful.
The types below can usually be allowed without explicit rules by configuring your firewall to allow responses to requests it has made (by using the conntrack
module to allow ESTABLISHED
and RELATED
traffic).
Blocking all incoming ICMP traffic is still recommended by some security experts, however many people now encourage intelligent ICMP acceptance policies. These two Stackexchange threads have more information.
For some services and traffic patterns, you may want to allow access only as long as the client is not abusing that access. Two ways of constraining resource usage are connection limiting and rate limiting.
Connection limiting can be implemented using extensions like connlimit
to check how many active connections a client has open. This can be used to restrict the number of connections allowed at one time. If you decide to impose connection limits, you will have some decisions to make:
Connections can be limited on a host-by-host basis, or a limit can be set for a network segment by supplying a network prefix (such as an IP address range for an entire organization). You can also set a global maximum number of connections for a service or the entire machine. Keep in mind that it is possible to mix and match these to create more complex policies to control your connection numbers.
Rate limiting allows you to construct rules that govern the rate or frequency at which traffic will be accepted by your server. There are a number of different firewall extensions that can be used for rate limiting including limit
, hashlimit
, and recent
. The choice of the extension you use will depend largely on the way that you want to limit traffic.
The limit
extension will cause the rule in question to be matched until the limit is hit, after which further packets are dropped. A limit like 5/sec
will allow 5 packets to match per second, after which the rule no longer matches. This is good for setting a global rate-limit for a service. You can also deploy an additional service like Fail2ban to block repeated connection attempts.
The hashlimit
extension is more flexible, allowing you to specify some of the values that iptables
will hash to evaluate a match. For instance, it can look at the source address, source port, destination address, destination port, or a combination of those four values to evaluate each entry. It can limit by packets or by bytes received. This provides flexible per-client or per-service rate limiting.
The recent
extension dynamically adds client IP addresses to a list or checks against an existing list when the rule matches. This allows you to spread your limiting logic across a number of different rules for complex patterns. It has the ability to specify a hit count and a time range like the other limiters, but can also reset the time range if additional traffic is seen, forcing a client to stop all traffic if they are being limited.
All iptables
and nftables
firewall policy is essentially rooted in extending the built-in chains. For a start, this usually means changing the default policy for the existing chains and adding rules. For more complex firewalls, it is often a good idea to extend the management framework by creating additional chains.
User-created chains are called secondarily, and inherently tied to their “calling chain,” that they originate from. User-created chains have no default policy, so if a packet falls through a user-created chain, it will return to the calling chain and continue evaluation. With that in mind, user-created chains are mainly useful for organizational purposes, to make rule matching conditions more maintainable, and to improve readability by splitting match conditions.
If you find yourself repeating certain match criteria for a significant number of rules, it might be worthwhile to create a jump rule with the shared match criteria to a new chain. Inside the new chain, you can add that set of rules with the redundant matching criteria omitted.
The decision as to whether to lump all of your rules into one of the built-in chains or whether to create and utilize additional chains will depend on how complex your rule set is.
You should now have a better understanding of the decisions you’ll have to make when designing firewall policies for your servers. Usually the time investment involved with firewalls skews heavily towards the initial setup. While it may take some time and experimentation to come up with a policy that best serves your needs, doing so will give you more control over the security of your server.
If you would like to know more about firewalls and iptables
specifically, check out the following articles:
The following guides can help you implement your policies. Choose the guide that matches your firewall to get started:
]]>Setting up a firewall is an essential step to take in securing any modern operating system. Most Linux distributions ship with a few different firewall tools that you can use to configure a firewall. In this guide, we’ll be covering the iptables
firewall.
Iptables is a standard firewall included in most Linux distributions by default. It is a command-line interface to the kernel-level netfilter hooks that can manipulate the Linux network stack. It works by matching each packet that crosses the networking interface against a set of rules to decide what to do.
In this guide, you will review how Iptables works. For a more in-depth approach, you can read A Deep Dive into Iptables and Netfilter Architecture.
First, let’s review some terminology and discuss how iptables works.
The iptables firewall operates by comparing network traffic against a set of rules. The rules define the characteristics that a network packet needs to have to match, and the action that should be taken for matching packets.
There are many options to establish which packets match a specific rule. You can match the packet protocol type, the source or destination address or port, the interface that is being used, its relation to previous packets, and so on.
When the defined pattern matches, the action that takes place is called a target. A target can be a final policy decision for the packet, such as ACCEPT
or DROP
. It can also move the packet to a different chain for processing, or log the encounter. There are many options.
These rules are organized into groups called chains. A chain is a set of rules that a packet is checked against sequentially. When the packet matches one of the rules, it executes the associated action and skips the remaining rules in the chain.
A user can create chains as needed. There are three chains defined by default. They are:
Each chain can contain zero or more rules, and has a default policy. The policy determines what happens when a packet drops through all of the rules in the chain and does not match any rule. You can either drop the packet or accept the packet if no rules match.
Iptables can also track connections. This means you can create rules that define what happens to a packet based on its relationship to previous packets. The capability is “state tracking”, “connection tracking”, or configuring the “state machine”.
The netfilter firewall that is included in the Linux kernel keeps IPv4 and IPv6 traffic completely separate. The Iptables tools used to manipulate the tables that contain the firewall rulesets are distinct as well. If you have IPv6 enabled on your server, you will have to configure both tables to address the traffic on your server.
Note: Nftables, a successor to Iptables, integrates handling of IPv4 and IPv6 more closely. The iptables-translate command can be used to migrate Iptables rules to Nftables.
The regular iptables
command is used to manipulate the table containing rules that govern IPv4 traffic. For IPv6 traffic, a companion command called ip6tables
is used. Any rules that you set with iptables
will only affect packets using IPv4 addressing, but the syntax between these commands is the same. The iptables
command will make the rules that apply to IPv4 traffic, and the ip6tables
command will make the rules that apply to IPv6 traffic. Don’t forget to use the IPv6 addresses of your server to craft the ip6tables
rules.
Now that you know how iptables directs packets that come through its interface – direct the packet to the appropriate chain, check it against each rule until one matches, issue the default policy of the chain if no match is found – you can begin to create rules.
First, you need to make sure that you have rules to keep current connections active if you implement a default drop policy. This is especially important if you are connected to your server through SSH. If you accidentally implement a rule or policy that drops your current connection, you may need to log into your server using a browser-based recovery console.
Another thing to keep in mind is that the order of the rules in each chain matter. A packet must not come across a more general rule that it matches if it is meant to match a more specific rule.
Because of this, rules near the top of a chain should have a higher level of specificity than rules at the bottom. You should match specific cases first, and then provide more general rules to match broader patterns. If a packet falls through the entire chain (if it doesn’t match any rules), it will follow the most general rule, i.e., the default policy.
For this reason, a chain’s default policy strongly dictates the types of rules that will be included in the chain. A chain with the default policy of ACCEPT
will contain rules that explicitly drop packets. A chain that defaults to DROP
will contain exceptions for packets that should be specifically accepted.
At this point, you’re ready to implement your own firewall. For this, you should read How To Set Up a Firewall Using Nftables on Ubuntu 22.04. Or, for a more high-level approach, How To Set Up a Firewall with UFW on Ubuntu 22.04. If you’d prefer to run your firewall as a managed service layer, you can also try DigitalOcean’s Cloud Firewalls.
]]>Nginx is a high performance web server that is responsible for handling the load of some of the largest sites on the internet. It is especially good at handling many concurrent connections and excels at forwarding or serving static content. In this guide, we will focus on discussing the structure of an Nginx configuration file along with some guidelines on how to design your files.
This guide will cover the structure of the main Nginx configuration file. The location of this file will depend on how Nginx was installed. On many Linux distributions, the file will be located at /etc/nginx/nginx.conf
. If it does not exist there, it may also be at /usr/local/nginx/conf/nginx.conf
or /usr/local/etc/nginx/nginx.conf
.
One of the first things that you should notice when looking at the main configuration file is that it is organized in a tree-like structure, marked by sets of brackets ({
and }
). In Nginx documentation, the areas that these brackets define are called “contexts” because they contain configuration details that are separated according to their area of concern. These divisions provide an organizational structure along with some conditional logic to decide whether to apply the configurations within.
Because contexts can be layered within one another, Nginx allows configurations to be inherited. As a general rule, if a directive is valid in multiple nested scopes, a declaration in a broader context will be passed on to any child contexts as default values. The child contexts can override these values. It is important to note that an override to any array-type directives will replace the previous value, not add to it.
Directives can only be used in the contexts that they were designed for. Nginx will throw an error when reading a configuration file with directives that are declared in the wrong context. The Nginx documentation contains information about which contexts each directive is valid in, making it a useful reference.
Below, we’ll discuss the most common contexts that you’re likely to come across when working with Nginx.
The first group of contexts that we will discuss are the core contexts that Nginx uses to create a hierarchical tree and separate the discrete configuration blocks. These are the contexts that make up the major structure of an Nginx configuration.
The most general context is the “main” or “global” context. It is the only context that is not contained within the typical context blocks that look like this:
# The main context is here, outside any other contexts
. . .
context {
. . .
}
Any directive that exists entirely outside of these blocks belongs to the “main” context. Keep in mind that if your Nginx configuration is set up in a modular fashion – i.e., with configuration options in multiple files – some files will contain instructions that appear to exist outside of a bracketed context, but will be included within a context when the configuration is loaded together.
The main context represents the broadest environment for Nginx configuration. It is used to configure details that affect the entire application. While the directives in this section affect the lower contexts, many of these cannot be overridden in lower levels.
Some common details that are configured in the main context are the system user and group to run the worker processes as, the number of workers, and the file to save the main Nginx process’s ID. The default error file for the entire application can be set at this level (this can be overridden in more specific contexts).
The “events” context is contained within the “main” context. It is used to set global options that affect how Nginx handles connections at a general level. There can only be a single events context defined within the Nginx configuration.
This context will look like this in the configuration file, outside of any other bracketed contexts:
# main context
events {
# events context
. . .
}
Nginx uses an event-based connection processing model, so the directives defined within this context determine how worker processes should handle connections. Mainly, directives found here are used to either select the connection processing technique to use, or to modify the way these methods are implemented.
Usually, the connection processing method is automatically selected based on the most efficient choice that the platform has available. For Linux systems, the epoll
method is usually the best choice.
Other items that can be configured are the number of connections each worker can handle, whether a worker will only take a single connection at a time or take all pending connections after being notified about a pending connection, and whether workers will take turns responding to events.
Defining an HTTP context is probably the most common use of Nginx. When configuring Nginx as a web server or reverse proxy, the “http” context will hold the majority of the configuration. This context will contain all of the directives and other contexts necessary to define how the program will handle HTTP or HTTPS connections.
The http context is a sibling of the events context, so they should be listed side-by-side, rather than nested. They both are children of the main context:
# main context
events {
# events context
. . .
}
http {
# http context
. . .
}
While lower contexts get more specific about how to handle requests, directives at this level control the defaults for every virtual server defined within. A large number of directives are configurable at this context and below, depending on how you would like the inheritance to function.
Some of the directives that you are likely to encounter control the default locations for access and error logs (access_log
and error_log
), configure asynchronous I/O for file operations (aio
, sendfile
, and directio
), and configure the server’s statuses when errors occur (error_page
). Other directives configure compression (gzip
and gzip_disable
), fine-tune the TCP keep alive settings (keepalive_disable
, keepalive_requests
, and keepalive_timeout
), and the rules that Nginx will follow to try to optimize packets and system calls (sendfile
, tcp_nodelay
, and tcp_nopush
). Additional directives configure an application-level document root and index files (root
and index
) and set up the various hash tables that are used to store different types of data (*_hash_bucket_size
and *_hash_max_size
for server_names
, types
, and variables
). For more information, refer to the Nginx documentation.
The “server” context is declared within the “http” context. This is our first example of nested, bracketed contexts. It is also the first context that allows for multiple declarations.
The general format for server context may look something like this. Remember that these reside within the http context:
# main context
http {
# http context
server {
# first server context
}
server {
# second server context
}
}
You can declare multiple server
contexts, because each instance defines a specific virtual server to handle client requests. You can have as many server blocks as you need, each of which can handle a specific subset of connections.
This context type is also the first that Nginx must use to select an algorithm. Each client request will be handled according to the configuration defined in a single server context, so Nginx must decide which server context is most appropriate based on details of the request. There are two common options, which differ in their use of domain names:
The directives in this context can override many of the directives that may be defined in the http context, including logging, the document root, compression, etc. In addition to the directives that are taken from the http context, we also can configure files to try to respond to requests (try_files
), issue redirects and rewrites (return
and rewrite
), and set arbitrary variables (set
).
The next context that you will deal with regularly is the location context. Location contexts share many relational qualities with server contexts. For example, multiple location contexts can be defined, each location is used to handle a certain type of client request, and each location is selected by matching the location definition against the client request through a selection algorithm.
While the directives that determine whether to select a server block are defined within the server context, the component that decides on a location’s ability to handle a request is located in the location definition (the line that opens the location block).
The general syntax looks like this:
location match_modifier location_match {
. . .
}
Location blocks live within server contexts and, unlike server blocks, can be nested inside one another. This can be useful for creating a more general location context to catch a certain subset of traffic, and then further processing it based on more specific criteria with additional contexts inside:
# main context
server {
# server context
location /match/criteria {
# first location context
}
location /other/criteria {
# second location context
location nested_match {
# first nested location
}
location other_nested {
# second nested location
}
}
}
While server contexts are selected based on the requested IP address/port combination and the host name in the “Host” header, location blocks further divide up the request handling within a server block by looking at the request URI. The request URI is the portion of the request that comes after the domain name or IP address/port combination.
Say, for example, that a client requests http://www.example.com/blog
on port 80. The http
, www.example.com
, and port 80 components individually would all be used to determine which server block to select. After a server is selected, the /blog
portion (the request URI), would be evaluated against the defined locations to determine which further context should be used to respond to the request.
Many of the directives you are likely to see in a location context are also available at the parent levels. New directives at this level allow you to reach locations outside of the document root (alias
), mark the location as only internally accessible (internal
), and proxy to other servers or locations (using http, fastcgi, scgi, and uwsgi proxying).
While the above examples represent the essential contexts that you will encounter with Nginx, other contexts exist as well. The following contexts are used only in certain circumstances, or they are used for functionality that most people will not be using:
split_clients
: This context is configured to split the clients that the server receives into categories by labeling them with variables based on a percentage. These can then be used to do A/B testing by providing different content to different hosts.perl / perl_set
: These contexts configure Perl handlers for the location they appear in. This will only be used for processing with Perl.map
: This context is used to set the value of a variable depending on the value of another variable. It provides a mapping of one variable’s values to determine what the second variable should be set to.geo
: Like the above context, this context is used to specify a mapping. However, this mapping is specifically used to categorize client IP addresses. It sets the value of a variable depending on the connecting IP address.types
: This context is again used for mapping. This context is used to map MIME types to the file extensions that should be associated with them. This is usually provided with Nginx through a file that is sourced into the main nginx.conf
config file.charset_map
: This is another example of a mapping context. This context is used to map a conversion table from one character set to another. In the context header, both sets are listed and in the body, the mapping takes place.The upstream context is used to define and configure “upstream” servers. This context defines a named pool of servers that Nginx can then proxy requests to. This context will likely be used when you are configuring proxies of various types.
The upstream context should be placed within the http context, outside of any specific server contexts. The form looks like this:
# main context
http {
# http context
upstream upstream_name {
# upstream context
server proxy_server1;
server proxy_server2;
. . .
}
server {
# server context
}
}
The upstream context can then be referenced by name within server or location blocks to pass requests of a certain type to the pool of servers that have been defined. The upstream will then use an algorithm (round-robin by default) to determine which specific server to hand the request to. This context gives our Nginx the ability to do some load balancing when proxying requests.
The “if” context can be established to provide conditional processing of directives. Like an if statement in conventional programming, the if directive in Nginx will execute the instructions contained if a given test returns “true”.
The if context in Nginx is provided by the rewrite module and this is the primary intended use of this context. Since Nginx will test conditions of a request with many other purpose-made directives, if
should not be used for most forms of conditional execution. This is such an important note that the Nginx community has created a page called if is evil.
The problem is that the Nginx processing order can very often lead to unexpected results. The only directives that are considered reliably safe to use inside of these contexts are the return
and rewrite
directives (the ones this context was created for). Another thing to keep in mind when using an if context is that it renders a try_files
directive in the same context useless.
Most often, an if will be used to determine whether a rewrite or return is needed. These will most often exist in location blocks, so the common form will look something like this:
# main context
http {
# http context
server {
# server context
location location_match {
# location context
if (test_condition) {
# if context
}
}
}
}
The limit_except
context is used to restrict the use of certain HTTP methods within a location context. For example, if only certain clients should have access to POST content, but everyone should have the ability to read content, you can use a limit_except
block to define this requirement.
The above example would look something like this:
. . .
# server or location context
location /restricted-write {
# location context
limit_except GET HEAD {
# limit_except context
allow 192.168.1.1/24;
deny all;
}
}
This will apply the directives inside the context (meant to restrict access) when encountering any HTTP methods except those listed in the context header. The result of the above example is that any client can use the GET and HEAD verbs, but only clients coming from the 192.168.1.1/24
subnet are allowed to use other methods.
Now that you have an idea of the common contexts that you are likely to encounter when exploring Nginx configurations, we can discuss some best practices to use when dealing with Nginx contexts.
Many directives are valid in more than one context. For instance, there are quite a few directives that can be placed in the http, server, or location context. This gives us flexibility in setting these directives.
As a general rule, it is usually best to declare directives in the highest context to which they are applicable, and overriding them in lower contexts as necessary. This is possible because of the inheritance model that Nginx implements. There are many reasons to use this strategy.
First of all, declaring at a high level allows you to avoid unnecessary repetition between sibling contexts. For instance, in the example below, each of the locations is declaring the same document root:
http {
server {
location / {
root /var/www/html;
. . .
}
location /another {
root /var/www/html;
. . .
}
}
}
You could move the root out to the server block, or even to the http block, like this:
http {
root /var/www/html;
server {
location / {
. . .
}
location /another {
. . .
}
}
}
Most of the time, the server level will be most appropriate, but declaring at the higher level has its advantages. This not only allows you to set the directive in fewer places, it also allows you to cascade the default value down to all of the child elements, preventing situations where you run into an error by forgetting a directive at a lower level. This can be a major issue with long configurations. Declaring at higher levels provides you with a good default.
When you want to handle requests differently depending on some information that can be found in the client’s request, often users jump to the “if” context to try to conditionalize processing. There are a few issues with this that we touched on briefly earlier.
The first is that the “if” directive often returns results that do not align with the administrator’s expectations. Although the processing will always lead to the same result given the same input, the way that Nginx interprets the environment can be different than can be assumed without heavy testing.
The second reason for this is that there are already optimized, purpose-made directives that are used for many of these purposes. Nginx already uses a well-documented selection algorithm for things like selecting server blocks and location blocks. If possible, it is best to try to move your different configurations into their own blocks so that this algorithm can handle the selection process logic.
For instance, instead of relying on rewrites to get a user supplied request into the format that you would like to work with, you should try to set up two blocks for the request, one of which represents the desired method, and the other that catches messy requests and redirects (and possibly rewrites) them to your correct block.
The result is usually more readable and also has the added benefit of being more performant. Correct requests undergo no additional processing and, in many cases, incorrect requests can get by with a redirect rather than a rewrite, which should execute with lower overhead.
By this point, you should have a good grasp on Nginx’s most common contexts and the directives that create the blocks that define them.
Always check Nginx’s documentation for information about which contexts a directive can be placed in and to evaluate the most effective location. Taking care when creating your configurations will improve maintainability and also often increase performance.
Next, you can learn how to configure password authentication with Nginx.
]]>In popular usage, “Linux” often refers to a group of operating system distributions built around the Linux kernel. In the strictest sense, though, Linux refers only to the presence of the kernel itself. To build out a full operating system, Linux distributions often include tooling and libraries from the GNU project and other sources. More developers have been using Linux recently to build and run mobile applications; it has also played a key role in the development of affordable devices such as Chromebooks, which run operating systems on the kernel. Within cloud computing and server environments in general, Linux is a popular choice for some practical reasons:
Linux also traces its origins to the free and open-source software movement, and as a consequence some developers choose it for a combination of ethical and practical reasons:
To understand Linux’s role within the developer community (and beyond), this article will outline a brief history of Linux by way of Unix, and discuss some popular Linux distributions.
Linux has its roots in Unix and Multics, two projects that shared the goal of developing a robust multi-user operating system.
Unix developed out of the Multics project iteration at the Bell Laboratories’ Computer Sciences Research Center. The developers working on Multics at Bell Labs and elsewhere were interested in building a multi-user operating system with single-level storage, dynamic linking (in which a running process can request that another segment be added to its address space, enabling it to execute that segment’s code), and a hierarchical file system.
Bell Labs stopped funding the Multics project in 1969, but a group of researchers, including Ken Thompson and Dennis Ritchie, continued working with the project’s core principles. In 1972-3 they made the decision to rewrite the system in C, which made Unix uniquely portable: unlike other contemporary operating systems, it could both move from and outlive its hardware.
Research and development at Bell Labs (later AT&T) continued, with Unix System Laboratories developing versions of Unix, in partnership with Sun Microsystems, that would be widely adopted by commercial Unix vendors. Meanwhile, research continued in academic circles, most notably the Computer Systems Research Group at the University of California Berkeley. This group produced the Berkeley Software Distribution (BSD), which inspired a range of operating systems, many of which are still in use today. Two BSD distributions of historical note are NeXTStep, the operating system pioneered by NeXT, which became the basis for macOS, among other products, and MINIX, an educational operating system that formed a comparative basis for Linus Torvalds as he developed Linux.
Unix is oriented around principles of clarity, portability, and simultaneity.
Unix raised important questions for developers, but it also remained proprietary in its earliest iterations. The next chapter of its history is thus the story of how developers worked within and against it to create free and open-source alternatives.
Richard Stallman was a central figure among the developers who were inspired to create non-proprietary alternatives to Unix. While working at MIT’s Artificial Intelligence Laboratory, he initiated work on the GNU project (recursive for “GNU’s not Unix!”), eventually leaving the Lab in 1984 so he could distribute GNU components as free software. The GNU kernel, known as GNU HURD, became the focus of the Free Software Foundation (FSF), founded in 1985 and currently headed by Stallman.
Meanwhile, another developer was at work on a free alternative to Unix: Finnish undergraduate Linus Torvalds. After becoming frustrated with licensure for MINIX, Torvalds announced to a MINIX user group on August 25, 1991 that he was developing his own operating system, which resembled MINIX. Though initially developed on MINIX using the GNU C compiler, the Linux kernel quickly became a unique project with a core of developers who released version 1.0 of the kernel with Torvalds in 1994.
Torvalds had been using GNU code, including the GNU C Compiler, with his kernel, and it remains true that many Linux distributions draw on GNU components. Stallman has lobbied to expand the term “Linux” to “GNU/Linux,” which he argues would capture both the role of the GNU project in Linux’s development and the underlying ideals that fostered the GNU project and the Linux kernel. Today, “Linux” is often used to indicate both the presence of the Linux kernel and GNU elements. At the same time, embedded systems on many handheld devices and smartphones often use the Linux kernel with few to no GNU components.
Though the Linux kernel inherited many goals and properties from Unix, it differs from the earlier system in the following ways:
Developers maintain many popular Linux distributions today. Among the longest-standing is Debian, a free and open-source distribution that has 50,000 software packages. Debian inspired another popular distribution, Ubuntu, funded by Canonical Ltd. Ubuntu uses Debian’s deb package format and package management tools, and Ubuntu’s developers push changes back upstream to Debian.
A similar relationship exists between Red Hat, Fedora, and CentOS. Red Hat created a Linux distribution in 1993, and ten years later split its efforts into Red Hat Enterprise Linux and Fedora, a community-based operating system that utilizes the Linux kernel and elements from the GNU Project. Red Hat also has a relationship with the CentOS Project, another popular Linux distribution for web servers. This relationship does not include paid maintenance, however. Like Debian, CentOS is maintained by a community of developers.
In this article, we have covered Linux’s roots in Unix and some of its defining features. If you are interested in learning more about the history of Linux and Unix variations (including FreeBSD), a good step might be our series on FreeBSD. Another option might be to consider our introductory series on getting started with Linux. You can also check out this introduction to the filesystem layout in Linux, this discussion of how to use find
and locate
to search for files on a Linux VPS, or this introduction to regular expressions on the command line.
Firewalls are an important tool that can be configured to protect your servers and infrastructure. In the Linux ecosystem, iptables
is a widely used firewall tool that works with the kernel’s netfilter
packet filtering framework. Creating reliable firewall policies can be daunting, due to complex syntax and the number of interrelated parts involved.
In this guide, we will dive into the iptables
architecture with the aim of making it more comprehensible for users who need to build their own firewall policies. We will discuss how iptables
interacts with netfilter
and how the various components fit together to provide a comprehensive filtering system.
For many years, the firewall software most commonly used in Linux was called iptables
. In some distributions, it has been replaced by a new tool called nftables
, but iptables
syntax is still commonly used as a baseline. The iptables
firewall works by interacting with the packet filtering hooks in the Linux kernel’s networking stack. These kernel hooks are known as the netfilter
framework.
Every packet that passes through the networking layer (incoming or outgoing) will trigger these hooks, allowing programs to interact with the traffic at key points. The kernel modules associated with iptables
register with these hooks in order to ensure that the traffic conforms to the conditions laid out by the firewall rules.
There are five netfilter
hooks that programs can register with. As packets progress through the stack, they will trigger the kernel modules that have registered with these hooks. The hooks that a packet will trigger depends on whether the packet is incoming or outgoing, the packet’s destination, and whether the packet was dropped or rejected at a previous point.
The following hooks represent these well-defined points in the networking stack:
NF_IP_PRE_ROUTING
: This hook will be triggered by any incoming traffic very soon after entering the network stack. This hook is processed before any routing decisions have been made regarding where to send the packet.NF_IP_LOCAL_IN
: This hook is triggered after an incoming packet has been routed if the packet is destined for the local system.NF_IP_FORWARD
: This hook is triggered after an incoming packet has been routed if the packet is to be forwarded to another host.NF_IP_LOCAL_OUT
: This hook is triggered by any locally created outbound traffic as soon as it hits the network stack.NF_IP_POST_ROUTING
: This hook is triggered by any outgoing or forwarded traffic after routing has taken place and just before being sent out on the wire.Kernel modules that need to register at these hooks must also provide a priority number to help determine the order in which they will be called when the hook is triggered. This provides the means for multiple modules (or multiple instances of the same module) to be connected to each of the hooks with deterministic ordering. Each module will be called in turn and will return a decision to the netfilter
framework after processing that indicates what should be done with the packet.
The iptables
firewall uses tables to organize its rules. These tables classify rules according to the type of decisions they are used to make. For instance, if a rule deals with network address translation, it will be put into the nat
table. If the rule is used to decide whether to allow the packet to continue to its destination, it would probably be added to the filter
table.
Within each iptables
table, rules are further organized within separate “chains”. While tables are defined by the general aim of the rules they hold, the built-in chains represent the netfilter
hooks which trigger them. Chains determine when rules will be evaluated.
The names of the built-in chains mirror the names of the netfilter
hooks they are associated with:
PREROUTING
: Triggered by the NF_IP_PRE_ROUTING
hook.INPUT
: Triggered by the NF_IP_LOCAL_IN
hook.FORWARD
: Triggered by the NF_IP_FORWARD
hook.OUTPUT
: Triggered by the NF_IP_LOCAL_OUT
hook.POSTROUTING
: Triggered by the NF_IP_POST_ROUTING
hook.Chains allow the administrator to control where in a packet’s delivery path a rule will be evaluated. Since each table has multiple chains, a table’s influence can be exerted at multiple points in processing. Because certain types of decisions only make sense at certain points in the network stack, every table will not have a chain registered with each kernel hook.
There are only five netfilter
kernel hooks, so chains from multiple tables are registered at each of the hooks. For instance, three tables have PREROUTING
chains. When these chains register at the associated NF_IP_PRE_ROUTING
hook, they specify a priority that dictates what order each table’s PREROUTING
chain is called. Each of the rules inside the highest priority PREROUTING
chain is evaluated sequentially before moving onto the next PREROUTING
chain. We will take a look at the specific order of each chain in a moment.
Let’s step back for a moment and take a look at the different tables that iptables
provides. These represent distinct sets of rules, organized by area of concern, for evaluating packets.
The filter table is one of the most widely used tables in iptables
. The filter
table is used to make decisions about whether to let a packet continue to its intended destination or to deny its request. In firewall parlance, this is known as “filtering” packets. This table provides the bulk of functionality that people think of when discussing firewalls.
The nat
table is used to implement network address translation rules. As packets enter the network stack, rules in this table will determine whether and how to modify the packet’s source or destination addresses in order to impact the way that the packet and any response traffic are routed. This is often used to route packets to networks when direct access is not possible.
The mangle
table is used to alter the IP headers of the packet in various ways. For instance, you can adjust the TTL (Time to Live) value of a packet, either lengthening or shortening the number of valid network hops the packet can sustain. Other IP headers can be altered in similar ways.
This table can also place an internal kernel “mark” on the packet for further processing in other tables and by other networking tools. This mark does not touch the actual packet, but adds the mark to the kernel’s representation of the packet.
The iptables
firewall is stateful, meaning that packets are evaluated in regards to their relation to previous packets. The connection tracking features built on top of the netfilter
framework allow iptables
to view packets as part of an ongoing connection or session instead of as a stream of discrete, unrelated packets. The connection tracking logic is usually applied very soon after the packet hits the network interface.
The raw
table has a very narrowly defined function. Its only purpose is to provide a mechanism for marking packets in order to opt-out of connection tracking.
The security
table is used to set internal SELinux security context marks on packets, which will affect how SELinux or other systems that can interpret SELinux security contexts handle the packets. These marks can be applied on a per-packet or per-connection basis.
If three tables have PREROUTING
chains, in which order are they evaluated?
The following table indicates the chains that are available within each iptables
table when read from left-to-right. For instance, we can tell that the raw
table has both PREROUTING
and OUTPUT
chains. When read from top-to-bottom, it also displays the order in which each chain is called when the associated netfilter
hook is triggered.
A few things should be noted. In the representation below, the nat
table has been split between DNAT
operations (those that alter the destination address of a packet) and SNAT
operations (those that alter the source address) in order to display their ordering more clearly. We have also include rows that represent points where routing decisions are made and where connection tracking is enabled in order to give a more holistic view of the processes taking place:
Tables↓/Chains→ | PREROUTING | INPUT | FORWARD | OUTPUT | POSTROUTING |
---|---|---|---|---|---|
(routing decision) | ✓ | ||||
raw | ✓ | ✓ | |||
(connection tracking enabled) | ✓ | ✓ | |||
mangle | ✓ | ✓ | ✓ | ✓ | ✓ |
nat (DNAT) | ✓ | ✓ | |||
(routing decision) | ✓ | ✓ | |||
filter | ✓ | ✓ | ✓ | ||
security | ✓ | ✓ | ✓ | ||
nat (SNAT) | ✓ | ✓ |
As a packet triggers a netfilter
hook, the associated chains will be processed as they are listed in the table above from top-to-bottom. The hooks (columns) that a packet will trigger depend on whether it is an incoming or outgoing packet, the routing decisions that are made, and whether the packet passes filtering criteria.
Certain events will cause a table’s chain to be skipped during processing. For instance, only the first packet in a connection will be evaluated against the NAT rules. Any nat
decisions made for the first packet will be applied to all subsequent packets in the connection without additional evaluation. Responses to NAT’ed connections will automatically have the reverse NAT rules applied to route correctly.
Assuming that the server knows how to route a packet and that the firewall rules permit its transmission, the following flows represent the paths that will be traversed in different situations:
PREROUTING
-> INPUT
PREROUTING
-> FORWARD
-> POSTROUTING
OUTPUT
-> POSTROUTING
If we combine the above information with the ordering laid out in the previous table, we can see that an incoming packet destined for the local system will first be evaluated against the PREROUTING
chains of the raw
, mangle
, and nat
tables. It will then traverse the INPUT
chains of the mangle
, filter
, security
, and nat
tables before finally being delivered to the local socket.
Rules are placed within a specific chain of a specific table. As each chain is called, the packet in question will be checked against each rule within the chain in order. Each rule has a matching component and an action component.
The matching portion of a rule specifies the criteria that a packet must meet in order for the associated action (or “target”) to be executed.
The matching system is very flexible and can be expanded significantly with additional iptables
extensions. Rules can be constructed to match by protocol type, destination or source address, destination or source port, destination or source network, input or output interface, headers, or connection state among other criteria. These can be combined to create complex rule sets to distinguish between different traffic.
A “target” refers to the actions that are triggered when a packet meets the matching criteria of a rule. Targets are generally divided into two categories:
netfilter
hook. Depending on the return value provided, the hook might drop the packet or allow the packet to continue to the next stage of processing.The availability of each target within rules will depend on context. For instance, the table and chain type might dictate the targets available. The extensions activated in the rule and the matching clauses can also affect the availability of targets.
There is also a special class of non-terminating target: the jump target. Jump targets are actions that result in evaluation moving to a different chain for additional processing. We’ve covered the built-in chains which are tied to the netfilter
hooks that call them. However, iptables
also allows administrators to create their own chains for organizational purposes.
Rules can be placed in user-defined chains in the same way that they can be placed into built-in chains. The difference is that user-defined chains can only be reached by “jumping” to them from a rule (they are not registered with a netfilter
hook themselves).
User-defined chains act as extensions of the chain which called them. For instance, in a user-defined chain, evaluation will pass back to the calling chain if the end of the rule list is reached or if a RETURN
target is activated by a matching rule. Evaluation can also jump to additional user-defined chains.
This construct allows for greater organization and provides the framework necessary for more robust branching.
We introduced the connection tracking system implemented on top of the netfilter
framework when we discussed the raw
table and connection state matching criteria. Connection tracking allows iptables
to make decisions about packets viewed in the context of an ongoing connection. The connection tracking system provides iptables
with the functionality it needs to perform “stateful” operations.
Connection tracking is applied very soon after packets enter the networking stack. The raw
table chains and some sanity checks are the only logic that is performed on packets prior to associating the packets with a connection.
The system checks each packet against a set of existing connections. It will update the state of the connection in its store if needed and will add new connections to the system when necessary. Packets that have been marked with the NOTRACK
target in one of the raw
chains will bypass the connection tracking routines.
Connections tracked by the connection tracking system will be in one of the following states:
NEW
: When a packet arrives that is not associated with an existing connection, but is not invalid as a first packet, a new connection will be added to the system with this label. This happens for both connection-aware protocols like TCP and for connectionless protocols like UDP.ESTABLISHED
: A connection is changed from NEW
to ESTABLISHED
when it receives a valid response in the opposite direction. For TCP connections, this means a SYN/ACK
and for UDP and ICMP traffic, this means a response where source and destination of the original packet are switched.RELATED
: Packets that are not part of an existing connection, but are associated with a connection already in the system are labeled RELATED
. This could mean a helper connection, as is the case with FTP data transmission connections, or it could be ICMP responses to connection attempts by other protocols.INVALID
: Packets can be marked INVALID
if they are not associated with an existing connection and aren’t appropriate for opening a new connection, if they cannot be identified, or if they aren’t routable among other reasons.UNTRACKED
: Packets can be marked as UNTRACKED
if they’ve been targeted in a raw
table chain to bypass tracking.SNAT
: This is a virtual state set when the source address has been altered by NAT operations. This is used by the connection tracking system so that it knows to change the source addresses back in reply packets.DNAT
: This is a virtual state set when the destination address has been altered by NAT operations. This is used by the connection tracking system so that it knows to change the destination address back when routing reply packets.The states tracked in the connection tracking system allow administrators to craft rules that target specific points in a connection’s lifetime. This provides the functionality needed for more thorough and secure rules.
The netfilter
packet filtering framework and the iptables
firewall are the basis for most firewall solutions on Linux servers. The netfilter
kernel hooks are close enough to the networking stack to provide powerful control over packets as they are processed by the system. The iptables
firewall leverages these capabilities to provide a flexible, extensible method of communicating policy requirements to the kernel. By learning about how these pieces fit together, you can better utilize them to control and secure your server environments.
If you would like to know more about how to choose effective iptables
policies, check out this guide.
These guides can help you get started implementing your iptables
firewall rules:
SQL, or Structured Query Language, can sound intimidating at first. It’s a language primarily used to define, manipulate, and query data held in relational databases — the kind of databases where data is highly organized and structured to fit in well-defined rows and columns.
Note: You can learn more about relational databases, including their history, key concepts, and common uses, by following the Understanding Relational Databases tutorial.
SQL’s popularity has grown since its inception in the 1970s, and the areas where SQL is used have vastly expanded. Now, SQL is a mature and prevalent way of querying and manipulating data in most industries and with the most popular tools. You have a high chance of encountering SQL in your work, whether you are a database administrator, database architect, software engineer, data analyst, or none of the above! Even if you don’t use SQL directly, you can still benefit from understanding its concepts.
This article shows why it’s worth learning SQL and explains where and how you can apply that knowledge.
If you come across any relational database, such as MySQL, PostgreSQL, Microsoft SQL, Oracle SQL, or many others, you will inevitably stumble upon SQL statements. The language is widely used to define, manipulate, and query data in this type of database. It’s the primary way of interacting with the database engine.
Although many database systems provide easy-to-use GUI tools to help work with structure and data, they don’t make SQL obsolete. While the simplest of tasks can be quickly achieved without SQL, more complex data manipulations or querying will inevitably require using SQL to construct queries.
With SQL, you can query the data that you need and do so efficiently, playing into the strengths of the database engine.
By understanding SQL, you’ll be able to quickly get up to speed with virtually any relational database system used today. Your skillset will not be limited to any particular software or employer.
Because of its popularity and use across widely adopted relational database systems, SQL has found its way into other database systems and data analysis tools.
SQL and SQL derivatives are supported across many data storage systems, data analytics engines, business intelligence tools, and data mining tools, including many non-relational databases, analytical (OLAP) databases, and big data solutions.
Whatever software you might encounter for analyzing large datasets, there is a good chance you can use your SQL knowledge to work with data across those tools. You will be able to use a similar approach with different databases, making the work more accessible and more universal across different data sources. As a result, SQL knowledge is a primary and essential tool that is frequently used by data analysts and data scientists.
SQL is even in the top ten most popular programming languages in the TIOBE Programming Community index, an indicator of the popularity of programming languages.
Discussing data can often be challenging, since people may understand everyday language slightly differently. This ambiguity can lead to misunderstandings and errors in communication. Knowing SQL basics, such as understanding how the data is structured and how you could query it yourself, you can be more precise when communicating with your colleagues and team members.
Even if you are not going to write SQL statements yourself, you can use your experience to convey your requirements and expectations precisely. You will also be able to pinpoint issues with data that you receive and provide easy-to-understand and actionable feedback to the data analysts who helped prepare the data for you.
You can think of SQL as a common language universally understood by the people working with data. Even if you do not use SQL directly, referring to concepts from SQL will get your communication on track.
When tasked with designing any database, it’s important to consider what kind of data will be stored in the database and how it will be accessed or manipulated in the future. While it is certainly possible to design databases based exclusively on a good grasp of database design theory, moving from pen and paper to the actual database can be difficult and can often hold surprises.
Whenever any data is retrieved from a database, some kind of SQL query will be used, either written by the data analyst or generated through software. By understanding the desired usage patterns and knowing how to translate them into possible SQL queries, you will grasp how SQL will access the underlying database to retrieve the data, and what the database engine will have to do to respond to such a query.
In turn, you can use that knowledge to design databases fit for the purpose. By taking into account the use cases the database should support, you can choose the database structure that lends itself to simpler and more efficient queries for common scenarios.
You can structure tables thoughtfully as well as use data types, foreign key relationships, and indexes to facilitate data access. In effect, you’ll learn how to design databases better suited for your purpose.
Modern software development frameworks and popular web frameworks, such as Laravel, Symfony, Django, or Ruby on Rails, often employ data abstraction layers like object-relational mappers to hide the complexity of data access from the developer. In general, you will not use SQL directly to access or manipulate data when working with such frameworks. The easy-to-follow syntax typical to the framework will make things “just work” for you, and the requested data will become seamlessly available.
However, no matter how many shortcuts and simplifications frameworks provide, the underlying database engine will be queried using an SQL statement constructed under the hood from your input.
Understanding how SQL works can help you use the features of your chosen framework to make queries faster and more efficient. You can use your knowledge of how queries are built by the framework to influence the construction and execution of queries that are the best way to access the data.
You can also more easily debug any issues with data querying and manipulation. Database engines often return errors referring to the failing part of the actual SQL query executed, which, while precise, makes it difficult to backtrack the issue to the framework-specific query syntax and where the issue comes from. By understanding database errors, you can precisely pinpoint the issue at hand.
Last but not least, you will also be more away of security dangers that stem from any improper and dangerous use of SQL queries, such as SQL injection attacks, and you will be able to counteract them.
By knowing SQL, you will be fully in control of how you access the data, regardless of whether you use SQL directly or choose to work with software abstractions and ORM tools within software frameworks.
There are many benefits to understanding and applying the Structured Query Language in practice, but the best part is its accessiblility for beginners. It is well-defined, and its syntax primarily uses common English words to name operations, filters, and other modifiers. SQL queries can often be read like English sentences and quickly understood even without prior programming experience.
There are more challenging and complex aspects of the language that can prove tricky and require a good deal of work to comprehend and gain experience with. But the essentials of SQL can be understood and learned at a more basic level. By using it in your daily work, it can be convenient to learn with your actual data needs. You can start with the most basic concepts of SQL and extend your grasp of the language whenever you need to retrieve some data in a way you haven’t done before. Experimenting with SQL is straightforward and non-destructive when querying data, making it safe and reassuring.
By learning SQL, you can gain benefits that greatly exceed the time involved, acquire new methods to retrieve and analyze data from multiple sources, become self-sufficient in data analysis, and open new career paths in multiple fields.
Due to its flexibility, ease of use, and applicability within different data-related areas, SQL is a ubiquitous data querying and manipulation language. Learning it can have many benefits, even if your primary job is not directly related to databases or creating software.
To get started using SQL, check out the How To Use SQL series, which covers a range of topics from introductory articles on various SQL concepts and practices to advanced techniques and features of the language. We encourage you to follow this tutorial series to get acquainted with SQL. You can also use the entries in this series for reference while you continue to hone your skills with SQL.
To practice and experiment with SQL without installing and configuring the database server yourself, you can use one of the DigitalOcean Managed Databases (either Managed MySQL or Managed PostgreSQL), which provide a quick path to a fully working database environment.
]]>Linux has robust systems and tooling to manage hardware devices, including storage drives. In this article we’ll cover, at a high level, how Linux represents these devices and how raw storage is made into usable space on the server.
Block storage is another name for what the Linux kernel calls a block device. A block device is a piece of hardware that can be used to store data, like a traditional spinning hard disk drive (HDD), solid state drive (SSD), flash memory stick, and so on. It is called a block device because the kernel interfaces with the hardware by referencing fixed-size blocks, or chunks of space.
In other words, block storage is what you think of as regular disk storage on a computer. Once it is set up, it acts as an extension of the current filesystem tree, and you should be able to write to or read information from each drive interchangeably.
Disk partitions are a way of breaking up a storage drive into smaller usable units. A partition is a section of a storage drive that can be treated in much the same way as a drive itself.
Partitioning allows you to segment the available space and use each partition for a different purpose. This gives a user more flexibility, allowing them to potentially segment a single disk for multiple operating systems, swap space, or specialized filesystems.
While disks can be formatted and used without partitioning, operating systems usually expect to find a partition table, even if there is only a single partition written to the disk. It is generally recommended to partition new drives for greater flexibility.
When partitioning a disk, it is important to know what partitioning format will be used. This generally comes down to a choice between MBR (Master Boot Record) and GPT (GUID Partition Table).
MBR is over 30 years old. Because of its age, it has some serious limitations. For instance, it cannot be used for disks over 2TB in size, and can only have a maximum of four primary partitions.
GPT is a more modern partitioning scheme that resolves some of the issues inherent with MBR. Systems running GPT can have many more partitions per disk. This is usually only limited by the restrictions imposed by the operating system itself. Additionally, the disk size limitation does not exist with GPT and the partition table information is available in multiple locations to guard against corruption. GPT can also write a “protective MBR” for compatibility with MBR-only tools.
In most cases, GPT is the better choice unless your operating system prevents you from using it.
While the Linux kernel can recognize a raw disk, it must be formatted to be used. Formatting is the process of writing a filesystem to the disk and preparing it for file operations. A filesystem is the system that structures data and controls how information is written to and retrieved from the underlying disk. Without a filesystem, you could not use the storage device for any standard filesystem operations.
There are many different filesystem formats, each with trade-offs, including operating system support. They all present the user with a similar representation of the disk, but the features and the platforms that they support can be very different.
Some of the more popular filesystems for Linux are:
Additionally, Windows primarily uses *NTFS and ExFAT, and macOS primarily uses HFS+ and APFS. It is usually possible to read and sometimes write these filesystem formats on different platforms, but may require additional compatibility tools.
In Linux, almost everything is represented by a file somewhere in the filesystem hierarchy. This includes hardware like storage drives, which are represented on the system as files in the /dev
directory. Typically, files representing storage devices start with sd
or hd
followed by a letter. For instance, the first drive on a server is usually something like /dev/sda
.
Partitions on these drives also have files within /dev
, represented by appending the partition number to the end of the drive name. For example, the first partition on the drive from the previous example would be /dev/sda1
.
While the /dev/sd*
and /dev/hd*
device files represent the traditional way to refer to drives and partitions, there is a significant disadvantage to using these values alone. The Linux kernel decides which device gets which name on each boot, so this can lead to confusing scenarios where your devices change device nodes.
To work around this issue, the /dev/disk
directory contains subdirectories corresponding with different, more persistent ways to identify disks and partitions on the system. These contain symbolic links that are created at boot back to the correct /dev/[sh]da*
files. The links are named according to the directory’s identifying trait (for example, by partition label in for the /dev/disk/by-partlabel
directory). These links will always point to the correct devices, so they can be used as static identifiers for storage spaces.
Some or all of the following subdirectories may exist under /dev/disk
:
by-label
: Most filesystems have a labeling mechanism that allows the assignment of arbitrary user-specified names for a disk or partition. This directory consists of links named after these user-supplied labels.by-uuid
: UUIDs, or universally unique identifiers, are a long, unique string of letters and numbers that can be used as an ID for a storage resource. These are generally not very human-readable, but are almost always unique, even across systems. As such, it might be a good idea to use UUIDs to reference storage that may migrate between systems, since naming collisions are less likely.by-partlabel
and by-partuuid
: GPT tables offer their own set of labels and UUIDs, which can also be used for identification. This functions in much the same way as the previous two directories, but uses GPT-specific identifiers.by-id
: This directory contains links generated by the hardware’s own serial numbers and the hardware they are attached to. This is not entirely persistent, because the way that the device is connected to the system may change its by-id
name.by-path
: Like by-id
, this directory relies on a storage device’s connection to the system itself. The links here are constructed using the system’s interpretation of the hardware used to access the device. This has the same drawbacks as by-id
as connecting a device to a different port can alter this value.Usually, by-label
or by-uuid
are the best options for persistent identification of specific devices.
Note: DigitalOcean block storage volumes control the device serial numbers reported to the operating system. This allows for the by-id
categorization to be reliably persistent on this platform. This is the preferred method of referring to DigitalOcean volumes as it is both persistent and predictable on first boot.
In Linux and other Unix-like operating systems, the entire system, regardless of how many physical devices are involved, is represented by a single unified file tree. When a filesystem on a drive or partition is to be used, it must be hooked into the existing tree. Mounting is the process of attaching a formatted partition or drive to a directory within the Linux filesystem. The drive’s contents can then be accessed from that directory.
Drives are almost always mounted on dedicated empty directories – mounting on a non-empty directory means that the directory’s usual contents will be inaccessible until the drive is unmounted). There are many different mounting options that can be set to alter the behavior of a mounted device. For example, the drive can be mounted in read-only mode to ensure that its contents won’t be altered.
The Filesystem Hierarchy Standard recommends using /mnt
or a subdirectory under it for temporarily mounted filesystems. It makes no recommendations on where to mount more permanent storage, so you can choose whichever scheme you’d like. In many cases, /mnt
or /mnt
subdirectories are used for more permanent storage as well.
Linux systems use a file called /etc/fstab
(filesystem table) to determine which filesystems to mount during the boot process. Filesystems that do not have an entry in this file will not be automatically mounted unless scripted by some other software.
Each line of the /etc/fstab
file represents a different filesystem that should be mounted. This line specifies the block device, the mount point to attach it to, the format of the drive, and the mount options, as well as a few other pieces of information.
While many use cases will be accommodated by these core features, there are more complex management paradigms available for joining together multiple disks, notably RAID.
RAID stands for redundant array of independent disks. RAID is a storage management and virtualization technology that allows you to group drives together and manage them as a single unit with additional capabilities.
The characteristics of a RAID array depend on its RAID level, which defines how the disks in the array relate to each other. Some of the more common levels are:
If you have a new storage device that you wish to use in your Linux system, this article will guide you through the process of partitioning, formatting, and mounting your new filesystem. This should be sufficient for most use cases where you are mainly concerned with adding additional capacity. To learn how to perform storage administration tasks, check out How To Perform Basic Administration Tasks for Storage Devices in Linux.
]]>An understanding of networking is important for anyone managing a server. Not only is it essential for getting your services online and running smoothly, it also gives you the insight to diagnose problems.
This article will provide an overview of some common networking concepts. We will discuss terminology, common protocols, and the responsibilities and characteristics of the different layers of networking.
This guide is operating system agnostic, but should be very helpful when implementing features and services that utilize networking on your server.
First, we will define some common terms that you will see throughout this guide, and in other guides and documentation regarding networking.
These terms will be expanded upon in the appropriate sections that follow:
Connection: In networking, a connection refers to pieces of related information that are transferred through a network. Generally speaking, a connection is established before data transfer (by following the procedures laid out in a protocol) and may be deconstructed at the end of the data transfer.
Packet: A packet is the smallest unit that is intentionally transferred over a network. When communicating over a network, packets are the envelopes that carry your data (in pieces) from one end point to the other.
Packets have a header portion that contains information about the packet including the source and destination, timestamps, network hops, etc. The main portion of a packet contains the actual data being transferred. It is sometimes called the body or the payload.
A network interface may be associated with a physical device, or it may be a representation of a virtual interface. The “loopback” device, which is a virtual interface available in most Linux environments to connect back to the same machine, is an example of this.
LAN: LAN stands for “local area network”. It refers to a network or a portion of a network that is not publicly accessible to the greater internet. A home or office network is an example of a LAN.
WAN: WAN stands for “wide area network”. It means a network that is much more extensive than a LAN. While WAN is the relevant term to use to describe large, dispersed networks in general, it is usually meant to mean the internet, as a whole.
If an interface is said to be connected to the WAN, it is generally assumed that it is reachable through the internet.
Some low level protocols are TCP, UDP, IP, and ICMP. Some familiar examples of application layer protocols, built on these lower protocols, are HTTP (for accessing web content), SSH, and TLS/SSL.
Port: A port is an address on a single machine that can be tied to a specific piece of software. It is not a physical interface or location, but it allows your server to be able to communicate using more than one application.
Firewall: A firewall is a program that decides whether traffic coming or going from a server should be allowed. A firewall usually works by creating rules for which type of traffic is acceptable on which ports. Generally, firewalls block ports that are not used by a specific application on a server.
NAT: NAT stands for network address translation. It is a way to repackage and send incoming requests to a routing server to the relevant devices or servers on a LAN. This is usually implemented in physical LANs as a way to route requests through one IP address to the necessary backend servers.
VPN: VPN stands for virtual private network. It is a means of connecting separate LANs through the internet, while maintaining privacy. This is used to connect remote systems as if they were on a local network, often for security reasons.
There are many other terms that you will come across, and this list is not exhaustive. We will explain other terms as we need them. At this point, you should understand some high-level concepts that will enable us to better discuss the topics to come.
While networking is often discussed in terms of topology in a horizontal way, between hosts, its implementation is layered in a vertical fashion within any given computer or network.
What this means is that there are multiple technologies and protocols that are built on top of each other in order for communication to function. Each successive, higher layer abstracts the raw data a little bit more.
It also allows you to leverage lower layers in new ways without having to invest the time and energy to develop the protocols and applications that handle those types of traffic.
The language that we use to talk about each of the layering schemes varies significantly depending on which model you use. Regardless of the model used to discuss the layers, the path of data is the same.
As data is sent out of one machine, it begins at the top of the stack and filters downwards. At the lowest level, actual transmission to another machine takes place. At this point, the data travels back up through the layers of the other computer.
Each layer has the ability to add its own “wrapper” around the data that it receives from the adjacent layer, which will help the layers that come after decide what to do with the data when it is handed off.
The TCP/IP model, more commonly known as the Internet protocol suite, is a widely adopted layering model. It defines the four separate layers:
Application: In this model, the application layer is responsible for creating and transmitting user data between applications. The applications can be on remote systems, and should appear to operate as if locally to the end user. This communication is said to take place between peers.
Transport: The transport layer is responsible for communication between processes. This level of networking utilizes ports to address different services.
Internet: The internet layer is used to transport data from node to node in a network. This layer is aware of the endpoints of the connections, but is not concerned with the actual connection needed to get from one place to another. IP addresses are defined in this layer as a way of reaching remote systems in an addressable manner.
Link: The link layer implements the actual topology of the local network that allows the internet layer to present an addressable interface. It establishes connections between neighboring nodes to send data.
As you can see, the TCP/IP model is abstract and fluid. This made it popular to implement and allowed it to become the dominant way that networking layers are categorized.
Interfaces are networking communication points for your computer. Each interface is associated with a physical or virtual networking device.
Typically, your server will have one configurable network interface for each Ethernet or wireless internet card you have.
In addition, it will define a virtual network interface called the “loopback” or localhost interface. This is used as an interface to connect applications and processes on a single computer to other applications and processes. You can see this referenced as the “lo” interface in many tools.
Many times, administrators configure one interface to service traffic to the internet and another interface for a LAN or private network.
In datacenters with private networking enabled (including DigitalOcean Droplets), your VPS will have two networking interfaces. The “eth0” interface will be configured to handle traffic from the internet, while the “eth1” interface will operate to communicate with a private network.
Networking works by piggybacking a number of different protocols on top of each other. In this way, one piece of data can be transmitted using multiple protocols encapsulated within one another.
We will start with protocols implemented on the lower networking layers and work our way up to protocols with higher abstraction.
Medium access control is a communications protocol that is used to distinguish specific devices. Each device is supposed to get a unique, hardcoded media access control address (MAC address) when it is manufactured that differentiates it from every other device on the internet.
Addressing hardware by the MAC address allows you to reference a device by a unique value even when the software on top may change the name for that specific device during operation.
MAC addressing is one of the only protocols from the low-level link layer that you are likely to interact with on a regular basis.
The IP protocol is one of the fundamental protocols that allow the internet to work. IP addresses are unique on each network and they allow machines to address each other across a network. It is implemented on the internet layer in the TCP/IP model.
Networks can be linked together, but traffic must be routed when crossing network boundaries. This protocol assumes an unreliable network and multiple paths to the same destination that it can dynamically change between.
There are a number of different implementations of the protocol. The most common implementation today is IPv4 addresses, which follow the pattern 123.123.123.123
, although IPv6 addresses, which follows the pattern 2001:0db8:0000:0000:0000:ff00:0042:8329
, are growing in popularity due to the limited number of available IPv4 addresses.
ICMP stands for internet control message protocol. It is used to send messages between devices to indicate their availability or error conditions. These packets are used in a variety of network diagnostic tools, such as ping
and traceroute
.
Usually ICMP packets are transmitted when a different kind of packet encounters a problem. They are used as a feedback mechanism for network communications.
TCP stands for transmission control protocol. It is implemented in the transport layer of the TCP/IP model and is used to establish reliable connections.
TCP is one of the protocols that encapsulates data into packets. It then transfers these to the remote end of the connection using the methods available on the lower layers. On the other end, it can check for errors, request certain pieces to be resent, and reassemble the information into one logical piece to send to the application layer.
The protocol builds up a connection prior to data transfer using a system called a three-way handshake. This is a way for the two ends of the communication to acknowledge the request and agree upon a method of ensuring data reliability.
After the data has been sent, the connection is torn down using a similar four-way handshake.
TCP is the protocol of choice for many of the most popular uses for the internet, including WWW, SSH, and email.
UDP stands for user datagram protocol. It is a popular companion protocol to TCP and is also implemented in the transport layer.
The fundamental difference between UDP and TCP is that UDP offers unreliable data transfer. It does not verify that data has been received on the other end of the connection. This might sound like a bad thing, and for many purposes, it is. However, it is also extremely important for some functions.
Because it is not required to wait for confirmation that the data was received and forced to resend data, UDP is much faster than TCP. It does not establish a connection with the remote host, it just sends data without confirmation.
Because it is a straightforward transaction, it is useful for communications like querying for network resources. It also doesn’t maintain a state, which makes it great for transmitting data from one machine to many real-time clients. This makes it ideal for VOIP, games, and other applications that cannot afford delays.
HTTP stands for hypertext transfer protocol. It is a protocol defined in the application layer that forms the basis for communication on the web.
HTTP defines a number of verbs that tell the remote system what you are requesting. For instance, GET, POST, and DELETE all interact with the requested data in a different way. To see an example of the different HTTP requests in action, refer to How To Define Routes and HTTP Request Methods in Express.
DNS stands for domain name system. It is an application layer protocol used to provide a human-friendly naming mechanism for internet resources. It is what ties a domain name to an IP address and allows you to access sites by name in your browser.
SSH stands for secure shell. It is an encrypted protocol implemented in the application layer that can be used to communicate with a remote server in a secure way. Many additional technologies are built around this protocol because of its end-to-end encryption and ubiquity.
There are many other protocols that we haven’t covered that are equally important. However, this should give you a good overview of some of the fundamental technologies that make the internet and networking possible.
At this point, you should be familiar with some networking terminology and be able to understand how different components are able to communicate with each other. This should assist you in understanding other articles and the documentation of your system.
Next, for a high-level, read world example, you may want to read How To Make HTTP Requests in Go.
]]>JSON, short for JavaScript Object Notation, is a format for sharing data. As its name suggests, JSON is derived from the JavaScript programming language, but it’s available for use by many languages including Python, Ruby, PHP, and Java. JSON is usually pronounced like the name “Jason.”
JSON is also readable, lightweight, offers a good alternative to XML, and requires much less formatting. This informational guide will discuss the data you can use in JSON files and the general structure and syntax of this format.
JSON uses the .json
extension when it stands alone, and when it’s defined in another file format (as in .html
), it can appear inside of quotes as a JSON string, or it can be an object assigned to a variable. This format transmits between web server and client or browser.
A JSON object is a key-value data format that is typically rendered in curly braces. When you’re working with JSON, you’ll likely come across JSON objects in a .json
file, but they can also exist as a JSON object or string within the context of a program.
Here is an example of a JSON object:
{
"first_name" : "Sammy",
"last_name" : "Shark",
"location" : "Ocean",
"online" : true,
"followers" : 987
}
Although this is a short example, and JSON can be many lines long, this demonstrates that the format is generally set up with two curly braces (or curly brackets) that are represented with { }
on either end of it, and with key-value pairs populating the space between. Most data used in JSON ends up being encapsulated in a JSON object.
Key-value pairs have a colon between them as in "key" : "value"
. Each key-value pair is separated by a comma, so the middle of a JSON lists as follows: "key" : "value", "key" : "value", "key": "value"
. In the previous example, the first key-value pair is "first_name" : "Sammy"
.
JSON keys are on the left side of the colon. They need to be wrapped in double quotation marks, as in "key"
, and can be any valid string. Within each object, keys need to be unique. These key strings can include whitespaces, as in "first name"
, but that can make it harder to access when you’re programming, so it’s best to use underscores, as in "first_name"
.
JSON values are found to the right of the colon. At the granular level, these need to be one of following six data types:
At the broader level, values can also be made up of the complex data types of JSON object or array, which is discussed in the next section.
Each of the data types that are passed in as values into JSON will maintain their own syntax, meaning strings will be in quotes, but numbers will not be.
With .json
files, you’ll typically get a format expanded over multiple lines, but JSON can also be written all in one line, as in the following example:
{ "first_name" : "Sammy", "last_name": "Shark", "online" : true, }
This is more common within another file type or when you encounter a JSON string.
Writing the JSON format on multiple lines often makes it much more readable, especially when dealing with a large data set. Because JSON ignores whitespace between its elements, you can space out your colons and key-value pairs in order to make the data even more human readable:
{
"first_name" : "Sammy",
"last_name" : "Shark",
"online" : true
}
It is important to keep in mind that though they appear similar, a JSON object is not the same format as a JavaScript object, so though you can use functions within JavaScript objects, you cannot use them as values in JSON. The most important attribute of JSON is that it can be readily transferred between programming languages in a format that all of the participating languages can work with. In contrast, JavaScript objects can only be worked with directly through the JavaScript programming language.
JSON can become increasingly complex with hierarchies that are comprised of nested objects and arrays. Next, you’ll learn more about these complex structures.
JSON can store nested objects in JSON format in addition to nested arrays. These objects and arrays will be passed as values assigned to keys, and may be comprised of key-value pairs as well.
In the following users.json
file, for each of the four users ("sammy"
, "jesse"
, "drew"
, "jamie"
) there is a nested JSON object passed as the value for each of them, with its own nested keys of "username"
and "location"
that relate to each of the users. Each user entry in the following code block is an example of a nested JSON object:
{
"sammy" : {
"username" : "SammyShark",
"location" : "Indian Ocean",
"online" : true,
"followers" : 987
},
"jesse" : {
"username" : "JesseOctopus",
"location" : "Pacific Ocean",
"online" : false,
"followers" : 432
},
"drew" : {
"username" : "DrewSquid",
"location" : "Atlantic Ocean",
"online" : false,
"followers" : 321
},
"jamie" : {
"username" : "JamieMantisShrimp",
"location" : "Pacific Ocean",
"online" : true,
"followers" : 654
}
}
In this example, curly braces are used throughout to form a nested JSON object with associated username and location data for each of the four users. As with any other value, when using objects, commas are used to separate elements.
Data can also be nested within the JSON format by using JavaScript arrays that are passed as a value. JavaScript uses square brackets [ ]
on either end of its array type. Arrays are ordered collections and can contain values of different data types.
For example, you may use an array when dealing with a lot of data that can be grouped together, like when there are various websites and social media profiles associated with a single user.
With the first nested array, a user profile for "Sammy"
may be represented as follows:
{
"first_name" : "Sammy",
"last_name" : "Shark",
"location" : "Ocean",
"websites" : [
{
"description" : "work",
"URL" : "https://www.digitalocean.com/"
},
{
"desciption" : "tutorials",
"URL" : "https://www.digitalocean.com/community/tutorials"
}
],
"social_media" : [
{
"description" : "twitter",
"link" : "https://twitter.com/digitalocean"
},
{
"description" : "facebook",
"link" : "https://www.facebook.com/DigitalOceanCloudHosting"
},
{
"description" : "github",
"link" : "https://github.com/digitalocean"
}
]
}
The "websites"
key and "social_media"
key each use an array to nest information belonging to Sammy’s two website links and three social media profile links. You can identify that those are arrays because of the use of square brackets.
Using nesting within your JSON format allows you to work with more complicated and hierarchical data.
XML, or eXtensible Markup Language, is a way to store accessible data that can be read by both humans and machines. The XML format is available for use across many programming languages.
In many ways, XML is similar to JSON, but it requires much more text, making it longer and more time-consuming to read and write. XML must also be parsed with an XML parser, but JSON can be parsed with a standard function. Also, unlike JSON, XML cannot use arrays.
Here’s an example of the XML format:
<users>
<user>
<username>SammyShark</username> <location>Indian Ocean</location>
</user>
<user>
<username>JesseOctopus</username> <location>Pacific Ocean</location>
</user>
<user>
<username>DrewSquir</username> <location>Atlantic Ocean</location>
</user>
<user>
<username>JamieMantisShrimp</username> <location>Pacific Ocean</location>
</user>
</users>
Now, compare the same data rendered in JSON:
{"users": [
{"username" : "SammyShark", "location" : "Indian Ocean"},
{"username" : "JesseOctopus", "location" : "Pacific Ocean"},
{"username" : "DrewSquid", "location" : "Atlantic Ocean"},
{"username" : "JamieMantisShrimp", "location" : "Pacific Ocean"}
] }
JSON is much more compact and does not require end tags while XML does. Additionally, XML is not making use of an array as this example of JSON does (which you can tell through the use of square brackets).
If you are familiar with HTML, you’ll notice that XML is quite similar in its use of tags. While JSON is leaner and less verbose than XML and quick to use in many situations, including AJAX applications, you first want to understand the type of project you’re working on before deciding what data structures to use.
JSON is a lightweight format that enables you to share, store, and work with data. As a format, JSON has been experiencing increased support in APIs, including the Twitter API. JSON is also a natural format to use in JavaScript and has many implementations available for use in various popular programming languages. You can read the full language support on the “Introducing JSON” site.
Because you likely won’t be creating your own .json
files but procuring them from other sources, it is important to think less about JSON’s structure and more about how to best use JSON in your programs. For example, you can convert CSV or tab-delimited data that you may find in spreadsheet programs into JSON by using the open-source tool Mr. Data Converter. You can also convert XML to JSON and vice versa with the Creative Commons-licensed utilities-online.info site.
Finally, when translating other data types to JSON, or creating your own, you can validate your JSON with JSONLint, and test your JSON in a web development context with JSFiddle.
]]>A proxy, also called a proxy server, is a server software that sits as an intermediary between a client and server on the internet. Without a proxy, a client would send a request for a resource directly to a server, and then the server would serve the resource directly back to the client. While this approach is straightforward to understand and implement, adding proxies provides benefits in the form of increased performance, privacy, security, and more. As an additional pass-through layer, a proxy acts as a gatekeeper of the internet between clients and servers.
Generally speaking, the combined package of server hardware with installed proxy software is also often referred to as a proxy server. However, this article will focus on proxies traditionally defined as software, and in the context of web servers. You will get a breakdown of the two main types, a forward proxy and a reverse proxy. Each type has a different use case, often confused because of the similar naming convention.
This article will provide you with an understanding of what proxies and their subtypes are, and how they are useful in common setups. By reading this article, you will be able to identify the circumstances in which a proxy is beneficial, and choose the correct solution between forward proxy and reverse proxy in any given situation.
A forward proxy, also called an open proxy, acts as a representative for a client that is trying to send a request through the internet to an origin server. In this scenario, all attempts to send requests by the client will instead be sent to the forward proxy. The forward proxy, in the client’s stead, will examine the request. First, it will determine if this client is authorized to send requests through this specific forward proxy. It will then reject the request or forward it to the origin server. The client has no direct access to the internet; it can only reach what the forward proxy allows it to access.
A common use case of forward proxies is to gain increased privacy or anonymity on the internet. A forward proxy accesses the internet in place of a client, and in that process it can use a different IP address than the client’s original IP address.
Depending on how it has been configured, a forward proxy can grant a number of features, allowing you to:
Forward proxies are also used in systems for centralized security and permission based access, such as in a workplace. When all internet traffic passes through a common forward proxy layer, an administrator can allow only specific clients access to the internet filtered through a common firewall. Instead of maintaining firewalls for the client layer that can involve many machines with varying environments and users, a firewall can be placed at the forward proxy layer.
Keep in mind that forward proxies must be manually set up in order to be used, whereas reverse proxies can go unnoticed by the client. Depending on whether the IP address of a client is passed on to the origin server by the forward proxy, privacy and anonymity can be granted or left transparent.
There are several options to consider for forward proxies:
A reverse proxy acts as a representative of a web server, handling incoming requests from clients on its behalf. This web server can be a single server or multiple servers. Additionally, it can be an application server such as Gunicorn. In either scenario, a request would come in from a client through the internet at large. Normally, this request will go directly to the web server that has the resources the client is requesting. Instead, a reverse proxy acts as an intermediary, isolating the web server from direct interaction with the open internet.
From a client’s perspective, interacting with a reverse proxy is no different from interacting with the web server directly. It is functionally the same, and the client cannot tell the difference. The client requests a resource and then receives it, without any extra configuration required by the client.
Reverse proxies grant features such as:
While centralized security is a benefit of both forward and reverse proxies, reverse proxies provide this to the web server layer and not the client layer. Instead of focusing on maintaining firewalls at the web server layer, which may contain multiple servers with different configurations, the majority of firewall security can be focused at the reverse proxy layer. Additionally, removing the responsibility of interfacing with a firewall and interfacing with client requests away from web servers allows them to focus solely on serving resources.
In the case of multiple servers existing behind a reverse proxy, the reverse proxy also handles directing which requests go to which server. Multiple web servers might be serving the same resource, each serving different kinds of resources, or some combination of the two. These servers can use the HTTP protocol as a conventional web server, but can also include application server protocols such as FastCGI. You can configure a reverse proxy to direct clients to specific servers depending on the resource requested, or to follow certain rules regarding traffic load.
Reverse proxies can also take advantage of their placement in front of web servers by offering caching functionality. Large static assets can be configured with caching rules to avoid hitting web servers on every request, with some solutions offering an option to serve static assets directly without touching the web server at all. Furthermore, the reverse proxy can handle compression of these assets.
The popular Nginx web server is also a popular reverse proxy solution. While the Apache web server also has reverse proxy feature, it is an additional feature for Apache whereas Nginx was originally built for and focuses on reverse proxy functionality.
Because “forward” and “reverse” come with connotations of directionality and misleading comparisons with “incoming” and “outgoing” traffic, these labels can be confusing because both kinds of proxies handle requests and responses. Instead, a better way to differentiate between forward and reverse proxies is to examine the needs of the application you’re building.
A reverse proxy is useful when building a solution to serve web applications on the internet. They represent your web servers in any interactions with the internet.
A forward proxy is useful when placed in front of client traffic for your personal use or in a workplace environment. They represent your client traffic in any interactions with the internet.
Differentiating by use case instead of focusing on the similar naming conventions will help you avoid confusion.
This article defined what a proxy is along with the two main types being the forward proxy and the reverse proxy. Practical use cases and an exploration of beneficial features was used to differentiate forward proxies and reverse proxies. If you’d like to explore implementation of proxies, you can check out our guide on how to configure Nginx as a web server and reverse proxy for Apache on one Ubuntu 20.04 Server.
]]>SSH is the de facto method of connecting to a cloud server. It is durable, and it is extensible — as new encryption standards are developed, they can be used to generate new SSH keys, ensuring that the core protocol remains secure. However, no protocol or software stack is totally foolproof, and SSH being so widely deployed across the internet means that it represents a very predictable attack surface or attack vector through which people can try to gain access.
Any service that is exposed to the network is a potential target in this way. If you review the logs for your SSH service running on any widely trafficked server, you will often see repeated, systematic login attempts that represent brute force attacks by users and bots alike. Although you can make some optimizations to your SSH service to reduce the chance of these attacks succeeding to near-zero, such as disabling password authentication in favor of SSH keys, they can still pose a minor, ongoing liability.
Large-scale production deployments for whom this liability is completely unacceptable will usually implement a VPN such as WireGuard in front of their SSH service, so that it is impossible to connect directly to the default SSH port 22 from the outside internet without additional software abstraction or gateways. These VPN solutions are widely trusted, but will add complexity, and can break some automations or other small software hooks.
Prior to or in addition to committing to a full VPN setup, you can implement a tool called Fail2ban. Fail2ban can significantly mitigate brute force attacks by creating rules that automatically alter your firewall configuration to ban specific IPs after a certain number of unsuccessful login attempts. This will allow your server to harden itself against these access attempts without intervention from you.
In another tutorial, we discussed How to protect SSH with Fail2ban. In this guide, we’ll discuss in more depth how Fail2ban actually works and how you can use this knowledge to modify or extend the behavior of this service.
The purpose of Fail2ban is to monitor the logs of common services to spot patterns in authentication failures.
When fail2ban is configured to monitor the logs of a service, it looks at a filter that has been configured specific to that service. The filter is designed to identify authentication failures for that specific service through the use of complex regular expressions. Regular expressions are a common templating language used for pattern matching. It defines these regular expression patterns into an internal variable called failregex
.
By default, Fail2ban includes filter files for common services. When a log from any service, like a web server, matches the failregex
in its filter, a predefined action is executed for that service. The action
is a variable that can be configured to do many different things, depending on the preferences of the administrator.
The default action is to ban the offending host/IP address by modifying the local firewall rules. You can expand this action to, for example, send an email to your system administrator.
By default, action will be taken when three authentication failures have been detected in 10 minutes, and the default ban time is for 10 minutes. This is configurable.
When using the default iptables
firewall, fail2ban
creates a new set of firewall rules, also called a chain, when the service is started. It adds a new rule to the INPUT chain that sends all TCP traffic directed at port 22 to the new chain. In the new chain, it inserts a single rule that returns to the INPUT chain. The chain and associated rules are removed if the Fail2ban service is stopped.
Fail2ban is configured through several files located within a hierarchy under the /etc/fail2ban/
directory.
The fail2ban.conf
file configures some operational settings like the way the daemon logs info, and the socket and pid file it will use. The main configuration, however, is specified in the files that define the per-application “jails”.
By default, fail2ban ships with a jail.conf
file. However, this can be overwritten in updates, so you should copy this file to a jail.local
file and make adjustments there.
If you already have a jail.local
file, open it using nano
or your favorite text editor:
- sudo nano /etc/fail2ban/jail.local
If you don’t have a jail.local
file already, or the file you opened was blank, copy over the jail.conf
file and then open the new file:
- sudo cp /etc/fail2ban/jail.conf /etc/fail2ban/jail.local
- sudo nano /etc/fail2ban/jail.local
We will take a look at the options available here and see how this file interacts with other configuration files on the system.
The first portion of the file will define the defaults for fail2ban policy. These options can be overridden in each individual service’s configuration section.
With the comments removed, the entirety of the default section looks something like this:
[DEFAULT]
ignoreip = 127.0.0.1/8
bantime = 10m
findtime = 10m
maxretry = 3
backend = auto
usedns = warn
destemail = root@localhost
sendername = Fail2Ban
banaction = iptables-multiport
mta = sendmail
protocol = tcp
chain = INPUT
action_ = %(banaction)s[name=%(__name__)s, port="%(port)s", protocol="%(protocol)s", chain="%(chain)s"]
action_mw = %(banaction)s[name=%(__name__)s, port="%(port)s", protocol="%(protocol)s", chain="%(chain)s"]
%(mta)s-whois[name=%(__name__)s, dest="%(destemail)s", protocol="%(protocol)s", chain="%(chain)s", sendername="%(sendername)s"]
action_mwl = %(banaction)s[name=%(__name__)s, port="%(port)s", protocol="%(protocol)s", chain="%(chain)s"]
%(mta)s-whois-lines[name=%(__name__)s, dest="%(destemail)s", logpath=%(logpath)s, chain="%(chain)s", sendername="%(sendername)s"]
action = %(action_)s
Let’s go over what some of this means:
findtime
window before a ban is instituted.auto
means that fail2ban will try pyinotify
, then gamin
, and then a polling algorithm based on what’s available. inotify
is a built-in Linux kernel feature for tracking when files are accessed, and pyinotify
is a Python interface to inotify
, used by Fail2ban.warn
setting will attempt to look up a hostname and ban that way, but will log the activity for review./etc/fail2ban/action.d/
called iptables-multiport.conf
. This handles the actual iptables
firewall manipulation to ban an IP address. We will look at this later.The rest of the parameters define different actions that can be specified. They pass in some of the parameters that we’ve defined above using variable substitution within text strings like this:
%(var_name)s
The line above would be replaced with the contents of var_name
. Using this, we can tell that the action
variable is set to the action_
definition by default (ban only, no mail alerts).
This, in turn, is configured by calling the iptables-multiport
action with a list of parameters (service name, port, protocol, and chain) that is needed to perform the ban. The __name__
is substituted with the name of the service as specified by the section headers below.
Beneath the default section, there are sections for specific services that can be used to override the default settings. This follows a convention of only modifying the parameters that differ from the normal values (convention over configuration).
Each section header is specified like this:
[service_name]
Any section that has the line enabled = true
will be read and enabled.
Within each section, the parameters are configured, including the filter file that should be used to parse the logs (minus the file extension) and the location of the log files themselves.
Keeping this in mind, the section that specifies the actions for the SSH service looks like this:
[SSH]
enabled = true
port = ssh
filter = sshd
logpath = /var/log/auth.log
maxretry = 6
This enables this section and sets the port to the default “ssh” port (port 22). It tells Fail2ban to look at the log located at /var/log/auth.log
for this section and to parse the log using the filtering mechanisms defined in the /etc/fail2ban/filters.d
directory in a file called sshd.conf
.
All of the other pieces of information that it needs are taken from the parameters defined in the [DEFAULT]
section. For instance, the action will be set to action_
which will ban the offending IP address using the iptables-multiport
banaction, which references a file called iptables-multiport.conf
found in /etc/fail2ban/action.d
.
As you can see, the actions in the [DEFAULT]
section should be general and flexible. Using parameter substitution along with parameters that provide sensible defaults will make it possible to override definitions when necessary.
In order to understand what is going on in our configuration, we need to understand the filter and action files, which do the bulk of the work.
The filter file will determine the lines that fail2ban will look for in the log files to identify offending characteristics. The action file implements all of the actions required, from building up a firewall structure when the service starts, to adding and deleting rules, and tearing down the firewall structure when the service stops.
Let’s look at the filter file that our SSH service called for in the configuration above:
- sudo nano /etc/fail2ban/filter.d/sshd.conf
[INCLUDES]
before = common.conf
[Definition]
_daemon = sshd
failregex = ^%(__prefix_line)s(?:error: PAM: )?[aA]uthentication (?:failure|error) for .* from <HOST>( via \S+)?\s*$
^%(__prefix_line)s(?:error: PAM: )?User not known to the underlying authentication module for .* from <HOST>\s*$
^%(__prefix_line)sFailed \S+ for .*? from <HOST>(?: port \d*)?(?: ssh\d*)?(: (ruser .*|(\S+ ID \S+ \(serial \d+\) CA )?\S+ %(__md5hex)s(, client user ".*", client host ".*")?))?\s*$
^%(__prefix_line)sROOT LOGIN REFUSED.* FROM <HOST>\s*$
^%(__prefix_line)s[iI](?:llegal|nvalid) user .* from <HOST>\s*$
^%(__prefix_line)sUser .+ from <HOST> not allowed because not listed in AllowUsers\s*$
^%(__prefix_line)sUser .+ from <HOST> not allowed because listed in DenyUsers\s*$
^%(__prefix_line)sUser .+ from <HOST> not allowed because not in any group\s*$
^%(__prefix_line)srefused connect from \S+ \(<HOST>\)\s*$
^%(__prefix_line)sUser .+ from <HOST> not allowed because a group is listed in DenyGroups\s*$
^%(__prefix_line)sUser .+ from <HOST> not allowed because none of user's groups are listed in AllowGroups\s*$
ignoreregex =
The [INCLUDES]
section header specifies other filter files that are read in before or after this file. In our example, the common.conf
file is read in and placed before the other lines in this file. This sets up some parameters that we will be using in our configuration.
Next, we have a [Definition]
section that defines the actual rules for our filter matches. First, we set the name of the daemon we are monitoring by using the _daemon
parameter.
After that, we go through the actual failregex
definition, which sets the patterns that will trigger when a matching line in the log file is found. These are regular expressions that match based on the different errors and failures that can be thrown when a user does not authenticate correctly.
Portions of the line like %(__prefix_line)s
will be substituted with the value of a parameter setup in the common.conf
file that we sourced. This is used to match the different leading information that operating systems write to log files when they use standard methods. For instance, some lines from the /var/log/auth.log
might look something like this:
May 6 18:18:52 localhost sshd[3534]: pam_unix(sshd:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=101.79.130.213
May 6 18:18:54 localhost sshd[3534]: Failed password for invalid user phil from 101.79.130.213 port 38354 ssh2
May 6 18:18:54 localhost sshd[3534]: Received disconnect from 101.79.130.213: 11: Bye Bye [preauth]
The highlighted portion is a standard pattern that the operating system inserts to provide more context. After that, there are quite a few different ways that the iptables firewall service writes failure attempts to the log.
We see two separate failures in the first two lines above (a PAM authentication error and a password error). The regular expressions defined in the filter are designed to match any of the possible failure lines. You should not have to adjust any of these lines, but you should be aware of the need to catch all of the log entries that signify an unauthorized use error for the application you are trying to protect if you ever have to create a filter file yourself.
At the bottom, you can see an ignoreregex
parameter, which is currently blank. This can be used to exclude more specific patterns that would typically match a failure condition in case you want to negate the failure trigger for fail2ban for certain scenarios. We won’t be adjusting this.
Save and close the file when you are finished examining it.
Now, let’s take a look at the action file. This file is responsible for setting up the firewall with a structure that allows modifications for banning malicious hosts, and for adding and removing those hosts as necessary.
The action that our SSH service invokes is called iptables-multiport
. Open the associated file now:
- sudo nano /etc/fail2ban/action.d/iptables-multiport.conf
With the comments removed, this file looks something like this:
[INCLUDES]
before = iptables-blocktype.conf
[Definition]
actionstart = iptables -N fail2ban-<name>
iptables -A fail2ban-<name> -j RETURN
iptables -I <chain> -p <protocol> -m multiport --dports <port> -j fail2ban-<name>
actionstop = iptables -D <chain> -p <protocol> -m multiport --dports <port> -j fail2ban-<name>
actioncheck = iptables -n -L <chain> | grep -a 'fail2ban-<name>[ \t]'
actionban = iptables -I fail2ban-<name> 1 -s <ip> -j <blocktype>
actionunban = iptables -D fail2ban-<name> -s <ip> -j <blocktype>
[Init]
name = default
port = ssh
protocol = tcp
chain = INPUT
The file starts off by sourcing another action file called iptables-blocktype.conf
that defines the blocktype
parameter, which configures the restriction that will be set when a client is banned. By default the blocktype
is set to reject packets and reply to pings sent by banned clients with a rejection message that the port is unreachable. We will use this in our ban rules below.
Next, we get to the rule definitions themselves. The actionstart
action sets up the iptables firewall when the fail2ban service is started. It creates a new chain, adds a rule to that chain to return to the calling chain, and then inserts a rule at the beginning of the INPUT chain that passes traffic matching the correct protocol and port destinations to the new chain.
It does this by using the values we passed in with the action
that we defined in our jail.local
file. The name
is taken from the section header for each service. The chain
, protocol
, and port
are taken from the action
line itself in that file.
Here, all of the parameters that are set by the other file are referenced by including the parameter name in angle brackets:
<param_name>
When we move down to the companion actionstop
definition, we can see that the firewall commands are implementing a reversal of the actionstart
commands. When the Fail2ban service stops, it cleanly removes any firewall rules that it added.
Another action called actioncheck
makes sure that the proper chain has been created prior to attempting to add ban rules.
Next, we get to the actual banning rule, called actionban
. This rule works by adding a new rule to our created chain. The rule matches the source IP address of the offending client – this parameter is read in from the authorization logs when the maxretry
limit is reached. It institutes the block defined by the blocktype
parameter that we sourced in the [INCLUDE]
section at the top of the file.
The actionunban
rule removes this rule. This is done automatically by fail2ban when the ban time has elapsed.
Finally, we get to the [Init]
section. This just provides some defaults in case the action file is called without passing in all of the appropriate values.
Now that we’ve seen the specifics, let’s go over the process that happens when fail2ban starts.
First, the main fail2ban.conf
file is read to determine the conditions that the main process should operate under. It creates the socket, pid, and log files if necessary and begins to use them.
Next, fail2ban reads the jail.conf
file for configuration details. It follows this by reading, in alphabetical order, any files found in the jail.d
directory that end in .conf
. It adds the settings found in these files to its internal configuration, giving new values preference over the values described in the jail.conf
file.
It then searches for a jail.local
file and repeats this process, adapting the new values. Finally, it searches the jail.d
directory again, reading in alphabetical order files ending in .local
.
In our case, we only have a jail.conf
file and a jail.local
file. In our jail.local
file, we only need to define the values that differ from the jail.conf
file. The fail2ban process now has a set of directives loaded into memory that represent a combination of all of the files that it found.
It examines each section and searches for an enabled = true
directive. If it finds one, it uses the parameters defined under that section to build a policy and decide what actions are required. Any parameters that are not found in the service’s section use the parameters defined in the [DEFAULT]
section.
Fail2ban looks for an action
directive to figure out what action script to call to implement the banning/unbanning policies. If one is not found, it falls back on the default action determined above.
The action directive consists of the name of the action file(s) that will be read, as well as a key-value dictionary that passes the parameters needed by those files. The values of these often take the form of parameter substitutions by referencing the settings configured in the service’s section. The “name” key is usually passed the value of the special __name__
variable that will be set to the value of the section’s header.
Fail2ban then uses this information to find the associated files in the action.d
directory. It first looks for the associated action file ending in .conf
and then amends the information found there with any settings contained in an accompanying .local
file also found in the action.d
directory.
It parses those files to determine the actions that it needs to take. It reads the actionstart
value to see the actions it should take to set up the environment. This often includes creating a firewall structure to accommodate banning rules in the future.
The actions defined in this file use the parameters passed to it from the action
directive. It will use these values to dynamically create the appropriate rules. If a certain variable wasn’t set, it can look at the default values set in the action file to fill in the blanks.
The parameters for the service in the jail.*
files also include the location of the log file as well as the polling mechanism that should be used to check the file (this is defined by the backend
parameter). It also includes a filter that should be used to determine whether a line in the log represents a failure.
Fail2ban looks in the filter.d
directory to find the matching filter file that ends with .conf
. It reads this file to define the patterns that can be used to match offending lines. It then searches for a matching filter file ending with .local
to see if any of the default parameters were overwritten.
It uses the regular expressions defined in these files as it reads the service’s log file. It tries each failregex
line defined in the filter.d
files against every new line written to the service’s log file.
If the regular expression returns a match, it checks the line against the regular expressions defined by the ignoreregex
. If this also matches, fail2ban ignores it. If the line matches an expression in the failregex
but does not match an expression in the ignoreregex
, an internal counter is incremented for the client that caused the line and an associated timestamp is created for the event.
As the window of time set by the findtime
parameter in the jail.*
files is reached (as determined by the event timestamp), the internal counter is decremented again and the event is no longer considered relevant to the banning policy.
If, over the course of time, additional authentication failures are logged, each attempt increments the counter. If the counter reaches the value set by the maxretry
parameter within the configured window of time, fail2ban institutes a ban by calling the actioncheck
action for the service as defined in the action.d/
files for the service. This is to determine whether the actionstart
action set up the necessary structure. It then calls the actionban
action to ban the offending client. It sets a timestamp for this event as well.
When the amount of time has elapsed that was specified by the bantime
parameter, fail2ban unbans the client by calling the actionunban
action.
By now you have a fairly in-depth understanding of how fail2ban operates. When you deviate from the standard configuration, it is helpful to know how fail2ban functions in order to manipulate its behavior in a predictable way.
To learn about how to protect other services with fail2ban, you can read How To Protect an Nginx Server with Fail2Ban on Ubuntu 22.04.
]]>Data is central to how many of today’s applications and websites function. Comments on a viral video, changing scores in a multiplayer game, and the items you left in a shopping cart on your favorite online store are all bits of information stored somewhere in a database.
This conceptual article serves as an introduction to numerous database topics. It provides a brief overview of what databases are in the context of cloud computing and highlights a few concepts central to their design and function. It also contains links to relevant conceptual and procedural tutorials throughout.
Broadly speaking, a database is any logically modeled collection of information. A database does not necessarily have to be stored on a computer, and things like a stack of patient files in a hospital, a set of contacts in a rolodex, or file cabinet filled with old invoices all qualify as examples of databases.
In the context of websites and applications, when people refer to a “database” they’re often talking about a computer program that allows them to interact with their database. These programs, known more formally as a database management system (DBMS), are often installed on a virtual private server and accessed remotely.
Redis, MariaDB, and PostgreSQL are a few examples of open-source DBMSs, but there are many different ones available today. Different DBMSs usually have their own unique features and associated toolsets, but they generally fall into one of two categories: relational and non-relational databases.
Since the 1970s, most DBMSs have been designed around the relational model. The most fundamental elements in the relational model are relations, which users and modern relational DBMSs (RDBMSs or relational databases) recognize as tables. A relation is a set of tuples, or rows in a table, with each tuple sharing a set of attributes, or columns:
You can think of each tuple as a unique instance of whatever type of people, objects, events, or associations the table holds. These instances might be things like employees at a company, sales from an online business, or test results in a medical lab. For example, in a table that holds employee records of teachers at a school, the tuples might have attributes like name
, subjects
, start_date
, and so on.
In the relational model, each table contains at least one column that can be used to uniquely identify each row, called a primary key. Building on the example of a table storing employee records of teachers at a school, the database administrator could create a primary key column named employee_ID
whose values automatically increment. This would allow the DBMS to keep track of each record and return them on an ad hoc basis. In turn, it would mean that the records have no defined logical order, and users have the ability to return their data in whatever order or through whatever filters they wish.
If you have two tables that you’d like to associate with one another, one way you can do so is with a foreign key. A foreign key is essentially a copy of one table’s (the “parent” table) primary key inserted into a column in another table (the “child”). The following example highlights the relationship between two tables, one used to record information about employees at a company and another used to track the company’s sales. In this example, the primary key of the EMPLOYEES
table is used as the foreign key of the SALES
table:
The relational model’s structural elements help to keep data stored in an organized way, but storing data is only useful if you can retrieve it. To retrieve information from an RDBMS, you can issue a query, or a structured request for a set of information. Most relational databases use a language called Structured Query Language — better known as SQL and informally pronounced like “sequel” — to manage and query data. SQL allows you to filter and manipulate query results with a variety of clauses, predicates, and expressions, giving you fine control over what data will appear in the result set.
There are many open-source RDBMSs available today, including the following:
Today, most applications still use the relation model to store and organize data. However, the relation model cannot meet the needs of every application. For example, it can be difficult to scale relational databases horizontally, and though they’re ideal for storing structured data, they’re less useful for storing unstructured data.
These and other limitations of the relational model have led to the development of alternatives. Collectively, these database models are often referred to as non-relational databases. Because these alternative models typically don’t implement SQL for defining or querying data, they are also sometimes referred to as NoSQL databases. This also means that many NoSQL databases implement a unique syntax to insert and retrieve data.
It can be helpful to think of “NoSQL” and “non-relational” as broad umbrella terms, as there are many database models that are labeled as NoSQL, with significant differences between them. The remainder of this section highlights a few of the more commonly used non-relational database models:
Key-value databases, also known as key-value stores, work by storing and managing associative arrays. An associative array, also known as a dictionary or hash table, consists of a collection of key-value pairs in which a key serves as a unique identifier to retrieve an associated value. Values can be anything from simple objects, like integers or strings, to more complex objects, like JSON structures.
Redis is an example of a popular, open-source key-value store.
Document-oriented databases, or document stores, are NoSQL databases that store data in the form of documents. Document stores are a type of key-value store: each document has a unique identifier — its key — and the document itself serves as the value. The difference between these two models is that, in a key-value database, the data is treated as opaque and the database doesn’t know or care about the data held within it; it’s up to the application to understand what data is stored. In a document store, however, each document contains some kind of metadata that provides a degree of structure to the data. Document stores often come with an API or query language that allows users to retrieve documents based on the metadata they contain. They also allow for complex data structures, as you can nest documents within other documents.
MongoDB is a widely used document database. The documents you store in a MongoDB database are written in BSON, which is a binary form of JSON.
Columnar databases, sometimes called column-oriented databases, are database systems that store data in columns. This may seem similar to traditional relational databases, but rather than grouping columns together into tables, each column is stored in a separate file or region in the system’s storage. The data stored in a columnar database appears in record order, meaning that the first entry in one column is related to the first entry in other columns. This design allows queries to only read the columns they need, rather than having to read every row in a table and discard unneeded data after it’s been stored in memory.
Apache Cassandra is a widely used open-source column store.
By itself, a database management system isn’t very useful. You can use a DBMS to query and interact with a database directly, but in most real-world contexts you’ll likely want to combine it with other tools since DBMS cannot serve or display content on its own. When done so, a database becomes an essential component of a larger application.
There are a number of popular open-source technology stacks that include a DBMS. Here are a few examples:
These technology stacks are often deployed on the same server — an architecture pattern referred to as a monolithic architecture. In such cases, it is fairly trivial to connect the other stack components to a DBMS. Alternatively, you can set up a remote database by installing the DBMS on a remote server. Most DBMSs operate on a dedicated port that you can use to connect an application server to your remote database. For example, MySQL’s default port is 3306
and Redis’s is 6379
. Using a remote database server like this can be a highly scalable solution compared to a monolithic application, as it allows you to scale your database separately from your application.
However, setting up a remote database like this increases the attack surface of your application, since it adds more potential entry points for unauthorized users. It also requires your data to be sent from the database server to the application server over a network connection, which means the data packets could be intercepted by malicious actors. To protect your data from sniffing attacks like this, many DBMSs allow you to encrypt your data. Encryption is the process of converting a piece of information from plaintext, the information’s original form, into ciphertext, an unreadable form that can only be read by a person or computer that has the right cipher to decrypt it. If a malicious actor were to intercept a piece of encrypted data, they wouldn’t be able to read it until they’re able to decrypt it.
Many DBMSs allow you to encrypt communications between your database server and whatever clients or applications need access to it by configuring it to require connections that use Transport Layer Security, also known as TLS. Like its predecessor, Secure Sockets Layer (SSL), TLS is a cryptographic protocol that uses certificate-based authentication to encrypt data as it’s transmitted over a network. Note that TLS only encrypts data as it moves over a network, otherwise known as data in-transit. Even if you’ve configured Mongo to require connections to be made with TLS, the static data stored on the database server, called data at rest, will still be unencrypted unless your DBMS offers a form of data at rest encryption.
Most database management systems come installed with a command line tool that allows you to interact with the database installation. Examples include the mysql
command line client for MySQL, psql
for PostgreSQL, and the MongoDB Shell. There are also third-party command line clients available for many DBMSs. One such example is Redli, which serves as an alternative to Redis’s default redis-cli
tool and comes with certain added features.
However, managing data through a command line interface may not be intuitive for every user, which is why there are graphical database administration tools available for many open-source DBMSs. Some, like phpMyAdmin or pgAdmin, are browser-based while others, like MySQL Workbench or MongoDB Compass, are meant to connect to a remote database from a local machine.
As an application continues to operate and grow, the data held within the database will require more and more storage, to the point where it could slow down the entire application. There are several common strategies for dealing with issues like this, the two most common of which are replication and sharding.
Replication is the practice of synchronizing data across multiple separate databases. When working with databases, it’s often useful to have multiple copies of your data. This provides redundancy in case one of the database servers fails and can improve a database’s availability and scalability, as well as reduce read latencies. Many DBMSs include replication as a built-in feature, including MongoDB and MySQL. Some like MySQL even provide multiple replication methods for greater flexibility.
Database sharding is the process of splitting up data records that would normally be held in the same table or collection and distributing them across multiple machines, known as shards. Sharding is especially useful in cases where you’re working with large amounts of data, as it allows you to scale your base horizontally by adding more machines that can function as new shards.
By reading this article, you should have a better understanding of what databases are and how you can use them. We encourage you to check out the rest of our databases content.
Tutorials
redis-cli
. Oftentimes, databases contain sensitive data. This tutorial outlines how to connect to a managed Redis database over a secure tunnel created with Stunnel.DigitalOcean Products
DigitalOcean offers a variety of Managed Databases, allowing you to provision and use a database server without the need to configure it, perform backups, or update it in the future. Currently, DigitalOcean offers Managed Databases for the following four DBMSs:
You can also find a variety of Database solutions on the DigitalOcean Marketplace. These allow you to launch a database server in just a few clicks, with options like MySQL, MongoDB, RethinkDB, and others.
There are a number of software and tools used to create websites. One tool that is fast becoming a mainstay is called a Static Site Generator (SSG). A SSG is an application that automatically creates the HTML, CSS, and JavaScript files necessary to render a website. SSGs have gained popularity because of their flexibility. It can be used as a stand-alone tool or deeply integrated into a web architecture of your choice.
To outline the underlying web technologies that power SSGs and the features that make them useful, this article will explore various concepts as they relate to ‘static,’ ‘site,’ and ‘generator.’
When a user views a page on a website, a request is made to a web server. The web server responds back with a specific file based on that request.
For example, say a user requests a page on a traditional or dynamic website. The server might need to query a database, send the results to a templating application, then finally generate the HTML file to render to the browser. This page is generated, on-demand, each time a request is made in this architecture.
Conversely, on a static site, the requested files already sit on the web server. This is what is meant by the term static: the files used to render the site are unchanging because they already contain the HTML, CSS, and JavaScript code. There is nothing to convert, generate, or transform on demand because they were generated before the page request.
To be clear, a static site can still be interactive. Things like JavaScript code, CSS animations, video and audio elements, and data formats like JSON are still supported and can run as normal on a static site. The only difference is that the files themselves are generated in advance rather than generated on demand.
You can think of a landing page or a blog post as examples of a static site. On the other hand, a live sports page or a comments section are examples of dynamic websites. For dynamic sites like these, you need a server or an API call to perform additional processing in order to render an element to the user correctly. You can read about the rendering spectrum to better understand how websites use a variety of techniques to reveal content to users.
On any site, you need a way to organize and create content. Static site generators offer a streamlined way to accomplish this task. With a SSG, you can create and organize your site structure, while also having direct access to the content. One of the integral technologies common to many SSGs to help manage all of this is called a template engine.
A template engine is software that creates templates for common elements that appear across your site. Instead of hard coding repetitive HTML for each page on your website, a template engine assists in creating these elements for all your pages. For example, if you have a header and footer that are required to be on every page on your site, you can write this code once and apply it to the pages with template code. You can also use variables and expressions to build or replace values within a template.
Bear in mind that many SSGs let you choose your templating language, while some come with a frontend framework and templating language already selected for you. Template engines like Nunjucks, Pug, Liquid, and Blade offer you different ways to create templates.
Here is an example of a base template in the Nunjucks template engine:
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<title>Your Website</title>
</head>
<body>
<header>
{% block header %}
This is your header content
{% endblock %}
</header>
<div id="content">{% block content %}{% endblock %}</div>
<footer>
{% block footer %}
This is your footer content
{% endblock %}
</footer>
</body>
</html>
You can think of a base template as a container of placeholders. The {% block %}
tags are blocks of template code. These can contain content or be left empty in order to be filled or overridden by child templates. Notice that each {% block %}
tag has a variable name appended. For example, nested inside of the <header>
HTML tag, there’s a {% block header%}
tag that specifies that this block is named header. You can name this whatever you’d like, however it is best practice to use descriptive variable names in order to avoid confusion.
Template engines perform some heavy lifting to create these reusable templates that are then turned into HTML. These template files are usually placed in a directory called layouts
or templates
. Because templates are meant to be reusable, you won’t often write content directly within your template files. For content, you can create what are called Markdown files.
Markdown is a markup language that is used to add formatting to plaintext documents before being converted to HTML. Although not a full replacement for writing HTML, Markdown can help you write and structure your content without worrying too much about HTML tags.
For instance, if you need to create an unordered list in HTML, you have to create an <ul>
tag, then nested within, are your list elements wrapped with the <li>
tag. In Markdown, you can create this list with an asterisk *
.
<ul>
<li>Thing 1</li>
<li>Thing 2</li>
<li>Thing 3</li>
</ul>
* Thing 1
* Thing 2
* Thing 3
Many SSGs allow you to write all your content in Markdown apart from the encompassing code base. This way, you can focus on writing content instead of code, although you do have the option to write plain HTML in a Markdown file. These files are usually referred to as content files and are typically stored within a content specific directory. Your content will be rendered accordingly when paired with the power of your templates and the metadata used in front matter.
Info: You can learn how our custom and open-source Markdown engine powers and formats the content on DigitalOcean’s Community site.
Another powerful tool in the SSG toolbox are languages that assist in configuring and formatting metadata. YAML, TOML, and JSON are the languages used to define metadata within the front matter of your content files. Front matter is the structured data that describes or defines attributes about the content. You can also use this data to apply a specific layout from your templates. This data usually sits at the very top of any given content file.
For example, with front matter you can define a title, an author, provide a brief description of your content, and use a specific layout from your templates:
---
title: Front Matter Matters
author: Author Name
description: Write out your description here
layout: base.html
---
This example uses the YAML language to define the contents of the page. The three hyphens ---
at the top and bottom encapsulate your front matter metadata. Without these hyphens, your metadata will not be functional. In practice, you can include more or fewer metadata elements. Paired with a templating engine and Markdown, your content will be rendered to a page according to how you’ve structured your site.
Structuring your front matter can become increasingly complex as your site grows. Though it is beyond this article to explain the intricacies, you can use front matter to create links, tags, and more to customize your site.
After structuring and creating content for your site, what’s left is to build it. During this build process, all your assets – including things like CSS, images, JavaScript, metadata and more – are entered into a pipeline and pieced together. These assets are typically minified, transpiled, compiled, bundled, then ultimately served to a user as static files.
When your assets are minified, an application removes white space, indentation, new lines, long variable names, and code comments in order to keep the file as small as possible. This is sometimes referred to as “uglifying” your code, since it removes much of the formatting that makes it easy to read. Although it may look odd, your code still functions the same when minified.
The following is an example of some CSS code prior to minification:
html {
box-sizing: border-box;
}
*,
*::before,
*::after {
box-sizing: inherit;
margin: 0;
padding: 0;
}
body {
background: seagreen;
font-family: sans-serif;
}
.wrapper {
padding: 1rem;
border-radius: 1rem;
display: flex;
}
Here is the same code, but minified:
html{box-sizing:border-box}*,::after,::before{box-sizing:inherit;margin:0;padding:0}body{background:#2e8b57;font-family:sans-serif}.wrapper{padding:1rem;border-radius:1rem}
Minified files are not meant to be edited. Since the minified code is contained in a new file, it is the one that gets used on the production site. Instead, if you require edits, edit the original unminified file, re-save and re-compile the minified file, and replace it on the server to reflect your changes.
It’s common practice to write code in separate files during development. This allows you to use smaller, more manageable chunks of code rather than a single monolithic file containing all the code for your site.
For example, you might have multiple scripts and modules that call each other when a button is clicked. You might also have multiple CSS stylesheets for the different pages and elements on your site. Furthermore, it may also be true that your scripts and CSS depend on each other for functionality. When your site is in production, having to request all of these resources individually can slow down your site, as each adds a little latency to the rendering process. This is what bundling is designed to solve.
Code bundling is the process of combining and merging code into a single file. For instance, if your JavaScript functions, modules, and components are contained in separate files, bundling merges them into one JavaScript file. The same is true if you write your styles using a CSS preprocessor, like SASS, where the code is separated. Bundling will also compile these files into a single CSS file.
In the end, bundling, like minifying, is an optimization process. Instead of requesting multiple scripts or stylesheets, by bundling the browser may only need to request a few.
Different bundlers use different processes to merge your code together. There are quite a few bundlers out there and each SSG is usually integrated with one. rollup.js, Parcel, and webpack are examples of bundlers. SSGs like Gatsby and Next.js use webpack to bundle their assets.
The transpiling and compiling process essentially turns your code into files that any web browser can read and execute.
In general, a compiler is an application that translates and converts a higher level programming language into a lower level programming language. For example, a compiler could translate C code into 1’s and 0’s, machine code, in order to run a program.
A transpiler is a type of compiler that translates code from one programming language to another equivalent programming language. As an example, the transpiling process could mean turning code written in the Typescript programming language into the JavaScript language. Since web browsers don’t understand Typescript code, it needs to be translated into JavaScript in order for a browser to execute it.
The popular JavaScript application called Babel is an example of these concepts in practice. Babel does its best to ensure that JavaScript code is readable in old and modern browsers alike. Another example of this is Autoprefixer. It detects newer CSS features and inserts the proper fallbacks for wider browser support.
Whether you’re working within a legacy code base or with the newest syntax, the transpiling and compiling process interprets the code for the browser to achieve the deepest possible browser support so that users can interact with the page as you intended.
When you’re finished creating your site, you can type your SSG’s specific build
command into your terminal. This command is configured to minify, transpile, compile, bundle your code, as well as perform any other tasks required to serve your static files. For example, if you’re using the Gatsby SSG, you would type in gatsby build
into your terminal to initiate the build.
During the build process, you may notice the terminal outputting the build’s progress. This includes specific details about how the files are being processed. You may also encounter error messages during this process telling you where the failure occurred. If everything runs successfully, the terminal will tell you after it is finished with the build.
During a successful build, the static HTML, CSS, and JavaScript files needed to render the site are placed in a public
or dist
directory. The exact name of this folder will depend on your configuration.
Note: If you’re interested in a deeper explanation into this process, Gatsby’s great documentation details what happens during its build process.
After building your site, you may end up with a site structure that is similar to this:
your_project_root
├── node_modules
├── public
├── src
│ ├── css
│ │ └── styles.css
│ ├── js
│ │ └── script1.js
│ │ └── module.js
│ ├── images
│ │ ├── dog.jpg
│ │ └── cat.jpg
│ ├── _includes
│ │ ├── partials
│ │ │ └── about.html
│ │ └── layouts
│ │ └── base.html
│ ├── posts
│ │ ├── post-1.md
│ │ ├── post-2.md
│ │ ├── post-3.md
│ └── index.html
├── .your_SSG_config.js
├── package.json
├── package-lock.json
├── README.md
└── .gitignore
Please note that, depending on the SSG you use, your file structure may include different files and directories. Though there can be a significant amount of work up front when creating a site with a SSG, the payoff is static files that will be served, without additional server processes, creating a faster loading experience for your users.
Static site generators by themselves are powerful tools. Coupled with other web technologies, they can blur the line between static and dynamic. SSGs play an integral role in the Jamstack ecosystem. Like a LAMP or MEAN stack, the Jamstack is another way to create and architect a website or web application. It uses the power of SSGs to create static HTML and utilizes JavaScript and API calls to connect to backend services and databases.
A SSG can also work in tandem with a CMS. This other model of development is called headless CMS. With a headless CMS, you have a way to store data, have a graphical user interface to interact with for content, and an API endpoint to connect to. In this model, you’re removing the presentation layer — the “head” — from the backend management system. A SSG fills this missing role for the presentation layer. For example, an editor can use the CMS interface to create content which is then stored on the database. A developer can then access that content via the API endpoint and create a view to display the content to users.
SSGs are also extensible. With different plugins and modules, your SSG can become something a bit more than its initial offering. For example, in Eleventy, you can use a plugin to optimize and resize your images and even utilize serverless functions to create dynamic pages. Depending on your needs, a SSG can grow in complexity and functionality, while still outputting static files.
In this conceptual article, you’ve learned about some of the underlying technologies that static site generators use to create a website. With this information, you now have a better understanding about how static sites are created with a SSG.
If you’d like to know more about creating a site with an SSG, we have a tutorial series on the popular SSG Gatsby.
If you’re interested in Eleventy, we also have a tutorial on How to Create and Deploy Your First Eleventy Website.
]]>When setting up a web site or application under your own domain, your hosting provider may also offer you the option of configuring your own mail server. Although there are many robust open source solutions such as Dovecot, hosting your own mail is often not the best option for many deployments. Because of the relatively complicated way that DNS records, spam filters, and webmail interfaces are implemented, maintaining your own mail server is becoming less popular, and less widely supported by hosting providers. Most people will get more value out of using a hosted mail service. This guide will cover many of the reasons that you may not want to run your own mail server, and offer a few alternatives.
A typical mail server consists of many software components that provide a specific function. Each component must be configured and tuned to work nicely together and provide a fully-functioning mail server. Because they have so many moving parts, mail servers can become complex and difficult to set up.
Here is a list of required components in a mail server:
In addition to those, you will probably want to add these components:
While some software packages include the functionality of multiple components, the choice of each component is often left up to you. In addition to the software components, mail servers need a domain name, the appropriate DNS records, and an SSL certificate.
Let’s take a look at each component in more detail.
A Mail Transfer Agent (MTA), which handles Simple Mail Transfer Protocol (SMTP) traffic, has two responsibilities:
Examples of MTA software include Postfix, Exim, and Sendmail.
Note: As a general rule, even if you are committed to not running a full mail server, an MTA is still relatively straightforward to deploy on its own in order to send alerts or notifications from your software. This is sometimes challenging because some hosting providers (including DigitalOcean) will automatically block the default outgoing mail port, 25, in order to avoid being used for spam. To avoid this, you can use a third-party SMTP server. You can also review How To Install and Setup Postfix.
A Mail Delivery Agent (MDA), which is sometimes referred to as the Local Delivery Agent (LDA), retrieves mail from a MTA and places it in the appropriate mail user’s mailbox.
There are a variety of mailbox formats, such as mbox and Maildir. Each MDA supports specific mailbox formats. The choice of mailbox format determines how the messages are actually stored on the mail server which, in turn, affects disk usage and mailbox access performance, as well as import/export compatibility.
Examples of MDA software include Postfix and Dovecot.
IMAP and POP3 are protocols that are used by mail clients — software that is used to read email, for mail retrieval.
IMAP is the more complex protocol that allows, among other things, multiple clients to connect to an individual mailbox simultaneously. The email messages are copied to the client, and the original message is left on the mail server.
POP3 is simpler, and moves email messages to the mail client’s computer, typically the user’s local computer, by default.
Examples of software that provides IMAP and/or POP3 server functionality include Courier, Dovecot, and Zimbra.
The purpose of a spam filter is to reduce the amount of incoming spam, or junk mail, that reaches user’s mailboxes. Spam filters accomplish this by applying spam detection rules — which consider a variety of factors such as the server that sent the message, the message content, and so forth — to incoming mail. If a message’s “spam level” reaches a certain threshold, it is marked and treated as spam.
Spam filters can also be applied to outgoing mail. This can be useful if a user’s mail account is compromised, to reduce the amount of spam that can be sent using your mail server.
SpamAssassin is a popular open source spam filter.
Antivirus is used to detect viruses, trojans, malware, and other threats in incoming and outgoing mail. ClamAV is a popular open source antivirus engine.
Many users expect their email service to provide webmail access. Webmail, in the context of running a mail server, is a mail client that can be accessed by users via a web browser. Gmail is probably the best-known example of this. The webmail component, which requires a web server such as Nginx or Apache, can run on the mail server itself.
Examples of software that provide webmail functionality include Roundcube and Citadel.
Although having to maintain a stack of four or five different software components in order to provide basic functionality is not ideal, it may not seem so much worse than other deployments in that regard. This, however, does not take into account the significant “trust” issues of running your own mail server.
In many ways, mail server stacks represent a collision between the tools and values of the early internet — self-hosting open source software using well-defined standards and interoperable protocols — and the reality of the modern internet — a few centralized, trusted authorities. More than web servers, database servers, or other cloud software, they have to handle an enormous amount of untrustworthy input, and the trust standards of commercial mail servers are very high as a result. Because mail servers are constantly handling attachments of potentially harmful files, and constantly filtering spam and spam addresses, it can be quite challenging to run a server that actually keeps up with the expectations of modern webmail providers. Many of them will not hesitate to block traffic from a temporarily compromised sender, especially if it is a small, self-hosted operation.
It is not trivial to keep your server off of the various blacklists, also known as DNSBL, blocklists, or blackhole lists. These lists contain the IP addresses of mail servers that were reported to send spam or junk mail (or for having improperly configured DNS records). Many mail servers subscribe to one or more of these blacklists, and filter incoming messages based on whether the mail server that sent the messages is on the list(s). If your mail server gets listed, your outgoing messages may be filtered and discarded before they reach their intended recipients.
When deploying a web server, it is fairly common to experience occasional outages from DNS misconfiguration. There is a substantial ecosystem of CDNs and load balancers whose primary purpose is to prevent these minor outages from otherwise impacting your infrastructure. When it comes to mail servers, however, a minor misconfiguration can make it hard to — literally — restore trust.
If your mail server gets blacklisted, it is often possible to get it unlisted (or removed from the blacklist). You will need to determine the reason for being blacklisted, and resolve the issue. After this, you will need to look up the blacklist removal process for the particular list that your mail server is on, and follow it.
Hosted mail services fall into two broad categories. The first category is comprised of personal webmail providers. These service providers are widely known for their free service tiers, and usually provide paid options for hosting a custom email domain, supporting multiple users of a shared business account, and so on. They usually provide their own webmail interfaces and dedicated mobile apps.
A second category is mail delivery services. These providers are not necessarily in the personal email business, but instead provide API access for any software that needs to send mail in bulk, such as password change notifications or advertising campaigns. Usually, these services include dedicated mail server credentials, the relevant trust and filtering features, and a web dashboard to monitor your mail volume and any related issues. They are typically priced by usage.
This list is not exhaustive, but should provide an overview of the service landscape.
Although email is a fundamental internet technology, many cloud providers are reluctant to support self-hosted mail servers because of their inherent challenges. We generally recommend using an external provider to handle email for your cloud.
If you are determined to run your own mail server, you can see a comprehensive example in How To Configure a Mail Server Using Postfix, Dovecot, MySQL, and SpamAssasin.
]]>A web server’s primary role is to serve web pages for a website. A web page can be rendered from a single HTML file, or a complex assortment of resources fitted together. If you want to host your web application on the internet, in many cases you will need a web server.
One of the most common use cases for web servers is serving files necessary for rendering a website in a browser. When you visit http://www.digitalocean.com
, you begin with entering a URL that starts a request over the internet. This request passes through multiple layers, one or more of which will be a web server. This web server generates a response to your request, which in this case is the DigitalOcean website, specifically the homepage. Ideally, this happens quickly and with 24/7 availability.
While any visitor to DigitalOcean’s homepage will experience it as a single web page, in reality most modern web pages today are a combination of many resources. Web servers act as an intermediary between the backend and the frontend, serving up resources like HTML and CSS files to JSON data, all generated dynamically on the fly or served statically. If you intend to work with websites or online apps in any capacity, it is extremely useful to familiarize yourself with the basics of what a web server is, and how it works.
While the term “web server” can refer to either the software itself or the hardware it exists on, this article refers specifically to web server software. For more details on this difference, check out our introduction to cloud servers.
A web server handles requests on the internet through HTTP and HTTPS protocol, and is also called an HTTP server. A web server is distinct from other types of servers in that it specializes in handling these HTTP and HTTPS requests, differentiating itself from application servers (e.g. Gunicorn) and servers for other protocols (i.e. WSGI). These other servers work as intermediaries for backend programming languages through external libraries, which is a different level of abstraction than web servers.
Here are some common tasks handled by web servers:
In practical terms, here are some personal projects that would involve a web server:
This list is by no means comprehensive, and a web server is not strictly limited in the data types it can serve to an end user. For example, a web server that serves web API requests often responds with data in a format such as JSON.
Web servers cater to an audience with expectations of speed, availability, reliability, and more. They have a shared purpose of serving content on the internet, and in order to be considered a viable web server solution, the following aspects must be considered:
While web servers can offer different solutions, the solutions they offer stem from attempts to address these same problems. These problems themselves evolve over time along with the needs and expectations of the end user, making this a living and ever evolving list.
The most popular open source web servers are currently Apache and Nginx.
Apache came first, and was built at a time when it was common for multiple websites with their own individual configuration files to all exist on a single web server. Nginx came after Apache, at a time when needs shifted away from serving multiple websites from one server, and instead towards serving one website from one server in an extremely efficient manner under load.
While web servers share the same goals and problems, each solution’s interpretation and implementation will be different. The exact answers to these problems shape the identity of any given web server solution. Nginx and Apache are highlighted here due to their ubiquity, but any web server solution will be opinionated. When selecting a web server, it’s important to keep your own needs in mind for your specific project. That way, even if the landscape of web server offerings change, your method of evaluation stays grounded by your own requirements.
Here are some key differentiators in how web servers attempt to accomplish the goals of a web server:
Web servers store their settings in configuration files. You can customize your web servers by editing these files. The storing and organization of configuration files is an opinionated, structural issue that splits web server products.
The main divide is between centralization and decentralization. Decentralized configuration files allows for a granular level of control on a filesystem level, which stems from a need to host multiple websites on one server. Centralized configurations don’t focus on hosting multiple websites on one server, and instead focuses on efficiently serving a single website. These configurations rely on URI pattern matching, which is the matching of URLs to filenames and other unique identifiers, instead of relying on matching against the directory structure of a web server.
Apache’s .htaccess
files facilitates a decentralized configuration as a feature, and every design decision flows from this focus on the filesystem with a granular level of control. Nginx does not have the same filesystem focus, and focuses on URI pattern matching and a centralized configuration.
The physical and virtual servers you run web servers on have limited resources such as RAM and CPU processing. How your web server fundamentally manages its requests will have the largest impact on efficiently using your resources. A single request can spawn an entire process per request, or it can be handled on an event-driven basis. The capacity of your web server to handle multiple simultaneous requests efficiently is tied to fundamental design decisions.
Apache handles requests by spawning processes, which consumes resources at a rate that can become a problem under load. Nginx’s event-driven system handling system uses less resources and can be more performant under load.
Besides web pages, web servers get requests for other resources such as images, videos, CSS files, and JavaScript files. Since these items are always the same regardless of who requests them, this type of content is referred to as static. Oftentimes a web page itself is just an HTML file that isn’t customized to the person requesting it, and is also treated as static content. Web servers can also compress this static content for better load times.
Nginx excels as serving static content due to its event-driven request handling. Apache can also serve static content, but in most setups, not at the same speed and efficiency under load compared to Nginx.
When content is changed, processed, and customized depending on who is requesting it, the content is referred to as dynamic. For example, after you log in to a website, often the website will dynamically populate your username in the top navigation bar. Dynamic content adds extra complexity because it forces the web server to handle many requests uniquely at the time it receives it. Content tailored per request cannot be served to everyone, and cannot be universally cached.
Processing dynamic content internally removes an extra layer of abstraction that would normally require handing off the request to an external library. Apache natively implements dynamic content processing, with popular solution stacks such as LAMP (Linux, Apache, MySQL, PHP). Nginx is more language agnostic but requires external libraries such as PHP-FPM to act as a similar solution for use cases such as LAMP stack.
A reverse proxy sits in front of a traditional web server, becoming an intermediary server that routes HTTP request traffic to web servers behind it. A reverse proxy becomes the gateway that directs traffic between web servers and the internet at large, and often is the layer that directly interfaces with a firewall. While most web servers have reverse proxy capability, Nginx was built and optimized from the ground up to be a robust reverse proxy server.
Nginx’s importance in real world usage hinges heavily on its reverse proxy features and efficiency. Many server setups place multiple traditional web servers behind an Nginx reverse proxy, using Nginx to determine which web server to send the request to based on load or rule configuration. This intermediary role allows it to even pair with Apache in some setups, sitting as a reverse proxy in front of a traditional Apache web server.
Nginx and Apache both have strong support from their respective development teams and community. Being the most popular open source web servers, learning resources are plentiful. Apache is supported by the Apache, a non-profit organization, and will always be free to use. Nginx’s core is open source, but desirable features are locked behind their Nginx Plus product offering with features such as upstream health checks, session persistence, and advanced monitoring.
If you want a server that is ready at all times to respond to an incoming HTTP request, then a web server accomplishes this task best. As you stray further from focusing on serving HTTP requests, web servers will become less of an ideal solution. This is especially true for the auxiliary features that web servers provide. For example, features such as caching may be more efficiently handled at the reverse proxy or CDN levels, depending on the setup.
Additionally, as developers have shifted their priorities in dedicating development resources to managing web servers, solutions such as serverless, headless CMS, and Jamstack have emerged in response. These solutions don’t require a self-hosted web server, instead abstracting out the web server layer to external services. For developers who don’t require granular or advanced control of the web server layer, development time can be focused elsewhere. For more, check out this article on Jamstack with headless CMS or implementing full stack Jamstack with DigitalOcean’s App Platform.
In this article, you’ve gone through a basic primer of what web servers are, how they’re used, and the problems they’re trying to solve. Equipped with this knowledge, you dove into the current landscape of web server solutions, and applied your knowledge towards finding the solution that fits your needs specifically. To learn more about how to set up and use a web server, check out the rest of our Cloud Curriculum series on web servers.
Tutorials:
DigitalOcean Products:
A container is a unit of software that packages code with its required dependencies in order to run in an isolated, controlled environment. This allows you to run an application in a way that is predictable, repeatable, and without the uncertainty of inconsistent code execution across diverse development and production environments. A container will always start up and run code the same way, regardless of what machine it exists on.
Containers provide benefits on multiple levels. In terms of physical allocation of resources, containers provide efficient distribution of compute power and memory. However, containers also enable new design paradigms by strengthening the separation of application code and infrastructure code. This design separation also enables specialization within teams of developers, allowing developers to focus on their strengths.
Considering the breadth of what containers bring to the table, this article will bring you up to speed with the common, related terminology. After a general introduction to the benefits of containers, you will learn the nuance between terms such as “container runtimes”, “container orchestration”, and “container engines”. You will then apply this towards understanding the goals of containers, the specific problems they’re trying to solve, and the current solutions available.
Note: This article will abstract the feature sets and benefits of containers away from specific products where it makes sense. However, learning how to use containers will ultimately involve learning how to use Docker. Docker containers are not only one of the most popular stand-alone container solutions in their own right, but are also the de facto standard for many container related technologies.
Additionally, the focus here will only be on containers specialized in isolating single application processes, which is the contemporary implementation of containers most people currently associate with container technology. While container technologies such as LXC existed prior to technologies such as Docker, and even acted as the base of early development for parts of Docker, LXC is different enough in design and practical usage to be considered out of the scope of this article.
Containers are efficient with their consumption of resources — such as CPU compute power and memory — compared to options such as virtual machines. They also offer design advantages both in how they isolate applications and how they abstract the application layer away from the infrastructure layer.
At a project level, this abstraction enables teams and team members to work on their piece of the project without the risk of interfering with other parts of the project. Application developers can focus on application development, while infrastructure engineers can focus on infrastructure. Separating the responsibilities of a team in this case also separates the code, meaning application code won’t break infrastructure code and vice versa.
Containers provide a smoother transition between development and production for teams. For example, if you need to run a Node.js application on a server, one option is to install Node.js directly. This is a straightforward solution when you are a singular developer working on a singular server. However, once you begin collaborating with multiple developers and deploying to multiple environments, that singular installation of Node.js can differ across different team members using even slightly differing development environments.
At a development level, the portability of containers enables infrastructure design that doesn’t have to account for unpredictable application code execution. In this context, containers enable developers to:
These features may make containers appear similar to virtual machines, but the difference is in their underlying design and subsequent efficiency. Modern container technologies are designed specifically to avoid the heavy resource requirements of virtual machines. While containers share the same principles of portability and repeatability, they are designed at a different level of abstraction. Containers skip the virtualization of hardware and the kernel, which are the most resource intensive parts of a virtual machine, and instead rely on the underlying hardware and kernel of the host machine.
As a result, containers are comparatively lightweight, with lower resource requirements. Container technology has subsequently fostered a rich ecosystem to support container centered development, including:
Strictly speaking, these tools and benefits are auxiliary features on the edges of container technology. However, these are often bundled together as complete container solutions due to the ubiquity of their need.
Due to their broad set of use cases, “containers” can refer to multiple things. To help you understand the nuance between the interconnected concepts, here are some key terms that are often confused or used interchangeably in error:
runc
, which is the most common, used by Docker, and written in Golang. An alternative is crun
from Red Hat written in C, and is meant to be faster and lightweight.dockershim
which has steadily been dropped in favor of the more fully featured containerd
from Docker, or the alternative CRI-O
from Red Hat.While these terms all refer to specific aspects of container technology, don’t be surprised if an informal discussion of containers confuses some of these hardline definitions. These definitions are important for your learning path, but be aware that people sometimes use these terms more loosely in everyday conversation.
Container solutions share some common problems that must be addressed in order to be successful:
While containers can offer different solutions, the solutions they offer stem from attempts to answer the same problems. These problems themselves evolve over time along with the needs and expectations of the end user, making this list ever evolving.
Realistically speaking, choosing a container is choosing a container engine. Most container engines use a combination of runc
as their OCI runtime in conjunction with containerd
as their CRI. As previously mentioned, the current landscape is dominated by Docker. Whether or not you choose Docker as your primary deployment solution, your learning path will likely cross with Docker.
The closest competitor is Red Hat’s solution stack of Podman, which manages the runtime of containers; Buildah, which builds container images; and Skopeo, which is an interface between container images and image registries.
Ultimately, the differences between these solutions hinge on how they handle root access and their usage of daemons. Docker requires root access in order to function, and this is an extra level of access that widens the plane of potential security concern. Docker also requires a daemon to always be always alive. The Podman solution stack requires neither of these things.
The next level of container deployment is automation, and container orchestration automates many steps by always adjusting towards an ideal, defined state across a deployment. As a whole orchestration involves provisioning, configuration, scheduling, scaling, monitoring, deployment, and more.
While a full dive into container orchestration is beyond the scope of this article, two prominent players are Docker with Docker Compose and Docker Swarm mode, and Kubernetes. In roughly order of complexity, Docker Compose is a container orchestration solution that deals with multi-container deployments on a single host. When there are multiple hosts involved, Docker Swarm mode is required.
Kubernetes is a purpose-built container orchestration solution. Whereas Docker’s orchestration solutions are balanced against their focus on the core container components, Kubernetes is scoped for extensibility and granular control in orchestration. This results in a tradeoff where Kubernetes deployments are more complex.
This article outlined what containers are and what benefits they bring to the table. Container technology is rapidly evolving, and knowing the terminology will be crucial to your learning journey. LIkewise, understanding the goals of containers and the landscape of competitors that led to their creation allows you to make an informed decision on adopting this technology for your specific needs. To learn more about how to set up and use containers, check out the rest of our Cloud Curriculum series on web containers.
Tutorials:
DigitalOcean Products:
To set up a cloud server, one of the first things you need to do is install an operating system. In a modern context, this means a Linux operating system almost all of the time. Historically, both Windows servers and other types of Unix were popular in specific commercial contexts, but almost everyone now runs Linux due to its broad support, free or flexible licensing, and overall ubiquity in server computing. There are many Linux distributions available, each with their own maintainers, with some backed by commercial providers and some not. The distributions detailed in the following sections are some of the most popular operating systems for running cloud servers.
Ubuntu is one of the most popular Linux distributions for both servers and desktop computers. New Ubuntu versions are released every six months, and new long-term support versions of Ubuntu are released every two years and supported for five. Most educational content about Linux reflects Ubuntu due to its popularity, and the breadth of available support is a significant point in its favor.
Debian is upstream of Ubuntu, meaning its core architectural decisions usually inform Ubuntu releases, and it uses the same .deb package format and apt
package manager that Ubuntu uses. Debian is not as popular for production servers due to its conservative packaging choices and lack of commercial support. However, many users pick Debian due to its portability and its use as a baseline for many other Linux distributions on different platforms, including Raspbian, the most popular Raspberry Pi OS.
Red Hat Enterprise Linux or RHEL, is the most popular commercially supported Linux distribution. Unlike the Debian family, it uses .rpm packages and a package manager called dnf
, along with its own ecosystem of tools. For licensing reasons, Red Hat is only used where there is a commercial support agreement in place.
Rocky Linux is downstream of Red Hat the way that Ubuntu is downstream of Debian, and unlike RHEL is free to use like most other Linux distributions, making it a popular choice for users that have adopted Red Hat tooling but may not be using Red Hat’s commercial support. Previously, a distribution called CentOS filled the same role as Rocky Linux, but its release model is changing. Rocky Linux versions track closely with RHEL versions, and most documentation can be shared between the two.
Fedora Linux is upstream of Red Hat, and like Ubuntu, is used in desktop environments as well as on servers. Fedora is the de facto development home of most RHEL ecosystem packages, as well as of the Gnome desktop environment, which is used as a default by Ubuntu and others.
Arch Linux is another popular desktop-focused Linux distribution which is not a member of either the Debian or the Red Hat Linux family, but provides its own unique packaging format and tools. Unlike the other distributions, it does not use release versioning of any kind — its packages are always the newest available. For this reason, it is not recommended for production servers, but provides excellent documentation, and can be very flexible for knowledgeable users.
Alpine Linux is a minimal Linux distribution which does not provide many common tools by default. Historically there have been many Linux distributions created with this goal in mind. Alpine is commonly used in modern containerized deployments such as Docker, where your software may need a virtualized operating system to run in, but needs to keep its overall footprint as small as possible. You would generally not work directly in Alpine Linux unless trying to prototype a container.
Previously, there were more differences between distributions in their choice of init system, window manager, and other libraries, but nearly all major Linux distributions have now standardized on systemd and other such tools.
There are many other Linux distributions, but most of the others can be currently understood in relation to these seven. As you can tell from this overview, most of your selection criteria for Linux distributions will come down to:
Choosing a distribution is down to preference, but if you are working in the cloud and do not have any production requirements for the Red Hat ecosystem, Ubuntu is a popular default choice. You can also review the available packages for a given distribution from their web-facing package repositories. For example, the Ubuntu 22.04 “Jammy Jellyfish” packages are hosted under the Jammy section of Ubuntu.com.
Most Linux distributions also differ significantly in how third-party packages — packages not available from the repository’s own package sources — are created, discovered, and installed. Red Hat, Fedora, and Rocky Linux generally use only a few popular third party package repositories in addition to their official packages, in keeping with their more authoritative, production-minded approach. One of these is the Extra Packages for Enterprise Linux or EPEL. Because the RHEL ecosystem draws a distinction between packages that are commercially supported and those that aren’t, many common packages that are available out of the box on Ubuntu will require you to configure EPEL to install them on Red Hat. In this and many other cases, which packages are available upstream from your distribution’s own repositories is often a matter of authoritativeness and maintenance responsibility more than anything else. Many third-party package sources are widely trusted, they may just be out of the scope of your distribution’s maintainers.
Ubuntu allows individual users to create PPAs, or personal package archives, to maintain third-party software for others to install. However, using too many PPAs concurrently can cause incompatibility headaches, because Debian and Ubuntu packages are all versioned to have specific requirements, so PPA maintainers need to match Ubuntu’s upstream updates fairly closely. Arch Linux has a single repository for user-submitted packages, fittingly called the Arch User Repository or AUR, and although their approach seems more chaotic by comparison, it can be more convenient in practice if you use dozens of third-party packages.
You can also avoid adding complexity to your system package manager by instead installing third-party software through Homebrew or through Docker. Although “Dockerized” or containerized deployments can be inefficient in terms of disk usage and installation overhead, which is where Alpine Linux comes into consideration, they are portable across distributions and do not impose any versioning requirements on your system. However, any packages not installed by your system package manager may not receive automatic updates by default, which should be another consideration.
In this tutorial, you reviewed some of the most important considerations in choosing a Linux distribution for your cloud. The now-widespread use of Docker and other container engines means that choosing a distribution is not quite as impactful in terms of the software you’re able to run as it was in the past. However, it still factors heavily into how you’ll obtain support for your software, and should be a significant consideration as you scale your infrastructure for production.
To learn more about how to work with the system package manager on different Linux distributions, refer to Package Management Essentials.
Tutorials
Products
Linux is, by definition, a multi-user OS that is based on the Unix concepts of file ownership and permissions to provide security at the file system level. To reliably administer a cloud server, it is essential that you have a decent understanding of how ownership and permissions work. There are many intricacies of dealing with file ownership and permissions, but this tutorial will provide a good introduction.
This tutorial will cover how to view and understand Linux ownership and permissions. If you are looking for a tutorial on how to modify permissions, you can read Linux Permissions Basics and How to Use Umask on a VPS.
Make sure you understand the concepts covered in the prior tutorials in this series:
To follow this tutorial, you will need access to a cloud server. You can follow this guide to creating a DigitalOcean droplet.
As mentioned in the introduction, Linux is a multi-user system. You should understand the fundamentals of Linux users and groups before ownership and permissions, because they are the entities that the ownership and permissions apply to. Let’s get started with what users are.
In Linux, there are two types of users: system users and regular users. Traditionally, system users are used to run non-interactive or background processes on a system, while regular users are used for logging in and running processes interactively. When you first initialize and log in to a Linux system, you may notice that it starts out with many system users already created to run the services that the OS depends on. This is normal.
You can view all of the users on a system by looking at the contents of the /etc/passwd
file. Each line in this file contains information about a single user, starting with its username (the name before the first :
). You can print the contents of the passwd
file with cat
:
- cat /etc/passwd
Output…
sshd:x:109:65534::/run/sshd:/usr/sbin/nologin
landscape:x:110:115::/var/lib/landscape:/usr/sbin/nologin
pollinate:x:111:1::/var/cache/pollinate:/bin/false
systemd-coredump:x:999:999:systemd Core Dumper:/:/usr/sbin/nologin
lxd:x:998:100::/var/snap/lxd/common/lxd:/bin/false
vault:x:997:997::/home/vault:/bin/bash
stunnel4:x:112:119::/var/run/stunnel4:/usr/sbin/nologin
sammy:x:1001:1002::/home/sammy:/bin/sh
In addition to the two user types, there is the superuser, or root user, that has the ability to override any file ownership and permission restrictions. In practice, this means that the superuser has the rights to access anything on its own server. This user is used to make system-wide changes.
It is also possible to configure other user accounts with the ability to assume “superuser rights”. This is often referred to as having sudo
, because users who have permissions to temporarily gain superuser rights do so by preceding admin-level commands with sudo
. In fact, creating a normal user that has sudo
privileges for system administration tasks is considered to be best practice. This way, you can be more conservative in your use of the root user account.
Groups are collections of zero or more users. A user belongs to a default group, and can also be a member of any of the other groups on a server.
You can view all the groups on the system and their members by looking in the /etc/group
file, as you would with /etc/passwd
for users. This article does not cover group management.
Now that you know what users and groups are, let’s talk about file ownership and permissions!
In Linux, every file is owned by a single user and a single group, and has its own access permissions. Let’s look at how to view the ownership and permissions of a file.
The most common way to view the permissions of a file is to use ls
with the long listing option -l
, e.g. ls -l myfile
. If you want to view the permissions of all of the files in your current directory, run the command without the myfile
argument, like this:
- ls -l
Note: If you are in an empty home directory, and you haven’t created any files to view yet, you can follow along by listing the contents of the /etc
directory by running this command: ls -l /etc
Here is an example screenshot of ls -l
output, with labels of each column of output:
Each file lists its mode (which contains permissions), owner, group, and name are listed. To help explain what all of those letters and hyphens mean, let’s break down the mode column into its components.
To help explain what all the groupings and letters mean, here is a breakdown of the mode metadata of the first file in the above example:
In Linux, there are two types of files: normal and special. The file type is indicated by the first character of the mode of a file — in this guide, this will be referred to as the “file type field”.
Normal files can be identified by a hyphen (-
) in their file type fields. Normal files can contain data or anything else. They are called normal, or regular, files to distinguish them from special files.
Special files can be identified by a non-hyphen character, such as a letter, in their file type fields, and are handled by the OS differently than normal files. The character that appears in the file type field indicates the kind of special file a particular file is. For example, a directory, which is the most common kind of special file, is identified by the d
character that appears in its file type field (like in the previous screenshot). There are several other kinds of special files.
From the diagram, you can see that the mode column indicates the file type, followed by three triads, or classes, of permissions: user (owner), group, and other. The order of the classes is consistent across all Linux systems.
The three permissions classes work as follows:
The next thing to pay attention to are those sets of three characters. They denote the permissions, in symbolic form, that each class has for a given file.
In each triad, read, write, and execute permissions are represented in the following way:
r
in the first positionw
in the second positionx
in the third position. In some special cases, there may be a different character hereA hyphen (-
) in the place of one of these characters indicates that the respective permission is not available for the respective class. For example, if the group (second) triad for a file is r--
, the file is “read-only” to the group that is associated with the file.
Now that you know how to read the permissions of a file, you should know what each of the permissions actually allow users to do. This tutorial will cover each permission individually, but keep in mind that they are often used in combination with each other to allow for useful access to files and directories.
Here is a breakdown of the access that the three permission types grant to user:
For a normal file, read permission allows a user to view the contents of the file.
For a directory, read permission allows a user to view the names of the file in the directory.
For a normal file, write permission allows a user to modify and delete the file.
For a directory, write permission allows a user to delete the directory, modify its contents (create, delete, and rename files in it), and modify the contents of files that the user has write permissions to.
For a normal file, execute permission allows a user to execute (run) a file — the user must also have read permission. Execute permissions must be set for executable programs and shell scripts before a user can run them.
For a directory, execute permission allows a user to access, or traverse into (i.e. cd
) and access metadata about files in the directory (the information that is listed in an ls -l
).
Now that know how to read the mode of a file, and understand the meaning of each permission, you will see a few examples of common modes, with brief explanations, to bring the concepts together.
-rw-------
: A file that is only accessible by its owner-rwxr-xr-x
: A file that is executable by every user on the system. A “world-executable” file-rw-rw-rw-
: A file that is open to modification by every user on the system. A “world-writable” filedrwxr-xr-x
: A directory that every user on the system can read and accessdrwxrwx---
: A directory that is modifiable (including its contents) by its owner and groupdrwxr-x---
: A directory that is accessible by its groupThe owner of a file usually enjoys the most permissions, when compared to the other two classes. Typically, you will see that the group and other classes only have a subset of the owner’s permissions (equivalent or less). This makes sense because files should only be accessible to users who need them for a particular reason.
Another thing to note is that even though many permission combinations are possible, only certain ones make sense in most situations. For example, write or execute access is almost always accompanied by read access, since it’s hard to modify, and impossible to execute, something you can’t read.
You should now have a good understanding of how ownership and permissions work in Linux. To learn how to modify these permissions using chown
, chgrp
, and chmod
, refer to Linux Permissions Basics and How to Use Umask on a VPS.
If you would like to learn more about Linux fundamentals, read the next tutorial in this series, An Introduction to Linux I/O Redirection.
]]>Machine learning is a subfield of artificial intelligence (AI). The goal of machine learning generally is to understand the structure of data and fit that data into models that can be understood and utilized by people.
Although machine learning is a field within computer science, it differs from traditional computational approaches. In traditional computing, algorithms are sets of explicitly programmed instructions used by computers to calculate or problem solve. Machine learning algorithms instead allow for computers to train on data inputs and use statistical analysis in order to output values that fall within a specific range. Because of this, machine learning facilitates computers in building models from sample data in order to automate decision-making processes based on data inputs.
Any technology user today has benefitted from machine learning. Facial recognition technology allows social media platforms to help users tag and share photos of friends. Optical character recognition (OCR) technology converts images of text into movable type. Recommendation engines, powered by machine learning, suggest what movies or television shows to watch next based on user preferences. Self-driving cars that rely on machine learning to navigate may soon be available to consumers.
Machine learning is a continuously developing field. Because of this, there are some considerations to keep in mind as you work with machine learning methodologies, or analyze the impact of machine learning processes.
In this tutorial, we’ll look into the common machine learning methods of supervised and unsupervised learning, and common algorithmic approaches in machine learning, including the k-nearest neighbor algorithm, decision tree learning, and deep learning. We’ll explore which programming languages are most used in machine learning, providing you with some of the positive and negative attributes of each. Additionally, we’ll discuss biases that are perpetuated by machine learning algorithms, and consider what can be kept in mind to prevent these biases when building algorithms.
In machine learning, tasks are generally classified into broad categories. These categories are based on how learning is received or how feedback on the learning is given to the system developed.
Two of the most widely adopted machine learning methods are supervised learning which trains algorithms based on example input and output data that is labeled by humans, and unsupervised learning which provides the algorithm with no labeled data in order to allow it to find structure within its input data. Let’s explore these methods in more detail.
In supervised learning, the computer is provided with example inputs that are labeled with their desired outputs. The purpose of this method is for the algorithm to be able to “learn” by comparing its actual output with the “taught” outputs to find errors, and modify the model accordingly. Supervised learning therefore uses patterns to predict label values on additional unlabeled data.
For example, with supervised learning, an algorithm may be fed data with images of sharks labeled as fish
and images of oceans labeled as water
. By being trained on this data, the supervised learning algorithm should be able to later identify unlabeled shark images as fish
and unlabeled ocean images as water
.
A common use case of supervised learning is to use historical data to predict statistically likely future events. It may use historical stock market information to anticipate upcoming fluctuations, or be employed to filter out spam emails. In supervised learning, tagged photos of dogs can be used as input data to classify untagged photos of dogs.
In unsupervised learning, data is unlabeled, so the learning algorithm is left to find commonalities among its input data. As unlabeled data are more abundant than labeled data, machine learning methods that facilitate unsupervised learning are particularly valuable.
The goal of unsupervised learning may be as straightforward as discovering hidden patterns within a dataset, but it may also have a goal of feature learning, which allows the computational machine to automatically discover the representations that are needed to classify raw data.
Unsupervised learning is commonly used for transactional data. You may have a large dataset of customers and their purchases, but as a human you will likely not be able to make sense of what similar attributes can be drawn from customer profiles and their types of purchases. With this data fed into an unsupervised learning algorithm, it may be determined that women of a certain age range who buy unscented soaps are likely to be pregnant, and therefore a marketing campaign related to pregnancy and baby products can be targeted to this audience in order to increase their number of purchases.
Without being told a “correct” answer, unsupervised learning methods can look at complex data that is more expansive and seemingly unrelated in order to organize it in potentially meaningful ways. Unsupervised learning is often used for anomaly detection including for fraudulent credit card purchases, and recommender systems that recommend what products to buy next. In unsupervised learning, untagged photos of dogs can be used as input data for the algorithm to find likenesses and classify dog photos together.
As a field, machine learning is closely related to computational statistics, so having a background knowledge in statistics is useful for understanding and leveraging machine learning algorithms.
For those who may not have studied statistics, it can be helpful to first define correlation and regression, as they are commonly used techniques for investigating the relationship among quantitative variables. Correlation is a measure of association between two variables that are not designated as either dependent or independent. Regression at a basic level is used to examine the relationship between one dependent and one independent variable. Because regression statistics can be used to anticipate the dependent variable when the independent variable is known, regression enables prediction capabilities.
Approaches to machine learning are continuously being developed. For our purposes, we’ll go through a few of the popular approaches that are being used in machine learning at the time of writing.
The k-nearest neighbor algorithm is a pattern recognition model that can be used for classification as well as regression. Often abbreviated as k-NN, the k in k-nearest neighbor is a positive integer, which is typically small. In either classification or regression, the input will consist of the k closest training examples within a space.
We will focus on k-NN classification. In this method, the output is class membership. This will assign a new object to the class most common among its k nearest neighbors. In the case of k = 1, the object is assigned to the class of the single nearest neighbor.
Let’s look at an example of k-nearest neighbor. In the diagram below, there are blue diamond objects and orange star objects. These belong to two separate classes: the diamond class and the star class.
When a new object is added to the space — in this case a green heart — we will want the machine learning algorithm to classify the heart to a certain class.
When we choose k = 3, the algorithm will find the three nearest neighbors of the green heart in order to classify it to either the diamond class or the star class.
In our diagram, the three nearest neighbors of the green heart are one diamond and two stars. Therefore, the algorithm will classify the heart with the star class.
Among the most basic of machine learning algorithms, k-nearest neighbor is considered to be a type of “lazy learning” as generalization beyond the training data does not occur until a query is made to the system.
For general use, decision trees are employed to visually represent decisions and show or inform decision making. When working with machine learning and data mining, decision trees are used as a predictive model. These models map observations about data to conclusions about the data’s target value.
The goal of decision tree learning is to create a model that will predict the value of a target based on input variables.
In the predictive model, the data’s attributes that are determined through observation are represented by the branches, while the conclusions about the data’s target value are represented in the leaves.
When “learning” a tree, the source data is divided into subsets based on an attribute value test, which is repeated on each of the derived subsets recursively. Once the subset at a node has the equivalent value as its target value has, the recursion process will be complete.
Let’s look at an example of various conditions that can determine whether or not someone should go fishing. This includes weather conditions as well as barometric pressure conditions.
In the simplified decision tree above, an example is classified by sorting it through the tree to the appropriate leaf node. This then returns the classification associated with the particular leaf, which in this case is either a Yes
or a No
. The tree classifies a day’s conditions based on whether or not it is suitable for going fishing.
A true classification tree data set would have a lot more features than what is outlined above, but relationships should be straightforward to determine. When working with decision tree learning, several determinations need to be made, including what features to choose, what conditions to use for splitting, and understanding when the decision tree has reached a clear ending.
Deep learning attempts to imitate how the human brain can process light and sound stimuli into vision and hearing. A deep learning architecture is inspired by biological neural networks and consists of multiple layers in an artificial neural network made up of hardware and GPUs.
Deep learning uses a cascade of nonlinear processing unit layers in order to extract or transform features (or representations) of the data. The output of one layer serves as the input of the successive layer. In deep learning, algorithms can be either supervised and serve to classify data, or unsupervised and perform pattern analysis.
Among the machine learning algorithms that are currently being used and developed, deep learning absorbs the most data and has been able to beat humans in some cognitive tasks. Because of these attributes, deep learning has become an approach with significant potential in the artificial intelligence space
Computer vision and speech recognition have both realized significant advances from deep learning approaches. IBM Watson is a well-known example of a system that leverages deep learning.
When choosing a language to specialize in with machine learning, you may want to consider the skills listed on current job advertisements as well as libraries available in various languages that can be used for machine learning processes.
Python’s is one of the most popular languages for working with machine learning due to the many available frameworks, including TensorFlow, PyTorch, and Keras. As a language that has readable syntax and the ability to be used as a scripting language, Python proves to be powerful and straightforward both for preprocessing data and working with data directly. The scikit-learn machine learning library is built on top of several existing Python packages that Python developers may already be familiar with, namely NumPy, SciPy, and Matplotlib.
To get started with Python, you can read our tutorial series on “How To Code in Python 3,” or read specifically on “How To Build a Machine Learning Classifier in Python with scikit-learn” or “How To Perform Neural Style Transfer with Python 3 and PyTorch.”
Java is widely used in enterprise programming, and is generally used by front-end desktop application developers who are also working on machine learning at the enterprise level. Usually it is not the first choice for those new to programming who want to learn about machine learning, but is favored by those with a background in Java development to apply to machine learning. In terms of machine learning applications in industry, Java tends to be used more than Python for network security, including in cyber attack and fraud detection use cases.
Among machine learning libraries for Java are Deeplearning4j, an open-source and distributed deep-learning library written for both Java and Scala; MALLET (MAchine Learning for LanguagE Toolkit) allows for machine learning applications on text, including natural language processing, topic modeling, document classification, and clustering; and Weka, a collection of machine learning algorithms to use for data mining tasks.
C++ is the language of choice for machine learning and artificial intelligence in game or robot applications (including robot locomotion). Embedded computing hardware developers and electronics engineers are more likely to favor C++ or C in machine learning applications due to their proficiency and level of control in the language. Some machine learning libraries you can use with C++ include the scalable mlpack, Dlib offering wide-ranging machine learning algorithms, and the modular and open-source Shark.
Although data and computational analysis may make us think that we are receiving objective information, this is not the case; being based on data does not mean that machine learning outputs are neutral. Human bias plays a role in how data is collected, organized, and ultimately in the algorithms that determine how machine learning will interact with that data.
If, for example, people are providing images for “fish” as data to train an algorithm, and these people overwhelmingly select images of goldfish, a computer may not classify a shark as a fish. This would create a bias against sharks as fish, and sharks would not be counted as fish.
When using historical photographs of scientists as training data, a computer may not properly classify scientists who are also people of color or women. In fact, recent peer-reviewed research has indicated that AI and machine learning programs exhibit human-like biases that include race and gender prejudices. See, for example “Semantics derived automatically from language corpora contain human-like biases” and “Men Also Like Shopping: Reducing Gender Bias Amplification using Corpus-level Constraints” [PDF].
As machine learning is increasingly leveraged in business, uncaught biases can perpetuate systemic issues that may prevent people from qualifying for loans, from being shown ads for high-paying job opportunities, or from receiving same-day delivery options.
Because human bias can negatively impact others, it is extremely important to be aware of it, and to also work towards eliminating it as much as possible. One way to work towards achieving this is by ensuring that there are diverse people working on a project and that diverse people are testing and reviewing it. Others have called for regulatory third parties to monitor and audit algorithms, building alternative systems that can detect biases, and ethics reviews as part of data science project planning. Raising awareness about biases, being mindful of our own unconscious biases, and structuring equity in our machine learning projects and pipelines can work to combat bias in this field.
This tutorial reviewed some of the use cases of machine learning, common methods and popular approaches used in the field, suitable machine learning programming languages, and also covered some things to keep in mind in terms of unconscious biases being replicated in algorithms.
Because machine learning is a field that is continuously being innovated, it is important to keep in mind that algorithms, methods, and approaches will continue to change.
In addition to reading our tutorials on “How To Build a Machine Learning Classifier in Python with scikit-learn” or “How To Perform Neural Style Transfer with Python 3 and PyTorch,” you can learn more about working with data in the technology industry by reading our Data Analysis tutorials.
]]>Serverless architecture allows backend web services to be implemented on an as-needed basis. Rather than having to maintain your own server configuration, architecting your software for serverless providers can minimize the overhead involved. Serverless applications are typically deployed from a Git repository into an environment that can scale up or down as needed.
Serverless deployments usually involve microservices. Using microservices is an approach to software architecture that structures an application as a collection of services that are loosely coupled, independently deployable, and independently maintainable and testable. Microservice architectures predate the widespread use of serverless deployments, but they are a natural fit together. Microservices can be used in any context that allows them to be deployed independently and managed by a central process or job server. Serverless implementations abstract away this central process management, leaving you to focus on your application logic.
This tutorial will review some best practices for rearchitecting monolithic applications to use microservices.
Rearchitecting, or refactoring, a monolithic application, is often invisible by design. If you plan to significantly rewrite your application logic while introducing no new features, your goal should be to avoid service disruptions to the greatest extent possible. This can entail using some form of blue-green deployment. When implementing microservices, it usually also entails replacing your application’s functionality on a step-by-step basis. This requires you to thoroughly implement unit tests to ensure that your application gracefully handles any unexpected edge cases. It also provides many opportunities to review your application logic and evaluate how to replace existing features with distinct microservices.
Microservices are equally well-supported by almost all major programming languages, and adopting a microservice-driven architecture can facilitate combining multiple different languages or frameworks within the same project. This allows you to adopt the best possible solution for each component of your stack, but can also change the way that you think about code maintenance.
Some architectures are a more natural fit for microservices than others. If your application logic contains multiple sequential steps that all depend on one another, it may not be a good idea to abstract each of them into individual microservices. In that case, you would need a sophisticated controller architecture that could handle and route any mid-stage errors. This is possible with a microservice architecture that uses a framework like Gearman to dispatch subprocesses, but it is more inconvenient when working with serverless deployments and can add complexity without necessarily solving problems.
Instead of delineating microservices between stages of the same input processing pipeline, you could delineate microservices between application state changes, or every time some output is returned to a user. This way, you do not need to pass the same data between public API calls as part of a single process. Handling your application state can be challenging with a microservice architecture, because each microservice will only have access to its own input, rather than to a globally defined scope. Wherever possible, you should create and pass similar data structures to each of your microservices, so that you can make reliable assumptions about the scope available to each of them.
Consider creating and maintaining your own application libraries for core logic and functions that are likely to be used in multiple places, and then create microservices which join together unique combinations of this logic. Remember that microservices can scale to zero: there is no penalty from maintaining unused code paths. This way, you can create microservices which do not directly depend on other microservices, because they each include a complete, linear set of application logic, composed of function calls which you maintain in a separate repository.
When working with microservices, you should employ the principles of GitOps as much as possible. Treat Git repositories as a single source of truth for deployment purposes. Most language-specific package managers, such as pip
for Python and npm
for Node.js, provide syntax to deploy packages from your own Git repositories. This can be used in addition to the default functionality of installing from PyPI
, npmjs.com
or other upstream repositories. This way, you can gracefully combine your own in-development functions with third-party libraries without deviating from best practices around maintainability or reproducibility.
Each of your microservices can implement its own API, and depending on the complexity of your application, you can implement another API layer on top of that (and so on, and so on), and plan to only expose the highest-level API to your users. Although maintaining multiple different API routes can add complexity, this complexity can be resolved through good documentation of each of your individual microservices’ API endpoints. Communicating between processes using well-defined API calls, such as HTTP GET
and POST
, adds virtually no overhead and will make your microservices much more reusable than if they used more idiosyncratic interprocess communication.
Adopting microservices may naturally push you toward also adopting more Software-as-a-Service (SaaS) tooling as a drop-in replacement for various parts of your application stack. This is almost always good in principle. While you are under no obligation to replace your own function calls with third-party services, retaining the option to do so will keep your application logic more flexible and more contemporary.
Effectively migrating to Microservices requires you to synthesize a number of best practices around software development and deployment.
When rearchitecting an application to use microservices, you should follow the best practices for Continuous Integration and Continuous Delivery to incrementally replace features of your monolithic architecture. For example, you can use branching by abstraction — building an abstraction layer within an existing implementation so that a new implementation can be built out behind the abstraction in parallel — to refactor production code without any disruption to users. You can also use decorators, a language feature of TypeScript and Python, to add more code paths to existing functions. This way, you can progressively toggle or roll back functionality.
Microservices have become popular at the same time as containerization frameworks like Docker for good reason. They have similar goals and architectural assumptions:
Containers provide process and dependency isolation so that they can be deployed on an individual basis.
Containers allow other applications running in tandem with them to function as a “black box” — they don’t need to share state or any information other than input and output.
Container registries, such as Docker Hub, make it possible to publish and use your own dependencies interchangeably with third-party dependencies.
In theory, your microservices should be equally suited to running in a Docker container or a Kubernetes cluster as they are in a serverless deployment. In practice, there may be significant advantages to one or the other. Highly CPU-intensive microservices such as video processing may not be economical in serverless environments, whereas maintaining a Kubernetes control plane and configuration details requires a significant commitment. However, building with portability in mind is always a worthwhile investment. Depending on the complexity of your architecture, you may be able to support multiple environments merely by creating the relevant .yml
metadata declarations and Dockerfile
s. Prototyping for both Kubernetes and serverless environments can improve the overall resilience of your architecture.
Generally speaking, you should not need to worry about database concurrency or other storage scaling issues inside of microservices themselves. Any relevant optimizations should be addressed and implemented directly by your database, your database abstraction layer, or your Database-as-a-Service (DBaaS) provider, so that your microservices can perform any create-read-update-delete (CRUD) operations without embellishment. Microservices must be able to concurrently query and update the same data sources, and your database backend should support these assumptions.
When making breaking, non-backwards-compatible updates to your microservices, you should provide new endpoints. For example, you might provide a /my/service/v2
in addition to a preexisting /my/service/v1
, and plan to gradually deprecate the /v1
endpoint. This is important because production microservices are likely to become useful and supported outside of their originally intended context. For this reason, many serverless providers will automatically version your URL endpoints to /v1
when deploying new functions.
Implementing microservices in your application can replace nested function calls or private methods by promoting them to their own standalone service. Take this example of a Flask application, which performs a Google query based on a user’s input into a web form, then manipulates the result before returning it back to the user:
from flask import *
from flask import render_template
from flask import Markup
from googleapiclient.discovery import build
from config import *
app = Flask(__name__)
def google_query(query, api_key, cse_id, **kwargs):
query_service = build("customsearch", "v1", developerKey=api_key)
query_results = query_service.cse().list(q=query, cx=cse_id, **kwargs).execute()
return query_results['items']
def manipulate_result(input, cli=False):
search_results = google_query(input, keys["api_key"], keys["cse_id"])
for result in search_results:
abc(result)
…
return manipulated_text
@app.route('/<string:text>', methods= ["GET"])
def get_url(text):
manipulated_text = manipulate_result(text)
return render_template('index.html', prefill=text, value=Markup(manipulated_text))
if __name__ == "__main__":
serve(app, host='0.0.0.0', port=5000)
This application provides its own web endpoint, which includes an HTTP GET
method. Providing a text string to that endpoint calls a function called manipulate_result()
, which first sends the text to another function google_query()
, then manipulates the text from the query results before returning it to the user.
This application could be refactored into two separate microservices, both of which take HTTP GET
parameters as input arguments. The first would return Google query results based on some input, using the googleapiclient
Python library:
from googleapiclient.discovery import build
from config import *
def main(input_text):
query_service = build("customsearch", "v1", developerKey=api_key)
query_results = query_service.cse().list(q=query, cx=cse_id, **kwargs).execute()
return query_results['items']
A second microservice would then manipulate and extract the relevant data to be returned to the user from those search results:
import requests
def main(search_string, standalone=True):
if standalone == False:
search_results = requests.get('https://path/to/microservice_1/v1/'+search_string).text
else:
search_results = search_string
for result in search_results:
abc(result)
…
return manipulated_text
In this example, microservice_2.py
performs all of the input handling, and calls microservice_1.py
directly via an HTTP post if an additional argument, standalone=False
has been provided. You could optionally create a separate, third function to join both microservices together, if you preferred to keep them entirely separate, but still provide their full functionality with a single API call.
This is a straightforward example, and the original Flask code does not appear to present a significant maintenance burden, but there are still advantages to being able to remove Flask from your stack. If you no longer need to run your own web request handler, you could then return these results to a static site, using a Jamstack environment, rather than needing to maintain a Flask backend.
In this tutorial, you reviewed some best practices for migrating monolithic applications to microservices, and followed a brief example for decomposing a Flask application into two separate microservice endpoints.
Next, you may want to learn more about efficient monitoring of microservice architectures to better understand the optimization of serverless deployments. You may also want to understand how to write a serverless function.
]]>Serverless architecture allows backend web services to be implemented on an as-needed basis. Rather than needing to maintain your own server configuration, architecting your software for serverless providers can minimize the overhead involved. Serverless applications are typically deployed from a Git repository into an environment that can scale up or down as needed.
This means that serverless functions can effectively “scale to zero” – a function or endpoint should consume no resources at all as long as it is not being accessed. However, this also means that serverless functions must be well-behaved, and should become available from an idle state only to provide individual responses to input requests. These responses can be as computationally intensive as needed, but must be invoked and terminated in a predictable manner.
This tutorial will cover some best practices for writing an example serverless function.
To follow this tutorial, you will need:
A local shell environment with a serverless deployment tool installed. Some serverless platforms make use of the serverless
command, while this tutorial will reflect DigitalOcean’s doctl sandbox
tools. Both provide similar functionality. To install and configure doctl
, refer to its documentation.
The version control tool Git available in your development environment. If you are working in Ubuntu, you can refer to installing Git on Ubuntu 20.04
A complete serverless application can be contained in only two files at a minimum — the configuration file, usually using .yml
syntax, which declares necessary metadata for your application to the serverless provider, and a file containing the code itself, e.g. my_app.py
, my_app.js
, or my_app.go
. If your application has any language dependencies, it will typically also declare them using standard language conventions, such as a package.json
file for Node.js.
To initialize a serverless application, you can use doctl sandbox init
with the name of a new directory:
- doctl sandbox init myServerlessProject
OutputA local sandbox area 'myServerlessProject' was created for you.
You may deploy it by running the command shown on the next line:
doctl sandbox deploy myServerlessProject
By default, this will create a project with the following directory structure:
myServerlessProject/
├── packages
│ └── sample
│ └── hello
│ └── hello.js
└── project.yml
project.yml
is contained in the top-level directory. It declares metadata for hello.js
, which contains a single function. All serverless applications will follow this same essential structure. You can find more examples, using other serverless frameworks, at the official Serverless Framework GitHub repository, or refer to DigitalOcean’s documentation. You can also create these directory structures from scratch without relying on an init
function, but note that the requirements of each serverless provider will differ slightly.
In the next step, you’ll explore the sample project you initialized in greater detail.
A serverless application can be a single function, written in a language that is interpreted by your serverless computing provider (usually Go, Python, and JavaScript), as long as it can return
some output. Your function can call other functions or load other language libraries, but there should always be a single main function defined in your project configuration file that communicates with the endpoint itself.
Running doctl sandbox init
in the last step automatically generated a sample project for your serverless application, including a file called hello.js
. You can open that file using nano
or your favorite text editor:
- nano myServerlessProject/packages/sample/hello/hello.js
function main(args) {
let name = args.name || 'stranger'
let greeting = 'Hello ' + name + '!'
console.log(greeting)
return {"body": greeting}
}
This file contains a single function, called main()
, which can accept a set of arguments. This is the default way that serverless architectures manage input handling. Serverless functions do not necessarily need to directly parse JSON or HTTP headers to handle input. On most providers’ platforms, serverless functions will receive input from HTTP requests as a list of arguments that can be unpacked using standard language features.
The first line of the function uses JavaScript’s ||
OR operator to parse a name
argument if it is present, or use the string stranger
if the function is called without any arguments. This is important in the event that your function’s endpoint is queried incorrectly, or with missing data. Serverless functions should always have a code path that allows you to quickly return null
, or return the equivalent of null
in a well-formed HTTP response, with a minimum of additional processing. The next line, let greeting =
, performs some additional string manipulation.
Depending on your serverless provider, you may not have any filesystem or OS-level features available to your function. Serverless applications are not necessarily stateless. However, features that allow serverless applications to record or retain their state between runs are typically proprietary to each provider. The most common exception to this is the ability to log output from your functions. The sample hello.js
app contains a console.log()
function, which uses a built-in feature of JavaScript to output some additional data to a browser console or a local terminal’s stdout
without returning it to the user. Most serverless providers will allow you to retain and review logging output in this way.
The final line of the function is used to return
output from your function. Because most serverless functions are deployed as HTTP endpoints, you will usually want to return an HTTP response. Your serverless environment may automatically scaffold this response for you. In this case, it is only necessary to return a request body
within an array, and the endpoint configuration takes care of the rest.
This function could perform many more steps, as long as it maintained the same baseline expectations around input and output. Alternatively, your application could run multiple serverless functions in a sequence, and they could be swapped out as needed. Serverless functions can be thought of as being similar to microservice-driven architectures: both enable you to construct an application out of multiple loosely-coupled services which are not necessarily dependent on one another, and communicate over established protocols such as HTTP. Not all microservice architectures are serverless, but most serverless architectures implement microservices.
Now that you understand the application architecture, in the next step, you’ll learn some best practices around preparing serverless functions for deployment and deploying serverless functions.
The doctl sandbox
command line tools allow you to deploy and test your application without promoting them to production, and other serverless implementations provide similar functionality. However, nearly all serverless deployment workflows will eventually involve you committing your application to a source control repository such as GitHub, and connecting the GitHub repository to your serverless provider.
When you are ready for a production deployment, you should be able to visit your serverless provider’s console and identify your source repository as a component of an application. Your application may also have other components, such as a static site, or it may just provide the one endpoint.
For now, you can deploy directly to a testing sandbox using doctl sandbox
:
- doctl sandbox deploy myServerlessProject
This will return information about your deployment, including another command that you can run to request your live testing URL:
OutputDeployed '~/Desktop/myServerlessProject'
to namespace 'f8572f2a-swev6f2t3bs'
on host 'https://faas-nyc1-78edc.doserverless.io'
Deployment status recorded in 'myServerlessProject\.nimbella'
Deployed functions ('doctl sbx fn get <funcName> --url' for URL):
- sample/hello
Running this command will return your serverless function’s current endpoint:
- doctl sbx fn get sample/hello --url
Outputhttps://faas-nyc1-78edc.doserverless.io/api/v1/web/f8572f2a-swev6f2t3bs/sample/hello
The paths returned will be automatically generated, but should end in /sample/hello
, based on your function names.
Note: You can review the doctl sandbox
deployment functionality at its source repository.
After deploying in testing or production, you can use cURL to send HTTP requests to your endpoint. For the sample/hello
app developed in this tutorial, you should be able to send a curl
request to your /sample/hello
endpoint:
- curl https://faas-nyc1-78edc.doserverless.io/api/v1/web/f8572f2a-swev6f2t3bs/sample/hello
Output will be returned as the body
of a standard HTTP request:
Output“Hello stranger!”
You can also provide the name
argument to your function as outlined above, by encoding it as an additional URL parameter:
- curl “https://faas-nyc1-78edc.doserverless.io/api/v1/web/f8572f2a-swev6f2t3bs/sample/hello?name=sammy”
Output“Hello sammy!”
After testing and confirming that your application returns the expected responses, you should ensure that sending unexpected output to your endpoint causes it to fail gracefully. You can review best practices around error handling to ensure that input is parsed correctly, but it’s most important to ensure that your application never hangs unexpectedly, as this can cause availability issues for serverless apps, as well as unexpected per-use billing.
Finally, you’ll want to commit your application to GitHub or another source code repository for going to production. If you choose to use Git or GitHub, you can refer to how to use Git effectively for an introduction to working with Git repositories.
After connecting your source code repository to your serverless provider, you will be able to take additional steps to restrict access to your function’s endpoints, or to associate it together with other serverless functions as part of a larger, tagged app.
In this tutorial, you initialized, reviewed, and deployed a sample serverless function. Although each serverless computing platform is essentially proprietary, the various providers follow very similar architectural principles, and the principles in this tutorial are broadly applicable. Like any other web stack, serverless architectures can vary considerably in scale, but ensuring that individual components are self-contained helps keep your whole stack more maintainable.
Next, you may want to learn more about efficient monitoring of microservice architectures to better understand the optimization of serverless deployments. You may also want to learn about some other potential serverless architectures, such as the Jamstack environment.
]]>Most of the time, your main focus will be on getting your cloud applications up and running. As part of your setup and deployment process, it is important to build in robust and thorough security measures for your systems and applications before they are publicly available. Implementing the security measures in this tutorial before deploying your applications will ensure that any software that you run on your infrastructure has a secure base configuration, as opposed to ad-hoc measures that may be implemented post-deploy.
This guide highlights some practical security measures that you can take while you are configuring and setting up your server infrastructure. This list is not an exhaustive list of everything that you can do to secure your servers, but this offers you a starting point that you can build upon. Over time you can develop a more tailored security approach that suits the specific needs of your environments and applications.
SSH, or secure shell, is an encrypted protocol used to administer and communicate with servers. When working with a server, you’ll probably spend most of your time in a terminal session connected to your server through SSH. As an alternative to password-based logins, SSH keys use encryption to provide a secure way of logging into your server and are recommended for all users.
With SSH keys, a private and public key pair are created for the purpose of authentication. The private key is kept secret and secure by the user, while the public key can be shared. This is commonly referred to as asymmetric encryption, a pattern you may see elsewhere.
To configure SSH key authentication, you need to put your public SSH key on the server in the expected location (usually ~/.ssh/authorized_keys
). To learn more about how SSH-key-based authentication works, read Understanding the SSH Encryption and Connection Process.
With SSH, any kind of authentication — including password authentication — is completely encrypted. However, when password-based logins are allowed, malicious users can repeatedly, automatically attempt to access a server, especially if it has a public-facing IP address. Although there are ways of locking out access after multiple failed attempts from the same IP, and malicious users will be limited in practice by how rapidly they can attempt to log in to your server, any circumstance in which a user can plausibly attempt to gain access to your stack by repeated brute force attacks will pose a security risk.
Setting up SSH key authentication allows you to disable password-based authentication. SSH keys generally have many more bits of data than a password — you can create a 128-character SSH key hash from a 12 character password — making them much more challenging to brute-force. Some encryption algorithms are nevertheless considered crackable by attempting to reverse-engineer password hashes enough times on a powerful enough computer. Others, including the default RSA key generated by modern SSH clients, are not yet plausible to crack.
SSH keys are the recommended way to log into any Linux server environment remotely. A pair of SSH keys can be generated on your local machine using the ssh
command, and you can then transfer the public key to a remote server.
To set up SSH keys on your server, you can follow How To Set Up SSH Keys for Ubuntu, Debian, or CentOS.
For any parts of your stack that require password access, or which are prone to brute force attacks, you can implement a solution like fail2ban on your servers to limit password guesses.
It is a best practice to not allow the root
user to login directly over SSH. Instead, login as an unprivileged user and then escalate privileges as needed using a tool like sudo
. This approach to limiting permissions is known as the principle of least privilege. Once you have connected to your server and created an unprivileged account that you have verified works with SSH, you can disable root
logins by setting the PermitRootLogin no
directive in /etc/ssh/sshd_config
on your server and then restarting the server’s SSH process with a command like sudo systemctl restart sshd
.
A firewall is a software or hardware device that controls how services are exposed to the network, and what types of traffic are allowed in and out of a given server or servers. A properly configured firewall will ensure that only services that should be publicly available can be reached from outside your servers or network.
On a typical server, a number of services may be running by default. These can be categorized into the following groups:
Firewalls can ensure that access to your software is restricted according to the categories above with varying degrees of granularity. Public services can be left open and available to the internet, and private services can be restricted based on different criteria, such as connection types. Internal services can be made completely inaccessible to the internet. For ports that are not being used, access is blocked entirely in most configurations.
Even if your services implement security features or are restricted to the interfaces you’d like them to run on, a firewall serves as a base layer of protection by limiting connections to and from your services before traffic is handled by an application.
A properly configured firewall will restrict access to everything except the specific services you need to remain open, usually by opening only the ports associated with those services. For example, SSH generally runs on port 22, and HTTP/HTTPS access via a web browser usually run on ports 80 and 443 respectively. Exposing only a few pieces of software reduces the attack surface of your server, limiting the components that are vulnerable to exploitation.
There are many firewalls available for Linux systems, and some are more complex than others. In general, you should only need to make changes to your firewall configuration when you make changes to the services running on your server. Here are some options to get up and running:
UFW, or Uncomplicated Firewall, is installed by default on some Linux distributions like Ubuntu. You can learn more about it in How To Set Up a Firewall with UFW on Ubuntu 20.04
If you are using Red Hat, Rocky, or Fedora Linux, you can read How To Set Up a Firewall Using firewalld to use their default tooling.
Many software firewalls such as UFW and firewalld will write their configured rules directly to a file called iptables
. To learn how to work with the iptables
configuration directly, you can review Iptables Essentials: Common Firewall Rules and Commands
. Note that some other software that implements port rules on its own, such as Docker, will also write directly to iptables
, and may conflict with the rules you create with UFW, so it’s helpful to know how to read an iptables
configuration in cases like this.
Note: Many hosting providers, including DigitalOcean, will allow you to configure a firewall as a service which runs as an external layer over your cloud server(s), rather than needing to implement the firewall directly. These configurations, which are implemented at the network edge using managed tools, are often less complex in practice, but can be more challenging to script and replicate. You can refer to the documentation for DigitalOcean’s cloud firewall.
Be sure that your firewall configuration defaults to blocking unknown traffic. That way any new services that you deploy will not be inadvertently exposed to the Internet. Instead, you will have to allow access explicitly, which will force you to evaluate how a service is run, accessed, and who should be able to use it.
Virtual Private Cloud (VPC) networks are private networks for your infrastructure’s resources. VPC networks provide a more secure connection among resources because the network’s interfaces are inaccessible from the public internet.
Some hosting providers will, by default, assign your cloud servers one public network interface and one private network interface. Disabling your public network interface on parts of your infrastructure will only allow these instances to connect to each other using their private network interfaces over an internal network, which means that the traffic among your systems will not be routed through the public internet where it could be exposed or intercepted.
By conditionally exposing only a few dedicated internet gateways, also known as ingress gateways, as the sole point of access between your VPC network’s resources and the public internet, you will have more control and visibility into the public traffic connecting to your resources. Modern container orchestration systems like Kubernetes have a very well-defined concept of ingress gateways, because they create many private network interfaces by default, which need to be exposed selectively.
Many cloud infrastructure providers enable you to create and add resources to a VPC network inside their data centers.
Note: If you are using DigitalOcean and would like to set up your own VPC gateway, you can follow How to Configure a Droplet as a VPC Gateway guide to learn how on Debian, Ubuntu, and CentOS-based servers.
Manually configuring your own private network can require advanced server configurations and networking knowledge. An alternative to setting up a VPC network is to use a VPN connection between your servers.
A VPN, or virtual private network, is a way to create secure connections between remote computers and present the connection as if it were a local private network. This provides a way to configure your services as if they were on a private network and connect remote servers over secure connections.
For example, DigitalOcean private networks enable isolated communication between servers in the same account or team within the same region.
Using a VPN is a way to map out a private network that only your servers can see. Communication will be fully private and secure. Other applications can be configured to pass their traffic over the virtual interface that the VPN software exposes. This way, only services that are meant to be used by clients on the public internet need to be exposed on the public network.
Using private networks usually requires you to make decisions about your network interfaces when first deploying your servers, and configuring your applications and firewall to prefer these interfaces. By comparison, deploying VPNs requires installing additional tools and creating additional network routes, but can typically be deployed on top of existing architecture. Each server on a VPN must have the shared security and configuration data needed to establish a VPN connection. After a VPN is up and running, applications must be configured to use the VPN tunnel.
If you are using Ubuntu or CentOS, you can follow How To Set Up and Configure an OpenVPN Server on Ubuntu 20.04 tutorial.
Wireguard is another popular VPN deployment. Generally, VPNs follow the same principle of limiting ingress to your cloud servers by implementing a series of private network interfaces behind a few entry points, but where VPC configurations are usually a core infrastructure consideration, VPNs can be deployed on a more ad-hoc basis.
Good security involves analyzing your systems, understanding the available attack surfaces, and locking down the components as best as you can.
Service auditing is a way of knowing what services are running on a given system, which ports they are using for communication, and which protocols those services are speaking. This information can help you configure which services should be publicly accessible, firewall settings, monitoring, and alerting.
Each running service, whether it is intended to be internal or public, represents an expanded attack surface for malicious users. The more services that you have running, the greater the chance of a vulnerability affecting your software.
Once you have a good idea of what network services are running on your machine, you can begin to analyze these services. When you perform a service audit, ask yourself the following questions about each running service:
This type of service audit should be standard practice when configuring any new server in your infrastructure. Performing service audits every few months will also help you catch any services with configurations that may have changed unintentionally.
To audit network services that are running on your system, use the ss
command to list all the TCP and UDP ports that are in use on a server. An example command that shows the program name, PID, and addresses being used for listening for TCP and UDP traffic is:
- sudo ss -plunt
The p
, l
, u
, n
, and t
options work as follows:
p
shows the specific process using a given socket.l
shows only sockets that are actively listening for connections.u
includes UDP sockets (in addition to TCP sockets).n
shows numerical traffic values.t
includes TCP sockets (in addition to UDP sockets).You will receive output similar to this:
OutputNetid State Recv-Q Send-Q Local Address:Port Peer Address:Port Process
tcp LISTEN 0 128 0.0.0.0:22 0.0.0.0:* users:(("sshd",pid=812,fd=3))
tcp LISTEN 0 511 0.0.0.0:80 0.0.0.0:* users:(("nginx",pid=69226,fd=6),("nginx",pid=69225,fd=6))
tcp LISTEN 0 128 [::]:22 [::]:* users:(("sshd",pid=812,fd=4))
tcp LISTEN 0 511 [::]:80 [::]:* users:(("nginx",pid=69226,fd=7),("nginx",pid=69225,fd=7))
The main columns that need your attention are the Netid, Local Address:Port, and Process name columns. If the Local Address:Port is 0.0.0.0
, then the service is accepting connections on all IPv4 network interfaces. If the address is [::]
then the service is accepting connections on all IPv6 interfaces. In the example output above, SSH and Nginx are both listening on all public interfaces, on both IPv4 and IPv6 networking stacks.
You could decide if you want to allow SSH and Nginx to listen on both interfaces, or only on one or the other. Generally, you should disable services that are running on unused interfaces.
Keeping your servers up to date with patches is necessary to ensure a good base level of security. Servers that run out of date and insecure versions of software are responsible for a majority of security incidents, but regular updates can mitigate vulnerabilities and prevent attackers from gaining a foothold on your servers. Unattended updates allow the system to update a majority of packages automatically.
Implementing unattended, i.e. automatic, updates lowers the level of effort required to keep your servers secure and shortens the amount of time that your servers may be vulnerable to known bugs. In the event of a vulnerability that affects software on your servers, your servers will be vulnerable for however long it takes for you to run updates. Daily unattended upgrades will ensure that you don’t miss any packages, and that any vulnerable software is patched as soon as fixes are available.
You can refer to How to Keep Ubuntu Servers Updated for an overview of implementing unattended updates on Ubuntu.
Public key infrastructure, or PKI, refers to a system that is designed to create, manage, and validate certificates for identifying individuals and encrypting communication. SSL or TLS certificates can be used to authenticate different entities to one another. After authentication, they can also be used to establish encrypted communication.
Establishing a certificate authority (CA) and managing certificates for your servers allows each entity within your infrastructure to validate the other members’ identities and encrypt their traffic. This can prevent man-in-the-middle attacks where an attacker imitates a server in your infrastructure to intercept traffic.
Each server can be configured to trust a centralized certificate authority. Afterward, any certificate signed by this authority can be implicitly trusted.
Configuring a certificate authority and setting up the other public key infrastructure can involve quite a bit of initial effort. Furthermore, managing certificates can create an additional administration burden when new certificates need to be created, signed, or revoked.
For many users, implementing a full-fledged public key infrastructure will only make sense as their infrastructure needs grow. Securing communications between components using a VPN may be a better intermediate measure until you reach a point where PKI is worth the extra administration costs.
If you would like to create your own certificate authority, you can refer to the How To Set Up and Configure a Certificate Authority (CA) guides depending on the Linux distribution that you are using.
The strategies outlined in this tutorial are an overview of some of the steps that you can take to improve the security of your systems. It is important to recognize that security measures decrease in their effectiveness the longer you wait to implement them. Security should not be an afterthought and must be implemented when you first provision your infrastructure. Once you have a secure base to build upon, you can then start deploying your services and applications with some assurances that they are running in a secure environment by default.
Even with a secure starting environment, keep in mind that security is an ongoing and iterative process. Always be sure to ask yourself what the security implications of any change might be, and what steps you can take to ensure that you are always creating secure default configurations and environments for your software.
]]>Cloud hosting is a method of using online virtual servers that can be created, modified, and destroyed on demand. Cloud servers are allocated resources like CPU cores and memory by the physical they are hosted on, and can be configured with any operating system and accompanying software. Cloud hosting can be used for hosting websites, distributing web-based applications, or other services.
In this guide, we will go over some of the basic concepts involved in cloud hosting, including how virtualization works, the components in a virtual environment, and comparisons with other common hosting methods.
“The Cloud” is a common term that refers to internet-accessible servers that are available for public use, either through paid leasing or as part of a software or platform service. A cloud-based service can take many forms, including web hosting, file hosting and sharing, and software distribution. “The Cloud” can also refer to cloud computing, i.e., transparently spanning a task across multiple servers. Instead of running a complex process on a single powerful machine, cloud computing distributes the task across many smaller nodes.
Cloud hosting environments are broken down into two main parts: the virtual servers that apps and websites can be hosted on, and the physical hosts that manage the virtual servers. Virtualization makes cloud hosting possible: the relationship between host and virtual server provides flexibility and scaling that are not available through other hosting methods.
The most common form of cloud hosting today is the use of a virtual private server, or VPS. A VPS is a virtual server that acts like a real computer with its own operating system. While virtual servers share resources that are allocated to them by the host, they are totally isolated in practice, so operations on one VPS won’t affect the others.
Virtual servers are deployed and managed by the hypervisor of a physical host. Each virtual server has an operating system installed by the hypervisor, that is available to the user. For practical purposes, a virtual server is identical in use to a dedicated physical server, though the a virtual server needs to share physical hardware resources with other servers on the same host.
Resources are allocated to a virtual server by the physical server that it is hosted on. This host uses a software layer called a hypervisor to deploy, manage, and grant resources to the virtual servers that are under its control. The term “hypervisor” is also often used to refer to the physical hosts that hypervisors (and their virtual servers) are installed on.
The host is in charge of allocating memory, CPU cores, and a network connection to a virtual server when one is launched. An ongoing duty of the hypervisor is to schedule processes between the virtual CPU cores and the physical ones, since multiple virtual servers may be utilizing the same physical cores. Hypervisors differ from one another in the nuances of process scheduling and resource sharing.
There are a few common hypervisors available for cloud hosts today. These different virtualization methods have some key differences, but they all provide the tools that a host needs to deploy, maintain, move, and destroy virtual servers as needed.
KVM, short for “Kernel-Based Virtual Machine”, is a virtualization infrastructure that is built in to the Linux kernel. When activated, this kernel module turns the Linux machine into a hypervisor, allowing it to begin hosting virtual servers. This method contrasts with how other hypervisors usually work, as KVM does not need to create or emulate kernel components that are used for virtual hosting.
Xen is one of the most common hypervisors. Unlike KVM, Xen uses its own microkernel, which provides the tools needed to support virtual servers without modifying the host’s kernel. Xen supports two distinct methods of virtualization: paravirtualization, which skips the need to emulate hardware but requires special modifications made to the virtual servers’ operating system, and hardware-assisted virtualization (or HVM), which uses special hardware features to efficiently emulate a virtual server so that they can use unmodified operating systems. HVM became widespread on consumer CPUs around 2006, allowing most desktops and laptops to achieve similar performance when running virtual machines or microkernel-based containers (e.g. through Docker).
ESXi is an enterprise-level hypervisor offered by VMware. ESXi is unique in that it doesn’t require the host to have an underlying operating system. This is referred to as a “type 1” hypervisor and is extremely efficient due to the lack of a “middleman” between the hardware and the virtual servers. With type 1 hypervisors like ESXi, no operating system needs to be loaded on the host because the hypervisor itself acts as the operating system.
Hyper-V is one of the most popular methods of virtualizing Windows servers and is available as a system service in Windows Server. This makes Hyper-V a common choice for developers working within a Windows software environment. Hyper-V is included in modern versions of Windows and is also available as a stand-alone server without an existing installation of Windows Server. WSL2, the Windows Subsystem for Linux, is implemented via Hyper-V.
The features offered by virtualization lend themselves well to a cloud hosting environment. Virtual servers can be configured with a wide range of hardware resource allocations, and can often have resources added or removed as needs change over time. Some cloud hosts can move a virtual server from one hypervisor to another with little or no downtime, or duplicate the server for redundancy in case of a node failure.
Developers often prefer to work in a VPS due to the control that they have over the virtual environment. Most virtual servers running Linux offer access to the root (administrator) account or sudo
privileges by default, giving a developer the ability to install and modify whatever software they need.
This freedom of choice begins with the operating system. Most hypervisors are capable of hosting nearly any guest operating system, from open source software like Linux and BSD to proprietary systems like Windows. From there, developers can begin installing and configuring the building blocks needed for whatever they are working on. A cloud server’s configurations might include a web server, database, or an app that has been developed and is ready for distribution.
Cloud servers are very flexible in their ability to scale. Scaling methods fall into two broad categories: horizontal scaling and vertical scaling. Most hosting methods can scale one way or the other, but cloud hosting is unique in its ability to scale both horizontally and vertically. This is due to the virtual environment that a cloud server is built on: as its resources are an allocated portion of a larger physical pool, these resources can be adjusted or duplicated to other hypervisors.
Horizontal scaling, often referred to as “scaling out”, is the process of adding more nodes to a clustered system. This might involve adding more web servers to better manage traffic, adding new servers to a region to reduce latency, or adding more database workers to increase data transfer speed.
Vertical scaling, or “scaling up”, is when a single server is upgraded with additional resources. This might be an expansion of available memory, an allocation of more CPU cores, or some other upgrade that increases that server’s capacity. These upgrades usually pave the way for additional software instances, like database workers, to operate on that server. Before horizontal scaling became cost-effective, vertical scaling was the de facto way to respond to increasing demand.
With cloud hosting, developers can scale depending on their application’s needs — they can scale out by deploying additional VPS nodes, scale up by upgrading existing servers, or do both when server needs have dramatically increased.
By now, you should have an understanding of how cloud hosting works, including the relationship between hypervisors and the virtual servers that they are responsible for, as well as how cloud hosting compares to other common hosting methods. With this information in mind, you can choose the best hosting for your needs.
For a broader view of the overall cloud computing landscape, you can read A General Introduction to Cloud Computing.
]]>Backups are very important for cloud servers. Whether you are running a single project with all of its data stored on a single server, or deploying directly from Git to VMs that are spun up and torn down while retaining a minimum set of logs, you should always plan for a failure scenario. This can mean many different things depending on what applications you are using, how important it is to have immediate failover, and what kind of problems you are anticipating.
In this guide, you’ll explore the different approaches for providing backups and data redundancy. Because different use cases demand different solutions, this article won’t be able to give you a one-size-fits-all answer, but you will learn what is important in different scenarios and what implementations are best suited for your operation.
In the first part of this guide, you’ll look at several backup solutions and review the relative merits of each so that you can choose the approach that fits your environment. In part two, you’ll explore redundancy options.
The definitions of the terms redundant and backup are often overlapping and, in many cases, confused. These are two distinct concepts that are related, but different. Some solutions provide both.
Redundancy in data means that there is immediate failover in the event of a system problem. A failover means that if one set of data (or one host) becomes unavailable, another perfect copy is immediately swapped into production to take its place. This results in almost no perceivable downtime, and the application or website can continue serving requests as if nothing happened. In the meantime, the system administrator (in this case, you) have the opportunity to fix the problem and return the system to a fully operational state.
However, a redundancy solution is usually not also a backup solution. Redundant storage does not necessarily provide protection against a failure that affects the entire machine or system. For instance, if you have a mirrored RAID configured (such as RAID 1), your data is redundant in that if one drive fails, the other will still be available. However, if the machine itself fails, all of your data could be lost.
With redundancy solutions such as MySQL Group Replication, every operation is typically performed on every copy of the data. This includes malicious or accidental operations. By definition, a backup solution should also allow you to restore from a previous point where the data is known to be good.
In general, you need to maintain functional backups for your important data. Depending on your situation, this could mean backing up application or user data, or an entire website or machine. The idea behind backups is that in the event of a system, machine, or data loss, you can restore, redeploy, or otherwise access your data. Restoring from a backup may require downtime, but it can mean the difference between starting from a day ago and starting from scratch. Anything that you cannot afford to lose should, by definition, be backed up.
In terms of methods, there are quite a few different levels of backups. These can be layered as necessary to account for different kinds of problems. For instance, you may back up a configuration file prior to modifying it so that you can revert to your old settings should a problem arise. This is ideal for small changes that you are actively monitoring. However, this setup would fail in the case of a disk failure or anything more complex. You should also have regular, automated backups to a remote location.
Backups by themselves do not provide automatic failover. This means that your failures may not cost you any data (assuming your backups are 100% up-to-date), but they may cost you uptime. This is one reason why redundancy and backups are often used in combination with each other.
One of the most familiar forms of backing up is a file-level backup. This type of backup uses normal filesystem level copying tools to transfer files to another location or device.
In theory, you could back up a Linux machine, like your cloud server, with the cp
command. This copies files from one local location to another. On a local computer, you could mount a removable drive, and then copy files to it:
- mount /dev/sdc /mnt/my-backup
- cp -a /etc/* /mnt/my-backup
- umount /dev/sdc
This example mounts a removable disk, sdc
, as /mnt/my-backup
and then copies the /etc
directory to the disk. It then unmounts the drive, which can be stored somewhere else.
A better alternative to cp
is the rsync
command. Rsync is a powerful tool that provides a wide array of options for replicating files and directories across many different environments, with built-in checksum validation and other features. Rsync can perform the equivalent of the cp
operation above like so:
- mount /dev/sdc /mnt/my-backup
- rsync -azvP /etc/* /mnt/my-backup
- umount /dev/sdc
-azvP
is a typical set of Rsync options. As a breakdown of what each of those do:
a
enables “Archive Mode” for this copy operation, which preserves file modification times, owners, and so on. It is also the equivalent of providing each of the -rlptgoD
options individually (yes, really). Notably, the -r
option tells Rsync to recurse into subdirectories to copy nested files and folders as well. This option is common to many other copy operations, such as cp
and scp
.z
compresses data during the transfer itself, if possible. This is useful for any transfers over slow connections, especially when transferring data that compresses very effectively, like logs and other text.v
enables verbose mode, so you can read more details of your transfer while it is in progress.P
tells Rsync to retain partial copies of any files that do not transfer completely, so that transfers can be resumed later.You can review other rsync options on its man page.
Of course, in a cloud environment, you would not normally be mounting and copying files to a mounted disk each time. Rsync can also perform remote backups over a network by providing SSH-style syntax. This will work on any host that you can SSH into, as long as Rsync is installed at both ends. Because Rsync is considered a core Linux tool, this is almost always a safe assumption, even if you are working locally on a Mac or Windows machine.
- rsync -azvP /etc/* username@remote_host:/backup/
This will back up the local machine’s /etc
directory to a directory on remote_host
located at /backup
. This will succeed if you have permission to write to this directory and there is available space.
You can also review more information about how to use Rsync to sync local and remote directories.
Although cp
and rsync
are useful and ubiquitous, they are not a complete solution on their own. To automate backups using Rsync, you would need to create your own automated procedures, backup schedule, log rotation, and so on. While this may be appropriate for some very small deployments which do not want to make use of external services, or very large deployments which have dedicated resources for maintaining very granular scripts for various purposes, many users may want to invest in a dedicated backup offering.
Bacula
Bacula is a complex, flexible solution that works on a client - server model. Bacula is designed with separate concepts of clients, backup locations, and directors (the component that orchestrates the actual backup). It also configures each backup task into a unit called a “job”.
This allows for extremely granular and flexible configuration. You can back up multiple clients to one storage device, one client to multiple storage devices, and modify the backup scheme by adding nodes or adjusting their details. It functions well over a networked environment and is expandable and modular, making it great for backing up a site or application spread across multiple servers.
Duplicity
Duplicity is another open source backup tool. It uses GPG encryption by default for transfers.
The obvious benefit of using GPG encryption for file backups is that the data is not stored in plain text. Only the owner of the GPG key can decrypt the data. This provides some level of security to offset the additional security measures required when your data is stored in multiple locations.
Another benefit that may not be apparent to those who do not use GPG regularly is that each transaction has to be verified to be completely accurate. GPG, like Rsync, enforces hash checking to ensure that there was no data loss during the transfer. This means that when restoring data from a backup, you will be significantly less likely to encounter file corruption.
A slightly less common, but important alternative to file-level backups are block-level backups. This style of backup is also known as “imaging” because it can be used to duplicate and restore entire devices. Block-level backups allow you to copy on a deeper level than a file. While a file-based backup might copy file1, file2, and file3 to a backup location, a block-based backup system would copy the entire “block” that those files reside on. Another way of explaining the same concept is to say that block-level backups copy information bit after bit. They do not know about the files that may span those bits.
One advantage of the block-level backups is that they are typically faster. While file-based backups usually initiate a new transfer for each separate file, a block-based backup will transfer blocks, meaning that fewer non-sequential transfers need to be initiated to complete the copying.
The most common method of performing block-level backups is with the dd
utility. dd
can be used to create entire disk images, and is also frequently used when archiving removable media like CDs or DVDs. This means that you can back up a partition or disk to a single file or a raw device without any preliminary steps.
To use dd
, you need to specify an input location and an output location, like so:
- dd if=/path/of/original/device of=/path/to/place/backup
In this scenario, the if=
argument specifies the input device or location. The of=
arguments specifies the output file or location. Be careful not to confuse these, or you could erase an entire disk by mistake.
For example, to back up a partition containing your documents, which is located at /dev/sda3
, you can create an image of that directory by providing an output path to an .img
file:
- dd if=/dev/sda3 of=~/documents.img
One of the primary motivations for backing up data is being able to restore a previous version of a file in the event of an unwanted change or deletion. While all of the backup mechanisms mentioned so far can deliver this, you can also implement a more granular solution.
For example, a manual way of accomplishing this would be to create a backup of a file prior to editing it in nano
:
- cp file1 file1.bak
- nano file1
You could even automate this process by creating timestamped hidden files every time you modify a file with your editor. For instance, you could place this in your ~/.bashrc
file, so that every time you execute nano
from your bash
(i.e. $
) shell, it automatically creates a backup stamped with year (%y
), month (%m
), day (%d
), and so on:
- nano() { cp $1 .${1}.`date +%y-%m-%d_%H.%M.%S`.bak; /usr/bin/nano $1; }
This would work to the extent that you edit files manually with nano
, but is limited in scope, and could quickly fill up a disk. You can see how it could end up being worse than manually copying files you are going to edit.
An alternative that solves many of the problems inherent in this design is to use Git as a version control system. Although it was developed primarily to focus on versioning plain text, usually source code, line-by-line, you can use Git to track almost any kind of file. To learn more, you can review How to Use Git Effectively.
Most hosting providers will also provide their own optional backup functionality. DigtalOcean’s backup function regularly performs automated backups for droplets that have enabled this service. You can turn this on during droplet creation by checking the “Backups” check box:
This will back up your entire cloud server image on a regular basis. This means that you can redeploy from the backup, or use it as a base for new droplets.
For one-off imaging of your system, you can also create snapshots. These work in a similar way to backups, but are not automated. Although it’s possible to take a snapshot of a running system in some contexts, it is not always recommended, depending on how you are writing to your filesystem:
You can learn more about DigitalOcean backups and snapshots from the Containers and Images documentation.
Finally, it is worth noting that there are some circumstances in which you will not necessarily be looking to implement backups on a per-server basis. For example, if your deployment follows the principles of GitOps, you may treat many of your individual cloud servers as disposable, and instead treat remote data sources like Git repositories as the effective source of truth for your data. Complex, modern deployments like this can be more scalable and less prone to failure in many cases. However, you will still want to implement a backup strategy for your data stores themselves, or for a centralized log server that each of these disposable servers may be sending information to. Consider which aspects of your deployment may not need to be backed up, and which do.
In this article, you explored various backup concepts and solutions. Next, you may want to review solutions to enable redundancy.
]]>Optimizing Wordpress installations gives the clients and individuals who use your sites the performance, speed, and flexibility they’ve come to expect with WordPress. Whether you’re managing a personal site or a suite of installations for various clients, taking the time to optimize your WordPress installations increases efficiency and performance.
In this tutorial, you’ll explore how to optimize WordPress installations in a way that’s built for scale, including guidance on configuration, speed, and overall performance.
This is a conceptual article sharing different ways to approach optimization of a WordPress installation on Ubuntu 20.04. While this tutorial references the use of a managed solution via our WordPress 1-Click App, there are many different starting points, including:
Whichever you choose, this tutorial will start with the assumption that you have or are prepared to install a fully-working WordPress installation configured with an administrative user on Ubuntu 20.04.
During the installation and creation of your WordPress installation there are a few variables to take into account, including the location of your potential users, the scope of your WordPress site or suite of sites, and the maintenance and security preferences set that allow your site to be continually optimized. Taking the time to dive into each thoughtfully before building out your site will save time and benefit your WordPress installation as it grows.
The first step in optimizing your WordPress site is to have a deep understanding of how you intend to use and grow your site. Will it be one site, or a network of sites? Is your site a static or dynamic website? Answering these questions before setting up your installation can inform some of your initial decisions regarding hosting, storage size, and performance.
For example, if you’d like to build a personal blog, caching and optimizing images and visual content is important to consider. If you intend to create a community or ecommerce site with concurrent visitors and frequently changing data, considerations for server resources should be made. Being thoughtful about your intention for your WordPress installation from the start will guide the usefulness of security and performance tweaks made to your site, and lend to an overall more efficient installation.
There are a few preferences that are important to consider while installing WordPress that can reduce latency and increase performance on your site.
First, select a hosting provider that provides the latest WordPress, Apache, MySQL, and PHP software with firewall and SSL certificate capabilities. A reliable and modern hosting provider will give you the best start for your LAMP
stack installation. With shared hosting, be aware of server usage and customers per server to ensure the best performance for your site. Choosing the right hosting provider for your needs will help you prevent downtime and performance errors.
Be aware of the location of your servers or datacenters when starting a new WordPress installation, and choose the location that best suits the need of your site and general location of your visitors and users. Latency, the time it takes for data to be transmitted between your site and users, fluctuates based on location. The Wordpress documentation on site analytic tools explains how to track visitor location data, as well as the number of visits to your site. Having an idea from the start about where your visitors are from can help determine where to host your site and provide them with a faster browsing experience.
There are a wide range of available themes that can be used or customized for WordPress. Many themes can be configured with user-friendly drag and drop interfaces, integrated with custom plugins and more. When setting up your WordPress site, it’s a good idea to initially consider only the essential features that you’ll use for the lifecycle of your site, adding more as you grow.
Starting with a lightweight theme can help your installation to load more efficiently. A theme will require fewer database calls and by keeping your site free of unnecessary code, your users will have fewer delays in site speed and performance.
For any theme selected, be sure to turn off or disable any features offered with the theme that you won’t need or use. These can be preferences offered in the Appearance section of the WordPress dashboard, typically under Theme Editor or Customize. Turning off features you don’t use reduces the number of requests and calls happening to query for data in the background.
While there are a number of free and paid options for WordPress themes available online, many use page builders that add excess shortcode and unused code that will affect the performance of your site. Consider your use case when deciding whether or not to use a page builder, as they typically include a lot of extra processes that will have an impact on your site’s speed.
WordPress plugins offer extended functionality for WordPress installations through added code that allows users to customize their installations to suit their specific needs. There are over 56,000 currently available plugins, making them an appealing way to add additional features to a WordPress site.
While plugins can increase the efficiency of your site, care should be taken in selecting quality plugins that are maintained and updated regularly. Because many plugins not only add code to your site but entries to your WordPress installation’s database, using too many plugins may cause site speed issues over time.
Once you have installed all of the plugins, widgets, and additional features you’d like to add to your WordPress installation, there are a few more optimization options to try within the WordPress dashboard that could positively impact your site’s speed and performance.
First, be sure to change your site’s login address. Because most WordPress administrative login pages end in /wp-admin
, this page is often prone to attacks. There are a number of tools available that enable you to change your login URL — be sure to select the one that works best for your use case.
Next, consider the Site Health tool, located in the Tools section of your WordPress dashboard:
Consider the results shown, and follow the instructions found in each dropdown on the Status tab to improve security or performance as mentioned within the tabs.
Using the built-in configuration offered in the WordPress dashboard ensures that you’ve covered all of the readily available configuration tweaks for your installation.
Caching can also help improve your WordPress site’s performance and speed. Caching, a core design feature of the HTTP protocol meant to minimize network traffic while improving the perceived responsiveness of the system as a whole, can be used to help minimize load times when implemented on your site. WordPress offers a number of caching plugins that are helpful in maintaining a snapshot of your site to serve static HTML elements, reducing the amount of PHP calls and improving page load speed.
In this tutorial you explored a number of different techniques that you can use to make your WordPress installation on Ubuntu 20.04 faster and more efficient. Following the suggestions in this tutorial will help ensure that your site’s performance isn’t an issue as you grow in users and content on your site.
]]>This article is deprecated and no longer maintained.
We have rewritten our introductory database content.
This article may still be useful as a reference, but may not work or follow best practices. We strongly recommend using a recent article written for the operating system you are using.
Since time immemorial, one of the most heavily needed and relied upon functionality of computers has been the memory. Although the technicalities and underlying implementation methods differ, most computers come equipped with necessary hardware to process information and safe-keep them to be used in future whenever necessary.
In today’s world, it is almost impossible to think of any application that does not make use of this ability of machines, whether they be servers, personal computers or hand-held devices. From simple games to business-related tools, including web sites, certain type(s) of data is processed, recorded, and retrieved with each operation.
Database Management Systems (DBMS) are the higher-level software, working with lower-level application programming interfaces (APIs), that take care of these operations. To help with solving different kind of problems, for decades new kinds of DBMSs have been developed (e.g. Relational, NoSQL, etc.) along with applications implementing them (e.g. MySQL, PostgreSQL, MongoDB, Redis, etc).
In this DigitalOcean article, we are going to go over the basics of databases and database management systems. We will learn about the logic behind how different databases work and what sets them apart.
Database Management System is an umbrella term that refers to all sorts of completely different tools (i.e. computer programs or embedded libraries), mostly working in different and very unique ways. These applications handle, or heavily assist in handling, dealing with collections of information. Since information (or data) itself can come in various shapes and sizes, dozens of DBMS have been developed, along with tons of DB applications, since the second half of the 21st century to help in solving different programming and computerisation needs.
Database management systems are based on database models: structures defined for handling the data. Each emerging DBMS, and applications created to actualise their methods, work in very different ways with regards to definitions and storage-and-retrieval operations of said information.
Although there are a large number of solutions that implement different DBMs, each period in history has seen a relatively small amount of choices rapidly become extremely popular and stay in use for a longer time, with probably the most predominant choice since the past couple of decades (or even longer) being the Relational Database Management Systems (RDBMS).
Each database system implements a different database model to logically structure the data that is being managed. These models are the first step and the biggest determiner of how a database application will work and handle the information it deals with.
There are quite a few different types of database models which clearly and strictly provide the means of structuring the data, with most popular probably being the Relational Model.
Although the relational model and relational databases are extremely powerful and flexible - when the programmer knows how to use them, for many, there have been several issues or features that these solutions never really offered.
Recently, a series of different systems and applications called NoSQL databases started to gain popularity, expeditiously, with their promise of solving these problems and offering some very interesting additional functionality. By eradicating the strictly structured data keeping style defined within the relational model, these DB systems work by offering a much more freely shaped way of working with information, thus providing a great deal of flexibility and ease – despite the fact that they come with their own problems, some serious considering the important and indispensable nature of data.
Introduced in 1970s, the relational model offers a very mathematically-adapt way of structuring, keeping, and using the data. It expands the earlier designs of flat model, network model, et cetera by introducing means of relations. Relations bring the benefits of group-keeping the data as constrained collections whereby data-tables, containing the information in a structured way (e.g. a Person’s name and address), relates all the input by assigning values to attributes (e.g. a Person’s ID number).
Thanks to decades of research and development, database systems that implement the relational model work extremely efficiently and reliably. Combined with the long experience of programmers and database administrators working with these tools, using relational database applications has become the choice of mission-critical applications which can not afford loss of any information, in any situation – especially due to glitches or gotchas.
Despite their strict nature of forming and handling data, relational databases can become extremely flexible and offer a lot, granted with a little bit of effort.
The NoSQL way of structuring the data consists of getting rid of these constraints, hence liberating the means of keeping, querying, and using information. NoSQL databases, by using an unstructured (or structured-on-the-go) kind of approach, aim to eliminate the limitations of strict relations, and offer many different types of ways to keep and work with the data for specific use cases efficiently (e.g. full-text document storage).
In this article, our aim is to introduce you to paradigms of some of the most (and more) popular and commonly used database solutions. Although it is hard to reach a numeric conclusion, it can be clearly estimated that for most, the odds lie between a relational database engine, or, a relatively newer NoSQL one. Before we begin with understanding the differences between different implementations of each one of these systems, let us now see what is under-the-hood.
Relational Database System takes its name from the model it implements: The Relational Model, which we have discussed previously. Currently, and for quite some time to come, they are and they will be the popular choice of keeping data reliably and safe – and they are efficient as well.
Relational database management systems require defined and clearly set schemas - which is not to be confused with PostgreSQL’s specific definition for the term - in order to accept data. These user-defined formats shape how the data is contained and used. Schemas are much like tables with columns, representing the number and the type of information that belongs to each record; and rows represent entries.
Some popular relational database management systems are:
A very powerful, embedded relational database management system.
The most popular and commonly used RDBMS.
The most advanced, SQL-compliant and open-source objective-RDBMS.
Note: To learn more about NoSQL database management systems, check out our article on the subject: A Comparison Of NoSQL Database Management Systems.
NoSQL database systems do not come with a model as used (or needed) with structured relational solutions. There are many implementations with each working very differently and serving a specific need. These schema-less solutions either allow an unlimited forming of entries, or, a rather an opposing, very simple but extremely efficient as useful key based value stores.
Unlike traditional relational databases, it is possible to group collections of data together with some NoSQL databases, such as the MongoDB. These document stores keep each data, together, as a single collection (i.e. document) in the database. These documents can be represented as singular data objects, similar to JSON and still be quires depending on attributes.
NoSQL databases do not have a common way to query the data (i.e. similar to SQL of relational databases) and each solution provides its own query system.
Note: To learn more about relational database management systems, check out our article on the subject: A Comparison Of Relational Database Management Systems.
In order to reach a simpler, understandable conclusion, let us analyse SQL and No-SQL database management systems’ differences:
SQL/Relational databases require a structure with defined attributes to hold the data, unlike NoSQL databases which usually allow free-flow operations.
Regardless of their licences, relational databases all implement the SQL standard to a certain degree and thus, they can be queried using the Structured Query Language (SQL). NoSQL databases, on the other hand, each implement a unique way to work with the data they manage.
Both solutions are easy to scale vertically (i.e. by increasing system resources). However, being more modern (and simpler) applications, NoSQL solutions usually offer much easier means to scale horizontally (i.e. by creating a cluster of multiple machines).
When it comes to data reliability and safe guarantee of performed transactions, SQL databases are still the better bet.
Relational database management systems have decades long history. They are extremely popular and it is very easy to find both free and paid support. If an issue arises, it is therefore much easier to solve than recently-popular NoSQL databases – especially if said solution is complex in nature (e.g. MongoDB).
By nature, relational databases are the go-to solution for complex querying and data keeping needs. They are much more efficient and excel in this domain.
]]>SSH, or secure shell, is a secure protocol and the most common way of safely administering remote servers. Using a number of encryption technologies, SSH provides a mechanism for establishing a cryptographically secured connection between two parties, authenticating each side to the other, and passing commands and output back and forth.
In this guide, we will be examining the underlying encryption techniques that SSH employs and the methods it uses to establish secure connections. This information can be useful for understanding the various layers of encryption and the different steps needed to form a connection and authenticate both parties.
In order to secure the transmission of information, SSH employs a number of different types of data manipulation techniques at various points in the transaction. These include forms of symmetrical encryption, asymmetrical encryption, and hashing.
The relationship of the components that encrypt and decrypt data determines whether an encryption scheme is symmetrical or asymmetrical.
Symmetrical encryption is a type of encryption where one key can be used to encrypt messages to the opposite party, and also to decrypt the messages received from the other participant. This means that anyone who holds the key can encrypt and decrypt messages to anyone else holding the key.
This type of encryption scheme is often called “shared secret” encryption, or “secret key” encryption. There is typically only a single key that is used for all operations or a pair of keys where the relationship is discoverable and it’s trivial to derive the opposite key.
Symmetric keys are used by SSH in order to encrypt the entire connection. Contrary to what some users assume, public/private asymmetrical key pairs that can be created are only used for authentication, not encrypting the connection. The symmetrical encryption allows even password authentication to be protected against snooping.
The client and server both contribute toward establishing this key, and the resulting secret is never known to outside parties. The secret key is created through a process known as a key exchange algorithm. This exchange results in the server and client both arriving at the same key independently by sharing certain pieces of public data and manipulating them with certain secret data. This process is explained in greater detail later on.
The symmetrical encryption key created by this procedure is session-based and constitutes the actual encryption for the data sent between server and client. Once this is established, the rest of the data must be encrypted with this shared secret. This is done prior to authenticating a client.
SSH can be configured to use a variety of different symmetrical cipher systems, including Advanced Encryption Standard (AES), Blowfish, 3DES, CAST128, and Arcfour. The server and client can both decide on a list of their supported ciphers, ordered by preference. The first option from the client’s list that is available on the server is used as the cipher algorithm in both directions.
On Ubuntu 20.04, both the client and the server are defaulted like the following:
chacha20-poly1305@openssh.com
aes128-ctr
aes192-ctr
aes256-ctr
aes128-gcm@openssh.com
aes256-gcm@openssh.com
This means that if two Ubuntu 20.04 machines are connecting to each other (without overriding the default ciphers through configuration options), they will always default to using the chacha20-poly1305@openssh.com
cipher to encrypt their connection.
Asymmetrical encryption is different from symmetrical encryption because to send data in a single direction, two associated keys are needed. One of these keys is known as the private key, while the other is called the public key.
The public key can be freely shared with any party. It is associated with its paired key, but the private key cannot be derived from the public key. The mathematical relationship between the public key and the private key allows the public key to encrypt messages that can only be decrypted by the private key. This is a one-way ability, meaning that the public key has no ability to decrypt the messages it writes, nor can it decrypt anything the private key may send it.
The private key should be kept entirely secret and should never be shared with another party. This is a key requirement for the public key paradigm to work. The private key is the only component capable of decrypting messages that were encrypted using the associated public key. By virtue of this fact, any entity capable of decrypting these messages has demonstrated that they are in control of the private key.
SSH uses asymmetric encryption in a few different places. During the initial key exchange process used to set up the symmetrical encryption (used to encrypt the session), asymmetrical encryption is used. In this stage, both parties produce temporary key pairs and exchange the public key in order to produce the shared secret that will be used for symmetrical encryption.
The more well-discussed use of asymmetrical encryption with SSH comes from SSH key-based authentication. SSH key pairs can be used to authenticate a client to a server. The client creates a key pair and then uploads the public key to any remote server it wishes to access. This is placed in a file called authorized_keys
within the ~/.ssh
directory in the user account’s home directory on the remote server.
After the symmetrical encryption is established to secure communications between the server and client, the client must authenticate to be allowed access. The server can use the public key in this file to encrypt a challenge message to the client. If the client can prove that it was able to decrypt this message, it has demonstrated that it owns the associated private key. Then the server can set up the environment for the client.
Another form of data manipulation that SSH takes advantage of is cryptographic hashing. Cryptographic hash functions are methods of creating a succinct “signature” or summary of a set of information. Their main distinguishing attributes are that they are never meant to be reversed, they are virtually impossible to influence predictably, and they are practically unique.
Using the same hashing function and message should produce the same hash; modifying any portion of the data should produce an entirely different hash. A user should not be able to produce the original message from a given hash, but they should be able to tell if a given message produced a given hash.
Given these properties, hashes are mainly used for data integrity purposes and to verify the authenticity of communication. The main use in SSH is with HMAC, or hash-based message authentication codes. These are used to ensure the message text that’s received is intact and unmodified.
As part of the symmetrical encryption negotiation outlined previously, a message authentication code (MAC) algorithm is selected. The algorithm is chosen by working through the client’s list of acceptable MAC choices. The first one on this list that the server supports will be used.
Each message sent after the encryption is negotiated must contain a MAC so that the other party can verify the packet integrity. The MAC is calculated from the symmetrical shared secret, the packet sequence number of the message, and the actual message content.
The MAC itself is sent outside of the symmetrically encrypted area as the final part of the packet. Researchers generally recommend this method of encrypting the data first and then calculating the MAC.
You probably already have a basic understanding of how SSH works. The SSH protocol employs a client-server model to authenticate two parties and encrypt the data between them.
The server component listens on a designated port for connections. It is responsible for negotiating the secure connection, authenticating the connecting party, and spawning the correct environment if the credentials are accepted.
The client is responsible for beginning the initial transmission control protocol (TCP) handshake with the server, negotiating the secure connection, verifying that the server’s identity matches previously recorded information, and providing credentials to authenticate.
An SSH session is established in two separate stages. The first is to agree upon and establish encryption to protect future communication. The second stage is to authenticate the user and discover whether access to the server should be granted.
When a TCP connection is made by a client, the server responds with the protocol versions it supports. If the client can match one of the acceptable protocol versions, the connection continues. The server also provides its public host key, which the client can use to check whether this was the intended host.
At this point, both parties negotiate a session key using a version of something called the Diffie-Hellman algorithm. This algorithm (and its variants) make it possible for each party to combine their own private data with public data from the other system to arrive at an identical secret session key.
The session key will be used to encrypt the entire session. The public and private key pairs used for this part of the procedure are completely separate from the SSH keys used to authenticate a client to the server.
The basis of this procedure for classic Diffie-Hellman are:
This process allows each party to equally participate in generating the shared secret, which does not allow one end to control the secret. It also accomplishes the task of generating an identical shared secret without ever having to send that information over insecure channels. The shared secret encryption that is used for the rest of the connection is called binary packet protocol.
The generated secret is a symmetric key, meaning that the same key used to encrypt a message can be used to decrypt it on the other side. The purpose of this is to wrap all further communication in an encrypted tunnel that cannot be deciphered by outsiders.
After the session encryption is established, the user authentication stage begins.
The next step involves authenticating the user and deciding on access. There are a few methods that can be used for authentication, based on what the server accepts.
The general method is password authentication, which is when the server prompts the client for the password of the account they are attempting to log in with. The password is sent through the negotiated encryption, so it is secure from outside parties.
Even though the password will be encrypted, this method is not generally recommended due to the limitations on the complexity of the password. Automated scripts can break passwords of normal lengths very easily compared to other authentication methods.
The most popular and recommended alternative is the use of SSH key pairs. SSH key pairs are asymmetric keys, meaning that the two associated keys serve different functions.
The public key is used to encrypt data that can only be decrypted with the private key. The public key can be freely shared, because, although it can encrypt for the private key, there is no method of deriving the private key from the public key.
Authentication using SSH key pairs begins after the symmetric encryption has been established as described in the previous section. The procedure happens as follows:
authorized_keys
file of the account that the client is attempting to log into for the key ID.In sum, the asymmetry of the keys allows the server to encrypt messages to the client using the public key. The client can then prove that it holds the private key by decrypting the message correctly. The two types of encryption that are used (symmetric shared secret and asymmetric public/private keys) are each able to leverage their specific strengths in this model.
Learning about the connection negotiation steps and the layers of encryption at work in SSH can help you better understand what is happening when you log in to a remote server. Now you can recognize the relationship between various components and algorithms, and understand how all of these pieces fit together. To learn more about SSH, check out the following guides:
]]>HAProxy, which stands for High Availability Proxy, is a popular open source software TCP/HTTP Load Balancer and proxying solution which can be run on Linux, macOS, and FreeBSD. Its most common use is to improve the performance and reliability of a server environment by distributing the workload across multiple servers (e.g. web, application, database). It is used in many high-profile environments, including: GitHub, Imgur, Instagram, and Twitter.
In this guide, you’ll get a general overview of what HAProxy is, review load-balancing terminology, and examples of how it might be used to improve the performance and reliability of your own server environment.
There are many terms and concepts that are important when discussing load balancing and proxying. You’ll go over commonly used terms in the following subsections.
Before you get into the basic types of load balancing, you should begin with a review of ACLs, backends, and frontends.
In relation to load balancing, ACLs are used to test some condition and perform an action (e.g. select a server, or block a request) based on the test result. Use of ACLs allows flexible network traffic forwarding based on a variety of factors like pattern-matching and the number of connections to a backend, for example.
Example of an ACL:
acl url_blog path_beg /blog
This ACL is matched if the path of a user’s request begins with /blog
. This would match a request of http://yourdomain.com/blog/blog-entry-1
, for example.
For a detailed guide on ACL usage, check out the HAProxy Configuration Manual.
A backend is a set of servers that receives forwarded requests. Backends are defined in the backend section of the HAProxy configuration. In its most basic form, a backend can be defined by:
A backend can contain one or many servers in it. Generally speaking, adding more servers to your backend will increase your potential load capacity by spreading the load over multiple servers. Increased reliability is also achieved through this manner, in case some of your backend servers become unavailable.
Here is an example of a two backend configuration, web-backend
and blog-backend
with two web servers in each, listening on port 80:
backend web-backend
balance roundrobin
server web1 web1.yourdomain.com:80 check
server web2 web2.yourdomain.com:80 check
backend blog-backend
balance roundrobin
mode http
server blog1 blog1.yourdomain.com:80 check
server blog1 blog1.yourdomain.com:80 check
balance roundrobin
line specifies the load balancing algorithm, which is detailed in the Load Balancing Algorithms section.
mode http
specifies that layer 7 proxying will be used, which is explained in the Types of Load Balancing section.
The check
option at the end of the server
directives specifies that health checks should be performed on those backend servers.
A frontend defines how requests should be forwarded to backends. Frontends are defined in the frontend
section of the HAProxy configuration. Their definitions are composed of the following components:
use_backend
rules, which define which backends to use depending on which ACL conditions are matched, and/or a default_backend
rule that handles every other caseA frontend can be configured to various types of network traffic, as explained in the next section.
Now that you have an understanding of the basic components that are used in load balancing, you can move into the basic types of load balancing.
A simple web application environment with no load balancing might look like the following:
In this example, the user connects directly to your web server, at yourdomain.com
and there is no load balancing. If your single web server goes down, the user will no longer be able to access your web server. Additionally, if many users are trying to access your server simultaneously and it is unable to handle the load, they may have a slow experience or they may not be able to connect at all.
The simplest way to load balance network traffic to multiple servers is to use layer 4 (transport layer) load balancing. Load balancing this way will forward user traffic based on IP range and port (i.e. if a request comes in for http://yourdomain.com/anything
, the traffic will be forwarded to the backend that handles all the requests for yourdomain.com
on port 80
). For more details on layer 4, check out the TCP subsection of our Introduction to Networking.
Here is a diagram of a simple example of layer 4 load balancing:
The user accesses the load balancer, which forwards the user’s request to the web-backend group of backend servers. Whichever backend server is selected will respond directly to the user’s request. Generally, all of the servers in the web-backend should be serving identical content–otherwise the user might receive inconsistent content. Note that both web servers connect to the same database server.
Another, more complex way to load balance network traffic is to use layer 7 (application layer) load balancing. Using layer 7 allows the load balancer to forward requests to different backend servers based on the content of the user’s request. This mode of load balancing allows you to run multiple web application servers under the same domain and port. For more details on layer 7, check out the HTTP subsection of our Introduction to Networking.
Here is a diagram of a simple example of layer 7 load balancing:
In this example, if a user requests yourdomain.com/blog
, they are forwarded to the blog backend, which is a set of servers that run a blog application. Other requests are forwarded to web-backend, which might be running another application. Both backends use the same database server, in this example.
A snippet of the example frontend configuration would look like this:
frontend http
bind *:80
mode http
acl url_blog path_beg /blog
use_backend blog-backend if url_blog
default_backend web-backend
This configures a frontend named http
, which handles all incoming traffic on port 80.
acl url_blog path_beg /blog
matches a request if the path of the user’s request begins with /blog
.
use_backend blog-backend if url_blog
uses the ACL to proxy the traffic to blog-backend
.
default_backend web-backend
specifies that all other traffic will be forwarded to web-backend
.
The load balancing algorithm that is used determines which server, in a backend, will be selected when load balancing. HAProxy offers several options for algorithms. In addition to the load balancing algorithm, servers can be assigned a weight parameter to manipulate how frequently the server is selected, compared to other servers.
A few of the commonly used algorithms are as follows:
Round Robin selects servers in turns. This is the default algorithm.
Selects the server with the least number of connections. This is recommended for longer sessions. Servers in the same backend are also rotated in a round-robin fashion.
This selects which server to use based on a hash of the source IP address that users are making requests from. This method ensures that the same users will connect to the same servers.
Some applications require that a user continues to connect to the same backend server. This can be achieved through sticky sessions, using the appsession
parameter in the backend that requires it.
HAProxy uses health checks to determine if a backend server is available to process requests. This avoids having to manually remove a server from the backend if it becomes unavailable. The default health check is to try to establish a TCP connection to the server.
If a server fails a health check, and therefore is unable to serve requests, it is automatically disabled in the backend, and traffic will not be forwarded to it until it becomes healthy again. If all servers in a backend fail, the service will become unavailable until at least one of those backend servers becomes healthy again.
For certain types of backends, like database servers, the default health check is not necessarily to determine whether a server is still healthy.
The Nginx web server can also be used as a standalone proxy server or load balancer, and is often used in conjunction with HAProxy for its caching and compression capabilities.
The layer 4 and 7 load balancing setups described in this tutorial both use a load balancer to direct traffic to one of many backend servers. However, your load balancer is a single point of failure in these setups; if it goes down or gets overwhelmed with requests, it can cause high latency or downtime for your service.
A high availability (HA) setup is broadly defined as infrastructure without a single point of failure. It prevents a single server failure from being a downtime event by adding redundancy to every layer of your architecture. A load balancer facilitates redundancy for the backend layer (web/app servers), but for a true high availability setup, you need to have redundant load balancers as well.
Here is a diagram of a high availability setup:
In this example, you have multiple load balancers (one active and one or more passive) behind a static IP address that can be remapped from one server to another. When a user accesses your website, the request goes through the external IP address to the active load balancer. If that load balancer fails, your failover mechanism will detect it and automatically reassign the IP address to one of the passive servers. There are a number of different ways to implement an active/passive HA setup. To learn more, read How To Use Reserved IPs.
Now that you have an understanding of load balancing, and know how to make use of HAProxy, you have a solid foundation to get started on improving the performance and reliability of your own server environment.
If you’re interested in storing HAProxy’s output for later viewing, check out How To Configure HAProxy Logging with Rsyslog on CentOS 8 [Quickstart]
If you’re looking to solve an issue, check out Common HAProxy Errors. If even further troubleshooting is needed, take a look at How To Troubleshoot Common HAProxy Errors.
]]>Modern websites and applications must often deliver a significant amount of static content to end users. This content includes images, stylesheets, JavaScript, and video. As these static assets grow in number and size, bandwidth usage swells and page load times increase, deteriorating the browsing experience for your users and reducing your servers’ available capacity.
To dramatically reduce page load times, improve performance, and reduce your bandwidth and infrastructure costs, you can implement a CDN, or content delivery network, to cache these assets across a set of geographically distributed servers.
In this tutorial, we’ll provide a high-level overview of CDNs and how they work, as well as the benefits they can provide for your web applications.
A content delivery network is a geographically distributed group of servers optimized to deliver static content to end users. This static content can be almost any sort of data, but CDNs are most commonly used to deliver web pages and their related files, streaming video and audio, and large software packages.
A CDN consists of multiple points of presence (PoPs) in various locations, each consisting of several edge servers that cache assets from your origin, or host server. When a user visits your website and requests static assets like images or JavaScript files, their requests are routed by the CDN to the nearest edge server, from which the content is served. If the edge server does not have the assets cached or the cached assets have expired, the CDN will fetch and cache the latest version from either another nearby CDN edge server or your origin servers. If the CDN edge does have a cache entry for your assets (which occurs the majority of the time if your website receives a moderate amount of traffic), it will return the cached copy to the end user.
This allows geographically dispersed users to minimize the number of hops needed to receive static content, fetching the content directly from a nearby edge’s cache. The result is significantly decreased latencies and packet loss, faster page load times, and drastically reduced load on your origin infrastructure.
CDN providers often offer additional features such as DDoS mitigation and rate-limiting, user analytics, and optimizations for streaming or mobile use cases at additional cost.
When a user visits your website, they first receive a response from a DNS server containing the IP address of your host web server. Their browser then requests the web page content, which often consists of a variety of static files, such as HTML pages, CSS stylesheets, JavaScript code, and images.
Once you roll out a CDN and offload these static assets onto CDN servers, either by “pushing” them out manually or having the CDN “pull” the assets automatically (both mechanisms are covered in the next section), you then instruct your web server to rewrite links to static content such that these links now point to files hosted by the CDN. If you’re using a CMS such as WordPress, this link rewriting can be implemented using a third-party plugin like CDN Enabler.
Many CDNs provide support for custom domains, allowing you to create a CNAME record under your domain pointing to a CDN endpoint. Once the CDN receives a user request at this endpoint (located at the edge, much closer to the user than your backend servers), it then routes the request to the Point of Presence (PoP) located closest to the user. This PoP often consists of one or more CDN edge servers collocated at an Internet Exchange Point (IxP), essentially a data center that Internet Service Providers (ISPs) use to interconnect their networks. The CDN’s internal load balancer then routes the request to an edge server located at this PoP, which then serves the content to the user.
Caching mechanisms vary across CDN providers, but generally they work as follows:
X-Cache: MISS
. This initial request will be slower than future requests because after completing this request the asset will have been cached at the edge.X-Cache: HIT
.To learn more about how a specific CDN works and has been implemented, consult your CDN provider’s documentation.
In the next section, we’ll introduce the two popular types of CDNs: push and pull CDNs.
Most CDN providers offer two ways of caching your data: pull zones and push zones.
Pull Zones involve entering your origin server’s address, and letting the CDN automatically fetch and cache all the static resources available on your site. Pull zones are commonly used to deliver frequently updated, small to medium sized web assets like HTML, CSS, and JavaScript files. After providing the CDN with your origin server’s address, the next step is usually rewriting links to static assets such that they now point to the URL provided by the CDN. From that point onwards, the CDN will handle your users’ incoming asset requests and serve content from its geographically distributed caches and your origin as appropriate.
To use a Push Zone, you upload your data to a designated bucket or storage location, which the CDN then pushes out to caches on its distributed fleet of edge servers. Push zones are typically used for larger, infrequently changing files, like archives, software packages, PDFs, video, and audio files.
Almost any site can reap the benefits provided by rolling out a CDN, but generally the core reasons for implementing one are to offload bandwidth from your origin servers onto the CDN servers, and to reduce latency for geographically distributed users.
We’ll go through these and several of the other major advantages afforded by using a CDN below.
If you’re nearing bandwidth capacity on your servers, offloading static assets like images, videos, CSS and JavaScript files will drastically reduce your servers’ bandwidth usage. Content delivery networks are designed and optimized for serving static content, and client requests for this content will be routed to and served by edge CDN servers. This has the added benefit of reducing load on your origin servers, as they then serve this data at a much lower frequency.
If your user base is geographically dispersed, and a non-trivial portion of your traffic comes from a distant geographical area, a CDN can decrease latency by caching static assets on edge servers closer to your users. By reducing the distance between your users and static content, you can more quickly deliver content to your users and improve their experience by boosting page load speeds.
These benefits are compounded for websites serving primarily bandwidth-intensive video content, where high latencies and slow loading times more directly impact user experience and content engagement.
CDNs allow you to handle large traffic spikes and bursts by load balancing requests across a large, distributed network of edge servers. By offloading and caching static content on a delivery network, you can accommodate a larger number of simultaneous users with your existing infrastructure.
For websites using a single origin server, these large traffic spikes can often overwhelm the system, causing unplanned outages and downtime. Shifting traffic onto highly available and redundant CDN infrastructure, designed to handle variable levels of web traffic, can increase the availability of your assets and content.
As serving static content usually makes up the majority of your bandwidth usage, offloading these assets onto a content delivery network can drastically reduce your monthly infrastructure spend. In addition to reducing bandwidth costs, a CDN can decrease server costs by reducing load on the origin servers, enabling your existing infrastructure to scale. Finally, some CDN providers offer fixed-price monthly billing, allowing you to transform your variable monthly bandwidth usage into a stable, predictable recurring spend.
Another common use case for CDNs is DDoS attack mitigation. Many CDN providers include features to monitor and filter requests to edge servers. These services analyze web traffic for suspicious patterns, blocking malicious attack traffic while continuing to allow reputable user traffic through. CDN providers usually offer a variety of DDoS mitigation services, from common attack protection at the infrastructure level (OSI layers 3 and 4), to more advanced mitigation services and rate limiting.
In addition, most CDNs let you configure full SSL, so that you can encrypt traffic between the CDN and the end user, as well as traffic between the CDN and your origin servers, using either CDN-provided or custom SSL certificates.
If your bottleneck is CPU load on the origin server, and not bandwidth, a CDN may not be the most appropriate solution. In this case, local caching using popular caches such as NGINX or Varnish may significantly reduce load by serving assets from system memory.
Before rolling out a CDN, additional optimization steps — like minifying and compressing JavaScript and CSS files, and enabling web server HTTP request compression — can also have a significant impact on page load times and bandwidth usage.
A helpful tool to measure your page load speed and improve it is Google’s PageSpeed Insights. Another helpful tool that provides a waterfall breakdown of request and response times as well as suggested optimizations is Pingdom.
A content delivery network can be a quick and effective solution for improving the scalability and availability of your web sites. By caching static assets on a geographically distributed network of optimized servers, you can greatly reduce page load times and latencies for end users. In addition, CDNs allow you to significantly reduce your bandwidth usage by absorbing user requests and responding from cache at the edge, thus lowering your bandwidth and infrastructure costs.
With plugins and third-party support for major frameworks like WordPress, Drupal, Django, and Ruby on Rails, as well as additional features like DDoS mitigation, full SSL, user monitoring, and asset compression, CDNs can be an impactful tool for securing and optimizing high-traffic web sites.
]]>This article is deprecated and no longer maintained.
We now provide Git setup instructions for each platform individually.
This article may still be useful as a reference but may not follow best practices. We strongly recommend using a more recent article.
Open-source projects that are hosted in public repositories benefit from contributions made by the broader developer community, and are typically managed through Git.
A distributed version control system, Git helps both individuals and teams contribute to and maintain open-source software projects. Free to download and use, Git is an example of an open-source project itself.
This tutorial will discuss the benefits of contributing to open-source projects, and go over installing and setting up Git so that you can contribute to software projects.
Open-source software is software that is freely available to use, redistribute, and modify.
Projects that follow the open-source development model encourage a transparent process that is advanced through distributed peer review. Open-source projects can be updated quickly and as needed, and offer reliable and flexible software that is not built on locked proprietary systems.
Contributing to open-source projects helps ensure that they are as good as they can be and representative of the broad base of technology end-users. When end-users contribute to open-source projects through code or documentation, their diverse perspectives provide added value to the project, the project’s end-users, and the larger developer community.
The best way to begin to contribute to open-source projects is to start by contributing to software that you already use. As a user of a particular tool, you best understand what functionalities would be most valuable to the project. Make sure you read any available documentation about the software first. In fact, many open-source projects will have a CONTRIBUTING.md
file in the root directory, which you should read carefully before you contribute. You may also want to get a sense of the interactions between other developers in the community if there are forums about the project available.
Finally, if you’re starting out with contributing to open-source software, it is a good idea to start with something small — each contribution is valuable. You may want to start with fixing typos, adding comments, or writing clearer documentation.
One of the most popular version control systems for software is Git. Git was created in 2005 by Linus Torvalds, the creator of the Linux kernel. Originally utilized for the development of the Linux kernel, Junio Hamano is the current maintainer of the project.
Many projects maintain their files in a Git repository, and sites like GitHub, GitLab, and Bitbucket have streamlined the process of sharing and contributing to code. Every working directory in Git is a full-fledged repository with complete history and tracking independent of network access or a central server.
Version control has become an indispensable tool in modern software development because these systems allow you to keep track of software at the source level. You and other members of a development team can track changes, revert to previous stages, and branch off from the base code to create alternative versions of files and directories.
Git is so useful for open-source projects because it facilitates the contributions of many developers. Each contributor can branch off from the main or master branch of the code base repository to isolate their own changes, and can then make a pull request to have these changes integrated into the main project.
To use Git to contribute to open-source projects, let’s check if Git is installed, and if it’s not, let’s go through how to install it on your local machine.
First, you will want to check if you have Git command line tools installed on your computer. If you have been making repositories of your own code, then you likely have Git installed on your local machine. Some operating systems also come with Git installed, so it is worth checking before you install.
You can check whether Git is installed and what version you are using by opening up a terminal window in Linux or Mac, or a command prompt window in Windows, and typing the following command:
- git --version
However, if Git is not installed, you will receive an error similar to the following:
-bash: git: command not found
'git' is not recognized as an internal or external command, operable program, or batch file.
In this case, you should install Git into your machine. Let’s go through installation for several of the major operating systems.
By far the easiest way of getting Git installed and ready to use is by using your version of Linux’s default repositories. Let’s go through how to install Git on your local Linux machine using this method.
You can use the APT package management tools to update your local package index. After, you can download and install the program:
- sudo apt update
- sudo apt install git
While this is the fastest method of installing Git, the version may be older than the newest version. If you need the latest release, consider compiling Git from source by using this guide.
From here, you can continue on to the section on Setting Up Git.
We’ll be using yum
, CentOS’s native package manager, to search for and install the latest Git package available in CentOS’s repositories.
Let’s first make sure that yum is up to date by running this command:
- sudo yum -y update
The -y
flag is used to alert the system that we are aware that we are making changes, preventing the terminal from prompting us to confirm.
Now, we can go ahead and install Git:
- sudo yum install git
While this is the fastest method of installing Git, the version may be older than the newest version. If you need the latest release, consider compiling Git from source by following Option 2 from this guide.
From here, you can continue on to the section on Setting Up Git.
Git packages for Fedora are available through both yum
and dnf
. Introduced in Fedora 18, DNF, or Dandified Yum, has been the default package manager for Fedora since Fedora 22.
From your terminal window, update dnf and install Git:
- sudo dnf update
- sudo dnf install git
If you have an older version of Fedora, you can use the yum
command instead. Let’s first update yum
, then install Git:
- sudo yum update
- sudo yum install git
From here, you can continue on to the section on Setting Up Git.
On a local Macintosh computer, if you type a Git command into your Terminal window (as in git --version
above), you’ll be prompted to install Git if it is not already on your system. When you receive this prompt, you should agree to have Git installed and follow the instructions and respond to the prompts in your Terminal window.
You can install the most recent version of Git onto your Mac by installing it through the binary installer. There is an OS X Git installer maintained and available for download through the Git website. Clicking here will cause the download to start automatically.
Once Git is fully installed, you can continue on to the section on Setting Up Git.
For Windows, the official build is available for you to download through the Git website. Clicking here will cause the download to start automatically.
There is also an open-source project called Git for Windows, which is separate from the official Git website. This tool provides both command line and graphical user interface tools for using Git effectively on your Windows machine. For more information about this project and to inspect and download the code, visit the Git for Windows project site.
Once Git is fully installed, you can continue on to the section on Setting Up Git.
Now that you have Git installed, you need to do a few things so that the commit messages that will be generated for you will contain your correct information.
The easiest way of doing this is through the git config
command. Specifically, we need to provide our name and email address because Git embeds this information into each commit we do. We can go ahead and add this information by typing:
- git config --global user.name "Your Name"
- git config --global user.email "youremail@domain.com"
We can review all of the configuration items that have been set by typing:
- git config --list
user.name=Your Name
user.email=youremail@domain.com
As you may notice, this has a slightly different format. The information is stored in your Git configuration file, which you can optionally edit by hand with a text editor, like nano for example:
- nano ~/.gitconfig
[user]
name = Your Name
email = youremail@domain.com
Once you’re done editing your file, you can exit nano by typing the control and x
keys, and when prompted to save the file press y
.
There are many other options that you can set, but these are the two essential ones needed to prevent warnings in the future.
With Git installed and set up on your local machine, you are now ready to use Git for version control of your own software projects as well as contribute to open-source projects that are open to the public.
Adding your own contributions to open-source software is a great way to become more involved in the broader developer community, and help to ensure that software made for the public is of high quality and fully representative of the end-users.
]]>When you maintain an open-source software repository, you’re taking on a leadership role. Whether you’re the founder of a project who released it to the public for use and contributions, or you’re working on a team and are maintaining one specific aspect of the project, you are going to be providing an important service to the larger developer community.
While open-source contributions through pull requests from the developer community are crucial for ensuring that software is as useful as it can be for end users, maintainers have a real impact on shaping the overall project. Repository maintainers are extremely involved in the open-source projects they manage, from day-to-day organization and development, to interfacing with the public and providing prompt and effective feedback to contributors.
This guide will take you through some tips for maintaining public repositories of open-source software. Being a leader of an open-source project comes with both technical and non-technical responsibilities to help foster a user-base and community around your project. Taking on the role of a maintainer is an opportunity to learn from others, get experience with project management, and watch your project grow and change as your users become invested contributors.
Documentation that is thorough, well-organized, and serves the intended communities of your project will help expand your user base. Over time, your user base will become the contributors to your open-source project.
Since you’ll be thinking through the code you are creating anyway, and may even be jotting down notes, it can be worthwhile to incorporate documentation as part of your development process while it is fresh in your mind. You may even want to consider writing the documentation before the code, following the philosophy of a documentation-driven development approach that documents features first and develops those features after writing out what they will do.
Along with your code, there are a few files of documentation that you’ll want to keep in your top-level directory:
README.md
file that provides a summary of the project and your goals.CONTRIBUTING.md
file with contribution instructions.Documentation can come in many forms and can target different audiences. As part of your documentation, and depending on the scope of your work, you may decide to do one or more of the following:
Your project may be better suited to certain kinds of documentation than others, but providing more than one approach to the software will help your user base better understand how to interact with your work.
When writing documentation, or recording voice for a video, it is important to be as clear as possible. It is best to make no assumptions about the technical ability of your audience. You’ll also want to approach your documentation from the top down — that is, explain what your software does in a general way (e.g., automate server tasks, build a website, animate sprites for game development), before going into details.
Though English has become a universal language in the technology sphere, you’ll still want to consider who your expected users are and how to reach them. English may be the best choice to have access to a broad user base, but you’ll want to keep in mind that many people are approaching your documentation as non-native English speakers, so work to favor accessible language that will not confuse your readers or viewers.
Try to write documentation as though you are writing to a collaborator who needs to be brought up to speed on the current project; after all, you’ll want to encourage potential contributors to make pull requests to the project.
Issues are typically a way to keep track of or report bugs, or to request new features to be added to the code base. Open-source repository hosting services like GitHub, GitLab, and Bitbucket will provide you with an interface for yourself and others to keep track of issues within your repository. When releasing open-source code to the public, you should expect to have issues opened by the community of users. Organizing and prioritizing issues will give you a good road map of upcoming work on your project.
Because any user can file an issue, not all issues will be reporting bugs or be feature requests; you may receive questions via the issue tracker tool, or you may receive requests for smaller enhancements to the user interface, for example. It is best to organize these issues as much as possible and to be communicative to the users who are creating these issues.
Issues should represent concrete tasks that need to be done on the source code, and you will need to prioritize them accordingly. You and your team will have an understanding of the amount of time and energy you or contributors can devote to filed issues, and together you can work collaboratively to make decisions and create an actionable plan. When you know you won’t be able to get to a particular issue within a quick timeframe, you can still comment on the issue to let the user know that you have read the issue and that you’ll get to it when you can, and if you are able to you can provide an expected timeline for when you can review the issue again.
For issues that are feature requests or enhancements, you can ask the person who filed the issue whether they are able to contribute code themselves. You can direct them to the CONTRIBUTORS.md
file and any other relevant documentation.
Since questions often do not represent concrete tasks, commenting on the issue to courteously direct the user to relevant documentation can be a good option to keep your interactions professional and kind. If documentation for this question does not exist, now is a great time to add the relevant documentation, and express your thanks to the user for identifying this oversight. If you are getting a lot of questions via issues, you may consider creating a FAQ section of your documentation, or a wiki or forum for others to participate in question-answering.
Whenever a user reports an issue, try to be as kind and gracious as possible. Issues are indicators that users like your software and want to make it better!
Working to organize issues as best you can keep your project up to date and relevant to its community of users. Remove issues that are outside of the scope of your project or become stale, and prioritize the others so that you are able to make continuous progress.
You can improve the efficiency and quality of your project by automating maintenance tasks and testing. Automated maintenance and testing can continuously check the accuracy of your code and provide a more formal process for approving contributor submissions. This helps free up time so you can focus on the most important aspects of your project. The best news is that there are plenty of tools that have already been developed and may fulfill the needs of your project.
Set up automatic testing for incoming contributions by requiring status checks. Make sure to include information about how testing works for your project in a CONTRIBUTING.md
file.
Check out the tools that have been developed to automate maintenance tasks. Several possibilities include automating your releases and code review, or closing issues if an author doesn’t respond when information is requested.
Keep in mind that less is more. Be intentional about the processes and tasks you choose to automate in ways that can optimize efficiency, production, and quality for your project, yourself, and contributors.
The more you welcome contributors to your project and reward their efforts, the more likely you’ll be to encourage more contributions. To get people started, you’ll want to include a CONTRIBUTING.md
file in the top-level of your repository, and a pointer to that file in your README.md
file.
A good file on contributing will outline how to get started working on the project as a developer. You may want to offer a step-by-step guide, or provide a checklist for developers to follow, explaining how to successfully get their code merged into the project through a pull request.
In addition to documentation on how to contribute to the project, don’t forget to keep the code consistent and readable throughout. Code that is easy to understand through comments and clear and consistent usage will go a long way to making contributors feel like they can jump in on the project.
Finally, maintain a list of contributors or authors. You can invite contributors to add themselves to the list no matter what their contribution (even fixing typos is valuable, and can lead to more contributions in the future). This provides a way to recognize contributors for their work on the project in a public-facing way that they can point to, while also making others aware of how well contributors are treated.
By empowering users through documentation, being responsive to issues, and encouraging them to participate, you are already well on your way to building out the community around your open-source project. Users that you keep happy and who you treat as collaborators will in turn promote your software.
Additionally, you can work to promote your project through various avenues:
You’ll want to tailor your promotion to the scope of your project and the number of active team members and contributors you have working with you.
As your community grows, you can provide more spaces for contributors, users, and maintainers to interact. Some options you may consider include:
Consider your core user base and the scope of your project — including the number of people who are maintaining the project and the resources you have available — before rolling out these potential spaces, and seek feedback from your community about what works for them.
Above all, it is important to be kind and show some love in all of your interactions with your community. Being a gracious maintainer can be difficult, but it will pay off for your project down the line.
Repository maintainers are incredibly important within the larger open-source community. Though it requires significant investment and hard work, it is often a rewarding experience that allows you to grow as a developer and a contributor. Being an approachable and kind maintainer can go a long way to advance the development of a project that you care about.
]]>The Hypertext Transfer Protocol, or HTTP, is an application protocol that has been the de facto standard for communication on the World Wide Web since its invention in 1989. From the release of HTTP/1.1 in 1997 until recently, there have been few revisions to the protocol. But in 2015, a reimagined version called HTTP/2 came into use, which offered several methods to decrease latency, especially when dealing with mobile platforms and server-intensive graphics and videos. HTTP/2 has since become increasingly popular, with some estimates suggesting that around a third of all websites in the world support it. In this changing landscape, web developers can benefit from understanding the technical differences between HTTP/1.1 and HTTP/2, allowing them to make informed and efficient decisions about evolving best practices.
After reading this article, you will understand the main differences between HTTP/1.1 and HTTP/2, concentrating on the technical changes HTTP/2 has adopted to achieve a more efficient Web protocol.
To contextualize the specific changes that HTTP/2 made to HTTP/1.1, let’s first take a high-level look at the historical development and basic workings of each.
Developed by Timothy Berners-Lee in 1989 as a communication standard for the World Wide Web, HTTP is a top-level application protocol that exchanges information between a client computer and a local or remote web server. In this process, a client sends a text-based request to a server by calling a method like GET
or POST
. In response, the server sends a resource like an HTML page back to the client.
For example, let’s say you are visiting a website at the domain www.example.com
. When you navigate to this URL, the web browser on your computer sends an HTTP request in the form of a text-based message, similar to the one shown here:
GET /index.html HTTP/1.1
Host: www.example.com
This request uses the GET
method, which asks for data from the host server listed after Host:
. In response to this request, the example.com
web server returns an HTML page to the requesting client, in addition to any images, stylesheets, or other resources called for in the HTML. Note that not all of the resources are returned to the client in the first call for data. The requests and responses will go back and forth between the server and client until the web browser has received all the resources necessary to render the contents of the HTML page on your screen.
You can think of this exchange of requests and responses as a single application layer of the internet protocol stack, sitting on top of the transfer layer (usually using the Transmission Control Protocol, or TCP) and networking layers (using the Internet Protocol, or IP):
There is much to discuss about the lower levels of this stack, but in order to gain a high-level understanding of HTTP/2, you only need to know this abstracted layer model and where HTTP figures into it.
With this basic overview of HTTP/1.1 out of the way, we can now move on to recounting the early development of HTTP/2.
HTTP/2 began as the SPDY protocol, developed primarily at Google with the intention of reducing web page load latency by using techniques such as compression, multiplexing, and prioritization. This protocol served as a template for HTTP/2 when the Hypertext Transfer Protocol working group httpbis of the IETF (Internet Engineering Task Force) put the standard together, culminating in the publication of HTTP/2 in May 2015. From the beginning, many browsers supported this standardization effort, including Chrome, Opera, Internet Explorer, and Safari. Due in part to this browser support, there has been a significant adoption rate of the protocol since 2015, with especially high rates among new sites.
From a technical point of view, one of the most significant features that distinguishes HTTP/1.1 and HTTP/2 is the binary framing layer, which can be thought of as a part of the application layer in the internet protocol stack. As opposed to HTTP/1.1, which keeps all requests and responses in plain text format, HTTP/2 uses the binary framing layer to encapsulate all messages in binary format, while still maintaining HTTP semantics, such as verbs, methods, and headers. An application level API would still create messages in the conventional HTTP formats, but the underlying layer would then convert these messages into binary. This ensures that web applications created before HTTP/2 can continue functioning as normal when interacting with the new protocol.
The conversion of messages into binary allows HTTP/2 to try new approaches to data delivery not available in HTTP/1.1, a contrast that is at the root of the practical differences between the two protocols. The next section will take a look at the delivery model of HTTP/1.1, followed by what new models are made possible by HTTP/2.
As mentioned in the previous section, HTTP/1.1 and HTTP/2 share semantics, ensuring that the requests and responses traveling between the server and client in both protocols reach their destinations as traditionally formatted messages with headers and bodies, using familiar methods like GET
and POST
. But while HTTP/1.1 transfers these in plain-text messages, HTTP/2 encodes these into binary, allowing for significantly different delivery model possibilities. In this section, we will first briefly examine how HTTP/1.1 tries to optimize efficiency with its delivery model and the problems that come up from this, followed by the advantages of the binary framing layer of HTTP/2 and a description of how it prioritizes requests.
The first response that a client receives on an HTTP GET
request is often not the fully rendered page. Instead, it contains links to additional resources needed by the requested page. The client discovers that the full rendering of the page requires these additional resources from the server only after it downloads the page. Because of this, the client will have to make additional requests to retrieve these resources. In HTTP/1.0, the client had to break and remake the TCP connection with every new request, a costly affair in terms of both time and resources.
HTTP/1.1 takes care of this problem by introducing persistent connections and pipelining. With persistent connections, HTTP/1.1 assumes that a TCP connection should be kept open unless directly told to close. This allows the client to send multiple requests along the same connection without waiting for a response to each, greatly improving the performance of HTTP/1.1 over HTTP/1.0.
Unfortunately, there is a natural bottleneck to this optimization strategy. Since multiple data packets cannot pass each other when traveling to the same destination, there are situations in which a request at the head of the queue that cannot retrieve its required resource will block all the requests behind it. This is known as head-of-line (HOL) blocking, and is a significant problem with optimizing connection efficiency in HTTP/1.1. Adding separate, parallel TCP connections could alleviate this issue, but there are limits to the number of concurrent TCP connections possible between a client and server, and each new connection requires significant resources.
These problems were at the forefront of the minds of HTTP/2 developers, who proposed to use the aforementioned binary framing layer to fix these issues, a topic you will learn more about in the next section.
In HTTP/2, the binary framing layer encodes requests/responses and cuts them up into smaller packets of information, greatly increasing the flexibility of data transfer.
Let’s take a closer look at how this works. As opposed to HTTP/1.1, which must make use of multiple TCP connections to lessen the effect of HOL blocking, HTTP/2 establishes a single connection object between the two machines. Within this connection there are multiple streams of data. Each stream consists of multiple messages in the familiar request/response format. Finally, each of these messages split into smaller units called frames:
At the most granular level, the communication channel consists of a bunch of binary-encoded frames, each tagged to a particular stream. The identifying tags allow the connection to interleave these frames during transfer and reassemble them at the other end. The interleaved requests and responses can run in parallel without blocking the messages behind them, a process called multiplexing. Multiplexing resolves the head-of-line blocking issue in HTTP/1.1 by ensuring that no message has to wait for another to finish. This also means that servers and clients can send concurrent requests and responses, allowing for greater control and more efficient connection management.
Since multiplexing allows the client to construct multiple streams in parallel, these streams only need to make use of a single TCP connection. Having a single persistent connection per origin improves upon HTTP/1.1 by reducing the memory and processing footprint throughout the network. This results in better network and bandwidth utilization and thus decreases the overall operational cost.
A single TCP connection also improves the performance of the HTTPS protocol, since the client and server can reuse the same secured session for multiple requests/responses. In HTTPS, during the TLS or SSL handshake, both parties agree on the use of a single key throughout the session. If the connection breaks, a new session starts, requiring a newly generated key for further communication. Thus, maintaining a single connection can greatly reduce the resources required for HTTPS performance. Note that, though HTTP/2 specifications do not make it mandatory to use the TLS layer, many major browsers only support HTTP/2 with HTTPS.
Although the multiplexing inherent in the binary framing layer solves certain issues of HTTP/1.1, multiple streams awaiting the same resource can still cause performance issues. The design of HTTP/2 takes this into account, however, by using stream prioritization, a topic we will discuss in the next section.
Stream prioritization not only solves the possible issue of requests competing for the same resource, but also allows developers to customize the relative weight of requests to better optimize application performance. In this section, we will break down the process of this prioritization in order to provide better insight into how you can leverage this feature of HTTP/2.
As you know now, the binary framing layer organizes messages into parallel streams of data. When a client sends concurrent requests to a server, it can prioritize the responses it is requesting by assigning a weight between 1 and 256 to each stream. The higher number indicates higher priority. In addition to this, the client also states each stream’s dependency on another stream by specifying the ID of the stream on which it depends. If the parent identifier is omitted, the stream is considered to be dependent on the root stream. This is illustrated in the following figure:
In the illustration, the channel contains six streams, each with a unique ID and associated with a specific weight. Stream 1 does not have a parent ID associated with it and is by default associated with the root node. All other streams have some parent ID marked. The resource allocation for each stream will be based on the weight that they hold and the dependencies they require. Streams 5 and 6 for example, which in the figure have been assigned the same weight and same parent stream, will have the same prioritization for resource allocation.
The server uses this information to create a dependency tree, which allows the server to determine the order in which the requests will retrieve their data. Based on the streams in the preceding figure, the dependency tree will be as follows:
In this dependency tree, stream 1 is dependent on the root stream and there is no other stream derived from the root, so all the available resources will allocate to stream 1 ahead of the other streams. Since the tree indicates that stream 2 depends on the completion of stream 1, stream 2 will not proceed until the stream 1 task is completed. Now, let us look at streams 3 and 4. Both these streams depend on stream 2. As in the case of stream 1, stream 2 will get all the available resources ahead of streams 3 and 4. After stream 2 completes its task, streams 3 and 4 will get the resources; these are split in the ratio of 2:4 as indicated by their weights, resulting in a higher chunk of the resources for stream 4. Finally, when stream 3 finishes, streams 5 and 6 will get the available resources in equal parts. This can happen before stream 4 has finished its task, even though stream 4 receives a higher chunk of resources; streams at a lower level are allowed to start as soon as the dependent streams on an upper level have finished.
As an application developer, you can set the weights in your requests based on your needs. For example, you may assign a lower priority for loading an image with high resolution after providing a thumbnail image on the web page. By providing this facility of weight assignment, HTTP/2 enables developers to gain better control over web page rendering. The protocol also allows the client to change dependencies and reallocate weights at runtime in response to user interaction. It is important to note, however, that a server may change assigned priorities on its own if a certain stream is blocked from accessing a specific resource.
In any TCP connection between two machines, both the client and the server have a certain amount of buffer space available to hold incoming requests that have not yet been processed. These buffers offer flexibility to account for numerous or particularly large requests, in addition to uneven speeds of downstream and upstream connections.
There are situations, however, in which a buffer is not enough. For example, the server may be pushing a large amount of data at a pace that the client application is not able to cope with due to a limited buffer size or a lower bandwidth. Likewise, when a client uploads a huge image or a video to a server, the server buffer may overflow, causing some additional packets to be lost.
In order to avoid buffer overflow, a flow control mechanism must prevent the sender from overwhelming the receiver with data. This section will provide an overview of how HTTP/1.1 and HTTP/2 use different versions of this mechanism to deal with flow control according to their different delivery models.
In HTTP/1.1, flow control relies on the underlying TCP connection. When this connection initiates, both client and server establish their buffer sizes using their system default settings. If the receiver’s buffer is partially filled with data, it will tell the sender its receive window, i.e., the amount of available space that remains in its buffer. This receive window is advertised in a signal known as an ACK packet, which is the data packet that the receiver sends to acknowledge that it received the opening signal. If this advertised receive window size is zero, the sender will send no more data until the client clears its internal buffer and then requests to resume data transmission. It is important to note here that using receive windows based on the underlying TCP connection can only implement flow control on either end of the connection.
Because HTTP/1.1 relies on the transport layer to avoid buffer overflow, each new TCP connection requires a separate flow control mechanism. HTTP/2, however, multiplexes streams within a single TCP connection, and will have to implement flow control in a different manner.
HTTP/2 multiplexes streams of data within a single TCP connection. As a result, receive windows on the level of the TCP connection are not sufficient to regulate the delivery of individual streams. HTTP/2 solves this problem by allowing the client and server to implement their own flow controls, rather than relying on the transport layer. The application layer communicates the available buffer space, allowing the client and server to set the receive window on the level of the multiplexed streams. This fine-scale flow control can be modified or maintained after the initial connection via a WINDOW_UPDATE
frame.
Since this method controls data flow on the level of the application layer, the flow control mechanism does not have to wait for a signal to reach its ultimate destination before adjusting the receive window. Intermediary nodes can use the flow control settings information to determine their own resource allocations and modify accordingly. In this way, each intermediary server can implement its own custom resource strategy, allowing for greater connection efficiency.
This flexibility in flow control can be advantageous when creating appropriate resource strategies. For example, the client may fetch the first scan of an image, display it to the user, and allow the user to preview it while fetching more critical resources. Once the client fetches these critical resources, the browser will resume the retrieval of the remaining part of the image. Deferring the implementation of flow control to the client and server can thus improve the perceived performance of web applications.
In terms of flow control and the stream prioritization mentioned in an earlier section, HTTP/2 provides a more detailed level of control that opens up the possibility of greater optimization. The next section will explain another method unique to the protocol that can enhance a connection in a similar way: predicting resource requests with server push.
In a typical web application, the client will send a GET
request and receive a page in HTML, usually the index page of the site. While examining the index page contents, the client may discover that it needs to fetch additional resources, such as CSS and JavaScript files, in order to fully render the page. The client determines that it needs these additional resources only after receiving the response from its initial GET
request, and thus must make additional requests to fetch these resources and complete putting the page together. These additional requests ultimately increase the connection load time.
There are solutions to this problem, however: since the server knows in advance that the client will require additional files, the server can save the client time by sending these resources to the client before it asks for them. HTTP/1.1 and HTTP/2 have different strategies of accomplishing this, each of which will be described in the next section.
In HTTP/1.1, if the developer knows in advance which additional resources the client machine will need to render the page, they can use a technique called resource inlining to include the required resource directly within the HTML document that the server sends in response to the initial GET
request. For example, if a client needs a specific CSS file to render a page, inlining that CSS file will provide the client with the needed resource before it asks for it, reducing the total number of requests that the client must send.
But there are a few problems with resource inlining. Including the resource in the HTML document is a viable solution for smaller, text-based resources, but larger files in non-text formats can greatly increase the size of the HTML document, which can ultimately decrease the connection speed and nullify the original advantage gained from using this technique. Also, since the inlined resources are no longer separate from the HTML document, there is no mechanism for the client to decline resources that it already has, or to place a resource in its cache. If multiple pages require the resource, each new HTML document will have the same resource inlined in its code, leading to larger HTML documents and longer load times than if the resource were simply cached in the beginning.
A major drawback of resource inlining, then, is that the client cannot separate the resource and the document. A finer level of control is needed to optimize the connection, a need that HTTP/2 seeks to meet with server push.
Since HTTP/2 enables multiple concurrent responses to a client’s initial GET
request, a server can send a resource to a client along with the requested HTML page, providing the resource before the client asks for it. This process is called server push. In this way, an HTTP/2 connection can accomplish the same goal of resource inlining while maintaining the separation between the pushed resource and the document. This means that the client can decide to cache or decline the pushed resource separate from the main HTML document, fixing the major drawback of resource inlining.
In HTTP/2, this process begins when the server sends a PUSH_PROMISE
frame to inform the client that it is going to push a resource. This frame includes only the header of the message, and allows the client to know ahead of time which resource the server will push. If it already has the resource cached, the client can decline the push by sending a RST_STREAM
frame in response. The PUSH_PROMISE
frame also saves the client from sending a duplicate request to the server, since it knows which resources the server is going to push.
It is important to note here that the emphasis of server push is client control. If a client needed to adjust the priority of server push, or even disable it, it could at any time send a SETTINGS
frame to modify this HTTP/2 feature.
Although this feature has a lot of potential, server push is not always the answer to optimizing your web application. For example, some web browsers cannot always cancel pushed requests, even if the client already has the resource cached. If the client mistakenly allows the server to send a duplicate resource, the server push can use up the connection unnecessarily. In the end, server push should be used at the discretion of the developer. For more on how to strategically use server push and optimize web applications, check out the PRPL pattern developed by Google. To learn more about the possible issues with server push, see Jake Archibald’s blog post HTTP/2 push is tougher than I thought.
A common method of optimizing web applications is to use compression algorithms to reduce the size of HTTP messages that travel between the client and the server. HTTP/1.1 and HTTP/2 both use this strategy, but there are implementation problems in the former that prohibit compressing the entire message. The following section will discuss why this is the case, and how HTTP/2 can provide a solution.
Programs like gzip have long been used to compress the data sent in HTTP messages, especially to decrease the size of CSS and JavaScript files. The header component of a message, however, is always sent as plain text. Although each header is quite small, the burden of this uncompressed data weighs heavier and heavier on the connection as more requests are made, particularly penalizing complicated, API-heavy web applications that require many different resources and thus many different resource requests. Additionally, the use of cookies can sometimes make headers much larger, increasing the need for some kind of compression.
In order to solve this bottleneck, HTTP/2 uses HPACK compression to shrink the size of headers, a topic discussed further in the next section.
One of the themes that has come up again and again in HTTP/2 is its ability to use the binary framing layer to exhibit greater control over finer detail. The same is true when it comes to header compression. HTTP/2 can split headers from their data, resulting in a header frame and a data frame. The HTTP/2-specific compression program HPACK can then compress this header frame. This algorithm can encode the header metadata using Huffman coding, thereby greatly decreasing its size. Additionally, HPACK can keep track of previously conveyed metadata fields and further compress them according to a dynamically altered index shared between the client and the server. For example, take the following two requests:
method: GET
scheme: https
host: example.com
path: /academy
accept: /image/jpeg
user-agent: Mozilla/5.0 ...
method: GET
scheme: https
host: example.com
path: /academy/images
accept: /image/jpeg
user-agent: Mozilla/5.0 ...
The various fields in these requests, such as method
, scheme
, host
, accept
, and user-agent
, have the same values; only the path
field uses a different value. As a result, when sending Request #2
, the client can use HPACK to send only the indexed values needed to reconstruct these common fields and newly encode the path
field. The resulting header frames will be as follows:
method: GET
scheme: https
host: example.com
path: /academy
accept: /image/jpeg
user-agent: Mozilla/5.0 ...
path: /academy/images
Using HPACK and other compression methods, HTTP/2 provides one more feature that can reduce client-server latency.
As you can see from this point-by-point analysis, HTTP/2 differs from HTTP/1.1 in many ways, with some features providing greater levels of control that can be used to better optimize web application performance and other features simply improving upon the previous protocol. Now that you have gained a high-level perspective on the variations between the two protocols, you can consider how such factors as multiplexing, stream prioritization, flow control, server push, and compression in HTTP/2 will affect the changing landscape of web development.
If you would like to see a performance comparison between HTTP/1.1 and HTTP/2, check out this Google demo that compares the protocols for different latencies. Note that when you run the test on your computer, page load times may vary depending on several factors such as bandwidth, client and server resources available at the time of testing, and so on. If you’d like to study the results of more exhaustive testing, take a look at the article HTTP/2 – A Real-World Performance Test and Analysis. Finally, if you would like to explore how to build a modern web application, you could follow our How To Build a Modern Web Application to Manage Customer Information with Django and React on Ubuntu 18.04 tutorial, or set up your own HTTP/2 server with our How To Set Up Nginx with HTTP/2 Support on Ubuntu 20.04 tutorial.
]]>Container images are the primary packaging format for defining applications within Kubernetes. Images are used as the basis for pods and other objects, and play an important role in efficiently leveraging Kubernetes’ features. Well-designed images are secure, highly performant, and focused. They are able to react to configuration data or instructions provided by Kubernetes. They implement endpoints that the deployment uses to understand their internal application state.
In this article, we’ll introduce some strategies for creating high quality images and discuss a few general goals to help guide your decisions when containerizing applications. We will focus on building images intended to be run on Kubernetes, but many of these suggestions apply equally to running containers on other orchestration platforms or in other contexts.
Before we go over specific actions to take when building container images, we will talk about what makes a good container image. What should your goals be when designing new images? Which characteristics and what behavior are most important?
Some qualities to aim for are:
A single, well-defined purpose
Container images should have a single discrete focus. Avoid thinking of container images as virtual machines, where it can make sense to package related functionality together. Instead, treat your container images like Unix utilities, maintaining a strict focus on doing one small thing well. Applications can be coordinated outside of an individual container’s scope to provide more complex functionality.
Generic design with the ability to inject configuration at runtime
Container images should be designed with reuse in mind when possible. For instance, the ability to adjust configuration at runtime is often required to fulfill basic requirements like testing your images before deploying to production. Small, generic images can be combined in different configurations to modify behavior without creating new images.
Small image size
Smaller images have a number of benefits in clustered environments like Kubernetes. They download quickly to new nodes and often have a smaller set of installed packages, which can improve security. Pared down container images make it simpler to debug problems by minimizing the amount of software involved.
Externally managed state
Containers in clustered environments experience a very volatile life cycle including planned and unplanned shutdowns due to resource scarcity, scaling, or node failures. To maintain consistency, aid in recovery and availability of your services, and to avoid losing data, it is critical that you store application state in a stable location outside of the container.
Easy to understand
It is important to try to keep container images as simple and easy to understand as possible. When troubleshooting, being able to directly view configurations and logs, or test container behavior, can help you reach a resolution faster. Thinking of container images as a packaging format for your application instead of a machine configuration can help you strike the right balance.
Follow containerized software best practices
Images should aim to work within the container model instead of acting against it. Avoid implementing conventional system administration practices, like including full init systems and daemonizing applications. Log to stdout
so Kubernetes can expose data to administrators instead of using an internal logging daemon. These recommendations largely diverge from best practices for full operating systems.
Fully leverage Kubernetes features
Beyond conforming to the container model, it’s important to understand and reconcile with the tooling that Kubernetes provides. For example, providing endpoints for liveness and readiness checks or adjusting operations based on changes in the configuration or environment can help your applications use Kubernetes’ dynamic deployment environment to their advantage.
Now that we’ve established some of the qualities that define highly functional container images, we can dive deeper into strategies that help you achieve these goals.
We can start by examining the resources that container images are built from: base images. Each container image is built either from a parent image, an image used as a starting point, or from the abstract scratch
layer, an empty image layer with no filesystem. A base image is a container image that serves as a foundation for future images by defining the basic operating system and providing core functionality. Images are comprised of one or more image layers built on top of one another to form a final image.
No standard utilities or filesystem are available when working directly from scratch
, which means that you only have access to extremely limited functionality. While images created directly from scratch
can be very streamlined and minimal, their main purpose is in defining base images. Typically, you want to build your container images on top of a parent image that sets up a basic environment that your applications run in so that you do not have to construct a complete system for every image.
While there are base images for a variety of Linux distributions, it’s best to be deliberate about which systems you choose. Each new machine will have to download the parent image and any additional layers you’ve added. For large images, this can consume a significant amount of bandwidth and noticeably lengthen the startup time of your containers on their first run. There is no way to pare down an image that’s used as a parent downstream in the container build process, so starting with a minimal parent is a good idea.
Feature rich environments like Ubuntu allow your application to run in an environment you’re familiar with, but there are some tradeoffs to consider. Ubuntu images (and similar conventional distribution images) tend to be relatively large (over 100MB), meaning that any container images built from them will inherit that weight.
Alpine Linux is a popular alternative for base images because it successfully packages a lot of functionality into a very small base image (~ 5MB). It includes a package manager with sizable repositories and has most of the standard utilities you would expect from a minimal Linux environment.
When designing your applications, it’s a good idea to try to reuse the same parent for each image. When your images share a parent, machines running your containers will download the parent layer only once. Afterwards, they will only need to download the layers that differ between your images. This means that if you have common features or functionality you’d like to embed in each image, creating a common parent image to inherit from might be a good idea. Images that share a lineage help minimize the amount of extra data you need to download on fresh servers.
Once you’ve selected a parent image, you can define your container image by adding additional software, copying files, exposing ports, and choosing processes to run. Certain instructions in the image configuration file (e.g., a Dockerfile
if you are using Docker) will add additional layers to your image.
For many of the same reasons mentioned in the previous section, it’s important to be mindful of how you add layers to your images due to the resulting size, inheritance, and runtime complexity. To avoid building large, unwieldy images, it’s important to develop a good understanding of how container layers interact, how the build engine caches layers, and how subtle differences in similar instructions can have a big impact on the images you create.
Docker creates a new image layer each time it executes a RUN
, COPY
, or ADD
instruction. If you build the image again, the build engine will check each instruction to see if it has an image layer cached for the operation. If it finds a match in the cache, it uses the existing image layer rather than executing the instruction again and rebuilding the layer.
This process can significantly shorten build times, but it is important to understand the mechanism used to avoid potential problems. For file copying instructions like COPY
and ADD
, Docker compares the checksums of the files to see if the operation needs to be performed again. For RUN
instructions, Docker checks to see if it has an existing image layer cached for that particular command string.
While it might not be immediately obvious, this behavior can cause unexpected results if you are not careful. A common example of this is updating the local package index and installing packages in two separate steps. We will be using Ubuntu for this example, but the basic premise applies equally well to base images for other distributions:
FROM ubuntu:20.04
RUN apt -y update
RUN apt -y install nginx
. . .
Here, the local package index is updated in one RUN
instruction (apt -y update
) and Nginx is installed in another operation. This works without issue when it is first used. However, if the Dockerfile is updated later to install an additional package, there may be problems:
FROM ubuntu:20.04
RUN apt -y update
RUN apt -y install nginx php-fpm
. . .
We’ve added a second package to the installation command run by the second instruction. If a significant amount of time has passed since the previous image build, the new build might fail. That’s because the package index update instruction (RUN apt -y update
) has not changed, so Docker reuses the image layer associated with that instruction. Since we are using an old package index, the version of the php-fpm
package we have in our local records may no longer be in the repositories, resulting in an error when the second instruction is run.
To avoid this scenario, be sure to consolidate any steps that are interdependent into a single RUN
instruction so that Docker will re-execute all of the necessary commands when a change occurs. In shell scripting, using the logical AND operator &&
, which will execute multiple commands on the same line as long as they are each successful, is a good way to achieve this:
FROM ubuntu:20.04
RUN apt -y update && apt -y install nginx php-fpm
. . .
The instruction now updates the local package cache whenever the package list changes. An alternative would be to RUN
an entire shell.sh
script that contains multiple lines of instructions, but would have to be made available to the container first.
The previous example demonstrates how Docker’s caching behavior can subvert expectations, but there are some other things to keep in mind about how RUN
instructions interact with Docker’s layering system. As mentioned earlier, at the end of each RUN
instruction, Docker commits the changes as an additional image layer. In order to exert control over the scope of the image layers produced, you can clean up unnecessary files by paying attention to the artifacts introduced by the commands you run.
In general, chaining commands together into a single RUN
instruction offers a great deal of control over the layer that will be written. For each command, you can set up the state of the layer (apt -y update
), perform the core command (apt install -y nginx php-fpm
), and remove any unnecessary artifacts to clean up the environment before it’s committed. For example, many Dockerfiles chain rm -rf /var/lib/apt/lists/*
to the end of apt
commands, removing the downloaded package indexes, to reduce the final layer size:
FROM ubuntu:20.04
RUN apt -y update && apt -y install nginx php-fpm && rm -rf /var/lib/apt/lists/*
. . .
To further reduce the size of the image layers you are creating, trying to limit other unintended side effects of the commands you’re running can be helpful. For instance, in addition to the explicitly declared packages, apt
also installs “recommended” packages by default. You can include --no-install-recommends
to your apt
commands to remove this behavior. You may have to experiment to find out if you rely on any of the functionality provided by recommended packages.
We’ve used package management commands in this section as an example, but these same principles apply to other scenarios. The general idea is to construct the prerequisite conditions, execute the minimum viable command, and then clean up any unnecessary artifacts in a single RUN
command to reduce the overhead of the layer you’ll be producing.
Multi-stage builds were introduced in Docker 17.05, allowing developers to more tightly control the final runtime images they produce. Multi-stage builds allow you to divide your Dockerfile into multiple sections representing distinct stages, each with a FROM
statement to specify separate parent images.
Earlier sections define images that can be used to build your application and prepare assets. These often contain build tools and development files that are needed to produce the application, but are not necessary to run it. Each subsequent stage defined in the file will have access to artifacts produced by previous stages.
The last FROM
statement defines the image that will be used to run the application. Typically, this is a pared down image that installs only the necessary runtime requirements and then copies the application artifacts produced by previous stages.
This system allows you worry less about optimizing RUN
instructions in the build stages since those container layers will not be present in the final runtime image. You should still pay attention to how instructions interact with layer caching in the build stages, but your efforts can be directed towards minimizing build time rather than final image size. Paying attention to instructions in the final stage is still important in reducing image size, but by separating the different stages of your container build, it’s easier to to obtain streamlined images without as much Dockerfile
complexity.
While the choices you make regarding container build instructions are important, broader decisions about how to containerize your services often have a more direct impact on your success. In this section, we’ll talk a bit more about how to best transition your applications from a more conventional environment to running on a container platform.
Generally, it is good practice to package each piece of independent functionality into a separate container image.
This differs from common strategies employed in virtual machine environments where applications are frequently grouped together within the same image to reduce the size and minimize the resources required to run the VM. Since containers are lightweight abstractions that don’t virtualize the entire operating system stack, this tradeoff is less compelling on Kubernetes. So while a web stack virtual machine might bundle an Nginx web server with a Gunicorn application server on a single machine to serve a Django application, in Kubernetes these might be split into separate containers.
Designing containers that implement one discrete piece of functionality for your services offers a number of advantages. Each container can be developed independently if standard interfaces between services are established. For instance, the Nginx container could potentially be used to proxy to a number of different backends or could be used as a load balancer if given a different configuration.
Once deployed, each container image can be scaled independently to address varying resource and load constraints. By splitting your applications into multiple container images, you gain flexibility in development, organization, and deployment.
In Kubernetes, pods are the smallest unit that can be directly managed by the control plane. Pods consist of one or more containers along with additional configuration data to tell the platform how those components should be run. The containers within a pod are always scheduled on the same worker node in the cluster and the system automatically restarts failed containers. The pod abstraction is very useful, but it introduces another layer of decisions about how to bundle together the components of your applications.
Like container images, pods also become less flexible when too much functionality is bundled into a single entity. Pods themselves can be scaled using other abstractions, but the containers within cannot be managed or scaled independently. So, to continue using our previous example, the separate Nginx and Gunicorn containers should probably not be bundled together into a single pod. This way, they can be controlled and deployed separately.
However, there are scenarios where it does make sense to combine functionally different containers as a unit. In general, these can be categorized as situations where an additional container supports or enhances the core functionality of the main container or helps it adapt to its deployment environment. Some common patterns are:
As you might have noticed, each of these patterns support the strategy of building standard, generic primary container images that can then be deployed in a variety contexts and configurations. The secondary containers help bridge the gap between the primary container and the specific deployment environment being used. Some sidecar containers can also be reused to adapt multiple primary containers to the same environmental conditions. These patterns benefit from the shared filesystem and networking namespace provided by the pod abstraction while still allowing independent development and flexible deployment of standardized containers.
There is some tension between the desire to build standardized, reusable components and the requirements involved in adapting applications to their runtime environment. Runtime configuration is one of the best methods to bridge the gap between these concerns. This way, components are built to be general-purpose, and their required behavior is outlined at runtime by supplying additional configuration details. This standard approach works for containers as well as it does for applications.
Building with runtime configuration in mind requires you to think ahead during both the application development and containerization steps. Applications should be designed to read values from command line parameters, configuration files, or environment variables when they are launched or restarted. This configuration parsing and injection logic must be implemented in code prior to containerization.
When writing a Dockerfile, the container must also be designed with runtime configuration in mind. Containers have a number of mechanisms for providing data at runtime. Users can mount files or directories from the host as volumes within the container to enable file-based configuration. Likewise, environment variables can be passed into the internal container runtime when the container is started. The CMD
and ENTRYPOINT
Dockerfile instructions can also be defined in a way that allows for runtime configuration information to be passed in as command line parameters.
Since Kubernetes manipulates higher level objects like pods instead of managing containers directly, there are mechanisms available to define configuration and inject it into the container environment at runtime. Kubernetes ConfigMaps and Secrets allow you to define configuration data separately and then project these values into the container environment at runtime. ConfigMaps are general purpose objects intended to store configuration data that might vary based on environment, testing stage, etc. Secrets offer a similar interface but are specifically designed for sensitive data, like account passwords or API credentials.
By understanding and correctly using the runtime configuration options available throughout each layer of abstraction, you can build flexible components that take their cues from environment-provided values. This makes it possible to reuse the same container images in very different scenarios, reducing development overhead by improving application flexibility.
When transitioning to container-based environments, users often start by shifting existing workloads, with few or no changes, to the new system. They package applications in containers by wrapping the tools they are already using in the new abstraction. While it is helpful to use your usual patterns to get migrated applications up and running, dropping in previous implementations within containers can sometimes lead to ineffective design.
Problems frequently arise when developers implement significant service management functionality within containers. For example, running systemd services within the container or daemonizing web servers may be considered best practices in a normal computing environment, but they often conflict with assumptions inherent in the container model.
Hosts manage container life cycle events by sending signals to the process operating as PID (process ID) 1 inside the container. PID 1 is the first process started, which would be the init system in traditional computing environments. However, because the host can only manage PID 1, using a conventional init system to manage processes within the container sometimes means there is no way to control the primary application. The host can start, stop, or kill the internal init system, but can’t manage the primary application directly. These signals can sometimes propagate the intended behavior to the running application, but this still adds complexity and isn’t always necessary.
Most of the time, it is better to simplify the running environment within the container so that PID 1 is running the primary application in the foreground. In cases where multiple processes must be run, PID 1 is responsible for managing the life cycle of subsequent processes. Certain applications, like Apache, handle this natively by spawning and managing workers that handle connections. For other applications, a wrapper script or a very lean init system like dumb-init or the included tini init system can be used. Regardless of the implementation you choose, the process running as PID 1 within the container should respond appropriately to TERM
signals sent by Kubernetes to behave as expected.
Kubernetes deployments and services offer life cycle management for long-running processes and reliable, persistent access to applications, even when underlying containers need to be restarted or the implementations themselves change. By extracting the responsibility of monitoring and maintaining service health out of the container, you can leverage the platform’s tools for managing healthy workloads.
In order for Kubernetes to manage containers properly, it has to understand whether the applications running within containers are healthy and capable of performing work. To enable this, containers can implement liveness probes: that is, network endpoints or commands that can be used to report application health. Kubernetes will periodically check defined liveness probes to determine if the container is operating as expected. If the container does not respond appropriately, Kubernetes restarts the container in an attempt to reestablish functionality.
Kubernetes also provides readiness probes, a similar construct. Rather than indicating whether the application within a container is healthy, readiness probes determine whether the application is ready to receive traffic. This can be useful when a containerized application has an initialization routine that must complete before it is ready to receive connections. Kubernetes uses readiness probes to determine whether to add a pod to or remove a pod from a service.
Defining endpoints for these two probe types can help Kubernetes manage your containers efficiently and can prevent container life cycle problems from affecting service availability. The mechanisms to respond to these types of health requests must be built into the application itself and must be exposed in the Docker image configuration.
In this guide, we’ve covered some important considerations to keep in mind when running containerized applications in Kubernetes. To reiterate, some of the suggestions we went over were:
Throughout the development and implementation process, you will need to make decisions that can affect your service’s robustness and effectiveness. Understanding the ways that containerized applications differ from conventional applications, and learning how they operate in a managed cluster environment, can help you avoid some common pitfalls and allow you to take advantage of all of the capabilities Kubernetes provides.
]]>Developing and releasing software can be a complicated process, especially as applications, teams, and deployment infrastructure grow in complexity. Often, challenges become more pronounced as projects grow. To develop, test, and release software in a quick and consistent way, developers and organizations have created three related but distinct strategies to manage and automate these processes.
Continuous integration focuses on integrating work from individual developers into a main repository multiple times a day to catch integration bugs early and accelerate collaborative development. Continuous delivery is concerned with reducing friction in the deployment or release process, automating the steps required to deploy a build so that code can be released safely at any time. Continuous deployment takes this one step further by automatically deploying each time a code change is made.
In this guide, we will discuss each of these strategies, how they relate to one another, and how incorporating them into your application life cycle can transform your software development and release practices. To get a better idea of the differences between various open-source CI/CD projects, check out our CI/CD tool comparison.
Continuous integration is a practice that encourages developers to integrate their code into a main branch of a shared repository early and often. Instead of building out features in isolation and integrating them at the end of a development cycle, code is integrated with the shared repository by each developer multiple times throughout the day.
The idea is to minimize the cost of integration by making it an early consideration. Developers can discover conflicts at the boundaries between new and existing code early, while conflicts are still relatively easy to reconcile. Once the conflict is resolved, work can continue with confidence that the new code honors the requirements of the existing codebase.
Integrating code frequently does not, by itself, offer any guarantees about the quality of the new code or functionality. In many organizations, integration is costly because manual processes are used to ensure that the code meets standards, does not introduce bugs, and does not break existing functionality. Frequent integration can create friction when an approach to automation does not align with quality assurance measures in place.
To address this friction within the integration process, in practice, continuous integration relies on robust test suites and an automated system to run those tests. When a developer merges code into the main repository, automated processes kick off a build of the new code. Afterwards, test suites are run against the new build to check whether any integration problems were introduced. If either the build or the test phase fails, the team is alerted so that they can work to fix the build.
The end goal of continuous integration is to make integration a simple, repeatable process that is part of the everyday development workflow in order to reduce integration costs and respond to defects early. Working to make sure the system is robust, automated, and fast while cultivating a team culture that encourages frequent iteration and responsiveness to build issues is fundamental to CI success.
Continuous delivery is an extension of continuous integration. It focuses on automating the software delivery process so that teams can easily and confidently deploy their code to production at any time. By ensuring that the codebase is always in a deployable state, releasing software becomes an unremarkable event, without any complicated rituals. Teams can be confident that they can release whenever they need to without complex coordination or late-stage testing. As with continuous integration, continuous delivery is a practice that requires a mixture of technical and organizational improvements to be effective.
On the technology side, continuous delivery leans heavily on deployment pipelines to automate the testing and deployment processes. A deployment pipeline is an automated system that runs increasingly rigorous test suites against a build as a series of sequential stages. This picks up where continuous integration leaves off, so a reliable continuous integration setup is a prerequisite to implementing continuous delivery.
At each stage, the build either fails the tests, which alerts the team, or passes the tests, which results in automatic promotion to the next stage. As the build moves through the pipeline, later stages deploy the build to environments that mirror the production environment as closely as possible. This way the build, the deployment process, and the environment can be tested in tandem. The pipeline ends with a build that can be deployed to production at any time in a single step.
The organizational aspects of continuous delivery encourage prioritization of “deployability” as a principle concern. This has an impact on the way that features are built and hooked into the rest of the codebase. Thought must be put into the design of the code so that features can be safely deployed to production at any time, even when incomplete. A number of techniques have emerged to assist in this area.
Continuous delivery is compelling because it automates the steps between checking code into the repository and deciding on whether to release well-tested, functional builds to your production infrastructure. The steps that help assert the quality and correctness of the code are automated, but the final decision about what to release is left in the hands of the organization for maximum flexibility.
Continuous deployment is an extension of continuous delivery that automatically deploys each build that passes the full test cycle. Instead of waiting for a human gatekeeper to decide what and when to deploy to production, a continuous deployment system deploys everything that has successfully traversed the deployment pipeline. Keep in mind that when new code is automatically deployed, new features can still be activated conditionally at a later time or for a subset of users. Deploying automatically pushes features and fixes to customers quickly, encourages smaller changes with limited scope, and helps avoid confusion over what is currently deployed to production.
This fully automated deploy cycle can be a source of anxiety for organizations worried about relinquishing control to their automation system of what gets released. The trade-off offered by automated deployments is sometimes considered too dangerous for the payoff they provide.
Other groups leverage this approach as a means of ensuring that best practices are always followed. Without a final manual verification before deploying a piece of code, developers must take responsibility for ensuring that their code is well-designed and that the test suites are up-to-date. This consolidates any decision-making around what and when to commit to the main repository and what and when to release to production into a single decision point for the development team.
Continuous deployment also allows organizations to benefit from consistent early feedback. Features can immediately be made available to users and defects or unhelpful implementations can be caught early before the team devotes extensive effort in an unproductive direction. Getting fast feedback that a feature isn’t helpful lets the team shift focus rather than sinking more energy into an area with minimal impact.
While continuous integration, delivery, and deployment vary in the scope of their involvement, there are some concepts and practices that are fundamental to the success of each.
One of the most important practices when adopting continuous integration is to encourage small changes. Developers should practice breaking up larger work into small pieces and committing those early. Special techniques like branching by abstraction and feature flags (see below) help to protect the functionality of the main branch from in-progress code changes.
Small changes minimize the possibility and impact of integration problems. By committing to the shared branch at the earliest possible stage and then continually throughout development, the cost of integration is diminished and unrelated work is synchronized regularly.
With trunk-based development, work is done in the main branch (“trunk”) of the repository or merged back into the shared repository at frequent intervals. Short-lived feature branches are permissible as long as they represent small changes and are merged back as soon as possible.
The idea behind trunk-based development is to avoid large commits that violate of concept of small, iterative changes discussed above. Code is available to peers early so that conflicts can be resolved when their scope is small.
Releases are performed from the main branch or from a release branch created from the trunk specifically for that purpose. No development occurs on the release branches in order to maintain focus on the main branch as the single source of truth.
Each of the processes relies on automated building and testing to validate correctness. Because the build and test steps must be performed frequently, it is essential that these processes be streamlined to minimize the time spent on these steps.
Increases in build time should be treated as a major problem because the impact is compounded by the fact that each commit kicks off a build. Because continuous processes force developers to engage with these activities daily, reducing friction in these areas is very worthwhile.
When possible, running different sections of the test suite in parallel can help move the build through the pipeline faster. Care should also be taken to make sure the proportion of each type of test makes sense. Unit tests are typically very fast and have minimal maintenance overhead. In contrast, acceptance testing is often complex and prone to breakage. To account for this, it is often a good idea to rely heavily on unit tests, conduct a fair number of integration tests, and minimize the number of more complex tests.
Because a continuous delivery or deployment implementations is supposed to be testing release worthiness, it is essential to maintain consistency during each step of the process — the build itself, the deployment environments, and the deployment process itself:
Separating the deployment of code from its release to users is an extremely powerful part of continuous delivery and deployment. Code can be deployed to production without initially activating it or making it accessible to users. Then, the organization decides when to release new functionality or features independent from deployment.
This gives organizations a great deal of flexibility by separating business decisions from technical processes. If the code is already on the servers, then deployment is no longer a delicate part of the release process, which minimizes the number of participants and the amount of work involved at the time of release.
There are a number of techniques that help teams deploy the code responsible for a feature without releasing it. Feature flags set up conditional logic to check whether to run code based on the value of an environmental variable. Branching by abstraction allows developers to rewrite processes incrementally by creating an abstraction layer between process input and output. Careful planning to incorporate these techniques gives you the ability to decouple these two processes.
Continuous integration, delivery, and deployment all rely heavily on automated tests to determine the efficacy and correctness of each code change. Different types of tests are needed throughout these processes to assert confidence in a given solution.
While the categories below in no way represent an exhaustive list, and although there is disagreement on the exact definition of each type, these broad categories of tests represent a variety of ways to evaluate code in different contexts.
Smoke tests are a special kind of initial checks designed to ensure core functionality as well as some fundamental implementation and environmental assumptions. Smoke tests are generally run at the very start of each testing cycle as a sanity check before running a more complete test suite.
The idea behind this type of test is to help to catch big red flags in an implementation and to bring attention to problems that might indicate that further testing is either not possible or not worthwhile. Smoke tests are not very extensive, but should be extremely quick. If a change fails a smoke test, it is an early signal that core assertions were broken and that you should not devote any more time to testing until the problem is resolved.
Context-specific smoke tests can be employed at the start of any new phase testing to assert that the basic assumptions and requirements are met. For instance, smoke tests can be used both prior to integration testing or deploying to staging servers, but the conditions to be tested will vary in each case.
Unit tests are responsible for testing individual elements of code in an isolated and highly targeted way. The functionality of individual functions and classes are tested on their own. Any external dependencies are replaced with stub or mock implementations to focus the test completely on the code in question.
Unit tests are essential to test the correctness of individual code components for internal consistency and correctness before they are placed in more complex contexts. The limited extent of the tests and the removal of dependencies makes it easier to hunt down the cause of any defects. It also is the best time to test a variety of inputs and code branches that might be difficult to reproduce later on. Often, after any smoke tests, unit tests are the first tests that are run when any changes are made.
Unit tests are typically run by individual developers on their own workstation prior to submitting changes. However, continuous integration servers almost always run these tests again as a safeguard before beginning integration tests.
After unit testing, integration testing is performed by grouping together components and testing them as an assembly. While unit tests validate the functionality of code in isolation, integration tests ensure that components cooperate when interfacing with one another. This type of testing has the opportunity to catch an entirely different class of bugs that are exposed through interaction between components.
Typically, integration tests are performed automatically when code is checked into a shared repository. A continuous integration server checks out the code, performs any necessary build steps (usually performing a quick smoke test to make sure the build was successful) and then runs unit and integration tests. Modules are hooked together in different combinations and tested.
Integration tests are important for shared work because they protect the health of the project. Changes must prove that they do not break existing functionality and that they interact with other code as expected. A secondary aim of integration testing is to verify that the changes can be deployed into a clean environment. This is frequently the first testing cycle that is not performed on the developers’ own machines, so unknown software and environmental dependencies can also be discovered during this process. This is usually also the first time that new code is tested against real external libraries, services, and data.
Once integration tests are performed, another level of testing called system testing can begin. In many ways, system testing acts as an extension to integration testing. The focus of system tests are to make sure that groups of components function correctly as a cohesive whole.
Instead of focusing on the interfaces between components, system tests typically evaluate the outward functionality of a full piece of software. This set of tests ignores the constituent parts in order to gauge the composed software as a unified entity. Because of this distinction, system tests usually focus on user- or externally-accessible interfaces.
Acceptance tests are one of the last types of tests that are performed on software prior to delivery. Acceptance testing is used to determine whether a piece of software satisfies all of the requirements from the business or user’s perspective. These tests are sometimes built against the original specification and often test interfaces for some expected functionality and for usability.
Acceptance testing is often a more involved phase that might extend past the release of the software. Automated acceptance testing can be used to make sure the technological requirements of the design were met, but manual verification usually also plays a role.
Frequently, acceptance testing begins by deploying the build to a staging environment that mirrors the production system. From here, the automated test suites can run and internal users can access the system to check whether it functions the way they need it to. After releasing the software or offering beta access to users, further acceptance testing is performed by evaluating how the software functions in real-world use, and by collecting additional feedback.
While we’ve discussed some of the broader ideas above, there are many related concepts that you may come across as you learn about continuous integration, delivery, and deployment. Let’s define a few other terms you are likely to see:
In this guide, we introduced continuous integration, continuous delivery, and continuous deployment and discussed how they can be used to build and release well-tested software safely and quickly. These processes leverage extensive automation and encourage constant code sharing to fix defects early. While the techniques, processes, and tools needed to implement these solutions represent a significant investment, the benefits of a well-designed and properly used system can be enormous.
To find out which CI/CD solution might be right for your project, take a look at our CI/CD tool comparison guide for more information.
]]>Apache and Nginx are the two most common open source web servers in the world. Together, they are responsible for serving over 50% of traffic on the internet. Both solutions are capable of handling diverse workloads and working with other software to provide a complete web stack.
While Apache and Nginx share many qualities, they should not be thought of as entirely interchangeable. Each excels in its own way, and this article will cover the strengths and weaknesses of each.
Before we dive into the differences between Apache and Nginx, let’s take a quick look at the background of these two projects and their general characteristics.
The Apache HTTP Server was created by Robert McCool in 1995 and has been developed under the direction of the Apache Software Foundation since 1999. Since the HTTP web server is the foundation’s original project and is by far their most popular piece of software, it is often referred to simply as “Apache”.
The Apache web server was the most popular server on the internet from at least 1996 through 2016. Because of this popularity, Apache benefits from great documentation and integrated support from other software projects.
Apache is often chosen by administrators for its flexibility, power, and near-universal support. It is extensible through a dynamically loadable module system and can directly serve many scripting languages, such as PHP, without requiring additional software.
In 2002, Igor Sysoev began work on Nginx as an answer to the C10K problem, which was an outstanding challenge for web servers to be able to handle ten thousand concurrent connections. Nginx was publicly released in 2004, and met this goal by relying on an asynchronous, events-driven architecture.
Nginx has since surpassed Apache in popularity due to its lightweight footprint and its ability to scale easily on minimal hardware. Nginx excels at serving static content quickly, has its own robust module system, and can proxy dynamic requests off to other software as needed.
Nginx is often selected by administrators for its resource efficiency and responsiveness under load, as well as its straightforward configuration syntax.
One difference between Apache and Nginx is the specific way that they handle connections and network traffic. This is perhaps the most significant difference in the way that they respond under load.
Apache provides a variety of multi-processing modules (Apache calls these MPMs) that dictate how client requests are handled. This allows administrators to configure its connection handling architecture. These are:
mpm_prefork: This processing module spawns processes with a single thread each to handle requests. Each child can handle a single connection at a time. As long as the number of requests is fewer than the number of processes, this MPM is very fast. However, performance degrades quickly after the requests surpass the number of processes, so this is not a good choice in many scenarios. Each process has a significant impact on RAM consumption, so this MPM is difficult to scale effectively. This may still be a good choice though if used in conjunction with other components that are not built with threads in mind. For instance, PHP is not always thread-safe, so this MPM has been recommended as a safe way of working with mod_php
, the Apache module for processing these files.
mpm_worker: This module spawns processes that can each manage multiple threads. Each of these threads can handle a single connection. Threads are much more efficient than processes, which means that this MPM scales better than the prefork MPM. Since there are more threads than processes, this also means that new connections can immediately take a free thread instead of having to wait for a free process.
mpm_event: This module is similar to the worker module in most situations, but is optimized to handle keep-alive connections. When using the worker MPM, a connection will hold a thread regardless of whether a request is actively being made for as long as the connection is kept alive. The event MPM handles keep alive connections by setting aside dedicated threads for handling keep alive connections and passing active requests off to other threads. This keeps the module from getting bogged down by keep-alive requests, allowing for faster execution.
Apache provides a flexible architecture for choosing different connection and request handling algorithms. The choices provided are mainly a function of the server’s evolution and the increasing need for concurrency as the internet landscape has changed.
Nginx came onto the scene after Apache, with more awareness of the concurrency problems that sites face at scale. As a result, Nginx was designed from the ground up to use an asynchronous, non-blocking, event-driven connection handling algorithm.
Nginx spawns worker processes, each of which can handle thousands of connections. The worker processes accomplish this by implementing a fast looping mechanism that continuously checks for and processes events. Decoupling actual work from connections allows each worker to concern itself with a connection only when a new event has been triggered.
Each of the connections handled by the worker are placed within the event loop. Within the loop, events are processed asynchronously, allowing work to be handled in a non-blocking manner. When a connection closes, it is removed from the loop.
This style of connection processing allows Nginx to scale with limited resources. Since the server is single-threaded and processes are not spawned to handle each new connection, the memory and CPU usage tends to stay relatively consistent, even at times of heavy load.
In terms of real world use-cases, one of the most common comparisons between Apache and Nginx is the way in which each server handles requests for static and dynamic content.
Apache servers can handle static content using its conventional file-based methods. The performance of these operations is mainly a function of the MPM methods described above.
Apache can also process dynamic content by embedding a processor of the language in question into each of its worker instances. This allows it to execute dynamic content within the web server itself without having to rely on external components. These dynamic processors can be enabled through the use of dynamically loadable modules.
Apache’s ability to handle dynamic content internally was a direct contributor to the popularity of LAMP (Linux-Apache-MySQL-PHP) architectures, as PHP code can be executed natively by the web server itself.
Nginx does not have any ability to process dynamic content natively. To handle PHP and other requests for dynamic content, Nginx has to hand off a request to an external library for execution and wait for output to be returned. The results can then be relayed to the client.
These requests must be exchanged by Nginx and the external library using one of the protocols that Nginx knows how to speak (http, FastCGI, SCGI, uWSGI, memcache). In practice, PHP-FPM, a FastCGI implementation, is usually a drop-in solution, but Nginx is not as closely coupled with any particular language in practice.
However, this method has some advantages as well. Since the dynamic interpreter is not embedded in the worker process, its overhead will only be present for dynamic content. Static content can be served in a straight-forward manner and the interpreter will only be contacted when needed.
Apache and Nginx differ significantly in their approach to allowing overrides on a per-directory basis.
Apache includes an option to allow additional configuration on a per-directory basis by inspecting and interpreting directives in hidden files within the content directories themselves. These files are known as .htaccess
files.
Since these files reside within the content directories themselves, when handling a request, Apache checks each component of the path to the requested file for an .htaccess
file and applies the directives found within. This effectively allows decentralized configuration of the web server, which is often used for implementing URL rewrites, access restrictions, authorization and authentication, even caching policies.
While the above examples can all be configured in the main Apache configuration file, .htaccess
files have some important advantages. First, since these are interpreted each time they are found along a request path, they are implemented immediately without reloading the server. Second, it makes it possible to allow non-privileged users to control certain aspects of their own web content without giving them control over the entire configuration file.
This provides an easy way for certain web software, like content management systems, to configure their environment without providing access to the central configuration file. This is also used by shared hosting providers to retain control of the main configuration while giving clients control over their specific directories.
Nginx does not interpret .htaccess
files, nor does it provide any mechanism for evaluating per-directory configuration outside of the main configuration file. Apache was originally developed at a time when it was advantageous to run many heterogeneous web deployments side-by-side on a single server, and delegating permissions made sense. Nginx was developed at a time when individual deployments were more likely to be containerized and to ship with their own network configurations, minimizing this need. This may be less flexible in some circumstances than the Apache model, but it does have its own advantages.
The most notable improvement over the .htaccess
system of directory-level configuration is increased performance. For a typical Apache setup that may allow .htaccess
in any directory, the server will check for these files in each of the parent directories leading up to the requested file, for each request. If one or more .htaccess
files are found during this search, they must be read and interpreted. By not allowing directory overrides, Nginx can serve requests faster by
doing a single directory lookup and file read for each request (assuming that the file is found in the conventional directory structure).
Another advantage is security related. Distributing directory-level configuration access also distributes the responsibility of security to individual users, who may not be trusted to handle this task well. Keep in mind that it is possible to turn off .htaccess
interpretation in Apache if these concerns resonate with you.
How the web server interprets requests and maps them to actual resources on the system is another area where these two servers differ.
Apache provides the ability to interpret a request as a physical resource on the filesystem or as a URI location that may need a more abstract evaluation. In general, for the former Apache uses <Directory>
or <Files>
blocks, while it utilizes <Location>
blocks for more abstract resources.
Because Apache was designed from the ground up as a web server, the default is usually to interpret requests as filesystem resources. It begins by taking the document root and appending the portion of the request following the host and port number to try to find an actual file. Essentially, the filesystem hierarchy is represented on the web as the available document tree.
Apache provides a number of alternatives for when the request does not match the underlying filesystem. For instance, an Alias
directive can be used to map to an alternative location. Using <Location>
blocks is a method of working with the URI itself instead of the filesystem. There are also regular expression variants which can be used to apply configuration more flexibly throughout the filesystem.
While Apache has the ability to operate on both the underlying filesystem and other web URIs, it leans heavily towards filesystem methods. This can be seen in some of the design decisions, including the use of .htaccess
files for per-directory configuration. The Apache docs themselves warn against using URI-based blocks to restrict access when the request mirrors the underlying filesystem.
Nginx was created to be both a web server and a proxy server. Due to the architecture required for these two roles, it works primarily with URIs, translating to the filesystem when necessary.
This is evident in the way that Nginx configuration files are constructed and interpreted. Nginx does not provide a mechanism for specifying configuration for a filesystem directory and instead parses the URI itself.
For instance, the primary configuration blocks for Nginx are server
and location
blocks. The server
block interprets the host being requested, while the location
blocks are responsible for matching portions of the URI that comes after the host and port. At this point, the request is being interpreted as a URI, not as a location on the filesystem.
For static files, all requests eventually have to be mapped to a location on the filesystem. First, Nginx selects the server and location blocks that will handle the request and then combines the document root with the URI, adapting anything necessary according to the configuration specified.
This may seem similar, but parsing requests primarily as URIs instead of filesystem locations allows Nginx to more easily function in both web, mail, and proxy server roles. Nginx is configured by laying out how to respond to different request patterns. Nginx does not check the filesystem until it is ready to serve the request, which explains why it does not implement a form of .htaccess
files.
After reviewing the benefits and limitations of both Apache and Nginx, you may have a better idea of which server is more suited to your needs. In some cases, it is possible to leverage each server’s strengths by using them together.
The conventional configuration for this partnership is to place Nginx in front of Apache as a reverse proxy. This will allow Nginx to handle all client requests. This takes advantage of Nginx’s fast processing speed and ability to handle large numbers of connections concurrently.
For static content, which Nginx excels at, files or other directives will be served quickly and directly to the client. For dynamic content, for instance PHP files, Nginx will proxy the request to Apache, which can then process the results and return the rendered page. Nginx can then pass the content back to the client.
This setup works well for many people because it allows Nginx to function as a sorting machine. It will handle all requests it can and pass on the ones that it has no native ability to serve. By cutting down on the requests the Apache server is asked to handle, we can alleviate some of the blocking that occurs when an Apache process or thread is occupied.
This configuration also facilitates horizontal scaling by adding additional backend servers as necessary. Nginx can be configured to pass requests to multiple servers, increasing this configuration’s performance.
Both Apache and Nginx are powerful, flexible, and capable. Deciding which server is best for you is largely a function of evaluating your specific requirements and testing with the patterns that you expect to see.
There are differences between these projects that have a very real impact on the raw performance, capabilities, and the implementation time necessary to use either solution in production. Use the solution that best aligns with your objectives.
]]>Any application or website that sees significant growth will eventually need to scale in order to accommodate increases in traffic. For data-driven applications and websites, it’s critical that scaling is done in a way that ensures the security and integrity of their data. It can be difficult to predict how popular a website or application will become or how long it will maintain that popularity, which is why some organizations choose a database architecture that allows them to scale their databases dynamically.
In this conceptual article, we will discuss one such database architecture: sharded databases. Sharding has been receiving lots of attention in recent years, but many don’t have a clear understanding of what it is or the scenarios in which it might make sense to shard a database. We will go over what sharding is, some of its main benefits and drawbacks, and also a few common sharding methods.
Sharding is a database architecture pattern related to horizontal partitioning — the practice of separating one table’s rows into multiple different tables, known as partitions. Each partition has the same schema and columns, but also entirely different rows. Likewise, the data held in each is unique and independent of the data held in other partitions.
It can be helpful to think of horizontal partitioning in terms of how it relates to vertical partitioning. In a vertically-partitioned table, entire columns are separated out and put into new, distinct tables. The data held within one vertical partition is independent from the data in all the others, and each holds both distinct rows and columns. The following diagram illustrates how a table could be partitioned both horizontally and vertically:
Sharding involves breaking up one’s data into two or more smaller chunks, called logical shards. The logical shards are then distributed across separate database nodes, referred to as physical shards, which can hold multiple logical shards. Despite this, the data held within all the shards collectively represent an entire logical dataset.
Database shards exemplify a shared-nothing architecture. This means that the shards are autonomous; they don’t share any of the same data or computing resources. In some cases, though, it may make sense to replicate certain tables into each shard to serve as reference tables. For example, let’s say there’s a database for an application that depends on fixed conversion rates for weight measurements. By replicating a table containing the necessary conversion rate data into each shard, it would help to ensure that all of the data required for queries is held in every shard.
Oftentimes, sharding is implemented at the application level, meaning that the application includes code that defines which shard to transmit reads and writes to. However, some database management systems have sharding capabilities built in, allowing you to implement sharding directly at the database level.
Given this general overview of sharding, let’s go over some of the positives and negatives associated with this database architecture.
The main appeal of sharding a database is that it can help to facilitate horizontal scaling, also known as scaling out. Horizontal scaling is the practice of adding more machines to an existing stack in order to spread out the load and allow for more traffic and faster processing. This is often contrasted with vertical scaling, otherwise known as scaling up, which involves upgrading the hardware of an existing server, usually by adding more RAM or CPU.
It’s relatively simple to have a relational database running on a single machine and scale it up as necessary by upgrading its computing resources. Ultimately, though, any non-distributed database will be limited in terms of storage and compute power, so having the freedom to scale horizontally makes your setup far more flexible.
Another reason why some might choose a sharded database architecture is to speed up query response times. When you submit a query on a database that hasn’t been sharded, it may have to search every row in the table you’re querying before it can find the result set you’re looking for. For an application with a large, monolithic database, queries can become prohibitively slow. By sharding one table into multiple, though, queries have to go over fewer rows and their result sets are returned much more quickly.
Sharding can also help to make an application more reliable by mitigating the impact of outages. If your application or website relies on an unsharded database, an outage has the potential to make the entire application unavailable. With a sharded database, though, an outage is likely to affect only a single shard. Even though this might make some parts of the application or website unavailable to some users, the overall impact would still be less than if the entire database crashed.
While sharding a database can make scaling easier and improve performance, it can also impose certain limitations. Here, we’ll discuss some of these and why they might be reasons to avoid sharding altogether.
The first difficulty that people encounter with sharding is the sheer complexity of properly implementing a sharded database architecture. If done incorrectly, there’s a significant risk that the sharding process can lead to lost data or corrupted tables. Even when done correctly, though, sharding is likely to have a major impact on your team’s workflows. Rather than accessing and managing one’s data from a single entry point, users must manage data across multiple shard locations, which could potentially be disruptive to some teams.
One problem that users sometimes encounter after having sharded a database is that the shards eventually become unbalanced. By way of example, let’s say you have a database with two separate shards, one for customers whose last names begin with letters A through M and another for those whose names begin with the letters N through Z. However, your application serves an inordinate amount of people whose last names start with the letter G. Accordingly, the A-M shard gradually accrues more data than the N-Z one, causing the application to slow down and stall out for a significant portion of your users. The A-M shard has become what is known as a database hotspot. In this case, any benefits of sharding the database are canceled out by the slowdowns and crashes. The database would likely need to be repaired and resharded to allow for a more even data distribution.
Another major drawback is that once a database has been sharded, it can be very difficult to return it to its unsharded architecture. Any backups of the database made before it was sharded won’t include data written since the partitioning. Consequently, rebuilding the original unsharded architecture would require merging the new partitioned data with the old backups or, alternatively, transforming the partitioned DB back into a single DB, both of which would be costly and time consuming endeavors.
A final disadvantage to consider is that sharding isn’t natively supported by every database engine. For instance, PostgreSQL does not include automatic sharding as a feature, although it is possible to manually shard a PostgreSQL database. There are a number of Postgres forks that do include automatic sharding, but these often trail behind the latest PostgreSQL release and lack certain other features. Some specialized database technologies — like MySQL Cluster or certain database-as-a-service products like MongoDB Atlas — do include auto-sharding as a feature, but vanilla versions of these database management systems do not. Because of this, sharding often requires a “roll your own” approach. This means that documentation for sharding or tips for troubleshooting problems are often difficult to find.
These are, of course, only some general issues to consider before sharding. There may be many more potential drawbacks to sharding a database depending on its use case.
Now that we’ve covered a few of sharding’s drawbacks and benefits, we will go over a few different architectures for sharded databases.
Once you’ve decided to shard your database, the next thing you need to figure out is how you’ll go about doing so. When running queries or distributing incoming data to sharded tables or databases, it’s crucial that it goes to the correct shard. Otherwise, it could result in lost data or painfully slow queries. In this section, we’ll go over a few common sharding architectures, each of which uses a slightly different process to distribute data across shards.
Key based sharding, also known as hash based sharding, involves using a value taken from newly written data — such as a customer’s ID number, a client application’s IP address, a ZIP code, etc. — and plugging it into a hash function to determine which shard the data should go to. A hash function is a function that takes as input a piece of data (for example, a customer email) and outputs a discrete value, known as a hash value. In the case of sharding, the hash value is a shard ID used to determine which shard the incoming data will be stored on. Altogether, the process looks like this:
To ensure that entries are placed in the correct shards and in a consistent manner, the values entered into the hash function should all come from the same column. This column is known as a shard key. In simple terms, shard keys are similar to primary keys in that both are columns which are used to establish a unique identifier for individual rows. Broadly speaking, a shard key should be static, meaning it shouldn’t contain values that might change over time. Otherwise, it would increase the amount of work that goes into update operations, and could slow down performance.
While key based sharding is a fairly common sharding architecture, it can make things tricky when trying to dynamically add or remove additional servers to a database. As you add servers, each one will need a corresponding hash value and many of your existing entries, if not all of them, will need to be remapped to their new, correct hash value and then migrated to the appropriate server. As you begin rebalancing the data, neither the new nor the old hashing functions will be valid. Consequently, your server won’t be able to write any new data during the migration and your application could be subject to downtime.
The main appeal of this strategy is that it can be used to evenly distribute data so as to prevent hotspots. Also, because it distributes data algorithmically, there’s no need to maintain a map of where all the data is located, as is necessary with other strategies like range or directory based sharding.
Range based sharding involves sharding data based on ranges of a given value. To illustrate, let’s say you have a database that stores information about all the products within a retailer’s catalog. You could create a few different shards and divvy up each products’ information based on which price range they fall into, like this:
The main benefit of range based sharding is that it’s relatively simple to implement. Every shard holds a different set of data but they all have an identical schema as one another, as well as the original database. The application code reads which range the data falls into and writes it to the corresponding shard.
On the other hand, range based sharding doesn’t protect data from being unevenly distributed, leading to the aforementioned database hotspots. Looking at the example diagram, even if each shard holds an equal amount of data the odds are that specific products will receive more attention than others. Their respective shards will, in turn, receive a disproportionate number of reads.
To implement directory based sharding, one must create and maintain a lookup table that uses a shard key to keep track of which shard holds which data. A lookup table is a table that holds a static set of information about where specific data can be found. The following diagram shows a simplistic example of directory based sharding:
Here, the Delivery Zone column is defined as a shard key. Data from the shard key is written to the lookup table along with whatever shard each respective row should be written to. This is similar to range based sharding, but instead of determining which range the shard key’s data falls into, each key is tied to its own specific shard. Directory based sharding is a good choice over range based sharding in cases where the shard key has a low cardinality — meaning, it has a low number of possible values — and it doesn’t make sense for a shard to store a range of keys. Note that it’s also distinct from key based sharding in that it doesn’t process the shard key through a hash function; it just checks the key against a lookup table to see where the data needs to be written.
The main appeal of directory based sharding is its flexibility. Range based sharding architectures limit you to specifying ranges of values, while key based ones limit you to using a fixed hash function which, as mentioned previously, can be exceedingly difficult to change later on. Directory based sharding, on the other hand, allows you to use whatever system or algorithm you want to assign data entries to shards, and it’s relatively easy to dynamically add shards using this approach.
While directory based sharding is the most flexible of the sharding methods discussed here, the need to connect to the lookup table before every query or write can have a detrimental impact on an application’s performance. Furthermore, the lookup table can become a single point of failure: if it becomes corrupted or otherwise fails, it can impact one’s ability to write new data or access their existing data.
Whether or not one should implement a sharded database architecture is almost always a matter of debate. Some see sharding as an inevitable outcome for databases that reach a certain size, while others see it as a headache that should be avoided unless it’s absolutely necessary, due to the operational complexity that sharding adds.
Because of this added complexity, sharding is usually only performed when dealing with very large amounts of data. Here are some common scenarios where it may be beneficial to shard a database:
Before sharding, you should exhaust all other options for optimizing your database. Some optimizations you might want to consider include:
Bear in mind that if your application or website grows past a certain point, none of these strategies will be enough to improve performance on their own. In such cases, sharding may indeed be the best option for you.
Sharding can be a great solution for those looking to scale their database horizontally. However, it also adds a great deal of complexity and creates more potential failure points for your application. Sharding may be necessary for some, but the time and resources needed to create and maintain a sharded architecture could outweigh the benefits for others.
By reading this conceptual article, you should have a clearer understanding of the pros and cons of sharding. Moving forward, you can use this insight to make a more informed decision about whether or not a sharded database architecture is right for your application.
]]>The relational data model, which organizes data in tables of rows and columns, predominates in database management tools. Today there are other data models, including NoSQL and NewSQL, but relational database management systems (RDBMSs) remain dominant for storing and managing data worldwide.
This article compares and contrasts three of the most widely implemented open-source RDBMSs: SQLite, MySQL, and PostgreSQL. Specifically, it will explore the data types that each RDBMS uses, their advantages and disadvantages, and situations where they are best optimized.
Databases are logically modelled clusters of information, or data. A database management system (DBMS), on the other hand, is a computer program that interacts with a database. A DBMS allows you to control access to a database, write data, run queries, and perform any other tasks related to database management.
Although database management systems are often referred to as “databases,” the two terms are not interchangeable. A database can be any collection of data, not just one stored on a computer. In contrast, a DBMS specifically refers to the software that allows you to interact with a database.
All database management systems have an underlying model that structures how data is stored and accessed. A relational database management system is a DBMS that employs the relational data model. In this relational model, data is organized into tables. Tables, in the context of RDBMSs, are more formally referred to as relations. A relation is a set of tuples, which are the rows in a table, and each tuple shares a set of attributes, which are the columns in a table:
Most relational databases use structured query language (SQL) to manage and query data. However, many RDBMSs use their own particular dialect of SQL, which may have certain limitations or extensions. These extensions typically include extra features that allow users to perform more complex operations than they otherwise could with standard SQL.
Note: The term “standard SQL” comes up several times throughout this guide. SQL standards are jointly maintained by the American National Standards Institute (ANSI), the International Organization for Standardization (ISO), and the International Electrotechnical Commission (IEC). Whenever this article mentions “standard SQL” or “the SQL standard,” it’s referring to the current version of the SQL standard published by these bodies.
It should be noted that the full SQL standard is large and complex: full core SQL:2011 compliance requires 179 features. Because of this, most RDBMSs don’t support the entire standard, although some do come closer to full compliance than others.
Each column is assigned a data type which dictates what kind of entries are allowed in that column. Different RDBMSs implement different data types, which aren’t always directly interchangeable. Some common data types include dates, strings, integers, and Booleans.
Storing integers in a database is more nuanced than putting numbers in a table. Numeric data types can either be signed, meaning they can represent both positive and negative numbers, or unsigned, which means they can only represent positive numbers. For example, MySQL’s tinyint
data type can hold 8 bits of data, which equates to 256 possible values. The signed range of this data type is from -128 to 127, while the unsigned range is from 0 to 255.
Being able to control what data is allowed into a database is important. Sometimes, a database administrator will impose a constraint on a table to limit what values can be entered into it. A constraint typically applies to one particular column, but some constraints can also apply to an entire table. Here are some constraints that are commonly used in SQL:
UNIQUE
: Applying this constraint to a column ensures that no two entries in that column are identical.NOT NULL
: This constraint ensures that a column doesn’t have any NULL
entries.PRIMARY KEY
: A combination of UNIQUE
and NOT NULL
, the PRIMARY KEY
constraint ensures that no entry in the column is NULL
and that every entry is distinct.FOREIGN KEY
: A FOREIGN KEY
is a column in one table that refers to the PRIMARY KEY
of another table. This constraint is used to link two tables together. Entries to the FOREIGN KEY
column must already exist in the parent PRIMARY KEY
column for the write process to succeed.CHECK
: This constraint limits the range of values that can be entered into a column. For example, if your application is intended only for residents of Alaska, you could add a CHECK
constraint on a ZIP code column to only allow entries between 99501 and 99950.If you’d like to learn more about database management systems, check out our article on A Comparison of NoSQL Database Management Systems and Models.
Now that we’ve covered relational database management systems generally, let’s move onto the first of the three open-source relational databases this article will cover: SQLite.
SQLite is a self-contained, file-based, and fully open-source RDBMS known for its portability, reliability, and strong performance even in low-memory environments. Its transactions are ACID-compliant, even in cases where the system crashes or undergoes a power outage.
The SQLite project’s website describes it as a “serverless” database. Most relational database engines are implemented as a server process in which programs communicate with the host server through an interprocess communication that relays requests. In contrast, SQLite allows any process that accesses the database to read and write to the database disk file directly. This simplifies SQLite’s setup process, since it eliminates any need to configure a server process. Likewise, there’s no configuration necessary for programs that will use the SQLite database: all they need is access to the disk.
SQLite is free and open-source software, and no special license is required to use it. However, the project does offer several extensions — each for a one-time fee — that help with compression and encryption. Additionally, the project offers various commercial support packages, each for an annual fee.
SQLite allows a variety of data types, organized into the following storage classes:
Data Type | Explanation |
---|---|
null |
Includes any NULL values. |
integer |
Signed integers, stored in 1, 2, 3, 4, 6, or 8 bytes depending on the magnitude of the value. |
real |
Real numbers, or floating point values, stored as 8-byte floating point numbers. |
text |
Text strings stored using the database encoding, which can either be UTF-8, UTF-16BE or UTF-16LE. |
blob |
Any blob of data, with every blob stored exactly as it was input. |
In the context of SQLite, the terms “storage class” and “data type” are considered interchangeable. If you’d like to learn more about SQLite’s data types and SQLite type affinity, check out SQLite’s official documentation on the subject.
According to the DB-Engines Ranking, MySQL has been the most popular open-source RDBMS since the site began tracking database popularity in 2012. It is a feature-rich product that powers many of the world’s largest websites and applications, including Twitter, Facebook, Netflix, and Spotify. Getting started with MySQL is relatively straightforward, thanks in large part to its exhaustive documentation and large community of developers, as well as the abundance of MySQL-related resources online.
MySQL was designed for speed and reliability, at the expense of full adherence to standard SQL. The MySQL developers continually work towards closer adherence to standard SQL, but it still lags behind other SQL implementations. It does, however, come with various SQL modes and extensions that bring it closer to compliance.
Unlike applications using SQLite, applications using a MySQL database access it through a separate daemon process. Because the server process stands between the database and other applications, it allows for greater control over who has access to the database.
MySQL has inspired a wealth of third-party applications, tools, and integrated libraries that extend its functionality and help make it easier to work with. Some of the more widely-used of these third-party tools are phpMyAdmin, DBeaver, and HeidiSQL.
MySQL’s data types can be organized into three broad categories: numeric types, date and time types, and string types.
Numeric types:
Data Type | Explanation |
---|---|
tinyint |
A very small integer. The signed range for this numeric data type is -128 to 127, while the unsigned range is 0 to 255. |
smallint |
A small integer. The signed range for this numeric type is -32768 to 32767, while the unsigned range is 0 to 65535. |
mediumint |
A medium-sized integer. The signed range for this numeric data type is -8388608 to 8388607, while the unsigned range is 0 to 16777215. |
int or integer |
A normal-sized integer. The signed range for this numeric data type is -2147483648 to 2147483647, while the unsigned range is 0 to 4294967295. |
bigint |
A large integer. The signed range for this numeric data type is -9223372036854775808 to 9223372036854775807, while the unsigned range is 0 to 18446744073709551615. |
float |
A small (single-precision) floating-point number. |
double , double precision , or real |
A normal sized (double-precision) floating-point number. |
dec , decimal , fixed , or numeric |
A packed fixed-point number. The display length of entries for this data type is defined when the column is created, and every entry adheres to that length. |
bool or boolean |
A Boolean is a data type that only has two possible values, usually either true or false . |
bit |
A bit value type for which you can specify the number of bits per value, from 1 to 64. |
Date and time types:
Data Type | Explanation |
---|---|
date |
A date, represented as YYYY-MM-DD . |
datetime |
A timestamp showing the date and time, displayed as YYYY-MM-DD HH:MM:SS . |
timestamp |
A timestamp indicating the amount of time since the Unix epoch (00:00:00 on January 1, 1970). |
time |
A time of day, displayed as HH:MM:SS . |
year |
A year expressed in either a 2 or 4 digit format, with 4 digits being the default. |
String types:
Data Type | Explanation |
---|---|
char |
A fixed-length string; entries of this type are padded on the right with spaces to meet the specified length when stored. |
varchar |
A string of variable length. |
binary |
Similar to the char type, but a binary byte string of a specified length rather than a nonbinary character string. |
varbinary |
Similar to the varchar type, but a binary byte string of a variable length rather than a nonbinary character string. |
blob |
A binary string with a maximum length of 65535 (2^16 - 1) bytes of data. |
tinyblob |
A blob column with a maximum length of 255 (2^8 - 1) bytes of data. |
mediumblob |
A blob column with a maximum length of 16777215 (2^24 - 1) bytes of data. |
longblob |
A blob column with a maximum length of 4294967295 (2^32 - 1) bytes of data. |
text |
A string with a maximum length of 65535 (2^16 - 1) characters. |
tinytext |
A text column with a maximum length of 255 (2^8 - 1) characters. |
mediumtext |
A text column with a maximum length of 16777215 (2^24 - 1) characters. |
longtext |
A text column with a maximum length of 4294967295 (2^32 - 1) characters. |
enum |
An enumeration, which is a string object that takes a single value from a list of values that are declared when the table is created. |
set |
Similar to an enumeration, a string object that can have zero or more values, each of which must be chosen from a list of allowed values that are specified when the table is created. |
FULL JOIN
clauses.PostgreSQL, also known as Postgres, bills itself as “the most advanced open-source relational database in the world.” It was created with the goal of being highly extensible and standards compliant. PostgreSQL is an object-relational database, meaning that although it’s primarily a relational database it also includes features — like table inheritance and function overloading — that are more often associated with object databases.
Postgres is capable of efficiently handling multiple tasks at the same time, a characteristic known as concurrency. It achieves this without read locks thanks to its implementation of Multiversion Concurrency Control (MVCC), which ensures the atomicity, consistency, isolation, and durability of its transactions, also known as ACID compliance.
PostgreSQL isn’t as widely used as MySQL, but there are still a number of third-party tools and libraries designed to simplify working with with PostgreSQL, including pgAdmin and Postbird.
PostgreSQL supports numeric, string, and date and time data types like MySQL. In addition, it supports data types for geometric shapes, network addresses, bit strings, text searches, and JSON entries, as well as several idiosyncratic data types.
Numeric types:
Data Type | Explanation |
---|---|
bigint |
A signed 8 byte integer. |
bigserial |
An auto-incrementing 8 byte integer. |
double precision |
An 8 byte double precision floating-point number. |
integer |
A signed 4 byte integer. |
numeric or decimal |
A number of selectable precision, recommended for use in cases where exactness is crucial, such as monetary amounts. |
real |
A 4 byte single precision floating-point number. |
smallint |
A signed 2 byte integer. |
smallserial |
An auto-incrementing 2 byte integer. |
serial |
An auto-incrementing 4 byte integer. |
Character types:
Data Type | Explanation |
---|---|
character |
A character string with a specified fixed length. |
character varying or varchar |
A character string with a variable but limited length. |
text |
A character string of a variable, unlimited length. |
Date and time types:
Data Type | Explanation |
---|---|
date |
A calendar date consisting of the day, month, and year. |
interval |
A time span. |
time or time without time zone |
A time of day, not including the time zone. |
time with time zone |
A time of day, including the time zone. |
timestamp or timestamp without time zone |
A date and time, not including the time zone. |
timestamp with time zone |
A date and time, including the time zone. |
Geometric types:
Data Type | Explanation |
---|---|
box |
A rectangular box on a plane. |
circle |
A circle on a plane. |
line |
An infinite line on a plane. |
lseg |
A line segment on a plane. |
path |
A geometric path on a plane. |
point |
A geometric point on a plane. |
polygon |
A closed geometric path on a plane. |
Network address types:
Data Type | Explanation |
---|---|
cidr |
An IPv4 or IPv6 network address. |
inet |
An IPv4 or IPv6 host address. |
macaddr |
A Media Access Control (MAC) address. |
Bit string types:
Data Type | Explanation |
---|---|
bit |
A fixed-length bit string. |
bit varying |
A variable-length bit string. |
Text search types:
Data Type | Explanation |
---|---|
tsquery |
A text search query. |
tsvector |
A text search document. |
JSON types:
Data Type | Explanation |
---|---|
json |
Textual JSON data. |
jsonb |
Decomposed binary JSON data. |
Other data types:
Data Type | Explanation |
---|---|
boolean |
A logical Boolean, representing either true or false . |
bytea |
Short for “byte array”, this type is used for binary data. |
money |
An amount of currency. |
pg_lsn |
A PostgreSQL Log Sequence Number. |
txid_snapshot |
A user-level transaction ID snapshot. |
uuid |
A universally unique identifier. |
xml |
XML data. |
Today, SQLite, MySQL, and PostgreSQL are the three most popular open-source relational database management systems in the world. Each has its own unique features and limitations, and excels in particular scenarios. There are quite a few variables at play when deciding on an RDBMS, and the choice is rarely as simple as picking the fastest one or the one with the most features. The next time you’re in need of a relational database solution, be sure to research these and other tools in depth to find the one that best suits your needs.
If you’d like to learn more about SQL and how to use it to manage a relational database, we encourage you to refer to our How To Manage an SQL Database cheat sheet. On the other hand, if you’d like to learn about non-relational (or NoSQL) databases, check out our Comparison Of NoSQL Database Management Systems.
Continuous integration, delivery, and deployment, known collectively as CI/CD, is an integral part of modern development intended to reduce errors during integration and deployment while increasing project velocity. CI/CD is a philosophy and set of practices often augmented by robust tooling that emphasize automated testing at each stage of the software pipeline. By incorporating these ideas into your practice, you can reduce the time required to integrate changes for a release and thoroughly test each change before moving it into production.
CI/CD has many potential benefits, but successful implementation often requires a good deal of consideration. Deciding exactly how to use the tools and what changes you might need in your processes can be challenging without extensive trial and error. However, while all implementations will be different, adhering to best practices can help you avoid common problems and improve faster.
In this guide, we’ll introduce some basic guidance on how to implement and maintain a CI/CD system to best serve your organization’s needs. We’ll cover a number of practices that will help you improve the effectiveness of your CI/CD service. Feel free to read through as written or skip ahead to areas that interest you.
CI/CD pipelines help shepherd changes through automated testing cycles, out to staging environments, and finally to production. The more comprehensive your testing pipelines are, the more confident you can be that changes won’t introduce unforeseen side effects into your production deployment. However, since each change must go through this process, keeping your pipelines fast and dependable is incredibly important.
The tension between these two requirements can be difficult to balance. There are some straightforward steps you can take to improve speed, like scaling out your CI/CD infrastructure and optimizing tests. However, as time goes on, you may be forced to make critical decisions about the relative value of different tests and the stage or order where they are run. Sometimes, paring down your test suite by removing tests with low value or with indeterminate conclusions is the smartest way to maintain the speed required by a heavily used pipelines.
When making these significant decisions, make sure you understand and document the trade-offs you are making. Consult with team members and stakeholders to align the team’s assumptions about what the test suite is responsible for and what the primary areas of focus should be.
From an operational security standpoint, your CI/CD system represents some of the most critical infrastructure to protect. Since the CI/CD system has complete access to your codebase and credentials to deploy in various environments, it is essential to secure it. Due to its high value as a target, it is important to isolate and lock down your CI/CD as much as possible.
CI/CD systems should be deployed to internal, protected networks, unexposed to outside parties. Setting up VPNs or other network access control technology is recommended to ensure that only authenticated operators are able to access your system. Depending on the complexity of your network topology, your CI/CD system may need to access several different networks to deploy code to different environments. If not properly secured or isolated, attackers that gain access to one environment may be able to island hop, a technique used to expand access by taking advantage of more lenient internal networking rules, to gain access to other environments through weaknesses in your CI/CD servers.
The required isolation and security strategies will depend heavily on your network topology, infrastructure, and your management and development requirements. The important point to keep in mind is that your CI/CD systems are highly valuable targets and in many cases, they have a broad degree of access to your other vital systems. Shielding all external access to the servers and tightly controlling the types of internal access allowed will help reduce the risk of your CI/CD system being compromised.
Part of what makes it possible for CI/CD to improve your development practices and code quality is that tooling often helps enforce best practices for testing and deployment. Promoting code through your CI/CD pipelines requires each change to demonstrate that it adheres to your organization’s codified standards and procedures. Failures in a CI/CD pipeline are immediately visible and halt the advancement of the affected release to later stages of the cycle. This is a gatekeeping mechanism that safeguards the more important environments from untrusted code.
To realize these advantages, however, you need to ensure that every change to your production environment goes through your pipeline. The CI/CD pipeline should be the only mechanism by which code enters the production environment. This can happen automatically at the end of successfully testing with continuous deployment practices, or through a manual promotion of tested changes approved and made available by your CI/CD system.
Frequently, teams start using their pipelines for deployment, but begin making exceptions when problems occur and there is pressure to resolve them quickly. While downtime and other issues should be mitigated as soon as possible, it is important to understand that the CI/CD system is a good tool to ensure that your changes are not introducing other bugs or further breaking the system. Putting your fix through the pipeline (or just using the CI/CD system to rollback) will also prevent the next deployment from erasing an ad hoc hotfix that was applied directly to production. The pipeline protects the validity of your deployments regardless of whether this was a regular, planned release, or a fast fix to resolve an ongoing issue. This use of the CI/CD system is yet another reason to work to keep your pipeline fast.
CI/CD pipelines promote changes through a series of test suites and deployment environments. Changes that pass the requirements of one stage are either automatically deployed or queued for manual deployment into more restrictive environments. Early stages are meant to prove that it’s worthwhile to continue testing and pushing the changes closer to production.
For later stages especially, reproducing the production environment as closely as possible in the testing environments helps ensure that the tests accurately reflect how the change would behave in production. Significant differences between staging and production can allow problematic changes to be released that were never observed to be faulty in testing. The more differences between your live environment and the testing environment, the less your tests will measure how the code will perform when released.
Some differences between staging and production are expected, but keeping them manageable and making sure they are well-understood is essential. Some organizations use blue-green deployments to swap production traffic between two nearly identical environments that alternate between being designated production and staging. Less extreme strategies involved deploying the same configuration and infrastructure from production to your staging environment, but at a reduced scale. Items like network endpoints might differ between your environments, but parameterization of this type of variable data can help make sure that the code is consistent and that the environmental differences are well-defined.
A primary goal of a CI/CD pipeline is to build confidence in your changes and minimize the chance of unexpected impact. We discussed the importance of maintaining parity between environments, but one component of this is important enough to warrant extra attention. If your software requires a building, packaging, or bundling step, that step should be executed only once and the resulting output should be reused throughout the entire pipeline.
This guideline helps prevent problems that arise when software is compiled or packaged multiple times, allowing slight inconsistencies to be injected into the resulting artifacts. Building the software separately at each new stage can mean the tests in earlier environments weren’t targeting the same software that will be deployed later, invalidating the results.
To avoid this problem, CI systems should include a build process as the first step in the pipeline that creates and packages the software in a clean environment. The resulting artifact should be versioned and uploaded to an artifact storage system to be pulled down by subsequent stages of the pipeline, ensuring that the build does not change as it progresses through the system.
While keeping your entire pipeline fast is a great general goal, parts of your test suite will inevitably be faster than others. Because the CI/CD system serves as a conduit for all changes entering your system, discovering failures as early as possible is important to minimize the resources devoted to problematic builds. To achieve this, prioritize and run your fastest tests first. Save complex, long-running tests until after you’ve validated the build with smaller, quick-running tests.
This strategy has a number of benefits that can help keep your CI/CD process healthy. It encourages you to understand the performance impact of individual tests, allows you to complete most of your tests early, and increases the likelihood of fast failures, which means that problematic changes can be reverted or fixed before blocking other members’ work.
Test prioritization usually means running your project’s unit tests first since those tend to be quick, isolated, and component focused. Afterwards, integration tests typically represent the next level of complexity and speed, followed by system-wide tests, and finally acceptance tests, which often require some level of human interaction.
One of the main principles of CI/CD is to integrate changes into the primary shared repository early and often. This helps avoid costly integration problems down the line when multiple developers attempt to merge large, divergent, and conflicting changes into the main branch of the repository in preparation for release. Typically, CI/CD systems are set to monitor and test the changes committed to only one or a few branches.
To take advantage of the benefits that CI provides, it is best to limit the number and scope of branches in your repository. Most implementations suggest that developers commit directly to the main branch or merge changes from their local branches in at least once a day.
Essentially, branches that are not being tracked by your CI/CD system contain untested code that should be regarded as a liability to your project’s success and momentum. Minimizing branching to encourage early integration of different developers’ code helps leverage the strengths of the system, and prevents developers from negating the advantages it provides.
Related to the earlier point about discovering failures early, developers should be encouraged to run some tests locally prior to committing to the shared repository. This makes it possible to detect certain problematic changes before they block other team members. While the local developer environment will unlikely be able to run the entire test suite in a production-like environment, this extra step gives individuals more confidence that the changes they are making pass basic tests and are worth trying to integrate with the larger codebase.
To ensure that developers can test effectively on their own, your test suite should be runnable with a single command that can be run from any environment. The same command used by developers on their local machines should be used by the CI/CD system to kick off tests on code merged to the repository. Often, this is coordinated by providing a shell script or makefile to automate running the testing tools in a repeatable, predictable manner.
To help ensure that your tests run the same at various stages, it’s often a good idea to use clean, ephemeral testing environments when possible. Usually, this means running tests in containers to abstract differences between the host systems and to provide a standard API for hooking together components at various scales. Since containers run with minimal state, residual side effects from testing are not inherited by subsequent runs of the test suite, which could taint the results.
Another benefit of containerized testing environments is the portability of your testing infrastructure. With containers, developers have an easier time replicating the configuration that will be used later on in the pipeline without having to either manually set up and maintain infrastructure or sacrifice environmental fidelity. Since containers can be spun up easily when needed and then destroyed, users can make fewer compromises with regard to the accuracy of their testing environment when running local tests. In general, using containers locks in some aspects of the runtime environment to help minimize differences between pipeline stages.
While each CI/CD implementation will be different, following some of these basic principles will help you avoid some common pitfalls and strengthen your testing and development practices. As with most aspects of continuous integration, a mixture of process, tooling, and habit will help make development changes more successful and impactful.
To learn more about general CI/CD practices and how to set up various CI/CD services, check out other articles with the CI/CD tag.
]]>Modern stateless applications are built and designed to run in software containers like Docker, and be managed by container clusters like Kubernetes. They are developed using Cloud Native and Twelve Factor principles and patterns, to minimize manual intervention and maximize portability and redundancy. Migrating virtual-machine or bare metal-based applications into containers (known as “containerizing”) and deploying them inside of clusters often involves significant shifts in how these apps are built, packaged, and delivered.
Building on Architecting Applications for Kubernetes, in this conceptual guide, we’ll discuss high-level steps for modernizing your applications, with the end goal of running and managing them in a Kubernetes cluster. Although you can run stateful applications like databases on Kubernetes, this guide focuses on migrating and modernizing stateless applications, with persistent data offloaded to an external data store. Kubernetes provides advanced functionality for efficiently managing and scaling stateless applications, and we’ll explore the application and infrastructure changes necessary for running scalable, observable, and portable apps on Kubernetes.
Before containerizing your application or writing Kubernetes Pod and Deployment configuration files, you should implement application-level changes to maximize your app’s portability and observability in Kubernetes. Kubernetes is a highly automated environment that can automatically deploy and restart failing application containers, so it’s important to build in the appropriate application logic to communicate with the container orchestrator and allow it to automatically scale your app as necessary.
One of the first application-level changes to implement is extracting application configuration from application code. Configuration consists of any information that varies across deployments and environments, like service endpoints, database addresses, credentials, and various parameters and options. For example, if you have two environments, say staging
and production
, and each contains a separate database, your application should not have the database endpoint and credentials explicitly declared in the code, but stored in a separate location, either as variables in the running environment, a local file, or external key-value store, from which the values are read into the app.
Hardcoding these parameters into your code poses a security risk as this config data often consists of sensitive information, which you then check in to your version control system. It also increases complexity as you now have to maintain multiple versions of your application, each consisting of the same core application logic, but varying slightly in configuration. As applications and their configuration data grow, hardcoding config into app code quickly becomes unwieldy.
By extracting configuration values from your application code, and instead ingesting them from the running environment or local files, your app becomes a generic, portable package that can be deployed into any environment, provided you supply it with accompanying configuration data. Container software like Docker and cluster software like Kubernetes have been designed around this paradigm, building in features for managing configuration data and injecting it into application containers. These features will be covered in more detail in the Containerizing and Kubernetes sections.
Here’s a quick example demonstrating how to externalize two config values DB_HOST
and DB_USER
from a Python Flask app’s code. We’ll make them available in the app’s running environment as env vars, from which the app will read them:
from flask import Flask
DB_HOST = 'mydb.mycloud.com'
DB_USER = 'sammy'
app = Flask(__name__)
@app.route('/')
def print_config():
output = 'DB_HOST: {} -- DB_USER: {}'.format(DB_HOST, DB_USER)
return output
Running this app (consult the Flask Quickstart to learn how) and visiting its web endpoint will display a page containing these two config values.
Now, here’s the same example with the config values externalized to the app’s running environment:
import os
from flask import Flask
DB_HOST = os.environ.get('APP_DB_HOST')
DB_USER = os.environ.get('APP_DB_USER')
app = Flask(__name__)
@app.route('/')
def print_config():
output = 'DB_HOST: {} -- DB_USER: {}'.format(DB_HOST, DB_USER)
return output
Before running the app, we set the necessary config variables in the local environment:
- export APP_DB_HOST=mydb.mycloud.com
- export APP_DB_USER=sammy
- flask run
The displayed web page should contain the same text as in the first example, but the app’s config can now be modified independently of the application code. You can use a similar approach to read in config parameters from a local file.
In the next section we’ll discuss moving application state outside of containers.
Cloud Native applications run in containers, and are dynamically orchestrated by cluster software like Kubernetes or Docker Swarm. A given app or service can be load balanced across multiple replicas, and any individual app container should be able to fail, with minimal or no disruption of service for clients. To enable this horizontal, redundant scaling, applications must be designed in a stateless fashion. This means that they respond to client requests without storing persistent client and application data locally, and at any point in time if the running app container is destroyed or restarted, critical data is not lost.
For example, if you are running an address book application and your app adds, removes and modifies contacts from an address book, the address book data store should be an external database or other data store, and the only data kept in container memory should be short-term in nature, and disposable without critical loss of information. Data that persists across user visits like sessions should also be moved to external data stores like Redis. Wherever possible, you should offload any state from your app to services like managed databases or caches.
For stateful applications that require a persistent data store (like a replicated MySQL database), Kubernetes builds in features for attaching persistent block storage volumes to containers and Pods. To ensure that a Pod can maintain state and access the same persistent volume after a restart, the StatefulSet workload must be used. StatefulSets are ideal for deploying databases and other long-running data stores to Kubernetes.
Stateless containers enable maximum portability and full use of available cloud resources, allowing the Kubernetes scheduler to quickly scale your app up and down and launch Pods wherever resources are available. If you don’t require the stability and ordering guarantees provided by the StatefulSet workload, you should use the Deployment workload to manage and scale your applications.
To learn more about the design and architecture of stateless, Cloud Native microservices, consult our Kubernetes White Paper.
In the Kubernetes model, the cluster control plane can be relied on to repair a broken application or service. It does this by checking the health of application Pods, and restarting or rescheduling unhealthy or unresponsive containers. By default, if your application container is running, Kubernetes sees your Pod as “healthy.” In many cases this is a reliable indicator for the health of a running application. However, if your application is deadlocked and not performing any meaningful work, the app process and container will continue to run indefinitely, and by default Kubernetes will keep the stalled container alive.
To properly communicate application health to the Kubernetes control plane, you should implement custom application health checks that indicate when an application is both running and ready to receive traffic. The first type of health check is called a readiness probe, and lets Kubernetes know when your application is ready to receive traffic. The second type of check is called a liveness probe, and lets Kubernetes know when your application is healthy and running. The Kubelet Node agent can perform these probes on running Pods using 3 different methods:
/health
), and succeeds if the response status is between 200 and 399You should choose the appropriate method depending on the running application(s), programming language, and framework. The readiness and liveness probes can both use the same probe method and perform the same check, but the inclusion of a readiness probe will ensure that the Pod doesn’t receive traffic until the probe begins succeeding.
When planning and thinking about containerizing your application and running it on Kubernetes, you should allocate planning time for defining what “healthy” and “ready” mean for your particular application, and development time for implementing and testing the endpoints and/or check commands.
Here’s a minimal health endpoint for the Flask example referenced above:
. . .
@app.route('/')
def print_config():
output = 'DB_HOST: {} -- DB_USER: {}'.format(DB_HOST, DB_USER)
return output
@app.route('/health')
def return_ok():
return 'Ok!', 200
A Kubernetes liveness probe that checks this path would then look something like this:
. . .
livenessProbe:
httpGet:
path: /health
port: 80
initialDelaySeconds: 5
periodSeconds: 2
The initialDelaySeconds
field specifies that Kubernetes (specifically the Node Kubelet) should probe the /health
endpoint after waiting 5 seconds, and periodSeconds
tells the Kubelet to probe /health
every 2 seconds.
To learn more about liveness and readiness probes, consult the Kubernetes documentation.
When running your containerized application in an environment like Kubernetes, it’s important to publish telemetry and logging data to monitor and debug your application’s performance. Building in features to publish performance metrics like response duration and error rates will help you monitor your application and alert you when your application is unhealthy.
One tool you can use to monitor your services is Prometheus, an open-source systems monitoring and alerting toolkit, hosted by the Cloud Native Computing Foundation (CNCF). Prometheus provides several client libraries for instrumenting your code with various metric types to count events and their durations. For example, if you’re using the Flask Python framework, you can use the Prometheus Python client to add decorators to your request processing functions to track the time spent processing requests. These metrics can then be scraped by Prometheus at an HTTP endpoint like /metrics
.
A helpful method to use when designing your app’s instrumentation is the RED method. It consists of the following three key request metrics:
This minimal set of metrics should give you enough data to generate alerts when your application’s performance degrades. Implementing this instrumentation along with the health checks discussed above will allow you to quickly detect and recover from a failing application.
To learn more about signals to measure when monitoring your applications, consult Monitoring Distributed Systems from the Google Site Reliability Engineering book.
In addition to thinking about and designing features for publishing telemetry data, you should also decide how your application will log in a distributed cluster-based environment. You should ideally remove hardcoded configuration references to local log files and log directories, and instead log directly to stdout
and stderr
. You should treat logs as a continuous event stream, or sequence of time-ordered events. This output stream will then get captured by the container enveloping your application, from which it can be forwarded to a logging layer like the EFK (Elasticsearch, Fluentd, and Kibana) stack. Kubernetes provides a lot of flexibility in designing your logging architecture, which we’ll explore in more detail below.
Once your application is containerized and up and running in a cluster environment like Kubernetes, you may no longer have shell access to the container running your app. If you’ve implemented adequate health checking, logging, and monitoring, you can quickly be alerted of and debug production issues, but taking action beyond restarting and redeploying containers may be difficult. For quick operational and maintenance fixes like flushing queues or clearing a cache, you should implement the appropriate API endpoints so that you can perform these operations without having to restart containers or docker exec
into running containers. Containers should be treated as immutable objects, and manual administration should be avoided in a production environment. If you must perform one-off administrative tasks, like clearing caches, you should expose this functionality via the API.
In these sections we’ve discussed application-level changes you may wish to implement before containerizing your application and moving it to Kubernetes. For a more in-depth walkthrough on building Cloud Native apps, consult Architecting Applications for Kubernetes.
We’ll now discuss some considerations to keep in mind when building containers for your apps.
Now that you’ve implemented app logic to maximize its portability and observability in a cloud-based environment, it’s time to package your app inside of a container. For the purposes of this guide, we’ll use Docker containers, but you should use whichever container implementation best suits your production needs.
Before creating a Dockerfile for your application, one of the first steps is taking stock of the software and operating system dependencies your application needs to run correctly. Dockerfiles allow you to explicitly version every piece of software installed into the image, and you should take advantage of this feature by explicitly declaring the parent image, software library, and programming language versions.
Avoid latest
tags and unversioned packages as much as possible, as these can shift, potentially breaking your application. You may wish to create a private registry or private mirror of a public registry to exert more control over image versioning and to prevent upstream changes from unintentionally breaking your image builds.
To learn more about setting up a private image registry, consult Deploy a Registry Server from the Docker official documentation and the Registries section below.
When deploying and pulling container images, large images can significantly slow things down and add to your bandwidth costs. Packaging a minimal set of tools and application files into an image provides several benefits:
Some steps you can consider when building your images:
alpine
or build from scratch
instead of a fully featured OS like ubuntu
For a full guide on optimizing Docker containers, including many illustrative examples, consult Building Optimized Containers for Kubernetes.
Docker provides several helpful features for injecting configuration data into your app’s running environment.
One option for doing this is specifying environment variables and their values in the Dockerfile using the ENV
statement, so that configuration data is built-in to images:
...
ENV MYSQL_USER=my_db_user
...
Your app can then parse these values from its running environment and configure its settings appropriately.
You can also pass in environment variables as parameters when starting a container using docker run
and the -e
flag:
- docker run -e MYSQL_USER='my_db_user' IMAGE[:TAG]
Finally, you can use an env file, containing a list of environment variables and their values. To do this, create the file and use the --env-file
parameter to pass it in to the command:
- docker run --env-file var_list IMAGE[:TAG]
If you’re modernizing your application to run it using a cluster manager like Kubernetes, you should further externalize your config from the image, and manage configuration using Kubernetes’ built-in ConfigMap and Secrets objects. This allows you to separate configuration from image manifests, so that you can manage and version it separately from your application. To learn how to externalize configuration using ConfigMaps and Secrets, consult the ConfigMaps and Secrets section below.
Once you’ve built your application images, to make them available to Kubernetes, you should upload them to a container image registry. Public registries like Docker Hub host the latest Docker images for popular open source projects like Node.js and nginx. Private registries allow you publish your internal application images, making them available to developers and infrastructure, but not the wider world.
You can deploy a private registry using your existing infrastructure (e.g. on top of cloud object storage), or optionally use one of several Docker registry products like Quay.io or paid Docker Hub plans. These registries can integrate with hosted version control services like GitHub so that when a Dockerfile is updated and pushed, the registry service will automatically pull the new Dockerfile, build the container image, and make the updated image available to your services.
To exert more control over the building and testing of your container images and their tagging and publishing, you can implement a continuous integration (CI) pipeline.
Building, testing, publishing and deploying your images into production manually can be error-prone and does not scale well. To manage builds and continuously publish containers containing your latest code changes to your image registry, you should use a build pipeline.
Most build pipelines perform the following core functions:
There are many paid continuous integration products that have built-in integrations with popular version control services like GitHub and image registries like Docker Hub. An alternative to these products is Jenkins, a free and open-source build automation server that can be configured to perform all of the functions described above. To learn how to set up a Jenkins continuous integration pipeline, consult How To Set Up Continuous Integration Pipelines in Jenkins on Ubuntu 20.04.
When working with containers, it’s important to think about the logging infrastructure you will use to manage and store logs for all your running and stopped containers. There are multiple container-level patterns you can use for logging, and also multiple Kubernetes-level patterns.
In Kubernetes, by default containers use the json-file
Docker logging driver, which captures the stdout and stderr streams and writes them to JSON files on the Node where the container is running. Sometimes logging directly to stderr and stdout may not be enough for your application container, and you may want to pair the app container with a logging sidecar container in a Kubernetes Pod. This sidecar container can then pick up logs from the filesystem, a local socket, or the systemd journal, granting you a little more flexibility than simply using the stderr and stdout streams. This container can also do some processing and then stream enriched logs to stdout/stderr, or directly to a logging backend. To learn more about Kubernetes logging patterns, consult the Kubernetes logging and monitoring section of this tutorial.
How your application logs at the container level will depend on its complexity. For single-purpose microservices, logging directly to stdout/stderr and letting Kubernetes pick up these streams is the recommended approach, as you can then leverage the kubectl logs
command to access log streams from your Kubernetes-deployed containers.
Similar to logging, you should begin thinking about monitoring in a container and cluster-based environment. Docker provides the helpful docker stats
command for grabbing standard metrics like CPU and memory usage for running containers on the host, and exposes even more metrics through the Remote REST API. Additionally, the open-source tool cAdvisor (installed on Kubernetes Nodes by default) provides more advanced functionality like historical metric collection, metric data export, and a helpful web UI for sorting through the data.
However, in a multi-node, multi-container production environment, more complex metrics stacks like Prometheus and Grafana may help organize and monitor your containers’ performance data.
In these sections, we briefly discussed some best practices for building containers, setting up a CI/CD pipeline and image registry, as well as some considerations for increasing observability into your containers.
In the next section, we’ll explore Kubernetes features that allow you to run and scale your containerized app in a cluster.
At this point, you’ve containerized your app and implemented logic to maximize its portability and observability in Cloud Native environments. We’ll now explore Kubernetes features that provide interfaces for managing and scaling your apps in a Kubernetes cluster.
Once you’ve containerized your application and published it to a registry, you can now deploy it into a Kubernetes cluster using the Pod workload. The smallest deployable unit in a Kubernetes cluster is not a container but a Pod. Pods typically consist of an application container (like a containerized Flask web app), or an app container and any “sidecar” containers that perform some helper function like monitoring or logging. Containers in a Pod share storage resources, a network namespace, and port space. They can communicate with each other using localhost
and can share data using mounted volumes. Addtionally, the Pod workload allows you to define Init Containers that run setup scripts or utilities before the main app container begins running.
Pods are typically rolled out using Deployments, which are Controllers defined by YAML files that declare a particular desired state. For example, an application state could be running three replicas of the Flask web app container and exposing port 8080
. Once created, the control plane gradually brings the actual state of the cluster to match the desired state declared in the Deployment by scheduling containers onto Nodes as required. To scale the number of application replicas running in the cluster, say from 3 up to 5, you update the replicas
field of the Deployment configuration file, and then kubectl apply
the new configuration file. Using these configuration files, scaling and deployment operations can all be tracked and versioned using your existing source control services and integrations.
Here’s a sample Kubernetes Deployment configuration file for a Flask app:
apiVersion: apps/v1
kind: Deployment
metadata:
name: flask-app
labels:
app: flask-app
spec:
replicas: 3
selector:
matchLabels:
app: flask-app
template:
metadata:
labels:
app: flask-app
spec:
containers:
- name: flask
image: sammy/flask_app:1.0
ports:
- containerPort: 8080
This Deployment launches 3 Pods that run a container called flask
using the sammy/flask_app
image (version 1.0
) with port 8080
open. The Deployment is called flask-app
.
To learn more about Kubernetes Pods and Deployments, consult the Pods and Deployments sections of the official Kubernetes documentation.
Kubernetes manages Pod storage using Volumes, Persistent Volumes (PVs) and Persistent Volume Claims (PVCs). Volumes are the Kubernetes abstraction used to manage Pod storage, and support most cloud provider block storage offerings, as well as local storage on the Nodes hosting the running Pods. To see a full list of supported Volume types, consult the Kubernetes documentation.
For example, if your Pod contains two NGINX containers that need to share data between them (say the first, called nginx
serves web pages, and the second, called nginx-sync
fetches the pages from an external location and updates the pages served by the nginx
container), your Pod spec would look something like this (here we use the emptyDir
Volume type):
apiVersion: v1
kind: Pod
metadata:
name: nginx
spec:
containers:
- name: nginx
image: nginx
volumeMounts:
- name: nginx-web
mountPath: /usr/share/nginx/html
- name: nginx-sync
image: nginx-sync
volumeMounts:
- name: nginx-web
mountPath: /web-data
volumes:
- name: nginx-web
emptyDir: {}
We use a volumeMount
for each container, indicating that we’d like to mount the nginx-web
volume containing the web page files at /usr/share/nginx/html
in the nginx
container and at /web-data
in the nginx-sync
container. We also define a volume
called nginx-web
of type emptyDir
.
In a similar fashion, you can configure Pod storage using cloud block storage products by modifying the volume
type from emptyDir
to the relevant cloud storage volume type.
The lifecycle of a Volume is tied to the lifecycle of the Pod, but not to that of a container. If a container within a Pod dies, the Volume persists and the newly launched container will be able to mount the same Volume and access its data. When a Pod gets restarted or dies, so do its Volumes, although if the Volumes consist of cloud block storage, they will simply be unmounted with data still accessible by future Pods.
To preserve data across Pod restarts and updates, the PersistentVolume (PV) and PersistentVolumeClaim (PVC) objects must be used.
PersistentVolumes are abstractions representing pieces of persistent storage like cloud block storage volumes or NFS storage. They are created separately from PersistentVolumeClaims, which are demands for pieces of storage by developers. In their Pod configurations, developers request persistent storage using PVCs, which Kubernetes matches with available PV Volumes (if using cloud block storage, Kubernetes can dynamically create PersistentVolumes when PersistentVolumeClaims are created).
If your application requires one persistent volume per replica, which is the case with many databases, you should not use Deployments but use the StatefulSet controller, which is designed for apps that require stable network identifiers, stable persistent storage, and ordering guarantees. Deployments should be used for stateless applications, and if you define a PersistentVolumeClaim for use in a Deployment configuration, that PVC will be shared by all the Deployment’s replicas.
To learn more about the StatefulSet controller, consult the Kubernetes documentation. To learn more about PersistentVolumes and PersistentVolume claims, consult the Kubernetes storage documentation.
Similar to Docker, Kubernetes provides the env
and envFrom
fields for setting environment variables in Pod configuration files. Here’s a sample snippet from a Pod configuration file that sets the HOSTNAME
environment variable in the running Pod to my_hostname
:
...
spec:
containers:
- name: nginx
image: nginx:1.21.6
ports:
- containerPort: 80
env:
- name: HOSTNAME
value: my_hostname
...
This allows you to move configuration out of Dockerfiles and into Pod and Deployment configuration files. A key advantage of further externalizing configuration from your Dockerfiles is that you can now modify these Kubernetes workload configurations (say, by changing the HOSTNAME
value to my_hostname_2
) separately from your application container definitions. Once you modify the Pod configuration file, you can then redeploy the Pod using its new environment, while the underlying container image (defined via its Dockerfile) does not need to be rebuilt, tested, and pushed to a repository. You can also version these Pod and Deployment configurations separately from your Dockerfiles, allowing you to quickly detect breaking changes and further separate config issues from application bugs.
Kubernetes provides another construct for further externalizing and managing configuration data: ConfigMaps and Secrets.
ConfigMaps allow you to save configuration data as objects that you then reference in your Pod and Deployment configuration files, so that you can avoid hardcoding configuration data and reuse it across Pods and Deployments.
Here’s an example, using the Pod config from above. We’ll first save the HOSTNAME
environment variable as a ConfigMap, and then reference it in the Pod config:
- kubectl create configmap hostname --from-literal=HOSTNAME=my_host_name
To reference it from the Pod configuration file, we use the the valueFrom
and configMapKeyRef
constructs:
...
spec:
containers:
- name: nginx
image: nginx:1.21.6
ports:
- containerPort: 80
env:
- name: HOSTNAME
valueFrom:
configMapKeyRef:
name: hostname
key: HOSTNAME
...
So the HOSTNAME
environment variable’s value has been completely externalized from configuration files. We can then update these variables across all Deployments and Pods referencing them, and restart the Pods for the changes to take effect.
If your applications use configuration files, ConfigMaps additionally allow you to store these files as ConfigMap objects (using the --from-file
flag), which you can then mount into containers as configuration files.
Secrets provide the same essential functionality as ConfigMaps, but should be used for sensitive data like database credentials as the values are base64-encoded.
To learn more about ConfigMaps and Secrets consult the Kubernetes documentation.
Once you have your application up and running in Kubernetes, every Pod will be assigned an (internal) IP address, shared by its containers. If one of these Pods is removed or dies, newly started Pods will be assigned different IP addresses.
For long-running services that expose functionality to internal and/or external clients, you may wish to grant a set of Pods performing the same function (or Deployment) a stable IP address that load balances requests across its containers. You can do this using a Kubernetes Service.
Kubernetes Services have 4 types, specified by the type
field in the Service configuration file:
ClusterIP
: This is the default type, which grants the Service a stable internal IP accessible from anywhere inside of the cluster.NodePort
: This will expose your Service on each Node at a static port, between 30000-32767 by default. When a request hits a Node at its Node IP address and the NodePort
for your service, the request will be load balanced and routed to the application containers for your service.LoadBalancer
: This will create a load balancer using your cloud provider’s load balancing product, and configure a NodePort
and ClusterIP
for your Service to which external requests will be routed.ExternalName
: This Service type allows you to map a Kubernetes Service to a DNS record. It can be used for accessing external services from your Pods using Kubernetes DNS.Note that creating a Service of type LoadBalancer
for each Deployment running in your cluster will create a new cloud load balancer for each Service, which can become costly. To manage routing external requests to multiple services using a single load balancer, you can use an Ingress Controller. Ingress Controllers are beyond the scope of this article, but to learn more about them you can consult the Kubernetes documentation. A popular Ingress Controller is the NGINX Ingress Controller.
Here’s a Service configuration file for the Flask example used in the Pods and Deployments section of this guide:
apiVersion: v1
kind: Service
metadata:
name: flask-svc
spec:
ports:
- port: 80
targetPort: 8080
selector:
app: flask-app
type: LoadBalancer
Here we choose to expose the flask-app
Deployment using this flask-svc
Service. We create a cloud load balancer to route traffic from load balancer port 80
to exposed container port 8080
.
To learn more about Kubernetes Services, consult the Services section of the Kubernetes docs.
Parsing through individual container and Pod logs using kubectl logs
and docker logs
can get tedious as the number of running applications grows. To help you debug application or cluster issues, you should implement centralized logging. At a high level, this consists of agents running on all the worker nodes that process Pod log files and streams, enrich them with metadata, and forward the logs off to a backend like Elasticsearch. From there, log data can be visualized, filtered, and organized using a visualization tool like Kibana.
In the container-level logging section, we discussed the recommended Kubernetes approach of having applications in containers log to the stdout/stderr streams. We also briefly discussed logging sidecar containers that can grant you more flexibility when logging from your application. You could also run logging agents directly in your Pods that capture local log data and forward them directly to your logging backend. Each approach has its pros and cons, and resource utilization tradeoffs (for example, running a logging agent container inside of each Pod can become resource-intensive and quickly overwhelm your logging backend). To learn more about different logging architectures and their tradeoffs, consult the Kubernetes documentation.
In a standard setup, each Node runs a logging agent like Filebeat or Fluentd that picks up container logs created by Kubernetes. Recall that Kubernetes creates JSON log files for containers on the Node (in most installations these can be found at /var/lib/docker/containers/
). These should be rotated using a tool like logrotate. The Node logging agent should be run as a DaemonSet Controller, a type of Kubernetes Workload that ensures that every Node runs a copy of the DaemonSet Pod. In this case the Pod would contain the logging agent and its configuration, which processes logs from files and directories mounted into the logging DaemonSet Pod.
Similar to the bottleneck in using kubectl logs
to debug container issues, eventually you may need to consider a more robust option than simply using kubectl top
and the Kubernetes Dashboard to monitor Pod resource usage on your cluster. Cluster and application-level monitoring can be set up using the Prometheus monitoring system and time-series database, and Grafana metrics dashboard. Prometheus works using a “pull” model, which scrapes HTTP endpoints (like /metrics/cadvisor
on the Nodes, or the /metrics
application REST API endpoints) periodically for metric data, which it then processes and stores. This data can then be analyzed and visualized using Grafana dashboard. Prometheus and Grafana can be launched into a Kubernetes cluster like any other Deployment and Service.
For added resiliency, you may wish to run your logging and monitoring infrastructure on a separate Kubernetes cluster, or using external logging and metrics services.
Migrating and modernizing an application so that it can efficiently run in a Kubernetes cluster often involves non-trivial amounts of planning and architecting of software and infrastructure changes. Once implemented, these changes allow service owners to continuously deploy new versions of their apps and easily scale them as necessary, with minimal amounts of manual intervention. Steps like externalizing configuration from your app, setting up proper logging and metrics publishing, and configuring health checks allow you to fully take advantage of the Cloud Native paradigm that Kubernetes has been designed around. By building portable containers and managing them using Kubernetes objects like Deployments and Services, you can fully use your available compute infrastructure and development resources.
]]>Increasingly, Linux distributions are adopting the systemd
init system. This powerful suite of software can manage many aspects of your server, from services to mounted devices and system states.
In systemd
, a unit
refers to any resource that the system knows how to operate on and manage. This is the primary object that the systemd
tools know how to deal with. These resources are defined using configuration files called unit files.
In this guide, we will introduce you to the different units that systemd
can handle. We will also be covering some of the many directives that can be used in unit files in order to shape the way these resources are handled on your system.
Units are the objects that systemd
knows how to manage. These are basically a standardized representation of system resources that can be managed by the suite of daemons and manipulated by the provided utilities.
Units can be said to be similar to services or jobs in other init systems. However, a unit has a much broader definition, as these can be used to abstract services, network resources, devices, filesystem mounts, and isolated resource pools.
Ideas that in other init systems may be handled with one unified service definition can be broken out into component units according to their focus. This organizes by function and allows you to easily enable, disable, or extend functionality without modifying the core behavior of a unit.
Some features that units are able implement easily are:
D-Bus
. A unit can be started when an associated bus is published.inotify
.udev
events.systemd
itself. You can still add dependency and ordering information, but most of the heavy lifting is taken care of for you./tmp
and network access.There are many other advantages that systemd
units have over other init systems’ work items, but this should give you an idea of the power that can be leveraged using native configuration directives.
The files that define how systemd
will handle a unit can be found in many different locations, each of which have different priorities and implications.
The system’s copy of unit files are generally kept in the /lib/systemd/system
directory. When software installs unit files on the system, this is the location where they are placed by default.
Unit files stored here are able to be started and stopped on-demand during a session. This will be the generic, vanilla unit file, often written by the upstream project’s maintainers that should work on any system that deploys systemd
in its standard implementation. You should not edit files in this directory. Instead you should override the file, if necessary, using another unit file location which will supersede the file in this location.
If you wish to modify the way that a unit functions, the best location to do so is within the /etc/systemd/system
directory. Unit files found in this directory location take precedence over any of the other locations on the filesystem. If you need to modify the system’s copy of a unit file, putting a replacement in this directory is the safest and most flexible way to do this.
If you wish to override only specific directives from the system’s unit file, you can actually provide unit file snippets within a subdirectory. These will append or modify the directives of the system’s copy, allowing you to specify only the options you want to change.
The correct way to do this is to create a directory named after the unit file with .d
appended on the end. So for a unit called example.service
, a subdirectory called example.service.d
could be created. Within this directory a file ending with .conf
can be used to override or extend the attributes of the system’s unit file.
There is also a location for run-time unit definitions at /run/systemd/system
. Unit files found in this directory have a priority landing between those in /etc/systemd/system
and /lib/systemd/system
. Files in this location are given less weight than the former location, but more weight than the latter.
The systemd
process itself uses this location for dynamically created unit files created at runtime. This directory can be used to change the system’s unit behavior for the duration of the session. All changes made in this directory will be lost when the server is rebooted.
Systemd
categories units according to the type of resource they describe. The easiest way to determine the type of a unit is with its type suffix, which is appended to the end of the resource name. The following list describes the types of units available to systemd
:
.service
: A service unit describes how to manage a service or application on the server. This will include how to start or stop the service, under which circumstances it should be automatically started, and the dependency and ordering information for related software..socket
: A socket unit file describes a network or IPC socket, or a FIFO buffer that systemd
uses for socket-based activation. These always have an associated .service
file that will be started when activity is seen on the socket that this unit defines..device
: A unit that describes a device that has been designated as needing systemd
management by udev
or the sysfs
filesystem. Not all devices will have .device
files. Some scenarios where .device
units may be necessary are for ordering, mounting, and accessing the devices..mount
: This unit defines a mountpoint on the system to be managed by systemd
. These are named after the mount path, with slashes changed to dashes. Entries within /etc/fstab
can have units created automatically..automount
: An .automount
unit configures a mountpoint that will be automatically mounted. These must be named after the mount point they refer to and must have a matching .mount
unit to define the specifics of the mount..swap
: This unit describes swap space on the system. The name of these units must reflect the device or file path of the space..target
: A target unit is used to provide synchronization points for other units when booting up or changing states. They also can be used to bring the system to a new state. Other units specify their relation to targets to become tied to the target’s operations..path
: This unit defines a path that can be used for path-based activation. By default, a .service
unit of the same base name will be started when the path reaches the specified state. This uses inotify
to monitor the path for changes..timer
: A .timer
unit defines a timer that will be managed by systemd
, similar to a cron
job for delayed or scheduled activation. A matching unit will be started when the timer is reached..snapshot
: A .snapshot
unit is created automatically by the systemctl snapshot
command. It allows you to reconstruct the current state of the system after making changes. Snapshots do not survive across sessions and are used to roll back temporary states..slice
: A .slice
unit is associated with Linux Control Group nodes, allowing resources to be restricted or assigned to any processes associated with the slice. The name reflects its hierarchical position within the cgroup
tree. Units are placed in certain slices by default depending on their type..scope
: Scope units are created automatically by systemd
from information received from its bus interfaces. These are used to manage sets of system processes that are created externally.As you can see, there are many different units that systemd
knows how to manage. Many of the unit types work together to add functionality. For instance, some units are used to trigger other units and provide activation functionality.
We will mainly be focusing on .service
units due to their utility and the consistency in which administrators need to managed these units.
The internal structure of unit files are organized with sections. Sections are denoted by a pair of square brackets “[
” and “]
” with the section name enclosed within. Each section extends until the beginning of the subsequent section or until the end of the file.
Section names are well-defined and case-sensitive. So, the section [Unit]
will not be interpreted correctly if it is spelled like [UNIT]
. If you need to add non-standard sections to be parsed by applications other than systemd
, you can add a X-
prefix to the section name.
Within these sections, unit behavior and metadata is defined through the use of simple directives using a key-value format with assignment indicated by an equal sign, like this:
[Section]
Directive1=value
Directive2=value
. . .
In the event of an override file (such as those contained in a unit.type.d
directory), directives can be reset by assigning them to an empty string. For example, the system’s copy of a unit file may contain a directive set to a value like this:
Directive1=default_value
The default_value
can be eliminated in an override file by referencing Directive1
without a value, like this:
Directive1=
In general, systemd
allows for easy and flexible configuration. For example, multiple boolean expressions are accepted (1
, yes
, on
, and true
for affirmative and 0
, no
off
, and false
for the opposite answer). Times can be intelligently parsed, with seconds assumed for unit-less values and combining multiple formats accomplished internally.
The first section found in most unit files is the [Unit]
section. This is generally used for defining metadata for the unit and configuring the relationship of the unit to other units.
Although section order does not matter to systemd
when parsing the file, this section is often placed at the top because it provides an overview of the unit. Some common directives that you will find in the [Unit]
section are:
Description=
: This directive can be used to describe the name and basic functionality of the unit. It is returned by various systemd
tools, so it is good to set this to something short, specific, and informative.Documentation=
: This directive provides a location for a list of URIs for documentation. These can be either internally available man
pages or web accessible URLs. The systemctl status
command will expose this information, allowing for easy discoverability.Requires=
: This directive lists any units upon which this unit essentially depends. If the current unit is activated, the units listed here must successfully activate as well, else this unit will fail. These units are started in parallel with the current unit by default.Wants=
: This directive is similar to Requires=
, but less strict. Systemd
will attempt to start any units listed here when this unit is activated. If these units are not found or fail to start, the current unit will continue to function. This is the recommended way to configure most dependency relationships. Again, this implies a parallel activation unless modified by other directives.BindsTo=
: This directive is similar to Requires=
, but also causes the current unit to stop when the associated unit terminates.Before=
: The units listed in this directive will not be started until the current unit is marked as started if they are activated at the same time. This does not imply a dependency relationship and must be used in conjunction with one of the above directives if this is desired.After=
: The units listed in this directive will be started before starting the current unit. This does not imply a dependency relationship and one must be established through the above directives if this is required.Conflicts=
: This can be used to list units that cannot be run at the same time as the current unit. Starting a unit with this relationship will cause the other units to be stopped.Condition...=
: There are a number of directives that start with Condition
which allow the administrator to test certain conditions prior to starting the unit. This can be used to provide a generic unit file that will only be run when on appropriate systems. If the condition is not met, the unit is gracefully skipped.Assert...=
: Similar to the directives that start with Condition
, these directives check for different aspects of the running environment to decide whether the unit should activate. However, unlike the Condition
directives, a negative result causes a failure with this directive.Using these directives and a handful of others, general information about the unit and its relationship to other units and the operating system can be established.
On the opposite side of unit file, the last section is often the [Install]
section. This section is optional and is used to define the behavior or a unit if it is enabled or disabled. Enabling a unit marks it to be automatically started at boot. In essence, this is accomplished by latching the unit in question onto another unit that is somewhere in the line of units to be started at boot.
Because of this, only units that can be enabled will have this section. The directives within dictate what should happen when the unit is enabled:
WantedBy=
: The WantedBy=
directive is the most common way to specify how a unit should be enabled. This directive allows you to specify a dependency relationship in a similar way to the Wants=
directive does in the [Unit]
section. The difference is that this directive is included in the ancillary unit allowing the primary unit listed to remain relatively clean. When a unit with this directive is enabled, a directory will be created within /etc/systemd/system
named after the specified unit with .wants
appended to the end. Within this, a symbolic link to the current unit will be created, creating the dependency. For instance, if the current unit has WantedBy=multi-user.target
, a directory called multi-user.target.wants
will be created within /etc/systemd/system
(if not already available) and a symbolic link to the current unit will be placed within. Disabling this unit removes the link and removes the dependency relationship.RequiredBy=
: This directive is very similar to the WantedBy=
directive, but instead specifies a required dependency that will cause the activation to fail if not met. When enabled, a unit with this directive will create a directory ending with .requires
.Alias=
: This directive allows the unit to be enabled under another name as well. Among other uses, this allows multiple providers of a function to be available, so that related units can look for any provider of the common aliased name.Also=
: This directive allows units to be enabled or disabled as a set. Supporting units that should always be available when this unit is active can be listed here. They will be managed as a group for installation tasks.DefaultInstance=
: For template units (covered later) which can produce unit instances with unpredictable names, this can be used as a fallback value for the name if an appropriate name is not provided.Sandwiched between the previous two sections, you will likely find unit type-specific sections. Most unit types offer directives that only apply to their specific type. These are available within sections named after their type. We will cover those briefly here.
The device
, target
, snapshot
, and scope
unit types have no unit-specific directives, and thus have no associated sections for their type.
The [Service]
section is used to provide configuration that is only applicable for services.
One of the basic things that should be specified within the [Service]
section is the Type=
of the service. This categorizes services by their process and daemonizing behavior. This is important because it tells systemd
how to correctly manage the servie and find out its state.
The Type=
directive can be one of the following:
Type=
and Busname=
directives are not set, but the ExecStart=
is set. Any communication should be handled outside of the unit through a second unit of the appropriate type (like through a .socket
unit if this unit must communicate using sockets).systemd
that the process is still running even though the parent exited.systemd
should wait for the process to exit before continuing on with other units. This is the default Type=
and ExecStart=
are not set. It is used for one-off tasks.systemd
will continue to process the next unit.systemd
process will wait for this to happen before proceeding to other units.Some additional directives may be needed when using certain service types. For instance:
RemainAfterExit=
: This directive is commonly used with the oneshot
type. It indicates that the service should be considered active even after the process exits.PIDFile=
: If the service type is marked as “forking”, this directive is used to set the path of the file that should contain the process ID number of the main child that should be monitored.BusName=
: This directive should be set to the D-Bus bus name that the service will attempt to acquire when using the “dbus” service type.NotifyAccess=
: This specifies access to the socket that should be used to listen for notifications when the “notify” service type is selected This can be “none”, “main”, or "all. The default, “none”, ignores all status messages. The “main” option will listen to messages from the main process and the “all” option will cause all members of the service’s control group to be processed.So far, we have discussed some pre-requisite information, but we haven’t actually defined how to manage our services. The directives to do this are:
ExecStart=
: This specifies the full path and the arguments of the command to be executed to start the process. This may only be specified once (except for “oneshot” services). If the path to the command is preceded by a dash “-” character, non-zero exit statuses will be accepted without marking the unit activation as failed.ExecStartPre=
: This can be used to provide additional commands that should be executed before the main process is started. This can be used multiple times. Again, commands must specify a full path and they can be preceded by “-” to indicate that the failure of the command will be tolerated.ExecStartPost=
: This has the same exact qualities as ExecStartPre=
except that it specifies commands that will be run after the main process is started.ExecReload=
: This optional directive indicates the command necessary to reload the configuration of the service if available.ExecStop=
: This indicates the command needed to stop the service. If this is not given, the process will be killed immediately when the service is stopped.ExecStopPost=
: This can be used to specify commands to execute following the stop command.RestartSec=
: If automatically restarting the service is enabled, this specifies the amount of time to wait before attempting to restart the service.Restart=
: This indicates the circumstances under which systemd
will attempt to automatically restart the service. This can be set to values like “always”, “on-success”, “on-failure”, “on-abnormal”, “on-abort”, or “on-watchdog”. These will trigger a restart according to the way that the service was stopped.TimeoutSec=
: This configures the amount of time that systemd
will wait when stopping or stopping the service before marking it as failed or forcefully killing it. You can set separate timeouts with TimeoutStartSec=
and TimeoutStopSec=
as well.Socket units are very common in systemd
configurations because many services implement socket-based activation to provide better parallelization and flexibility. Each socket unit must have a matching service unit that will be activated when the socket receives activity.
By breaking socket control outside of the service itself, sockets can be initialized early and the associated services can often be started in parallel. By default, the socket name will attempt to start the service of the same name upon receiving a connection. When the service is initialized, the socket will be passed to it, allowing it to begin processing any buffered requests.
To specify the actual socket, these directives are common:
ListenStream=
: This defines an address for a stream socket which supports sequential, reliable communication. Services that use TCP should use this socket type.ListenDatagram=
: This defines an address for a datagram socket which supports fast, unreliable communication packets. Services that use UDP should set this socket type.ListenSequentialPacket=
: This defines an address for sequential, reliable communication with max length datagrams that preserves message boundaries. This is found most often for Unix sockets.ListenFIFO
: Along with the other listening types, you can also specify a FIFO buffer instead of a socket.There are more types of listening directives, but the ones above are the most common.
Other characteristics of the sockets can be controlled through additional directives:
Accept=
: This determines whether an additional instance of the service will be started for each connection. If set to false (the default), one instance will handle all connections.SocketUser=
: With a Unix socket, specifies the owner of the socket. This will be the root user if left unset.SocketGroup=
: With a Unix socket, specifies the group owner of the socket. This will be the root group if neither this or the above are set. If only the SocketUser=
is set, systemd
will try to find a matching group.SocketMode=
: For Unix sockets or FIFO buffers, this sets the permissions on the created entity.Service=
: If the service name does not match the .socket
name, the service can be specified with this directive.Mount units allow for mount point management from within systemd
. Mount points are named after the directory that they control, with a translation algorithm applied.
For example, the leading slash is removed, all other slashes are translated into dashes “-”, and all dashes and unprintable characters are replaced with C-style escape codes. The result of this translation is used as the mount unit name. Mount units will have an implicit dependency on other mounts above it in the hierarchy.
Mount units are often translated directly from /etc/fstab
files during the boot process. For the unit definitions automatically created and those that you wish to define in a unit file, the following directives are useful:
What=
: The absolute path to the resource that needs to be mounted.Where=
: The absolute path of the mount point where the resource should be mounted. This should be the same as the unit file name, except using conventional filesystem notation.Type=
: The filesystem type of the mount.Options=
: Any mount options that need to be applied. This is a comma-separated list.SloppyOptions=
: A boolean that determines whether the mount will fail if there is an unrecognized mount option.DirectoryMode=
: If parent directories need to be created for the mount point, this determines the permission mode of these directories.TimeoutSec=
: Configures the amount of time the system will wait until the mount operation is marked as failed.This unit allows an associated .mount
unit to be automatically mounted at boot. As with the .mount
unit, these units must be named after the translated mount point’s path.
The [Automount]
section is pretty simple, with only the following two options allowed:
Where=
: The absolute path of the automount point on the filesystem. This will match the filename except that it uses conventional path notation instead of the translation.DirectoryMode=
: If the automount point or any parent directories need to be created, this will determine the permissions settings of those path components.Swap units are used to configure swap space on the system. The units must be named after the swap file or the swap device, using the same filesystem translation that was discussed above.
Like the mount options, the swap units can be automatically created from /etc/fstab
entries, or can be configured through a dedicated unit file.
The [Swap]
section of a unit file can contain the following directives for configuration:
What=
: The absolute path to the location of the swap space, whether this is a file or a device.Priority=
: This takes an integer that indicates the priority of the swap being configured.Options=
: Any options that are typically set in the /etc/fstab
file can be set with this directive instead. A comma-separated list is used.TimeoutSec=
: The amount of time that systemd
waits for the swap to be activated before marking the operation as a failure.A path unit defines a filesystem path that systmed
can monitor for changes. Another unit must exist that will be be activated when certain activity is detected at the path location. Path activity is determined thorugh inotify
events.
The [Path]
section of a unit file can contain the following directives:
PathExists=
: This directive is used to check whether the path in question exists. If it does, the associated unit is activated.PathExistsGlob=
: This is the same as the above, but supports file glob expressions for determining path existence.PathChanged=
: This watches the path location for changes. The associated unit is activated if a change is detected when the watched file is closed.PathModified=
: This watches for changes like the above directive, but it activates on file writes as well as when the file is closed.DirectoryNotEmpty=
: This directive allows systemd
to activate the associated unit when the directory is no longer empty.Unit=
: This specifies the unit to activate when the path conditions specified above are met. If this is omitted, systemd
will look for a .service
file that shares the same base unit name as this unit.MakeDirectory=
: This determines if systemd
will create the directory structure of the path in question prior to watching.DirectoryMode=
: If the above is enabled, this will set the permission mode of any path components that must be created.Timer units are used to schedule tasks to operate at a specific time or after a certain delay. This unit type replaces or supplements some of the functionality of the cron
and at
daemons. An associated unit must be provided which will be activated when the timer is reached.
The [Timer]
section of a unit file can contain some of the following directives:
OnActiveSec=
: This directive allows the associated unit to be activated relative to the .timer
unit’s activation.OnBootSec=
: This directive is used to specify the amount of time after the system is booted when the associated unit should be activated.OnStartupSec=
: This directive is similar to the above timer, but in relation to when the systemd
process itself was started.OnUnitActiveSec=
: This sets a timer according to when the associated unit was last activated.OnUnitInactiveSec=
: This sets the timer in relation to when the associated unit was last marked as inactive.OnCalendar=
: This allows you to activate the associated unit by specifying an absolute instead of relative to an event.AccuracySec=
: This unit is used to set the level of accuracy with which the timer should be adhered to. By default, the associated unit will be activated within one minute of the timer being reached. The value of this directive will determine the upper bounds on the window in which systemd
schedules the activation to occur.Unit=
: This directive is used to specify the unit that should be activated when the timer elapses. If unset, systemd
will look for a .service
unit with a name that matches this unit.Persistent=
: If this is set, systemd
will trigger the associated unit when the timer becomes active if it would have been triggered during the period in which the timer was inactive.WakeSystem=
: Setting this directive allows you to wake a system from suspend if the timer is reached when in that state.The [Slice]
section of a unit file actually does not have any .slice
unit specific configuration. Instead, it can contain some resource management directives that are actually available to a number of the units listed above.
Some common directives in the [Slice]
section, which may also be used in other units can be found in the systemd.resource-control
man page. These are valid in the following unit-specific sections:
[Slice]
[Scope]
[Service]
[Socket]
[Mount]
[Swap]
We mentioned earlier in this guide the idea of template unit files being used to create multiple instances of units. In this section, we can go over this concept in more detail.
Template unit files are, in most ways, no different than regular unit files. However, these provide flexibility in configuring units by allowing certain parts of the file to utilize dynamic information that will be available at runtime.
Template unit files can be identified because they contain an @
symbol after the base unit name and before the unit type suffix. A template unit file name may look like this:
example@.service
When an instance is created from a template, an instance identifier is placed between the @
symbol and the period signifying the start of the unit type. For example, the above template unit file could be used to create an instance unit that looks like this:
example@instance1.service
An instance file is usually created as a symbolic link to the template file, with the link name including the instance identifier. In this way, multiple links with unique identifiers can point back to a single template file. When managing an instance unit, systemd
will look for a file with the exact instance name you specify on the command line to use. If it cannot find one, it will look for an associated template file.
The power of template unit files is mainly seen through its ability to dynamically substitute appropriate information within the unit definition according to the operating environment. This is done by setting the directives in the template file as normal, but replacing certain values or parts of values with variable specifiers.
The following are some of the more common specifiers will be replaced when an instance unit is interpreted with the relevant information:
%n
: Anywhere where this appears in a template file, the full resulting unit name will be inserted.%N
: This is the same as the above, but any escaping, such as those present in file path patterns, will be reversed.%p
: This references the unit name prefix. This is the portion of the unit name that comes before the @
symbol.%P
: This is the same as above, but with any escaping reversed.%i
: This references the instance name, which is the identifier following the @
in the instance unit. This is one of the most commonly used specifiers because it will be guaranteed to be dynamic. The use of this identifier encourages the use of configuration significant identifiers. For example, the port that the service will be run at can be used as the instance identifier and the template can use this specifier to set up the port specification.%I
: This specifier is the same as the above, but with any escaping reversed.%f
: This will be replaced with the unescaped instance name or the prefix name, prepended with a /
.%c
: This will indicate the control group of the unit, with the standard parent hierarchy of /sys/fs/cgroup/ssytemd/
removed.%u
: The name of the user configured to run the unit.%U
: The same as above, but as a numeric UID
instead of name.%H
: The host name of the system that is running the unit.%%
: This is used to insert a literal percentage sign.By using the above identifiers in a template file, systemd
will fill in the correct values when interpreting the template to create an instance unit.
When working with systemd
, understanding units and unit files can make administration easier. Unlike many other init systems, you do not have to know a scripting language to interpret the init files used to boot services or the system. The unit files use a fairly straightforward declarative syntax that allows you to see at a glance the purpose and effects of a unit upon activation.
Breaking functionality such as activation logic into separate units not only allows the internal systemd
processes to optimize parallel initialization, it also keeps the configuration rather simple and allows you to modify and restart some units without tearing down and rebuilding their associated connections. Leveraging these abilities can give you more flexibility and power during administration.
Designing and running applications with scalability, portability, and robustness in mind can be challenging, especially as system complexity grows. The architecture of an application or system dictates how it must be run, what it expects from its environment, and how closely coupled it is to related components. Following certain patterns during the design phase and adhering to certain operational practices can help counter some of the most common problems that applications face when running in highly distributed environments.
Technologies like Docker and Kubernetes help teams package software and then distribute, deploy, and scale on platforms of distributed computers. Learning how to best harness the power of these tools can help you manage applications with greater flexibility, control, and responsiveness.
In this guide, we will discuss some of the principles and patterns you may want to adopt to help you scale and manage your workloads on Kubernetes. While Kubernetes can run many types of workloads, your choices can affect the ease of operation and the possibilities available.
If you’re looking for a managed Kubernetes hosting service, check out our simple, managed Kubernetes service built for growth.
When producing software, many requirements affect the patterns and architecture you choose to employ. With Kubernetes, one of the most important factors is the ability to scale horizontally, increasing the number of identical copies of your application running in parallel to distribute load and increase availability. This is an alternative to vertical scaling, which usually refers to increasing the capacity of a single application stack.
In particular, microservices are a software design pattern that work well for scalable deployments on clusters. Developers create small, composable applications that communicate over the network through well-defined APIs instead of larger compound programs that communicate through internal mechanisms. Refactoring monolithic applications into discrete single-purpose components makes it possible to scale each function independently. Much of the complexity and overhead that would normally exist at the application level is transferred to the operational realm where it can be managed by platforms like Kubernetes.
Beyond specific software patterns, cloud native applications are designed with a few additional considerations in mind. Cloud native applications are programs that generally follow a microservices architecture pattern with built-in resiliency, observability, and administrative features to make the fullest use of cloud platforms.
For example, cloud native applications are constructed with health reporting metrics to enable the platform to manage life cycle events if an instance becomes unhealthy. They produce (and make available for export) robust telemetry data to alert operators to problems and allow them to make informed decisions. Applications are designed to handle regular restarts and failures, changes in backend availability, and high load without corrupting data or becoming unresponsive.
One popular methodology that can help you focus on the characteristics that matter most when creating cloud-ready web apps is the Twelve-Factor App philosophy. Originally written to help developers and operations teams understand the core qualities shared by web services designed to run in the cloud, the principles apply very well to software that will live in a clustered environment like Kubernetes. While monolithic applications can benefit from following these recommendations, microservices architectures designed around these principles work particularly well.
A quick summary of the Twelve Factors are:
By adhering to the guidelines provided by the Twelve Factors, you can create and run applications well-suited to Kubernetes. The Twelve Factors encourage developers to focus on their application’s primary purpose, consider the operating conditions and interfaces between components, and use inputs, outputs, and standard process management features to run predictably in Kubernetes.
Kubernetes uses containers to run isolated, packaged applications across its cluster nodes. To run on Kubernetes, your applications must be encapsulated in one or more container images and executed using a container runtime like Docker. While containerizing your components is a requirement for Kubernetes, it also helps reinforce many of the principles from the twelve factor app methodology discussed above, allowing for better scaling and management.
For instance, containers provide isolation between the application environment and the external host system. They support a networked approach to inter-application communication, and typically take configuration through environmental variables and expose logs written to stdout
and stderr
. Containers themselves encourage process-based concurrency and help maintain dev/prod parity by being independently scalable and bundling the process’s runtime environment. These characteristics make it possible to package your applications so that they run smoothly on Kubernetes.
The flexibility of container technology allows many different ways of encapsulating an application. However, some methods work better in a Kubernetes environment than others.
Most best practices on containerizing your applications have to do with image building, where you define how your software will be set up and run from within a container. In general, keeping image sizes small and straightforward provides a number of benefits. Size-optimized images can reduce the time and resources required to start up a new container on a cluster by reusing existing layers between image updates, which Docker and other container frameworks are designed to do automatically.
A good first step when creating container images is to do your best to separate your build steps from the final image that will be run in production. Compiling or bundling software generally requires extra tooling, takes additional time, and produces artifacts (e.g., cross-platform dependencies) that might be inconsistent from container to container or unnecessary to the final runtime environment. One way to cleanly separate the build process from the runtime environment is to use Docker multi-stage builds. Multi-stage build configurations allow you to specify one base image to use during your build process and define another to use at runtime. This makes it possible to build software using an image with all of the build tools installed and copy the resulting artifacts to a slim, streamlined image that will be used each time afterwards.
With this type of functionality available, it is usually a good idea to build production images on top of a minimal parent image. If you wish to completely avoid the bloat found in “distro”-style parent layers like ubuntu:20.04
(which includes a complete Ubuntu 20.04 server environment), you could build your images with scratch
— Docker’s most minimal base image — as the parent. However, the scratch
base layer doesn’t provide access to many core tools and will often break some baseline assumptions about a Linux environment. As an alternative, the Alpine Linux alpine
image has become popular by being a solid, minimal base environment that provides a tiny, but fully featured Linux distribution.
For interpreted languages like Python or Ruby, the paradigm shifts slightly since there is no compilation stage and the interpreter must be available to run the code in production. However, since slim images are still ideal, many language-specific, optimized images built on top of Alpine Linux are available on Docker Hub. The benefits of using a smaller image for interpreted languages are similar to those for compiled languages: Kubernetes will be able to quickly pull all of the necessary container images onto new nodes to begin doing meaningful work.
While your applications must be containerized to run on a Kubernetes cluster, pods are the smallest unit of abstraction that Kubernetes can manage directly. A pod is a Kubernetes object composed of one or more closely coupled containers. Containers in a pod share a life cycle and are managed together as a single unit. For example, the containers are always scheduled (deployed) on the same node (server), are started or stopped in unison, and share resources like filesystems and IP addressing.
It’s important to understand how Kubernetes handles these components and what each layer of abstraction provides for your systems. A few considerations can help you identify some natural points of encapsulation for your application with each of these abstractions.
One way to determine an effective scope for your containers is to look for natural development boundaries. If your systems operate using a microservices architecture, well-designed containers are frequently built to represent discrete units of functionality that can often be used in a variety of contexts. This level of abstraction allows your team to release changes to container images and then deploy this new functionality to any environment where those images are used. Applications can be built by composing individual containers that each fulfill a given function but may not accomplish an entire process alone.
In contrast to the above, pods are usually constructed by thinking about which parts of your system might benefit most from independent management. Since Kubernetes uses pods as its smallest user-facing abstraction, these are the most primitive units that the Kubernetes tools and API can directly interact with and control. You can start, stop, and restart pods, or use higher level objects built upon pods to introduce replication and lifecycle management features. Kubernetes doesn’t allow you to manage the containers within a pod independently, so you shouldn’t group containers together that might benefit from separate administration.
Because many of Kubernetes’ features and abstractions deal with pods directly, it makes sense to bundle items that should scale together in a single pod and to separate those that should scale independently. For example, separating your web servers from your application servers in different pods allows you to scale each layer independently as needed. However, bundling a web server and a database adaptor into the same pod can make sense if the adaptor provides essential functionality that the web server needs to work properly.
With this in mind, what types of containers should be bundled in a single pod? Generally, a primary container is responsible for fulfilling the core functions of the pod, but additional containers may be defined that modify or extend the primary container or help it connect to a unique deployment environment.
For instance, in a web server pod, an Nginx container might listen for requests and serve content while an associated container updates static files when a repository changes. It may be tempting to package both of these components within a single container, but there are significant benefits to implementing them as separate containers. Both the web server container and the repository puller can be used independently in different contexts. They can be maintained by different teams and can each be developed to generalize their behavior to work with different companion containers.
Brendan Burns and David Oppenheimer identified three primary patterns for bundling supporting containers in their paper on design patterns for container-based distributed systems. These represent some of the most common use cases for packaging containers together in a pod:
While application configuration can be baked into container images, it’s best to make your components configurable at runtime to support deployment in multiple contexts and allow more flexible administration. To manage runtime configuration parameters, Kubernetes offers two different types of objects, called ConfigMaps and Secrets.
ConfigMaps are a mechanism used to store data that can be exposed to pods and other objects at runtime. Data stored within ConfigMaps can be presented as environment variables or mounted as files in the pod. By designing your applications to read from these locations, you can inject the configuration at runtime using ConfigMaps and modify the behavior of your components without having to rebuild the container image.
Secrets are a similar Kubernetes object type used to securely store sensitive data and selectively allow pods and other components access to it as needed. Secrets are a convenient way of passing sensitive material to applications without storing them as plain text in easily accessible locations in your normal configuration. Functionally, they work in much the same way as ConfigMaps, so applications can consume data from ConfigMaps and Secrets using the same mechanisms.
ConfigMaps and Secrets help you avoid putting configuration parameters directly in Kubernetes object definitions. You can map the configuration key instead of the value, allowing you to update configuration on the fly by modifying the ConfigMap or Secret. This gives you the opportunity to alter the active runtime behavior of pods and other Kubernetes objects without modifying the Kubernetes definitions of the resources.
Kubernetes includes lots of out-of-the-box functionality for managing component life cycles and ensuring that your applications are always healthy and available. However, to take advantage of these features, Kubernetes has to understand how it should monitor and interpret your application’s health. To do so, Kubernetes allows you to define liveness and readiness probes.
Liveness probes allow Kubernetes to determine whether an application within a container is alive and actively running. Kubernetes can periodically run commands within the container to check basic application behavior or can send HTTP or TCP network requests to a designated location to determine if the process is available and able to respond as expected. If a liveness probe fails, Kubernetes restarts the container to attempt to reestablish functionality within the pod.
Readiness probes are a similar tool used to determine whether a pod is ready to serve traffic. Applications within a container may need to perform initialization procedures before they are ready to accept client requests or they may need to reload upon a configuration change. When a readiness probe fails, instead of restarting the container, Kubernetes stops sending requests to the pod temporarily. This allows the pod to complete its initialization or maintenance routines without impacting the health of the group as a whole.
By combining liveness and readiness probes, you can instruct Kubernetes to automatically restart pods or remove them from backend groups. Configuring your infrastructure to take advantage of these capabilities allows Kubernetes to manage the availability and health of your applications without additional operations work.
Earlier, when discussing some pod design fundamentals, we mentioned that other Kubernetes objects build on these primitives to provide more advanced functionality. A deployment, one such compound object, is probably the most commonly defined and manipulated Kubernetes object.
Deployments are compound objects that build on other Kubernetes primitives to add additional capabilities. They add life cycle management capabilities to intermediary objects called ReplicaSets, like the ability to perform rolling updates, rollback to earlier versions, and transition between states. These ReplicaSets allow you to define pod templates to spin up and manage multiple copies of a single pod design. This helps you easily scale out your infrastructure, manage availability requirements, and automatically restart pods in the event of failure.
These additional features provide an administrative framework and self-healing capabilities to the base pod layer. While pods are the units that ultimately run the workloads you define, they are not the units that you should usually be provisioning and managing. Instead, think of pods as a building block that can run applications robustly when provisioned through higher-level objects like deployments.
Deployments allow you to provision and manage sets of interchangeable pods to scale out your applications and meet user demands. However, routing traffic to the provisioned pods is a separate concern. As pods are swapped out as part of rolling updates, restarted, or moved due to host failures, the network addresses previously associated with the running group will change. Kubernetes services allow you to manage this complexity by maintaining routing information for dynamic pools of pods and controlling access to various layers of your infrastructure.
In Kubernetes, services are specific mechanisms that control how traffic gets routed to sets of pods. Whether forwarding traffic from external clients or managing connections between several internal components, services allow you to control how traffic should flow. Kubernetes will then update and maintain all of the information needed to forward connections to the relevant pods, even as the environment shifts and the networking addressing changes.
To effectively use services, you first must determine the intended consumers for each group of pods. If your service will only be used by other applications deployed within your Kubernetes cluster, the clusterIP service type allows you to connect to a set of pods using a stable IP address that is only routable from within the cluster. Any object deployed on the cluster can communicate with the group of replicated pods by sending traffic directly to the service’s IP address. This is the most straightforward service type, which works well for internal application layers.
An optional DNS addon enables Kubernetes to provide DNS names for services. This allows pods and other objects to communicate with services by name instead of by IP address. This mechanism does not change service usage significantly, but name-based identifiers can make it simpler to hook up components or define interactions without necessarily knowing the service IP address.
If the interface should be publicly accessible, your best option is usually the load balancer service type. This uses your specific cloud provider’s API to provision a load balancer, which serves traffic to the service pods through a publicly exposed IP address. This allows you to route external requests to the pods in your service, offering a controlled network channel to your internal cluster network.
Since the load balancer service type creates a load balancer for every service, it can potentially become expensive to expose Kubernetes services publicly using this method. To help alleviate this, Kubernetes ingress objects can be used to describe how to route different types of requests to different services based on a predetermined set of rules. For instance, requests for “example.com” might go to service A, while requests for “sammytheshark.com” might be routed to service B. Ingress objects provide a way of describing how to logically route a mixed stream of requests to their target services based on predefined patterns.
Ingress rules must be interpreted by an ingress controller — typically some sort of load balance, like Nginx — deployed within the cluster as a pod, which implements the ingress rules and forwards traffic to Kubernetes services accordingly. Ingress implementations can be used to minimize the number of external load balancers that cluster owners are required to run.
Kubernetes offers quite a lot of flexibility in defining and controlling the resources deployed to your cluster. Using tools like kubectl
, you can imperatively define ad-hoc objects to immediately deploy to your cluster. While this can be useful for quickly deploying resources when learning Kubernetes, there are drawbacks to this approach that make it undesirable for long-term production administration.
One of the major problems with imperative management is that it does not leave any record of the changes you’ve deployed to your cluster. This makes it difficult or impossible to recover in the event of failures or to track operational changes as they’re applied to your systems.
Fortunately, Kubernetes provides an alternative declarative syntax that allows you to fully define resources within text files and then use kubectl
to apply the configuration or change. Storing these configuration files in a version control repository is a good way to monitor changes and integrate with the review processes used for other parts of your organization. File-based management also makes it possible to adapt existing patterns to new resources by copying and editing existing definitions. Storing your Kubernetes object definitions in versioned directories allows you to maintain a snapshot of your desired cluster state at each point in time. This can be invaluable during recovery operations, migrations, or when tracking down the root cause of unintended changes introduced to your system.
Managing the infrastructure that will run your applications and learning how to best leverage the features offered by modern orchestration environments can be daunting. However, many of the benefits offered by systems like Kubernetes and technologies like containers become more clear when your development and operations practices align with the concepts the tooling is built around. Architecting your systems using the patterns Kubernetes excels at and understanding how certain features can alleviate the challenges of complex deployments can improve your experience running on the platform.
Next, you may want to read about Modernizing existing applications for Kubernetes.
]]>Understanding networking is a fundamental part of configuring complex environments on the internet. This has implications when trying to communicate between servers efficiently, developing secure network policies, and keeping your nodes organized.
In a previous guide, we went over some basic networking terminology. You should look through that guide to make sure you are familiar with the concepts presented there.
In this article, we will discuss some more specific concepts that are involved with designing or interacting with networked computers. Specifically, we will be covering network classes, subnets, and CIDR notation for grouping IP addresses.
Every location or device on a network must be addressable. This means that it can be reached by referencing its designation under a predefined system of addresses. In the normal TCP/IP model of network layering, this is handled on a few different layers, but usually when we refer to an address on a network we are talking about an IP address.
IP addresses allow network resources to be reached through a network interface. If one computer wants to communicate with another computer, it can address the information to the remote computer’s IP address. Assuming that the two computers are on the same network, or that the different computers and devices in between can translate requests across networks, the computers should be able to reach each other and send information.
Each IP address must be unique on its own network. Networks can be isolated from one another, and they can be bridged and translated to provide access between distinct networks. A system called Network Address Translation, allows the addresses to be rewritten when packets traverse network borders to allow them to continue on to their correct destination. This allows the same IP address to be used on multiple, isolated networks while still allowing these to communicate with each other if configured correctly.
There are two revisions of the IP protocol that are widely implemented on systems today: IPv4 and IPv6. IPv6 is slowly replacing IPv4 due to improvements in the protocol and the limitations of IPv4 address space. Simply put, the world now has too many internet-connected devices for the amount of addresses available through IPv4.
IPv4 addresses are 32-bit addresses. Each byte, or 8-bit segment of the address, is divided by a period and typically expressed as a number 0–255. Even though these numbers are typically expressed in decimal to aid in human comprehension, each segment is usually referred to as an octet to express the fact that it is a representation of 8 bits.
A typical IPv4 address looks something like this:
192.168.0.5
The lowest value in each octet is a 0, and the highest value is 255.
We can also express this in binary to get a better idea of how the four octets will look. We will separate each 4 bits by a space for readability and replace the dots with dashes:
1100 0000 - 1010 1000 - 0000 0000 - 0000 0101
Recognizing that these two formats represent the same number will be important for understanding concepts later on.
Although there are some other differences in the protocol and background functionality of IPv4 and IPv6, the most noticeable difference is the address space. IPv6 expresses addresses as an 128-bit number. To put that into perspective, this means that IPv6 has space for more than 7.9×10<sup>28</sup> times the amount of addresses as IPv4.
To express this extended address range, IPv6 is generally written out as eight segments of four hexadecimal digits. Hexadecimal numbers represent the numbers 0–15 by using the digits 0–9, as well as the numbers a–f to express the higher values. A typical IPv6 address might look something like this:
1203:8fe0:fe80:b897:8990:8a7c:99bf:323d
You may also see these addresses written in a compact format. The rules of IPv6 allow you to remove any leading zeros from each octet, and to replace a single range of zeroed groups with a double colon (::).
For instance, if you have one group in an IPv6 address that looks like this:
...:00bc:...
You could instead just type:
...:bc:...
To demonstrate the second case, if you have a range in an IPv6 address with multiple groups as zeroes, like this:
...:18bc:0000:0000:0000:00ff:...
You could compact this like so (also removing the leading zeros of the group like we did above):
...:18bc::ff...
You can do this only once per address, or else the full address will be unable to be reconstructed.
While IPv6 is becoming more common every day, in this guide, we will be exploring the remaining concepts using IPv4 addresses because it is easier to discuss with a smaller address space.
IP addresses are typically made of two separate components. The first part of the address is used to identify the network that the address is a part of. The part that comes afterwards is used to specify a specific host within that network.
Where the network specification ends and the host specification begins depends on how the network is configured. We will discuss this more thoroughly momentarily.
IPv4 addresses were traditionally divided into five different “classes”, named A through E, meant to differentiate segments of the available addressable IPv4 space. These are defined by the first four bits of each address. You can identify what class an IP address belongs to by looking at these bits.
Here is a translation table that defines the addresses based on their leading bits:
Class A
0---
: If the first bit of an IPv4 address is “0”, this means that the address is part of class A. This means that any address from 0.0.0.0
to 127.255.255.255
is in class A.Class B
10--
: Class B includes any address from 128.0.0.0
to 191.255.255.255
. This represents the addresses that have a “1” for their first bit, but don’t have a “1” for their second bit.Class C
110-
: Class C is defined as the addresses ranging from 192.0.0.0
to 223.255.255.255
. This represents all of the addresses with a “1” for their first two bits, but without a “1” for their third bit.Class D
1110
: This class includes addresses that have “111” as their first three bits, but a “0” for the next bit. This address range includes addresses from 224.0.0.0
to 239.255.255.255
.Class E
1111
: This class defines addresses between 240.0.0.0
and 255.255.255.255
. Any address that begins with four “1” bits is included in this class.Class D addresses are reserved for multi-casting protocols, which allow a packet to be sent to a group of hosts in one movement. Class E addresses are reserved for future and experimental use, and are largely not used.
Traditionally, each of the regular classes (A–C) divided the networking and host portions of the address differently to accommodate different sized networks. Class A addresses used the remainder of the first octet to represent the network and the rest of the address to define hosts. This was good for defining a few networks with a lot of hosts each.
The class B addresses used the first two octets (the remainder of the first, and the entire second) to define the network and the rest to define the hosts on each network. The class C addresses used the first three octets to define the network and the last octet to define hosts within that network.
The division of large portions of IP space into classes is now almost a legacy concept. Originally, this was implemented as a stop-gap for the problem of rapidly depleting IPv4 addresses (you can have multiple computers with the same host if they are in separate networks). This was replaced largely by later schemes that we will discuss below.
There are also some portions of the IPv4 space that are reserved for specific uses.
One of the most useful reserved ranges is the loopback range specified by addresses from 127.0.0.0
to 127.255.255.255
. This range is used by each host to test networking to itself. Typically, this is expressed by the first address in this range: 127.0.0.1
.
Each of the normal classes also have a range within them that is used to designate private network addresses. For instance, for class A addresses, the addresses from 10.0.0.0
to 10.255.255.255
are reserved for private network assignment. For class B, this range is 172.16.0.0
to 172.31.255.255
. For class C, the range of 192.168.0.0
to 192.168.255.255
is reserved for private usage.
Any computer that is not hooked up to the internet directly (any computer that goes through a router or other NAT system) can use these addresses at will.
There are additional address ranges reserved for specific use-cases. You can find a summary of reserved addresses here.
The process of dividing a network into smaller network sections is called subnetting. This can be useful for many different purposes and helps isolate groups of hosts from each other to deal with them more easily.
As we discussed above, each address space is divided into a network portion and a host portion. The amount of the address that each of these take up is dependent on the class that the address belongs to. For instance, for class C addresses, the first 3 octets are used to describe the network. For the address 192.168.0.15
, the 192.168.0
portion describes the network and the 15
describes the host.
By default, each network has only one subnet, which contains all of the host addresses defined within. A netmask is basically a specification of the amount of address bits that are used for the network portion. A subnet mask is another netmask within used to further divide the network.
Each bit of the address that is considered significant for describing the network should be represented as a “1” in the netmask.
For instance, the address we discussed above, 192.168.0.15
can be expressed like this, in binary:
1100 0000 - 1010 1000 - 0000 0000 - 0000 1111
As we described above, the network portion for class C addresses is the first 3 octets, or the first 24 bits. Since these are the significant bits that we want to preserve, the netmask would be:
1111 1111 - 1111 1111 - 1111 1111 - 0000 0000
This can be written in the normal IPv4 format as 255.255.255.0
. Any bit that is a “0” in the binary representation of the netmask is considered part of the host portion of the address and can be variable. The bits that are “1” are static, however, for the network or subnetwork that is being discussed.
We determine the network portion of the address by applying a bitwise AND
operation to between the address and the netmask. A bitwise AND
operation will save the networking portion of the address and discard the host portion. The result of this on our above example that represents our network is:
1100 0000 - 1010 1000 - 0000 0000 - 0000 0000
This can be expressed as 192.168.0.0
. The host specification is then the difference between these original value and the host portion. In our case, the host is 0000 1111
or 15
.
The idea of subnetting is to take a portion of the host space of an address, and use it as an additional networking specification to divide the address space again.
For instance, a netmask of 255.255.255.0
as we saw above leaves us with 254 hosts in the network (you cannot end in 0 or 255 because these are reserved). If we wanted to divide this into two subnetworks, we could use one bit of the conventional host portion of the address as the subnet mask.
So, continuing with our example, the networking portion is:
1100 0000 - 1010 1000 - 0000 0000
The host portion is:
0000 1111
We can use the first bit of our host to designate a subnetwork. We can do this by adjusting the subnet mask from this:
1111 1111 - 1111 1111 - 1111 1111 - 0000 0000
To this:
1111 1111 - 1111 1111 - 1111 1111 - 1000 0000
In traditional IPv4 notation, this would be expressed as 192.168.0.128
. What we have done here is to designate the first bit of the last octet as significant in addressing the network. This effectively produces two subnetworks. The first subnetwork is from 192.168.0.1
to 192.168.0.127
. The second subnetwork contains the hosts 192.168.0.129
to 192.168.0.255
. Traditionally, the subnet itself must not be used as an address.
If we use more bits out of the host space for networking, we can get more and more subnetworks.
A system called Classless Inter-Domain Routing, or CIDR, was developed as an alternative to traditional subnetting. The idea is that you can add a specification in the IP address itself as to the number of significant bits that make up the routing or networking portion.
For example, we could express the idea that the IP address 192.168.0.15
is associated with the netmask 255.255.255.0
by using the CIDR notation of 192.168.0.15/24
. This means that the first 24 bits of the IP address given are considered significant for the network routing.
This allows us some interesting possibilities. We can use these to reference “supernets”. In this case, we mean a more inclusive address range that is not possible with a traditional subnet mask. For instance, in a class C network, like above, we could not combine the addresses from the networks 192.168.0.0
and 192.168.1.0
because the netmask for class C addresses is 255.255.255.0
.
However, using CIDR notation, we can combine these blocks by referencing this chunk as 192.168.0.0/23
. This specifies that there are 23 bits used for the network portion that we are referring to.
So the first network (192.168.0.0
) could be represented like this in binary:
1100 0000 - 1010 1000 - 0000 0000 - 0000 0000
While the second network (192.168.1.0
) would be like this:
1100 0000 - 1010 1000 - 0000 0001 - 0000 0000
The CIDR address we specified indicates that the first 23 bits are used for the network block we are referencing. This is equivalent to a netmask of 255.255.254.0
, or:
1111 1111 - 1111 1111 - 1111 1110 - 0000 0000
As you can see, with this block the 24th bit can be either 0 or 1 and it will still match, because the network block only cares about the first 23 digits.
CIDR allows us more control over addressing continuous blocks of IP addresses. This is much more useful than the subnetting we talked about originally.
Hopefully by now, you should have a working understanding of some of the networking implications of the IP protocol. While dealing with this type of networking is not always intuitive, and may be difficult to work with at times, it is important to understand what is going on in order to configure your software and components correctly.
There are various calculators and tools online that will help you understand some of these concepts and get the correct addresses and ranges that you need by typing in certain information. CIDR.xyz provides a translation from decimal-based IP addresses to octets, and lets you visualize different CIDR netmasks.
]]>If you have a lot of experience working with relational databases, it can be difficult to move past the principles of the relational model, such as thinking in terms of tables and relationships. Document-oriented databases like MongoDB make it possible to break free from rigidity and limitations of the relational model. However, the flexibility and freedom that comes with being able to store self-descriptive documents in the database can lead to other pitfalls and difficulties.
This conceptual article outlines five common guidelines related to schema design in a document-oriented database and highlights various considerations one should make when modeling relationships between data. It will also walk through several strategies one can employ to model such relationships, including embedding documents within arrays and using child and parent references, as well as when these strategies would be most appropriate to use.
In a typical relational database, data is kept in tables, and each table is constructed with a fixed list of columns representing various attributes that make up an entity, object, or event. For example, in a table representing students at a a university, you might find columns holding each student’s first name, last name, date of birth, and a unique identification number.
Typically, each table represents a single subject. If you wanted to store information about a student’s current studies, scholarships, or prior education, it could make sense to keep that data in a separate table from the one holding their personal information. You could then connect these tables to signify that there is a relationship between the data in each one, indicating that the information they contain has a meaningful connection.
For instance, a table describing each student’s scholarship status could refer to students by their student ID number, but it would not store the student’s name or address directly, avoiding data duplication. In such a case, to retrieve information about any student with all information on the student’s social media accounts, prior education, and scholarships, a query would need to access more than one table at a time and then compile the results from different tables into one.
This method of describing relationships through references is known as a normalized data model. Storing data this way — using multiple separate, concise objects related to each other — is also possible in document-oriented databases. However, the flexibility of the document model and the freedom it gives to store embedded documents and arrays within a single document means that you can model data differently than you might in a relational database.
The underlying concept for modeling data in a document-oriented database is to “store together what will be accessed together.”" Digging further into the student example, say that most students at this school have more than one email address. Because of this, the university wants the ability to store multiple email addresses with each student’s contact information.
In a case like this, an example document could have a structure like the following:
{
"_id": ObjectId("612d1e835ebee16872a109a4"),
"first_name": "Sammy",
"last_name": "Shark",
"emails": [
{
"email": "sammy@digitalocean.com",
"type": "work"
},
{
"email": "sammy@example.com",
"type": "home"
}
]
}
Notice that this example document contains an embedded list of email addresses.
Representing more than a single subject inside a single document characterizes a denormalized data model. It allows applications to retrieve and manipulate all the relevant data for a given object (here, a student) in one go without a need to access multiple separate objects and collections. Doing so also guarantees the atomicity of operations on such a document without having to use multi-document transactions to guarantee integrity.
Storing together what needs to be accessed together using embedded documents is often the optimal way to represent data in a document-oriented database. In the following guidelines, you’ll learn how different relationships between objects, such as one-to-one or one-to-many relationships, can be best modeled in a document-oriented database.
A one-to-one relationship represents an association between two distinct objects where one object is connected with exactly one of another kind.
Continuing with the student example from the previous section, each student has only one valid student ID card at any given point in time. One card never belongs to multiple students, and no student can have multiple identification cards. If you were to store all this data in a relational database, it would likely make sense to model the relationship between students and their ID cards by storing the student records and the ID card records in separate tables that are tied together through references.
One common method for representing such relationships in a document database is by using embedded documents. As an example, the following document describes a student named Sammy and their student ID card:
{
"_id": ObjectId("612d1e835ebee16872a109a4"),
"first_name": "Sammy",
"last_name": "Shark",
"id_card": {
"number": "123-1234-123",
"issued_on": ISODate("2020-01-23"),
"expires_on": ISODate("2020-01-23")
}
}
Notice that instead of a single value, this example document’s id_card
field holds an embedded document representing the student’s identification card, described by an ID number, the card’s date of issue, and the card’s expiration date. The identity card essentially becomes a part of the document describing the student Sammy, even though it’s a separate object in real life. Usually, structuring the document schema like this so that you can retrieve all related information through a single query is a sound choice.
Things become less straightforward if you encounter relationships connecting one object of a kind with many objects of another type, such as a student’s email addresses, the courses they attend, or the messages they post on the student council’s message board. In the next few guidelines, you’ll use these data examples to learn different approaches for working with one-to-many and many-to-many relationships.
When an object of one type is related to multiple objects of another type, it can be described as a one-to-many relationship. A student can have multiple email addresses, a car can have numerous parts, or a shopping order can consist of multiple items. Each of these examples represents a one-to-many relationship.
While the most common way to represent a one-to-one relationship in a document database is through an embedded document, there are several ways to model one-to-many relationships in a document schema. When considering your options for how to best model these, though, there are three properties of the given relationship you should consider:
Imagine you’re deciding how to store student email addresses. Each student can have multiple email addresses, such as one for work, one for personal use, and one provided by the university. A document representing a single email address might take a form like this:
{
"email": "sammy@digitalocean.com",
"type": "work"
}
In terms of cardinality, there will be only a few email addresses for each student, since it’s unlikely that a student will have dozens — let alone hundreds — of email addresses. Thus, this relationship can be characterized as a one-to-few relationship, which is a compelling reason to embed email addresses directly into the student document and store them together. You don’t run any risk that the list of email addresses will grow indefinitely, which would make the document big and inefficient to use.
Note: Be aware that there are certain pitfalls associated with storing data in arrays. For instance, a single MongoDB document cannot exceed 16MB in size. While it is possible and common to embed multiple documents using array fields, if the list of objects grows uncontrollably the document could quickly reach this size limit. Additionally, storing a large amount of data inside embedded arrays have a big impact on query performance.
Embedding multiple documents in an array field will likely be suitable in many situations, but know that it also may not always be the best solution.
Regarding independent access, email addresses will likely not be accessed separately from the student. As such, there is no clear incentive to store them as separate documents in a separate collection. This is another compelling reason to embed them inside the student’s document.
The last thing to consider is whether this relationship is really a one-to-many relationship instead of a many-to-many relationship. Because an email address belongs to a single person, it’s reasonable to describe this relationship as a one-to-many relationship (or, perhaps more accurately, a one-to-few relationship) instead of a many-to-many relationship.
These three assumptions suggest that embedding students’ various email addresses within the same documents that describe students themselves would be a good choice for storing this kind of data. A sample student’s document with email addresses embedded might take this shape:
{
"_id": ObjectId("612d1e835ebee16872a109a4"),
"first_name": "Sammy",
"last_name": "Shark",
"emails": [
{
"email": "sammy@digitalocean.com",
"type": "work"
},
{
"email": "sammy@example.com",
"type": "home"
}
]
}
Using this structure, every time you retrieve a student’s document you will also retrieve the embedded email addresses in the same read operation.
If you model a relationship of the one-to-few variety, where the related documents do not need to be accessed independently, embedding documents directly like this is usually desirable, as this can reduce the complexity of the schema.
As mentioned previously, though, embedding documents like this isn’t always the optimal solution. The next section provides more details on why this might be the case in some scenarios, and outlines how to use child references as an alternative way to represent relationships in a document database.
The nature of the relationship between students and their email addresses informed how that relationship could best be modeled in a document database. There are some differences between this and the relationship between students and the courses they attend, so the way you model the relationships between students and their courses will be different as well.
A document describing a single course that a student attends could follow a structure like this:
{
"name": "Physics 101",
"department": "Department of Physics",
"points": 7
}
Say that you decided from the outset to use embedded documents to store information about each students’ courses, as in this example:
{
"_id": ObjectId("612d1e835ebee16872a109a4"),
"first_name": "Sammy",
"last_name": "Shark",
"emails": [
{
"email": "sammy@digitalocean.com",
"type": "work"
},
{
"email": "sammy@example.com",
"type": "home"
}
],
"courses": [
{
"name": "Physics 101",
"department": "Department of Physics",
"points": 7
},
{
"name": "Introduction to Cloud Computing",
"department": "Department of Computer Science",
"points": 4
}
]
}
This would be a perfectly valid MongoDB document and could well serve the purpose, but consider the three relationship properties you learned about in the previous guideline.
The first one is cardinality. A student will likely only maintain a few email addresses, but they can attend multiple courses during their study. After several years of attendance, there could be dozens of courses the student took part in. Plus, they’d attend these courses along with many other students who are likewise attending their own set of courses over their years of attendance.
If you decided to embed each course like the previous example, the student’s document would quickly get unwieldy. With a higher cardinality, the embedded document approach becomes less compelling.
The second consideration is independent access. Unlike email addresses, it’s sound to assume there would be cases in which information about university courses would need to be retrieved on their own. For instance, say someone needs information about available courses to prepare a marketing brochure. Additionally, courses will likely need to be updated over time: the professor teaching the course might change, its schedule may fluctuate, or its prerequisites might need to be updated.
If you were to store the courses as documents embedded within student documents, retrieving the list of all the courses offered by the university would become troublesome. Also, each time a course needs an update, you would need to go through all student records and update the course information everywhere. Both are good reasons to store courses separately and not embed them fully.
The third thing to consider is whether the relationship between student and a university course is actually one-to-many or instead many-to-many. In this case, it’s the latter, as more than one student can attend each course. This relationship’s cardinality and independent access aspects suggest against embedding each course document, primarily for practical reasons like ease of access and update. Considering the many-to-many nature of the relationship between courses and students, it might make sense to store course documents in a separate collection with unique identifiers of their own.
The documents representing classes in this separate collection might have a structure like these examples:
{
"_id": ObjectId("61741c9cbc9ec583c836170a"),
"name": "Physics 101",
"department": "Department of Physics",
"points": 7
},
{
"_id": ObjectId("61741c9cbc9ec583c836170b"),
"name": "Introduction to Cloud Computing",
"department": "Department of Computer Science",
"points": 4
}
If you decide to store course information like this, you’ll need to find a way to connect students with these courses so that you will know which students attend which courses. In cases like this where the number of related objects isn’t excessively large, especially with many-to-many relationships, one common way of doing this is to use child references.
With child references, a student’s document will reference the object identifiers of the courses that the student attends in an embedded array, as in this example:
{
"_id": ObjectId("612d1e835ebee16872a109a4"),
"first_name": "Sammy",
"last_name": "Shark",
"emails": [
{
"email": "sammy@digitalocean.com",
"type": "work"
},
{
"email": "sammy@example.com",
"type": "home"
}
],
"courses": [
ObjectId("61741c9cbc9ec583c836170a"),
ObjectId("61741c9cbc9ec583c836170b")
]
}
Notice that this example document still has a courses
field which also is an array, but instead of embedding full course documents like in the earlier example, only the identifiers referencing the course documents in the separate collection are embedded. Now, when retrieving a student document, courses will not be immediately available and will need to be queried separately. On the other hand, it’s immediately known which courses to retrieve. Also, in case any course’s details need to be updated, only the course document itself needs to be altered. All references between students and their courses will remain valid.
Note: There is no firm rule for when the cardinality of a relation is too great to embed child references in this manner. You might choose a different approach at either a lower or higher cardinality if it’s what best suits the application in question. After all, you will always want to structure your data to suit the manner in which your application queries and updates it.
If you model a one-to-many relationship where the amount of related documents is within reasonable bounds and related documents need to be accessed independently, favor storing the related documents separately and embedding child references to connect to them.
Now that you’ve learned how to use child references to signify relationships between different types of data, this guide will outline an inverse concept: parent references.
Using child references works well when there are too many related objects to embed them directly inside the parent document, but the amount is still within known bounds. However, there are cases when the number of associated documents might be unbounded and will continue to grow with time.
As an example, imagine that the university’s student council has a message board where any student can post whatever messages they want, including questions about courses, travel stories, job postings, study materials, or just a free chat. A sample message in this example consists of a subject and a message body:
{
"_id": ObjectId("61741c9cbc9ec583c836174c"),
"subject": "Books on kinematics and dynamics",
"message": "Hello! Could you recommend good introductory books covering the topics of kinematics and dynamics? Thanks!",
"posted_on": ISODate("2021-07-23T16:03:21Z")
}
You could use either of the two approaches discussed previously — embedding and child references — to model this relationship. If you were to decide on embedding, the student’s document might take a shape like this:
{
"_id": ObjectId("612d1e835ebee16872a109a4"),
"first_name": "Sammy",
"last_name": "Shark",
"emails": [
{
"email": "sammy@digitalocean.com",
"type": "work"
},
{
"email": "sammy@example.com",
"type": "home"
}
],
"courses": [
ObjectId("61741c9cbc9ec583c836170a"),
ObjectId("61741c9cbc9ec583c836170b")
],
"message_board_messages": [
{
"subject": "Books on kinematics and dynamics",
"message": "Hello! Could you recommend good introductory books covering the topics of kinematics and dynamics? Thanks!",
"posted_on": ISODate("2021-07-23T16:03:21Z")
},
. . .
]
}
However, if a student is prolific with writing messages their document will quickly become incredibly long and could easily exceed the 16MB size limit, so the cardinality of this relation suggests against embedding. Additionally, the messages might need to be accessed separately from the student, as could be the case if the message board page is designed to show the latest messages posted by students. This also suggests that embedding is not the best choice for this scenario.
Note: You should also consider whether the message board messages are frequently accessed when retrieving the student’s document. If not, having them all embedded inside that document would incur a performance penalty when retrieving and manipulating this document, even when the list of messages would not be used often. Infrequent access of related data is often another clue that you shouldn’t embed documents.
Now consider using child references instead of embedding full documents as in the previous example. The individual messages would be stored in a separate collection, and the student’s document could then have the following structure:
{
"_id": ObjectId("612d1e835ebee16872a109a4"),
"first_name": "Sammy",
"last_name": "Shark",
"emails": [
{
"email": "sammy@digitalocean.com",
"type": "work"
},
{
"email": "sammy@example.com",
"type": "home"
}
],
"courses": [
ObjectId("61741c9cbc9ec583c836170a"),
ObjectId("61741c9cbc9ec583c836170b")
],
"message_board_messages": [
ObjectId("61741c9cbc9ec583c836174c"),
. . .
]
}
In this example, the message_board_messages
field now stores the child references to all messages written by Sammy. However, changing the approach solves only one of the issues mentioned before in that it would now be possible to access the messages independently. But although the student’s document size would grow more slowly using the child references approach, the collection of object identifiers could also become unwieldy given the unbounded cardinality of this relation. A student could easily write thousands of messages during their four years of study, after all.
In such scenarios, a common way to connect one object to another is through parent references. Unlike the child references described previously, it’s now not the student document referring to individual messages, but rather a reference in the message’s document pointing towards the student that wrote it.
To use parent references, you would need to modify the message document schema to contain a reference to the student who authored the message:
{
"_id": ObjectId("61741c9cbc9ec583c836174c"),
"subject": "Books on kinematics and dynamics",
"message": "Hello! Could you recommend a good introductory books covering the topics of kinematics and dynamics? Thanks!",
"posted_on": ISODate("2021-07-23T16:03:21Z"),
"posted_by": ObjectId("612d1e835ebee16872a109a4")
}
Notice the new posted_by
field contains the object identifier of the student’s document. Now, the student’s document won’t contain any information about the messages they’ve posted:
{
"_id": ObjectId("612d1e835ebee16872a109a4"),
"first_name": "Sammy",
"last_name": "Shark",
"emails": [
{
"email": "sammy@digitalocean.com",
"type": "work"
},
{
"email": "sammy@example.com",
"type": "home"
}
],
"courses": [
ObjectId("61741c9cbc9ec583c836170a"),
ObjectId("61741c9cbc9ec583c836170b")
]
}
To retrieve the list of messages written by a student, you would use a query on the messages collection and filter against the posted_by
field. Having them in a separate collection makes it safe to let the list of messages grow without affecting any of the student’s documents.
Note: When using parent references, creating an index on the field referencing the parent document can significantly increase the query performance each time you filter against the parent document identifier.
If you model a one-to-many relationship where the amount of related documents is unbounded, regardless of whether the documents need to be accessed independently, it’s generally advised that you store related documents separately and use parent references to connect them to the parent document.
Thanks to the flexibility of document-oriented databases, determining the best way to model relationships in a document databases is less of a strict science than it is in a relational database. By reading this article, you’ve acquainted yourself with embedding documents and using child and parent references to store related data. You’ve learned about considering the relationship cardinality and avoiding unbounded arrays, as well as taking into account whether the document will be accessed separately or frequently.
These are just a few guidelines that can help you model typical relationships in MongoDB, but modeling database schema is not a one size fits all. Always take into account your application and how it uses and updates the data when designing the schema.
To learn more about schema design and common patterns for storing different kinds of data in MongoDB, we encourage you to check the official MongoDB documentation on that topic.
]]>In PHP, as in all programming languages, data types are used to classify one particular type of data. This is important because the specific data type you use will determine what values you can assign to it and what you can do to it (including what operations you can perform on it).
In this tutorial, we will go over the important data types native to PHP. This is not an exhaustive investigation of data types, but will help you become familiar with what options you have available to you in PHP.
One way to think about data types is to consider the different types of data that we use in the real world. Two different types are numbers and words. These two data types work in different ways. We would add 3 + 4
to get 7
, while we would combine the words star
and fish
to get starfish
.
If we start evaluating different data types with one another, such as numbers and words, things start to make less sense. The following equation, for example, has no obvious answer:
'sky' + 8
For computers, each data type can be thought of as being quite different, like words and numbers, so we have to be careful about how we use them to assign values and how we manipulate them through operations.
PHP is a loosely typed language. This means, by default, if a value doesn’t match the expected data type, PHP will attempt the change the value of the wrong data type to match the expected type when possible. This is called type juggling. For example, a function that expects a string
but instead receives an integer
with a value of 2
will change the incoming value into the expected string
type with a value of "2"
.
It is possible, and encouraged, to enable strict mode on a per-file basis. This provides enforcement of data types in the code you control, while allowing the use of additional code packages that may not adhere to strict data types. Strict type is declared at the top of a file:
<?php
declare(strict_types=1);
...
In strict mode, only a value corresponding exactly to the type declaration will be accepted; otherwise a TypeError
will be thrown. The only exception to this rule is that an int
value will pass a float
type declaration.
Any number you enter in PHP will be interpreted as a number. You are not required to declare what kind of data type you are entering. PHP will consider any number written without decimals as an integer (such as 138) and any number written with decimals as a float (such as 138.0).
Like in math, integers in computer programming are whole numbers that can be positive, negative, or 0 (…, -1, 0, 1, …). An integer can also be known as an int
. As with other programming languages, you should not use commas in numbers of four digits or more, so to represent the number 1,000 in your program, write it as 1000
.
We can print out an integer in a like this:
echo -25;
Which would output:
Output-25
We can also declare a variable, which in this case is a symbol of the number we are using or manipulating, like so:
$my_int = -25;
echo $my_int;
Which would output:
Output-25
We can do math with integers in PHP, too:
$int_ans = 116 - 68;
echo $int_ans;
Which would output:
Output48
Integers can be used in many ways within PHP programs, and as you continue to learn more about the language you will have a lot of opportunities to work with integers and understand more about this data type.
A floating-point number or float is a real number, meaning that it can be either a rational or an irrational number. Because of this, floating-point numbers can be numbers that can contain a fractional part, such as 9.0
or -116.42
. For the purposes of thinking of a float
in a PHP program, it is a number that contains a decimal point.
Like we did with the integer, we can print out a floating-point number like this:
echo 17.3;
Which would output:
Output17.3
We can also declare a variable that stands in for a float, like so:
$my_flt = 17.3;
echo $my_flt;
Which would output:
Output17.3
And, just like with integers, we can do math with floats in PHP, too:
$flt_ans = 564.0 + 365.24;
echo $flt_ans;
Which would output:
Output929.24
With integers and floating-point numbers, it is important to keep in mind that 3
does not equal 3.0
, because 3
refers to an integer while 3.0
refers to a float. This may or may not change the way your program functions.
Numbers are useful when working with calculations, counting items or money, and the passage of time.
A string is a sequence of one or more characters that may consist of letters, numbers, or symbols. This sequence is enclosed within either single quotes ''
or double quotes ""
:
echo 'This is a 47 character string in single quotes.'
echo "This is a 47 character string in double quotes."
Both lines output the their value the same way:
OutputThis is a 47 character string in single quotes.
This is a 47 character string in double quotes.
You can choose to use either single quotes or double quotes, but whichever you decide on you should be consistent within a program.
The program “Hello, World!” demonstrates how a string can be used in computer programming, as the characters that make up the phrase Hello, World! are a string:
echo "Hello, World!";
As with other data types, we can store strings in variables and output the results:
$hw = "Hello, World!"
echo $hw;
Either way, the output is the same:
OutputHello, World!
Like numbers, there are many operations that we can perform on strings within our programs in order to manipulate them to achieve the results we are seeking. Strings are important for communicating information to the user, and for the user to communicate information back to the program.
The Boolean, or bool
, data type can be one of two values, either true
or false
. Booleans are used to represent the truth values that are associated with the logic branch of mathematics.
You do not use quotes when declaring a Boolean value; anything in quotes is assumed to be a string. PHP doesn’t care about case when declaring a Boolean; True
, TRUE
, true
, and tRuE
all evaluate the same. If you follow the style guide put out by the PHP-FIG, the values should be all lowercase true
or false
.
Many operations in math give us answers that evaluate to either True or False:
True
False
True
False
True
False
Like with any other data type, we can store a Boolean value in a variable. Unlike numbers or strings, echo
cannot be used to output the value because a Boolean true
value is converted to the string "1"
, while a Boolean false
is converted to ""
(an empty string). This allows “type juggling” to convert a variable back and forth between Boolean and string values. To output the value of a Boolean we have several options. To output the type along with the value of a variable, we use var_dump
. To output the string representation of a variable’s value, we use var_export
:
$my_bool = 4 > 3;
echo $my_bool;
var_dump($my_bool);
var_export($my_bool);
Since 4 is greater than 3, we will receive the following output:
Output1
bool(true)
true
The echo
line converts the true
Boolean to the string of 1
. The var_dump
outputs the variable type of bool
along with the value of true
. The var_export
outputs the string representation of the value which is true
.
As you write more programs in PHP, you will become more familiar with how Booleans work and how different functions and operations evaluating to either True
or False
can change the course of the program.
The NULL type is an absence of value. It reserves space for a variable. This allows PHP to know about a variable, but still consider it unset. The only possible value of a NULL type is the case-insensitive value of null
. When PHP attempts to access a variable that has not been declared, it will throw a warning:
echo $name;
It warns that the variable is not set, but the code continues to process:
OutputPHP Warning: Undefined variable $name
One common way to prevent this warning is to check that that variable has been set using the isset
function:
if (isset($name)) {
echo $name;
}
This skips the echo entirely and no warning is thrown. A second way to prevent this type of error is to set a placeholder value for a variable such as an empty string:
$name = '';
echo "Hello ".$name;
This will now display Hello
without a name because the value of $name
is an empty string:
OutputHello
Both of these solutions are valid and useful. However, when setting the value of $name
to an empty string, that value is actually set:
$name = '';
if (isset($name)) {
echo "Hello ".$name;
}
This will also display Hello
without a name because the value of $name
is set to an empty string:
OutputHello
As with most challenges, there are multiple solutions. One solution is to set the variable to a null
value. This holds space for that variable and prevents PHP from throwing errors, but still considers the variable “not set”:
$name = null;
echo $name;
if (isset($name)) {
echo "Hello ".$name;
}
The variable is has been “declared” so there will be no error when echo
attempts to access the variable. It will also display nothing because there is no value. The condition will also evaluate to false because the $name
variable is not considered set.
We can use var_dump
to see how PHP evaluates a NULL variable:
$name = null;
var_dump($name);
This shows us that the type is NULL:
OutputNULL
While less common than other variable types, NULL is often used as the return type of a function that performs an action but does not have a return value.
An array in PHP is actually an ordered map. A map is a data type that associates or “maps” values to keys. This data type has many different uses; it can be treated as an array
, list
, hash table
, dictionary
, collection
, and more. Additionally, because array values in PHP can also be other arrays, multidimensional arrays are possible.
In its simplest form, an array will have a numeric index or key
. If you do not specify a key, PHP will automatically generate the next numeric key for you. By default, array keys are 0-indexed, which means that the first key is 0, not 1. Each element, or value, that is inside of an array can also be referred to as an item.
An array can be defined in one of two ways. The first is using the array()
language construct, which uses a comma-separated list of items. An array of integers would be defined like this:
array(-3, -2, -1, 0, 1, 2, 3)
The second and more common way to define an array is through the short array syntax using square brackets []
. An array of floats would be defined like this:
[3.14, 9.23, 111.11, 312.12, 1.05]
We can also define an array of strings, and assign an array to a variable, like so:
$sea_creatures = ['shark', 'cuttlefish', 'squid', 'mantis shrimp'];
Once again, we cannot use echo
to output an entire array, but we can use var_export
or var_dump
:
var_export($sea_creatures);
var_dump($sea_creatures);
The output shows that the array uses numeric keys
:
Outputarray (
0 => 'shark',
1 => 'cuttlefish',
2 => 'squid',
3 => 'mantis shrimp',
)
array(4) {
[0]=>
string(5) "shark"
[1]=>
string(10) "cuttlefish"
[2]=>
string(5) "squid"
[3]=>
string(13) "mantis shrimp"
}
Because the array is 0-indexed, the var_dump
shows an indexed array with numeric keys between 0
and 3
. Each numeric key
corresponds with a string value
. The first element has a key of 0
and a value of shark
. The var_dump
function gives us more details about an array: there are 4 items in the array, and the value of the first item is a string with a length of 5.
The numeric key of an indexed array may be specified when setting the value. However, the key is more commonly specified when using a named key.
Associative arrays are arrays with named keys. They are typically used to hold data that are related, such as the information contained in an ID. An associative array looks like this:
['name' => 'Sammy', 'animal' => 'shark', 'color' => 'blue', 'location' => 'ocean']
Notice the double arrow operator =>
used to separate the strings. The words to the left of the =>
are the keys. The key can either be an integer or a string. The keys in the previous array are: 'name'
, 'animal'
, 'color'
, 'location'
.
The words to the right of the =>
are the values. Values can be comprised of any data type, including another array. The values in the previous array are: 'Sammy'
, 'shark'
, 'blue'
, 'ocean'
.
Like the indexed array, let’s store the associative array inside a variable, and output the details:
$sammy = ['name' => 'Sammy', 'animal' => 'shark', 'color' => 'blue', 'location' => 'ocean'];
var_dump($sammy);
The results will describe this array as having 4 elements. The string for each key is given, but only the value specifies the type string
with a character count:
Outputarray(4) {
["name"]=>
string(5) "Sammy"
["animal"]=>
string(5) "shark"
["color"]=>
string(4) "blue"
["location"]=>
string(5) "ocean"
}
Associative arrays allow us to more precisely access a single element. If we want to isolate Sammy’s color, we can do so by adding square brackets containing the name of the key after the array variable:
echo $sammy['color'];
The resulting output:
Outputblue
As arrays offer key-value mapping for storing data, they can be important elements in your PHP program.
While a constant is not actually a separate data type, it does work differently than other data types. As the name implies, constants are variables which are declared once, after which they do not change throughout your application. The name of a constant should always be uppercase and does not start with a dollar sign. A constant can be declared using either the define
function or the const
keyword:
define('MIN_VALUE', 1);
const MAX_VALUE = 10;
The define
function takes two parameters: the first is a string
containing the name of the constant, and the second is the value to assign. This could be any of the data type values explained earlier. The const
keyword allows the constant to be assigned a value in the same manner as other data types, using the single equal sign. A constant can be used within your application in the same way as other variables, except they will not be interpreted within a double quoted string:
echo "The value must be between MIN_VALUE and MAX_VALUE";
echo "The value must be between ".MIN_VALUE." and ".MAX_VALUE;
Because the constants are not interpreted, the output of these lines is different:
OutputThe value must be between MIN_VALUE and MAX_VALUE
The value must be between 1 and 10
At this point, you should have a better understanding of some of the major data types that are available for you to use in PHP. Each of these data types will become important as you develop programming projects in the PHP language.
]]>A software license is a legal agreement that defines how a given piece of software can be used. For software developers who may want to exercise certain rights, permissions, and control over how the work is used, modified, and shared by others, choosing a software license is an important decision. Some developers may want to place strong restrictions over how their software can be used. Others, however, may choose to license their software with few or no restrictions. This may be because they want their software to be as widely used as possible, or perhaps they oppose restrictive software licenses on philosophical grounds.
Regardless of their reasoning, developers can accomplish this by implementing an open-source software license. Broadly speaking, open-source software licenses make the source code available for use, modification, and distribution based on agreed-upon terms and conditions. There are many different open-source software licenses, and they vary based on the restrictions a creator may want future users to abide by.
When it comes to long-term planning for your project, it’s useful to understand the open-source software licenses available so that you can make an informed decision about which one best suits your project’s needs. In this article, we will share information about rights you have when your work is created (such as copyright), and how licensing helps establish the legal agreement you want your users to abide by when using your software. We will also discuss the differences between proprietary, free, and open-source software, permissive and copyleft licenses, and information about the open-source software license options suggested when creating a GitHub project.
Note: This article is not intended to provide any form of legal advice, it’s solely a resource of information on the topic of open-source software licensing.
If you’d like to learn more about patents, trademarks, and intellectual property, you can visit the U.S. Patent and Trademark Office.
In the U.S. and many countries, there are certain legal protections you are automatically granted for any creative work you produce, one of those being copyright. The U.S. Copyright Office defines copyright as “a type of intellectual property that protects original works of authorship,” specifically when the “author fixes the work in a tangible form of expression.” This means with copyright you are not the owner of the idea, but rather the material expression of the idea. If a copyright owner desires stricter legal protection over their work, this can be achieved through patents, trademarks, and intellectual property laws. Copyrighting your work does not require a formal process to ensure these rights are given.
Copyright grants the owner various rights, such as reproducing and distributing copies of the work. If an owner wants control over how their work can be used by others, then they must implement a license that outlines the rules by which those users must abide. If the copyright owner states the work is “All Rights Reserved”, this means that their work cannot be used or modified by anyone at all, except themselves.
Another complexity to acknowledge is the creative work you produce for your employer. If you’re engaging in what is known as work for hire, this means that any work you create for the company or organization you work for belongs to that entity, since they’re paying you for the work. As a result, sharing this work without permission has legal consequences since you do not have ownership rights to copyright or licensing.
Proprietary software is any software with a license that restricts how it can be used, modified, or shared. Video games are a common example of proprietary software. If you purchase a video game (whether as a cartridge, disc, or digital download), you aren’t allowed to make a copy of that game to share with friends or sell for profit. It’s also likely you aren’t permitted to modify the game’s code to run it on a different platform than the one you originally bought it for.
Software users are typically held to certain restrictions with an end-user license agreement (EULA). If you’ve ever purchased software, you may have assumed you own that piece of software. However, if you’ve purchased proprietary software, it will likely come with a EULA that specifies you do not own the software. Instead, you’re the owner of a software license that permits you to use that software. EULAs may also define how you can use the license itself, and they typically limit you from sharing it with others without the permission of the software owner (the software’s developer or publisher).
Another legal instrument similar to a EULA is a Terms of Service agreement (ToS). Sometimes known as Terms of Use or Terms and Conditions, a ToS outlines the rules a user must follow in order to be allowed to use a program or service. It’s more common to see an EULA included with software that requires a one-time purchase, while ToS agreements are more common for subscription services and websites. Oftentimes, the first time you start a given piece of proprietary software, a dialog box will appear which explains the EULA or ToS and contains an I Agree button (or something similar) which you must click before you can use the program.
Software with such restrictions hasn’t always been the norm. Before the 1970s, software was typically distributed along with its source code, meaning users were free to modify and share the software as they desired. With time, though, software publishers began imposing restrictions on these activities, typically with the goal of increasing profits by reducing the number of people who used their software but didn’t pay for it.
This development had repercussions in the form of two closely related movements: the free software and the open-source software movements. Although the two are distinct, the free software and open-source software movements both argue that software users should be allowed to access a program’s source code, modify it as they see fit, and share it as often and with whomever they like.
Note: Since free software is generally considered to be open source, but open-source software is not always considered to be free, this guide will default to the more inclusive terms “open-source software” and “open-source software licenses” moving forward. However, please be aware that the two terms are not always interchangeable.
If you’d like a more thorough explanation of the history and differences between free software and open-source software, we encourage you to read our article on The Difference Between Free and Open-Source Software.
Open-source software advocates still encourage developers to distribute their software with a license. However, instead of a proprietary software license outlining what users may not do, they recommend using an open-source software license that outlines the freedoms available to users of the given piece of software. These licenses are often distributed as a single file within the program, typically named LICENSE.txt
or a similar naming convention.
Over the years, there has been some disagreement about what specific freedoms should be guaranteed by an open-source software license. This has led to the emergence of many different open-source licenses, but most of these can fall into one of two categories: permissive and copyleft licenses.
A permissive license, sometimes referred to as a non-copyleft license, grants users permission to use, modify, and share the source code, but users also have the option to change some of those terms and conditions for redistribution, including derivative work. In the context of software, a derivative work is a piece of software that is based on an existing program. If the original was released under a permissive license, a creator can choose to share their derivative work with different terms than what the original work’s license might have required.
A copyleft license, also grants users permission to use, modify, and share the source code, but offers protection against relicensing through specific restrictions and terms and conditions. This means that software users creating derivative work are required to release under the same copyleft license terms and conditions of the original work. This reciprocity is a defining aspect of copyleft licenses, and is intended to protect creators’ intentions by ensuring that users will have the same rights and permissions when using works derived from the original software.
In addition, there are public-domain-equivalent licenses that grant users permission to use copyrighted works without attribution or required licensing compatibility. For a creator, this means that any rights over their work are completely forfeited. Although there is some overlap in the philosophies behind public-domain and free and open-source software licenses, there has been disagreement over the years about whether a public-domain-equivalent license truly qualifies as open source. In 2012, the CC0 license was submitted but ultimately denied approval by the Open Source Initiative (OSI), a nonprofit organization that defines standards for open-source software and maintains a list of approved open source licenses. However, the OSI did approve a public-domain-equivalent license called the Unlicense in 2020.
As a developer starting a project from scratch, it’s important to have some familiarity with the open-source software licenses available to assess how you’d like others to use your work. Recognizing these licenses is also important to users so they can understand the permissions or restrictions set by the agreement they’ve made when using the creator’s work.
Again, any original work will have copyright upon completion, but without a license, it’s unclear what is and isn’t allowed for those who want to use it. Consider the following reasons why you might include an open-source software license:
Improvement: The open-source community prides itself on cultivating a culture that encourages collaboration and innovation. Using an open-source software license invites users to engage in community development. This creates a shared sense of responsibility to consistently improve the source code or expand the program further to everyone’s benefit.
Ownership: If you want to exercise more power over your work, choosing a license that can place those restrictions will help you do so. For instance, if you want any derivative works to grant the same permissions as the one you originally chose, you may want to opt for a copyleft license. Fortunately, an open-source software license provides transparency to future users on how much control you have over the work, whether it’s a lot or a little, is up to you.
Competition: There’s a plethora of software out there and if you want to break into that market, using an open-source license can help put you on the map. Some popular examples of open-source software that were developed to compete with established proprietary alternatives include the Linux operating system, Android by Google, and the Firefox browser.
Keep in mind that it is possible to monetize an open-source software project, but the typical business practice for monetizing software is to use a proprietary license to protect the software from being shared or stolen.
These reasons for using an open-source software license may not all be applicable to you, and we encourage you to do your own research on the subject before choosing a license for your next project. Additionally, you may want to seek the assistance of a legal professional to confirm a full understanding of what a license would signify for your work in the present and future.
As mentioned earlier, this article focuses on the open-source software licenses listed when creating a new repository for your project on GitHub. You’ll notice at the end of the page there is an option for choosing a license. Once you click the box, a drop-down list of licenses will appear for you to select from, like in the following:
In the next sections, we will provide brief descriptions of the types of open-source software licenses you can choose from for your next project, starting with the permissive licenses recommended by GitHub.
Permissive licenses grant software users permission to use, modify, and share the source code. Additionally, creators of software derived from permissively licensed software can change the licensing conditions for redistribution.
Please note, the following list is not representative of all the permissive open-source software licenses available. Rather, this list is taken from the license options offered by GitHub when starting a new project. Also, these brief descriptions are not comprehensive. We recommend carefully reading through the documentation for any license you’re interested in using or speaking with a legal professional for more information.
The Apache License is written by the Apache Software Foundation (ASF). With this license, users do not have to share their modified version of the source code under the same license and can choose to use a different one, this is known as sublicensing.
The MIT License is from the Massachusetts Institute of Technology (MIT) and is one of the shortest to read with few restrictions. Similar to the Apache license, it also gives users the option to sublicense the software.
GitHub lets you choose between two BSD licenses, the BSD 2-Clause “Simplified” License, sometimes referred to as the “FreeBSD” license; and the BSD 3-Clause “New” or “Revised” License. The main difference between these two licenses is with the 3-clause. This clause restricts software users from using the name of the author, authors, or contributors, to endorse products or services.
The Boost Software License, is from the Boost Libraries of C++ and was approved by the OSI in 2008. This license is similar to the MIT and BSD licenses, except it does not require attribution when redistributing in binary form.
Copyleft licenses grant software users permission to use, modify, and share the source code, but also protect against relicensing through specific restrictions and terms and conditions. This represents the reciprocal characteristic of this license that requires users’ work to adhere to the original rights outlined in the license.
Again, the following list is not representative of all the copyleft open-source software licenses available. Rather, this list is taken from the license options offered by GitHub when starting a new project. Also, these brief descriptions are not comprehensive. We recommend carefully reading through the documentation for any license you’re interested in using or speaking with a legal professional for more information.
There have been a number of versions of the GNU General Public License (GPL) that have been released by the Free Software Foundation, four of which users can choose from on GitHub. The GPL v3.0 requires users to state any modifications to the original code and make that original code available when distributing any binaries used on their work under that licensed software. This license also made it easier to work with other licenses such as Apache, which the previous version (v2.0), did not have compatibility with.
Before the current GPL v3.0 version, a second version was created, the GNU Public License v2.0. This license shares similar terms and conditions as v3.0, but is considered a strong copyleft license. A strong copyleft license requires that any modifications to the source code get released using the same license. The primary difference with v2.0 is that software users are allowed to distribute work if they adhere to the requirements of the license, regardless of prior legal obligations. The goal of this clause is to prevent an individual or party from submitting a patent infringement claim that would limit a user’s freedom under this license.
There is also the GNU Lesser General Public License , referred to as LGPL, and also v2.1 of the GPL v2.0. This license is meant to serve as a middle-ground between strong and weak copyleft licenses. The main difference with this license is that software users can combine a software component of the LGPL with their own and are not required to share the source code of their own components. Users can also distribute a hybrid library, which is a combination of functions in the LGPL library and functions from a non-LGPL, but there must be a copy of that non-LGPL library and information on where it’s located.
Another GNU license is the GNU Affero General Public License v3.0, referred to as AGPL. The main difference with this license is that it is specific to software programs used on a server. This license requires users who run a modified program on a server to share this information and make the modified source code available for download to the relevant modified version that is currently running on the server.
The Eclipse Public License, is from the Eclipse Foundation and is considered a weak copyleft license. A weak copyleft license requires software users to share any changes they make to the code. This license chose to implement a weaker copyleft as a way to reduce the stricter requirements users encountered with GNU’s General Public Licenses.
The Mozilla Public License, or MPL, is from the Mozilla Foundation and is also considered a weak copyleft license. The difference with this license (in comparison with the Eclipse Public License) is that it is file-based copyleft, which means code can be combined with open-source or proprietary code.
Public-domain-equivalent licenses grant users permission to use copyrighted works without attribution or required licensing compatibility. As you may recall, these licenses are not always OSI-approved.
The Creative Commons Zero Universal License, was written by Creative Commons and is considered a public copyright license. This means copyrighted work can be freely distributed. Please be aware that this license is not OSI-approved. The main point about this license is that users can use, distribute, and modify the source code, but must agree to waive any copyrights to ensure this work is accessible in the public domain. Additionally, users do not have to provide any attribution to the work and can use it commercially.
The Unlicense was released in 2012 and is considered a public-domain-equivalent license that is OSI-approved. With this license, software users can use, modify, distribute source code, and compiled binary for both commercial and non-commercial purposes. This license also advises users who want to ensure contributions to the code or software are available to the public domain by including a statement about their commitment to sharing the code base with the public.
There are many factors to consider when choosing an open-source software license. Yet, there are certainly popular choices among the developer community. Common permissive licenses include the MIT License, Apache License, and BSD License. Some common copyleft licenses include the GNU General Public License and the Mozilla Public License.
Remember, this article only provided information about a few common open-source software licenses, specifically the ones suggested by GitHub. We encourage you to explore all of your available licensing options or consult the help of a legal professional to make an informed decision about what best fits the needs of your project.
]]>When accessing a web server or application, every HTTP request that is received by a server is responded to with an HTTP status code. HTTP status codes are three-digit codes, and are grouped into five different classes. The class of a status code can be identified by its first digit:
This guide focuses on identifying and troubleshooting the most commonly encountered HTTP error codes, i.e. 4xx and 5xx status codes, from a system administrator’s perspective. There are many situations that could cause a web server to respond to a request with a particular error code – we will cover common potential causes and solutions.
Client errors, or HTTP status codes from 400 to 499, are the result of HTTP requests sent by a user client (i.e. a web browser or other HTTP client). Even though these types of errors are client-related, it is often useful to know which error code a user is encountering to determine if the potential issue can be fixed by server configuration.
Server errors, or HTTP status codes from 500 to 599, are returned by a web server when it is aware that an error has occurred or is otherwise not able to process the request.
access.log
and error.log
that can be scanned for relevant informationNow that you have a high-level understanding of HTTP status codes, we will look at the commonly encountered errors.
The 400 status code, or Bad Request error, means the HTTP request that was sent to the server has invalid syntax.
Here are a few examples of when a 400 Bad Request error might occur:
curl
incorrectly)The 401 status code, or an Unauthorized error, means that the user trying to access the resource has not been authenticated or has not been authenticated correctly. This means that the user must provide credentials to be able to view the protected resource.
An example scenario where a 401 Unauthorized error would be returned is if a user tries to access a resource that is protected by HTTP authentication, as in this Nginx tutorial. In this case, the user will receive a 401 response code until they provide a valid username and password (one that exists in the .htpasswd
file) to the web server.
The 403 status code, or a Forbidden error, means that the user made a valid request but the server is refusing to serve the request, due to a lack of permission to access the requested resource. If you are encountering a 403 error unexpectedly, there are a few typical causes that are explained here.
403 errors commonly occur when the user that is running the web server process does not have sufficient permissions to read the file that is being accessed.
To give an example of troubleshooting a 403 error, assume the following situation:
http://example.com/index.html
www-data
user/usr/share/nginx/html/index.html
If the user is getting a 403 Forbidden error, ensure that the www-data
user has sufficient permissions to read the file. Typically, this means that the other permissions of the file should be set to read. There are several ways to ensure this, but the following command will work in this case:
- sudo chmod o=r /usr/share/nginx/html/index.html
Another potential cause of 403 errors, often intentionally, is the use of an .htaccess
file. The .htaccess
file can be used to deny access of certain resources to specific IP addresses or ranges, for example.
If the user is unexpectedly getting a 403 Forbidden error, ensure that it is not being caused by your .htaccess
settings.
If the user is trying to access a directory that does not have a default index file, and directory listings are not enabled, the web server will return a 403 Forbidden error. For example, if the user is trying to access http://example.com/emptydir/
, and there is no index file in the emptydir
directory on the server, a 403 status will be returned.
If you want directory listings to be enabled, you may do so in your web server configuration.
The 404 status code, or a Not Found error, means that the user is able to communicate with the server but it is unable to locate the requested file or resource.
404 errors can occur in a large variety of situations. If the user is unexpectedly receiving a 404 Not Found error, here are some questions to ask while troubleshooting:
The 500 status code, or Internal Server Error, means that server cannot process the request for an unknown reason. Sometimes this code will appear when more specific 5xx errors are more appropriate.
This most common cause for this error is server misconfiguration (e.g. a malformed .htaccess
file) or missing packages (e.g. trying to execute a PHP file without PHP installed properly).
The 502 status code, or Bad Gateway error, means that the server is a gateway or proxy server, and it is not receiving a valid response from the backend servers that should actually fulfill the request.
If the server in question is a reverse proxy server, such as a load balancer, here are a few things to check:
The 503 status code, or Service Unavailable error, means that the server is overloaded or under maintenance. This error implies that the service should become available at some point.
If the server is not under maintenance, this can indicate that the server does not have enough CPU or memory resources to handle all of the incoming requests, or that the web server needs to be configured to allow more users, threads, or processes.
The 504 status code, or Gateway Timeout error, means that the server is a gateway or proxy server, and it is not receiving a response from the backend servers within the allowed time period.
This typically occurs in the following situations:
Now that you are familiar with the most common HTTP error codes, and common solutions to those codes, you should have a good basis for troubleshooting issues with your web servers or applications.
If you encounter any error codes that were not mentioned in this guide, or if you know of other likely solutions to the ones that were described, feel free to discuss them in the comments.
]]>As a coder, you are probably used to telling computers what to do. Type up some code, run it, and the computer gets to work executing whatever command you gave it.
Even though we have this powerful reign over computers, there’s still a lot of magic constantly occurring in our code that we tend to overlook. This is especially true if you’re working with high-level languages with pre-built functions, as most of us are. And, of course, while there is no real reason to reinvent the wheel or try to implement these helpful functions on your own, it is still fun to take a peek under the hood and see what’s going on!
In this article, you will take a closer look at one of these concepts that you have probably all used at one point or another: the order of operations.
Say you want to evaluate this sample expression:
5 + 10 * 3
According to the mathematical order of operations, you would multiply 10 by 3
first and then add 5
to the product of that, but how exactly would you tell a computer to do this?
There are different ways you can parse this equation, but some require a little more background than others.
This tutorial will convert the equation into the correct format. Once it’s in a more machine-readable form, then you can feed it through your parsing algorithm which will calculate it. This tutorial will focus on four operators: addition, subtraction, multiplication, and division.
Even though you may not realize it yet, you are probably already familiar with infix notation. The sample expression is written in infix notation:
5 + 10 * 3
It means the operators fall in between the operands that they’re acting upon.
As mentioned earlier, you need to convert the equation into a format that the computer can understand. This format is called postfix notation.
Expressions written in postfix notation will have all operators following their operands.
This is important because when the machine is reading expressions in this format, it will never encounter an operator before the operands it’s acting on, which means it won’t have to go back and forth.
So the sample expression:
5 + 10 * 3
Becomes:
5 10 3 * +
This may look unusual, but there’s a methodology to arrive at this.
Add in parentheses in order of precedence:
(5 + (10 * 3))
Move every operator to the right, directly before its closing parenthesis:
(5 (10 3 *) +)
Now drop the parentheses altogether, which leaves you with the expression in postfix notation:
5 10 3 * +
Here is another example to show that the operators won’t necessarily always be at the end:
8 * 4 + 2
((8 * 4) + 2)
((8 4 *) 2 +)
8 4 * 2 +
Again, this is not ideal for the computer to do. It still wouldn’t know where to put the parentheses. Luckily, there is an algorithm to produce the same results.
The Shunting Yard Algorithm was developed by Dijkstra as a means to convert infix notation to postfix notation.
Before you go any further, let’s quickly review the two data structures you’re going to be using here: a stack and a queue. You can use an array to hold both of these sets of data. The main difference comes from the order you’re adding and removing the data.
Queue: When you add data to a queue, you’re pushing it onto the back. Just imagine you’re getting in line for an event and every person in line is an element in the queue. When you walk up to the line, you’re automatically inserted into the back of the line. As the event starts letting people in (removing elements from the queue), they pull from the front of the line since those people have been there longer. You can remember this with the acronym FIFO: first in, first out.
Stack: Every time you add a new element to the stack, it will be put on top (or at the front) instead of in the back. When you want to remove an item from the stack, you’ll pop off the top item. Because new elements always go on top, those new ones will always be popped off first when you need to remove something. This can be remembered with the acronym LIFO: last in, first out.
Note: The rest of this tutorial will use push and pop terminology for stacks. A push action refers to adding a new item to the top of the stack. A pop action refers to removing the most recently added item from the top of the stack.
For this algorithm, assume you have one temporary stack to hold the operators (operator stack) and one queue that will hold the final result.
The Shunting Yard Algorithm follows four basic steps:
It’s hard to make sense of those steps without seeing it in action, so let’s walk through the previous example and try to format it with the algorithm!
Convert this equation from infix notation to postfix notation:
5 + 10 * 3
Let’s set up your two arrays: one for the results output and one for the temporary operator stack:
expression = 5 + 10 * 3
output = []
operator stack = []
First, you start reading the expression from left to right. So first up you have 5
. Since this is an operand, you can output it immediately:
expression = + 10 * 3
output = [5]
operator stack = []
Next, you see the +
. The operator stack is empty, so you can push it there:
expression = 10 * 3
output = [5]
operator stack = [+]
Next up is 10
, so you’ll output immediately:
expression = * 3
output = [5, 10]
operator stack = [+]
Now you hit another operator, *
. Since the operator stack isn’t empty, you have to compare it to the current top of the operator stack to see which has higher precedence.
The current top of the stack is +
. So comparing the two, you know multiplication has higher precedence than addition.
This means you can push it onto the top of the stack, which gives you:
expression = 3
output = [5, 10]
operator stack = [*, +]
Now you hit your final value, 3
. Since this isn’t an operator, you can output it immediately:
expression is now empty
output = [5, 10, 3]
operator stack = [*, +]
Since the expression is now empty, all that remains is to pop all tokens from the operator stack and output them immediately. When you pop from the stack, you’re grabbing from the top, so first, you’ll take the *
to push to the end of the queue, and then you’ll take the +
.
output = [5, 10, 3, *, +]
And that’s it! As you can see, it matches the previous method where you add parentheses, but this way is much easier for a computer to do.
You may have noticed there was one point where instead of using the algorithm to decide, you relied on our own knowledge to make a choice between what to do next: determining which operator had higher precedence.
It’s not important right now while you are understanding the concepts behind the algorithm, but when you’re writing the actual code to solve this, you’re going to have to build in some precedence rules.
You have to create an object that will essentially rank each operator. You’ll give the multiplication and division operators a rank of 2 and the addition and subtraction operators a rank of 1.
When you code it up, you’ll compare two operators by comparing their numerical rank. The actual numbers 1 and 2 here are arbitrary, so don’t get too caught up in that. Grasp the concept that multiplication ranks higher than addition, so it has a higher number.
const precedence = {
"*": 2,
"/": 2,
"+": 1,
"-": 1
};
You finally have the expression in postfix notation. Now, you can use this format to evaluate it.
Here’s how you’ll do it:
In the example, you’re only dealing with binary operators, so you can always pop off two operands when you see an operator. If you wanted to expand the example to handle all operators, you’d have to handle unary operators such as !
.
Let’s walk through some pseudo-code where we use the algorithm to evaluate the sample postfix notation expression:
5 10 3 * +
First, you start by pushing every operand onto the stack until you hit an operator:
expression = [5, 10, 3, *, +]
- push 5
- push 10
- push 3
stack = [3, 10, 5]
So now you get to your first operator, *
, which means it’s time to start popping. You pop until you have two values:
- pop 3
- pop 10
Alright now you have your two operands, 3
and 10
, so you will combine this with your operator, *
, leaving you with 10 * 3
:
expression = [+]
stack = [5]
tempOperand1 = 3
tempOperand2 = 10
tempOperator = *
eval(tempOperand1 + tempOperator + tempOperand2) // 3 * 10
You evaluate that, get 30
, and then push this back onto the stack. You now have the following:
expression = [+]
stack = [30, 5]
So you start parsing the expression again and you immediately hit an operator. Again, you have to pop from the stack until you have two operands:
expression = []
stack = []
tempOperand1 = 30
tempOperand2 = 5
tempOperator = +
eval(tempOperand1 + tempOperator + tempOperand2) // 30 + 5
You pop 30
and the 5
and you are ready to evaluate again. 5 + 30
gives you 35
and you can now push this back onto the stack.
Going back to your original expression to parse for the next token, you find that it’s empty!
expression = []
stack = [35]
This either means that you are done or that the original expression was malformed.
Let’s check by looking at your stack. It only has one value in it, so this means you are done and 35
is the final output of the original expression, 5 + 10 * 3
.
The algorithm for evaluating an expression in prefix notation is essentially the same, except this time you will read from right to left. A small modification to the code and we can also evaluate for prefix notation.
If you go back to your original method of adding parentheses and moving operators, you can convert to prefix notation in the same way you did postfix. Instead of moving the operators to the end of their operands, you’ll move them to the beginning. Once you’ve done that, you can drop the parentheses altogether and then you have your prefix notation expression!
5 + 10 * 3
(5 + (10 * 3))
(+ 5 (* 10 3))
+ 5 * 10 3
If you want to put your knowledge to the test, try to figure out how you’d do this algorithmically with a small modification to the Shunting Yard Algorithm.
In this tutorial, you’ve created an algorithm for converting expressions to postfix notation and tested it by evaluating an expression.
]]>Nginx is one of the most popular web servers in the world. It can successfully handle high loads with many concurrent client connections, and can function as a web server, a mail server, or a reverse proxy server.
In this guide, we will discuss some of the behind-the-scenes details that determine how Nginx processes client requests. Understanding these ideas can help take the guesswork out of designing server and location blocks and can make the request handling seem less unpredictable.
Nginx logically divides the configurations meant to serve different content into blocks, which live in a hierarchical structure. Each time a client request is made, Nginx begins a process of determining which configuration blocks should be used to handle the request. This decision process is what we will be discussing in this guide.
The main blocks that we will be discussing are the server block and the location block.
A server block is a subset of Nginx’s configuration that defines a virtual server used to handle requests of a defined type. Administrators often configure multiple server blocks and decide which block should handle which connection based on the requested domain name, port, and IP address.
A location block lives within a server block and is used to define how Nginx should handle requests for different resources and URIs for the parent server. The URI space can be subdivided in whatever way the administrator likes using these blocks. It is an extremely flexible model.
Since Nginx allows the administrator to define multiple server blocks that function as separate virtual web server instances, it needs a procedure for determining which of these server blocks will be used to satisfy a request.
It does this through a defined system of checks that are used to find the best possible match. The main server block directives that Nginx is concerned with during this process are the listen
directive, and the server_name
directive.
listen
Directive to Find Possible MatchesFirst, Nginx looks at the IP address and the port of the request. It matches this against the listen
directive of each server to build a list of the server blocks that can possibly resolve the request.
The listen
directive typically defines which IP address and port that the server block will respond to. By default, any server block that does not include a listen
directive is given the listen parameters of 0.0.0.0:80
(or 0.0.0.0:8080
if Nginx is being run by a normal, non-root user). This allows these blocks to respond to requests on any interface on port 80, but this default value does not hold much weight within the server selection process.
The listen
directive can be set to:
The last option will generally only have implications when passing requests between different servers.
When trying to determine which server block to send a request to, Nginx will first try to decide based on the specificity of the listen
directive using the following rules:
listen
directives by substituting missing values with their default values so that each block can be evaluated by its IP address and port. Some examples of these translations are:
listen
directive uses the value 0.0.0.0:80
.111.111.111.111
with no port becomes 111.111.111.111:80
8888
with no IP address becomes 0.0.0.0:8888
0.0.0.0
as its IP address (to match any interface), will not be selected if there are matching blocks that list a specific IP address. In any case, the port must be matched exactly.server_name
directive of each server block.It is important to understand that Nginx will only evaluate the server_name
directive when it needs to distinguish between server blocks that match to the same level of specificity in the listen
directive. For instance, if example.com
is hosted on port 80
of 192.168.1.10
, a request for example.com
will always be served by the first block in this example, despite the server_name
directive in the second block.
server {
listen 192.168.1.10;
. . .
}
server {
listen 80;
server_name example.com;
. . .
}
In the event that more than one server block matches with equal specificity, the next step is to check the server_name
directive.
server_name
Directive to Choose a MatchNext, to further evaluate requests that have equally specific listen
directives, Nginx checks the request’s Host
header. This value holds the domain or IP address that the client was actually trying to reach.
Nginx attempts to find the best match for the value it finds by looking at the server_name
directive within each of the server blocks that are still selection candidates. Nginx evaluates these by using the following formula:
server_name
that matches the value in the Host
header of the request exactly. If this is found, the associated block will be used to serve the request. If multiple exact matches are found, the first one is used.server_name
that matches using a leading wildcard (indicated by a *
at the beginning of the name in the config). If one is found, that block will be used to serve the request. If multiple matches are found, the longest match will be used to serve the request.server_name
that matches using a trailing wildcard (indicated by a server name ending with a *
in the config). If one is found, that block is used to serve the request. If multiple matches are found, the longest match will be used to serve the request.server_name
using regular expressions (indicated by a ~
before the name). The first server_name
with a regular expression that matches the “Host” header will be used to serve the request.Each IP address/port combo has a default server block that will be used when a course of action can not be determined with the above methods. For an IP address/port combo, this will either be the first block in the configuration or the block that contains the default_server
option as part of the listen
directive (which would override the first-found algorithm). There can be only one default_server
declaration per each IP address/port combination.
If there is a server_name
defined that exactly matches the Host
header value, that server block is selected to process the request.
In this example, if the Host
header of the request was set to host1.example.com
, the second server would be selected:
server {
listen 80;
server_name *.example.com;
. . .
}
server {
listen 80;
server_name host1.example.com;
. . .
}
If no exact match is found, Nginx then checks to see if there is a server_name
with a starting wildcard that fits. The longest match beginning with a wildcard will be selected to fulfill the request.
In this example, if the request had a Host
header of www.example.org
, the second server block would be selected:
server {
listen 80;
server_name www.example.*;
. . .
}
server {
listen 80;
server_name *.example.org;
. . .
}
server {
listen 80;
server_name *.org;
. . .
}
If no match is found with a starting wildcard, Nginx will then see if a match exists using a wildcard at the end of the expression. At this point, the longest match ending with a wildcard will be selected to serve the request.
For instance, if the request has a Host
header set to www.example.com
, the third server block will be selected:
server {
listen 80;
server_name host1.example.com;
. . .
}
server {
listen 80;
server_name example.com;
. . .
}
server {
listen 80;
server_name www.example.*;
. . .
}
If no wildcard matches can be found, Nginx will then move on to attempting to match server_name
directives that use regular expressions. The first matching regular expression will be selected to respond to the request.
For example, if the Host
header of the request is set to www.example.com
, then the second server block will be selected to satisfy the request:
server {
listen 80;
server_name example.com;
. . .
}
server {
listen 80;
server_name ~^(www|host1).*\.example\.com$;
. . .
}
server {
listen 80;
server_name ~^(subdomain|set|www|host1).*\.example\.com$;
. . .
}
If none of the above steps are able to satisfy the request, then the request will be passed to the default server for the matching IP address and port.
Similar to the process that Nginx uses to select the server block that will process a request, Nginx also has an established algorithm for deciding which location block within the server to use for handling requests.
Before we cover how Nginx decides which location block to use to handle requests, let’s go over some of the syntax you might see in location block definitions. Location blocks live within server blocks (or other location blocks) and are used to decide how to process the request URI (the part of the request that comes after the domain name or IP address/port).
Location blocks generally take the following form:
location optional_modifier location_match {
. . .
}
The location_match
in the above defines what Nginx should check the request URI against. The existence or nonexistence of the modifier in the above example affects the way that the Nginx attempts to match the location block. The modifiers below will cause the associated location block to be interpreted as follows:
=
: If an equal sign is used, this block will be considered a match if the request URI exactly matches the location given.~
: If a tilde modifier is present, this location will be interpreted as a case-sensitive regular expression match.~*
: If a tilde and asterisk modifier is used, the location block will be interpreted as a case-insensitive regular expression match.^~
: If a carat and tilde modifier is present, and if this block is selected as the best non-regular expression match, regular expression matching will not take place.As an example of prefix matching, the following location block may be selected to respond for request URIs that look like /site
, /site/page1/index.html
, or /site/index.html
:
location /site {
. . .
}
For a demonstration of exact request URI matching, this block will always be used to respond to a request URI that looks like /page1
. It will not be used to respond to a /page1/index.html
request URI. Keep in mind that if this block is selected and the request is fulfilled using an index page, an internal redirect will take place to another location that will be the actual handler of the request:
location = /page1 {
. . .
}
As an example of a location that should be interpreted as a case-sensitive regular expression, this block could be used to handle requests for /tortoise.jpg
, but not for /FLOWER.PNG
:
location ~ \.(jpe?g|png|gif|ico)$ {
. . .
}
A block that would allow for case-insensitive matching similar to the above is shown below. Here, both /tortoise.jpg
and /FLOWER.PNG
could be handled by this block:
location ~* \.(jpe?g|png|gif|ico)$ {
. . .
}
Finally, this block would prevent regular expression matching from occurring if it is determined to be the best non-regular expression match. It could handle requests for /costumes/ninja.html
:
location ^~ /costumes {
. . .
}
As you see, the modifiers indicate how the location block should be interpreted. However, this does not tell us the algorithm that Nginx uses to decide which location block to send the request to. We will go over that next.
Nginx chooses the location that will be used to serve a request in a similar fashion to how it selects a server block. It runs through a process that determines the best location block for any given request. Understanding this process is a crucial requirement in being able to configure Nginx reliably and accurately.
Keeping in mind the types of location declarations we described above, Nginx evaluates the possible location contexts by comparing the request URI to each of the locations. It does this using the following algorithm:
=
modifier is found to match the request URI exactly, this location block is immediately selected to serve the request.=
modifier) location block matches are found, Nginx then moves on to evaluating non-exact prefixes. It discovers the longest matching prefix location for the given request URI, which it then evaluates as follows:
^~
modifier, then Nginx will immediately end its search and select this location to serve the request.^~
modifier, the match is stored by Nginx for the moment so that the focus of the search can be shifted.It is important to understand that, by default, Nginx will serve regular expression matches in preference to prefix matches. However, it evaluates prefix locations first, allowing for the administer to override this tendency by specifying locations using the =
and ^~
modifiers.
It is also important to note that, while prefix locations generally select based on the longest, most specific match, regular expression evaluation is stopped when the first matching location is found. This means that positioning within the configuration has vast implications for regular expression locations.
Finally, it it is important to understand that regular expression matches within the longest prefix match will “jump the line” when Nginx evaluates regex locations. These will be evaluated, in order, before any of the other regular expression matches are considered. Maxim Dounin, an incredibly helpful Nginx developer, explains in this post this portion of the selection algorithm.
Generally speaking, when a location block is selected to serve a request, the request is handled entirely within that context from that point onward. Only the selected location and the inherited directives determine how the request is processed, without interference from sibling location blocks.
Although this is a general rule that will allow you to design your location blocks in a predictable way, it is important to realize that there are times when a new location search is triggered by certain directives within the selected location. The exceptions to the “only one location block” rule may have implications on how the request is actually served and may not align with the expectations you had when designing your location blocks.
Some directives that can lead to this type of internal redirect are:
Let’s go over these briefly.
The index
directive always leads to an internal redirect if it is used to handle the request. Exact location matches are often used to speed up the selection process by immediately ending the execution of the algorithm. However, if you make an exact location match that is a directory, there is a good chance that the request will be redirected to a different location for actual processing.
In this example, the first location is matched by a request URI of /exact
, but in order to handle the request, the index
directive inherited by the block initiates an internal redirect to the second block:
index index.html;
location = /exact {
. . .
}
location / {
. . .
}
In the case above, if you really need the execution to stay in the first block, you will have to come up with a different method of satisfying the request to the directory. For instance, you could set an invalid index
for that block and turn on autoindex
:
location = /exact {
index nothing_will_match;
autoindex on;
}
location / {
. . .
}
This is one way of preventing an index
from switching contexts, but it’s probably not useful for most configurations. Mostly an exact match on directories can be helpful for things like rewriting the request (which also results in a new location search).
Another instance where the processing location may be reevaluated is with the try_files
directive. This directive tells Nginx to check for the existence of a named set of files or directories. The last parameter can be a URI that Nginx will make an internal redirect to.
Consider the following configuration:
root /var/www/main;
location / {
try_files $uri $uri.html $uri/ /fallback/index.html;
}
location /fallback {
root /var/www/another;
}
In the above example, if a request is made for /blahblah
, the first location will initially get the request. It will try to find a file called blahblah
in /var/www/main
directory. If it cannot find one, it will follow up by searching for a file called blahblah.html
. It will then try to see if there is a directory called blahblah/
within the /var/www/main
directory. Failing all of these attempts, it will redirect to /fallback/index.html
. This will trigger another location search that will be caught by the second location block. This will serve the file /var/www/another/fallback/index.html
.
Another directive that can lead to a location block pass off is the rewrite
directive. When using the last
parameter with the rewrite
directive, or when using no parameter at all, Nginx will search for a new matching location based on the results of the rewrite.
For example, if we modify the last example to include a rewrite, we can see that the request is sometimes passed directly to the second location without relying on the try_files
directive:
root /var/www/main;
location / {
rewrite ^/rewriteme/(.*)$ /$1 last;
try_files $uri $uri.html $uri/ /fallback/index.html;
}
location /fallback {
root /var/www/another;
}
In the above example, a request for /rewriteme/hello
will be handled initially by the first location block. It will be rewritten to /hello
and a location will be searched. In this case, it will match the first location again and be processed by the try_files
as usual, maybe kicking back to /fallback/index.html
if nothing is found (using the try_files
internal redirect we discussed above).
However, if a request is made for /rewriteme/fallback/hello
, the first block again will match. The rewrite be applied again, this time resulting in /fallback/hello
. The request will then be served out of the second location block.
A related situation happens with the return
directive when sending the 301
or 302
status codes. The difference in this case is that it results in an entirely new request in the form of an externally visible redirect. This same situation can occur with the rewrite
directive when using the redirect
or permanent
flags. However, these location searches shouldn’t be unexpected, since externally visible redirects always result in a new request.
The error_page
directive can lead to an internal redirect similar to that created by try_files
. This directive is used to define what should happen when certain status codes are encountered. This will likely never be executed if try_files
is set, since that directive handles the entire life cycle of a request.
Consider this example:
root /var/www/main;
location / {
error_page 404 /another/whoops.html;
}
location /another {
root /var/www;
}
Every request (other than those starting with /another
) will be handled by the first block, which will serve files out of /var/www/main
. However, if a file is not found (a 404 status), an internal redirect to /another/whoops.html
will occur, leading to a new location search that will eventually land on the second block. This file will be served out of /var/www/another/whoops.html
.
As you can see, understanding the circumstances in which Nginx triggers a new location search can help to predict the behavior you will see when making requests.
Understanding the ways that Nginx processes client requests can make your job as an administrator much easier. You will be able to know which server block Nginx will select based on each client request. You will also be able to tell how the location block will be selected based on the request URI. Overall, knowing the way that Nginx selects different blocks will give you the ability to trace the contexts that Nginx will apply in order to serve each request.
]]>In Python, like in all programming languages, data types are used to classify one particular type of data. This is important because the specific data type you use will determine what values you can assign to it and what you can do to it (including what operations you can perform on it).
In this tutorial, we will go over the important data types native to Python. This is not an exhaustive investigation of data types, but will help you become familiar with what options you have available to you in Python.
You should have Python 3 installed and a programming environment set up on your computer or server. If you don’t have a programming environment set up, you can refer to the installation and setup guides for a local programming environment or for a programming environment on your server appropriate for your operating system (Ubuntu, CentOS, Debian, etc.)
One way to think about data types is to consider the different types of data that we use in the real world. An example of data in the real world are numbers: we may use whole numbers (0, 1, 2, …), integers (…, -1, 0, 1, …), and irrational numbers (π), for example.
Usually, in math, we can combine numbers from different types, and get some kind of an answer. We may want to add 5 to π, for example:
5 + π
We can either keep the equation as the answer to account for the irrational number, or round π to a number with a brief number of decimal places, and then add the numbers together:
5 + π = 5 + 3.14 = 8.14
But, if we start to try to evaluate numbers with another data type, such as words, things start to make less sense. How would we solve for the following equation?
sky + 8
For computers, each data type can be thought of as being quite different, like words and numbers, so we will have to be careful about how we use them to assign values and how we manipulate them through operations.
Any number you enter in Python will be interpreted as a number; you are not required to declare what kind of data type you are entering. Python will consider any number written without decimals as an integer (as in 138
) and any number written with decimals as a float (as in 138.0
).
Like in math, integers in computer programming are whole numbers that can be positive, negative, or 0 (…, -1
, 0
, 1
, …). An integer can also be known as an int
. As with other programming languages, you should not use commas in numbers of four digits or more, so when you write 1,000 in your program, write it as 1000
.
Info: To follow along with the example code in this tutorial, open a Python interactive shell on your local system by running the python3
command. Then you can copy, paste, or edit the examples by adding them after the >>>
prompt.
We can print out an integer like this:
print(-25)
Output-25
Or, we can declare a variable, which in this case is essentially a symbol of the number we are using or manipulating, like so:
my_int = -25
print(my_int)
Output-25
We can do math with integers in Python, too:
int_ans = 116 - 68
print(int_ans)
Output48
Integers can be used in many ways within Python programs, and as you continue to learn more about the language you will have a lot of opportunities to work with integers and understand more about this data type.
A floating-point number or a float is a real number, meaning that it can be either a rational or an irrational number. Because of this, floating-point numbers can be numbers that can contain a fractional part, such as 9.0
or -116.42
. In general, for the purposes of thinking of a float
in a Python program, it is a number that contains a decimal point.
Like we did with the integer, we can print out a floating-point number like this:
print(17.3)
Output17.3
We can also declare a variable that stands in for a float, like so:
my_flt = 17.3
print(my_flt)
Output17.3
And, just like with integers, we can do math with floats in Python, too:
flt_ans = 564.0 + 365.24
print(flt_ans)
Output929.24
With integers and floating-point numbers, it is important to keep in mind that 3 ≠ 3.0, as 3
refers to an integer while 3.0
refers to a float.
The Boolean data type can be one of two values, either True or False. Booleans are used to represent the truth values that are associated with the logic branch of mathematics, which informs algorithms in computer science.
Whenever you see the data type Boolean, it will start with a capitalized B because it is named for the mathematician George Boole. The values True
and False
will also always be with a capital T and F respectively, as they are special values in Python.
Many operations in math give us answers that evaluate to either True or False:
True
False
True
False
True
False
Like with numbers, we can store a Boolean value in a variable:
my_bool = 5 > 8
We can then print the Boolean value with a call to the print()
function:
print(my_bool)
Since 5 is not greater than 8, we will receive the following output:
OutputFalse
As you write more programs in Python, you will become more familiar with how Booleans work and how different functions and operations evaluating to either True or False can change the course of the program.
A string is a sequence of one or more characters (letters, numbers, symbols) that can be either a constant or a variable. Strings exist within either single quotes '
or double quotes "
in Python, so to create a string, enclose a sequence of characters in quotes:
'This is a string in single quotes.'
"This is a string in double quotes."
You can choose to use either single quotes or double quotes, but whichever you decide on you should be consistent within a program.
The basic program “Hello, World!” demonstrates how a string can be used in computer programming, as the characters that make up the phrase Hello, World!
are a string.
print("Hello, World!")
As with other data types, we can store strings in variables:
hw = "Hello, World!"
And print out the string by calling the variable:
print(hw)
OuputHello, World!
Like numbers, there are many operations that we can perform on strings within our programs in order to manipulate them to achieve the results we are seeking. Strings are important for communicating information to the user, and for the user to communicate information back to the program.
A list is a mutable, or changeable, ordered sequence of elements. Each element or value that is inside of a list is called an item. Just as strings are defined as characters between quotes, lists are defined by having values between square brackets [ ]
.
A list of integers looks like this:
[-3, -2, -1, 0, 1, 2, 3]
A list of floats looks like this:
[3.14, 9.23, 111.11, 312.12, 1.05]
A list of strings:
['shark', 'cuttlefish', 'squid', 'mantis shrimp']
If we define our string list as sea_creatures
:
sea_creatures = ['shark', 'cuttlefish', 'squid', 'mantis shrimp']
We can print them out by calling the variable:
print(sea_creatures)
And the output looks exactly like the list that we created:
Output['shark', 'cuttlefish', 'squid', 'mantis shrimp']
Lists are a very flexible data type because they are mutable in that they can have values added, removed, and changed. There is a data type that is similar to lists but that can’t be changed, and that is called a tuple.
A tuple is used for grouping data. It is an immutable, or unchangeable, ordered sequence of elements.
Tuples are very similar to lists, but they use parentheses ( )
instead of square brackets and because they are immutable their values cannot be modified.
A tuple looks like this:
('blue coral', 'staghorn coral', 'pillar coral')
We can store a tuple in a variable and print it out:
coral = ('blue coral', 'staghorn coral', 'pillar coral')
print(coral)
Output('blue coral', 'staghorn coral', 'pillar coral')
Like in the other data types, Python prints out the tuple just as we had typed it, with parentheses containing a sequence of values.
The dictionary is Python’s built-in mapping type. This means that dictionaries map keys to values and these key-value pairs are a useful way to store data in Python. A dictionary is constructed with curly braces on either side { }
.
Typically used to hold data that are related, such as the information contained in an ID, a dictionary looks like this:
{'name': 'Sammy', 'animal': 'shark', 'color': 'blue', 'location': 'ocean'}
You will notice that in addition to the curly braces, there are also colons throughout the dictionary. The words to the left of the colons are the keys. Keys can be made up of any immutable data type. The keys in the dictionary above are: 'name', 'animal', 'color', 'location'
.
The words to the right of the colons are the values. Values can be comprised of any data type. The values in the dictionary above are: 'Sammy', 'shark', 'blue', 'ocean'
.
Like the other data types, let’s store the dictionary inside a variable, and print it out:
sammy = {'name': 'Sammy', 'animal': 'shark', 'color': 'blue', 'location': 'ocean'}
print(sammy)
Output{'color': 'blue', 'animal': 'shark', 'name': 'Sammy', 'location': 'ocean'}
If we want to isolate Sammy’s color, we can do so by calling sammy['color']
. Let’s print that out:
print(sammy['color'])
Outputblue
As dictionaries offer key-value pairs for storing data, they can be important elements in your Python program.
At this point, you should have a better understanding of some of the major data types that are available for you to use in Python. Each of these data types will become important as you develop programming projects in the Python language.
You can learn about each of the data types above in more detail by reading the following specific tutorials:
Once you have a solid grasp of data types available to you in Python, you can learn how to convert data types.
]]>OAuth 2 is an authorization framework that enables applications — such as Facebook, GitHub, and DigitalOcean — to obtain limited access to user accounts on an HTTP service. It works by delegating user authentication to the service that hosts a user account and authorizing third-party applications to access that user account. OAuth 2 provides authorization flows for web and desktop applications, as well as mobile devices.
This informational guide is geared towards application developers, and provides an overview of OAuth 2 roles, authorization grant types, use cases, and flows.
Deploy your frontend applications from GitHub using DigitalOcean App Platform. Let DigitalOcean focus on scaling your app.
OAuth defines four roles:
From an application developer’s point of view, a service’s API fulfills both the resource and authorization server roles. We will refer to both of these roles combined, as the Service or API role.
Now that you have an idea of what the OAuth roles are, let’s look at a diagram of how they generally interact with each other:
Here is a more detailed explanation of the steps in the diagram:
The actual flow of this process will differ depending on the authorization grant type in use, but this is the general idea. We will explore different grant types in a later section.
Before using OAuth with your application, you must register your application with the service. This is done through a registration form in the developer or API portion of the service’s website, where you will provide the following information (and probably details about your application):
The redirect URI is where the service will redirect the user after they authorize (or deny) your application, and therefore the part of your application that will handle authorization codes or access tokens.
Once your application is registered, the service will issue client credentials in the form of a client identifier and a client secret. The Client ID is a publicly exposed string that is used by the service API to identify the application, and is also used to build authorization URLs that are presented to users. The Client Secret is used to authenticate the identity of the application to the service API when the application requests to access a user’s account, and must be kept private between the application and the API.
In the Abstract Protocol Flow outlined previously, the first four steps cover obtaining an authorization grant and access token. The authorization grant type depends on the method used by the application to request authorization, and the grant types supported by the API. OAuth 2 defines three primary grant types, each of which is useful in different cases:
Warning: The OAuth framework specifies two additional grant types: the Implicit Flow type and the Password Grant type. However, these grant types are both considered insecure, and are no longer recommended for use.
Now we will describe grant types in more detail, their use cases and flows, in the following sections.
The authorization code grant type is the most commonly used because it is optimized for server-side applications, where source code is not publicly exposed, and Client Secret confidentiality can be maintained. This is a redirection-based flow, which means that the application must be capable of interacting with the user-agent (i.e. the user’s web browser) and receiving API authorization codes that are routed through the user-agent.
Now we will describe the authorization code flow:
First, the user is given an authorization code link that looks like the following:
https://cloud.digitalocean.com/v1/oauth/authorize?response_type=code&client_id=CLIENT_ID&redirect_uri=CALLBACK_URL&scope=read
Here is an explanation of this example link’s components:
When the user clicks the link, they must first log in to the service to authenticate their identity (unless they are already logged in). Then they will be prompted by the service to authorize or deny the application access to their account. Here is an example authorize application prompt:
This particular screenshot is of DigitalOcean’s authorization screen, and it indicates that Thedropletbook App is requesting authorization for read access to the account of manicas@digitalocean.com
.
If the user clicks Authorize Application the service redirects the user-agent to the application redirect URI, which was specified during the client registration, along with an authorization code. The redirect would look something like this (assuming the application is dropletbook.com
):
https://dropletbook.com/callback?code=AUTHORIZATION_CODE
The application requests an access token from the API by passing the authorization code along with authentication details, including the client secret, to the API token endpoint. Here is an example POST
request to DigitalOcean’s token endpoint:
https://cloud.digitalocean.com/v1/oauth/token?client_id=CLIENT_ID&client_secret=CLIENT_SECRET&grant_type=authorization_code&code=AUTHORIZATION_CODE&redirect_uri=CALLBACK_URL
If the authorization is valid, the API will send a response containing the access token (and optionally, a refresh token) to the application. The entire response will look something like this:
{"access_token":"ACCESS_TOKEN","token_type":"bearer","expires_in":2592000,"refresh_token":"REFRESH_TOKEN","scope":"read","uid":100101,"info":{"name":"Mark E. Mark","email":"mark@thefunkybunch.com"}}
Now the application is authorized. It may use the token to access the user’s account via the service API, limited to the scope of access, until the token expires or is revoked. If a refresh token was issued, it may be used to request new access tokens if the original token has expired.
If a public client is using the Authorization Code grant type, there’s a chance that the authorization code could be intercepted. The Proof Key for Code Exchange (or PKCE, pronounced like “pixie”) is an extension to the Authorization Code flow that helps to mitigate this kind of attack.
The PKCE extension involves the client creating and recording a secret key — known as a code verifier — for every authorization request. The client then transforms the code verifier into a code challenge and then sends this code challenge and the transformation method to the authorization endpoint in the same authorization request.
The authorization endpoint records the code challenge and the transformation method, and responds with the authorization code as outlined previously. The client then sends in the access token request which includes the code verifier it originally generated.
After receiving the code verifier, the authorization server transforms it into the code challenge using the transformation method first shared by the client. If the code challenge derived from the code verifier sent by the client doesn’t match the one originally recorded by the authorization server, then the authorization server will deny the client access.
It’s recommended that every client use the PKCE extension for improved security.
The client credentials grant type provides an application a way to access its own service account. Examples of when this might be useful include if an application wants to update its registered description or redirect URI, or access other data stored in its service account via the API.
The application requests an access token by sending its credentials, its client ID and client secret, to the authorization server. An example POST
request might look like the following:
https://oauth.example.com/token?grant_type=client_credentials&client_id=CLIENT_ID&client_secret=CLIENT_SECRET
If the application credentials check out, the authorization server returns an access token to the application. Now the application is authorized to use its own account.
Note: DigitalOcean does not currently support the client credentials grant type, so the link points to an imaginary authorization server at oauth.example.com
.
The device code grant type provides a means for devices that lack a browser or have limited inputs to obtain an access token and access a user’s account. The purpose of this grant type is to make it easier for users to more easily authorize applications on such devices to access their accounts. Examples of when this might be useful include if a user wants to sign into a video streaming application on a device that doesn’t have a typical keyboard input, such as a smart television or a video game console.
The user starts an application on their browserless or input-limited device, such as a television or a set-top box. The application submits a POST
request to a device authorization endpoint.
An example device code POST
request might look like the following:
POST https://oauth.example.com/device
client_id=CLIENT_id
The device authorization endpoint is different from the authentication server, as the device authorization endpoint doesn’t actually authenticate the device. Instead, it returns a unique device code, which is used to identify the device; a user code, which the user can enter on a machine on which it’s easier to authenticate, such as a laptop or mobile device; and the URL the user should visit to enter the user code and authenticate their device.
Here’s what an example response from the device authorization endpoint might look like:
{
"device_code": "IO2RUI3SAH0IQuESHAEBAeYOO8UPAI",
"user_code": "RSIK-KRAM",
"verification_uri": "https://example.okta.com/device",
"interval": 10,
"expires_in": 1600
}
Note that the device code could also be a QR code which the reader can scan on a mobile device.
The user then enters the user code at the specified URL and signs into their account. They are then presented with a consent screen where they can authorize the device to access their account.
While the user visits the verification URL and enters their code, the device will poll the access endpoint until it returns an error or an authentication token. The access endpoint will return errors if the device is polling too frequently (the slow_down
error), if the user hasn’t yet approved or denied the request (the authorization_pending
error), if the user has denied the request (the access_denied
error), or if the token has expired (the expired_token
error).
If the user approves the request, though, the access endpoint will return an authentication token.
Note: Again, DigitalOcean does not currently support the device code grant type, so the link in this example points to an imaginary authorization server at oauth.example.com
.
Once the application has an access token, it may use the token to access the user’s account via the API, limited to the scope of access, until the token expires or is revoked.
Here is an example of an API request, using curl
. Note that it includes the access token:
curl -X POST -H "Authorization: Bearer ACCESS_TOKEN""https://api.digitalocean.com/v2/$OBJECT"
Assuming the access token is valid, the API will process the request according to its API specifications. If the access token is expired or otherwise invalid, the API will return an invalid_request
error.
After an access token expires, using it to make a request from the API will result in an Invalid Token Error
. At this point, if a refresh token was included when the original access token was issued, it can be used to request a fresh access token from the authorization server.
Here is an example POST
request, using a refresh token to obtain a new access token:
https://cloud.digitalocean.com/v1/oauth/token?grant_type=refresh_token&client_id=CLIENT_ID&client_secret=CLIENT_SECRET&refresh_token=REFRESH_TOKEN
By following this guide, you will have gained an understanding of how OAuth 2 works, and when a particular authorization flow should be used.
If you want to learn more about OAuth 2, check out these valuable resources:
]]>Although they were first invented decades ago, computer-based databases have become ubiquitous on today’s internet. More and more commonly, websites and applications involve collecting, storing, and retrieving data from a database. For many years the database landscape was dominated by relational databases, which organize data in tables made up of rows. To break free from the rigid structure imposed by the relational model, though, a number of different database types have emerged in recent years.
These new database models are jointly referred to as NoSQL databases, as they usually do not use Structured Query Language — also known as SQL — which relational databases typically employ to manage and query data. NoSQL databases offer a high level of scalability as well as flexibility in terms of data structure. These features make NoSQL databases useful for handling large volumes of data and fast-paced, agile development.
This conceptual article outlines the key concepts related to document databases as well as the benefits of using them. Examples used in this article reference MongoDB, a widely-used document-oriented database, but most of the concepts highlighted here are applicable for most other document databases as well.
Breaking free from thinking about databases as consisting of rows and columns, as is the case in a table within a relational database, document databases store data as documents. You might think of a document as a self-contained data entry containing everything needed to understand its meaning, similar to documents used in the real world.
The following is an example of a document that might appear in a document database like MongoDB. This sample document represents a company contact card, describing an employee called Sammy
:
{
"_id": "sammyshark",
"firstName": "Sammy",
"lastName": "Shark",
"email": "sammy.shark@digitalocean.com",
"department": "Finance"
}
Notice that the document is written as a JSON object. JSON is a human-readable data format that has become quite popular in recent years. While many different formats can be used to represent data within a document database, such as XML or YAML, JSON is one of the most common choices. For example, MongoDB adopted JSON as the primary data format to define and manage data.
All data in JSON documents are represented as field-and-value pairs that take the form of field: value
. In the previous example, the first line shows an _id
field with the value sammyshark
. The example also includes fields for the employee’s first and last names, their email address, as well as what department they work in.
Field names allow you to understand what kind of data is held within a document with just a glance. Documents in document databases are self-describing, which means they contain both the data values as well as the information on what kind of data is being stored. When retrieving a document from the database, you always get the whole picture.
The following is another sample document representing a colleague of Sammy’s named Tom
, who works in multiple departments and also uses a middle name:
{
"_id": "tomjohnson",
"firstName": "Tom",
"middleName": "William",
"lastName": "Johnson",
"email": "tom.johnson@digitalocean.com",
"department": ["Finance", "Accounting"]
}
This second document has a few differences from the first example. For instance, it adds a new field called middleName
. Also, this document’s department
field stores not a single value, but an array of two values: "Finance"
and "Accounting"
.
Because these documents hold different fields of data, they can be said to have different schemas. A database’s schema is its formal structure, which outlines what kind of data it can hold. In the case of documents, their schemas are reflected in their field names and what kinds of values those fields represent.
In a relational database, you’d be unable to store both of these example contact cards in the same table, as they differ in structure. You would have to adapt the database schema both to allow storing multiple departments as well as middle names, and you would have to provide a middle name for Sammy or else fill the column for that row with a NULL
value. This is not the case with document databases, which offer you the freedom to save multiple documents with different schemas together with no changes to the database itself.
In document databases, documents are not only self-describing but also their schema is dynamic, which means that you don’t have to define it before you start saving data. Fields can differ between different documents in the same database, and you can modify the document’s structure at will, adding or removing fields as you go. Documents can be also nested — meaning that a field within one document can have a value consisting of another document — making it possible to store complex data within a single document entry.
Let’s imagine the contact card must store information about social media accounts the employee uses and add them as nested objects to the document:
{
"_id": "tomjohnson",
"firstName": "Tom",
"middleName": "William",
"lastName": "Johnson",
"email": "tom.johnson@digitalocean.com",
"department": ["Finance", "Accounting"],
"socialMediaAccounts": [
{
"type": "facebook",
"username": "tom_william_johnson_23"
},
{
"type": "twitter",
"username": "@tomwilliamjohnson23"
}
]
}
A new field called socialMediaAccounts
appears in the document, but instead of a single value, it refers to an array of nested objects describing individual social media accounts. Each of these accounts could be a document on its own, but here they’re stored directly within the contact card. Once again, there is no need to change the database structure to accommodate this requirement. You can immediately save the new document to the database.
Note: In MongoDB, it’s customary to name fields and collections using a camelCase
notation, with no spaces between words, the first word written entirely in lowercase, and any additional words having their first letters capitalized. That said, you can also use different notations such as snake_case
, in which words are all written in lowercase and separated with underscores. Whichever notation you choose, it’s considered bast practice to use it consistently across the whole database.
All these attributes make it intuitive to work with document databases from the developer’s perspective. The database facilitates storing actual objects describing data within the application, encouraging experimentation and allowing great flexibility when reshaping data as the software grows and evolves.
While document-oriented databases may not be the right choice for every use case, there are many benefits of choosing one over a relational database. A few of the most important benefits are:
Flexibility and adaptability: with a high level of control over the data structure, document databases enable experimentation and adaptation to new emerging requirements. New fields can be added right away and existing ones can be changed any time. It’s up to the developer to decide whether old documents must be amended or the change can be implemented only going forward.
Ability to manage structured and unstructured data: as mentioned previously, relational databases are well suited for storing data that conforms to a rigid structure. Document databases can be used to handle structured data as well, but they’re also quite useful for storing unstructured data where necessary. You can imagine structured data as the kind of information you would easily represent in a spreadsheet with rows and columns, whereas unstructured data is everything not as straightforward to frame. Examples of unstructured data are rich social media posts with human-generated texts and multimedia, server logs that don’t follow unified format, or data coming from a multitude of different sensors in smart homes.
Scalability by design: relational databases are often write constrained, and increasing their performance requires you to scale vertically (meaning you must migrate their data to more powerful and performant database servers). Conversely, document databases are designed as distributed systems that instead allow you to scale horizontally (meaning that you split a single database up across multiple servers). Because documents are independent units containing both data and schema, it’s relatively trivial to distribute them across server nodes. This makes it possible to store large amounts of data with less operational complexity.
In real-world applications, both document databases and other NoSQL and relational databases are often used together, each responsible for what it’s best suited for. This paradigm of mixing various types of databases is known as polyglot persistence.
While document databases allow great flexibility in how the documents are structured, having some means of organizing data into categories sharing similar characteristics is crucial for ensuring that a database is healthy and manageable.
Imagine a database as an individual cabinet in a company archive with many draws. For example, one drawer might keep records of employment contracts, with another keeping agreements with business partners. While it is technically possible to put both kinds of documents into a single drawer, it would be difficult to browse the documents later on.
In a document database, such drawers are often called collections, logically similar to tables in relational databases. The role of a collection is to group together documents that share a similar logical function, even if individual documents may slightly differ in their schema. For instance, say you have one employment contract for a fixed-term and another that describes a contractor’s additional benefits. Both documents are employment contracts and, as such, it could make sense to group them into a single collection:
Note: While it’s a popular approach, not all document databases use the concept of collections to organize documents together. Some database systems use tags or tree-like hierarchies, others store documents directly within a database with no further subdivisions. MongoDB is one of the popular document-oriented databases that use collections for document organization.
Having similar characteristics between documents within a collection also allows you to build indexes in order to allow for more performant retrieval of documents based on queries related to certain fields. Indexes are special data structures that store a portion of a collection’s data in a way that’s faster to traverse and filter.
As an example, you might have a collection of documents in a database that all share a similar field. Because each document shares the same field, it’s likely you would often use that field when running queries. Without indexes, any query asking the database to retrieve a particular document requires a collection scan — browsing all documents within a collection one by one to find the requested match. By creating an index, however, the database only needs to browse through indexed fields, thereby improving query performance.
While we mentioned that document-oriented databases can store documents in different formats, such as XML, YAML or JSON, these are often further extended with additional traits that are specific to a given database system, such as additional data types or structure validation features.
For example, MongoDB internally uses a binary format called BSON (short for Binary JSON) instead of a pure JSON. This not only allows for better performance, but it also extends the format with data types that JSON does not support natively. Thanks to this, we can reliably store different kinds of data in MongoDB documents without being restricted to standard JSON types and use filtering, sorting, and aggregation features specific to individual data types.
The following sample document uses several different data types supported by MongoDB:
{
"_id": ObjectId("5a934e000102030405000000"),
"code": NumberLong(2090845886852),
"image": BinData(0, "TGVhcm5pbmcgTW9uZ29EQg=="),
"lastPurchased": ISODate("2021-01-19T06:01:17.171Z"),
"name": "Document database sticker",
"price": NumberDecimal("13.23"),
"quantity": 317,
"tags": [
"stickers",
"accessories"
]
}
Notice that some of these data types not typical to JSON, such as decimal numbers with exact precision or dates which are represented as objects, such as NumberDecimal
or ISODate
. This ensures that these fields will always be interpreted properly and not mistakenly cast to another similar data type, like a decimal number being cast to a regular double.
This variety of supported data types, combined with schema validation features, makes it possible to implement a set of rules and validity requirements to provide your document database structure. This allows you to model not only unstructured data, but to also create collections of documents following more rigid and precise requirements.
Thanks to their flexibility, scalability, and ease of use, document databases are becoming an increasingly popular choice of database for application developers. They are well suited to different applications and work well on their own or as a part of bigger, multi-database ecosystems. The wide array of document-oriented databases has distinct advantages and use cases, making it possible to choose the best database for any given task.
You can learn more about document-oriented databases and other NoSQL databases from DigitalOcean’s community articles on that topic.
To learn more about MongoDB in particular, we encourage you to follow this tutorial series covering many topics on using and administering MongoDB and to check the official MongoDB documentation, a vast source of knowledge about MongoDB as well as document databases in general.
]]>On its own, though, a VPN may not be enough to prevent unauthorized users from accessing your MongoDB installation. For instance, there may be a large number of people who need access to your VPN but only a few of them need access to your Mongo database. You could have more granular control over who has access to your data by setting up a firewall on your database server.
A firewall provides network security by filtering incoming and outgoing traffic based on a set of user-defined rules. Firewall tools generally allow you to define rules with a high level of precision, giving you the flexibility to grant connections from specific IP addresses access to specific ports on your server. For example, you could write rules that would only allow an application server access to the port on your database server used by a MongoDB installation.
Another way to limit your database’s network exposure is to configure IP binding. By default, MongoDB is bound only to localhost upon installation. This means that, without further configuration, a fresh Mongo installation will only be able to accept connections that originate from localhost, or the same server on which the MongoDB instance is installed.
This default setting is secure, since it means the database is only accessible to those who already have access to the server on which it’s installed. However, this setting will cause problems if you need to access the database remotely from another machine. In such cases, you can additionally bind your instance to an IP address or hostname where the remote computer can reach the database server.
Data has become a driving force of technology in recent years, as modern applications and websites need to manage an ever-increasing amount of data. Traditionally, database management systems organize data based on the relational model. As organizations’ data needs have changed, however, a number of new types of databases have been developed.
These new types of databases often don’t rely on the traditional table structure provided by relational databases, and can thus allow for far more flexibility than the rigid structure imposed by relational databases. Additionally, they typically don’t use Structured Query Language (SQL), which is employed by most relational database systems to allow users to define and interact with data. This has led to many of these new non-relational databases to be referred to generally as NoSQL databases.
First released in 2009, MongoDB — also known as Mongo — is a document-oriented NoSQL database used in many modern web applications. This conceptual article provides a high-level overview of the features that set MongoDB apart from other database management systems and make it a valuable tool across many different use cases.
As mentioned in the introduction, MongoDB is considered to be a NoSQL database since it doesn’t depend on the relational model. Every database management system is designed around a certain type of data model that defines how the data within the database will be organized. The relational model involves storing data in tables — more formally known as relations — made up of rows and columns.
MongoDB, on the other hand, stores its data records in structures known as documents. Mongo allows you to group multiple documents into a structure known as a collection, which can be further grouped into separate databases.
A document is written in BSON, a binary representation of JSON. Like objects in JSON, MongoDB documents begin and end with curly brackets ({
and }
), and contain a number of field-and-value pairs which typically take the form of field: value
. A field’s value can be any one of the data types used in BSON, or even other structures like documents and arrays.
MongoDB comes installed with a number of features that can help to prevent data loss as well as access by unauthorized users. Some of these features can be found on other database management systems. For instance, Mongo, like many modern DBMSs, allows you to encrypt data as it traverses a network — sometimes called data in transit. It does this by requiring that connections to the database be made with Transport Layer Security (TLS), a cryptographic protocol that serves as a successor to Secure Sockets Layer (SSL).
Also like other DBMSs, Mongo manages authorization — the practice of setting rules for a given user or group of users to define what actions they can perform and what resources they can access — through a computer security concept known as role-based access control, or RBAC. Whenever you create a MongoDB user, you have the option to provide them with one or more roles.
A role defines what privileges a user has, including what actions they can perform on a given database, collection, set of collections, or cluster. For example, you can assign a user the readWrite
role on any database, meaning that you can read and modify the data held in any database on your system as long as you’ve granted a user the readWrite
role over it. Something that distinguishes MongoDB’s RBAC from that of other databases is that, in addition to its built-in roles, Mongo also allows you to define custom roles, giving you even more control over what resources users can access on your system.
Since the release of version 4.2, MongoDB supports client-side field level encryption. This involves encrypting certain fields within a document before the data gets written to the database. Any client or application that tries to read it later on must first present the correct encryption keys to be able to decrypt the data in these fields.
To illustrate, say your database holds a document with the following fields and values:
{
"name" : "Sammy",
"phone" : "555-555-1234",
"creditcard" : "1234567890123456"
}
It could be dangerous to store sensitive information like this — namely, a person’s phone and credit card numbers — in a real-world application. Even if you’ve put limits on who can access the database, anyone who has privileges to access the database could see and take advantage of your users’ sensitive information. When properly configured, though, these fields would look something like if they were written with client side field level encryption:
{
"name" : "Sammy",
"phone" : BinData6,"quas+eG4chuolau6ahq=i8ahqui0otaek7phe+Miexoo"),
"creditcard" : BinData6,"rau0Teez=iju4As9Eeyiu+h4coht=ukae8ahFah4aRo="),
}
For a more thorough overview of MongoDB’s security features, along with some general strategies for keeping a Mongo database secure, we encourage you to check out our series on MongoDB Security: Best Practices to Keep Your Data Safe.
Another characteristic of MongoDB that has helped drive its adoption is the flexibility it provides when compared with more traditional database management systems. This flexibility is rooted in MongoDB’s document-based design, since collections in Mongo do not enforce a specific structure that every document within them must follow. This contrasts with the rigid structure imposed by tables in a relational database.
Whenever you create a table in a relational database, you must explicitly define the set of columns the table will hold along with their data types. Following that, every row of data you add must conform to that specific structure. On the other hand, MongoDB documents in the same collection can have different fields, and even if they share a given field it can hold different data types in different documents.
This rigidity imposed by the relational model isn’t necessarily a bad thing. In fact, it makes relational databases quite useful for storing data that neatly conforms to a predefined structure. But it can become limiting in cases where you need to store unstructured data — data that doesn’t easily fit into predefined data models or isn’t easily searchable by conventional tools.
Examples of unstructured data include media content, like videos or photos, communications data, or text files. Sometimes, unstructured data is generalized as qualitative data. In other words, data that may be human readable but is difficult for computers to adequately parse. MongoDB’s versatile document-oriented design, however, makes it a great choice for storing and analyzing unstructured data as well as structured and semi-structured data.
Another example of Mongo’s flexibility is how it offers multiple avenues for interacting with one’s data. For example, you can run the mongo
shell, a JavaScript-based interface that comes installed with the MongoDB server, which allows you to interact with your data from the command line.
Mongo also supports a number of official drivers that can help you connect a database to your application. Mongo provides these libraries for a variety of popular programming languages, including PHP, Java, JavaScript, and Python. These drivers also provide support for the data types found in their respective host languages, expanding on the BSON data types available by default.
Any computer-based database system depends on its underlying hardware to function and serve the needs of an application or client. If the machine on which it’s running fails for any reason, the data held within the database won’t be accessible until the machine is back up and running. If a database management system is able to remain in operation for a higher than normal period of time, it’s said to be highly available.
One way many databases remain highly available is through a practice known as replication. Replication involves synchronizing data across multiple different databases running on separate machines. This results in multiple copies of the same data and provides redundancy in case one of the database servers fails. This ensures that the synchronized data always remains available to the applications or clients that depend on it.
In MongoDB, a group of servers that maintain the same data set through replication are referred to as a replica set. Each running instance of MongoDB that’s part of a given replica set is referred to as one of its members. Every replica set must have one primary member and at least one secondary member.
One advantage that MongoDB’s replica sets have over other replication implementations in other database systems is Mongo’s automatic failover mechanism. In the event that the primary member becomes unavailable, an automated election process happens among the secondary nodes to choose a new primary.
As a core component of modern applications, it’s important for a database to be able to respond to changes in the amount of work it must perform. After all, an application can see sudden surges in its number of users, or perhaps experience periods of particularly heavy workloads.
Scalability refers to a computer system’s ability to handle an ever-growing amount of work, and the practice of increasing this capacity is called scaling. There are two ways one can scale a computer system:
To vertically scale a MongoDB database, one could back up its data and migrate it to another machine with more computing resources. This is generally the same procedure for vertically scaling any database management system, including relational databases. However, scaling up like this can have drawbacks. The cost of using larger and larger machines over time can become prohibitively expensive and, no matter how great it is, there is always an upper limit to how much data a single machine can store.
Sharding is a strategy some administrators employ for scaling out a database. If you’d like a thorough explanation of sharding, we encourage you to read our conceptual article on Understanding Database Sharding. For the purposes of this article, though, understand that sharding is the process of breaking up a data set based on a given set of rules, and distributing the resulting pieces of data across multiple separate database nodes. A single node that holds part of a sharded cluster’s data set is known as a shard.
Database management systems don’t always include sharding capabilities as a built-in feature, so oftentimes sharding is implemented at the application level. MongoDB, however, does include a built-in sharding feature which allows you to shard data at the collection level. As of version 3.6, every MongoDB shard must be deployed as a replica set to ensure that the shard’s data remains highly available.
To shard data in Mongo, you must select one or more fields in a given collection’s documents to function as the shard key. MongoDB then takes the range of shard key values and divides them into non-overlapping ranges, known as chunks, and each chunk is assigned to a given shard.
Following that, Mongo reads each document’s shard key value, determines what chunk the document belongs to, and then distributes the document to the appropriate shard. MongoDB actively monitors the number of chunks in each shard, and will attempt to migrate chunks from one shard to another to ensure that each has an equal amount.
The main drawback of sharding is that it adds a degree of operational complexity to a database system. However, once you have a working MongoDB shard cluster, the process of adding more shards to scale the system horizontally is fairly straightforward, and a properly configured replica set can be added as a shard with a single command. This makes MongoDB an appealing choice for applications that need to scale out quickly.
Relational database management systems still see wider use than databases that employ a NoSQL model. With that said, though, MongoDB continues to gain ground thanks to the features described throughout this guide. In particular, it’s become a common choice of database for a number of use cases.
For example, its scaling capabilities and high availability make it a popular database for e-commerce and gaming applications where the number of users being served can increase quickly and dramatically. Likewise, its flexible schema and ability to handle large amounts of unstructured data make it a great choice for content management applications which need to manage an ever-evolving catalog of assets, ranging from text, to video, images, and audio files. It has also seen strong adoption among mobile application developers, thanks again to its powerful scaling as well as its data analysis capabilities.
When deciding whether you should use MongoDB in your next application, you should first ask yourself what the application’s specific data needs are. If your application will store data that rigidly adheres to a predefined structure, you may not get much additional value from Mongo’s schemaless design and you might be better off using a relational database.
Then, weigh how much data you expect your application will need to store and use. MongoDB’s document-oriented design makes it a great choice for applications that need to store large amounts of unstructured data. Similarly, MongoDB’s scalability and high availability make it a perfect fit for applications that serve a large and ever-growing number of clients. However, these features could be excessive in cases that aren’t as data intensive.
By reading this article, you’ll have gained a better understanding of the features that set MongoDB apart from other database management systems. Although MongoDB is a powerful, flexible, and secure database management system that can be the right choice of database in certain use cases, it may not always be the best choice. While its document-based and schemaless design may not supplant the relational database model any time soon, Mongo’s rapid growth highlights its value as a tool worth understanding.
For more information about MongoDB, we encourage you to check out DigitalOcean’s entire library of MongoDB content. Additionally, the official MongoDB documentation serves as a valuable resource of information on working with Mongo.
]]>With an increased demand for reliable and performant infrastructures designed to serve critical systems, the terms scalability and high availability couldn’t be more popular. While handling increased system load is a common concern, decreasing downtime and eliminating single points of failure are just as important. High availability is a quality of infrastructure design at scale that addresses these latter considerations.
In this guide, we will discuss what exactly high availability means and how it can improve your infrastructure’s reliability.
In computing, the term availability is used to describe the period of time when a service is available, as well as the time required by a system to respond to a request made by a user. High availability is a quality of a system or component that assures a high level of operational performance for a given period of time.
Availability is often expressed as a percentage indicating how much uptime is expected from a particular system or component in a given period of time, where a value of 100% would indicate that the system never fails. For instance, a system that guarantees 99% of availability in a period of one year can have up to 3.65 days of downtime (1%).
These values are calculated based on several factors, including both scheduled and unscheduled maintenance periods, as well as the time to recover from a possible system failure.
High availability functions as a failure response mechanism for infrastructure. The way that it works is quite simple conceptually but typically requires some specialized software and configuration.
When setting up robust production systems, minimizing downtime and service interruptions is often a high priority. Regardless of how reliable your systems and software are, problems can occur that can bring down your applications or your servers. Implementing high availability for your infrastructure is a useful strategy to reduce the impact of these types of events. Highly available systems can recover from server or component failure automatically.
One of the goals of high availability is to eliminate single points of failure in your infrastructure. A single point of failure is a component of your technology stack that would cause a service interruption if it became unavailable. As such, any component that is a requisite for the proper functionality of your application that does not have redundancy is considered to be a single point of failure. To eliminate single points of failure, each layer of your stack must be prepared for redundancy. For instance, imagine you have an infrastructure consisting of two identical, redundant web servers behind a load balancer. The traffic coming from clients will be equally distributed between the web servers, but if one of the servers goes down, the load balancer will redirect all traffic to the remaining online server.
The web server layer in this scenario is not a single point of failure because:
But what happens if the load balancer goes offline?
With the described scenario, which is not uncommon in real life, the load balancing layer itself remains a single point of failure. Eliminating this remaining single point of failure, however, can be challenging; even though you can easily configure an additional load balancer to achieve redundancy, there isn’t an obvious point above the load balancers to implement failure detection and recovery.
Redundancy alone cannot guarantee high availability. A mechanism must be in place for detecting failures and taking action when one of the components of your stack becomes unavailable.
Failure detection and recovery for redundant systems can be implemented using a top-to-bottom approach: the layer on top becomes responsible for monitoring the layer immediately beneath it for failures. In our previous example scenario, the load balancer is the top layer. If one of the web servers (bottom layer) becomes unavailable, the load balancer will stop redirecting requests for that specific server.
This approach tends to be simpler, but it has limitations: there will be a point in your infrastructure where a top layer is either nonexistent or out of reach, which is the case with the load balancer layer. Creating a failure detection service for the load balancer in an external server would simply create a new single point of failure.
With such a scenario, a distributed approach is necessary. Multiple redundant nodes must be connected together as a cluster where each node should be equally capable of failure detection and recovery.
For the load balancer case, however, there’s an additional complication, due to the way nameservers work. Recovering from a load balancer failure typically means a failover to a redundant load balancer, which implies that a DNS change must be made in order to point a domain name to the redundant load balancer’s IP address. A change like this can take a considerable amount of time to be propagated on the Internet, which would cause a serious downtime to this system.
A possible solution is to use DNS round-robin load balancing. However, this approach is not reliable as it leaves failover the client-side application.
A more robust and reliable solution is to use systems that allow for flexible IP address remapping, such as Reserved IPs. On demand IP address remapping eliminates the propagation and caching issues inherent in DNS changes by providing a static IP address that can be easily remapped when needed. The domain name can remain associated with the same IP address, while the IP address itself is moved between servers.
This is how a highly available infrastructure using Reserved IPs looks like:
There are several components that must be carefully taken into consideration for implementing high availability in practice. Much more than a software implementation, high availability depends on factors such as:
Each layer of a highly available system will have different needs in terms of software and configuration. However, at the application level, load balancers represent an essential piece of software for creating any high availability setup.
HAProxy (High Availability Proxy) is a common choice for load balancing, as it can handle load balancing at multiple layers, and for different kinds of servers, including database servers.
Moving up in the system stack, it is important to implement a reliable redundant solution for your application entry point, normally the load balancer. To remove this single point of failure, as mentioned before, we need to implement a cluster of load balancers behind a Reserved IP. Corosync and Pacemaker are popular choices for creating such a setup, on both Ubuntu and CentOS servers.
High availability is an important subset of reliability engineering, focused towards assuring that a system or component has a high level of operational performance in a given period of time. At a first glance, its implementation might seem quite complex; however, it can bring tremendous benefits for systems that require increased reliability.
]]>Nginx ist einer der beliebtesten Webserver der Welt. Er kann hohe Lasten mit vielen gleichzeitigen Clientverbindungen erfolgreich bewältigen und kann problemlos als Webserver, Mailserver oder Reverse-Proxy-Server fungieren.
In diesem Leitfaden werden wir einige der Details hinter den Kulissen erörtern, die bestimmen, wie Nginx Client-Abfragen verarbeitet. Das Verständnis dieser Ideen kann das Rätselraten beim Entwerfen von Server- und Standortblöcken erleichtern und die Bearbeitung von Abfragen weniger unvorhersehbar erscheinen lassen.
Nginx unterteilt die Konfigurationen, die unterschiedliche Inhalte bereitstellen sollen, logisch in Blöcke, die in einer hierarchischen Struktur leben. Bei jedem Abfragen einer Client-Abfrage beginnt Nginx einen Prozess zur Feststellung, welche Konfigurationsblöcke zur Bearbeitung der Abfrage verwendet werden sollen. Dieser Entscheidungsprozess ist das, was wir in diesem Leitfaden diskutieren werden.
Die Hauptblöcke, die wir diskutieren werden, sind der Server-Block und der Standort-Block.
Ein Serverblock ist eine Teilmenge der Nginx-Konfiguration, die einen virtuellen Server definiert, der zur Verarbeitung von Abfragen eines definierten Typs verwendet wird. Administratoren konfigurieren häufig mehrere Serverblöcke und entscheiden anhand des angeforderten Domainnamens, Ports und der IP-Adresse, welcher Block welche Verbindung verarbeiten soll.
Ein Standortblock befindet sich in einem Serverblock und wird verwendet, um zu definieren, wie Nginx Abfragen für verschiedene Ressourcen und URIs für den übergeordneten Server verarbeiten soll. Der URI-Bereich kann nach Belieben des Administrators mithilfe dieser Blöcke unterteilt werden. Es ist ein extrem flexibles Modell.
Da der Administrator mit Nginx mehrere Serverblöcke definieren kann, die als separate virtuelle Webserverinstanzen fungieren, muss festgelegt werden, welcher dieser Serverblöcke zur Erfüllung einer Abfrage verwendet wird.
Dies geschieht durch ein definiertes System von Überprüfungen, mit denen die bestmögliche Übereinstimmung gefunden wird. Die wichtigsten Serverblock-Direktiven, mit denen sich Nginx während dieses Prozesses befasst, sind die Direktive listen
und server_name
.
Zunächst überprüft Nginx die IP-Adresse und den Port der Abfrage. Dies wird mit der listen
-Direktive jedes Servers verglichen, um eine Liste der Serverblöcke zu erstellen, die möglicherweise die Abfrage auflösen können.
Die listen
-Direktive definiert typischerweise, auf welche IP-Adresse und welchen Port der Serverblock reagieren wird. Standardmäßig erhält jeder Serverblock, der keine listen
-Direktive enthält, die listen-Parameter 0.0.0.0:80
(oder 0.0.0.0:8080
, wenn Nginx von einem normalen non-root-Benutzer ausgeführt wird). Auf diese Weise können diese Blöcke auf Abfragen an einer beliebigen Schnittstelle an Port 80 antworten, aber dieser Standardwert hat im Serverauswahlprozess nicht viel Gewicht.
Die listen
-Direktive kann auf Folgendes eingestellt werden:
Die letzte Option hat im Allgemeinen nur Auswirkungen beim Übergeben von Abfragen zwischen verschiedenen Servern.
Wenn Sie versuchen, zu bestimmen, an welchen Serverblock eine Abfrage gesendet wird, wird Nginx zunächst versuchen, anhand der Spezifität der listen
-Direktive mit den folgenden Regeln zu entscheiden:
listen
-Direktiven, indem fehlende Werte mit den Standardwerten ersetzt werden, sodass jeder Block durch seine IP-Adresse und den Port bewertet werden kann. Einige Beispiele für diese Übersetzungen sind:
listen
-Direktive verwendet den Wert 0.0.0.0:80
.111.111.111.111
ohne Port festgelegt ist, wird zu 111.111.111.111:80
8888
ohne IP-Adresse festgelegt ist, wird zu 0.0.0.0:8888
0.0.0.0
als IP-Adresse verwendet (um mit einer beliebigen Schnittstelle übereinzustimmen), nicht ausgewählt wird, wenn übereinstimmende Blöcke vorhanden sind, in denen eine bestimmte IP-Adresse aufgeführt ist. In jedem Fall muss der Port genau übereinstimmen.server_name
jedes Serverblocks.Es ist wichtig, zu verstehen, dass Nginx die Direktive server_name
nur dann auswertet, wenn zwischen Serverblöcken unterschieden werden muss, die der gleichen Spezifitätsstufe in der listen
-Direktive entsprechen. Wenn beispielsweise example.com
auf Port 80
von 192.168.1.10
gehostet wird, wird eine Abfrage für example.com
in diesem Beispiel trotz der Direktive server_name
im zweiten Block immer vom ersten Block bedient.
server {
listen 192.168.1.10;
. . .
}
server {
listen 80;
server_name example.com;
. . .
}
Für den Fall, dass mehr als ein Serverblock mit gleicher Spezifität übereinstimmt, besteht der nächste Schritt darin, die Direktive server_name
zu überprüfen.
Um Abfragen mit gleichermaßen spezifischen listen
-Direktiven weiter auszuwerten, überprüft Nginx die „Host“-Überschrift der Abfrage. Dieser Wert enthält die Domain oder IP-Adresse, die der Client tatsächlich versucht hat, zu erreichen.
Nginx versucht, die beste Übereinstimmung für den gefundenen Wert zu finden, indem es sich die Direktive server_name
in jedem der Serverblöcke ansieht, die noch Auswahlkandidaten sind. Nginx bewertet diese durch die folgende Formel:
server_name
zu finden, der dem Wert in der „Host“-Überschrift der Abfrage genau entspricht. Wenn dieser gefunden wird, wird der zugeordnete Block verwendet, um die Abfrage zu bedienen. Wenn mehrere genaue Übereinstimmungen gefunden werden, wird die erste verwendet.server_name
zu finden, der mit einem führenden Platzhalter übereinstimmt (angezeigt durch ein *
am Anfang des Namens in der Konfiguration). Wenn eine gefunden wird, wird der zugeordnete Block verwendet, um die Abfrage zu bedienen. Wenn mehrere Übereinstimmungen gefunden werden, wird die längste Übereinstimmung verwendet, um die Abfrage zu bedienen.server_name
, der mit einem nachfolgenden Platzhalter übereinstimmt (angezeigt durch einen Servernamen, der in der Konfiguration mit einem *
endet). Wenn eine gefunden wird, wird dieser Block verwendet, um die Abfrage zu bedienen. Wenn mehrere Übereinstimmungen gefunden werden, wird die längste Übereinstimmung verwendet, um die Abfrage zu bedienen.server_name
mithilfe regulärer Ausdrücke definieren (angezeigt durch ein ~
vor dem Namen). Der erste server_name
mit einem regulären Ausdruck, der der „Host“-Überschrift entspricht, wird zur Bearbeitung der Abfrage verwendet.Jede Kombination aus IP-Adresse und Port verfügt über einen Standardserverblock, der verwendet wird, wenn mit den oben genannten Methoden keine Vorgehensweise festgelegt werden kann. Bei einer Kombination aus IP-Adresse und Port ist dies entweder der erste Block in der Konfiguration oder der Block, der die Option default_server
als Teil der listen
-Direktive enthält (die den zuerst gefundenen Algorithmus überschreiben würde). Pro IP-Adresse/Port-Kombination kann nur eine default_server
-Deklaration vorhanden sein.
Wenn ein server_name
definiert ist, der genau mit dem „Host“-Überschriftswert übereinstimmt, wird dieser Serverblock ausgewählt, um die Abfrage zu verarbeiten.
Wenn in diesem Beispiel die „Host“-Überschrift der Abfrage auf „host1.example.com“ gesetzt wäre, würde der zweite Server ausgewählt:
server {
listen 80;
server_name *.example.com;
. . .
}
server {
listen 80;
server_name host1.example.com;
. . .
}
Wenn keine genaue Übereinstimmung gefunden wird, prüft Nginx, ob ein server_name
mit einem passenden Start-Platzhalter vorhanden ist. Die längste Übereinstimmung mit einem Platzhalter wird ausgewählt, um die Abfrage zu erfüllen.
Wenn in diesem Beispiel die Abfrage einer „Host“-Überschrift „www.example.org“ hat, würde der zweite Serverblock ausgewählt:
server {
listen 80;
server_name www.example.*;
. . .
}
server {
listen 80;
server_name *.example.org;
. . .
}
server {
listen 80;
server_name *.org;
. . .
}
Wenn mit einem Start-Platzhalter keine Übereinstimmung gefunden wird, prüft Nginx anhand eines Platzhalters am Ende des Ausdrucks, ob eine Übereinstimmung vorliegt. An diesem Punkt wird die längste Übereinstimmung mit einem Platzhalter ausgewählt, um die Abfrage zu bedienen.
Wenn beispielsweise die Abfrage eine „Host“-Überschrift auf „www.example.com“ festgelegt hat, wird der dritte Serverblock ausgewählt:
server {
listen 80;
server_name host1.example.com;
. . .
}
server {
listen 80;
server_name example.com;
. . .
}
server {
listen 80;
server_name www.example.*;
. . .
}
Wenn keine Platzhalterübereinstimmungen gefunden werden können, versucht Nginx, Übereinstimmungen mit den Direktiven server_name
zuzuordnen, die reguläre Ausdrücke verwenden. Der erste übereinstimmende reguläre Ausdruck wird ausgewählt, um auf die Abfrage zu antworten.
Wenn beispielsweise die „Host“-Überschrift der Abfrage auf „www.example.com“ gesetzt ist, wird der zweite Serverblock ausgewählt, um die Abfrage zu erfüllen:
server {
listen 80;
server_name example.com;
. . .
}
server {
listen 80;
server_name ~^(www|host1).*\.example\.com$;
. . .
}
server {
listen 80;
server_name ~^(subdomain|set|www|host1).*\.example\.com$;
. . .
}
Wenn keiner der oben genannten Schritte die Abfrage erfüllen kann, wird die Abfrage an den Standardserver für die übereinstimmende IP-Adresse und den passenden Port weitergeleitet.
Ähnlich wie bei dem Prozess, mit dem Nginx den Serverblock auswählt, der eine Abfrage verarbeitet, verfügt Nginx auch über einen etablierten Algorithmus zur Entscheidung, welcher Standortblock innerhalb des Servers zur Verarbeitung von Abfragen verwendet werden soll.
Bevor wir uns damit befassen, wie Nginx entscheidet, welcher Standortblock zur Verarbeitung von Abfragen verwendet werden soll, gehen wir einen Teil der Syntax durch, der möglicherweise in Standortblockdefinitionen angezeigt wird. Standortblöcke befinden sich in Serverblöcken (oder anderen Standortblöcken) und werden verwendet, um zu entscheiden, wie der Abfrage-URI (der Teil der Abfrage, der nach dem Domainnamen oder der IP-Adresse/dem IP-Port kommt) verarbeitet werden soll.
Standortblöcke haben im Allgemeinen die folgende Form:
location optional_modifier location_match {
. . .
}
Das oben angezeigte location_match
definiert, gegen was Nginx den Abfrage-URI prüfen soll. Das Vorhandensein oder Nichtvorhandensein des Modifikators im obigen Beispiel beeinflusst die Art und Weise, wie der Nginx versucht, mit dem Standortblock übereinzustimmen. Die folgenden Modifikatoren bewirken, dass der zugehörige Standortblock wie folgt interpretiert wird:
=
: Wenn ein Gleichheitszeichen verwendet wird, wird dieser Block als Übereinstimmung betrachtet, wenn der Abfrage-URI genau mit dem angegebenen Standort übereinstimmt.~
: Wenn ein Tilde-Modifikator vorhanden ist, wird dieser Speicherort als Übereinstimmung zwischen regulären Ausdrücken und Groß- und Kleinschreibung interpretiert.~*
: Wenn ein Tilde- und ein Sternchen-Modifikator verwendet werden, wird der Positionsblock als Übereinstimmung zwischen regulären Ausdrücken ohne Berücksichtigung der Groß- und Kleinschreibung interpretiert.^~
: Wenn ein Karat- und Tilde-Modifikator vorhanden ist und dieser Block als beste Übereinstimmung mit nicht regulären Ausdrücken ausgewählt ist, findet keine Übereinstimmung mit regulären Ausdrücken statt.Als Beispiel für die Präfixübereinstimmung kann der folgende Standortblock ausgewählt werden, um auf Abfrage-URIs zu antworten, die wie /site
, /site/page1/index.html
oder /site/index.html
aussehen:
location /site {
. . .
}
Zur Demonstration der genauen Übereinstimmung der Abfrage-URI wird dieser Block immer verwendet, um auf eine Abfrage-URI zu antworten, die wie /page1
aussieht. Es wird nicht verwendet, um auf eine /page1/index.html
-Abfrage-URI zu antworten. Beachten Sie, dass bei Auswahl dieses Blocks und Erfüllung der Abfrage über eine Indexseite eine interne Umleitung an einen anderen Speicherort erfolgt, der der eigentliche Handhaber der Abfrage ist:
location = /page1 {
. . .
}
Als Beispiel für einen Speicherort, der als regulärer Ausdruck mit Groß- und Kleinschreibung interpretiert werden sollte, kann dieser Block verwendet werden, um Abfragen für /tortoise.jpg
zu verarbeiten, nicht jedoch für /FLOWER.PNG
:
location ~ \.(jpe?g|png|gif|ico)$ {
. . .
}
Ein Block, der eine Übereinstimmung ohne Berücksichtigung der Groß- und Kleinschreibung ähnlich wie oben ermöglichen würde, ist unten dargestellt. Hier könnten sowohl /tortoise.jpg
als auch /FLOWER.PNG
von diesem Block verarbeitet werden:
location ~* \.(jpe?g|png|gif|ico)$ {
. . .
}
Schließlich würde dieser Block verhindern, dass eine Übereinstimmung mit regulären Ausdrücken auftritt, wenn festgestellt wird, dass dies die beste Übereinstimmung mit nicht regulären Ausdrücken ist. Er könnte Abfragen für /costumes/ninja.html
verarbeiten:
location ^~ /costumes {
. . .
}
Wie Sie sehen, geben die Modifikatoren an, wie der Standortblock interpretiert werden soll. Dies sagt uns jedoch nicht, welchen Algorithmus Nginx verwendet, um zu entscheiden, an welchen Standortblock die Abfrage gesendet werden soll. Wir werden das als nächstes durchgehen.
Nginx wählt den Speicherort aus, an dem eine Abfrage bearbeitet wird, ähnlich wie bei der Auswahl eines Serverblocks. Es wird ein Prozess durchlaufen, der den besten Standortblock für eine bestimmte Abfrage ermittelt. Das Verständnis dieses Prozesses ist eine entscheidende Anforderung, um Nginx zuverlässig und korrekt konfigurieren zu können.
Unter Berücksichtigung der oben beschriebenen Arten von Standortdeklarationen bewertet Nginx die möglichen Standortkontexte, indem der Abfrage-URI mit jedem der Standorte verglichen wird. Dies geschieht mit dem folgenden Algorithmus:
=
gefunden wird, der genau mit dem Abfrage-URI übereinstimmt, wird dieser Standortblock sofort ausgewählt, um die Abfrage zu bedienen.=
) Positionsblockübereinstimmungen gefunden werden, fährt Nginx mit der Auswertung nicht exakter Präfixe fort. Es ermittelt den am längsten übereinstimmenden Präfixspeicherort für den angegebenen Abfrage-URI, den es dann wie folgt auswertet:
^ ~
hat, beendet Nginx die Suche sofort und wählt diesen Standort aus, um die Abfrage zu bearbeiten.^ ~
verwendet, wird die Übereinstimmung für den Moment von Nginx gespeichert, damit der Fokus der Suche verschoben werden kann.Es ist wichtig, zu verstehen, dass Nginx standardmäßig Übereinstimmungen mit regulären Ausdrücken anstelle von Präfixübereinstimmungen bereitstellt. Es werden jedoch zuerst Präfixpositionen ausgewertet, sodass die Verwaltung diese Tendenz überschreiben kann, indPositionen mit den Modifikatoren =
und ^ ~
angegeben werden.
Es ist auch wichtig, zu beachten, dass, während Präfixpositionen im Allgemeinen basierend auf der längsten, spezifischsten Übereinstimmung ausgewählt werden, die Auswertung regulärer Ausdrücke gestoppt wird, wenn die erste übereinstimmende Position gefunden wird. Dies bedeutet, dass die Positionierung innerhalb der Konfiguration enorme Auswirkungen auf die Positionen regulärer Ausdrücke hat.
Schließlich ist es wichtig, zu verstehen, dass Übereinstimmungen mit regulären Ausdrücken innerhalb der längsten Präfixübereinstimmung „die Zeile überspringen“, wenn Nginx Regex-Positionen auswertet. Diese werden der Reihe nach ausgewertet, bevor andere Übereinstimmungen mit regulären Ausdrücken berücksichtigt werden. Maxim Dounin, ein unglaublich hilfreicher Nginx-Entwickler, erklärt in diesem Beitrag diesen Teil des Auswahlalgorithmus.
Wenn ein Standortblock ausgewählt wird, um eine Abfrage zu bedienen, wird die Abfrage im Allgemeinen von diesem Punkt an vollständig in diesem Kontext behandelt. Nur der ausgewählte Standort und die geerbten Anweisungen bestimmen, wie die Abfrage verarbeitet wird, ohne dass Geschwisterstandortblöcke eingreifen.
Obwohl dies eine allgemeine Regel ist, mit der Sie Ihre Standortblöcke auf vorhersehbare Weise entwerfen können, ist es wichtig zu wissen, dass es manchmal Zeiten gibt, in denen eine neue Standortsuche durch bestimmte Anweisungen innerhalb des ausgewählten Standortes ausgelöst wird. Die Ausnahmen von der Regel „Nur ein Standortblock“ können Auswirkungen auf die tatsächliche Zustellung der Abfrage haben und stimmen möglicherweise nicht mit den Erwartungen überein, die Sie beim Entwerfen Ihrer Standortblöcke hatten.
Einige Direktiven, die zu dieser Art der internen Weiterleitung führen können, sind:
Gehen wir diese kurz durch.
Die Direktive index
führt immer zu einer internen Weiterleitung, wenn sie verwendet wird, um die Abfrage zu verarbeiten. Genaue Standortübereinstimmungen werden häufig verwendet, um den Auswahlprozess zu beschleunigen, indem die Ausführung des Algorithmus sofort beendet wird. Wenn Sie jedoch eine genaue Standortübereinstimmung mit einem Verzeichnis vornehmen, besteht eine gute Chance, dass die Abfrage zur tatsächlichen Verarbeitung an einen anderen Standort umgeleitet wird.
In diesem Beispiel wird der erste Standort mit einem Abfrage-URI von /exact
abgeglichen. Um die Abfrage zu verarbeiten, initiiert die vom Block geerbte Indexanweisung
eine interne Umleitung zum zweiten Block:
index index.html;
location = /exact {
. . .
}
location / {
. . .
}
Wenn Sie im obigen Fall wirklich die Ausführung benötigen, um im ersten Block zu bleiben, müssen Sie eine andere Methode finden, um die Abfrage an das Verzeichnis zu erfüllen. Sie können beispielsweise einen ungültigen index
für diesen Block festlegen und den autoindex
aktivieren:
location = /exact {
index nothing_will_match;
autoindex on;
}
location / {
. . .
}
Dies ist eine Möglichkeit, um zu verhindern, dass ein index
den Kontext wechselt, ist jedoch für die meisten Konfigurationen wahrscheinlich nicht hilfreich. Meistens kann eine genaue Übereinstimmung mit Verzeichnissen hilfreich sein, um beispielsweise die Abfrage neu zu schreiben (was auch zu einer neuen Standortsuche führt).
Eine andere Instanz, in der der Verarbeitungsort neu bewertet werden kann, ist die Direktive try_files
. Diese Direktive weist Nginx an, das Vorhandensein einer benannten Gruppe von Dateien oder Verzeichnissen zu überprüfen. Der letzte Parameter kann ein URI sein, zu dem Nginx eine interne Umleitung vornimmt.
Erwägen Sie folgende Konfiguration:
root /var/www/main;
location / {
try_files $uri $uri.html $uri/ /fallback/index.html;
}
location /fallback {
root /var/www/another;
}
Wenn im obigen Beispiel eine Anfrage für /blahblah
gestellt wird, erhält der erste Standort zunächst die Abfrage. Er wird versuchen, eine Datei namens blahblah
im Verzeichnis /var/www/main
zu finden. Wenn gefunden werden kann, wird anschließend nach einer Datei mit dem Namen blahblah.html
gesucht. Anschließend wird versucht, festzustellen, ob sich im Verzeichnis /var/www/main
ein Verzeichnis mit dem Namen blahblah/
befindet. Wenn alle diese Versuche fehlschlagen, wird zu /fallback/index.html
umgeleitet. Dies löst eine weitere Standortsuche aus, die vom zweiten Standortblock abgefangen wird. Dies wird die Datei /var/www/anderen/fallback/index.html
bereitstellen.
Eine weitere Direktive, die dazu führen kann, dass ein Standortblock übergeben wird, ist die Direktive rewrite
. Wenn Sie den letzten
Parameter mit der Direktive rewrite
zum Umschreiben oder überhaupt keinen Parameter verwenden, sucht Nginx basierend auf den Ergebnissen des Umschreibens nach einem neuen übereinstimmenden Standort.
Wenn wir beispielsweise das letzte Beispiel so ändern, dass es ein Umschreiben enthält, können wir feststellen, dass die Abfrage manchmal direkt an den zweiten Standort übergeben wird, ohne sich auf die Direktive try_files
zu verlassen:
root /var/www/main;
location / {
rewrite ^/rewriteme/(.*)$ /$1 last;
try_files $uri $uri.html $uri/ /fallback/index.html;
}
location /fallback {
root /var/www/another;
}
Im obigen Beispiel wird eine Abfrage für /rewriteme/hello
zunächst vom ersten Standortblock verarbeitet. Sie wird in /hello
umgeschrieben und ein Standort gesucht. In diesem Fall stimmt sie wieder mit dem ersten Standort überein und wird wie gewohnt von den try_files
verarbeitet. Wenn nichts gefunden wird, kehren Sie möglicherweise zu /fallback/index.html
zurück (mithilfe der oben beschriebenen internen Umleitung try_files
).
Wenn jedoch eine Abfrage für /rewriteme/fallback/hello
gestellt wird, stimmt der erste Block erneut überein. Das Umschreiben wird erneut angewendet, diesmal mit /fallback/hello
. Die Abfrage wird dann aus dem zweiten Standortblock heraus zugestellt.
Eine verwandte Situation tritt bei der Direktive return
auf, wenn die Statuscodes 301
oder 302
gesendet werden. Der Unterschied in diesem Fall ist, dass es eine völlig neue Abfrage in Form einer extern sichtbaren Umleitung bildet. Dieselbe Situation kann bei der Direktive rewrite
auftreten, wenn die Flags redirect
oder permanent
verwendet werden. Diese Standortsuche sollte jedoch nicht unerwartet sein, da extern sichtbare Weiterleitungen immer zu einer neuen Abfrage führen.
Die Direktive error_page
kann zu einer internen Umleitung führen, die der von try_files
erstellten ähnelt. Diese Direktive wird verwendet, um zu definieren, was passieren soll, wenn bestimmte Statuscodes aufgetreten sind. Dies wird wahrscheinlich nie ausgeführt, wenn try_files
festgelegt ist, da diese Direktive den gesamten Lebenszyklus einer Abfrage behandelt.
Erwägen Sie dieses Beispiel:
root /var/www/main;
location / {
error_page 404 /another/whoops.html;
}
location /another {
root /var/www;
}
Jede Abfrage (außer denjenigen, die mit /another
beginnen) wird vom ersten Block bearbeitet, der Dateien aus /var/www/main
bereitstellt. Wenn jedoch keine Datei gefunden wird (Status 404), erfolgt eine interne Umleitung zu /another/whoops.html
, die zu einer neuen Standortsuche führt, die schließlich im zweiten Block landet. Diese Datei wird aus /var/www/another/whoops.html
. bereitgestellt.
Wie Sie sehen können, kann das Verständnis der Umstände, unter denen Nginx eine neue Standortsuche auslöst, dazu beitragen, das Verhalten vorherzusagen, das bei Abfragen auftreten wird.
Wenn Sie wissen, wie Nginx Client-Abfragen verarbeitet, können Sie Ihre Arbeit als Administrator erheblich vereinfachen. Sie können anhand jeder Client-Abfrage wissen, welchen Serverblock Nginx auswählt. Sie können auch anhand der Abfrage-URI festlegen, wie der Standortblock ausgewählt wird. Wenn Sie wissen, wie Nginx verschiedene Blöcke auswählt, können Sie die Kontexte verfolgen, die Nginx anwenden wird, um jede Abfrage zu bearbeiten.
]]>Nginx — один из самых популярных веб-серверов в мире. Он может успешно выдерживать высокую нагрузку с множеством одновременных подключений клиентов и функционировать как веб-сервер, почтовый сервер или обратный прокси-сервер.
В этом учебном модуле мы обсудим некоторые скрытые аспекты, определяющие, как Nginx обрабатывает запросы клиентов. Понимание этих идей поможет избежать догадок при проектировании сервера и блоков расположения, а также сделать обработку запросов более предсказуемой.
Nginx логически разделяет на блоки конфигурации, обслуживающие разные виды контента, и размещает эти блоки в иерархической структуре. При каждом поступлении клиентского запроса Nginx определяет, какие блоки конфигурации следует использовать для его обработки. Об этом процессе мы и расскажем в этом учебном модуле.
В первую очередь мы расскажем о блоках server и location.
Блок server — это часть конфигурации Nginx, которая определяет виртуальный сервер, используемый для обработки запросов заданного типа. Администраторы часто настраивают несколько блоков server и определяют, какой из них будет отвечать за конкретное соединение на основании запрошенного доменного имени, порта и IP-адреса.
Блок location располагается внутри блока server и определяет, как Nginx будет обрабатывать запросы различных ресурсов и URI для родительского сервера. Администратор, использующий эти блоки, может разделить пространство URI любым удобным способом. Это чрезвычайно гибкая модель.
Поскольку Nginx разрешает администратору определять несколько серверных блоков, работающих как отдельные экземпляры виртуального веб-сервера, ему требуется процедура, определяющая, какие серверные блоки будут использоваться для выполнения запроса.
Для этого используется фиксированная система проверок, служащих для подбора оптимального совпадения. Главные директивы серверного блока, которые учитывает Nginx в этом процессе — директивы listen
и server_name
.
Прежде всего, Nginx смотрит IP-адрес и порт запроса. Он сверяет их с директивой listen
каждого сервера, создавая список серверных блоков, которые могут обработать данный запрос.
Директива listen
обычно определяет IP-адрес и порт, на которые отвечает серверный блок. Любой серверный блок, не включающий директиву listen
, по умолчанию имеет параметры прослушивания 0.0.0.0:80
(или 0.0.0.0:8080
, если Nginx запускается обычным пользователем без привилегий root). Это позволяет данным блокам отвечать на запросы любого интерфейса на порту 80, но данное значение по умолчанию не имеет большого веса в процессе выбора сервера.
Директиву listen
можно задать следующим образом:
Последняя опция обычно влияет только на передачу запросов между разными серверами.
Вначале Nginx попробует выбрать серверный блок, на который будет отправлен запрос, на основе специфики директивы listen
, используя следующие правила:
listen
, заменяя отсутствующие значения значениями по умолчанию так, что каждый блок оценивается по IP-адресу и порту. Вот несколько примеров такого преобразования:
listen
использует значение 0.0.0.0:80
.111.111.111.111
без номера порта использует значение 111.111.111.111:80
8888
без IP-адреса использует значение 0.0.0.0:8888
0.0.0.0
(соответствующим любому интерфейсу) не будет выбран, если будут найден блоки, где указан конкретный IP-адрес. Точное совпадение порта обязательно в любом случае.server_name
каждого серверного блока.Важно понимать, что Nginx будет использовать для оценки директиву server_name
, только если будет нужно выбрать из серверных блоков с одинаковым уровнем соответствия в директиве listen
. Например, в случае размещения example.com
на порту 80
с IP-адресом 192.168.1.10
, запрос example.com
всегда будет обслуживаться первым блоком из данного примера невзирая на директиву server_name
во втором блоке.
server {
listen 192.168.1.10;
. . .
}
server {
listen 80;
server_name example.com;
. . .
}
В случае равного уровня соответствия нескольких серверных блоков следующим шагом будет проверка директивы server_name
.
Для оценки запросов с равноценным уровнем соответствия директив listen
Nginx проверяет заголовок “Host” запроса. Это значение соответствует домену или IP-адресу, к которым клиент пытается подключиться.
Nginx пытается подобрать наилучшее значение на основе директивы server_name
в каждом из серверных блоков, которые являются наилучшим соответствием. Nginx оценивает их по следующей формуле:
server_name
, точно соответствующим значению в заголовке запроса “Host”. Если такой элемент найден, для обслуживания запроса будет использован соответствующий блок. Если найдется несколько точных совпадений, используется первый вариант.server_name
, соответствующей первому подстановочному символу (обозначается символом *
в начале названия в конфигурации). Если такой блок будет найден, он будет использоваться для обслуживания запроса. Если будут найдены несколько совпадений, для обслуживания запроса будет использоваться самое длинное из них.server_name
, соответствующим конечному подстановочному символу (указывается именем сервера с символом *
в конфигурации). Если такой блок будет найден, он будет использоваться для обслуживания запроса. Если будут найдены несколько совпадений, для обслуживания запроса будет использоваться самое длинное из них.server_name
, с помощью регулярных выражений (обозначаются символом ~
перед названием). Для выполнения запроса будет использоваться первая директива server_name
с регулярным выражением, соответствующим заголовку “Host”.Каждая комбинация IP-адреса и порта имеет серверный блок, который будет по умолчанию использоваться, если с помощью вышеописанных методов не удастся принять решение. Для комбинации IP-адреса и порта это будет первый блок в конфигурации или блок, содержащий опцию default_server
в директиве listen
(имеет приоритет перед алгоритмом на основе первого найденного). Для каждой комбинации IP-адреса и порта может существовать только одна декларация default_server
.
Если будет определена директива server_name
, которая точно соответствует значению заголовка “Host”, для обработки запроса будет выбран соответствующий серверный блок.
В этом примере, если для запроса задать заголовку “Host” значение “host1.example.com”, будет выбран второй сервер:
server {
listen 80;
server_name *.example.com;
. . .
}
server {
listen 80;
server_name host1.example.com;
. . .
}
Если точного совпадения найдено не будет, Nginx проверяет наличие параметра server_name
с подходящим начальным подстановочным символом. Для выполнения запроса будет выбрано самое длинное совпадение, начинающееся с подстановочного символа.
В этом примере, если заголовок “Host” запроса будет иметь значение “www.example.org”, будет выбран второй серверный блок:
server {
listen 80;
server_name www.example.*;
. . .
}
server {
listen 80;
server_name *.example.org;
. . .
}
server {
listen 80;
server_name *.org;
. . .
}
Если не будет найдено совпадения с начальным подстановочным символом, Nginx проверит наличие совпадения с подстановочным символом в конце выражения. На этом шаге для обслуживания запроса выбирается наиболее длинное совпадение, заканчивающееся подстановочным символом.
Например, если заголовок “Host” запроса имеет значение “www.example.com”, будет выбран третий серверный блок:
server {
listen 80;
server_name host1.example.com;
. . .
}
server {
listen 80;
server_name example.com;
. . .
}
server {
listen 80;
server_name www.example.*;
. . .
}
Если совпадений с подстановочными символами найдено не будет, Nginx попытается подобрать директивы server_name
, использующие регулярные выражения. Первое совпадающее регулярное выражение будет выбрано для реагирования на запрос.
Например, если заголовок “Host” будет иметь значение “www.example.com”, для выполнения запроса будет выбран второй серверный блок:
server {
listen 80;
server_name example.com;
. . .
}
server {
listen 80;
server_name ~^(www|host1).*\.example\.com$;
. . .
}
server {
listen 80;
server_name ~^(subdomain|set|www|host1).*\.example\.com$;
. . .
}
Если никакие из вышеуказанных шагов не обеспечат выполнение запроса, запрос будет передан серверу по умолчанию для соответствующей комбинации IP-адреса и порта.
Аналогично процессу, который Nginx использует для выбора серверного блока для обработки запроса, Nginx также имеет стабильный алгоритм для определения блока расположения сервера, который будет использоваться для обработки запросов.
Прежде чем рассказывать о том, как Nginx определяет, какой блок расположения использовать для обработки запросов, давайте посмотрим синтаксис, который можно увидеть в определениях блоков расположения. Блоки расположения находятся в серверных блоках (или других блоках расположения) и используются, чтобы решить, как обрабатывать URI запроса (часть запроса после доменного имени или IP-адрес/порта).
Блоки расположения обычно принимают следующую форму:
location optional_modifier location_match {
. . .
}
location_match
выше определяет, что Nginx следует проверять в отношении URI запроса. Наличие или отсутствие модификатора в примере выше влияет на то, как Nginx пытается подобрать соответствие блока расположения. Далее перечислены модификаторы, используемые для интерпретации блока расположения:
=
: если используется знак равенства, блок будет считаться совпадающим, если URI запроса точно соответствует указанному расположению.~
: знак тильды означает, что это расположение будет интерпретироваться как совпадение с регулярным выражением с учетом регистра.~*
: знак тильды со звездочкой означают, что блок расположения будет интерпретироваться как совпадение с регулярным выражением без учета регистра.^~
: знак елочки с тильдой означают, что если этот блок будет выбран как лучшее соответствие без регулярных выражений, сопоставление по регулярным выражением проводиться не будет.В качестве примера соответствия префиксов можно выбрать следующий блок расположения для реагирования на URI запроса вида /site
, /site/page1/index.html
или /site/index.html
:
location /site {
. . .
}
Как пример точного соответствия URI запроса, этот блок всегда будет использоваться для ответа на URI запроса вида /page1
. Он не будет использоваться для ответа на URI запроса /page1/index.html
. Помните, что если выбран этот блок, и если запрос выполняется с использованием страницы индекса, произойдет внутренняя переадресация на другое расположение, которое фактически и будет обрабатывать запрос:
location = /page1 {
. . .
}
Как пример расположения, которое следует интерпретировать как регулярное выражение с учетом регистра, этот блок можно использовать для обработки запросов /tortoise.jpg
, но не запросов /FLOWER.PNG
:
location ~ \.(jpe?g|png|gif|ico)$ {
. . .
}
Ниже показан похожий блок, поддерживающий сопоставление без учета регистра. Этот блок может обрабатывать как запросы /tortoise.jpg
, так и запросы /FLOWER.PNG
:
location ~* \.(jpe?g|png|gif|ico)$ {
. . .
}
Наконец, этот блок не даст выполнять сопоставление с регулярными выражениями, если будет признан лучшим совпадением без регулярного выражения. Он сможет обрабатывать запросы /costumes/ninja.html
:
location ^~ /costumes {
. . .
}
Как видите, модификаторы показывают, как следует интерпретировать блок расположения. Однако это не говорит нам, какой алгоритм Nginx использует для определения блока расположения, в который будет отправлен запрос. Этот вопрос мы рассмотрим далее.
Nginx выбирает расположение, которое будет использоваться для обработки запроса аналогично выбору серверного блока. Он выполняет процесс, определяющий наилучший блок расположения для любого заданного запроса. Понимание этого процесса очень важно для возможности надежной и точной настройки Nginx.
Учитывая описанные выше типы деклараций расположения, Nginx оценивает возможные контексты расположения, сравнивая URI запроса с каждым расположением. Для этого используется следующий алгоритм:
=
, будет точно соответствовать URI запроса, этот блок расположения сразу же будет выбран для обслуживания запроса.=
) блока расположения найдено не будет, Nginx перейдет к оценке неточных префиксов. Он определит самое длинное совпадающее расположение префикса для указанного URI запроса, которое будет оценено следующим образом:
^~
, Nginx немедленно прекращает поиск и выбирает это расположение для обслуживания запроса.^~
, Nginx временно сохраняет его, чтобы можно было сместить фокус поиска.Важно понимать, что по умолчанию Nginx будет отдавать совпадениям регулярных выражений приоритет перед совпадениями префиксов. Однако он вначале оценивает расположения префиксов, позволяя администратору переопределить этот приоритет, используя модификаторы =
и ^~
при определении расположения.
Также важно отметить, что хотя расположения префиксов обычно определяются на основе самого длинного и точного совпадения, оценка регулярных выражений останавливается при обнаружении первого совпадения. Это означает, что расположение в конфигурации важно для расположения регулярных выражений.
Наконец, важно понимать, что совпадения регулярных выражений с самым длинным совпадением префикса будут иметь больший приоритет при оценке регулярных выражений Nginx. Они будут оцениваться по порядку до начала оценки любых других совпадений регулярных выражений. Максим Дунин, разработчик Nginx, дающий очень много полезных советов, объясняет в этом сообщении принципы работы данной части алгоритма выбора.
Обычно, когда для обслуживания запроса выбирается блок расположения, запрос полностью обрабатывается в этом контексте, начиная с этого момента. Обработка запроса определяется только выбранным расположением и унаследованными директивами без вмешательства других родственных блоков расположения.
Хотя это общее правило, позволяющее прогнозируемо проектировать блоки расположения, важно понимать, что иногда определенные директивы в выбранном расположении могут активировать новый поиск расположения. Исключения из правила использования только одного блока расположения могут влиять на фактический процесс обработки запроса и не соответствовать вашим ожиданиям при проектировании блоков расположения.
Вот некоторые директивы, которые могут активировать подобную внутреннюю переадресацию:
Давайте вкратце рассмотрим их.
Директива index
всегда вызывает внутреннюю переадресацию, если используется для обработки запроса. Точные совпадения расположения часто используются для ускорения процесса выбора с немедленным завершением алгоритма. Однако, если точное совпадение расположения представляет собой каталог, есть вероятность, что запрос будет переадресован для фактической обработки в другое расположение.
В этом примере первому расположению соответствует URI запроса /exact
, но для обработки запроса директива index
, унаследованная блоком, активирует внутреннюю переадресацию во второй блок:
index index.html;
location = /exact {
. . .
}
location / {
. . .
}
Если в примере выше вы захотите ограничить исполнение первым блоком, вам нужно будет подобрать другой метод выполнения запроса каталога. Например, вы можете задать недопустимый index
этого блока и включить autoindex
:
location = /exact {
index nothing_will_match;
autoindex on;
}
location / {
. . .
}
Этот способ позволит предотвратить переключение контекста index
, но в большинстве конфигураций он не будет полезен. Более точное совпадение каталогов может помочь в таких случаях как перезапись запроса (в результате чего также выполняется новый поиск расположения).
Также расположение обработки может переоцениваться при использовании директивы try_files
. Эта директива предписывает Nginx проверить существование набора файлов или каталогов с определенным именем. Последним параметром может быть URI, на который Nginx осуществляет внутреннюю переадресацию.
Рассмотрим следующую конфигурацию:
root /var/www/main;
location / {
try_files $uri $uri.html $uri/ /fallback/index.html;
}
location /fallback {
root /var/www/another;
}
В примере выше, если мы делаем запрос /blahblah
, запрос получит первое расположение. Оно попытается найти файл с именем blahblah
в каталоге /var/www/main
. Если это не получится сделать, будет выполнен поиск файла с именем blahblah.html
. Затем будет выполнен поиск каталога blahblah/
в каталоге /var/www/main
. Если все эти попытки закончатся неудачно, будет выполнена переадресация на /fallback/index.html
. В этом случае будет активирован другой поиск расположения, который будет перехвачен вторым блоком расположения. Он выдаст файл /var/www/another/fallback/index.html
.
Также смена блока расположения возможна при использовании директивы rewrite
. При использовании параметра last
с директивой rewrite
или при ее использовании без каких-либо параметров Nginx выполняет поиск нового подходящего расположения на основе результатов перезаписи.
Например, если мы изменим последний пример и включим в него директиву rewrite, мы увидим, что запрос будет иногда передаваться во второе расположение без использования директивы try_files
:
root /var/www/main;
location / {
rewrite ^/rewriteme/(.*)$ /$1 last;
try_files $uri $uri.html $uri/ /fallback/index.html;
}
location /fallback {
root /var/www/another;
}
В примере выше запрос /rewriteme/hello
будет первоначально обработан первым блоком расположения. Он будет перезаписан в /hello
, и будет выполнен поиск расположения. В этом случае совпадением опять будет первое расположение, и будет выполнена обычная обработка try_files
, возможно с возвратом к /fallback/index.html
, если ничего не будет найдено (посредством внутренней переадресации try_files
, как описано выше).
Однако в случае запроса /rewriteme/fallback/hello
первый блок опять будет соответствовать. В этом случае снова будет применена перезапись, в данном случае на /fallback/hello
. Затем запрос будет выполнен вторым блоком расположения.
Похожая ситуация происходит с директивой return
при отправке кодов состояния 301
или 302
. В данном случае разница заключается в том, чтобы обработать совершенно новый запрос в форме внешней видимой переадресации. Такая же ситуация может возникнуть с директивой rewrite
при использовании флагов redirect
или permanent
. Однако эти поиски расположения не должны быть неожиданными, поскольку внешняя видимая переадресация всегда приводит к созданию нового запроса.
Директива error_page
может вызвать внутреннюю переадресацию, аналогичную созданной try_files
. Эта директива используется, чтобы определить, что должно происходить при получении определенных кодов состояния. Она практически никогда не выполняется вместе с try_files
, потому что обрабатывает весь жизненный цикл запроса.
Рассмотрим следующий пример:
root /var/www/main;
location / {
error_page 404 /another/whoops.html;
}
location /another {
root /var/www;
}
Каждый запрос, кроме начинающихся с /another
, будет обрабатываться первым блоком, который будет выводить файлы из /var/www/main
. Однако, если файл не будет найден (статус 404), будет выполнена внутренняя переадресация на /another/whoops.html
, в результате чего будет активирован новый поиск расположения, который попадет на второй блок. Файл будет выводиться из /var/www/another/whoops.html
.
Как видите, понимание обстоятельств, в которых Nginx активирует новый поиск расположения, может помочь прогнозировать поведение, которое вы будете наблюдать при отправке запросов.
Понимание способов обработки запросов клиентов в Nginx может значительно упростить работу администратора. Вы сможете понимать, какой серверный блок будет выбирать Nginx в ответ на запрос каждого клиента. Также вы поймете, как определить выбираемый блок расположения на основе URI запроса. Понимание того, как Nginx выбирает разные блоки, позволит вам отслеживать применяемые Nginx контексты для обслуживания каждого запроса.
]]>O Nginx é um dos servidores Web mais populares do mundo. Ele é capaz de lidar com grandes cargas com muitas conexões de cliente simultâneas, e pode funcionar facilmente como um servidor Web, um servidor de e-mail ou um servidor de proxy reverso.
Neste guia, vamos discutir alguns dos detalhes dos bastidores que determinam como o Nginx processa as solicitações de clientes. A compreensão dessas ideias pode ajudar a eliminar a necessidade de suposições ao projetar servidores e blocos de localização e pode tornar o processamento de solicitações menos imprevisível.
O Nginx divide logicamente as configurações destinadas a atender diferentes conteúdo em blocos, que operam em uma estrutura hierárquica. Cada vez que uma solicitação de cliente é feita, o Nginx começa um processo para determinar quais blocos de configuração devem ser usados para lidar com a solicitação. Este processo de decisão é o que discutiremos neste guia.
Os principais blocos que discutiremos são o bloco de servidor e o bloco de localização.
Um bloco de servidor é um subconjunto da configuração do Nginx que define um servidor virtual usado para manusear solicitações de um determinado tipo. Os administradores geralmente configuram vários blocos de servidor e decidem qual bloco deve lidar com qual conexão com base no nome de domínio, porta e endereço IP solicitados.
Um bloco de localização fica dentro de um bloco de servidor e é usado para definir como o Nginx deve manusear solicitações para diferentes recursos e URIs para o servidor pai. O espaço URI pode ser subdividido da maneira que o administrador preferir usando esses blocos. Ele é um modelo extremamente flexível.
Como o Nginx permite que o administrador defina vários blocos de servidor que funcionam como instâncias de servidor Web separadas, ele precisa de um procedimento para determinar qual desses blocos de servidor será usado para satisfazer uma solicitação.
Ele faz isso através de um sistema definido de verificações que são usadas para encontrar a melhor combinação possível. As principais diretivas do bloco de servidor com as quais o Nginx se preocupa durante esse processo são a diretiva listen
(escuta) e a diretiva server_name
(nome do servidor).
Primeiramente, o Nginx analisa o endereço IP e a porta da solicitação. Ele compara esses valores com a diretiva listen
de cada servidor para construir uma lista dos blocos de servidor que podem resolver a solicitação.
A diretiva listen
tipicamente define quais endereços IP e portas que serão respondidos pelo bloco de servidor. Por padrão, qualquer bloco de servidor que não inclua uma diretiva listen
recebe os parâmetros de escuta de 0.0.0.0:80
(ou 0.0.0.0:8080
se o Nginx estiver sendo executado por um usuário normal, não root). Isso permite que esses blocos respondam a solicitações em qualquer interface na porta 80, mas esse valor padrão não possui muito peso dentro do processo de seleção de servidor.
A diretiva listen
pode ser definida como:
Geralmente, a última opção só terá implicações ao passar solicitações entre servidores diferentes.
Ao tentar determinar para qual bloco de servidor enviar uma solicitação, o Nginx primeiro tentará decidir com base na especificidade da diretiva listen
usando as seguintes regras:
listen
“incompletas” substituindo valores que estão faltando pelos seus valores padrão para que cada bloco possa ser avaliado por seu endereço IP e porta. Alguns exemplos dessas traduções são:
listen
usa o valor 0.0.0.0:80
.111.111.111.111
sem porta se torna 111.111.111.111:80
8888
sem endereço IP torna-se 0.0.0.0:8888
0.0.0 0
como endereço IP (para corresponder a qualquer interface), não será selecionado se houver blocos correspondentes que listam um endereço IP específico. Em todo caso, a porta deve ser exatamente a mesma.server_name
de cada bloco de servidor.É importante entender que o Nginx avaliará apenas a diretiva server_name
quando precisar distinguir entre blocos de servidor que correspondem ao mesmo nível de especificidade na diretiva listen
. Por exemplo, se o example.com
for hospedado na porta 80
de 192.168.1.10
, uma solicitação para example.com
será sempre atendida pelo primeiro bloco neste exemplo, apesar da diretiva server_name
no segundo bloco.
server {
listen 192.168.1.10;
. . .
}
server {
listen 80;
server_name example.com;
. . .
}
Caso mais de um bloco de servidor corresponda com o mesmo nível de especificidade, o próximo passo é verificar a diretiva server_name
.
Em seguida, para avaliar ainda mais as solicitações que possuem diretivas listen
igualmente específicas, o Nginx verifica o cabeçalho “Host” da solicitação. Esse valor possui o domínio ou endereço IP que o cliente estava tentando realmente alcançar.
O Nginx tenta achar a melhor correspondência para o valor que ele encontra olhando a diretiva server_name
dentro de cada um dos blocos de servidor que ainda são candidatos a seleção. O Nginx avalia eles usando a seguinte fórmula:
server_name
que corresponda exatamente ao valor no cabeçalho “Host” da solicitação. Se for encontrado, o bloco associado será usado para atender à solicitação. Se mais de uma correspondência exata for encontrada, a primeira é usada.server_name
correspondente usando um curinga inicial (indicado por um *
no início do nome na configuração). Se for encontrado, o bloco será usado para atender à solicitação. Se várias correspondências forem encontradas, a correspondência mais longa será usada para atender à solicitação.server_name
correspondente usando um curinga à direita (indicado por um nome de servidor que termina com um *
na configuração). Se for encontrado, o bloco será usado para atender à solicitação. Se várias correspondências forem encontradas, a correspondência mais longa será usada para atender à solicitação.server_name
usando expressões regulares (indicado por um ~
antes do nome). O primeiro server_name
com uma expressão regular que corresponda ao cabeçalho “Host” será usado para atender à solicitação.Cada combinação de endereço IP/porta possui um bloco de servidor padrão que será usado quando um plano de ação não puder ser determinado com os métodos acima. Para uma combinação de endereço IP/porta, ele será o primeiro bloco na configuração ou o bloco que contém a opção default_server
como parte da diretiva listen
(que iria se sobrepor ao algoritmo encontrado por primeiro). Pode haver apenas uma declaração default_server
para cada combinação de endereço IP/porta.
Se houver um server_name
definido que corresponda exatamente ao valor de cabeçalho “Host”, o bloco de servidor é selecionado para processar a solicitação.
Neste exemplo, se o cabeçalho “Host” da solicitação fosse definido como ”host1.example.com”, o segundo servidor seria selecionado:
server {
listen 80;
server_name *.example.com;
. . .
}
server {
listen 80;
server_name host1.example.com;
. . .
}
Se nenhuma correspondência exata for encontrada, o Nginx então verifica se há um server_name
com um curinga inicial que se encaixa. A correspondência mais longa que começa com um curinga será selecionada para atender à solicitação.
Neste exemplo, se a solicitação tivesse um cabeçalho “Host” de ”www.example.org”, o segundo bloco de servidor seria selecionado:
server {
listen 80;
server_name www.example.*;
. . .
}
server {
listen 80;
server_name *.example.org;
. . .
}
server {
listen 80;
server_name *.org;
. . .
}
Se nenhuma correspondência for encontrada com um curinga inicial, o Nginx então verá se uma correspondência existe usando um curinga no final da expressão. Neste ponto, a correspondência mais longa que termina com um curinga será selecionada para atender à solicitação.
Por exemplo, se a solicitação tiver um cabeçalho “Host” definido como ”www.example.com”, o terceiro bloco de servidor será selecionado:
server {
listen 80;
server_name host1.example.com;
. . .
}
server {
listen 80;
server_name example.com;
. . .
}
server {
listen 80;
server_name www.example.*;
. . .
}
Se nenhuma correspondência com curingas puder ser encontrada, o Nginx então seguirá adiante para tentar corresponder o termo com as diretivas do server_name
que usam expressões regulares. A primeira expressão regular correspondente será selecionada para responder à solicitação.
Por exemplo, se o cabeçalho “Host” da solicitação for definido como ”www.example.com”, então o segundo bloco de servidor será selecionado para satisfazer a solicitação:
server {
listen 80;
server_name example.com;
. . .
}
server {
listen 80;
server_name ~^(www|host1).*\.example\.com$;
. . .
}
server {
listen 80;
server_name ~^(subdomain|set|www|host1).*\.example\.com$;
. . .
}
Se nenhum dos passos acima for capaz de satisfazer a solicitação, então a solicitação será passada ao servidor padrão para o endereço IP e porta correspondentes.
De maneira similar ao processo que o Nginx usa para selecionar o bloco de servidor que processará uma solicitação, o Nginx também possui um algoritmo estabelecido para decidir qual bloco de localização dentro do servidor usar para manusear solicitações.
Antes de abordarmos como o Nginx decide qual bloco de localização usar para manusear solicitações, vamos analisar algumas das sintaxes que podem ser vistas nas definições de blocos de localização. Os blocos de localização localizam-se dentro de blocos de servidor (ou outros blocos de localização) e são usados para decidir como processar o URI de solicitação (a parte da solicitação que vem após o nome de domínio ou endereço IP/porta).
Os blocos de localização geralmente possuem a seguinte forma:
location optional_modifier location_match {
. . .
}
O location_match
no exemplo acima define com o que o Nginx deve comparar o URI de solicitação. A existência ou não existência do modificador no exemplo acima afeta a maneira como o Nginx tenta corresponder o bloco de localização. Os modificadores abaixo farão com que o bloco de localização associado seja interpretado da seguinte forma:
=
: se um sinal de igual for usado, o bloco será considerado uma correspondência se o URI de solicitação corresponder exatamente ao local dado.~
: se um modificador til estiver presente, o local será interpretado como uma correspondência de expressão regular sensível a maiúsculas e minúsculas.~*
: se um modificador de til e asterisco for usado, o bloco de localização será interpretado como uma correspondência de expressão regular que não diferencia maiúsculas de minúsculas.^~
: se um modificador de acento circunflexo e til estiver presente, e se este bloco for selecionado como a melhor correspondência de expressão não regular, a correspondência de expressão regular não será realizada.Como um exemplo de correspondência de prefixos, o bloco de localização a seguir pode ser selecionado para responder aos URIs de solicitação que se pareçam com /site
, /site/page1/index.html
ou /site/index.html
:
location /site {
. . .
}
Para demonstrar a correspondência exata de URIs de solicitação, esse bloco será sempre usado para responder a um URI de solicitação que se pareça com /page1
. Ele não será usado para responder a um URI de solicitação /page1/index.html
. Lembre-se de que se esse bloco for selecionado e a solicitação for realizada usando uma página de índice, um redirecionamento interno será feito para outro local. Ele será o manuseador real da solicitação:
location = /page1 {
. . .
}
Como um exemplo de um local que deve ser interpretado como uma expressão regular sensível a maiúsculas e minúsculas, este bloco poderia ser usado para manusear solicitações para /tortoise.jpg
, mas não para /FLOWER.PNG
:
location ~ \.(jpe?g|png|gif|ico)$ {
. . .
}
Um bloco que permitiria a correspondência sem diferenciar maiúsculas de minúsculas parecido com o exemplo acima é mostrado abaixo. Aqui, tanto /tortoise.jpg
quanto /FLOWER.PNG
poderiam ser manuseados por este bloco:
location ~* \.(jpe?g|png|gif|ico)$ {
. . .
}
Por fim, este bloco impediria a ocorrência de correspondências de expressão regular se ele fosse escolhido como a melhor correspondência de expressão não regular. Ele poderia manusear solicitações para /costumes/ninja.html
:
location ^~ /costumes {
. . .
}
Como se vê, os modificadores indicam como o bloco de localização deve ser interpretado. No entanto, isso não nos informa qual algoritmo o Nginx usa para decidir para qual bloco de localização enviar a solicitação. Vamos analisar isso a seguir.
O Nginx escolhe o local que será usado para atender a uma solicitação de forma semelhante a como seleciona um bloco de servidor. Ele executa um processo que determina o melhor bloco de localização para qualquer solicitação. Entender esse processo é um requisito crucial para ser capaz de configurar o Nginx de maneira confiável e precisa.
Tendo em mente os tipos de declaração de localização que descrevemos acima, o Nginx avalia os possíveis contextos de localização comparando o URI de solicitação com cada um dos locais. Ele faz isso usando o seguinte algoritmo:
=
for encontrado para corresponder exatamente ao URI de solicitação, este bloco de localização é selecionado imediatamente para atender à solicitação.=
) for encontrada, o Nginx então passa a avaliar prefixos não exatos. Ele descobre o prefixo de correspondência mais longo para um determinado URI de solicitação, que é então avaliado da seguinte forma:
^~
, então o Nginx imediatamente terminará sua pesquisa e selecionará esse local para atender à solicitação.^~
, a correspondência é armazenada pelo Nginx por enquanto, para que o foco da pesquisa possa mudar.É importante entender que, por padrão, o Nginx atenderá correspondências por expressão regular preferencialmente em relação a correspondências por prefixo. No entanto, ele avalia primeiro as localizações de prefixos, permitindo que o administrador altere essa tendência especificando localizações usando os modificadores =
e ^~
.
Também é importante notar que, embora as localizações de prefixo geralmente selecionarem com base na correspondência mais longa e específica, a avaliação de expressões regulares é interrompida quando a primeira localização correspondente é encontrada. Isso significa que o posicionamento dentro da configuração tem vastas implicações para as localizações de expressões regulares.
Por fim, é importante entender que as correspondências por expressões regulares dentro da correspondência de prefixo mais longa “passam à frente” quando o Nginx avalia as localizações de regex. Elas serão avaliadas, por ordem, antes de qualquer uma das outras correspondências por expressão regular serem consideradas. Maxim Dounin, um desenvolvedor do Nginx incrivelmente prestativo, explica nesta postagem essa parte do algoritmo de seleção.
De maneira geral, quando um bloco de localização é selecionado para atender a uma solicitação, a solicitação é manuseada inteiramente dentro desse contexto a partir daquele ponto. Apenas a localização selecionada e as diretivas herdadas determinam como a solicitação é processada, sem a interferência de blocos de localização irmãos.
Embora essa seja uma regra geral que permite projetar seus blocos de localização de maneira previsível, é importante perceber que há momentos em que uma nova procura de localização é acionada por determinadas diretivas dentro da localização selecionada. As exceções à regra de “apenas um bloco de localização” podem ter implicações sobre como a solicitação é realmente atendida e podem não se alinhar com as expectativas que você tinha quando projetou seus blocos de localização.
Algumas diretivas que podem levar a esse tipo de redirecionamento interno são:
Vamos analisa-los brevemente.
A diretiva index
sempre leva a um redirecionamento interno se for usada para processar a solicitação. As correspondências exatas de localização são frequentemente usadas para acelerar o processo de seleção por finalizarem imediatamente a execução do algoritmo. No entanto, se você fizer uma correspondência exata de localização que é um diretório, existe uma boa chance de a solicitação ser redirecionada para uma localização diferente para ser de fato processada.
Neste exemplo, a primeira localização corresponde a um URI de solicitação /exact
, mas, para manusear a solicitação, a diretiva index
herdada pelo bloco inicia um redirecionamento interno para o segundo bloco:
index index.html;
location = /exact {
. . .
}
location / {
. . .
}
No caso acima, se fosse realmente necessário que a execução permanecesse no primeiro bloco, você precisaria escolher um método diferente para satisfazer a solicitação para o diretório. Por exemplo, você poderia definir um index
inválido para esse bloco e ativar o autoindex
:
location = /exact {
index nothing_will_match;
autoindex on;
}
location / {
. . .
}
Essa é uma maneira de impedir que o index
mude os contextos, mas provavelmente não é útil para a maioria das configurações. Geralmente, uma correspondência exata nos diretórios pode ser útil para coisas como reescrever a solicitação (que também resulta em uma nova pesquisa de localização).
Outra situação onde a localização de processamento pode ser reavaliada é com a diretiva try_files
. Essa diretiva diz ao Nginx para verificar a existência de um conjunto de arquivos ou diretórios nomeados. O último parâmetro pode ser um URI para o qual o Nginx fará um redirecionamento interno.
Considere a configuração a seguir:
root /var/www/main;
location / {
try_files $uri $uri.html $uri/ /fallback/index.html;
}
location /fallback {
root /var/www/another;
}
No exemplo acima, se uma solicitação for feita para /blahblah
, a primeira localização inicialmente receberá a solicitação. Ele tentará encontrar um arquivo chamado blahblah
no diretório /var/www/main
. Se ele não puder encontrar um, ele seguirá procurando por um arquivo chamado blahblah.html
. Em seguida, ele tentará ver se há um diretório chamado blahblah/
dentro do diretório /var/www/main
. Se todas essas tentativas falharem, ele redirecionará para /fallback/index.html
. Isso desencadeará outra pesquisa de localização que será capturada pelo segundo bloco de localização. Isso atenderá o arquivo /var/www/another/fallback/index.html
.
Outra diretiva que pode levar a uma mudança de bloco de localização é a rewrite
. Quando se usa o último
parâmetro com a diretiva rewrite
, ou quando não se usa nenhum parâmetro, o Nginx procurará uma nova localização correspondente com base nos resultados de rewrite.
Por exemplo, se modificarmos o último exemplo para incluir uma rewrite, é possível ver que a solicitação às vezes é passada diretamente para a segunda localização sem depender da diretiva try_files
:
root /var/www/main;
location / {
rewrite ^/rewriteme/(.*)$ /$1 last;
try_files $uri $uri.html $uri/ /fallback/index.html;
}
location /fallback {
root /var/www/another;
}
No exemplo acima, uma solicitação para /rewriteme/hello
será manuseada inicialmente por um primeiro bloco de localização. Ela será reescrita para /hello
e uma localização será pesquisada. Neste caso, ela corresponderá novamente à primeira localização e será processada pelo try_files
como de costume, talvez retornando para /fallback/index.html
se nada for encontrado (usando o redirecionamento interno do try_files
que discutimos acima).
No entanto, se uma solicitação for feita para /rewriteme/fallback/hello
, o primeiro bloco novamente corresponderá. A diretiva rewrite é aplicada novamente, dessa vez resultando em /fallback/hello
. A solicitação será então atendida a partir do segundo bloco de localização.
Uma situação relacionada acontece com a diretiva return
ao enviar os códigos de status 301
ou 302
. Neste caso, a diferença é que isso resulta em uma solicitação inteiramente nova na forma de um redirecionamento visível externamente. Essa mesma situação pode ocorrer com a diretiva rewrite
ao usar os sinalizadores redirect
ou permanent
. No entanto, essas pesquisas de localização não devem ser inesperadas, uma vez que os redirecionamentos visíveis externamente sempre resultam em uma nova solicitação.
A diretiva error_page
pode levar a um redirecionamento interno parecido com aquele criado por try_files
. Essa diretiva é usada para definir o que deve acontecer quando certos códigos de status são encontrados. Isso provavelmente nunca será executado se o try_files
estiver definido, já que essa diretiva manuseia todo o ciclo de vida de uma solicitação.
Considere este exemplo:
root /var/www/main;
location / {
error_page 404 /another/whoops.html;
}
location /another {
root /var/www;
}
Todas as solicitações (que não sejam aquelas que começam com /another
) serão manuseadas pelo primeiro bloco, que atenderá arquivos em /var/www/main
. No entanto, se um arquivo não for encontrado (um status 404), um redirecionamento interno para /another/whoops.html
ocorrerá, levando a uma nova pesquisa de localização que eventualmente chegará no segundo bloco. Este arquivo será atendido por /var/www/another/whoops.html
.
Como se vê, entender as circunstâncias nas quais o Nginx aciona uma nova pesquisa de localização pode ajudar a prever o comportamento que você verá ao fazer solicitações.
Compreender as maneiras como o Nginx processa as solicitações de clientes pode tornar seu trabalho como um administrador muito mais fácil. Você será capaz de saber qual bloco de servidor o Nginx selecionará com base em cada solicitação de cliente. Você também será capaz de dizer como o bloco de localização será selecionado com base no URI de solicitação. De maneira geral, saber como o Nginx seleciona diferentes blocos lhe dará a capacidade de rastrear os contextos que o Nginx aplicará para atender a cada solicitação.
]]>Nginx fait partie des serveurs Web les plus populaires au monde. C’est avec succès qu’il peut gérer de grandes charges avec plusieurs connexions clients concurrentes et être facilement utilisé comme un serveur web, un serveur de courrier ou un serveur proxy inverse.
Dans ce guide, nous allons traiter de quelques-uns des détails en coulisses qui déterminent la manière dont Nginx traite les requêtes clients. En ayant une compréhension de ces idées, vous n’aurez plus à faire autant de devinettes quant à la conception du serveur et des blocs de localisation et vous serez alors en mesure de rendre la manipulation des requêtes moins imprévisible.
Nginx divise de manière logique les configurations destinées à servir différents contenus dans des blocs qui se trouvent dans une structure hiérarchique. Chaque fois qu’un client fait une requête, Nginx entame un processus pour sélectionner les blocs de configuration qui seront utilisés pour gérer la requête. C’est ce processus de décision que nous allons aborder dans ce guide.
Les principaux blocs que nous allons aborder sont les suivants : le bloc server et le bloc location.
Un bloc de serveur est un sous-ensemble de la configuration de Nginx qui définit le serveur virtuel avec lequel les requêtes d’un type donné seront gérées. Les administrateurs configurent souvent plusieurs blocs de serveur et décident ensuite du bloc qui sera chargé de la connexion en fonction du nom du domaine, du port et de l’adresse IP demandés.
Un bloc de localisation se trouve dans un bloc de serveur. Il permet de définir la façon dont Nginx doit gérer les requêtes pour différentes ressources et URI du serveur parent. L’administrateur peut subdiviser l’espace URI de toutes les manières qu’il le souhaite en utilisant ces blocs. Il s’agit d’un modèle extrêmement flexible.
Étant donné que Nginx permet à l’administrateur de définir plusieurs blocs de serveur qui fonctionnent comme des instances de serveur virtuel distinctes, il lui faut une procédure qui lui permette de déterminer lesquels de ces blocs utiliser pour satisfaire une requête.
Pour cela, il passe par un système de vérifications utilisées pour trouver la meilleure correspondance possible. Les directives de bloc de serveur principal que Nginx prend en considération pendant ce processus sont les directives listen
et server_name
.
Tout d’abord, Nginx examine l’adresse IP et le port de la requête. Il les fait ensuite correspondre avec la directive listen
de chaque serveur pour créer une liste des blocs du serveur qui peuvent éventuellement résoudre la requête.
La directive listen
définit généralement l’adresse IP et le port auxquels le bloc de serveur répondra. Par défaut, tout bloc de serveur qui n’inclut pas une directive listen
se voit attribuer les paramètres d’écoute de 0.0.0.0:80
(ou 0.0.0.0:8080
si Nginx est exécuté par un non-root user normal). Ces blocs peuvent ainsi répondre aux requêtes sur toute interface sur le port 80. Mais cette valeur par défaut n’a pas beaucoup de poids dans le processus de sélection de serveur.
La directive listen
peut être configurée sur :
La dernière option aura généralement des implications lorsque les requêtes passeront entre les différents serveurs.
Lorsque Nginx tentera de déterminer le bloc de serveur qui enverra une requête, il tentera d’abord de décider en fonction de la spécificité de la directive listen
en utilisant les règles suivantes :
listen
« incomplètes » en substituant les valeurs manquantes par leurs valeurs par défaut afin que chaque bloc puisse être évalué par son adresse IP et son port. Voici quelques exemples de ces traductions :
listen
utilise la valeur 0.0.0.0:80
.111.111.111.111
sans aucun port devient 111.111.111.111:80
8888
sans adresse IP devient 0.0.0.0:8888
0.0.0.0
(pour correspondre à toute interface) ne sera pas sélectionné s’il existe des blocs correspondants qui répertorient une adresse IP spécifique. Dans tous les cas, la correspondance avec le port doit être exacte.server_name
de chaque bloc de serveur.Il est important de comprendre que Nginx évaluera la directive server_name
uniquement au moment où il devra faire la distinction entre les blocs de serveur qui correspondent au même niveau de spécificité dans la directive listen
. Par exemple, si example.com
est hébergé sur le port 80
de 192.168.1.10
, le premier bloc servira toujours la requête example.com
de cet exemple, malgré la présence de la directive server_name
dans le second bloc.
server {
listen 192.168.1.10;
. . .
}
server {
listen 80;
server_name example.com;
. . .
}
Dans le cas où plusieurs blocs de serveur correspondent à une spécificité égale, l’étape suivante consistera à vérifier la directive server_name
.
Ensuite, pour évaluer les requêtes qui disposent de directives listen
également spécifiques, Nginx vérifie l’en-tête « Host » de la requête. Cette valeur contient le domaine ou l’adresse IP que le client a réellement tenté d’atteindre.
Nginx tente de trouver la meilleure correspondance pour la valeur qu’il trouve en examinant la directive server_name
dans chacun des blocs de serveur toujours sélectionnés. Nginx les évalue en utilisant la formule suivante :
server_name
qui correspond à la valeur dans l’en-tête « Host » de la requête exactly. S’il la trouve, la requête sera servie par le bloc associé. S’il trouve plusieurs correspondances exactes, la première est utilisée.server_name
correspondant à l’aide d’un métacaractère principal (indiqué par un *
au début du nom dans la config). S’il en trouve un, la requête sera servie par ce bloc. S’il trouve plusieurs correspondances, la requête sera servie par la correspondance longest.server_name
correspondant à l’aide d’un métacaractère secondaire (indiqué par un nom de serveur qui se termine par un *
dans la config). S’il en trouve un, la requête sera servie par ce bloc. S’il trouve plusieurs correspondances, la requête sera servie par la correspondance longest.server_name
en utilisant des expressions régulières (indiquées par un ~
avant le nom). Le first server_name
avec une expression régulière qui correspond à l’en-tête « Host » sera utilisé pour servir la requête.Chaque combo adresse IP/port est doté d’un bloc de serveur par défaut qui sera utilisé lorsqu’aucun cours d’action ne pourra pas être déterminé avec les méthodes ci-dessus. Pour un combo adresse IP/port, il s’agira soit du premier bloc de la configuration ou du bloc qui contient l’option default_server
dans la directive listen
(qui remplacera le premier algorithme trouvé). Il ne peut y avoir qu’une seule déclaration default_server
pour chaque combinaison adresse IP/port.
S’il existe un server_name
défini qui correspond exactement à la valeur de l’en-tête « Host », ce bloc de serveur sera sélectionné pour traiter la requête.
Dans cet exemple, si l’en-tête « Host » de la requête est configuré sur « host1.example.com », c’est le second serveur qui sera sélectionné :
server {
listen 80;
server_name *.example.com;
. . .
}
server {
listen 80;
server_name host1.example.com;
. . .
}
S’il ne trouve aucune correspondance exacte, Nginx vérifie alors s’il existe un server_name
qui commence avec un métacaractère qui convient. Ce sera la correspondance la plus longue et qui commence par un métacaractère qui sera sélectionnée pour répondre à la requête.
Dans cet exemple, si l’en-tête de la requête indique « Host » pour « www.example.org », c’est le second bloc de serveur qui sera sélectionné :
server {
listen 80;
server_name www.example.*;
. . .
}
server {
listen 80;
server_name *.example.org;
. . .
}
server {
listen 80;
server_name *.org;
. . .
}
S’il ne trouve aucune correspondance qui commence par un métacaractère, Nginx verra alors s’il existe une correspondance en utilisant un métacaractère à la fin de l’expression. À ce stade, la requête sera servie par la correspondance la plus longue et qui contient un métacaractère.
Par exemple, si l’en-tête de la requête est configuré sur « www.example.com », ce sera le troisième bloc de serveur qui sera sélectionné :
server {
listen 80;
server_name host1.example.com;
. . .
}
server {
listen 80;
server_name example.com;
. . .
}
server {
listen 80;
server_name www.example.*;
. . .
}
S’il ne trouve aucune correspondance avec un métacaractère, Nginx tentera alors de faire correspondre les directives server_name
qui utilisent des expressions régulières. C’est la first expression régulière correspondante qui sera sélectionnée pour répondre à la requête.
Par exemple, si l’en-tête « Host » de la requête est configuré sur « www.example.com », c’est alors le second serveur qui sera sélectionné pour satisfaire la requête :
server {
listen 80;
server_name example.com;
. . .
}
server {
listen 80;
server_name ~^(www|host1).*\.example\.com$;
. . .
}
server {
listen 80;
server_name ~^(subdomain|set|www|host1).*\.example\.com$;
. . .
}
Si aucune des étapes ci-dessus ne permet de satisfaire la requête, la requête sera alors transmise au serveur par default de l’adresse IP et le port correspondants.
Tout comme le processus que Nginx utilise pour sélectionner le bloc de serveur qui traitera une requête, Nginx dispose également d’un algorithme établi pour décider du bloc de localisation du serveur qui servira au traitement des requêtes.
Avant de voir de quelle manière Nginx procède pour décider du bloc de localisation qui sera utilisé pour traiter les requêtes, étudions un peu la syntaxe que vous êtes susceptible de rencontrer dans les définitions de bloc de localisation. Les blocs de localisation se trouvent dans les blocs serveur (ou autresblocs de localisation) et servent à décider de quelle manière traité l’URl de la requête (la partie de la demande qui se trouve après le nom du domaine ou l’adresse IP/port).
Les blocs de localisation ont généralement cette forme :
location optional_modifier location_match {
. . .
}
Le location_match
ci-dessus définit avec quel élément Nginx devrait comparer l’URl de la requête. Dans l’exemple ci-dessus, l’existence ou l’absence du modificateur affecte la façon dont Nginx tente de faire correspondre le bloc de localisation. Les modificateurs donnés ci-dessous entraîneront l’interprétation suivante du bloc de localisation associé :
=
: si le signe égal est utilisé, ce bloc sera considéré comme une correspondance si l’URI de la requête correspond exactement à la localisation indiquée.~
: la présence d’un modificateur tilde indique que cet emplacement sera interprété comme une correspondance d’expression régulière sensible à la casse.~*
: si un modificateur tilde et astérisque est utilisé, le bloc de localisation sera interprété comme une correspondance d’expression régulière insensible à la casse.^~
: si un modificateur de carat et tilde est présent et que ce bloc est sélectionné comme la meilleure correspondance d’expression non régulière, la mise en correspondance des expressions régulières n’aura pas lieu.Pour vous donner un exemple de correspondance de préfixe, vous pouvez sélectionner le bloc de localisation suivant pour répondre aux URl de requête qui ressemblent à /site
, /site/page1/index.html
ou /site/index.html
:
location /site {
. . .
}
Pour démontrer une correspondance exacte de l’URI, ce bloc sera toujours utilisé pour répondre à une URI de la requête qui ressemble à /page1
. Il ne sera pas utilisé pour répondre à une URI de requête /page1/index.html
. N’oubliez pas que, si ce bloc est sélectionné et que la requête est renseignée à l’aide d’une page d’index, le système procédera à une redirection interne vers un autre emplacement qui correspondra au gestionnaire réel de la requête :
location = /page1 {
. . .
}
Pour vous donner un exemple du type de localisation qui doit être interprétée comme une expression régulière sensible à la casse, vous pouvez utiliser ce bloc pour gérer les requêtes de /tortoise.jpg
, mais pas pour /FLOWER.PNG
:
location ~ \.(jpe?g|png|gif|ico)$ {
. . .
}
Voici un bloc qui permettrait de faire une mise en correspondance insensible à la casse similaire à ce qui précède : Ici, ce bloc peut gérer /tortoise.jpg
et /FLOWER.PNG
à la fois :
location ~* \.(jpe?g|png|gif|ico)$ {
. . .
}
Enfin, ce bloc empêchera l’affichage de la correspondance avec l’expression régulière, s’il est configuré de manière à être mis en correspondance avec la meilleure expression non régulière. Il peut gérer les requêtes de /costumes/ninja.html
:
location ^~ /costumes {
. . .
}
Comme vous le voyez, les modificateurs indiquent de quelle manière le bloc de localisation doit être interprété. Cependant, cela ne nous indique pas l’algorithme que Nginx utilise pour décider du bloc de localisation qui enverra la requête. C’est ce que nous allons voir maintenant.
Nginx choisit la localisation qui sera utilisée pour servir une requête de manière similaire à la façon dont il sélectionne un bloc de serveur. Il passe par un processus qui détermine le meilleur bloc de localisation pour toute requête donnée. Il est vraiment crucial d’avoir une bonne compréhension de ce processus afin de pouvoir configurer Nginx de manière fiable et précise.
Tout en gardant à l’esprit les types de déclarations de localisation que nous avons décrites ci-dessus, Nginx évalue les contextes de localisation possibles en comparant l’URI de la requête avec chacun des emplacements. Pour cela, il utilise l’algorithme suivant :
=
correspond exactement à l’URI de la requête, ce bloc de localisation est immédiatement sélectionné pour servir la requête.=
) qui corresponde exactement, Nginx passe alors à l’évaluation des préfixes inexacts. Il trouve la localisation préfixée la plus longue qui correspond à l’URI de la requête donnée, qu’il évalue ensuite de la manière suivante :
^~
, Nginx arrête immédiatement sa recherche et sélectionnera cette localisation pour servir la requête.^~
, Nginx enregistre la correspondance pour le moment afin que le focus de la recherche puisse être déplacé.Il est important de comprendre que, par défaut, Nginx préférera servir des correspondances d’expression régulière que des correspondances de préfixe. Cependant, il évalue d’abord les localisations préfixées pour que l’administration puisse écraser cette tendance en spécifiant les localisations avec les modificateurs =
et ^~
.
Il est également important de noter que bien que les localisations préfixées sont généralement sélectionnées en fonction de la correspondance la plus spécifique et la plus longue, l’évaluation des expressions régulières s’arrête une fois que la première localisation correspondante est trouvée. Cela signifie que, dans la configuration, le positionnement a un grand impact sur les localisations d’expressions régulières.
Pour finir, il est important de comprendre que les correspondances d’expressions régulières dans la correspondance de préfixe la plus longue « sauteront la ligne » lorsque Nginx procédera à l’évaluation des localisations de regex. Il procédera à l’évaluation de ces éléments, dans l’ordre, avant même qu’une des autres correspondances d’expression régulière ne soit prise en considération. Dans cet article, Maxim Dounin, un développeur Nginx incroyablement enrichissant, nous explique cette partie de l’algorithme de sélection.
En règle générale, lorsqu’un bloc de localisation est sélectionné pour servir une requête, l’intégralité de la requête est traitée dans ce même contexte à partir de ce moment-là. Seules la localisation sélectionnée et les directives héritées déterminent de quelle manière la requête est traitée, sans que les blocs de localisation apparentés ne viennent interférer.
Bien qu’il s’agisse d’une règle générale qui vous permettra de concevoir vos blocs de localisation de manière prévisible, il est important que vous sachiez que, parfois, certaines directives dans la localisation sélectionnée déclenchent une nouvelle recherche de localisation. Les exceptions à la règle « seulement un bloc de localisation » peuvent avoir un impact sur la façon dont la requête est effectivement servie et ne seront pas en accord avec les attentes que vous aviez lors de la conception de vos blocs de localisation.
Voici quelques-unes des directives qui peuvent conduire à ce type de redirection interne :
Examinons-les brièvement.
La directive index
entraîne toujours une redirection interne si elle est utilisée pour gérer la requête. Les correspondances de localisation exactes permettent souvent d’accélérer le processus de sélection en mettant immédiatement fin à l’exécution de l’algorithme. Cependant, si vous effectuez une correspondance de localisation exacte qui est un répertoire, il est possible que la requête soit redirigée vers un autre emplacement pour le traitement en tant que tel.
Dans cet exemple, la mise en correspondance de la première localisation se fait par un URI de requête /exact
, mais pour gérer la requête, la directive index
héritée du bloc initialise une redirection interne vers le second bloc :
index index.html;
location = /exact {
. . .
}
location / {
. . .
}
Dans le cas précédent, si vous souhaitez vraiment que l’exécution reste dans le premier bloc, il vous faudra trouver un moyen différent de satisfaire la requête au répertoire. Par exemple, vous pouvez configurer un index
non valide pour ce bloc et activer autoindex
:
location = /exact {
index nothing_will_match;
autoindex on;
}
location / {
. . .
}
Il s’agit d’un moyen d’empêcher un index
de changer de contexte, mais il n’est probablement pas si pratique pour la plupart des configurations. Une correspondance exacte sur les répertoires vous permet la plupart du temps de réécrire la requête par exemple (ce qui peut également générer une nouvelle recherche de localisation).
Il existe également une autre instance dans laquelle la localisation de traitement peut être réévaluée, avec la directive try_files
Cette directive indique à Nginx de vérifier s’il existe un ensemble de fichiers ou de répertoires nommés. Le dernier paramètre peut être un URI vers lequel Nginx procédera à une redirection interne.
Prenons le cas de la configuration suivante :
root /var/www/main;
location / {
try_files $uri $uri.html $uri/ /fallback/index.html;
}
location /fallback {
root /var/www/another;
}
Dans l’exemple ci-dessus, si une requête est exécutée pour /blahblah
, la première localisation obtiendra initialement la requête. Il tentera de trouver un fichier nommé blahblah
dans le répertoire /var/www/main
. S’il ne peut en trouver, il redirigera sa recherche sur un fichier nommé blahblah.html
. Il tentera ensuite de voir s’il existe un répertoire appelé blahblah/
dans le répertoire /var/www/main
. Si toutes ces tentatives son un échec, il se redirigera vers /fallback/index.html
. Cela déclenchera une autre recherche de localisation que le deuxième bloc de localisation détectera. Cela servira le fichier /var/www/another/fallback/index.html
.
La directive rewrite
est également une des autres directives qui permettent de passer un bloc de localisation. Lorsque Nginx utilise le paramètre last
avec la directive rewrite
, ou n’utilise aucun paramètre du tout, il recherchera une nouvelle localisation correspondante en fonction des résultats de la réécriture.
Par exemple, si nous modifions le dernier exemple pour y inclure une réécriture, nous pouvons voir que la requête est parfois transmise directement à la seconde localisation sans s’appuyer sur la directive try_files
:
root /var/www/main;
location / {
rewrite ^/rewriteme/(.*)$ /$1 last;
try_files $uri $uri.html $uri/ /fallback/index.html;
}
location /fallback {
root /var/www/another;
}
Dans l’exemple ci-dessus, une requête /rewriteme/hello
sera initialement gérée par le premier bloc de localisation. Elle sera réécrite /hello
, puis la recherche de la localisation sera déclenchée. Dans ce cas, elle correspondra à nouveau à la première localisation et sera traité par le try_files
comme d’habitude, peut même redirigé sur /fallback/index.html
si la recherche est infructueuse (en utilisant la redirection interne try_files
discutée plus haut).
Cependant, si une requête /rewriteme/fallback/hello
est effectuée, le premier bloc sera à nouveau une correspondance. La réécriture est à nouveau appliquée, ce qui génère cette fois-ci /fallback/hello
. La requête sera ensuite servie par le second bloc de localisation.
La situation est connexe avec la directive return
lorsque vous envoyez les codes de statut 301
ou 302
. La différence ici est que la requête obtenue et une requête entièrement nouvelle qui prend la forme d’une redirection visible à l’extérieur. Cette même situation peut se produire avec la directive rewrite
si vous utilisez les balises redirect
ou permanent
. Cependant, ces recherches de localisation ne devraient pas être fortuites, car les redirections visibles en externe entraînent toujours une nouvelle requête.
La directive error_page
peut générer une redirection interne similaire à celle créée par try_files
. Cette directive permet de définir ce qui devrait se passer si le système rencontre certains codes de statut. Cela ne sera probablement jamais exécuté si try_files
est configuré. En effet, cette directive gère l’intégralité du cycle de vie d’une requête.
Prenons l’exemple suivant :
root /var/www/main;
location / {
error_page 404 /another/whoops.html;
}
location /another {
root /var/www;
}
Chaque requête (autres que celles qui commencent par /another
) sera traitée par le premier bloc, qui servira les fichiers /var/www/main
. Cependant, si un fichier est introuvable (un statut 404), vous verrez se produire une redirection interne vers /another/whoops.html
, générant une nouvelle recherche de localisation qui attérira éventuellement sur le second bloc. Ce fichier sera servi à partir de /var/www/another/whoops.html
.
Comme vous pouvez le voir, en comprenant les circonstances dans lesquelles Nginx déclenche une recherche de nouvelle localisation, vous serez plus à même de prévoir le comportement que vous verrez lorsque vous ferez des requêtes.
Le fait de comprendre de quelle manière Nginx traite les requêtes clients peut vraiment vous faciliter la tâche en tant qu’administrateur. Vous pourrez savoir quel bloc de serveur Nginx sélectionnera en fonction de chaque requête clients. Vous pourrez également deviner de quelle manière le bloc de localisation sera sélectionné en fonction de l’URI de la requête. Dans l’ensemble, en sachant de quelle manière Nginx sélectionne les différents blocs, vous serez en capacité de reconnaître les contextes que Nginx appliquera pour servir chaque requête.
]]>Nginx es uno de los servidores web más populares del mundo. Puede manejar correctamente altas cargas con muchas conexiones de clientes concurrentes y puede funcionar fácilmente como servidor web, servidor de correo o servidor de proxy inverso.
En esta guía, explicaremos algunos de los detalles en segundo plano que determinan cómo Nginx procesa las solicitudes de los clientes. Entender estas ideas puede ayudar a despejar las incógnitas sobre el diseño de bloques de servidores y ubicación, y puede hacer que el manejo de las solicitudes parezca menos impredecible.
Nginx divide de forma lógica las configuraciones destinadas a entregar distintos contenidos en bloques, que conviven en una estructura jerárquica. Cada vez que se realiza una solicitud de cliente, Nginx inicia un proceso para determinar qué bloques de configuración deben usarse para gestionar la solicitud. Este proceso de decisión es lo que explicaremos en esta guía.
Los bloques principales que explicaremos son el bloque de servidor y el bloque de ubicación.
Un bloque de servidor es un subconjunto de la configuración de Nginx que define un servidor virtual utilizado para gestionar las solicitudes de un tipo definido. Los administradores suelen configurar varios bloques de servidores y decidir qué bloque debe gestionar cada conexión según el nombre de dominio, el puerto y la dirección IP solicitados.
Un bloque de ubicación reside dentro de un bloque de servidor y se utiliza para definir la manera en que Nginx debe gestionar las solicitudes para diferentes recursos y URI para el servidor principal. El espacio URI puede subdividirse de la manera que el administrador quiera utilizando estos bloques. Es un modelo extremadamente flexible.
Dado que Nginx permite que el administrador defina varios bloques de servidores que funcionan como instancias de servidores web virtuales independientes, necesita un procedimiento para determinar cuál de estos bloques de servidores se utilizará para satisfacer una solicitud.
Lo hace mediante un sistema definido de comprobaciones que se utilizan para encontrar la mejor coincidencia posible. Las principales directivas de bloques de servidores de las que se ocupa Nginx durante este proceso son la directiva listen
y la directiva server_name
.
Primero, Nginx examina la dirección IP y el puerto de la solicitud. Luego, lo compara con la directiva listen
de cada servidor para crear una lista de los bloques de servidores que pueden resolver la solicitud.
La directiva listen
generalmente define la dirección IP y el puerto a los que responderá el bloque de servidor. De manera predeterminada, cualquier bloque de servidor que no incluya una directiva listen
recibe los parámetros de escucha 0.0.0.0:80
(o 0.0.0.0:8080
si Nginx está siendo ejecutado por un usuario no root regular). Eso permite que estos bloques respondan a las solicitudes en cualquier interfaz en el puerto 80, pero este valor predeterminado no afecta demasiado al proceso de selección de servidores.
La directiva listen
puede establecerse para las siguientes características:
La última opción, por lo general, solo tendrá implicaciones al pasar solicitudes entre distintos servidores.
Cuando intente determinar a qué bloque de servidor enviar una solicitud, Nginx tratará primero de decidir según la especificidad de la directiva listen
usando las siguientes reglas:
listen
“incompletas” sustituyendo los valores que faltan por sus valores predeterminados para que cada bloque pueda evaluarse por su dirección IP y su puerto. Algunos ejemplos de estas traslaciones son:
listen
utiliza el valor 0.0.0.0:80
.111.111.111.111
sin puerto se convierte en 111.111.111.111:80
.8888
sin dirección IP se convierte en 0.0.0.0:8888
.0.0.0.0
como su dirección IP (para coincidir con cualquier interfaz), no se seleccionará si hay bloques coincidentes que enumeran una dirección IP específica. En todo caso, el puerto debe coincidir con exactitud.server_name
de cada bloque de servidores.Es importante entender que Nginx solo evaluará la directiva server_name
cuando necesite distinguir entre bloques de servidores que coinciden con el mismo nivel de especificidad en la directiva listen
. Por ejemplo, si example.com
está alojado en el puerto 80
de 192.168.1.10
, una solicitud para example.com
siempre será atendida por el primer bloque de este ejemplo, a pesar de la directiva server_name
en el segundo bloque.
server {
listen 192.168.1.10;
. . .
}
server {
listen 80;
server_name example.com;
. . .
}
En caso de que más de un bloque de servidor coincida con la misma especificidad, el siguiente paso es verificar la directiva server_name
.
Luego, para evaluar más a fondo las solicitudes que tienen directivas listen
igualmente específicas, Nginx verifica el encabezado “Host” de la solicitud. Ese valor contiene el dominio o la dirección IP que el cliente estaba intentando alcanzar.
Nginx intenta encontrar la mejor coincidencia para el valor que encuentra al ver la directiva server_name
dentro de cada uno de los bloques de servidores que aún son candidatos a selección. Nginx lo evalúa usando la siguiente formula:
server_name
que coincida con el valor en el encabezado “Host” de la solicitud de manera exacta. Si lo encuentra, ese bloque asociado se utilizará para atender la solicitud. Si se encuentran varias coincidencias exactas, se utiliza la primera coincidencia.server_name
que coincida usando un comodín inicial (indicado por un *
al principio del nombre en la configuración). Si lo encuentra, ese bloque se utilizará para atender la solicitud. Si se encuentran varias coincidencias, la coincidencia más larga se utilizará para atender la solicitud.server_name
que coincida usando un comodín final (indicado por un nombre de servidor que termina con un *
en la configuración). Si lo encuentra, ese bloque se utilizará para atender la solicitud. Si se encuentran varias coincidencias, la coincidencia más larga se utilizará para atender la solicitud.server_name
usando expresiones regulares (indicadas por un ~
antes del nombre). El primer server_name
con una expresión regular que coincida con el encabezado “Host” se utilizará para atender la solicitud.Cada combinación de dirección IP/puerto tiene un bloque de servidor predeterminado que se utilizará cuando no se pueda determinar un curso de acción con los métodos anteriores. En el caso de una combinación de dirección IP y puerto, será el primer bloque de la configuración o el bloque que contiene la opción default_server
como parte de la directiva listen
(que anularía el primer algoritmo encontrado). Solo puede haber una declaración default_server
por cada combinación de dirección IP y puerto.
Si se define un server_name
que coincida exactamente con el valor de encabezado “Host”, se selecciona ese bloque de servidor para procesar la solicitud.
En este ejemplo, si el encabezado “Host” de la solicitud se estableció en “host1.example.com”, se seleccionaría el segundo servidor:
server {
listen 80;
server_name *.example.com;
. . .
}
server {
listen 80;
server_name host1.example.com;
. . .
}
Si no se encuentra ninguna coincidencia exacta, Nginx comprueba entonces si hay un server_name
con un comodín inicial que coincida. Se seleccionará la coincidencia más larga que comience con un comodín para satisfacer la solicitud.
En este ejemplo, si la solicitud tuviera un encabezado “Host” de “www.example.org”, se seleccionaría el segundo bloque de servidor:
server {
listen 80;
server_name www.example.*;
. . .
}
server {
listen 80;
server_name *.example.org;
. . .
}
server {
listen 80;
server_name *.org;
. . .
}
Si no se encuentra ninguna coincidencia con un comodín inicial, Nginx verá si existe una coincidencia usando un comodín al final de la expresión. En este punto, se seleccionará la coincidencia más larga que termine con un comodín para atender la solicitud.
Por ejemplo, si la solicitud tiene un encabezado “Host” establecido en “www.example.com”, se seleccionará el tercer bloque de servidor:
server {
listen 80;
server_name host1.example.com;
. . .
}
server {
listen 80;
server_name example.com;
. . .
}
server {
listen 80;
server_name www.example.*;
. . .
}
Si no se encuentran coincidencias con los comodines, Nginx pasará a intentar coincidir con las directivas server_name
que utilicen expresiones regulares. Se seleccionará la primera expresión regular que coincida para responder a la solicitud.
Por ejemplo, si el encabezado “Host” de la solicitud está configurado en “www.example.com”, se seleccionará el segundo bloque de servidor para satisfacer la solicitud:
server {
listen 80;
server_name example.com;
. . .
}
server {
listen 80;
server_name ~^(www|host1).*\.example\.com$;
. . .
}
server {
listen 80;
server_name ~^(subdomain|set|www|host1).*\.example\.com$;
. . .
}
Si ninguno de los pasos anteriores puede satisfacer la solicitud, la solicitud se pasará al servidor predeterminado para la dirección IP y el puerto que coincidan.
De forma similar al proceso que utiliza Nginx para seleccionar el bloque del servidor que procesará una solicitud, Nginx también tiene un algoritmo establecido para decidir qué bloque de ubicación dentro del servidor se utilizará para gestionar las solicitudes.
Antes de ver cómo Nginx decide qué bloque de ubicación utilizar para gestionar las solicitudes, repasemos parte de la sintaxis que se podría encontrar en las definiciones de los bloques de ubicación. Los bloques de ubicación se encuentran dentro de los bloques de servidor (u otros bloques de ubicación) y se utilizan para decidir cómo procesar la URI de la solicitud (la parte de la solicitud que viene después del nombre de dominio o la dirección IP/puerto).
Los bloques de ubicación generalmente tienen la siguiente forma:
location optional_modifier location_match {
. . .
}
location_match
que aparece arriba define contra qué debe comprobar Nginx la URI de la solicitud. La existencia o la inexistencia del modificador en el ejemplo anterior afecta a la forma en que Nginx intenta hacer coincidir el bloque de ubicación. Los modificadores que se muestran abajo harán que el bloque de ubicación asociado se interprete de la siguiente manera:
=
: si se utiliza un signo igual, este bloque se considerará coincidente si el URI de la solicitud coincide exactamente con la ubicación indicada.~
: si hay una tilde de la eñe, esta ubicación se interpretará como una coincidencia de expresión regular que distingue entre mayúsculas y minúsculas.~*
: si se utiliza un modificador de tilde de la eñe y asterisco, el bloque de ubicación se interpretará como una coincidencia de expresión regular que no distingue entre mayúsculas y minúsculas.^~
: si hay un modificador de acento circunflejo y tilde de la eñe, y si este bloque se selecciona como la mejor coincidencia de expresión no regular, no se realizará la coincidencia de expresión regular.Como ejemplo de concordancia de prefijos, se puede seleccionar el siguiente bloque de ubicación para responder a los URI de solicitud que tienen el aspecto /site
, /site/page1/index.html
o /site/index.html
:
location /site {
. . .
}
Para una demostración de la coincidencia exacta del URI de la solicitud, este bloque siempre se utilizará para responder a un URI de solicitud que tenga el aspecto /page1
. No se utilizará para responder a un URI de solicitud /page1/index.html
. Tenga en cuenta que, si se selecciona este bloque y la solicitud se cumple usando una página de índice, se realizará un redireccionamiento interno a otra ubicación que será la que realmente gestione la solicitud:
location = /page1 {
. . .
}
Como ejemplo de una ubicación que debe interpretarse como una expresión regular que distingue entre mayúsculas y minúsculas, este bloque puede usarse para gestionar las solicitudes de /tortoise.jpg
, pero no para /FLOWER.PNG
:
location ~ \.(jpe?g|png|gif|ico)$ {
. . .
}
A continuación, se muestra un bloque que permitiría una coincidencia sin distinción ente mayúsculas y minúsculas similar al que se mostró anteriormente. Este bloque podría gestionar /tortoise.jpg
y /FLOWER.PNG
:
location ~* \.(jpe?g|png|gif|ico)$ {
. . .
}
Por último, este bloque evitaría que se produjera una coincidencia de expresión regular si se determina que es la mejor coincidencia de expresión no regular. Podría gestionar las solicitudes de /costumes/ninja.html
:
location ^~ /costumes {
. . .
}
Como ve, los modificadores indican cómo debe interpretarse el bloque de ubicación. Sin embargo, esto no nos indica el algoritmo que utiliza Nginx para decidir a qué bloque de ubicación enviar la solicitud. Lo repasaremos a continuación.
Nginx elige la ubicación que se utilizará para atender una solicitud de forma similar a cómo selecciona un bloque de servidor. Se ejecuta a través de un proceso que determina el mejor bloque de ubicación para cualquier solicitud. Entender este proceso es un requisito crucial para poder configurar Nginx de forma fiable y precisa.
Teniendo en cuenta los tipos de declaraciones de ubicación que hemos descrito anteriormente, Nginx evalúa los posibles contextos de ubicación comparando el URI de solicitud con cada una de las ubicaciones. Lo hace mediante el siguiente algoritmo:
=
y que coincida con el URI de la solicitud de manera exacta, este bloque de ubicación se selecciona inmediatamente para atender la solicitud.=
), Nginx pasa a evaluar los prefijos no exactos. Descubre la ubicación del prefijo más largo que coincide con el URI de la solicitud dada que luego evalúa de la siguiente manera:
^~
, Nginx finalizará inmediatamente su búsqueda y seleccionará esta ubicación para atender la solicitud.^~
, Nginx almacena la coincidencia de momento para poder cambiar el enfoque de la búsqueda.Es importante entender que, de manera predeterminada, Nginx atenderá las coincidencias de expresiones regulares con mayor preferencia en comparación con las coincidencias de prefijos. Sin embargo, primero evalúa las ubicaciones de prefijos, lo que permite al administrador anular esta tendencia especificando las ubicaciones usando los modificadores =
y ^~
.
También es importante tener en cuenta que, aunque las ubicaciones de prefijos generalmente se seleccionan según la coincidencia más larga y específica, la evaluación de la expresión regular se detiene cuando se encuentra la primera ubicación que coincida. Eso significa que el posicionamiento dentro de la configuración tiene amplias implicaciones para las ubicaciones de expresiones regulares.
Por último, es importante entender que las coincidencias de expresiones regulares dentro de la coincidencia del prefijo más largo “saltarán la línea” cuando Nginx evalúe las ubicaciones de regex. Estas se evaluarán en orden, antes de que se consideren las demás coincidencias de expresiones regulares. Maxim Dounin, un desarrollador de Nginx increíblemente atento, explica en esta publicación esta parte del algoritmo de selección.
En general, cuando se selecciona un bloque de ubicación para atender una solicitud, la solicitud se gestiona completamente dentro de ese contexto a partir de ese momento. Solo la ubicación seleccionada y las directivas heredadas determinan cómo se procesa la solicitud, sin interferencia de los bloques de ubicación hermanos.
Aunque esta es una regla general que le permitirá diseñar sus bloques de ubicación de una manera predecible, es importante darse cuenta de que, a veces, una nueva búsqueda de ubicación se activa por ciertas directivas dentro de la ubicación seleccionada. Las excepciones a la regla “solo un bloque de ubicación” pueden tener implicaciones sobre cómo se atiende realmente la solicitud y pueden no coincidir con las expectativas que tenía al diseñar sus bloques de ubicación.
Algunas directivas que pueden dar como resultado este tipo de redireccionamiento interno son:
Las repasaremos brevemente.
La directiva index
siempre conduce a un redireccionamiento interno si se utiliza para gestionar la solicitud. Las coincidencias de ubicación exactas se utilizan a menudo para acelerar el proceso de selección, terminando inmediatamente la ejecución del algoritmo. Sin embargo, si realiza una coincidencia de ubicación exacta que sea un directorio, hay una buena probabilidad de que la solicitud sea redirigida a una ubicación diferente para el procesamiento real.
En este ejemplo, la primera ubicación coincide con un URI de solicitud de /exact
, pero, para gestionar la solicitud, la directiva index
heredada por el bloque inicia un redireccionamiento interno al segundo bloque:
index index.html;
location = /exact {
. . .
}
location / {
. . .
}
En el caso anterior, si realmente necesita que la ejecución permanezca en el primer bloque, tendrá que encontrar un método diferente para satisfacer la solicitud al directorio. Por ejemplo, podría establecer un index
no válido para ese bloque y activar autoindex
:
location = /exact {
index nothing_will_match;
autoindex on;
}
location / {
. . .
}
Esta es una forma de evitar que un index
cambie de contexto, pero probablemente no sea útil para la mayoría de las configuraciones. Generalmente, una coincidencia exacta en directorios puede ser útil para acciones como reescribir la solicitud (lo que también genera una nueva búsqueda de ubicación).
Otro caso en que se puede reevaluar la ubicación del procesamiento es con la directiva try_files
. Esa directiva le indica a Nginx que busque la existencia de un conjunto determinado de archivos o directorios. El último parámetro puede ser un URI al que Nginx realizará un redireccionamiento interno.
Analice la siguiente configuración:
root /var/www/main;
location / {
try_files $uri $uri.html $uri/ /fallback/index.html;
}
location /fallback {
root /var/www/another;
}
En el ejemplo anterior, si se realiza una solicitud para /blahblah
, la primera ubicación obtendrá inicialmente la solicitud. Intentará encontrar un archivo llamado blahblah
en el directorio /var/www/main
. Si no puede encontrar ninguno, procederá a buscar un archivo llamado blahblah.html
. Luego, intentará buscar un directorio llamado blahblah/
dentro del directorio /var/www/main
. Si todos esos intentos fallan, se redirigirá a /fallback/index.html
. Eso activará otra búsqueda de ubicación que el segundo bloque de ubicación capturará. Y atenderá el archivo /var/www/another/fallback/index.html
.
Otra directiva que puede pasar el bloque de ubicación es la directiva rewrite
. Cuando utiliza el último
parámetro con la directiva rewrite
o cuando no utiliza ningún parámetro, Nginx buscará una nueva ubicación que coincida según los resultados de la reescritura.
Por ejemplo, si modificamos el último ejemplo para incluir una reescritura, podemos ver que la solicitud a veces se pasa directamente a la segunda ubicación sin depender de la directiva try_files
:
root /var/www/main;
location / {
rewrite ^/rewriteme/(.*)$ /$1 last;
try_files $uri $uri.html $uri/ /fallback/index.html;
}
location /fallback {
root /var/www/another;
}
En el ejemplo de arriba, el primer bloque de ubicación gestionará inicialmente una solicitud de /rewriteme/hello
. Se reescribirá a /hello
y se buscará una ubicación. En este caso, volverá a coincidir con la primera ubicación y se procesará mediante try_files
de manera habitual, probablemente regresando a /fallback/index.html
si no se encuentra nada (usando el redireccionamiento interno de try_files
que mencionamos antes).
Sin embargo, si se realiza una solicitud para /rewriteme/fallback/hello
, el primer bloque volverá a coincidir. Se aplicará la reescritura de nuevo, dando como resultado /fallback/hello
esta vez. La solicitud se atenderá desde el segundo bloque de ubicación.
Algo similar sucede con la directiva return
cuando se envían los códigos de estado 301
o 302
. La diferencia en este caso es que da como resultado una solicitud completamente nueva en forma de un redireccionamiento visible desde el exterior. Eso mismo puede suceder con la directiva rewrite
cuando se utilizan los indicadores redirect
o permanent
. Sin embargo, estas búsquedas de ubicación no deberían ser inesperadas, ya que los redireccionamientos visibles externamente siempre dan como resultado una nueva solicitud.
La directiva error_page
puede dar como resultado un redireccionamiento interno similar al que se creó con try_files
. Esa directiva se utiliza para definir lo que debe suceder cuando se encuentran ciertos códigos de estado. Esto probablemente nunca se ejecutará si se establece try_files
, ya que esa directiva maneja todo el ciclo de vida de una solicitud.
Analice este ejemplo:
root /var/www/main;
location / {
error_page 404 /another/whoops.html;
}
location /another {
root /var/www;
}
Todas las solicitudes (aparte de las que comienzan con /another
) se gestionarán mediante el primer bloque, que atenderá archivos desde /var/www/main
. Sin embargo, si no se encuentra un archivo (un estado 404), se producirá un redireccionamiento interno a /another/whoops.html
, lo que llevará a una nueva búsqueda de ubicación que finalmente aterrizará en el segundo bloque. Este archivo se atenderá desde /var/www/another/whoops.html
.
Como se puede ver, entender las circunstancias en que Nginx activa una nueva búsqueda de ubicación puede ayudar a predecir el comportamiento que habrá cuando se hagan solicitudes.
Entender las formas en que Nginx procesa las solicitudes de los clientes puede hacer que su trabajo como administrador sea mucho más fácil. Podrá saber qué bloque de servidor seleccionará Nginx según cada solicitud del cliente. También podrá saber cómo se seleccionará el bloque de ubicación según el URI de la solicitud. En general, saber la forma en que Nginx selecciona los diferentes bloques le permitirá rastrear los contextos que Nginx aplicará para atender cada solicitud.
]]>DNS, or the Domain Name System, is often a very difficult part of learning how to configure websites and servers. Understanding how DNS works will help you diagnose problems with configuring access to your websites and will allow you to broaden your understanding of what’s going on behind the scenes.
In this guide, we will discuss some fundamental DNS concepts that will help you hit the ground running with your DNS configuration. After tackling this guide, you should be ready to set up your domain name with DigitalOcean or set up your very own DNS server.
Before we jump into setting up your own servers to resolve your domain or setting up our domains in the control panel, let’s go over some basic concepts about how all of this actually works.
We should start by defining our terms. While some of these topics are familiar from other contexts, there are many terms used when talking about domain names and DNS that aren’t used too often in other areas of computing.
Let’s start easy:
The domain name system, more commonly known as “DNS” is the networking system in place that allows us to resolve human-friendly names to unique IP addresses.
A domain name is the human-friendly name that we are used to associating with an internet resource. For instance, “google.com
” is a domain name. Some people will say that the “google” portion is the domain, but we can generally refer to the combined form as the domain name.
The URL “google.com
” is associated with the servers owned by Google Inc. The domain name system allows us to reach the Google servers when we type “google.com
” into our browsers.
An IP address is what we call a network addressable location. Each IP address must be unique within its network. When we are talking about websites, this network is the entire internet.
IPv4, the most common form of addresses, are written as four sets of numbers, each set having up to three digits, with each set separated by a dot. For example, “111.222.111.222
” could be a valid IPv4 IP address. With DNS, we map a name to that address so that you do not have to remember a complicated set of numbers for each place you wish to visit on a network.
A top-level domain, or TLD, is the most general part of the domain. The top-level domain is the furthest portion to the right (as separated by a dot). Common top-level domains are “com”, “net”, “org”, “gov”, “edu”, and “io”.
Top-level domains are at the top of the hierarchy in terms of domain names. Certain parties are given management control over top-level domains by ICANN (Internet Corporation for Assigned Names and Numbers). These parties can then distribute domain names under the TLD, usually through a domain registrar.
Within a domain, the domain owner can define individual hosts, which refer to separate computers or services accessible through a domain. For instance, most domain owners make their web servers accessible through the bare domain (example.com
) and also through the “host” definition “www” (www.example.com
).
You can have other host definitions under the general domain. You could have API access through an “api” host (api.example.com
) or you could have ftp access by defining a host called “ftp” or “files” (ftp.example.com
or files.example.com
). The host names can be arbitrary as long as they are unique for the domain.
A subject related to hosts are subdomains.
DNS works in a hierarchy. TLDs can have many domains under them. For instance, the “com” TLD has both “google.com
” and “ubuntu.com
” underneath it. A “subdomain” refers to any domain that is part of a larger domain. In this case, “ubuntu.com
” can be said to be a subdomain of “com”. This is typically just called the domain or the “ubuntu” portion is called a SLD, which means second level domain.
Likewise, each domain can control “subdomains” that are located under it. This is usually what we mean by subdomains. For instance you could have a subdomain for the history department of your school at “www.history.school.edu
”. The “history” portion is a subdomain.
The difference between a host name and a subdomain is that a host defines a computer or resource, while a subdomain extends the parent domain. It is a method of subdividing the domain itself.
Whether talking about subdomains or hosts, you can begin to see that the left-most portions of a domain are the most specific. This is how DNS works: from most to least specific as you read from left-to-right.
A fully qualified domain name, often called FQDN, is what we call an absolute domain name. Domains in the DNS system can be given relative to one another, and as such, can be somewhat ambiguous. A FQDN is an absolute name that specifies its location in relation to the absolute root of the domain name system.
This means that it specifies each parent domain including the TLD. A proper FQDN ends with a dot, indicating the root of the DNS hierarchy. An example of a FQDN is “mail.google.com.
”. Sometimes software that calls for FQDN does not require the ending dot, but the trailing dot is required to conform to ICANN standards.
A name server is a computer designated to translate domain names into IP addresses. These servers do most of the work in the DNS system. Since the total number of domain translations is too much for any one server, each server may redirect request to other name servers or delegate responsibility for a subset of subdomains they are responsible for.
Name servers can be “authoritative”, meaning that they give answers to queries about domains under their control. Otherwise, they may point to other servers, or serve cached copies of other name servers’ data.
A zone file is a simple text file that contains the mappings between domain names and IP addresses. This is how the DNS system finally finds out which IP address should be contacted when a user requests a certain domain name.
Zone files reside in name servers and generally define the resources available under a specific domain, or the place that one can go to get that information.
Within a zone file, records are kept. In its simplest form, a record is basically a single mapping between a resource and a name. These can map a domain name to an IP address, define the name servers for the domain, define the mail servers for the domain, etc.
Now that you are familiar with some of the terminology involved with DNS, how does the system actually work?
The system is very simple at a high-level overview, but is very complex as you look at the details. Overall though, it is a very reliable infrastructure that has been essential to the adoption of the internet as we know it today.
As we said above, DNS is, at its core, a hierarchical system. At the top of this system is what are known as “root servers”. These servers are controlled by various organizations and are delegated authority by ICANN (Internet Corporation for Assigned Names and Numbers).
There are currently 13 root servers in operation. However, as there are an incredible number of names to resolve every minute, each of these servers is actually mirrored. The interesting thing about this set up is that each of the mirrors for a single root server share the same IP address. When requests are made for a certain root server, the request will be routed to the nearest mirror of that root server.
What do these root servers do? Root servers handle requests for information about Top-level domains. So if a request comes in for something a lower-level name server cannot resolve, a query is made to the root server for the domain.
The root servers won’t actually know where the domain is hosted. They will, however, be able to direct the requester to the name servers that handle the specifically requested top-level domain.
So if a request for “www.wikipedia.org
” is made to the root server, the root server will not find the result in its records. It will check its zone files for a listing that matches “www.wikipedia.org
”. It will not find one.
It will instead find a record for the “org” TLD and give the requesting entity the address of the name server responsible for “org” addresses.
The requester then sends a new request to the IP address (given to it by the root server) that is responsible for the top-level domain of the request.
So, to continue our example, it would send a request to the name server responsible for knowing about “org” domains to see if it knows where “www.wikipedia.org
” is located.
Once again, the requester will look for “www.wikipedia.org
” in its zone files. It will not find this record in its files.
However, it will find a record listing the IP address of the name server responsible for “wikipedia.org
”. This is getting much closer to the answer we want.
At this point, the requester has the IP address of the name server that is responsible for knowing the actual IP address of the resource. It sends a new request to the name server asking, once again, if it can resolve “www.wikipedia.org
”.
The name server checks its zone files and it finds that it has a zone file associated with “wikipedia.org
”. Inside of this file, there is a record for the “www” host. This record tells the IP address where this host is located. The name server returns the final answer to the requester.
In the above scenario, we referred to a “requester”. What is the requester in this situation?
In almost all cases, the requester will be what we call a “resolving name server.” A resolving name server is one configured to ask other servers questions. It is basically an intermediary for a user which caches previous query results to improve speed and knows the addresses of the root servers to be able to “resolve” requests made for things it doesn’t already know about.
Basically, a user will usually have a few resolving name servers configured on their computer system. The resolving name servers are usually provided by an ISP or other organizations. For instance Google provides resolving DNS servers that you can query. These can be either configured in your computer automatically or manually.
When you type a URL in the address bar of your browser, your computer first looks to see if it can find out locally where the resource is located. It checks the “hosts” file on the computer and a few other locations. It then sends the request to the resolving name server and waits back to receive the IP address of the resource.
The resolving name server then checks its cache for the answer. If it doesn’t find it, it goes through the steps outlined above.
Resolving name servers basically compress the requesting process for the end user. The clients simply have to know to ask the resolving name servers where a resource is located and be confident that they will investigate and return the final answer.
We mentioned in the above process the idea of “zone files” and “records”.
Zone files are the way that name servers store information about the domains they know about. Every domain that a name server knows about is stored in a zone file. Most requests coming to the average name server are not something that the server will have zone files for.
If it is configured to handle recursive queries, like a resolving name server, it will find out the answer and return it. Otherwise, it will tell the requesting party where to look next.
The more zone files that a name server has, the more requests it will be able to answer authoritatively.
A zone file describes a DNS “zone”, which is basically a subset of the entire DNS naming system. It generally is used to configure just a single domain. It can contain a number of records which define where resources are for the domain in question.
The zone’s $ORIGIN
is a parameter equal to the zone’s highest level of authority by default.
So if a zone file is used to configure the “example.com.
” domain, the $ORIGIN
would be set to example.com.
.
This is either configured at the top of the zone file or it can be defined in the DNS server’s configuration file that references the zone file. Either way, this parameter describes what the zone is going to be authoritative for.
Similarly, the $TTL
configures the “time to live” of the information it provides. It is basically a timer. A caching name server can use previously queried results to answer questions until the TTL value runs out.
Within the zone file, we can have many different record types. We will go over some of the more common (or mandatory types) here.
The Start of Authority, or SOA, record is a mandatory record in all zone files. It must be the first real record in a file (although $ORIGIN
or $TTL
specifications may appear above). It is also one of the most complex to understand.
The start of authority record looks something like this:
domain.com. IN SOA ns1.domain.com. admin.domain.com. (
12083 ; serial number
3h ; refresh interval
30m ; retry interval
3w ; expiry period
1h ; negative TTL
)
Let’s explain what each part is for:
domain.com.
: This is the root of the zone. This specifies that the zone file is for the domain.com.
domain. Often, you’ll see this replaced with @
, which is just a placeholder that substitutes the contents of the $ORIGIN
variable we learned about above.
IN SOA: The “IN” portion means internet (and will be present in many records). The SOA is the indicator that this is a Start of Authority record.
ns1.domain.com.
: This defines the primary name server for this domain. Name servers can either be primary or secondary, and if dynamic DNS is configured one server needs to be a “primary”, which goes here. If you haven’t configured dynamic DNS, then this is just one of your primary name servers.
admin.domain.com.
: This is the email address of the administrator for this zone. The “@” is replaced with a dot in the email address. If the name portion of the email address normally has a dot in it, this is replace with a "" in this part (your.name@domain.com
becomes your\name.domain.com
).
12083: This is the serial number for the zone file. Every time you edit a zone file, you must increment this number for the zone file to propagate correctly. Secondary servers will check if the primary server’s serial number for a zone is larger than the one they have on their system. If it is, it requests the new zone file, if not, it continues serving the original file.
3h: This is the refresh interval for the zone. This is the amount of time that the secondary will wait before polling the primary for zone file changes.
30m: This is the retry interval for this zone. If the secondary cannot connect to the primary when the refresh period is up, it will wait this amount of time and retry to poll the primary.
3w: This is the expiry period. If a secondary name server has not been able to contact the primary for this amount of time, it no longer returns responses as an authoritative source for this zone.
1h: This is the amount of time that the name server will cache a name error if it cannot find the requested name in this file.
Both of these records map a host to an IP address. The “A” record is used to map a host to an IPv4 IP address, while “AAAA” records are used to map a host to an IPv6 address.
The general format of these records is this:
host IN A IPv4_address
host IN AAAA IPv6_address
So since our SOA record called out a primary server at “ns1.domain.com
”, we would have to map this to an address to an IP address since “ns1.domain.com
” is within the “domain.com
” zone that this file is defining.
The record could look something like this:
ns1 IN A 111.222.111.222
Notice that we don’t have to give the full name. We can just give the host, without the FQDN and the DNS server will fill in the rest with the $ORIGIN
value. However, we could just as easily use the entire FQDN if we feel like being semantic:
ns1.domain.com. IN A 111.222.111.222
In most cases, this is where you’ll define your web server as “www”:
www IN A 222.222.222.222
We should also tell where the base domain resolves to. We can do this like this:
domain.com. IN A 222.222.222.222
We could have used the “@
” to refer to the base domain instead:
@ IN A 222.222.222.222
We also have the option of resolving anything that under this domain that is not defined explicitly to this server too. We can do this with the “*
” wild card:
* IN A 222.222.222.222
All of these work just as well with AAAA records for IPv6 addresses.
CNAME records define an alias for canonical name for your server (one defined by an A or AAAA record).
For instance, we could have an A name record defining the “server1” host and then use the “www” as an alias for this host:
server1 IN A 111.111.111.111
www IN CNAME server1
Be aware that these aliases come with some performance losses because they require an additional query to the server. Most of the time, the same result could be achieved by using additional A or AAAA records.
One case when a CNAME is recommended is to provide an alias for a resource outside of the current zone.
MX records are used to define the mail exchanges that are used for the domain. This helps email messages arrive at your mail server correctly.
Unlike many other record types, mail records generally don’t map a host to something, because they apply to the entire zone. As such, they usually look like this:
IN MX 10 mail.domain.com.
Note that there is no host name at the beginning.
Also note that there is an extra number in there. This is the preference number that helps computers decide which server to send mail to if there are multiple mail servers defined. Lower numbers have a higher priority.
The MX record should generally point to a host defined by an A or AAAA record, and not one defined by a CNAME.
So, let’s say that we have two mail servers. There would have to be records that look something like this:
IN MX 10 mail1.domain.com.
IN MX 50 mail2.domain.com.
mail1 IN A 111.111.111.111
mail2 IN A 222.222.222.222
In this example, the “mail1” host is the preferred email exchange server.
We could also write that like this:
IN MX 10 mail1
IN MX 50 mail2
mail1 IN A 111.111.111.111
mail2 IN A 222.222.222.222
This record type defines the name servers that are used for this zone.
You may be wondering, “if the zone file resides on the name server, why does it need to reference itself?”. Part of what makes DNS so successful is its multiple levels of caching. One reason for defining name servers within the zone file is that the zone file may be actually being served from a cached copy on another name server. There are other reasons for needing the name servers defined on the name server itself, but we won’t go into that here.
Like the MX records, these are zone-wide parameters, so they do not take hosts either. In general, they look like this:
IN NS ns1.domain.com.
IN NS ns2.domain.com.
You should have at least two name servers defined in each zone file in order to operate correctly if there is a problem with one server. Most DNS server software considers a zone file to be invalid if there is only a single name server.
As always, include the mapping for the hosts with A or AAAA records:
IN NS ns1.domain.com.
IN NS ns2.domain.com.
ns1 IN A 111.222.111.111
ns2 IN A 123.211.111.233
There are quite a few other record types you can use, but these are probably the most common types that you will come across.
The PTR records are used define a name associated with an IP address. PTR records are the inverse of an A or AAAA record. PTR records are unique in that they begin at the .arpa
root and are delegated to the owners of the IP addresses. The Regional Internet Registries (RIRs) manage the IP address delegation to organization and service providers. The Regional Internet Registries include APNIC, ARIN, RIPE NCC, LACNIC, and AFRINIC.
Here is an example of a PTR record for 111.222.333.444
would look like:
444.333.222.111.in-addr.arpa. 33692 IN PTR host.example.com.
This example of a PTR record for an IPv6 address shows the nibble format of the reverse of Google’s IPv6 DNS Server 2001:4860:4860::8888
.
8.8.8.8.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.6.8.4.0.6.8.4.1.0.0.2.ip6.arpa. 86400IN PTR google-public-dns-a.google.com.
The command line tool dig
with the -x
flag can be used to look up the reverse DNS name of an IP address.
Here is an example of a dig command. The +short
is appended to reduce the output to the reverse DNS name.
- dig -x 8.8.4.4 +short
The output for the dig command above will be the domain name in the PTR record for the IP address:
google-public-dns-b.google.com.
Servers on the Internet use PTR records to place domain names within log entries, make informed spam handling decisions, and display easy-to-read details about other devices.
Most commonly-used email servers will look up the PTR record of an IP address it receives email from. If the source IP address does not have a PTR record associated with it, the emails being sent may be treated as spam and rejected. It is not important that the FQDN in the PTR matches the domain name of the email being sent. What is important is that there is a valid PTR record with a corresponding and matching forward A record.
Normally network routers on the Internet are given PTR records that correspond with their physical location. For example you may see references to ‘NYC’ or ‘CHI’ for a router in New York City or Chicago. This is helpful when running a traceroute or MTR and reviewing the path Internet traffic is taking.
Most providers offering dedicated servers or VPS services will give customers the ability to set a PTR record for their IP address. DigitalOcean will automatically assign the PTR record of any Droplet when the Droplet is named with a domain name. The Droplet name is assigned during creation and can be edited later using the settings page of the Droplet control panel.
Note: It is important that the FQDN in the PTR record has a corresponding and matching forward A record. Example: 111.222.333.444
has a PTR of server.example.com
and server.example.com
is an A record that points to 111.222.333.444
.
CAA records are used to specify which Certificate Authorities (CAs) are allowed to issue SSL/TLS certificates for your domain. As of September 8, 2017 all CAs are required to check for these records before issuing a certificate. If no record is present, any CA may issue a certificate. Otherwise, only the specified CAs may issue certificates. CAA records can be applied to single hosts, or entire domains.
An example CAA record follows:
example.com. IN CAA 0 issue "letsencrypt.org"
The host, IN
, and record type (CAA
) are common DNS fields. The CAA-specific information above is the 0 issue "letsencrypt.org"
portion. It is made up of three parts: flags (0
), tags (issue
), and values ("letsencrypt.org"
).
0
, the record will be ignored. If 1
, the CA must refuse to issue the certificate.issue
to authorize a CA to create certificates for a specific hostname, issuewild
to authorize wildcard certificates, or iodef
to define a URL where CAs can report policy violations.issue
and issuewild
this will typically be the domain of the CA you’re granting the permission to. For iodef
this may be the URL of a contact form, or a mailto:
link for email feedback.You may use dig
to fetch CAA records using the following options:
- dig example.com type257
For more detailed information about CAA records, you can read RFC 6844, or our tutorial How To Create and Manage CAA Records Using DigitalOcean DNS
You should now have a pretty good grasp on how DNS works. While the general idea is relatively easy to grasp once you’re familiar with the strategy, this is still something that can be difficult for inexperienced administrators to put into practice.
For an overview check out How To Set Up Domains within the DigitalOcean Control Panel.
]]>The impact of cloud computing on industry and end users would be difficult to overstate: many aspects of everyday life have been transformed by the omnipresence of software that runs on cloud networks. By leveraging cloud computing, startups and businesses are able to optimize costs and increase their offerings without purchasing and managing the hardware and software themselves. Independent developers are empowered to launch globally-available apps and online services. Researchers can share and analyze data at scales once reserved only for highly-funded projects. And internet users can quickly access software and storage to create, share, and store digital media in quantities that extend far beyond the computing capacity of their personal devices.
Despite the growing presence of cloud computing, its details remain obscure to many. What exactly is the cloud, how does one use it, and what are its benefits for businesses, developers, researchers, government, healthcare practitioners, and students? In this conceptual article, we’ll provide a general overview of cloud computing, its history, delivery models, offerings, and risks.
In this article, you will gain an understanding of how the cloud can help support business, research, education, and community infrastructure and how to get started using the cloud for your own projects.
Cloud computing is the delivery of computing resources as a service, meaning that the resources are owned and managed by the cloud provider rather than the end user. Those resources may include anything from browser-based software applications (such as Tik Tok or Netflix), third party data storage for photos and other digital media (such as iCloud or Dropbox), or third-party servers used to support the computing infrastructure of a business, research, or personal project.
Before the broad proliferation of cloud computing, businesses and general computer users typically had to buy and maintain the software and hardware that they wished to use. With the growing availability of cloud-based applications, storage, services, and machines, businesses and consumers now have access to a wealth of on-demand computing resources as internet-accessed services. Shifting from on-premise software and hardware to networked remote and distributed resources means cloud users no longer have to invest the labor, capital, or expertise required for buying and maintaining these computing resources themselves. This unprecedented access to computing resources has given rise to a new wave of cloud-based businesses, changed IT practices across industries, and transformed many everyday computer-assisted practices. With the cloud, individuals can now work with colleagues over video meetings and other collaborative platforms, access entertainment and educational content on demand, communicate with household appliances, hail a cab with a mobile device, and rent a vacation room in someone’s house.
The National Institute of Standards and Technology (NIST), a non-regulatory agency of the United States Department of Commerce with a mission to advance innovation, defines cloud computing as:
a model for enabling ubiquitous, convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction.
NIST lists the following as the five essential characteristics of cloud computing:
These characteristics offer a wide variety of transformative opportunities for businesses and individuals alike, which we’ll discuss later in the section Benefits of Cloud Computing. To gain some additional context, let’s briefly review the emergence of cloud computing.
Many aspects of cloud computing can be traced as far back as the 1950s, when universities and companies rented out computation time on mainframe computers. At the time, renting was one of the only ways to access computing resources as computing technology was too large and expensive to be owned or managed by individuals. By the 1960s, computer scientists like John McCarthy of Stanford University and J.C.R Licklider of The U.S. Department of Defense Advanced Research Projects Agency (ARPA) began proposing ideas that anticipated some of the major features of cloud computing today, such as the conceptualization of computing as a public utility and the possibility of a network of computers that would allow people to access data and programs from anywhere in the world.
Cloud computing, however, didn’t become a mainstream reality and a popular term until the first decade of the 21st century. This decade saw the launch of cloud services like Amazon’s Elastic Compute (EC2) and Simple Storage Service (S3) in 2006, Heroku in 2007, Google Cloud Platform in 2008, Alibaba Cloud in 2009, Windows Azure (now Microsoft Azure) in 2010, IBM’s SmartCloud in 2011, and DigitalOcean in 2011. These services allowed existing businesses to optimize costs by migrating their in-house IT infrastructure to cloud-based resources and provided independent developers and small developer teams resources for creating and deploying apps. Cloud-based applications, known as Software as a Service (SasS) — which we’ll discuss in greater detail in the Cloud Delivery Models section — also became popular during this time period. Unlike on-premise software, or software that users need to physically install and maintain on their machines, SaaS increased the availability of applications by allowing users to access them from a variety of devices on demand.
Some of these cloud-based applications — such as Google’s productivity apps (Gmail, Drive, and Docs) and Microsoft 365 (a cloud-based version of the Microsoft Office Suite) — were offered by the same companies that launched cloud infrastructure services, while other pre-existing software products, such as Adobe Creative Cloud, were launched as cloud-based applications using the services of cloud providers. New SaaS products and businesses also emerged based on the novel opportunities of these cloud providers, such as Netflix’s streaming services in 2007, the music platform Spotify in 2008, the file-hosting service Dropbox in 2009, the video conferencing service Zoom in 2012, and the communication tool Slack in 2013. Today, cloud-based IT infrastructure and cloud-based applications have become a popular choice for both businesses and individual users and their market share is expected to grow.
Cloud resources are provided in a variety of different delivery models that offer customers different levels of support and flexibility.
IaaS is the on-demand delivery of computing infrastructure, including operating systems, networking, storage, and other infrastructural components. Acting much like a virtual equivalent to physical servers, IaaS relieves cloud users of the need to buy and maintain physical servers while also providing the flexibility to scale and pay for resources as needed. IaaS is a popular option for businesses that wish to leverage the advantages of the cloud and have system administrators who can oversee the installation, configuration, and management of operating systems, development tools, and other underlying infrastructure that they wish to use. However, IaaS is also used by developers, researchers, and others who wish to customize the underlying infrastructure of their computing environment. Given its flexibility, IaaS can support everything from a company’s computing infrastructure to web hosting to big data analysis.
PaaS provides a computing platform where the underlying infrastructure (such as the operating system and other software) is installed, configured, and maintained by the provider, allowing users to focus their efforts on developing and deploying apps in a tested and standardized environment. PaaS is commonly used by software developers and developer teams as it cuts down on the complexity of setting up and maintaining computer infrastructure, while also supporting collaboration among distributed teams. PaaS can be a good choice for developers who don’t have the need to customize their underlying infrastructure, or those who want to focus their attention on development rather than DevOps and system administration.
SaaS providers are cloud-based applications that users access on demand from the internet without needing to install or maintain the software. Examples include GitHub, Google Docs, Slack, and Adobe Creative Cloud. SaaS applications are popular among businesses and general users given that they’re often easy to adopt, accessible from any device, and have free, premium, and enterprise versions of their applications. Like PaaS, SaaS abstracts away the underlying infrastructure of the software application so that users are only exposed to the interface they interact with.
Cloud services are available as public or private resources, each of which serves different needs.
The public cloud refers to cloud services (such as virtual machines, storage, or applications) offered publicly by a commercial provider to businesses and individuals. Public cloud resources are hosted on the commercial provider’s hardware, which users access through the internet. They are not always suitable for organizations in highly-regulated industries, such as healthcare or finance, as public cloud environments may not comply with industry regulations regarding customer data.
The private cloud refers to cloud services that are owned and managed by the organization that uses them and available only to the organization’s employees and customers. Private clouds allow organizations to exert greater control over their computing environment and their stored data, which can be necessary for organizations in highly-regulated industries. Private clouds are sometimes seen as more secure than public clouds as they are accessed through private networks and enable the organization to directly oversee their cloud security. Public cloud providers sometimes provide their services as applications that can be installed on private clouds, allowing organizations to keep their infrastructure and data on premise while taking advantage of the public cloud’s latest innovations.
Many organizations use a hybrid cloud environment which combines public and private cloud resources to support the organization’s computing needs while maintaining compliance with industry regulation. Multicloud environments are also common, which entail the use of more than one public cloud provider (for example, combining Amazon Web Services and DigitalOcean).
Cloud computing offers a variety of benefits to individuals, businesses, developers, and other organizations. These benefits vary according to the cloud users goals and activities.
Prior to the proliferation of cloud computing, most businesses and organizations needed to purchase and maintain the software and hardware that supported their computing activities. As cloud computing resources became available, many businesses began using them to store data, provide enterprise software, and deploy online products and services. Some of these cloud-based adoptions and innovations are industry-specific. In healthcare, many providers use cloud services that are specifically designed to store and share patient data or communicate with patients. In academia, educators and researchers use cloud-based teaching and research apps. But there are also a large number of general cloud-based tools that have been adopted across industries, such as apps for productivity, messaging, expense management, video conferencing, project management, newsletters, surveys, customer relations management, identity management, and scheduling. The rapid growth of cloud-based business apps and infrastructure shows that the cloud isn’t just changing business IT strategy: it’s a booming business in its own right.
Cloud-based technologies offer businesses several key advantages. First, they can help optimize IT costs. As businesses shift towards renting computing resources, they no longer have to invest as much in purchasing and maintaining on-premise IT infrastructure. Cloud computing is also enormously flexible, allowing businesses to rapidly scale (and only pay for) the computing resources they actually use. Cost, however, is not the only consideration that drives cloud adoption in business. Cloud-based technologies can help make internal IT processes more efficient as they can be accessed on demand by employees without needing to go through IT approval processes. Cloud-based apps can improve collaboration across a business as they allow for real-time communication and data sharing.
Computing resources that were once only affordable to large companies and organizations are now available on demand through an internet connection and at a fraction of their previous cost. In effect, independent developers can rapidly deploy and experiment with cloud-based apps. Cloud-based apps for sharing code (such as GitHub) have also made it easier for developers to build upon and collaborate on open source software projects. Additionally, cloud-based educational platforms and interactive coding tutorials have expanded access to developer education, enabling individuals without formal technical training to learn to code in their own time.
Altogether, these cloud-based computing and educational resources have helped lower the barriers to learning developer skills and deploying cloud-based apps. Formal training, company support, and massive amounts of startup capital are no longer necessary for individuals to experiment with creating and deploying apps, allowing for more individuals to participate in cloud development, compete with established industry players, and create and share apps as side projects.
As machine learning methods become increasingly important in scientific research, cloud computing has become essential to many scientific fields, including astronomy, physics, genomics, and artificial intelligence. The massive amount of data collected and analyzed in machine learning and other data-intensive research projects often require computing resources that scale beyond the capacity of hardware owned by an individual researcher or provisioned by the university. Cloud computing allows researchers to access (and only pay for) computing resources as their workloads require and allows for real-time collaboration with research partners across the globe. Without commercial cloud providers, a majority of academic machine learning research would be limited to individuals with access to university-provisioned, high-powered computing resources.
Cloud computing has also provided students with tools for supplementing their education and opportunities to put their technical skills into practice as they learn. Cloud-based apps for sharing, teaching, and collaborating on code and data (such as GitHub and Jupyter Notebooks) enable students to learn technical skills in a hands-on manner by studying, deploying, and contributing to open source software and research projects relevant to their field or professional aspirations. And just like independent developers, students are able to use cloud computing resources to share their code and apps with the public and reap the satisfaction of understanding the real-world application of their skills.
Students, researchers, and educators can also take advantage of cloud computing resources to support personalized academic infrastructure and practice greater control over their computing environments. Some academics prefer this approach as it lets them pick which applications they use, customize the functionality and design of these tools, and limit or prohibit the collection of data. There are also a growing number of cloud-based applications developed specifically for academic purposes that supplement or provide alternatives to traditional academic IT offerings. Voyant Tools offers students and researchers a code-free method for providing textual analysis on documents of their choosing and The HathiTrust provides access to its digital collection of millions of volumes. Reclaim Hosting, Commons in a Box, the Modern Language Humanities Commons, and Manifold offer educational, publishing, and networking tools designed specifically for academic communities.
Some individuals and communities choose to install and manage their own cloud-based software to serve community needs and values, customize functionality, protect user data, and have more control over their computing environment. Open source software, such as social media tools like Mastodon, video conferencing software like Jitsi, collaborative text editors like Etherpad, and web chat tools like Rocket Chat, provide alternatives to SaaS platforms that often limit user’s control, privacy, and oversight over their computing environment. While often requiring more administrative work than SaaS applications or social media platforms, some communities prefer these options given ethical concerns about the use of personal data and company practices with popular platforms and SaaS applications.
Though the cloud offers many benefits, it also comes with its own set of risks, costs, and ethical questions that should be considered. Some of these issues are relevant to all cloud users, while others are more applicable to businesses and organizations that use the cloud to store customers’ data:
Cloud technologies offer a variety of opportunities to businesses, independent developers, researchers, educators, and students. By understanding the different services, models, benefits and risks offered by the cloud, users can make informed decisions about how to best take advantage of its offerings.
]]>GraphQL is a query language for APIs and a runtime for fulfilling those queries with your existing data. GraphQL provides a complete and understandable description of the data in your API as well as gives clients the power to ask for exactly what they need and nothing more.
It simplifies evolving APIs over time and enables powerful developer tools. In this guide, we’ll look at the benefits and also the drawbacks of GraphQL so that you can decide for yourself if it’s a good fit for your project or not.
If you are seeking a GraphQL installation guide, we have tutorials that cover using GraphQL with Ruby on Rails and GraphQL with Node.js.
The importance and usefulness of GraphQL exact data fetching features cannot be overemphasized. With GraphQL, you can send a query to your API and get exactly what you need, nothing more and nothing less. If you compare this feature with the conventional intuitive nature of REST, you’ll understand that this is a major improvement to the way we initially do things.
GraphQL minimizes the amount of data that is transferred across the wire by being selective about the data depending on the client application’s needs. Thus, a mobile client can fetch less information because it may not be needed on a small screen compared to the larger screen for the web application.
So instead of multiple endpoints that return fixed data structures, a GraphQL server only exposes a single endpoint and responds with precisely the data a client requested.
Consider a situation where you want to call an API endpoint that has two resources, artists and their tracks.
To be able to request for a particular artist or their music tracks, you will have an API structure like this:
METHOD /api/:resource:/:id:
With the traditional REST pattern, if we want to look up a list of every artist using the provided API, we would have to make a GET request to the root resource endpoint like this:
GET /api/artists
What if we want to query for an individual artist from the list of artists? then we will have to append the resource ID to the endpoint like this:
GET /api/artists/1
In essence, we have to call two different endpoints to get the required data. With GraphQL, every request can be performed on one endpoint, with the actions being taken and data being returned all defined within the query itself. Let’s say we want to get an artists track and duration, with GraphQL, we’ll have a query like this:
GET /api?query={ artists(id:"1") { track, duration } }
This query instructs the API to lookup an artist with the ID of 1 and then return its track and duration which is exactly what we wanted, no more, no less. This same endpoint can also be used to perform actions within the API as well.
Another useful feature of GraphQL is that it makes it simple to fetch all required data with one single request. The structure of GraphQL servers makes it possible to declaratively fetch data as it only exposes a single endpoint.
Consider a situation where a user wants to request the details of a particular artist, say name, id, tracks, etc. With the traditional REST intuitive pattern, this will require at least two requests to two endpoints /artists
and /tracks
. However, with GraphQL, we can define all the data we need in the query as shown below:
// the query request
artists(id: "1") {
id
name
avatarUrl
tracks(limit: 2) {
name
urlSlug
}
}
Here, we have defined a single GraphQL query to request for multiple resources (artists and tracks). This query will return all and only the requested resources like so:
// the query result
{
"data": {
"artists": {
"id": "1",
"name": "Michael Jackson",
"avatarUrl": "https://artistsdb.com/artist/1",
"tracks": [
{
"name": "Heal the world",
"urlSlug": "heal-the-world"
},
{
"name": "Thriller",
"urlSlug": "thriller"
}
]
}
}
}
As can be seen from the response data above, we have fetched the resources for both /artists
and /tracks
with a single API call. This is a powerful feature that GraphQL offers. As you can already imagine, the applications of this feature for highly declarative API structures are limitless.
Modern applications are now built-in comprehensive ways where a single backend application supplies the data that is needed to run multiple clients. Web applications, mobile apps, smart screens, watches, etc. can now depend only on a single backend application for data to function efficiently.
GraphQL embraces these new trends as it can be used to connect the backend application and fulfill each client’s requirements ( nested relationships of data, fetching only the required data, network usage requirements, etc.) without dedicating a separate API for each client.
Most times, to do this, the backend would be broken down into multiple microservices with distinct functionalities. This way, it becomes easy to dedicate specific functionalities to the microservices through what we call schema stitching. Schema stitching makes it possible to create a single general schema from different schemas. As a result, each microservice can define its own GraphQL schema.
Afterward, you could use schema stitching to weave all individual schemas into one general schema, which can then be accessed by each of the client applications. In the end, each microservice can have its own GraphQL endpoint, whereas one GraphQL API gateway consolidates all schemas into one global schema to make it available to the client applications.
To demonstrate schema stitching, let’s consider the same situation employed by Sakho Stubailo while explaining stitching where we have two related APIs’s. The new public Universes GraphQL API for Ticketmaster’s Universe event management system and the Dark Sky weather API on Launchpad, created by Matt Dionis. Let’s look at two queries we can run against these APIs separately.
First, with the Universe API, we can get the details about a specific event ID:
With the Dark sky weather API, we can get the details of the same location like so:
Now with GraphQL schema stitching, we could do an operation to merge the two schemas in such a way that we could easily send those two queries side by side:
You can take an in-depth look at GraphQL schema stitching by Sashko Stubailo to get a deeper understanding of the concepts involved.
This way, GraphQL makes it possible to merge different schemas into one general schema where all the clients can get resources from hence, embracing the new modern style of development with ease.
This is one GraphQL feature that personally gives me joy. As developers, we are used to calling different versions of an API and often times getting really weird responses. Traditionally, we version APIs when we’ve made changes to the resources or to the structure of the resources we currently have hence, the need to deprecate and evolve a new version.
For example, we can have an API like api.domain.com/resources/v1
and at some point in the later months or years, a few changes would have happened and resources or the structure of the resources will have changed, hence, the next best thing to do will be to evolve this API to api.domain.com/resources/v2
to capture all the recent changes.
At this point, some resources in v1
will have been deprecated (or left active for a while until users have migrated to the new version) and on receiving a request for those resources, will get unexpected responses like deprecation notices.
In GraphQL, it is possible to deprecate APIs on a field level. When a particular field is to be deprecated, a client receives a deprecation warning when querying the field. After a while, the deprecated field may be removed from the schema when not many clients are using it anymore.
As a result, instead of completely versioning the API, it is possible to gradually evolve the API over time without having to restructure the entire API schema.
Caching is the storage of data so that future requests for that data can be served faster; the data stored in a cache might be the result of an earlier computation or the duplicate of data stored elsewhere. The goal of caching an API response is primarily to obtain the response from future requests faster. Unlike GraphQL, caching is built into the HTTP specification which RESTful APIs are able to leverage.
With REST you access resources with URLs, and thus you would be able to cache on a resource level because you have the resource URL as an identifier. In GraphQL, this becomes complex as each query can be different even though it operates on the same entity.
In one query you might be interested in just the name of an artist, however, in the next query you might want to get the artists’ tracks and release dates. This is the point where caching is mostly complex as it’ll require field level caching which isn’t an easy thing to achieve with GraphQL since it uses a single endpoint.
That said, the GraphQL community recognizes this difficulty and has since been making efforts to make caching easier for GraphQL users. Libraries like Prisma and Dataloader (built on GraphQL) have been developed to help with similar scenarios. However, it still doesn’t completely cover things like browser and mobile caching.
GraphQL gives clients the power to execute queries to get exactly what they need. This is an amazing feature however, it could be a bit controversial as it could also mean that users can ask for as many fields in as many resources as they want.
For instance, a user defines a query that asks for a list of all the users that commented on all the tracks of a particular artist. This will require a query like this:
artist(id: '1') {
id
name
tracks {
id
title
comments {
text
date
user {
id
name
}
}
}
}
This query could potentially get tens of thousands of data in response.
Therefore, as much as it is a good thing to allow users to request for whatever they need, at certain levels of complexity, requests like this can slow down performance and immensely affect the efficiency of GraphQL applications.
For complex queries, a REST API might be easier to design because you can have several endpoints for specific needs, and for each endpoint, you can define specific queries to retrieve the data in an efficient way. This might also be a bit controversial given that the fact that several network calls can as well take a lot of time, but if you are not careful, a few big queries can bring your server down to its knees.
As we exemplified before while building with GraphQL on the backend, often than not, your database and GraphQL API will have similar but different schemas, which translate to different document structures. As a result, a track
from the database will have a trackId
property while the same track fetched through your API will instead have a track
property on the client. This makes for client/server-side data mismatch.
Consider getting the name of the artist of a particular track on the client-side, it’ll look like this:
const getArtistNameInClient = track => {
return artist.user.name
}
However, doing the exact same thing on the server-side will result in an entirely different code like this:
const getArtistNameInServer = track => {
const trackArtist = Users.findOne(track.userId)
return trackArtist.name
}
By extension, this means that you’re missing out on GraphQL’s great approach to data querying on the server. Thankfully, this is not without a fix. It turns out that you can run server-to-server GraphQL queries just fine. How? you can do this by passing your GraphQL executable schema to the GraphQL function, along with your GraphQL query:
const result = await graphql(executableSchema, query, {}, context, variables);
According to Sesha Greif, it is important to not just see GraphQL as just a pure client-server protocol. GraphQL can be used to query data in any situation, including client-to-client with Apollo Link State or even during a static build process with Gatsby.
When building with GraphQL on the backend, you can’t seem to be able to avoid duplication and code repetition, especially when it comes to schemas. First, you need a schema for your database and another for your GraphQL endpoint; this involves similar-but-not-quite-identical code, especially when it comes to schemas.
It is hard enough that you have to write very similar code for your schemas regularly, but it’s even more frustrating that you also have to continually keep them in sync.
Apparently, other developers have noticed this difficulty, and so far, efforts have been made in the GraphQL community to fix it. Here are the two most popular fixes we found:
GraphQL is an exciting new technology, but it is important to understand the tradeoffs before making expensive and important architectural decisions. Some APIs, such as those with very few entities and relationships across entities like analytics APIs, may not be very suited for GraphQL. However, applications with many different domain objects like e-commerce applications where you have items, users, orders, payments, and so on may be able to leverage GraphQL much more.
GraphQL is a powerful tool, and there are many reasons to choose it in your projects but do well not to forget that the most important and often times the best choice is choosing whichever tool is right for the project into consideration. The good and bad points I have presented here may not always apply, but it is worth taking them into consideration while looking at GraphQL to see if they can help your project or to know if the cons have been addressed.
If you are seeking a GraphQL installation guide, we have tutorials that cover using GraphQL with Ruby on Rails and GraphQL with Node.js.
]]>When designing a database, there may be times when you want to put limits on what data is allowed in certain columns. For example, if you’re creating a table that will hold information on skyscrapers, you may want the column holding each building’s height to prohibit negative values.
Relational database management systems (RDBMSs) allow you to control what data gets added to a table with constraints. A constraint is a special rule that applies to one or more columns — or to an entire table — that restricts what changes can be made to a table’s data, whether through an INSERT
, UPDATE
, or DELETE
statement.
This article will review in detail what constraints are and how they’re used in RDBMSs. It will also walk through each of the five constraints defined in the SQL standard and explain their respective functions.
In SQL, a constraint is any rule applied to a column or table that limits what data can be entered into it. Any time you attempt to perform an operation that changes that data held in a table — such as an INSERT
, UPDATE
, or DELETE
statement — the RDBMS will test whether that data violates any existing constraints and, if so, return an error.
Database administrators often rely on constraints to ensure that a database follows a set of defined business rules. In the context of a database, a business rule is any policy or procedure that a business or other organization follows and that its data must adhere to as well. For instance, say you’re building a database that will catalog a client’s store inventory. If the client specifies that each product record should have a unique identification number, you could create a column with a UNIQUE
constraint that will ensure no two entries in that column are the same.
Constraints are also helpful with maintaining data integrity. Data integrity is a broad term that’s often used to describe the overall accuracy, consistency, and rationality of data held in a database, based on its particular use case. Tables in a database are often closely related, with columns in one table being dependent on the values in another. Because data entry is often prone to human error constraints are useful in cases like this, as they can help ensure that no incorrectly entered data could impact such relationships and thus harm the database’s data integrity.
Imagine you’re designing a database with two tables: one for listing current students at a school and another for listing members of that school’s basketball team. You could apply a FOREIGN KEY
constraint to a column in the basketball team table which refers to a column in the school table. This will establish a relationship between the two tables by requiring any entry to the team table to refer to an existing entry in the students table.
Users define constraints when they first create a table, or they can add them later on with an ALTER TABLE
statement as long as it doesn’t conflict with any data already in the table. When you create a constraint, the database system will generate a name for it automatically, but in most SQL implementations you can add a custom name for any constraint. These names are used to refer to constraints in ALTER TABLE
statements when changing or removing them.
The SQL standard formally defines just five constraints:
PRIMARY KEY
FOREIGN KEY
UNIQUE
CHECK
NOT NULL
Note: Many RDBMSs include the DEFAULT
keyword, which is used to define a default value for a column other than NULL
if no value is specified when inserting a row. The documentation of some of these database management systems refer to DEFAULT
as a constraint, as their implementations of SQL use a DEFAULT
syntax similar to that of constraints like UNIQUE
or CHECK
. However, DEFAULT
technically is not a constraint since it doesn’t restrict what data can be entered into a column.
Now that you have a general understanding of how constraints are used, let’s take a closer look at each of these five constraints.
PRIMARY KEY
The
PRIMARY KEY
constraint requires every entry in the given column to be both unique and notNULL
, and allows you to use that column to identify each individual row in the table
In the relational model, a key is a column or set of columns in a table in which every value is guaranteed to be unique and to not contain any NULL
values. A primary key is a special key whose values are used to identify individual rows in a table, and the column or columns that comprise the primary key can be used to identify the table throughout the rest of the database.
This is an important aspect of relational databases: with a primary key, users don’t need to know where their data is physically stored on a machine and their DBMS can keep track of each record and return them on an ad hoc basis. In turn, this means that records have no defined logical order, and users have the ability to return their data in whatever order or through whatever filters they wish.
You can create a primary key in SQL with the PRIMARY KEY
constraint, which is essentially a combination of the UNIQUE
and NOT NULL
constraints. After defining a primary key, the DBMS will automatically create an index associated with it. An index is a database structure that helps to retrieve data from a table more quickly. Similar to an index in a textbook, queries only have to review entries from the indexed column to find the associated values. This is what allows the primary key to act as an identifier for each row in the table.
A table can only have one primary key but, like regular keys, a primary key can comprise more than one column. With that said, a defining feature of primary keys is that they use only the minimal set of attributes needed to uniquely identify each row in a table. To illustrate this idea, imagine a table that stores information about students at a school using the following three columns:
studentID
: used to hold each student’s unique identification numberfirstName
: used to hold each student’s first namelastName
: used to hold each student’s last nameIt’s possible that some students at the school could share a first name, making the firstName
column a poor choice of a primary key. The same is true for the lastName
column. A primary key consisting of both the firstName
and lastName
columns could work, but there’s still a possibility that two students could share a first and last name.
A primary key consisting of the studentID
and either the firstName
or lastName
columns could work, but since each student’s identification number is already known to be unique, including either of the name columns in the primary key would be superfluous. So in this case the minimal set of attributes that can identify each row, and would thus be a good choice for the table’s primary key, is just the studentID
column on its own.
If a key is made up of observable application data (that is, data that represents real world entities, events, or attributes) it’s referred to as a natural key. If the key is generated internally and doesn’t represent anything outside the database, it’s known as a surrogate or synthetic key. Some database systems recommend against using natural keys, as even seemingly constant data points can change in unpredictable ways.
FOREIGN KEY
The
FOREIGN KEY
constraint requires that every entry in the given column must already exist in a specific column from another table.
If you have two tables that you’d like to associate with one another, one way you can do so is by defining a foreign key with the FOREIGN KEY
constraint. A foreign key is a column in one table (the “child” table) whose values come from a key in another table (the “parent”). This is a way to express a relationship between two tables: the FOREIGN KEY
constraint requires that values in the column on which it applies must already exist in the column that it references.
The following diagram highlights such a relationship between two tables: one used to record information about employees at a company and another used to track the company’s sales. In this example, the primary key of the EMPLOYEES
table is referenced by the foreign key of the SALES
table:
If you try to add a record to the child table and the value entered into the foreign key column doesn’t exist in the parent table’s primary key, the insertion statement will be invalid. This helps to maintain relationship-level integrity, as the rows in both tables will always be related correctly.
Oftentimes, a table’s foreign key is the parent table’s primary key, but this isn’t always the case. In most RDBMSs, any column in the parent table that has a UNIQUE
or PRIMARY KEY
constraint applied to it can be referenced by the child table’s foreign key.
UNIQUE
The
UNIQUE
constraint prohibits any duplicate values from being added to the given column.
As its name implies, a UNIQUE
constraint requires every entry in the given column to be a unique value. Any attempt to add a value that already appears in the column will result in an error.
UNIQUE
constraints are useful for enforcing one-to-one relationships between tables. As mentioned previously, you can establish a relationship between two tables with a foreign key, but there are multiple kinds of relationships that can exist between tables:
By adding a UNIQUE
constraint to a column on which a FOREIGN KEY
constraint has been applied, you can ensure that each entry in the parent table appears only once in the child, thereby establishing a one-to-one relationship between the two tables.
Note that you can define UNIQUE
constraints at the table level as well as the column level. When defined at the table level, a UNIQUE
constraint can apply to more than one column. In cases like this, each column included in the constraint can have duplicate values but every row must have a unique combination of values in the constrained columns.
CHECK
A
CHECK
constraint defines a requirement for a column, known as a predicate, that every value entered into it must meet.
CHECK
constraint predicates are written in the form of an expression that can evaluate to either TRUE
, FALSE
, or potentially UNKNOWN
. If you attempt to enter a value into a column with a CHECK
constraint and the value causes the predicate to evaluate to TRUE
or UNKNOWN
(which happens for NULL
values), the operation will succeed. However, if the expression resolves to FALSE
, it will fail.
CHECK
predicates often rely on a mathematical comparison operator (like <
, >
, <=
, OR >=
) to limit the range of data allowed into the given column. For instance, one common use for CHECK
constraints is to prevent certain columns holding negative values in cases where a negative value wouldn’t make sense, as in the following example.
This CREATE TABLE
statement creates a table named productInfo
with columns for each product’s name, identification number, and price. Because it wouldn’t make sense for a product to have a negative price, this statement imposes a CHECK
constraint on the price
column to ensure that it only contains positive values:
- CREATE TABLE productInfo (
- productID int,
- name varchar(30),
- price decimal(4,2)
- CHECK (price > 0)
- );
Not every CHECK
predicate must use a mathematical comparison operator. Typically, you can use any SQL operator that can evaluate to TRUE
, FALSE
, or UNKNOWN
in a CHECK
predicate, including LIKE
, BETWEEN
, IS NOT NULL
, and others. Some SQL implementations, but not all, even allow you to include a subquery in a CHECK
predicate. Be aware, though, that most implementations do not allow you to reference another table in a predicate.
NOT NULL
The
NOT NULL
constraint prohibits anyNULL
values from being added to the given column.
In most implementations of SQL, if you add a row of data but don’t specify a value for a certain column, the database system will by default represent the missing data as NULL
. In SQL, NULL
is a special keyword used to represent an unknown, missing, or otherwise unspecified value. However, NULL
is not a value itself but instead the state of an unknown value.
To illustrate this difference, imagine a table used to track clients at a talent agency that has columns for each client’s first and last names. If a client goes by a mononym — like “Cher”, “Usher”, or “Beyoncé” — the database administrator might only enter the mononym in the first name column, causing the DBMS to enter NULL
in the last name column. The database doesn’t consider the client’s last name to literally be “Null.” It just means that the value for that row’s last name column is unknown or the field doesn’t apply for that particular record.
As its name implies, the NOT NULL
constraint prevents any values in the given column from being NULL
. This means that for any column with a NOT NULL
constraint, you must specify a value for it when inserting a new row. Otherwise, the INSERT
operation will fail.
Constraints are essential tools for anyone looking to design a database with a high level of data integrity and security. By limiting what data gets entered into a column, you can ensure that relationships between tables will be maintained correctly and that the database adheres to the business rules that define its purpose.
For more detailed information on how to create and manage SQL constraints, you can review our guide on How To Use Constraints in SQL. If you’d like to learn more about SQL in general, we encourage you to check out our series on How To Use SQL.
]]>Datenbankmanagementsysteme (DBMS) sind Computerprogramme, mit denen Benutzer mit einer Datenbank interagieren können. Ein DBMS ermöglicht es Benutzern, den Zugriff auf eine Datenbank zu steuern, Daten zu schreiben, Abfragen auszuführen und andere Aufgaben im Zusammenhang mit der Datenbankverwaltung durchzuführen.
Um eine dieser Aufgaben auszuführen, muss das DBMS jedoch eine Art zugrunde liegendes Modell haben, das definiert, wie die Daten organisiert sind. Das relationale Modell ist ein Ansatz zur Organisation von Daten, der in der Datenbanksoftware seit seiner Entwicklung in den späten 60er Jahren breite Verwendung gefunden hat, sodass zum Zeitpunkt der Erstellung dieses Artikels vier der fünf beliebtesten DBMS relational sind.
Dieser konzeptionelle Artikel skiziert die Geschichte des relationalen Modells, wie relationale Datenbanken Daten organisieren und wie sie heute verwendet werden.
Datenbanken sind logisch modellierte Cluster von Informationen oder Daten. Jede Sammlung von Daten ist eine Datenbank, unabhängig davon, wie oder wo sie gespeichert ist. Sogar ein Aktenschrank mit Lohn- und Gehaltsabrechnungsinformationen ist eine Datenbank, ebenso wie ein Stapel von Krankenhauspatientenformularen oder die Sammlung von Kundeninformationen eines Unternehmens, die über mehrere Standorte verteilt sind. Bevor die Speicherung und Verwaltung von Daten mit Computern gängige Praxis war, waren physische Datenbanken wie diese die einzigen Datenbanken, die Behörden und Unternehmen zur Speicherung von Informationen zur Verfügung standen.
Um die Mitte des 20. Jahrhunderts führten Entwicklungen in der Informatik zu Computern mit mehr Verarbeitungsleistung sowie größerer lokaler und externer Speicherkapazität. Diese Fortschritte führten dazu, dass Computerwissenschaftler begannen, das Potenzial für die Speicherung und Verwaltung immer größerer Datenmengen zu erkennen.
Es gab jedoch keine Theorien darüber, wie Computer Daten auf sinnvolle, logische Weise organisieren können. Es ist eine Sache, unsortierte Daten auf einem Computer zu speichern, aber es ist viel komplizierter, Systeme zu entwerfen, die es erlauben, diese Daten auf konsistente, praktische Weise hinzuzufügen, abzurufen, zu sortieren und anderweitig zu verwalten. Der Bedarf an einem logischen Rahmen zur Speicherung und Organisation von Daten führte zu einer Reihe von Vorschlägen, wie Computer für die Datenverwaltung nutzbar gemacht werden können.
Ein frühes Datenbankmodell war das hierarchische Modell, bei dem die Daten in einer baumartigen Struktur organisiert sind, ähnlich wie bei modernen Dateisystemen. Das folgende Beispiel zeigt, wie das Layout eines Teils einer hierarchischen Datenbank zur Kategorisierung von Tieren aussehen könnte:
Das hierarchische Modell wurde in den frühen Datenbankmanagementsystemen verbreitet eingesetzt, erwies sich aber auch als etwas unflexible. Obwohl in diesem Modell einzelne Datensätze mehrere „untergeordnete Datensätze“ haben können, kann jeder Datensatz nur einen „übergeordneten“ Datensatz in der Hierarchie haben. Aus diesem Grund waren diese früheren hierarchischen Datenbanken darauf beschränkt, nur „Eins-zu-Eins“ und „Eins-zu-Viele“-Beziehungen darzustellen. Dieser Mangel an „Viele-zu-Viele“-Beziehungen könnte zu Problemen führen, wenn Sie mit Datenpunkten arbeiten, die Sie mit mehr als einem übergeordneten Datensatz verknüpfen möchten.
In den späten 1960er Jahren entwickelte Edgar F. Codd, ein Informatiker, der bei IBM arbeitete, das relationale Modell der Datenbankverwaltung. Das relationale Modell von Codd ermöglichte es, einzelne Datensätze mit mehr als einer Tabelle zu verknüpfen, wodurch zusätzlich zu den „Eins-zu-Viele“-Beziehungen auch „Viele-zu-Viele“-Beziehungen zwischen Datenpunkten möglich wurden. Dies bot mehr Flexibilität als andere existierende Modelle, wenn es um die Gestaltung von Datenbankstrukturen ging, und bedeutetet, dass relationale Datenbankmanagementsysteme (RDBMS) ein wesentlich breiteres Spektrum von Geschäftsanforderungen erfüllen können.
Codd schlug eine Sprache zur Verwaltung relationaler Daten vor, bekannt als Alpha, die die Entwicklung späterer Datenbanksprachen beeinflusste. Zwei Kollegen von Codd bei IBM, Donald Chamberlin und Raymond Boyce, schufen eine solche Sprache, die von Alpha inspiriert ist. Sie nannten ihre Sprache SEQUEL, kurz für Structured English Query Language, aber aufgrund eines bestehenden Warenzeichens kürzten sie den Namen ihrer Sprache auf SQL (formal eher als Structured Query Language bezeichnet).
Aufgrund von Hardware-Beschränkungen waren die frühen relationalen Datenbanken noch ungemein langsam, und es dauerte einige Zeit, bis die Technologie weit verbreitet war. Aber Mitte der 1980er Jahre war das relationale Modell von Codd bereits in einer Reihe kommerzieller Datenbankmanagementprodukte sowohl von IBM als auch seinen Konkurrenten implementiert worden. Diese Anbieter folgten ebenfalls dem IBM-Vorbild, indem sie ihre eigenen Dialekte von SQL entwickelten und implementierten. Bis 1987 hatten sowohl das American Standards Institute als auch die International Organization for Standardization Standards für SQL ratifiziert und veröffentlicht und damit den Status von SQL als akzeptierte Sprache für die Verwaltung von RDBMS gefestigt.
Die weite Verwendung des relationalen Modells in mehreren Branchen führte dazu, dass es als Standardmodell für das Datenmanagement anerkannt wurde. Selbst mit dem Aufkommen verschiedener NoSQL-Datenbanken in den letzten Jahren bleiben relationale Datenbanken die dominierenden Werkzeuge zur Speicherung und Organisation von Daten.
Nachdem Sie nun ein allgemeines Verständnis für die Geschichte des relationalen Modells haben, lassen Sie uns einen genaueren Blick darauf werfen, wie das Modell Daten organisiert.
Die grundlegendsten Elemente des relationalen Modells sind Beziehungen, die von Benutzern und modernen RDBMS als Tabellen erkannt werden. Eine Beziehung ist ein Satz von Tupeln oder Zeilen in einer Tabelle, wobei jedes Tupel einen Satz von Attributen oder Spalten gemeinsam hat:
Eine Spalte ist die kleinste Organisationsstruktur einer relationalen Datenbank und stellt die verschiedenen Facetten dar, die die Datensätze in der Tabelle definieren. Daher ihr formellerer Name, Attribute. Sie können sich jedes Tupel als eine einzigartige Instanz jeder Art von Personen, Objekten, Ereignissen oder Assoziationen vorstellen, die die Tabelle enthält. Diese Instanzen können z. B. Mitarbeiter eines Unternehmens, Verkäufe aus einem Online-Geschäft oder Labor-Testergebnisse sein. In einer Tabelle, die beispielsweise Mitarbeiterdaten von Lehrern an einer Schule enthält, können die Tupel Attribute wie name
, subjects
, start_date
usw. haben.
Bei der Erstellung von Spalten geben Sie einen Datentyp an, der festlegt, welche Art von Einträge in dieser Spalte zulässig sind. RDBMS implementieren oft ihre eigenen eindeutigen Datentypen, die möglicherweise nicht direkt mit ähnlichen Datentypen in anderen Systemen austauschbar sind. Einige gängige Datentypen umfassen Datumsangaben, Zeichenketten, Ganzzahlen und Boolesche.
Im relationalen Modell enthält jede Tabelle mindestens eine Spalte, die zur eindeutigen Identifizierung jeder Zeile verwendet werden kann, was als Primärschlüssel bezeichnet wird. Dies ist wichtig, da es bedeutet, dass Benutzer nicht wissen müssen, wo ihre Daten physisch auf einem Computer gespeichert sind; stattdessen können ihre DBMS jeden Datensatz verfolgen und ad hoc zurückgeben. Dies wiederum bedeutet, dass die Datensätze keine definierte logische Reihenfolge haben und die Benutzer die Möglichkeit haben, ihre Daten in beliebiger Reihenfolge oder durch beliebige Filter zurückgeben.
Wenn Sie zwei Tabellen haben, die Sie miteinander verknüpfen möchten, können Sie dies unter anderem mit einem Fremdschlüssel tun. Ein Fremdschlüssel ist im Wesentlichen eine Kopie des Primärschlüssels einer Tabelle (der „übergeordneten“ Tabelle), der in eine Spalte einer anderen Tabelle (der „untergeordneten“ Tabelle) eingefügt wird. Das folgende Beispiel verdeutlicht die Beziehung zwischen zwei Tabellen, von denen die eine zur Aufzeichnung von Informationen über die Mitarbeiter eines Unternehmens und die andere zur Verfolgung der Verkäufe des Unternehmens verwendet wird. In diesem Beispiel wird der Primärschlüssel der Tabelle EMPLOYEES
als Fremdschlüssel der Tabelle SALES
verwendet:
Wenn Sie versuchen, der untergeordneten Tabelle einen Datensatz hinzuzufügen, und der in die Fremdschlüsselspalte eingegebene Wert im Primärschlüssel der übergeordneten Tabelle nicht existiert, ist die Einfügeanweisung ungültig. Dies hilft, die Integrität der Beziehungsebene aufrechtzuerhalten, da die Zeilen in beiden Tabellen immer korrekt zueinander in Beziehung stehen werden.
Die Strukturelemente des relationalen Modells tragen dazu bei, die Daten auf organisierte Weise zu speichern, aber die Speicherung von Daten ist nur dann sinnvoll, wenn Sie diese auch abrufen können. Um Informationen aus einem RDBMS abzurufen, können Sie eine *Abfrage *oder eine strukturierte Anfrage nach einem Satz von Informationen stellen. Wie zuvor erwähnt, verwenden die meisten relationalen Datenbanken SQL zur Verwaltung und Abfrage von Daten. SQL ermöglicht es Ihnen, Abfrageergebnisse mit einer Vielzahl von Klauseln, Prädikaten und Ausdrücken zu filtern und zu manipulieren, wodurch Sie eine genaue Kontrolle darüber erhalten, welche Daten im Ergebnissatz angezeigt werden.
Lassen Sie uns mit Blick auf die zugrunde liegende Organisationsstruktur relationaler Datenbanken einige ihrer Vor- und Nachteile betrachten.
Heute weichen sowohl SQL als auch die Datenbanken, die es implementieren, in mehrfacher Hinsicht von Codds relationalem Modell ab. Beispielsweise schreibt das Modell von Codd vor, dass jede Zeile in einer Tabelle eindeutig sein sollte, während die meisten modernen relationalen Datenbanken aus Gründen der Praktikabilität duplizierte Zeilen zulassen. Es gibt einige, die SQL-Datenbanken nicht als echte relationale Datenbanken betrachten, wenn sich nicht an jede von Codds Spezifikationen für das relationale Modell halten. In der Praxis wird jedoch jedes DBMS, das SQL verwendet und sich zumindest teilweise an das relationale Modell hält, als relationales Datenbankmanagementsystem bezeichnet.
Obwohl relationale Datenbanken schnell an Popularität gewannen, wurden einige Unzulänglichkeiten des relationalen Modells offensichtlich, als Daten immer wertvoller wurden und Unternehmen begannen, mehr davon zu speichern. Zum einen kann es schwierig sein, eine relationale Datenbank horizontal zu skalieren. Horizontale Skalierung oder Herausskalieren ist die Praxis des Hinzufügens weiterer Computer zu einem bestehenden Stack, um die Last zu verteilen und mehr Datenverkehr und eine schnellere Verarbeitung zu ermöglichen. Dies steht oft im Gegensatz zur vertikalen Skalierung, bei der die Hardware eines vorhandenen Servers aufgerüstet wird, in der Regel durch Hinzufügen von mehr RAM oder CPU.
Der Grund dafür, dass die horizontale Skalierung einer relationalen Datenbank schwierig ist, hängt damit zusammen, dass das relationale Modell auf Konsistenz ausgelegt ist, d, h. Clients, die dieselbe Datenbank abfragen, werden immer dieselben Daten abrufen. Wenn Sie eine relationale Datenbank horizontal über mehrere Computer skalieren, wird es schwierig, die Konsistenz zu gewährleisten, da Clients Daten auf einen Knoten Schreiben können, aber nicht auf die anderen. Es gäbe wahrscheinlich eine Verzögerung zwischen dem anfänglichen Schreiben und dem Zeitpunkt, zu dem die anderen Knoten aktualisiert werden, um die Änderungen widerspiegeln, was zu Inkonsistenzen zwischen ihnen führen würde.
Eine weitere Einschränkung bei RDBMS besteht darin, dass das relationale Modell für die Verwaltung strukturierter Daten oder von Daten konzipiert wurde, die mit einem vordefinierten Datentyp übereinstimmen oder zumindest auf eine vorher festgelegte Weise organisiert sind, sodass sie leicht sortierbar und durchsuchbar sind. Mit der Verbreitung von Personal Computing und dem Aufkommen des Internets in den frühen 1990er Jahren wurden jedoch unstrukturierte Daten – wie E-Mail-Nachrichten, Fotos, Videos usw. – immer üblicher.
All dies bedeutet nicht, dass relationale Datenbanken nicht nützlich sind. Ganz im Gegenteil, das relationale Modell ist auch nach über 40 Jahren noch immer der dominierende Rahmen für das Datenmanagement. Ihre Verbreitung und Langlebigkeit bedeuten, das relationale Datenbanken eine ausgereifte Technologie darstellen, was wiederum einer ihrer wichtigsten Vorteile ist. Es gibt viele Anwendungen, die für die Arbeit mit dem relationalen Modell konzipiert wurden, sowie viele Datenbankadministratoren, die in ihrer Laufbahn Experten auf dem Gebiet der relationalen Datenbanken sind. Für diejenigen, die mit relationalen Datenbanken beginnen möchten, gibt es ein breites Angebot an Ressourcen, in gedruckter Form und online.
Ein weiterer Vorteil relationaler Datenbanken besteht darin, dass fast jedes RDBMS Transaktionen unterstützt. Eine Transaktion besteht aus einer oder mehreren einzelnen SQL-Anweisungen, die nacheinander als eine einzige Arbeitseinheit ausgeführt werden. Transaktionen stellen einen Alles-oder-Nichts-Ansatz dar, was bedeutet, dass jede SQL-Anweisung in der Transaktion gültig sein muss, da ansonsten die gesamte Transaktion fehlschlägt. Dies ist sehr hilfreich, um die Datenintegrität zu gewährleisten, wenn Änderungen an mehreren Zeilen oder Tabellen vorgenommen werden.
Und schließlich sind relationale Datenbanken äußerst flexibel. Sie wurden zum Aufbau einer Vielzahl unterschiedlicher Anwendungen verwendet und arbeiten auch bei sehr großen Datenmengen weiterhin effizient. SQL ist ebenfalls extrem leistungsfähig, sodass Sie im Handumdrehen Daten hinzufügen und ändern sowie die Struktur von Datenbankschemata und Tabellen ändern können, ohne die vorhandenen Daten zu beeinträchtigen.
Dank ihrer Flexibilität und ihres Designs für Datenintegrität sind relationale Datenbanken auch mehr als fünfzig Jahre nach ihrer ersten Konzeption immer noch die wichtigste Art und Weise, wie Daten verwaltet und gespeichert werden. Selbst mit dem Aufkommen verschiedener NoSQL-Datenbanken in den letzten Jahren sind das Verständnis des relationalen Modells und die Arbeit mit RDBMS der Schlüssel für jeden, der Anwendungen entwickeln möchte, die die Datenleistung nutzen.
Um mehr über einige beliebte Open-Source-RDBMS zu erfahren, empfehlen wir Ihnen, sich unseren Vergleich verschiedener relationaler Open-Source-Datenbanken anzusehen. Wenn Sie mehr über Datenbanken im Allgemeinen erfahren möchten, empfehlen wir Ihnen, einen Blick in unsere vollständige Bibliothek datenbankbezogener Inhalte zu werfen.
]]>Los sistemas de administración de bases de datos (DBMS) son programas que permiten a los usuarios interactuar con una base de datos. Les permiten controlar el acceso a una base de datos, escribir datos, ejecutar consultas y realizar otras tareas relacionadas con la administración de bases de datos.
Sin embargo, para realizar cualquiera de estas tareas, los DBMS deben tener algún tipo de modelo subyacente que defina la organización de los datos. El modelo relacional es un enfoque para la organización de datos que se ha utilizado ampliamente en programas de software de bases de datos desde su concepción a fines de la década de 1960; a tal grado que, a la fecha de redacción de este artículo, cuatro de los cinco DBMS más populares son relacionales.
Este artículo conceptual describe la historia del modelo relacional, la manera en que se organizan los datos en las bases de datos relacionales y cómo se utilizan hoy en día.
Las bases de datos son conjuntos de información, o datos, modelados de forma lógica. Cualquier recopilación de datos es una base de datos, independientemente de cómo o dónde se almacene. Incluso un gabinete de archivos con información sobre nómina es una base de datos, al igual que una pila de formularios de pacientes hospitalizados o la recopilación de información sobre clientes de una empresa repartida en varias ubicaciones. Antes de que almacenar y administrar datos con computadoras se convirtiera en una práctica común, las bases de datos físicas como estas eran lo único con lo que contaban las organizaciones gubernamentales y empresariales que necesitaban almacenar información.
A mediados del siglo XX, los desarrollos en las ciencias de la computación dieron lugar a máquinas con mayor capacidad de procesamiento y almacenamiento, tanto local como externo. Estos avances hicieron que los especialistas en ciencias de la computación comenzaran a reconocer el potencial que tenían estas máquinas para almacenar y administrar cantidades de datos cada vez más grandes.
Sin embargo, no había teorías sobre cómo las computadoras podían organizar datos de manera significativa y lógica. Una cosa es almacenar datos no ordenados en una máquina, pero es mucho más complicado diseñar sistemas que permitan agregar, recuperar, clasificar y administrar esos datos de forma sistemática y práctica. La necesidad de contar con un marco de trabajo lógico para almacenar y organizar datos dio lugar a varias propuestas sobre cómo utilizar las computadoras para administrar datos.
Uno de los primeros modelos de bases de datos fue el modelo jerárquico, en el que los datos se organizan con una estructura de árbol similar a la de los sistemas de archivos modernos. El siguiente ejemplo muestra el aspecto que podría tener el diseño de una parte de una base de datos jerárquica utilizada para categorizar animales:
El modelo jerárquico se implementó ampliamente en los primeros sistemas de administración de bases de datos, pero resultaba poco flexible. En este modelo, a pesar de que los registros individuales pueden tener varios “elementos secundarios”, cada uno puede tener un solo “elemento primario” en la jerarquía. Es por eso que estas primeras bases de datos jerárquicas se limitaban a representar relaciones “uno a uno” y “uno a varios”. Esta carencia de relaciones “varios a varios” podía provocar problemas al trabajar con puntos de datos que se deseaba asociar a más de un elemento primario.
A fines de los años 60, Edgar F. Codd, un especialista en ciencias de la computación que trabajaba en IBM, diseñó el modelo relacional de administración de bases de datos. El modelo relacional de Codd permitía que los registros individuales se relacionaran con más de una tabla y, de esta manera, posibilitaba las relaciones “varios a varios” entre los puntos de datos además de las relaciones “uno a varios”. Esto proporcionó más flexibilidad que los demás modelos existentes a la hora de diseñar estructuras de base de datos y permitió que los sistemas de gestión de bases de datos relacionales (RDBMS) pudieran satisfacer una gama mucho más amplia de necesidades empresariales.
Codd propuso un lenguaje para la administración de datos relacionales, conocido como Alfa, que influyó en el desarrollo de los lenguajes de bases de datos posteriores. Dos colegas de Codd en IBM, Donald Chamberlin y Raymond Boyce, crearon un lenguaje similar inspirado en Alpha. Lo llamaron SEQUEL, abreviatura de Structured English Query Language (Lenguaje de consulta estructurado en inglés), pero debido a una marca comercial existente, lo abreviaron a SQL (conocido formalmente como* Lenguaje de consulta estructurado*).
Debido a las limitaciones de hardware, las primeras bases de datos relacionales eran prohibitivamente lentas y el uso de la tecnología tardó un tiempo en generalizarse. Pero a mediados de los años ochenta, el modelo relacional de Codd se había implementado en varios productos comerciales de administración de bases de datos, tanto de IBM como de sus competidores. Estos proveedores también siguieron el liderazgo de IBM al desarrollar e implementar sus propios dialectos de SQL. Para 1987, tanto el Instituto Nacional Estadounidense de Estándares (American National Standards Institute) como la Organización Internacional de Normalización (International Organization for Standardization) habían ratificado y publicado normas para SQL, lo que consolidó su estado como el lenguaje aceptado para la administración de RDBMS.
Gracias al uso extendido del modelo relacional en varias industrias, se lo comenzó a reconocer como el modelo estándar para la administración de datos. Incluso con el surgimiento de varias bases de datos NoSQL en los últimos años, las bases de datos relacionales siguen siendo las herramientas predominantes para almacenar y organizar datos.
Ahora que tiene una idea general de la historia del modelo relacional, vamos a analizar en detalle cómo organiza los datos.
Los elementos más fundamentales del modelo relacional son las relaciones, que los usuarios y los RDBMS modernos reconocen como tablas. Una relación es un conjunto de tuplas, o filas de una tabla, en el que cada una de ellas comparte un conjunto de atributos o columnas:
Una columna es la estructura de organización más pequeña de una base de datos relacional y representa las distintas facetas que definen los registros en la tabla. De ahí proviene su nombre más formal: atributos. Se puede pensar que cada tupla es como una instancia única de cualquier tipo de personas, objetos, eventos o asociaciones que contenga la tabla. Estas instancias pueden ser desde empleados de una empresa y ventas de un negocio en línea hasta resultados de pruebas de laboratorio. Por ejemplo, en una tabla que contiene registros de los maestros de una escuela, las tuplas pueden tener atributos como name
, subjects
, start_date
, etc.
Al crear una columna, especificamos un* tipo de datos *que determina el tipo de entradas que se permiten en esa columna. Los sistemas de administración de bases de datos relacionales (RDBMS) suelen implementar sus propios tipos de datos únicos, que pueden no ser directamente intercambiables con tipos de datos similares de otros sistemas. Algunos tipos de datos frecuentes son fechas, cadenas, enteros y booleanos.
En el modelo relacional, cada tabla contiene, por lo menos, una columna que puede utilizarse para identificar de forma única cada fila, lo que se denomina clave primaria. Esto es importante, dado que significa que los usuarios no necesitan saber dónde se almacenan físicamente los datos en una máquina; en su lugar, sus DBMS pueden realizar un seguimiento de cada registro y devolverlo según corresponda. A su vez, significa que los registros no tienen un orden lógico definido y que los usuarios tienen la capacidad de devolver sus datos en cualquier orden y a través de los filtros que deseen.
Si tiene dos tablas que desea asociar, una manera de hacerlo es con una clave externa. Una clave externa es, básicamente, una copia de la clave primaria de una tabla (la tabla “primaria”) insertada en una columna de otra tabla (la “secundaria”). El siguiente ejemplo destaca la relación entre dos tablas: una se utiliza para registrar información sobre los empleados de una empresa y la otra, para realizar un seguimiento de las ventas de la empresa. En este ejemplo, la clave primaria de la tabla EMPLOYEES
se utiliza como clave externa de la tabla SALES
:
Si intenta agregar un registro a la tabla secundaria y el valor ingresado en la columna de la clave externa no existe en la clave primaria de la tabla primaria, la instrucción de inserción no será válida. Esto ayuda a mantener la integridad del nivel de las relaciones, dado que las filas de ambas tablas siempre se relacionarán correctamente.
Los elementos estructurales del modelo relacional ayudan a mantener los datos almacenados de forma organizada, pero el almacenamiento de datos solo es útil si se pueden recuperar. Para obtener información de un RDBMS, puede emitir una consulta o una solicitud estructurada de un conjunto de información. Como se mencionó anteriormente, la mayoría de las bases de datos relacionales utilizan SQL para administrar y consultar datos. SQL permite filtrar y manipular los resultados de las consultas con una variedad de cláusulas, predicados y expresiones, lo que, a su vez, permite controlar correctamente qué datos se mostrarán en el conjunto de resultados.
Teniendo en cuenta la estructura organizativa subyacente de las bases de datos relacionales, consideremos algunas de sus ventajas y desventajas.
Hoy en día, tanto SQL como las bases de datos que lo implementan se desvían del modelo relacional de Codd de diversas maneras. Por ejemplo, el modelo de Codd determina que cada fila de una tabla debe ser única, mientras que, por cuestiones de practicidad, la mayoría de las bases de datos relacionales modernas permiten la duplicación de filas. Algunas personas no consideran que las bases de datos SQL sean verdaderas bases de datos relacionales a menos que cumplan con las especificaciones de Codd del modelo relacional. Sin embargo, en términos prácticos, es probable que cualquier DBMS que utilice SQL y, por lo menos, se adhiera en cierta medida al modelo relacional se denomine “sistema de administración de bases de datos relacionales”.
A pesar de que las bases de datos relacionales ganaron popularidad rápidamente, algunas de las deficiencias del modelo relacional comenzaron a hacerse evidentes a medida que los datos se volvieron más valiosos y las empresas comenzaron a almacenar más de ellos. Entre otras cosas, puede ser difícil escalar una base de datos relacional de forma horizontal. Los términos escalado horizontal o escala horizontal se refieren a la práctica de agregar más máquinas a una pila existente para extender la carga y permitir un mayor tráfico y un procesamiento más rápido. Se suele contrastar con el escalado vertical, que implica la actualización del hardware de un servidor existente, generalmente, añadiendo más RAM o CPU.
El motivo por el cual es difícil escalar una base de datos relacional de forma horizontal está relacionado con el hecho de que el modelo relacional se diseñó para garantizar coherencia, lo que significa que los clientes que realizan consultas en la misma base de datos siempre obtendrán los mismos datos. Al querer escalar una base de datos relacional de forma horizontal en varias máquinas, resulta difícil asegurar la coherencia, dado que los clientes pueden escribir datos en un nodo pero no en otros. Es probable que haya un retraso entre la escritura inicial y el momento en que se actualicen los demás nodos para reflejar los cambios, lo que daría lugar a inconsistencias entre ellos.
Otra limitación que presentan los RDBMS es que el modelo relacional se diseñó para administrar datos estructurados o alineados con un tipo de datos predeterminado o, por lo menos, organizado de una manera predeterminada para que se puedan ordenar o buscar de forma más sencilla. Sin embargo, con la difusión de los equipos informáticos personales y el auge de Internet a principios de la década de 1990, los datos no estructurados, como los mensajes de correo electrónico, las fotos, los videos, etc., se volvieron más comunes.
Nada de esto significa que las bases de datos relacionales no sean útiles. Al contrario, después de más de 40 años, el modelo relacional sigue siendo el marco dominante para la administración de datos. Su prevalencia y longevidad indican que las bases de datos relacionales son una tecnología madura, lo que constituye una de sus principales ventajas. Hay muchas aplicaciones diseñadas para trabajar con el modelo relacional, así como muchos administradores de bases de datos profesionales expertos en bases de datos relacionales. También hay una amplia variedad de recursos disponibles en formato impreso y digital para quienes desean comenzar a utilizar bases de datos relacionales.
Otra ventaja de las bases de datos relacionales es que casi todos los RDBMS admiten transacciones. Las transacciones constan de una o más instrucciones SQL individuales que se ejecutan en secuencia como una unidad de trabajo única. Las transacciones tienen un enfoque de todo o nada, lo que significa que todas las instrucciones SQL de una transacción deben ser válidas; de lo contrario, falla en su totalidad. Esto es muy útil para garantizar la integridad de los datos al realizar cambios en varias filas o tablas.
Por último, las bases de datos relacionales son sumamente flexibles. Se han utilizado para crear una amplia variedad de aplicaciones distintas y siguen funcionando de forma eficiente incluso con grandes cantidades de datos. SQL también es sumamente potente y permite agregar y cambiar datos sobre la marcha, así como alterar la estructura de los esquemas y las tablas de las bases de datos sin afectar a los datos existentes.
Gracias a su flexibilidad y diseño para la integridad de los datos, más de cincuenta años después de su concepción inicial, las bases de datos relacionales siguen siendo la principal forma de administrar y almacenar datos. A pesar del surgimiento de diversas bases de datos NoSQL en los últimos años, la comprensión del modelo relacional y la manera de trabajar con RDBMS es fundamental para cualquier persona que desee crear aplicaciones que aprovechen el poder de los datos.
Para obtener más información sobre RDBMS populares de código abierto, le recomendamos consultar nuestra comparación de diversas bases de datos relacionales SQL de código abierto. Si desea obtener más información sobre las bases de datos en general, le recomendamos consultar nuestra biblioteca completa de contenidos relacionados con bases de datos.
]]>Les systèmes de gestion des bases de données (SGDB ou DBMS en anglais) sont des programmes informatiques qui permettent aux utilisateurs d’interagir avec une base de données. Un SGDB permet aux utilisateurs de contrôler l’accès à une base de données, d’écrire des données, d’exécuter des requêtes et d’effectuer toute autre tâche liée à la gestion de base de données.
Pour effectuer l’une de ces tâches, cependant, le SGDB doit avoir une sorte de modèle sous-jacent qui définit la manière dont les données sont organisées. Le modèle relationnel est une approche d’organisation des données qui ont trouvé un large usage dans le logiciel de base de données depuis sa conception initiale à la fin des années 1960, à tel point que, à partir de ce texte, quatre des cinq SGDB les plus populaires sont relationnels.
Cet article conceptuel décrit l’historique du modèle relationnel, la manière dont les bases de données relationnelles organisent les données, et leur utilisation aujourd’hui.
Les bases de données sont des groupes d’informations ou de données modélisés de manière logique**. Toute collecte de données est une base de données, quel que soit la manière dont elle est stockée ou l’endroit où elle est stockée. Même un classeur de fichiers contenant des informations sur les salaires est une base de données, tout comme une pile de formulaires de patients d’un hôpital ou la collecte d’informations sur les clients d’une entreprise répartis dans plusieurs endroits. Avant que le stockage et la gestion des données à l’aide d’ordinateurs ne soit une pratique courante, les bases de données physiques comme celles-ci étaient les seules dont disposaient les organisations gouvernementales et commerciales qui avaient besoin de stocker des informations.
Vers le milieu du XXe siècle, les développements en informatique ont conduit à la mise au point de machines plus puissantes, ainsi qu’à une plus grande capacité de stockage local et externe. Ces progrès ont conduit les informaticiens à commencer à reconnaître le potentiel de ces machines pour stocker et gérer des quantités de données toujours plus importantes.
Cependant, il n’existait pas de théories sur la manière dont les ordinateurs pouvaient organiser les données de manière logique et significative. C’est une chose de stocker des données non triées sur une machine, mais il est beaucoup plus compliqué de concevoir des systèmes qui vous permettent d’ajouter, récupérer, trier et gérer ces données d’une manière cohérente et pratique. La nécessité d’un cadre logique pour stocker et organiser les données a conduit à un certain nombre de propositions d’utilisation des ordinateurs pour la gestion des données.
L’un des premiers modèles de base de données était le modèle hiérarchique, dans lequel les données sont organisées dans une structure arborescente ressemblant semblables aux systèmes de fichiers modernes. L’exemple suivant montre à quoi pourrait ressembler la disposition d’une partie d’une base de données hiérarchique utilisée pour classer les animaux : :
Le modèle hiérarchique a été largement appliqué dans les systèmes de gestion de base de données, mais il s’est également avéré peu flexible. Dans ce modèle, même si les enregistrements individuels peuvent avoir plusieurs enregistrements “enfant”, chaque enregistrement ne peut avoir qu’un seul “parent” dans la hiérarchie. Pour cette raison, ces bases de données hiérarchiques antérieures se limitaient à représenter uniquement des relations "“un à un” et “un à plusieurs” Cette absence de relations “de plusieurs à plusieurs” pourrait entraîner des problèmes lorsque vous travaillez avec des points de données que vous souhaitez associer à plus d’un parent.
À la fin des années 1960, Edgar F. Codd, un informaticien chez IBM, a mis au point le modèle relationnel de gestion de base de données. Le modèle relationnel de Codd a permis à des enregistrements individuels d’être associés à plus d’une table, permettant ainsi des relations “de plusieurs à plusieurs” entre les points de données en plus des relations “d’un à plusieurs”. Cela a permis une plus grande flexibilité que d’autres modèles existants en ce qui concerne la conception des structures de base de données, ce qui signifie que les systèmes de gestion de bases de données relationnelles (SGBDR) pouvaient répondre à un éventail beaucoup plus large de besoins commerciaux.
Codd a proposé un langage pour gérer les données relationnelles, connu sous le nom d’Alpha, qui a influencé le développement de langages de base de données ultérieurs. Deux des collègues de Codd chez IBM, Donald Chamberlin et Raymond Boyce, ont créé un tel langage inspirée d’Alpha. Ils ont appelé leur langue SEQUEL, anagramme de S tructured E nglish Que ry L anguage, mais en raison d’une marque existante ils ont raccourci le nom de leur langage à SQL (appelé de manière plus formelle Structured Query Language).
En raison de contraintes matérielles, les premières bases de données relationnelles étaient excessivement lentes, et il a fallu un certain temps avant que la technologie ne se généralise. Mais au milieu des années 1980, le modèle relationnel de Codd a été mis en œuvre dans un certain nombre de produits de gestion de base de données commerciales d’IBM et de ses concurrents. Ces entreprises ont également suivi l’initiative d’IBM en développant et en mettant en œuvre leurs propres dialectes de SQL. En 1987, l’American National Standards Institute et l’International Organization for Standardization avaient tous deux ratifié et publié des normes pour SQL, consolidant ainsi son statut de langage accepté pour la gestion des SGBDR.
L’utilisation du modèle relationnel dans plusieurs industries a conduit à sa reconnaissance en tant que modèle standard de gestion des données. Même avec l’essor de différentes bases de données NoSQL ces dernières années, les bases de données relationnelles restent les outils dominants pour le stockage et l’organisation des données.
Maintenant que vous avez une compréhension générale de l’histoire du modèle relationnel, examinons de plus près la manière dont le modèle organise les données.
Les éléments les plus fondamentaux du modèle relationnel sont les relations que les utilisateurs et les SGBDR modernes reconnaissent comme tableaux. Une relation est un ensemble de tuples, ou de lignes dans une table, avec chaque tuple partageant un ensemble d’attributs, ou colonnes :
Une colonne est la plus petite structure organisationnelle d’une base de données relationnelle, et représente les différentes facettes qui définissent les enregistrements de la table. D’où leur nom plus formel, les attributs. Vous pouvez penser à chaque tuple comme une instance unique de n’importe quel type de personnes, objets, événements ou associations que la table contient. Ces instances peuvent être des éléments comme les employés d’une entreprise, les ventes d’une entreprise en ligne ou les résultats de test en laboratoire. Par exemple, dans une table contenant les enregistrements des enseignants d’une école, les tuples peuvent avoir des attributs comme name
, subjects
, start_date
, etc.
Lorsque vous créez des colonnes, vous spécifiez un type de données qui dicte le type d’entrées autorisées dans cette colonne. Les SGBDR mettent souvent en œuvre leurs propres types de données uniques, qui peuvent ne pas être directement interchangeables avec des types de données similaires dans d’autres systèmes. Les types de données les plus courants comprennent les dates, les chaînes de caractères, les entiers et les Booléens.
Dans le modèle relationnel, chaque table contient au moins une colonne qui peut être utilisée pour identifier de manière unique chaque ligne, appelée clé primaire. C’est important, car cela signifie que les utilisateurs n’ont pas à savoir où leurs données sont physiquement stockées sur une machine, au lieu de cela, leur SGBD peut suivre chaque enregistrement et les renvoyer sur une base ad hoc. Cela signifie que les enregistrements n’ont pas d’ordre logique défini, et que les utilisateurs ont la possibilité de renvoyer leurs données dans n’importe quel ordre ou par les filtres qu’ils souhaitent.
Si vous souhaitez associer deux tables l’une à l’autre, vous pouvez le faire avec une clé étrangère. Une clé étrangère est essentiellement une copie de la clé primaire d’une table (la table “parent”) insérée dans une colonne d’une autre table (l’“enfant”). L’exemple suivant met en évidence la relation entre deux tableaux, l’un utilisé pour enregistrer les informations relatives aux employés d’une entreprise et l’autre utilisée pour suivre les ventes de l’entreprise. Dans cet exemple, la clé principale du tableau EMPLOYEES
est utilisée comme clé étrangère du tableau SALES
:
Si vous essayez d’ajouter un enregistrement au tableau enfant et que la valeur saisie dans la colonne de clé étrangère n’existe pas dans la clé primaire du tableau parent, l’instruction d’insertion sera invalide. Cela aide à maintenir l’intégrité au niveau des relations, car les lignes des deux tableaux seront toujours correctement reliées.
Les éléments structurels du modèle relationnel aident à conserver les données stockées de manière organisée, mais la conservation des données n’est utile que si vous pouvez les récupérer. Pour récupérer des informations d’un SGBDR, vous pouvez émettre une query ou une requête structurée d’un ensemble d’informations. Comme mentionné précédemment, la plupart des bases de données relationnelles utilisent SQL pour gérer et interroger les données. SQL vous permet de filtrer et de manipuler les résultats de requête avec une variété de clauses, de prédicats et d’expressions, vous donnant un contrôle précis sur les données qui apparaîtront dans l’ensemble de résultats.
En tenant compte de la structure organisationnelle sous-jacente des bases de données relationnelles, examinons quelques-uns de leurs avantages et de leurs inconvénients.
Aujourd’hui, tant SQL que les bases de données qui l’implémentent s’écartent du modèle relationnel de Codd de plusieurs façons. Par exemple, le modèle de Codd dicte que chaque ligne d’une table doit être unique tandis que, pour des raisons pratiques, la plupart des bases de données relationnelles modernes permettent de dupliquer les lignes. Certaines personnes ne considèrent pas les bases de données SQL comme de véritables bases de données relationnelles si elles ne respectent pas chacune des spécifications du modèle relationnel de Codd. En termes pratiques, cependant, tout SGBD qui utilise SQL et qui adhère au moins en partie au modèle relationnel est susceptible d’être appelé un système de gestion de base de données relationnelles.
Bien que les bases de données relationnelles aient rapidement gagné en popularité, certaines des lacunes du modèle relationnel ont commencé à apparaître lorsque les données prenaient de la valeur et que les entreprises ont commencé à en stocker davantage. La scalabilité horizontale, ou scaling out, est la pratique qui consiste à ajouter plus de machines à une pile existante afin de répartir la charge et de permettre un traffic plus important et un traitement plus rapide. Cette opération est souvent opposée à la mise à la scalabilité verticale qui implique la mise à niveau du matériel d’un serveur existant, généralement en ajoutant plus de RAM ou de CPU.
La raison pour laquelle il est difficile de faire évoluer une base de données relationnelle horizontalement est liée au fait que le modèle relationnel est conçu pour assurer la cohérence, ce qui signifie que les clients qui interrogent la même base de données récupèrent toujours les mêmes données. Si vous devez faire évoluer une base de données relationnelle horizontalement sur plusieurs machines, il devient difficile d’en garantir la cohérence car les clients peuvent parfois écrire des données sur un nœud, sans le faire sur les autres. Il y aurait probablement un délai entre l’écriture initiale et le moment où les autres nœuds sont mis à jour pour refléter les changements, ce qui entraînerait des incohérences entre eux.
Une autre limitation présentée par les SGDBR est que le modèle relationnel a été conçu pour gérer des données structurées, ou des données qui s’alignent avec un type de données prédéfini ou qui sont au moins organisées d’une manière prédéterminée, ce qui les rend facilement triables et consultables. Toutefois, avec le développement de l’informatique personnelle et l’essor d’Internet au début des années 1990, les données non structurées — telles que les messages électroniques, les photos, les vidéos, etc. — sont devenues plus fréquentes.
Rien de cela ne veut dire que les bases de données relationnelles ne sont pas utiles. Au contraire, le modèle relationnel est toujours le cadre dominant de la gestion des données après plus de 40 ans. Leur prévalence et leur longévité signifient que les bases de données relationnelles sont une technologie mature, qui est en soi l’un de leurs avantages majeurs. Il existe de nombreuses applications conçues pour fonctionner avec le modèle relationnel, ainsi que de nombreux administrateurs de base de données de carrière qui sont des experts en matière de bases de données relationnelles. Il existe également un large éventail de ressources disponibles sur papier et en ligne pour ceux qui souhaitent se lancer dans les bases de données relationnelles.
Un autre avantage des bases de données relationnelles est que presque tous les SGBDR prennent en charge les transactions. Une transaction consiste en une ou plusieurs des instructions SQL individuelles exécutées en séquence comme une seule unité de travail. Les transactions présentent une approche de type tout-ou rien, ce qui signifie que chaque instruction SQL de la transaction doit être valide ; sinon, la transaction entière échouera. Ceci est très utile pour garantir l’intégrité des données lors de modifications de plusieurs lignes ou tableaux.
Enfin, les bases de données relationnelles sont extrêmement flexibles. Elles ont été utilisées pour construire une grande variété d’applications différentes, et continuent de fonctionner efficacement même avec de très grandes quantités de données. SQL est également extrêmement puissant, vous permettant d’ajouter et de modifier des données au vol, ainsi que de modifier la structure des schémas et des tableaux de base de données sans incidence sur les données existantes.
Grâce à leur flexibilité et à leur conception pour l’intégrité des données, les bases de données relationnelles sont toujours le principal moyen de gérer et de stocker les données plus de cinquante ans après leur conception. Même avec l’essor de diverses bases de données NoSQL ces dernières années, la compréhension du modèle relationnel et de la manière de travailler avec les SGBDR sont la clé pour tous ceux qui veulent construire des applications qui exploitent la puissance des données.
Pour en savoir plus sur quelques SGBDR open source populaires, nous vous encourageons à consulter notre comparaison de différentes bases de données SQL relationnelles open-source. Si vous souhaitez en savoir plus sur les bases de données en général, nous vous encourageons à consulter notre bibliothèque complète de contenus liés aux bases de données.
]]>Os sistemas de gerenciamento de banco de dados (SGDB, no inglês DMBS) são programas de computador que permitem que usuários interajam com um banco de dados. Um DBMS permite que os usuários controlem o acesso a um banco de dados, gravem dados, executem consultas e façam outras tarefas relacionadas ao gerenciamento de banco de dados.
No entanto, para realizar qualquer uma dessas tarefas, o DBMS deve possuir algum modelo subjacente que defina como os dados são organizados. O modelo relacional é uma abordagem para organizar dados que vem sendo bastante usada em softwares de banco de dados desde que foram criados no final dos anos 60. Tanto é que, no momento em que este artigo está sendo escrito, quatro dos cinco DBMS mais populares são relacionais.
Este artigo conceitual descreve o histórico do modelo relacional, como os bancos de dados relacionais organizam dados e como são usados hoje.
Os bancos de dados são clusters de informações logicamente modelados, ou dados. Qualquer coleção de dados é um banco de dados, independentemente de como ou onde é armazenada. Mesmo um armário de arquivos que contém informações de folha de pagamento é um banco de dados, assim como uma pilha de formulários hospitalares ou uma coleção de informações de cliente de uma empresa espalhada em vários locais. Antes do armazenamento e gerenciamento de dados com computadores ser uma prática comum, bancos de dados físicos como esses eram os únicos disponíveis para as organizações governamentais e empresariais que precisavam armazenar informações.
Em meados do século XX, pesquisas na área da ciência da computação levaram à criação de máquinas com maior poder de processamento, bem como maior capacidade de armazenamento local e externo. Esses avanços levaram os cientistas da computação a começar a reconhecer o potencial que essas máquinas tinham para armazenar e gerenciar quantidades cada vez maiores de dados.
No entanto, não havia nenhuma teoria para explicar como os computadores poderiam organizar os dados de maneiras lógicas e significativas. Uma coisa é armazenar dados não ordenados em uma máquina, e outra, muito mais complicada, é projetar sistemas que permitam adicionar, recuperar, classificar e gerenciar os dados de maneira consistente e prática. A necessidade de uma estrutura lógica para armazenar e organizar dados levou a uma série de propostas sobre como tirar proveito dos computadores para gerenciar dados.
Um modelo de banco de dados inicial foi o modelo hierárquico, no qual os dados são organizados em uma estrutura que se assemelha a uma árvore, semelhante aos sistemas de arquivos modernos. O exemplo a seguir mostra como o layout de parte de um banco de dados hierárquico usado para categorizar animais se pareceria:
O modelo hierárquico foi amplamente implementado nos primeiros sistemas de gerenciamento banco de dados, mas também se mostrou um pouco inflexível. Neste modelo, mesmo que os registros individuais possam ter vários “filhos”, cada registro só pode ter um “pai” na hierarquia. Por conta disso, esses primeiros bancos de dados hierárquicos ficavam limitados a representar apenas relações “um a um” e “um para muitos”. Essa falta de relações"muitos para muitos" poderia levar a problemas quando se estivesse trabalhando com pontos de dados que você gostaria de associar a mais de um pai.
No final dos anos 60, Edgar F. Codd, um cientista da computação que trabalhava na IBM, criou o modelo relacional de gerenciamento de banco de dados. O modelo relacional de Codd permitiu que registros individuais fossem associados a mais de uma tabela, permitindo assim relações “muitos para muitos” entre pontos de dados, além de relacionamentos “um para muitos”. Isso proporcionou maior flexibilidade do que outros modelos existentes no que diz respeito ao projeto de banco de dados. Isso fez com que os sistemas de gerenciamento de banco de dados relacionais (RDBMSs) pudessem atender a uma gama muito maior de necessidades de negócios.
Codd propôs uma linguagem para gerenciar dados relacionais, conhecida como Alpha, que influenciou o desenvolvimento de linguagens de banco de dados posteriores. Dois dos colegas de Codd na IBM, Donald Chamberlin e Raymond Boyce, criaram uma dessas linguagens inspirada no Alpha. Eles deram a ela o nome de SEQUEL, abreviação de Structured English Query Language. No entanto, por causa de uma marca registrada existente, eles reduziram o nome de sua linguagem para SQL (conhecida mais formalmente como Structured Query Language).
Devido às restrições de hardware, os primeiros bancos de dados relacionais ainda eram proibitivamente lentos, e foi necessário algum tempo até que a tecnologia se tornasse difundida. Mas em meados dos anos 80, o modelo relacional de Codd já havia sido implementado em diversos produtos de gerenciamento de banco de dados comercial tanto da IBM quando de seus concorrentes. Esses fornecedores também seguiram a iniciativa da IBM desenvolvendo e implementando seus próprios dialetos de SQL. Em 1987, tanto o American National Standards Institute quanto a International Organization for Standardization haviam ratificado e publicado padrões para SQL, solidificando seu status como a linguagem aceita para gerenciar RDBMSs.
O uso difundido do modelo relacional em várias indústrias fez com que ele se tornasse reconhecido como o modelo padrão para o gerenciamento de dados. Mesmo com o surgimento de vários bancos de dados NoSQL nos últimos anos, os bancos de dados relacionais continuam sendo as ferramentas dominantes para armazenar e organizar dados.
Agora que você compreende de maneira geral a história do modelo relacional, vamos dar uma olhada mais de perto em como o modelo organiza os dados.
Os elementos mais fundamentais no modelo relacional são as relações, que os usuários e RDBMS modernos reconhecem como tabelas. Uma relação é um conjunto de tuplas, ou linhas em uma tabela, com cada tupla compartilhando um conjunto de atributos, ou colunas:
Uma coluna é a menor estrutura organizacional de um banco de dados relacional e representa as diversas facetas que definem os registros na tabela. Daí seu nome mais formal, atributos. Você pode pensar em cada tupla como sendo uma instância única de qualquer tipo de pessoas, objetos, eventos ou associações que a tabela possua. Essas instâncias podem representar coisas como funcionários em uma empresa, vendas de um negócio on-line ou resultados de testes de laboratório. Por exemplo, em uma tabela que contenha registros de funcionário de professores em uma escola, as tuplas podem possuir atributos como name
(nome), subjects
(disciplinas), start_date
(data de início), e assim por diante.
Ao criar colunas, você especifica um tipo de dados que dita qual o tipo das entradas que são permitidas nessa coluna. Os RDBMSs muitas vezes implementam seus próprios tipos de dados únicos, que podem não ser diretamente intercambiáveis com tipos de dados semelhantes de outros sistemas. Alguns tipos de dados comuns incluem datas, strings, inteiros e booleanos.
No modelo relacional, cada tabela contém pelo menos uma coluna que pode ser usada para identificar de maneira única cada linha, chamada de chave primária. Isso é importante, pois significa que os usuários não precisam saber onde seus dados ficam fisicamente armazenados em uma máquina. Em vez disso, seu DBMS pode manter o controle de cada registro e retorná-los conforme necessário. Por sua vez, isso significa que os registros não possuem ordem lógica definida, e os usuários têm a capacidade de retornar seus dados em qualquer ordem ou sob o efeito do filtro que quiserem.
Se você tiver duas tabelas que gostaria de se associar uma com a outra, uma maneira de fazer isso é com uma chave estrangeira. Uma chave estrangeira é essencialmente uma cópia da chave primária de uma tabela (a tabela “pai”) inserida em uma coluna em outra tabela (o “filho”). O exemplo a seguir destaca a relação entre duas tabelas. Uma é usada para registrar informações sobre funcionários em uma empresa e a outra é usada para acompanhar as vendas dela. Neste exemplo, a chave primária da tabela EMPLOYEES
(funcionários) é usada como a chave estrangeira na tabela SALES
(vendas):
Se você tentar adicionar um registro à tabela filho e o valor inserido na coluna de chaves estrangeiras não existir na chave primária da tabela de origem, a declaração de inserção será inválida. Isso ajuda a manter a integridade de nível de relação, já que as linhas em ambas as tabelas sempre estarão relacionadas corretamente.
Os elementos estruturais do modelo relacional ajudam a manter os dados armazenados de forma organizada, mas o armazenamento de dados só é útil se for possível recuperá-los. Para recuperar informações de um RDBMS, é possível emitir uma consulta, ou uma solicitação estruturada para um conjunto de informações. Como mencionado anteriormente, a maioria dos bancos de dados relacionais usa SQL para gerenciar e consultar dados. O SQL permite que você filtre e manipule os resultados da consulta com uma variedade de cláusulas, predicados e expressões, dando-lhe um controle fino sobre quais dados aparecerão no conjunto de resultados.
Com a estrutura organizacional subjacente de bancos de dados relacionais em mente, vamos considerar algumas de suas vantagens e desvantagens.
Hoje em dia, tanto o SQL quanto os bancos de dados que o implementam se desviam do modelo relacional de Codd de várias maneiras. Por exemplo, o modelo de Codd determina que cada linha em uma tabela deve ser única, enquanto que, por razões de praticidade, a maioria dos bancos de dados relacionais modernos permite linhas duplicadas. Existem algumas pessoas que não consideram os bancos de dados SQL como verdadeiros bancos de dados relacionais se não conseguem aderir às especificações de Codd para o modelo relacional. No entanto, em termos práticos, qualquer DBMS que use SQL e de certa forma acate ao modelo relacional de dados provavelmente será referido como um sistema de gerenciamento de banco de dados relacional.
Embora os bancos de dados relacionais tenham rapidamente crescido em popularidade, algumas das desvantagens do modelo relacional começaram a se tornar aparentes à medida que os dados se tornaram mais valiosos e as empresas começaram a armazenar uma maior quantidade deles. Por exemplo, pode ser difícil dimensionar um banco de dados relacional horizontalmente. O dimensionamento horizontal, ou ampliamento, é a prática de adicionar mais máquinas a uma pilha existente para espalhar a carga e permitir um maior tráfego e processamento mais rápido. Isso é muitas vezes contrastado com o dimensionamento vertical, que envolve atualizar o hardware de um servidor existente, geralmente adicionando mais RAM ou CPU.
A razão pela qual é difícil dimensionar um banco de dados relacional de dados horizontalmente tem a ver com o fato de que o modelo relacional foi projetado para garantir a consistência, ou seja, clientes que consultarem o mesmo banco de dados sempre receberão os mesmos dados. Se você fosse dimensionar um banco de dados relacional horizontalmente em várias máquinas, tornaria-se difícil garantir a consistência, pois os clientes poderiam escrever dados em um nó mas não nos outros. Provavelmente haveria um atraso entre a gravação inicial e o momento em que os outros nós fossem atualizados para refletir as alterações, resultando em inconsistências entre eles.
Outra limitação apresentada pelos RDBMSs é que o modelo relacional foi projetado para gerenciar dados estruturados, ou dados que se alinhem a um tipo de dados pré-definido ou que pelo menos estejam organizados de alguma forma pré-determinada, tornando-os facilmente categorizáveis e pesquisáveis. No entanto, com a propagação da computação pessoal e o aumento da internet no início dos anos 90, dados não estruturados — como mensagens de e-mail, fotos, vídeos, etc — se tornaram mais comuns.
Nada disso é para dizer que os bancos de dados relacionais não são úteis. Muito pelo contrário, o modelo relacional ainda é a estrutura dominante para o gerenciamento de dados após mais de 40 anos. Sua prevalência e longevidade significam que os bancos de dados relacionais são uma tecnologia madura, que por si só é uma de suas principais vantagens. Existem muitos aplicativos projetados para trabalhar com o modelo relacional, bem como muitos profissionais administradores de banco de dados que são especialistas no assunto. Também há uma grande variedade de recursos disponíveis on-line e em livros para aqueles que querem começar com bancos de dados relacionais.
Outra vantagem dos bancos de dados relacionais é que quase todos os RDBMS suportam transações. Uma transação consiste em uma ou mais declarações SQL individuais realizadas em sequência como uma única unidade de trabalho. As transações apresentam uma abordagem tudo ou nada, o que significa que cada declaração SQL na transação deve ser válida; caso contrário, toda a transação falhará. Isso é muito útil para garantir a integridade dos dados ao fazer alterações em várias linhas ou tabelas.
Por fim, os bancos de dados relacionais são extremamente flexíveis. Eles foram usados para construir uma grande variedade de aplicativos diferentes e continuam funcionando de maneira eficiente, mesmo com grandes quantidades de dados. O SQL também é extremamente poderoso, permitindo adicionar e alterar os dados em tempo real, bem como alterar a estrutura de esquemas e tabelas do banco de dados sem afetar os dados existentes.
Graças à sua flexibilidade e design para a integridade dos dados, os bancos de dados relacionais ainda representam a principal maneira de os dados serem gerenciados e armazenados mais de cinquenta anos após terem sido concebidos pela primeira vez. Mesmo com o surgimento de diversos bancos de dados NoSQL nos últimos anos, compreender o modelo relacional e como trabalhar com RDBMSs é muito importante para quem quiser construir aplicativos que tirem proveito do poder dos dados.
Para aprender mais sobre alguns RDBMSs populares de código aberto, recomendamos que você consulte nossa comparação entre vários bancos de dados SQL relacionais de código aberto. Se tiver interesse em aprender mais sobre bancos de dados de maneira geral, recomendamos que verifique nossa biblioteca completa de conteúdos relacionados a banco de dados.
]]>Системы управления базами данных (СУБД) — это компьютерные программы, которые позволяют пользователям взаимодействовать с базой данных. СУБД позволяет пользователям контролировать доступ к базе данных, записывать данные, запускать запросы и выполнять любые другие задачи, связанные с управлением базами данных.
Однако для выполнения любой из этих задач СУБД должна иметь в основе модель, определяющую организацию данных. Реляционная модель — это один из подходов к организации данных, который широко используется в программном обеспечении баз данных с момента своего появления в конце 60-х годов. Этот подход настолько распространен, что на момент написания данной статьи четыре из пяти самых популярных систем управления базами данных являются реляционными.
В этой концептуальной статье представлена история реляционной модели, порядок организации данных реляционными системами и примеры использования в настоящее время.
Базы данных — это логически сформированные кластеры информации, или данных. Любая коллекция данных является базой данных, независимо от того, как и где она хранится. Шкаф с платежными ведомостями, полка в регистратуре с карточками пациентов или хранящаяся в разных офисах клиентская картотека компании — все это базы данных. Прежде чем хранение данных и управление ими с помощью компьютеров стало общей практикой, правительственным организациям и коммерческим компаниям для хранения информации были доступны только физические базы данных такого рода.
Примерно в середине XX века развитие компьютерной науки привело к созданию машин с большей вычислительной мощностью, а также с увеличенными возможностями встроенной и внешней памяти. Эти достижения позволили специалистам в области вычислительной техники осознать потенциал таких устройств в области хранения и управления большими массивами данных.
Однако не существовало никаких теорий о том, как компьютеры могут организовывать данные осмысленным, логическим образом. Одно дело хранить несортированные данные на компьютере, но гораздо сложнее создать системы, которые позволяют последовательно добавлять, извлекать, сортировать и иным образом управлять этими данными на практике. Необходимость в логической конструкции для хранения и организации данных привела к появлению ряда предложений по использованию компьютеров для управления данными.
Одной из ранних моделей базы данных была иерархическая модель, в которой данные были организованы в виде древовидной структуры, подобной современным файловым системам. Следующий пример показывает, как может выглядеть часть иерархической базы данных, используемой для классификации животных:
Иерархическая модель была широко внедрена в ранние системы управления базами данных, но она отличалась отсутствием гибкости. В этой модели каждая запись может иметь только одного «предка», даже если отдельные записи могут иметь несколько «потомков». Из-за этого эти ранние иерархические базы данных могли представлять только отношения «один к одному» или «один ко многим». Отсутствие отношений «много ко многим» могло привести к возникновению проблем при работе с точками данных, которые требуют привязки к нескольким предкам.
В конце 60-х годов Эдгар Ф. Кодд (Edgar F. Codd), программист из IBM, разработал реляционную модель управления базами данных. Реляционная модель Кодда позволила связать отдельные записи с несколькими таблицами, что дало возможность устанавливать между точками данных отношения «много ко многим» в дополнение к «один ко многим». Это обеспечило большую гибкость по сравнению с другими существующими моделями, если говорить о разработке структур баз данных, а значит реляционные системы управления базами данных (РСУБД) могли удовлетворить гораздо более широкий спектр бизнес-задач.
Кодд предложил язык для управления реляционными данными, известный как Alpha , оказавший влияние на разработку более поздних языков баз данных. Коллеги Кодда из IBM, Дональд Чемберлен (Donald Chamberlin) и Рэймонд Бойс (Raymond Boyce), создали один из языков под влиянием языка Alpha. Они назвали свой язык SEQUEL, сокращенное название от Structured English Query Language (структурированный английский язык запросов), но из-за существующего товарного знака сократили название до SQL (более формальное название — структурированный язык запросов).
Из-за ограниченных возможностей аппаратного обеспечения ранние реляционные базы данных были все еще непозволительно медленными, и потребовалось некоторое время, прежде чем технология получила широкое распространение. Но к середине 80-х годов реляционная модель Кодда была внедрена в ряд коммерческих продуктов по управлению базами данных от компании IBM и ее конкурентов. Вслед за IBM, эти поставщики также стали разрабатывать и применять свои собственные диалекты SQL. К 1987 году Американский национальный институт стандартов и Международная организация по стандартизации ратифицировали и опубликовали стандарты SQL, укрепив его статус признанного языка для управления РСУБД.
Широкое использование реляционной модели во многих отраслях привело к тому, что она была признана стандартной моделью для управления данными. Даже с появлением в последнее время все большего числа различных баз данных NoSQL реляционные базы данных остаются доминирующим инструментом хранения и организации данных.
Теперь, когда у вас есть общее понимание истории реляционной модели, давайте более подробно рассмотрим, как данная модель структурирует данные.
Наиболее значимыми элементами реляционной модели являются отношения, которые известны пользователям и современным РСУБД как таблицы. Отношения — это набор кортежей, или строк в таблице, где каждый кортеж имеет набор атрибутов, или столбцов:
Столбец — это наименьшая организационная структура реляционной базы данных, представляющая различные ячейки, которые определяют записи в таблице. Отсюда происходит более формальное название — атрибуты. Вы можете рассматривать каждый кортеж в качестве уникального экземпляра чего-либо, что может находиться в таблице: категории людей, предметов, событий или ассоциаций. Такими экземплярами могут быть сотрудники компаний, продажи в онлайн-бизнесе или результаты лабораторных тестов. Например, в таблице с трудовыми записями учителей в школе кортежи могут иметь такие атрибуты, как name
, subjects
, start_date
и т. д.
При создании столбцов вы указываете тип данных, определяющий, какие записи могут вноситься в данный столбец. РСУБД часто используют свои собственные уникальные типы данных, которые могут не быть напрямую взаимозаменяемы с аналогичными типами данных из других систем. Некоторые распространенные типы данных включают даты, строки, целые числа и логические значения.
В реляционной модели каждая таблица содержит по крайней мере один столбец, который можно использовать для уникальной идентификации каждой строки. Он называется первичным ключом. Это важно, поскольку это означает, что пользователям не нужно знать, где физически хранятся данные на компьютере. Их СУБД может отслеживать каждую запись и возвращать ее в зависимости от конкретной цели. В свою очередь, это означает, что записи не имеют определенного логического порядка, и пользователи могут возвращать данные в любом порядке или с помощью любого фильтра по своему усмотрению.
Если у вас есть две таблицы, которые вы хотите связать друг с другом, можно сделать это с помощью внешнего ключа. Внешний ключ — это, по сути, копия основного ключа одной таблицы (таблицы «предка»), вставленная в столбец другой таблицы («потомка»). Следующий пример показывает отношения между двумя таблицами: одна используется для записи информации о сотрудниках компании, а другая — для отслеживания продаж компании. В этом примере первичный ключ таблицы EMPLOYEES
используется в качестве внешнего ключа таблицы SALES
:
Если вы попытаетесь добавить запись в таблицу «потомок», и при этом значение, вводимое в столбец внешнего ключа, не существует в первичном ключе таблицы «предок», вставка будет недействительной. Это помогает поддерживать целостность уровня отношений, поскольку ряды в обеих таблицах всегда будут связаны корректно.
Структурные элементы реляционной модели помогают хранить данные в структурированном виде, но хранение имеет значение только в том случае, если вы можете извлечь эти данные. Для извлечения информации из РСУБД вы можете создать запрос, т. е. структурированный запрос на набор информации. Как уже упоминалось ранее, большинство реляционных баз данных используют язык SQL для управления данными и отправки запросов. SQL позволяет фильтровать результаты и обрабатывать их с помощью различных пунктов, предикатов и выражений, позволяя вам контролировать, какие данные появятся в результате.
Учитывая организационную структуру, положенную в основу реляционных баз данных, давайте рассмотрим их некоторые преимущества и недостатки.
Сегодня как SQL, так и базы данных, которые ее используют, несколько отклоняются от реляционной модели Кодда. Например, модель Кодда предписывает, что каждая строка в таблице должна быть уникальной, а по соображениям практической целесообразности большинство современных реляционных баз данных допускают дублирование строк. Есть и те, кто не считает базы данных на основе SQL истинными реляционными базами данных, если они не соответствуют каждому критерию реляционной модели по версии Кодда. Но на практике любая СУБД, которая использует SQL и в какой-то мере соответствует реляционной модели, может быть отнесена к реляционным системам управления базами данных.
Хотя популярность реляционных баз данных стремительно росла, некоторое недостатки реляционной модели стали проявляться по мере того, как увеличивались ценность и объемы хранящихся данных. К примеру, трудно масштабировать реляционную базу данных горизонтально. Горизонтальное масштабирование или масштабирование по горизонтали — это практика добавления большего количества машин к существующему стеку, что позволяет распределить нагрузку, увеличить трафик и ускорить обработку. Часто это контрастирует с вертикальным масштабированием, которое предполагает модернизацию аппаратного обеспечения существующего сервера, как правило, с помощью добавления оперативной памяти или центрального процессора.
Реляционную базу данных сложно масштабировать горизонтально из-за того, что она разработана для обеспечения целостности, т.е. клиенты, отправляющие запросы в одну и ту же базу данных, всегда будут получать одинаковые данные. Если вы масштабируете реляционную базу данных горизонтально по всем машинам, будет трудно обеспечить целостность, т.к. клиенты могут вносить данные только в один узел, а не во все. Вероятно, между начальной записью и моментом обновления других узлов для отображения изменений возникнет задержка, что приведет к отсутствию целостности данных между узлами.
Еще одно ограничение, существующее в РСУБД, заключается в том, что реляционная модель была разработана для управления структурированными данными, или данными, которые соответствуют заранее определенному типу данных, или, по крайней мере, каким-либо образом предварительно организованы. Однако с распространением персональных компьютеров и развитием сети Интернет в начале 90-х годов появились неструктурированные данные, такие как электронные сообщения, фотографии, видео и пр.
Но все это не означает, что реляционные базы данных бесполезны. Напротив, спустя более 40 лет, реляционная модель все еще является доминирующей основой для управления данными. Распространенность и долголетие реляционных баз данных свидетельствуют о том, что это зрелая технология, которая сама по себе является главным преимуществом. Существует много приложений, предназначенных для работы с реляционной моделью, а также много карьерных администраторов баз данных, которые являются экспертами, когда дело доходит до реляционных баз данных. Также существует широкий спектр доступных печатных и онлайн-ресурсов для тех, кто хочет начать работу с реляционными базами данных.
Еще одно преимущество реляционных баз данных заключается в том, что почти все РСУБД поддерживают транзакции. Транзакция состоит из одного или более индивидуального выражения SQL, выполняемого последовательно, как один блок работы. Транзакции представляют подход «все или ничего», означающий, что все операторы SQL в транзакции должны быть действительными. В противном случае вся транзакция не будет выполнена. Это очень полезно для обеспечения целостности данных при внесении изменений в несколько строк или в таблицы.
Наконец, реляционные базы данных демонстрируют чрезвычайную гибкость. Они используются для построения широкого спектра различных приложений и продолжают эффективно работать даже с большими объемами данных. Язык SQL также обладает огромным потенциалом и позволяет вам добавлять или менять данные на лету, а также вносить изменения в структуру схем баз данных и таблиц, не влияя на существующие данные.
Благодаря гибкости и проектному решению, направленному на сохранение целостности данных, спустя пятьдесят лет после появления такого замысла, реляционные базы данных все еще являются основным способом управления данными и их хранения. Даже с увеличением в последние годы числа разнообразных баз данных NoSQL понимание реляционной модели и принципов ее работы с РСУБД является ключевым моментом для всех, кто хочет создавать приложения, использующие возможности данных.
Чтобы узнать больше о нескольких популярных РСУБД с открытым исходным кодом, мы рекомендуем вам ознакомиться с нашим сравнением различных реляционных баз данных с открытым исходным кодом. Если вам интересно узнать больше о базах данных в целом, мы рекомендуем вам ознакомиться с нашей полной библиотекой материалов о базах данных.
]]>Database management systems (DBMS) are computer programs that allow users to interact with a database. A DBMS allows users to control access to a database, write data, run queries, and perform any other tasks related to database management.
In order to perform any of these tasks, though, the DBMS must have some kind of underlying model that defines how the data are organized. The relational model is one approach for organizing data that has found wide use in database software since it was first devised in the late 1960s, so much so that, as of this writing, four of the top five most popular DBMSs are relational.
This conceptual article outlines the history of the relational model, how relational databases organize data, and how they’re used today.
Databases are logically modelled clusters of information, or data. Any collection of data is a database, regardless of how or where it is stored. Even a file cabinet containing payroll information is a database, as is a stack of hospital patient forms, or a company’s collection of customer information spread across multiple locations. Before storing and managing data with computers was common practice, physical databases like these were the only ones available to government and business organizations that needed to store information.
Around the middle of the 20th century, developments in computer science led to machines with more processing power, as well as greater local and external storage capacity. These advancements led computer scientists to start recognizing the potential these machines had for storing and managing ever larger amounts of data.
However, there weren’t any theories for how computers could organize data in meaningful, logical ways. It’s one thing to store unsorted data on a machine, but it’s much more complicated to design systems that allow you to add, retrieve, sort, and otherwise manage that data in consistent, practical ways. The need for a logical framework for storing and organizing data led to a number of proposals for how to harness computers for data management.
One early database model was the hierarchical model, in which data are organized in a tree-like structure, similar to modern-day filesystems. The following example shows how the layout of part of a hierarchical database used to categorize animals might look:
The hierarchical model was widely implemented in early database management systems, but it also proved to be somewhat inflexible. In this model, even though individual records can have multiple “children,” each record can only have one “parent” in the hierarchy. Because of this, these earlier hierarchical databases were limited to representing only “one-to-one” and “one-to-many” relationships. This lack of “many-to-many” relationships could lead to problems when you’re working with data points that you’d like to associate with more than one parent.
In the late 1960s, Edgar F. Codd, a computer scientist working at IBM, devised the relational model of database management. Codd’s relational model allowed individual records to be associated with more than one table, thereby enabling “many-to-many” relationships between data points in addition to “one-to-many” relationships. This provided more flexibility than other existing models when it came to designing database structures, and meant that relational database management systems (RDBMSs) could meet a much wider range of business needs.
Codd proposed a language for managing relational data, known as Alpha, which influenced the development of later database languages. Two of Codd’s colleagues at IBM, Donald Chamberlin and Raymond Boyce, created one such language inspired by Alpha. They called their language SEQUEL, short for Structured English Query Language, but because of an existing trademark they shortened the name of their language to SQL (referred to more formally as Structured Query Language).
Due to hardware constraints, early relational databases were still prohibitively slow, and it took some time before the technology became widespread. But by the mid-1980s, Codd’s relational model had been implemented in a number of commercial database management products from both IBM and its competitors. These vendors also followed IBM’s lead by developing and implementing their own dialects of SQL. By 1987, both the American National Standards Institute and the International Organization for Standardization had ratified and published standards for SQL, solidifying its status as the accepted language for managing RDBMSs.
The relational model’s wide use across multiple industries led to it becoming recognized as the standard model for data management. Even with the rise of various NoSQL databases in more recent years, relational databases remain the dominant tools for storing and organizing data.
Now that you have a general understanding of the relational model’s history, let’s take a closer look at how the model organizes data.
The most fundamental elements in the relational model are relations, which users and modern RDBMSs recognize as tables. A relation is a set of tuples, or rows in a table, with each tuple sharing a set of attributes, or columns:
A column is the smallest organizational structure of a relational database, and represents the various facets that define the records in the table. Hence their more formal name, attributes. You can think of each tuple as a unique instance of whatever type of people, objects, events, or associations the table holds. These instances might be things like employees at a company, sales from an online business, or lab test results. For example, in a table that holds employee records of teachers at a school, the tuples might have attributes like name
, subjects
, start_date
, and so on.
When creating columns, you specify a data type that dictates what kind of entries are allowed in that column. RDBMSs often implement their own unique data types, which may not be directly interchangeable with similar data types in other systems. Some common data types include dates, strings, integers, and Booleans.
In the relational model, each table contains at least one column that can be used to uniquely identify each row, called a primary key. This is important, because it means that users don’t need to know where their data is physically stored on a machine; instead, their DBMS can keep track of each record and return them on an ad hoc basis. In turn, this means that records have no defined logical order, and users have the ability to return their data in whatever order or through whatever filters they wish.
If you have two tables that you’d like to associate with one another, one way you can do so is with a foreign key. A foreign key is essentially a copy of one table’s (the “parent” table) primary key inserted into a column in another table (the “child”). The following example highlights the relationship between two tables, one used to record information about employees at a company and another used to track the company’s sales. In this example, the primary key of the EMPLOYEES
table is used as the foreign key of the SALES
table:
If you try to add a record to the child table and the value entered into the foreign key column doesn’t exist in the parent table’s primary key, the insertion statement will be invalid. This helps to maintain relationship-level integrity, as the rows in both tables will always be related correctly.
The relational model’s structural elements help to keep data stored in an organized way, but storing data is only useful if you can retrieve it. To retrieve information from an RDBMS, you can issue a query, or a structured request for a set of information. As mentioned previously, most relational databases use SQL to manage and query data. SQL allows you to filter and manipulate query results with a variety of clauses, predicates, and expressions, giving you fine control over what data will appear in the result set.
With the underlying organizational structure of relational databases in mind, let’s consider some of their advantages and disadvantages.
Today, both SQL and the databases that implement it deviate from Codd’s relational model in several ways. For instance, Codd’s model dictates that each row in a table should be unique while, for reasons of practicality, most modern relational databases do allow for duplicate rows. There are some that don’t consider SQL databases to be true relational databases if they fail to adhere to each of Codd’s specifications for the relational model. In practical terms, though, any DBMS that uses SQL and at least somewhat adheres to the relational model is likely to be referred to as a relational database management system.
Although relational databases quickly grew in popularity, a few of the relational model’s shortcomings started to become apparent as data became more valuable and businesses began storing more of it. For one thing, it can be difficult to scale a relational database horizontally. Horizontal scaling, or scaling out, is the practice of adding more machines to an existing stack in order to spread out the load and allow for more traffic and faster processing. This is often contrasted with vertical scaling which involves upgrading the hardware of an existing server, usually by adding more RAM or CPU.
The reason it’s difficult to scale a relational database horizontally has to do with the fact that the relational model is designed to ensure consistency, meaning clients querying the same database will always retrieve the same data. If you were to scale a relational database horizontally across multiple machines, it becomes difficult to ensure consistency since clients may write data to one node but not the others. There would likely be a delay between the initial write and the time when the other nodes are updated to reflect the changes, resulting in inconsistencies between them.
Another limitation presented by RDBMSs is that the relational model was designed to manage structured data, or data that aligns with a predefined data type or is at least organized in some predetermined way, making it easily sortable and searchable. With the spread of personal computing and the rise of the internet in the early 1990s, however, unstructured data — such as email messages, photos, videos, etc. — became more common.
None of this is to say that relational databases aren’t useful. Quite the contrary, the relational model is still the dominant framework for data management after over 40 years. Their prevalence and longevity mean that relational databases are a mature technology, which is itself one of their major advantages. There are many applications designed to work with the relational model, as well as many career database administrators who are experts when it comes to relational databases. There’s also a wide array of resources available in print and online for those looking to get started with relational databases.
Another advantage of relational databases is that almost every RDBMS supports transactions. A transaction consists of one or more individual SQL statements performed in sequence as a single unit of work. Transactions present an all-or-nothing approach, meaning that every SQL statement in the transaction must be valid; otherwise, the entire transaction will fail. This is very helpful for ensuring data integrity when making changes to multiple rows or tables.
Lastly, relational databases are extremely flexible. They’ve been used to build a wide variety of different applications, and continue working efficiently even with very large amounts of data. SQL is also extremely powerful, allowing you to add and change data on the fly, as well as alter the structure of database schemas and tables without impacting existing data.
Thanks to their flexibility and design for data integrity, relational databases are still the primary way data are managed and stored more than fifty years after they were first conceived of. Even with the rise of various NoSQL databases in recent years, understanding the relational model and how to work with RDBMSs are key for anyone who wants to build applications that harness the power of data.
To learn more about a few popular open-source RDBMSs, we encourage you to check out our comparison of various open-source relational SQL databases. If you’re interested in learning more about databases generally, we encourage you to check out our complete library of database-related content.
]]>Управление конфигурацией — это процесс учета изменений, вносимых в систему с целью сохранения ее целостности. Обычно используются инструменты и методы, способствующие автоматизации процесса и наблюдению состояния системы. Хотя эта концепция родилась не в ИТ-индустрии, термин стал широко использоваться для обозначения управления конфигурацией сервера.
В контексте серверов управление конфигурацией также часто называют ИТ-автоматизацией или оркестрацией сервера. Оба термина освещают практические аспекты управления конфигурацией, а также возможность контроля нескольких систем с центрального сервера.
Из этого руководства вы узнаете о преимуществах использования инструментов управления конфигурацией для автоматизации настроек серверной инфраструктуры на примере системы управления Ansible.
На рынке присутствует целый ряд инструментов управления конфигурацией разного уровня сложности и с разными архитектурами. Хотя все эти инструменты отличаются характеристиками и принципами работы, все они выполняют одну и ту же функцию: обеспечивают соответствие системы параметрам, заявленным в наборе скриптов конфигурирования.
Преимущества управления конфигурацией серверов состоят в способности определять вашу инфраструктуру как код. Это позволяет:
К тому же инструменты управления конфигурацией предлагают возможность централизованного контроля неограниченного числа серверов из одной точки. Это может значительно повысить производительность и укрепить целостность инфраструктуры.
Ansible — это современный инструмент управления конфигурацией, который упрощает настройку и удаленное администрирование серверов. Минималистичный дизайн инструмента обеспечивает простое и понятное использование.
Пользователи пишут скрипты конфигурирования Ansible в удобном формате сериализации данных YAML, который не привязывается к какому-либо языку программирования. Это позволяет пользователям интуитивно создавать сложные скрипты конфигурирования, в отличие от аналогичных инструментов такой же категории.
Ansible не требует установки специального программного обеспечения на узлах, где будет работать эта система. Контрольный механизм, настроенный в программном обеспечении Ansible, связывается с узлами через стандартные каналы SSH.
Как инструмент управления конфигурацией и система автоматизации Ansible имеет все функции, присутствующие в других инструментах этой же категории, но при этом данная система ориентируется на простоту использования и производительность:
Ansible отслеживает состояние ресурсов управляемых систем для недопущения повторения задач, которые выполнялись ранее. Если пакет уже установлен, система не будет пытаться установить его снова. Основной задачей является то, что при каждом исполнении система достигает (или сохраняет) желаемое состояние, даже если вы запускаете ее несколько раз. Это означает, что системе Ansible и другим инструментам управления конфигурацией присуще идемпотентное поведение. При запуске плейбука вы увидите статус каждой выполняемой задачи и указание, приводит ли выполненная задача к изменению системы.
При написании скриптов автоматизации Ansible вы можете использовать переменные, условия и циклы, чтобы сделать процесс автоматизации более универсальным и эффективным.
Ansible собирает серию подробных данных об управляемых узлах, например сетевых интерфейсах и операционной системе, и обозначает эти данные как глобальные переменные, называемые системными сведениями. Сведения можно использовать внутри плейбуков, чтобы обеспечить универсальность и адаптивность автоматизации, работающей по-разному в зависимости от системы конфигурирования.
Ansible использует систему шаблонов Jinja2 Python, разрешающую динамические выражения и доступ к переменным. Шаблоны можно использовать для облегчения настройки файлов и служб конфигурации. Например, вы можете использовать шаблон для настройки нового виртуального хоста в Apache, а также использовать такой же шаблон для нескольких установок сервера.
Ansible поставляется с сотнями встроенных модулей, упрощающих автоматизацию стандартных задач администрирования, таких как установка пакетов с помощью apt
и синхронизация файлов через rsync
, а также работа с популярными программами, например системами базы данных (MySQL, PostgreSQL, MongoDB и др.) и инструментами управления зависимостями (PHP composer
, Ruby gem
, Node npm
и др.). Помимо этого существует ряд способов расширения системы Ansible. Это плагины и модули, которые необходимы для обеспечения пользовательских функций, не включенных по умолчанию.
Вы также можете использовать модули и плагины сторонних организаций, представленные на портале Ansible Galaxy.
Давайте рассмотрим терминологию и концепции Ansible, чтобы познакомиться с терминами, которые будут употребляться в этой серии материалов.
Узел управления — это устройство с установленной и настроенной для подключения к вашему серверу системой Ansible. Вы можете использовать несколько узлов управления, а любая система, на которой возможен запуск Ansible, может быть настроена как узел управления, включая персональные компьютеры или ноутбуки на базе операционных систем Linux или Unix. На данный момент Ansible нельзя установить на хосте Windows, но вы можете обойти это ограничение, настроив виртуальную машину под управлением Linux и запустив на ней Ansible.
Системы, которыми вы управляете с помощью Ansible, называются управляемыми узлами. Для работы Ansible необходимо, чтобы управляемые узлы были доступны через каналы SSH и на них был установлен Python 2 (версия 2.6 или выше) или Python 3 (версия 3.5 или выше).
Ansible поддерживает различные операционные системы в качестве управляемых узлов, включая серверы Windows.
Файл инвентаризации содержит список хостов, которыми вы будете управлять при помощи Ansible. Хотя Ansible обычно создает файл инвентаризации по умолчанию при установке, вы можете использовать файлы инвентаризации для каждого проекта отдельно. Это обеспечит лучшее разделение инфраструктуры и позволит избежать ошибочного выполнения команд или плейбуков не на том сервере. Статические файлы инвентаризации обычно создаются как файлы .ini
, но вы также можете использовать динамически сгенерированные файлы, написанные на любом языке программирования, способном возвращать JSON.
В Ansible задача — это отдельная часть работы, которую нужно выполнить на управляемом узле. Каждое действие, которое нужно выполнить, определяется как задача. Задачи можно выполнять как единичные действия через ситуативные команды или же включать их в плейбуки в качестве части скрипта автоматизации.
Плейбук состоит из списка заданных задач и нескольких директив, указывающих, какой хост является целью автоматизации, а также нужно ли использовать систему эскалации привилегий для выполнения этих задач. В плейбук также могут быть включены дополнительные разделы для определения переменных или включения файлов. Ansible выполняет задачи последовательно, а полноценное выполнение плейбука называется плей. Плейбуки записываются в формате YAML.
Обработчики используются для выполнения действий со службами, например при перезагрузке или остановке службы, которая активно работает на системе управляемого узла. Обработчики обычно запускаются с помощью задач, а их использование происходит в конце плея, после выполнения всех задач. В этом случае, если перезагрузку службы начинает более чем одна задача, служба запустится только один раз после выполнения всех команд. Хотя поведение обработчика по умолчанию более эффективно, также возможно принудительное немедленное выполнение обработчика, если этого требует задача.
Роль — это набор плейбуков и связанных файлов, организованных в предопределенную структуру, известную Ansible. Роли упрощают повторное использование плейбуков и превращают их в пакеты детализированной автоматизации с общим доступом для конкретных целей, таких как установка веб-сервера, среды PHP или настройка сервера MySQL.
Ansible — это минималистичный и простой для изучения инструмент ИТ-автоматизации благодаря использованию YAML в скриптах конфигурирования. Он имеет большое количество встроенных модулей, которые можно использовать для абстрактных задач, таких как установка пакетов и работа с шаблонами. Его упрощенные требования к инфраструктуре и доступный синтаксис подходят тем, кто только начинает работать с управлением конфигурацией.
В следующих материалах этой серии мы рассмотрим установку и начало работы с Ansible на сервере Ubuntu 20.04.
]]>O gerenciamento de configuração é o processamento de alterações em um sistema para garantir a integridade ao longo do tempo, envolvendo geralmente ferramentas e processos que facilitam a automação e a observabilidade. Embora este conceito não tenha se originado na indústria de TI, o termo é largamente usado para se referir ao gerenciamento de configuração de servidor.
No contexto de servidores, o gerenciamento de configuração também é geralmente referido como Automação de TI ou Orquestração de Servidor. Ambos os termos destacam os aspectos práticos do gerenciamento de configuração e a capacidade de controlar vários sistemas a partir de um servidor central.
Este guia lhe apresentará os benefícios do uso de uma ferramenta de gerenciamento de configuração para automatizar a configuração de sua infraestrutura de servidor, e como uma dessas ferramentas, o Ansible, pode ajudá-lo com isto.
Existem várias ferramentas de gerenciamento de configuração disponíveis no mercado, com diferentes níveis de complexidade e estilos diferentes de arquitetura. Embora cada uma destas ferramentas tenha suas próprias características e trabalhe de maneira ligeiramente diferente, todas elas fornecem a mesma função: garantir que o estado de um sistema corresponda ao estado descrito por um conjunto de scripts de provisionamento.
Muitos dos benefícios do gerenciamento de configuração para os servidores vêm da capacidade de definir sua infraestrutura como código. Isso habilita você a:
Além disso, as ferramentas de gerenciamento de configuração oferecem a você uma maneira de controlar de um a centenas de servidores a partir de um local centralizado, o que pode melhorar drasticamente a eficiência e a integridade da sua infraestrutura de servidor.
O Ansible é uma ferramenta de gerenciamento de configuração moderna que facilita a tarefa de configurar e manter servidores remotos, com um projeto minimalista destinado a colocar os usuários em operação rapidamente.
Os usuários escrevem scripts de provisionamento do Ansible em YAML, um padrão de serialização de dados mais fácil de usar que não é vinculado a nenhuma linguagem de programação em particular. Isso habilita os usuários a criar scripts de provisionamento sofisticados mais intuitivamente em comparação com ferramentas semelhantes na mesma categoria.
O Ansible não requer nenhum software especial a ser instalado nos nós que serão gerenciados com esta ferramenta. Uma máquina de controle é configurada com o software do Ansible, que então se comunica com os nós através do SSH padrão.
Com uma ferramenta de gerenciamento de configuração e um framework de automação, o Ansible encapsula todas as funcionalidades comuns presentes em outras ferramentas da mesma categoria, enquanto ainda mantém um forte foco em simplicidade e desempenho:
O Ansible acompanha o estado de recursos em sistemas gerenciados, para evitar a repetição de tarefas que foram executadas antes. Se um pacote já foi instalado, ele não tentará instalá-lo novamente. O objetivo é que após cada execução de provisionamento o sistema atinja (ou mantenha) o estado desejado, mesmo se você executá-lo várias vezes. Isso é o que caracteriza o Ansible e outras ferramentas de gerenciamento de configuração como tendo um comportamento idempotente. Ao executar um playbook, você verá o status de cada tarefa sendo executada e se a tarefa executou ou não uma alteração no sistema.
Ao escrever scripts de automação do Ansible, você pode usar variáveis, condicionais e loops para tornar sua automação mais versátil e eficiente.
O Ansible coleta uma série de informações detalhadas sobre os nós gerenciados, como interfaces de rede e sistema operacional, e as fornece como variáveis globais chamadas de fatos do sistema. Fatos podem ser usados dentro de playbooks para tornar sua automação mais versátil e adaptável, comportando-se de maneira diferente dependendo do sistema que estiver sendo provisionado.
O Ansible utiliza o sistema de modelos Jinja2 para Python para permitir expressões dinâmicas e acesso a variáveis. Os modelos podem ser usados para facilitar a definição de arquivos de configuração e serviços. Por exemplo, você pode usar um modelo para configurar um novo host virtual no Apache, enquanto reutiliza o mesmo modelo para várias instalações de servidor.
O Ansible vem com centenas de módulos embutidos para facilitar a escrita da automação para tarefas de gerenciamento comuns de sistemas, como instalar pacotes com o apt
e sincronizar arquivos com o rsync
, e também para lidar com softwares populares como os sistemas de banco de dados (como MySQL, PostgreSQL, MongoDB, e outros) e ferramentas de gerenciamento de dependências (como o composer
do PHP, o gem
do Ruby, o npm
, do Node e outros). Além disso, existem várias maneiras pelas quais você pode estender o Ansible: plugins e módulos são boas opções quando você precisa de uma funcionalidade personalizada que não está presente por padrão.
Você também pode encontrar módulos e plugins de terceiros no portal Ansible Galaxy.
Agora, vamos dar uma olhada na terminologia e conceitos do Ansible para ajudar você a familiarizar-se com esses termos à medida que aparecem ao longo da série.
Um nó de controle é um sistema onde o Ansible está instalado e configurado para se conectar ao seu servidor. Você pode ter vários nós de controle, e qualquer sistema que seja capaz de executar o Ansible pode ser configurado como um nó de controle, incluindo computadores pessoais ou notebooks executando Linux ou um sistema operacional baseado em Unix. Por enquanto, o Ansible não pode ser instalado em hosts Windows, mas você pode contornar esta limitação, configurando uma máquina virtual que execute o Linux e executando o Ansible a partir daí.
Os sistemas que você controla usando o Ansible são chamados de nós gerenciados. O Ansible requer que os nós gerenciados sejam alcançáveis via SSH e que tenham Python 2 (versão 2.6 ou superior) ou Python 3 (versão 3.5 ou superior) instalado.
O Ansible suporta uma variedade de sistemas operacionais, incluindo servidores Windows como nós gerenciados.
Um arquivo de inventário contém uma lista dos hosts que você gerenciará usando o Ansible. Embora o Ansible geralmente crie um arquivo de inventário padrão quando instalado, você pode usar inventários por projeto para ter uma melhor separação de sua infraestrutura e evitar executar comandos ou playbooks no servidor errado por engano. Inventários estáticos são geralmente criados como arquivos .ini
, mas você também pode usar inventários gerados dinamicamente, escritos em qualquer linguagem de programação que possa retornar JSON.
No Ansible, uma tarefa (task) é uma unidade individual de trabalho a ser executada em um nó gerenciado. Cada ação a executar é definida como uma tarefa. As tarefas podem ser executadas como uma ação única através de comandos ad-hoc ou incluídas em um playbook como parte de um script de automação.
Um playbook contém uma lista ordenada de tarefas, e algumas outras diretivas para indicar quais hosts são o alvo dessa automação, se irá ou não usar um sistema de escalação de privilégios para executar essas tarefas, e seções opcionais para definir variáveis ou incluir arquivos. O Ansible executa tarefas sequencialmente, e uma execução completa do playbook é chamada de play (reprodução). Os playbooks são escritos no formato YAML.
Os handlers são usados para executar ações em um serviço, como reinicialização ou parada de um serviço que esteja funcionando ativamente no sistema do nó gerenciado. O handlers são geralmente ativados pelas tarefas, e sua execução acontece no final de uma reprodução, depois que todas as tarefas estiverem concluídas. Desta forma, se mais de uma tarefa dispara uma reinicialização para um serviço, por exemplo, o serviço será reiniciado apenas uma vez e depois de todas as tarefas serem executadas. Embora o comportamento padrão do handler seja mais eficiente e, no geral, uma melhor prática, também é possível forçar a execução imediata do handler, se isso for exigido por uma tarefa.
Uma função (role) é um conjunto de playbooks e arquivos relacionados, organizados em uma estrutura pré-definida que é conhecida pelo Ansible. As funções facilitam a reutilização e a redefinição de objetivos dos playbooks em pacotes compartilháveis de automação granular para objetivos específicos, como instalar um servidor Web, instalar um ambiente PHP, ou configurar um servidor MySQL.
O Ansible é uma ferramenta de automação de TI minimalista que tem uma curva de aprendizagem suave, graças em parte ao uso do YAML para scripts de provisionamento. Ela tem um grande número de módulos embutidos que podem ser usados para abstrair tarefas, como instalar pacotes e trabalhar com modelos. Suas exigências simplificadas de infraestrutura e sintaxe acessível podem ser uma boa opção para quem está iniciando no gerenciamento de configuração.
Na próxima parte desta série, veremos como instalar e começar com o Ansible em um servidor Ubuntu 20.04.
]]>La gestion de la configuration est le processus qui consiste à traiter les modifications apportées à un système de manière à en assurer l’intégrité dans le temps, ce qui implique généralement des outils et des processus qui facilitent l’automatisation et l’observabilité. Même si ce concept n’est pas issu de l’industrie informatique, le terme est largement utilisé pour désigner la gestion de la configuration des serveurs.
Dans le contexte des serveurs, la gestion de la configuration est aussi communément appelée automatisation informatique ou orchestration des serveurs. Ces deux termes mettent en évidence les aspects pratiques de la gestion de la configuration et la possibilité de contrôler plusieurs systèmes à partir d’un serveur central.
Ce guide vous expliquera les avantages de l’utilisation d’un outil de gestion de la configuration pour automatiser la configuration de votre infrastructure de serveurs, et comment un tel outil, Ansible, peut vous aider à cet égard.
Il existe sur le marché un certain nombre d’outils de gestion de la configuration, avec des niveaux de complexité variables et des styles architecturaux variés. Bien que chacun de ces outils ait ses propres caractéristiques et fonctionne de manière légèrement différente, ils remplissent tous la même fonction : s’assurer que l’état d’un système correspond à l’état décrit par un ensemble de scripts de provisionnement.
De nombreux avantages de la gestion de la configuration des serveurs proviennent de la possibilité de définir votre infrastructure sous forme de code. Cela vous permet de :
En outre, les outils de gestion de la configuration vous permettent de contrôler un à plusieurs centaines de serveurs depuis un emplacement centralisé, ce qui peut améliorer considérablement l’efficacité et l’intégrité de votre infrastructure de serveurs.
Ansible est un outil moderne de gestion de la configuration qui facilite la tâche de mise en place et de maintenance des serveurs distants, avec une conception minimaliste destinée à rendre les utilisateurs rapidement opérationnels.
Les utilisateurs écrivent les scripts de provisionnement Ansible dans YAML, une norme de sérialisation des données user-friendly qui n’est pas liée à un langage de programmation particulier. Cela permet aux utilisateurs de créer des scripts de provisionnement sophistiqués de manière plus intuitive par rapport à des outils similaires de la même catégorie.
Ansible ne nécessite l’installation d’aucun logiciel spécial sur les nœuds qui seront gérés avec cet outil. Une machine de contrôle est configurée avec le logiciel Ansible, qui communique ensuite avec les nœuds via SSH standard.
En tant qu’outil de gestion de la configuration et cadre d’automatisation, Ansible encapsule toutes les caractéristiques communes présentes dans d’autres outils de la même catégorie, tout en maintenant un fort accent sur la simplicité et la performance :
Ansible garde une trace de l’état des ressources dans les systèmes gérés afin d’éviter de répéter des tâches qui ont été exécutées auparavant. Si un package était déjà installé, il n’essaiera pas de l’installer à nouveau. L’objectif est qu’après chaque exécution de provisionnement, le système atteigne (ou conserve) l’état souhaité, même si vous l’exécutez plusieurs fois. C’est ce qui caractérise Ansible et les autres outils de gestion de la configuration comme ayant un comportement idempotent. Lors de l’exécution d’un modèle de développement, vous verrez le statut de chaque tâche en cours d’exécution et si la tâche a effectué ou non un changement dans le système.
Lors de l’écriture des scripts d’automatisation Ansible, vous pouvez utiliser des variables, des conditions et des boucles afin de rendre votre automatisation plus polyvalente et plus efficace.
Ansible collecte une série d’informations détaillées sur les nœuds gérés, telles que les interfaces réseau et le système d’exploitation, et les fournit sous forme de variables globales appelées faits relatifs au système. Les faits peuvent être utilisés dans les modèles de développement pour rendre votre automatisation plus polyvalente et adaptative, en se comportant différemment selon le système mis en place.
Ansible utilise le système de modèles Jinja2 Python pour permettre des expressions dynamiques et l’accès aux variables. Des modèles peuvent être utilisés pour faciliter la mise en place de fichiers de configuration et de services. Par exemple, vous pouvez utiliser un modèle pour configurer un nouvel hôte virtuel dans Apache, tout en réutilisant le même modèle pour l’installation de plusieurs serveurs.
Ansible est doté de centaines de modules intégrés facilitant l’automatisation de l’écriture pour les tâches courantes d’administration des systèmes (telles que l’installation de packages avec apt
et la synchronisation de fichiers avec rsync
), et permettant aussi de traiter les logiciels populaires tels que les systèmes de bases de données (comme MySQL, PostgreSQL, MongoDB, et d’autres) et les outils de gestion des dépendances (comme le compositeur
de PHP, le gem
de Ruby, le npm
de Node, et d’autres). En dehors de cela, il existe différentes façons d’étendre Ansible : les plugins et les modules sont de bonnes options lorsque vous avez besoin d’une fonctionnalité personnalisée qui n’est pas présente par défaut.
Vous pouvez également trouver des modules et des plugins tiers dans le portail Ansible Galaxy.
Nous allons maintenant jeter un coup d’œil à la terminologie et aux concepts d’Ansible pour vous aider à vous familiariser avec ces termes au fur et à mesure qu’ils apparaissent dans cette série.
Un nœud de contrôle est un système dans lequel Ansible est installé et configuré pour se connecter à votre serveur. Vous pouvez avoir plusieurs nœuds de contrôle, et tout système capable de faire fonctionner Ansible peut être configuré comme un nœud de contrôle, y compris les ordinateurs personnels ou portables fonctionnant avec un système d’exploitation basé sur Linux ou Unix. Pour l’instant, Ansible ne peut pas être installé sur des hôtes Windows, mais vous pouvez contourner cette limitation en installant une machine virtuelle qui fonctionne sous Linux et en exécutant Ansible à partir de là.
Les systèmes que vous contrôlez à l’aide d’Ansible sont appelés nœuds gérés. Ansible nécessite que les nœuds gérés soient accessibles via SSH, et qu’ils aient Python 2 (version 2.6 ou supérieure) ou Python 3 (version 3.5 ou supérieure) installé.
Ansible prend en charge une variété de systèmes d’exploitation, y compris les serveurs Windows en tant que nœuds gérés.
Un fichier d’inventaire contient une liste des hôtes que vous gérerez à l’aide d’Ansible. Bien qu’Ansible crée généralement un fichier d’inventaire par défaut lorsqu’il est installé, vous pouvez utiliser des inventaires par projet pour avoir une meilleure séparation de votre infrastructure et éviter d’exécuter des commandes ou des modèles de développement sur le mauvais serveur, par erreur. Les inventaires statiques sont généralement créés sous forme de fichiers .ini
, mais vous pouvez également utiliser des inventaires générés dynamiquement, écrits dans n’importe quel langage de programmation capable de renvoyer JSON.
Dans Ansible, une tâche est une unité de travail individuelle à exécuter sur un nœud géré. Chaque action à effectuer est définie comme une tâche. Les tâches peuvent être exécutées comme une action ponctuelle via des commandes ad hoc, ou incluses dans un modèle de développement dans le cadre d’un script d’automatisation.
Un modèle de développement contient une liste ordonnée de tâches, et quelques autres directives pour indiquer quels hôtes sont la cible de cette automatisation, s’il faut ou non utiliser un système d’escalade de privilèges pour exécuter ces tâches, et des sections optionnelles pour définir des variables ou inclure des fichiers. Ansible exécute les tâches de manière séquentielle, et l’exécution complète d’un modèle de développement est appelée développement. Les modèles de développement sont écrits au format YAML.
Les handlers sont utilisés pour effectuer des actions sur un service, telles que le redémarrage ou l’arrêt d’un service qui fonctionne activement sur le système du nœud géré. Les handlers sont généralement déclenchés par des tâches, et leur exécution a lieu à la fin d’un développement, une fois que toutes les tâches sont terminées. Ainsi, si de multiples tâches déclenchent le redémarrage d’un service, par exemple, le service ne sera redémarré qu’une fois et après l’exécution de toutes les tâches. Bien que le comportement du handler par défaut soit plus efficace et globalement une meilleure pratique, il est également possible de forcer l’exécution immédiate du handler si cela est requis par une tâche.
Un rôle est un ensemble de modèles de développement et de fichiers connexes organisés dans une structure prédéfinie qui est connue par Ansible. Les rôles facilitent la réutilisation et la reconversion des modèles de développement en packages partageables d’automatisation granulaire pour des objectifs spécifiques, tels que l’installation d’un serveur web, l’installation d’un environnement PHP ou la mise en place d’un serveur MySQL.
Ansible est un outil d’automatisation informatique minimaliste qui a une courbe d’apprentissage douce, en partie grâce à son utilisation de YAML pour ses scripts de provisionnement. Il dispose d’un grand nombre de modules intégrés qui peuvent être utilisés pour abstraire des tâches telles que l’installation de packages et le travail avec des modèles. Ses exigences simplifiées en matière d’infrastructure et sa syntaxe accessible peuvent convenir à ceux qui se lancent dans la gestion de la configuration.
Dans la prochaine partie de cette série, nous verrons comment installer et commencer à utiliser Ansible sur un serveur Ubuntu 20.04.
]]>La administración de configuración es el proceso de manejar los cambios en un sistema de manera que se asegure la integridad a lo largo del tiempo, lo que suele implicar herramientas y procesos que facilitan la automatización y la capacidad de observación. Si bien este concepto no se originó en la industria de IT, el término se utiliza ampliamente para referirse a la administración de la configuración de servidores.
En el contexto de los servidores, la administración de la configuración también se suele denominar automatización de TI u orquestación de servidores. Estos dos términos destacan los aspectos prácticos de la administración de la configuración y la capacidad de controlar varios sistemas desde un servidor central.
Esta guía lo orientará sobre los beneficios que proporciona la utilización de una herramienta de administración de configuración para automatizar los ajustes de la infraestructura de servidores y le demostrará cómo una de esas herramientas, Ansible, puede ayudarlo a hacerlo.
Hay varias herramientas de administración de configuración disponibles en el mercado, con distintos niveles de complejidad y estilos de arquitectura. Si bien cada una de estas herramientas tiene sus propias características y funciona de manera algo distinta, todas ellas ofrecen la misma función: asegurarse de que el estado de un sistema coincida con el estado descrito por un conjunto de secuencias de comandos de aprovisionamiento.
Muchos de los beneficios de la administración de configuración para los servidores provienen de la capacidad de definir la infraestructura como código. Esto le permite hacer lo siguiente:
Además, las herramientas de administración de configuración le ofrecen una forma de controlar hasta cientos de servidores desde una ubicación centralizada, lo que puede mejorar drásticamente la eficiencia y la integridad de la infraestructura de su servidor.
Ansible es una herramienta de administración de configuración moderna que facilita la tarea de configurar y mantener servidores remotos con un diseño minimalista destinado a hacer que los usuarios puedan estar operativos rápidamente.
Los usuarios escriben secuencias de comandos de aprovisionamiento de Ansible en YAML, un estándar de serialización de datos fácil de usar que no está vinculado a ningún lenguaje de programación en particular. Esto permite a los usuarios crear secuencias de comandos de aprovisionamiento de forma más intuitiva en comparación con herramientas similares de la misma categoría.
La herramienta Ansible no requiere la instalación de ningún software especial en los nodos que se administrarán con ella. El software Ansible instala una máquina de control que, luego, se comunica con los nodos a través de SSH estándar.
Como herramienta de administración de configuración y marco de automatización, Ansible reúne todas las funciones comunes presentes en otras herramientas de la misma categoría, a la vez que mantiene un fuerte enfoque en la simplicidad y el rendimiento:
Ansible realiza un seguimiento del estado de los recursos en los sistemas administrados para evitar repetir tareas ejecutadas con anterioridad. Si ya se instaló un paquete, no intentará volver a instalarlo. El objetivo es que, después de cada ejecución de aprovisionamiento, el sistema alcance (o mantenga) el estado deseado, aunque lo ejecute varias veces. Esto es lo que caracteriza a Ansible y otras herramientas de administración de configuración que tienen un comportamiento idempotente. Al ejecutar un playbook, verá el estado de cada tarea que se ejecute y si la tarea realizó o no un cambio en el sistema.
Al escribir secuencias de comandos de automatización de Ansible, puede usar variables, condicionales y bucles para hacer que la automatización sea más versátil y eficiente.
Ansible recopila una serie de datos detallados sobre los nodos administrados, como las interfaces de red y el sistema operativo, y los proporciona como variables globales denominadas datos del sistema. Los datos pueden usarse en playbooks para hacer que la automatización sea más versátil y adaptativa, y se comportarán de forma diferente según el sistema que se aprovisione.
Ansible utiliza el sistema de plantillas Jinja2 de Python para permitir expresiones dinámicas y el acceso a las variables. Las plantillas pueden usarse para facilitar el ajuste de archivos y servicios de configuración. Por ejemplo, puede usar una plantilla para configurar un nuevo host virtual en Apache y volver a utilizar la misma plantilla para varias instalaciones de servidores.
Ansible viene con cientos de módulos incorporados para facilitar la automatización de la escritura en tareas comunes de administración de sistemas, como la instalación de paquetes con apt
y la sincronización de archivos con rsync
, así como para administrar software populares, como sistemas de bases de datos (por ejemplo, MySQL, PostgreSQL y MongoDB, entre otros) y herramientas de administración de dependencias (como composer
de PHP, gem
de Ruby y npm
de Node, entre otras). Además, Ansible se puede ampliar de varias maneras: los complementos y los módulos son buenas opciones cuando se necesita una funcionalidad personalizada que no está presente por defecto.
También puede encontrar módulos y complementos de terceros en el portal Ansible Galaxy.
Ahora, veremos la terminología y los conceptos de Ansible para ayudarlo a familiarizarse con estos términos que se irán presentando a lo largo de esta serie.
Un nodo de control es un sistema en el que se instala y configura Ansible para establecer conexión con su servidor. Puede tener varios nodos de control y puede configurar como nodo de control cualquier sistema que pueda ejecutar Ansible, incluso equipos personales o portátiles que ejecuten un sistema operativo basado en Linux o Unix. Por el momento, no se puede instalar Ansible en hosts de Windows, pero puede sortear esta limitación estableciendo una máquina virtual que ejecute Linux y ejecutando Ansible allí.
Los sistemas que controla usando Ansible se denominan nodos administrados. Ansible requiere que se pueda acceder a los nodos administrados a través de SSH y que tengan instalado Python 2 (versión 2.6 o superior) o Python 3 (versión 3.5 o superior).
Ansible admite varios sistemas operativos como nodos administrados, incluso servidores de Windows.
Los archivos de inventario contienen una lista de los hosts que administrará usando Ansible. Si bien, en general, Ansible crea un archivo de inventario predeterminado cuando se instala, puede usar inventarios por proyecto para tener una mejor separación de la infraestructura y evitar ejecutar comandos o playbooks en un servidor equivocado por error. Los inventarios estáticos se suelen crear como archivos .ini
, pero también puede usar inventarios generados de forma dinámica escritos en cualquier lenguaje de programación que pueda devolver JSON.
En Ansible, una tarea es una unidad de trabajo individual que se ejecuta en un nodo administrado. Cada acción que se debe ejecutar se define como una tarea. Las tareas pueden ejecutarse como una acción puntual mediante comandos ad-hoc, o incluirse en un playbook como parte de una secuencia de comandos de automatización.
Los playbooks contienen una lista ordenada de tareas y algunas otras directivas para indicar qué hosts son el objetivo de esa automatización, si se debe utilizar o no un sistema de elevación de privilegios para ejecutar esas tareas y secciones opcionales para definir variables o incluir archivos. Ansible ejecuta tareas de forma secuencial, y la ejecución completa de un playbook se denomina play. Los playbooks se escribirán en formato YAML.
Los controladores se utilizan para realizar acciones en un servicio, por ejemplo, para reiniciar o detener un servicio que se está ejecutando activamente en el sistema de un nodo administrado. Los controladores se suelen activar con tareas y su ejecución se realiza al final de un play, una vez que se completaron todas las tareas. De esta manera, si más de una tarea desencadena el reinicio de un servicio, por ejemplo, el servicio solo se reiniciará una vez, después de que se ejecuten todas las tareas. Si bien el comportamiento predeterminado de los controladores es más eficiente y, en general, lo más recomendable, también es posible forzar la ejecución inmediata de un controlador si así lo requiere una tarea.
Una función es un conjunto de playbooks y archivos relacionados organizados en una estructura predefinida que Ansible reconoce. Las funciones facilitan la reutilización y la reasignación de playbooks en paquetes compartidos de automatización granular para objetivos específicos, como la configuración de un servidor web, de un entorno PHP o de un servidor MySQL.
Ansible es una herramienta de automatización de TI minimalista que tiene una curva de aprendizaje moderada, en parte, gracias a que utiliza YAML para sus secuencias de comando de aprovisionamiento. Cuenta con una gran cantidad de módulos incorporados que pueden usarse para tareas abstractas, por ejemplo, para instalar paquetes y trabajar con plantillas. Sus requisitos de infraestructura simplificados y su sintaxis accesible pueden ser una buena opción para quienes se están iniciando en la administración de configuración.
En la siguiente parte de esta serie, veremos cómo instalar y comenzar a comenzar a utilizar Ansible en un servidor Ubuntu 20.04.
]]>Konfigurationsmanagement ist die Verwaltung von Änderungen in einem System auf eine Weise, die dauerhafte Integrität gewährleistet. Dazu werden in der Regel Tools und Prozesse verwendet, die für Automatisierung und Beobachtbarkeit sorgen. Das Konzept wurde zwar nicht in der IT-Branche erfunden, bezieht sich jedoch oft auf das Server Configuration Management.
Im Kontext von Servern wird Konfigurationsmanagement häufig auch als IT Automation oder Server Orchestration bezeichnet. Beide Begriffe verdeutlichen die praktischen Aspekte des Konfigurationsmanagements sowie die Fähigkeit, mehrere Systeme von einem zentralen Server aus zu steuern.
Dieser Leitfaden wird Ihnen die Vorteile der Verwendung eines Konfigurationsmanagement-Tools zum Automatisieren der Serverinfrastruktureinrichtung erläutern und beschreiben, wie Ihnen ein solches Tool (Ansible) dabei helfen kann.
Auf dem Markt gibt es eine Reihe von Konfigurationsmanagement-Tools, die sich durch verschieden hohe Komplexität und unterschiedliche architektonische Stile auszeichnen. Jedes dieser Tools verfügt zwar über eigene Eigenschaften und funktioniert etwas anders, doch bieten alle die gleiche Funktion: Sie sorgen dafür, dass der Zustand eines Systems mit dem Zustand übereinstimmt, der durch einen Satz von Bereitstellungsskripten beschrieben wird.
Viele Vorteile beim Konfigurationsmanagement für Server hängen mit der Möglichkeit zusammen, Ihre Infrastruktur als Code zu definieren. Dadurch können Sie:
Außerdem bieten Ihnen Konfigurationsmanagement-Tools eine Möglichkeit, um einen bis Hunderte von Servern an einem zentralen Ort zu steuern. So können Sie die Effizienz und Integrität Ihrer Serverinfrastruktur spürbar verbessern.
Ansible ist ein modernes Konfigurationsmanagement-Tool, das die Einrichtung und Wartung von Remoteservern erleichtert. Dabei hilft ein minimalistisches Design Benutzern dabei, rasch loszulegen.
Benutzer schreiben Ansible-Bereitstellungsskripte in YAML, einem benutzerfreundlichen Standard für Datenserialisierung, der nicht mit einer bestimmten Programmiersprache verknüpft ist. Dadurch können Benutzer anspruchsvolle Bereitstellungsskripte intuitiver erstellen als bei ähnlichen Tools in der gleichen Kategorie.
Ansible benötigt keine spezielle Software, die auf den Knoten, die mit diesem Tool verwaltet werden sollen, installiert werden muss. Ein Steuerrechner wird mit der Ansible-Software eingerichtet, der dann mit den Knoten über Standard-SSH kommuniziert.
Als Konfigurationsmanagement-Tool und Automatisierungs-Framework bietet Ansible alle gängigen Funktionen anderer Tools der gleichen Kategorie und zeichnet sich dennoch durch starken Fokus auf Einfachheit und Leistung aus:
Ansible verfolgt den Status von Ressourcen in verwalteten Systemen, um eine Wiederholung von Aufgaben zu vermeiden, die zuvor bereits ausgeführt wurden. Wenn ein Paket bereits installiert wurde, wird nicht versucht, es erneut zu installieren. Das Ziel ist es, dass das System nach jeder Bereitstellungsausführung den gewünschten Zustand erreicht (oder beibehält), selbst wenn Sie es mehrfach ausführen. Darum zeichnen sich Ansible und andere Konfigurationsmanagement-Tools durch idempotentes Verhalten aus. Bei Ausführung eines Playbooks sehen Sie den Status der einzelnen Aufgaben, die ausgeführt werden, und ob die jeweilige Aufgabe eine Änderung am System vorgenommen hat oder nicht.
Beim Schreiben von Ansible-Automatisierungsskripten können Sie Variablen, Bedingungen und Schleifen verwenden, um Ihre Automatisierung vielseitiger und effizienter zu gestalten.
Ansible sammelt eine Reihe detaillierter Informationen über die verwalteten Knoten (z. B. Netzwerkschnittstellen und Betriebssystem) und stellt sie als globale Variablen namens Systemfakten bereit. Fakten können in Playbooks verwendet werden, um Ihre Automatisierung vielseitiger und flexibler zu gestalten. Sie unterscheiden sich je nach bereitgestelltem System.
Ansible verwendet das Jinja2-Python-Vorlagensystem, um dynamische Ausdrücke und Zugriff auf Variablen zuzulassen. Vorlagen können zur einfachen Einrichtung von Konfigurationsdateien und -diensten dienen. Sie können beispielsweise eine Vorlage nutzen, um in Apache einen neuen virtuellen Hosteinzurichten, und die gleiche Vorlage für verschiedene Serverinstallationen wiederverwenden.
Ansible bietet Hunderte von integrierten Modulen, um das Schreiben von Automatisierungen für häufige Systemverwaltungsaufgaben zu erleichtern, z. B. das Installieren von Paketen mit apt
, das Synchronisieren von Dateien mit rsync
sowie das Handhaben von beliebten Softwareanwendungen wie Datenbanksystemen (z. B. MySQL, PostgreSQL, MongoDB und andere) und Verwaltungstools für Abhängigkeiten (wie composer
von PHP, gem
von Ruby, npm
von Node und andere). Abgesehen davon gibt es verschiedene Methoden, um Ansible zu erweitern: Plug-ins und Module stellen geeignete Optionen dar, wenn Sie eine benutzerdefinierte Funktion benötigen, die nicht standardmäßig vorhanden ist.
Außerdem finden Sie Module und Plug-ins von Dritten im Ansible Galaxy-Portal.
Nun sehen wir uns Ansible-Begriffe und -Konzepte an, die uns in dieser Reihe immer wieder begegnen werden.
Ein Steuerknoten ist ein System, auf dem Ansible installiert und zur Verbindung mit Ihrem Server eingerichtet ist. Sie können mehrere Steuerknoten haben; jedes System, das Ansible ausführen kann, lässt sich als Steuerknoten einrichten, einschließlich PCs oder Laptops, die ein Linux- oder Unix-basiertes Betriebssystem ausführen. Derzeit kann Ansible nicht auf Windows-Hosts installiert werden, aber Sie können diese Einschränkung umgehen, indem Sie eine virtuelle Maschine einrichten, die Linux ausführt, und Ansible von dort ausführen.
Die Systeme, die Sie mit Ansible steuern, werden als verwaltete Knoten bezeichnet. Ansible erfordert, dass verwaltete Knoten über SSH erreichbar sind und Python 2 (Version 2.6 oder höher) oder Python 3 (Version 3.5 oder höher) installiert haben.
Ansible unterstützt eine Vielzahl von Betriebssystemen, einschließlich Windows-Server, als verwaltete Knoten.
Eine Inventardatei enthält eine Liste der Hosts, die Sie mit Ansible verwalten werden. Zwar erstellt Ansible beim Installieren typischerweise eine standardmäßige Inventardatei, doch können Sie projektspezifische Inventare verwenden, um eine bessere Trennung Ihrer Infrastruktur zu erzielen und eine Ausführung von Befehlen oder Playbooks auf dem falschen Server zu verhindern. Statische Inventare werden normalerweise als .ini
-Dateien erstellt, Sie können jedoch auch dynamisch generierte Inventare in jeder Programmiersprache verwenden, die JSON zurückgeben kann.
In Ansible ist eine Aufgabe eine einzelne Arbeitseinheit, die auf einem verwalteten Knoten ausgeführt werden soll. Jede Aktion, die ausgeführt werden soll, wird als Aufgabe definiert. Aufgaben können als eine einmalige Aktion über Ad-hoc-Befehle ausgeführt oder als Teil eines Automatisierungsskripts in ein Playbook aufgenommen werden.
Ein Playbook enthält eine geordnete Liste von Aufgaben und einige andere Anweisungen, um anzugeben, welche Hosts das Ziel der Automatisierung sind bzw. ob ein Eskalationssystem für Berechtigungen verwendet werden soll oder nicht, um diese Aufgaben auszuführen, sowie optionale Abschnitte, um Variablen zu definieren oder Dateien einzubinden. Ansible führt Aufgaben sequentiell aus und eine vollständige Playbookausführung heißt ein Play. Playbooks werden im YAML-Format geschrieben.
Handler werden verwendet, um Aktionen für einen Dienst auszuführen, z. B. Neustarten oder Anhalten eines Diensts, der im System des verwalteten Knotens aktiv ausgeführt wird. Handler werden typischerweise durch Aufgaben ausgelöst; ihre Ausführung erfolgt am Ende eines Plays, nachdem alle Aufgaben abgeschlossen sind. Wenn zum Beispiel mehr als eine Aufgabe den Neustart eines Dienst auslöst, wird der Dienst so nur einmal neu gestartet und nach Ende der Ausführung aller Aufgaben ausgeführt. Das Standardverhalten des Handlers ist zwar eine effizientere und insgesamt bessere Praxis, doch kann eine sofortige Ausführung des Handlers erzwungen werden, wenn eine Aufgabe das erfordert.
Eine Rolle ist ein Satz von Playbooks und verwandten Dateien, die in einer vordefinierten Struktur organisiert sind, die Ansible bekannt ist. Rollen erleichtern die Wiederverwendung und Neuausrichtung von Playbooks in freigebbaren Paketen mit granularer Automatisierung für bestimmte Ziele, wie z. B. das Installieren eines Webservers oder einer PHP-Umgebung bzw. das Einrichten eines MySQL-Servers.
Ansible ist ein minimalistisches IT-Automatisierungstool, das sich durch eine flache Lernkurve auszeichnet, zum Teil bedingt durch die Verwendung von YAML für die Bereitstellungsskripte. Es verfügt über eine große Anzahl von integrierten Modulen, die zur Abstrahierung von Aufgaben wie dem Installieren von Paketen und Arbeiten mit Vorlagen verwendet werden können. Die vereinfachten Infrastrukturanforderungen und die zugängliche Syntax eignen sich gut für Anfänger im Konfigurationsmanagement.
Im nächsten Teil dieser Reihe werden wir sehen, wie Sie Ansible auf einem Ubuntu 20.04-Server installieren und verwenden können.
]]>Configuration management is the process of handling changes to a system in a way that assures integrity over time, typically involving tools and processes that facilitate automation and observability. Even though this concept didn’t originate in the IT industry, the term is broadly used to refer to server configuration management.
In the context of servers, configuration management is also commonly referred to as IT Automation or Server Orchestration. Both terms highlight the practical aspects of configuration management and the ability to control multiple systems from a central server.
This guide will walk you through the benefits of using a configuration management tool to automate your server infrastructure setup, and how one such tool, Ansible, can help you with that.
There are a number of configuration management tools available on the market, with varying levels of complexity and diverse architectural styles. Although each of these tools have their own characteristics and work in slightly different ways, they all provide the same function: make sure a system’s state matches the state described by a set of provisioning scripts.
Many of the benefits of configuration management for servers come from the ability to define your infrastructure as code. This enables you to:
Additionally, configuration management tools offer you a way to control one to hundreds of servers from a centralized location, which can dramatically improve efficiency and integrity of your server infrastructure.
Ansible is a modern configuration management tool that facilitates the task of setting up and maintaining remote servers, with a minimalist design intended to get users up and running quickly.
Users write Ansible provisioning scripts in YAML, a user-friendly data serialization standard that is not tied to any particular programming language. This enables users to create sophisticated provisioning scripts more intuitively compared to similar tools in the same category.
Ansible doesn’t require any special software to be installed on the nodes that will be managed with this tool. A control machine is set up with the Ansible software, which then communicates with the nodes via standard SSH.
As a configuration management tool and automation framework, Ansible encapsulates all of the common features present in other tools of the same category, while still maintaining a strong focus on simplicity and performance:
Ansible keeps track of the state of resources in managed systems in order to avoid repeating tasks that were executed before. If a package was already installed, it won’t try to install it again. The objective is that after each provisioning execution the system reaches (or keeps) the desired state, even if you run it multiple times. This is what characterizes Ansible and other configuration management tools as having an idempotent behavior. When running a playbook, you’ll see the status of each task being executed and whether or not the task performed a change in the system.
When writing Ansible automation scripts, you can use variables, conditionals, and loops in order to make your automation more versatile and efficient.
Ansible collects a series of detailed information about the managed nodes, such as network interfaces and operating system, and provides it as global variables called system facts. Facts can be used within playbooks to make your automation more versatile and adaptive, behaving differently depending on the system being provisioned.
Ansible uses the Jinja2 Python templating system to allow for dynamic expressions and access to variables. Templates can be used to facilitate setting up configuration files and services. For instance, you can use a template to set up a new virtual host within Apache, while reusing the same template for multiple server installations.
Ansible comes with hundreds of built-in modules to facilitate writing automation for common systems administration tasks, such as installing packages with apt
and synchronizing files with rsync
, and also for dealing with popular software such as database systems (like MySQL, PostgreSQL, MongoDB, and others) and dependency management tools (such as PHP’s composer
, Ruby’s gem
, Node’s npm
, and others). Apart from that, there are various ways in which you can extend Ansible: plugins and modules are good options when you need a custom functionality that is not present by default.
You can also find third-party modules and plugins in the Ansible Galaxy portal.
We’ll now have a look at Ansible terminology and concepts to help familiarize you with these terms as they come up throughout this series.
A control node is a system where Ansible is installed and set up to connect to your server. You can have multiple control nodes, and any system capable of running Ansible can be set up as a control node, including personal computers or laptops running a Linux or Unix based operating system. For the time being, Ansible can’t be installed on Windows hosts, but you can circumvent this limitation by setting up a virtual machine that runs Linux and running Ansible from there.
The systems you control using Ansible are called managed nodes. Ansible requires that managed nodes are reachable via SSH, and have Python 2 (version 2.6 or higher) or Python 3 (version 3.5 or higher) installed.
Ansible supports a variety of operating systems including Windows servers as managed nodes.
An inventory file contains a list of the hosts you’ll manage using Ansible. Although Ansible typically creates a default inventory file when installed, you can use per-project inventories to have a better separation of your infrastructure and avoid running commands or playbooks on the wrong server by mistake. Static inventories are usually created as .ini
files, but you can also use dynamically generated inventories written in any programming language able to return JSON.
In Ansible, a task is an individual unit of work to execute on a managed node. Each action to perform is defined as a task. Tasks can be executed as a one-off action via ad-hoc commands, or included in a playbook as part of an automation script.
A playbook contains an ordered list of tasks, and a few other directives to indicate which hosts are the target of that automation, whether or not to use a privilege escalation system to run those tasks, and optional sections to define variables or include files. Ansible executes tasks sequentially, and a full playbook execution is called a play. Playbooks are written in YAML format.
Handlers are used to perform actions on a service, such as restarting or stopping a service that is actively running on the managed node’s system. Handlers are typically triggered by tasks, and their execution happens at the end of a play, after all tasks are finished. This way, if more than one task triggers a restart to a service, for instance, the service will only be restarted once and after all tasks are executed. Although the default handler behavior is more efficient and overall a better practice, it is also possible to force immediate handler execution if that is required by a task.
A role is a set of playbooks and related files organized into a predefined structure that is known by Ansible. Roles facilitate reusing and repurposing playbooks into shareable packages of granular automation for specific goals, such as installing a web server, installing a PHP environment, or setting up a MySQL server.
Ansible is a minimalist IT automation tool that has a gentle learning curve, thanks in part to its use of YAML for its provisioning scripts. It has a great number of built-in modules that can be used to abstract tasks such as installing packages and working with templates. Its simplified infrastructure requirements and accessible syntax can be a good fit for those who are getting started with configuration management.
In the next part of this series, we’ll see how to install and get started with Ansible on an Ubuntu 20.04 server.
]]>Em um aspecto mais amplo, o gerenciamento de configuração (GC) refere-se ao processo de manipular sistematicamente as alterações em um sistema, de maneira a manter a integridade ao longo do tempo. Embora esse processo não tenha sido originado no setor de TI, o termo é amplamente usado para se referir ao gerenciamento de configuração de servidor.
A automação desempenha um papel essencial no gerenciamento de configuração de servidor. É o mecanismo usado para fazer o servidor alcançar um estado desejável, definido anteriormente por scripts de provisionamento usando a linguagem e os recursos específicos de uma ferramenta. A automação é, de fato, o coração do gerenciamento de configurações de servidores, e é por isso que é comum também se referir às ferramentas de gerenciamento de configuração como Ferramentas de Automação ou Ferramentas de Automação de TI.
Outro termo comum usado para descrever os recursos de automação implementados pelas ferramentas de gerenciamento de configuração é Orquestração de Servidor ou Orquestração de TI, uma vez que essas ferramentas geralmente são capazes de gerenciar de um a centenas de servidores a partir de uma máquina controladora central.
Existem várias ferramentas de gerenciamento de configuração disponíveis no mercado. Puppet, Ansible, Chef e Salt são escolhas populares. Embora cada ferramenta tenha suas próprias características e funcione de maneiras ligeiramente diferentes, todas elas são orientadas pelo mesmo propósito: garantir que o estado do sistema corresponda ao estado descrito pelos seus scripts de provisionamento.
Embora o uso do gerenciamento de configuração geralmente exija mais planejamento e esforço iniciais do que a administração manual do sistema, todas as infraestruturas de servidor, exceto as mais simples, serão aprimoradas pelos benefícios que ele oferece. Para citar alguns:
Sempre que um novo servidor precisa ser deployado, uma ferramenta de gerenciamento de configuração pode automatizar a maior parte, se não todo, do processo de provisionamento para você. A automação torna o provisionamento muito mais rápido e eficiente, pois permite que tarefas tediosas sejam executadas com mais rapidez e precisão do que qualquer ser humano poderia. Mesmo com a documentação adequada e completa, o deployment manual de um servidor web, por exemplo, pode levar horas em comparação a alguns minutos com o gerenciamento/automação da configuração.
Com o provisionamento rápido, vem outro benefício: recuperação rápida de eventos críticos. Quando um servidor fica offline devido a circunstâncias desconhecidas, pode levar várias horas para se auditar adequadamente o sistema e descobrir o que realmente aconteceu. Em cenários como esse, fazer o deploy de um servidor substituto geralmente é a maneira mais segura de colocar seus serviços online novamente enquanto uma inspeção detalhada é feita no servidor afetado. Com o gerenciamento e a automação da configuração, isso pode ser feito de maneira rápida e confiável.
À primeira vista, a administração manual do sistema pode parecer uma maneira fácil de fazer deploy e corrigir rapidamente os servidores, mas isso geralmente tem um preço. Com o tempo, pode se tornar extremamente difícil saber exatamente o que está instalado em um servidor e quais alterações foram feitas quando o processo não é automatizado. Os hotfixes manuais, os ajustes de configuração e as atualizações de software podem transformar os servidores em snowflakes exclusivos, difíceis de gerenciar e ainda mais difíceis de replicar. Usando uma ferramenta de GC, o procedimento necessário para lançar um novo servidor ou atualizar um existente estará documentado nos scripts de provisionamento.
Depois de ter sua configuração do servidor traduzida em um conjunto de scripts de provisionamento, você poderá aplicar ao ambiente do seu servidor muitas das ferramentas e fluxos de trabalho que você normalmente usa para o código-fonte de software.
Ferramentas de controle de versão como o Git, podem ser usadas para acompanhar as alterações feitas no provisionamento e manter ramificações separadas para as versões antigas dos scripts. Você também pode usar o controle de versão para implementar uma política de code review para os scripts de provisionamento, onde todas as alterações devem ser submetidas como um pull request e aprovadas pelo líder do projeto antes de serem aceitas. Essa prática adicionará consistência extra à sua configuração de infraestrutura.
O gerenciamento de configuração torna trivial replicar ambientes com exatamente o mesmo software e as mesmas configurações. Isso permite que você crie efetivamente um ecossistema de vários estágios, com servidores de produção, desenvolvimento e teste. Você pode até usar máquinas virtuais locais para desenvolvimento, criadas com os mesmos scripts de provisionamento. Essa prática minimizará os problemas causados por discrepâncias no ambiente que ocorrem com frequência quando as aplicações são deployadas na produção ou compartilhadas entre colegas de trabalho com diferentes configurações de máquina (sistema operacional diferente, versões de software e/ou configurações).
Embora cada ferramenta de GC tenha seus próprios termos, filosofia e ecossistema, elas geralmente compartilham muitas características e têm conceitos semelhantes.
A maioria das ferramentas de gerenciamento de configuração usa um modelo de controlador/mestre e node/agente. Essencialmente, o controlador direciona a configuração dos nodes, com base em uma série de instruções ou tasks definidas em seus scripts de provisionamento.
Abaixo, você encontra os recursos mais comuns presentes na maioria das ferramentas de gerenciamento de configuração para servidores:
Cada ferramenta de gerenciamento de configuração fornece uma sintaxe específica e um conjunto de recursos que você pode usar para escrever scripts de provisionamento. A maioria das ferramentas possui recursos que tornam sua linguagem semelhante às linguagens de programação convencionais, mas de maneira simplificada. Variáveis, loops e condicionais são recursos comuns fornecidos para facilitar a criação de scripts de provisionamento mais versáteis.
As ferramentas de gerenciamento de configuração controlam o estado dos recursos para evitar a repetição de tarefas que foram executadas anteriormente. Se um pacote já estiver instalado, a ferramenta não tentará instalá-lo novamente. O objetivo é que, após cada execução de provisionamento, o sistema atinja (ou mantenha) o estado desejado, mesmo que você o execute várias vezes. É isso que caracteriza essas ferramentas como tendo um comportamento idempotente. Esse comportamento não é necessariamente imposto em todos os casos, no entanto.
As ferramentas de gerenciamento de configuração geralmente fornecem informações detalhadas sobre o sistema que está sendo provisionado. Esses dados estão disponíveis através de variáveis globais, conhecidas como facts. Eles incluem coisas como interfaces de rede, endereços IP, sistema operacional e distribuição. Cada ferramenta fornecerá um conjunto diferente de facts. Eles podem ser usados para tornar os scripts e templates de provisionamento mais adaptáveis a vários sistemas.
A maioria das ferramentas de GC fornecerá um sistema de templates interno que pode ser usado para facilitar a criação de arquivos e serviços de configuração. Os templates geralmente suportam variáveis, loops e condicionais que podem ser usados para maximizar a versatilidade. Por exemplo, você pode usar um template para configurar facilmente um novo virtual host no Apache, enquanto reutiliza o mesmo template para várias instalações de servidores. Em vez de ter apenas valores estáticos codificados, um template deve conter espaços reservados (placeholders) para valores que podem mudar de host para host, como NameServer
e DocumentRoot
.
Embora os scripts de provisionamento possam ser muito especializados para as necessidades e demandas de um servidor específico, há muitos casos em que você tem configurações semelhantes de servidor ou partes de uma configuração que podem ser compartilhadas entre vários servidores. A maioria das ferramentas de provisionamento fornecerá maneiras pelas quais você pode reutilizar e compartilhar facilmente pequenos blocos de sua configuração de provisionamento como módulos ou plugins.
Módulos e plugins de terceiros geralmente são fáceis de encontrar na Internet, especialmente para configurações comuns de servidores, como a instalação de um servidor web PHP. As ferramentas de GC tendem a ter uma comunidade forte criada em torno delas e os usuários são incentivados a compartilhar suas extensões personalizadas. O uso de extensões fornecidas por outros usuários pode economizar muito tempo, além de servir como uma excelente maneira de aprender como outros usuários resolveram problemas comuns usando a ferramenta de sua escolha.
Existem muitas ferramentas de GC disponíveis no mercado, cada uma com um conjunto diferente de recursos e diferentes níveis de complexidade. As escolhas populares incluem Chef, Ansible e Puppet. O primeiro desafio é escolher uma ferramenta que seja adequada às suas necessidades.
Há algumas coisas que você deve levar em consideração antes de fazer uma escolha:
A maioria das ferramentas de gerenciamento de configuração requer uma hierarquia mínima composta por uma máquina controladora e um node que será gerenciado por ela. O Puppet, por exemplo, exige que uma aplicação agent seja instalada em cada node e um aplicação master seja instalada na máquina do controlador. O Ansible, por outro lado, possui uma estrutura descentralizada que não requer instalação de software adicional nos nodes, mas depende do SSH para executar as tarefas de provisionamento. Para projetos menores, uma infraestrutura simplificada pode parecer melhor, no entanto, é importante levar em consideração aspectos como escalabilidade e segurança, que podem não ser impostos pela ferramenta.
Algumas ferramentas podem ter mais componentes e partes móveis, o que pode aumentar a complexidade da sua infraestrutura, impactando na curva de aprendizado e possivelmente aumentando o custo geral de implementação.
Como mencionado anteriormente neste artigo, as ferramentas de GC fornecem uma sintaxe personalizada, às vezes usando uma linguagem de domínio específico (DSL) e um conjunto de recursos que compõem seu framework de automação. Como nas linguagens de programação convencionais, algumas ferramentas exigem uma curva de aprendizado mais alta para serem dominadas. Os requisitos de infraestrutura também podem influenciar a complexidade da ferramenta e a rapidez com que você poderá ver um retorno do investimento.
A maioria das ferramentas de GC oferece versões gratuitas ou open source, com assinaturas pagas para recursos e serviços avançados. Algumas ferramentas terão mais limitações que outras, portanto, dependendo de suas necessidades específicas e de como sua infraestrutura cresce, você pode acabar tendo que pagar por esses serviços. Você também deve considerar o treinamento como um custo extra em potencial, não apenas em termos monetários, mas também em relação ao tempo necessário para atualizar sua equipe com a ferramenta que você acabou escolhendo.
Como mencionado anteriormente, a maioria das ferramentas oferece serviços pagos que podem incluir suporte, extensões e ferramentas avançadas. É importante analisar suas necessidades específicas, o tamanho da sua infraestrutura e se há ou não a necessidade de usar esses serviços. Os painéis de gerenciamento, por exemplo, são um serviço comum oferecido por essas ferramentas e podem facilitar muito o processo de gerenciamento e monitoramento de todos os seus servidores a partir de um ponto central. Mesmo que você ainda não precise desses serviços, considere as opções para uma possível necessidade futura.
Uma comunidade forte e acolhedora pode ser extremamente útil em termos de suporte e documentação, pois os usuários geralmente ficam felizes em compartilhar seu conhecimento e suas extensões (módulos, plugins e scripts de provisionamento) com outros usuários. Isso pode ser útil para acelerar sua curva de aprendizado e evitar custos extras com suporte ou treinamento pagos.
A tabela abaixo deve lhe fornecer uma rápida visão geral das principais diferenças entre as três ferramentas de gerenciamento de configuração mais populares disponíveis no mercado atualmente: Ansible, Puppet e Chef.
Ansible | Puppet | Chef | |
---|---|---|---|
Linguagem do Script | YAML | DSL personalizada baseada em Ruby | Ruby |
Infraestrutura | A máquina controladora aplica a configuração nos nodes via SSH | O Puppet Master sincroniza a configuração nos Puppet Nodes | As Workstations do Chef enviam a configuração para o Chef Server, a partir do qual os nodes do Chef serão atualizados |
Requer software especializado para os nodes | Não | Sim | Sim |
Fornece ponto de controle centralizado | Não. Qualquer computador pode ser um controlador | Sim, via Puppet Master | Sim, via Chef Server |
Terminologia do script | Playbook / Roles | Manifests / Modules | Recipes / Cookbooks |
Ordem de Execução das Tarefas | Sequencial | Não-sequencial | Sequencial |
Até agora, vimos como o gerenciamento de configuração funciona para servidores e o que considerar ao escolher uma ferramenta para criar sua infraestrutura de gerenciamento de configuração. Nos guias subsequentes desta série, teremos uma experiência prática com três ferramentas populares de gerenciamento de configuração: Ansible, Puppet e Chef.
Para que você possa comparar essas ferramentas por si mesmo, usaremos um exemplo simples de configuração de servidor que deve ser totalmente automatizado por cada ferramenta. Essa configuração consiste em um servidor Ubuntu 18.04 executando o Apache para hospedar uma página web simples.
O gerenciamento de configuração pode melhorar drasticamente a integridade dos servidores ao longo do tempo, fornecendo um framework para automatizar processos e acompanhar as alterações feitas no ambiente do sistema. No próximo guia dessa série, veremos como implementar uma estratégia de gerenciamento de configuração na prática usando Ansible como ferramenta.
]]>So say you have a LAMP type solution running on a 2vcpu/4GB droplet and you think it’s time to get some more capacity. Doubling the size of the droplet is easy, but is that the right way to go? Spinning up a separate database server is another option with some advantages and some disadvantages.
Advantages:
Disadvantages:
I’m split right down the middle on the advantages vs disadvantages and thought I might find some interesting and informative opinions in the DO community.
]]>Современные веб-сайты и приложения часто должны предоставлять большое количество статичного контента конечным пользователям. Это могут быть изображения, таблицы стилей, файлы JavaScript и видео. По мере роста количества и размера таких статичных активов, значительно увеличивается объем трафика, а также время загрузки страницы, что ухудшает опыт просмотра сайта или приложения для ваших пользователей и снижает количество свободных ресурсов сервера.
Чтобы заметно сократить время загрузки страницы, улучшить производительность и снизить объем трафика и расходы на инфраструктуру, вы можете реализовать CDN, или сеть доставки контента (content delivery network), чтобы кешировать подобные активы на распределенных в разных географических точках серверах.
В этом руководстве вы найдете достаточно общее описание сетей доставки контента и принципов их работы, а также преимущества, которые они могут предоставить вашим веб-приложениям.
Сеть доставки контента — это географически распределенная группа серверов, оптимизированная для доставки статичного контента конечным пользователям. Подобный статичный контент может представлять собой практически любой тип данных, но, как правило, CDN используются для предоставления веб-страниц и связанных с ними файлов, трансляции видео и аудио и больших пакетов программного обеспечения.
CDN включает несколько точек входа в сеть (PoP) в различных местах, каждая из которых включает несколько пограничных серверов, которые кешируют активы из вашего источника или хост-сервера. При посещении пользователем веб-сайта и запросе статичных активов, таких как изображения или файлы JavaScript, эти запросы перенаправляются CDN на ближайший пограничный сервер, откуда и предоставляется контент. Если у данного пограничного сервера нет кешированных активов или срок действия подобных активов истек, CDN будет выполнять выборку и кешировать последнюю версию из любого другого ближайшего пограничного сервера CDN или ваших серверов источника. Если у пограничного сервера CDN есть запись в кеше для ваших активов (что происходит в большинстве случаев, если ваш веб-сайт принимает среднее количество трафика), он будет возвращать конечному пользователю копию из кеша.
Это позволяет географически разрозненным пользователям минимизировать количество действий, необходимых для получения статичного контента, получая этот контент напрямую из кеша ближайшего пограничного сервера. В результате значительно сокращается количество задержек в доступе и утерянных пакетов, достигается увеличение скорости загрузки страницы и резко уменьшается нагрузка на вашу инфраструктуру источника.
Поставщики CDN часто предлагают дополнительные функции, например, смягчение последствий DDoS-атак и ограничение скорости, инструменты аналитики пользователей, а также инструменты оптимизации для трансляций или мобильного трафика за дополнительную плату.
Когда пользователь посещает ваш веб-сайт, сначала он получает ответ от сервера DNS, который содержит IP-адрес вашего веб-сервера хоста. Затем браузер пользователя запрашивает контент веб-страницы, который часто включает целый ряд статичных файлов, таких как страницы HTML, таблицы стилей CSS, код JavaScript и изображения.
После развертывания CDN и переноса этих статичных активов на сервера CDN либо путем ручной их выгрузки, либо путем автоматической выгрузки (оба механизма описываются в следующем разделе), вы должны будете указывать вашему веб-серверу на необходимость перезаписи ссылок на статичный контент, чтобы эти ссылки указывали на файлы, которые содержатся в CDN. Если вы используете CMS, например WordPress, подобная перезапись ссылки может быть выполнена с помощью стороннего плагина, например CDN Enabler.
Многие CDN обеспечивают поддержку для настраиваемых доменов, позволяя вам создавать запись CNAME в вашем домене, указывающую на конечную точку CDN. Когда CDN получает запрос пользователя в этой конечной точке (которая расположена на пограничном сервере, находящемся гораздо ближе к пользователю, чем ваши серверы бекэнда), он перенаправляет запросы на ближайшую к пользователю точку входа (PoP). Эта точка входа часто включает один или несколько пограничных серверов CDN, которые расположены в точке обмена интернет-трафиком (IxP), то есть, по сути, в центре обработки данных, используемом интернет-провайдерами для соединения своих сетей. Внутренний распределитель нагрузки CDN перенаправляет запрос на пограничный сервер, находящийся в этой точке входа, который затем предоставляет контент пользователю.
Механизмы кеширования, применяемые разными поставщиками CDN, различаются, но обычно они работают следующим образом:
X-Cache: MISS
. Этот первоначальный запрос может быть медленнее, чем будущие запросы, поскольку после выполнения этого запроса данный актив будет храниться на пограничном сервере.X-Cache: HIT
.Дополнительные данные о том, как работает и реализуется конкретная CDN, приведены в документации поставщика CDN.
В следующем разделе мы добавим два популярных типа CDN: push и pull CDN.
Большинство поставщиков CDN предоставляет два способа кэширования ваших данных: pull-зоны и push-зоны.
Pull-зоны подразумевают ввод адреса сервера источника, после чего CDN автоматически получает и кеширует все статичные ресурсы на вашем сайте. Pull-зоны, как правило, используются для предоставления часто обновляемых веб-активов небольшого и среднего размера, таких как файлы HTML, CSS и JavaScript. После предоставления CDN адреса сервера источника следующим шагом обычно является переписка ссылок на статичные активы, чтобы они указывали на URL-адрес, предоставляемый CDN. С этого момента и далее CDN будет обрабатывать входящие запросы активов от ваших пользователей и обслуживать контент из распределенных по географическому принципу кешей и вашего источника, если это уместно.
Чтобы использовать Push-зону, вы должны загрузить ваши данные в указанный блок или место хранения, которое после этого CDN отправляет в кеш на распределенный парк пограничных серверов. Push-зоны, как правило, используются для крупных и редко изменяемых файлов, например архивов, пакетов программного обеспечения, PDF-файлов, видео и аудио.
Почти любой сайт может воспользоваться преимуществами, предоставляемыми CDN, но обычно основными причинами внедрения такой сети являются возможность снижения объема трафика из серверов источника на серверы CDN, а также сокращение времени задержки для географически распределенных пользователей.
Мы обсудим эти преимущества, а также некоторые другие основные выгоды от использования CDN ниже.
Если вы практически достигли пиковой пропускной способности на ваших серверах, разгрузка статичных активов, таких как изображения, видео, файлы CSS и JavaScript, позволит значительно снизить нагрузку на ваши серверы. Сети доставки контента предназначены и оптимизированы для обслуживания статичного контента, а запросы клиента для этого контента будут перенаправляться на пограничные серверы CDN и обслуживаться ими. Это дает дополнительное преимущество при сокращении нагрузки на серверы источника, поскольку они будут гораздо реже обслуживать эти данные.
Если ваши пользователи географически распределены, а необычно большая часть трафика приходит из удаленного географического района, система CDN может уменьшать время задержки, выполняя кеширование статичных активов на пограничных серверах, расположенных ближе к вашим пользователям. Уменьшая расстояние между пользователями и статическим контентом, вы можете быстрее предоставлять контент пользователям и улучшать опыт взаимодействия, повышая скорость загрузки страницы.
Эти преимущества особенно важны для веб-сайтов, обслуживающих главным образом потребляющий множество трафика видеоконтент, где большое время задержки и низкая скорость загрузки оказывают прямое влияние на пользовательский опыт и уровень вовлечения при работе с контентом.
CDN позволяет обрабатывать пиковые нагрузки для трафика и справляться с резким ростом трафика, отправляя распределяющие нагрузку запросы в распределенную сеть пограничных серверов. Загружая и кешируя статичный контент в сети доставки, вы можете обслуживать большее количество пользователей одновременно, используя текущую инфраструктуру.
Для веб-сайтов, использующих один сервер источника, такие пики трафика часто могут перегружать систему, что может становиться причиной незапланированных отключений и перерывов в работе сайта. Перенос трафика на доступную сеть с избыточными ресурсами, предназначенную для обработки различных уровней веб-трафика, может увеличить доступность ваших активов и контента.
Предоставление статичного контента, как правило, составляет большую часть вашего трафика, перенос этих активов в сеть доставки контента может значительно сократить месячные расходы на инфраструктуру. Помимо сокращения затрат на обслуживание трафика, CDN может снижать расходы на организацию работы сервера с помощью снижения нагрузки на серверы источника, что позволит увеличить масштаб вашей существующей инфраструктуры. Наконец, некоторые поставщики CDN предлагают месячные тарифы с фиксированной ценой, что позволяет перейти от непрогнозируемых затрат на трафик к стабильной и предсказуемой регулярной сумме.
Еще одним распространенным вариантом использования CDN является смягчение последствий DDoS-атак. Многие поставщики CDN предоставляют возможность мониторинга и фильтрации запросов для пограничных серверов. Подобные службы проводят анализ веб-трафика на подозрительные элементы, блокируя трафик злоумышленников, одновременно позволяя добросовестным пользователям использовать сайт. Поставщики CDN, как правило, предоставляют различные службы для смягчения последствий DDoS-атак, от общей защиты на уровне инфраструктуры (слои 3 и 4 OSI) до более продвинутых служб и ограничения скорости.
Кроме того, большинство CDN позволяют в полной мере настроить SSL, чтобы вы могли шифровать трафик между CDN и конечным пользователем, а также трафик между CDN и серверами источника, используя предоставляемые CDN или пользовательские SSL-сертификаты.
Если ограничением вашей системы является нагрузка на ЦП на сервере источника, а не трафик, CDN может оказаться не самым подходящим решением. В данном случае локальное кеширование с помощью таких популярных средств кеширования, как NGINX или Varnish, может значительно снизить нагрузку с помощью предоставления активов из системной памяти.
Помимо развертывания CDN, такие дополнительные меры оптимизации, как, например, уменьшение размера и сжатие файлов JavaScript и CSS или активация сжатия HTTP-запросов, могут также оказывать значительное влияние на время загрузки страницы и объем трафика.
PageSpeed Insights от Google — это отличный инструмент для оценки скорости загрузки страницы и ее оптимизации. Еще одним полезным инструментом, который предоставляет каскадную разбивку скорости запросов и ответов, а также меры оптимизации, является Pingdom.
Сеть доставки контента может служить быстрым и эффективным решением для масштабирования и повышения доступности веб-сайтов. Кешируя статичные активы в географически распределенной сети оптимизированных серверов, вы можете значительно сократить время, затрачиваемое на загрузку страницы, а также продолжительность задержек для конечных пользователей. Также CDN позволяет значительно сократить количество трафика, получая запросы пользователей и отправляя необходимые данные из кеша на пограничном сервере, сокращая объем передаваемых данных и затраты на инфраструктуру.
Наличие плагинов и поддержки большинства фреймворков, например, WordPress, Drupal, Django и Ruby on Rails, а также таких дополнительных функций, как смягчение последствий DDoS-атак, полноценные SSL-сертификаты, мониторинг пользователей и сжатие активов, делает CDN полезным инструментом для обеспечения безопасности и оптимизации веб-сайтов с высоким трафиком.
]]>Система доменных имен DNS — это система связи различных типов информации, например связи IP адресов с легко запоминающимися именами. По умолчанию большинство кластеров Kubernetes автоматически настраивают внутреннюю службу DNS в качестве компактного механизма поиска служб. Встроенная система обнаружения служб позволяет приложениям легко находить друг друга и взаимодействовать друг с другом на кластерах Kubernetes, даже в случае создания подов и служб, их удаления и перемещения между узлами.
В последних версиях Kubernetes изменены детали реализации службы DNS. В этой статье мы рассмотрим версии kube-dns и CoreDNS службы Kubernetes DNS. Мы расскажем о том, как они работают, и о генерируемых Kubernetes записях DNS.
Чтобы лучше понять принципы работы службы DNS прочитайте статью «Введение в терминологию, компоненты и концепции DNS». Чтобы узнать больше о любых аспектах Kubernetes, с которыми вы не знакомы, вы можете прочитать статью «Введение в Kubernetes».
До выпуска версии Kubernetes 1.11 служба Kubernetes DNS была основана на kube-dns. В версии 1.11 появилась версия службы CoreDNS, устраняющая ряд проблем kube-dns со стабильностью и безопасностью.
Вне зависимости от того, какое программное обеспечение обрабатывает записи DNS, оба варианта работают одинаково:
Создается служба с именем kube-dns
и один или несколько подов.
Служба kube-dns
прослушивает события службы и события конечных точек через Kubernetes API и обновляет записи DNS по мере необходимости. Эти события активируются при создании, обновлении или удалении служб Kubernetes и связанных с ними подов.
kubelet задает опцию /etc/resolv.conf
nameserver
для каждого нового пода как IP-адрес кластера службы kube-dns
с соответствующими опциями поиска
, позволяющими использовать более короткие имена хостов:
nameserver 10.32.0.10
search namespace.svc.cluster.local svc.cluster.local cluster.local
options ndots:5
Работающие в контейнерах приложения могут разрешать такие имена хостов как example-service.namespace
в корректные IP-адреса кластера.
Полная запись DNS A
службы Kubernetes будет выглядеть, как показано в следующем примере:
service.namespace.svc.cluster.local
Под будет иметь запись в этом формате, отражающую фактический IP-адрес пода:
10.32.0.125.namespace.pod.cluster.local
Кроме того, создаются дополнительные записи SRV
для именованных портов службы Kubernetes:
_port-name._protocol.service.namespace.svc.cluster.local
В результате получается встроенный механизм обнаружения служб на базе DNS, с помощью которого ваше приложение или микрослужба может использовать простое и согласованное имя хоста для доступа к другим службам или подам кластера.
Поскольку суффиксы поиска доменов перечислены в файле resolv.conf
, вам не нужно использовать полные имена хостов для связи с другими службами. Если вы выполняете адресацию службы в том же пространстве имен, вы можете использовать для связи с ней просто имя службы:
other-service
Если служба находится в другом пространстве имен, его нужно добавить в запрос:
other-service.other-namespace
Если вы хотите связаться с подом, вам потребуется как минимум следующее:
pod-ip.other-namespace.pod
Как мы видели в файле resolv.conf
по умолчанию, автоматически заполняются только суффиксы .svc
, так что обязательно укажите все вплоть до уровня .pod
.
Теперь мы знаем, как применять службу Kubernetes DNS на практике, и можем более детально познакомиться с двумя вариантами ее реализации.
Как было отмечено в предыдущем разделе, в версии Kubernetes version 1.11 было представлено новое программное обеспечение для работы со службой kube-dns
. Это изменение было разработано для повышения производительности и безопасности службы. Вначале посмотрим на первоначальную реализацию kube-dns
.
Служба kube-dns
до версии Kubernetes 1.11 состояла из трех контейнеров, работающих в поде kube-dns
в пространстве имен kube-system
. Вот эти три контейнера:
В связи с уязвимостями безопасности Dnsmasq и проблемами с производительностью при масштабировании SkyDNS была разработана новая система CoreDNS.
Начиная с версии Kubernetes 1.11 в качестве основной службы Kubernetes DNS используется служба CoreDNS. Это означает, что данная служба готова к использованию в производственной среде и будет выступать в качестве кластерной службы DNS по умолчанию для различных инструментов установки и управляемых поставщиков Kubernetes.
CoreDNS представляет собой одиночный процесс, написанный на языке Go. Этот процесс охватывает весь диапазон функций предыдущей системы. Единый контейнер разрешает и кэширует запросы DNS, отправляет ответы при проверке состояния и предоставляет метрические показатели.
Помимо устранения проблем с производительностью и безопасностью, в CoreDNS также исправлены некоторые мелкие ошибки и добавлены некоторые новые функции:
autopath
может улучшить время отклика DNS при разрешении внешних имен хостов за счет использования интеллектуальных функций при итерации суффиксов поиска доменов, перечисленных в файле resolv.conf
.10.32.0.125.namespace.pod.cluster.local
всегда разрешается как 10.32.0.125
, даже если этот под фактически не существует. CoreDNS использует режим подтверждения подов, при котором разрешение выполняется только в случае существования пода с правильным IP-адресом в правильном пространстве имен.Дополнительную информацию о службе CoreDNS и ее отличии от kube-dns можно найти в объявлении о выпуске основной версии Kubernetes CoreDNS.
Операторам Kubernetes часто требуется настроить разрешение определенных доменов подами и контейнерами или изменить параметры серверов имен верхнего уровня или суффиксов поиска доменов, которые заданы в файле resolv.conf
. Для этого можно использовать опцию dnsConfig
в спецификации пода:
apiVersion: v1
kind: Pod
metadata:
namespace: example
name: custom-dns
spec:
containers:
- name: example
image: nginx
dnsPolicy: "None"
dnsConfig:
nameservers:
- 203.0.113.44
searches:
- custom.dns.local
При обновлении этого параметра конфигурации выполняется перезапись файла resolv.conf
в поде для активации изменений. Конфигурация напрямую сопоставляется со стандартными опциями resolv.conf
, так что вышеуказанная конфигурация создаст файл со строками nameserver 203.0.113.44
и search custom.dns.local
.
В этой статье мы рассказали об основных возможностях, которые служба Kubernetes DNS предоставляет разработчикам, показали несколько примеров записей DNS для служб и подов, обсудили реализацию системы в разных версиях Kubernetes и показали дополнительные варианты конфигурации, с помощью которых можно настроить разрешение запросов DNS вашими подами.
Дополнительную информацию о службе Kubernetes DNS можно найти в официальной документации Kubernetes по службе DNS для служб и подов.
]]>Развертывание приложений в мощной и популярной системе организации контейнеров Kubernetes может представлять собой сложную задачу. Для настройки одного приложения может потребоваться создание нескольких независимых ресурсов Kubernetes, в том числе подов, служб, развертываний и наборов копий, и для каждого из этих ресурсов требуется детализированный файл манифеста YAML.
Helm — это диспетчер пакетов для Kubernetes, упрощающий для разработчиков и операторов упаковку, настройку и развертывание приложений и служб в кластерах Kubernetes.
Helm уже является официальным проектом Kubernetes и поддерживается некоммерческим фондом Cloud Native Computing Foundation, который поддерживает проекты с открытым исходным кодом, связанные с экосистемой Kubernetes.
В этой статье мы приведем обзор Helm и различных абстракций, которые он использует для развертывания приложений в Kubernetes. Если вы еще мало знакомы Kubernetes, прочитайте статью «Введение в Kubernetes», чтобы ознакомиться с базовыми концепциями.
Практически у каждого языка программирования и каждой операционной системы имеется собственный диспетчер пакетов, служащий для установки и обслуживания программного обеспечения. Helm имеет те же базовые функции, что и другие знакомые вам диспетчеры пакетов, такие как apt
в Debian или pip
в Python.
Helm может выполнять следующие задачи:
Helm реализует эти возможности с помощью следующих компонентов:
helm
, обеспечивающий пользовательский интерфейс для всех функций Helm.tiller
, который работает на кластере Kubernetes, прослушивает команды helm
и обрабатывает конфигурации и развертывание версий программного обеспечения в кластере.Далее мы более подробно расскажем о формате charts.
Пакеты Helm имеют формат charts и состоят из нескольких файлов конфигурации YAML и шаблонов, преобразуемых в файлы манифеста Kubernetes. Базовая структура каталогов пакета charts выглядит следующим образом:
package-name/
charts/
templates/
Chart.yaml
LICENSE
README.md
requirements.yaml
values.yaml
Эти каталоги и файлы имеют следующие функции:
requirements.yaml
для динамической привязки зависимостей.values.yaml
и командной строки) и записываются в манифесты Kubernetes. Для шаблонов используется формат шаблонов языка программирования Go.Команда helm
может использоваться для установки пакета из локального каталога или из упакованной версии .tar.gz
этой структуры каталогов. Упакованные пакеты также можно автоматически загружать и устанавливать из репозиториев пакетов или repos.
Далее мы рассмотрим репозитории пакетов.
Репозиторий пакетов Helm — это простой сайт HTTP, обслуживающий упакованные пакеты в файлах index.yaml
и .tar.gz
. Команда helm
имеет две субкоманды для упаковки пакетов и создания требуемого файла index.yaml
. Эти файлы может обслуживать любой веб-сервер, служба хранения объектов или хост статических сайтов, например GitHub Pages.
В Helm настроен репозиторий пакетов по умолчанию с именем stable. Этот репозиторий указывает на объект Google Storage по адресу https://kubernetes-charts.storage.googleapis.com
. Источник репозитория stable можно найти в репозитории Git helm/charts на GitHub.
Альтернативные репозитории можно добавлять с помощью команды helm repo add
. Ниже представлены некоторые популярные альтернативные репозитории:
Если вы устанавливаете локально разработанный пакет или пакет из репозитория, вам нужно настроить его для своей конкретной системы. Теперь мы перейдем к конфигурациям.
Значения конфигурации пакета по умолчанию обычно содержатся в файле values.yaml
этого пакета. Некоторые приложения можно полностью развернуть со значениями по умолчанию, однако в большинстве случаев требуется изменить некоторые элементы конфигурации.
Значения, открываемые для настройки через конфигурацию, определяются автором пакета. Некоторые из них используются для настройки примитивов Kubernetes, а некоторые передаются в базовый контейнер для настройки самого приложения.
Далее приведен сниппет с примерами значений:
service:
type: ClusterIP
port: 3306
Это варианты настройки ресурса службы Kubernetes. Вы можете использовать команду helm inspect values chart-name
для очистки всех доступных значений конфигурации пакета.
Эти значения можно переопределить в собственном файле YAML, используемом при запуске команды helm install
, или через отдельные параметры командной строки с флагом --set
. Нужно указать только те значения, для которых вы хотите изменить параметры по умолчанию.
Пакет Helm, развернутый в определенной конфигурации, назыается релизом. Далее мы поговорим о релизах.
Во время установки пакета Helm объединяет шаблоны пакета с заданной пользователем конфигурацией и значеними по умолчанию из файла value.yaml
. Для этих пакетов выполняется рендеринг в манифестах Kubernetes, которые затем развертываются в Kubernetes API. При этом создается релиз, то есть конкретная конфигурация и развертывание для конкретного пакета.
Понимание концепции релизов очень важно, поскольку одно и то же приложение можно развернуть в кластере несколько раз. Например, вам пожет понадобиться несколько серверов MySQL с разными конфигурациями.
Возможно вы захотите обновлять разные экземпляры пакета по отдельности. Одно приложение может быть готово работать с обновленным сервером MySQL, а другое — нет. С помощью Helm вы можете обновлять каждый релиз по отдельности.
Вы можете обновить релиз в связи с обновлением пакета или в связи с необходимостью обновить конфигурацию релиза. При каждом обновлении создается новая редакция релиза, и в случае возникновения проблем Helm позволяет легко возвращаться к предыдущим редакциям.
Если вы не можете найти существующий пакет для своего программного обеспечения, вы можете создать собственный пакет. Helm может вывести схему каталога пакетов с помощью команды helm create chart-name
. При этом будет создана папка с файлами и каталогами, которые мы обсуждали в разделе «Пакеты» выше.
Теперь вы можете заполнить метаданные пакетов в файле Chart.yaml
и поместить файлы манифеста Kubernetes в каталог templates
. В этом случае вам нужно извлечь переменные конфигурации из манифестов в файл values.yaml
, а затем включить их в шаблоны манифеста с помощью системы шаблонов.
Команда helm
имеет множество субкоманд для тестирования, упаковки и обслуживания пакетов. Дополнительную информацию можно найти в официальной документации Helm по разработке пакетов.
В этой статье мы рассмотрели диспетчер пакетов Helm для Kubernetes. Мы рассмотрели архитектуру Helm и отдельные компоненты helm
и tiller
, рассказали о формате пакетов Helm и привели обзор репозиториев пакетов. Также мы рассмотрели процедуру настройки пакетов Helm и объединение конфигураций и пакетов и их развертывание в качестве релизов на кластерах Kubernetes. В заключение мы рассказали об основных процедурах создания пакетов в случаях, когда готовых пакетов нет.
Дополнительную информацию о Helm можно найти в официальной документации по Helm. Официальные пакеты Helm можно найти в официальном репозитории Git helm/charts на GitHub.
]]>Методика непрерывной интеграции, доставки и развертывания (CI/CD) — неотъемлемая часть современного процесса разработки, призванная снизить количество ошибок во время интеграции и развертывания и повысить скорость реализации проектов. CI/CD — это одновременно философия и набор практик, которые часто усиливваются высоконадежными инструментами, обеспечивающими автоматическое тестирование на каждом шаге конвейера программного обеспечения. Добавляя их в свою практику, вы можете сократить затраты времени на интеграцию изменений в новом выпуске и тщательно тестировать каждое изменение перед переносом в производственную среду.
CI/CD имеет множество потенциальных преимуществ, однако для успешной реализации часто требуется тщательный анализ. Принятие точного решения относительно использования инструментов и необходимых изменений в среде может оказаться непростой задачей, если не прибегать к методу проб и ошибок. Однако хотя разные виды реализации будут различаться, лучшие практики помогут вам избежать распространенных проблем и быстрее совершенствовать свою среду.
В этом обучающем модуле содержатся базовые указания по внедрению и поддержке системы CI/CD для оптимального соответствия потребностям вашей организации. Мы расскажем о ряде практик, которые помогут повысить эффективность вашей службы CI/CD. Вы можете прочитать этот материал целиком или сразу перейти к интересующим вас разделах.
Конвейеры CI/CD помогают внедрять изменения в рамках автоматизированных циклов тестирования, в том числе в тестовых и производственных средах Чем больше процедур охватывают ваши конвейеры тестирования, тем надежнее гарантия отсутствия побочных эффектов изменений при развертывании в производственной среде. Однако поскольку данную процедуру следует применять для всех изменений, быстрота и надежность конвейера очень важны для сохранения скорости разработки.
Возможно баланс между этими двумя требованиями не так просто достичь. Вы можете предпринять ряд простых шагов для повышения скорости работы, включая масштабирование инфраструктуры CI/CD и оптимизацию тестирования. Однако с течением времени у вас может возникнуть необходимость принимать критически важные решения касательно относительной ценности разных тестов и этапа или порядка их выполнения. Иногда для сохранения требуемой скорости процессов лучше всего удалить из набора тестов некоторые компоненты, имеющие наименьшую ценность или дающие плохо интерпретируемые результаты.
При принятии этих важных решений следует понимать и документировать принимаемые решения. Проконсультируйтесь с членами команды и заинтересованными лицами, чтобы понять представление команды относительно задач комплекса тестирования, а также определить наиболее важные области.
С точки зрения операционной безопасности система CI/CD отражает наиболее важную инфраструктуру для защиты. Поскольку система CI/CD имеет полный доступ к вашей базе кода и учетным данным для развертывания в различных средах, очень важно обеспечить защиту внутренних данных и обеспечить целостность вашего сайта или продукта. В связи с высокой важностью системы CI/CD как объекта атаки, очень важно изолировать и защитить ее как можно лучше.
Системы CI/CD следует развертывать во внутренних защищенных сетях без возможности внешнего доступа. Рекомендуется настроить VPN и другие технологии контроля доступа к сети, чтобы ваша система была доступна только прошедшим аутентификацию операторам. В зависимости от сложности топологии вашей сети, вашей системе CI/CD может потребоваться доступ к разным сетям для развертывания кода в разнообразных средах. Если сети не будут надлежащим образом защищены или изолированы, злоумышленники с доступом к одной среде могут использовать этот доступ для эксплуатации уязвимостей внутренних сетей с целью получения доступа к другим серверам через слабые места серверов CI/CD.
Требуемые стратегии изоляции и защиты зависят от топологии сети, инфраструктуры и требований к управлению и разработке. Очень важно помнить, что ваши системы CI/CD представляют собой очень важные цели для злоумышленников, поскольку они часто имеют высокий уровень доступа к другим критически важным системам. Защита внешнего доступа к серверам и жесткий контроль внутреннего доступа помогут снизить риск взлома вашей системы CI/CD.
Инструменты для внедрения передовых практик тестирования и развертывания также помогают системам CI/CD оптимизировать процесс разработки и повысить качество кода. Для распространения кода через конвейеры CI/CD необходимо сделать так, чтобы каждое изменение соответствовало кодифицированным стандартам и процедурам вашей организации. Любые неисправности конвейера CI/CD сразу же становятся заметными, и при их возникновении процесс разработки новых версий останавливается на последующих этапах цикла. Такой механизм защищает важные среды от ненадежного кода.
Для использования этих преимуществ требуется внедрить надлежащую производственную дисциплину, чтобы все изменения рабочей среды проходили через ваш конвейер. Конвейер CI/CD должен быть единственным механизмом поступления кода в производственную среду. Это может делаться автоматически в конце успешного тестирования с применением практик непрерывной разработки, а также посредством ручного распространения протестированных изменений, одобренных и опубликованных через систему CI/CD.
Очень часто команды разработчиков используют конвейеры для развертывания, но при этом допускают исключения при возникновении проблем и необходимости их срочного решения. Хотя простои и другие проблемы необходимо устранять как можно скорее, при этом очень важно понимать, что система CI/CD обеспечивает отсутствие ошибок, новых неисправностей и других побочных эффектов при внедрении изменений. Развертывание исправлений через конвейер (или просто использование системы CI/CD для отката) также предотвращает удаление напрямую установленных исправлений при развертывании следующих версий. Конвейер защищает целостность как плановых обновлений, так и экстренных обновлений, призванных устранить текущие проблемы. Такое использование системы CI/CD также дает еще одну причину обеспечить максимальное быстродействие конвейера.
Конвейеры CI/CD используются для развертывания изменений через тестовые комплекты и среды развертывания. Изменения, проходящие проверку на соответствие требованиям на одном этапе, автоматически развертываются или помещаются в очередь ручного развертывания для сред с более высокими требованиями. Ранние этапы тестирования призваны подтвердить ценность тестирования и внедрения изменений на уровнях, находящихся ближе к производственной среде.
На более поздних этапах воспроизведение производственной среды в тестовых средах обеспечивает более точное соответствие тестов реальной реакции производственной среды на изменения. При значительных отличиях между тестовыми и производственными средами существует вероятность возникновения при развертывании проблем, которые не наблюдались при тестировании. Чем больше будет заметно отличий между рабочей средой и тестовой средой, тем меньше уверенности в стабильной работе приложений дают тесты.
Небольшие отличия между тестовыми и производственными средами закономерны, но их необходимо контролировать и хорошо понимать. Некоторые организации используют схему развертывания «синий-зеленый» для переключения производственного трафика межжду двумя практически идентичными средами, которые по очереди выполняют функции рабочей и тестовой среды. Менее экстремальные стратегии предусматривают использование в тестовой среде той же конфигурации, что и в рабочей среде. но в уменьшенном масштабе. Конечные точки сети и другие подобные элементы разных сред могут отличаться, однако параметризация такого типа переменных данных поможет обеспечить единообразие кода и четкое определение отличий разных сред.
Основная цель конвейера CI/CD — обеспечить уверенность в изменениях и минимизировать вероятность непредвиденного негативного воздействия. Мы обсудили важность поддержания соответствия сред, однако один компонент такого соответствия заслуживает особого внимания. Если для вашего программного обеспечения требуется этап сборки, упаковки или объединения в комплект, этот этап должен выполняться только один раз, и его результат должен многократно использоваться в масштабах всего конвейера.
Это правильно поможет предотвратить проблемы, возникающие при многократной компиляции или упаковке программного обеспечения, в результате чего могут возникать небольшие несоответствия или ошибки. Отдельная сборка программного обеспечения на каждом новом этапе означает, что в предыдущем тестировании использовалось не то же самое программное обеспечение, что делает результаты недействительными.
Для предотвращения этой проблемы в системы CI должен быть включен процесс сборки на первом этапе конвейера, предусматривающий создание и упаковку программного обеспечения в чистой среде. Полученную сборку с присвоенным номером версии следует выгрузить в хранилище артефактов, откуда она будет извлекаться на следующих частях конвейера. Это гарантирует, что на разных участках системы будет использоваться одна и та же сборка.
Хотя быстрая работа конвейера в целом очень важна, некоторые части комплекса тестирования будут неизбежно выполняться быстрее других. Поскольку система CI/CD обрабатывает все изменения в системе, очень важно как можно раньше обнаруживать любые неполадки, чтобы минимизировать выделение ресурсов на проблемные варианты сборки. Для этого желательно в приоритетном порядке выполнять самые быстрые тесты. Сложные и длительные тесты лучше отложить до тех пор, пока вы не проверите сборку с помощью небольших и быстрых тестов.
Такая стратегия дает ряд преимуществ, помогающих оптимизировать процесс CI/CD. Она помогает лучше понять влияние отдельных тестов на производительность, быстро выполнять основную часть тестов и раньше выявлять неполадки, что позволит отменить проблемные изменения или внести исправления, до того, как неполадки скажутся на работе.
Приоритетное тестирование подразумевает первоочередное выполнение тестов проектного компонента, поскольку эти тесты обычно быстрые, изолированные и ориентированы на конкретные элементы. Тесты интеграции идут следующими по скорости и сложности, затем идут системные тесты, а на последнем этапе идут приемочные испытания, в которых обычно принимают участие операторы.
Один из главных принципов CI/CD — быстро и часто интегрировать любые изменения в главное общее хранилище. Это помогает избежать дорогостоящих проблем с интеграцией на поздних этапах, когда разные разработчики записывают большие, разнородные и даже конфликтующие друг с другом пакеты изменений в главное хранилище непосредственно перед выпуском. Обычно системы CI/CD настраиваются для мониторинга и тестирования изменений, влияющих на одно или несколько ответвлений проекта.
Чтобы использовать все преимущества CI в полной мере. лучше всего ограничить количество и состав ответвлений в вашем хранилище. В большинстве вариантов реализации предполагается, что разработчики будут записывать изменения непосредственно в главное хранилище или хотя бы раз в день синхронизировать его с локальными ответвлениями.
Ответвления, которые не отслеживаются системой CI/CD, содержат непроверенный код, и должны рассматриваться как риск для успеха и скорости выполнения проекта. Минимизация ответвлений для быстрой интеграции кода разных разработчиков помогает использовать сильные стороны системы и не дает разработчикам аннулировать ее преимущества.
Продолжая вышесказанное, также разработчикам рекомендуется выполнять как можно больше тестов на локальном уровне, прежде чем записывать изменения кода в общее хранилище. Это позволяет обнаруживать определенные проблемные изменения до того, как они начнут мешать работе других членов команды. Хотя в локальной среде разработки вряд ли можно будет выполнить полный набор тестов, такой же как в производственной среде, этот дополнительный шаг дает разработчикам уверенность, что их код соответствует требованиям базовых тестов, и его стоит интегрировать в основную базу кода.
Чтобы разработчики могли эффективно выполнять тестирование самостоятельно, набор тестов должен запускаться с помощью одной команды из любой среды. Та же самая команда, которую будут использовать разработчики на своих локальных компьютерах, должна использоваться системой CI/CD для запуска тестов кода в хранилище. Обычно для координации используется скрипт оболочки или файл makefile, которые автоматизируют инструменты тестирования, помогая получать воспроизводимые и прогнозируемые результаты.
Чтобы все тесты на разных этапах выполнялись одинаково, имеет смысл использовать чистые абстрагированные среды тестирования, если это возможно. Обычно это означает, что тесты выполняются в контейнеров для абстрагирования разницы между системами хоста и предоставления стандартного API для объединения компонентов в различных масштабах. Поскольку контейнеры работают с минимальным состоянием, побочные эффекты тестирования не воспроизводятся при последующих запусках набора тестов, что могло бы исказить результаты.
Еще одно преимущество контейнерных сред тестирования — удобство переноса и воспроизведения инфраструктуры тестирования. С помощью контейнеров разработчики могут легко воспроизвести конфигурацию на следующих этапах конвейера без ручной настройки и обслуживания инфраструктуры и без необходимости жертвовать удобством среды. Поскольку контейнеры можно легко развертывать и уничтожать, пользователям реже придется идти на компромиссы в отношении точности среды тестирования при запуске локальных тестов. Использование контейнерных блокировок в целом помогает минимизировать различия между разными этапами конвейера.
Хотя разные виды реализации CI/CD могут различаться, следование некоторым из этих базовых принципов поможет избежать распространенных проблем и оптимизировать практики тестирования и разработки. Непрерывная интеграция и сочетание процессов, инструментов и правил работы помогут сделать изменения в разработке более успешными и эффективными.
Чтобы узнать больше об общих практиках CI/CD и о настройке различных служб CI/CD, читайте другие статьи с пометкой CI/CD.
]]>Las aplicaciones y los sitios web modernos deben a menudo proporcionar un volumen considerable de contenido estático a los usuarios finales. Este contenido incluye imágenes, hojas de estilos, JavaScript y video. A medida que aumentan la cantidad y el tamaño de estos recursos estáticos, lo mismo sucede con el uso del ancho de banda y los tiempos de carga de la página. Esto genera perjuicios en la experiencia de navegación de los usuarios y reduce la capacidad disponible en sus servidores.
Para reducir drásticamente los tiempos de carga de la página, mejorar el rendimiento y recortar sus costos de ancho de banda e infraestructura, puede implementar una CDN, o red de entrega de contenido, para almacenar en caché estos recursos en un conjunto de servidores distribuidos geográficamente.
En este tutorial, ofrecemos una descripción de alto nivel de las CDN, su funcionamiento y los beneficios que ofrecen para sus aplicaciones web.
Una red de entrega de contenido es un grupo de servidores distribuidos geográficamente y optimizados para proporcionar contenido estático a los usuarios finales. Este contenido estático corresponder a casi cualquier tipo de datos, pero las CDN se utilizan sobre todo para entregar páginas web y sus archivos relacionados, y transmitir vídeo y audio además de grandes paquetes de software.
Una CDN consta de varios puntos de presencia (PoP) en varias ubicaciones, y cada uno de ellos consta de numerosos servidores perimetrales que almacenan en caché recursos de su origen o servidor host. Cuando un usuario visita su sitio web y solicita recursos estáticos, como imágenes o archivos de JavaScript, la CDN redirecciona sus solicitudes al servidor perimetral más cercano, desde el cual se proporciona el contenido. Si los recursos no están almacenados en caché en el servidor perimetral, o si los recursos almacenados en caché caducaron, la CDN buscará y almacenará en caché la versión más reciente desde otro servidor perimetral de CDN cercano o desde sus servidores de origen. Si la CDN perimetral no tiene una entrada almacenada en caché para sus recursos (lo cual sucede en la mayoría de los casos si su sitio web recibe una cantidad moderada de tráfico), mostrará la copia en caché al usuario final.
Esto permite que los usuarios dispersos geográficamente reduzcan al mínimo el número de saltos necesarios para recibir contenido estático y buscar el contenido directamente en la caché de un servidor perimetral cercano. Como resultado, se reducen considerablemente las latencias y la pérdida de paquetes, se acelera la carga de páginas y disminuye de forma drástica la carga para su infraestructura de origen.
Por un costo adicional, los proveedores de CDN a menudo ofrecen funciones adicionales, como la mitigación de DDoS y limitación de velocidad, los análisis de usuarios y las optimizaciones para casos de uso de transmisión o plataformas móviles.
Cuando un usuario visita su sitio web, primero recibe una respuesta de un servidor DNS que contiene la dirección IP de su servidor web host. Su navegador luego solicita el contenido de la página web, que a menudo consiste en una variedad de archivos estáticos, como páginas HTML, hojas de estilos CSS, código JavaScript e imágenes.
Una vez que implementa una CDN y descarga estos recursos estáticos en servidores CDN, ya sea “forzándolos” manualmente o haciendo que la CDN “extraiga” los recursos automáticamente (ambos mecanismos se explican en la sección siguiente), indicará a su servidor web que reescriba los enlaces al contenido estático de modo que apunten a los archivos alojados por la CDN. Si está usa un CMS como WordPress, esta reescritura puede implementarse usando un complemento externo, como CDN Enabler.
Muchas CDN ofrecen soporte para dominios personalizados, lo que le permite crear un registro CNAME en su dominio apuntando a un extremo de la CDN. Una vez que la CDN recibe una solicitud de usuario en este extremo (ubicado en el servidor perimetral, mucho más cerca del usuario que sus servidores de backend), dirige la solicitud al punto de presencia (PoP) ubicado más cerca del usuario. Este PoP a menudo consta de uno o más servidores perimetrales de CDN, localizados de forma conjunta en un punto de intercambio de Internet (IxP), que es básicamente un centro de datos que los proveedores de servicios de Internet (ISP) utilizan para interconectar sus redes. El equilibrador de carga interna de la CDN dirige la solicitud a un servidor perimetral ubicado en este PoP, que luego proporciona el contenido al usuario.
Los mecanismos de almacenamiento en caché varían según el proveedor de la CDN, pero generalmente funcionan de la siguiente manera:
X-Cache: MISS
. Esta solicitud inicial será más lenta que las solicitudes futura,debido a que, tras completarla, el recurso quedará almacenado en la memoria caché del servidor perimetral.X-Cache: HIT
.Para obtener más información sobre cómo funciona y cómo se implementó una CDN específica, consulte la documentación su proveedor de CDN.
En la siguiente sección, se presentarán los dos tipos populares de CDN: push y pull.
La mayoría de los proveedores de CDN ofrecen dos alternativas para almacenar sus datos en caché: las zonas de obtención y las de incorporación.
En el caso de las zonas de obtención se debe introducir la dirección de su servidor de origen y permitir que la CDN busque y almacene en caché de forma automática todos los recursos estáticos disponibles en su sitio. Estas zonas normalmente se utilizan para proporcionar recursos web que se actualizan con frecuencia y tienen tamaños entre reducidos y medianos, como los HTML, CSS y los de Java Script. Una vez que se proporciona a la CDN la dirección de su servidor de origen, el siguiente paso es normalmente reescribir los enlaces a los recursos estáticos, de modo que apunten a la URL proporcionada por la CDN. A partir de ese momento, la CDN gestionará las solicitudes de recursos entrantes de sus usuarios y proporcionará contenido desde sus cachés distribuidas geográficamente y desde su origen, según corresponda.
Para usar una zona de incorporación, se suben los datos a una ubicación de almacenamiento o un depósito designados que luego la CDN introduce en memorias caché de su flota de servidores perimetrales distribuidos. Las zonas push se usan normalmente para archivos más grandes que apenas cambian, como los que se encuentran en ficheros, paquetes de software, PDF, videos y audio.
Casi cualquier sitio puede aprovechar los beneficios obtenidos al implementar una CDN, pero generalmente el motivo principal por el cual se implementa tiene que ver con la descarga ancho de banda de sus servidores de origen a los servidores de la CDN y la reducción de la latencia para los usuarios distribuidos geográficamente.
A continuación, repasaremos estas y otras ventajas principales del uso de una CDN.
Si está cerca de utilizar la capacidad de ancho de banda de sus servidores, la descarga de recursos estáticos como imágenes, videos, CSS y archivos de JavaScript reducirá drásticamente el uso de ancho de banda por parte de sus servidores. Las redes de entrega de contenido se diseñan y optimizan para proporcionar contenido estático, y las solicitudes de clientes para este contenido se dirigirán a servidores perimetrales de CDN y se proporcionarán en ellos. Esto tiene el beneficio adicional de reducir la carga en sus servidores de origen, ya que proporcionan estos datos a una frecuencia mucho más baja.
Si su base de usuarios está dispersa geográficamente y una parte de su tráfico que no es poco representativa proviene de una zona geográfica distante, una CDN puede disminuir la latencia almacenando en caché los recursos estáticos en servidores perimetrales más cercanos a sus usuarios. Al reducir la distancia entre sus usuarios y el contenido estático, puede entregar contenido de forma más rápida a sus usuarios y mejorar su experiencia aumentando las velocidades de carga de la página.
Estos beneficios se acentúan en el caso de los sitios web que proporcionan principalmente contenido de video que exige mucho ancho de banda, en los cuales las altas latencias y los tiempos de carga prolongados tienen un impacto más directo sobre la experiencia del usuario y el compromiso con el contenido.
Las CDN le permiten gestionar grandes picos y aumentos de tráfico al equilibrar la carga de las solicitudes en una red grande y distribuida de servidores perimetrales. Al descargar y almacenar contenido estático en una red de entrega, puede acomodar un número mayor de usuarios simultáneos con su infraestructura existente.
Para los sitios web que usen un servidor de origen único, estos grandes picos de tráfico a menudo pueden sobrecargar el sistema, y producir desconexiones y tiempo de inactividad no planificados. El cambio del tráfico a una infraestructura de CDN altamente disponible y redundante, diseñada para gestionar niveles variables de tráfico web, puede aumentar la disponibilidad de sus recursos y contenido.
Debido a que el aprovisionamiento de contenido estático normalmente ocupa la mayor parte del uso de su ancho de banda, la descarga de estos recursos a una red de entrega de contenido puede reducir notablemente sus costos mensuales de infraestructura. Además de reducir los costos de ancho de banda, una CDN puede disminuir los cargos de servidores mediante la reducción de la carga en los servidores de origen, lo cual permite la expansión de su infraestructura existente. Por último, algunos proveedores de CDN ofrecen facturación mensual fija, lo que le permite transformar su consumo de ancho de banda mensual variable en un gasto recurrente estable y predecible.
Las CDN también se usan comúnmente para mitigar ataques de DDoS. Muchos proveedores de CDN incluyen funciones para controlar y filtrar solicitudes a servidores perimetrales. Estos servicios analizan el tráfico web para detectar patrones sospechosos y bloquean el tráfico de ataques malintencionados mientras permiten el tráfico de los usuarios de confianza. Los proveedores de CDN suelen ofrecer varios servicios de mitigación de DDoS, desde la protección contra ataques comunes a nivel de la infraestructura (capas 3 y 4 de la OSI) hasta servicios de mitigación y limitación de índices más avanzados.
Además, la mayoría de las CDN le permiten configurar SSL completo, con lo cual puede cifrar el tráfico entre la CDN y el usuario final, así como el tráfico entre la CDN y sus servidores de origen, usando certificados proporcionados por la CDN o SSL personalizados.
Si su obstáculo es la carga sobre la CPU en el servidor de origen, y no el ancho de banda, es posible que una CDN no sea la solución más apropiada. En este caso, el almacenamiento en cachés locales mediante cachés populares como NGINX o Varnish puede reducir considerablemente la carga proporcionando recursos desde la memoria del sistema.
Antes de implementar una CDN, los pasos de optimización adicionales (como los de minificar y comprimir archivos de JavaScript y CSS, y permitir la compresión de la solicitud HTTP del servidor web) pueden tener un impacto considerable en los tiempos de carga de la página y el uso del ancho de banda.
Una herramienta útil para medir la velocidad de carga de su página y mejorarla es PageSpeed Insights, de Google. Otra herramienta útil que proporciona un desglose de los tiempos de respuesta y las optimizaciones sugeridas es Pingdom.
Una red de entrega de contenido puede ser una solución rápida y eficaz para mejorar la escalabilidad y disponibilidad de sus sitios web. Al almacenar en caché los recursos estáticos en una red distribuida de servidores optimizados, puede reducir enormemente los tiempos de carga de la página y las latencias para los usuarios finales. Además, las CDN le permiten reducir considerablemente el uso de ancho de banda absorbiendo las solicitudes de los usuarios y respondiendo desde la memoria caché perimetral, lo que disminuye sus costos de ancho de banda e infraestructura.
Gracias a complementos y compatibilidad con plataformas externas para los principales marcos, como WordPress, Drupal, Django y Ruby on Rails, además de funciones adicionales como la mitigación de DDoS, SSL completo, la monitorización de usuarios y la compresión de recursos, las CDN pueden ser una herramienta muy útil para proteger y optimizar sitios web con un alto nivel de tráfico.
]]>El sistema de nombres de dominio (DNS, por sus siglas en inglés) es un sistema que asocia varios tipos de información (como las direcciones IP) con nombres fáciles de recordar. De forma predeterminada, la mayoría de los clústeres de Kubernetes configuran automáticamente un servicio DNS interno para proporcionar un mecanismo ligero para el descubrimiento de servicios. La detección de servicios integrada permite que las aplicaciones se encuentren y comuniquen entre sí de forma más sencilla en los clústeres de Kubernetes, incluso cuando se crean, eliminan o cambian de nodo los pods y servicios.
Los detalles relacionados con la implementación del servicio DNS de Kubernetes han cambiado en las últimas versiones de Kubernetes. En este artículo, veremos las versiones kube-dns y CoreDNS del servicio DNS de Kubernetes. Revisaremos cómo funcionan y los registros DNS que genera Kubernetes.
Para comprender mejor el concepto de DNS antes de comenzar, consulte Introducción a terminología, componentes y conceptos relacionados con DNS. En el caso de los temas de Kubernetes con los que no esté familiarizado, puede consultar Introducción a Kubernetes.
Antes de la versión 1.11 de Kubernetes, el servicio DNS de Kubernetes se basaba en kube-dns. En la versión 1.11 se introdujo CoreDNS para solucionar algunos problemas de seguridad y estabilidad de kube-dns.
Independientemente del software que maneje los registros DNS reales, ambas implementaciones funcionan de manera similar:
Se crea un servicio llamado kube-dns
y uno o más pods.
El servicio kube-dns
escucha los eventos de service y endpoint de la API de Kubernetes y actualiza sus registros DNS según sea necesario. Estos eventos se activan cuando crea, actualiza o elimina servicios de Kubernetes y sus pods asociados.
kubelet fija cada opción /etc/resolv.conf
nameserver
de un pod nuevo en el IP del clúster del servicio kube-dns
, con las opciones search
adecuadas para permitir el uso de nombres de host más cortos:
nameserver 10.32.0.10
search namespace.svc.cluster.local svc.cluster.local cluster.local
options ndots:5
Las aplicaciones que se ejecutan en contenedores pueden resolver nombres de host como example-service.namespace
en las direcciones IP de clúster correctas.
El registro DNS completo A
de un servicio de Kubernetes tendrá el aspecto del siguiente ejemplo:
service.namespace.svc.cluster.local
Un pod tendría un registro en este formato y reflejaría la dirección IP real del pod:
10.32.0.125.namespace.pod.cluster.local
Además, se crean registros SRV
para los puertos con nombre de un servicio de Kubernetes:
_port-name._protocol.service.namespace.svc.cluster.local
Todo esto da como resultado un mecanismo incorporado de descubrimiento de servicios basado en DNS, en el cual su aplicación o microservicio puede orientarse a un nombre de host simple y uniforme para acceder a otros servicios o pods del clúster.
Debido a los sufijos de dominio de búsqueda que aparecen en el archivo resolv.conf,
a menudo no necesitará utilizar el nombre de host completo para establecer contacto con otro servicio. Si se orienta a un servicio en el mismo espacio de nombres, puede utilizar solo el nombre del servicio para contactarlo:
other-service
Si el servicio está en un espacio de nombres diferente, añádalo a la consulta:
other-service.other-namespace
Si se orienta a un pod, deberá usar al menos lo siguiente:
pod-ip.other-namespace.pod
Como hemos visto en el archivo resolv.conf
predeterminado, solo los sufijos .svc
se completan automáticamente. Por ello, debe asegurarse de especificar todo hasta .pod
.
Ahora que vimos los usos prácticos del servicio DNS de Kubernetes, vamos a repasar algunos detalles sobre las dos implementaciones diferentes.
Como se indicó en la sección anterior, en la versión 1.11 de Kubernetes se introdujo un nuevo software para gestionar el servicio kube-dns
. Los motivos que impulsaron el cambio fueron el aumento del rendimiento y de la seguridad del servicio. Veamos primero la implementación original de kube-dns
.
El servicio kube-dns
anterior a la versión 1.11 de Kubernetes se compone de tres contenedores que se ejecutan en un pod kube-dns
en el espacio de nombres kube-system
. Los tres contenedores son:
Las vulnerabilidades de seguridad en Dnsmasq y los problemas de rendimiento de ajuste de escala con SkyDNS llevaron a la creación de un sistema de reemplazo: CoreDNS.
A partir de la versión 1.11 de Kubernetes, un nuevo servicio DNS de Kubernetes, CoreDNS pasó a la categoría Accesible al mercado. Esto significa que está listo para su uso en producción y será el servicio DNS de clúster por defecto para muchas herramientas de instalación y proveedores de Kubernetes gestionados.
CoreDNS es un proceso único, escrito en Go, que abarca toda la funcionalidad del sistema anterior. Un único contenedor resuelve y almacena en caché las consultas del DNS, responde a las comprobaciones de estado y proporciona métricas.
Además de resolver problemas de rendimiento y seguridad, CoreDNS corrige algunos otros errores menores y añade algunas características nuevas:
autopath
puede mejorar los tiempos de respuesta del DNS cuando se resuelven nombres de host externos, ya que es más inteligente respecto de la iteración a través de cada uno de los sufijos de dominio de búsqueda que aparecen en resolv.conf
.10.32.0.125.namespace.pod.cluster.local
siempre se resolvería a 10.32.0.125
, incluso si el pod no existe realmente. CoreDNS tiene un modo de “pods verificados” que solo se resolverá con éxito si existe un pod con el IP correcto y en el espacio de nombres adecuado.Para obtener más información sobre CoreDNS y en la forma en que se diferencia de los kube-dns, puede leer el anuncio de accesibilidad al mercado de CoreDNS de Kubernetes.
Los operadores de Kubernetes a menudo quieren personalizar la forma en que sus pods y contenedores resuelven determinados dominios personalizados, o necesitan ajustar los servidores de nombres de cabecera o los sufijos de dominio de búsqueda configurados en resolv.conf
. Puede hacerlo con la opción dnsConfig
de las especificaciones de su pod:
apiVersion: v1
kind: Pod
metadata:
namespace: example
name: custom-dns
spec:
containers:
- name: example
image: nginx
dnsPolicy: "None"
dnsConfig:
nameservers:
- 203.0.113.44
searches:
- custom.dns.local
Al actualizar esta configuración, se reescribirá el archivo resolv.conf
de un pod para activar los cambios. La configuración se asigna directamente a las opciones estándares resolv.conf,
, por lo que la configuración anterior creará un archivo con las líneas 203.0.113.44
y search custom.dns.local
.
En este artículo, abarcamos aspectos básicos acerca de lo que el servicio DNS de Kubernetes proporciona a los desarrolladores, mostramos algunos ejemplos de registros DNS para servicios y pods, abordamos la implementación del sistema en las diferentes versiones de Kubernetes y destacamos algunas opciones de configuración adicionales disponibles para personalizar la forma en que sus pods resuelven las consultas de DNS.
Para obtener más información sobre el servicio DNS de Kubernetes, consulte la documentación oficial sobre DNS de Kubernetes para servicios y pods.
]]>Implementar aplicaciones en Kubernetes (el popular y sólido sistema contenedor y organizador) puede ser un proceso complejo. Configurar una sola aplicación puede implicar crear varios recursos de interdependientes de Kubernetes (tales como lugares de descarga, servicios, implementaciones y ReplicaSets), cada uno de los cuales requiere la redacción de un archivo de manifiesto YAML detallado.
Helm es un administrador de paquetes para Kubernetes que permite a los desarrolladores y operadores configurar e implementar de forma más sencilla aplicaciones y servicios en clústeres de Kubernetes.
Actualmente, Helm es un proyecto oficial de Kubernetes y forma parte de Cloud Native Computing Foundation, una organización sin fines de lucro que respalda proyectos de código abierto en el ecosistema de Kubernetes y a su alrededor.
En este artículo, proporcionaremos una descripción general de Helm y las diversas abstracciones que usa para simplificar la implementación de aplicaciones en Kubernetes. Si no conoce Kubernetes, Introducción a Kubernetes puede resultarle útil antes para familiarizarse con los conceptos básicos.
La mayoría de los sistemas operativos y de programación de lenguaje tienen su propio administrador de paquetes para la instalación y el mantenimiento de software. Helm proporciona el mismo conjunto de funciones básicas que muchos de los administradores que seguramente ya conoce, como apt
de Debian o pip
de Python.
Helm puede:
Helm proporciona esta funcionalidad a través de los siguientes componentes:
helm
, que proporciona la interfaz de usuario para todas las funcionalidades de Helm.tiller
, que funciona en su clúster de Kubernetes, escucha los comandos de helm
y gestiona la configuración e implementación de versiones de software en el clúster.A continuación, investigaremos el formato de los charts en mayor detalle.
Los paquetes de Helm se llaman charts, y constan de algunos archivos de configuración YAML y algunas plantillas que se convierten en archivos de manifiesto de Kubernetes. Esta es la estructura de directorios básica de un chart:
package-name/
charts/
templates/
Chart.yaml
LICENSE
README.md
requirements.yaml
values.yaml
Estos directorios y archivos tienen las siguientes funciones:
requirements.yaml
para vincular las dependencias de manera dinámica.values.yaml
y la línea de comandos) y se representan en manifiestos de Kubernetes. Las plantillas usan el formato Go programming language.El comando helm
puede instalar un chart de un directorio local o de una versión .tar.gz
empaquetada de esta estructura de directorio. Estos charts empaquetados también pueden descargarse e instalarse automáticamente desde los repositorios de charts, o repos.
A continuación, analizaremos los repositorios de charts.
Un repo de charts de Helm es un sitio HTTP simple que proporciona un archivo index.yaml
y charts empaquetados .tar.gz
. El comando helm
tiene subcomandos disponibles para ayudar a empaquetar charts y crear el archivo index.yaml
necesario. En cualquier servidor web, servicio de almacenamiento de objetos o sitio estático, como las páginas GitHub, se pueden proporcionar estos archivos.
Helm viene previamente configurado con un repositorio de charts predeterminado, conocido como stable. Este repo apunta a un depósito de almacenamiento de Google en https://kubernetes-charts.storage.googleapis.com
. La fuente del repo **stable **se puede encontrar en el repositorio Git de helm/charts en GitHub.
Los repos alternativos pueden agregarse con el comando helm repo add
. Los siguientes son algunos repositorios alternativos populares:
Si instala un chart que desarrolló a nivel local o uno de un repo, debe configurarlo para su instalación en particular. A continuación, veremos las configuraciones.
Un chart suele incorporar valores de configuración predeterminados en su archivo values.yaml
. Algunas aplicaciones pueden implementarse completamente con valores predeterminados, pero por lo general deberá sobrescribir alguna de las configuraciones para que el chart cumpla con sus requisitos.
El autor del chart determina los valores expuestos para la configuración. Algunos se usan para configurar Kubernetes antiguos y algunos pueden pasarse a través del contenedor subyacente para configurar la propia aplicación.
A continuación, se brinda un fragmento de algunos valores de ejemplo:
service:
type: ClusterIP
port: 3306
Estas son opciones para configurar un recurso de servicio de Kubernetes. Puede usar helm inspect values chart-name
para desechar todos los valores de configuración disponibles para un chart.
Estos valores pueden sobrescribirse redactando su propio archivo YAML y usándolo al ejecutar helm install
, o al establecer opciones de manera individual en la línea de comandos con el indicador --set
. Solo necesita especificar esos valores predeterminados que desee cambiar.
Un chart de Helm implementado con una configuración particular se conoce como release. Hablaremos sobre los releases a continuación.
Durante la instalación de un chart, Helm combina las plantillas del chart con la configuración especificada por el usuario y los valores predeterminados de value.yaml
. Estos se convierten en manifiestos de Kubernetes que se implementan a través de la API de Kubernetes. Con esto se crea un release, una configuración e implementación específica de un chart en particular.
El concepto de los lanzamientos es importante, ya que posiblemente desee implementar la misma aplicación más de una vez en un clúster. Por ejemplo, es posible que necesite varios servidores de MySQL con distintas configuraciones.
También es posible que quiera mejorar distintas instancias de un chart de manera individual. Es posible que una aplicación esté lista para un servidor MySQL actualizado, y que otra no lo esté. Con Helm, puede mejorar cada release de forma individual.
Es posible que mejore un release porque se actualizó el chart de este o porque quiere actualizar la configuración del propio release. De cualquier manera, con cada mejora se creará una nueva revisión de un release y Helm le permitirá restablecer fácilmente revisiones anteriores en caso de que se produzca un problema.
Si no puede encontrar un chart existente para el software que implementará, es posible que prefiera crear uno propio. Helm puede generar la estructura del directorio de un chart con helm create chart-name
. Con esto, se creará una carpeta con los archivos y directorios que analizamos en la sección Charts anterior.
A partir de este punto, debe completar los metadatos de su chart en Chart.yaml
y disponer sus archivos de manifiesto de Kubernetes en el directorio templates
. Luego, debe quitar las variables de configuración pertinentes de sus manifiestos y disponerlas en values.yaml
,y luego incluirlas de nuevo en sus plantillas de manifiestos mediante el sistema de creación de plantillas.
El comando helm
tiene muchos subcomandos disponibles para ayudarlo a probar, empaquetar y administrar sus charts. Para obtener más información, consulte la documentación oficial de Helm vinculada al desarrollo de charts.
A lo largo de este artículo, hicimos una revisión de Helm, el administrador de paquetes de Kubernetes. Observamos la arquitectura de Helm y los componentes individuales helm
y tiller
, vimos en detalle el formato de charts de Helm y analizamos los repositorios de charts. También investigamos la forma de configurar un chart de Helm y de combinar e implementar las configuraciones y los charts como releases en clústeres de Kubernetes. Por último, vimos brevemente los conceptos básicos con los que se puede crear un chart cuando no hay uno adecuado disponible.
Para obtener más información sobre Helm, consulte la documentación oficial de Helm. Para encontrar charts oficiales para Helm, consulte el repositorio oficial de Git helm/charts en GitHub.
]]>La integración, entrega e implementación continuas, conocidas en conjunto como CI/CD, son una parte integral del desarrollo moderno que pretende reducir los errores durante la integración y la implementación mientras se aumenta la velocidad del proyecto. CI/CD es una filosofía y un conjunto de prácticas a menudo aumentadas por herramientas sólidas que enfatizan las pruebas automatizadas en cada etapa del proceso de software. Al incorporar estas ideas en su práctica, puede reducir el tiempo necesario para integrar los cambios para una versión y probar en profundidad cada cambio antes de pasar a la producción.
La CI/CD tiene muchos beneficios potenciales, pero para una implementación exitosa a menudo se requiere mucho análisis. Sin un método de ensayo y error exhaustivo, definir exactamente la forma de usar las herramientas y los cambios que puede necesitar en sus entornos o procesos puede ser un desafío. Sin embargo, aunque todas las implementaciones son diferentes, ceñirse a las mejores prácticas puede servirle para evitar problemas comunes y lograr mejoras de manera más rápida.
En esta guía, presentaremos algunas pautas básicas sobre cómo implementar y mantener un sistema de CI/CD para satisfacer mejor las necesidades de su organización. Abarcaremos varias prácticas que le servirán para mejorar la efectividad de su servicio de CI/CD. No dude en leer el documento completo o ir directamente a las áreas que le interesen.
Los procesos de CI/CD ayudan a guiar los cambios a través de ciclos de prueba automatizados, hacia entornos de ensayo y, finalmente, hacia la producción. Cuanto más completos sean sus procesos de prueba, mayor será la garantía de que los cambios no introducirán efectos secundarios imprevistos en su implementación de producción. Sin embargo, dado que cada cambio debe pasar por este proceso, preservar la velocidad y confiabilidad de sus procesos es increíblemente importante para no inhibir la velocidad de desarrollo.
Equilibrar la tensión entre estos dos requisitos puede ser difícil. Hay algunos pasos sencillos que puede seguir para mejorar la velocidad; por ejemplo, ampliar su infraestructura de CI/CD y optimizar las pruebas. Sin embargo, con el paso del tiempo, puede verse obligado a tomar decisiones críticas sobre el valor relativo de las diferentes pruebas y la etapa o el orden en que se ejecutan. A veces, reducir el conjunto de pruebas eliminando las pruebas de bajo valor o con conclusiones indeterminadas es la forma más inteligente de mantener la velocidad requerida por procesos muy utilizados.
Al tomar estas decisiones significativas, asegúrese de entender y documentar las compensaciones que realice. Consulte a miembros del equipo y a partes interesadas para alinear las suposiciones del equipo respecto de lo que se espera del conjunto de pruebas y de cuáles deben ser las áreas de enfoque principales.
Desde el punto de vista de la seguridad operativa, su sistema de CI/CD representa una de las infraestructuras que con mayor empeño se deben proteger. Dado que el sistema de CI/CD tiene acceso completo a su base de códigos y credenciales para la implementación en varios entornos, es esencial protegerlo para cuidar los datos internos y garantizar la integridad de su sitio o producto. Debido a su alto valor como objetivo, es importante aislar y bloquear su CI/CD todo lo posible.
Los sistemas de CI/CD deben implementarse en redes internas protegidas, sin exposición a terceros. Se recomienda configurar de VPN u otra tecnología de control de acceso a la red para garantizar que solo los operadores autenticados puedan acceder a su sistema. Según la complejidad de la topología de su red, su sistema de CI/CD puede necesitar acceder a varias redes diferentes para implementar código en diferentes entornos. Si no cuenta con la protección o el aislamiento correspondiente, los atacantes que obtengan acceso a un entorno pueden ser capaces de practicar saltos de un punto a otro, una técnica utilizada para ampliar el acceso aprovechando las reglas de red internas más indulgentes, a fin de obtener acceso a otros entornos a través de los puntos débiles de sus servidores CI/CD.
Las estrategias de aislamiento y seguridad requeridas dependerán en gran medida de la topología e infraestructura de su red y de sus requisitos de administración y desarrollo. El punto importante que se debe tener en cuenta es que sus sistemas de CI/CD son objetivos muy valiosos y, en muchos casos, tienen un amplio grado de acceso a sus otros sistemas esenciales. Blindar todos los accesos externos a los servidores y controlar de forma estricta los tipos de acceso interno permitidos ayudarán a reducir el riesgo de que su sistema de CI/CD se vea comprometido.
Parte de lo que hace posible que CI/CD mejore sus prácticas de desarrollo y la calidad de su código es que las herramientas a menudo permiten aplicar las prácticas recomendadas para la prueba y la implementación. La promoción del código a través de sus procesos de CI/CD requiere que cada cambio demuestre que cumple con los estándares y procedimientos codificados de su organización. Las fallas en un proceso de CI/CD se pueden ver de inmediato y detienen el avance de la versión afectada hasta etapas posteriores del ciclo. Este es un mecanismo de control de acceso que protege los entornos más importantes contra códigos no confiables.
Sin embargo, para obtener estas ventajas es necesario ser disciplinado a fin de garantizar que cada cambio en su entorno de producción pase por su proceso. El proceso de CI/CD debería ser el único mecanismo por el cual el código ingresa en el entorno de producción. Esto puede suceder automáticamente al final de las pruebas exitosas con prácticas de implementación continua o mediante una promoción manual de cambios probados aprobados y puestos a disposición por su sistema de CI/CD.
Con frecuencia, los equipos comienzan a utilizar sus procesos para la implementación, pero empiezan a hacer excepciones cuando se producen problemas y hay presión para resolverlos rápidamente. Si bien es necesario resolver problemas como el tiempo de inactividad y otros lo más pronto posible, es importante entender que el sistema de CI/CD es una buena herramienta para garantizar que sus cambios no introduzcan otros errores ni generen aún más daños en el sistema. Poner su solución mediante el proceso (o simplemente usar el sistema de CI/CD para retroceder) también evitará que en la próxima implementación se borre una corrección ad hoc aplicada directamente a la producción. El proceso protege la validez de sus implementaciones, independientemente de si se trata de una versión regular y planificada o una solución rápida para resolver un problema en curso. Este uso del sistema de CI/CD es otra razón más para procurar que su proceso sea rápido.
Los procesos de CI/CD promueven cambios mediante una serie de conjuntos de pruebas y entornos de implementación. Los cambios que superan los requisitos de una etapa se implementan automáticamente o se ponen en cola para su implementación manual en entornos más restrictivos. Las primeras etapas tienen por objeto demostrar que vale la pena seguir probando e impulsando los cambios más cerca de la producción.
En particular para las etapas posteriores, reproducir el entorno de producción de la manera más fidedigna posible en los entornos de pruebas ayuda a garantizar que estas últimas reflejen con precisión el comportamiento que tendría el cambio en la producción. Las diferencias significativas entre la puesta en escena y la producción pueden permitir que se liberen cambios problemáticos que nunca se observaron en las pruebas. Cuantas más diferencias haya entre su entorno activo y el entorno de pruebas, menos medirán sus pruebas el comportamiento que tendrá el código cuando se libere.
Se esperan algunas diferencias entre la puesta en escena y la producción, pero es esencial procurar que sean manejables y asegurarse de que se comprendan bien. Algunas organizaciones utilizan implementaciones “blue-green” para intercambiar el tráfico de producción entre dos entornos casi idénticos cuyas designaciones se alternan entre la producción y la puesta en escena. Estrategias menos extremas implican la implementación de la misma configuración e infraestructura desde la producción hasta su entorno de puesta en escena, pero a una escala reducida. Los elementos como los extremos de la red pueden diferir entre sus entornos, pero la parametrización de este tipo de datos variables puede ayudar a garantizar que el código sea coherente y que las diferencias de entorno estén bien definidas.
Un objetivo principal de un proceso de CI/CD es crear confianza en sus cambios y minimizar la posibilidad de un impacto inesperado. Analizamos la importancia de mantener la paridad entre los ambientes, pero un componente de esto es suficientemente importante como para ameritar atención adicional. Si su software requiere un paso de construcción, empaquetado o agrupación, ese paso se debe ejecutar solo una vez y el resultado obtenido se debe reutilizar a lo largo de todo el proceso.
Esta guía permite prevenir problemas que surgen cuando el software se compila o empaqueta varias veces, lo cual permite que se inyecten ligeras inconsistencias en los artefactos obtenidos. Construir el software por separado en cada nueva etapa puede significar que las pruebas en entornos anteriores no se dirigían al mismo software que se implementará más tarde, lo que invalida los resultados.
Para evitar este problema, los sistemas de CI deberían incluir un proceso de construcción como primer paso en el proceso que crea y empaqueta el software en un entorno limpio. El artefacto obtenido debería someterse a un control de versiones y cargarse en un sistema de almacenamiento de artefactos para retirarse en etapas posteriores del proceso, lo cual garantizará que la construcción no cambie a medida que avance en el sistema.
Si bien sostener la velocidad del proceso es un gran objetivo general, algunas partes del conjunto de pruebas serán inevitablemente más rápidas que otras. Debido a que el sistema de CI/CD sirve como un conducto para todos los cambios que ingresan a su sistema, descubrir las fallas lo más pronto posible es importante para minimizar los recursos dedicados a las versiones problemáticas. Para lograr esto, priorice y ejecute sus pruebas más rápidas primero. Guarde las pruebas complejas y de larga duración para después de que validar la construcción con pruebas más pequeñas y de ejecución rápida.
Esta estrategia tiene una serie de beneficios que pueden servir para que en su proceso de IC/CD no haya problemas. Lo alienta a comprender el impacto en el rendimiento de las pruebas individuales, le permite completar la mayoría de sus pruebas de forma anticipada y aumenta la probabilidad de fallas rápidas, lo cual significa que los cambios problemáticos pueden revertirse o arreglarse antes de bloquear el trabajo de otros miembros.
Priorizar pruebas normalmente implica ejecutar primero las pruebas unitarias de su proyecto, ya que estas tienden a ser rápidas, aisladas y centradas en los componentes. Después, las pruebas de integración representan normalmente el siguiente nivel de complejidad y velocidad; a estas les siguen pruebas a nivel de todo el sistema y, finalmente, las pruebas de aceptación, que a menudo requieren algún nivel de interacción humana.
Uno de los principios fundamentales de CI/CD es integrar los cambios en el repositorio primario compartido de forma temprana y frecuente. Esto ayuda a evitar costosos problemas de integración en el futuro cuando varios desarrolladores intentan fusionar cambios grandes, divergentes y conflictivos en la rama principal del repositorio en preparación para su lanzamiento. Normalmente, los sistemas de CI/CD se configuran para monitorear y probar los cambios enviados a solo una o a unas pocas ramas.
Para aprovechar los beneficios que proporciona la CI, es mejor limitar el número y el alcance de las ramas en su repositorio. La mayoría de las implementaciones sugieren que los desarrolladores realicen la implementación directamente en la rama principal o fusionen los cambios de sus ramas locales al menos una vez al día.
Básicamente, las ramas de las cuales su sistema de CI/CD no realiza un seguimiento contienen código no probado que debería considerarse como una responsabilidad para el éxito y el impulso de su proyecto. Minimizar las ramificaciones para fomentar la integración temprana del código de diferentes desarrolladores permite aprovechar las fortalezas del sistema y evita que los desarrolladores nieguen las ventajas que proporciona.
En relación con el punto anterior sobre la detección temprana de fallas, se debería animar a los desarrolladores a que ejecuten localmente todas las pruebas posibles antes de la implementación en el repositorio compartido. Esto permite detectar ciertos cambios problemáticos antes de que bloqueen a otros miembros del equipo. Si bien el entorno de desarrollo local probablemente no pueda ejecutar el conjunto de pruebas completo en un entorno de producción, este paso adicional da a los individuos más confianza en que los cambios que realizan superan las pruebas básicas y vale la pena intentar integrarlos con la base de código más grande.
Para garantizar de que los desarrolladores puedan realizar las pruebas de forma efectiva por sí mismos, su paquete de pruebas debería poder ejecutarse con un único comando que se pueda ejecutar desde cualquier entorno. El sistema CI/CD debería usar el mismo comando que emplean los desarrolladores en sus máquinas locales para iniciar pruebas en código fusionado con el repositorio. A menudo, esto se coordina proporcionando una secuencia de comandos de shell o makefile para automatizar la ejecución de las herramientas de pruebas de una manera repetible y predecible.
Para garantizar que sus pruebas se ejecuten de la misma manera en varias etapas, a menudo se recomienda utilizar entornos de prueba limpios y efímeros cuando sea posible. Normalmente, esto significa ejecutar las pruebas en contenedores para abstraer las diferencias entre los sistemas host y proporcionar una API estándar para acoplar los componentes a varias escalas. Debido a que los contenedores se ejecutan con un estado mínimo, los efectos secundarios residuales de las pruebas no se heredan en las ejecuciones siguientes del conjunto de pruebas, lo que podría contaminar los resultados.
Otro beneficio de los entornos de pruebas en contenedores es la portabilidad de su infraestructura de pruebas. Con los contenedores, los desarrolladores tienen más facilidad para replicar la configuración que se utilizará más adelante en el proceso sin necesidad de configurar y mantener manualmente la infraestructura, ni de sacrificar la fidelidad de entorno. Dado que los contenedores pueden girarse fácilmente cuando es necesario y luego destruirse, los usuarios pueden hacer menos sacrificios en términos de la precisión de su entorno de pruebas cuando se ejecutan pruebas locales. En general, el uso de contenedores se bloquea en algunos aspectos del entorno de ejecución para ayudar a minimizar las diferencias entre las etapas del proceso.
Si bien cada implementación de CI/CD será diferente, aplicar algunos de estos principios básicos le permitirá evitar algunos errores comunes y fortalecer sus prácticas de prueba y desarrollo. Como en la mayoría de los aspectos de la integración continua, una combinación de procesos, herramientas y hábitos permitirá que los cambios vinculados al desarrollo tengan más éxito e impacto.
Para obtener más información sobre las prácticas generales de CI/CD y la configuración de varios servicios de CI/CD, consulte otros artículos con la etiqueta CI/CD.
]]>Web development often involves the use of different development stacks, including the LAMP stack, the MEAN stack, the MERN stack, etc. JAMstack is another stack that offers some unique benefits to developers. This article will discuss those benefits and some general definitions and terms in order to provide an introduction to the JAMstack.
Static websites have been growing recently in use and functionality. No longer a collection of HTML and CSS files, static websites do things like process payments, handle realtime activities, and more. To call these sites “static” undermines and under-describes their functionality. Hence, the term JAMstack.
JAMstack stands for JavaScript, APIs, and Markup. According to the official website, JAMstack means:
Modern web development architecture based on client-side JavaScript, reusable APIs, and prebuilt Markup.
The term was coined by Mathias Biilmann, co-founder of Netlify.
With the JAMstack, we no longer talk about specific technologies such as operating systems, web servers, backend programming languages, or databases. It’s a new way of building websites and apps that delivers better performance, higher security, lower cost of scaling, and a more streamlined developer experience.
When should you consider using the JAMstack? Some reasons you might consider the JAMstack include:
There are also a growing number of services that integrate dynamic functionality into JAMstack websites, including:
In order to build a project using the JAMstack, it must meet the following criteria:
With those in mind, the following projects are not JAMstack projects:
When building a project with the JAMstack, keep the following best practices in mind:
In this article, you learned about what the JAMstack is and why you might consider it for your next project. You also learned about project requirements for JAMstack sites. For examples of websites and web apps built with the JAMstack, you can also look at these official examples.
You can learn more about the JAMstack by going through the official website and the resources section.
]]>Python是可读性很强,并且用途极其多样的编程语言。它的名字是从英国的喜剧团体“Monty Python”受到启发,因此“让Python语言用起来很有趣”对于Python开发团队来说是非常重要的核心目标。易于安装配置、代码风格相对简单直接、有即时的反馈和报错,这些特性让Python成为编程初学者的一个很好选择。
由于Python是一种“多范式”语言-也就是说它支持多种编程样式,包括“脚本”和“面向对象”,因此非常适用于通用编程,用途广泛。在工业界,Python被诸如United Space Alliance(“美国航天局”主要的航天飞机技术支持承包商)和Industrial Light & Magic(Lucasfilm的特效和动画工作室)这类组织越来越多地使用。拥有Python的基础,将会为学习其他编程语言的人们提供巨大的发展潜力。
Python最初作者是Guido van Rossum(荷兰籍),他现在仍活跃在社区中。Python语言的开发始于20世纪80年代末,并于1991年首次发行。作为ABC编程语言的继任,Python的第一迭代版本中已经包含了异常处理、函数、以及类与继承。 1994年一个重要的、名为comp.lang.python的“用户网络咨询论坛”成立,此后Python的用户群体蓬勃发展,为Python成为“最受欢迎的开源开发语言”铺平了道路。
在研究与Python2和Python3相关的潜在用途、以及它们之间的主要编程句法差异之前,让我们先来了解关于Python近年主要发行版的背景知识。
Python 2于2000年底发布,它实现了PEP(Python Enhancement Proposal,“Python改进提议”),这表明Python 2比早期版本Python有更加透明、包容的语言开发过程。PEP是一个技术规范,可以向Python社区成员提供信息,也可以描述该语言的新特性。
此外,Python 2包含了更多的编程功能,其中包括:周期检测垃圾收集器(用于自动化内存管理)、增加Unicode支持(使字符标准化)、以及“列表解析”(基于现有列表创建列表)。随着Python 2的不断发展,更多的功能加入其中,例如python 2.2版,将Python的类型(types)和类(classes)统一到一个层级结构中。
Python 3被认为是Python的未来,也是当前正在开发的版本。作为一项重大改革,Python 3发布于2008年末,旨在解决和修正Python语言之前版本中存在的固有设计缺陷。Python 3的开发重点是清理代码库和消除冗余,使一种既定的任务只有一种方法去实现,让语言更加清晰明了。
Python 3.0版中的重大修改包括:把print
语句改成一个内置函数、改进整数除法、增加更多的Unicode支持。
起先,Python 3的采纳使用非常缓慢,因为它不向后兼容Python 2,需要人们去决定使用哪个版本。此外,许多第三方库一开始仅支持Python 2。但由于Python 3的开发团队重申强调了对于Python 2的支持即将结束,之后更多的库已移植到了Python 3。通过提供Python 3支持的第三方包的数量,我们可以了解“Python 3被越来越多采用”的这个事实。在这份教程的英文版定稿时,360个最流行的Python包中,有339个已支持Python 3。
继2008年Python 3.0发布之后,Python 2.7于2010年7月3日发布,并计划作为2.x版本中的最后一版。Python 2.7存在,是为了让Python 2.x用户更容易将特性移植到Python 3上,因此2.7提供了某种程度的兼容性。这种兼容性支持包括2.7版的增强模块,如支持自动化测试的unittest
、用于命令行中句法分析的argparse
,以及collections
中更方便的类。
由于Python 2.7有着“介于Python 2和Python 3.0早期迭代版之间”的独特地位,它与许多健全的库兼容。因此Python 2.7在程序员中一直是非常流行的选择。当我们今天谈到Python 2时,我们通常指的是Python 2.7(因为它是最常用的版本)。
然而Python2.7被认为是一种将会停止支持的遗留语言,它的持续开发(目前主要是修改bug)将于2020完全停止。
尽管Python 2.7和Python 3具有许多类似功能,但它们不应该被认为是可以完全互换的。虽然你可以在任一版本中写出“好代码”和“有用的程序”,但需要了解的是,它们在代码语法和处理上会有相当大的区别。
下面是一些例子,但请记住随着继续学习Python,你可能会遇到更多的语法差异。
在Python 2中,print
被视为语句而不是函数,这是一个典型的易引起混淆的地方,因为Python中的许多其他操作都需要括号内的参数来执行。如果你想在命令行中用Python 2打印Sammy the Shark is my favorite sea creature
,可以使用以下print
语句:
print "Sammy the Shark is my favorite sea creature"
现在,在Python 3中print()
被明确地定义为一个函数,因此想打印上面相同的字符串,可以使用函数的语法简单容易地执行此操作:
print("Sammy the Shark is my favorite sea creature")
这一变化使得Python的语法更加一致,也更易于在不同的打印函数之间进行更改。print()
句法同样向后支持Python 2.7,因此 Python 3的print()
函数可以方便地在任一版本中运行。
在Python2中,任何不带小数的数字都被视为整数(integer)。乍一看这似乎是处理编程类型的一种简单方法,但当整数相除时,有时你希望得到一个带有小数位的答案,被称为浮点数(float),比如:
5 / 2 = 2.5
然而,在Python 2中整数是强类型,即使在凭直觉应当转换为带小数点的“浮点数”(float)的情况下,也不会转换。
当除法/
符号两边的两个数字是整数时,Python 2执行向下取整(floor division)。因此对于商x
,返回的数字是小于或等于x
的最大整数。这意味着使用5/2
进行这两个数字的除法时,Python 2.7将返回小于或等于2.5的最大整数,在这个例子中为2
:
a = 5 / 2
print a
Output2
若想强制覆盖这种行为,你可以添加小数位5.0/2.0
,以获得预期的答案2.5
。
在Python 3中, 整数除法 变得直观,比如:
a = 5 / 2
print(a)
Output2.5
你依然可以使用5.0 / 2.0
去得到2.5
,但是如果想向下取整运算,应该使用Python 3的语法//
,如下所示:
b = 5 // 2
print(b)
Output2
Python 3中的这个修改使得整数除法更加直观,这个句法不向后兼容Python 2.7。
当编程语言处理字符串类型 字符串 (也就是一串字符)时,他们可以用几种不同的方法来实现,这样计算机可以把数字转换成字母和其他符号。
Python 2默认使用ASCII字母表, 因此当你输入 "Hello, Sammy!"
时Python 2会将字符串作为ASCII进行处理。ASCII在各种扩展形式中,也最多只有几百个字符。因此ASCII不是一种非常灵活的字符编码方式,尤其是非英语字符。
若是想使用更加通用和健全的Unicode字符编码(它支持超过128000个字符,跨越当代和历史上的各类脚本和符号集),你必须键入u"Hello, Sammy!"
,其中u
前缀表示Unicode。
Python 3默认使用Unicode,这为程序员节省了额外的开发时间。而且你可以轻松地在程序中直接输入和显示更多字符。Unicode支持更大的语言字符多样性,以及emojis(表情符号)的显示。因此使用它作为默认字符编码,可以确保你的开发项目能无缝支持世界各地的移动设备。
不过若希望你的Python 3代码向后兼容Python 2,你可你以在字符串之前保留u
。
Python 3和Python 2之间最大的区别不是语法,而是Python 2.7将在2020年失去后续支持,但Python 3将持续发展更多的功能特性,以及持续修复Bug。
Python 3最近的发展包括格式化字符串文本,更简单地定制类的创建,以及更简洁的处理矩阵乘法的语法。
Python 3的持续发展意味着程序员可以确信:Python 3语言中的问题会及时得到修复,并且功能会随着时间的推移而增加,程序的效率会变得更高。
作为一个用Python入门的新手程序员,或者不熟悉Python的有经验程序员,你需要考虑:想在Python学习中实现什么。
如果只是想在“不考虑既定项目”的情况下学习,那么你应该优先考量:Python 3将继续得到支持和开发,而Python 2.7不会。
但如果你正计划加入现有项目,你应当看看:团队使用的Python版本、不同的版本如何与老代码库进行交互、项目使用的包是否被不同的版本支持、以及项目的实现细节。
如果你打算开始一个新的项目,那么应该研究哪些包可以使用,这些包与哪个版本的Python兼容。如上所述,尽管早期版本的Python 3与那些“为Python 2构建的库”兼容性较低,但许多版本已经移植到、或是承诺在未来四年内移植到Python 3。
Python是一种用途广泛、文档齐全的编程语言,无论选择使用Python 2还是Python 3,你都可以写令人兴奋的软件项目。
尽管有几处重要不同,但是只要稍加调整,从Python 3迁移到Python 2并不太难。并且你常会发现python 2.7可以轻松运行python 3的代码,特别是在你刚开始学习时。你可以通过阅读这个教程,去了解更多信息如何将Python 2代码移植到Python 3.
需要记住,随着更多的开发人员和社区关注于Python 3,它将变得更加精炼,并符合程序员不断发展的需求。而对于Python 2.7的支持将会逐渐减少。
]]>I’m interested in starting up a hosting business through the DigitalOcean system. My plan is to start off by offering VPS hosting (automated with WHMCS). I’ve had a look and can see that there are different ways I could offer VPS hosting, and as I’m still relatively new to this field, would it be possible for someone to suggest a few ways that I could do this - with pros and cons?
Thanks!
]]>When most people think of a database, they often envision the traditional relational database model that involves tables made up of rows and columns. While relational database management systems still handle the lion’s share of data on the internet, alternative data models have become more common in recent years as developers have sought workarounds to the relational model’s limitations. These non-relational database models, each with their own unique advantages, disadvantages, and use cases, have come to be categorized as NoSQL databases.
This article will introduce you to a few of the more commonly used NoSQL database models. It will weigh some of their strengths and disadvantages, as well as provide a few examples of database management systems and potential use cases for each.
Databases are logically modeled clusters of information, or data. A database management system (DBMS), meanwhile, is a computer program that interacts with a database. A DBMS allows you to control access to a database, write data, run queries, and perform any other tasks related to database management. Although database management systems are often referred to as “databases,” the two terms are not exactly interchangeable. A database can be any collection of data, not just one stored on a computer, while a DBMS is the specific software that allows you to interact with a database.
All database management systems have an underlying model that structures how data is stored and accessed. A relational database management system (RDBMS) is a DBMS that employs the relational data model. In this model, data is organized into tables, which in the context of RDBMSs are more formally referred to as relations. Relational database management systems typically employ Structured Query Language (SQL) for managing and accessing data held within the database.
Historically, the relational model has been the most widely used approach for managing data, and to this day many of the most popular database management systems implement the relational model. However, the relational model presents several limitations that can be problematic in certain use cases.
For instance, it can be difficult to scale a relational database horizontally. Horizontal scaling, or scaling out, is the practice of adding more machines to an existing stack in order to spread out the load and allow for more traffic and faster processing. This is often contrasted with vertical scaling which involves upgrading the hardware of an existing server, usually by adding more RAM or CPU.
The reason it’s difficult to scale a relational database horizontally has to do with the fact that the relational model is designed to ensure consistency, meaning clients querying the same database will always see the latest data. If you were to scale a relational database horizontally across multiple machines, it becomes difficult to ensure consistency since clients may write data to one node and not the others and there would likely be a delay between the initial write and the time when the other nodes are updated to reflect the changes.
Another limitation presented by RDBMSs is that the relational model was designed to manage structured data, or data that aligns with a predefined data type or is at least organized in some predetermined way, making it easily sortable and searchable. With the spread of personal computing and the rise of the internet in the early 1990s, however, unstructured data — such as email messages, photos, videos, etc. — became more common.
As these limitations grew more constricting, developers began looking for alternatives to the traditional relational data model, leading to the growth in popularity of NoSQL databases.
The label NoSQL itself has a rather fuzzy definition. “NoSQL” was coined in 1998 by Carlo Strozzi as the name for his then-new NoSQL Database, chosen simply because it doesn’t use SQL for managing data.
The term took on a new meaning after 2009 when Johan Oskarsson organized a meetup for developers to discuss the spread of “open source, distributed, and non relational databases” like Cassandra and Voldemort. Oskarsson named the meetup “NOSQL” and since then the term has been used as a catch-all for any database that doesn’t employ the relational model. Interestingly, Strozzi’s NoSQL database does in fact employ the relational model, meaning that the original NoSQL database doesn’t fit the contemporary definition of NoSQL.
Because “NoSQL” generally refers to any DBMS that doesn’t employ the relational model, there are several operational data models associated with the NoSQL concept. The following table includes several such data models, but please note that this is not a comprehensive list:
Operational Database Model | Example DBMSs |
---|---|
Key-value store | Redis, MemcacheDB |
Columnar database | Cassandra, Apache HBase |
Document store | MongoDB, Couchbase |
Graph database | OrientDB, Neo4j |
Despite these different underlying data models, most NoSQL databases share several characteristics. For one, NoSQL databases are typically designed to maximize availability at the expense of consistency. In this sense, consistency refers to the idea that any read operation will return the most recent data written to the database. In a distributed database designed for strong consistency, any data written to one node will be immediately available on all other nodes; otherwise, an error will occur.
Conversely, NoSQL databases oftentimes aim for eventual consistency. This means that newly written data is made available on other nodes in the database eventually (usually in a matter of a few milliseconds), though not necessarily immediately. This has the benefit of improving the availability of one’s data: even though you may not see the very latest data written, you can still view an earlier version of it instead of receiving an error.
Relational databases are designed to deal with normalized data that fits neatly into a predefined schema. In the context of a DBMS, normalized data is data that’s been organized in a way to eliminate redundancies — meaning that the database takes up as little storage space as possible — while a schema is an outline of how the data in the database is structured.
While NoSQL databases are equipped to handle normalized data and they are able to sort data within a predefined schema, their respective data models usually allow for far greater flexibility than the rigid structure imposed by relational databases. Because of this, NoSQL databases have a reputation for being a better choice for storing semi-structured and unstructured data. With that in mind, though, because NoSQL databases don’t come with a predefined schema that often means it’s up to the database administrator to define how the data should be organized and accessed in whatever way makes the most sense for their application.
Now that you have some context around what NoSQL databases are and what makes them different from relational databases, let’s take a closer look at some of the more widely-implemented NoSQL database models.
Key-value databases, also known as key-value stores, work by storing and managing associative arrays. An associative array, also known as a dictionary or hash table, consists of a collection of key-value pairs in which a key serves as a unique identifier to retrieve an associated value. Values can be anything from simple objects, like integers or strings, to more complex objects, like JSON structures.
In contrast to relational databases, which define a data structure made up of tables of rows and columns with predefined data types, key-value databases store data as a single collection without any structure or relation. After connecting to the database server, an application can define a key (for example, the_meaning_of_life
) and provide a matching value (for example, 42
) which can later be retrieved the same way by supplying the key. A key-value database treats any data held within it as an opaque blob; it’s up to the application to understand how it’s structured.
Key-value databases are often described as highly performant, efficient, and scalable. Common use cases for key-value databases are caching, message queuing, and session management.
Some popular open-source key-value data stores are:
Database | Description |
---|---|
Redis | An in-memory data store used as a database, cache, or message broker, Redis supports a variety of data structures, ranging from strings to bitmaps, streams, and spatial indexes. |
Memcached | A general-purpose memory object caching system frequently used to speed up data-driven websites and applications by caching data and objects in memory. |
Riak | A distributed key-value database with advanced local and multi-cluster replication. |
Columnar databases, sometimes called column-oriented databases, are database systems that store data in columns. This may seem similar to traditional relational databases, but rather than grouping columns together into tables, each column is stored in a separate file or region in the system’s storage.
The data stored in a columnar database appears in record order, meaning that the first entry in one column is related to the first entry in other columns. This design allows queries to only read the columns they need, rather than having to read every row in a table and discard unneeded data after it’s been stored in memory.
Because the data in each column is of the same type, it allows for various storage and read optimization strategies. In particular, many columnar database administrators implement a compression strategy such as run-length encoding to minimize the amount of space taken up by a single column. This can have the benefit of speeding up reads since queries need to go over fewer rows. One drawback with columnar databases, though, is that load performance tends to be slow since each column must be written separately and data is often kept compressed. Incremental loads in particular, as well as reads of individual records, can be costly in terms of performance.
Column-oriented databases have been around since the 1960s. Since the mid-2000s, though, columnar databases have become more widely used for data analytics since the columnar data model lends itself well to fast query processing. They’re also seen as advantageous in cases where an application needs to frequently perform aggregate functions, such as finding the average or sum total of data in a column. Some columnar database management systems are even capable of using SQL queries.
Some popular open-source columnar databases are:
Database | Description |
---|---|
Apache Cassandra | A column store designed to maximize scalability, availability, and performance. |
Apache HBase | A distributed database that supports structured storage for large amounts of data and is designed to work with the Hadoop software library. |
ClickHouse | A fault tolerant DBMS that supports real time generation of analytical data and SQL queries. |
Document-oriented databases, or document stores, are NoSQL databases that store data in the form of documents. Document stores are a type of key-value store: each document has a unique identifier — its key — and the document itself serves as the value.
The difference between these two models is that, in a key-value database, the data is treated as opaque and the database doesn’t know or care about the data held within it; it’s up to the application to understand what data is stored. In a document store, however, each document contains some kind of metadata that provides a degree of structure to the data. Document stores often come with an API or query language that allows users to retrieve documents based on the metadata they contain. They also allow for complex data structures, as you can nest documents within other documents.
Unlike relational databases, in which the information of a given object may be spread across multiple tables or databases, a document-oriented database can store all the data of a given object in a single document. Document stores typically store data as JSON, BSON, XML, or YAML documents, and some can store binary formats like PDF documents. Some use a variant of SQL, full-text search, or their own native query language for data retrieval, and others feature more than one query method.
Document-oriented databases have seen an enormous growth in popularity in recent years. Thanks to their flexible schema, they’ve found regular use in e-commerce, blogging, and analytics platforms, as well as content management systems. Document stores are considered highly scalable, with sharding being a common horizontal scaling strategy. They are also excellent for keeping large amounts of unrelated, complex information that varies in structure.
Some popular open-source document based data stores are:
Database | Description |
---|---|
MongoDB | A general purpose, distributed document store, MongoDB is the world’s most widely used document-oriented database at the time of this writing. |
Couchbase | Originally known as Membase, a JSON-based, Memcached-compatible document-based data store. A multi-model database, Couchbase can also function as a key-value store. |
Apache CouchDB | A project of the Apache Software Foundation, CouchDB stores data as JSON documents and uses JavaScript as its query language. |
Graph databases can be thought of as a subcategory of the document store model, in that they store data in documents and don’t insist that data adhere to a predefined schema. The difference, though, is that graph databases add an extra layer to the document model by highlighting the relationships between individual documents.
To better grasp the concept of graph databases, it’s important to understand the following terms:
Certain operations are much simpler to perform using graph databases because of how they link and group related pieces of information. These databases are commonly used in cases where it’s important to be able to gain insights from the relationships between data points or in applications where the information available to end users is determined by their connections to others, as in a social network. They’ve found regular use in fraud detection, recommendation engines, and identity and access management applications.
Some popular open-source graph databases are:
Database | Description |
---|---|
Neo4j | An ACID-compliant DBMS with native graph storage and processing. As of this writing, Neo4j is the most popular graph database in the world. |
ArangoDB | Not exclusively a graph database, ArangoDB is a multi-model database that unites the graph, document, and key-value data models in one DBMS. It features AQL (a native SQL-like query language), full-text search, and a ranking engine. |
OrientDB | Another multi-model database, OrientDB supports the graph, document, key-value, and object models. It supports SQL queries and ACID transactions. |
In this tutorial, we’ve gone over only a few of the NoSQL data models in use today. Some NoSQL models, such as object stores, have seen varying levels of use over the years but remain as viable alternatives to the relational model in some use cases. Others, like object-relational databases and time-series databases, blend elements of relational and NoSQL data models to form a kind of middle ground between the two ends of the spectrum.
The NoSQL category of databases is extremely broad, and continues to evolve to this day. If you’re interested in learning more about NoSQL database management systems and concepts, we encourage you to check out our library of NoSQL-related content.
]]>The Debian operating system’s most recent stable release, version 10 (Buster), was published on July 6, 2019, and will be supported until 2022. Long term support may be provided through 2024 as part of the Debian LTS Project.
This guide is a brief overview of the new features and significant changes to Debian since the previous release. It focuses mainly on changes that will affect users running Debian in a typical server environment. It synthesizes information from the official Debian 10 release notes, the Debian 10 release blog post, kernelnewbies.org, and other sources.
Generally, Debian stable releases contain very few surprises or major changes. This remains the case with Debian 10. Beyond a few networking and security changes — which we will cover in subsequent sections — most updates are small modifications to the base system and new versions of available software packages.
The following list summarizes a select list of Debian 10 software updates. The versions that shipped in Debian 9 are included in ( )
parentheses:
The following sections explain some of the more extensive changes to Debian 10.
The Linux kernel has been updated to version 4.19. This is a long-term support kernel that was released on October 22, 2018 and will be supported until December of 2020. For more information on the different types of Linux kernel releases, take a look at the official Linux kernel release and support schedule.
Some new features and updates that were released between kernels 4.9 and 4.19 include:
For more information on Linux kernel updates, kernelnewbies.org maintains a detailed and beginner-friendly changelog summary for each release.
AppArmor is an access control system that focuses on limiting the resources an application can use. It is supplemental to more traditional user-based access control mechanisms.
AppArmor works by loading application profiles into the kernel, and then using those profiles to enforce limits on capabilities such as file reads and writes, networking access, mounts, and raw socket access.
Debian 10 ships with AppArmor enabled and some default profiles for common applications such as Apache, Bash, Python, and PHP. More profiles can be installed via the apparmor-profiles-extra
package.
See the AppArmor documentation for more information, including guidelines on how to write your own AppArmor application profiles.
In Debian Buster the iptables
subsystem is replaced by nftables
, a newer packet filtering system with improved syntax, streamlined ipv4/ipv6 support, and built-in support for data sets such as dictionaries and maps. You can read a more detailed list of differences on the nftables wiki.
Compatibility with existing iptables
scripts is provided by the iptables-nft
command. The nftables wiki also has advice on transitioning from iptables to nftables.
Apt supports https
repositories by default in Debian 10. Users no longer need to install additional packages before using https
-based package repos.
Additionally, unattended-upgrades
— the system Debian uses to perform automatic updates from the security
repository — now also supports automating point-release upgrades from any repo. These upgrades are usually small bug fixes and security updates.
While this guide is not exhaustive, you should now have a general idea of the major changes and new features in Debian 10 Buster. Please refer to the official Debian 10 release notes for more information.
]]>Imagens de container são o formato de empacotamento principal para a definição de aplicações no Kubernetes. Usadas como base para pods e outros objetos, as imagens desempenham um papel importante ao aproveitar os recursos do Kubernetes para executar aplicações com eficiência na plataforma. Imagens bem projetadas são seguras, altamente eficientes e focadas. Elas são capazes de reagir a dados de configuração ou instruções fornecidas pelo Kubernetes e também implementar endpoints que o sistema de orquestração usa para entender o estado interno da aplicação.
Neste artigo, vamos apresentar algumas estratégias para criar imagens de alta qualidade e discutir algumas metas gerais para ajudar a orientar suas decisões ao containerizar aplicações. Vamos nos concentrar na criação de imagens destinadas a serem executadas no Kubernetes, mas muitas das sugestões se aplicam igualmente à execução de containers em outras plataformas de orquestração ou em outros contextos.
Antes de passarmos por ações específicas a serem tomadas ao criar imagens de container, falaremos sobre o que torna boa uma imagem de container. Quais devem ser seus objetivos ao projetar novas imagens? Quais características e quais comportamentos são mais importantes?
Algumas qualidades que podem ser indicadas são:
Um propósito único e bem definido
Imagens de container devem ter um único foco discreto. Evite pensar em imagens de container como máquinas virtuais, onde pode fazer sentido agrupar funcionalidades relacionadas. Em vez disso, trate suas imagens de container como utilitários Unix, mantendo um foco estrito em fazer bem uma pequena coisa. As aplicações podem ser coordenadas fora do escopo do container para compor funcionalidades complexas.
Design genérico com a capacidade de injetar configuração em tempo de execução
Imagens de container devem ser projetadas com a reutilização em mente quando possível. Por exemplo, a capacidade de ajustar a configuração em tempo de execução geralmente é necessária para atender aos requisitos básicos, como testar suas imagens antes de fazer o deploy em produção. Imagens pequenas e genéricas podem ser combinadas em diferentes configurações para modificar o comportamento sem criar novas imagens.
Tamanho pequeno da imagem
Imagens menores têm vários benefícios em ambientes em cluster, como o Kubernetes. Elas baixam rapidamente para novos nodes e geralmente têm um conjunto menor de pacotes instalados, o que pode melhorar a segurança. As imagens de container reduzidas simplificam o debug de problemas, minimizando a quantidade de software envolvida.
Estado gerenciado externamente
Containers em ambientes clusterizados experimentam um ciclo de vida muito volátil, incluindo desligamentos planejados e não planejados devido à escassez de recursos, dimensionamento ou falhas de node. Para manter a consistência, auxiliar na recuperação e na disponibilidade de seus serviços e evitar a perda de dados, é essencial armazenar o estado da aplicação em um local estável fora do container.
Fácil de entender
É importante tentar manter as imagens de container tão simples e fáceis de entender quanto possível. Ao solucionar problemas, a capacidade de raciocinar facilmente sobre o problema exibindo a configuração da imagem do container ou testando o comportamento dele pode ajudá-lo a alcançar uma resolução mais rapidamente. Pensar em imagens de container como um formato de empacotamento para sua aplicação, em vez de uma configuração de máquina, pode ajudá-lo a encontrar o equilíbrio certo.
Siga as práticas recomendadas do software em container
As imagens devem ter como objetivo trabalhar dentro do modelo de container, em vez de agir contra ele. Evite implementar práticas convencionais de administração de sistema, como incluir sistemas init completos e aplicações como daemon. Faça o log para a saída padrão, para que o Kubernetes possa expor os dados aos administradores, em vez de usar um daemon de log interno. Cada um desses itens difere das melhores práticas para sistemas operacionais completos.
Aproveite totalmente os recursos do Kubernetes
Além de estar em conformidade com o modelo de container, é importante entender e reconciliar o ambiente e as ferramentas que o Kubernetes fornece. Por exemplo, fornecer endpoints para verificações de prontidão e disponibilidade ou ajustar a operação com base nas alterações na configuração ou no ambiente pode ajudar suas aplicações a usar o ambiente de deploy dinâmico do Kubernetes a seu favor.
Agora que estabelecemos algumas das qualidades que definem imagens de container altamente funcionais, podemos mergulhar mais fundo em estratégias que ajudam você a atingir essas metas.
Podemos começar examinando os recursos a partir dos quais as imagens de container são criadas: imagens de base. Cada imagem de container é construída ou a partir de uma imagem pai, uma imagem usada como ponto de partida ou da camada abstrata scratch
, uma camada de imagem vazia sem sistema de arquivos. Uma imagem de base é uma imagem de container que serve como fundação para futuras imagens, através da definição do sistema operacional básico e do fornecimento da funcionalidade principal. As imagens são compostas por uma ou mais camadas de imagem construídas umas sobre as outras para formar uma imagem final.
Nenhum utilitário padrão ou sistema de arquivos está disponível ao trabalhar diretamente a partir do scratch
, o que significa que você só tem acesso a funcionalidades extremamente limitadas. Embora as imagens criadas diretamente a partir do scratch
possam ser muito simples e minimalistas, seu objetivo principal é definir imagens de base. Normalmente, você deseja construir suas imagens de container sobre uma imagem pai que configura um ambiente básico no qual suas aplicações são executadas, para que você não precise construir um sistema completo para cada imagem.
Embora existam imagens base para uma variedade de distribuições Linux, é melhor ser deliberado sobre quais sistemas você escolhe. Cada nova máquina terá que baixar a imagem principal e as camadas complementares que você adicionou. Para imagens grandes, isso pode consumir uma quantidade significativa de largura de banda e aumentar significativamente o tempo de inicialização de seus containers em sua primeira execução. Não há como reduzir uma imagem pai usada como downstream no processo de criação de containers, portanto, começar com uma imagem pai mínima é uma boa ideia.
Ambientes ricos em recursos, como o Ubuntu, permitem que sua aplicação seja executada em um ambiente com o qual você esteja familiarizado, mas há algumas desvantagens a serem consideradas. As imagens do Ubuntu (e imagens de distribuição convencionais semelhantes) tendem a ser relativamente grandes (acima de 100 MB), o que significa que quaisquer imagens de container construídas a partir delas herdarão esse peso.
O Alpine Linux é uma alternativa popular para imagens de base porque ele compacta com sucesso muitas funcionalidades em uma imagem de base muito pequena (~ 5MB). Ele inclui um gerenciador de pacotes com repositórios consideráveis e possui a maioria dos utilitários padrão que você esperaria de um ambiente Linux mínimo.
Ao projetar suas aplicações, é uma boa ideia tentar reutilizar o mesmo pai para cada imagem. Quando suas imagens compartilham um pai, as máquinas que executam seus containers baixam a camada pai apenas uma vez. Depois disso, elas só precisarão baixar as camadas que diferem entre suas imagens. Isso significa que, se você tiver recursos ou funcionalidades comuns que gostaria de incorporar em cada imagem, criar uma imagem pai comum para herdar talvez seja uma boa ideia. Imagens que compartilham uma linhagem ajudam a minimizar a quantidade de dados extras que você precisa baixar em novos servidores.
Depois que você selecionou uma imagem pai, você pode definir sua imagem de container acrescentando software adicional, copiando arquivos, expondo portas e escolhendo processos para serem executados. Certas instruções no arquivo de configuração da imagem (um Dockerfile
se você estiver usando o Docker) adicionarão camadas complementares à sua imagem.
Por muitas das mesmas razões mencionadas na seção anterior, é importante estar ciente de como você adiciona camadas às suas imagens devido ao tamanho resultante, à herança e à complexidade do runtime. Para evitar a criação de imagens grandes e de difícil controle é importante desenvolver um bom entendimento de como as camadas de container interagem, como o mecanismo de criação faz o cache das camadas e como diferenças sutis em instruções semelhantes podem ter um grande impacto nas imagens que você cria.
O Docker cria uma nova camada de imagem toda vez que executa as instruções RUN
, COPY
ou ADD
. Se você construir a imagem novamente, o mecanismo de construção verificará cada instrução para ver se ela possui uma camada de imagem armazenada em cache para a operação. Se ele encontrar uma correspondência no cache, ele usará a camada de imagem existente em vez de executar a instrução novamente e reconstruir a camada.
Esse processo pode reduzir significativamente os tempos de criação, mas é importante entender o mecanismo usado para evitar possíveis problemas. Para instruções de cópia de arquivos como COPY
e ADD
, o Docker compara os checksums dos arquivos para ver se a operação precisa ser executada novamente. Para instruções RUN
, o Docker verifica se possui uma camada de imagem existente armazenada em cache para aquela sequência de comandos específica.
Embora não seja imediatamente óbvio, esse comportamento pode causar resultados inesperados se você não for cuidadoso. Um exemplo comum disso é a atualização do índice de pacotes local e a instalação de pacotes em duas etapas separadas. Estaremos usando o Ubuntu para este exemplo, mas a premissa básica se aplica igualmente bem às imagens de base para outras distribuições:
FROM ubuntu:18.04
RUN apt -y update
RUN apt -y install nginx
. . .
Aqui, o índice de pacotes local é atualizado em uma instrução RUN
(apt -y update
) e o Nginx é instalado em outra operação. Isso funciona sem problemas quando é usado pela primeira vez. No entanto, se o Dockerfile for atualizado posteriormente para instalar um pacote adicional, pode haver problemas:
FROM ubuntu:18.04
RUN apt -y update
RUN apt -y install nginx php-fpm
. . .
Nós adicionamos um segundo pacote ao comando de instalação executado pela segunda instrução. Se uma quantidade significativa de tempo tiver passado desde a criação da imagem anterior, a nova compilação poderá falhar. Isso ocorre porque a instrução de atualização de índice de pacotes (RUN apt -y update
) não foi alterada, portanto, o Docker reutiliza a camada de imagem associada a essa instrução. Como estamos usando um índice de pacotes antigo, a versão do pacote php-fpm
que temos em nossos registros locais pode não estar mais nos repositórios, resultando em um erro quando a segunda instrução é executada.
Para evitar esse cenário, certifique-se de consolidar quaisquer etapas que sejam interdependentes em uma única instrução RUN
para que o Docker reexecute todos os comandos necessários quando ocorrer uma mudança:
FROM ubuntu:18.04
RUN apt -y update && apt -y install nginx php-fpm
. . .
A instrução agora atualiza o cache do pacotes local sempre que a lista de pacotes é alterada.
O exemplo anterior demonstra como o comportamento do cache do Docker pode subverter as expectativas, mas há algumas outras coisas que devem ser lembradas com relação à maneira como as instruções RUN
interagem com o sistema de camadas do Docker. Como mencionado anteriormente, no final de cada instrução RUN
, o Docker faz o commit das alterações como uma camada de imagem adicional. A fim de exercer controle sobre o escopo das camadas de imagens produzidas, você pode limpar arquivos desnecessários no ambiente final que serão comitados prestando atenção aos artefatos introduzidos pelos comandos que você executa.
Em geral, o encadeamento de comandos em uma única instrução RUN
oferece um grande controle sobre a camada que será gravada. Para cada comando, você pode configurar o estado da camada (apt -y update
), executar o comando principal (apt install -y nginx php-fpm
) e remover quaisquer artefatos desnecessários para limpar o ambiente antes de ser comitado. Por exemplo, muitos Dockerfiles encadeiam rm -rf /var/lib/apt/lists/*
ao final dos comandos apt
, removendo os índices de pacotes baixados, para reduzir o tamanho final da camada:
FROM ubuntu:18.04
RUN apt -y update && apt -y install nginx php-fpm && rm -rf /var/lib/apt/lists/*
. . .
Para reduzir ainda mais o tamanho das camadas de imagem que você está criando, tentar limitar outros efeitos colaterais não intencionais dos comandos que você está executando pode ser útil. Por exemplo, além dos pacotes explicitamente declarados, o apt
também instala pacotes “recomendados” por padrão. Você pode incluir --no-install-recommends
aos seus comandos apt
para remover esse comportamento. Você pode ter que experimentar para descobrir se você confia em qualquer uma das funcionalidades fornecidas pelos pacotes recomendados.
Usamos os comandos de gerenciamento de pacotes nesta seção como exemplo, mas esses mesmos princípios se aplicam a outros cenários. A idéia geral é construir as condições de pré-requisito, executar o comando mínimo viável e, em seguida, limpar quaisquer artefatos desnecessários em um único comando RUN
para reduzir a sobrecarga da camada que você estará produzindo.
Multi-stage builds foram introduzidos no Docker 17.05, permitindo aos desenvolvedores controlar mais rigidamente as imagens finais de runtime que eles produzem. Multi-stage builds ou Compilações em Vários Estágios permitem que você divida seu Dockerfile em várias seções representando estágios distintos, cada um com uma instrução FROM
para especificar imagens pai separadas.
Seções anteriores definem imagens que podem ser usadas para criar sua aplicação e preparar ativos. Elas geralmente contêm ferramentas de compilação e arquivos de desenvolvimento necessários para produzir a aplicação, mas não são necessários para executá-la. Cada estágio subsequente definido no arquivo terá acesso aos artefatos produzidos pelos estágios anteriores.
A última declaração FROM
define a imagem que será usada para executar a aplicação. Normalmente, essa é uma imagem reduzida que instala apenas os requisitos de runtime necessários e, em seguida, copia os artefatos da aplicação produzidos pelos estágios anteriores.
Este sistema permite que você se preocupe menos com a otimização das instruções RUN
nos estágios de construção, já que essas camadas de container não estarão presentes na imagem de runtime final. Você ainda deve prestar atenção em como as instruções interagem com o cache de camadas nos estágios de construção, mas seus esforços podem ser direcionados para minimizar o tempo de construção em vez do tamanho final da imagem. Prestar atenção às instruções no estágio final ainda é importante para reduzir o tamanho da imagem, mas ao separar os diferentes estágios da construção do container, é mais fácil obter imagens simplificadas sem tanta complexidade no Dockerfile.
Embora as escolhas que você faz em relação às instruções de criação de containers sejam importantes, decisões mais amplas sobre como containerizar seus serviços geralmente têm um impacto mais direto em seu sucesso. Nesta seção, falaremos um pouco mais sobre como fazer uma melhor transição de suas aplicações de um ambiente mais convencional para uma plataforma de container.
Geralmente, é uma boa prática empacotar cada parte de uma funcionalidade independente em uma imagem de container separada.
Isso difere das estratégias comuns empregadas nos ambientes de máquina virtual, em que os aplicativos são frequentemente agrupados na mesma imagem para reduzir o tamanho e minimizar os recursos necessários para executar a VM. Como os containers são abstrações leves que não virtualizam toda a pilha do sistema operacional, essa abordagem é menos atraente no Kubernetes. Assim, enquanto uma máquina virtual de stack web pode empacotar um servidor web Nginx com um servidor de aplicações Gunicorn em uma única máquina para servir uma aplicação Django, no Kubernetes eles podem ser divididos em containeres separados.
Projetar containers que implementam uma parte discreta de funcionalidade para seus serviços oferece várias vantagens. Cada container pode ser desenvolvido independentemente se as interfaces padrão entre os serviços forem estabelecidas. Por exemplo, o container Nginx poderia ser usado para fazer proxy para vários back-ends diferentes ou poderia ser usado como um balanceador de carga se tivesse uma configuração diferente.
Depois de fazer o deploy, cada imagem de container pode ser escalonada independentemente para lidar com várias restrições de recursos e de carga. Ao dividir suas aplicações em várias imagens de container, você ganha flexibilidade no desenvolvimento, na organização e no deployment.
No Kubernetes, pods são a menor unidade que pode ser gerenciada diretamente pelo painel de controle. Os pods consistem em um ou mais containers juntamente com dados de configuração adicionais para informar à plataforma como esses componentes devem ser executados. Os containers em um pod são sempre lançados no mesmo worker node no cluster e o sistema reinicia automaticamente containers com falha. A abstração do pod é muito útil, mas introduz outra camada de decisões sobre como agrupar os componentes de suas aplicações.
Assim como as imagens de container, os pods também se tornam menos flexíveis quando muita funcionalidade é agrupada em uma única entidade. Os próprios pods podem ser escalados usando outras abstrações, mas os containers dentro deles não podem ser gerenciados ou redimensionados independentemente. Portanto, para continuar usando nosso exemplo anterior, os containers Nginx e Gunicorn separados provavelmente não devem ser empacotados juntos em um único pod, para que possam ser controlados e deployados separadamente.
No entanto, há cenários em que faz sentido combinar containers funcionalmente diferentes como uma unidade. Em geral, eles podem ser categorizadas como situações em que um container adicional suporta ou aprimora a funcionalidade central do container principal ou ajuda-o a adaptar-se ao seu ambiente de deployment. Alguns padrões comuns são:
Como você deve ter notado, cada um desses padrões suporta a estratégia de criar imagens genéricas e padronizadas de container principais que podem ser implantadas em contextos e configurações variados. Os containers secundários ajudam a preencher a lacuna entre o container principal e o ambiente de deployment específico que está sendo usado. Alguns containers Sidecar também podem ser reutilizados para adaptar vários containers primários às mesmas condições ambientais. Esses padrões se beneficiam do sistema de arquivos compartilhado e do namespace de rede fornecidos pela abstração do pod, ao mesmo tempo em que permitem o desenvolvimento independente e o deploy flexível de containers padronizados.
Existe alguma tensão entre o desejo de construir componentes reutilizáveis e padronizados e os requisitos envolvidos na adaptação de aplicações ao seu ambiente de runtime. A configuração de runtime é um dos melhores métodos para preencher a lacuna entre essas preocupações. Componentes são criados para serem genéricos e flexíveis e o comportamento necessário é descrito no runtime, fornecendo ao software informações adicionais sobre a configuração. Essa abordagem padrão funciona para containers, assim como para aplicações.
Construir com a configuração de runtime em mente requer que você pense à frente durante as etapas de desenvolvimento de aplicação e de containerização. As aplicações devem ser projetadas para ler valores de parâmetros da linha de comando, arquivos de configuração ou variáveis de ambiente quando forem iniciados ou reiniciados. Essa lógica de análise e injeção de configuração deve ser implementada no código antes da containerização.
Ao escrever um Dockerfile, o container também deve ser projetado com a configuração de runtime em mente. Os containers possuem vários mecanismos para fornecer dados em tempo de execução. Os usuários podem montar arquivos ou diretórios do host como volumes dentro do container para ativar a configuração baseada em arquivo. Da mesma forma, as variáveis de ambiente podem ser passadas para o runtime interno do container quando o msmo é iniciado. As instruções de Dockerfile CMD
e ENTRYPOINT
também podem ser definidas de uma forma que permita que as informações de configuração de runtime sejam passadas como parâmetros de comando.
Como o Kubernetes manipula objetos de nível superior, como pods, em vez de gerenciar containeres diretamente, há mecanismos disponíveis para definir a configuração e injetá-la no ambiente de container em runtime. Kubernetes ConfigMaps e Secrets permitem que você defina os dados de configuração separadamente e projete os valores no ambiente de container como variáveis de ambiente ou arquivos em runtime. ConfigMaps são objetos de finalidade geral destinados a armazenar dados de configuração que podem variar de acordo com o ambiente, o estágio de teste etc. Secrets oferecem uma interface semelhante, mas são projetados especificamente para dados confidenciais, como senhas de contas ou credenciais de API.
Ao entender e utilizar corretamente as opções de configuração de runtime disponíveis em todas as camadas de abstração, você pode criar componentes flexíveis que retiram suas entradas dos valores fornecidos pelo ambiente. Isso possibilita reutilizar as mesmas imagens de container em cenários muito diferentes, reduzindo a sobrecarga de desenvolvimento, melhorando a flexibilidade da aplicação.
Ao fazer a transição para ambientes baseados em container, os usuários geralmente iniciam movendo as cargas de trabalho existentes, com poucas ou nenhuma alteração, para o novo sistema. Eles empacotam aplicações em containers agrupando as ferramentas que já estão usando na nova abstração. Embora seja útil utilizar seus padrões usuais para colocar as aplicações migradas em funcionamento, cair em implementações anteriores em containers pode, às vezes, levar a um design ineficaz.
Frequentemente surgem problemas quando os desenvolvedores implementam uma funcionalidade significativa de gerenciamento de serviços nos containers. Por exemplo, a execução de serviços systemd no container ou tornar daemons os servidores web pode ser considerada uma prática recomendada em um ambiente de computação normal, mas elas geralmente entram em conflito com as suposições inerentes ao modelo de container.
Os hosts gerenciam os eventos do ciclo de vida do container enviando sinais para o processo que opera como PID (ID do processo) 1 dentro do container. O PID 1 é o primeiro processo iniciado, que seria o sistema init em ambientes de computação tradicionais. No entanto, como o host só pode gerenciar o PID 1, usar um sistema init convencional para gerenciar processos dentro do container às vezes significa que não há como controlar a aplicação principal. O host pode iniciar, parar ou matar o sistema init interno, mas não pode gerenciar diretamente a aplicação principal. Os sinais às vezes propagam o comportamento pretendido para a aplicação em execução, mas isso adiciona complexidade e nem sempre é necessário.
Na maioria das vezes, é melhor simplificar o ambiente de execução dentro do container para que o PID 1 esteja executando a aplicação principal em primeiro plano. Nos casos em que vários processos devem ser executados, o PID 1 é responsável por gerenciar o ciclo de vida de processos subsequentes. Certas aplicações, como o Apache, lidam com isso nativamente gerando e gerenciando workers que lidam com conexões. Para outras aplicações, um script wrapper ou um sistema init muito simples como o dumb-init ou o sistema init incluído tini podem ser usados em alguns casos. Independentemente da implementação escolhida, o processo que está sendo executado como PID 1 no container deve responder adequadamente aos sinais TERM
enviados pelo Kubernetes para se comportar como esperado.
Os deployments e serviços do Kubernetes oferecem gerenciamento de ciclo de vida para processos de longa duração e acesso confiável e persistente a aplicações, mesmo quando os containers subjacentes precisam ser reiniciados ou as próprias implementações são alteradas. Ao retirar a responsabilidade de monitorar e manter a integridade do serviço do container, você pode aproveitar as ferramentas da plataforma para gerenciar cargas de trabalho saudáveis.
Para que o Kubernetes gerencie os containers adequadamente, ele precisa entender se as aplicações em execução nos containers são saudáveis e capazes de executar o trabalho. Para ativar isso, os containers podem implementar análises de integridade: endpoints de rede ou comandos que podem ser usados para relatar a integridade da aplicação. O Kubernetes verificará periodicamente as sondas de integridade definidas para determinar se o container está operando conforme o esperado. Se o container não responder adequadamente, o Kubernetes reinicia o container na tentativa de restabelecer a funcionalidade.
O Kubernetes também fornece sondas de prontidão, uma construção similar. Em vez de indicar se a aplicação em um container está íntegra, as sondas de prontidão determinam se a aplicação está pronta para receber tráfego. Isso pode ser útil quando uma aplicação em container tiver uma rotina de inicialização que deve ser concluída antes de estar pronta para receber conexões. O Kubernetes usa sondas ou testes de prontidão para determinar se deve adicionar um pod ou remover um pod de um serviço.
A definição de endpoints para esses dois tipos de sondagem pode ajudar o Kubernetes a gerenciar seus containers com eficiência e pode evitar que problemas no ciclo de vida do container afetem a disponibilidade do serviço. Os mecanismos para responder a esses tipos de solicitações de integridade devem ser incorporados à própria aplicação e devem ser expostos na configuração da imagem do Docker.
Neste guia, abordamos algumas considerações importantes para se ter em mente ao executar aplicações em container no Kubernetes. Para reiterar, algumas das sugestões que examinamos foram:
Durante todo o processo de desenvolvimento e deployment, você precisará tomar decisões que podem afetar a robustez e a eficácia do seu serviço. Compreender as maneiras pelas quais as aplicações conteinerizadas diferem das aplicações convencionais e aprender como elas operam em um ambiente de cluster gerenciado pode ajudá-lo a evitar algumas armadilhas comuns e permitir que você aproveite todos os recursos oferecidos pelo Kubernetes.
]]>Ok so… I have a root droplet in which I have my Spring boot application which users from all around the world can use. Now what I need is to get files that are being processed by the application and write them onto another Droplet accordingly (which may or may not be on the account where the root droplet is). So in a sense, I need for my root droplet to have access to child droplets with writing privileges in certain folders, of course in a secure manner.
Is something like this possible? And if it is what is the best approach?
Kind Regards, Luka Balac
]]>O monitoramento de sistemas e da infraestrutura é uma responsabilidade central de equipes de operações de todos os tamanhos. A indústria desenvolveu coletivamente muitas estratégias e ferramentas para ajudar a monitorar servidores, coletar dados importantes e responder a incidentes e condições em alteração em ambientes variados. No entanto, à medida que as metodologias de software e os projetos de infraestrutura evoluem, o monitoramento deve se adaptar para atender a novos desafios e fornecer insights em um território relativamente desconhecido.
Até agora, nesta série, discutimos o que são métricas, monitoramento e alertas, e as qualidades de bons sistemas de monitoramento. Conversamos sobre coletar métricas de sua infraestrutura e aplicações e os sinais importantes para monitorar toda a sua infraestrutura. Em nosso último guia, cobrimos como colocar em prática métricas e alertas entendendo os componentes individuais e as qualidades do bom projeto de alertas.
Neste guia, vamos dar uma olhada em como o monitoramento e a coleta de métricas são alterados para arquiteturas e microsserviços altamente distribuídos. A crescente popularidade da computação em nuvem, dos clusters de big data e das camadas de orquestração de instâncias forçou os profissionais de operações a repensar como projetar o monitoramento em escala e a enfrentar problemas específicos com uma melhor instrumentação. Vamos falar sobre o que diferencia os novos modelos de deployment e quais estratégias podem ser usadas para atender a essas novas demandas.
Para modelar e espelhar os sistemas monitorados, a infraestrutura de monitoramento sempre foi um pouco distribuída. No entanto, muitas práticas modernas de desenvolvimento — incluindo projetos em torno de microsserviços, containers e instâncias de computação intercambiáveis e efêmeras — alteraram drasticamente o cenário de monitoramento. Em muitos casos, os principais recursos desses avanços são os fatores que tornam o monitoramento mais difícil. Vamos analisar algumas das maneiras pelas quais elas diferem dos ambientes tradicionais e como isso afeta o monitoramento.
Algumas das mudanças mais fundamentais na forma como muitos sistemas se comportam são devido a uma explosão em novas camadas de abstração em torno das quais o software pode ser projetado. A tecnologia de containers mudou o relacionamento entre o software deployado e o sistema operacional subjacente. Os aplicações com deploy em containers têm relacionamentos diferentes com o mundo externo, com outros programas e com o sistema operacional do host, do que com as aplicações cujo deploy foi feito por meios convencionais. As abstrações de kernel e rede podem levar a diferentes entendimentos do ambiente operacional, dependendo de qual camada você verificar.
Esse nível de abstração é incrivelmente útil de várias maneiras, criando estratégias de deployment consistentes, facilitando a migração do trabalho entre hosts e permitindo que os desenvolvedores controlem de perto os ambientes de runtime de suas aplicações. No entanto, esses novos recursos surgem às custas do aumento da complexidade e de um relacionamento mais distante com os recursos que suportam cada processo.
Uma semelhança entre paradigmas mais recentes é uma dependência crescente da comunicação de rede interna para coordenar e realizar tarefas. O que antes era o domínio de uma única aplicação, agora pode ser distribuído entre muitos componentes que precisam coordenar e compartilhar informações. Isso tem algumas repercussões em termos de infraestrutura de comunicação e monitoramento.
Primeiro, como esses modelos são construídos na comunicação entre serviços pequenos e discretos, a saúde da rede se torna mais importante do que nunca. Em arquiteturas tradicionais, mais monolíticas, tarefas de coordenação, compartilhamento de informações e organização de resultados foram amplamente realizadas em aplicações com lógica de programação regular ou através de uma quantidade comparativamente pequena de comunicação externa. Em contraste, o fluxo lógico de aplicações altamente distribuídas usa a rede para sincronizar, verificar a integridade dos pares e passar informações. A saúde e o desempenho da rede impactam diretamente mais funcionalidades do que anteriormente, o que significa que é necessário um monitoramento mais intensivo para garantir a operação correta.
Embora a rede tenha se tornado mais crítica do que nunca, a capacidade de monitorá-la é cada vez mais desafiadora devido ao número estendido de participantes e linhas de comunicação individuais. Em vez de rastrear interações entre algumas aplicações, a comunicação correta entre dezenas, centenas ou milhares de pontos diferentes torna-se necessária para garantir a mesma funcionalidade. Além das considerações de complexidade, o aumento do volume de tráfego também sobrecarrega os recursos de rede disponíveis, aumentando ainda mais a necessidade de um monitoramento confiável.
Acima, mencionamos de passagem a tendência das arquiteturas modernas de dividir o trabalho e a funcionalidade entre muitos componentes menores e discretos. Esses projetos podem ter um impacto direto no cenário de monitoramento porque tornam a clareza e a compreensão especialmente valiosas, mas cada vez mais evasivas.
Ferramentas e instrumentação mais robustas são necessárias para garantir um bom funcionamento. No entanto, como a responsabilidade de concluir qualquer tarefa é fragmentada e dividida entre diferentes workers (possivelmente em muitos hosts físicos diferentes), entender onde a responsabilidade reside em questões de desempenho ou erros pode ser difícil. Solicitações e unidades de trabalho que tocam dezenas de componentes, muitos dos quais são selecionados de um pool de possíveis candidatos, podem tornar impraticável a visualização do caminho da solicitação ou a análise da causa raiz usando mecanismos tradicionais.
Uma batalha adicional na adaptação do monitoramento convencional é monitorar sensivelmente as unidades de vida curta ou efêmeras. Independentemente de as unidades de interesse serem instâncias de computação em nuvem, instâncias de container ou outras abstrações, esses componentes geralmente violam algumas das suposições feitas pelo software de monitoramento convencional.
Por exemplo, para distinguir entre um node problemático e uma instância intencionalmente destruída para reduzir a escala, o sistema de monitoramento deve ter um entendimento mais íntimo de sua camada de provisionamento e gerenciamento do que era necessário anteriormente. Para muitos sistemas modernos, esses eventos ocorrem com muito mais frequência, portanto, ajustar manualmente o domínio de monitoramento a cada vez não é prático. O ambiente de deployment muda mais rapidamente com esses projetos, portanto, a camada de monitoramento deve adotar novas estratégias para permanecer valiosa.
Uma questão que muitos sistemas devem enfrentar é o que fazer com os dados das instâncias destruídas. Embora as work units possam ser aprovisionadas e desprovisionadas rapidamente para acomodar demandas variáveis, é necessário tomar uma decisão sobre o que fazer com os dados relacionados às instâncias antigas. Os dados não perdem necessariamente seu valor imediatamente porque o worker subjacente não está mais disponível. Quando centenas ou milhares de nodes podem entrar e sair todos os dias, pode ser difícil saber como melhorar a construção de uma narrativa sobre a integridade operacional geral de seu sistema a partir dos dados fragmentados de instâncias de vida curta.
Agora que identificamos alguns dos desafios únicos das arquiteturas e microsserviços distribuídos, podemos falar sobre como os sistemas de monitoramento podem funcionar dentro dessas realidades. Algumas das soluções envolvem reavaliar e isolar o que é mais valioso sobre os diferentes tipos de métricas, enquanto outras envolvem novas ferramentas ou novas formas de entender o ambiente em que elas habitam.
O aumento no volume total de tráfego causado pelo elevado número de serviços é um dos problemas mais simples de se pensar. Além do aumento nos números de transferência causados por novas arquiteturas, a própria atividade de monitoramento pode começar a atolar a rede e roubar recursos do host. Para lidar melhor com o aumento de volume, você pode expandir sua infraestrutura de monitoramento ou reduzir a resolução dos dados com os quais trabalha. Vale à pena olhar ambas as abordagens, mas vamos nos concentrar na segunda, pois representa uma solução mais extensível e amplamente útil.
Alterar suas taxas de amostragem de dados pode minimizar a quantidade de dados que seu sistema precisa coletar dos hosts. A amostragem é uma parte normal da coleção de métricas que representa com que frequência você solicita novos valores para uma métrica. Aumentar o intervalo de amostragem reduzirá a quantidade de dados que você precisa manipular, mas também reduzirá a resolução — o nível de detalhes — de seus dados. Embora você deva ter cuidado e compreender sua resolução mínima útil, ajustar as taxas de coleta de dados pode ter um impacto profundo em quantos clientes de monitoramento seu sistema pode atender adequadamente.
Para diminuir a perda de informações resultante de resoluções mais baixas, uma opção é continuar a coletar dados em hosts na mesma frequência, mas compilá-los em números mais digeríveis para transferência pela rede. Computadores individuais podem agregar e calcular valores médios de métricas e enviar resumos para o sistema de monitoramento. Isso pode ajudar a reduzir o tráfego da rede, mantendo a precisão, já que um grande número de pontos de dados ainda é levado em consideração. Observe que isso ajuda a reduzir a influência da coleta de dados na rede, mas não ajuda, por si só, com a pressão envolvida na coleta desses números no host.
Como mencionado acima, um dos principais diferenciais entre sistemas tradicionais e arquiteturas modernas é a quebra de quais componentes participam no processamento de solicitações. Em sistemas distribuídos e microsserviços, é muito mais provável que uma unidade de trabalho ou worker seja dado a um grupo de workers por meio de algum tipo de camada de agendamento ou arbitragem. Isso tem implicações em muitos dos processos automatizados que você pode construir em torno do monitoramento.
Em ambientes que usam grupos de workers intercambiáveis, as políticas de verificação de integridade e de alerta podem ter relações complexas com a infraestrutura que eles monitoram. As verificações de integridade em workers individuais podem ser úteis para desativar e reciclar unidades defeituosas automaticamente. No entanto, se você tiver a automação em funcionamento, em escala, não importa muito se um único servidor web falhar em um grande pool ou grupo. O sistema irá se auto-corrigir para garantir que apenas as unidades íntegras estejam no pool ativo recebendo solicitações.
Embora as verificações de integridade do host possam detectar unidades defeituosas, a verificação da integridade do pool em si é mais apropriada para alertas. A capacidade do pool de satisfazer a carga de trabalho atual tem maior importância na experiência do usuário do que os recursos de qualquer worker individual. Os alertas com base no número de membros íntegros, na latência do agregado do pool ou na taxa de erros do pool podem notificar os operadores sobre problemas mais difíceis de serem mitigados automaticamente e mais propensos a causar impacto nos usuários.
Em geral, a camada de monitoramento em sistemas distribuídos precisa ter um entendimento mais completo do ambiente de deploy e dos mecanismos de provisionamento. O gerenciamento automatizado do ciclo de vida se torna extremamente valioso devido ao número de unidades individuais envolvidas nessas arquiteturas. Independentemente de as unidades serem containers puros, containers em uma estrutura de orquestração ou nodes de computação em um ambiente de nuvem, existe uma camada de gerenciamento que expõe informações de integridade e aceita comandos para dimensionar e responder a eventos.
O número de peças em jogo aumenta a probabilidade estatística de falha. Com todos os outros fatores sendo iguais, isso exigiria mais intervenção humana para responder e mitigar esses problemas. Como o sistema de monitoramento é responsável por identificar falhas e degradação do serviço, se ele puder conectar-se às interfaces de controle da plataforma, isso pode aliviar uma grande classe desses problemas. Uma resposta imediata e automática desencadeada pelo software de monitoramento pode ajudar a manter a integridade operacional do seu sistema.
Essa relação estreita entre o sistema de monitoramento e a plataforma de deploy não é necessariamente obrigatória ou comum em outras arquiteturas. Mas os sistemas distribuídos automatizados visam ser auto-reguláveis, com a capacidade de dimensionar e ajustar com base em regras pré-configuradas e status observado. O sistema de monitoramento, neste caso, assume um papel central no controle do ambiente e na decisão sobre quando agir.
Outro motivo pelo qual o sistema de monitoramento deve ter conhecimento da camada de provisionamento é lidar com os efeitos colaterais de instâncias efêmeras. Em ambientes onde há rotatividade frequente nas instâncias de trabalho, o sistema de monitoramento depende de informações de um canal paralelo para entender quando as ações foram intencionais ou não. Por exemplo, sistemas que podem ler eventos de API de um provisionador podem reagir de maneira diferente quando um servidor é destruído intencionalmente por um operador do que quando um servidor repentinamente não responde sem nenhum evento associado. A capacidade de diferenciar esses eventos pode ajudar seu monitoramento a permanecer útil, preciso e confiável, mesmo que a infraestrutura subjacente possa mudar com frequência.
Um dos aspectos mais desafiadores de cargas de trabalho altamente distribuídas é entender a interação entre os diferentes componentes e isolar a responsabilidade ao tentar a análise da causa raiz. Como uma única solicitação pode afetar dúzias de pequenos programas para gerar uma resposta, pode ser difícil interpretar onde os gargalos ou alterações de desempenho se originam. Para fornecer melhores informações sobre como cada componente contribui para a latência e sobrecarga de processamento, surgiu uma técnica chamada rastreamento distribuído.
O rastreamento distribuído é uma abordagem dos sistemas de instrumentação que funciona adicionando código a cada componente para iluminar o processamento da solicitação à medida que ela percorre seus serviços. Cada solicitação recebe um identificador exclusivo na borda de sua infraestrutura que é transmitido conforme a tarefa atravessa sua infraestrutura. Cada serviço usa essa ID para relatar erros e os registros de data e hora de quando viu a solicitação pela primeira vez e quando ela foi entregue para a próxima etapa. Ao agregar os relatórios dos componentes usando o ID da solicitação, um caminho detalhado com dados de tempo precisos pode ser rastreado através de sua infraestrutura.
Esse método pode ser usado para entender quanto tempo é gasto em cada parte de um processo e identificar claramente qualquer aumento sério na latência. Essa instrumentação extra é uma maneira de adaptar a coleta de métricas a um grande número de componentes de processamento. Quando mapeado visualmente com o tempo no eixo x, a exibição resultante mostra o relacionamento entre diferentes estágios, por quanto tempo cada processo foi executado e o relacionamento de dependência entre os eventos que devem ser executados em paralelo. Isso pode ser incrivelmente útil para entender como melhorar seus sistemas e como o tempo está sendo gasto.
Discutimos como as arquiteturas distribuídas podem tornar a análise da causa raiz e a clareza operacional difíceis de se obter. Em muitos casos, mudar a forma como os humanos respondem e investigam questões é parte da resposta a essas ambiguidades. Configurar as ferramentas para expor as informações de uma maneira que permita analisar a situação metodicamente pode ajudar a classificar as várias camadas de dados disponíveis. Nesta seção, discutiremos maneiras de se preparar para o sucesso ao solucionar problemas em ambientes grandes e distribuídos.
O primeiro passo para garantir que você possa responder a problemas em seus sistemas é saber quando eles estão ocorrendo. Em nosso guia Coletando Métricas de sua Infraestrutura e Aplicações, apresentamos os quatro sinais de ouro - indicadores de monitoramento identificados pela equipe de SRE do Google como os mais vitais para rastrear. Os quatro sinais são:
Esses ainda são os melhores locais para começar quando estiver instrumentando seus sistemas, mas o número de camadas que devem ser observadas geralmente aumenta para sistemas altamente distribuídos. A infraestrutura subjacente, o plano de orquestração e a camada de trabalho precisam de um monitoramento robusto com alertas detalhados definidos para identificar alterações importantes.
Depois que seus sistemas identificarem uma anomalia e notificarem sua equipe, esta precisa começar a coletar dados. Antes de continuar a partir desta etapa, eles devem ter uma compreensão de quais componentes foram afetados, quando o incidente começou e qual condição de alerta específica foi acionada.
A maneira mais útil de começar a entender o escopo de um incidente é começar em um nível alto. Comece a investigar verificando dashboards e visualizações que coletam e generalizam informações de seus sistemas. Isso pode ajudá-lo a identificar rapidamente os fatores correlacionados e a entender o impacto imediato que o usuário enfrenta. Durante esse processo, você deve conseguir sobrepor informações de diferentes componentes e hosts.
O objetivo deste estágio é começar a criar um inventário mental ou físico de itens para verificar com mais detalhes e começar a priorizar sua investigação. Se você puder identificar uma cadeia de problemas relacionados que percorrem diferentes camadas, a camada mais baixa deve ter precedência: as correções para as camadas fundamentais geralmente resolvem os sintomas em níveis mais altos. A lista de sistemas afetados pode servir como uma lista de verificação informal de locais para validar as correções posteriormente quando a mitigação é implementada.
Quando você perceber que tem uma visão razoável do incidente, faça uma pesquisa detalhada sobre os componentes e sistemas da sua lista em ordem de prioridade. As métricas detalhadas sobre unidades individuais ajudarão você a rastrear a rota da falha até o recurso responsável mais baixo. Ao examinar painéis de controle e entradas de log mais refinados, consulte a lista de componentes afetados para tentar entender melhor como os efeitos colaterais estão sendo propagados pelo sistema. Com microsserviços, o número de componentes interdependentes significa que os problemas se espalham para outros serviços com mais frequência.
Este estágio é focado em isolar o serviço, componente ou sistema responsável pelo incidente inicial e identificar qual problema específico está ocorrendo. Isso pode ser um código recém-implantado, uma infraestrutura física com defeito, um erro ou bug na camada de orquestração ou uma alteração na carga de trabalho que o sistema não pôde manipular normalmente. Diagnosticar o que está acontecendo e porquê permite descobrir como mitigar o problema e recuperar a saúde operacional. Entender até que ponto a resolução deste problema pode corrigir problemas relatados em outros sistemas pode ajudá-lo a continuar priorizando as tarefas de mitigação.
Depois que os detalhes forem identificados, você poderá resolver ou mitigar o problema. Em muitos casos, pode haver uma maneira óbvia e rápida de restaurar o serviço fornecendo mais recursos, revertendo ou redirecionando o tráfego para uma implementação alternativa. Nestes cenários, a resolução será dividida em três fases:
Em muitos sistemas distribuídos, a redundância e os componentes altamente disponíveis garantirão que o serviço seja restaurado rapidamente, embora seja necessário mais trabalho em segundo plano para restaurar a redundância ou tirar o sistema de um estado degradado. Você deve usar a lista de componentes impactados compilados anteriormente como uma base de medição para determinar se a mitigação inicial resolve problemas de serviço em cascata. À medida que a sofisticação dos sistemas de monitoramento evolui, ele também pode automatizar alguns desses processos de recuperação mais completos enviando comandos para a camada de provisionamento para lançar novas instâncias de unidades com falha ou para eliminar unidades que não se comportam corretamente.
Dada a automação possível nas duas primeiras fases, o trabalho mais importante para a equipe de operações geralmente é entender as causas-raiz de um evento. O conhecimento obtido a partir desse processo pode ser usado para desenvolver novos gatilhos e políticas para ajudar a prever ocorrências futuras e automatizar ainda mais as reações do sistema. O software de monitoramento geralmente obtém novos recursos em resposta a cada incidente para proteger contra os cenários de falha recém-descobertos. Para sistemas distribuídos, rastreamentos distribuídos, entradas de log, visualizações de séries temporais e eventos como deploys recentes podem ajudá-lo a reconstruir a sequência de eventos e identificar onde o software e os processos humanos podem ser aprimorados.
Devido à complexidade específica inerente aos grandes sistemas distribuídos, é importante tratar o processo de resolução de qualquer evento significativo como uma oportunidade para aprender e ajustar seus sistemas. O número de componentes separados e os caminhos de comunicação envolvidos forçam uma grande dependência da automação e das ferramentas para ajudar a gerenciar a complexidade. A codificação de novas lições nos mecanismos de resposta e conjuntos de regras desses componentes (bem como nas políticas operacionais que sua equipe segue) é a melhor maneira de seu sistema de monitoramento manter a pegada de gerenciamento de sua equipe sob controle.
Neste guia, falamos sobre alguns dos desafios específicos que as arquiteturas distribuídas e os projetos de microsserviço podem introduzir para o software de monitoramento e visibilidade. As maneiras modernas de se construir sistemas quebram algumas suposições dos métodos tradicionais, exigindo abordagens diferentes para lidar com os novos ambientes de configuração. Exploramos os ajustes que você precisará considerar ao passar de sistemas monolíticos para aqueles que dependem cada vez mais de workers efêmeros, baseados em nuvem ou em containers e alto volume de coordenação de rede. Posteriormente, discutimos algumas maneiras pelas quais a arquitetura do sistema pode afetar a maneira como você responde a incidentes e a resolução.
]]>As a broader subject, configuration management (CM) refers to the process of systematically handling changes to a system in a way that it maintains integrity over time. Even though this process was not originated in the IT industry, the term is broadly used to refer to server configuration management.
Automation plays an essential role in server configuration management. It’s the mechanism used to make the server reach a desirable state, previously defined by provisioning scripts using a tool’s specific language and features. Automation is, in fact, the heart of configuration management for servers, and that’s why it’s common to also refer to configuration management tools as Automation Tools or IT Automation Tools.
Another common term used to describe the automation features implemented by configuration management tools is Server Orchestration or IT Orchestration, since these tools are typically capable of managing one to hundreds of servers from a central controller machine.
There are a number of configuration management tools available in the market. Puppet, Ansible, Chef and Salt are popular choices. Although each tool will have its own characteristics and work in slightly different ways, they are all driven by the same purpose: to make sure the system’s state matches the state described by your provisioning scripts.
Although the use of configuration management typically requires more initial planning and effort than manual system administration, all but the simplest of server infrastructures will be improved by the benefits that it provides. To name a few:
Whenever a new server needs to be deployed, a configuration management tool can automate most, if not all, of the provisioning process for you. Automation makes provisioning much quicker and more efficient because it allows tedious tasks to be performed faster and more accurately than any human could. Even with proper and thorough documentation, manually deploying a web server, for instance, could take hours compared to a few minutes with configuration management/automation.
With quick provisioning comes another benefit: quick recovery from critical events. When a server goes offline due to unknown circumstances, it might take several hours to properly audit the system and find out what really happened. In scenarios like this, deploying a replacement server is usually the safest way to get your services back online while a detailed inspection is done on the affected server. With configuration management and automation, this can be done in a quick and reliable way.
At first glance, manual system administration may seem to be an easy way to deploy and quickly fix servers, but it often comes with a price. With time, it may become extremely difficult to know exactly what is installed on a server and which changes were made, when the process is not automated. Manual hotfixes, configuration tweaks, and software updates can turn servers into unique snowflakes, hard to manage and even harder to replicate. By using a configuration management tool, the procedure necessary for bringing up a new server or updating an existing one will be all documented in the provisioning scripts.
Once you have your server setup translated into a set of provisioning scripts, you will have the ability to apply to your server environment many of the tools and workflows you normally use for software source code.
Version control tools, such as Git, can be used to keep track of changes made to the provisioning and to maintain separate branches for legacy versions of the scripts. You can also use version control to implement a code review policy for the provisioning scripts, where any changes should be submitted as a pull request and approved by a project lead before being accepted. This practice will add extra consistency to your infrastructure setup.
Configuration management makes it trivial to replicate environments with the exact same software and configurations. This enables you to effectively build a multistage ecosystem, with production, development, and testing servers. You can even use local virtual machines for development, built with the same provisioning scripts. This practice will minimize problems caused by environment discrepancies that frequently occur when applications are deployed to production or shared between co-workers with different machine setups (different operating system, software versions and/or configurations).
Even though each CM tool has its own terms, philosophy and ecosystem, they typically share many characteristics and have similar concepts.
Most configuration management tools use a controller/master and node/agent model. Essentially, the controller directs the configuration of the nodes, based on a series of instructions or tasks defined in your provisioning scripts.
Below you can find the most common features present in most configuration management tools for servers:
Each CM tool provides a specific syntax and a set of features that you can use to write provisioning scripts. Most tools will have features that make their language similar to conventional programming languages, but in a simplified way. Variables, loops, and conditionals are common features provided to facilitate the creation of more versatile provisioning scripts.
Configuration management tools keep track of the state of resources in order to avoid repeating tasks that were executed before. If a package was already installed, the tool won’t try to install it again. The objective is that after each provisioning run the system reaches (or keeps) the desired state, even if you run it multiple times. This is what characterizes these tools as having an idempotent behavior. This behavior is not necessarily enforced in all cases, though.
Configuration management tools usually provide detailed information about the system being provisioned. This data is available through global variables, known as facts. They include things like network interfaces, IP addresses, operating system, and distribution. Each tool will provide a different set of facts. They can be used to make provisioning scripts and templates more adaptive for multiple systems.
Most CM tools will provide a built-in templating system that can be used to facilitate setting up configuration files and services. Templates usually support variables, loops, and conditionals that can be used to maximise versatility. For instance, you can use a template to easily set up a new virtual host within Apache, while reusing the same template for multiple server installations. Instead of having only hard-coded, static values, a template should contain placeholders for values that can change from host to host, such as NameServer
and DocumentRoot
.
Even though provisioning scripts can be very specialized for the needs and demands of a particular server, there are many cases when you have similar server setups or parts of a setup that could be shared between multiple servers. Most provisioning tools will provide ways in which you can easily reuse and share smaller chunks of your provisioning setup as modules or plugins.
Third-party modules and plugins are often easy to find on the Internet, specially for common server setups like installing a PHP web server. CM tools tend to have a strong community built around them and users are encouraged to share their custom extensions. Using extensions provided by other users can save you a lot of time, while also serving as an excellent way of learning how other users solved common problems using your tool of choice.
There are many CM tools available in the market, each one with a different set of features and different complexity levels. Popular choices include Chef, Ansible, and Puppet. The first challenge is to choose a tool that is a good fit for your needs.
There are a few things you should take into consideration before making a choice:
Most configuration management tools require a minimum hierarchy consisting of a controller machine and a node that will be managed by it. Puppet, for example, requires an agent application to be installed on each node, and a master application to be installed on the controller machine. Ansible, on the other hand, has a decentralized structure that doesn’t require installation of additional software on the nodes, but relies on SSH to execute the provisioning tasks. For smaller projects, a simplified infrastructure might seem like a better fit, however it is important to take into consideration aspects like scalability and security, which may not be enforced by the tool.
Some tools can have more components and moving parts, which might increase the complexity of your infrastructure, impacting on the learning curve and possibly increasing the overall cost of implementation.
As mentioned earlier in this article, CM tools provide a custom syntax, sometimes using a Domain Specific Language (DSL), and a set of features that comprise their framework for automation. As with conventional programming languages, some tools will demand a higher learning curve to be mastered. The infrastructure requirements might also influence the complexity of the tool and how quickly you will be able to see a return of investment.
Most CM tools offer free or open source versions, with paid subscriptions for advanced features and services. Some tools will have more limitations than others, so depending on your specific needs and how your infrastructure grows, you might end up having to pay for these services. You should also consider training as a potential extra cost, not only in monetary terms, but also regarding the time that will be necessary to get your team up to speed with the tool you end up choosing.
As mentioned before, most tools offer paid services that can include support, extensions, and advanced tooling. It’s important to analyse your specific needs, the size of your infrastructure and whether or not there is a need for using these services. Management panels, for instance, are a common service offered by these tools, and they can greatly facilitate the process of managing and monitoring all your servers from a central point. Even if you don’t need such services just yet, consider the options for a possible future necessity.
A strong and welcoming community can be extremely resourceful for support and for documentation, since users are typically happy to share their knowledge and their extensions (modules, plugins, and provisioning scripts) with other users. This can be helpful to speed up your learning curve and avoid extra costs with paid support or training.
The table below should give you a quick overview of the main differences between three of the most popular configuration management tools available in the market today: Ansible, Puppet, and Chef.
Ansible | Puppet | Chef | |
---|---|---|---|
Script Language | YAML | Custom DSL based on Ruby | Ruby |
Infrastructure | Controller machine applies configuration on nodes via SSH | Puppet Master synchronizes configuration on Puppet Nodes | Chef Workstations push configuration to Chef Server, from which the Chef Nodes will be updated |
Requires specialized software for nodes | No | Yes | Yes |
Provides centralized point of control | No. Any computer can be a controller | Yes, via Puppet Master | Yes, via Chef Server |
Script Terminology | Playbook / Roles | Manifests / Modules | Recipes / Cookbooks |
Task Execution Order | Sequential | Non-Sequential | Sequential |
So far, we’ve seen how configuration management works for servers, and what to consider when choosing a tool for building your configuration management infrastructure. In subsequent guides in this series, we will have a hands-on experience with three popular configuration management tools: Ansible, Puppet and Chef.
In order to give you a chance to compare these tools by yourself, we are going to use a simple example of server setup that should be fully automated by each tool. This setup consists of an Ubuntu 18.04 server running Apache to host a simple web page.
Configuration management can drastically improve the integrity of servers over time by providing a framework for automating processes and keeping track of changes made to the system environment. In the next guide in this series, we will see how to implement a configuration management strategy in practice using Ansible as tool.
]]>Um service mesh é uma camada de infraestrutura que permite gerenciar a comunicação entre os microsserviços da sua aplicação. À medida que mais desenvolvedores trabalham com microsserviços, os service meshes evoluíram para tornar esse trabalho mais fácil e mais eficaz consolidando tarefas administrativas e de gerenciamento comuns em uma configuração distribuída.
Aplicar uma abordagem de microsserviço à arquitetura de aplicações envolve dividir sua aplicação em uma coleção de serviços fracamente acoplados. Essa abordagem oferece certos benefícios: as equipes podem iterar projetos e escalar rapidamente, usando uma variedade maior de ferramentas e linguagens. Por outro lado, os microsserviços representam novos desafios para a complexidade operacional, consistência de dados e segurança.
Service meshes são projetados para resolver alguns desses desafios, oferecendo um nível granular de controle sobre como os serviços se comunicam uns com os outros. Especificamente, eles oferecem aos desenvolvedores uma maneira de gerenciar:
Embora seja possível executar essas tarefas de forma nativa com orquestradores de containers como o Kubernetes, essa abordagem envolve uma maior quantidade de tomadas de decisão e administração antecipadas quando comparada com o que as soluções de service mesh como o Istio e o Linkerd oferecem por fora. Nesse sentido, service meshes podem agilizar e simplificar o processo de trabalho com componentes comuns em uma arquitetura de microsserviço. Em alguns casos, eles podem até ampliar a funcionalidade desses componentes.
Service meshes são projetados para resolver alguns dos desafios inerentes às arquiteturas de aplicações distribuídas.
Essas arquiteturas cresceram a partir do modelo de aplicação de três camadas, que dividia as aplicações em uma camada web, uma camada de aplicação e uma camada de banco de dados. Ao escalar, esse modelo se mostrou desafiador para organizações que experimentam um rápido crescimento. Bases de código de aplicações monolíticas podem se tornar bagunçadas, conhecidas como “big balls of mud”, impondo desafios para o desenvolvimento e o deployment.
Em resposta a esse problema, organizações como Google, Netflix e Twitter desenvolveram bibliotecas “fat client” internas para padronizar as operações de runtime entre os serviços. Essas bibliotecas forneceram balanceamento de carga, circuit breaker , roteamento e telemetria — precursores para recursos de service mesh. No entanto, eles também impuseram limitações às linguagens que os desenvolvedores poderiam usar e exigiram mudanças nos serviços quando eles próprios foram atualizados ou alterados.
Um design de microsserviço evita alguns desses problemas. Em vez de ter uma base de código grande e centralizada de aplicações, você tem uma coleção de serviços gerenciados discretamente que representam um recurso da sua aplicação. Os benefícios de uma abordagem de microsserviço incluem:
Ao mesmo tempo, os microsserviços também criaram desafios:
Service meshes são projetados para resolver esses problemas, oferecendo controle coordenado e granular sobre como os serviços se comunicam. Nas seções a seguir, veremos como service meshes facilitam a comunicação de serviço a serviço por meio da descoberta de serviços, roteamento e balanceamento interno de carga, configuração de tráfego, criptografia, autenticação e autorização, métricas e monitoramento. Vamos utilizar a aplicação de exemplo Bookinfo do Istio — quatro microsserviços que juntos exibem informações sobre determinados livros — como um exemplo concreto para ilustrar como os service meshes funcionam.
Em um framework distribuído, é necessário saber como se conectar aos serviços e saber se eles estão ou não disponíveis. Os locais das instâncias de serviço são atribuídos dinamicamente na rede e as informações sobre eles estão em constante mudança, à medida que os containers são criados e destruídos por meio do escalonamento automático, upgrades e falhas.
Historicamente, existiram algumas ferramentas para fazer a descoberta de serviços em uma estrutura de microsserviço. Repositórios de chave-valor como o etcd foram emparelhados com outras ferramentas como o Registrator para oferecer soluções de descoberta de serviços. Ferramentas como o Consul iteraram isso combinando um armazenamento de chave-valor com uma interface de DNS que permite aos usuários trabalhar diretamente com seu servidor ou nó DNS.
Tomando uma abordagem semelhante, o Kubernetes oferece descoberta de serviço baseada em DNS por padrão. Com ele, você pode procurar serviços e portas de serviço e fazer pesquisas inversas de IP usando convenções comuns de nomenclatura de DNS. Em geral, um registro A para um serviço do Kubernetes corresponde a esse padrão: serviço.namespace.svc.cluster.local
. Vamos ver como isso funciona no contexto do aplicativo Bookinfo. Se, por exemplo, você quisesse informações sobre o serviço details
do aplicativo Bookinfo, poderia ver a entrada relevante no painel do Kubernetes:
Isto lhe dará informações relevantes sobre o nome do serviço, namespace e ClusterIP
, que você pode usar para se conectar ao seu serviço, mesmo que os containers individuais sejam destruídos e recriados.
Um service mesh como o Istio também oferece recursos de descoberta de serviço. Para fazer a descoberta de serviços, o Istio confia na comunicação entre a API do Kubernetes, o próprio plano de controle do Istio, gerenciado pelo componente de gerenciamento de tráfego Pilot, e seu plano de dados, gerenciado pelos proxies sidecar Envoy. O Pilot interpreta os dados do servidor da API do Kubernetes para registrar as alterações nos locais do Pod. Em seguida, ele converte esses dados em uma representação canônica Istio e os encaminha para os proxies sidecar.
Isso significa que a descoberta de serviço no Istio é independente de plataforma, o que podemos ver usando o add-on Grafana do Istio para olhar o serviço details
novamente no painel de serviço do Istio:
Nossa aplicação está sendo executada em um cluster Kubernetes, então, mais uma vez, podemos ver as informações relevantes do DNS sobre o Serviço details
, juntamente com outros dados de desempenho.
Em uma arquitetura distribuída, é importante ter informações atualizadas, precisas e fáceis de localizar sobre serviços. Tanto o Kubernetes quanto os service meshes, como o Istio, oferecem maneiras de obter essas informações usando convenções do DNS.
Gerenciar o tráfego em uma estrutura distribuída significa controlar como o tráfego chega ao seu cluster e como ele é direcionado aos seus serviços. Quanto mais controle e especificidade você tiver na configuração do tráfego externo e interno, mais você poderá fazer com sua configuração. Por exemplo, nos casos em que você está trabalhando com deployments piloto (canary), migrando aplicativos para novas versões ou testando serviços específicos por meio de injeção de falhas, ter a capacidade de decidir quanto tráfego seus serviços estão obtendo e de onde ele vem será a chave para o sucesso de seus objetivos.
O Kubernetes oferece diferentes ferramentas, objetos e serviços que permitem aos desenvolvedores controlar o tráfego externo para um cluster: kubectl proxy
, NodePort
, Load Balancers, e Ingress Controllers and Resources. O kubectl proxy
e o NodePort
permitem expor rapidamente seus serviços ao tráfego externo: O kubectl proxy
cria um servidor proxy que permite acesso ao conteúdo estático com um caminho HTTP, enquanto o NodePort
expõe uma porta designada aleatoriamente em cada node. Embora isso ofereça acesso rápido, as desvantagens incluem ter que executar o kubectl
como um usuário autenticado, no caso do kubectl proxy
, e a falta de flexibilidade nas portas e nos IPs do node, no caso do NodePort
. E, embora um Balanceador de Carga otimize a flexibilidade ao se conectar a um serviço específico, cada serviço exige seu próprio Balanceador de Carga, o que pode custar caro.
Um Ingress Resource e um Ingress Controller juntos oferecem um maior grau de flexibilidade e configuração em relação a essas outras opções. O uso de um Ingress Controller com um Ingress Resource permite rotear o tráfego externo para os serviços e configurar o roteamento interno e o balanceamento de carga. Para usar um Ingress Resource, você precisa configurar seus serviços, o Ingress Controller e o LoadBalancer
e o próprio Ingress Resource, que especificará as rotas desejadas para os seus serviços. Atualmente, o Kubernetes suporta seu próprio Controlador Nginx, mas há outras opções que você pode escolher também, gerenciadas pelo Nginx, Kong, e outros.
O Istio itera no padrão Controlador/Recurso do Kubernetes com Gateways do Istio e VirtualServices. Como um Ingress Controller, um gateway define como o tráfego de entrada deve ser tratado, especificando as portas e os protocolos expostos a serem usados. Ele funciona em conjunto com um VirtualService, que define rotas para serviços dentro da malha ou mesh. Ambos os recursos comunicam informações ao Pilot, que encaminha essas informações para os proxies Envoy. Embora sejam semelhantes ao Ingress Controllers and Resources, os Gateways e os VirtualServices oferecem um nível diferente de controle sobre o tráfego: em vez de combinar camadas e protocolos Open Systems Interconnection (OSI), Gateways e VirtualServices permitem diferenciar entre as camadas OSI nas suas configurações. Por exemplo, usando VirtualServices, as equipes que trabalham com especificações de camada de aplicação podem ter interesses diferenciados das equipes de operações de segurança que trabalham com diferentes especificações de camada. Os VirtualServices possibilitam separar o trabalho em recursos de aplicações distintos ou em diferentes domínios de confiança e podem ser usados para testes como canary, rollouts graduais, testes A/B, etc.
Para visualizar a relação entre os serviços, você pode usar o add-on Servicegraph do Istio, que produz uma representação dinâmica da relação entre os serviços usando dados de tráfego em tempo real. A aplicação Bookinfo pode se parecer com isso sem qualquer roteamento personalizado aplicado:
Da mesma forma, você pode usar uma ferramenta de visualização como o Weave Scope para ver a relação entre seus serviços em um determinado momento. A aplicação Bookinfo sem roteamento avançado pode ter esta aparência:
Ao configurar o tráfego de aplicações em uma estrutura distribuída, há várias soluções diferentes — de opções nativas do Kubernetes até service meshes como o Istio — que oferecem várias opções para determinar como o tráfego externo chegará até seus recursos de aplicação e como esses recursos se comunicarão entre si.
Um framework distribuído apresenta oportunidades para vulnerabilidades de segurança. Em vez de se comunicarem por meio de chamadas internas locais, como aconteceria em uma configuração monolítica, os serviços em uma arquitetura de microsserviço transmitem informações, incluindo informações privilegiadas, pela rede. No geral, isso cria uma área de superfície maior para ataques.
Proteger os clusters do Kubernetes envolve uma variedade de procedimentos; Vamos nos concentrar em autenticação, autorização e criptografia. O Kubernetes oferece abordagens nativas para cada um deles:
etcd
.kube-apisever
. Você também pode usar uma rede de sobreposição como a Weave Net para fazer isso.A configuração de políticas e protocolos de segurança individuais no Kubernetes requer investimento administrativo. Um service mesh como o Istio pode consolidar algumas dessas atividades.
O Istio foi projetado para automatizar parte do trabalho de proteção dos serviços. Seu plano de controle inclui vários componentes que lidam com segurança:
Por exemplo, quando você cria um serviço, o Citadel recebe essa informação do kube-apiserver
e cria certificados e chaves SPIFFE para este serviço. Em seguida, ele transfere essas informações para Pods e sidecars Envoy para facilitar a comunicação entre os serviços.
Você também pode implementar alguns recursos de segurança habilitando o TLS mútuo durante a instalação do Istio. Isso inclui identidades de serviço fortes para comunicação interna nos clusters e entre clusters, comunicação segura de serviço para serviço e de usuários para serviço, e um sistema de gerenciamento de chaves capaz de automatizar a criação, a distribuição e a rotação de chaves e certificados.
Ao iterar em como o Kubernetes lida com autenticação, autorização e criptografia, service meshes como o Istio são capazes de consolidar e estender algumas das melhores práticas recomendadas para a execução de um cluster seguro do Kubernetes.
Ambientes distribuídos alteraram os requisitos para métricas e monitoramento. As ferramentas de monitoramento precisam ser adaptativas, respondendo por mudanças frequentes em serviços e endereços de rede, e abrangentes, permitindo a quantidade e o tipo de informações que passam entre os serviços.
O Kubernetes inclui algumas ferramentas internas de monitoramento por padrão. Esses recursos pertencem ao seu pipeline de métricas de recursos, que garante que o cluster seja executado conforme o esperado. O componente cAdvisor coleta estatísticas de uso de rede, memória e CPU de containers e nodes individuais e passa essas informações para o kubelet; o kubelet, por sua vez, expõe essas informações por meio de uma API REST. O servidor de métricas obtém essas informações da API e as repassa para o kube-aggregator
para formatação.
Você pode estender essas ferramentas internas e monitorar os recursos com uma solução completa de métricas. Usando um serviço como o Prometheus como um agregador de métricas, você pode criar uma solução diretamente em cima do pipeline de métricas de recursos do Kubernetes. O Prometheus integra-se diretamente ao cAdvisor através de seus próprios agentes, localizados nos nodes. Seu principal serviço de agregação coleta e armazena dados dos nodes e os expõe através de painéis e APIs. Opções adicionais de armazenamento e visualização também estão disponíveis se você optar por integrar seu principal serviço de agregação com ferramentas de backend de armazenamento, registro e visualização, como InfluxDB, Grafana, ElasticSearch, Logstash, Kibana, e outros.
Em um service mesh como o Istio, a estrutura do pipeline completo de métricas faz parte do design da malha. Os sidecars do Envoy operando no nível do Pod comunicam as métricas ao Mixer, que gerencia políticas e telemetria. Além disso, os serviços Prometheus e Grafana estão habilitados por padrão (embora se você estiver instalando o Istio com o Helm você precisará especificar granafa.enabled=true
durante a instalação). Como no caso do pipeline completo de métricas, você também pode configurar outros serviços e deployments para opções de registro e visualização.
Com essas ferramentas de métrica e visualização, você pode acessar informações atuais sobre serviços e cargas de trabalho em um local central. Por exemplo, uma visão global do aplicativo BookInfo pode ter esta aparência no painel Grafana do Istio:
Ao replicar a estrutura de um pipeline completo de métricas do Kubernetes e simplificar o acesso a alguns de seus componentes comuns, service meshes como o Istio agilizam o processo de coleta e visualização de dados ao trabalhar com um cluster.
As arquiteturas de microsserviço são projetadas para tornar o desenvolvimento e o deployment de aplicações mais rápidos e confiáveis. No entanto, um aumento na comunicação entre serviços mudou as práticas recomendadas para determinadas tarefas administrativas. Este artigo discute algumas dessas tarefas, como elas são tratadas em um contexto nativo do Kubernetes e como elas podem ser gerenciadas usando service mesh - nesse caso, o Istio.
Para obter mais informações sobre alguns dos tópicos do Kubernetes abordados aqui, consulte os seguintes recursos:
Além disso, os hubs de documentação do Kubernetes e do Istio são ótimos lugares para encontrar informações detalhadas sobre os tópicos discutidos aqui.
]]>Armazenamento de dados seguro e confiável é uma necessidade para quase todas as aplicações modernas. No entanto, a infraestrutura necessária para um banco de dados local autogerenciado pode ser proibitivamente cara para muitas equipes. Da mesma forma, funcionários que possuem as habilidades e a experiência necessárias para manter um banco de dados de produção com eficiência podem ser difíceis de contratar.
A disseminação dos serviços de computação em nuvem reduziu as barreiras de entrada associadas ao provisionamento de um banco de dados, mas muitos desenvolvedores ainda não têm tempo nem conhecimento necessários para gerenciar e ajustar um banco de dados para atender às suas necessidades. Por esse motivo, muitas empresas estão recorrendo aos serviços de banco de dados gerenciados para ajudá-las a criar e dimensionar seus bancos de dados de acordo com seu crescimento.
Neste artigo conceitual, veremos o que são bancos de dados gerenciados e como eles podem ser benéficos para muitas empresas. Também abordaremos algumas considerações práticas que devem ser feitas antes de criar sua próxima aplicação em cima de uma solução de banco de dados gerenciada.
Um banco de dados gerenciado é um serviço de computação em nuvem no qual o usuário final paga um provedor de serviços em nuvem para acessar um banco de dados. Ao contrário de um banco de dados típico, os usuários não precisam configurar ou manter um banco de dados gerenciado por conta própria; em vez disso, é responsabilidade do provedor supervisionar a infraestrutura do banco de dados. Isso permite que o usuário se concentre em criar sua aplicação, em vez de gastar tempo configurando seu banco de dados e mantê-lo atualizado.
O processo de provisionamento de um banco de dados gerenciado varia de acordo com o provedor, mas, em geral, é semelhante ao de qualquer outro serviço baseado em nuvem. Depois de registrar uma conta e fazer login no painel, o usuário revisa as opções de banco de dados disponíveis - como o engine do banco de dados e o tamanho do cluster - e escolhe a configuração certa para elas. Depois de provisionar o banco de dados gerenciado, você pode se conectar a ele por meio de uma GUI ou de um cliente e, em seguida, pode começar a carregar dados e a integrar o banco de dados à sua aplicação.
As soluções de dados gerenciadas simplificam o processo de provisionamento e manutenção de um banco de dados. Em vez de executar comandos a partir de um terminal para instalar e configurar, você pode implantar um banco de dados pronto para produção com apenas alguns cliques no navegador. Ao simplificar e automatizar o gerenciamento de banco de dados, os provedores de nuvem facilitam para todos, até mesmo os usuários de bancos de dados iniciantes, a criação de aplicações e websites orientados a dados. Esse foi o resultado de uma tendência de décadas para simplificar, automatizar e abstrair várias tarefas de gerenciamento de banco de dados, que por si só era uma resposta à aflição sentida pelos administradores de banco de dados.
Antes do surgimento do modelo de computação em nuvem, qualquer empresa que necessitasse de um data center precisava fornecer todo o tempo, espaço e recursos necessários para configurar um. Uma vez que seu banco de dados estava funcionando, elas também precisavam manter o hardware, manter seu software atualizado, contratar uma equipe para gerenciar o banco de dados e treinar seus funcionários sobre como usá-lo.
Como os serviços de computação em nuvem cresceram em popularidade nos anos 2000, tornou-se mais fácil e mais acessível provisionar a infraestrutura do servidor, já que o hardware e o espaço necessário para ele não precisavam mais ser de propriedade da empresa ou gerenciados por aqueles que o usavam. Da mesma forma, configurar um banco de dados inteiramente dentro da nuvem tornou-se muito menos difícil; um empresa ou desenvolvedor precisaria apenas requisitar um servidor, instalar e configurar o sistema de gerenciamento de banco de dados escolhido e começar a armazenar dados.
Embora a computação em nuvem tenha facilitado o processo de configuração de um banco de dados tradicional, ela não resolveu todos os seus problemas. Por exemplo, na nuvem, ainda pode ser difícil identificar o tamanho ideal da área de cobertura da infraestrutura de um banco de dados antes de começar a coletar dados. Isso é importante porque os consumidores de nuvem são cobrados com base nos recursos que consomem e correm o risco de pagar mais do que o necessário, se o servidor que eles provisionam for maior que o necessário. Além disso, como ocorre com os bancos de dados locais tradicionais, o gerenciamento do banco de dados na nuvem pode ser um esforço dispendioso. Dependendo de suas necessidades, você ainda pode precisar contratar um administrador de banco de dados experiente ou gastar uma quantidade significativa de tempo e dinheiro treinando sua equipe atual para gerenciar seu banco de dados de forma eficaz.
Muitos desses problemas são agravados para empresas menores e desenvolvedores independentes. Enquanto uma grande empresa geralmente pode contratar funcionários com um conhecimento profundo de bancos de dados, equipes menores geralmente têm menos recursos disponíveis, deixando-as apenas com o conhecimento institucional existente. Isso torna tarefas como replicação, migrações e backups ainda mais difíceis e demoradas, pois podem exigir muito aprendizado no trabalho, além de tentativa e erro.
Os bancos de dados gerenciados ajudam a resolver esses pontos problemáticos com uma série de benefícios para empresas e desenvolvedores. Vamos examinar alguns desses benefícios e ver como eles podem impactar as equipes de desenvolvimento.
Os serviços de banco de dados gerenciados podem ajudar a reduzir muitas das dores de cabeça associadas ao provisionamento e ao gerenciamento de um banco de dados. Os desenvolvedores criam aplicações sobre os serviços de banco de dados gerenciados para acelerar drasticamente o processo de provisionamento de um servidor. Com uma solução autogerenciada, você deve obter um servidor (local ou na nuvem), conectar-se a ele a partir de um cliente ou terminal, configurá-lo e protegê-lo e, em seguida, instalar e configurar o software de gerenciamento de banco de dados antes de poder iniciar o armazenamento de dados. Com um banco de dados gerenciado, você só precisa decidir sobre o tamanho inicial do servidor, configurar opções adicionais específicas do provedor e terá um novo banco de dados pronto para ser integrado à sua aplicação ou website. Isso geralmente pode ser feito em apenas alguns minutos através da interface de usuário do provedor.
Outro apelo de bancos de dados gerenciados é a automação. Bancos de dados autogerenciados podem consumir uma grande quantidade de recursos de uma organização porque seus funcionários precisam executar todas as tarefas administrativas — do dimensionamento até a execução de atualizações, executar migrações e criar backups — manualmente. Com um banco de dados gerenciado, no entanto, essas e outras tarefas são executadas automaticamente ou sob demanda, o que reduz drasticamente o risco de erro humano.
Isso está relacionado ao fato de que os serviços de banco de dados gerenciados ajudam a simplificar o processo de escalonamento do banco de dados. Escalar um banco de dados autogerenciado pode consumir muito tempo e recursos. Quer você escolha sharding, replicação, balanceamento de carga ou qualquer outra coisa como estratégia de dimensionamento, se você gerenciar a infraestrutura por conta própria, será responsável por garantir que nenhum dado seja perdido no processo e que a aplicação continue funcionando corretamente. No entanto, se você integrar sua aplicação a um serviço de banco de dados gerenciado, poderá escalonar o cluster de banco de dados sob demanda. Em vez de precisar trabalhar de antemão com o tamanho ideal do servidor ou o uso da CPU, você pode provisionar rapidamente mais recursos dinamicamente. Isso ajuda a evitar o uso de recursos desnecessários, o que significa que você também não pagará pelo que não precisa.
Soluções gerenciadas tendem a ter alta disponibilidade integrada. No contexto da computação em nuvem, um serviço é considerado de alta disponibilidade se for estável e passível de ser executado sem falhas por longos períodos de tempo. A maioria dos produtos de provedores de nuvem respeitáveis vem com um contrato de nível de serviço (SLA), um compromisso entre o provedor e seus clientes que garante a disponibilidade e a confiabilidade de seus serviços. Um SLA típico especificará quanto tempo de inatividade o cliente deve esperar e muitos também definirão a compensação para os clientes se esses níveis de serviço não forem atingidos. Isso fornece garantia para o cliente de que o banco de dados não irá falhar e, se isso acontecer, poderá pelo menos esperar algum tipo de reparação do provedor.
Em geral, os bancos de dados gerenciados simplificam as tarefas associadas ao provisionamento e à manutenção de um banco de dados. Dependendo do provedor, você ou sua equipe ainda precisarão de algum nível de experiência ao trabalhar com bancos de dados para provisionar um banco e interagir com ele à medida que você cria e escala sua aplicação. Por fim, a experiência específica em banco de dados necessária para administrar um banco de dados gerenciado será muito menor do que com a solução autogerenciada.
É claro que os bancos de dados gerenciados não são capazes de resolver todos os problemas e podem não ser a opção ideal para alguns. Em seguida, examinaremos algumas das possíveis desvantagens que devemos considerar antes de provisionar um banco de dados gerenciado.
Um serviço de banco de dados gerenciado pode aliviar o estresse de implantar e manter um banco de dados, mas ainda há algumas coisas a serem consideradas antes de se comprometer com um. Lembre-se de que o principal atrativo dos bancos de dados gerenciados é que eles abstraem a maioria dos aspectos mais tediosos da administração do banco de dados. Para este fim, um provedor de banco de dados gerenciado tem como objetivo fornecer um banco de dados rudimentar que satisfaça os casos de uso mais comuns. Assim, essas ofertas de banco de dados não apresentam muitas opções de personalização ou recursos exclusivos incluídos em softwares de banco de dados mais especializados. Por causa disso, você não terá tanta liberdade para adaptar seu banco de dados e estará limitado ao que o provedor de nuvem tiver a oferecer.
Um banco de dados gerenciado é quase sempre mais caro do que um autogerenciado. Isso faz sentido, já que você está pagando ao provedor de nuvem para dar suporte a você no gerenciamento do banco de dados, mas pode ser um motivo de preocupação para equipes com recursos limitados. Além disso, o preço para bancos de dados gerenciados é geralmente baseado em quanto armazenamento e RAM o banco de dados usa, quantas leituras ele manipula e quantos backups do banco de dados o usuário cria. Da mesma forma, qualquer aplicativo que utilize um serviço de banco de dados gerenciado que manipule grandes quantidades de dados ou tráfego, será mais caro do que se fosse usado um banco de dados autogerenciado na nuvem.
Deve-se também refletir sobre o impacto que a mudança para um banco de dados gerenciado terá em seus fluxos de trabalho internos e se eles poderão ou não se ajustar a essas mudanças. Um provedor é diferente de outro e, dependendo do seu SLA, pode assumir a responsabilidade por apenas algumas tarefas de administração, o que seria problemático para os desenvolvedores que procuram uma solução de serviço completo. Por outro lado, alguns provedores podem ter um SLA proibitivo ou tornar o cliente totalmente dependente do provedor em questão, uma situação conhecida como vendor lock-in.
Por último, e talvez mais importante, deve-se considerar cuidadosamente se algum serviço de banco de dados gerenciado que você está considerando usar atenderá ou não às suas necessidades de segurança. Todos os bancos de dados, incluindo bancos de dados locais, são propensos a determinadas ameaças de segurança, como ataques de injeção de SQL ou vazamentos de dados. No entanto, a dinâmica de segurança é muito diferente para bancos de dados hospedados na nuvem. Usuários de banco de dados gerenciados não podem controlar a localização física de seus dados ou quem tem acesso a eles, nem podem garantir a conformidade com padrões de segurança específicos. Isso pode ser especialmente problemático se o seu cliente tiver necessidades de segurança elevadas.
Para ilustrar, imagine que você é contratado por um banco para criar uma aplicação em que seus clientes possam acessar registros financeiros e efetuar pagamentos. O banco pode estipular que o aplicativo deve ter criptografia de dados em repouso e permissões de usuário com escopo adequado, e que devem estar em conformidade com determinados padrões de regulamentação, como PCI DSS. Nem todos os provedores de bancos de dados gerenciados aderem aos mesmos padrões regulatórios ou mantêm as mesmas práticas de segurança, e é improvável que adotem novos padrões ou práticas para apenas um de seus clientes. Por esse motivo, é essencial garantir que qualquer provedor de banco de dados gerenciado do qual você dependa para tal aplicação seja capaz de atender às suas necessidades de segurança, bem como às necessidades de seus clientes.
Os bancos de dados gerenciados têm muitos recursos que atraem uma ampla variedade de empresas e desenvolvedores, mas um banco de dados gerenciado pode não resolver todos os problemas ou atender às necessidades de todos. Alguns podem achar que o conjunto limitado de recursos e as opções de configuração de um banco de dados gerenciado, o aumento de custos e a flexibilidade reduzida superam qualquer uma de suas possíveis vantagens. No entanto, benefícios atraentes como facilidade de uso, escalabilidade, backups e upgrades automatizados e alta disponibilidade levaram a uma maior adoção de soluções de banco de dados gerenciado em vários setores.
Se você estiver interessado em aprender mais sobre os Bancos de Dados Gerenciados da DigitalOcean, recomendamos que você confira nossa documentação de produto.
]]>Despite being a commercial failure after its first publication, Herman Melville’s allegorical adventure novel Moby-Dick; or, The Whale is today one of the most popular and influential novels in the American canon. Artists as diverse as William Faulkner, Ralph Ellison, and Bob Dylan have acknowledged the novel’s impact on their work, and one can spot references to it in films, television, music, and, of course, open-source projects.
In this article, we will analyze several nautically-themed open-source projects and how they pay tribute to Moby-Dick.
Warning: While it isn’t necessary that you read Moby-Dick prior to reading this article, this article does contain a few spoilers. If you haven’t read the novel but would like to, you may want to hold off from reading this article until you’ve finished it.
To follow along with this tutorial, you’ll need:
Docker is an open-source program that performs operating system-level virtualization, also known as containerization. The influence of Moby-Dick is obvious with this project: Docker’s logo and mascot is a whale affectionately known as Moby Dock. However, there are some substantial differences between Moby Dick and Moby Dock.
First, Moby Dock’s species isn’t immediately obvious. It’s clear from the beginning of the novel that Moby Dick is a sperm whale, and while it’s possible that Moby Dock is a sperm whale as well, there are several clues that suggest otherwise:
Another important difference between these Mobys is that Moby Dock is helpfully carrying a few stacks of containers; Moby Dick would never be so accommodating. In fact, one can easily imagine Moby Dick going out of his way to knock over such a neatly organized pile of shipping containers. Perhaps Moby Dock is meant to be seen as a warmer, friendlier cousin of Moby Dick. After all, it’s probably bad marketing to associate one’s product with a ferocious leviathan bent on destroying everything in its path.
OpenFaaS is an open-source project that aims to make serverless functions simple through the use of Docker containers, allowing users to run complex infrastructures with far greater flexibility and without the fear of vendor lock-in.
The OpenFaaS logo focuses entirely on a whale’s tail, which is significant because Melville dedicates an entire chapter to describing the tails of sperm whales. In it, Ishmael reveals his deep appreciation of whales’ tails:
Such is the subtle elasticity of [the tail], that whether wielded in sport, or in earnest, or in anger, whatever be the mood it be in, its flexions are invariably marked by exceeding grace. Therein no fairy’s arm can transcend it.
The OpenFaaS whale is shown to be peaking its flukes, presumably as it is about to dive. In the same chapter, Ishmael opines that “excepting the sublime breach…this peaking of the whale’s flukes is perhaps the grandest sight to be seen in all animated nature.” Perhaps the OpenFaaS team chose a whale’s tail as their logo to convey the grace and power that OpenFaaS brings to managing functions. It could even be that the whale is “diving in” to the realm of functions as a service.
Because OpenFaaS is closely related to Docker, it’s obvious why the project’s logo also features a whale. However, are these supposed to be the same whale? Let us not forget that Moby Dick was believed to be “ubiquitous”, with sailors swearing up and down that they had encountered him “in opposite latitudes at one and the same instant of time.” This may be a clue that Moby Dock and the OpenFaaS whale are indeed one and the same.
Perhaps in choosing this logo the OpenFaaS team was trying to signal their hope that the framework would become ubiquitous in future software projects. Interestingly, while an omnipresent whale may strike fear in the hearts of whalers, software is generally seen as safer and more secure if it’s widely used. The OpenFaaS team should be thankful that coders are generally less superstitious than whalers.
Kubernetes is an open-source container orchestration system that helps to automate the deployment, scaling, and management of applications. The name “Kubernetes” comes from the Greek word “κυβερνήτης,” which translates to English as “captain” or “helmsman.” Appropriately, its logo consists of a ship’s wheel, or helm, conveying the control and steadiness required to manage complex container orchestration with ease.
Curiously, the Pequod doesn’t have a wheel; instead, it has a tiller made out of a whale’s jawbone. This is seen by some readers as underscoring the shared histories of Captain Ahab and the ship, as Ahab lost his leg to the great white whale and replaced it with a whalebone prosthesis.
Though a helm or tiller can convey steadiness and control, as the Kubernetes logo designers intended, Moby-Dick shows us the deeper questions that the project maintainers might have brushed aside. Who is at the helm when it comes to Kubernetes? Even more, who is at the helm in our everyday lives? Do we drive software, or does software drive us? Of all these things the helm is the symbol.
MySQL is the world’s most widely deployed open-source database management system (DBMS). MySQL’s logo features the outline of a dolphin, affectionately known as Sakila.
While dolphins aren’t prominently featured in the plot of Moby-Dick, Melville discusses them at length in one of the books famous pseudoscientific asides. In Chapter 32, “Cetology,” Ishmael refers to dolphins as “Huzza Porpoises,” so called because sailors see them as an omen of good luck:
Their appearance is generally hailed with delight by the mariner…. If you yourself can withstand three cheers at beholding these vivacious fish, then heaven help ye; the spirit of godly gamesomeness is not in ye.
Mayhaps the MySQL developers chose a dolphin to represent their DBMS to impart this same sense of hopeful joy to those who use it. By associating the database with a dolphin, they hope users will see it as being similarly fast, agile, and fun-loving. After all, who doesn’t have fun running correlated subqueries?
MariaDB is a community-supported fork of MySQL, as indicated by its similarly nautical logo. Both the MariaDB and MySQL logos include the respective RDBMS’s name and feature an aquatic animal: in MariaDB’s case, this animal is a pinniped.
Interestingly, there’s some confusion about what kind of animal is depicted in the MariaDB logo. According to the project’s trademarks page, the animal in the logo is a sea lion. However, some members of the MariaDB community see it as a seal. MariaDB’s official sources are fairly consistent in referring to their mascot as a sea lion, though not always. Certainly, the mascot’s shape does seem to more closely resemble that of a sea lion, but it’s also missing the telltale ears which would distinguish it as such.
The idea that human perception is inherently biased and unreliable runs as a theme throughout the novel. Perhaps by keeping the pinniped’s species vague, the MariaDB team is making a Melvillian comment on how truth isn’t always obvious and, in some cases, can never be known for certain. Is it a seal or a sea lion? Is Moby Dick real or imagined? Is Vim or Emacs the superior text editor? Riddles like these abound throughout the world we live in, which, like a magician’s glass, to each and every man in turn but mirrors back his own mysterious self. Great pains, small gains for those who ask the world to solve them.
Of course, it’s also possible that the logo is simply meant to represent a sea lion. Perhaps when the MariaDB team asked the designer to draw ears, they responded “I would prefer not to.”
Clearly, Melville’s influence extends far beyond the realm of literature, and well into the world of open-source technology. As this article has highlighted, these five projects (and likely many more) pay homage to his great whaling tale through subtle references in their names and logos, as well as how they challenge our perceptions of truth and human nature.
We hope that by reading this article, you’ll go on to create your own Melville-inspired, nautically-themed, open-source project. Here are a few ideas to help you get started:
Note: Some readers may be wondering why this article hasn’t yet mentioned DigitalOcean’s own Sammy the Shark. The simple reason is that Sammy has little in common with the sharks depicted in Moby-Dick. Throughout the novel, sharks are depicted as ravenous beasts dominated by instinct. Melville’s sharks eat anything and everything in their path, and are violent, dangerous creatures who pose a serious risk to the crew of the Pequod (though not as great a risk as whales, apparently).
Clearly, Melville never encountered a shark like Sammy. After all, Sammy is a vegetarian, and a very friendly one at that!
Is there a way to Send POD Logs to Elasticsearch in Digitaloceans Kubernetes?
According to the Docs of Kubernetes this is feasible. https://kubernetes.io/docs/tasks/debug-application-cluster/logging-elasticsearch-kibana/
But it has to be done during setup.
KUBE_LOGGING_DESTINATION=elasticsearch
Are there Alternatives to this?
Kind regards Gradlon
]]>I want to create several different landing pages for each of our different domains. But I don’t want to create a droplet for each domain and then manage each on separately. I want to be able to quickly create landing pages, on the fly, for one domain and be able to do the same for another domain, without having to run a handful of different web servers.
Here’s an illustration of what I would like
Would Wordpress Multi-site be the way to go? What is the best resource out there to walk through the process and all the considerations you must take into account, when doing WP multisite?
I currently create landing pages on our Agile CRM but the problem with their landing pages is that they aren’t SECURE (no https) so it could flag users to not trust us when they land on our page.
This question went unanswered but was similar.
What’s the BEST PRACTICE here?
Basically I want to do what LeadPages and other “Landing Page Builders” offer, but not have to pay for their services, and run/manage my landing pages on my own.
]]>Secure, reliable data storage is a must for nearly every modern application. However, the infrastructure needed for a self-managed, on-premises database can be prohibitively expensive for many teams. Similarly, employees who have the skills and experience needed to maintain a production database effectively can be difficult to come by.
The spread of cloud computing services has lowered the barriers to entry associated with provisioning a database, but many developers still lack the time or expertise needed to manage and tune a database to suit their needs. For this reason, many businesses are turning to managed database services to help them build and scale their databases in line with their growth.
In this conceptual article, we will go over what managed databases are and how they can be beneficial to many organizations. We will also cover some practical considerations one should make before building their next application on top of a managed database solution.
A managed database is a cloud computing service in which the end user pays a cloud service provider for access to a database. Unlike a typical database, users don’t have to set up or maintain a managed database on their own; rather, it’s the provider’s responsibility to oversee the database’s infrastructure. This allows the user to focus on building their application instead of spending time configuring their database and keeping it up to date.
The process of provisioning a managed database varies by provider, but in general it’s similar to that of any other cloud-based service. After registering an account and logging in to the dashboard, the user reviews the available database options — such as the database engine and cluster size — and then chooses the setup that’s right for them. After you provision the managed database, you can connect to it through a GUI or client and can then begin loading data and and integrating the database with your application.
Managed data solutions simplify the process of provisioning and maintaining a database. Instead of running commands from a terminal to install and set one up, you can deploy a production-ready database with just a few clicks in your browser. By simplifying and automating database management, cloud providers make it easier for anyone, even novice database users, to build data-driven applications and websites. This was the result of a decades-long trend towards simplifying, automating, and abstracting various database management tasks, which was itself a response to pain points long felt by database administrators.
Prior to the rise of the cloud computing model, any organization in need of a data center had to supply all the time, space, and resources that went into setting one up. Once their database was up and running, they also had to maintain the hardware, keep its software updated, hire a team to manage the database, and train their employees on how to use it.
As cloud computing services grew in popularity in the 2000s, it became easier and more affordable to provision server infrastructure, since the hardware and the space required for it no longer had to be owned or managed by those using it. Likewise, setting up a database entirely within the cloud became far less difficult; a business or developer would just have to requisition a server, install and configure their chosen database management system, and begin storing data.
While cloud computing did make the process of setting up a traditional database easier, it didn’t address all of its problems. For instance, in the cloud it can still be difficult to pinpoint the ideal size of a database’s infrastructure footprint before it begins collecting data. This is important because cloud consumers are charged based on the resources they consume, and they risk paying for more than what they require if the server they provision is larger than necessary. Additionally, as with traditional on-premises databases, managing one’s database in the cloud can be a costly endeavor. Depending on your needs, you may still need to hire an experienced database administrator or spend a significant amount of time and money training your existing staff to manage your database effectively.
Many of these issues are compounded for smaller organizations and independent developers. While a large business can usually afford to hire employees with a deep knowledge of databases, smaller teams usually have fewer resources available, leaving them with only their existing institutional knowledge. This makes tasks like replication, migrations, and backups all the more difficult and time consuming, as they can require a great deal of on-the-job learning as well as trial and error.
Managed databases help to resolve these pain points with a host of benefits to businesses and developers. Let’s walk through some of these benefits and how they can impact development teams.
Managed database services can help to reduce many of the headaches associated with provisioning and managing a database. For one thing, developers build applications on top of managed database services to drastically speed up the process of provisioning a database server. With a self-managed solution, you must obtain a server (either on-premises or in the cloud), connect to it from a client or terminal, configure and secure it, and then install and set up the database management software before you can begin storing data. With a managed database, you only have to decide on the initial size of the database server, configure any additional provider-specific options, and you’ll have a new database ready to integrate with your app or website. This can usually be done in just a few minutes through the provider’s user interface.
Another appeal of managed databases is automation. Self-managed databases can consume a large amount of an organization’s resources because its employees have to perform every administrative task — from scaling to performing updates, running migrations, and creating backups — manually. With a managed database, however, these and other tasks are done either automatically or on-demand, which markedly reduces the risk of human error.
This relates to the fact that managed database services help to streamline the process of database scaling. Scaling a self-managed database can be very time- and resource-intensive. Whether you choose sharding, replication, load balancing, or something else as your scaling strategy, if you manage the infrastructure yourself then you’re responsible for ensuring that no data is lost in the process and that the application will continue to work properly. If you integrate your application with a managed database service, however, you can scale the database cluster on demand. Rather than having to work out the optimal server size or CPU usage beforehand, you can quickly provision more resources on-the-fly. This helps you avoid using unnecessary resources, meaning you also won’t pay for what you don’t need.
Managed solutions tend to have built-in high-availability. In the context of cloud computing, a service is said to be highly available if it is stable and likely to run without failure for long periods of time. Most reputable cloud providers’ products come with a service level agreement (SLA), a commitment between the provider and its customers that guarantees the availability and reliability of their services. A typical SLA will specify how much downtime the customer should expect, and many also define the compensation for customers if these service levels are not met. This provides assurance for the customer that their database won’t crash and, if it does, they can at least expect some kind of reparation from the provider.
In general, managed databases simplify the tasks associated with provisioning and maintaining a database. Depending on the provider, you or your team will still likely need some level of experience working with databases in order to provision a database and interact with it as you build and scale your application. Ultimately, though, the database-specific experience needed to administer a managed database will be much less than with self-managed solution.
Of course, managed databases aren’t able to solve every problem, and may prove to be a less-than-ideal choice for some. Next, we’ll go over a few of the potential drawbacks one should consider before provisioning a managed database.
A managed database service can ease the stress of deploying and maintaining a database, but there are still a few things to keep in mind before committing to one. Recall that a principal draw of managed databases is that they abstract away most of the more tedious aspects of database administration. To this end, a managed database provider aims to deliver a rudimentary database that will satisfy the most common use cases. Accordingly, their database offerings won’t feature tons of customization options or the unique features included in more specialized database software. Because of this, you won’t have as much freedom to tailor your database and you’ll be limited to what the cloud provider has to offer.
A managed database is almost always more expensive than a self-managed one. This makes sense, since you’re paying for the cloud provider to support you in managing the database, but it can be a cause for concern for teams with limited resources. Moreover, pricing for managed databases is usually based on how much storage and RAM the database uses, how many reads it handles, and how many backups of the database the user creates. Likewise, any application using a managed database service that handle large amounts of data or traffic will be more expensive than if it were to use a self-managed cloud database.
One should also reflect on the impact switching to a managed database will have on their internal workflows and whether or not they’ll be able to adjust to those changes. Every provider differs, and depending on their SLA they may shoulder responsibility for only some administration tasks, which would be problematic for developers looking for a full-service solution. On the other hand, some providers could have a prohibitively restrictive SLA or make the customer entirely dependent on the provider in question, a situation known as vendor lock-in.
Lastly, and perhaps most importantly, one should carefully consider whether or not any managed database service they’re considering using will meet their security needs. All databases, including on-premises databases, are prone to certain security threats, like SQL injection attacks or data leaks. However, the security dynamic is far different for databases hosted in the cloud. Managed database users can’t control the physical location of their data or who has access to it, nor can they ensure compliance with specific security standards. This can be especially problematic if your client has heightened security needs.
To illustrate, imagine that you’re hired by a bank to build an application where its clients can access financial records and make payments. The bank may stipulate that the app must have data at rest encryption and appropriately scoped user permissions, and that it must be compliant with certain regulatory standards like PCI DSS. Not all managed database providers adhere to the same regulatory standards or maintain the same security practices, and they’re unlikely to adopt new standards or practices for just one of their customers. For this reason, it’s critical that you ensure any managed database provider you rely on for such an application is able to meet your security needs as well as the needs of your clients.
Managed databases have many features that appeal to a wide variety of businesses and developers, but a managed database may not solve every problem or suit everyone’s needs. Some may find that a managed database’s limited feature set and configuration options, increased cost, and reduced flexibility outweigh any of its potential advantages. However, compelling benefits like ease of use, scalability, automated backups and upgrades, and high availability have led to increased adoption of managed database solutions in a variety of industries.
If you’re interested in learning more about DigitalOcean Managed Databases, we encourage you to check out our Managed Databases product documentation.
]]>/var/lib/docker
dir results in lost+found
.
First, Terraform is used to provision a digital ocean droplet and attach a volume to it. Then ansible is used to configure the host.
Please note that playbook below is using static/explicit values due to needing to debug:
---
- name: mount point of attached volume
stat:
path: /mnt/name_of_attached_volume
- name: get digital_ocean_volume_path_by_name
stat:
path: /dev/disk/by-id/scsi-0DO_Volume_name_of_attached_volume
- name: unmount images volume
command: umount /mnt/name_of_attached_volume
- name: Label the volume
command: parted -s /dev/disk/by-id/scsi-0DO_Volume_name_of_attached_volume mklabel gpt
- name: Create an ext4 partition
command: parted -s -a opt /dev/disk/by-id/scsi-0DO_Volume__name_of_attached_volume mkpart primary ext4 0% 100%
- name: Build the ext4 metadata
command: mkfs.ext4 /dev/disk/by-id/scsi-0DO_Volume__name_of_attached_volume-part1
####################################################################
# since the mount point -- `/var/lib/docker` -- already exists #
# by virtue of docker being installed on the host, no need to #
# create a mount point but I do need stop docker running #
####################################################################
- name: stop docker service
service:
name: docker
state: stopped
- name: mount volume read-write
mount:
path: /var/lib/docker
src: /dev/disk/by-id/scsi-0DO_Volume__name_of_attached_volume-part1
fstype: ext4
opts: defaults,discard
dump: 0
passno: 2
state: mounted
- name: remove mount point for images volume
command: rmdir /mnt/name_of_attached_volume
- name: Start docker service
service:
name: docker
state: started
enabled: "{{ docker_service_enabled }}"
After running the playbook above, the result of for running ls -la /var/lib/docker is:
drwxr-xr-x 3 root root 4096
drwx--x--x 14 root root 4096
drwx------ 2 root root 16384 Jan 25 16:47 lost+found
**Why is this so? ** I do not believe that the directory should be lost+found.
I am obviously missing/misunderstanding a step. Greatly appreciate tips please. Thank you!
]]>A service mesh is an infrastructure layer that allows you to manage communication between your application’s microservices. As more developers work with microservices, service meshes have evolved to make that work easier and more effective by consolidating common management and administrative tasks in a distributed setup.
Taking a microservice approach to application architecture involves breaking your application into a collection of loosely-coupled services. This approach offers certain benefits: teams can iterate designs and scale quickly, using a wider range of tools and languages. On the other hand, microservices pose new challenges for operational complexity, data consistency, and security.
Service meshes are designed to address some of these challenges by offering a granular level of control over how services communicate with one another. Specifically, they offer developers a way to manage:
Though it is possible to do these tasks natively with container orchestrators like Kubernetes, this approach involves a greater amount of up-front decision-making and administration when compared to what service mesh solutions like Istio and Linkerd offer out of the box. In this sense, service meshes can streamline and simplify the process of working with common components in a microservice architecture. In some cases they can even extend the functionality of these components.
Service meshes are designed to address some of the challenges inherent to distributed application architectures.
These architectures grew out of the three-tier application model, which broke applications into a web tier, application tier, and database tier. At scale, this model has proved challenging to organizations experiencing rapid growth. Monolithic application code bases can grow to be unwieldy “big balls of mud”, posing challenges for development and deployment.
In response to this problem, organizations like Google, Netflix, and Twitter developed internal “fat client” libraries to standardize runtime operations across services. These libraries provided load balancing, circuit breaking, routing, and telemetry — precursors to service mesh capabilities. However, they also imposed limitations on the languages developers could use and required changes across services when they themselves were updated or changed.
A microservice design avoids some of these issues. Instead of having a large, centralized application codebase, you have a collection of discretely managed services that represent a feature of your application. Benefits of a microservice approach include:
At the same time, microservices have also created challenges:
Service meshes are designed to address these issues by offering coordinated and granular control over how services communicate. In the sections that follow, we’ll look at how service meshes facilitate service-to-service communication through service discovery, routing and internal load balancing, traffic configuration, encryption, authentication and authorization, and metrics and monitoring. We will use Istio’s Bookinfo sample application — four microservices that together display information about particular books — as a concrete example to illustrate how service meshes work.
In a distributed framework, it’s necessary to know how to connect to services and whether or not they are available. Service instance locations are assigned dynamically on the network and information about them is constantly changing as containers are created and destroyed through autoscaling, upgrades, and failures.
Historically, there have been a few tools for doing service discovery in a microservice framework. Key-value stores like etcd were paired with other tools like Registrator to offer service discovery solutions. Tools like Consul iterated on this by combining a key-value store with a DNS interface that allows users to work directly with their DNS server or node.
Taking a similar approach, Kubernetes offers DNS-based service discovery by default. With it, you can look up services and service ports, and do reverse IP lookups using common DNS naming conventions. In general, an A record for a Kubernetes service matches this pattern: service.namespace.svc.cluster.local
. Let’s look at how this works in the context of the Bookinfo application. If, for example, you wanted information on the details
service from the Bookinfo app, you could look at the relevant entry in the Kubernetes dashboard:
This will give you relevant information about the Service name, namespace, and ClusterIP
, which you can use to connect with your Service even as individual containers are destroyed and recreated.
A service mesh like Istio also offers service discovery capabilities. To do service discovery, Istio relies on communication between the Kubernetes API, Istio’s own control plane, managed by the traffic management component Pilot, and its data plane, managed by Envoy sidecar proxies. Pilot interprets data from the Kubernetes API server to register changes in Pod locations. It then translates that data into a canonical Istio representation and forwards it onto the sidecar proxies.
This means that service discovery in Istio is platform agnostic, which we can see by using Istio’s Grafana add-on to look at the details
service again in Istio’s service dashboard:
Our application is running on a Kubernetes cluster, so once again we can see the relevant DNS information about the details
Service, along with other performance data.
In a distributed architecture, it’s important to have up-to-date, accurate, and easy-to-locate information about services. Both Kubernetes and service meshes like Istio offer ways to obtain this information using DNS conventions.
Managing traffic in a distributed framework means controlling how traffic gets to your cluster and how it’s directed to your services. The more control and specificity you have in configuring external and internal traffic, the more you will be able to do with your setup. For example, in cases where you are working with canary deployments, migrating applications to new versions, or stress testing particular services through fault injection, having the ability to decide how much traffic your services are getting and where it is coming from will be key to the success of your objectives.
Kubernetes offers different tools, objects, and services that allow developers to control external traffic to a cluster: kubectl proxy
, NodePort
, Load Balancers, and Ingress Controllers and Resources. Both kubectl proxy
and NodePort
allow you to quickly expose your services to external traffic: kubectl proxy
creates a proxy server that allows access to static content with an HTTP path, while NodePort
exposes a randomly assigned port on each node. Though this offers quick access, drawbacks include having to run kubectl
as an authenticated user, in the case of kubectl proxy
, and a lack of flexibility in ports and node IPs, in the case of NodePort
. And though a Load Balancer optimizes for flexibility by attaching to a particular Service, each Service requires its own Load Balancer, which can be costly.
An Ingress Resource and Ingress Controller together offer a greater degree of flexibility and configurability over these other options. Using an Ingress Controller with an Ingress Resource allows you to route external traffic to Services and configure internal routing and load balancing. To use an Ingress Resource, you need to configure your Services, the Ingress Controller and LoadBalancer
, and the Ingress Resource itself, which will specify the desired routes to your Services. Currently, Kubernetes supports its own Nginx Controller, but there are other options you can choose from as well, managed by Nginx, Kong, and others.
Istio iterates on the Kubernetes Controller/Resource pattern with Istio Gateways and VirtualServices. Like an Ingress Controller, a Gateway defines how incoming traffic should be handled, specifying exposed ports and protocols to use. It works in conjunction with a VirtualService, which defines routes to Services within the mesh. Both of these resources communicate information to Pilot, which then forwards that information to the Envoy proxies. Though they are similar to Ingress Controllers and Resources, Gateways and VirtualServices offer a different level of control over traffic: instead of combining Open Systems Interconnection (OSI) layers and protocols, Gateways and VirtualServices allow you to differentiate between OSI layers in your settings. For example, by using VirtualServices, teams working with application layer specifications could have a separation of concerns from security operations teams working with different layer specifications. VirtualServices make it possible to separate work on discrete application features or within different trust domains, and can be used for things like canary testing, gradual rollouts, A/B testing, etc.
To visualize the relationship between Services, you can use Istio’s Servicegraph add-on, which produces a dynamic representation of the relationship between Services using real-time traffic data. The Bookinfo application might look like this without any custom routing applied:
Similarly, you can use a visualization tool like Weave Scope to see the relationship between your Services at a given time. The Bookinfo application without advanced routing might look like this:
When configuring application traffic in a distributed framework, there are a number of different solutions — from Kubernetes-native options to service meshes like Istio — that offer various options for determining how external traffic will reach your application resources and how these resources will communicate with one another.
A distributed framework presents opportunities for security vulnerabilities. Instead of communicating through local internal calls, as they would in a monolithic setup, services in a microservice architecture communicate information, including privileged information, over the network. Overall, this creates a greater surface area for attacks.
Securing Kubernetes clusters involves a range of procedures; we will focus on authentication, authorization, and encryption. Kubernetes offers native approaches to each of these:
etcd
.kube-apisever
. You can also use an overlay network like Weave Net to do this.Configuring individual security policies and protocols in Kubernetes requires administrative investment. A service mesh like Istio can consolidate some of these activities.
Istio is designed to automate some of the work of securing services. Its control plane includes several components that handle security:
For example, when you create a Service, Citadel receives that information from the kube-apiserver
and creates SPIFFE certificates and keys for this Service. It then transfers this information to Pods and Envoy sidecars to facilitate communication between Services.
You can also implement some security features by enabling mutual TLS during the Istio installation. These include strong service identities for cross- and inter-cluster communication, secure service-to-service and user-to-service communication, and a key management system that can automate key and certificate creation, distribution, and rotation.
By iterating on how Kubernetes handles authentication, authorization, and encryption, service meshes like Istio are able to consolidate and extend some of the recommended best practices for running a secure Kubernetes cluster.
Distributed environments have changed the requirements for metrics and monitoring. Monitoring tools need to be adaptive, accounting for frequent changes to services and network addresses, and comprehensive, allowing for the amount and type of information passing between services.
Kubernetes includes some internal monitoring tools by default. These resources belong to its resource metrics pipeline, which ensures that the cluster runs as expected. The cAdvisor component collects network usage, memory, and CPU statistics from individual containers and nodes and passes that information to kubelet; kubelet in turn exposes that information via a REST API. The Metrics Server gets this information from the API and then passes it to the kube-aggregator
for formatting.
You can extended these internal tools and monitoring capabilities with a full metrics solution. Using a service like Prometheus as a metrics aggregator allows you to build directly on top of the Kubernetes resource metrics pipeline. Prometheus integrates directly with cAdvisor through its own agents, located on the nodes. Its main aggregation service collects and stores data from the nodes and exposes it though dashboards and APIs. Additional storage and visualization options are also available if you choose to integrate your main aggregation service with backend storage, logging, and visualization tools like InfluxDB, Grafana, ElasticSearch, Logstash, Kibana, and others.
In a service mesh like Istio, the structure of the full metrics pipeline is part of the mesh’s design. Envoy sidecars operating at the Pod level communicate metrics to Mixer, which manages policies and telemetry. Additionally, Prometheus and Grafana services are enabled by default (though if you are installing Istio with Helm you will need to specify granafa.enabled=true
during installation). As is the case with the full metrics pipeline, you can also configure other services and deployments for logging and viewing options.
With these metric and visualization tools in place, you can access current information about services and workloads in a central place. For example, a global view of the BookInfo application might look like this in the Istio Grafana dashboard:
By replicating the structure of a Kubernetes full metrics pipeline and simplifying access to some of its common components, service meshes like Istio streamline the process of data collection and visualization when working with a cluster.
Microservice architectures are designed to make application development and deployment fast and reliable. Yet an increase in inter-service communication has changed best practices for certain administrative tasks. This article discusses some of those tasks, how they are handled in a Kubernetes-native context, and how they can be managed using a service mesh — in this case, Istio.
For more information on some of the Kubernetes topics covered here, please see the following resources:
Additionally, the Kubernetes and Istio documentation hubs are great places to find detailed information about the topics discussed here.
]]>I’m setting up a Virtualbox guest machine running Windows 10 on a droplet running CentOS 7.
Users would need to be able to use Windows Remote Desktop to connect directly into the Virtualbox guest OS running Windows 10 from machines not connected to any internal network (i.e. running them with only direct internet connections).
I’ve tried various configurations, port forwarding in both the CentOS environment and the Virtualbox setup.
I can connect to the droplet using both SSH and XRDP - from there I can connect into the guest Win 10 OS running on the virtualbox - however I need to know if its possible to bypass the connection to the droplet and connect directly into the Virtualbox guest from any machine connected to the internet.
]]>O Domain Name System ou Sistema de Nomes de Domínio (DNS) é um sistema para associar vários tipos de informação – como endereços IP – com nomes fáceis de lembrar. Por padrão, a maioria dos clusters de Kubernetes configura automaticamente um serviço de DNS interno para fornecer um mecanismo leve para a descoberta de serviços. O serviço de descoberta integrado torna fácil para as aplicações encontrar e se comunicar umas com as outras nos clusters de Kubernetes, mesmo quando os pods e serviços estão sendo criados, excluídos, e deslocados entre os nodes.
Os detalhes de implementação do serviço de DNS do Kubernetes mudaram nas versões recentes do Kubernetes. Neste artigo vamos dar uma olhada nas versões kube-dns e CoreDNS do serviço de DNS do Kubernetes. Vamos rever como eles operam e os registros DNS que o Kubernetes gera.
Para obter uma compreensão mais completa do DNS antes de começar, por favor leia Uma Introdução à Terminologia, Componentes e Conceitos do DNS. Para qualquer tópico do Kubernetes com o qual você não esteja familiarizado, leia Uma Introdução ao Kubernetes.
Antes da versão 1.11 do Kubernetes, o serviço de DNS do Kubernetes era baseado no kube-dns. A versão 1.11 introduziu o CoreDNS para resolver algumas preocupações de segurança e estabilidade com o kube-dns.
Independentemente do software que manipula os registros de DNS reais, as duas implementações funcionam de maneira semelhante:
Um serviço chamado kube-dns
e um ou mais pods são criados.
O serviço kube-dns
escuta por eventos service e endpoint da API do Kubernetes e atualiza seus registros DNS quando necessário. Esses eventos são disparados quando você cria, atualiza ou exclui serviços do Kubernetes e seus pods associados.
O kubelet define a opção nameserver
do /etc/resolv.conf
de cada novo pod para o IP do cluster do serviço kube-dns
, com opções apropriadas de search
para permitir que nomes de host mais curtos sejam usados:
nameserver 10.32.0.10
search namespace.svc.cluster.local svc.cluster.local cluster.local
options ndots:5
example-service.namespace
nos endereços IP corretos do cluster.O registro de DNS A
completo de um serviço do Kubernetes será semelhante ao seguinte exemplo:
service.namespace.svc.cluster.local
Um pod teria um registro nesse formato, refletindo o endereço IP real do pod:
10.32.0.125.namespace.pod.cluster.local
Além disso, os registros SRV
são criados para as portas nomeadas do serviço Kubernetes:
_port-name._protocol.service.namespace.svc.cluster.local
O resultado de tudo isso é um mecanismo de descoberta de serviço interno baseado em DNS, onde seu aplicativo ou microsserviço pode referenciar um nome de host simples e consistente para acessar outros serviços ou pods no cluster.
Por causa dos sufixos de busca de domínio listados no arquivo resolv.conf
, muitas vezes você não precisará usar o nome do host completo para entrar em contato com outro serviço. Se você estiver referenciando um serviço no mesmo namespace, poderá usar apenas o nome do serviço para contatá-lo:
outro-service
Se o serviço estiver em um namespace diferente, adicione-o à consulta:
outro-service.outro-namespace
Se você estiver referenciando um pod, precisará usar pelo menos o seguinte:
pod-ip.outro-namespace.pod
Como vimos no arquivo resolv.conf
padrão, apenas os sufixos .svc
são automaticamente completados, então certifique-se de que você especificou tudo até o .pod
.
Agora que sabemos os usos práticos do serviço DNS do Kubernetes, vamos analisar alguns detalhes sobre as duas diferentes implementações.
Como observado na seção anterior, a versão 1.11 do Kubernetes introduziu um novo software para lidar com o serviço kube-dns
. A motivação para a mudança era aumentar o desempenho e a segurança do serviço. Vamos dar uma olhada na implementação original do kube-dns
primeiro.
O serviço kube-dns
antes do Kubernetes 1.11 é composto de três containers executando em um pod kube-dns
no namespace kube-system
. Os três containers são:
kube-dns: um container que executa o SkyDNS, que realiza a resolução de consultas DNS
dnsmasq: um resolvedor e cache de DNS leve e popular que armazena em cache as respostas do SkyDNS
sidecar: um container sidecar que lida com relatórios de métricas e responde a verificações de integridade do serviço
As vulnerabilidades de segurança no Dnsmasq, e os problemas com desempenho ao escalar com o SkyDNS levaram à criação de um sistema substituto, o CoreDNS.
A partir do Kubernetes 1.11, um novo serviço de DNS do Kubernetes, o CoreDNS foi promovido à Disponibilidade Geral. Isso significa que ele está pronto para uso em produção e será o serviço DNS de cluster padrão para muitas ferramentas de instalação e provedores gerenciados do Kubernetes.
O CoreDNS é um processo único, escrito em Go, que cobre todas as funcionalidades do sistema anterior. Um único container resolve e armazena em cache as consultas DNS, responde a verificações de integridade e fornece métricas.
Além de abordar problemas relacionados a desempenho e segurança, o CoreDNS corrige alguns outros pequenos bugs e adiciona alguns novos recursos:
Alguns problemas com incompatibilidades entre o uso de stubDomains e serviços externos foram corrigidos
O CoreDNS pode melhorar o balanceamento de carga round-robin baseado em DNS ao randomizar a ordem na qual ele retorna determinados registros
Um recurso chamado autopath
pode melhorar os tempos de resposta do DNS ao resolver nomes de host externos, sendo mais inteligente ao iterar através de cada um dos sufixos de domínio de busca listados em resolv.conf
Com o kube-dns 10.32.0.125.namespace.pod.cluster.local
sempre resolveria para 10.32.0.125
, mesmo que o pod não existisse realmente. O CoreDNS tem um modo “pods verificados” que somente resolverá com sucesso se o pod existir com o IP correto e no namespace correto.
Para mais informações sobre o CoreDNS e com ele se diferencia do kube-dns, você pode ler o anúncio do Kubernetes CoreDNS GA.
Os operadores do Kubernetes geralmente desejam personalizar como seus pods e containers resolvem determinados domínios personalizados, ou precisam ajustar os servidores de nomes upstream ou os sufixos de domínio de busca configurados em resolv.conf
. Você pode fazer isso com a opção dnsConfig
na especificação do seu pod:
apiVersion: v1
kind: Pod
metadata:
namespace: example
name: custom-dns
spec:
containers:
- name: example
image: nginx
dnsPolicy: "None"
dnsConfig:
nameservers:
- 203.0.113.44
searches:
- custom.dns.local
A atualização dessa configuração irá reescrever o resolv.conf
do pod para ativar as alterações. A configuração mapeia diretamente para as opções padrão do resolv.conf
, assim a configuração acima criaria um arquivo com as linhas nameserver
203.0.113.44
e search custom.dns.local
Neste artigo, cobrimos as noções básicas sobre o que o serviço de DNS do Kubernetes fornece aos desenvolvedores, mostramos alguns exemplos de registros DNS para serviços e pods, discutimos como o sistema é implementado em diferentes versões do Kubernetes, e destacamos algumas opções de configuração adicionais disponíveis para personalizar como seus pods resolvem as consultas DNS.
Para mais informações sobre o serviço e DNS do Kubernetes, por favor, consulte a documentação oficial do DNS do Kubernetes para Serviços e Pods.
Por Brian Boucheron
]]>The Domain Name System (DNS) is a system for associating various types of information – such as IP addresses – with easy-to-remember names. By default most Kubernetes clusters automatically configure an internal DNS service to provide a lightweight mechanism for service discovery. Built-in service discovery makes it easier for applications to find and communicate with each other on Kubernetes clusters, even when pods and services are being created, deleted, and shifted between nodes.
The implementation details of the Kubernetes DNS service have changed in recent versions of Kubernetes. In this article we will take a look at both the kube-dns and CoreDNS versions of the Kubernetes DNS service. We will review how they operate and the DNS records that Kubernetes generates.
To gain a more thorough understanding of DNS before you begin, please read An Introduction to DNS Terminology, Components, and Concepts. For any Kubernetes topics you may be unfamiliar with, you could read What is Kubernetes?.
If you’re looking for a managed Kubernetes hosting service, check out our simple, managed Kubernetes service built for growth.
Before Kubernetes version 1.11, the Kubernetes DNS service was based on kube-dns. Version 1.11 introduced CoreDNS to address some security and stability concerns with kube-dns.
Regardless of the software handling the actual DNS records, both implementations work in a similar manner:
A service named kube-dns
and one or more pods are created.
The kube-dns
service listens for service and endpoint events from the Kubernetes API and updates its DNS records as needed. These events are triggered when you create, update or delete Kubernetes services and their associated pods.
kubelet sets each new pod’s /etc/resolv.conf
nameserver
option to the cluster IP of the kube-dns
service, with appropriate search
options to allow for shorter hostnames to be used:
nameserver 10.32.0.10
search namespace.svc.cluster.local svc.cluster.local cluster.local
options ndots:5
Applications running in containers can then resolve hostnames such as example-service.namespace
into the correct cluster IP addresses.
The full DNS A
record of a Kubernetes service will look like the following example:
service.namespace.svc.cluster.local
A pod would have a record in this format, reflecting the actual IP address of the pod:
10.32.0.125.namespace.pod.cluster.local
Additionally, SRV
records are created for a Kubernetes service’s named ports:
_port-name._protocol.service.namespace.svc.cluster.local
The result of all this is a built-in, DNS-based service discovery mechanism, where your application or microservice can target a simple and consistent hostname to access other services or pods on the cluster.
Because of the search domain suffixes listed in the resolv.conf
file, you often won’t need to use the full hostname to contact another service. If you’re addressing a service in the same namespace, you can use just the service name to contact it:
other-service
If the service is in a different namespace, add it to the query:
other-service.other-namespace
If you’re targeting a pod, you’ll need to use at least the following:
pod-ip.other-namespace.pod
As we saw in the default resolv.conf
file, only .svc
suffixes are automatically completed, so make sure you specify everything up to .pod
.
Now that we know the practical uses of the Kubernetes DNS service, let’s run through some details on the two different implementations.
As noted in the previous section, Kubernetes version 1.11 introduced new software to handle the kube-dns
service. The motivation for the change was to increase the performance and security of the service. Let’s take a look at the original kube-dns
implementation first.
The kube-dns
service prior to Kubernetes 1.11 is made up of three containers running in a kube-dns
pod in the kube-system
namespace. The three containers are:
Security vulnerabilities in Dnsmasq, and scaling performance issues with SkyDNS led to the creation of a replacement system, CoreDNS.
As of Kubernetes 1.11 a new Kubernetes DNS service, CoreDNS has been promoted to General Availability. This means that it’s ready for production use and will be the default cluster DNS service for many installation tools and managed Kubernetes providers.
CoreDNS is a single process, written in Go, that covers all of the functionality of the previous system. A single container resolves and caches DNS queries, responds to health checks, and provides metrics.
In addition to addressing performance- and security-related issues, CoreDNS fixes some other minor bugs and adds some new features:
autopath
can improve DNS response times when resolving external hostnames, by being smarter about iterating through each of the search domain suffixes listed in resolv.conf
10.32.0.125.namespace.pod.cluster.local
would always resolve to 10.32.0.125
, even if the pod doesn’t actually exist. CoreDNS has a “pods verified” mode that will only resolve successfully if a pod exists with the right IP and in the right namespace.For more information on CoreDNS and how it differs from kube-dns, you can read the Kubernetes CoreDNS GA announcement.
Kubernetes operators often want to customize how their pods and containers resolve certain custom domains, or need to adjust the upstream nameservers or search domain suffixes configured in resolv.conf
. You can do this with the dnsConfig
option of your pod’s spec:
apiVersion: v1
kind: Pod
metadata:
namespace: example
name: custom-dns
spec:
containers:
- name: example
image: nginx
dnsPolicy: "None"
dnsConfig:
nameservers:
- 203.0.113.44
searches:
- custom.dns.local
Updating this config will rewrite a pod’s resolv.conf
to enable the changes. The configuration maps directly to the standard resolv.conf
options, so the above config would create a file with nameserver 203.0.113.44
and search custom.dns.local
lines.
In this article we covered the basics of what the Kubernetes DNS service provides to developers, showed some example DNS records for services and pods, discussed how the system is implemented on different Kubernetes versions, and highlighted some additional configuration options available to customize how your pods resolve DNS queries.
For more information on the Kubernetes DNS service, please refer to the official Kubernetes DNS for Services and Pods documentation.
]]>I have an app that has an API it calls. The API get’s called multiple times a second from the app’s users. One aspect of the API calls is the recording if usage statistics.
Currently I have all of this managed on a dedicated server at a local hosting company. I have however noticed that the CPU-loads tend to get pretty high on peak usage times.
I am wanting to upgrade to a more professional solution. From what I have researched it would be wise to create multiple droplets.
In the future I can clone the web server droplet to handle its load demands. I assume I can then use the Digital Ocean’s load balancer to mediate between the droplets.
But what about the mySQL database droplet? I can’t clone the database to balance it’s load - I would have duplicate data. How do I best go about dealing with it’s load? Same question is regarding the size. At the current rate the database grows approx. 5 GB a month. The more users come the faster it will grow. Is my only solution to expand the droplet’s disk space, or can I then split the database across multiple droplets? The usage statistics database itself constists of mainly 3 very large tables. I am sure I could split each table on a separate server, but that should make JOIN statements difficult.
I realize that some of these questions are very future-oriented, but if I am already setting everything up new, I want to setup up every thing so that it is less work when things need to expand. I am trying to prevent time-costly caveats unknowningly.
Thanks
]]>I plan to host my client’s applications (mostly Wordpress) and have each client have it’s own VPS. Right now, I have like a dozen potential clients, but I hope to grow it into hundred or so.
Is it possible to host all of them on DO under one account?
]]>