Nginx Server and Location Block Selection Algorithms- CloudSigma

Introduction

Nginx is among the world’s most popular web server options. It is able to successfully deal with a multitude of simultaneous client connections. At the same time, it is functioning as a mail, web, or reverse proxy server.

This guide aims to outline the behind-the-scenes methods that direct how Nginx processes client requests. We will demystify the server and location block design, as well as explain how to best reduce the seeming unpredictability of handling requests.

First things first, here is a comprehensive tutorial on how to install Nginx on your Ubuntu server. Now, let’s begin!

Block Configuration with Nginx

Nginx’s logical approach involves sorting the configurations intended for different purposes into separate, more logical content blocks. These will reside in a hierarchical structure. When a client produces a request, Nginx initiates a process by which it determines which of these configuration blocks are the most applicable to address this request. We will focus on this decision process.

The primary blocks we discuss will be server blocks and location blocks. Server blocks are a subset of the configurations Nginx establishes that define which virtual server will be responsible for handling a defined type of request. They are most commonly based on the IP address, domain name, or port of the incoming request. Administrators configure multiple server blocks. Then, they need to decide which of the connections should be handling the request.

Location blocks reside within the server blocks. These are the decision-makers on how and which resources you can leverage to handle the incoming requests to their particular parent server. This model is highly flexible. The URI space can be configured to use these blocks in whichever way the administrator sees best.

Decide Which Block Will Handle Which Request with Nginx

Nginx permits the definition of multiple server blocks. All of them function as different virtual web servers. Therefore, there needs to be a method that delineates which server will address particular incoming requests. This is done by finding the best fit for request performance through a system of defined checks.

Nginx primarily deals with two main server block directives: listen and server_name.

Find Possible Matches with the ‘Listen’ Directive

The first thing Nginx evaluates is the port and the IP address of the request. Then, it matches it up against the listen directive of each server. This parsing of the server list helps to isolate just those server blocks that can resolve the request in question.

Typically, the listen directive will define the port and the IP address that a particular server block will be responsible for responding to. A server block that does not feature a listen directive receives the listen to parameters of 0.0.0.0:80 by default. If Nginx is run by a normal, non-root user, the listen parameter is defined as 0.0.0.0:8080. That means that whatever the interface, if the blocks are coming from port 80, blocks defined in such a manner will respond to them. However, this defaulted value is not heavily weighed in the process of selecting a server.

You can configure the listen directive to:

A solo IP address that listens for requests on the default port (80).
A solo port that listens to any interface on that port.
A port and IP address combination.
A set Unix socket path (this option only has implications when requests pass across different servers).

Nginx will implement a set of rules when deciding which server block will have a request sent to it. The rules depend on the particular configuration of the listen directive. They are as follows:

If a listen directive is incomplete, the missing pieces get their default values. This means that the IP address and port will be forced into completion with default values in order to process the request.
- In this case, a block containing no listen directive will use the default value of 0.0.0.0:80.
- A block that is missing a port, and has an IP address of 111.111.111.111 will become 111.111.111.111:80.
- When there is no IP address, a block with port 8888 will acquire the default IP address to append to create 0.0.0.0:8888.
Having the determined IP address and port, Nginx will then look for server blocks offered as a match to that port.
If it finds only one specific match, this would be the server block. If there are multiple blocks that qualify, Nginx will turn to the server_name directive to further drill down to the exact server block in question.

Nginx will only resort to evaluating the server_name directive if it did not find the server block with the exact specificity level from the listen directive. If example.com is on port 80, with an IP of 192.168.1.10, the first block in this example will always be the one that accommodates the request. This is the case regardless of what the server_name directive says:

If there is more than just one qualified service block with a specificity match, then the server_name directive will be taken into consideration.

Find Possible Matches with the ‘Server_Name’ Directive

If the listen directives are equally specific, Nginx will check the header of the request’s ‘Host.’ This is a value that will have the IP of the domain which the client was looking to reach, initially. Nginx will utilize the server_name directive inside of each still qualified server block candidate. It performs these evaluations based on a formula. It is as follows:

The first attempt by Nginx will be to identify a block with a server_name matching the ‘Host’ header value in the request exactly. If it finds it, the block containing the exact match will be the one to serve the request. In case it finds multiple blocks, it will choose the first on the list.
If there are no exact matches, Nginx will then try to use server_name to find the server block that matches with the use of *, a wildcard at the start of the server block name in the configuration. Finding one with this method means that the server block has been determined. If it finds more than one match, the longest match will be the one to fulfill the request.
Without a matching wildcard, Nginx will try to find a server block with a matching trailing wildcard. In other words, this will be a server name with a * in the configuration. If one is found it is used for the request. While, if you find multiple ones, Nginx will once again use the longest match.
In case that there are still no matches after both wildcard attempts, Nginx will evaluate those server blocks that define the server_name by using typical expressions (designated by a ~ before the name). The first instance of server_name with an expression matching that of the ‘Host’ header, will be deemed as the server block for the request handling.
If at this point there are still no matches, Nginx will use the default server block for that port and IP address combination.

Every port/IP address combination will have a designated server block. It will be used if the rules of determining the appropriate server block for request handling are fruitless. This will be the first block in the configuration containing a default_server option in the listen directive (it would override the initially found algorithm). Each IP address/port combination can only have, at most, one default_server setting.

Examples of Server Block Selection

If the server_name defined matches the ‘Host’ header value exactly, it will be the server block selected for request processing. The following example shows a ‘Host header of the request designated as “host1.example.com”. In this case, it will select the second server:

Without an exact match, Nginx will check if the server_name with a wildcard exists. If not, the longest match starting with a wildcard will be selected. In the following, “www.example.org” is on the “Host” header. That means that it will choose the second block:

Without a match starting with the wildcard, Nginx moves on to a trailing wildcard check. The longest match ending with the wildcard will be selected for request processing. In this instance, the “Host” header is “www.example.com”, so it will choose the third server block:

If still there are no matches, Nginx will attempt to match server_name directives using standard expressions. The first of those expressions is selected for request processing. If the “Host” is “www.example.com,” the second server block will be the choice to attend to the request:

With still no matches, the request will go to the IP address and port combination with the matching default server set up.

Location Block Matching

Nginx also needs to establish an algorithm by which it will decide which location block on the server will be responsible for responding to a request.

Syntax for Location Blocks

Before explaining how Nginx decides how to designate the location block that will handle requests, we will review the syntax in location block definitions. As stated earlier, location blocks reside in server blocks (and other location blocks). Their purpose is to make decisions about how to process the request URI. The URI is the portion of the request that comes after the IP address and port or the domain name on the request.

Location blocks typically look like this:

Nginx will check the URI of the request against the location_match. Whether or not the above modifier is present will dictate the way by which Nginx will attempt to match the blocks. Depending on the modifier, the location blocks will be interpreted according to the following rules:

No modifiers: Without any modifiers, the location will be interpreted as a prefix match. This means that the provided location will match up against the beginning of the URI on the request to determine a correct match.
=: The equal sign signifies that this block will be considered a match as long as the URI of the request matches the provided location exactly.
~: The tilde modifier represents that the match of the location block will be case sensitive.
~*: The combination of a tilde and an asterisk modifier represents that the location block will be case-insensitive in seeking a match.
^~: If the tilde modifier is led by a carat, regular expression matching will not occur as long as this block is chosen as the best non-regular expression match.

Location Block Syntax Examples

To present an example of prefix matching, the location block will be the selection to respond to a URI of a request in the form of /site, /site/page1/index.html, or /site/index/html:

For the purposes of this demonstration of requisite URI matching, the block will be always used to respond to URI requests in the form of /page1, and not /page1/index.html request URI. If this is the selected block and it fulfills the request using an index page, the actual handler of the request will be redirected internally to another location:

For example, a location that must be interpreted with a case-sensitive expression, the following block could not handle requests for /FLOWER.PNG. However, it will handle requests for /tortoise.jpg:

Next, observe a block that would permit case-insensitive matching that is similar to the one above. In this case, the block could handle both //tortoise.jpg and /FLOWER.PNG:

The final variant is one in which a block would prevent regular expression matching from taking place if the determination is such that it is the optimal non-regular expression match. This one can handle requests for /costumes/ninja.html:

To put a fine point on it, the modifiers dictate the way in which location blocks are determined. This does not, however, tell us what Nginx utilizes as the decision-making algorithm to identify the location block that a request is to be sent to. Let us address that next.

Choose the Location that Will Handle Requests by Nginx

The method by which Nginx chooses the location that processes a request is similar to how server blocks are selected. In other words, it determines the optimal location for every request by running through a process. In order to configure Nginx accurately and accordingly, it is imperative that you understand this process.

Bearing in mind the location declarations addressed earlier, Nginx similarly uses potential location contexts by checking the qualification for every location by comparing to it the URI from a given request. In this, it applies the following algorithm:

First, Nginx checks all location types that do not include a regular expression. It does so by seeking out all location-based prefix matches. To do that it checks the location against the request’s complete URI.
Nginx begins looking for an exact match. Once a location block that is using the = modifier is identified, it is compared against the URI request. If the two match exactly, the location block is selected to handle the request right on the spot.
If there are no locations matching the = modifier comparison exactly, Nginx proceeds to evaluate prefixes that are non-exact. Once it determines the longest prefix location that matches the request’s URI, it will perform the following evaluations:
- If the location with the longest prefix match uses the ^~ modifier, this location will immediately be chosen.
- If the location with the longest prefix does not use the ^~ modifier, the match is briefly retained by Nginx to allow the focus of the search to shift.
Once the longest prefix location match is found and stored, Nginx shifts to the evaluation of regular expression locations. Those include both case-sensitive and insensitive matches. If the longest matching prefix location has any regular locations within it, Nginx will reform the list to place these near the top of the list of locations. The first expression from the resorted listing that matches the URI of a request, will be the location chosen to serve the request.
If there are no regular expressions found to satisfy the request RI, the location stored previously will be chosen to process the request.

Nginx prioritizes regular expression matches over preferentially prefixed ones by default. It does, however, evaluate prefix locations first, so that the administering part can overrule this tendency with = and ^~ modifiers.

Another important takeaway is that while prefix locations are typically based on the most specific, longest found match, a regular expression check is stopped as soon as the first match is identified. This means that positioning within the configuration has real implications for regular expression locations.

A final point to touch upon is that the regular expression matches within the match with the longest prefix will essentially jump the line during Nginx’s location evaluations. These will be positioned at the top of the list and evaluated ahead of other regular expressions.

When does Jumping to other Locations Occur in Location Block Evaluations?

Typically, once a request is assessed and a location block to handle it is selected, it will be addressed entirely within that context. This means that only the inherited directives and selected locations are the determinants in the processing of the request, without any input of sibling location blocks.

While this is a general directive that permits the predictable design of location blocks, sometimes certain directives within the location can trigger a new search as well. In other words, the ‘just one location block’ rule has a few exceptions. Those exceptions might not align with the expectation of the location blocks. Therefore, they may not address the request as expected.

These internal redirects can end up manifesting due to some directives including:

index
rewrite
error_page
try_files

If you use the index directive it will always result in an internal redirect during request handling. While finding location matches typically end the algorithm execution to speed up the selection process, if the location match found is a directory, the request will likely be redirected to another location to formally be processed.

For example, the following first location matches with a request URI of /exact. However, to process the request, the index directive that the location block inherits redirects the request to a secondary block:

index

For that scenario, if the execution needs to stay within the primary block, another scheme will need to process the request to the directory. One way is to do this is to set up an invalid index for the block in question, and activate auto index instead:

While this method might work in a few cases, it is not by and large practically applicable in most contexts. An exact directory match can be useful for situations where the request needs to be rewritten. This will trigger a brand new location search.

Another directive that can be used to reevaluate the processing location is the try_files directive. It tells Nginx to specifically check for whether a named set of files or directories exists, with the last search criteria being the URI for Nginx to redirect to internally.

Let’s think about the following configuration:

If there is a request for /blahblah, the first location will receive it. Not finding the blahblah file in the /var/www/main directory will trigger a follow-up search for blahblah.html. Then it will look for a subdirectory named blahblah in the /var/www/main directory. If all of those checks fail, it will redirect to /fallback/index.html. This will trigger another location search that another location block will pick up. Then, it will process the file /var/www/another/fallback/index.html.

Another directive that results in a redirect to another location block is the rewrite directive. Nginx will search for a new matching location based on the result of the rewrite directive when the last parameter is used. If the last example is modified to now include this rewrite directive, it becomes evident that the request can be redirected to another location without the try_files directive being implemented:

For this example, the request for /rewrite/hello will be addressed by the first location initially. After it is rewritten to /hello, a secondary location search will be triggered. It will match against the first location. It will be processed by the try_file directive, potentially reverting to /fallback/index.html if it yields no hits.

If a request is made for /rewrite/fallback/hello, however, a match to the first block will be found. Thus, the rewrite will be processed again, but this time yield /fallback/hello as the result. The request will be processed on another location block.

Similar situations occur when you use the return directive to send 301 or 302 status codes. The only difference is that a new request results, and manifests in a very obvious redirect. Similarly, this can occur with the rewrite directive when you apply permanent or redirect flags.

Another directive that can lead to similar internal redirects to that of try_again is the error_page directive. You can use that when you encounter particular error codes in processing. When a try_files directive is set, the error_page directive will likely never be executed. That’s because that directive will handle the full life cycle of the request.

Let’s consider the following example:

In this case, every request will be processed by the first block serving files from /var/www/main. This does not apply to those requests that start with /another. But if a file were to not be found, there will be an internal redirect initiated to /another/whoops/html. This will lead to another location search. In turn, it will direct the request to a secondary block, with that file being addressed out of /var/www/another/whoops.html.

As evident, comprehension of situations where Nginx will trigger a new location search can help better predict system behavior when requests are being processed.

Conclusion

Administrators’ jobs become immensely simpler when they understand the methods by which Nginx addresses client requests. This allows administrators to ascertain which server block the request will go to. They can also determine the location block that will be selected based on the request URI. By and large, it also affords administrators the ability to trace the Nginx applied contexts when addressing each request.

Finally, you can take a look at the other tutorials on our blog focusing on Nginx. They will help you benefit better from one of the most popular web servers in the world:

Happy Computing!

About
Latest

About Manpreet Singh

Java developer

How To Create a Kubernetes Cluster Using Kubeadm on Ubuntu 18.04 - June 24, 2021
Nginx Server and Location Block Selection Algorithms: Overview - May 14, 2021
Configuring an Iptables Firewall: Basic Rules and Commands - April 5, 2021
How to Configure a Linux Service to Auto-Start After a Reboot or System Crash: Part 2 (Theoretical Explanations) - March 25, 2021
How to Configure a Linux Service to Auto-Start After a Reboot or System Crash: Part 1 (Practical Examples) - March 24, 2021