writing an nginx module

Nginx Modules

In this article, we will explore the world of Nginx modules and their importance in extending the functionality of your Nginx web server. Nginx modules are crucial components that enable you to add new features, enhance performance , and customize the behavior of your server. Understanding the role and types of Nginx modules will empower you to optimize your server configuration and deliver a highly efficient web application.

What are NGINX Modules?

Nginx modules are modular components that allow you to expand the capabilities of your Nginx server. They are designed to address specific functionalities and can be easily added or removed from your server configuration. Modules act as building blocks, enabling you to tailor Nginx to meet your specific requirements. Whether you need to enhance security, enable caching , or implement load balancing, Nginx modules provide the necessary tools to achieve these objectives.

Types of Nginx Modules

There are two main types of Nginx modules: core modules and third-party modules.

Popular Third-Party Modules for Nginx

There is a vibrant ecosystem of third-party Nginx modules developed by the community. These modules extend Nginx’s capabilities and offer advanced features and functionalities. Third-party Nginx modules cater to various use cases such as content caching, authentication, rate limiting , security, and more. Some popular third-party modules include the Nginx Amplify module, Lua module, GeoIP module, and Let’s Encrypt module.

There are numerous third-party modules available for Nginx that extend its functionality and provide additional features.

Here are some popular third-party Nginx modules:

This module integrates the Lua scripting language into Nginx, allowing you to write powerful and flexible configurations and extensions using Lua scripts.

To configure the Nginx Lua module, you can easily follow the step-by-step guide provided in the article Install and Configure Nginx Lua Module . This comprehensive guide will walk you through the entire process, ensuring that you have a smooth and successful configuration.

Brotli Module

ngx_brotli is a module that enables Brotli compression support in Nginx. Brotli is a compression algorithm developed by Google that offers superior compression ratios compared to gzip.

To configure the Nginx Brotli module, refer to the detailed step-by-step guide outlined in the article Install and Configure Nginx Brotli Module .

Redis Module

The Redis module integrates Nginx with Redis, a popular in-memory data store. It allows you to cache content, perform dynamic lookups, and leverage Redis’ key-value store within Nginx configurations.

To set up Redis with your Nginx server, carefully follow the instructions provided in this comprehensive guide .

Nginx Pagespeed Module

The Nginx Pagespeed module, developed by Google, optimizes web page delivery and performance. It automatically applies various optimizations like minification, compression, and caching to improve page load times.

To effortlessly configure the Nginx Pagespeed Module, simply consult the comprehensive step-by-step guide provided in the article Nginx Pagespeed Module . This guide will walk you through the process and ensure a smooth configuration experience.

ModSecurity Module

The ModSecurity module integrates the ModSecurity Web Application Firewall (WAF) into Nginx. It provides advanced security features, including request filtering, intrusion detection, and protection against various web attacks.

The tutorial How to Install Nginx ModSecurity Module provides a detailed step-by-step guide that will make the configuration process a breeze. Just follow the instructions outlined in the article, and you’ll have your Nginx ModSecurity module up and running smoothly in no time.

Nginx RTMP Module

The Nginx RTMP module adds support for real-time streaming and broadcasting using the RTMP (Real-Time Messaging Protocol) protocol. It allows you to build scalable video streaming platforms or deliver live video content. Feel free to check out our comprehensive guide on installing and configuring the Nginx RTMP module . It covers everything you need to know to get started with setting up Nginx for RTMP streaming.

Let’s Encrypt Nginx Module

This module simplifies the integration of Let’s Encrypt SSL certificates into Nginx configurations . It automatically handles certificate issuance, renewal, and installation, making it easier to secure your websites with free SSL/TLS certificates.

To configure Let’s Encrypt with Nginx, simply follow the steps outlined in our tutorial on setting up Nginx with Let’s Encrypt . By following these instructions, you’ll be able to seamlessly integrate Let’s Encrypt SSL/TLS certificates into your Nginx server and ensure secure communication for your website or application. Let’s dive in and get started!

Nginx Upload Progress Module

The Nginx Upload Progress module tracks and reports the progress of file uploads. It allows you to provide real-time upload progress feedback to users during large file uploads. To configure the Nginx Upload Progress Module on your website, follow the steps mentioned in our tutorial .

GeoIP2 Nginx Module

This module integrates MaxMind’s GeoIP2 databases into Nginx, enabling you to determine the geographic location of clients based on their IP addresses.

HttpEchoModule

The HttpEchoModule allows you to easily send custom HTTP responses from Nginx. It is useful for testing and debugging purposes, or for creating specialized HTTP endpoints.

These are just a few examples of popular third-party modules available for Nginx . There are many more modules developed by the Nginx community that provide additional functionality and customization options for your Nginx server.

Nginx Core Modules

Nginx Core modules are integral and provide fundamental features for handling HTTP requests, managing server events, enabling SSL/TLS encryption, and more. Let’s explore some essential core modules and their functionalities:

HTTP Module

The HTTP module is responsible for handling HTTP requests and responses. It enables you to configure server-wide settings, set up virtual hosts, define location-based rules, and manage proxying and load balancing.

Events Module

The Events module allows you to configure how Nginx handles network connections and events. You can control parameters such as the maximum number of connections, timeouts, and the use of multi-threading or event-driven architectures.

The SSL module provides support for SSL/TLS encryption, allowing you to secure communication between clients and your server. It enables the configuration of SSL certificates, cipher suites, and other security-related settings.

Here is a list of some of the core modules available in Nginx:

ngx_http_core_module

The ngx_http_core_module provides the basic functionality of the HTTP server. It handles request processing, URI mapping, and access control. This module is essential for any HTTP server configuration in Nginx.

ngx_http_ssl_module

The ngx_http_ssl_module enables HTTPS support in Nginx by providing SSL/TLS encryption for secure communication between clients and the server. It allows you to configure SSL certificates and specify SSL-related settings.

ngx_http_access_module

The ngx_http_access_module allows you to control access to the server based on client IP addresses, domain names, or other request attributes. It provides directives like allow and deny to specify access rules and restrictions.

ngx_http_proxy_module

The ngx_http_proxy_module implements a proxy server functionality in Nginx. It enables Nginx to act as a reverse proxy , forwarding client requests to other servers and processing responses. This module is commonly used for load balancing and caching.

ngx_http_fastcgi_module

The ngx_http_fastcgi_module allows Nginx to interact with FastCGI servers, enabling the execution of dynamic scripts and applications. It provides directives to define FastCGI server addresses and control the communication between Nginx and the FastCGI server.

ngx_http_rewrite_module

The ngx_http_rewrite_module provides URL rewriting capabilities in Nginx. It allows you to modify request URIs or redirect requests based on defined rules. This module is often used for URL manipulation and redirection.

ngx_http_gzip_module

The ngx_http_gzip_module enables gzip compression for HTTP responses in Nginx. It reduces the size of transmitted data, improving performance and reducing bandwidth usage. This module can be configured to compress certain types of files or based on client request headers.

ngx_http_realip_module

The ngx_http_realip_module replaces client IP addresses in the request headers with the IP address of the proxy server. This module is useful when Nginx acts as a reverse proxy, as it ensures the backend server receives the correct client IP address for logging or other purposes.

ngx_http_limit_req_module

The ngx_http_limit_req_module allows you to limit the request rate from clients. It helps prevent abuse or excessive resource consumption by enforcing limits on the number of requests per second or per minute from individual IP addresses or other request attributes.

ngx_http_autoindex_module

The ngx_http_autoindex_module generates directory listings for directories that don’t have an index file. It allows users to browse the contents of a directory when an index file is not present, making it useful for serving static files.

ngx_http_auth_basic_module

The ngx_http_auth_basic_module provides HTTP basic authentication support in Nginx. It allows you to protect resources by requiring username and password authentication from clients. This module is commonly used to secure specific areas of a website.

ngx_http_stub_status_module

The ngx_http_stub_status_module exposes basic server status information through a simple HTML page. It provides metrics such as the number of active connections, requests being processed, and other useful information about the server’s performance.

ngx_http_v2_module

The ngx_http_v2_module enables support for the HTTP/2 protocol in Nginx. It allows clients and servers to communicate using the more efficient HTTP/2 protocol, which offers features such as multiplexing, server push, and header compression.

ngx_http_dav_module

The ngx_http_dav_module provides support for WebDAV (Web Distributed Authoring and Versioning) functionality in Nginx. It allows clients to perform file operations such as upload, download, and delete through HTTP.

ngx_http_flv_module

The ngx_http_flv_module enables streaming of FLV (Flash Video) files in Nginx. It allows clients to view FLV video files in real-time as they are being downloaded from the server, making it useful for video streaming applications.

ngx_http_mp4_module

The ngx_http_mp4_module provides support for streaming MP4 (MPEG-4 Part 14) files in Nginx. It allows clients to progressively play MP4 files while they are still being downloaded, making it useful for video streaming applications.

ngx_http_random_index_module

The ngx_http_random_index_module allows Nginx to select a random file from a directory to serve as the index file. This module is handy when you want to display a different file as the default page each time the directory is accessed.

ngx_http_secure_link_module

The ngx_http_secure_link_module provides a mechanism for creating secure links to protect your resources. It generates time-limited and tamper-proof URLs that grant temporary access to specific resources.

ngx_http_slice_module

The ngx_http_slice_module allows Nginx to serve large files in smaller slices or chunks. It helps optimize file transmission and enables clients to download files in parts, supporting resumable downloads.

ngx_http_ssi_module

The ngx_http_ssi_module enables Server Side Includes (SSI) functionality in Nginx. It allows you to include dynamic content in static HTML pages, such as displaying the current date, including the output of a script, or conditionally showing content based on request parameters.

ngx_http_userid_module

The ngx_http_userid_module assigns a unique identifier to clients visiting your server. It sets a cookie with a unique ID for each client, allowing you to track and identify individual users.

ngx_http_headers_module

The ngx_http_headers_module allows you to modify and manipulate HTTP headers in Nginx. It provides directives to add, modify, or remove headers from the client’s request or the server’s response.

ngx_http_referer_module

The ngx_http_referer_module allows you to block or control access based on the referring URL. It provides directives to restrict access to resources based on the HTTP Referer header.

ngx_http_memcached_module

The ngx_http_memcached_module integrates Nginx with a Memcached server. It allows you to cache and retrieve data from Memcached, improving performance by serving cached content directly from memory.

ngx_http_empty_gif_module

The ngx_http_empty_gif_module returns a 1×1 transparent GIF image. It is often used as a placeholder or tracking pixel.

ngx_http_geo_module

The ngx_http_geo_module provides geolocation-based features in Nginx. It allows you to define geographic IP ranges and perform actions based on the client’s location.

ngx_http_map_module

The ngx_http_map_module enables you to define key-value mappings and use them in various parts of the configuration. It is useful for conditional configuration based on variables.

ngx_http_split_clients_module

The ngx_http_split_clients_module allows you to split client traffic based on various factors, such as random distribution or by specific variables. It is commonly used for A/B testing or traffic splitting purposes.

ngx_http_upstream_module

The ngx_http_upstream_module provides functionality for load balancing and proxying requests to backend servers. It allows you to define a group of upstream servers and distribute client requests among them.

ngx_http_fastcgi_cache_module

The ngx_http_fastcgi_cache_module enables caching of FastCGI responses. It allows Nginx to store and serve cached content from FastCGI applications, improving performance and reducing the load on backend servers.

ngx_http_addition_module

The ngx_http_addition_module allows you to add additional content to HTTP responses. It provides directives to append or prepend content to the response body, headers, or both.

ngx_http_xslt_module

The ngx_http_xslt_module enables XSLT (Extensible Stylesheet Language Transformations) support in Nginx. It allows you to apply XSLT transformations to XML data before serving it to clients.

ngx_http_image_filter_module

The ngx_http_image_filter_module provides image processing capabilities in Nginx. It allows you to resize, crop, rotate, and perform other transformations on images dynamically.

ngx_http_sub_module

The ngx_http_sub_module enables response content substitution in Nginx. It allows you to replace specific strings or patterns in the response body with other content, providing the ability to modify the response on the fly.

ngx_http_dav_ext_module

The ngx_http_dav_ext_module extends the WebDAV (Web Distributed Authoring and Versioning) functionality in Nginx. It adds support for more advanced WebDAV features like file locking, properties, and DeltaV version control.

ngx_http_flv_live_module

The ngx_http_flv_live_module provides support for live FLV (Flash Video) streaming in Nginx. It allows clients to watch live video streams in FLV format.

ngx_http_gunzip_module

The ngx_http_gunzip_module enables on-the-fly decompression of gzipped HTTP responses. It automatically decompresses gzipped content before serving it to clients that do not support gzip compression.

ngx_http_mirror_module

The ngx_http_mirror_module allows you to mirror incoming requests to one or more remote servers. It is useful for performing A/B testing, load testing, or capturing requests for analysis.

ngx_http_auth_request_module

The ngx_http_auth_request_module enables subrequest-based authentication in Nginx. It allows you to perform an internal subrequest to authenticate a request before allowing access to protected resources.

ngx_http_perl_module

The ngx_http_perl_module integrates the Perl programming language into Nginx. It allows you to write custom modules or scripts in Perl to extend the functionality of Nginx.

ngx_http_geoip_module

The ngx_http_geoip_module provides geolocation-based features using the MaxMind GeoIP database in Nginx. It allows you to determine the geographic location of a client based on their IP address.

ngx_http_degradation_module

The ngx_http_degradation_module provides a mechanism to degrade or limit the functionality of Nginx based on various conditions. It allows you to control the behavior of Nginx under high load or other situations.

ngx_http_headers_more_module

The ngx_http_headers_more_module extends the functionality of the ngx_http_headers_module . It allows you to modify or add additional headers to the client’s request or the server’s response.

ngx_http_xslt_proc_module

The ngx_http_xslt_proc_module enables XSLT (Extensible Stylesheet Language Transformations) processing of XML data in Nginx. It allows you to apply XSLT transformations to XML data on the fly.

ngx_http_js_module

The ngx_http_js_module allows you to embed JavaScript code into Nginx configurations. It enables you to write custom modules or scripts in JavaScript to extend the functionality of Nginx.

These are just a few examples of the core modules available in Nginx. Nginx is highly modular, and there are many additional modules that can be added to extend its functionality further. By leveraging both core and third-party modules, you can unlock a vast array of possibilities and fine-tune your Nginx server to meet your specific needs.

Please Note: we have provided an overview of the functionality of each Nginx module. In future articles, we will take a closer look at each Nginx module individually, exploring their features and capabilities in depth. Stay tuned for dedicated articles on each module.

Nginx Modules Best Practice

When using Nginx modules, it’s important to follow best practices to ensure smooth integration, maintainability, and performance.

Here are some best practices for using Nginx modules:

Select Nginx modules that align with your specific requirements. Consider factors such as stability, community support, compatibility with your Nginx version, and the module’s track record.
Keep your Nginx modules up to date with the latest stable versions. This ensures you benefit from bug fixes, security patches, and new features.
Thoroughly review the documentation provided by the module developers. Understand the module’s configuration directives, dependencies, and any limitations or considerations specific to the module.
Before deploying a module in a production environment, test it thoroughly in a controlled staging or development environment. Verify its functionality, performance impact, and compatibility with your existing setup.
Always create backups of your Nginx configuration files before making changes or adding modules. This allows you to revert back to a working configuration if any issues arise.
Organize your Nginx configuration by keeping module-specific settings in separate files. This improves readability, makes it easier to manage and update individual Nginx modules, and allows for better modularization.
Be mindful of module dependencies and avoid unnecessary dependencies between modules. Minimize the number of modules loaded to reduce resource usage and potential conflicts.
Monitor the performance of your Nginx server after adding or updating modules. Keep an eye on resource utilization, response times, and error logs. Performance testing and benchmarking can help identify any bottlenecks or performance issues introduced by the Nginx modules.
Regularly review the security posture of your Nginx installation , including the modules used. Stay informed about any security advisories or vulnerabilities associated with the modules and update them promptly.
Engage with the Nginx community , including module developers and user forums. This provides an opportunity to seek help, share experiences, and contribute back to the community by reporting issues or providing feedback.

Exploring Nginx Modules: A Guide to Custom Module Development self.__wrap_b=(t,n,e)=>{e=e||document.querySelector(`[data-br="${t}"]`);let a=e.parentElement,r=R=>e.style.maxWidth=R+"px";e.style.maxWidth="";let o=a.clientWidth,c=a.clientHeight,i=o/2-.25,l=o+.5,u;if(o){for(;i+1 {self.__wrap_b(0,+e.dataset.brr,e)})).observe(a):process.env.NODE_ENV==="development"&&console.warn("The browser you are using does not support the ResizeObserver API. Please consider add polyfill for this API to avoid potential layout shifts or upgrade your browser. Read more: https://github.com/shuding/react-wrap-balancer#browser-support-information"))};self.__wrap_b(":R4mr36:",1)

Understanding Nginx Modules

Setting up the development environment, creating a custom module, compiling and installing the custom module, testing the custom module.

Nginx is a powerful, high-performance web server, reverse proxy server, and load balancer. It's well-known for its ability to handle a large number of connections and serve static files efficiently. One of the reasons behind its popularity is its modular architecture, which allows developers to extend its functionality by creating custom modules. In this blog post, we'll take an in-depth look at Nginx modules and guide you through the process of developing your custom module. We'll explore the different types of modules, their structure, and how to compile and install them. Let's dive in!

Nginx is designed with a modular architecture, which means its functionality is divided into smaller, independent components called modules. These modules can be enabled or disabled during the compilation process, allowing users to tailor Nginx to their specific needs. There are several types of modules in Nginx, including:

Core modules
Event modules
Protocol modules (like HTTP, Mail, and Stream)
HTTP modules
Mail modules
Stream modules

For this guide, we'll focus on developing custom HTTP modules since they're the most commonly used modules.

Before we start developing our custom module, let's set up a development environment. We'll need the following tools and dependencies:

A Linux-based operating system (we'll use Ubuntu in this guide)
Nginx source code
GCC (GNU Compiler Collection) and development tools
Text editor or IDE of your choice

First, install the necessary tools and dependencies:

Next, download the Nginx source code and extract it:

Now that our development environment is ready, let's create a simple "Hello, World!" module. We'll name it ngx_http_hello_module .

Step 1: Create the Module Directory and Source Files

Create a new directory for your module and navigate to it:

Next, create a new C source file named ngx_http_hello_module.c and open it in your favorite text editor or IDE.

Step 2: Define the Module's Structure

In the ngx_http_hello_module.c file, we'll start by including the necessary Nginx headers and defining the module's structure:

In the code above, we defined a simple "Hello, World!" HTTP module. We started by including the required Nginx headers and defining the module's structure, commands, and context. Then, we implemented the ngx_http_hello and ngx_http_hello_handler functions to handle the hello directive and generate the response.

Now that our custom module is ready, we need to compile and install it. First, navigate back to the Nginx source directory:

Next, configure the Nginx build with the --add-module option to include our custom module:

Compile and install Nginx:

Finally, start Nginx:

To test our custom module, we'll need to modify the Nginx configuration file located at /usr/local/nginx/conf/nginx.conf . Open the file in your favorite text editor and add the following line inside the location / block:

Your configuration file should look like this:

Restart Nginx to apply the changes:

Now, visit http://localhost/ in your web browser or use a command-line tool like curl :

You should see the "Hello, World!" message, indicating that our custom module is working as expected.

1. Can I use third-party modules with the official Nginx package?

Yes, you can use third-party modules with the official Nginx package. However, you'll need to recompile Nginx with the --add-module or --add-dynamic-module options to include the third-party module.

2. What is the difference between a static module and a dynamic module?

A static module is compiled directly into the Nginx binary, while a dynamic module is compiled as a separate shared object file ( .so ) that can be loaded at runtime. Static modules are always available, whereas dynamic modules can be loaded or unloaded as needed.

3. Can I load my custom module without recompiling Nginx?

Yes, you can compile your custom module as a dynamic module and load it at runtime without recompiling Nginx. To do this, use the --add-dynamic-module option during the configuration step and add the load_module directive to your Nginx configuration file.

4. How do I debug my custom module?

You can use a debugger like GDB (GNU Debugger) to debug your custom module. You'll need to compile Nginx with the -g flag for debugging symbols and start Nginx with the gdb command. Additionally, you can add ngx_log_debug() statements in your module's code to print debug messages in the Nginx error log.

5. Are there any resources for learning more about Nginx module development?

Yes, there are several resources available for learning more about Nginx module development. Some recommended resources include:

The official Nginx Development Guide
The book "Nginx HTTP Server" by Clément Nedelcu
Various open-source Nginx modules on GitHub

Sharing is caring

Did you like what Mehul Mohan wrote? Thank them for their work by sharing it on social media.

No comment s so far

Curious about this topic? Continue your journey with these coding courses:

125 students learning

Husein Nasser

Backend Web Development with Python

Piyush Garg

Master Node.JS in Hindi

Why Make Your Own NGINX Modules? Theory and Practice

Vasiliy Soshnikov, Head of Development Group at Mail.Ru Group

Sometimes you have business goals which can be reached by developing your own modules for NGINX. NGINX modules can be business‑oriented and contain some business logic as well. However, how do you decide for certain that a module should be developed? How might NGINX help you with development?

In his session at NGINX Conf 2018 , Vasiliy provides the detailed knowledge you need to build your own NGINX modules, including details about NGINX's core, its modular architecture, and guiding principles for NGINX code development. Using real‑world case studies and business scenarios, he answers the question, "Why and when do you need to develop your own modules?"

The session is quite technical. To get the most out of it, attendees need at least intermediate‑level experience with NGINX code.

Experience F5 in action by testing our products in your pre-production environment.

Get a free trial

We can assess your needs and connect you with the right cloud provider, reseller partner, or F5 sales engineer.

We’re dedicated to building partnerships that drive your business forward.

Find a partner

Introduction

Welcome to my Nginx module guide!

To follow this guide, you need to know a decent amount of C. You should know about structs, pointers, and functions. You also need to know how the nginx.conf file works.

If you find a mistake in the guide, please report it in an issue !

The Handler Guide

Let’s get started with a quick hello world module called ngx_http_hello_world_module .

This module will be a handler, meaning that it will take a request and generate output.

In the nginx source, create a folder called ngx_http_hello_world_module , and make two files in it: config and ngx_http_hello_world_module.c .

The Config File

The config file is just a simple shell script that will be used at compile time to show Nginx where your module source is. As you can see, the config file tests to see if your nginx version supports dynamic modules (the test -n line). If it supports dynamic modules, the module is added the new way. Otherwise, it is added the old way.

ngx_http_hello_world_module.c

This C file is huge! Let’s go through it line by line:

The first line is a prototype for the function ngx_http_hello_world . We’ll define the function at the end of the file.

ngx_http_hello_world_commands is a static array of directives. In our module, we have only one directive: print_hello_world . It will have no arguments, so we put in NGX_CONF_NOARGS .

ngx_http_hello_world_module_ctx is an array of function references. The functions will be executed for various purposes such as preconfiguration, postconfiguration, etc. We don’t need this array in our module, but we still have to define it and fill it with NULL s.

ngx_http_hello_world_module is an array of definitions for the module. It tells where the array of directives and functions are ( ngx_http_hello_world_module and ngx_http_hello_world_module_ctx ). We can also add init and exit callback functions. In our module, we don’t need them so we put NULL s instead.

Now for the interesting part. ngx_http_hello_world_handler is the heart of our module. We want to print Hello World! on the screen, so we have an unsigned char * with our message in it. Right after that, there is another variable with the size of the message.

Next, we have to send the headers. Notice that ngx_http_hello_world_handler had 1 argument that was of type ngx_http_request_t . This is a custom struct made by Nginx. It has a member called headers_out , which we use to send the headers. After we are done setting the headers, we can send them with ngx_http_send_header(r) .

Now we have to send the body. ngx_buf_t is a buffer, and ngx_chain_t is a chain link. The chain links send responses buffer by buffer and point to the next link. In our module, there is no next link, so we set out->next to NULL . ngx_calloc_buf and ngx_alloc_chain_link are Nginx’s calloc wrappers that automatically take care of garbage collection. b->pos and b->last help us send our content. b->pos is the first position in the memory and b->last is the last position. b->memory is set to 1 because our content is read-only. b->last_buf tells that our buffer is the last buffer in the request.

Now that we’re done setting the body, we can send it with return ngx_http_output_filter(r, &out)

Now we define that function we prototyped in the beginning. We can show Nginx what our handler is called with clcf-> handler = ngx_http_hello_world_handler .

And we’re done with our C file! Time to build the module.

Building the Module

How to build the module:

In the Nginx source, run configure , make , and make install . If you only want to build the modules and not the Nginx server itself, you can run make modules .

Using the Module

To use the module, edit your nginx.conf file found in the conf directory in the install location.

When you’re done, you can run nginx ( <nginx_install_location>/sbin/nginx ) and take a look at your work at localhost:8000/test . You should get a blank page saying Hello World! . If so, congratulations! You made your first Nginx module! This module is the base for making any handler.

Printing All the URL Arguments

Modified ngx_http_hello_world_handler:

Now, we’ll modify our module slightly to print all the URL arguments (everything after the ? ). So if our request is localhost:8000/test?foo=bar&hello=world we should get foo=bar&hello=world printed in the body.

We need to modify the handler, ngx_http_hello_world_handler . Notice that the string Hello World! was changed to r->args.data , and strlen(ngx_hello_world) was changed to r->args.len . r->args stores all the arguments and is of type ngx_str_t . ngx_str_t s have a data and a len element, for storing the string and its length.

When you’re done, you should stop nginx ( <nginx_install_location>/sbin/nginx> -s stop ) and build again. After that’s done, start Nginx again and go to localhost:8000/test?foo=hello&bar=world . You should see foo=hello&bar=world printed in the body.

Many Buffers

Get your hello world template again and add another buffer ( ngx_buf_t *b2 ) and another chain link ( ngx_chain_t out2; ). Then allocate some memory with ngx_calloc_buf and ngx_alloc_chain_link . Everything is the same except that we are setting b->last_buf to 0 and out.next to out2 . This is because our original buffer is no longer the last one, and the next buffer is out2 .

Now, if we send the headers we only mention our original buffer because out links to the next buffer.

Rebuild, restart, and go to localhost:8000/test . You should see Hello World!Hello World! .

The Filter Guide

After a handler is loaded and run, all the filter modules are executed. Filters take the header and/or body, manipulate them, and then send them back.

Our module will add a music track to all web pages where the module is loaded.

Nothing is very different here, except that HTTP is replaced with HTTP_FILTER .

Let’s see how this is different from our handler. First we see our u_char : An HTML <audio> element with a song (from Wikipedia).

Next, instead of prototyping ngx_http_background_music , we just defined it. Normally, this function would be more interesting, but since this is just a demo module, we don’t need to do anything other than returning NGX_OK .

The filter is made of two parts: the header filter, and the body filter. For our header filter, we only have to add the content length of our background_music variable. After that, we pass on the baton to the next header filter with ngx_http_next_header_filter .

The body filter accepts 2 arguments: An ngx_http_request_t , and a chain link, ngx_chain_t . The chain link is from the handler and previous filters (if any). What we want to do is to prefix our audio element to the chain link in . It won’t be perfectly valid HTML, but it’s good enough for now.

Our buffer should have last_buf set to 0 because it isn’t the last buffer: The last buffer is in the in chain link. So we’ll just set link->next to in and call the next body filter.

The ngx_http_background_music_init just tells what our filter funtions are called.

And we’re done. Now build and reload nginx, and you should see… a 404 page? Yes, there will be a 404 page, but with an audio track above it.

There was a 404 page because there was no other handler given. If you just add an HTML file, that should be taken care of.

ngx_http_request_t

Example usage of the server variable:

This table lists some useful members of ngx_http_request_t

ngx_str_t usage:

The ngx_str_t datatype has 2 members: data and len . They allow you to access the contents of the string and it’s length.

Other Stuff

Troubleshooting, 1. only part of my text is showing up.

You have to correctly set the size in both the headers and in the buffer. If your string is a u_char* , use ngx_strlen , if it’s an ngx_str_t , use {variable_name}.len .

2. Some weird string is showing up after my text.

3. why is my filter not working (but compiling).

Make sure you’ve configured it correctly. The config file should have HTTP_FILTER instead of HTTP . Also, check if you’ve sent the body correctly, and make sure your buffer is in the chain link.

Useful Links

Emiller’s Guide
Nginx Main Module API
Nginx Memory Management API
Example Module: The EightC Module

The NGINX Handbook – Learn NGINX for Beginners

A young Russian developer named Igor Sysoev was frustrated by older web servers' inability to handle more than 10 thousand concurrent requests. This is a problem referred to as the C10k problem . As an answer to this, he started working on a new web server back in 2002.

NGINX was first released to the public in 2004 under the terms of the 2-clause BSD license. According to the March 2021 Web Server Survey , NGINX holds 35.3% of the market with a total of 419.6 million sites.

Thanks to tools like NGINXConfig by DigitalOcean and an abundance of pre-written configuration files on the internet, people tend to do a lot of copy-pasting instead of trying to understand when it comes to configuring NGINX.

177962736_1410222585999736_5618677227291897851_n

I'm not saying that copying code is bad, but copying code without understanding is a big "no no".

Also NGINX is the kind of software that should be configured exactly according to the requirements of the application to be served and available resources on the host.

That's why instead of copying blindly, you should understand and then fine tune what you're copying – and that's where this handbook comes in.

After going through the entire book, you should be able to:

Understand configuration files generated by popular tools as well as those found in various documentation.
Configure NGINX as a web server, a reverse proxy server, and a load balancer from scratch.
Optimize NGINX to get maximum performance out of your server.

Prerequisites

Familiarity with the Linux terminal and common Unix programs such as ls , cat , ps , grep , find , nproc , ulimit and nano .
A computer powerful enough to run a virtual machine or a $5 virtual private server.
Understanding of web applications and a programming language such as JavaScript or PHP.

Introduction to nginx, how to provision a local virtual machine, how to provision a virtual private server, how to install nginx on a provisioned server or virtual machine, introduction to nginx's configuration files, how to write your first configuration file, how to validate and reload configuration files, how to understand directives and contexts in nginx, how to serve static content using nginx, static file type handling in nginx, how to include partial config files, location matches, variables in nginx, redirects and rewrites, how to try for multiple files, logging in nginx, node.js with nginx, php with nginx, how to use nginx as a load balancer, how to configure worker processes and worker connections, how to cache static content, how to compress responses, how to understand the main configuration file, how to configure ssl.

How to Enable HTTP/2

How to Enable Server Push

Project code.

You can find the code for the example projects in the following repository:

NGINX is a high performance web server developed to facilitate the increasing needs of the modern web. It focuses on high performance, high concurrency, and low resource usage. Although it's mostly known as a web server, NGINX at its core is a reverse proxy server.

NGINX is not the only web server on the market, though. One of its biggest competitors is Apache HTTP Server (httpd) , first released back on 1995. In spite of the fact that Apache HTTP Server is more flexible, server admins often prefer NGINX for two main reasons:

It can handle a higher number of concurrent requests.
It has faster static content delivery with low resource usage.

I won't go further into the whole Apache vs NGINX debate. But if you wish to learn more about the differences between them in detail, this excellent article from Justin Ellingwood may help.

In fact, to explain NGINX's request handling technique, I would like to quote two paragraphs from Justin's article here:

Nginx came onto the scene after Apache, with more awareness of the concurrency problems that would face sites at scale. Leveraging this knowledge, Nginx was designed from the ground up to use an asynchronous, non-blocking, event-driven connection handling algorithm. Nginx spawns worker processes, each of which can handle thousands of connections. The worker processes accomplish this by implementing a fast looping mechanism that continuously checks for and processes events. Decoupling actual work from connections allows each worker to concern itself with a connection only when a new event has been triggered.

If that seems a bit complicated to understand, don't worry. Having a basic understanding of the inner workings will suffice for now.

NGINX is faster in static content delivery while staying relatively lighter on resources because it doesn't embed a dynamic programming language processor. When a request for static content comes, NGINX simply responds with the file without running any additional processes.

That doesn't mean that NGINX can't handle requests that require a dynamic programming language processor. In such cases, NGINX simply delegates the tasks to separate processes such as PHP-FPM , Node.js or Python . Then, once that process finishes its work, NGINX reverse proxies the response back to the client.

NGINX is also a lot easier to configure thanks to a configuration file syntax inspired from various scripting languages that results in compact, easily maintainable configuration files.

How to Install NGINX

Installing NGINX on a Linux -based system is pretty straightforward. You can either use a virtual private server running Ubuntu as your playground, or you can provision a virtual machine on your local system using Vagrant.

For the most part, provisioning a local virtual machine will suffice and that's the way I'll be using in this article.

For those who doesn't know, Vagrant is an open-source tool by Hashicorp that allows you to provision virtual machines using simple configuration files.

For this approach to work, you'll need VirtualBox and Vagrant , so go ahead and install them first. If you need a little warm up on the topic, this tutorial may help.

Create a working directory somewhere in your system with a sensible name. Mine is ~/vagrant/nginx-handbook directory.

Inside the working directory create a file named Vagrantfile and put following content in there:

This Vagrantfile is the configuration file I talked about earlier. It contains information like name of the virtual machine, number of CPUs, size of RAM, the IP address, and more.

To start a virtual machine using this configuration, open your terminal inside the working directory and execute the following command:

The output of the vagrant up command may differ on your system, but as long as vagrant status says the machine is running, you're good to go.

Given that the virtual machine is now running, you should be able to SSH into it. To do so, execute the following command:

If everything's done correctly you should be logged into your virtual machine, which will be evident by the vagrant@nginx-handbook-box line on your terminal.

This virtual machine will be accessible on http://192.168.20.20 on your local machine. You can even assign a custom domain like http://nginx-handbook.test to the virtual machine by adding an entry to your hosts file:

Now append the following line at the end of the file:

Now you should be able to access the virtual machine on http://nginx-handbook.test URI in your browser.

You can stop or destroy the virtual machine by executing the following commands inside the working directory:

If you want to learn about more Vagrant commands, this cheat sheet may come in handy.

Now that you have a functioning Ubuntu virtual machine on your system, all that is left to do is install NGINX .

For this demonstration, I'll use Vultr as my provider but you may use DigitalOcean or whatever provider you like.

Assuming you already have an account with your provider, log into the account and deploy a new server:

On DigitalOcean, it's usually called a droplet. On the next screen, choose a location close to you. I live in Bangladesh which is why I've chosen Singapore:

On the next step, you'll have to choose the operating system and server size. Choose Ubuntu 20.04 and the smallest possible server size:

Although production servers tend to be much bigger and more powerful than this, a tiny server will be more than enough for this article.

Finally, for the last step, put something fitting like nginx-hadnbook-demo-server as the server host and label. You can even leave them empty if you want.

Once you're happy with your choices, go ahead and press the Deploy Now button.

The deployment process may take some time to finish, but once it's done, you'll see the newly created server on your dashboard:

Also pay attention to the Status – it should say Running and not Preparing or Stopped . To connect to the server, you'll need a username and password.

Go into the overview page for your server and there you should see the server's IP address, username, and password:

The generic command for logging into a server using SSH is as follows:

So in the case of my server, it'll be:

You'll be asked if you want to continue connecting to this server or not. Answer with yes and then you'll be asked for the password. Copy the password from the server overview page and paste that into your terminal.

If you do everything correctly you should be logged into your server – you'll see the root@localhost line on your terminal. Here localhost is the server host name, and may differ in your case.

You can access this server directly by its IP address. Or if you own any custom domain, you can use that also.

Throughout the article you'll see me adding test domains to my operating system's hosts file. In case of a real server, you'll have to configure those servers using your DNS provider.

Remember that you'll be charged as long as this server is being used. Although the charge should be very small, I'm warning you anyways. You can destroy the server anytime you want by hitting the trash icon on the server overview page:

If you own a custom domain name, you may assign a sub-domain to this server. Now that you're inside the server, all that is left to is install NGINX .

Assuming you're logged into your server or virtual machine, the first thing you should do is performing an update. Execute the following command to do so:

After the update, install NGINX by executing the following command:

Once the installation is done, NGINX should be automatically registered as a systemd service and should be running. To check, execute the following command:

If the status says running , then you're good to go. Otherwise you may start the service by executing this command:

Finally for a visual verification that everything is working properly, visit your server/virtual machine with your favorite browser and you should see NGINX's default welcome page:

NGINX is usually installed on the /etc/nginx directory and the majority of our work in the upcoming sections will be done in here.

Congratulations! Bow you have NGINX up and running on your server/virtual machine. Now it's time to jump head first into NGINX.

As a web server, NGINX's job is to serve static or dynamic contents to the clients. But how that content are going to be served is usually controlled by configuration files.

NGINX's configuration files end with the .conf extension and usually live inside the /etc/nginx/ directory. Let's begin by cd ing into this directory and getting a list of all the files:

Among these files, there should be one named nginx.conf . This is the the main configuration file for NGINX. You can have a look at the content of this file using the cat program:

Whoa! That's a lot of stuff. Trying to understand this file at its current state will be a nightmare. So let's rename the file and create a new empty one:

I highly discourage you from editing the original nginx.conf file unless you absolutely know what you're doing. For learning purposes, you may rename it, but later on , I'll show you how you should go about configuring a server in a real life scenario.

How to Configure a Basic Web Server

In this section of the book, you'll finally get your hands dirty by configuring a basic static web server from the ground up. The goal of this section is to introduce you to the syntax and fundamental concepts of NGINX configuration files.

Start by opening the newly created nginx.conf file using the nano text editor:

Throughout the book, I'll be using nano as my text editor. You may use something more modern if you want to, but in a real life scenario, you're most likely to work using nano or vim on servers instead of anything else. So use this book as an opportunity to sharpen your nano skills. Also the official cheat sheet is there for you to consult whenever you need.

After opening the file, update its content to look like this:

If you have experience building REST APIs then you may guess from the return 200 "Bonjour, mon ami!\n"; line that the server has been configured to respond with a status code of 200 and the message "Bonjour, mon ami!".

Don't worry if you don't understand anything more than that at the moment. I'll explain this file line by line, but first let's see this configuration in action.

After writing a new configuration file or updating an old one, the first thing to do is check the file for any syntax mistakes. The nginx binary includes an option -t to do just that.

If you have any syntax errors, this command will let you know about them, including the line number.

Although the configuration file is fine, NGINX will not use it. The way NGINX works is it reads the configuration file once and keeps working based on that.

If you update the configuration file, then you'll have to instruct NGINX explicitly to reload the configuration file. There are two ways to do that.

You can restart the NGINX service by executing the sudo systemctl restart nginx command.
You can dispatch a reload signal to NGINX by executing the sudo nginx -s reload command.

The -s option is used for dispatching various signals to NGINX. The available signals are stop , quit , reload and reopen . Among the two ways I just mentioned, I prefer the second one simply because it's less typing.

Once you've reloaded the configuration file by executing the nginx -s reload command, you can see it in action by sending a simple get request to the server:

The server is responding with a status code of 200 and the expected message. Congratulations on getting this far! Now it's time for some explanation.

The few lines of code you've written here, although seemingly simple, introduce two of the most important terminologies of NGINX configuration files. They are directives and contexts .

Technically, everything inside a NGINX configuration file is a directive . Directives are of two types:

Simple Directives
Block Directives

A simple directive consists of the directive name and the space delimited parameters, like listen , return and others. Simple directives are terminated by semicolons.

Block directives are similar to simple directives, except that instead of ending with semicolons, they end with a pair of curly braces { } enclosing additional instructions.

A block directive capable of containing other directives inside it is called a context, that is events , http and so on. There are four core contexts in NGINX:

events { } – The events context is used for setting global configuration regarding how NGINX is going to handle requests on a general level. There can be only one events context in a valid configuration file.
http { } – Evident by the name, http context is used for defining configuration regarding how the server is going to handle HTTP and HTTPS requests, specifically. There can be only one http context in a valid configuration file.
server { } – The server context is nested inside the http context and used for configuring specific virtual servers within a single host. There can be multiple server contexts in a valid configuration file nested inside the http context. Each server context is considered a virtual host.
main – The main context is the configuration file itself. Anything written outside of the three previously mentioned contexts is on the main context.

You can treat contexts in NGINX like scopes in other programming languages. There is also a sense of inheritance among them. You can find an alphabetical index of directives on the official NGINX docs.

I've already mentioned that there can be multiple server contexts within a configuration file. But when a request reaches the server, how does NGINX know which one of those contexts should handle the request?

The listen directive is one of the ways to identify the correct server context within a configuration. Consider the following scenario:

Now if you send a request to http://nginx-handbook.test:80 then you'll receive "hello from port 80!" as a response. And if you send a request to http://nginx-handbook.test:8080, you'll receive "hello from port 8080!" as a response:

These two server blocks are like two people holding telephone receivers, waiting to respond when a request reaches one of their numbers. Their numbers are indicated by the listen directives.

Apart from the listen directive, there is also the server_name directive. Consider the following scenario of an imaginary library management application:

This is a basic example of the idea of virtual hosts. You're running two separate applications under different server names in the same server.

If you send a request to http://library.test then you'll get "your local library!" as a response. If you send a request to http://librarian.library.test, you'll get "welcome dear librarian!" as a response.

To make this demo work on your system, you'll have to update your hosts file to include these two domain names as well:

Finally, the return directive is responsible for returning a valid response to the user. This directive takes two parameters: the status code and the string message to be returned.

Now that you have a good understanding of how to write a basic configuration file for NGINX, let's upgrade the configuration to serve static files instead of plain text responses.

In order to serve static content, you first have to store them somewhere on your server. If you list the files and directory on the root of your server using ls , you'll find a directory called /srv in there:

This /srv directory is meant to contain site-specific data which is served by this system. Now cd into this directory and clone the code repository that comes with this book:

Inside the nginx-handbook-projects directory there should a directory called static-demo containing four files in total:

Now that you have the static content to be served, update your configuration as follows:

The code is almost the same, except the return directive has now been replaced by a root directive. This directive is used for declaring the root directory for a site.

By writing root /srv/nginx-handbook-projects/static-demo you're telling NGINX to look for files to serve inside the /srv/nginx-handbook-projects/static-demo directory if any request comes to this server. Since NGINX is a web server, it is smart enough to serve the index.html file by default.

Let's see if this works or not. Test and reload the updated configuration file and visit the server. You should be greeted with a somewhat broken HTML site:

Although NGINX has served the index.html file correctly, judging by the look of the three navigation links, it seems like the CSS code is not working.

You may think that there is something wrong in the CSS file. But in reality, the problem is in the configuration file.

To debug the issue you're facing right now, send a request for the CSS file to the server:

Pay attention to the Content-Type and see how it says text/plain and not text/css . This means that NGINX is serving this file as plain text instead of as a stylesheet.

Although NGINX is smart enough to find the index.html file by default, it's pretty dumb when it comes to interpreting file types. To solve this problem update your configuration once again:

The only change we've made to the code is a new types context nested inside the http block. As you may have already guessed from the name, this context is used for configuring file types.

By writing text/html html in this context you're telling NGINX to parse any file as text/html that ends with the html extension.

You may think that configuring the CSS file type should suffice as the HTML is being parsed just fine – but no.

If you introduce a types context in the configuration, NGINX becomes even dumber and only parses the files configured by you. So if you only define the text/css css in this context then NGINX will start parsing the HTML file as plain text.

Validate and reload the newly updated config file and visit the server once again. Send a request for the CSS file once again, and this time the file should be parsed as a text/css file:

Visit the server for a visual verification, and the site should look better this time:

If you've updated and reloaded the configuration file correctly and you're still seeing the old site, perform a hard refresh.

Mapping file types within the types context may work for small projects, but for bigger projects it can be cumbersome and error-prone.

NGINX provides a solution for this problem. If you list the files inside the /etc/nginx directory once again, you'll see a file named mime.types .

Let's have a look at the content of this file:

The file contains a long list of file types and their extensions. To use this file inside your configuration file, update your configuration to look as follows:

The old types context has now been replaced with a new include directive. Like the name suggests, this directive allows you to include content from other configuration files.

Validate and reload the configuration file and send a request for the mini.min.css file once again:

In the section below on how to understand the main configuration file, I'll demonstrate how include can be used to modularize your virtual server configurations.

Dynamic Routing in NGINX

The configuration you wrote in the previous section was a very simple static content server configuration. All it did was match a file from the site root corresponding to the URI the client visits and respond back.

So if the client requests files existing on the root such as index.html , about.html or mini.min.css NGINX will return the file. But if you visit a route such as http://nginx-handbook.test/nothing, it'll respond with the default 404 page:

In this section of the book, you'll learn about the location context, variables, redirects, rewrites and the try_files directive. There will be no new projects in this section but the concepts you learn here will be necessary in the upcoming sections.

Also the configuration will change very frequently in this section, so do not forget to validate and reload the configuration file after every update.

The first concept we'll discuss in this section is the location context. Update the configuration as follows:

We've replaced the root directive with a new location context. This context is usually nested inside server blocks. There can be multiple location contexts within a server context.

If you send a request to http://nginx-handbook.test/agatha, you'll get a 200 response code and list of characters created by Agatha Christie .

Now if you send a request to http://nginx-handbook.test/agatha-christie, you'll get the same response:

This happens because, by writing location /agatha , you're telling NGINX to match any URI starting with "agatha". This kind of match is called a prefix match .

To perform an exact match , you'll have to update the code as follows:

Adding an = sign before the location URI will instruct NGINX to respond only if the URL matches exactly. Now if you send a request to anything but /agatha , you'll get a 404 response.

Another kind of match in NGINX is the regex match . Using this match you can check location URLs against complex regular expressions.

By replacing the previously used = sign with a ~ sign, you're telling NGINX to perform a regular expression match. Setting the location to ~ /agatha[0-9] means NIGINX will only respond if there is a number after the word "agatha":

A regex match is by default case sensitive, which means that if you capitalize any of the letters, the location won't work:

To turn this into case insensitive, you'll have to add a * after the ~ sign.

That will tell NGINX to let go of type sensitivity and match the location anyways.

NGINX assigns priority values to these matches, and a regex match has more priority than a prefix match.

Now if you send a request to http://nginx-handbook.test/Agatha8, you'll get the following response:

But this priority can be changed a little. The final type of match in NGINX is a preferential prefix match . To turn a prefix match into a preferential one, you need to include the ^~ modifier before the location URI:

This time, the prefix match wins. So the list of all the matches in descending order of priority is as follows:

Variables in NGINX are similar to variables in other programming languages. The set directive can be used to declare new variables anywhere within the configuration file:

Variables can be of three types

Apart from the variables you declare, there are embedded variables within NGINX modules. An alphabetical index of variables is available in the official documentation.

To see some of the variables in action, update the configuration as follows:

Now upon sending a request to the server, you should get a response as follows:

As you can see, the $host and $uri variables hold the root address and the requested URI relative to the root, respectively. The $args variable, as you can see, contains all the query strings.

Instead of printing the literal string form of the query strings, you can access the individual values using the $arg variable.

Now the response from the server should look like as follows:

The variables I demonstrated here are embedded in the ngx_http_core_module . For a variable to be accessible in the configuration, NGINX has to be built with the module embedding the variable. Building NGINX from source and usage of dynamic modules is slightly out of scope for this article. But I'll surely write about that in my blog.

A redirect in NGINX is same as redirects in any other platform. To demonstrate how redirects work, update your configuration to look like this:

Now if you send a request to http://nginx-handbook.test/about_page, you'll be redirected to http://nginx-handbook.test/about.html:

As you can see, the server responded with a status code of 307 and the location indicates http://nginx-handbook.test/about.html. If you visit http://nginx-handbook.test/about_page from a browser, you'll see that the URL will automatically change to http://nginx-handbook.test/about.html.

A rewrite directive, however, works a little differently. It changes the URI internally, without letting the user know. To see it in action, update your configuration as follows:

Now if you send a request to http://nginx-handbook/about_page URI, you'll get a 200 response code and the HTML code for about.html file in response:

And if you visit the URI using a browser, you'll see the about.html page while the URL remains unchanged:

Apart from the way the URI change is handled, there is another difference between a redirect and rewrite. When a rewrite happens, the server context gets re-evaluated by NGINX. So, a rewrite is a more expensive operation than a redirect.

The final concept I'll be showing in this section is the try_files directive. Instead of responding with a single file, the try_files directive lets you check for the existence of multiple files.

As you can see, a new try_files directive has been added. By writing try_files /the-nginx-handbook.jpg /not_found; you're instructing NGINX to look for a file named the-nginx-handbook.jpg on the root whenever a request is received. If it doesn't exist, go to the /not_found location.

So now if you visit the server, you'll see the image:

But if you update the configuration to try for a non-existent file such as blackhole.jpg, you'll get a 404 response with the message "sadly, you've hit a brick wall buddy!".

Now the problem with writing a try_files directive this way is that no matter what URL you visit, as long as a request is received by the server and the the-nginx-handbook.jpg file is found on the disk, NGINX will send that back.

And that's why try_files is often used with the $uri NGINX variable.

By writing try_files $uri /not_found; you're instructing NGINX to try for the URI requested by the client first. If it doesn't find that one, then try the next one.

So now if you visit http://nginx-handbook.test/index.html you should get the old index.html page. The same goes for the about.html page:

But if you request a file that doesn't exist, you'll get the response from the /not_found location:

One thing that you may have already noticed is that if you visit the server root http://nginx-handbook.test, you get the 404 response.

This is because when you're hitting the server root, the $uri variable doesn't correspond to any existing file so NGINX serves you the fallback location. If you want to fix this issue, update your configuration as follows:

By writing try_files $uri $uri/ /not_found; you're instructing NGINX to try for the requested URI first. If that doesn't work then try for the requested URI as a directory, and whenever NGINX ends up into a directory it automatically starts looking for an index.html file.

Now if you visit the server, you should get the index.html file just right:

The try_files is the kind of directive that can be used in a number of variations. In the upcoming sections, you'll encounter a few other variations but I would suggest that you do some research on the internet regarding the different usage of this directive by yourself.

By default, NGINX's log files are located inside /var/log/nginx . If you list the content of this directory, you may see something as follows:

Let's begin by emptying the two files.

If you do not dispatch a reopen signal to NGINX, it'll keep writing logs to the previously open streams and the new files will remain empty.

Now to make an entry in the access log, send a request to the server.

As you can see, a new entry has been added to the access.log file. Any request to the server will be logged to this file by default. But we can change this behavior using the access_log directive.

The first access_log directive inside the /admin location block instructs NGINX to write any access log of this URI to the /var/logs/nginx/admin.log file. The second one inside the /no_logging location turns off access logs for this location completely.

Validate and reload the configuration. Now if you send requests to these locations and inspect the log files, you should see something like this:

The error.log file, on the other hand, holds the failure logs. To make an entry to the error.log, you'll have to make NGINX crash. To do so, update your configuration as follows:

As you know, the return directive takes only two parameters – but we've given three here. Now try reloading the configuration and you'll be presented with an error message:

Check the content of the error log and the message should be present there as well:

Error messages have levels. A notice entry in the error log is harmless, but an emerg or emergency entry has to be addressed right away.

There are eight levels of error messages:

debug – Useful debugging information to help determine where the problem lies.
info – Informational messages that aren't necessary to read but may be good to know.
notice – Something normal happened that is worth noting.
warn – Something unexpected happened, however is not a cause for concern.
error – Something was unsuccessful.
crit – There are problems that need to be critically addressed.
alert – Prompt action is required.
emerg – The system is in an unusable state and requires immediate attention.

By default, NGINX records all level of messages. You can override this behavior using the error_log directive. If you want to set the minimum level of a message to be warn , then update your configuration file as follows:

Validate and reload the configuration, and from now on only messages with a level of warn or above will be logged.

Unlike the previous output, there are no notice entries here. emerg is a higher level error than warn and that's why it has been logged.

For most projects, leaving the error configuration as it is should be fine. The only suggestion I have is to set the minimum error level to warn . This way you won't have to look at unnecessary entries in the error log.

But if you want to learn more about customizing logging in NGINX, this link to the official docs may help.

How to Use NGINX as a Reverse Proxy

When configured as a reverse proxy, NGINX sits between the client and a back end server. The client sends requests to NGINX, then NGINX passes the request to the back end.

Once the back end server finishes processing the request, it sends it back to NGINX. In turn, NGINX returns the response to the client.

During the whole process, the client doesn't have any idea about who's actually processing the request. It sounds complicated in writing, but once you do it for yourself you'll see how easy NGINX makes it.

Let's see a very basic and impractical example of a reverse proxy:

Apart from validating and reloading the configuration, you'll also have to add this address to your hosts file to make this demo work on your system:

Now if you visit http://nginx.test, you'll be greeted by the original https://nginx.org site while the URI remains unchanged.

You should be even able to navigate around the site to an extent. If you visit http://nginx.test/en/docs/ you should get the http://nginx.org/en/docs/ page in response.

So as you can see, at a basic level, the proxy_pass directive simply passes a client's request to a third party server and reverse proxies the response to the client.

Now that you know how to configure a basic reverse proxy server, you can serve a Node.js application reverse proxied by NGINX. I've added a demo application inside the repository that comes with this article.

I'm assuming that you have experience with Node.js and know how to start a Node.js application using PM2.

If you've already cloned the repository inside /srv/nginx-handbook-projects then the node-js-demo project should be available in the /srv/nginx-handbook-projects/node-js-demo directory.

For this demo to work, you'll need to install Node.js on your server. You can do that following the instructions found here .

The demo application is a simple HTTP server that responds with a 200 status code and a JSON payload. You can start the application by simply executing node app.js but a better way is to use PM2 .

For those of you who don't know, PM2 is a daemon process manager widely used in production for Node.js applications. If you want to learn more, this link may help.

Install PM2 globally by executing sudo npm install -g pm2 . After the installation is complete, execute following command while being inside the /srv/nginx-handbook-projects/node-js-demo directory:

Alternatively you can also do pm2 start /srv/nginx-handbook-projects/node-js-demo/app.js from anywhere on the server. You can stop the application by executing the pm2 stop app command.

The application should be running now but should not be accessible from outside of the server. To verify if the application is running or not, send a get request to http://localhost:3000 from inside your server:

If you get a 200 response, then the server is running fine. Now to configure NGINX as a reverse proxy, open your configuration file and update its content as follows:

Nothing new to explain here. You're just passing the received request to the Node.js application running at port 3000. Now if you send a request to the server from outside you should get a response as follows:

Although this works for a basic server like this, you may have to add a few more directives to make it work in a real world scenario depending on your application's requirements.

For example, if your application handles web socket connections, then you should update the configuration as follows:

The proxy_http_version directive sets the HTTP version for the server. By default it's 1.0, but web socket requires it to be at least 1.1. The proxy_set_header directive is used for setting a header on the back-end server. Generic syntax for this directive is as follows:

So, by writing proxy_set_header Upgrade $http_upgrade; you're instructing NGINX to pass the value of the $http_upgrade variable as a header named Upgrade – same for the Connection header.

If you would like to learn more about web socket proxying, this link to the official NGINX docs may help.

Depending on the headers required by your application, you may have to set more of them. But the above mentioned configuration is very commonly used to serve Node.js applications.

PHP and NGINX go together like bread and butter. After all the E and the P in the LEMP stack stand for NGINX and PHP.

I'm assuming you have experience with PHP and know how to run a PHP application.

I've already included a demo PHP application in the repository that comes with this article. If you've already cloned it in the /srv/nginx-handbook-projects directory, then the application should be inside /srv/nginx-handbook-projects/php-demo .

For this demo to work, you'll have to install a package called PHP-FPM. To install the package, you can execute following command:

To test out the application, start a PHP server by executing the following command while inside the /srv/nginx-handbook-projects/php-demo directory:

Alternatively you can also do php -S localhost:8000 /srv/nginx-handbook-projects/php-demo/index.php from anywhere on the server.

The application should be running at port 8000 but it can not be accessed from the outside of the server. To verify, send a get request to http://localhost:8000 from inside your server:

If you get a 200 response then the server is running fine. Just like the Node.js configuration, now you can simply proxy_pass the requests to localhost:8000 – but with PHP, there is a better way.

The FPM part in PHP-FPM stands for FastCGI Process Module. FastCGI is a protocol just like HTTP for exchanging binary data. This protocol is slightly faster than HTTP and provides better security.

To use FastCGI instead of HTTP, update your configuration as follows:

Let's begin with the new index directive. As you know, NGINX by default looks for an index.html file to serve. But in the demo-project, it's called index.php. So by writing index index.php , you're instructing NGINX to use the index.php file as root instead.

This directive can accept multiple parameters. If you write something like index index.php index.html , NGINX will first look for index.php. If it doesn't find that file, it will look for an index.html file.

The try_files directive inside the first location context is the same as you've seen in a previous section. The =404 at the end indicates the error to throw if none of the files are found.

The second location block is the place where the main magic happens. As you can see, we've replaced the proxy_pass directive by a new fastcgi_pass . As the name suggests, it's used to pass a request to a FastCGI service.

The PHP-FPM service by default runs on port 9000 of the host. So instead of using a Unix socket like I've done here, you can pass the request to http://localhost:9000 directly. But using a Unix socket is more secure.

If you have multiple PHP-FPM versions installed, you can simply list all the socket file locations by executing the following command:

The /run/php/php-fpm.sock file refers to the latest version of PHP-FPM installed on your system. I prefer using the one with the version number. This way even if PHP-FPM gets updated, I'll be certain about the version I'm using.

Unlike passing requests through HTTP, passing requests through FPM requires us to pass some extra information.

The general way of passing extra information to the FPM service is using the fastcgi_param directive. At the very least, you'll have to pass the request method and the script name to the back-end service for the proxying to work.

The fastcgi_param REQUEST_METHOD $request_method; passes the request method to the back-end and the fastcgi_param SCRIPT_FILENAME $realpath_root$fastcgi_script_name; line passes the exact location of the PHP script to run.

At this state, your configuration should work. To test it out, visit your server and you should be greeted by something like this:

Well, that's weird. A 500 error means NGINX has crashed for some reason. This is where the error logs can come in handy. Let's have a look at the last entry in the error.log file:

Seems like the NGINX process is being denied permission to access the PHP-FPM process.

One of the main reasons for getting a permission denied error is user mismatch. Have a look at the user owning the NGINX worker process.

As you can see, the process is currently owned by nobody . Now inspect the PHP-FPM process.

This process, on the other hand, is owned by the www-data user. This is why NGINX is being denied access to this process.

To solve this issue, update your configuration as follows:

The user directive is responsible for setting the owner for the NGINX worker processes. Now inspect the the NGINX process once again:

Undoubtedly the process is now owned by the www-data user. Send a request to your server to check if it's working or not:

If you get a 200 status code with a JSON payload, you're good to go.

This simple configuration is fine for the demo application, but in real-life projects you'll have to pass some additional parameters.

For this reason, NGINX includes a partial configuration called fastcgi_params . This file contains a list of the most common FastCGI parameters.

As you can see, this file also contains the REQUEST_METHOD parameter. Instead of passing that manually, you can just include this file in your configuration:

Your server should behave just the same. Apart from the fastcgi_params file, you may also come across the fastcgi.conf file which contains a slightly different set of parameters. I would suggest that you avoid that due to some inconsistencies with its behavior.

Thanks to the reverse proxy design of NGINX, you can easily configure it as a load balancer.

I've already added a demo to the repository that comes with this article. If you've already cloned the repository inside the /srv/nginx-handbook-projects/ directory then the demo should be in the /srv/nginx-handbook-projects/load-balancer-demo/ directory.

In a real life scenario, load balancing may be required on large scale projects distributed across multiple servers. But for this simple demo, I've created three very simple Node.js servers responding with a server number and 200 status code.

For this demo to work, you'll need Node.js installed on the server. You can find instructions in this link to help you get it installed.

Apart from this, you'll also need PM2 for daemonizing the Node.js servers provided in this demo.

If you haven't already, install PM2 by executing sudo npm install -g pm2 . After the installation finishes, execute the following commands to start the three Node.js servers:

Three Node.js servers should be running on localhost:3001, localhost:3002, localhost:3003 respectively.

Now update your configuration as follows:

The configuration inside the server context is the same as you've already seen. The upstream context, though, is new. An upstream in NGINX is a collection of servers that can be treated as a single backend.

So the three servers you started using PM2 can be put inside a single upstream and you can let NGINX balance the load between them.

To test out the configuration, you'll have to send a number of requests to the server. You can automate the process using a while loop in bash:

You can cancel the loop by hitting Ctrl + C on your keyboard. As you can see from the responses from the server, NGINX is load balancing the servers automatically.

Of course, depending on the project scale, load balancing can be a lot more complicated than this. But the goal of this article is to get you started, and I believe you now have a basic understanding of load balancing with NGINX. You can stop the three running server by executing pm2 stop server-1 server-2 server-3 command (and it's a good idea here).

How to Optimize NGINX for Maximum Performance

In this section of the article, you'll learn about a number of ways to get the maximum performance from your server.

Some of these methods will be application-specific, which means they'll probably need tweaking considering your application requirements. But some of them will be general optimization techniques.

Just like the previous sections, changes in configuration will be frequesnt in this one, so don't forget to validate and reload your configuration file every time.

As I've already mentioned in a previous section, NGINX can spawn multiple worker processes capable of handling thousands of requests each.

As you can see, right now there is only one NGINX worker process on the system. This number, however, can be changed by making a small change to the configuration file.

The worker_process directive written in the main context is responsible for setting the number of worker processes to spawn. Now check the NGINX service once again and you should see two worker processes:

Setting the number of worker processes is easy, but determining the optimal number of worker processes requires a bit more work.

The worker processes are asynchronous in nature. This means that they will process incoming requests as fast as the hardware can.

Now consider that your server runs on a single core processor. If you set the number of worker processes to 1, that single process will utilize 100% of the CPU capacity. But if you set it to 2, the two processes will be able to utilize 50% of the CPU each. So increasing the number of worker processes doesn't mean better performance.

A rule of thumb in determining the optimal number of worker processes is number of worker process = number of CPU cores .

If you're running on a server with a dual core CPU, the number of worker processes should be set to 2. In a quad core it should be set to 4...and you get the idea.

Determining the number of CPUs on your server is very easy on Linux.

I'm on a single CPU virtual machine, so the nproc detects that there's one CPU. Now that you know the number of CPUs, all that is left to do is set the number on the configuration.

That's all well and good, but every time you upscale the server and the CPU number changes, you'll have to update the server configuration manually.

NGINX provides a better way to deal with this issue. You can simply set the number of worker processes to auto and NGINX will set the number of processes based on the number of CPUs automatically.

Inspect the NGINX process once again:

The number of worker processes is back to one again, because that's what is optimal for this server.

Apart from the worker processes there is also the worker connection, indicating the highest number of connections a single worker process can handle.

Just like the number of worker processes, this number is also related to the number of your CPU core and the number of files your operating system is allowed to open per core.

Finding out this number is very easy on Linux:

Now that you have the number, all that is left is to set it in the configuration:

The worker_connections directive is responsible for setting the number of worker connections in a configuration. This is also the first time you're working with the events context.

In a previous section, I mentioned that this context is used for setting values used by NGINX on a general level. The worker connections configuration is one such example.

The second technique for optimizing your server is caching static content. Regardless of the application you're serving, there is always a certain amount of static content being served, such as stylesheets, images, and so on.

Considering that this content is not likely to change very frequently, it's a good idea to cache them for a certain amount of time. NGINX makes this task easy as well.

By writing location ~* .(css|js|jpg)$ you're instructing NGINX to match requests asking for a file ending with .css , .js and .jpg .

In my applications, I usually store images in the WebP format even if the user submits a different format. This way, configuring the static cache becomes even easier for me.

You can use the add_header directive to include a header in the response to the client. Previously you've seen the proxy_set_header directive used for setting headers on an ongoing request to the backend server. The add_header directive on the other hand only adds a given header to the response.

By setting the Cache-Control header to public, you're telling the client that this content can be cached in any way. The Pragma header is just an older version of the Cache-Control header and does more or less the same thing.

The next header, Vary , is responsible for letting the client know that this cached content may vary.

The value of Accept-Encoding means that the content may vary depending on the content encoding accepted by the client. This will be clarified further in the next section.

Finally the expires directive allows you to set the Expires header conveniently. The expires directive takes the duration of time this cache will be valid. By setting it to 1M you're telling NGINX to cache the content for one month. You can also set this to 10m or 10 minutes, 24h or 24 hours, and so on.

Now to test out the configuration, sent a request for the the-nginx-handbook.jpg file from the server:

As you can see, the headers have been added to the response and any modern browser should be able to interpret them.

The final optimization technique that I'm going to show today is a pretty straightforward one: compressing responses to reduce their size.

If you're not already familiar with it, GZIP is a popular file format used by applications for file compression and decompression. NGINX can utilize this format to compress responses using the gzip directives.

By writing gzip on in the http context, you're instructing NGINX to compress responses. The gzip_comp_level directive sets the level of compression. You can set it to a very high number, but that doesn't guarantee better compression. Setting a number between 1 - 4 gives you an efficient result. For example, I like setting it to 3.

By default, NGINX compresses HTML responses. To compress other file formats, you'll have to pass them as parameters to the gzip_types directive. By writing gzip_types text/css text/javascript; you're telling NGINX to compress any file with the mime types of text/css and text/javascript.

Configuring compression in NGINX is not enough. The client has to ask for the compressed response instead of the uncompressed responses. I hope you remember the add_header Vary Accept-Encoding; line in the previous section on caching. This header lets the client know that the response may vary based on what the client accepts.

As an example, if you want to request the uncompressed version of the mini.min.css file from the server, you may do something like this:

As you can see, there's nothing about compression. Now if you want to ask for the compressed version of the file, you'll have to send an additional header.

As you can see in the response headers, the Content-Encoding is now set to gzip meaning this is the compressed version of the file.

Now if you want to compare the difference in file size, you can do something like this:

The uncompressed version of the file is 46K and the compressed version is 9.1K , almost six times smaller. On real life sites where stylesheets can be much larger, compression can make your responses smaller and faster.

I hope you remember the original nginx.conf file you renamed in an earlier section. According to the Debian wiki , this file is meant to be changed by the NGINX maintainers and not by server administrators, unless they know exactly what they're doing.

But throughout the entire article, I've taught you to configure your servers in this very file. In this section, however, I'll who you how you should configure your servers without changing the nginx.conf file.

To begin with, first delete or rename your modified nginx.conf file and bring back the original one:

Now NGINX should go back to its original state. Let's have a look at the content of this file once again by executing the sudo cat /etc/nginx/nginx.conf file:

You should now be able to understand this file without much trouble. On the main context user www-data; , the worker_processes auto; lines should be easily recognizable to you.

The line pid /run/nginx.pid; sets the process ID for the NGINX process and include /etc/nginx/modules-enabled/*.conf; includes any configuration file found on the /etc/nginx/modules-enabled/ directory.

This directory is meant for NGINX dynamic modules. I haven't covered dynamic modules in this article so I'll skip that.

Now inside the the http context, under basic settings you can see some common optimization techniques applied. Here's what these techniques do:

sendfile on; disables buffering for static files.
tcp_nopush on; allows sending response header in one packet.
tcp_nodelay on; disables Nagle's Algorithm resulting in faster static file delivery.

The keepalive_timeout directive indicates how long to keep a connection open and the types_hash_maxsize directive sets the size of the types hash map. It also includes the mime.types file by default.

I'll skip the SSL settings simply because we haven't covered them in this article. We've already discussed the logging and gzip settings. You may see some of the directives regarding gzip as commented. As long as you understand what you're doing, you may customize these settings.

You use the mail context to configure NGINX as a mail server. We've only talked about NGINX as a web server so far, so I'll skip this as well.

Now under the virtual hosts settings, you should see two lines as follows:

These two lines instruct NGINX to include any configuration files found inside the /etc/nginx/conf.d/ and /etc/nginx/sites-enabled/ directories.

After seeing these two lines, people often take these two directories as the ideal place to put their configuration files, but that's not right.

There is another directory /etc/nginx/sites-available/ that's meant to store configuration files for your virtual hosts. The /etc/nginx/sites-enabled/ directory is meant for storing the symbolic links to the files from the /etc/nginx/sites-available/ directory.

In fact there is an example configuration:

As you can see, the directory contains a symbolic link to the /etc/nginx/sites-available/default file.

The idea is to write multiple virtual hosts inside the /etc/nginx/sites-available/ directory and make some of them active by symbolic linking them to the /etc/nginx/sites-enabled/ directory.

To demonstrate this concept, let's configure a simple static server. First, delete the default virtual host symbolic link, deactivating this configuration in the process:

Create a new file by executing sudo touch /etc/nginx/sites-available/nginx-handbook and put the following content in there:

Files inside the /etc/nginx/sites-available/ directory are meant to be included within the main http context so they should contain server blocks only.

Now create a symbolic link to this file inside the /etc/nginx/sites-enabled/ directory by executing the following command:

Before validating and reloading the configuration file, you'll have to reopen the log files. Otherwise you may get a permission denied error. This happens because the process ID is different this time as a result of swapping the old nginx.conf file.

Finally, validate and reload the configuration file:

Visit the server and you should be greeted with the good old The NGINX Handbook page:

If you've configured the server correctly and you're still getting the old NGINX welcome page, perform a hard refresh. The browser often holds on to old assets and requires a little cleanup.

How To Configure SSL and HTTP/2

HTTP/2 is the newest version of the wildly popular Hyper Text Transport Protocol. Based on Google's experimental SPDY protocol, HTTP/2 provides better performance by introducing features like full request and response multiplexing, better compression of header fields, server push and request prioritization.

Some of the notable features of HTTP/2 is as follows:

Binary Protocol - While HTTP/1.x was a text based protocol, HTTP/2 is a binary protocol resulting in less error during data transfer process.
Multiplexed Streams - All HTTP/2 connections are multiplexed streams meaning multiple files can be transferred in a single stream of binary data.
Compressed Header - HTTP/2 compresses header data in responses resulting in faster transfer of data.
Server Push - This capability allows the server to send linked resources to the client automatically, greatly reducing the number of requests to the server.
Stream Prioritization - HTTP/2 can prioritize data streams based on their type resulting in better bandwidth allocation where necessary.

If you want to learn more about the improvements in HTTP/2 this article by Kinsta may help.

While a significant upgrade over its predecessor, HTTP/2 is not as widely adapted as it should have been. In this section, I'll introduce you to some of the new features mentioned previously and I'll also show you how to enable HTTP/2 on your NGINX powered web server.

For this section, I'll be using the static-demo project. I'm assuming you've already cloned the repository inside /srv/nginx-handbook-projects directory. If you haven't, this is the time to do so. Also, this section has to be done on a virtual private server instead of a virtual machine.

For simplicity, I'll use the /etc/nginx/sites-available/default file as my configuration. Open the file using nano or vi if you fancy that.

Update the file's content as follows:

As you can see, the /srv/nginx-handbook-projects/static-demo; directory has been set as the root of this site and nginx-handbook.farhan.dev has been set as the server name. If you do not have a custom domain set up, you can use your server's IP address as the server name here.

Test the configuration by executing nginx -t and reload the configuration by executing nginx -s reload commands.

Finally visit your server and you should be greeted with a simple static HTML page.

One of the pre-requisite to have HTTP/2 working on your server is to have a valid SSL certificate. Lets do that first.

For those of you who may not know, an SSL certificate is what allows a server to make the move from HTTP to HTTPS. These certificates are issued by a certificate authority (CA). Most of the authorities charge a fee for issuing certificates but nonprofit authorities such as Let's Encrypt , issues certificates for free.

If you want to understand the theory of SSL in a bit more detail, this article on the Cloudflare Learning Center may help.

Thanks to open-source tools like Certbot , installing a free certificate is dead easy. Head over to certbot.eff.org link. Now select the software and system that powers your server.

I'm running NGINX on Ubuntu 20.04 and if you've been in line with this article, you should have the same combination.

After selecting your combination of software and system, you'll be forwarded to a new page containing step by step instructions for installing certbot and a new SSL certificate.

The installation steps for certbot may differ from system to system but rest of the instructions should remain same. On Ubuntu, the recommended way is to use snap .

Certbot is now installed and ready to be used. Before you install a new certificate, make sure the NGINX configuration file contains all the necessary server names. Such as, if you want to install a new certificate for yourdomain.tld and www.yourdomain.tld , you'll have to include both of them in your configuration.

Once you're happy with your configuration, you can install a newly provisioned certificate for your server. To do so, execute the certbot program with --nginx option.

You'll be asked for an emergency contact email address, license agreement and if you would like to receive emails from them or not.

The certbot program will automatically read the server names from your configuration file and show you a list of them. If you have multiple virtual hosts on your server, certbot will recognize them as well.

Finally if the installation is successful, you'll be congratulated by the program. To verify if everything's working or not, visit your server with HTTPS this time:

As you can see, HTTPS has been enabled successfully and you can confirm that the certificate is verified by Let's Encrypt authority. Later on, if you add new virtual hosts to this server with new domains or sub domains, you'll have to reinstall the certificates.

It's also possible to install wildcard certificate such as *.yourdomain.tld for some supported DNS managers. Detailed instructions can be found on the previously shown installation instruction page.

A newly installed certificate will be valid for 90 days. After that, a renewal will be required. Certbot does the renewal automatically. You can execute certbot renew command with the --dry-run option to test out the auto renewal feature.

The command will simulate a certificate renewal to test if it's correctly set up or not. If it succeeds you'll be congratulated by the program. This step ends the procedure of installing an SSL certificate on your server.

To understand what certbot did behind the scenes, open up the /etc/nginx/sites-available/default file once again and see how its content has been altered.

As you can see, certbot has added quite a few lines here. I'll explain the notable ones.

Like the 80 port, 443 is widely used for listening to HTTPS requests. By writing listen 443 ssl; certbot is instructing NGINX to listen for any HTTPS request on port 443. The listen [::]:443 ssl ipv6only=on; line is for handling IPV6 connections.

The ssl_certificate directive is used for indicating the location of the certificate and the private key file on your server. The /etc/letsencrypt/options-ssl-nginx.conf; includes some common directives necessary for SSL.

Finally the ssl_dhparam indicates to the file defining how OpenSSL is going to perform Diffie–Hellman key exchange . If you want to learn more about the purpose of /etc/letsencrypt/ssl-dhparams.pem; file, this stack exchange thread may help you.

This newly added server block is responsible for redirecting any HTTP requests to HTTPS disabling HTTP access completely.

How To Enable HTTP/2

Once you've successfully installed a valid SSL certificate on your server, you're ready to enable HTTP/2. SSL is a prerequisite for HTTP/2, so right off the bat you can see, security is not optional in HTTP/2.

HTTP/2 support for NGINX is provided by the ngx_http_v2_module module. Pre-built binaries of NGINX on most of the systems come with this module baked in. If you've built NGINX from source however, you'll have to include this module manually.

Before upgrading to HTTP/2, send a request to your server and see the current protocol version.

As you can see, by default the server is on HTTP/1.1 protocol. On the next step, we'll update the configuration file as necessary for enabling HTTP/2.

To enable HTTP/2 on your server, open the /etc/nginx/sites-available/default file once again. Find wherever it says listen [::]:443 ssl ipv6only=on; or listen 443 ssl; and update them to listen [::]:443 ssl http2 ipv6only=on; and listen 443 ssl http2; respectively.

Test the configuration file by executing niginx -t and reload the configuration by executing nginx -s reload commands. Now send a request to your server again.

As you can see, HTTP/2 has been enabled for any client supporting the new protocol.

Server push is one of the many features that HTTP/2 brings to the table. Which means the server can push files to the client without the client having to request for them. In a HTTP/1.x server, a typical request for static content may look like as follows:

But on a server push enabled HTTP/2 server, it may look like as follows:

On a single request for the index.html file the server responds with the style.css file as well, minimizing the number of requests in the process.

In this section, I'll use an open-source HTTP client named Nghttp2 for testing the server.

Lets test by sending a request to the server without server push.

On the first request --null-out means discard downloaded data and --stat means print statistics on terminal. On the second request --get-assets means also download assets such as stylesheets, images and scripts linked to this files. As a result you can tell by the requestStart times, the css file and image was downloaded shortly after the html file was downloaded.

Now, lets enable server push for stylesheets and images. Open /etc/nginx/sites-available/default file and update its content as follows:

Two location blocks have been added to exactly match /index.html and /about.html locations. The http2_push directive is used for sending back additional response. Now whenever NGINX receives a request for one of these two locations, it'll automatically send back the css and image file.

Now send another request to the server using nghttp and do not include --get-assets option.

As you can see, although the assets were not requested, the server has sent them to the client. Looking at the time measurements, process time has gone down and the three responses ended almost simultaneously.

This was a very simple example of server push but depending on the necessities of your project, this configuration can become much complex. This article by Owen Garrett on the official NGINX blog can help you with more complex server push configuration.

I would like to thank you from the bottom of my heart for the time you've spent on reading this article. I hope you've enjoyed your time and have learned all the essentials of NGINX.

Apart from this one, I've written full-length handbooks on other complicated topics available for free on freeCodeCamp .

These handbooks are part of my mission to simplify hard to understand technologies for everyone. Each of these handbooks takes a lot of time and effort to write.

If you've enjoyed my writing and want to keep me motivated, consider leaving starts on GitHub and endorse me for relevant skills on LinkedIn . I also accept sponsorship so you may consider buying me a coffee if you want to.

I'm always open to suggestions and discussions on Twitter or LinkedIn . Hit me with direct messages.

In the end, consider sharing the resources with others, because

Sharing knowledge is the most fundamental act of friendship. Because it is a way you can give something without loosing something. — Richard Stallman

Till the next one, stay safe and keep learning.

Software developer with a knack for learning new things and writing about them

If you read this far, thank the author to show them you care. Say Thanks

Learn to code for free. freeCodeCamp's open source curriculum has helped more than 40,000 people get jobs as developers. Get started

Emiller’s Advanced Topics In Nginx Module Development

By Evan Miller (with Grzegorz Nosek )

DRAFT: August 13, 2009

Whereas Emiller’s Guide To Nginx Module Development describes the bread-and-butter issues of writing a simple handler, filter, or load-balancer for Nginx, this document covers three advanced topics for the ambitious Nginx developer: shared memory, subrequests, and parsing. Because these are subjects on the boundaries of the Nginx universe, the code here may be sparse. The examples may be out of date. But hopefully, you will make it out not only alive, but with a few extra tools in your belt.

Shared Memory
A (fore)word of caution
Creating and using a shared memory segment
Using the slab allocator
Spinlocks, atomic memory access
Using rbtrees
Subrequests
Internal redirect
A single subrequest
Sequential subrequests
Parallel subrequests
Parsing With Ragel *NEW*
Installing ragel
Calling ragel from nginx
Writing a grammar
Writing some actions
Putting it all together

1. Shared Memory

Guest chapter written by Grzegorz Nosek

Nginx, while being unthreaded, allows worker processes to share memory between them. However, this is quite different from the standard pool allocator as the shared segment has fixed size and cannot be resized without restarting nginx or destroying its contents in another way.

1.1. A (fore)word of caution

First of all, caveat hacker. This guide has been written several months after hands-on experience with shared memory in nginx and while I try my best to be accurate (and have spent some time refreshing my memory), in no way is it guaranteed. You’ve been warned.

Also, 100% of this knowledge comes from reading the source and reverse-engineering the core concepts, so there are probably better ways to do most of the stuff described.

Oh, and this guide is based on 0.6.31, though 0.5.x is 100% compatible AFAIK and 0.7.x also brings no compatibility-breaking changes that I know of.

For real-world usage of shared memory in nginx, see my upstream_fair module .

This probably does not work on Windows at all. Core dumps in the rear mirror are closer than they appear.

1.2. Creating and using a shared memory segment

provide a constructor function to initialise the segment
call ngx_shared_memory_add

Your constructor will be called multiple times and it’s up to you to find out whether you’re called the first time (and should set something up), or not (and should probably leave everything alone). The prototype for the shared memory constructor looks like: static ngx_int_t init(ngx_shm_zone_t *shm_zone, void *data);

The data variable will contain the contents of oshm_zone->data , where oshm_zone is the "old" shm zone descriptor (more about it later). This variable is the only value that can survive a reload, so you must use it if you don’t want to lose the contents of your shared memory.

Your constructor function will probably look roughly similar to the one in upstream_fair, i.e.: static ngx_int_t init(ngx_shm_zone_t *shm_zone, void *data) { if (data) { /* we're being reloaded, propagate the data "cookie" */ shm_zone->data = data; return NGX_OK; } /* set up whatever structures you wish to keep in the shm */ /* initialise shm_zone->data so that we know we have been called; if nothing interesting comes to your mind, try shm_zone->shm.addr or, if you're desperate, (void*) 1, just set the value to something non-NULL for future invocations */ shm_zone->data = something_interesting; return NGX_OK; }

You must be careful when to access the shm segment.

The interface for adding a shared memory segment looks like: ngx_shm_zone_t * ngx_shared_memory_add(ngx_conf_t *cf, ngx_str_t *name, size_t size, void *tag);

cf is the reference to the config file (you’ll probably create the segment in response to a config option), name is the name of the segment (as a ngx_str_t , i.e. a counted string), size is the size in bytes (which will usually get rounded up to the nearest multiple of the page size, e.g. 4KB on many popular architectures) and tag is a, well, tag for detecting naming conflicts. If you call ngx_shared_memory_add multiple times with the same name, tag and size, you’ll get only a single segment. If you specify different names, you’ll get several distinct segments and if you specify the same name but different size or tag, you’ll get an error. A good choice for the tag value could be e.g. the pointer to your module descriptor.

parse the whole config file, noting requested shm segments

The constructors are called here. Note that every time your ctor is called, it is with another value of shm_zone . The reason is that the descriptor lives as long as the cycle (generation in Apache terms) while the segment lives as long as the master and all the workers. To let some data survive a reload, you have access to the old descriptor’s ->data field (mentioned above).

(re)start workers which begin handling requests

upon receipt of SIGHUP, goto 1

Also, you really must set the constructor, otherwise nginx will consider your segment unused and won’t create it at all.

Now that you know it, it’s pretty clear that you cannot rely on having access to the shared memory while parsing the config. You can access the whole segment as shm_zone->shm.addr (which will be NULL before the segment gets really created). Any access after the first parsing run (e.g. inside request handlers or on subsequent reloads) should be fine.

1.3. Using the slab allocator

Now that you have your new and shiny shm segment, how do you use it? The simplest way is to use another memory tool that nginx has at your disposal, namely the slab allocator. Nginx is nice enough to initialise the slab for you in every new shm segment, so you can either use it, or ignore the slab structures and overwrite them with your own data.

void *ngx_slab_alloc(ngx_slab_pool_t *pool, size_t size);
void ngx_slab_free(ngx_slab_pool_t *pool, void *p);

1.4. Spinlocks, atomic memory access

Remember that shared memory is inherently dangerous because you can have multiple processes accessing it at the same time. The slab allocator has a per-segment lock ( shpool->mutex ) which is used to protect the segment against concurrent modifications.

You can also acquire and release the lock yourself, which is useful if you want to implement some more complicated operations on the segment, like searching or walking a tree. The two snippets below are essentially equivalent: /* void *new_block; ngx_slab_pool_t *shpool = (ngx_slab_pool_t *)shm_zone->shm.addr; */ new_block = ngx_slab_alloc(shpool, ngx_pagesize); ngx_shmtx_lock(&shpool->mutex); new_block = ngx_slab_alloc_locked(shpool, ngx_pagesize); ngx_shmtx_unlock(&shpool->mutex); In fact, ngx_slab_alloc looks almost exactly like above.

If you perform any operations which depend on no new allocations (or, more to the point, frees), protect them with the slab mutex. However, remember that nginx mutexes are implemented as spinlocks (non-sleeping), so while they are very fast in the uncontended case, they can easily eat 100% CPU when waiting. So don’t do any long-running operations while holding the mutex (especially I/O, but you should avoid any system calls at all).

You can also use your own mutexes for more fine-grained locking, via the ngx_mutex_init() , ngx_mutex_lock() and ngx_mutex_unlock() functions.

As an alternative for locks, you can use atomic variables which are guaranteed to be read or written in an uninterruptible way (no worker process may see the value halfway as it’s being written by another one).

Atomic variables are defined with the type ngx_atomic_t or ngx_atomic_uint_t (depending on signedness). They should have at least 32 bits. To simply read or unconditionally set an atomic variable, you don’t need any special constructs: ngx_atomic_t i = an_atomic_var; an_atomic_var = i + 5;

Note that anything can happen between the two lines; context switches, execution of code on other other CPUs, etc.

Atomically retrieves old value of *lock and stores new under the same address. Returns 1 if *lock was equal to old before overwriting.

Atomically adds add to *value and returns the old *value .

1.5. Using rbtrees

OK, you have your data neatly allocated, protected with a suitable lock but you’d also like to organise it somehow. Again, nginx has a very nice structure just for this purpose - a red-black tree.

requires an insertion callback, which inserts the element in the tree (probably according to some predefined order) and then calls ngx_rbt_red(the_newly_added_node) to rebalance the tree
requires all leaves to be set to a predefined sentinel object (not NULL)

This chapter is about shared memory, not rbtrees so shoo! Go read the source for upstream_fair to see creating and walking an rbtree in action.

2. Subrequests

Subrequests are one of the most powerful aspects of Nginx. With subrequests, you can return the results of a different URL than what the client originally requested. Some web frameworks call this an "internal redirect." But Nginx goes further: not only can modules perform multiple subrequests and combine the outputs into a single response, subrequests can perform their own sub-subrequests, and sub-subrequests can initiate sub-sub-subrequests, and… you get the idea. Subrequests can map to files on the hard disk, other handlers, or upstream servers; it doesn’t matter from the perspective of Nginx. As far as I know, only filters can issue subrequests.

2.1. Internal redirects

Where r is the request struct, and uri and args are the new URI. Note that URIs must be locations already defined in nginx.conf; you cannot, for instance, redirect to an arbitrary domain. Handlers should return the return value of ngx_http_internal_redirect , i.e. redirecting handlers will typically end like

Internal redirects are used in the "index" module (which maps URLs that end in / to index.html) as well as Nginx’s X-Accel-Redirect feature.

2.2. A single subrequest

Subrequests are most useful for inserting additional content based on data from the original response . For example, the SSI (server-side include) module uses a filter to scan the contents of the returned document, and then replaces "include" directives with the contents of the specified URLs.

We’ll start with a simpler example. We’ll make a filter that treats the entire contents of a document as a URL to be retrieved, and then appends the new document to the URL itself. Remember that the URL must be a location in nginx.conf.

The prototype of ngx_http_subrequest is:

*r is the original request
*uri and *args refer to the sub-request
**psr is a reference to a NULL pointer that will point to the new (sub-)request structure
*ps is a callback for when the subrequest is finished. I’ve never used this, but see http/ngx_http_request.h for details.
flags can be a bitwise-OR’ed combination of:
NGX_HTTP_ZERO_IN_URI : the URI contains a character with ASCII code 0 (also known as '\0'), or contains "%00"
NGX_HTTP_SUBREQUEST_IN_MEMORY : store the result of the subrequest in a contiguous chunk of memory (usually not necessary)

The results of the subrequest will be inserted where you expect. If you want to modify the results of the subrequest, you can use another filter (or the same one!). You can tell whether a filter is operating on the primary request or a subrequest with this test:

The simplest example of a module that issues a single subrequest is the "addition" module .

2.3. Sequential subrequests

You might think issuing multiple subrequests is as simple as:

You’d be wrong! Remember that Nginx is single-threaded. Subrequests might need to access the network, and if so, Nginx needs to return to its other work while it waits for a response. So we need to check the return value of ngx_http_subrequest , which can be one of:

NGX_OK : the subrequest finished without touching the network
NGX_DONE : the client reset the network connection
NGX_ERROR : there was a server error of some sort
NGX_AGAIN : the subrequest requires network activity

If your subrequest returns NGX_AGAIN , your filter should also immediately return NGX_AGAIN . When that subrequest finishes, and the results have been sent to the client, Nginx is nice enough to call your filter again, from which you can issue the next subrequest (or do some work in between subrequests). It helps, of course, to keep track of your planned subrequests in a context struct. You should also take care to return errors immediately, too.

Let’s make a simple example. Suppose our context struct contains an array of URIs, and the index of the next subrequest:

Then a filter that simply concatenates the contents of these URIs together might something look like:

Let’s think this code through. There might be more going on than you expect.

First, the filter is called on the original response. Based on this response we populate ctx and ctx->uris . Then we enter the while loop and call ngx_http_subrequest for the first time.

If ngx_http_subrequest returns NGX_OK then we move onto the next subrequest immediately. If it returns with NGX_AGAIN, we break out of the while loop and return NGX_AGAIN.

Suppose we’ve returned an NGX_AGAIN. The subrequest is pending some network activity, and Nginx has moved on to other things. But when that subrequest is finished, Nginx will call our filter at least two more times:

once with r set to the subrequest, and in set to buffers from the subrequest’s response

once with r set to the original request, and in set to NULL

To distinguish these two cases, we must test whether r == r->main . In this example we call the next filter if we’re filtering the subrequest. But if we’re in the main request, we’ll just pick up the while loop where we last left off. in will be set to NULL because there aren’t actually any new buffers to process.

When the last subrequest finishes and all is well, we return NGX_OK.

This example is of course greatly simplified. You’ll have to figure out how to populate ctx->uris on your own. But the example shows how simple it is to re-enter the subrequesting loop, and break out as soon as we get an error or NGX_AGAIN .

2.4. Parallel subrequests

It’s also possible to issue several subrequests at once without waiting for previous subrequests to finish. This technique is, in fact, too advanced even for Emiller’s Advanced Topics in Nginx Module Development . See the SSI module for an example.

3. Parsing with Ragel

If your module is dealing with any kind of input, be it an incoming HTTP header or a full-blown template language, you will need to write a parser. Parsing is one of those things that seems easy—how hard can it be to convert a string into a struct?—but there is definitely a right way to parse and a wrong way to parse. Unfortunately, Nginx sets a bad example by choosing (what I feel is) the wrong way.

What’s wrong with Nginx’s parsing code?

Nginx does all of its parsing, whether of SSI includes, HTTP headers, or Nginx configuration files, using state machines. A state machine , you might recall from your college Theory of Computation class, reads a tape of characters, moves from state to state based on what it reads, and might perform some action based on what character it reads and what state it is in. So for example, if I wanted to parse positive decimal point numbers with a state machine, I might have a "reading stuff left of the period" state, a "just read a period" state, and a "reading stuff right of the period" state, and move among them as I read in each digit.

Unfortunately, state machine parsers are usually verbose, complex, hard to understand, and hard to modify. From a software development point of view, a better approach is to use a parser generator. A parser generator translates high-level, highly readable parsing rules into a low-level state machine. The compiled code from a parser generator is virtually the same as that of handwritten state machine, but your code is much easier to work with.

There are a number of parser generators available, each with their own special syntaxes, but I am going to focus on one parser generator in particular: Ragel. Ragel is a good choice because it was designed to work with buffered inputs. Given Nginx’s buffer-chain architecture, there is a very good chance that you will parsing a buffered input, whether you really want to or not.

3.1. Installing Ragel

Use your system’s package manager or else download Ragel from here .

3.2. Calling Ragel from Nginx

It’s a good idea to put your parser functions in a separate file from the rest of the module. You will then need to:

Create a header ( .h ) file for the parser
Include the header from your module
Create a Ragel ( .rl ) file
Generate a C ( .c ) file from the Ragel file
Include the C file in your module config

The header file should just have prototype for parser functions, which you can include in your module via the usual #include "my_module_parser.h" directive. The real work is writing the Ragel file. We will work through a simple example. The official Ragel User Guide ( PDF available here ) is fully 56 pages long and gives the programmer tremendous power, but we will just go through the parts of Ragel you really need for a simple parser.

Ragel files are C files interspersed with special Ragel commands and functions. Ragel commands are in blocks of code surrounded by %%{ and }%% . The first two Ragel commands you will want in your parser are:

These two commands should appear after any pre-processor directives but before your parser function. The machine command gives a name to the state machine Ragel is about to build for you. The write command will create the state definitions that the state machine will use. Don’t worry about these commands too much.

Next you can start writing your parser function as regular C. It can take any arguments you want and should return an ngx_int_t with NGX_OK upon success and NGX_ERROR upon failure. You should pass in, if not a pointer to the input you want to parse, then at least some kind of context struct that contains the input data.

Ragel will create a number of variables implicitly for you. Other variables you need to define yourself in order to use Ragel. At the top of your function, you need to declare:

u_char *p - pointer to the beginning of the input
u_char *pe - pointer to the end of the input
int cs - an integer which stores the state machine’s state

Ragel will start its parsing wherever p points, and finish up as soon as it reaches pe . Therefore p and pe should both be pointers on a contiguous chunk of memory. Note that when Ragel is finished running on a particular input, you can save the value of cs (the machine state) and resume parsing on additional input buffers exactly where you left off. In this way Ragel works across multiple input buffers and fits beautifully into Nginx’s event-driven architecture.

3.3. Writing a grammar

Next we want to write the Ragel grammar for our parser. A grammar is just a set of rules that specifies which kinds of input are allowed; a Ragel grammar is special because it allows us to perform actions as we scan each character. To take advantage of Ragel, you must learn the Ragel grammar syntax; it is not difficult, but it is not trivial, either.

Ragel grammars are defined by sets of rules. A rule has an arbitrary name on the left side of an equals sign and a specification on the right side, followed by a semicolon. The rule specification is a mixture of regular expressions and actions. We will get to actions in a minute.

The most important rule is called "main." All grammars must have a rule for main. The rule for main is special in that 1) the name is not arbitrary and 2) it uses := instead of = to separate the name from the specification.

Let’s start with a simple example: a parser for processing Range requests. This code is adapted from my mod_zip module, which also includes a more complicated parser for processing lists of files, if you are interested.

The "main" rule for our byte range parser is quite simple:

That rule just says "the input should consist of the string bytes= followed by input which follows the rule called byte_range_set ." So we need to define the rule byte_range_set :

That rule just says " byte_range_set consists of a byte_range_specs followed by zero or more commas each followed by a byte_range_specs ." In other words, a byte_range_set is a comma-separated list of byte_range_specs ’s. You might recognize the * as a Kleene star or from regular expressions.

Next we need to define the byte_range_specs rule:

The > character is special. It says that new_range is not the name of another rule, but the name of an action , and the action should be taken at the beginning of this rule, i.e. the beginning of byte_range_specs . The most important special characters are:

> - action should be taken at the beginning of this rule
$ - action should be taken as each character is processed
% - action should be taken at the end of this rule

There are others as well, which you can read about in the Ragel User Guide. These are enough to get you started without being too confusing.

Before we get into actions, let’s finish defining our rules. In the rule for byte_range_specs (plural), we referred to a rule called byte_range_spec (singular). It is defined as:

This rule states "read one or more digits, executing the action start_incr for each, then read a dash, then read one or more digits, executing the action end_incr for each." Notice that no actions are taken at the beginning or end of byte_range_spec .

When you are actually writing a grammar, you should write the rules in reverse order of what I have here. Rules should refer only to other rules that have been previously defined. So "main" should always be the last rule in your grammar, not the first.

Our byte-range grammar is now finished; it’s time to specify the actions.

3.4. Writing some actions

Actions are chunks of C code which have access to a few special variables. The most important special variables are:

fc - the current character being read
fpc - a pointer to the current character being read

fc is most useful for $ actions, i.e. actions performed on each character of a string or regular expression. fpc is more useful for > and % actions, that is, actions taken at the start or end of a rule.

To return to our byte-range example, here is the new_range action. It does not use any special variables.

new_range is surprisingly dull. It just allocated a new "range" struct on the "ranges" array stored in our context struct. Notice that as long as we include the right header files, Ragel actions have full access to the Nginx API.

Next we define the two remaining actions, start_incr and end_incr . These actions parse positive integers into the appropriate variables. As we read each digit of a number, we want to multiply the stored number by 10 and add the digit. Here we take advantage of the special variable fc described above:

Note the old parsing trick of subtracting '0' to convert a character to an integer.

That’s it for actions. We are almost finished with our parser.

3.5. Putting it all together

Actions and the grammar should go inside a Ragel block inside your parser function, but after the declarations of p , pe , and cs . I.e., something like:

We’ve added a few extra pieces here. The first are write init and write exec . These are commands to Ragel to insert the generated parser (written in C) right there.

The other extra bit is the comparison of cs to my_parser_first_final . Recall that cs stores the parser’s state. This check ensures that the parser is in a valid state after it has finished processing input. If we are parsing across multiple input buffers, then instead of this check we will store cs somewhere and retrieve it when we want to continue parsing.

Finally, we are ready to generate the actual parser. The code we’ve written so far should be in a Ragel ( .rl ) file; when we’re ready to compile, we just run the command:

This command will produce a file called "my_parser.c". To ensure that it is compiled by Nginx, you then need to add a line to your module’s "config" file, like this:

Once you get the hang of parsing with Ragel, you will wonder how you ever did without it. You will actually want to write parsers in your Nginx modules. Ragel opens up a whole new set of possible modules to the imaginative developer.

4. TODO: Advanced Topics Not Yet Covered Here

Topics not yet covered in this guide:

Built-in data structures (red-black trees, arrays, hash tables…)
Access control modules

Get new articles as they’re published, via LinkedIn , Twitter , or RSS .

Want to look for statistical patterns in your MySQL, PostgreSQL, or SQLite database? My desktop statistics software Wizard can help you analyze more data in less time and communicate discoveries visually without spending days struggling with pointless command syntax. Check it out!

Back to Evan Miller’s home page – Subscribe to RSS – LinkedIn – Twitter

Writing an Nginx Response Body Filter Module

By three methods we may learn wisdom: First, by reflection, which is noblest; Second, by imitation, which is easiest; and third by experience, which is bitterest. , Confucius (孔子) 15 Dec 2017

Introduction

Nginx is a popular opensource web and proxy server that is known for its performance and used by many websites. It supports third party modules that can provide additional functionalities and customizations. This article shows how to write and develop a simple filter module that inserts a text string after the <head> element in a HTTP response body.

This can be useful in some cases. For instance, to insert a monitoring script without modifying the existing web pages or web application. Nginx can be used as a reverse proxy to speed up access to the website and at the same time inserts the monitoring script to the web content.

Table of Content

The html tag parser, nginx buffer chains and text insertion, a big picture view of the filter setup, logical flow of the filter module, performance considerations, http chunked transfer encoding, components of nginx module, nginx module filter chain, module config shell file, nginx per request/respond context, saving and retrieving per request/response context, structure for storing module configuration, module directives, nginx module context, the module initialization function, the module configuration creation and merge functions, nginx module definition, the response headers filter function, the response body filter function, explaining ctx->in, ctx->out, ctx->last_out, the html tag parser function, the text insertion function, compiling the nginx body filter module, testing the nginx filter module, a note about previous versions, conclusion and afterthought, useful references, design and approach.

This section describes the design and approach taken to build the filter module. It shows how a simple parser can be built to parse for html tags. It explains how Nginx stores HTTP response using chain links of buffers and the way to insert text into this output chain. It also touches on how the filter module can be deployed, some of its features and the performance considerations.

Like many other Nginx modules, this filter module will be written using the C language.

In order to locate the <head> element, the filter needs to be able to parse an input stream for html tags or elements. To do this, let's take a look at the structure of an html element.

A html tag starts with an angle bracket < and ends with the corresponding closing > bracket. It has a tagname, an optional "/" and optional attributes. In the diagram above, SP represents whitespace. There must be at least a single space between the tagname and an attribute. Additonal whitespaces may be present between thesse tokens.

The following shows some examples of html tag.

A simplified BNF (Backus–Naur form) for HTML tagname and its attributes may look like this.

The BNF does look complex and scary. Parsing html into a syntax tree like what a web browser does is hard. Fortunately, it is not as difficult as thought in our case. We can forget about the BNF listing above.

The parser just needs to focus on four key tokens. A starting angle bracket, closing angle bracket, single quote and double quote.

A stack can be used to collect the html tag encountered in an input stream. When the parser encounters a start bracket, '<', it initializes an empty stack and push the start bracket into the stack. Other characters that come after the start bracket will be pushed into the stack.

If a single or double quote is seen, a toggling flag is set to indicate the start of string content. A corresponding closing quotation mark is required to end the string content. When the parser finally sees an end bracket, '>', it pushes it into the stack and the complete html tag is now present on the stack.

Toggling flags are used to determine if a bracket, '<' or '>', represents a token or is part of a string. Any '<' or '>' tags encountered after the start bracket and a quotation mark is part of a string. It will be treated as a normal character to be pushed into the stack. When the corresponding closing quotation is seen, the relevant toggling flag is reset. Any '<' or '>' encountered afterwards will be interpreted as the start or end token for an html element.

This toggling mechanism applies to the single and double quotation marks too. A single quote that appears after a start bracket and double quote is part of a string. A double quote that appears after a start bracket and a single quote is part of of a string.

Any characters encountered before a start bracket, '<', are ignored. These are the content of the html document. A fresh stack is initialized each time the starting bracket is encountered.

These simple rules are sufficient to extract an html element from a input stream. It is really not that complicated or scary as we have first thought. We will look at the parser code later in the implemetation section of this article.

Nginx stores the content of the HTTP response body into a linked list of buffers using chain links (ngx_chain_t). Each buffer structure (ngx_buf_t) in the linked list holds a part of the HTTP response body. The final buffer has a special flag, last_buf, configured. This marks it as the last buffer in the output.

More than one linked list of buffer chains may be required to store the entire content of a HTTP response body. Nginx will pass each linked list of chains to the filter module as and when data is available.

The job of our html parser is to process each of these buffers, looking for the <head> tag. Each buffer (ngx_buf_t) has a pointer to a block of memory space holding the actual response content. The parser treats this memory block as an input stream starting with the first buffer.

When the <head> tag is found, its end position must be in the memory block held by the current buffer. To insert our own text string, this buffer will be split and relinked with our text in the middle. The following illustrates how an original buffer is split into 3 new buffers with the inserted text.

If the original buffer doesn't contain any data after the <head> tag, our text can be linked directly to this buffer.

The new set of buffers with the inserted text are linked up in the correct order with other buffers in the nginx output chain. This modified chain link is then passed to other filters in nginx for processing. The content will eventually be sent to the user.

So far, there are 3 diagrams showing the structure of Nginx buffer chains but they are actually high level abstract views, meant to describe the concepts of text insertion.

The actual data structures is more like the following.

The diagram shows a single linked list of ngx_chain_t (chain links) containing ngx_buf_t (buffers) that point to blocks of memory holding the content of the HTTP response body. The final buffer in the link has the last_buf flag set to true. This indicates the end of output for the HTTP response.

Take note that the HTTP response can be stored in multiple sequential chain links. The filter module has to check the last_buf flag to determine the end of the HTTP response.

It is useful to keep the above diagram in mind; the filter module will be working on these chain of structures. It is easier to understand the source code when one can visualize these structures.

Refer to the official Nginx Development Guide for detailed description of ngx_chain_t and ngx_buf_t structures.

The earlier description about the html parser and text insertion is the core of the filter module that will be implemented. Here, we will show a big picture view of how this filter module can be deployed and used.

In the diagram above, Nginx and the web server are located on the same machine. The web server listens only on localhost (127.0.0.1) and accepts traffic from Nginx. Nginx is setup as a reverse proxy with the filter module installed. Incoming client requests are forwarded to the web server. The outgoing response from the web server is intercepted by Nginx and modified with the inserted text (a monitoring script).

Nginx is configured with TLS (Transport Layer Security, a.k.a HTTPS) and served as the TLS termination proxy for the web server. Caching will be enabled on Nginx to speed up performance.

There are a few other things the filter module has to handle. For example, if the original content from the web server is compressed (gzip or deflate), the filter will let the compressed content pass through unmodified. The web server should therefore disable compression and let Nginx itself handle content compression.

The order of module loading in Nginx is important. The filter module needs to run before Nginx's gzip module; otherwise, it cannot process the content that is compressed by gzip. By default, the filter module will run before gzip. The filter module will only handle html content type. Other content types like images, javascript, stylesheets or binary will be passed through unmodified.

The filter module will check the HTTP status code as well. If the status is not HTTP 200, the content will pass through unmodified. This means error pages will not have the text inserted.

Our filter also needs to be able to handle malformed html, such as those without <head> tag or those with multiple <head> tags etc... The string text will only be inserted once after the first <head> tag that is encountered.

The <head> tag has to be in the first 256 characters of the HTTP response body. The filter module will only process the first 256 characters of a HTTP response. Most well formed html content should have the <head> tag right at the beginning of a document. The 256 characters limit can be changed in the source code.

Another limit that is set is that a single html tag including its attributes cannot be more than 512 characters. The maximum stack size for the parser is set to 512. This limit should not be hit as the 256 characters limit will have been triggered much earlier.

The big picture view earlier has shown how the module can be deployed and what are some of its limits and features. We can work out the behaviour of the filter module using a logic flow diagram. This will provide more clarity when writing the module code.

The simple block diagram below shows the logical flow of the filter module.

The current buffer from the chain link is processed and there are two possible outcomes. The <head> tag is found within the first 256 characters of the current buffer or it is not found.

If the <head> tag is found, our text will be inserted as described earlier. The modified buffers will be linked to the other buffers in the chain link and eventually its new content will be sent to the user.

For the case where the <head> tag is not found, the filter module will log an alert in the nginx error log. The current buffer is already a part of the chain link of buffers and no modification is made. The chain link will be processed by Nginx and the unmodified content will eventually be sent to the user.

The filter module needs to be fast. An nginx setup may include many other modules; our module needs to do it work fast and pass the output to other modules and nginx for processing.

The html parsing and text insertion is done in a single pass through the chain of buffers. The parser will only process the first 256 characters in the response body. Anything that comes after will not be parsed. This avoids parsing all of the response body improving performance.

A particular problem of modifying a HTTP response body is the determination of the new content length. In our case, we are unable to tell whether a <head> tag is present until we have processed the content. Therefore, we can't determine the value of the content length header that is to be sent in advance.

The standard solution is to use HTTP Chunked Transfer Encoding that indicates unknown response body size. To avoid chunked transfer encoding, some tricks can actually be used.

For example, we can add the length of the text string to the Content Length header. If the <head> tag is eventually not found, we can append blank paddings to the output so that it matches the content length. If the <head> tag is found, our inserted text will ensure that the Content Length header is correct.

For simplicity, our filter will use chunked transfer encoding. In earlier versions of our filter module, the paddings are actually implemented to avoid chunked transfer encoding, there are also other features like blanking a page if <head> is not found etc...

All these additional features and tricks add complexity. It can also lead to potential bugs. Poor understanding of the module behaviour can lead to misconfiguration issues. In the end, I reverted back to a simple design for this filter module. The aim is for simplicity and performance.

In the future though, I may come up with another filter module that has mandatory blocking as this can be useful in security. For readers who are interested in the earlier versions, you can refer to the Github link for the module at the end of this article. The README.md describes how to checkout the version before my reversion back to this simple design.

Structure of an Nginx HTTP Filter Module

This section will briefly run through some of the components of an Nginx module. This will help in understanding how the filter module works when going through its source code later.

The official Nginx Development Guide is the main reference to learn about developing nginx modules. It provides detailed information on the header files to include, the return codes that are supported, the functions available, the various Nginx data types such as ngx_str_t (String), arrays, lists etc... There are also many example codes that one can refer to.

The official guide is rather long and multiple readings are probably required to understand the content. An easier introduction is available at EMiller 's Guide To Nginx Development . This guide is a useful tutorial for beginners learning to write Nginx modules.

There are 3 important Nginx data structures that modules rely on.

Module Definition
Module Context
Module Directive Structure

The following table describes each item in more details. The source definition column provides the link to the actual nginx source code where the structure is defined.

Besides the 3 data structures described above, we need to know a bit about how Nginx handles http filter modules. Nginx treats http filter modules like a chain too. The first filter will call the second and the second calls the third and so on... until the last. There are two separate chains, one for handling HTTP response headers and another for the HTTP response body.

A filter module can register a handler for HTTP response headers, as well as a handler for HTTP response body.

Registration can be done in an initialization function defined as a post configuration function in the module context. The module context (ngx_http_module_t) is described in the table earlier.

The filter handlers take the arguments and return values required by Nginx. For example, a HTTP response headers handler function takes a pointer of ngx_http_request_t as argument and return ngx_int_t. This handler function will call the next response headers handler in the chain when it is done.

The following is a function prototype of a filter handler for HTTP headers. The code is from our filter module.

The nginx request structure, ngx_http_request_t, contains many useful information like the HTTP status of the response, its content type, content length etc... Refer to the Nginx Development Guide on the various fields stored in a ngx_http_request_t structure.

The HTTP response body filter handler takes two arguments, a pointer to ngx_http_request_t and a pointer to ngx_chain_t. It returns an ngx_int_t. The second argument, ngx_chain_t* is a linked list for the output buffers. Each buffer stores part of the HTTP response body.

Function prototype of a filter handler for HTTP response body taken from our filter module.

Our filter module will be parsing the content blocks in the ngx_chain_t* linked list; inserting our text after the <head> tag. Once it is done, it will call the next response body handler in the chain.

Note that the response body filter handler function can be called many times in a single request. This is due to the nature of asynchronous data access, non blocking I/O that enables nginx to be high performance. The filter handler is called when data is available for processing.

There are two global variables that are used by Nginx for registering the handler functions. The initialization function of our filter module sets these two variables when registering the handlers.

ngx_http_top_header_filter is a global pointer for storing the first HTTP response headers filter handler.
ngx_http_top_body_filter is a global pointer that stores the first HTTP response body filter handler.

We will see how these 2 variables are used when going through the source code.

To tell Nginx about the filter module, a config file is required. This is just a regular shell file. It tells Nginx, the module name, the module type and the module source code location. For more details on the config file and Nginx module, refer to the Nginx Development Guide . The Nginx Wiki provides information on the config file as well.

Let's proceed to the implementation of the filter module and hopefully these concepts will become clearer when going through actual source code.

Implementing Nginx Response Body Filter

This section runs through some of the functions and data structures in the source code for the Html Head filter module. The full source is available at the Github link at the bottom of the article.

The following is the listing for the config file of Html Head filter module. Note, the filename of the config file is "config". It specifies the type of the module, a name for the module and a single c source file that contains the module code.

ngx_http_html_head_filter_module.c is the filter source file. The 3 Nginx header files required for HTTP module development are included at the top of the source file. Three macros are defined at the top as well.

The following code listing shows these macros and include files.

A brief explanation of each of the macros are given below.

HF_MAX_STACK_SZ defines the size of the parsing stack, currently set to 512.
HF_MAX_CHARACTERS defines the maximum characters in a response body that the parser will look for the <head> tag. Currently set as 256 characters.
HF_LAST_SEARCH defines the return code of our parsing function if the <head> tag is not found within 256 characters.

Nginx allows a module to keep state information per HTTP request/response through a data structure defined by the module. We define a structure ngx_http_html_head_filter_ctx_t that stores the state of processing a response. It includes a stack, headfilter_stack_t, used by the parser.

There are also a number of other members like count, which tracks the number of characters processed by the parser so far. The filter module expects to find the <head> tag in the first 256 characters of the response body.

The following shows the code for the per request/respond context structure and the parser stack.

The index variable stores the current position in the memory block of a buffer that the parser is processing. If a <head> tag is found, index will point to the position of the closing bracket ">" in the memory block of the current buffer. This information will be used for splitting up the buffer and inserting our text.

Structure members like found, last_search and last are flags to indicate certain conditions. The variable found is set to true when the <head> tag is found. last_search is set when the characters limit of 256 is hit. last is set when the last buffer of the output is processed.

starttag, tagquote and tagsquote are used by the parser when parsing the content block.

The ngx_chain_t pointers, free, busy, out and in, are used together with the pointer to pointer, last_out, for handling the incoming and outgoing buffers chains. free and busy are required for buffer reuse. Refer to the Nginx Development Guide for more details on buffer reuse.

Nginx offers two functions, ngx_http_set_ctx(r, ctx, module) and ngx_http_get_module_ctx(r, module) for saving and retrieving the module's per request/response context.

In our filter module implementation, ngx_http_set_ctx() function is called by the response headers filter handler when creating and initializing the per request/response context structure. The response body handler calls ngx_http_get_module_ctx() to retrieve the per request/response context structure.

If this structure is NULL, the response body handler will skip processing and call the next response body filter in the filter chain. The response headers filter handler will not create this context if certain checks failed. For example, if the content type is not "text/html" etc... You shall see this later in the source code.

The following is the data structure for storing the arguments of the configuration directives. When the nginx configuration file is processed, the arguments for our filter module directive will be stored into this structure.

ngx_http_html_head_filter_loc_conf_t has a string field, insert_text, that holds the text to be inserted after the <head> tag. This is the only configuration directive for our simple filter module.

The two static variables ngx_http_next_header_filter and ngx_http_next_body_filter, are pointers for storing the next header filter and body filter in the Nginx chain of filters. These are set during initialization of our filter module and are called when our module has done its work.

The following listing shows the directive that our filter module will take. The directives are declared as a static array of ngx_command_t structures.

ngx_http_html_head_filter_commands[ ] is an array of ngx_command_t, it holds a single directive for our filter module and is terminated by a ngx_null_command.

The directive that is defined is "html_head_filter". The following describes its individual fields.

Its first field is simply the directive name, an ngx_str_t, "html_head_filter".
The second field is a bitmask that defines where this directive can occur in the nginx configuration file (NGX_HTTP_LOC_CONF) and the number of arguments (NGX_CONF_1MORE) that it takes. In our case, we specify that this directive can occur in the location context in nginx configuration file and takes 1 or more argument. The argument is a string, the text to be inserted after the <head> tag.
The third field is the handler function that is called to read in our directive and set its argument. In this case, we use some of the set functions provided by Nginx. ngx_conf_set_str_slot( ) will read a string argument and save it in our module configuration structure.
The fourth field, NGX_HTTP_LOC_CONF_OFFSET, tells the handler function that our module configuration structure is a location configuration.
The fifth field, specifies the offset for saving the argument. In this case, the argument should be saved in our ngx_http_html_head_filter_loc_conf_t module configuration structure in the insert_text variable.
The sixth field, allows the specification of a post handler that can be used for further initialization of the directive argument. In our case, we are not using this and set it to NULL.

Note, that the "html_head_filter" directive is required in order to enable the filter module. If this directive is not set in the nginx configuration, our filter module will skip processing.

The module context, ngx_http_html_head_filter_ctx, sets three function handlers. The following shows the code listing.

ngx_http_html_head_init( ) is used for initializing the module after configuration is done and ngx_http_html_head_create_conf( ) is for creating the module configuration structure. ngx_http_html_head_merge_loc_conf( ) function is used for merging configuration directives from parent location contexts in the nginx configuration file.

More details of these 3 functions are provided below.

The ngx_http_html_head_init( ) function initializes the module and registers our handlers in the filter chain. This function is set in the post configuration field of the module context earlier. Nginx will call it after the configuration has been read.

The module's header filter and body filter handler functions are assigned to the global ngx_http_top_header_filter and ngx_http_top_body_filter pointers respectively. Nginx will call these and hence invoke our filter handlers.

The original function handlers in these 2 global pointers are saved in ngx_http_next_header_filter and ngx_http_next_body_filter respectively. When our module completes its work, it will in turn call these saved function handlers. This establishes the Nginx filter chain, enabling one filter to call the next until the last in the filter chain.

The following shows the source code for the ngx_http_html_head_init( ) function.

The following shows the code snippets for the ngx_http_html_head_create_conf( ) and ngx_http_html_head_merge_loc_conf( ) functions.

The ngx_http_html_head_create_conf( ) function creates our module configuration structure for saving our directives. The ngx_http_html_head_merge_loc_conf( ) function merges directives that appears in parent locations with that appearing in child locations.

The array of module directives, the module context and module type are specified in the ngx_module_t structure. This is the module definition discussed in the earlier section. The following shows the code.

The following shows the code listing for the ngx_http_html_head_header_filter() function. This is the handler that is registered earlier by the module initialization function. It process the incoming HTTP response headers, does some checks and initialize the module per request/response context for managing state.

If some of the checks failed, the context will not be created. The current response headers will be passed unmodified to the next headers filter handler. Some examples of checks failing include, the "html_head_filter" directive is not set, or if the HTTP response is compressed.

The following is the code listing for the ngx_http_html_head_body_filter() function. Like the header filter handler, this function is registered by the module initialization function.

Notice that the code follows the logical flow diagram. Text insertion though is done when processing each buffer and the <head> is found. So in a single pass, the buffers will have been changed.

The while loop on line 67 iterates through the incoming chain of buffers and call ngx_parse_buf_html( ) function to parse each buffer for the <head> tag. The <head> tag can be split over two or more consecutive buffers; the parser through the use of the stack can handle and track this easily.

If the <head> tag is found, the found flag in the module per request/response context is set and ngx_html_insert_output( ) function is called. ngx_html_insert_output( ) will insert our text after the <head> tag. The process for doing this is described in the earlier Design and Approach section. The text insertion is done in a single pass of the incoming buffers chain.

If <head> tag is not found after the first 256 characters, the last_search flag is set in the per request/response context. This stops the ngx_parse_buf_html( ) from being called on subsequent buffers, speeding up performance.

The found flag also prevents ngx_parse_buf_html( ) from being called on subsequent buffers once the <head> tag is found. It also ensures that the text will only be inserted once, after the occurence of the first <head> tag even if there are multiple <head> tags in a response body. The while loop builds the output chain that will be passed to the next nginx filter.

The ngx_http_next_body_filter() function is called once our filter has done its work.

Let's run through how the filter module actually handles the incoming buffers chain of the response body.

ctx->in and ctx->out are both pointers of ngx_chain_t. ctx->last_out is a pointer to a pointer of ngx_chain_t. When our response body handler, ngx_http_html_head_body_filter( ), is called; it is passed an incoming linked list of ngx_chain_t containing the buffers storing the response content. This linked list is copied to ctx->in. From that point on, our filter module will work on our own linked list, ctx->in.

The copying is done because our filter module may be replacing the buffers in the linked list of ngx_chain_t. It helps ensure the structures used by prior module is not accidentally modified by our filter module. These input chain of buffers in ctx->in are then processed and placed in ctx->out. ctx->out points to the head of the linked list of ngx_chain_t containing the buffers to be sent out.

To faciliate the placement of processed buffers into ctx->out, the pointer to pointer, ctx->last_out is used. ctx->last_out is initialized to the address of ctx->out, head of the output list in the ngx_http_html_head_header_filter( ) function. As and when buffer chain are added to ctx->out, ctx->last_out is updated to the address of the next chain.

ctx->last_out always point to the address of the next output chain. When the output chain is sent out to the next filter, ctx->last_out is reinitialized back to the address of ctx->out. When new buffer chains are available for our filter to process, ctx->last_out will be ready to add these to ctx->out.

The following lists the code for the ngx_parse_buf_html() function.

The function goes through the character stream in a buffer and looks for the four tokens <, ", ', >. The < token indicates a starting html tag. The stack is initialized and the token pushed into the stack. Subsequent characters that are not a token, are pushed into the stack. If a double quote or single quote is encountered, toggling flags for the respective quote is set. Any > that comes after either quotation will not be interpreted as an html ending tag. Any < that comes after a quotation will not be interpreted as a start tag.

The relevant quotation flags are reset when a second double quote or single quote is encountered. A subsequent > will then be treated as an end tag. The parser will then call the function ngx_process_tag() to check if the html tag in the stack is a <head>. Leading and trailing spaces in the tag are ignored and the check is case insensitive. However, the <head> tag cannot contain attributes.

Some examples will make this clearer. < HeAD> is considered valid, while <Head id=1> is invalid. The parser function returns NGX_OK if a valid <head> tag is found, it returns NGX_AGAIN to indicate processing can continue with subsequent buffers and NGX_ERROR if an error occurs. When the maximum characters limit of 256 is reached, the parser will return HF_LAST_SEARCH.

We will list one more function, the ngx_html_insert_output( ) function that will insert our text into the buffer chains. The following is the code snippet for ngx_html_insert_output( ).

The insert text function splits the input buffer where the <head> tag is found into either 3 or 2 buffers with the text inserted. The process is illustrated earlier in the Design and Approach section. If the current input buffer has only content up to the <head> tag, then our text can be inserted directly as a new buffer after the input buffer. In this case, it is split into 2 buffers.

Alternatively if the current input buffer has content after the <head> tag, the input buffer will be split into 3 buffers. The first is the content up till and including the <head> tag, the second is our inserted text and the third is the content after the <head> tag.

The new set of buffers are then incorporated into the output chain by the while loop in the function handler, ngx_http_html_head_body_filter( ). If the original buffer is marked with a recycled flag, it will be consumed. This is done by setting the start position of the buffer content to be equal to its last content position. The recycled flag indicates that the buffer has to be consumed as soon as possible, so that it can potentially be reused.

There are a couple of other functions and code snippet not covered in this implementation section. Some examples, include the functions for handling the parser stack, the ngx_process_tag( ) function etc... Refer to the github link below for the full source code.

Let's proceed to compile and test the html head filter module. Create a working directory "Build-Module" to hold the source files that are required. The filter module source code can be obtained from the github repository . On a Ubuntu linux system with git installed, the following commands can be used.

To verify the signature of the git download, refer to these instructions . Let's do a quick static analysis of the module's source code to make sure that there are no major vulnerabilities, such as buffer overflows. On Ubuntu, we can install cppcheck.

Good, our module code doesn't have any glaring issues that the cppcheck analyzer can find. We can proceed to download the other packages that are required. Change our directory back to Build-Module.

The filter module works with the latest stable Nginx 1.18.0. Download the latest stable nginx source code from the official Nginx download page . We are going to download Openssl 1.1.1h , zlib 1.2.11 and pcre 8.44 as well.

Verify the integrity of the downloads with either SHA-256 checksum or gpg signature provided by each of the package website. The following lists the sha256 checksums of the packages.

Extract these tar balls in the Build-Module directory. Issue the following commands to configure Nginx. The options include hardening flags to ensure a hardened binary.

The configure command above will create a Makefile in the objs directory. Proceed to build the binary and install it into /usr/local/nginx.

We can tar zip the compiled nginx package and move it to our server machine for testing. As a security measure and best practice, the server doesn't have gcc or compiler tools installed. We compile the code on a separate workstation that has the same architecture and OS as the server and then copy the compiled package to the server using sftp or scp.

On the server, extract the nginx binary package to /usr/local/nginx. Ensure that the ownership and permission on this extracted nginx binary location are secure. The Apache web server shall serve the main website on this machine. It listens locally (127.0.0.1) on port 80 and will not accept any external network traffic.

Nginx will be configured as a reverse proxy in front of the Apache web server. Nginx accepts external network traffic and forward the traffic to the Apache web server. Refer to the earlier section, Design and Approach, for a big picture view of the deployment architecture.

Nginx is run using the nginx user and group. The following commands create the user and group, as well as the directories used by Nginx.

Let 's do some additional hardening of the /usr/local/nginx location.

Open up the nginx configuration file located at /usr/local/nginx/conf/nginx.conf and fill in the following settings. Note these configuration settings are for nighthour.sg. Edit and replace the IP address, the server name, the ssl certificates, etc... with settings that are relevant for your test environment. Testing should be done on a non production system.

The configuration above sets up Nginx to listen on the public ip address at port 80 and 443. The server block at port 80 redirects HTTP request to HTTPS at port 443. In the server block for port 443 (HTTPS), proxy_pass to http://127.0.0.1 is configured. http://127.0.0.1 is where the Apache web server is listening for traffic.

We also turn on the Html Head filter module by setting the directive html_head_filter with its argument string in the location block.

This argument string is the text to be inserted after the <head> tag in the HTTP response body from the Apache web server. The argument string is a script tag. It is a monitoring javascript, mymonitor.js. This script tag will be inserted into the HTTP response body.

Start up Nginx with the following command

Access a page on the website using your favourite web browser and view the page source. The monitoring script should be inserted.

Nginx Html head filter module script insertion

Some other tests can include html pages with multiple <head> tags, (the monitoring script should be inserted once), head tags with leading/trailing spaces and a mix of upper/lower case, or a Php script dynamically generating html content, or a 404 not found error page (monitoring script should not be inserted) etc... The Html Head filter module should handle all these cases properly.

When all the testings are done and the results meet expectations, the filter module can be deployed to production. The filter module is actually deployed on nighthour.sg, inserting the monitoring script into the web pages here.

There are previous versions of this filter with more features. For example, sending a blank page (blocking) when the <head> tag is not found, a logging mode that allows content to pass through unmodified, the avoidance of HTTP chunked transfer, size limit of 10MiB for static content etc...

Some of these features such as sending a blank page when <head> is not found can be useful. However, all these other features have made the module complex and harder to reason about its behaviour. There are also issues with the trick of avoiding chunked transfer encoding. A simple module has become far more complicated than is necessary.

A good program has to be as simple as possible but still get its job done. In this case, it is really about inserting a text string after the

If features like blocking, additional content size limits, avoidance of chunked transfer encoding etc... are needed. It will be far better to implemet these as seperate customized versions, built for a specific purpose. This will reduce variations of combining different features all into one, making the behaviour of the module easier to grasp and reason. It also improves performance and reduce bugs.

The blocking feature though is useful from a security perspective. For example, there can be cases where a monitoring script has to be present in all html pages. In this case, html pages that doesn't have <head> tag can be blocked, since the monitoring script can't be inserted.

A customized version of this module that will send a blank page if the <head> tag is not found within the first 256 characters is available at the following github link.

Take note that the customized version should not be installed together with the non blocking version on the same nginx instance.

This article runs through the design and implementation of a simple nginx filter module that inserts a text into the http response body, after the html <head> tag. The code implementation though doesn't exactly follow nginx coding convention, it follows the author's random style.

Nginx has its own recommended coding convention. For those attempting to write nginx modules, it is good to follow the nginx coding convention. The coding convention is documented in the Nginx development guide . I may reformat this code again in the future to follow the nginx convention.

Nginx is a high performance web server and reverse proxy that is highly extensible. It can serve as a Web Application Firewall (WAF) through modules such as Mod-Security, NAXSI or even act as an application server through project such as Openresty . Learning to write an Nginx module will allow an IT professional to know more about the internals of this flexible web infrastructure that is gaining wide usage.

The knowledge gained can benefit developers, infrastructure engineers, security engineers/professionals and even system administrators who code.

EMiller 's Guide To Nginx Development , by Evan Miller. A useful beginner tutorial on developing nginx modules.
Nginx Development Guide , The official guide for Nginx developers.
Nginx Wiki , contains more resources for extending Nginx.
Nginx Source Code Browser , Browse through the Nginx source code, search for identifiers, fucntions, nginx data structures etc... This is an important resource for nginx module development.
NGINX Tutorial: Developing Modules , by Aaron Bedra.
agentzh's Nginx Tutorials , by agentzh.
Catch Body Filter Example , a body filter example at Nginx Wiki.
Alibaba Nginx footer filter , offers an example of adding a footer to HTTP responses.
A Hello World Nginx Module , a simple Hello world example that is easy to understand.
Openresty Project , A Web Platform that is based on Nginx.
A Nginx substitution module , by Weibin Yao. Useful for learning how to buffer HTTP response content and processing it using PCRE regular expression.
A Nginx Content Filter , Nginx content filter module forked from Weibin Yao 's substitution filter. Instead of substituting content, it blocks an entire web page when specific content are detected using PCRE regular expression. This can be useful for filtering outbound content from a website and blocking sensitive information.
Developing an Nginx URL Whitelisting Module , An article on how to write and develop an Nginx module that can restrict access to a website or web application through the use of a URL whitelist. The module will block all access by default with a HTTP 404 error. Only URLs that are specifically whitelisted can be accessed.
Intro to Nginx Module Development , An introduction to Nginx module development in chinese. The article has a clear introduction and diagrams on the basics of nginx module.
Writing Nginx Modules , A useful set of slides on writing nginx modules.
TEngine book on Nginx Development , A comprehensive book and guide on Nginx development. The book is in chinese.
Nginx Module Guide , A guide on developing Nginx module in english. It covers both nginx module handler and module filter.
The Architecture of Open Source Applications - nginx , by Andrew Alexeev. An article on the design and architecture of nginx including how some of its internals work.
NAXSI , Nginx Anti XSS & SQL Injection, a fast and simple web application firewall module for Nginx.

The full source code for the Nginx Html Head Filter is available at the following Github link. https://github.com/ngchianglin/NginxHtmlHeadFilter

A customized version that will send an empty page if the <head> tag is not found is available at https://github.com/ngchianglin/NginxHtmlHeadBlankFilter

If you have any feedback, comments, corrections or suggestions to improve this article. You can reach me via the contact/feedback link at the bottom of the page.

Article last updated on Nov 2020.

Tutorial: Configure OpenTelemetry for Your Applications Using NGINX

If you’re looking for a tool to trace web applications and infrastructure more effectively, OpenTelemetry might be just what you need. By instrumenting your NGINX server with the existing OpenTelemetry NGINX community module you can collect metrics, traces, and logs and gain better visibility into the health of your server. This, in turn, enables you to troubleshoot issues and optimize your web applications for better performance. However, this existing community module can also slow down your server’s response times due to the performance overhead it requires for tracing. This process can also consume additional resources, increasing CPU and memory usage. Furthermore, setting up and configuring the module can be a hassle.

NGINX has recently developed a native OpenTelemetry module, ngx_otel_module , which revolutionizes the tracing of request processing performance. The module utilizes telemetry calls to monitor application requests and responses, enabling enhanced tracking capabilities. The module can be conveniently set up and configured within the NGINX configuration files, making it highly user-friendly. This new module caters to the needs of both NGINX OSS and NGINX Plus users. It supports W3C context propagation and OTLP/gRPC export protocol, rendering it a comprehensive solution for optimizing performance.

The NGINX-native OpenTelemetry module is a dynamic module that doesn’t require any additional packaging with NGINX Plus. It offers a range of features, including the API and key-value store modules. These features work together to provide a complete solution for monitoring and optimizing the performance of your NGINX Plus instance. By using ngx_otel_module , you can gain valuable insights into your web application’s performance and take steps to improve it. We highly recommend exploring ngx_otel_module to discover how it can help you achieve better results.

Note: You can head over to our GitHub page for detailed instructions on how to install nginx_otel_module and get started.

Tutorial Overview

In this blog, you can follow a step-by-step guide on configuring OpenTelemetry in NGINX Plus and using the Jaeger tool to collect and visualize traces. OpenTelemetry is a powerful tool that offers a comprehensive view of a request’s path, including valuable information such as latency, request details, and response data. This can be incredibly useful in optimizing performance and identifying potential issues. To simplify things, we have set up the OpenTelemetry module, application, and Jaeger all in one instance, which you can see in the diagram below.

Follow the steps in these sections to complete the tutorial:

Prerequisites

Deploy nginx plus and install the opentelemetry module, deploy jaeger and the echo application, configure opentelemetry in nginx for tracing, test the configuration.

A Linux/Unix environment, or any compatible environment
A NGINX Plus subscription
Basic familiarity with the Linux command line and JavaScript
Node.js 19.x or later

Selecting an appropriate environment is crucial for successfully deploying an NGINX instance. This tutorial will walk you through deploying NGINX Plus and installing the NGINX dynamic modules.

Install NGINX Plus on a supported operating system .
Install ngx_otel_module . Add the dynamic module to the NGINX configuration directory to activate OpenTelemetry:

load_module modules/ngx_otel_module.so;

Reload NGINX to enable the module:

nginx -t && nginx -s reload

There are various options available to view traces. This tutorial uses Jaeger to collect and analyze OpenTelemetry data. Jaeger provides an efficient and user-friendly interface to collect and visualize tracing data. After data collection, you will deploy mendhak/http-https-echo , a simple Docker application. This application returns the request attributes for JavaScript in JSON format.

To install the Jaeger all-in-one tracing and http-echo application. Run this command:

'docker-compose up -d'

Run the docker ps -a command to verify if the container is installed.

You can now access Jaeger by simply typing in the http://localhost:16686 endpoint in your browser. Note that you might not be able to see any system trace data right away as it is currently being sent to the console. But don’t worry! We can quickly resolve this by exporting the traces in the OpenTelemetry Protocol (OTLP) format. You’ll learn to do this in the next section when we configure NGINX to send the traces to Jaeger.

This section will show you step-by-step how to set up the OpenTelemetry directive in NGINX Plus using a key-value store. This powerful configuration enables precise monitoring and analysis of traffic, allowing you to optimize your application’s performance. By the end of this section, you will have a solid understanding of utilizing the NGINX OpenTelemetry module to track your application’s performance.

Setting up and configuring telemetry collection is a breeze with NGINX configuration files. With ngx_otel_module , users can access a robust, protocol-aware tracing tool that can help to quickly identify and resolve issues in applications. This module is a valuable addition to your application development and management toolset and will help you enhance the performance of your applications. To learn more about configuring other OpenTelemetry sample configurations, please refer to the documentation ngx_otel_module documentation .

OpenTelemetry Directives and Variables

NGINX has new directives that can help you achieve an even more optimized OpenTelemetry deployment, tailored to your specific needs. These directives were designed to enhance your application’s performance and make it more efficient than ever.

Module Directives:

otel_exporter – Sets the parameters for OpenTelemetry data, including the endpoint , interval , batch size , and batch count . These parameters are crucial for the successful export of data and must be defined accurately.
otel_service_name – Sets the service name attribute for your OpenTelemetry resource to improve organization and tracking.
otel_trace – To enable or disable OpenTelemetry tracing, you can now do so by specifying a variable. This offers flexibility in managing your tracing settings.
otel_span_name – The name of the OpenTelemetry span is set as the location name for a request by default. It’s worth noting that the name is customizable and can include variables as required.

Configuration Examples

Here are examples of ways you can configure OpenTelemetry in NGINX using the NGINX Plus key-value store. The NGINX Plus key-value store module offers a valuable use case that enables dynamic configuration of OpenTelemetry span and other OpenTelemetry attributes, thereby streamlining the process of tracing and debugging.

This is an example of dynamically enabling OpenTelemetry tracing by using a key-value store:

Next, here’s an example of dynamically disabling OpenTelemetry tracing by using a key-value store:

Here is an example NGINX OpenTelemetry span attribute configuration:

To save the configuration and restart NGINX, input this code:

nginx -s reload

Lastly, here is how to add span attribute in NGINX Plus API:

Now, you can test your configuration by following the steps below.

$ curl -i localhost:4000/city

The output will look like this:

Now you want to ensure that the OTLP exporter is functioning correctly and that you can gain access to the trace. Start by opening a browser and accessing the Jaeger UI at http://localhost:16686 . Once the page loads, click on the Search button, located in the title bar. From there, select the service that starts with NGINX from the drop-down menu in the Service field. Then select the operation named Otel from the drop-down menu called Operation . To make it easier to identify any issues, click on the Find Traces button to visualize the trace.
To access a more detailed and comprehensive analysis of a specific trace, click on one of the individual traces available. This will provide you with valuable insights into the trace you have selected. In the trace below, you can review both the OpenTelemetry directive span attribute and the non-directive of the trace, allowing you to better understand the data at hand.

Under Tags you can see the following attributes:

demo – OTel – OpenTelemetry span attribute name
http.status_code field – 200 – Indicates successful creation
otel.library.name – nginx – OpenTelemetry service name

NGINX now has built-in support for OpenTelemetry, a significant development for tracing requests and responses in complex application environments. This feature streamlines the process and ensures seamless integration, making it much easier for developers to monitor and optimize their applications.

Although the OpenTracing module that was introduced in NGINX Plus R18 is now deprecated and will be removed starting from NGINX Plus R34, it will still be available in all NGINX Plus releases until then. However, it’s recommended to use the OpenTelemetry module, which was introduced in NGINX Plus R29 .

If you’re new to NGINX Plus, you can start your 30-day free trial today or contact us to discuss your use cases .

Module ngx_http_stub_status_module

The ngx_http_stub_status_module module provides access to basic status information.

This module is not built by default, it should be enabled with the --with-http_stub_status_module configuration parameter.

Example Configuration

location = /basic_status { stub_status; }

This configuration creates a simple web page with basic status data which may look like as follows:

Active connections: 291 server accepts handled requests 16630948 16630948 31070465 Reading: 6 Writing: 179 Waiting: 106

The basic status information will be accessible from the surrounding location.

In versions prior to 1.7.5, the directive syntax required an arbitrary argument, for example, “ stub_status on ”.

The following status information is provided:

Embedded Variables

The ngx_http_stub_status_module module supports the following embedded variables (1.3.14):

IMAGES

Getting Started with the NGINX JavaScript Module
Writing a Zero Trust NGINX Module
Nginx Modules
Writing Nginx Modules
NGINX Modules
Writing Nginx Modules

VIDEO

using nginx rtmp module to multistream
nginx module vts
Dynamic Modules Development: Ruslan Ermilov, NGINX, Inc
nginx module server on webmin/virtualmin
How To Enable NGINX Status Page On Ubuntu 20.04
NGINX

COMMENTS

plugins
Nginx has a module chain. When Nginx needs to gzip or chunk-encode a response, it whips out a module to do the work. When Nginx blocks access to a resource based on IP address or HTTP auth credentials, a module does the deflecting. When Nginx communicates with Memcache or FastCGI servers, a module is the walkie-talkie... The purpose of this ...
Emiller's Guide to Nginx Module Development
2. Components of an Nginx Module. As I said, you have a lot of flexibility when it comes to making an Nginx module. This section will describe the parts that are almost always present. It's intended as a guide for understanding a module, and a reference for when you think you're ready to start writing a module. 2.1. Module Configuration ...
Development guide
When writing a body or header filter, pay special attention to the filter's position in the filter order. There's a number of header and body filters registered by nginx standard modules. The nginx standard modules register a number of head and body filters and it's important to register a new filter module in the right place with respect to them.
Nginx Modules
The ngx_http_perl_module integrates the Perl programming language into Nginx. It allows you to write custom modules or scripts in Perl to extend the functionality of Nginx. ngx_http_geoip_module. The ngx_http_geoip_module provides geolocation-based features using the MaxMind GeoIP database in Nginx. It allows you to determine the geographic ...
Exploring Nginx Modules: A Guide to Custom Module Development
Nginx is a powerful, high-performance web server, reverse proxy server, and load balancer. It's well-known for its ability to handle a large number of connections and serve static files efficiently. One of the reasons behind its popularity is its modular architecture, which allows developers to extend its functionality by creating custom modules. In this blog
Why Make Your Own NGINX Modules? Theory and Practice
In his session at NGINX Conf 2018, Vasiliy provides the detailed knowledge you need to build your own NGINX modules, including details about NGINX's core, its modular architecture, and guiding principles for NGINX code development. Using real‑world case studies and business scenarios, he answers the question, "Why and when do you need to ...
Tejas's Nginx Module Guide
Welcome to my Nginx module guide! To follow this guide, you need to know a decent amount of C. You should know about structs, pointers, and functions. You also need to know how the nginx.conf file works. If you find a mistake in the guide, please report it in an issue! This guide is currently being written. Use it at your own risk!
The NGINX Handbook
For a variable to be accessible in the configuration, NGINX has to be built with the module embedding the variable. Building NGINX from source and usage of dynamic modules is slightly out of scope for this article. But I'll surely write about that in my blog. Redirects and Rewrites. A redirect in NGINX is same as redirects in any other platform.
Emiller's Advanced Topics In Nginx Module Development
DRAFT: August 13, 2009. Whereas Emiller's Guide To Nginx Module Development describes the bread-and-butter issues of writing a simple handler, filter, or load-balancer for Nginx, this document covers three advanced topics for the ambitious Nginx developer: shared memory, subrequests, and parsing. Because these are subjects on the boundaries ...
How to use nginx modules
Note that you can also use the load_module parameter in your /etc/nginx/nginx.conf at the top level, if preferred for some reason. To use a module for your website, its settings are specified in your server block. For example: location /img/ {. image_filter resize 240 360; image_filter rotate 180; image_filter_buffer 16M;
Understanding the Nginx Configuration File Structure and ...
The "if" context can be established to provide conditional processing of directives. Like an if statement in conventional programming, the if directive in Nginx will execute the instructions contained if a given test returns "true". The if context in Nginx is provided by the rewrite module and this is the primary intended use of this ...
Writing an Nginx Response Body Filter Module
The article has a clear introduction and diagrams on the basics of nginx module. Writing Nginx Modules, A useful set of slides on writing nginx modules. TEngine book on Nginx Development, A comprehensive book and guide on Nginx development. The book is in chinese. Nginx Module Guide, A guide on developing Nginx module in english. It covers both ...
PDF NGNXMo I sedul
Whether you are using core modules, like the http and stream modules, or 3rd party module, like geoip or RTMP, they are using the same module framework. With the addition of dynamic module support, modules are an even better way to add functionality to NGINX. But just because you can write a module doesn't necessarily mean you should.
ROFL with a LOL: rewriting an NGINX module in Rust
When writing an NGINX module, it's crucial to get its order relative to the other modules correct. Dynamic modules get loaded as NGINX starts, which means they are (perhaps counterintuitively) the first to run on a response. Ensuring your module runs after gzip decompression by specifying its order relative to the gunzip module is essential ...
Module ngx_http_api_module
The ngx_http_api_module module (1.13.3) provides REST API for accessing various status information, configuring upstream server groups on-the-fly, and managing key-value pairs without the need of reconfiguring nginx.. The module supersedes the ngx_http_status_module and ngx_http_upstream_conf_module modules. When using the PATCH or POST methods, make sure that the payload does not exceed the ...
Nginx module writing: how to forward request to server?
I'm using nginx as a reverse proxy and I have been trying to write an nginx module that would process incoming requests, and if it likes certain HTTP headers present in the request, nginx would allow the request to reach the protected server (behind the nginx proxy). Now, I've successfully implemented the header processing, but I am stuck at ...
Module ngx_http_proxy_module
Writing to temporary files is controlled by the proxy_max_temp_file_size and proxy_temp_file_write_size directives. When buffering is disabled, the response is passed to a client synchronously, immediately as it is received. nginx will not try to read the whole response from the proxied server.
Module ngx_http_rewrite_module
The ngx_http_rewrite_module module is used to change request URI using PCRE regular expressions, return redirects, and conditionally select configurations. The break, if, return , rewrite, and set directives are processed in the following order: the directives of this module specified on the server level are executed sequentially; repeatedly:
How to list installed Nginx modules and compiled flags
The same command also lists installed Nginx modules. This page explains the various command-line options for developers and sysadmin to determine if the particular feature is compiled. ... I'm Vivek Gite, and I write about Linux, macOS, Unix, IT, programming, infosec, and open source. Subscribe to my RSS feed or email newsletter for updates ...
Tutorial: Configure OpenTelemetry for Your Applications Using NGINX
The module utilizes telemetry calls to monitor application requests and responses, enabling enhanced tracking capabilities. The module can be conveniently set up and configured within the NGINX configuration files, making it highly user-friendly. This new module caters to the needs of both NGINX OSS and NGINX Plus users.
nginx
For example, when writing an NGINX module and needing to include headers from said module, how do I do it? #include <ngx_core.h> #include <ngx_http.h> Naturally, the headers are not found as they don't exist on disk. I doubt the proper way to solve this is to bring in the entire NGINX source's headers just so I can reference the headers in my ...
Module ngx_http_stub_status_module
The ngx_http_stub_status_module module provides access to basic status information.. This module is not built by default, it should be enabled with the --with-http_stub_status_module configuration parameter. Example Configuration. location = /basic_status { stub_status; } This configuration creates a simple web page with basic status data which may look like as follows:

Nginx Modules

What are NGINX Modules?

Types of Nginx Modules

Popular Third-Party Modules for Nginx

Brotli Module

Redis Module

Nginx Pagespeed Module

ModSecurity Module

Nginx RTMP Module

Let’s Encrypt Nginx Module

Nginx Upload Progress Module

GeoIP2 Nginx Module

HttpEchoModule

Nginx Core Modules

HTTP Module

Events Module

ngx_http_core_module

ngx_http_ssl_module

ngx_http_access_module

ngx_http_proxy_module

ngx_http_fastcgi_module

ngx_http_rewrite_module

ngx_http_gzip_module

ngx_http_realip_module

ngx_http_limit_req_module

ngx_http_autoindex_module

ngx_http_auth_basic_module

ngx_http_stub_status_module

ngx_http_v2_module

ngx_http_dav_module

ngx_http_flv_module

ngx_http_mp4_module

ngx_http_random_index_module

ngx_http_secure_link_module

ngx_http_slice_module

ngx_http_ssi_module

ngx_http_userid_module

ngx_http_headers_module

ngx_http_referer_module

ngx_http_memcached_module

ngx_http_empty_gif_module

ngx_http_geo_module

ngx_http_map_module

ngx_http_split_clients_module

ngx_http_upstream_module

ngx_http_fastcgi_cache_module

ngx_http_addition_module

ngx_http_xslt_module

ngx_http_image_filter_module

ngx_http_sub_module

ngx_http_dav_ext_module

ngx_http_flv_live_module

ngx_http_gunzip_module

ngx_http_mirror_module

ngx_http_auth_request_module

ngx_http_perl_module

ngx_http_geoip_module

ngx_http_degradation_module

ngx_http_headers_more_module

ngx_http_xslt_proc_module

ngx_http_js_module

Nginx Modules Best Practice

Related Posts

Understanding Nginx Modules

Step 1: Create the Module Directory and Source Files

Step 2: Define the Module's Structure

No comment s so far

Why Make Your Own NGINX Modules? Theory and Practice

Introduction

The Handler Guide

The Config File

Building the Module

Using the Module

Printing All the URL Arguments

Many Buffers

The Filter Guide

ngx_http_request_t

Other Stuff

2. Some weird string is showing up after my text.

Useful Links