Maurizio Benedetti's Nginx Blog: 2015

Wednesday, June 10, 2015

The 777 developer

I love to develop code since I was 13 and I am now 38. Development is my life, I coded pretty much in every of the most common language. For me development is a serious business.

I always disliked the web scripting languages coming from a computer games development world and having a strong background in assembler, C and C++. I love to develop 3D engines in C or C++, I have done it. I love to write C daemons on Linux, Automation routines on my Linux machines, my containerization solution and may other deep coding solution.I simply love it. I hate to write PHP, Javascript, Django and other web scripts. I am not tailored for it. Simply hate them. Simple as that.

My main duty anyway, in the current role I cover is to be involved with 777 developers, people coming from a pure web scripting background. It is very difficult to deal with them. They believe to be master developers just because they to how to copy and paste pieces of code in already available CMSs.

Anyway, this is nowadays world. There are some which are very good (and these friends are not 777 developers), some others are terrible.

Who is a 777 developer? A 777 developer is a typical guy which comes to you with its own application and asks you to provide hosting for it. They just tell you which stack they need and you are supposed to do the hard work.

Most typical problems with those guys is to let them understand that security is an important topic in a production environment. When they give you a package and it does not work they just tell you that "on my workstation works wonderfully". It is there that you understand that the pal is a 777 developer.

A 777 developer is a person which sets the entire nginx or apache directory tree with a chmod 777 -r and it feels happy, it works! Who cares that this is a the most stupid way to do it. It works! Doesn't it work on production environment? The fault is yours since you don't know how to do stuff right. On my personal EC2 instance it goes very well, in my docker container it works like a charm, on my workstation it is perfect. You see? it is your fault. The secret is the great 777 command!

I do not expect to be working with the best developers on the market. They pretend to be the best but you can recognize that they are just pretending with a simple look at them. In the end, if you are just configuring a CMS there is a reason but, I do not expect to be blamed if an application do not work.

A 777 developer always hides himself behind the devops trend. If fact, it is your fault if the application doesn't run, you do not understand the devops principle.

....it is so difficult sometimes to have a professional approach with a 777 developer. Anyway I think we are paid to do it and we need to do it.

Present sorry page for everyone except you

During applications back-end systems maintenance operations, we usually present the so called sorry page to end users. This is a nice way to inform that something is going on, in particular on small systems not having a multi node configuration setup.

Typical pages are the one saying "I am sorry, the system is under maintenance, will be back soon".

I usually configure the sorry page at reverse proxy level (nginx) since the backend system is most of the time going up and down.

The way I use is the following:

Create the HTML of the sorry page and I put in a local file system local /var/whatYouPrefer/www/sorry_page/index.html
Comment out the proxypass directive
Define a new document root for the website pointing to the sorry page's location

root /var/whatYouPrefer/www/sorry_page/;

At this point, everyone will get the sorry page in place of a proxy service error.

This is fine, basic and easy. Everyone knows it. What about your access? What about you cannot access the HTTP service on the application server due to a local firewall configuration (in particular if you perform SSL termination you shouldn't allow connections from players different than your reverse proxy) or due to a named based virtual hosting limitation.

The best way would be to create a conditional rule to provide the sorry page to everyone except than your IP address.

The way I usually go to achieve this result is:

if ( $remote_addr ~* yourIPAddress) {

proxy_pass http://remoteServer;

}

root /var/whatYouPrefer/www/sorry_page/;

The result of this is obvious, if your IP matches the one in the if condition, Nginx will proxy the backend service to you. If it doesn't, NGINX will serve the sorry page.

Nginx processes' users

Many people get confused about user ownership of the nginx processes.

Most people believe that Nginx runs as Root (oh my god) and some others believe that Nginx runs entirely as nobody user.

Now, let's do a distinction between the master and the workers processes.

Master process

The master process runs with the user used to launch the service. Typically root. Why root? Root is generally used in order to be able to bind sockets on ports number below 1024 (privileged ports). In fact unprivileged users cannot bind ports below 1024.

In general, master process has the following tasks (http://www.aosabook.org/en/nginx.html)

reading and validating configuration
creating, binding and closing sockets
starting, terminating and maintaining the configured number of worker processes
reconfiguring without service interruption
controlling non-stop binary upgrades (starting new binary and rolling back if necessary)
re-opening log files
compiling embedded Perl scripts

The fact that the master process starts as root is the result that you actually run with a root account (or otherwise you will have port binding problem). Potentially you could run it with a different user and configure your system to use a port number higher than 1024. This is simple as that.

Common usage anyway(in particular for reverse proxies) is to bind standard HTTP or HTTPS ports like port 80 and or 443. Wouldn't make much sense to have a reverse proxy listening on custom ports. Can be done technically, doesn't make much sense logically.

So, the gold rule here is that the master process belongs to the user which has started/spawned Nginx. Simple as that.

Child Process(es), workers

Child process(es), known as workers, are the actors which actually move your web services. While the master process is pretty much an orchestrator, the workers are the real hero here.

One gold rule of web servers is to grant the minimum access level required to the web server's processes (yes, I am talking to you my developer friend which always uses a chmod 777 approach). Having said this, wouldn't make much sense to run workers as root.

What is the risk? The risk is that if someone breaches your web application or your web server, doesn't need to elevate its account since it is already root. Attackers cannot ask for more! They will love you (and no, they won't have mercy for you!).

Now, the question is, which user is being used by workers?

The response is simple but a bit more complicated than it seems.

1. If nothing is specified at compile time (configure phase) and nothing is specified at configuration file level, then the used user is "nobody"

2. If a user (and optionally a group) is specified at compile time with the options --user and --group and nothing is specified in the config file, then the user specified during the build is used

3. If a user (and optionally a group) is specified in the config file, it will be used as owner of the spawned workers

In general, the priority is the following:

- Config file

- Default (nobody if not specified at compile time or the user name specified at compile time)

Note: A common pitfall for newcomers is to specify a user that do not exist (or "nobody" doesn't exist). In this case, remember to add the user before you actually start nginx.

The way nginx operates is the following, on a typical real environment with 3 workers activated:

root 21279 0.0 0.0 109152 4136 ? Ss 14:37 0:00 nginx: master process /usr/sbin/nginx

nginx 21280 0.0 0.0 109588 5876 ? S 14:37 0:00 \_ nginx: worker process

nginx 21281 0.0 0.0 109588 5876 ? S 14:37 0:00 \_ nginx: worker process

nginx 21282 0.0 0.0 109588 5876 ? S 14:37 0:00 \_ nginx: worker process

Useless to say, to check it, use the ps -auxf command.

This is very similar (in terms of users) of Apache HTTPD server using pre-fork

root 21401 6.5 0.3 417676 24852 ? Ss 14:39 0:00 /usr/sbin/httpd -DFOREGROUND

apache 21410 0.5 0.1 690288 13748 ? Sl 14:39 0:00 \_ /usr/sbin/httpd -DFOREGROUND

apache 21411 0.5 0.1 690288 13748 ? Sl 14:39 0:00 \_ /usr/sbin/httpd -DFOREGROUND

apache 21412 0.5 0.1 690288 13748 ? Sl 14:39 0:00 \_ /usr/sbin/httpd -DFOREGROUND

apache 21413 0.5 0.1 690288 13748 ? Sl 14:39 0:00 \_ /usr/sbin/httpd -DFOREGROUND

apache 21414 0.5 0.1 419840 14584 ? S 14:39 0:00 \_ /usr/sbin/httpd -DFOREGROUND

apache 21415 0.5 0.1 419840 14584 ? S 14:39 0:00 \_ /usr/sbin/httpd -DFOREGROUND

apache 21416 0.5 0.1 419840 14584 ? S 14:39 0:00 \_ /usr/sbin/httpd -DFOREGROUND

apache 21417 0.5 0.1 419840 14584 ? S 14:39 0:00 \_ /usr/sbin/httpd -DFOREGROUND

apache 21418 0.5 0.1 419840 14584 ? S 14:39 0:00 \_ /usr/sbin/httpd -DFOREGROUND

Tuesday, June 9, 2015

About Nginx number of workers

As per any web server, the tuning of the worker processes is a sort of voodoo art.Ask hundreds system administrators and you will get probably hundreds of different opinions.Before we go deep in the topic, let's understand what is an nginx worker. An nginx worker is a process in memory which "takes care" of the clients' requests. There is a minimum of one worker process in a Nginx environment. It is part of the roles of a good system administrators to decide how many workers you actually need.

NGINX offers the ability to configure how many worker processes must be spawned through the core configuration parameter "worker_processes" (http://nginx.org/en/docs/ngx_core_module.html#worker_processes).

Now, the temptation for many newcomers would be to set a very high number of worker processes or ignore the parameter at all.

The result would be to have an either over-complicated process management (or not enough CPU power) or to not take the best from the HW available.

This is pretty much similar to the endless story of how many APACHE workers you need to configure in a typical MPM_Prefork environment. Reality is, there is no magic recipe for it.

The number of workers you should configure on your NGINX server depends on different factors:

- The role of the server

- The number of CPU available

- The storage system landscape

- The disk I/O requirements (pay attention to caching which is disk I/O intensive)

- SSL encryption support

- Data compression support (gzip)

Why all these factors should be considered?

Let me try to explain this point by point.

The role of the server

The role of the server and the architecture of your web server solution is very important in the counting of the number of workers you should configure. For instance, a stack where NGINX is running on the same machine serving your Django stack (through WSGI for instance) is very much different from a stack where NGINX is running on a different machine providing you the django application. In the first case there will be competition in the usage of the cores, in the second case, the cores are available for a typical reverse proxy scenario and can be all allocated pretty much for NGINX usage.

Would be very inefficient to have a full LEMP (Linux, nginx, MySQL, Perl/Python/PHP) stack on your server having 4 cores and allocate 4 nginx workers. What about MySQL needs? What about your PHP/Python needs?

The number of CPU available

This parameter is very important since it does not make much sense to overload your system with more "loaded" processes than the number of CPUs. Nginx is very efficient in using CPUs, if your workers don't take a full CPU most probably you don't need it. Remember that Nginx workers are single thread processes.

The storage system landscape

Parallelism is not only related to CPUs. If your web server uses a lot of cached items will do surely high I/O on your storage system. What about having the full content of your web server, OS included, on the same physical volume? Would you benefit in having many processes all demanding pretty much 100% of your single physical volume I/O bandwidth? Try to think about it. Try to split the I/O load on multiple disk systems before you actually think about increasing the number of workers. If I/O is not a problem (i.e. some proxy services), then ignore this point.

The disk I/O requirements

This is linked very much to the previous points. Disk I/O is one of the few situations where Nginx can fnd itself in a lock/wait situation. There are possible ways to workaround this situation but none of them is a magic recipe (see AIO and sendfile)

SSL encryption support

Is your web server heavily using SSL? If so, remember that SSL requires additional CPU power compared to plain HTTP. Why this? Try checking the SSL encryption algorithms supported nowadays, you will see a lot of math over there, then you will realized why we talk about CPU consumption. To make a long story short, if you use SSL, consider that the CPU usage will be higher than not using it. How much higher? this depends very much on your application usage pattern, the number of GET/POST operations performed, the average size of your responses etc.

Data compression support

The support for data compression serving responses is a very good approach to limit the used bandwidth of your environment. It is nice and good but it costs in terms of CPU cycles.

Every response needs to be compressed and depending on the level of compression you set, the algorithm behind it will cost you in terms of CPU consumption.

The complexity and the computational cost of the GZIP support is ruled by the configuration parameter "gzip_comp_level" of the ngx_http_gzip_module module (http://nginx.org/en/docs/http/ngx_http_gzip_module.html#gzip_comp_level)

It accepts a value from 1 to 9. The higher you go, the best results (probably) you will get, the higher will be the CPU load on your average transactions.

Having mentioned the above points, it is way too evident that there is no magic recipe for it.

Let me say that a very common and easy approach is to allocate a worker per CPU. This is the basic setup and I am pretty sure that it works flawlessly on most of the environments.

But, as already said, this is not always the best approach. It works but it is not the best.

For instance on a typical reverse proxy, people tend to allocate one and a half worker for each core or, as well two workers per core. Let me say, if you don't do much SSL, 2 workers per core works Why this? Because most of the times NGINX will be waiting for the back-end systems to generate the response of the request and because the machine running Nginx is not busy running your application's logic and CPU payload is at minimum.

If you do extensive caching, considering the way NGINX does it (very efficient but I/O intensive), I would go for a more conservative approach, I would go for 1 worker per core and I would monitor the situation, if the machine behaves well, I would try to go for a 1.5 workers/cores ratio.

If you would like to have more information about this topic or your would like to have an advice on the number of workers you would need to load, try to comment on this page, I will try my best to support you.

There is an excellent diagram reporting how the master and workers processes work. Please refer to the following: