The Provectus Perspective
The Rest of LAMP...
Daniel A. Post
The advent of the LAMP stack (Linux, Apache, MySQL, PERL/PHP) was an Internet revolution which will certainly never be forgotten for being the hammer that built the vast majority of the World Wide Web. LAMP made it relatively quick and easy for hobbyists and amateur developers as well as ISPs and hosting companies to quickly deploy a working web stack. The result of this relative ease however is that many LAMP-based websites and corporate intranets have been poorly provisioned and deployed often by web designers and developers who are able to get a basic LAMP stack running and hosting pages, but lack the knowledge of the underlying operating system, security and performance considerations, and other factors which systems architects, engineers and administrators spend their lives working with. In this article I aim to discuss some of these considerations in what I collectively refer to as The Rest of LAMP.
Whenever I am asked to get involved with a LAMP server for the purposes of performance tuning, stability troubleshooting or security issues and hardening, the first step that I take is to establish some baselines. This is necessary not only to provide a before-and-after comparison but also as a means of understanding system activity and behaviour. My standard toolkit for establishing baselines includes Tripwire (file change monitoring and alerting), awstats, SNMPd, MRTG or Cacti, Smokeping, Nagios, rsyslogd, logrotate, and log2mail or logwatch. Taking the time to implement these tools before making any further changes to the system will provide a mechanism for observing virtually every aspect of what the system is doing, and valuable metrics about performance and activity. As you make improvements to the system, these tools will provide you with hard data and visualizations which will reflect the effect of your changes.
Most Linux distributions are not well suited to hosting public websites in their default configurations. Typically they are running many services which are not necessary for the task, each of these causes two problems; they use system resources which the database and webserver need to fulfill the primary function of the server, and more importantly they introduce a potential vector of attack. Disabling and full removal of unnecessary software from a LAMP server is a great first step, this typically requires some general knowledge of Linux and the specific distribution being used to host the stack. I prefer to utilize Debian GNU/Linux and perform a bare 'netinstall' installation with no GUI or other roles selected. Some examples of services to disable and remove which can commonly be found on LAMP servers are the CUPS print servers, NFS servers, IMAP/POP3 mail services (you should keep an SMTP server such as Postfix for sending alert EMails, any mail that your website may need to transmit, etc.).
Security is often a major concern on LAMP systems because of the use of PERL or PHP scripts as CGI (common gateway interface) which execute from web pages to handle things functions such as search lookups, user submissions, etc. The history of CGI based exploits is long and dark, and continues to this day. The primary means of defence on LAMP systems is two fold; secure programming methods and diligent patching discipline. Common mistakes which occur in LAMP web development include failing to sanitize form input from websites before passing it as a query to the database. This opens a vector for SQL injection attacks against the database. Many other secure web programming standards must be observed as part of the overall security strategy for the site. I quite frequently encounter systems which are running on very out of date software, it is especially important to keep the webserver itself and PHP, Perl, Python, etc. fully up to date as these are the weakest links in the chain. It is an ongoing and unfortunate belief among many people that "having a firewall" will protect your system however nothing could be further from the truth. If your site is not sanitizing database input or handling sessions securely, or if it is running CGI which is vulnerable to exploitation, then a simple firewall will not prevent anyone from leveraging these weaknesses against the system. Application firewalls and intrusion detection/prevention systems offer some protection in this area however it is still critical to code securely, and keep the entire stack up to date. Finally, it is a good idea to audit the site using various tools such as w3af, nmap, and OpenVAS to probe and attack your site to identify weaknesses which can be improved upon.
The next forefront of security is not the aforementioned firewall (we'll get to that soon) but rather is system administration practises. The root account on a Linux server is a thing which is so commonly abused that it deserves a support foundation to battle for its cause. This account should be restricted such that it can only be used via sudo from a named user account, or logged in directly from the server console. SSH as root, or any other means of accessing the system as root should be disabled. Named user accounts provide a means of change tracking, no administrator should be logging into the system as root for normal duties. The use of strong/complex passwords, never re-using passwords, and never storing them in configuration files, etc. are all practises which are understood to be standard but in reality often are not. These things are worth doing properly, they are even more important than having a firewall.
Security can be compromised a number of ways, but one of the more common means of attack against LAMP servers is the exploitation of management software which has been installed to assist developers and administrators, such as PHPMyAdmin and Webmin. In an ideal scenario, these should never be installed on the production LAMP servers as they provide an excellent and often-utilized vector of attack which results in direct root control of the entire system. If these must be used, then several steps need to be taken to ensure that they do not compromise security, using webserver authentication and security to protect these applications, the firewall to restrict access to them, and application firewalls and IDS/IPS systems to detect abuse and take action should all be part of the considerations for adding these type of tools to a server. The implementation and usage of a private IPSec encrypted VPN using OpenVPN for all remote systems-management is an excellent mechanism for shutting down all external access to these portals and other protocols.
Firewalls. Unfortunately the popular thinking is that once a firewall is in place, everything is safe! A packet-level firewall is definitely not optional on a LAMP server, the only traffic that should be coming into these systems should be HTTP/HTTPS, and the aforementioned OpenVPN IPSec. iptables will satisfy this requirement nicely, and can also be configured to do things such as thwart invalid traffic, bad packet sequences, tcp scans, and other type of malicious activity with only a few simple rules. These basic rules at the beginning of an iptables script can result in substantial bandwidth savings and reduction of load on the server. In the very least, use iptables to prevent access to all ports and protocols other than those which are required to fulfill the function of the server, and to restrict access to management protocols such as IPSec or SSH to only be available to the addresses of administrators.
Iptables can be taken a step further by implementing an intrusion detection and prevention mechanism. One of my favourite basic IPS solutions is fail2ban which can be quickly and easily configured to watch system and application logs for unfriendly activity, then take action to block the offender using iptables. The type of unfriendly activity that it can detect and respond to includes failed login attempts through SSH, HTTP/HTTPS, FTP, IMAP and more. Fail2ban can detect attempts to access restricted areas of a site, and attempts to leverage known exploits such as SQL injections and buffer overflows. Fail2ban can also block clients for a defined period of time (or forever) when they exceed a defined threshold of HTTP GETs, POSTs or other actions within a defined period of time. Logwatch is another example of a tool which works this way, using one of these is a critical part of any LAMP stack security solution. Snort is another type of IDS which analyzes network traffic at the point of ingress on the server. Snort monitors raw protocol data in real-time network traffic and takes action based on detecting malicious activity. This is a more sophisticated IDS/IPS mechanism but is much faster and more scalable than log-watching based tools.
Performance is quite often a big problem for LAMP stacks and tackling the problem begins with determing where the bottlenecks are. Aside from the aforementioned trending and baselining tools, administrators are armed with many other tricks wchih can help to quickly narrow down the problem. Most often the issue is either with the database or the webserver or both. Regardless of the database you are using (MySQL, MariaDB, Postgesql, etc.) the biggest performance gains can always be identified by looking at slow-running queries and queries which perform full table scans and are not using indexes. Often simply creating some indexes to speed searches can have a massive impact on the performance of a LAMP stack, this is usually one of the first places I look when troubleshooting performance. MySQL has the ability to log these type of queries with a quick configuration change, this combined with the usage of their MySQL Query Analyzer and MySQL Performance Tuner tools never fail to delight and impress. Keeping databases cleaned up by purging unnecessary data is also very important and should be automated where possible.
The next area of performance that requires focus is the webserver, I virtually never encounter an optimized Apache configuration, and have often encountered situations where people have tried implementing NGINX or Squid in an attempt to improve performance, but have been unable to scale this combination to fit their needs. With Apache one of the more common problems is that by default in most Linux distributions it is configured to load every known module. Trimming out the modules that you probably don't need such as mod_proxy, etc. will reduce the memory footprint of each Apache process. Tuning the number of active/spare server engines, max connections, and keepalive times. will also improve Apache performance however these must be tuned based on the sites activity. This is where the aforementioned monitoring and trending comes into play once more. Utilizing the baselines and activity that was captured with tools such as MRTG/Cacti (which can be used to trend apache metrics) you can now begin making adjustments and observing the results. When hosting a PHP based site, ACP, mod_fcgid, and other mechanisms should be investigated as possibilities for improving the interpretation, execution and performance of PHP scripts. Load-testing tools such as Apache Bench can be utilized for testing changes if you are working in a sandbox or do not wish to wait for real traffic peaks. Once the webserver is tuned then it is time to consider caching and proxying, and this is where Varnish literally comes to the rescue!
Another little-known trick which can help tremendously is to shift high IO transactions from disk to RAM using tmpfs. This helps even when your underlying storage is enterprise grade SAS, or even SSD. The varnish-cache rolling log, mysql temp-tables storage, and PHP session/cache directories are all excellent candidates for tmpfs implementation to reduce system IO and latency. Monitoring tmpfs usage over time is critical; too large of a tmpfs will waste valueable RAM and too small could result in services freezing or crashing if it fills. I recommend real-time monitoring and alerting for ramdrives, all filesystems, and all other critical system elements as well, Nagios is an excellent tool for this job.
Once all of these avenues have been exhausted if performance problems continue to plague a system then horizontal scaling becomes the best approach, starting with a dedicated server for the database, and one for the webserver+varnish. This approach can more than double the capacity of the solution and can also be taken to the next level by replicating the database between two or more servers with a memcached server (or cluster) in front of them. Reconfiguring the site to perform database reads against the slave instance and inserts/updates using the master is another method of load-balancing replicated database servers. These methods coupled with introducing dedicated a varnish-cache server (or cluster) as a front-end for multiple back end webservers will permit near infinite scaling of any stack. As complexity grows however, so do session-management and other challenges for the site developers.
LAMP has changed the world by serving to the Internet many trillions of views of the many millions of websites which we all enjoy. When properly deployed and administered, a single LAMP server can host tens of millions of pages per day and manage terabytes of content. If you are starting small but wish to build a LAMP stack as a secure and scalable platform for your site, then it is my hope that by reading this these tools and methods for getting the most out of what free and open source software can deliver will be yours.