Engineering Drupal

    

Drupal is a powerful framework for building enterprise solutions that range from simple web sites to complex web-enabled applications. While Drupal 8 off-the-shelf could be used to build any of the broad spectra of solutions, there are best practices for engineering enterprise-class Drupal. The chapter covers the key principles for determining the best approach, and the details involved in successfully engineering a solution that is scalable and adaptable.

Engineering the Foundation

Constructing anything requires an understanding of the requirements of what you are about to build, whether it is a bridge, an automobile, a house, a pizza, or a Drupal site. Building anything without a thorough understanding of the requirements will likely result in having to rebuild some or all of the foundation. If you begin building an automobile and later find out that the true requirements include the ability to tow a travel trailer and haul eight adults then you may have to radically shift the architecture of the two-seater convertible that you just about completed, a task that would likely require starting over. When building enterprise-class Drupal, the best approach involves a thorough understanding of the needs of


the various constituents that the solution must support. Having spent the past 12 years and 33,000+ hours working as the architect of enterprise-class Drupal solutions, it’s key that you understand high-level goals and objectives such as:


• What type of sites will the organization create? Are they primarily delivering marketing information? Is online commerce a key consideration? Is there an online community component (user-generated content)? Will the sites be multilingual? Are the sites functionally similar or are there wide variations in the types of sites that will be built? Understanding this aspect will help you determine whether all sites can be constructed from a common distribution or whether the variance in functionality will require multiple base platforms on which to build and launch Drupal sites.


• How many different web sites (domains) will the organization construct over the next one to three years? Understanding the number of different sites will help determine whether to build each site independent of the others, whether to use a solution such as Drupal’s multisite architecture, or whether a custom


enterprise distribution from which each site inherits a majority of its structure and functionality is in order.


• Is there an existing Drupal distribution that closely matches the functional and technical requirements for the organization’s sites? For example, does Drupal Commerce, Open Scholar, Open Publish, Open Government, Open Atrium, or other distribution closely match your organization’s requirements? Does the organization already have Drupal sites in place? If so, is one or more of those sites a candidate for building an enterprise class Drupal distribution for the organization?


• Will Drupal integrate with other non-Drupal systems in the enterprise? If so, what role does Drupal play? Is it a provider of information to external applications and web sites? Is it a consumer of content from other applications and sites? Is it both? If Drupal is primarily a provider of information to other enterprise application,


having a robust user interface may be lower priority than having a well architectures services layer for providing REST APIs.


• What user interface best serves the consumers of information contained in the organization’s sites? Does the Drupal interface suffice? Does AngularJS or another decoupled user interface provide a better interface? Headless or decoupled Drupal is becoming a more popular option for organizations that want to be more creative in the presentation of content to their users than what is typically accomplished


through the traditional Drupal frontend. A prime example is weather.com, which


uses a decoupled approach with AngularJS as the presentation layer and Drupal as


the decoupled provider of the content.


Answering these questions will not provide the detailed level of specifications required to fully define the approach required to build an Enterprise Drupal architecture; however, it will provide the overall guidance as to how the individual pillars of the architecture need to be engineered to address the organization’s needs.


Defining the Components of Enterprise Drupal

With a general understanding of the fundamental requirements for your enterprise-class Drupal 8 site, the next step is to begin the process of examining each component that will form the architecture and define how your organization’s requirements will impact each of the components

Drupal Development Company

Network and Web Server

The network and web server architecture required to support Enterprise Drupal 8 plays a significant role in the performance of your site, and there are several aspects that you should consider while engineering your solution. Most Drupal sites use Apache HTTP servers as their web server and Apache does well in that role. However, as your sites’ traffic volumes grow, the load placed on the web servers often tax Apache’s ability to serve pages quickly enough to ensure acceptable page load times.


Apache often faces what is called the C10K problem, which means that Apache has a difficult time supporting more than 10,000 concurrent connections, and in fact, in most cases, Apache falls far short of delivering adequate performance well before the 10,000 connections limit is reached. Apache’s approach is to allocate memory to every additional connection, resulting in swapping to disk as concurrent connections increase. As the number of connections climbs the performance quickly spirals downward, leading to unhappy site visitors and headaches for the operations team of your site. Nginx takes a slightly different approach, whereas Apache’s approach is to fork a new process for each new inbound connection, where each new fork is allocated resources to process the connection, Nginx queues requests and processes them without allocating resources to each request. The result is lower overhead and faster responses to requests.


Drupal itself also consumes memory and CPU for each request that it receives, similar to Apache, but performance is often negatively impacted at significantly fewer than 10,000 connections. To resolve Drupal’s own resource bottlenecks, the best practice is to employ reverse proxy servers. Reverse proxy servers receive a request from browsers and then examine each request and determine what to do with it. They either carry out the request itself or send it on to the webserver and Drupal for the fulfillment of the request. Reverse proxy


servers also provide the ability to cache static files (images, CSS files, and JavaScript files) separate from dynamic pages. A reverse proxy server may also cache PHP generated web pages, such as those pages generated by Drupal. By serving up pages from cache, Drupal never sees that request, as the request is fulfilled by the reverse proxy server. Using multiple reverse proxy servers also provides the ability to balance the load across several servers, further reducing the amount of time required to respond to requests.


Many of the biggest Drupal hosting providers, such as Pantheon, use Nginx and reverse proxy servers to ensure that the sites they host perform as desired. Your organization may choose to implement this same architecture in house, or you may rely on hosting providers to provide the infrastructure required to support your anticipated traffic volume.


Database Servers

The web and reverse proxy servers are the first line of defense in solving Enterprise Drupal 8 performance and scalability issues, while the database is a close second as the next area to focus on when engineering Drupal.


As an enterprise-class platform, Drupal 8 requires the same level of capabilities and power as any other enterprise application, such as your enterprise resource planning (ERP), customer relationship management (CRM), human resources (HR), or other enterprise-class applications.


Selecting the Database Platform

The de facto standard for most Drupal implementations has been MySQL. It was the first database supported by Drupal and continues to be the most popular option for most organizations. Drupal is optimized for MySQL, and while Drupal also supports PostgreSQL and SQLite, not all contributed modules support non-MySQL databases. There are also options for using Oracle and Microsoft’s SQL Server databases, although using either of those databases is not considered the mainstream approach for Drupal.


While MySQL meets the performance requirements of most Drupal implementations, there are two MySQL “clones” that provide even higher performance and scalability options as they have replaced key components of the database engine focusing on performance. MariaDB is a fork of MySQL created and maintained by a team


of MySQL engineers who left the organization when Oracle purchased the rights to MySQL. Percona is similar to MariaDB, but instead of a fork of MySQL, it is a branch of the main MySQL master branch. The primary


difference between the two is that MariaDB diverged from MySQL at a point in time and continues down its own path, whereas Percona shadows MySQL and will continue to be tightly in alignment with the MySQL master branch. Many of the large-scale hosting providers use MariaDB as the database engine for their service offerings.


While MariaDB and Percona typically outperform MySQL, any of the three options are viable candidates to support an Enterprise Drupal implementation.


Clustering MySQL to Improve Performance

Traditionally Drupal sites often ran on a single instance of MySQL, and for many sites, that architecture supported them well until they hit a threshold of page views where the database became a bottleneck. After exhausting the options to tune MySQL to support the transaction volumes, the only alternative is to deploy more than instance of a MySQL server and employ clustering to distribute the workload across servers. This approach provides virtually unlimited database server resources and resolves the issue of the database as the bottleneck. While you may address some of the performance issues through reverse proxy servers and advanced caching mechanisms, it is wise to consider engineering your Enterprise Drupal architecture as a MySQL cluster to avoid having to retrofit your architecture at a later point.

For more information about MySQL clustering, visit mysql.com/products/cluster. As a point of reference, a standalone MySQL server may be tuned to deliver 250,000 to 500,000 queries per second, whereas a MySQL cluster, configured properly with the right number of servers and resources, can deliver 200 million queries per second.

Drupal 8 Core

There are several aspects of Drupal 8 core that you should carefully examine and consider while engineering your Enterprise Drupal 8 platform, and many of those options are discussed throughout this book. However, when launching your Enterprise Drupal 8 initiative, there is one aspect that will dictate how you engineer and build your Drupal sites. That aspect is how you want to build sites across your organization. There are three general alternatives:


• Single site


• Multisite


• Distribution


Single Site

A single-site architectural approach focuses on building each site or application by starting with Drupal core and adding the contributed and custom modules required to address the functional and technical requirements for that specific site or application. This approach works well and has been the de facto standard for many organizations. A single site solution framework works best for organizations in which every site and application is significantly different and there is little opportunity to leverage a common framework across all sites and applications. In this case, a common framework would likely be limited to Drupal core and a small number of contributed modules. While there may still be value in developing a common platform, the benefits are not as significant as the other architectural approaches. While it may seem like the easiest alternative to building sites in your organization, you will likely come to the realization that having to maintain dozens or even hundreds of independent sites is overwhelmingly complex and costly. Fortunately, there are better ways, as described in the next two sections


Multisite

Drupal multisite is an approach that has been around for nearly 10 years and is employed as the primary structure for hosting sites on Acquia. A multisite architecture consists of a single codebase with each site or application having its own database and configuration.


The benefit of this approach is that you only have to maintain a single instance of Drupal and contributed modules. An update to Drupal core contributed, or custom module applies to all sites hosted in a multisite-based architecture. The benefits of a single codebase are often the primary benefit of a multisite architecture; however, there are potential pitfalls, such as:


• A single erroneous update to a module can take all of your sites down, as all sites share the same codebase.


• Scalability may be an issue, as all sites are running on a single instance.


A distribution-based approach, on the other hand, provides the ability to spin up independent containers as increased demands warrant additional resources.


• Administrative access to a multisite architecture is difficult to restrict to single sites for tasks like updating a custom module.


Multisite is widely used in large organizations and is a viable approach, but there are tradeoffs that may be addressed by using a distribution-based model.


Distribution

Using a common distribution is the third approach and is based on the concept of assembling a “packaged” solution that addresses a majority of the functional and technical requirements for all sites and applications in an organization. This approach is nearly identical to using one of the community contributed distributions as the foundation for your site—for example, using Drupal Commerce Kickstart, Open Atrium, or Open Public as the upstream distribution on which you build all of your sites.


A distribution-based approach starts with engineering a common Drupal footprint that addresses a majority of the functionality across the types of sites in your organization. You then create that site with the core building blocks to address that functionality, such as:


• Drupal 8 core


• Contributed modules


• Custom modules


• Entity types that address the common content requirements


• Taxonomy that addresses a consistent enterprise categorization of content using a common terminology


• Views that render content in ways that are consistent across the organization


• Page templates that address the common layouts used in the organization


• Common enterprise-wide navigational elements (menus)


• Common blocks


• An enterprise-wide search framework


• Integration with legacy enterprise applications and content


It is possible to assemble a common distribution that addresses a majority of the needs of the organization, fulfilling 80% of the common requirements. Creating a new site using a distribution is relatively straightforward—you clone the distribution from a centralized source code control system such as GitHub, install the distribution, and expand on the functionality provided by the distribution were necessary to address a site’s specific requirements.


By setting the upstream master of the cloned site to the distribution’s repository on, for example, GitHub, you have the ability to pull updates and enhancements from the distribution into localized versions of the distribution, making the process of rolling out updates, patches, security updates, and expansion of functionality a relatively simple process. There are hosting providers, such as Pantheon, that provide this capability as part of their enterprise hosting packages, or you can build it yourself.


 


Profiles

If you select Drupal multisite or distribution as the approach for building your Drupal 8 platform, you may consider creating one or more installation profiles. Installation profiles combine core Drupal, contributed modules, themes, and pre-defined configuration into one download. Installation profiles provide specific site features and functions for a specific purpose or type of site. They make it possible to quickly set up a complex, user-specific site in fewer steps than installing and configuring elements individually.


As an enterprise is it likely that there won’t be a “one-size-fits-all” profile to address every type of site in your organization. For example, you may have a site that is primarily a marketing web site, while another site delivers technical product information to customers who purchase your products, and yet another siteis primarily a commerce web site where you sell products and services. While it is possible to build three different distributions to address those three very divergent sites, it is more effective, efficient, and less complex to build a single distribution using installation profiles.


Don’t underestimate the power of installation profiles, as they may save your development team countless hours of spinning up new Drupal 8 sites for the various constituents in your organization. See Appendix C for details on how to create a Drupal 8 installation profile.


Drupal 8 Contributed Modules

When engineering your Drupal 8 solution, it is likely that you will need to step outside the capabilities of Drupal 8 core to address the functional requirements of your organization. While Drupal 8 core is feature rich, it can’t address every possible requirement from every conceivable use of Drupal 8 in organizations around the world. Combining Drupal 8 core with contributed and custom modules will provide the foundation for addressing your organization’s specific needs.


Many organizations fall short when engineering their Drupal footprint by overlooking contributed modules that may solve their functional and technical requirements and, instead, developing custom modules that must then be maintained by their organization. The task of finding the right contributed module or combination of modules is often a tedious one, but the long-term payoff of using contributed modules instead of developing custom modules is significant, especially when considering the cost of upgrading your custom modules to the next major version of Drupal.


There are no easy shortcuts to finding the proper contributed modules to address your functional requirements, other than searching through drupal.org and finding other similar use cases and how people solved those issues. When evaluating contributed modules, there are a few things to keep in mind:


• How many sites report that they are using the module? If the number is small, for example less than 50, closely examine the functionality to determine why more people aren’t using the module.


• Check the issue queue and read through the bugs that people are reporting. If they seem significant and there are a lot of them, you may want to consider a different path. The sheer number of issues may not always be a good indicator though, because many heavily used modules have issues that number in the hundreds. They key is to look for critical issues and determine how actively people are working on them.


• Check the date of the last update to the released (non-dev) version of the module. If the module hasn’t had a release in several months and there are several outstanding bugs that have been reported and worked on, understand that you may have some additional work to do to apply the patches that developers have submitted to address critical functional and technical bugs.


• Look for known conflicts with other contributed modules in the issue queue. If you have the modules that are reported as conflicting you may want to look for an


alternative solution, as implementing that module may break other functionality on your site.


• Look for hooks that provide you with the ability to augment the module. A hook is a function that allows you to directly interact with the module to modify some aspect of the module’s functionality, such as adding or modifying content that is being processed by that module. A module that provides 80% of the required functionality but has hooks is better than a module that provides 90% of the functionality without the ability to modify the functionality through a hook.


 • When presented with multiple options to solve a functional requirement, examine the two modules carefully. Not every module solves the problem the same way and there are likely differences that will sway you one way or another. Another key indicator


is the number of contributors to the module. More contributors mean more arms and legs to work through the issue queue and update the module. There are also well-known developers in the community who are known for the quality of their modules. While everyone can contribute a module, sometimes it pays to stick with the veterans who have consistently delivered high-quality modules to the community.


If at the end of your evaluation you’ve come up empty-handed, custom modules are the acceptable path. Follow Drupal Development best practices, which can be found at drupal.org/coding-standards, and consider contributing your custom module to the community. It’s highly likely that someone else in the world is facing

the same functional requirement and could benefit from your solution. Conversely, you have the opportunity to collaborate with others in the community to augment your custom module to make it even better.

Comments