MySQL Replication for Drupal
The problem with the Drupal/MySQL stack is that it doesn't scale across multiple database servers out of the box. When performance is a secondary concern to redundancy, the problem becomes a bit more evident. If your database servers will be in different data centers, it's crucial to be able to operate when the primary data center is down. The scalability in terms of performance is an added bonus. MySQL offers a number of options for replication, including master-slave and master-master. Drupal also offers modules to help control database traffic, but none are optimal. I had to experiment and create my own solution to effectively address the problem.
First a little background:
Options:
- Split-Read-Write: Drupal modules exist to split reads off to slave servers and send only writes to master server
- This doesn't work if the servers are geographically distributed and the "write" server (master) could be down
- Master-Master: The MySQL implementation of master-master is basically a back-to-back master-slave type of relationship. This means that in a two server scenario, each server has a master-slave relationship with the other. As this scales out it is configured in a ring topology.
- This can causes replication loops, especially for frequently updated temporary data
- Master-Slave: The master-slave relationship utilizes a single database for all "changes" and these are replicated to all of the other database servers.
- Read-only isn't read-only: The master slave relationship would work fine if Drupal could run in a read-only mode, but it won't work as such. Even if only accessed by users who don't edit content, the site still requires a writeable database for things like cache and session information.
- If configured as a strict master-slave, the slave server almost immediately start to fail because temporary information like sessions are overwritten, rendering the sites useless.
Solution: Master-Slave (modified)
- First we need to acknowledge the things we need to replicate, and the thing we don't:
- Because Drupal is a CMS, it stores almost everything in the database; we want to replicate all content and structural changes from the master to the slaves (note that the physical file structure is already replicated between the servers with Windows FRS)
- We do NOT need to replicate things like session data, caches, etc; this information is all temporary, and not needed for the sites to function independently
- The solution is to configure master-slave replication, but to exclude certain "temporary" tables that can operate independently on each of the slave servers
- URLs: Because your site will be known as "www.[site].com on each server, you need a URL to manage each server. I recommend www1, www2, www3, etc. This will allow for health-checks and maintenance on individual servers.
- See this article for details on enforcing Drupal read-only servers
- You will configure replication as normal via MySQL, phpMyAdmin is an easy way to do so, but first you need to edit your MySQL.ini file and restart MySQL:
Edit MySQL.ini:
[mysqld]
log-bin=mysql-bin
binlog-do-db=[replicated database name]
binlog-ignore-db=[db name]
binlog-ignore-db=mysql
server-id=1
sync_binlog=1
replicate-wild-ignore-table=[replicated database name]%.cache%
replicate-wild-ignore-table=[replicated database name]%.watchdog%
auto_increment_increment = 10
auto_increment_offset = 1
slave-skip-errors=all
Notes on usage:
- Use the "binlog-ignore-db" for any database you don't want to replicate
- Make sure the "server-id" is unique for each of your servers
- The "replicate-wild-ignore-table" commands effectively stop replication for the unwanted "temporary" data
- The "auto-increment" commands (make sure these are different on each server) allow for the situation where you might want to try master-master replication, as these should prevent overlapping/conflicting records in the "temporary" data tables (unverified)