The data team is introducing configuration changes for MySQL that will allow us to bring new features to our platform. These changes will be mostly transparent, but some could potentially impact applications in a noticeable manner (both positive and negative). In preparation, we are detailing these changes and, where appropriate, how you can override them using Custom Chef recipes.
As with any upgrade, these changes will not be applied to your environment until you select the upgrade option. Testing in Staging is always recommended prior to performing an Environment upgrade. If you have concerns with any of these changes please open a ticket with us so we can explore those with you. Unless otherwise mentioned these changes are planned for release on Tuesday, September 11th.
Replica Binary Logging
The first, and most important, change will enable binary logging on replica databases along with the log-slave-updates option. The most important benefit of this change is that it eliminates the need to restart a replica database before promoting it to the master role. This not only simplifies the process, but means that the new master database will have a lukewarm data cache when it takes over. Another benefit of this change is that it simplifies more complex replication topologies such as chained replication.
With any change there is often a tradeoff and that applies here as well. Enabling this logging on the replica will result in additional disk usage (average < 3.3GB) on the /db volume and disk I/O activity. The vast majority will notice no change but replica power users may be able to notice this change. We don’t anticipate any customers will be seriously impacted by this change.
On Tuesday, after the stack update is published, we will be publishing an update to ey_cloud_recipes to add the mysql_administration recipe. This can be applied in advance of upgrading your environment without negatively impacting operations on your databases. Please be aware that by using this recipe this feature you may limit your application’s compatibility with some features that will be added to the platform in the future.
Binary Log Management
Coupled with the changes to replica binary logging we will be releasing several updates to our tool that manages binary logs. This tool will be receiving updates to enable it to purge binary logs on replica databases. Along with this change we will also be adding a bunch of other features:
- Quicker Initial Purge: Initial log removal (when master-bin.000001 exists) will now happen at 50% disk usage by default instead of 90%. This will reduce concerns about disk space filling up more rapidly than expected.
- Maintain 24 Hours of Binlogs: When disk space is not constrained the script will attempt to maintain 24 hours of binary logs by default. This will make it easier to use historical snapshots when creating new replica databases.
- Greater Flexibility of Configuration: /etc/engineyard/binlogpurge.yml is being greatly expanded to allow for the configuration of most defaults used by the tool. The most valuable of these options is expected to be the number of hours to maintain binary logs when space is not constrained and the definition of that constraint.
Customers with high rates of binary log generation, or with very small disk volumes may notice an increase in disk space utilization. The defaults being used for this tool should help to minimize this usage and prevent alerts for most databases.
Mixed Mode Replication for 5.1 and 5.5
In MySQL 5.1 row-based replication was introduced to address consistency issues within mysql replication. Previously, replication was only statement-based and statements could be written in such a way that they were non-deterministic and could create data discrepancies between the master and its replica(s). Row-based replication addresses this by writing out the row details for everything that changes to enforce consistency. This feature comes with the price of more overhead in the replication stream since now much more data must be written through the binary log.
We are changing the default replication mode for MySQL 5.1 and 5.5 from ‘STATEMENT’ to ‘MIXED’. Operating with this setting the database will use statement based replication by default but revert to row-based when a query is encountered that has a non-deterministic pattern. This compromise helps to provide most of the performance benefit of ‘STATEMENT’ mode with the data integrity improvements offered by ‘ROW’ mode. This also eliminates warnings in the mysql error log similar to the following:
[Warning] Unsafe statement written to the binary log using statement format since BINLOG_FORMAT = STATEMENT. The statement is unsafe because it uses a LIMIT clause. This is unsafe because the set of rows included cannot be predicted.
Customers with a high rate of binary log creation and a high rate of non-deterministic queries may notice additional disk I/O on their master and replica databases. Due to the additional processing required some customers may also notice some additional replica lag.
Moving forward this configuration will be considered as “best practice” on the cloud platform. As a result we are not currently planning to publish a formal recipe for overriding this setting. If you have a special need please file a ticket with our support team.
In the coming weeks we will be making additional announcements about some MySQL changes that are still being worked on. These changes require a little more development time since their implementation will be a bit more complicated and we want to ensure a smooth upgrade process. Please stay tuned because the following changes are expected during the month of October:
- The default storage engine will be updated for all instances from MyISAM to InnoDB. There will be a published method to configure your database to ignore this change.
- Timezone tables will automatically be imported into master database instances if they are not already present. Replicas will receive the timezone tables via their replication link.
If you have any concerns about the pending changes please let us know as soon as possible so we can work with you on those prior to the release. We always welcome your feedback on the continued advancement of the data platform. If you have any questions or suggestions for changes we could make please let us know.