Saturday, 21 December 2013

Why MySQL introduced a new command 'CHANGE REPLICATION FILTER'


The MySQL@Oracle announced 5.7.3 development milestone release, now available for download at dev.mysql.com. You can find the full list of changes and bug fixes in the 5.7.3 Release Notes. One of the changes which went in this milestone release was MAKE –REPLICATION-* FILTER SETTINGS DYNAMIC. With this work the slave options –replicate-* can be changed dynamically through the new command CHANGE REPLICATION FILTER. These new options allow slave replication filters to be changed while the server is running, without requiring a server stop and restart.
This blog is intended to explain the design decision of choosing a “new command” over “extending the existing SET command”.

Alternatives:

  1. MySQL could introduce few global variables (one for each --replicate.* options) which could be settable through a SET command. Currently ‘SET’command syntax does not allow users to give comma separated values for any existing global/local variables.
  2. To support CSV list without changing the ‘SET’ grammar, Users can give CSV in a string (“ “). Parser will pass this string to executor and executor will have to parse this string value and   extract  the individual values. Eg: SET GLOBAL @@replicate_do_db=”db1,db2” i.e., instead of parser, executor should do the separation. If executor is separating the values by implementing        
    a special parsing logic to support commands like these
    • SET GLOBAL @@replicate_do_db=”db1,db2”
    • SET GLOBAL @@replicate_wild_do_table=”db.t\_\%”
    • SET GLOBAL @@replicate_rewrite_db=”db1->db2”
    then the special parsing logic should handle the following special cases as well
    • the value contains special characters like double quotes(“), comma(,), underscore (_), percentage(%), backslash(\) and this list goes on with many more special characters.
    • should respect different CHARACTER SET combinations.
    We can write code to handle all these cases in the special parser logic. But the parser(lex, yacc) is specially designed to handle these kind of cases and many more special cases. So it is always better to let parser separate out the values instead of implementing a special parser which might be error prone and it might become very difficult to maintain it in future

  3. As an another alternative, lets try to see what happens if we extend the parser to accept CSV for these new variables. Eg: SET @@global.replicate_do_db=(db1,db2). Parenthesis”(, )” are to resolve conflicts in the existing SET grammar. In this approach, Parser will give an object to executor which contains list of values. But consider this following special case
    • mysql> SET @@global.replicate_do_db=(db1,db2);
    • mysql> SET @saved_replicate_do_db_list= @@global.replicate_do_db;
    • mysql> SET @@global.replicate_do_db=@saved_replicate_do_db_list;
    To support this special case, we should declare all these new variables as special type of variables so that parser not only evaluates @saved_replicate_do_db_list variable, but also should parse the value (db1,db2) and provide an object to executor which contains list of values which requires good amount of changes in the current MySQL parser design.
  4. Another alternative would be to extend the syntax to take just one value at a time just the same way how –replicate-do-db takes. Problem with this approach is if a user has to set 10 database names in replicate_do_db filter rule, user has to execute 10 SET commands. And also with this design, to remove an item from this list, One has to clear all the items and then execute this command 9 times which is not a good design when we are implementing a new feature. 
MySQL could implement any of the above three alternatives either by resolving the mentioned problems using some *good* amount of changes at all layers or by imposing some limitations on the user (like 'special characters are not allowed' or 'users cannot use local variables to set the value' or 'users should execute the command ‘N’ number of times to set ‘N’ values'). In order to solve all the above problems in a *simpler way*, MySQL introduced a new command ‘CHANGE REPLICATION FILTER’ with no limitations on users.
Advantages of the new command:
    • Parser is used to separate out the values (hence avoiding all the above mentioned problems)
    • Easy to extend the grammar for future filter extensions.

Summary: This blog is intended to explain the design decision of choosing a “new command” over “extending the existing SET command”.  As explained above, MySQL looked into all the possible alternatives and chosen the approach which we believe is best suited to users. We hope this new command will make your life simpler by avoiding few restarts of the server. Please let us know your feedback on the same. If you find any issues feel free to create bugs @ bugs.mysql.com.

6 comments:

  1. Why bugs.oracle.com and not the public bugs.mysql.com? I prefer the latter unless there is too much customer information involved.

    ReplyDelete
    Replies
    1. Hello Daniël,
      It is a typo error. My intentions are to say "bugs.mysql.com" only. I have corrected it now. Thanks for finding it out.

      Delete
  2. First: the new functionality is great. I had been looking for this to be possible for some time, having bumped into issues where I wanted to change the number of databases that were being replicated or other more specific filters. Doing this without shutting down MySQL is great.

    However, I think that MySQL configuration is a bit of a mess. Look at how you configure P_S compared to how you configure other parts of MySQL. There are other examples too. While the P_S configuration is picked up from /etc/my.cnf (or command line arguments) it has the nice characteristics of not requiring special syntax. You change P_S.setup_instruments or setup_consumers by using normal SQL. Looking at the table is easy to do, and it is trivial to record histories or combine centrally the results from several servers by adding a couple of extra columns like a hostname or a timestamp.

    I would have liked to have seen the same for the new dynamic replication configuration. Pick up the configuration as now from /etc/my.cnf or server command line parameters, but then display the resulting configuration in a table where you have the ability to do the things described above. No new special syntax needed and the code could prevent changes if SQL, I/O or other replication threads are running.

    ReplyDelete
    Replies
    1. Thank you Simon for praising the feature and for your good feedback!

      Delete
    2. Simon,

      There's a big difference between replication and P_S. P_S is a storage engine. So it can emulate tables all it wants :)
      But I hear what you're saying and totally agree. mysql config files should have a better alternative.

      Delete