Saturday, 21 December 2013

Why MySQL introduced a new command 'CHANGE REPLICATION FILTER'


The MySQL@Oracle announced 5.7.3 development milestone release, now available for download at dev.mysql.com. You can find the full list of changes and bug fixes in the 5.7.3 Release Notes. One of the changes which went in this milestone release was MAKE –REPLICATION-* FILTER SETTINGS DYNAMIC. With this work the slave options –replicate-* can be changed dynamically through the new command CHANGE REPLICATION FILTER. These new options allow slave replication filters to be changed while the server is running, without requiring a server stop and restart.
This blog is intended to explain the design decision of choosing a “new command” over “extending the existing SET command”.

Alternatives:

  1. MySQL could introduce few global variables (one for each --replicate.* options) which could be settable through a SET command. Currently ‘SET’command syntax does not allow users to give comma separated values for any existing global/local variables.
  2. To support CSV list without changing the ‘SET’ grammar, Users can give CSV in a string (“ “). Parser will pass this string to executor and executor will have to parse this string value and   extract  the individual values. Eg: SET GLOBAL @@replicate_do_db=”db1,db2” i.e., instead of parser, executor should do the separation. If executor is separating the values by implementing        
    a special parsing logic to support commands like these
    • SET GLOBAL @@replicate_do_db=”db1,db2”
    • SET GLOBAL @@replicate_wild_do_table=”db.t\_\%”
    • SET GLOBAL @@replicate_rewrite_db=”db1->db2”
    then the special parsing logic should handle the following special cases as well
    • the value contains special characters like double quotes(“), comma(,), underscore (_), percentage(%), backslash(\) and this list goes on with many more special characters.
    • should respect different CHARACTER SET combinations.
    We can write code to handle all these cases in the special parser logic. But the parser(lex, yacc) is specially designed to handle these kind of cases and many more special cases. So it is always better to let parser separate out the values instead of implementing a special parser which might be error prone and it might become very difficult to maintain it in future

  3. As an another alternative, lets try to see what happens if we extend the parser to accept CSV for these new variables. Eg: SET @@global.replicate_do_db=(db1,db2). Parenthesis”(, )” are to resolve conflicts in the existing SET grammar. In this approach, Parser will give an object to executor which contains list of values. But consider this following special case
    • mysql> SET @@global.replicate_do_db=(db1,db2);
    • mysql> SET @saved_replicate_do_db_list= @@global.replicate_do_db;
    • mysql> SET @@global.replicate_do_db=@saved_replicate_do_db_list;
    To support this special case, we should declare all these new variables as special type of variables so that parser not only evaluates @saved_replicate_do_db_list variable, but also should parse the value (db1,db2) and provide an object to executor which contains list of values which requires good amount of changes in the current MySQL parser design.
  4. Another alternative would be to extend the syntax to take just one value at a time just the same way how –replicate-do-db takes. Problem with this approach is if a user has to set 10 database names in replicate_do_db filter rule, user has to execute 10 SET commands. And also with this design, to remove an item from this list, One has to clear all the items and then execute this command 9 times which is not a good design when we are implementing a new feature. 
MySQL could implement any of the above three alternatives either by resolving the mentioned problems using some *good* amount of changes at all layers or by imposing some limitations on the user (like 'special characters are not allowed' or 'users cannot use local variables to set the value' or 'users should execute the command ‘N’ number of times to set ‘N’ values'). In order to solve all the above problems in a *simpler way*, MySQL introduced a new command ‘CHANGE REPLICATION FILTER’ with no limitations on users.
Advantages of the new command:
    • Parser is used to separate out the values (hence avoiding all the above mentioned problems)
    • Easy to extend the grammar for future filter extensions.

Summary: This blog is intended to explain the design decision of choosing a “new command” over “extending the existing SET command”.  As explained above, MySQL looked into all the possible alternatives and chosen the approach which we believe is best suited to users. We hope this new command will make your life simpler by avoiding few restarts of the server. Please let us know your feedback on the same. If you find any issues feel free to create bugs @ bugs.mysql.com.

Thursday, 5 December 2013

Making MySQL Slave Replication Filters Dynamic


In MySQL Replication, users can filter statements either at master (using --binlog-* startup options) or at the slave (using --replicate-* startup options). Prior to MySQL-5.7.3, users could set these filtering rules either through command line parameters or by using my.cnf file. In either case MySQL server must be restarted in order to change the filtering rules. It is not easy to restart MySQL server in real time scenarios (because of downtime issues and also loss of buffer cache resulting in performance problems). It is always helpful having a way to dynamically configure these filtering rules. Particularly in environments where slaves are configured dynamically to replicate certain databases or tables based on load and usage.

In MySQL-5.7.3, a new command "CHANGE REPLICATION FILTER" has been introduced through which users can change the *slave* side replication filters dynamically without the need for restarting the server.

Replication filtering:
At the master, users can control the statements which need to get into the binary log file by specifying startup options (like –binlog-*). Users cannot use these options to control the databases and tables that need to be executed on the slave. Instead, use filtering on the slave to control the events that are executed on the slave. At the slave, decisions about whether to execute or ignore statements received from the master are made according to the --replicate-* options that the slave was started with. In the simplest case, when there are no --replicate-* options, the slave executes all statements that it receives from the master.

At the slave server, SQL thread is the only thread accessing replication filters at any give point of time as shown in the picture above. If the SQL thread is not active, filtering rules can be changed without any harm. Hence a new command “CHANGE REPLICATION FILTER” has been introduced which allows dynamic change of filters.

'CHANGE REPLICATION FILTER' Command:
Starting from MySQL-5.7.3, user can set one or more replication filtering rules on the slave dynamically while the server is up and running. It requires only SQL thread to be down.

The following slave replication filters can be changed dynamically using this command.

  • REPLICATE_DO_DB
  • REPLICATE_IGNORE_DB
  • REPLICATE_DO_TABLE
  • REPLICATE_IGNORE_TABLE
  • REPLICATE_WILD_DO_TABLE
  • REPLICATE_WILD_IGNORE_TABLE
  • REPLICATE_REWRITE_DB



    Syntax:
    CHANGE REPLICATION FILTER = filter[, filter][, ...]

    filter:

    REPLICATE_DO_DB = ([db_list])

    | REPLICATE_IGNORE_DB = ([db_list])

    | REPLICATE_DO_TABLE = ([tbl_list])

    | REPLICATE_IGNORE_TABLE = ([tbl_list])

    | REPLICATE_WILD_DO_TABLE = ([tbl_list])

    | REPLICATE_WILD_IGNORE_TABLE = ([wild_tbl_list])

    | REPLICATE_REWRITE_DB = ([db_pair_list])


    db_list:

    db_name[, db_name][, ...]


    tbl_list:

    tbl_name[, tbl_name][, ...]


    wild_tbl_list:

    'pattern'[, 'pattern'][, ...]


    db_pair_list:

    (db_pair)[, (db_pair)][, ...]


    db_pair:

    from_db, to_db 


    Example:

    CHANGE REPLICATION FILTER REPLICATE_DO_DB=(db1,db2);


    CHANGE REPLICATION FILTER REPLICATE_IGNORE_DB=(db1,db2);


    CHANGE REPLICATION FILTER REPLICATE_DO_TABLE=(db1.t1);


    CHANGE REPLICATION FILTER REPLICATE_IGNORE_TABLE=(db2.t2);


    CHANGE REPLICATION FILTER REPLICATE_WILD_DO_TABLE=('db.t%');


    CHANGE REPLICATION FILTER REPLICATE_WILD_IGNORE_TABLE=('db%.a%');


    CHANGE REPLICATION FILTER REPLICATE_REWRITE_DB=((from_db, to_db));

FAQ:
1) What is the equivalent way of doing --replicate-do-db=db1 start option using the command?
A) You can achieve the same using
Eg: CHANGE REPLICATION FILTER REPLICATE_DO_DB=(db1)

2) To specify db1 and db2 in replicate-ignore-db, I was using –replicate-ignore-db=db1
–replicate-ignore-db=db2. How do I achieve the same using the command?
A) You should be able to set more than one value to any rule using one command.
Eg: To put db1, db2 database names in replicate-ignore-db, You can do
CHANGE REPLICATION FILTER REPLICATE_IGNORE_DB=(db1, db2)

3) Can I change master side filter rules (--binlog-*) using this command?
A) No, this new command is only for replication filter rules on slave (--replicate-* options ).

4) What are the replication filter rules which can be changed dynamically using this command?
A) You can change all slave side replication filter rules dynamically using this command. Following is the complete list of filter rules which can be set dynamically in MySQL-5.7.3.
  • REPLICATE_DO_DB
  • REPLICATE_IGNORE_DB
  • REPLICATE_DO_TABLE
  • REPLICATE_IGNORE_TABLE
  • REPLICATE_WILD_DO_TABLE
  • REPLICATE_WILD_IGNORE_TABLE
  • REPLICATE_REWRITE_DB

5) When I am using this command, I am getting error 1896 (This operation cannot be performed with a running slave sql thread; run STOP SLAVE SQL_THREAD first.). How do I rectify it?
A) When SQL thread is active, one cannot change the filtering rules. Unlike the case with the server startup options, this statement does not require restarting the server to take effect, only that the slave SQL thread be stopped. As the error message indicates execute 'STOP SLAVE SQL_THREAD', execute CHANGE REPLICATION FILTER command and start the slave SQL thread using 'START SLAVE SQL_THREAD' to make the new rules effective.
mysql> CHANGE REPLICATION FILTER REPLICATE_DO_DB=(db1);
ERROR 1896 (HY000): This operation cannot be performed with a running slave sql thread; run STOP SLAVE SQL_THREAD first

mysql> STOP SLAVE SQL_THREAD;
Query OK, 0 rows affected (0.05 sec)

mysql> CHANGE REPLICATION FILTER REPLICATE_DO_DB=(db1);
Query OK, 0 rows affected (0.00 sec)

6) Can I change more than one replication filter rules at a time using one command?
A) Yes, You can set multiple filter rules in one command
Eg: CHANGE REPLICATION FILTER REPLICATE_DO_DB=(db1),
REPLICATE_IGNORE_TABLE=(db1.t1);

7) Are brackets ( ) compulsory to set even one value?
A) Yes, You must use '(' ')' to mention the list of values (even for one value).
Eg: CHANGE REPLICATION FILTER REPLICATE_DO_TABLE=(db1.t1)

8) How to reset the values for a particular filter rule?
A) To reset the filter value, use void brackets "()" syntax, i.e, empty list will clear the existing
values
Eg: CHANGE REPLICATION FILTER REPLICATE_WILD_IGNORE_TABLE=();
clears all the existing values for REPLICATE_WILD_IGNORE_TABLE rule.

9) I have used only REPLICATE_IGNORE_TABLE in the command. What happens to other filter rules?
A) Unspecified filter rules will be unchanged.
Eg: CHANGE REPLICATION FILTER REPLICATE_REWRITE_DB = ((db1, db2), (db3,db4)); will clear all the existing list for REPLICATION_REWRITE_DB rule and it will add two db_pairs to the list. And the rest of the filter rules will not be changed.

10) Are there any behavioral changes when compared to server startup options --replicate-*?
A) No, there are no behavioral changes when compared to server startup options.

11) Are these changes made through the new command persistent upon restarting the server?
A) No, the changes are not persistent. You have to change the values again after restarting the server. If you want these changes to be persistent, you can put them in my.cnf file.

12) What will happen if I use same rule multiple times in the same command, will they be appended together?
A) No, if any rule is specified multiple times, the latter list will be considered (the earlier list will be ignored)
Eg: CHANGE REPLICATION FILTER REPLICATION_DO_DB=(db1, db2), REPLICATE_DO_DB=(db3,db4); the first db list db1, db2 list will be ignored and the REPLICATE_DO_DB rule will be updated with the second list db3, db4.

13) I have mentioned filter rules in my.cnf and also changed the values using CHANGE REPLICATION FILTER command. Which one takes high priority?
A) The values changed through the new command will be considered and values read through my.cnf settings will be ignored completely.

14) Will the existing way of setting the replication filters work on MySQL-5.7.3?
A) Yes, there is no change in that behavior. The filter rules can be set using
  • command line parameters at the time of server startup (--replicate-* )
  • set it in my.cnf file before server startups
  • set it dynamically using CHANGE REPLICATION FILTER COMMAND while the server is up and running
15) I have a db name with special characters, non utf-8 characters, eg: db`1, dä
How do I use it in the new command?
A) You have to specify the object name exactly the same way you have created the database using create database command. Note that db_names and table_names in db_list and table_list follow the metadata syntax exactly the same way they were created.
Eg: CHANGE REPLICATION FILTER REPLICATION_DO_DB=(`db``1`, `dä`);
and 'pattern' in wild_tbl_list follows exactly the same rule as it is used in 'LIKE' clause in 'SELECT' command.
Eg: CHANGE REPLICATE FILTER REPLICATION_WILD_IGNORE_TABLE=('db`1.t1');

16) When I am using this new command, I am getting 1794 error (Slave is not configured or failed to initialize properly), What is wrong?
A) This command cannot be executed on a stand-alone server where the replication is disabled.
As the error message says, you must at least set --server-id to enable either a master or a slave.
Additional error messages can be found in the MySQL error log.

17) I have used the command to change the filter rules. Now how to check the changed filter rules?
A) You can use "Show Slave Status" command.

18) What is the new syntax to mention --replicate-rewrite-db= db1->db2.
A) You can use CHANGE REPLICATION FILTER REPLICATION_REWRITE_DB=((db1,db2));
to rewrite all db1's statements into db2's statements. Just like other rules, you can mention more than one db_pair.
Eg: CHANGE REPLICATION FILTER REPLICATION_REWRITE_DB=((db1,db2), (db3,db4)) which tells the server to rewrite all db1 statements into db2 statements and all db3 statements into
db4 statements.

19) Can I add or delete few values from the list?
A) No, this new command is used to replace the old values with the new values and there is
no provision to add /delete values from the list.

20) Are there any test scripts to have a look at the usage in detail?
A) Yes, please look at $MYSQL_HOME/mysql-test/suite/rpl/t/rpl_filter_* for more details on
the usage.

Summary:
This blog post provides a very brief insight about Filtered replication and the new command “CHANGE REPLICATION FILTER” introduced as part of MySQL-5.7.3. It has a short FAQ section about the usage of the command. I hope we have made your life simpler by avoiding few restarts of the server. Please let us know your feedback on the same. If you find any issues feel free to create bugs @ bugs.mysql.com. I cannot end this post without mentioning the contribution from Davi Arnaut (for Bug#67362) which inspired us to implement this new feature. Thanks Davi!

References: