Thursday, June 9, 2016

High availability using galera cluster on centos 6.5

 


MariaDB Galera Cluster 10.0.12 Stable has been released and available for production use. MariaDB is a relational database management system (RDBMS). Generally we use single node of database server for small application but think about application which have thousands of users keep online at a time, In that situation we need a structure which will capable to handle this load and provides high availability. So we need to add multiple database servers interconnected with each other and keep synchronized, so in case any server goes down other servers can take place of them and provide services to users.

This article will help you to set up MariaDB Galera Cluster 10.0.12 with 3 nodes running with CentOS 6.5. Cluster server details are as following.

Cluster DB1: 10.0.0.10
Cluster DB2: 10.0.0.20
Cluster DB3: 10.0.0.30

Note: Step 1/2/3 has to be done on all cluster nodes and remaining steps are node specific.

Step 1: Add MariaDB Repositories

Create a mariadb repository /etc/yum.repos.d/mariadb.repo using following content in your system. Below repository will work on CentOS 6.x systems, For other system use repository generation tool and add to your system.

For CentOS 6 – 64bit

[mariadb]
name = MariaDB
baseurl = http://yum.mariadb.org/10.0/centos6-amd64
gpgkey=https://yum.mariadb.org/RPM-GPG-KEY-MariaDB
gpgcheck=1

For CentOS 6 – 32bit

[mariadb]
name = MariaDB
baseurl = http://yum.mariadb.org/10.0/centos6-x86
gpgkey=https://yum.mariadb.org/RPM-GPG-KEY-MariaDB
gpgcheck=1

Step 2: Install MariaDB and Galera

Before installing MariaDB Galera cluster packages, remove any existing MySQL or MariaDB packages installed on system. After that use following command to install on all nodes.

# yum install MariaDB-Galera-server MariaDB-client galera

Step 3: Initial MariaDB Configuration

After  installing packages in above steps do the some basic MariaDB configurations.

# service mysql start

# mysql_secure_installation

After that create a user in MariaDB on all nodes, which can access database from your network in cluster.

# mysql -u root -p

MariaDB [(none)]> GRANT ALL PRIVILEGES ON *.* TO 'cluster'@'%' IDENTIFIED BY 'password' WITH GRANT OPTION;
MariaDB [(none)]> FLUSH PRIVILEGES;
MariaDB [(none)]> exit

and stop MariaDB service before starting cluster configuration

# service mysql stop

Step 4: Setup MariaDB Galera Cluster on DB1

Lets start setup MariaDB Galera cluster from DB1 server. Edit MariaDB server configuration file and add following values under [mariadb] section.

[root@db1 ~]# vim /etc/my.cnf.d/server.cnf
query_cache_size=0
binlog_format=ROW
default_storage_engine=innodb
innodb_autoinc_lock_mode=2
wsrep_provider=/usr/lib/galera/libgalera_smm.so
wsrep_cluster_address="gcomm://10.0.0.10,10.0.0.20,10.0.0.30"
wsrep_cluster_name='cluster1'
wsrep_node_address='10.0.0.10'
wsrep_node_name='db1'
wsrep_sst_method=rsync
wsrep_sst_auth=cluster:password

Start cluster using following command.

[root@db1 ~]# /etc/init.d/mysql bootstrap

Bootstrapping the clusterStarting MySQL.... SUCCESS!

If you get any problem during startup check MariaDB error log file /var/lib/mysql/.err

Step 5: Add DB2 in MariaDB Cluster

After successfully starting cluster on DB1. Start configuration on DB2. Edit MariaDB server configuration file and add following values under [mariadb] section. All the settings are similar to DB1 except wsrep_node_address, wsrep_cluster_address and wsrep_node_name.

[root@db2 ~]# vim /etc/my.cnf.d/server.cnf

query_cache_size=0
binlog_format=ROW
default_storage_engine=innodb
innodb_autoinc_lock_mode=2
wsrep_provider=/usr/lib/galera/libgalera_smm.so
wsrep_cluster_address="gcomm://10.0.0.10,10.0.0.20,10.0.0.30"
wsrep_cluster_name='cluster1'
wsrep_node_address='10.0.0.20'
wsrep_node_name='db2'
wsrep_sst_method=rsync
wsrep_sst_auth=cluster:password

Start cluster using following command.

[root@db2 ~]# /etc/init.d/mysql start
Starting MySQL..... SUCCESS!

Step 6: Add DB3 in MariaDB Cluster

[root@db3 ~]# vim /etc/my.cnf.d/server.cnf
query_cache_size=0
binlog_format=ROW
default_storage_engine=innodb
innodb_autoinc_lock_mode=2
wsrep_provider=/usr/lib/galera/libgalera_smm.so
wsrep_cluster_address="gcomm://10.0.0.10,10.0.0.20,10.0.0.30"
wsrep_cluster_name='cluster1'
wsrep_node_address='10.0.0.30'
wsrep_node_name='db2'
wsrep_sst_method=rsync
wsrep_sst_auth=cluster:password
Start cluster using following command.

[root@db3 ~]# /etc/init.d/mysql start
Starting MySQL..... SUCCESS!

Step 7: Test MariaDB Galera Cluster Setup

Now the cluster setup has been completed and running properly. Now you can test the cluster setup by creating database and tables at any server in cluster, it will replicate immediately to all servers in cluster.

We can also add a load balancer either haproxy or maxscale for scaling purpose.https://mysqlwonders.blogspot.com/2016/06/high-availability-using-galera-cluster.htmlhttps://mysqlwonders.blogspot.com/2016/06/high-availability-using-galera-cluster.html

Maxscale load blanacer with centos 6.5



Installing MaxScale

When logged in as root, create an installation directory for MaxScale, download it and install it:

# cd /usr/local
# mkdir skysql
# cd skysql
# curl https://downloads.skysql.com/files/SkySQL/MaxScale/maxscale.preview.0.4.tar.gz > maxscale.preview.0.4.tar.gz
# tar xvfz maxscale.preview.0.4.tar.gz
# cd maxscale

Now we have MaxScale downloaded, but for maxscale to work, the MAXSCALE_HOME environment variable has to be set up, and to achieve this we, let's create a shell-script that starts MaxScale for us.

 # vi maxstart.sh
Then set up this script like this:
#!/bin/bash
#
export MAXSCALE_HOME=/usr/local/skysql/maxscale/MaxScale
$MAXSCALE_HOME/../bin/maxscale
Once we have that script, let's make it executable:
 # chmod +x maxstart.sh

Configuring MaxScale

The next step then is to configure MaxScale. The default configuration file for MaxScale is called MaxScale.cnf and is located in the etc directory under where MAXSCALE_HOME is located.

 # vi /usr/local/skysql/maxscale/MaxScale/etc/MaxScale.cnf

In this file, each module has it's own section, as well as each server and there is also a section for MaxScale itself. Let's begin with MaxScale which has just one variables that controls the # of threads 

MaxScale uses:
[maxscale]
threads=1

Following this, we set up the servers we are to manage here, of which there are three. For every configuration section, except the one for MaxScale core, we have to tell what type of entity we are defining here, in this case it is server:

[srv1]
type=server
address=10.0.0.1
port=3306
protocol=MySQLBackend

[srv2]
type=server
address=10.0.0.2
port=3306
protocol=MySQLBackend

[srv3]
type=server
address=10.0.0.3
port=3306
protocol=MySQLBackend

 In addition we define which protocol module to use, and in this case there is not much else than MySQLBackend to choose from. As you can see, we do not define the master or slave properties of the servers, instead we let MaxScale figure that out for us by using a Monitor module, so now is the time to define that.

For the monitor to work, it will connect to the respective MySQL servers so we need to define a username and password for this. In the previous post I created a user for this called maxuser using the password maxpwd. Also, to simplify matters I created this user with full privileges, 

[MySQL Monitor]

type=monitor
module=mysqlmon
servers=srv1,srv2,srv3
user=maxuser
passwd=maxpwd

The router we are about to use is the read-write split router, which manages routing writes to the master and reads to the slaves. Let's start with the router that is the central point here:

[RWSplitRouter]
type=service
router=readwritesplit
servers=srv1,srv2,srv3
user=maxuser
passwd=maxpwd

The reason we need to have a username and password even for the router is that this needs to authenticate users connecting to MaxScale itself, and to do that it needs to connect to the server it manages, connect to the mysql database and get the user authorization data from there, which is why we need an account that can access the mysql database.

Now we are close to getting started with MaxScale, what is left is to set up a listener for the router we defined above:

[RWSplitListener]
type=listener
service=RWSplitRouter
protocol=MySQLClient
port=3306

That concludes the configuration of MaxScale!
Testing MaxScale

 # ./maxstart.sh

Now we should be able to connect to MaxScale and test that it works. We do this by setting up a MariaDB Client server. 

This is set up just like our previous servers, only that we only install the MariaDB client. So set up a server like before, set IPADDR to 10.0.0.4 and run, as root:

 # yum install MariaDB-client

I here assume that you have created the t1 table that we used to test replication in the previous post, if not, create it and populate it now:

# mysql -h 10.0.0.5 -u maxuser -pmaxpwd test -e "create table t1(c1 int)"

# mysql -h 10.0.0.5 -u maxuser -pmaxpwd test -e "insert into t1 values(1)"

Now we have a table to test with, so let's see what happens, and let's check that our SELECTs are being routed to our two slaves. We can do that by using the @@server_id variable, which is different
one these two of course:

# mysql -h 10.0.0.5 -u maxuser -pmaxpwd test -e "select c1, @@server_id from t1"
+------+-------------+
| c1  | @@server_id |
+------+-------------+
|  1 |  10 |
+------+-------------+
Cool, this ended up with one of the slaves, so if I try it again, it shoudl end up with the other slave, right:

# mysql -h 10.0.0.5 -u maxuser -pmaxpwd test -e "select c1, @@server_id from t1"
+------+-------------+
| c1  | @@server_id |
+------+-------------+
|  1 |  11 |
+------+-------------+

And if you didn't get the MaxScale to work, like seeing this in the error log:

 /usr/local/skysql/maxscale/MaxScale/log/skygw_err1.log:

Couldn't find suitable Slave from 3 candidates.

Or if you can't connect to MaxScale, then the most likely issue is that you didn't stop iptables:

 # /etc/init.d/iptables stop

All we done here for setting up some basic load balancing !

Tuesday, May 24, 2016

Innodb Performance Variables


1.innodb_buffer_pool_sizeInnoDB depend heavily on the buffer pool and should be set correctly. Typically a good value is 70%-80% of available memory. More precisely, if you have RAM bigger than your dataset setting it bit larger should be appropriate with that keep in account of your database growth and re-adjust innodb buffer pool size accordingly. 

2.innodb_buffer_pool_instances: Multiple innodb buffer pools introduced in InnoDB 1.1 and MySQL 5.5. In MySQL 5.5 the default value for it was 1 which is changed to 8 as new default value in MySQL 5.6. Minimum innodb_buffer_pool_instances should be lie between 1 (minimum) & 64 (maximum). Enabling innodb_buffer_pool_instances is useful in highly concurrent workload as it may reduce contention of the global mutexes.

3.Dump/Restore Buffer Pool: This feature speed up restarts by saving and restoring the contents of the buffer pool. This feature is first introduced in Percona Server 5.5. Oracle MySQL also introduced it in version 5.6, To automatically dump the database at startup and shutdown set innodb_buffer_pool_dump_at_shutdown&innodb_buffer_pool_load_at_startup parameters to ON

4.innodb_log_file_size: Large enough InnoDB transaction logs are crucial for good, stable write performance. But also larger log files means that recovery process will slower in case of crash. However this is not such big issue since great improvements in 5.5. Default value has been changed in MySQL 5.6 to 50 MB from 5 MB (old default), but it’s still too small size for many workloads. Also, in MySQL 5.6, if innodb_log_file_size is changed between restarts then MySQL will automatically resize the logs to match the new desired size during the startup process. Combined log file size is increased to almost 512 GB in MySQL 5.6 from 4 GB. 

5.innodb_log_buffer_size: Innodb writes changed data record into lt’s log buffer, which kept in memory and it saves disk I/O for large transactions as it not need to write the log of changes to disk before transaction commit. 4 MB – 8 MB is good start unless you write a lot of huge blobs.

6.innodb_flush_log_at_trx_commit: When innodb_flush_log_at_trx_commit is set to 1 the log buffer is flushed on every transaction commit to the log file on disk and provides maximum data integrity but it also has performance impact. Setting it to 2 means log buffer is flushed to OS file cache on every transaction commit. The implication of 2 is optimal and improve performance if you are not concerning ACID and can lose transactions for last second or two in case of OS crashes. 

7.innodb_thread_concurrency: It is recommended to allow the engine to control the concurrency by keeping it to default value (which is zero). If you see concurrency issues, you can tune this variable. A recommended value is 2 times the number of CPUs plus the number of disks. It’s dynamic variable means it can set without restarting MySQL server.

8.innodb_flush_method: DIRECT_IO relieves I/O pressure. Direct I/O is not cached, If it set to O_DIRECT avoids double buffering with buffer pool and filesystem cache. Given that you have hardware RAID controller and battery-backed write cache.

9.innodb_file_per_table: innodb_file_per_table is ON by default from MySQL 5.6. This is usually recommended as it avoids having a huge shared tablespace and as it allows you to reclaim space when you drop or truncate a table. 

Covered Index optimization using Explain format=json


   Let's start with an example

select City.name as city, Country.name as country, 
group_concat(Language)
from City join CountryLanguage using(CountryCode)
join Country
where City.CountryCode=Country.Code and Continent = 'North America' 
and District='St George'
group by City.name, Country.Name;


Using traditional Explain Keyword show us

mysql> explain select City.name as city, Country.name as country, 
group_concat(Language) from City join CountryLanguage 
using(CountryCode) join Country 
where City.CountryCode=Country.Code 
and Continent = 'North America' 
and District='St George' group by City.name, Country.NameG
*************************** 1. row *************************** id: 1 select_type: SIMPLE table: Country partitions: NULL type: ALL possible_keys: PRIMARY key: NULL key_len: NULL ref: NULL rows: 239 filtered: 14.29 Extra: Using where; Using temporary; Using filesort *************************** 2. row *************************** id: 1 select_type: SIMPLE table: City partitions: NULL type: ref possible_keys: CountryCode key: CountryCode key_len: 3 ref: world.Country.Code rows: 18 filtered: 10.00 Extra: Using where *************************** 3. row *************************** id: 1 select_type: SIMPLE table: CountryLanguage partitions: NULL type: ref possible_keys: PRIMARY,CountryCode key: CountryCode key_len: 3 ref: world.Country.Code rows: 4 filtered: 100.00 Extra: Using index 3 rows in set, 1 warning (0.00 sec)

EXPLAIN FORMAT=JSON will tell us to which columns we should add covered index


mysql> explain format=json select City.name as city, Country.name as country, group_concat(Language) from City join CountryLanguage using(CountryCode) join Country where City.CountryCode=Country.Code and Continent = 'North America' and District='St George' group by City.name, Country.NameG

*************************** 1. row ***************************

EXPLAIN: {
  "query_block": {
    "select_id": 1,
    "cost_info": {
      "query_cost": "927.92"
    },
        {
          "table": {
            "table_name": "City",
            "access_type": "ref",
            "possible_keys": [
              "CountryCode"
            ],
            "key": "CountryCode",
            "used_key_parts": [
              "CountryCode"
            ],
            "key_length": "3",
            "ref": [
              "world.Country.Code"
            ],
            "rows_examined_per_scan": 18,
            "rows_produced_per_join": 63,
            "filtered": "10.00",
            "cost_info": {
              "read_cost": "630.74",
              "eval_cost": "12.61",
              "prefix_cost": "810.68",
              "data_read_per_join": "4K"
            },
            "used_columns": [
              "ID",
              "Name",
              "CountryCode",
              "District"
            ],
            "attached_condition": "(`world`.`City`.`District` = 'St George')"
          }
        },

Watch the used columns values.Will use this on covered Index.

mysql> alter table City add index cov(CountryCode, District, Name);
Query OK, 0 rows affected (2.74 sec)
Records: 0  Duplicates: 0  Warnings: 0


mysql> explain format=json select City.name as city, 
Country.name as country, 
group_concat(Language) 
from City join CountryLanguage 
using(CountryCode) join 
Country where City.CountryCode=Country.Code 
and Continent = 'North America' 
and District='St George' group by City.name, Country.NameG
*************************** 1. row ***************************
EXPLAIN: {   "query_block": {     "select_id": 1,     "cost_info": {       "query_cost": "296.28"     },             {           "table": {             "table_name": "City",             "access_type": "ref",             "possible_keys": [               "CountryCode",               "cov"             ],             "key": "cov",             "used_key_parts": [               "CountryCode",               "District"             ],             "key_length": "23",             "ref": [               "world.Country.Code",               "const"             ],             "rows_examined_per_scan": 2,             "rows_produced_per_join": 100,             "filtered": "100.00",             "using_index": true,             "cost_info": {               "read_cost": "34.65",               "eval_cost": "20.19",               "prefix_cost": "108.64",               "data_read_per_join": "7K"             },             "used_columns": [               "ID",               "Name",               "CountryCode",               "District"             ]           }         },

     Exammine the metrics:

  • query_cost – 296.28 for the indexed table against 927.92 (smaller is better)
  • rows_examined_per_scan – 2 versus 18 (smaller is better)
  • filtered – 100 versus 10 (bigger is better)
  • cost_info – read_cost  and prefix_cost  for the indexed table are smaller than when not indexed, which is better. However,  eval_cost  and  data_read_per_join  are bigger. But since we read nine times less rows overall, the cost is still better.

Mysql Buffer Variables



connect_timeout
The number of seconds the mysqld server is waiting for a connect packet before responding with Bad handshake.

join_buffer
The size of the buffer that is used for full joins (joins that do not use indexes). The buffer is allocated one time for each full join between two tables. Increase this value to get a faster full join when adding indexes is not possible. (Normally the best way to get fast joins is to add indexes.)

key_buffer
Index blocks are buffered and are shared by all threads. key_buffer is the size of the buffer used for index blocks. You might want to increase this value when doing many DELETE or INSERT operations on a table with lots of indexes. 

max_allowed_packet
The maximum size of one packet. The message buffer is initialized to net_buffer_length bytes, but can grow up to max_allowed_packet bytes when needed. This value by default is small to catch big (possibly wrong) packets. You must increase this value if you are using big BLOB columns. It should be as big as the biggest BLOB you want to use.

max_connections
The number of simultaneous clients allowed. Increasing this value increases the number of file descriptors that mysqld requires. 

max_connect_errors
If there is more than this number of interrupted connections from a host this host will be blocked for further connections. You can unblock a host with the command FLUSH HOSTS.

max_join_size
Joins that are probably going to read more than max_join_size records return an error. Set this value if your users tend to perform joins without a WHERE clause that take a long time and return millions of rows.

max_sort_length
The number of bytes to use when sorting BLOB or TEXT values (only the first max_sort_length bytes of each value are used; the rest are ignored).

net_buffer_length
The communication buffer is reset to this size between queries. This should not normally be changed, but if you have very little memory, you can set it to the expected size of a query. (That is, the expected length of SQL statements sent by clients. If statements exceed this length, the buffer is automatically enlarged, up to max_allowed_packet bytes.)

record_buffer
Each thread that does a sequential scan allocates a buffer of this size for each table it scans. If you do many sequential scans, you may want to increase this value.

sort_buffer
Each thread that needs to do a sort allocates a buffer of this size. Increase this value for faster ORDER BY or GROUP BY operations. 

table_cache
The number of open tables for all threads. Increasing this value increases the number of file descriptors that mysqld requires. MySQL needs two file descriptors for each unique open table. See below for comments on file descriptor limits. 

tmp_table_size
If a temporary table exceeds this size, MySQL generates an error of the form The table tbl_name is full. Increase the value of tmp_table_size if you do many advanced GROUP BY queries.

thread_stack
The stack size for each thread. Many of the limits detected by the crash-me test are dependent on this value. The default is normally large enough.

wait_timeout

The number of seconds the server waits for activity on a connection before closing it.