Selective Row Replication – using Blackhole storage engine

What if you wanted only certain rows (i.e. rows of a particular type) to be only replicated to particular slave db ?

and you were not interested in the rest.

Lets say we have a table with the schema

CREATE TABLE `invoices` (
`id` int(11)  PRIMARY KEY NOT NULL,
`type` varchar(10) DEFAULT NULL,
`created_date` datetime DEFAULT NULL,
`status` varchar(10) DEFAULT NULL
) ENGINE=InnoDB

We have different types of invoices like – accounting invoices , warehouse invoices , procurement invoices.

Lets say that we were interested in having

a SLAVE DB where only ‘accounting‘ invoices are only needed and we are not interested in the other types:

What can we do to achieve this selective replication:

One Solution: Use tables with BlackHole  (Black Hole engine).

Step 1:

Create another table with same schema as invoices but having engine as BLACKHOLE.

CREATE TABLE `invoices_blackhole` (
`id` int(11)  PRIMARY KEY NOT NULL,
`type` varchar(10) DEFAULT NULL,
`created_date` datetime DEFAULT NULL,
`status` varchar(10) DEFAULT NULL
) ENGINE=BLACKHOLE

Step 2 (pseudo code) below:

Change application code as follows: [Assuming all writes go to a master db and slaves are read-only]

—————————–

Old code:

—————————–

string type = ‘accounting’;

Start transaction;   insert into invoices (999, type, now(), ‘created’);    commit;

or

string type = select type from invoices where id = ‘999’;

Start transaction;   update invoices set status = ‘processing’ where id = 999;    commit;

—————————–

New Code:

—————————–

string type = ‘accounting’;

Start transaction;   insert into invoices (999, type, now(), ‘created’);

if (type == accounting){ insert into invoices_blackhole (999, type, now(), ‘created’);     }

commit;

or

string type = select type from invoices where id = ‘999’;

Start transaction;   update invoices set status = ‘processing’ where id = 999;

if (type == accounting){ updated invoices_blackhole set status = ‘processing’ where id = 999;    }

commit;

/////// do the same for delete too……

Step 3:

On slave: since the create DDL gets propagated to slave:

Now, change engines of invoices & invoices_blackhole to ‘blackhole’ && ‘innodb’ respectively.  [that’s correct – reverse them 🙂 ]

Step4:

now starting writing to master and your slave will have only rows with type ‘accounting’.  — Tadaaaaa  🙂

 

mysql selective row replication using blackhole engine

mysql selective row replication using blackhole engine

 

What about Transactions across Blackhole and innodb ?

Ans:  In our case, if a rollback is issued , the transaction is not written to binlog => so no worries.

Thank you for reading this post !!

Any suggestions / thoughts are welcome

How I fell in love with MySQL ? Updates VS Inserts

Well in 2011, for a week we would see disruptions in our ERP servers especially around 7pm.

  • We had a ‘job’ table & multiple app servers(cluster) inserting jobs and also picking them up.

  • At around 7pm , we would notice that the number of Pending jobs would increase dramatically i.e. app servers would not execute them.

  • The ‘confirm shipment’ job – a critical job for us would not be executed – leaving the fate of many shipments hanging in limbo.

I did do a lot of debugging around this – looking for clues – the major one was found in

show engine innodb status

TRX HAS BEEN WAITING 120 SEC FOR THIS LOCK TO BE GRANTED:

RECORD LOCKS space id 0 page no 560985 n bits 72 index `status_idx` of table `test`.`job_sandbox` trx id 0 9133886 lock_mode X locks rec but not gap waiting

It was a fight between Inserts & Updates on an index.

The sequence of events are available in this bug i filed, which was marked invalid – but none the less, it was an important lesson –

“Updates can get stuck because of inserts”

The solution for the above issue mentioned in the bug is as follows:

Old Txn:

update t7_ibu set status='service_queued' where status='service_pending' ;

New Txn:

//convert update to select.
//first get all the primary key id's
select id from t7_ibu where status='service_pending' ;
//then update each row using primary key value
foreach (primary_key in result set){
update t7_ibu set status='service_queued' where id = primary_key;
}

Since the update is confined to a particular row – the index lock can be granted to both the insert and the update. Also please note: multiple servers exist – they all do the same thing i.e. insert job and update job.

And this is how i fell in love with MySQL, this issue did it 🙂

Mood of MySQL

As i have mentioned in the about page of this blog – MySQL is not just some piece of code, it too has a SOUL.

Well it has a mood too. it too can SING.

This was my Hack Day idea @ flipkart for which we won coolest hack day award.

http://www.slideshare.net/jaihind213/mood-of-mysql

Apart from the singing, the basic intention was to provide an solution to enable general query log without causing system failure i.e. as the general query log when enabled could eat up disk space fast.

The hack day event even came up in the online media – you can access the story here

Will be putting up the code shortly.

Thank you for reading this post.

Case 1 – Making Relay Log Dump possible

Hi All,

This is my first post on this Blog 🙂

I would like to thank Ramakrishna Nalam , Shashwat Agarwal , Suthirta for inspiring me to take up this new Blog.

I am chief Surgeon at the Flipkart M*A*S*H – which is an internal blog ,  well MASH 213 is now here.

At work, we had a use case for MultiSource Replication . We were using Tungsten to achieve this.

topo

We were stuck on a case where fail over would occur – i.e. what would happen if a slave we were replicating from would die, how would we continue using another slave ?

Using MySQL 5.5 – Well, the hard way would be to parse binlog – find the location of last applied transaction in other slave binlog, get its position and continue from there – the format of binlog – statement/ row/ mixed had to be same across all slaves else this could be tough.

Well Since we were using tungsten for MultiSource Replication and tungsten feeds of a binlog, i thought –

hey the Slave Relay Log is = Master Binlog (The slave copies over the binlog of master and keeps it as relay log)

why not replicate from relay log of slave instead of its binlog – All slaves would have similar Relay Logs.

On replication failure from the slave – Tungsten would tell us the last successful position in the log file (in this case the relay log of failed slave) it has applied. Parsing the relaylog using mysqlbinlog utility – we could go to this position & get the Transaction & its Transaction ID (XID)

Since tungsten would replicate from the relay log of the new slave now & the new slave relay log and the failed slave are same, the new slave would have that XID in its relay log , we could now narrow down to the corresponding position and continue replication from there from this slave.

We could even start replicating from the Master Binlog – (i.e. now ask tungsten to feed of binlog instead of relay log) and when the slave is up, we could make tungsten switch to relay log of slave.

I have modified the Percona Server to add Relay Log Dump capability. [Client included for testing this out]

Its available @ https://github.com/jaihind213/mash213

Do visit the code – This is my first patch to MySQL 🙂 – though not included in oracle/percona source tree.

All in all – my first surgery at this Unit.

Thank you for reading this post – you can reach me for any doubts at  -> jaihind213@ gmail dot com