MySQL Aurora CPU spikes

Recently we encountered an issue with aurora, where our cpu started spiking every X minutes. (X < 30)

Screen Shot 2015-08-27 at 3.02.27 pm

we had like 20-30 connections, out of which majority were inserts and a few selects.

every time the spike occurred our inserts would start waiting and waiting….

here is innotop output

updates

we checked our code. nothing major had changed.

The mysql error log had this mysterious message which i could not trace in the mysql/percona/maria source.

“Innodb: Retracted crabbing in btr_cur_search_to_nth_level 33001 times since restart”

if you get to know the meaning of this – please do let me know – would love to understand this.

crab_Screen Shot 2015-09-17 at 9.29.17 pm

Finally we contacted Aurora support team, after an investigation by them, it turned out to be a issue in aurora.

I am not too sure whether the above ‘error log message’ย  had any impact or not on this case.

PS: if you are wondering , that this post ended abruptly, even i am a little surprised by the ending too.

Screen Shot 2015-08-24 at 7.47.15 pm

Should I worry about the Query Cache in Aurora ?

There are a lot of blog posts on the internet which warn you about using the Query Cache in MySQL.

I was surprised to see that the query cache was enabled in Aurora.

Screen Shot 2015-08-24 at 7.25.02 pm

This was the size on a ‘db.r3.large’ instance.

On a ‘db.r3.2xlarge’ย  instance, it was set to 2460900352 i.e. 2.4GB

I am not sure, if amazon has done something to improve the query cache.

So, do run tests with Aurora and see if the cache suits you.

Screen Shot 2015-08-24 at 7.47.15 pm

a wild Supposition: can MySQL be Kafka ?

This is an idea which i presented at Percona Live 2015.

Is MySQL an avatar of Apache Kafka ?

Can it be Kafka ?

Yes, it can.

This talk takes a shot at modeling MySQL as Kafka.

PS: it’s unconventional, hence a WILD supposition ๐Ÿ™‚

slides @

http://www.slideshare.net/jaihind213/can-mysql-bekafka

ย or

ย I made a small video for this presentation too.

My Own Video for Ppt

MySQL Cluster – Java Connector / Bindings

While working with MySQL Cluster, i was looking for a monitoring framework for the cluster.

i came across a library @ https://launchpad.net/ndb-bindings – which had java and other connectors to NDB, the library was a wrapper of the existing C++ NDB Api.

This library allowed me to connect to the management node , get the state of the cluster and get real time notifications about heartbeat misses/node disconnections.

The library error-ed out on some conditions, with a small fix, it can work with MySQL Cluster 7.3.

https://github.com/jaihind213/mysql-cluster-ndb-bindings

I have listed down steps for compilation and running a sample program at github

My talk @ Percona Live 2014

Will be giving a lightning talk at Percona Live 2014 – santa clara ๐Ÿ™‚

In the world of replication, The Binlog is every ones favorite. 

But what about the Relay log, Don't you feel it can do more ?

Yes, It can. How about, if it can serve the purpose of the GTID 

feature without GTID.

https://www.percona.com/live/mysql-conference-2014/sessions/relay-log-unsung-hero

#MySQLย  ( alternative to GTID )ย  #gtid

having a log table ?

I came across a scenario where we were concerned about the size of a table, a 134G table, its a pretty huge table and growing fast.
The nature of the table – its a table which contains logs , sort of a audit table.

Some fundamental questions around this table’s use:

1) would it have a lot of reads ?ย  — Answer –ย  No
2) is it always insert only? — Answer – Yes
3) current strategy to restrict size ? — Answer – delete data from table.
4) does one need backup of the table, i.e. is it very important ?ย  — Answer – No

Well based on the above, it would seem that logging is the theme.

Schema of the table:

CREATE TABLE `SystemLogs` (
ย  `column1` varchar(255) DEFAULT NULL,
ย  `column2` varchar(255) DEFAULT NULL,
ย  `column3` varchar(255) DEFAULT NULL,
ย  `column4` varchar(255) DEFAULT NULL,
ย  `column5` text,
ย  `column6` text,
ย  `timestamp` datetime DEFAULT NULL,
ย  `id` int(11) NOT NULL AUTO_INCREMENT,
ย  `createdAt` datetime NOT NULL,
ย  `updatedAt` datetime NOT NULL,
ย  `column7` varchar(255) DEFAULT NULL,
ย  PRIMARY KEY (`id`),
ย  KEY `idx_1` (`column1`,`column2`,`column3`)

) ENGINE=InnoDB

Some strategies to mitigate the concern:

Strategy-1
———-

Based on the above schema, most columns are TEXT / VARCHAR, which is a good case for

ROW COMPRESSION. This could be employed which would reduce the size of the table on disk.

The results of compression – yet to do some tests.
ย 
Strategy-2
———–

Since the table is use to hold logs, why not employ a logfile to hold such logs.
ย 
In the MySQL world, The Binlog could be used to do just that, its an audit log for all changes happening in the server.

Well, one would ask, if binlog logging is enabled, which was the case, as there is a master/slave arrangement,

then is that what we wanted to achieve ?

Yes, but partially,ย  the size of table on disk is the concern, the size does not decrease just because the binlog is enabled.

Here comes the change which will resolve the concern.

‘alter table SystemLogs engine=BLACKHOLE’.

So, when you insert into the table, now the entry will be made to the binlog keeping the size of the table on disk as 0 bytes.
ย 
If one wishes, one can reconstruct the table on another box (having its engine as innodb) by applying the binlog.
ย 
or

you can simply have a slave where the table has innodb engine and the table gets reconstructed there.

ย 
ย 

Selective Row Replication – using Blackhole storage engine

What if you wanted only certain rows (i.e. rows of a particular type) to be only replicated to particular slave db ?

and you were not interested in the rest.

Lets say we have a table with the schema

CREATE TABLE `invoices` (
`id` int(11)ย  PRIMARY KEY NOT NULL,
`type` varchar(10) DEFAULT NULL,
`created_date` datetime DEFAULT NULL,
`status` varchar(10) DEFAULT NULL
) ENGINE=InnoDB

We have different types of invoices like – accounting invoices , warehouse invoices , procurement invoices.

Lets say that we were interested in having

a SLAVE DB where only ‘accounting‘ invoices are only needed and we are not interested in the other types:

What can we do to achieve this selective replication:

One Solution: Use tables with BlackHoleย  (Black Hole engine).

Step 1:

Create another table with same schema as invoices but having engine as BLACKHOLE.

CREATE TABLE `invoices_blackhole` (
`id` int(11)ย  PRIMARY KEY NOT NULL,
`type` varchar(10) DEFAULT NULL,
`created_date` datetime DEFAULT NULL,
`status` varchar(10) DEFAULT NULL
) ENGINE=BLACKHOLE

Step 2 (pseudo code) below:

Change application code as follows: [Assuming all writes go to a master db and slaves are read-only]

—————————–

Old code:

—————————–

string type = ‘accounting’;

Start transaction;ย ย  insert into invoices (999, type, now(), ‘created’);ย ย ย  commit;

or

string type = select type from invoices where id = ‘999’;

Start transaction;ย ย  update invoices set status = ‘processing’ where id = 999;ย ย ย  commit;

—————————–

New Code:

—————————–

string type = ‘accounting’;

Start transaction;ย ย  insert into invoices (999, type, now(), ‘created’);

if (type == accounting){ย insert intoย invoices_blackhole (999, type, now(), ‘created’);ย ย ย ย  }

commit;

or

string type = select type from invoices where id = ‘999’;

Start transaction;ย ย  update invoices set status = ‘processing’ where id = 999;

if (type == accounting){ย updatedย invoices_blackhole set status = ‘processing’ where id = 999;ย ย ย  }

commit;

/////// do the same for delete too……

Step 3:

On slave: since the create DDL gets propagated to slave:

Now, change engines of invoices & invoices_blackhole to ‘blackhole’ && ‘innodb’ respectively.ย  [that’s correct – reverse them ๐Ÿ™‚ ]

Step4:

now starting writing to master and your slave will have only rows with type ‘accounting’.ย  — Tadaaaaaย  ๐Ÿ™‚

 

mysql selective row replication using blackhole engine

mysql selective row replication using blackhole engine

 

What about Transactions across Blackhole and innodb ?

Ans:ย  In our case, if a rollback is issued , the transaction is not written to binlog => so no worries.

Thank you for reading this post !!

Any suggestions / thoughts are welcome

Case 1 – Making Relay Log Dump possible

Hi All,

This is my first post on this Blog ๐Ÿ™‚

I would like to thank Ramakrishna Nalam , Shashwat Agarwal , Suthirta for inspiring me to take up this new Blog.

I am chief Surgeon at the Flipkart M*A*S*H – which is an internal blog ,ย  well MASH 213 is now here.

At work, we had a use case for MultiSource Replication . We were using Tungsten to achieve this.

topo

We were stuck on a case where fail over would occur – i.e. what would happen if a slave we were replicating from would die, how would we continue using another slave ?

Using MySQL 5.5 – Well, the hard way would be to parse binlog – find the location of last applied transaction in other slave binlog, get its position and continue from there – the format of binlog – statement/ row/ mixed had to be same across all slaves else this could be tough.

Well Since we were using tungsten for MultiSource Replication and tungsten feeds of a binlog, i thought –

hey the Slave Relay Log is = Master Binlog (The slave copies over the binlog of master and keeps it as relay log)

why not replicate from relay log of slave instead of its binlog – All slaves would have similar Relay Logs.

On replication failure from the slave – Tungsten would tell us the last successful position in the log file (in this case the relay log of failed slave) it has applied. Parsing the relaylog using mysqlbinlog utility – we could go to this position & get the Transaction & its Transaction ID (XID)

Since tungsten would replicate from the relay log of the new slave now & the new slave relay log and the failed slave are same, the new slave would have that XID in its relay log , we could now narrow down to the corresponding position and continue replication from there from this slave.

We could even start replicating from the Master Binlog – (i.e. now ask tungsten to feed of binlog instead of relay log) and when the slave is up, we could make tungsten switch to relay log of slave.

I have modified the Percona Server to add Relay Log Dump capability. [Client included for testing this out]

Its available @ https://github.com/jaihind213/mash213

Do visit the code – This is my first patch to MySQL ๐Ÿ™‚ – though not included in oracle/percona source tree.

All in all – my first surgery at this Unit.

Thank you for reading this post – you can reach me for any doubts atย  -> jaihind213@ gmail dot com