having a log table ?

I came across a scenario where we were concerned about the size of a table, a 134G table, its a pretty huge table and growing fast.
The nature of the table – its a table which contains logs , sort of a audit table.

Some fundamental questions around this table’s use:

1) would it have a lot of reads ?  — Answer –  No
2) is it always insert only? — Answer – Yes
3) current strategy to restrict size ? — Answer – delete data from table.
4) does one need backup of the table, i.e. is it very important ?  — Answer – No

Well based on the above, it would seem that logging is the theme.

Schema of the table:

CREATE TABLE `SystemLogs` (
  `column1` varchar(255) DEFAULT NULL,
  `column2` varchar(255) DEFAULT NULL,
  `column3` varchar(255) DEFAULT NULL,
  `column4` varchar(255) DEFAULT NULL,
  `column5` text,
  `column6` text,
  `timestamp` datetime DEFAULT NULL,
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `createdAt` datetime NOT NULL,
  `updatedAt` datetime NOT NULL,
  `column7` varchar(255) DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `idx_1` (`column1`,`column2`,`column3`)

) ENGINE=InnoDB

Some strategies to mitigate the concern:

Strategy-1
———-

Based on the above schema, most columns are TEXT / VARCHAR, which is a good case for

ROW COMPRESSION. This could be employed which would reduce the size of the table on disk.

The results of compression – yet to do some tests.
 
Strategy-2
———–

Since the table is use to hold logs, why not employ a logfile to hold such logs.
 
In the MySQL world, The Binlog could be used to do just that, its an audit log for all changes happening in the server.

Well, one would ask, if binlog logging is enabled, which was the case, as there is a master/slave arrangement,

then is that what we wanted to achieve ?

Yes, but partially,  the size of table on disk is the concern, the size does not decrease just because the binlog is enabled.

Here comes the change which will resolve the concern.

‘alter table SystemLogs engine=BLACKHOLE’.

So, when you insert into the table, now the entry will be made to the binlog keeping the size of the table on disk as 0 bytes.
 
If one wishes, one can reconstruct the table on another box (having its engine as innodb) by applying the binlog.
 
or

you can simply have a slave where the table has innodb engine and the table gets reconstructed there.

 
 

My First Google Doodle – very special to me :)

My First Google Doodle - very special to me :)

it gives me immense joy to share with you my FIRST Google doodle – which i drew for Bitsian day – Aug2 , 2013
[A global celebration for Bits-Pilani students/Alumni ]

i am SO HAPPY to share with you that this Doodle has received 400+ Likes on Facebook Bits page (BITSians for Each Other) .

Google had inspired me to take up doodling and Bits-Pilani – my Alma mater, is the reason where i am today.

This doodle truly is special to me :)

The L is taken over by the Old Clock tower at Pilani Campus and the O at the center has the Bits Emblem with its motto below which means ‘knowledge is supreme power’.

http://www.bits-pilani.ac.in
Note: This is not an official doodle of Google.

400

Selective Row Replication – using Blackhole storage engine

What if you wanted only certain rows (i.e. rows of a particular type) to be only replicated to particular slave db ?

and you were not interested in the rest.

Lets say we have a table with the schema

CREATE TABLE `invoices` (
`id` int(11)  PRIMARY KEY NOT NULL,
`type` varchar(10) DEFAULT NULL,
`created_date` datetime DEFAULT NULL,
`status` varchar(10) DEFAULT NULL
) ENGINE=InnoDB

We have different types of invoices like – accounting invoices , warehouse invoices , procurement invoices.

Lets say that we were interested in having

a SLAVE DB where only ‘accounting‘ invoices are only needed and we are not interested in the other types:

What can we do to achieve this selective replication:

One Solution: Use tables with BlackHole  (Black Hole engine).

Step 1:

Create another table with same schema as invoices but having engine as BLACKHOLE.

CREATE TABLE `invoices_blackhole` (
`id` int(11)  PRIMARY KEY NOT NULL,
`type` varchar(10) DEFAULT NULL,
`created_date` datetime DEFAULT NULL,
`status` varchar(10) DEFAULT NULL
) ENGINE=BLACKHOLE

Step 2 (pseudo code) below:

Change application code as follows: [Assuming all writes go to a master db and slaves are read-only]

—————————–

Old code:

—————————–

string type = ‘accounting';

Start transaction;   insert into invoices (999, type, now(), ‘created’);    commit;

or

string type = select type from invoices where id = ‘999’;

Start transaction;   update invoices set status = ‘processing’ where id = 999;    commit;

—————————–

New Code:

—————————–

string type = ‘accounting';

Start transaction;   insert into invoices (999, type, now(), ‘created’);

if (type == accounting){ insert into invoices_blackhole (999, type, now(), ‘created’);     }

commit;

or

string type = select type from invoices where id = ‘999’;

Start transaction;   update invoices set status = ‘processing’ where id = 999;

if (type == accounting){ updated invoices_blackhole set status = ‘processing’ where id = 999;    }

commit;

/////// do the same for delete too……

Step 3:

On slave: since the create DDL gets propagated to slave:

Now, change engines of invoices & invoices_blackhole to ‘blackhole’ && ‘innodb’ respectively.  [that's correct - reverse them :) ]

Step4:

now starting writing to master and your slave will have only rows with type ‘accounting’.  — Tadaaaaa  :)

 

mysql selective row replication using blackhole engine

mysql selective row replication using blackhole engine

 

What about Transactions across Blackhole and innodb ?

Ans:  In our case, if a rollback is issued , the transaction is not written to binlog => so no worries.

Thank you for reading this post !!

Any suggestions / thoughts are welcome

How I fell in love with MySQL ? Updates VS Inserts

Well in 2011, for a week we would see disruptions in our ERP servers especially around 7pm.

  • We had a ‘job’ table & multiple app servers(cluster) inserting jobs and also picking them up.

  • At around 7pm , we would notice that the number of Pending jobs would increase dramatically i.e. app servers would not execute them.

  • The ‘confirm shipment’ job – a critical job for us would not be executed – leaving the fate of many shipments hanging in limbo.

I did do a lot of debugging around this – looking for clues – the major one was found in

show engine innodb status

TRX HAS BEEN WAITING 120 SEC FOR THIS LOCK TO BE GRANTED:

RECORD LOCKS space id 0 page no 560985 n bits 72 index `status_idx` of table `test`.`job_sandbox` trx id 0 9133886 lock_mode X locks rec but not gap waiting

It was a fight between Inserts & Updates on an index.

The sequence of events are available in this bug i filed, which was marked invalid – but none the less, it was an important lesson -

“Updates can get stuck because of inserts”

The solution for the above issue mentioned in the bug is as follows:

Old Txn:

update t7_ibu set status='service_queued' where status='service_pending' ;

New Txn:

//convert update to select.
//first get all the primary key id's
select id from t7_ibu where status='service_pending' ;
//then update each row using primary key value
foreach (primary_key in result set){
update t7_ibu set status='service_queued' where id = primary_key;
}

Since the update is confined to a particular row – the index lock can be granted to both the insert and the update. Also please note: multiple servers exist – they all do the same thing i.e. insert job and update job.

And this is how i fell in love with MySQL, this issue did it :)