Sat 1 Dec 2007
Innodb Locks, ActiveRecord and acts_as_ferret Problem
Posted by Scoundrel under Databases , DevelopmentLast few days one of our customers (one of the largest Ruby on Rails sites on the Net) was struggling to solve some really strange problem - once upon a time they were getting an error from ActiveRecord on their site:
1 | (ActiveRecord::StatementInvalid) "Mysql::Error: Lock wait timeout exceeded; try restarting transaction: UPDATE some_table..... |
They have innodb_lock_wait_timeout set to 20 seconds. After a few hours of looking for strange transactions we were decided to create s script to dump SHOW INNODB STATUS and SHOW FULL PROCESSLIST commands output to a file every 10 seconds to catch one of those moments when this error occurred.
Today we’ve got next error and started digging in our logs…
After a couple of hours of looking in logs, parsing binary logs from MySQL and thinking we’ve found a following image of the situation:
1 2 3 4 5 6 7 8 9 | # at 804649039 #071130 14:56:25 server id 4 end_log_pos 1011 Query thread_id=71318 exec_time=3 error_code=0 SET TIMESTAMP=1196456185/*!*/; UPDATE some_table .... # at 804650050 #071130 14:57:02 server id 4 end_log_pos 804650077 Xid = 220437663 COMMIT/*!*/; |
This transaction was locking a record and caused all those errors. We’ve took a look at project sources and didn’t found any transactions used there at all. So, it was obvious this problem transaction (with one query in it) was generated by ActiveRecord.
We’ve discussed this problem with our customer and started searching for a reason, why would AR wait for 37 seconds between a query and COMMIT command.
At the end we’ve found a reason. Customer uses acts_as_ferret for search on their site. Site has a huge documents database so ferret index updates became pretty slow. In acts_as_ferret index updates are called from a following AR hooks:
1 2 3 4 5 | ... after_create :ferret_create after_update :ferret_update after_destroy :ferret_destroy ... |
Next step was obvious - we’ve created a test project with a simple test model:
1 2 3 4 5 6 7 8 | class LockTest < ActiveRecord::Base after_update :my_lock def my_lock logger.debug("Starting sleep for 10 sec") sleep(10) logger.debug("Wake up!") end end |
Next - one test update of the data and the results in development.log:
1 2 3 4 5 | SQL (0.000093) BEGIN LockTest Update (0.000374) UPDATE lock_tests SET `created_at` = '2007-11-30 20:14:28', `updated_at` = '2007-11-30 20:14:43', `a` = 2 WHERE `id` = 1 Starting sleep for 10 sec Wake up! SQL (0.010107) COMMIT |
So, maybe we were reading AR docs in a wrong way, or there is no such info, but :after_update callback is called BEFORE the actual data COMMITed to the DB! In our customer’s case, some user saved the document, ferret stated updating its indexes, record was locked for 30+ seconds and every updates to the same record (next tries to click to a Save button, for example) were failing with timeout errors.
So far so good, we’re going to move to Sphinx and hope there will be no such crazy errors in future.
Tags: ferret, MySQL, problem, rails, scalability
- FastSessions Rails Plugin Released
- Data Recovery Toolkit for InnoDB Released
- Small Tip: How to Enable ActiveRecord Logging in Merb
- InnoDB Recovery toolset Version 0.3 Released
- Found an Ideal I/O Scheduler for my MySQL boxes
2007-12-01 at 5.58 pm
Question: Why is your client with “largest site” using acts_as_ferret in the first place?
2007-12-01 at 9.25 pm
Frank,
AFAIU, they simply did not hear about nginx back then and started using what everyone else in Rails community was using - ferret. Later many people I know migrated away from ferret, but when our client was creating his service, ferret was one of the most popular SE solutions which have Ruby client libraries and/or Rails plugins.
2007-12-02 at 3.22 pm
I understand. Have you evaluated Lucene as an option?
Frank
2007-12-02 at 8.46 pm
At this point we’ve just decided to stick to the technology we’ve used to. I mean both customer and me use sphinx in many places and sphinx integration process looks much easier for us at this moment.
2007-12-03 at 8.54 am
check out solr and acts_as_solr…
2007-12-12 at 2.12 am
Nice. Has it been filed to Rails Trac? I’m actually digging AR sources and trying to help core team with fixes. So should I create ticket for this first or can you please point me to it?
There’s about 10 000 tickets in their Trac
2007-12-12 at 2.12 pm
2Michael: I’ve never filled it as a bug, cause’ I wasn’t sure that it is a bug. Feel free to file it in rails trac if you think it is.
2008-04-23 at 11.31 pm
[...] Innodb Locks, ActiveRecord and acts_as_ferret Problem :: Homo-Adminus Blog by Alexey Kovyrin innodb_lock_wait_timeout (tags: activerecord mysql rails problem lock) [...]