Tuesday, May 15, 2007

Scaling Rails - Notes from Silicon Valley Ruby on Rails Meetup

Some websites are launched by getting it on TechCrunch, Digg, Reddit
etc. In such cases there is no time to grow organically.


1. The Adventures of Scaling Rails -
http://poocs.net/2006/3/13/the­adventures­of ­scaling­stage­1
2. Stephen Kaes "Performance Rails" - http://railsexpress.de/blog/f
3. RobotCoop blog and gems -
4. OReilly's book "High Performance MySQL

This presentation's focus is on what's different from previous
writings. For comprehensive overview refer the above resources.

Scribd.com was in launched in march, 07. It is the "YouTube" for
documents and it handles around 1 Million requests per day.

Current Scribd Architecture

1 Web Server
3 Database Servers
3 Document Conversion Servers
Test and Backup machines
Amazon S3

Server Hardware

Dual, dual core woodcrests at 3 Gz
16 GB of memory
4 15K SCSCI hard drives in a RAID 10
Disk speed is important. Don't skimp; you're not Google, and it's
easier to scale up than out.
Hosted by Softlayer.


Memcached, RobotCoop's memcache-client
Stefan Kaes' SQLSessionStore - This is the best way to store peristent sessions.
Monit, Capistrano

They ran tests and found out that fragment caching improved
performance for their web app.

How to Use Fragment Caching

Consider only the most frequently accessed pages.
Look for pieces of the page that don't change on every page view and
are expensive to compute

Just wrap them in a
<% cache('keyname') do %>
<% end %>
Do timing test before and afterwards; backtrack
unless significant performance gains

Expiring fragments - 1. Time based

Use memcached for storing fragments
It gives better performance
It is easier to scale to multiple servers
Most importantly: It allows time­based expiration

Use plugin http://agilewebdevelopment.com/plugins/memcache_fragments_with_time_e...
Dead easy:
<% cache 'keyname', :expire => 10.minutes do %>
<% end %>

Expiring fragments - 2. Manually

No need to serve stale data
Just use: Cache.delete("fragment:/partials/whatever")
Clear fragments whenever data changes
Again, easier with memcached

They also discussed about how to use 2 database servers with Rails
app. For more information you can see the slides at

Q & A

They use SWF open source plugin for uploading documents (it allows
multiple docs to be uploaded simultaneously)

They pay only $100 monthly for Amazon S3 for 5 tera bytes of b/w
usage. Downside of using Amazon S3 is that they cannot generate
analytics for that part of the app.

The uploaded files goes to a queue and is processed in the background.
So the documents don't appear immediately on the site if the load is

Tip from the first presentation on CacheBoard: It uses backgroun drb
plugin by Ezra for exporting documents in XML format.

