We’re looking for an exceptional Sys Admin

October 10th, 2006

The time has come for us to hire a dedicated system administrator. We’re looking for a great person who is passionate about the sys admin role and is willing to make it up and make it happen. Here’s the official job description

The Robot Co-op is looking for a passionate sys admin w/ experience managing open-source clusters & Rails. Our software includes: FreeBSD, MySQL, Apache, Rails, MemCache and MogileFS. We run our 5 sites on a 6-machine cluster. You must be comfortable setting up new servers, optimize existing servers and be able to perform server surveillance, monitoring and maintenance. We need you to keep us clear of system issues, performance crunches, & daily hiccups.

To apply email [email protected]


The software and hardware that runs our sites

October 10th, 2006

Robot Co-op Hardware

quantity memory cpu disks purpose
4ea Dual 3GHz Xeon 6GB 70GB RAID 1 Apache, FastCGI, MogileFS storage node, memcached, image serving
1ea Dual 3GHz Xeon 2GB 70GB RAID 1 Staging, mail, backend jobs
1ea Dual Opteron 246 12GB 5×73GB in RAID 5 MySQL

Eric has more to say on this topic

Robot Co-op Software

All our online machines run FreeBSD 6.0-RELEASE. We use Amanda for backups which gets rsynced off-site. Subversion is used for revision control of all our configuration data.

We have our own CVSup mirror and package build machine which builds packages for all our boxes to keep things in-sync and to decrease the load on FreeBSD’s CVSup mirrors. We also NFS-mount the src and ports directories to reduce wasted disk space. portaudit is used for vulnerability monitoring and portupgrade for performing package upgrades.

Critical processes are watched by dwatch. Clocks are synchronized with ntpd.

Webservers (x4)

The webservers all run PAE kernels with 6GB of ram. 2GB is allocated to memcached and the rest gets chewed up by Rails processes. Each of our sites gets its own set of processes and we have 25 total Rails processes running per host.

For pages our webservers are all Apache 1.3 with mod_fastcgi. Apache logs are rotated by cronolog and processed by AWStats (but I might switch to Visitors).

We’re using Rails (of course) to generate our pages. We use a library very similar to CachedModel to fetch ActiveRecord objects out of memcached. We don’t use any page or fragment caching.

For images our webservers are WEBrick using a Ruby MogileFS library I wrote (but have not yet released). Image resizing is performed by RMagick, occasionally on-the-fly.

We use the lang/ruby18-nopthreads port of Ruby because we experienced an incredible load increase with the default pthread version.

Each machine runs memcached with 2GB of cache (so be sure to set kern.maxdsiz and kern.maxssiz in /boot/loader.conf appropriately). We store sessions, ActiveRecords and random other expensive-to-compute data in memcache.

Each machine is a MogileFS tracker node and file store. All our images get stored in MogileFS. We use NFS mode due to problems with components of Perlbal on FreeBSD. (With the way we’ve implemented image serving and FreeBSD’s NFS implementation it works out fine.)

Our webservers all run sendmail set up as a smart host pointing to a machine running postfix. This keeps outbound mail sending fast and reduces the amount of maintenance.

Database

We have one database server, it runs MySQL 4.1.x and isn’t all that special. Bob tuned it based on Wikipedia’s MySQL configuration.

Miscellaneous

We have one machine that does miscellaneous jobs. It runs postfix for inbound and outbound mail, runs crons that update memcached, analyzes log files and a small handful of other unimportant things.

Read more about our Software on Eric’s blog.