user warning: Duplicate entry '827686' for key 1
query: INSERT INTO dr_accesslog (title, path, url, hostname, uid, sid, timer, timestamp) values('100,000%', 'node/134', '', '38.107.191.87', 0, '6embtud40on8btpkpfo7jdofd4', 271, 1268788652) in /usr/local/apache/htdocs/includes/database.mysql.inc on line 172.
user warning: Duplicate entry '827686' for key 1
query: INSERT INTO dr_accesslog (title, path, url, hostname, uid, sid, timer, timestamp) values('', 'links/refer/2', '', '38.107.191.88', 0, '6embtud40on8btpkpfo7jdofd4', 202, 1268788663) in /usr/local/apache/htdocs/includes/database.mysql.inc on line 172.
Tue, 12/23/2008 - 21:02 by Chad Mynhier
I jumped right in to this blog without much in the way of an introduction, so I'll correct that oversight now.
The traditional introduction seems to be, “Why am I at Forsythe?” For me, that's a pretty short answer: working at Forsythe is letting me do some of the most exciting work I've done my entire career. Instead, the more interesting question (or at least one that leads to a longer blog entry)might be, “How did I get here?”
A few years ago, I attended a set of presentations at the Sun offices in New York. One of the presentations was an introduction to DTrace and was given by Jarod Jenson, now a colleague of mine. Jarod was obviously very excited about DTrace and what it could do. I caught his enthusiasm, went back to work and downloaded Solaris Express to start playing with DTrace. Unfortunately, all I could do was play with it, as Solaris 10 hadn't been released yet, and I couldn't go putting an unsupported and experimental release of the OS on our production servers. The best I could do was set up problems on a test system and use DTrace to investigate them. But investigating a known problem is hardly challenging.
I had to wait quite a while to use DTrace on a real problem. The opportunity presented itself during an evaluation of the T2000 server at Juno Online Services. Being a web shop, the architecture of the Niagara chip seemed a perfect match. Beyond simply seeming like a good match, however, it had the potential to solve our data center problems. This was the usual stuff – we were running out of space and power/cooling. We were contemplating building out a new data center, but I was convinced that moving from Intel-based Java servers and SPARC-based SMTP servers (v210s) to T2000s would let us handle our workload in a smaller footprint, saving us the need for new data center space. I worked up the spreadsheet to demonstrate the savings to be had, which were of course dependent on the performance ratio between the T2000 and the Intel servers.
We used an in-house-written SMTP server as our first performance test. Unfortunately, we maxed out throughput well below our break-even point, although the server didn't itself appear to be fully utilized. This seemed to be the perfect opportunity to use DTrace. Being new to it, I didn't use it as optimally as I could have, but it did give us a smoking gun – we were seeing lock contention in libc's implementation of malloc(3C). A quick LD_PRELOAD of libumem gave us the performance we were expecting, and then some. A second test of the server using the SMTP server in a different mode (incoming customer mail versus outgoing customer mail) showed problems, although this time it wasn't lock contention in malloc(3C). DTrace again gave us the smoking gun – this time it was lock contention in readdir_r(3C). Some code reading and research indicated that using the thread-safe version of readdir(3C) was pointless in this case, as we were never sharing the underlying data structures between threads. This fix required a code recompilation, but again our scalability problems went away.
These were my first experiences as a DTrace user. A couple of years later I went deeper and got involved in DTrace development. One day I saw a message from Adam Leventhal fly by on a mailing list, informing someone that a problem they were seeing was a known bug and pointing out that fixing the would be a good intro to OpenSolaris development. I took the bait and fixed the bug, and Adam pointed out that there were a number of open DTrace bugs if I were interested in working on them. The bug pointing out the lack of a standard deviation function jumped out at me, and I started working on it, with Jon Haslam shepherding me through the process. A few months later, the code was putback into OpenSolaris, and Bryan Cantrill made the announcement at dtrace.conf(08) that I was being given core contributor status in the OpenSolaris DTrace community – the only other person with such status being Jarod Jenson. A couple of months after that, I mentioned to Adam that I was looking to change jobs, and he told me that Jarod was looking to hire. And that's pretty much how I got here.