Upgrading udev and kernels on a stack of Gentoo servers
Submitted by ckdake on Thu, 2013-02-07 21:07TL;DR: Install linux-firmware and uninstall pecl-apc. scroll down for a graph that shows why.
The Plan
The backlog of new packages to install was big enough that it was time for me to bite the bullet and upgrade the kernel and udev on all of my Gentoo servers for ithought.org They were a mix of 2.6.*/3* kernels and old unsupported udev.
I documented all the hardware, wrote out compilation plans for kernels for all of them including all the Gentoo, hardware, and software specific configuration options and built new kernels. 3.6.11 for the amd64 boxes and 3.5.7 for the x86 ones. Userland things upgraded successfully, and things were good to go for rebooting.
Just to play it safe (and to replace a server belonging to one of my colocaiton customers) I scheduled some routine maintenance to reboot all of the servers for 9pm on Wednesday February 6th.
The Not So Great Reboot
Rebooting the servers had mixed results. Several needed their root volume in grub.conf updated due to kernel changes in the way that volume names are presented, and I left out a few things that required booting with a boot cd to fix (software raid support on an old amd64 scsi box, and the Fusion MPT SAS driver on the box connected to the Dell MD1000 storage system) and thought I was in the clear, with just a few minutes of downtime for each server. Everything went as planned.
However, the x86/2.5.7 boxes had a problem. None of them could connect to the Internet because they couldn't see their eth* devices. This thread is a similar debugging experience to mine: http://forums.gentoo.org/viewtopic-t-948718-start-25.html. It seemed to be a problem with udev, which was a big problem because due to the new kernels combined with the age of all the other software and what was available in portage, I didn't have many options.
The newest server was new enough that I could use package.mask to rollback to an older udev that worked with the kernel on that system (2.6.36) but the other servers had kernels so old that they didn't work with the oldest versions of udev available in portage, which meant no downgrade was available.
This troubleshooting lasted from around 9:30pm until past 1:30am in the morning, trying kernel reconfiguration, customizing udev's rule files. With some help from a few people in irc.freenode.net#lugatgt that know far more about udev than I, we confirmed that everything in my kernel and udev setups looked correct. All net device starting yielded a 'SIOCADDRT no such process' that was not helpful to any of us for debugging, but a conviently timed glance at the output of `dmesg` yielded the problem, a missing file that is part of the 'linux-firmware' package. I installed linux-firmware via a USB drive on the broken servers, rebooted, and everything was finally working again. Success!
Nagios alarms all cleared, sites were back up, and I headed home to sleep.
The Next Battle
I woke up early to check on things, and one of the servers was down. The graphs indicated a combination of massive iowait processes and RAM utilization, but no swap usage. Interesting.
A few hours later, a stop by #lugatgt and asking some knowledgable coworkers, and I'd gotten acquainted with `iotop`, `iostat`, and others. I'd moved the MySQL datadirectory between RAID devices which only made things worse (and was undone), and narrowed things down to 6-12Mbps of write IO to the / volume. This was very strange because everything in /var (mysql, apache, vhost roots, logs, etc) are all on a separate RAID volume on a separate RAID controller.
The only file marked as open for writing in the php-cgi process's using 99.99% of their time waiting on IO were in /var/log, but no log files were growing, and disk utilization wasn't climbing. This indicated writing to unattached inodes, and it turned out that there is some conflict with the way APC (the opcode cache for php) was configured to use shm (shared memory in kernel land). I disabled apc, bounced apache, and the high-write load on / disappeared and conditions improved.
The impact of disabling APC is that PHP has to do a little more work on each http request which slightly lowers http respons times. It has doubled 'user' CPU load from ~15% to ~30% of one CPU, but has lowered the iowait from tanking the system after this update. That said, pre-update iowait hovered around ~50% and it is now down to ~5% which has some nice performance impacts for the rest of the system and may serve to actually decrease average response time.
It's frustrating when tools that are designed to enhance performance cause things to implode, but it's nice to clean up the stack a little and APC doesn't have the reputation of being the most stable thing out there anyways. Time will tell what impact this has on response code breakdown. Here's graph of CPU usage on this box during the events in the last 24 hours. From descriptions above you can probably see what looks pretty clear in hindsight.

The End
One more server needs a reboot with the newest kernel and udev userland, but given what I know now I'm confident that it will go well (almost confident enough to do remotely).
If you're a hosting customer of mine or a user of one of the sites that I host, I apologize for the extended outage last night and the slowness this morning as I worked out the IO issues, but know that these upgrades include security updates and having consistently configured systems makes testing out upgrades like this possible in the future. I'm on a monthly rotation now for software updates, and everything can now be tested on a backup server before applying it to any of the primary servers. Thanks for your business!
2013: Do (only) Awesome Things
Submitted by ckdake on Wed, 2013-01-02 11:19All kinds of things happend last year. This year I hope less things will happen, but that they will be more awesome. This means achieving 'flow' at the small and big scale.
First up is removing the little things that slow me down, simplifying things, and giving myself the room I need to acomplish great things.
- No more crazy projects that slowly creep out of the closet: While hack-night style projects are great, if pursued they turn into little time-sucking monsters. Reflecticle is shutting down, Portfolit won't be getting anything new, and I'll hopefully only be directing conceited effort towards big things that make an impact. This doesn't mean no more hacking on fun little things, but it does mean they won't accidentally grow up. Maybe Extract will make the cut, maybe it won't.
- No more Faster Mustache: While FM has been a great ride, it's time for me to bow out. I didn't race bikes at all last year which was _fantastic_ and I had a lot more fun mountain biking in the process. No more managing finances, no more coordinating race schedules, and no more race team e-mail list drama. That said, I'll still buy a kit or two and fly the FM colors on a bike this year.
- No more Gallery: Gallery has wound down, and the effort I put in isn't really helping me or anyone else. We may just wind down 'active' effort on the whole thing and see if anyone steps up to take care of the few things that need a hand.
- Things: If I have things I'm not using, they don't need to be in my face every day. Other people can get a lot more use out of some of the things I have than I can. A corner of my house is reserved for things to pass along, and hopefully friends and the Salvation Army are the better off for it.
- No more celebrating with presents and no more wantlist: Christmas 2012 with no presents was great. Family gettogethers, experiences, and a good time. My internet facing wantlist is gone, and replaced with a suggested places to donate if you'd like to regonize my birthday or a holiday. I asked people to donate to the Atlanta Community Food Bank for my birthday in 2012 and ~20 people did. Awesome!
- No more active ithought. It's mostly running itself these days and all the invoicing and bill paying is automated, so assuming there aren't hardware failures, this will hopefully stay out of the way.
- No more repetititive little things that I don't get any value from. Logging my commute with the Clean Air Campaign was fun, but a minute or so a day goes a long way to things that are useful.
So that is the 'less', here is the 'more':
- After merging with Highgroove, I'm now the Chief Operations Officer at Big Nerd Ranch. I am as prepared as I could be for this but mergers are crazy things and there is still a lot of figuring out to be done. Hopefully by the end of this year the merger will be 'complete', we'll have healthy processes and trust eachother, and The Big Nerd Way will be a thing. There will be a few more blog posts on this in the future here.
- Riding Bikes and Running with Friends will continue. A Monday run with coworkers, Tuesday night mountain biking, Thursday night hills on road bikes once it is alittle warmer out, and epic weekend adventures will continue. Hopefully a website update in the future here will make it easier to find those things, but for now just send me some kind of message if you're interested in any of them.
In short, I read a lot and think a lot, and hopefully by removing all the little things that take up so much of the 'inbetween' time I'll create some time for me to figure out what exactly it is that I'm supposed to be mastering, and do just that. Onward!
So I got a Surface
Submitted by ckdake on Mon, 2012-11-12 16:24I bought a Microsoft Surface tablet. While I have one of the first-gen iPads that work gave me a while back, this is the first tablet I've spent money on. I don't currently own any other Microsoft products, and usually am in close proximity to an iPhone or MacBook. This post is for everyone that keeps asking me 'why?' or 'how is it?'
The Apple ecosystem is nice when things work, but it's a little less nice when you come to rely on something and get burned by it. Resetting my Apple password is one of the most frustrating things I do on a computer and it seems to happen with some regularitly. iTunes Match is great, but it doesn't work on my phone any more, and market dominance brings complancency and a lack of innovation.
I play devil's advocate a lot, I root for the underdog, and being platform agnostic is a good thing, so enter the Surface running Windows RT. I've used it to be productive (Evernote, Mail), consume (Video, News), be social (Skype, Facebook, Twitter), and generally use the internet.
In short: it looks good, has a nice feel to it, and functions pretty well for most of the things I need a computer or tablet to do. That said, if I go on a trip it looks like I'll still need my MacBook for using Adobe Lightroom to process RAW files from my camera, and for downloading and uploading GPS information from my Garmin watch. Hopefully both of these will be fixed in future updates and apps.
All of my thoughts (maybe) worth sharing so far:
- It has Internet Explorer and Mail. They both work.
- The outside seems pretty sturdy and scratch resistant, but I didn't get a case and the keyboard cover thing only covers one side. We'll see how well that works out in the long run.
- The power adapter, while magnetic and easy to remove, requires a little focus to get plugged in correctly. It is far less nice than Apple's MagSafe connectors.
- The touch keyboard is different but is actually kindof nice. Using my Apple bluetooth keyboard seems wonky afterwards. The vibrating feedback from touching the Windows home button is neat, as is the whole Windows 8 touch interface. Not initially intuitive, but once you learn the basics it is very effective.
- People whine about no start menu and try to get apps to get it back. I don't understand. Why are you so attached to a button?
- The kickstand is awesome when on a sturdy surface, but not as awesome when balanced on legs. It would be if the pivot was higher up so it 'bounced' less
- It's a computer, and a tablet. That actually means it's going to be a little boring. It is, I'd rather be outside anyways.
- The 'People' app is pretty great. Read Facebook and Twitter in one place. That entire mess still needs to figure out how to keep track of the whole "What I've seen so far" thing.
- Skydrive is a lot like Dropbox, but it's more like iCloud with support for other platforms. Magically slide around photos/images/etc in a cross-platform way, plus get really great app support on the native platform.
- Microsoft Office included. Pretty nice if you're into that sort of thing.
- The screen aspect ratio is a little strange. It's a little awkward for reading, but nice for video.
- It seems like the Surface should be thicker and lighter.
- Tethering with iPhone for internet doesn't seem to work.
- The built in weather, news, finiance, etc, apps are all really polished lookign and simple and convey information, but wtf there are advertisements in them for cars?
- It seems like I need a Windows 8 pc at home for the full experience. If I had one, more of my media would be available remotely.
- Skype works pretty well.
- Messages is neat, but so far is just Facebook and I couldn't see replies from one of the people I messaged. Apple iMessage is way cooler but of course requires a complete buy-in to their ecosystem
- There are some definite gaps in the app store, specifically I need something for Campfire and Google Reader. Things.app may have to go away too if I can find a reasonable replacement.
- Currently there isn't a way to to do text snippets, clipboard management for multi-copy/paste,etc, and those are huge timesavers for me at this point on Mac OS X.
- Sometimes the screen is black, with the backlight obviously on, and no indication of what is going on. Thats a little annoying and iDevices don't seem to ever do that.
- Sometimes it needs a reboot to apply software updates. Are we really still doing that?
You're Doing It Wrong
Submitted by ckdake on Tue, 2012-10-23 08:37Last friday, I gave the first version of a fun talk that I'm calling "You're doing it wrong", because you are in fact doing it wrong. Watch below, let me know if there are other aspects of your life that I should be talking about too.
2012.10.19 Tech Talk: You're Doing it Wrong from Highgroove Studios on Vimeo.
Sign up to see more Highgroove Tech Talks as we do them on learn.highgroove.com.
The Uncanny Valley, The Pareto Principle, and end-user facing Software Development
Submitted by ckdake on Mon, 2012-10-01 15:41That software project you have in mind? You are not going to make everyone happy, so don't attempt to please everyone.
"The uncanny valley is a hypothesis in the field of robotics and 3D computer animation, which holds that when human replicas look and act almost, but not perfectly, like actual human beings, it causes a response of revulsion among human observers. The "valley" refers to the dip in a graph of the comfort level of humans as a function of a robot's human likeness." - wikipedia
If you saw 'The Polar Express' and something seemed off, you know exactly what this is. Something close to our expectations of 'right' but that isn't quite there is going to be uncomfortable. This applies not just to human figures, but to anything that we have expectations about. From ice cream flavors that are so close to some flavor that they are gross, to snakes with creepy faces [ref] to software that doesn't quite work right.
"The Pareto principle (also known as the 80–20 rule, the law of the vital few, and the principle of factor sparsity) states that, for many events, roughly 80% of the effects come from 20% of the causes." - wikipedia
This is commonly called the 80/20 rule in software, and can be extrapolated to things like:
- 80% of the needs of potential users can be met with 20% of the features
- 20% of the effort can acomplish 80% of the functionality
The fact that 80 and 20 add up to 100 is a mere concidence here, and something like "80% of the features can be acomplished by 5% of the code" would still fit this principle.
I propose that these two concepts can be combined together to predict a specific way that software projects can be guaranteed to fail or cause frustration to users. To start, a personal example:
Apple iCloud contact syncing is awesome. Like many Apple products, it just works: I edit the phone number for a contact on my computer, and seconds later the contact on my phone is updated and the next time I call them it will use the right number. I can create contact groups and they sync around as well. However, there is a concept of 'starred contacts' on my phone which doesn't seem to appear anywhere else. I forget how great the synchronization is because it's so frustrating that I can't work with the list of 'starred' contacts on my computer, and I waste time trying to find workarounds like making a 'starred' group which is less than perfect. As contact lists tie in with other things like Messages, this gets even more frustrating as groups of contacts are inconsistent across more devices and applications. Because of how much "Just Works", is it really too much to ask for my starred contacts to sync between Messages and my Phone?
Because Apple got things so close to perfect, I am frustrated instead of amazed. If they gave me a little less, or got things perfectly right I'd be a lot happier but I'm in an 'Uncanny valley'-ish hole of frustration and time wasting instead, which is the opposite of what I want from technology designed to make me more productive.
At Highgroove, we develop custom web applications for a pretty diverse base of customers. Our clients have a good idea of what they want to happen and we help them break it down into the tiny pieces. A 'Minimum Viable Product' line gets drawn in the sand and features get prioritized before or after this. Chances are pretty good that infinite scroll and auto-resizing profile images will be awesome, but they probably aren't needed to launch. We know from tons of experience that if the MVP marker gets pushed back in any significant way, there is almost no chance that our customer will get what they 'want' regardless of how much more development effort is on the schedule.
In software, there are always new ideas so to finish 'everything' would take an infinite amount of time. 'The Mythical Man Month' points out that software projects are almost always considered unstarted or 90% done, but never finished, and any project that attempts perfection will not ever be completed on time or under budget (if even at all).
The lesson here is that attempting to spec out and complete functionality to meet the needs of every potential user of your application is an insurmountable task. The 80/20 rule applied here (80% of the needs with 20% of the features) tells us that we'll save tons of time and effort by skipping out on a few things, and the Uncanny Valley tells us that unless we get things absolutely perfect (which never happens), we'll be stuck in a hole of user frustration due to things being close but not quite right.
So come up with crazy ideas and tons of features for your app, but put in the effort to prioritize what is really important and get that done first so that you can ship something that causes delight instead of disappointment. 95% of the way there is closer to 0% than you think, and if you stick to something more realistic, you'll ship something that your users love before you even realize it.
2012 Mountain Bike Trip: Sun Valley
Submitted by ckdake on Wed, 2012-06-20 15:03Another year, another awesome bike trip. What started in 2009 as just me and a rental bike in Santa Fe, NM turned into a party of 7 riding epic trails for a week in Sun Valley, Idaho. Austin and Paul (who made appearances in bike trip 2010 in Portland) drove in from Portland, Jason (my riding buddy for last year's week of riding in Colorado) flew in from Pittsburgh, and Bob/Jim/Charles flew in from Atlanta for a few days.

Everyone had a slightly different trip, and on my rental Turner 5.Spot from Pete Lane's Mountain Sports I ended up putting in about 20 hours of riding, covering 150 miles, climbing almost 19,000ft, and topping out at 9125ft above sea level. It was a little harder to breathe, but the main trouble was the super long steep climbs that required quite a bit of hike-a-bike, and the super long flowy fantastic downhills that required sharp concentration to prevent wiping out on. My only fall ended up being a minor one on my last day, and on the other end of the scale Jason got 6 stitches in his leg. (After a temporary fix and the rest of the day of riding and dinner and beers of course.)
There we saw some a lot of epic views like:

and ton's of super well maintained and well marked and tons of fun trails like:

The play-by-play:
- Day One: River Run, Traverse, Bald Mountain, Broadway, Warm Springs, Wood River - With fresh legs and a bit too quick of a start, we rode up to the top of Bald Mountain and came flying down Warm Springs. Ski Resorts are weird in the summer, and this was our first encounter with more hike-a-bike than expected, but it's a route I would do again. The Warm Springs portion will be on the US Nationals XC course in a few weeks.
- Day Two: Greenhorn and Mars Ridge - Our biggest mileage day, we cruised south of town ona nice paved bike path, up the super nice Greenhorn trail, and then did hike-a-bike battle up to Mars Ridge. The hills were steep, the views were great, and the downhill on Red Warrior Creek was a mix of technical and crazy fast that put smiles on everyone.
- Day Three: Heidelberg, Suny, Shady, Adam's Gulch, Pork Chop, etc - Our first day with the full crew, and our first day on 'classic' Sun Valley singletrack. Just aboue everyone had a mechanical in the first few miles, and many rear derailluer limit screws were adjusted. Apparently we went 'too far' up Adam's and 'nobody does that'. The hike-a-bike about killed a few people, but again, lots of fun flowy singletrack and the fact that we all ran out of water didn't stop us. We stopped once to try and use Bob's filter, but the mosquitos were far too bad.
- Day Four: Chocolate, Fox, Oregon - This was our only day doing something other than straight to the top and straight back down, and the continual mix of uphills and downhills was pretty strange, and pretty nice. For the most part it was no walking required. While without the views from the tops of mountains, there is something to be said about seeing in front of you miles of singletrack stretching down a hill, through a meadow, and around a distant peak. Once the 'short day' people peeled off, those of us on the 'deathride' crew headed up the steep side of Oregon Gulch. After hiking up the hill, Bob shared his Cherry Coke with us and we tore down some of the best downhill on the trip.
- Day Five: National Championships XC Course - In the morning, we all went to a shooting range to shoot some skeet, and all of us hit a few of the clay piegons while most of them made it to the ground safely. We dropped off people at the airport, and headed back to down. Down to just Jason and I, and with me feeling a little sick, we tackled the 2011 XC course. No world records were set, the uphill was pretty killer, and the downhill was a blast. I can imagine how crazy it would be to race on the trails and they're changing the start of the course this year to give people more places where they can pass.
- Day Six: Greenhorn and Imperial - I'd been feeling sick but thought things might get better, so we headed back down the bike path to go up Greenhorn and decide what we would do next there. Deer Creek is apparently fantastic, but by the top of Greenhorn my spirits had left and I was ready to get home. We rode a little more uphill on Imperial, and go to enjoy one of the most senic descents of the day to get back to the road.
6 days of riding was plenty, and there's a chance that a hair less than 6 days is the sweet spot for a bike trip as I was pretty beat up by the end. The trails around Sun Valley are all pretty great, but it's a network of trails and we did a bit of work to get to places we wanted to be, much of which involved walking bikes up steep hills for hours. Talking with people in bike shops and other mountain bikes, we hit all the 'must ride' trails in Sun Valley (except maybe Deer Creek), but there are a lot more trails, many of which are apparently 'Ride it once and cross it off the list' kinds of trails. Last year's trails in Colorado had more variety, less time on bike paths and unpaved roads, and more climbable climbing (and more climbing), but at the expense of spending as much time driving a rental car around as actually riding bikes. I'm happy to cross Sun Vally off my list, it was a week well spent.
Aside from bikes: beers were drunk, steaks were eaten, skeet was shot, and Sharktopus was watched, and it's going to be interesting to see how next year's trip will manage to outdo this one.
Check out the full set of photos if you need any more reason to visit Sun Valley and get in some riding.

