Work
June California Trip
Submitted by ckdake on Sat, 2009-06-06 21:06This past Monday, Seth and I headed to California for the week to get some work done. We didn't have hotel reservations that we knew of, and had a mess of things to clean up in the datacenter, so we drove the rental car straight from the airport to the office at around 11am on Monday and got started.
Monday
We spent Monday in the office, getting some face time with the new office IT guy ("chicks" is his username which is the source of much hilarity) and meeting with some people that we have ongoing projects with. Lunch was at Dittmer's Gourmet Meats and dinner was 4x4s at In-N-Out. We ended up crashing at Jesse's house after sitting in his hottub drinking Micky's, and watching Apocalypse Briggs (part 1 here, additional parts in "related videos"). It's nice sharing rooms with Seth because he likes sleeping on the floor which means no complicated figuring out beds/couches/etc. A pillow and a blanket, and he's set!
Tuesday
Pretty early on Tuesday, we headed directly to the datacenter, stopping at Le Boulanger on the way for tasty breakfast sandwiches. After getting our hands added to the biometrics system, we began sorting spare parts, getting rid of trash and server packaging, and removing wires that weren't plugged in to anything. 2 people from Virident Systems showed up with a box for us to install that we're doing some experimenting with, and things are looking pretty good so far with that. They took us out to eat at a Malaysian place that was pretty good, and our afternoon in the datacenter was more cleaning up. We drove to Thee Parkside in the city for beer and $2 tacos with some of the Gallery crew, and headed over to Digg with Robert for a few more beers. Afterwards, Seth and I drove Bharat back home and slept at his brand new house in Menlo Park. Digg HQ:
Wednesday
We started off Wednesday morning dropping Bharat off at work at Google, and getting a quick tour of Google HQ for Seth. After that was another datacenter day, interrupted with a trip to the office for some Japanese food for lunch. The grand total of trash we cleaned up filled up a 48 gallon plastic bin, and we began fixing labels on machines, noting rack locations in our ZenOSS installation, and properly labeling all the outlets on our PDUs and what they are connected to. Aside from everything looking a _lot_ better, highlights of the day included finding a machine we didn't know about with 32G of RAM (now a OpenVZ box doing a lot of things). For the evening we headed up to Lila's place in the hills of Los Gatos where the SugarCRM IT crew enjoyed beers and pork ribs, and Seth and I slept in a spare room there after staying up long past the always amazing sunsets:

Thursday
After the crazy drive back down from Lila's, we headed to the datacenter for the morning. It took us about 4 hours to finish things up including rewiring all the cat5 in one rack and mostly wiring up a new rack of machines (still waiting on the switch and PDUs before that will be done). Back at the office, we had a very late lunch of more 4x4s at In-N-Out because they apparently couldn't make us 5x5s. I spent the rest of the afternoon catching up on some of the ticket backlog assigned to me since we'd been busy all week doing other things, and around 6:30 we drove up to Igor's place and got to see "mini beast", Igor's newborn. Several other SugarCRM people met up with us to head to Whiskey Thieves for some whiskey sampling. At some point, Julian and I put a few dollars into the Area51 machine there and ended up with 5th and 6th place on the high score list, and he told us "The Japanese Fan Story" which you should get him to share if you haven't heard it yet. Afterwards, we stopped by The Owl Tree and ended up at Cocobang for some super spicy Korean BBQ chicken to finish off the week. A week's work:

Friday
Friday morning was back to the airport to fly home. It was another crazy exhausting week in California and while we got a lot done, I'm definitely glad to be home. Delta helped us out because both our flight our and our flight back took ~45 minutes less than expected. All meals not described above were either not eaten, or consisted of cherry coke and taquitos from 711. Now that I'm home, it's time to hunt down some people to pay their hosting bills (Eldon- While biking today I saw you on your bike so I know you are alive!) and mow the grass. Pictures from the week are at http://ckdake.com/gallery/2009/june-california/.
Some Changes
Submitted by ckdake on Fri, 2008-01-04 09:34It's been pretty busy the last couple of weeks. A lot's gone on, so here's the short version and I'll likely go a bit more in depth over the next few weeks as I get around to starting to post more regularly.
- I graduated from Georgia Tech with a Masters in Computer Science specializing in Networking.
- I bought a house in Reynoldstown/Cabbagetown in Atlanta, GA and moved in. Still need to get some furniture and fix some things.
- I started work as an Operations Engineer for SugarCRM where I'll be working on scalable infrastructure sorts of things.
CPR, SNMP, and the Sun T2000
Submitted by ckdake on Tue, 2006-06-06 08:17A while back I got a Sun Microsystems T2000 server for a few months as part of their try and buy program. It went into the Research Network Operations Center rack in OIT at Georgia Tech and we went to town on it seeing what the box was capable of doing. This was an 4 core box with 4GB of RAM in it, and the beauty of this server is that in this configuration it can handle 32 simultaneous threads in hardware on the cpu at any one time. The clock speed is only 1GHz but it's somewhat like having a 32 cpu machine. Better descriptions of this are available if you Google around a little bit, but here are our experiences with it.
Most of my tests focused on MySQL performance. The project I'm working, CPR: Campus-Wide Network Performance Monitoring and Recovery, is a full mesh network of monitoring machines that all report back _lots_ of data. I wrote a little parody of a Microsoft SQL Server ad to sum up what we do with databases here:
How does Georgia Tech predict failures for its 57,370 network ports on 1765 switches in 188 buildings? They import data from 54 systems into one data warehouse requiring over 100 million rows, all running on MySQL 4 with no downtime that's ever been noticed. Current deployment rates indicate that over the next 12 months, the system will acquire around 1 billion rows of additional data."
This is all currently running on a slightly dated Sun box and we ran into some perhaps unique performance issues. Originally all of the machines reported back on the hour at the same time. over 50 machines initiating SCP sessions at exactally the same time was bringing the box to its knees, and the data import script would take a while to run. Load averages on the box stayed over 8 most of the time and if something went wrong it took days to get caught back up. We have been able to fix a lot of this by scheduling reporting times to be at different times and rewriting the conversion, import, and archiving scripts, but more hardware would probably help so I did some testing on the T2000.
The primary latency in what we do seems to be the disk IO for MySQL and the T2000 was of little to no help with this, additionally the T2000 was unable to keep the entire database in RAM due to the way that memory is partioned on a per-cpu basis. This meant slower searches (and some of our searches take over 10 minutes to run on the existing hardware) and no noticable increase on insert performance. I didn't have time to test out the simultaneous SSH connections from the CPR network, but performance there probably would have been improved. Later on we would like every machine to report back every 5 minutes after running it's test and push the data into the database then instead of scheduling the importing separately which would give us a more real-time picture of the network and would most likely be greatly helped by the T2000's architecture.
After my testing was finished, I handed off the box to Jerry Swann, another OIT employee that runs the snmp monitoring for the campus network and here is what he had to say about it:
Then we got the 4 core T2000, as a test of the backwards compatibilities of the box, we copied the present polling directory structure from the old system to the new, including the program executables and binary data files. With minor changes, a symlink here and there to fix the program paths (/usr/local/bin vs /usr/bin), the code ran. Not only that, but it ran fast. Where before we had to control the number of devices polled in order to keep the total polling time under our 5 minute polling interval, the T2000 was able to poll all 100 of the previously configured devices in 50 seconds (down from 4min 30sec).
Well, that isn't much load, especially since I was trying to see what this new box can do. So I decided to use a list of all the network devices that our network admins maintain in an database called the Book of Knowledge to see if that would load up the T2000. There were now a total of 866 devices, all of the network pollable core routers, core switches, edge switches/stacks, and firewall instances.
After configuring the polling software to allow 300 consecutive processes, I started the poller polling. It was awesome, the load went up to 50 within a couple of minutes. It then topped out about 75, and surprisingly enough the box was still usable for other things.
What was really great was that where the ultra2 was monitoring a maximum of 1050 interfaces on those before mentioned 100 devices, the T2000 was now monitoring about 28000 interfaces every 5 minutes.
When you combine the fact that I had to only do minor changes to the code plus it was capable of using that backward compatibility and run the software so much faster, you really can't beat the T2000. Buy one today.
All and all it was a pretty well powered box and a lot of fun to play with. RNOC ended up ordering 4 of the 8 core T2000s for various things, one of which will definitely be the SNMP monitoring, and if you have similar tasks to do the T2000 is highly recommended, espically if you are already running on a Solaris system.
Dealing with MySQL Replication
Submitted by ckdake on Tue, 2005-08-09 15:08This will be a little different than previous posts but I figure I should put other things in here. I've been biking a whole lot recently including on the streets of buckhead at night in the rain, breaking some things on my mountain bike, and more of the same but today I'm talking about work.
So at work we have a e-mail filtering appliance. SpamAssassin does its thing and we store all of the bayesian filter data in a MySQL database. A master server is responsible for all of the learning, but there are also backup servers that filter email first to reduce load on the main server. To be the awesomest it can be, the backup servers all have MySQL installs that replicate off of the master MySQL database on the master server. This is great when it works but yesterday it was pointed out to me that another table in the database that does recipient verification wasn't updating properly. Turns out there were major major problems with the whole thing and I spent a good bit of today trying to figure things out. Here's what I learned:
- SpamAssassin uses composite keys in the token table for some reason. So in the bayes database there is an id that is the same for all tokens, and then the token itself. The key for the table is the composite of these two attributes and that's just weird since the id is pretty much always the same.. (Its the same in all ~135k rows in our database)
- Various terminals and character encodings when dealing with binary data in mysql are a total pain to deal with. The bayes token is stored as a little piece of binary, and when trying to update/delete/search by hand for specific tokens, using wildcards (%) at the beginning and the end, as well as for any sets of 8 bits that arent printable in ASCII is really the only way to go. Otherwise you'll get 0 rows returned and thats just not right.
- When importing data from a .sql file into MySQL and there is lots of it and it gives duplicate key errors, you have two options:
- search and replace ),( in any insert rows with )\nINSERT INTO bayes_token VALUES( . This lets you see which lines are duplicates and just modify that line
- the way I chose to do was to add ON DUPLICATE KEY UPDATE bayes_token="5" to the end of each INSERT statement. In my particular case, all ids were set to 4 and when there was a duplicate key, I set the ids to 5. then I could go in and SELECT * FROM bayes_token WHERE id="5" and fix any problems by hand. Odly enough I could do UPDATE bayes_token SET id="4" where id="5" and they were updated with no problem, even though this gave them a duplicate key. Even more odly, it doesnt seem that it actually was a duplicate. Before I updated, I updated I ran the following SQL (I checked and the num_spam number was unique to the one row I was looking at that supposedly a duplicate key) which should have given me both duplicate rows without worring about copying the binary data in a non-friendly terminal, but SELECT * FROM bayes_token WHERE token = (SELECT token FROM bayes_token WHERE num_spam = 34762) only returned the one row as a result. Weird.
- It is easiest to start over with MySQL replication. Do the database dump on the master, take note of the binlog position, import the .sql file into the slave, and start up replication on a new server id. Trying to "fix" errors is a bit of a pain and I ran into the following:
- "duplicate key exists" - I could go in and stop the slave, delete rows similiar to the token, start the slave again, and usually it would move on but due to not being able to correctly copy and paste the binary data to search with, sometimes 0 rows would be deleted and sometimes hundreds would be deleted. This probably had some effect on the quality of the filtering so I gave up on this approach.
- "Unable to parse replication log" - suck. I hate this error. Couldn't figure out a good way to fix this one and working with the crazy huge log file was just no fun.
So that was my fun at work today. Its time to get Gallery 1.5.1-RC2 out the door before meetings and apartment hunting this evening.
Laura + Work + Bike Ridin
Submitted by ckdake on Sat, 2005-07-23 17:24And so almost ends another crazy week of Laura, work, and bike riding.
Sunday night after many delays Laura finally got to the Atlanta airport around 12:30am and we came back to my place and crashed. I had to get up early for work each day she was here and she got to sleep in so I don't know about that... I do know that I'm tired.. but gotta work. Several things were done over the week including a good bit of eating and movie watching. The biggest "New Movie" of the week was Wedding Crashers. That movie renewed my faith in R rated movies. It was crude at parts, crazy funny, and I'd even recommend everyone see it. Other movies included Charlie and the Chocolate Factory (for my second time, just as good) and maybe half of Army of Darkness (I own it on DVD, one of my favorites) before we passed out one night.
As far as food was concerned, we had Elk steaks at my parrents' Monday, Grilled up some bugers and dogs with Michael and Jenny and others Tuesday, and went to Chow Baby on Wednesday which was followed by some hot fresh Krispy Kreme. Everyone needs to go to Chow Baby sometime because it is fantastic. All you can eat stir fry for not to expensive and you pick every little thing you want in it and how much of each. So cool. Anyways, as usual it was awesome hanging out with Laura for an extended period of time and we hung out with more than a couple people: My parents, Anya, Melissa, Becca, Aaron, Michael, Jenny, Emily, etc. Good times until she had to leave Thursday morning. I headed back from the airport and took care of some work things at home and then went to Blanket's Creek with Mark for a little mountain biking followed by dinner at The Vortex and riding back in Mark's relatively new 2005 Subaru WRX STI. Fast car. mm. However, that biking really beat me up because it was freakin hot. Hot just isn't my kind of weather.
Anyways, after more work and taking care of some stuff around here Friday it was time for another FM ride. I drove to the Lair at about 6:30 and we biked over to Christopher's work downtown. There a bunch more people met up and we rode to drop off some movies at "Movies Worth Seeing" and then headed to Angel in Decatur. It was lots of riding and we were hauling pretty fast, some pictures of the before and after sitting around are on our website. After some drinks, Ben, Robert, and I headed back to the Lair to shower and meet some people for a movie. We got to the theatre at 9:59 but the movie was at 10 and the line for tickets was over 30 minutes long. We bought tickets for the 12:30 showing of The Devil's Rejects and headed to Waffle House to kill some time. The movie turned out to be fantastic, better than House of 1000 Corpses, and after squeezing back into Robers Porsche to get back to the Lair, I drove home and got to sleep around 3:30.
Today I woke up super early and headed to LYD to reformat one of the servers that I manage. It took less time than expected and I've been sshd in from home getting things set up properly for most of the afternoon. Tonight may be more biking or I might just get some food and go to sleep and bike tomorrow instead. It's been a long week..

