Two MySQL Posts Which Make Me Secure In That I’ll Have Work For a While

This is going to make me sound like an old man (OK, right, I am old. So, suck it.), but these two posts alternately scare me and let me know that I’ll have work going into the future as a DB developer.

In the first post, Chris Moos describes a problem he’s having with queries performing badly, doing full table scans over a 12M row table. His answer: partition the table. So, Chris, first off, let’s talk about this 12M row table. Does it really have to be so long? Can something be done (other than partitioning) to make it smaller. Second, um, have you heard of indexes? They work pretty well. (Not that our tables are all that big, but) We have tables well over twice that long (and they sound like they are much wider as well), and get good response on queries using the indexes. Why, why, would you go with something as complicated as partitioning, when there’s other approaches that should be tried first.

The second post describes how, with the coming of more NoSQL solutions, MySQL doesn’t need to be used to store data objects as blobs anymore. A solution that earlier “made perfect sense.” There’s many arguments to the design of data objects, and I’ve been a part of many of them. I have big problems with the storing of objects as blobs which I won’t go into here. If that’s something you want to do, fine. But to do that using MySQL (or any relational DB is just plain stupid). And, it never made any sense, let alone perfect. When picking tools, familiarity with the tool is only one aspect. A much larger aspect is whether it’s the right tool for the job. MySQL has a lot of strengths, but the way it stores blobs is not one of them. And, the NoSQL solutions aren’t really new. Object-Relational, Object, and Document-based DB’s have been around for a long time. Not to mention, just using ye ole file system.

My design steps?

  1. Choose the right tools and be able to defend why you’re choosing them.
  2. Keep things simple. Only get more complicated as the product/users dictate.
  3. Think ahead, but not too much ahead. Are you designing for today, or the future? Don’t want to go crazy as who knows what the future holds, but try to at least think 3-6 months ahead.
  4. Be ready to redesign something. Don’t hold onto a design for reasons of pride. It doesn’t work, then it doesn’t work. It’s only data, it’s movable.
02/17/10 11am
no comments | databases, mysql read on

How to load large files safely into InnoDB tables

This is awesome! We’ve been having multiple issues with data loads recently. Especially one file we get monthly, which is not very wide (few columns, with small data types), but has over 27 millions rows. Even though our server is now huge, and I split the file into 7M row files, one of the “chunks” still almost brought down our database this month. About an hour into the load (the first two chunks, which loaded with no issues, took about 2.5 hours), all inserts started timing out. Looking at the monitoring graphs on RightScale, the cpu’s were pegged all waiting for disk response. Selects were still being returned, but all inserts had stopped. Weirdly, while new inserts were timing out, two inserts and one update were just hanging. Unfortunately, killing the load didn’t have any immediate affect as it needed to rollback the hour of changes it had made (don’t ask me why MySQL does a “load data infile” within a transaction; seems much smarter if it weren’t, but anyway …). Luckily, killing the three “hanging” processes did allow other inserts to start up again. And, after about 30 minutes, the data load process finally went away.

So, I went a-googling and found this article. We tried it on one file made up of the two remaining “chunks”, so around 13M rows (1.3GB). We set the fifo to 500K, and a delay of 30 seconds between each. Not only was the load on the db amazingly lower, but (most likely since memory wasn’t filling forcing the system to start swapping) the load was amazingly fast, loading everything in just over an hour. Yes, that’s right, from 2.5 hours for 7M rows to (I’ll round up and say) 1.5 hours for 13M rows.

Needless to say, I’m geeked-out about this.

Now, we just need to figure how to turn this into a repeatable monthly process.

Portland Transport: What Kind of Traveler are You?

From a survey originally done in Germany, five type of travelers are identified.

  1. Public transport rejecters. These believe public transport provides little sense of control or excitement. They are not open to change and see access to mobility as very important.
  2. Car individualists. Similar to public transport rejecters, but are open to change and consider privacy more important.
  3. Weather-resistant cyclists. Positive towards bicycles and will cycle even in bad weather.
  4. Eco-sensitised public transport users. Positive towards public transport and are highly influenced by their environmental conscience.
  5. Self-determined mobile people. Perform the highest percentage of trips by foot; they do not consider mobility important and are not open to change.

Among these, I guess I’d be closest to #5. Would much prefer to do my trips by foot. But, in no way do I “not consider mobility important”: in fact, it’s mainly for reasons of mobility that I prefer walking. Though, by mobility, I’m thinking versatility, and they may be thinking distance. Nor do I think I’m “not open to change”: though I used the bus a fair amount in Phila, I use it much more often here in Portland since distance if much more of an issue.

02/16/10 6pm
Comments Off | Uncategorized read on

HipHop for PHP: Move Fast

I’m posting this a little late, but this looks quite interesting. It may end up going nowhere, or, at least, maybe in a direction that doesn’t help us much, but it’ll certainly be interesting to watch.

Follow up with thoughts on who will and won’t be able to take the most advantage of this.

02/16/10 6pm
Comments Off | geek read on

Portland Timbers 2010 Schedule

I was again able to make an ics version of the Timbers schedule using this script.

Unfortunately, many of the times are still TBA, but I’ll update this later (if needed) as those times get filled in.

timbers-2010.ics

02/15/10 10am
no comments | timbers read on

Video : The New Yorker

Great short video on the making of the most recent New Yorker Money Issue. Loved the way they involved three different artists to produce three “covers” to create a little story. I hope they do something like that again soon. This little bit shows why I love this magazine, both that they do so much work with illustrators (so tired of photos) and that they feel free to spend the money to do something which was probably missed by most readers, let alone others. (via drawn.ca)

02/05/10 10pm
Comments Off | art read on

Walk or Bus? – Visualmotive

My last three years in Philadelphia, I (mostly) walked to work, since home and work were both downtown and right around two miles apart. (For two years of that, I worked across the street from Independence Hall, which was awesome until Sept. 11.) On days that I wanted to move fast while going to work, or it was too hot/cold, I’d walk up to Chestnut St. Usually, I’d go from bus stop to bus stop, and if a bus and I pulled into a stop at the same time, I’d hop on (rarely, I’d just wait for a bus at the nearest stop). Then, I’d repeat the process going home on Walnut St (or sometimes a street further South, but Walnut is much more fun to walk down). Quite often, I’d reach my destination never getting on a bus. Wasn’t always great if the weather sucked, but was good for my waistline (which has gotten much bigger now that we’ve been in Portland for almost as long as we’ve been in Phila).

Anyway, that was a long way to say it might have been interesting to have this chart back then. Won’t do me much good right now. But I probably won’t be working from home forever.

10/01/09 2pm
Comments Off | Design, health, transit read on

My Hidden Room » MySQL, Alter Table, and How to Observe Progress

Our tables have gotten large enough that it takes hours to do an alter table. So, this could be a very good thing. It doesn’t relieve any of the issues dealing with the table being locked during this time, but at least we’ll know how much longer it will be locked.

Makes me wonder if (since our servers are in the cloud), we could apply any new indexes on a slave, let the slave catch up, then promote the slave to master. That quite possibly could work with little down time. Hmmmm.

10/01/09 2pm
Comments Off | databases, geek, mysql read on

Fred’s Steak v1

Tried this recipe last weekend for Fred’s Steak (don’t ask me why the link is from wikibooks – that just seems wrong).

For those that don’t know Fred, you’re missing out, here’s two posts which will give you an idea of it’s fan base (and where to buy it).

I was pretty faithful to the ingredients (even the coffee), though I didn’t need so much so I basically halved everything (the recipe says it’s enough for four steaks). One, big difference is that I only marinated for 36 hours and the linked recipe calls for 48-72 hours. The steak was not black enough (seriously, the real Fred is incredibly – almost disgustingly – black). And Theresa said it tasted like Fred-lite. That could be my shorter marinate time, or it might be the ingredients (can’t say that I agree with all those used … I mean, coffee?). It wasn’t bad though, and might be worth a try with a little tweaking.

There’s also this, which is close to above, but adds chili, onion and garlic powders (boo!) and marmite (!).

If anybody has tried and been successful (even if it’s not fully Fred’s if it’s good, it’s worth knowing) please let me know.

08/29/09 5pm
Comments Off | cooking read on

Gmail Nudges Past AOL Email In The U.S. To Take No. 3 Spot.

Wait. What? People are still using AOL?

08/14/09 11pm
Comments Off | geek read on

About

Hello, I’m Rob Lusardi.

I’m married to a wonderful woman named Theresa, and have a son that is getting too sophisticated for his (and my) own good. We live happily with our dog in Portland, OR.

I’m a Software Developer specializing in Data Architecture and Web Application development. I use this site mainly to remind myself of different things I see around the web that may come in handy later.

Categories