Friday, September 7, 2012

Book Review for "Exploring Everyday Things with R and Ruby"

Book Review for "Exploring Everyday Things with R and Ruby"
by Sau Sheong Chang, O'Reilly Publishing

Data exploration and visualization made relatively easy, with Ruby and R.

I've been a fan of R for quite some time now. I'm not much of a statistician, but I am good at finding problems when given visualization of a pile of data. That's what I like R for. I've used it to examine log files (especially timing metrics) and found leaking connections, pointers to under allocated resources, and other problems. R can tell you things about data you'd never find just by looking it over.

The thing about R is that you usually have to set the data up in a format that makes it usable. R likes data grouped into neat bunches, which can then be mathematically examined and displayed on nifty plots and graphs. Preparing that data is something I had usually done in Python, but this book chose Ruby. I was happy with that, as Ruby is a language I could stand to learn more about.

The first chapter gives you a good introduction to Ruby. Readers should probably know what an object oriented language looks like, but not much else is required. The author presents the basics in a clear and minimal manner, probably enough to get most programmers off the ground. I liked what I read.

The second chapter introduces R. Again, the author is very clear and gives plenty of easy to try examples. In my opinion, this book provides one of the best introductions to R that I've seen. Very well done.

Once you're given introductions to the core toolkit, the author is off to the races explaining how you might envision data solutions to a variety of contrived problems. You get examples of how to calculate bathroom capacity for an office building, how to model a market economy, how to model a flock of birds as they travel. The author is creative about the problems. Honestly, some of these were a bit tedious to follow. For each problem presented, the author sets up an object-oriented domain to represent the problem space. The next step is to produce data from the domain, and finally to run the data through R to give insights into how that data can be interpreted. All these steps are explained, but you might find yourself referring to external sources once in a while to figure out how to read a particular line of Ruby or R that is being used.

 I found these problem exercises a little tedious at times, but valid in their construction. It's sort of like reading someone else's infrastructure code-- you can gain a respect for how their thought process works, but it's not always fun going through the knowledge assimilation process. Probably good mental exercise for a programmer, though.

So, what's the verdict? I liked this book and am certain I'll use it as a reference once in a while. (Especially for the R portions.) Ruby I may or may not end up using again, because Python is fun to program in. Is this book worth the money? I'd say it is, especially if you don't yet grock R or Ruby. If you're at the mid-to-expert level with R, then you may not have much to learn technically (assuming you already have a way to tee up your data), but you could find the problem-solving cases interesting. Or not, it depends on how you like reading that kind of stuff. If you're flat out new to both R and Ruby, you probably should buy this book just for the first couple of chapters. It'll open up your world.

 The book can be found here.

Happy Data Visualizing!

No comments: