Saturday, November 3, 2012

Book Review for "Oracle Certified Associate, Java SE 7 Programmer Study Guide"

Oracle Certified Associate, Java SE 7 Programmer Study Guide


The text was well written and easy to understand.  Each chapter starts by explaining what will be discussed in the pages ahead. The middle of the chapter explains the subject matter.  The end of each chapter outlines which objectives of the test were covered, and a short quiz to test the reader's understanding follows.  It is structured this way throughout the book.

The book is 332 pages, which is a pretty big book.  Illustrations are provided for topics that would benefit from them.  As one example, a discussion about pointers and the objects they reference might include an illustration.  Code samples are clean, minimal, and are well constructed for the topics they are targeted towards.

Nine chapters are present.  They cover: Getting Started with Java, Java Data Types and Their Usage, Decision Constructs, Using Arrays and Collections, Looping Constructs, Classes, Constructors, and Methods, Inheritance and Polymorphism, Handling Exceptions in an Application, and  The Java Application.

As you might expect, the contents of the above chapters are technical in nature, explaining Java in the subjects described.  The text is understandable and well presented.  I do wish the ordering were a little different in places.  For example, a new programmer might benefit more from learning about packages, the directory structure, etc. earlier on in the book.  (This is covered in the final chapter "The Java Application".)  On the whole, there is little to criticize.  The book is a matter-of-fact technical tome that describes Java and how to write basic programs in it. 

If I were to pick a favorite part of the book, it would probably be the explanation of Exceptions.  I've read some really horrible descriptions of the checked-vs-unchecked discussion in other places, this book does a really good job on this one.  There are other stand-out parts of the book as well, but IMHO this one was one of the best I've seen on this particular topic.

Readers new to Java may not realize the depth of the Java platform.  This book (and the exam it is targeted towards) will give the reader an understanding of Java 'SE'.  SE is the basic, desktop flavor of Java that underpins the rest of the Java development platform.  Other forms of Java-- especially JEE with its servlets, JMS, EJBs, JSF, etc. are not covered in this book, even lightly.  The same new-to-Java readers should be aware that Java SE is a necessary prerequisite to understanding JEE.

So, who would benefit from reading this book?  Primarily new programmers who are seeking the Oracle certification.  Experienced programmers who are seeking certification would also benefit, probably mostly from the explanation of the objectives of the test so they can know where to spend their time. 

What's the final verdict?  I would say this book is a good Java resource for anyone preparing for the SE 7 certification, and also a decent Java reference for anyone coding in Java.

The book can be found here.

Happy coding!


Tuesday, October 23, 2012

Back to Basics! Review underway for OCA Java SE 7 book

Hello friendly blog readers!

If you're a frequent reader of this blog, you probably remember some fairly technical Java topics.  Like most Enterprise-ish Java folks, I sometimes find myself deep in the weeds with some difficult topic.

But do you remember when you were new to Java?

There are still people out there today who haven't seen EJB2, XDoclet, Struts, Ant vs. Maven and so on!  They are just learning Java today.  How in the world will they ever accumulate the vast array of tangential knowledge they're going to need to survive?

Why, they'll read books of course.

If you're one of those new people, I'm embarking on a project that can help you.  I am reviewing a new book from Packt Publishing that covers the basics of the Java language.  You don't need to be a rocket scientist to read it, and it will give you a nice frame of reference to build your Java knowledge upon.

The book is called "Oracle Certified Associate, Java SE 7 Programmer Study Guide" and it can be found here.

I'll post a review soon.

Meanwhile, happy learnings!

Thursday, September 27, 2012

Get a free JBoss e-Book! (Or other, if you'd rather)

Hello fellow programming enthusiasts,

It's with great pleasure I report to you news that you are entitled to receive a free e-Book of your choice from Packt Publishing.  That's right a title of your choice!

This is in celebration of Packt's 1,000th title being printed.  The part I like about Packt is that they offer great titles on fast-moving open source projects.  If you want to learn about some hot project, become a contributor, or even just be a plain consumer of open source software, Packt is a great resource.

Here's how it works:  During the event, Packt is inviting anyone already registered to www.packtpub.com, or who registers before 30th September 2012, to download any one of their eBooks for free. Packt is also opening its online library for a week for free to members, offering customers an easy to way to research their choice of free eBook.  Just sign in and pick your title!

I favored JBoss in the title to this blog, but there are plenty of other great books you can pick up as well.  (Even a few that cover proprietary software.  But I hope you don't pick one of those!)

You can get started by visiting here.

Happy Reading!

Friday, September 7, 2012

Book Review for "Exploring Everyday Things with R and Ruby"

Book Review for "Exploring Everyday Things with R and Ruby"
by Sau Sheong Chang, O'Reilly Publishing

Data exploration and visualization made relatively easy, with Ruby and R.

I've been a fan of R for quite some time now. I'm not much of a statistician, but I am good at finding problems when given visualization of a pile of data. That's what I like R for. I've used it to examine log files (especially timing metrics) and found leaking connections, pointers to under allocated resources, and other problems. R can tell you things about data you'd never find just by looking it over.

The thing about R is that you usually have to set the data up in a format that makes it usable. R likes data grouped into neat bunches, which can then be mathematically examined and displayed on nifty plots and graphs. Preparing that data is something I had usually done in Python, but this book chose Ruby. I was happy with that, as Ruby is a language I could stand to learn more about.

The first chapter gives you a good introduction to Ruby. Readers should probably know what an object oriented language looks like, but not much else is required. The author presents the basics in a clear and minimal manner, probably enough to get most programmers off the ground. I liked what I read.

The second chapter introduces R. Again, the author is very clear and gives plenty of easy to try examples. In my opinion, this book provides one of the best introductions to R that I've seen. Very well done.

Once you're given introductions to the core toolkit, the author is off to the races explaining how you might envision data solutions to a variety of contrived problems. You get examples of how to calculate bathroom capacity for an office building, how to model a market economy, how to model a flock of birds as they travel. The author is creative about the problems. Honestly, some of these were a bit tedious to follow. For each problem presented, the author sets up an object-oriented domain to represent the problem space. The next step is to produce data from the domain, and finally to run the data through R to give insights into how that data can be interpreted. All these steps are explained, but you might find yourself referring to external sources once in a while to figure out how to read a particular line of Ruby or R that is being used.

 I found these problem exercises a little tedious at times, but valid in their construction. It's sort of like reading someone else's infrastructure code-- you can gain a respect for how their thought process works, but it's not always fun going through the knowledge assimilation process. Probably good mental exercise for a programmer, though.

So, what's the verdict? I liked this book and am certain I'll use it as a reference once in a while. (Especially for the R portions.) Ruby I may or may not end up using again, because Python is fun to program in. Is this book worth the money? I'd say it is, especially if you don't yet grock R or Ruby. If you're at the mid-to-expert level with R, then you may not have much to learn technically (assuming you already have a way to tee up your data), but you could find the problem-solving cases interesting. Or not, it depends on how you like reading that kind of stuff. If you're flat out new to both R and Ruby, you probably should buy this book just for the first couple of chapters. It'll open up your world.

 The book can be found here.

Happy Data Visualizing!

Wednesday, August 1, 2012

Early Access Book Review: Go In Action



I've had an interest in Google's Go since it was announced with great fanfare a few years back.  My bread-and-butter language is Java, but I'm painfully aware of the shortcomings of the language.  Once in a while I look at other languages for fun, hoping to find something that brings a little fun back to programming.  With this in mind, I recently picked up a copy of Manning's "Go In Action", in Manning Early Access Program (MEAP) form.

A MEAP book is released in stages, starting when just a few chapters are available.  Then as time goes on, more chapters are added and the reader gets notification that a new e-book can be downloaded.  (You've paid for the book just once, you get the updates free as the book progresses.)  If you catch a book in the early stages, it's kind of interesting to see what the author provides first.

In this case, "Go In Action" is bringing some language fundamentals in the first released chapters.  As you might expect, the first chapter introduces Go and explains the basics of what the language is like.  (i.e. static vs. dynamic typing, the main benefits of the language, some places where it would be appropriate and a few places where it might not.)

Chapter 2 introduces the basics of the language.  There are many small but clear examples, I found these especially informative.  The author explains things in a precise but compact manner.

The third and final chapter currently available deals with Go's use of interfaces and how they can be used to write flexible code.  This chapter demonstrated to me that the book will have substance beyond the basics, so I have high hopes the remaining chapters will provide more expert insights.

All things considered, I found this 3-chapter sneak peak to be a fun read, and one that whetted my appetite for the next chapters.  I'm looking forward to the next MEAP chapter drop!

The book can be found here.

Happy 'Go'ing!

Saturday, July 7, 2012

Book Review for "The Well Grounded Java Developer"


Book Review for "The Well Grounded Java Developer"



Once in a while there comes a programming book that's so ambitious in scope that you wonder how the author can ever hope to do all the sub-topics justice.  This is one such book.  Among the topics the authors hope to cover:

- New features of Java 7
- New I/O
- JVM internals
- Dependency Injection
- Performance Tuning
- Polyglot programming, including introductions to Groovy, Scala and Clojure (beyond 'Hello World')
- Test Driven Development (JUnit, stubs and mocks)
- Continuous Integration with Jenkins, Maven and FindBugs
- Staying current in the Java landscape

All this in around 500 pages. 

If you're like me, you probably have a decent understanding of some of those topics, probably haven't had time to get to the others, and maybe even see a few spots you think you may not need.  (That was my feel after looking over the table of contents, but it changed as I read the book.)

Here's the nice part of this book:  for those spots I didn't think I'd need (i.e. web development in Clojure) the authors do a nice job of explaining why I might someday want to consider it.  The reader is often urged to make well-educated decisions about how to go about developing on the JVM.  Need to write a web application?  You are given a nice set of criteria for deciding which framework you might use, and some ideas around the topic.  Wondering how other folks go about performance tuning?  That chapter explains some 'whys' not just 'hows' as well.  I'd consider this a book for curious Java programmers, written in a way that urges methodical adaption of helpful technologies and techniques.

All things considered, I liked this book and would recommend it for anyone programming on the JVM.  It contains tips and techniques seldom seen in the junior-to-mid-level programming ranks, and covers a lot of ground in describing the contemporary Java landscape.  It won't make you an expert in any of the varied sub-topics, but it will comfortably make you aware of them and give you enough of a headstart to get you off the ground.

For anyone coding Java for a living, I'd suggest putting this book on your short list of books to have a look at.

"The Well Grounded Java Developer", currently available from Manning in early-access MEAP form, will be published later this month.

Happy Reading, Coding, Testing and Deploying!

Saturday, April 21, 2012

Book Review: Apache Maven 3 Cookbook




Are you a long-time Ant fan being drawn into the world of Maven?  I am.  I like Ant and it's straight-line, totally transparent nature.  But I'm also pragmatic, and I think I can see the tide has turned.  Maven is now being used (almost required!) by a good many projects that I use daily.  It's high time I got on board.  So I picked up a copy of "Apache Maven 3 Cookbook" and started reading.

Like all Packt Cookbooks, this book follows a predictable format.  These books are meant to guide the reader directly through commonly encountered tasks.  You find an article title (i.e. "Integrating Scala development with Maven"), and under this heading you'll find a little text then the sections "Getting Ready", "How to Do It", and "How It Works".  There isn't  a lot of text spent explaining theory or history, it's mostly just how to accomplish a particular task.

The book starts out with the basics of Maven, which was useful for me.  Some of what I found in the first chapter I knew from previous dealings with Maven, some I thought I knew but wasn't sure, and some I hadn't seen before.  There are totally simple examples of how to set up Maven on various platforms.

The next few chapters cover the core of Maven's use cases.  Software engineering (complete with automated unit tests, code coverage reports, etc.) are explained here, as are the uses of Maven's dependency management system.  For those who are totally new to Maven, dependency management-- the automatic downloading and inclusion of libraries your project needs-- is probably the single best feature of Maven.

Hudson integration and various reports that can be generated are next.  The reports include JavaDocs, code coverage, and code quality, among others.

Some common Java development scenarios are covered next.  These include web applications, JEE apps, Spring, Hibernate and Seam.  Mostly what you are shown is use of an optimal archetype for each of these, then the expected directory structure after the project is generated.  There's also a little useful text about how to go about developing further in the chosen application type after that.

Chapter 6 is devoted to Google development with Maven.  Topics include Android development, GWT, and Google App Engine.

Chapter 7 explains Maven usage with Scala, Groovy, and Flex.

Chapter 8 explains using Maven with an IDE.  Eclipse, Netbeans, and IntelliJ are explained.

Finally, you are told how to extend Maven by making and documenting your own Maven plugins.

So, what's the final verdict?  This book was useful for me, as it explained many things about Maven I didn't previously know.  The book is formatted in such a way that it's task oriented, so it's a more comfortable read if you're coding as you're reading.  If you develop in some of the many use cases described above, you'll find some value in this book.

The book can be found here.

Happy assisted Software Engineering!

Thursday, March 29, 2012

Book Review for "Microsoft Windows Intune 2.0: Quickstart Administration"




Are you a Windows administrator? If so, you may be curious about Intune, Microsoft's browser-enabled tool for managing groups of PCs. This book explains how you can use Intune to set up and manage groups of PCs, to include software installation, migration, and maintenance.

The first chapter explains basic cloud concepts. Most of today's current buzzwords are explained from a high-level view.

Chapter 2 covers PC management, including management via policies. This is a little closer to the core material for the book, but still not quite down to the hands-on level.

Chapter 3 starts with Intune capabilities. This chapter is really an overview of what will be covered in more detail later in the book. Roughly, these topics include the following: Installation of Intune, the Management console, Security management, Auditing, Reporting, and Alerts and Support.

Chapter 4 is a deep dive on the Installation process.

Chapter 5 is all about configuration. It includes sections on addition of administrators, configuring groups, alerts, and license management.

Chapter 6 covers policies and updates. Policies have a sophisticated hierarchy, so you can group PCs by different criteria. Firewalls and anti-malware are explained here.

Chapter 7 explains how you can prepare software to be pushed to your client PCs, and how you can also remove software from the client PCs via Intune.

Chapter 8 is about Reporting and Alerts. Intune ships with standard reports, or you are able to write you own if you wish. Reports can help you do things like figure out which of your managed PCs have older hardware, or are running particular versions of software. Alerts are notifications that something has gone wrong, like mal-ware being detected. Obviously, these are important.

Chapter 9 deals with responding to Alerts and ways you can provide support for the remove client PCs.

Chapter 10 explains Desktop and Recovery Toolset (Dart), which is the most capable support mechanism. This is the one you will end up using if a user has a blue screen of death or some other serious flaw that can't be resolved through simple software pushes. You can unlock passwords and un-install hot fixes that you think may have done more harm than good. One notable item: If you go into deep recovery mode, you're going to need an actual user at the keyboard of the affected PC. Microsoft hopes to make this completely remotable in future iterations.

Chapter 11 gives coverage of Windows 7, which is the current target O/S for Intune. It includes a nice section that tells you how you can backup user preferences and data from the current O/S (i.e. XP) and how to re-apply them when the machine has been upgraded to Windows 7.

Chapter 12 explains how Intune relates to other Microsoft products you may be working with. Most of these products will be completely compatible, some will be partially so, and some not at all. It also explains which other Microsoft products you may be using for PC management, and how they may relate to Intune.

Overall, the book is a very easy read with a generous allotment of screenshots.

Who should read this book? Anyone who manages groups of Microsoft PCs which will be using Windows 7. Given that scenario, Intune looks like it would make administration from any browser-enabled PC possible. If that's your toolset, you probably ought to look into Intune and this book.

The book can be found here.

Happy Cloud-based Administering!

Saturday, March 24, 2012

Avoid this Fundamental Mistake, Simply Illustrated



An easy to make mistake in planning your Enterprise application can cost you big time. (Some folks even repeat this error after suffering through it once, because they don't understand why they failed the first time.) Here are two pictures that show you the right way to do it, and the wrong way as well.

One of the best things about my job as a SOA support engineer is that I get to see lots of application architectures, shared from smart people from all over the world. I probably learned more in the first year of this job than I did in my best 5 of any other, because I got to benefit not just from my own mistakes, but also from the mistakes of others. Nobody is right all the time, and anybody who steps out to design something complicated is putting themselves at risk of learning some good lessons. Here's one that's shown up time and again.

It's natural to read about layered architectures and think this means you can have different machines that serve different purposes. It seems natural to think of a bank of servers that act as 'business logic' machines, perhaps sending data over the wire to a different bunch of machines at the 'DAO layer'. This is wrong, don't do it! A variation of this theme is to envision a couple of machines that 'concentrate' on a task like being a messaging broker. Same idea-- let one server concentrate on one task-- this is not good!

A better way of doing it is to use your whole middleware stack on EACH server. (That's right-- let that server act as GUI, mid-tier, Data layer, JMS broker, etc.) Manage load by using a load balancer up front. This is much easier-- once you have the configuration put together, replicate it as many times as needed to scale up (assuming your middleware offers this capability. In JEE, it's pervasive.) If you have stateful clients, you'll want to use replication so you can failover. If your cluster goes beyond a couple of machines, you'll want to use 'Buddy Replication' schemes, where only 2 or 3 machines replicate state for each other, rather than every machine trying to carry state for the rest of the cluster.

Check the diagrams. This isn't a new idea-- I know I've heard this from Martin Fowler, read it in 'JBoss in Action', and seen it in print elsewhere too. But still, the idea is perhaps not the 'natural' decision to make, so people who haven't had it called out to them may make a big error in a very expensive place. Please think hard before adapting a "Distributed and Layered" architecture!

Happy Architecting!


Saturday, March 10, 2012

Book Review: Hadoop In Practice MEAP update

Manning adds more content to the latest Hadoop book. Real-world users will benefit.

Manning offers books that are in "MEAP", which is a way for readers to peak at books as they are developed. As the book is written, Manning will periodically take what they have "so far" and make it available electronically. They're recently offered a few more chapters of "Hadoop in Practice", here's what they contain.

There are some minor changes to earlier content, but the biggest change is new content. Here are the new chapters:
Chapter 2, "Moving Data in and out of Hadoop"
Chapter 4, "Applying MapReduce Patterns to Big Data"
Chapter 7, "Utilizing Data Structures and Algorithms"

Chapter 2 deals with moving data into and out of Hadoop. It provides techniques for working with flat files, databases, and HBase. You're introduced to some tools that can help you with these tasks and ancillary needs like translating and aggregating the data. You're given ideas on how to push data from external sources to HDFS, and how to pull data from external sources directly into the MapReduce framework. You're also given an introduction to a scheduler that can help you repeat these tasks on a periodic basis, sure to be a production concern. All the examples contain instructions on how to obtain the helper components, how to build them if necessary, and how to configure and run them.

Chapter 4 provides suggestions to help optimize Big Data operations in MapReduce like joining, sorting, and partitioning.
"Joins" are familiar to most programmers, necessary to combine data from 2 different sources based on some specified criteria. You are given ideas on how to best handle Inner joins and Outer. You are given ideas on how you might do your joining on the Map side or the Reduce side, and when each idea is appropriate. You are also given some expert insights into how Secondary Sorting works, and how the MapReduce framework interacts with your Map and Reduce functions at this point in the life cycle.

Chapter 7 examines algorithms and presents some valuable patterns and algorithms you can apply to your big data problems. Graph theory is used to conceptualize problems like 'Shortest Distance', where you try to calculate the fastest way to traverse a graph of nodes. Other problems deal with things like determining which nodes is best associated with another, the famous PageRank algorithm, and use of Bloom filters.

In my opinion, these latest chapters add positive value to the book, I think it's shaping up nicely. The concepts presented reveal expert insight into real-world problems a Hadoop user would encounter. If you are a Hadoop user, you owe it to yourself to check this one out.

Happy Hadooping!



Saturday, March 3, 2012

Book Review for "Do More with SOA Integration"



The first notable thing about this book is that it's big, nearly 700 pages. It is really a conglomeration of sections from several other books, so the scope of what is covered is wide. Like SOA itself, the book is not a cohesive whole that offers a shrink-wrapped solution to your problems. What this book does is offer concepts, considerations, and implementation advice for a variety of SOA needs.

The first few chapters are high level overviews. They provide a historical introduction to SOA, going back to the roots in EAI and before. Much of this will not be important for a contemporary architect or implementor, but it will help the reader to understand the past technologies used for Integration and how we got to where we are today.

The third chapter has some relevant material, and introduces some very important concepts. Among these are Transactions, OSGi,JCA, SCA and Process Modeling. These are all topics that are very much relevant to contemporary SOA, so the authors did well to include them. They also included some material that didn't quite pan out as well as it was envisioned (i.e. JBI), so it's a slightly mixed bag. Like with any book on a fast moving topic, you'll want to consult the contemporary workspace as you read the book, to understand which parts are going to be important to you.

Chapter 4 is about XML processing, and it has some really good content. (You can't touch SOA without getting XML all over you.) There are some expert recommendations on schema design, namespaces, message construction, etc. SOA architects should find some good ideas here.

The next chapter introduces BPEL, the standard way of hooking web service invocations together. As with the rest of the book, the material is presented using tools from the Oracle tool chest. Throughout the book, the reader should keep this in mind because users in Oracle shops will find some ideas directly applicable, where users of other technology stacks may have different mechanisms for accomplishing similar tasks. Fortunately, BPEL is fairly standard, so the terms we find in this chapter will be applicable to many users.

There are sections that are very much particular to certain products or vendor components. (For instance, there's a sizable piece on PeopleSoft CRM and Oracle Applications, integrated via BPEL.) This section can serve as a generic case study for users of other products, but some of the material is necessarily specific to the chosen products.

JBI and it's inner working are given good coverage in a couple of inside chapters. Once thought to be an emerging standard for ESBs, it now appears Integration products are growing in different directions, so these chapters may not be of interest to you depending on your chosen toolkit. Still, the material is well written and will be of use to some users.

Chapter 9 is about Web Services, a workhorse component of SOA. It includes a small catalog of "Runtime Patterns". I'm not quite sure what to think about the patterns yet-- we probably don't have enough well-agreed best practices documented in SOA, so this might be a step in the right direction. Some of it seemed a little too general to be useful for me, but perhaps in time I'll see better what was intended here.

Chapter 10 hits heavily on the Enterprise Service Bus (ESB). ESBs are solid winners in the marketplace, and have proven themselves to be an important component of SOA. This chapter is a big one, and the coverage is good. I found particular value in the sections on ESB operations, XSLT processing, distributed transactions, and WS-Standards.

Chapter 11 is about loosely coupling services and working to make services stateless. Some interesting ideas are presented, and implementation details about how to achieve the desired statelessness if you are using Oracle's tools.

Chapter 12 covers BPEL, as used with Oracle's suite.

The final chapters focus on 'legacy' integration, especially mainframe integration. The focus is especially sharp on IBM mainframes and the surrounding environment. If you are tasked with working in this space, there is material of value to you here.

So, what's the final verdict? I would recommend this book to:

- Anyone integrating with an IBM mainframe, using Oracle's tools
- Users of Oracle's SOA Suite
- General SOA readers who have a desire to round out basic knowledge. This book is not a good way to get off the ground, though.

The book can be found here.

Happy Integrating!

Wednesday, February 22, 2012

Book Review for MEAP "Hadoop in Practice"


Big Data is a hot topic, and a fast moving one as well. In that workspace, Hadoop is a big player. This early access edition shows the book's sweet spot: the areas other books have missed.

Hadoop, mature enough to have been recognized in mainstream media, is still fast moving. Like any framework, it constrains it's users to fairly rigid usage patterns-- but the users are finding ways around these. This book introduces you to some of these, opening up uses of Hadoop that would otherwise be out of bounds for you.

The book is a MEAP edition, which means it contains less than the full content. In this case, it means the book contains just a few chapters, good for 176 pages. The table of contents promises about a dozen more chapters and a few more appendices, so there is potential this could be a big book when it's done. But time will tell how those final chapters take shape, the ones that are present are rich enough that a little consolidation wouldn't be surprising.

The first chapter introduces the basics of Hadoop, and includes some excellent diagrams. Pictures can often bring clarity that words don't, and I really like to see plain and simple pictures to help me grasp the big picture. This book does well in this regard. Besides the overview, we get a quick glimpse of related and complimentary technologies, restrictions of using Hadoop and alternatives to using Hadoop. Versions of Hadoop are covered in two dimensions: the various distributions, and what's contained in forward-looking iterations. The chapter wraps up with a brief section on installing and configuring Hadoop for a first run.

The next chapter we're given is chapter 3, it covers data serialization tools and techniques. More good pictures are found in this chapter, as are explanations of how you can use Hadoop to process XML, JSON, Google's ProtocolBuffers, and Facebook's Thrift. Each of these gets their own section, explaining how you might use them. There are plenty of references to Elephant Bird, an open source project maintained by Twitter. You also learn how to handle custom file formats if you need them.

The final chapter of this early access book is on HDFS tuning techniques. The author tells you why Hadoop is not well suited to processing loads of small files, and how you can get around this limitation. The chapter also covers choosing the best compression and codec for your particular needs. When working with large amounts of data, choosing the right tools for compression can make huge differences in performance, so the contents of this chapter should be of high interest to those who are heading for production environments.

So, what's the verdict? I found the book's contents to be of high value and reflective of real-world knowledge that Hadoop users will require. I don't think the book is suitable as a sole resource for new users of Hadoop. (If that's your case, I'd suggest buying two books-- one to learn the basics, then this one for when you've gone past the newbie phase.) The book is fairly raw-- the chapters seem a little thrown together in places, and the content is short of what the table of contents promises. But you'll get updates with your MEAP purchase, and you can have some valuable content now. All things considered, I'd recommend this book for Hadoop users who are beyond the initial learning phases.

The book can be found here.

Happy Hadooping!

Saturday, February 11, 2012

Powerful Java Hacking made easy - use good judgement!


A very powerful hack - use with caution!

Here's a very quick tutorial on how you can use Byteman to change a running application on the fly. You can insert arbitrary code into a running application server. Caveat CodeMonkeytor!

Let's say we have a JEE 6 application that's misbehaving. You can easily attach Byteman to the app server, then inject the code you want-- without bringing down the server!

I like to use 2 scripts with Byteman, one to attach to the server (since this should be done only once), and one to check and install the Byteman rules I want to run. This way, you can incrementally add to your Byteman rules.

There's an example of a misbehaving Servlet at the bottom of this post. It's bad code. Let's pick on a single fault. The user is asked to provide two numbers, the servlet is supposed to divide them and provide the result. But if the user enters a zero for the second argument, a divide by zero exception results! Please install the servlet and try for yourself.

Making JEE apps is trivial now that JEE 6 is here. I'd recommend using JBoss AS 7, and building with Maven. (I used to hate Maven, but since I've started using it, I hate it a little less. Still, it seems better than Ant in easy cases.) So please find the Maven pom.xml at the bottom of this post, with the Servlet code.

But this is a post about Byteman, so I'm going to put that code first. Here are the scripts:

# InstallByteman.sh - run this only once, after JBoss is already running. It doesn't matter if JBoss has been running 7 months or 7 seconds.

# There should be 4 lines to this script-- in case your browser line-wraps.....
#!/bin/bash
export JBOSS_HOME=/home/rick/Tools/JBoss/AS7/jboss-as-7.1.0.Beta1b
export BYTEMAN_HOME=/home/rick/Tools/Byteman/byteman-2.0
export BYTEMAN_BIN=$BYTEMAN_HOME/bin
$BYTEMAN_BIN/bminstall.sh -b -Dorg.jboss.byteman.transform.all $JBOSS_HOME/jboss-modules.jar

# InstallRules.sh - This script validates and installs your Rules. You can run this as many times as you need, to incrementally build your rules up.

# There should be 10 lines to this script-- in case your browser line-wraps.....

#!/bin/bash
export JBOSS_HOME=/home/rick/Tools/JBoss/AS7/jboss-as-7.1.0.Beta1b
export BYTEMAN_HOME=/home/rick/Tools/Byteman/byteman-2.0
export BYTEMAN_BIN=$BYTEMAN_HOME/bin
# Export this so Byteman can validate your classes. This is wherever you have compiled your classes to.
export APP_TGT=/home/rick/Blog_Temp/Byteman2/simpleappservlet30/target/classes
# check it
$BYTEMAN_BIN/bmcheck.sh -cp $APP_TGT/HackFix.btm
# add the rule
$BYTEMAN_BIN/bmsubmit.sh HackFix.btm

So we can see in the above Install script that we have a Byteman rule named 'HackFix.btm'. Here it is.

(BTW, what this script is doing is changing the second argument passed to the method 'makeDivision' into a 1 if it is a zero. To prevent divide-by-zero...)

RULE Hack Fix for Divide Servlet
CLASS com.flyingdog.SimpleServlet
METHOD makeDivision
AT ENTRY
IF 0 == $2
DO $2 = 1;
ENDRULE

--------------------------------------------------------------------------------------------

So that's all the Byteman stuff above. You can use those samples to attach to a running application server (ANY application server, not just JBoss) and inject whatever arbitrary code you want. Cool (and dangerous), huh?




Ok, if you're like me, you'd love a quick way to try that out. So here's the pom.xml for the SimpleServlet...

 <project xmlns="http://maven.apache.org/POM/4.0.0"  
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.flyingdog</groupId>
<artifactId>notsogoodapp</artifactId>
<packaging>war</packaging>
<version>1.0</version>
<name>notsogoodapp</name>
<url>http://maven.apache.org</url>
<build>
<finalName>notsogoodapp</finalName>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-war-plugin</artifactId>
<version>2.1-beta-1</version>
<configuration>
<failOnMissingWebXml>false</failOnMissingWebXml>
</configuration>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<configuration>
<source>1.6</source>
<target>1.6</target>
</configuration>
</plugin>
</plugins>
</build>
<repositories>
<repository>
<id>java.net</id>
<url>http://download.java.net/maven/2</url>
</repository>
</repositories>
<dependencies>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>3.8.1</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>javax</groupId>
<artifactId>javaee-api</artifactId>
<version>6.0</version>
<scope>provided</scope>
</dependency>
</dependencies>
</project>

Now the bad Servlet.. Install it, have it divide a few numbers. Especially give it a zero to divide by, and notice the ugly blowup! Then use the above Byteman script to 'fix' the problem.


 package com.flyingdog;
import java.io.IOException;
import java.io.PrintWriter;
import java.lang.StringBuffer;
import javax.servlet.annotation.WebServlet;
import javax.servlet.http.HttpServlet;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;
@WebServlet(urlPatterns = {"/simpleservlet", "*.foo"})
public class SimpleServlet extends HttpServlet {
@Override
protected void doPost(HttpServletRequest request,
HttpServletResponse response) {
doGet(request, response);
}
@Override
protected void doGet(HttpServletRequest request,
HttpServletResponse response) {
try {
response.setContentType("text/html");
PrintWriter printWriter = response.getWriter();
printWriter.println("<h2>");
printWriter.println("</h2>");
// if there is a value there, try to provide an answer
if (null != request.getParameter("firstValue")){
printWriter.println(makeAnswer(request));
}
// print the form that requests numbers
printWriter.println(makeForm());
} catch (IOException ioException) {
ioException.printStackTrace();
}
}
private String makeAnswer(HttpServletRequest request){
String first = request.getParameter("firstValue");
String second = request.getParameter("secondValue");
int iFirst = Integer.parseInt(first);
int iSecond = Integer.parseInt(second);
int answer = makeDivision(iFirst, iSecond);
String response = first + " divided by " + second + " equals " + answer;
return response;
}
private int makeDivision(int first, int second){
return first / second;
}
private String makeForm(){
StringBuffer sb = new StringBuffer();
sb.append("<form method=\"post\" action=\"simpleservlet\">\n");
sb.append("<table cellpadding=\"0\" cellspacing=\"0\" border=\"0\">\n");
sb.append(" <tr>\n");
sb.append(" <td>Please enter two numbers to divide:</td>\n");
sb.append(" <td><input type=\"text\" name=\"firstValue\" /></td>\n");
sb.append(" <td><input type=\"text\" name=\"secondValue\" /></td>\n");
sb.append(" </tr>\n");
sb.append(" <tr>\n");
sb.append(" <td></td>\n");
sb.append(" <td><input type=\"submit\" value=\"Submit\"></td>\n");
sb.append(" </tr>\n");
sb.append("</table>\n");
sb.append("</form>\n");
return sb.toString();
}
}


So there you have it. To recap:
  1. Use the pom.xml and the Servlet code to make the .war
  2. Deploy the .war to a JEE 6 app server, like JBoss AS 7.
  3. Access the servlet at http://localhost:8080/notsogoodapp/simpleservlet
  4. Observe how it divides two numbers. Let it try to divide by zero, see blowup.
  5. Run the Byteman scripts, see how blowup stops.

Happy Hacking!


Tuesday, January 10, 2012

Win the JBoss AS 7 Book!



Now revised! No email necessary, we'll contact you after winners are selected!


Win Free Copies of JBoss AS 7 Configuration, Deployment and Administration book

Hello, Blog Readers! To celebrate the release of their new book- JBoss AS 7 Configuration, Deployment and Administration, Packt Publishing is organizing a Giveaway especially for you and three lucky winners stand a chance to win a copy of this book. Keep reading to find out how you can be one of the Lucky Winners.

Overview of JBoss AS 7 Configuration, Deployment and Administration book


* Covers all JBoss AS 7 administration topics in a concise, practical, and understandable manner, along with detailed explanations and lots of screenshots
* Uncover the advanced features of JBoss AS, including High Availability and clustering, integration with other frameworks, and creating complex AS domain configurations

Read more about this book and download free Sample Chapter: Sample Chapter

How to Enter?
All you need to do is head on over to the book page and look through the product description of this book and drop a line via the comments below over here to let us know what interests you the most about this book. It’s that simple

Product description for JBoss book: http://link.packtpub.com/caa3AV#in_detail

Winners from the U.S. and Europe can either choose a physical copy of the book or the eBook. Users from other locales are limited to the eBook only.

Deadline

The contest will close on January 31, 2012

If you'd like to include your email in the post, we'll use that to inform you if you win a copy. If you'd rather not leave an email, we'll try to contact you through your post identity or another blog post seeking you out.

Good luck!