When creating tooling for distributed computing, one should not hide the “hard” parts of replication, state, and client interactions.
Hiding can mean hindering. We should make componentry available when needed, and keep it out of the way otherwise. But, this is where things become tricky. It is important to provide an interface that hides the implementation but gives the developer the control she needs. This, in turn, is a “hard” thing with respect to API design.
Performance increases gained by using more cores may plateau in the next few years if power constraints remain constant.
Without hardware innovation, efficient algorithms and techniques will become increasingly necessary simply because work cannot be offloaded to yet another core. Developers must keep up with algorithm research as well as remain knowledgeable of the basics.
Let’s take a look at a few ways to handle collaborators in Scala. To make the discussion concrete, consider two classes, CollaboratorOne and CollaboratorTwo. CollaboratorTwo has a dependency on CollaboratorOne for obtaining a number that is then used in the creation of a string.
How do we handle this?
Technique One: Brute force
This is probably the simplest way of handling things. Its virtue is that it is easy to understand: it is a simple implementation with no surprises.
On the negative side, testing CollaboratorTwo might be on the trickier side. Nothing terrible, but perhaps a little more work than one of our other techniques.
Technique Two: POCI (Plain Ol’ Constructor Injection)
Another option is to build CollaboratorTwo with CollaboratorOne:
This technique allows us to change CollaboratorTwo’s behavior by injecting some flavor of CollaboratorOne. If CollaboratorOne’s implementation is subject to change, this is a good way to go. Testing is easy in this case: we can check that CollaboratorTwo delegates to CollaboratorOne or we can stub CollaboratorOne so we get a consistent result.
On the downside, we are adding things to our API. As such, we have to be keep in mind our API’s clients when making changes. Depending on the situation, additional code might be necessary, such as a public default constructor and a visibility modifier on the constructor that takes arguments.
Technique Three: ACAB (Abstract Class And Block)
In this case, we replace Collaborator one with block. Also, the “collaboration” is moved to the instantiation instead of the declaration. This provides a ton of flexibility. There is no need to declare a subclass of CollaboratorOne in order to use CollaboratorTwo–just fill in the blanks when it comes time create a CollaboratorTwo. For that reason, testing can be pretty easy, too.
One negative aspect of this approach is an abstract CollaboratorTwo implies consumption (somewhere) of some type of CollaboratorTwo independent of the implementation. Being abstract, it is a point of extensibility when maybe it shouldn’t be.
Technique Four: ACAT (Abstract Class And Trait)
This approach is very similar to the ACAB approach, but using a trait instead of a block. Thus, the same positive and negative attribute with respect to CollaboratorTwo apply. Using a trait instead of blocks reduces code duplication and codifies the varieties of CollaboratorOnes used.
An issue unique to this technique is the instantiated CollaboratorTwo is also a CollaboratorTwoTrait, and you may not want that exposed to users of your API.
We have a number of options when creating collaborating classes. In addition to to what’s outlined above, we can create our collaborators by mixing and matching flavors (e.g., Abstract Trait And Block). What’s important, though, is what you want as an extension point, and what you want to expose as part of your API.
Being an unfrozen caveman coder, I tend to go with Plain Ol’ Constructor Injection. It makes extension and composition (in the OO sense) simple and doesn’t expose much other than the collaborators themselves.
Last year, I was going to attempt to average one post a week (FAIL!). I also listed out a few other things that I wanted to attempt. Some goals were met; some were not. A quick review:
1 post a week
And maybe this Comet, too (FAIL)
Amazon EC2 (FAIL)
Google App Engine (FAIL)
7 Languages In 7 Weeks (FAIL)
1 major conference and at least 8 local users groups (YES! and FAIL, respectively)
Lots of fails, there. But, I don’t feel too bad, because I took a new job in June that has allowed me work on some new and similar technologies listed above. I’ll probably talk about some of this stuff in later posts.
In the mean time, there a number of things I want to do for the new year. Once again, I am putting this on the record to try to make sure that I actually do it. Here’s my list:
1 post every two weeks
Learn me a Haskell
Google App Engine
Update the Selenium related stuff on this site
2 books on algorithms and data structures
1 major conference and at least 4 local users groups
And, yes, this counts as blog post. Just like last year.
I used to believe that it was my job to get the customer the most functionality as quickly as possible. Rough edges didn’t matter.
If the customer could live with the strange workflow, a quirky UI, or some other abnormality, so could I. It was more important to get the usable code out there. We’ll smooth the edges out later. Which is fine if it actually happens. But, it never really seems to work out that way.
So, nowadays I am not so sure. Even if the customer wants the functionality, I wonder if we do them a disservice by giving it to them. Perhaps it is a matter of who says it is “done-done”, but I am beginning to wonder if it is better to deliver the tiniest bit of functionality (iteratively of course) until it is freaking awesome. No rough edges, no quirks. Just awesomeness. Then, we move on.
This entails finding out what the heart of the functionality is and only doing that. Also, we need a customer that acknowledges the rough edges whether they face the user or are in the code. Hmmmm.
Well, like I said, I am not so sure. But I am beginning to wonder.
A couple of months ago I read What Have You Changed Your Mind About. It was a fast and enjoyable read. One essay in particular, “Political Science” by Leon Lederman stuck with me.
In the essay, which is about the need for a science-literate political system, Lederman states
…I am driven to the ultimately wise advice of my Columbia mentor, I.I. Rabi, who … urged his students to run for public office and get elected. …We need to elect people who can think critically… We need a national movement to seek out scientists and engineers who have demonstrated the required management and communication skills…
While I agree with Lederman about the need for more science-literate representatives in our government, what I am posting about is related to software development. What is that? Well what if we take the above and do a few substitutions?
s/run for public office and get elected/apply for management positions and get them/
s/scientists and engineers/developers and architects/
The point is this: we’ve all had our share of PHB silliness, but we can’t expect things to get better if we aren’t willing to be involved in managing our affairs. We shouldn’t be afraid to manage.
As I’ve moved back into contracting/consulting, I’ve been drawing a list of questions to ask potential employers. I won’t discuss the answers I prefer to hear. But, I think if one knows what answers they’d like to hear, they are better prepared to evaluate if the employer is a fit for them.
Does all of the code compile?
Number of automated tests?
What is your code coverage?
What is your branch coverage?
What is your average cyclomatic complexity?
If working with legacy code, how long has it been since it was modified?
If working with legacy code, who wrote it, an employee or contractor?
Are you familiar with the Joel Test (or something like it)? What’s your score?
If you have the basic infrastructure:
What type of source control do you use?
What tool(s) do you use for continuous/daily builds?
What tool(s) you use for bug tracking?
What tool(s) do you use for deployment?
How long does it take to get a user environment set up?
What type of machines (RAM, CPU, monitor(s) size)?
What type of phones? Obvious how to hang up, redial, etc?
How easy is it to get conference area?
Do conference areas have projectors?
Do conference areas have whiteboards?
Do cubes have whiteboards?
How many managers have programming background?
How many managers to a person?
How are people organized? Matrix, hierarchical, team?
Who makes the estimates?
What is the basic unit of measurement for estimates? Hours, days, points, etc?
Type of browser? Are daily tools dependent upon browser?
And, of course,
Is coffee provided?
Let’s use assume we have a class like this:
We have a set of Map.Entry. Each member of the set is a String key with a List of Strings as the value. A possible use for such a thing would be the basis of a set of dependent drop downs. One drop down has a set of values (the keys), and when the first drop down changes, the second drop changes (the values).
Now, how do we get this into client side land?
Technique One: Duplication
This isn’t a particularly robust way doing things. The biggest issue is that we have duplication, that requires knowledge of both the data structure and the JSP if we are going to make changes.
Technique Two: Constants
Another way of getting stuff to the client side is to extract the values in the Java data structure to constants:
And then we’ll need it in some sort of wrapper:
Then, we’ll use that wrapper in the JSP:
Technique Three: Constants Part 2
We can achieve the use of constants in a little cleaner fashion by using the unstandard tag library. It allows us to use constants in a more natural way. No wrapper needed.
Technique Four: JSON
What if we serialize the data structure into JSON? We would need a serializer. In this case I used Xstream:
Then, we could serialize the data structure into a string that can be eval’d:
The tricky parts here are making sure we know what the JSON version of the object is going to look like and at what level can we start accessing the properties.
Tying It All Together
To test the code, I used Jetty, Selenium, and JUnit. The test class:
Are you familiar with the Boy Scout Rule? Hopefully you are. In its simplest form, in any endeavor, we should leave things better than when we got there. In software, how do we do this? I think at a high level, we have some general problems and solutions:
|No Tests||Create Tests|
|Incomplete Tests||Record Test Coverage And Add Tests Until All Paths Through The Code Are Covered|
|Hard To Understand Tests||Refactor Tests Into Chunks That Act As Developer Documentation|
|Hard To Understand Production Code||Perform Refactorings In Fowler’s Refactoring|
Problems and Solutions Expanded
Let’s expand a little on the table above. First, the problems are in decreasing severity to a the long term success of project. So, not having a set of tests is the biggest (technical) detriment to the project. Without automated and continuously running tests, we essentially have a black box. We can’t easily make changes without being certain that we are not introducing bugs.
Given the choice between some tests and no tests, I’ll take some tests. If we have some tests, we have a lot of the hard stuff out of the way. But, how do we make it better? We add tests until we have all code paths covered. There are number of tools out there for measuring code coverage. Hook the coverage tool up with your tests. Remember, it is not the percentage of code covered in a class that is important. It is the branch coverage. We want to know that we have tested all the paths through the code. If you have good branch coverage, you’ll have good line coverage. Good line coverage does not necessarily mean good branch coverage.
If we have complete tests, the next issue is most likely that the tests are hard to understand. To fix this, we keep in mind that not only prove that the code is working as expected, but also act as documentation. Effective documentation means exposing only what is necessary important for a certain outcome. We want enough context to understand what is going, but not so much information that the essentials are drowned out. Achieving this is a difficult balancing act. Refactoring test code (extracting methods and classes, using proper setup and tear down) and patterns like the test data builder help us achieve this goal.
Now I must come clean: the assumption all along has been that by starting at the tests we have made changes to the production to make it testable. This means that if we have all the other problems licked, we should be at a stage where the public API for the production code is settled. We have all the code tested, and we know what the code is supposed to do. So, if the production code is difficult to understand, it is a matter of refactoring it.