May 12, 2010

Boy Scout Rule in practice

I've started doing Scrum in late 2008 (thanks to Piotr Burdyło). And though I've immediately seen benefits of practices like daily standups, sprint planning, reviews and retrospectives, I had a big problem with finding out how to deal with refactoring within this method.

Before Scrum, whenever I found a place in code that I thought should be done better, I'd put a TODO comment. I would return to these comments whenever I saved some time on what I was doing, or when I decided that it has already been too ugly for my standards. And since my standards are pretty tight, I've been  (and making others) refactoring quite often and it worked well for me.

When doing code review, I'd write TODOs, discuss them with the authors of the code, and make them do clean up as the top priority.

But what were my options in Scrum?

1. Don't tell anybody, and just use your left over time to clean up the TODOs

This is against transparency, which Scrum is advocating, and against my moral backbone (no lying). And since you are evaluating tasks/features better and better, sooner or later you'll have no left over time for TODOs anyway. Ergo: this won't do.

2. Create additional tasks/features for refactoring

These are not visible from the product owner point of view, so some Scrum Masters are strongly pushing against them.

Ok, so I had to choose the second option, and make my Scrum Master live with it. But then, another problem emerged: unless you put a cleaning task for every class that has TODOs inside, sooner or later you'll end up with tasks like “Do all the TODOs in my Big-Ugly-Module”.

This will happen because doing TODOs may have very low priority, because people in a hurry don't bother doing them, and because some people don't write TODOs at all, rather thinking: “yeah, this is ugly, lets fix it, when we fix the rest of the module”.

So it seams that your code is predestined to go down the great spiral of (bio)degradation, until your team can't stand it anymore and somebody asks for a Scrum task to clean up the mess.

Is there a solution to this? Is there a third way?

Sure there is.

What if we can switch the direction the code quality goes in? What if our code can get better with time, instead of getting worse? That would be cool and it would solve our problem with TODOs in Scrum.



Boy Scout Rule

Leave every code you read in better shape than you found it.

Ha! That sounds like something everyone would sign under, and completely forget about a few minutes later, right? I mean, phrases like this sure look nice, but the real battlefield, the front-line, is a mean and messy place, right? You sit in your trench, trying to survive and keep the project alive, while bullets fly above your head. How can you try to make things better, when you couldn't keep them clean in the first place? No way to do it, man, there is too little time, right?

Wrong.

Let me show you how to use this rule in practice, without any significant time burden.


Real case scenario

You are in the middle of writing an application, you take a task from the Scrum white-board. It's a new feature. It says: “When importing orders, every shipping address should be verified. If it's not in our shippable zone, it should be canceled (status change + send an email to the frontend application)”.

Ok, so you want to write a new acceptance test, for this feature. But first you need to know what to test, which part of the application is responsible for importing orders. You search the project structure, and you spot the OrderService interface. You nose tells you, that it should be somewhere there. You take a look.

public interface OrderService {
    /**
     * Imports orders.
     */

    public void importOrders();
    /**
     * Export orders.
     */
    public void exportOrders(DateTime date);

    /**
     * Create raport of placed orders
     */
    public OrderRaport createRaport(DateTime date);
}


Yeah, it's there. The importOrders() method. You search for implementations, and as you expect, there is an OrderServiceImpl.


WTF is with this blur, you ask. It's intentional. It's how much you see during your first second of looking at the code. At the first second you, as a good writer, have already noticed two things:
  1. The method has 30 lines. It's probably too long. Good method should be as short as possible. This, according to Robert C. Martin, means between 1 and 7 lines.

  2. The are several different BLOCKS of code.
What does the second thing mean? It means, that the author has noticed the first thing as well, so he decomposed the method into different blocks, by putting an empty line between them.

This is so natural, so popular, nobody thinks about it anymore. We put empty lines between different blocks of code, to help us read the code. When we read the code, we can skip such blocks, because we already know, what those blocks do.

We put together similar stuff, so we can put a name into our mind-map, that given block does something. Next, we can fast read and skip it consequently.

There are two big problems with such method:
  1. it requires you to read all the details (every line) of the code at least once, before you put names of those blocks in you mind

  2. it is not the way Object Oriented Programming is supposed to separate stuff that make a single concept

So how about fixing it, while we read it? We have to read and understand it anyway. How about not keeping the names of blocks in our memory, but putting them into the code instead.


Comment every block of code

You are doing it anyway, but instead of saving the names/explanations in your head, save them in the code for a moment.

So our code will look like this:

@Override
public void importOrders() {
//create date before importing from web service
    DateTime date = new DateTime();

//import data transfer objects from web service
    List<SalesOrderExportDetail> orders = salesOrderWS.getSalesOrdersFromWebService();

//create summary for importing orders
    SalesOrderImport salesImport = new SalesOrderImport();

//for each order try
    for(SalesOrderExportDetail salesOrder : orders) {
        try {

//convert it to domain object
            Order order = new Order();
            order.setValue(new BigDecimal(salesOrder.getValue(), new MathContext(2, RoundingMode.HALF_EVEN)));
            order.setAddress(convertToAddress(salesOrder.getShippingAddress()));
            order.setBuyerAddress(convertToAddress(salesOrder.getBuyerAddress()));

//calculate vat
            double vat = calculator.vatRate(order.getAddress());
            order.setVatRate(vat);
            BigDecimal value = calculator.value(vat);
            order.setVatValue(value);

//calculate value in PLN
            Currency curr = order.getCurrency();
            if(curr != Currencies.PLN) {
                order.setPln(converter.convert(curr, order.getValue()));
            }

//save in repository
            orderRepository.save(order);

//prepare information whether it was succesfull                
            salesImport.imported(order);
        } catch(Exception e) {
            salesImport.notImported(e);
        }
    }

//save the summary for importing orders
    salesImport.setDate(date);
    orderRepository.save(salesImport);
}

Put the comments without any indentation, so that you won't forget to get rid of them, and they won't mix with other comments in the code. Make those comments verbose, make them say everything that is important on this level of abstraction. You are a programmer, you probably write faster than you think, so there was no additional time spend on this action.
Now for the second step.


Eliminate blocks of code using explicit names

The goal is to compress all the information from inside the block and the comment into one-liner. That saves the time the reader needs to get the meaning of the code next time. It usually means, replacing the block with a method, with a name created from the comment. So for example this:
//calculate vat
    double vat = calculator.vatRate(order.getAddress());
    order.setVatRate(vat);
    BigDecimal value = calculator.value(vat);
    order.setVatValue(vat);

will be replaced by this:
order.calculateVat();
Who said that the new method has to be in the same class? In this situation, it makes much more sense to put the use of the calculator into the order, after all it has all the data (we are using aspectj load time weaving to get the calculator into the order, but there are plenty of other methods if you don't want to use aspectj).

And this:
order.calculateVat();
//convert it to domain object
    Order order = new Order();
    order.setValue(new BigDecimal(salesOrder.getValue(), new MathContext(2, RoundingMode.HALF_EVEN)));
    order.setAddress(convertToAddress(salesOrder.getShippingAddress()));
    order.setBuyerAddress(convertToAddress(salesOrder.getBuyerAddress())); 

Will be replaced by this:
Order order = orderConverter.toOrder(salesOrderDto);
Here we have decided to put the body inside a new class: OrderConverter, because the convertion is a serious responsibility and the OrderService look like it's got other things to do already (Single Responsibility Principle).

Even one-liner can be a bit fixed, for example in this part:
//create date before importing from web service
    DateTime date = new DateTime();
The comment is put, because the name of the variable does not communicate it's intention. We can get rid of the comment if we give the necessary information to the reader in other way. For example like this:
DateTime dateImportStartedAt = new DateTime();
We should eliminate every block and split every part of the code that serves a different need. Even for-each loops and try-catch expressions. For-each responsibility is to iterate over a collection, and call someone else to do the job on an element. Try-catch is responsible for handling an error, so somebody else should do the unsafe job for it.

This is the Single Responsibility Principle on a method level. Make every method have only one reason to change, and your life (and the readability of your code) will be better.

Since these are very basic and easy refactoring steps, that your IDE does automatically with little effort from you, the time burden is insignificant. You had to understand the class anyway, and on the way, you've made it better, easier to understand. Now the part of your class responsible for imports looks like this:
@Override
public void importOrders() {
    DateTime dateImportStartedAt = new DateTime();
    List<SalesOrderDto> salesOrderDtos = salesOrderWebServiceClient.getSinceLastVisitOrderedByNumber();
    ImportSummary importSummary = tryToImportEach(salesOrderDtos, dateImportStartedAt);
    orderRepository.save(importSummary);
}

private ImportSummary tryToImportEach(List<SalesOrderDto> salesOrderDtos, DateTime dateImportStartedAt) {
    ImportSummary importSummary = new ImportSummary(dateImportStartedAt);
    for(SalesOrderDto salesOrderDto : salesOrderDtos) {
        importSummary.add(tryToImport(salesOrderDto));
    }
    return importSummary;
}

private OrderImportOutcome tryToImport(SalesOrderDto salesOrderDto) {
    OrderImportOutcome orderImportOutcome = null;
    try {
        orderImportOutcome = importOne(salesOrderDto);
    } catch(Exception e) {
        orderImportOutcome = new OrderImportFailure(e);
    }
    return orderImportOutcome;
}

private OrderImportSuccess importOne(SalesOrderDto salesOrderDto) {
    Order order = orderConverter.toOrder(salesOrderDto);
    order.calculateVat();
    order.updatePlnValueIfNeeded();
    orderRepository.save(order);
    return new OrderImportSuccess(order);
}

Every method has a meaningful name. No more than 7 lines of code per method. You don't need to divide code into blocks, methods are an object oriented solution to divide the code. By the way, you've delegated some responsibilities to other classes. You've refined the code, made it more precise, more clear, more readable.

But somewhere in your head there is a voice saying: “Hey, I had just one, 30-lines method, now I have four, taking the same space. If I do it to other methods, I'm gonna have like 15 methods per class. I'll get lost, it's no way easier for me”.

But that is, because you have one more big mistake to fix.


The name is the responsibility

Have a look at the name of the interface: OrderService. Can you tell me what this class is responsible for? Every class should have just one responsibility, what is it for OrderService?

Everything about orders?

What kind of responsibility is “everything”? There is no such thing as “everything”? Thinking this way you can create single class named God-Emperor and end your OOP right there. You'll never need other classes anymore.

No, my friend. The name SomethingService is a great mistake, just like a SomethingController or SomethingManager. I know quite a few SomethingManagers in real life, that you wouldn't be able to find any responsibilities for. They are called that way, because this is what they do: NOTHING. They just are.

But we do not speak corporation-bullshit (even though some of us work at corporations), we are Agile developers, we are writers, we name things as they are.

So our OrderService interface should be split into, let's say: OrdersImporter, OrdersExporter and OrdersReportFactory. That would make responsibilities and our intentions clear. If you do that, you end up with a WebServiceOrdersImporter implementation class, that has only 30 lines, 4 methods, only one of them public.

Now that's simple, isn't it?


Conclusion

It doesn't have to be expansive or difficult to make code better while you are reading it.

You shouldn't wait for the code to get really ugly. You shouldn't write TODOs only for obvious problems. Every time you have to read and understand some code, ask yourself a question: Can I make it easier? Can I make it more readable? Can I make it better?

Take at least a small step, every time you answer is yes. Good is better than good-enough. This helps turn the tide, change the (bio)degradation into positive evolution. In the end, your code gets better with time.

No one writes good code right from the start. Good code evolves from normal code, by using the Boy Scout Rule.

And if you cannot find any way to make the code better, at least eliminate blocks of code with methods. Stick to "7 lines per method", and "Single Responsibility of a method" rules.

9 comments:

  1. Like it very much. It's a nice addition to my reading of the book "Growing Object-Oriented Software, Guided by Tests". The blog entry reminds me some parts of the book very clearly. Thanks for sharing your take on it.

    Just a thought I've come up with during the lecture - what about replacing dateImportStartedAt into dateOfImportStartedAt (with Of)? Would it make the intent clearer?

    I can't live with orderConverter.toOrder(salesOrderDto) too. I thought about changing it to orderConverter.to(salesOrderDto), but it doesn't look well either. Somehow I don't like repeating Order in the variable name and its converter method. I wish I could change the name, but don't know how.

    ReplyDelete
  2. Maybe off-topic, but still..

    It took me some time to understand that there is no value in such javadocs like the ones you write. No offense, I used to write them like this myself for a long time because of the 11th commandment - 'a method must have a javadoc':
    /**
    * Imports orders.
    */
    public void importOrders();

    In fact, there is no value there, rather a violation of DRY principle.

    Yeah, it is nice to have javadocs but ONLY IF they say something more then the method name.

    Just my 3 cents.

    --
    Tomek
    http://kaczanowscy.pl/tomek

    ReplyDelete
  3. @Tomek

    You are absolutely right.

    In fact, as Robert C. Martin shows, whenever we have ANY comments in our code, it means we have failed in making the code self-explanatory. We should re-think our design and see if there is a way to express the meaning of the comment in the code alone.

    The only situation when we should have comments (javadocs), is when we create public APIs. Public library should be well documented because not everyone wants (or is able) to download and look into the source code.

    The example of code to be refactored (original interface and 30 lines method), is not the code I'd suggest anyone to write today. It's the code you, as a developer, may see in some project.

    I'm very happy that you have spotted this flaw in the interface. It's not the point of this post, but it's very important. Unfortunately I see this kind of "documentation" all around. You probably have also seen javadocs for setters and getters, telling you that these are setters and getter. This is how bad javadocs can get.

    @ Jacek

    “dateOfImportStartedAt” may be a better name, but frankly speaking, in my opinion it's a matter of taste. You could also call it “importBeginning” and the type (Date) would tell you what kind of beginning it is. After all, the type of a variable is as much important as the name.

    I'm still not sure what to choose in this kind of a situation. Some people are strongly against veryLongNamesThatHaveALotOfUnimportantStuffInside. Usually I'm happy as long as the information is there, and is readable.

    As for the repeating of “Order” in the name of the method and in the type/name of the variable, it's very interesting. My view is, that OrderConverter, like most converters, can have a lot of methods, and with converters, the name of the method usually says what is the output. For example the java.math.BigDecimal has an intValue() method. Same thing for toString().

    On the other hand, I believe that the type of the parameter is also very important, so for different types of methods I usually get rid of the type in method name.

    For example, instead of writing:

    Order.addLineItem( LineItem lineItem );

    I'd write:

    Order.add( LineItem lineItem) ;

    What do you think about it, Jacek?

    BTW: where are you Jacek, right now? I'm currently at Geecon in Poznań, and I thought I'd meet you there. Jakub Kurlenda said that he has seen your pass card at the entrance. Will you be here on the second day?

    ReplyDelete
  4. There's a bug in the code.
    order.setVatValue(vat);
    should be
    order.setVatValue(value);

    Personally I think that if you do that kind of refactoring when you are asked to do some other small task you're a "busybody" nuisance to your team. Pronouncements by Robert C. Martin on the correct size of methods being 1-7 lines are a joke, and should be ignored. Methods are too large when they become non scannable and/or start to become a source of errors. The original method was imo fine and should have been left alone.

    One far better improvement though which you have not mentioned would be to extend the Order object to create an ImportedOrder object, which takes a SalesOrderExportDetail as its constructor. Then the body of the loop just becomes:
    orderRepository.save(new ImportedOrder(salesOrder));
    with the
    getValue
    getAddress
    getBuyerAddress
    getVatRate
    getVatValue
    getPln
    methods all overridden to return the correct value.

    ReplyDelete
  5. This comment has been removed by the author.

    ReplyDelete
  6. @db: thanks for spotting the bug

    As for the nuisance/joke of 1-7 lines per method: that's what I thought about it when I've heard it the first time, about five years ago (not from Robert C. Martin but from my team leader).

    After reading "Clean Code" I've decided to finally give it a try, and half a year of doing this kind of "extract-till-you-drop" (term taken from http://www.infoq.com/presentations/Robert-C.-Martin-Bad-Code), I can say that it works extremely well in real life projects. The people I've been pairing with were also quite happy (though surprised at first). Try it and maybe it won't be a nuisance anymore.

    The idea of ImportedOrder is interesting, but overriding getters means you have to do the calculation every time they are called or... you could store results, right?

    Correct me if I'm wrong, but if you want to store results, the best place is probably in the original Order object (so that the ImportedOrder class has got only constructor inside, that fires calculations).

    But then... it doesn't have to inherit from Order at all, does it? It could be a class with a single method returning the Order. We all know that inheritance is bad after all.

    But then... it shouldn't be named ImportedOrder. It does the conversion and firing up of calculations, so we could call it something like OrderFactory or... OrderConverter.

    But that looks pretty much like first three lines of importOne(SalesOrderDto salesOrderDto) method, except the calculations are fired inside the converter. So maybe it's about giving the converter the responsibility of firing up the calculations? We have to choose between OrdersImporter and OrderConverter. I'd rather stay with OrdersImporter, because conversion imho does not imply vat calculations, while importing an order may.

    But then... I haven't changed anything.

    Ok, I've tried to implement your idea and came back to the starting point, but it is VERY possible I'm too much fixed on my solution (the author usually is). Could you show your example?

    ReplyDelete
  7. The "inheritance is bad" mantra has arisen because of early experiences with object orientated systems where complicated inheritance graphs were built up just as a means of additional layers of behaviour. This results in very fragile code, specially when methods invoke other methods in the same class which may be overridden further down the chain.

    REPEATEDLY extending objects just to alter their behavior is dangerous. Doing it once to fairly anaemic domain objects or just DTO's is not so dangerous, specially if you are just adding a constructor which configures the object differently, or adding independent utility methods. You have to be pragmatic about the cleanest way to solve a problem.

    Yes, as you say doing the configuration in the constructor is also fine, and generally I would do that. You could also just use a factory for creation of different types of orders. The basic idea is the same though - the actual code for constructing an Order from a SalesOrderExportDetail should not be in the middle of a loop over a list of SalesOrderExportDetails which has just been pulled from a webservice (imo). In this example, I suggested overloading the methods because as per your initial task i'd have added the "public boolean inShippingZone() {}" to the ImportedOrder class to be used prior to being saved. In that case none of the vat calculations etc would need to be done until the Order was saved, hence the overridden methods.

    (I'm aware that could potentially cause strange behaviour and would thus keep the class private or package private and document it. If Order generation from a SalesOrderExportDetail was also likely to done elsewhere, that would probably mean there should be either a factory for Orders with a corresponding method or a new constructor taking a SalesOrderExportDetail)

    ReplyDelete
  8. This comment has been removed by the author.

    ReplyDelete
  9. Very nice article. God... THE BIG REALITY!!! :D

    ReplyDelete