September 20, 2011

JBoss Envers and Spring transaction managers

I've stumbled upon a bug with my configuration for JBoss Envers today, despite having integration tests all over the application. I have to admit, it casted a dark shadow of doubt about the value of all the tests for a moment. I've been practicing TDD since 2005, and frankly speaking, I should have been smarter than that.

My fault was simple. I've started using Envers the right way, with exploratory tests and a prototype. Then I've deleted the prototype and created some integration tests using in-memory H2 that looked more or less like this example:

@Test
public void savingAndUpdatingPersonShouldCreateTwoHistoricalVersions() {
    //given
    Person person = createAndSavePerson();
    String oldFirstName = person.getFirstName();
    String newFirstName = oldFirstName + "NEW";

    //when
    updatePersonWithNewName(person, newFirstName);

    //then
    verifyTwoHistoricalVersionsWereSaved(oldFirstName, newFirstName);
}

private Person createAndSavePerson() {
    Transaction transaction = session.beginTransaction();
    Person person = PersonFactory.createPerson();
    session.save(person);
    transaction.commit();
    return person;
}    

private void updatePersonWithNewName(Person person, String newName) {
    Transaction transaction = session.beginTransaction();
    person.setFirstName(newName);
    session.update(person);
    transaction.commit();
}

private void verifyTwoHistoricalVersionsWereSaved(String oldFirstName, String newFirstName) {
    List<Object[]> personRevisions = getPersonRevisions();
    assertEquals(2, personRevisions.size());
    assertEquals(oldFirstName, ((Person)personRevisions.get(0)[0]).getFirstName());
    assertEquals(newFirstName, ((Person)personRevisions.get(1)[0]).getFirstName());
}

private List<Object[]> getPersonRevisions() {
    Transaction transaction = session.beginTransaction();
    AuditReader auditReader = AuditReaderFactory.get(session);
    List<Object[]> personRevisions = auditReader.createQuery()
            .forRevisionsOfEntity(Person.class, false, true)
            .getResultList();
    transaction.commit();
    return personRevisions;
}

Because Envers inserts audit data when the transaction is commited (in a new temporary session), I thought I have to create and commit the transaction manually. And that is true to some point.

My fault was that I didn't have an end-to-end integration/acceptance test, that would call to entry point of the application (in this case a service which is called by GWT via RPC), because then I'd notice, that the Spring @Transactional annotation, and calling transaction.commit() are two, very different things.

Spring @Transactional annotation will use a transaction manager configured for the application. Envers on the other hand is used by subscribing a listener to hibernate's SessionFactory like this:

<bean id="sessionFactory" class="org.springframework.orm.hibernate3.annotation.AnnotationSessionFactoryBean" >        
...
 <property name="eventListeners">
     <map key-type="java.lang.String" value-type="org.hibernate.event.EventListeners">
         <entry key="post-insert" value-ref="auditEventListener"/>
         <entry key="post-update" value-ref="auditEventListener"/>
         <entry key="post-delete" value-ref="auditEventListener"/>
         <entry key="pre-collection-update" value-ref="auditEventListener"/>
         <entry key="pre-collection-remove" value-ref="auditEventListener"/>
         <entry key="post-collection-recreate" value-ref="auditEventListener"/>
     </map>
 </property>
</bean>

<bean id="auditEventListener" class="org.hibernate.envers.event.AuditEventListener" />

Envers creates and collects something called AuditWorkUnits whenever you update/delete/insert audited entities, but audit tables are not populated until something calls AuditProcess.beforeCompletion, which makes sense. If you are using org.hibernate.transaction.JDBCTransaction manually, this is called on commit() when notifying all subscribed javax.transaction.Synchronization objects (and enver's AuditProcess is one of them).

The problem was, that I used a wrong transaction manager.

<bean id="transactionManager" class="org.springframework.jdbc.datasource.DataSourceTransactionManager" >
    <property name="dataSource" ref="dataSource"/>
</bean>

This transaction manager doesn't know anything about hibernate and doesn't use org.hibernate.transaction.JDBCTransaction. While Synchronization is an interface from javax.transaction package, DataSourceTransactionManager doesn't use it (maybe because of simplicity, I didn't dig deep enough in org.springframework.jdbc.datasource), and thus Envers works fine except not pushing the data to the database.

Which is the whole point of using Envers.

Use right tools for the task, they say. The whole problem is solved by using a transaction manager that is well aware of hibernate underneath.

<bean id="transactionManager" class="org.springframework.orm.hibernate3.HibernateTransactionManager" >
    <property name="sessionFactory" ref="sessionFactory"/>
</bean>

Lesson learned: always make sure your acceptance tests are testing the right thing. If there is a doubt about the value of your tests, you just don't have enough of them,

September 11, 2011

NoSQL devmeeting in Warsaw

I've spent this Saturday at NoSQL devmeeting in Warsaw, organized by Adam Lider, Piotr Zwoliński and lead by David de Rosier. At first I was reluctant to go, as my level of js mastery is clearly negative, and I have only theoretical knowledge about NoSQL databases, but as Maciej Próchniak noticed, these are exactly the reasons why I should.

The meeting started at 9am and lasted till 9pm (though I had to leave at 7pm), with one-hour break for lunch. David began with a gentle but fantastic introduction to CouchDB, MongoDB, Cassandra and Redis, after which we were split into groups of 3-4, each taking one of the databases aforementioned. Our task was simple: with four big MySQL (partitioned) dumps on the local SVN, we were to migrate the data to our NoSQL DB and then prepare a simple twitter application in any language we want, preferably javascript using Node.js.

Our group took the hard way of playing with Node.js, easing it up with the choice of MongoDB (as it seems to have the best community and thus support). Node.js was more of an obstacle than help, mainly because of it asynchronous nature, but it's quite possible that we just don't know how to write good code in it. We definitely didn't try much, being fine with “hey, it works!”, which is just right for the kind of hacking/prototyping we were into.

We had no problems with MongoDB. With Barack Obama's 9 million followers in mind, we settled on the best model early, choosing eventual consistency and data duplication in the name of simplicity and query performance.

Because I had to go two hours early, I've missed the part where we were to test the performance and try our luck with replication, but nonetheless this Saturday was clearly awesome. I loved the hackengarded at 33rd Degree, but devmeetings are even more fabulous. The formula is superb. If I were to add anything, it would be an open review of each application at the end. I'm really curious how other teams did it.

Checkout devmeetings web page and if you have a chance to visit one, definitely go for it. It's great, it's free, and it's one of the best ways to learn something useful fast.




September 5, 2011

How to write a good Request for Proposal


For the last month and a half, I've been busy writing offers in reply to Requests for Proposal (RFP) from many different companies: startups, big corporations and everything in between. Since the level of those RFPs vary, and since the quality of RFP always influences the price you get, I'd like to share a few tips, about how to write a good RFP without any knowledge in analysis, and how to get the lowest possible price.

What is a Request for Proposal


Simply put, it's a description of a non-existing system, for which software developers usually answer with a formal offer.

(RFP) The publication by a prospective software
          purchaser of details of the required system in order to
          attract offers by software developers to supply it.  Software
          development under contract starts with the selection of the
          software developer by the customer.  A request for proposal
          (also called in Britain an "invitation to tender") is the
          beginning of the selection process.
      
          [Bennatan, E.M., "Software Project Management", 2nd edition,
          McGraw-Hill International, 1992].


Most RFPs require a fixed-price on the offer, because the buyer wants to know how much will it costs to make the system, and be able to compare both the potential value and prices.

For whom it may concern...

Before I get into details, bear in mind, these advices were written for projects up to a few man-years in size/scope. For anything bigger I'd strongly suggest starting with a short analysis as a separate project. Probably done by some other company. Beware of pure consulting companies though – they cost a lot and without the battlefield experience, often fail to create anything worth the money. Sometimes even jeopardizing projects with their choice of poor technologies and architectures. You'll be much safer asking a good software company, which is also able to provide the solution, to do the analysis.

It's a pity but I've seen consultants working against their customers or against developers, for very trivial reason, coming directly from human psychology. Maslow's hierarchy of needs puts “respect by others” before morality, and a typical consultant without up to date battlefield experience, will never be respected by developers. So instead of helping, he'll do anything to make himself feel important, like fighting for features no one needs, or insisting on using SOA where it's completely useless. I've even seen whole projects, created for customers who couldn't understand them, wouldn't need them, and would abandon them right after spending a few men years in money and effort to create them, just because some out-of-place consultant used words like cloud and synergy to charm the CEO.

Beware of anyone who speaks about synergy. Chances are, they already lost their connection with reality. Save yourself a lot of trouble and hire a real software provider to do the analysis instead.

This post may also be useless, if you do not need a fixed-price offer and you have a trusted software company, that will play fair with you. But that's not common, unfortunately.

Why is the quality of RFP significant for the price?

In an ideal world, you just have to say that you need a developer's assistance, and some analyst will come to you to write down what you really want. Even things you are not aware of. In the ideal world, analysts work for free.

But in the real world, resources are limited, and if you need something worth 300 man-days, the company usually doesn't want to risk more than 10 man-days (two people per week) before you decide whether you want to pay or not. 10 man-days may seem like a lot, except when you need to make a prototype, technology reconnaissance, understand the domain and propose valid estimations and formal documents, it's really not.

This is the reason, why the developer often cannot have more than one or two meetings with the client before the offer is finished.  This is the reason why the quality of your RFP is crucial

The less informative your request is, the higher the price will be. To understand that, you need to understand where the price on an offer comes from.  It usually consist of at least three components:
  • the effort counted in man-days (function points and so on are finally translated to man-days at the end anyway)
  • the margin, that the provider has to have, to pay for all the costs not directly connected with the project (this is where you pay for the offer itself)
  • the uncertainty or fear

The margin depends on many things, being lower mostly for companies that hire more developers than marketing/management people (which explains, why big companies are often so expansive, and why agile software houses like to stay small).

The effort depends on the technology chosen, the internal experience of a company in given technology, the method (SCRUM being cheaper than full RUP with all the bells and whistles, but not that much compared to Open Unified Process, for example) and whatever libraries the company already has on hand.

The uncertainty is where a large part of the price is placed.

See, the company sending a formal offer for a fixed-price project, usually is bond by law, and has no way to change it later. The theory says our estimations at the beginning of the analysis may go wrong up to 400%. I've personally seen a project where the initial estimation of nine man-weeks turned out to be four man-years. Or another one, that turned from 42 man-months into 252. These are big differences that can bring a company down, and to cope with that, software providers have to have an uncertainty margin in their estimations.

And you, as a client, do not want that.

You don't want that, because either the price will be higher than possible (if their effort estimations were right, but risk factor too high), or the provider may go out of business (if estimations were very wrong) in which case you are screwed anyway.

If you think you want to trick them just a bit to get the software for just a little under the cost, think again, because software companies in case of an emergency often try to reduce one thing which wasn't set in stone in the offer and is most difficult to verify: quality.

You see, the famous tetrahedron of cost-scope-time-quality is wrong. It doesn't work that way. Not for software at least.

I'll skip the trivial cost-time phenomena for a moment, we all know that adding more women won't bring the baby faster. Each project has a sweet spot in number of developers, above which the cost rises exponentially. Let me show you something different. Let me explain how quality works.

Most developers usually work on one quality level – their maximum, which of course may be quite low anyway, but is still their top. A good developer will use all the techniques he is proficient with, A/B/TDD, self-documenting code, prototyping etc. because Uncle Bob is right saying, that the only way to go really fast is to go well. And we all hate debugging anyway.

So the developer works at one quality level. Unless someone comes in and tells him to hurry up no matter what. And that someone may be a PM in panic, that can clearly see now, that the initial estimation was wrong, and the only way to not lose too much money is to cut wherever possible.

This is a risky way, short sighted way, but may work if it slips by your acceptance tests. And so you, the client, are going to pay for it later, in change requests. And it works better, the bigger the customer is, the further away the guy accepting features is from the stakeholder and the user.

Which explains why so many IT projects for public administration are totally fucked up.

Some desperate companies even go so far, as to deliberately give you a price under the cost, knowing that they will charge insane amount of money on change requests later. But there is a solution to that: require all the providers to sell you non-obfuscated source code, with automatic tests, and make sure it's clear with them, that you will choose a company supporting the solution later. That should make them think twice.

So you don't want to screw the provider, because at the end you will screw yourself. You want the best deal for your bucks, low support cost and you want to have o good relationship with the developer after the project is done, because trust is the biggest value in every business.

Let's bring the price as low as possible, without loosing anything, by reducing fear and uncertainty with a few simple tips.

Use cases (interaction) instead of lists and screens

You don't need to be a software analyst to be in need of a software solution, and so you may want to describe the idea for the software, that is in your mind, using screens, list of nouns and so on.  Analysts have discovered that the best way to write down requirements, is to use User Stories or informal Use-Cases (for the rest of this post, I'll forget about any formal differences between those two).

What are those things? You don't need to read “Writing effective use cases” by Alistair Cockburn (though it is a great book), you just need to remember one thing:  describe interactions between the system and everything around it (users and other systems).

Don't think about HOW user interface looks like, think about WHAT the user can do with it.

I'll give you an example. If you write something like this
  • user management panel
there is a big uncertainty factor, because the implementation of this small otherwise task, may vary by 300%, depending on your actual needs.  But if you write something like this:
  • administrator can delete users, change their roles, change their passwords, search them by email and login, and see their details
my uncertainty drops close to zero, because I know exactly what I need and what I need not to implement.

So write about  interactions the  system in your mind has with it's users and with other systems. And don't even try to be formal – it's not needed.

No integration without interaction (and protocols)


What do you think when you see something like this
  • SSO integration
That seems simple. SSO stands for Single Sign On, so we probably need to authenticate the user, right? Too bad that the real requirement looks like this:
  • the user logs in to the system through our SSO, and if the user wants to register (sign-up), he should also be added through our SSO API with GUI in the portal
Ha! So it's a bit more than logging in. But is it enough to estimate the effort and keep the fear margin close to zero?

No.

Because SSO is just a concept. There is no information about the actual protocol. And there are plenty on the market. Maybe it's Kerberos or NTLM? Or maybe it's OpenID? OAuth2 anybody? Or maybe it's LDAP, and we already have tons of libraries for that, no implementation needed. Or maybe, god forbid, it's some CORBA service that remembers the times when dragons and demons ruled the earth.

This is a simple rule for requesting integration features: write about the INTERACTION (what the system wants to do from the business point of view) and about the PROTOCOL underneath. That's the minimum needed.

And if you have some experience with given system/protocol, write those too, whether it's bad or good. That will help a lot, especially for proprietary technologies. If your lawyers allow for that, send us the docs or a code sample. Nothing is as risky as integration with a third-party that we know nothing about.

Write what you don't know

With all the focus on putting all the knowledge you have about the system in the request, it's just as crucial to put there what you do not know.

When people think about a fixed-price project, they usually think about a single price, but that's not the way it has to be. You may need the fixed-price to be able to compare different offers and make you feel safe, but it doesn't mean, you cannot have options.

There is no way to make all the decisions before the project is started. Agile suggests to postpone making decisions to the last possible moment, because then, you know more and are able to make it more consciously. If you have a few ways to solve a given business problem, and do not see which one is better right away, you can ask us to estimate both solutions. If their costs differ, it may help you to choose. If not, let's leave it for later, just assume one of them is used.

Options are good. We can even prepare a proposal with two or three stages, where with the first one, you'll get your working software with all crucial features, with the second all non-crucial features will be implemented, and we will leave the bells and whistles for the third stage. In fact, these is what we usually do.

We have short cycles, we can get you new features on production every week, since our sprints are week long. Don't be afraid to write down what you don't know, or what you cannot decide yet on. We will prepare a few options and we will have that in mind, when creating the proposal.

No more unidentified TLAs

Even with my limited skills, I really like English, the lingua franca of IT. It's simple, elegant, and everything sounds better in it. But it has one big issue, polluting it from the mid-twenties: acronyms. And our industry is so full of them, it makes you want to Fornicate Under Consent of King.

It's easy to call the customer and get them all right, finally, but do you really want to have so much time wasted (multiply TLAs by number of providers)? Remember, that the guys preparing an offer for you, also have just this much time, and they could spend it on lowering uncertainty/cost of some other part of your project, instead on guessing TLAs. Please, save us and yourself the trouble, and every first time you use a given TLA (three letter acronym), add a full name in the parenthesis.

You know, it's not that big of a problem that we need to spend more time, finding out what you had on your mind, it's just that if there is a mistake, it's going to be painful. And there'll be mistakes, finally, because TLAs overlap, even in given context, and having so many of them, it's statistically unlikely we will always get everything right.

Things get much worse, when the request for proposal mixes two or three languages (which is inevitable, unless the whole document is in English), because on top of English acronyms, we get German acronyms, Polish acronyms, and short names of products from vendors that no longer exists. It's an invitation for error. And you don't want us to be wrong with the estimation.

Ask for CVs: companies are names, people are skills.

Some of our customers ask for Curriculum Vitae for every single person on the project. It's a great idea, because at the very end, it's not a company that creates software, it's the developer. Developers change their jobs quite often, and you may end up hiring a company with good references in the domain, only to find out, that the team is completely fresh to the subject.

It may be as well, the other way around: getting a proposal from a company with absolutely no references, only to find out that it's started by experienced veterans, who left the other software provider, to create their own business.

Skills matter, asking questions costs you nothing, so go ahead and ask for CVs, unless you want to find out that your critical project is created by two containers of first year students, that got hired for  a summer internship.

A software house, having a lot of projects going, may not be able to guarantee that the developers in the offer will be there when you finally make up your mind to sign the contract, but should still provide you with potential CVs. And if it won't... well, have a lot of diapers ready because a few containers of freshmen are coming your way.

Meaningful performance expectations

Performance requirements are included in nearly every request for proposal, and that's cool, because more often than not, the possible architecture depends on the traffic (number of messages, calls or users). I  don't need to be lightning fast for your 50 Intranet users, and a handful of messages per minute, I can focus on productivity and keep the price low. I may need to use CQRS with a fast no-SQL cache/database, eventual consistency, queues, partitioning and Hadoop with MapReduce, if  you want to have bazillions of users per second with possibility of tens-of-bazillions the next year on your existing hardware configuration. And that's a completely different effort.

Or I may be required to present a hardware configuration that will do the trick with response time < 1 sec. Either way I need to know what your expectations are, but I need a bit more than
  • expected number of users: 50 000
Where are those users from? How is the traffic distributed throughout the day, week and month? What is a typical use case scenario? Finally: what are the expected peaks and how important is the response time for your business?

You may not be able to answer, that's fine, but if you cannot, how do you expect us to guarantee the performance?

Sometimes I read something like: up to 50 requests per second in peaks on SPARC M8000 with 16 processors. These are valid requirements, knowing that those 16 processors have 64 cores together, I know that if I keep the response time at less than a second for a request on one core, I should be fine. And that doesn't seem like a problem event though I can't prove it, because I only have access to Intel's Xeons and the software doesn't exist yet. I can still prepare a little benchmark or a proof-of-concept. Here I keep my risk factor low.

But 50 000 users on three AMD's Opterons? Hell if I know. I've been building sites for 2mln users on that configuration, so it looks fine, but the question is what peaks look like? So I'm going to add a bit to cover that risk, and a few days at least for optimization.

If you want to have the price low in the offer you get, either write the details about peaks or don't ask for a guarantee. Ask for performance and give your expectations, and we'll keep that in mind and do everything to get you that performance, but guarantees are possible only when we can prove something, and more often than not, we can't. Usually it's just a bet, so you'll be paying extra for this risk.

Bird's eye view is nice, details are crucial

Sometimes the system in scope is so large, that the request is for just a part of it. It's a good idea to get your potential providers to understand the big picture, but it's even more important to make it clear, what your expectation as to their job is.

We had a request for a system that looked like a hardware plus embedded software solution. We were going to reject it, not even trying, because we don't do hardware, and our skills with given embedded technology is negligible. But a question remained: why would they even send us this request, knowing very well what we do, and what we don't. Trying to figure it, we found out, that our job would be only to prepare an integration point and all the embedded software would be written by another company, waiting for us to give them an API.

When you are describing the ecosystem the software is going to live in, make sure you are clear as to what a given software house is supposed to create. Otherwise, you may not get an answer at all from some of them, giving you less choice.

How much do you have (for startups only)

With startups, the first question I always ask is: how much money do you have? 

Funny, isn't? The whole point of a request for proposal is to find out, how much would the system cost from different providers, and yet I ask for how much they want to spend at the very beginning.

And while I would always like to know it, when preparing an offer, I'm only asking startups.

Why?

Because, existing companies are more flexible with their money. The software is there to help them achieve their business goal and if they need to invest more, to get more, they may go for it. Startups usually don't have that choice. Startups have an idea, that will hopefully bring tons of money, and they have some initial funding, that has to be enough.

So I need to know, what part of the funding is for IT, so that I can find a solution that may be acceptable. If the money seems like no-way for all the requirements, maybe we can split them into two: one part to get the business going, the other, when you finally start making some money. It's easier to find additional funding, when you can at least cover your costs. Or perhaps, we need to change a non-crucial part of your business idea for the first year, the cut the cost dramatically, and get you going. Maybe we can get you an off-the-shelf open source system for the parts that do not build your market advantage on. There ale plenty of possibilities.

I've also seen a few lucky startups, that couldn't invest too little in the IT of their business, because the investor wasn't interested in spending a mere few hundred thousand, to get a million back. The investor wanted to spend a few million to get back a few hundred million.

That's cool with me, but let me know from the start, because then, I'll go for solutions better fitting your capabilities.

Call me / email me if you can't write it formally

Frankly speaking, if I was the owner of the software house responding to requests for proposal, I wouldn't have  balls to sign half the contracts we do.  At least with corporations. Most of the requests have ugly sections, from even uglier lawyers, that generally require you to sell your soul and your wife, in case the other side requires it.

You know how long one can negotiate a Non-Disclosure Agreement, before finally getting the request for proposal? A month. That's how long it takes, before the lawyer finally understands, that he cannot prohibit us from working for ALL THE OTHER companies in the world, just because he wishes so.

Anyway, apart from legal formalities, corporations often have their request templates polluted with all sorts of smelly stuff, that implies a starting cost.

For example, an Intranet application template is being used for a public Internet application, and as a result we are asked to prepare a formal end-user documentation in doc format. Seriously? Come on, find me a single user that would download and read a doc before using the site. Can we get you a context help instead (which we will implement anyway)?

And the answer is: no ;)

No ;) Means we, the corporation, need a formal doc document because this is the way some consulting company (gaaah!) years ago decided, and charged us a million for preparing these templates, and it's not in my power to change that, but ;) also means nobody's going to read it, and you can produce those docs from the context help, if that brings the price down. And yes, in my private opinion, context help will be of much better use than a doc.

Same story with RUP. Do you guys really need that? Is Open Unified Process (which adopts the core of RUP, while being more agile) enough? Sometimes you do not care, it's just the template that says RUP.

The easiest way to keep the price low is to call us. Tell us what you really don't care about, and what is there just because of the consulting company.

I'd be scared to sign a document which obliges me to formal things I won't do (I've done that once, and now I have to, theoretically: “consult all important life decisions with the bank”), but my bosses have more experience in that matter (or just cojones) and understanding with our clients, for whom we've been working from their very beginning.

Everything changes (be agile or be dead)

It's quite embarrassing, that even in 2011, some companies literally stick to the offer. It may be because of fear and the potential legal consequences, but this is not the way it should be and it's not the way we work.

If you do not play dirty, good software houses will always welcome change. Let us make decision whether we can handle it within our budget, and if we can, we won't bother you with an official change request (CR) or invoice/agreement for every change. If we won't, let us find a cheaper way, to meet your business needs. We are here to cooperate, and we want to have you happy at the end of the project, because we want you to trust us. We are playing on the same side. At the end of the day, we want you to flourish, and we want to sent us another request for proposal, when you'll be in a need of another software solution.

We will make you secure, by selling you well written sources and using open-sourced libraries, by creating tons of tests, so that another company can easily take over the support if you wish so. 

This is all old news, Open Source movement and Agile movement, have been here so long it should be crystal clear, it's better to play together this way.

If you have a negative experience with a company over-charging you for changes, just don't send them another request for proposal, but don't think everybody behaves like that.

Or maybe, you just bought under-priced software, and the company is getting back the money this way? Hmmm....