Johan Martinsson: 2014

lundi 19 mai 2014

Refactorer legacy même pas peur

tldr; L’approche classique des tests n’est pas adapté pour le refactoring dans le legacy. Il vaut mieux écrire des tests automatiques jetables. Du fait qu’il y ait 0 besoin de maintenance et avec l’outillage adapté cela est très TRES rapide. Ex : 200 lignes en 5min.

Cette fin de semaine c’est Agile France et je suis vraiment très heureux de présenter “Refactorer legacy, même pas peur!". Il y aura du code en live bien entendu! Enfin des tests en tout cas. Ma présentation, qui est celle de Rémy Sanlaville aussi bien qu’il ne peut pas venir, cherche à remettre en cause comment en général on conçoit les tests sur legacy. Oui carrément! :)

Le problème

Classiquement nous rencontrons un ou plusieurs problèmes lorsque nous tentons d’intervenir sur du code legacy

Il est jugé trop long d’écrire les tests avant de toucher au code
On met en place des tests (très) haut niveau, souvent assez longs à exécuter
Les tests unitaires adhèrent au code et peuvent ralentir certains refactorings

C'est probablement maintenant que les esprits se chauffent. Justement c’est le moment de se calmer, je vais tout expliquer ;)

Trop long

Oui, ben parfois ca peut prendre des journées voire des semaines pour mettre en place des tests, notamment de haut niveau. Ca peut être justifiable mais ca peut aussi ne pas l’être. En tout état de cause, moins l’écriture de tests prend du temps plus on le fera.

Tests haut niveau, longs à exécuter

La durée d’exécution de ces tests sont plusieurs ordres de grandeur plus lent que les tests unitaires, parfois cela requiert le déploiement du code sur un serveur d’application et on compte des dizaines de minutes juste pour un lancement. Cela fait qu’on va plus lentement, on réfléchit plus longtemps, et généralement on va aller moins loin dans le refactoring et ainsi laisser plus de dette.

Tests unitaires et adhérence au code

Classiquement on teste l’interface de chaque classe, elle est rarement bien faite et lorsqu’on souhaite la changer, on doit non seulement changer le code de production mais aussi tous les tests. Pire on a souvent envie de tester qu'une partie de la classe et on crée ainsi une adhérence à des méthodes protected. Donc bien qu’elle permette le refactoring à l’intérieur les tests freinent le refactoring aux frontières.

Une alternative

Je ne parle pas de LA solution, ce que Rémy et moi tentons de vous faire voir est une approche complémentaire qui nous a bluffé par son efficacité, est une “nouvelle” façon de réfléchir aux tests. Qu’il s’agisse de tests haut niveau ou bas niveau c’est une approche différente et complémentaire elle s’applique à tous les niveaux des tests.

Une fois que le code est bien propre, il est facile et rapide d'écrire des tests unitaires et haut niveau avec le style tests provenant de TDD ou BDD qui pour le coup sont maintenables.

Il y a deux clés à retenir dans notre présentation

Écrire une assertion pour du legacy est une forme de gaspillage. Le résultat du premier lancement est LA référence. Record-Replay
Il s’agit de tests temporaires, ils sont à jeter à la fin du refactoring. Donc nous passons zero énergie sur la maintenabilité.

La combinaison de ces deux points produisent un gain de temps énorme et permet d’écrire des tests pour le besoin actuel sur mesure - des tests rapides et résistants au refactoring.

Si vous êtes curieux de comment nous faisons ça et quel est l'outillage qui en découle - venez voir un exemple et passer du bon temps à Agile France!

Notes : Outillage exemple :

ApprovalTests
Couverture de code
Moco (simuler des serveurs web) (Java)
XStream (sérialiser objets en chaîne) (Java)

Mise à jour : slides

Mise à jour : inclure aperçu video

vendredi 9 mai 2014

Golden Master and test data

Golden Master legacy testing - where does the test data come from?

I get the question from now to then "When using the Golden Master testing technique, how do you generate data?". In particular the concern is to have relevant data.

Now in a kata like the Legacy Code Retreat (trivia) and other small pieces of code, you can usually generate random data - this works particularly well with the trivia code as it depends on a randomizer already. Because this randomizer is the only input data the code depends on, you get 100% coverage if you vary it.

However randomly generating data is not always feasible or at least not easy. So what do you do then? Here's what I've been using, the list is non exhaustive of course. Please help me think of other means!

Capture production data

Ask a product expert

Use code coverage

Generating data (not so randomly)

Capture production data

This is my favorite. It is usually possible to ask for it, after all it is a legacy system in production and many people depend on it so if you say it might blow up on next delivery ... well people tend to find ways to get that data. There are a million ways of capturing it : listening to network traffic, looking in the database, looking in logs. Even inside the application you can serialize an object (in java with XStream for instance), well that is if you can push this kind of patch to production or duplicate network traffic to a second instance that is not in production.

The extra benefit with this method is that you can discover dead code and delete it :-D

How do you decide how much data is enough? Usually I collect A LOT, too much to work with. Then I run it all, verify I get close to total coverage, then i remove data until the code coverage gets affected.

Ask a product expert

This wont provide all significant data, but it is a start in understanding the code and data . If you run the application with this data and analyse what is not covered chances are you can complete the picture with "invented" data. Sharing these findings with the product expert can be quite interesting.

Use code coverage

Write a first test, check the code coverage. Examine the code to find out how to cover the next branch,
write the test, validate you assumption by running the code coverage. Repeat until done.

Generating data (not so randomly)

If you have 10 input parameters that has some potential variation, you can enumerate variations of each, then generate all combinations. ApprovalTests has some built-in support for this through LegacyApprovals.lockdown(). There's also the QuickCheck style libraries (one example in java, another in scala), that do intelligent generation (and more). Originally in Haskell, it has been ported to many languages. Those tools are very interesting, but I haven't had the occasion to work much with them.

mardi 29 avril 2014

French audience - HowTo be understood

Over and over again I my french colleagues say "I didn't hear what he said". The problem is that people in France aren't used to listening to english speaking people, so it's much more difficult for them to capture the message the speaker is trying to relay.

Discussing the problem with some peers we've come up with a few simple ideas that can make things better. It's not that I have tried them out nor am I an expert on the matter, but I just have the urge to share those simple ideas with you.

Speak slowly
Use basic words
Put more words on your slides
Translate slides
Present with a local speaker
Use non verbal communication

point to objects, images
use drawings
use expressions, show feelings

Also for conference organisers, people from non english speaking countries usually are easier to understand.

Speak slowly

Sure you'll cover less stuff - but so what, quality always beats quantity. Of course I'd forget that after about 30 seconds! So what I'd do is ask someone in the audience to raise a sign with "slow down" written on it from now and then. If you are co-presenting with a local speaker this is one of the things he can help with.

Use basic words

I've seen some prominent speakers use a very precise vocabulary, it was impressive. But it was too elaborate for 80% of the audience. What does it matter if 20% really get it if the rest misses most of it?

Put more words on your slides

This is contrary to good presentations - you're suppose to have only a handful of words per slide. But that is when the audience have an easy time hearing what you say. In fact if you see some key words on the slides, it helps you hear them.

Translate slides

You can usually get someone (conference organizer, course attendant, a college student) to spend a little time translating your slides. If not, ask on twitter.

Present with a local speaker

Take the opportunity to present your subject with a local speaker. They're experts in French people and can help reach the audience in every way. There are plenty of people who would jump on this opportunity.

The rest of the ideas are pretty self explanatory.

Please help improve this list!

I aim to make recommendations to english speaking speakers at AgileGrenoble this year, please help me improve those recommendations.

PS: Thanks to Manuel Vacelet, Sandro Mancuso and Remy Sanlaville for helping me elaborate this small list.

Note : Thanks to Pascal Van Cauwenberghe for adding the ideas of Translating slides and suggesting what he often does - present with a local speaker!

jeudi 2 janvier 2014

Golden Master and legacy - a few insights

tldr; Golden master is a complimentary testing technique particularly useful with awful legacy. It allows for quick production of reliable non regression tests that don't interfere with refactoring. Compared to traditional tests the game changer is that they are ephemeral - thus don't have to be maintainable.

I got introduced to the Golden Master testing technique 2 years ago when in Grenoble, France we had the chance to have J.B. Rainsberger come here for the worlds first Legacy Code Retreat. It was very an efficient technique to get total line coverage on that ugly piece of code of some 200-300 lines. The variation that he used was to put log statements all around the code, run it with many variations of input arguments, save the log files and write a test that rerun the application with the exact same arguments - if you only refactor (no behaviour changes i.e. transformations) then tests must be green.

Summary: Golden master is a technique where you bombard your system with many many many variations in input arguments, capture all the outputs to some file(s) which is committed. Every time you run the tests they must produce the same file(s). Chris Melinn describes the technique in detail. Sandro Mancuso has an example

I never used it later - except for dojos on legacy code. The reason is that it is so easy to on code that takes simple arguments and return all the results or at least has some easily verifiable side effects - like writing to a file. But production legacy code has soooooo many really horrible dependencies, intricate side effects and very often results depend very much on the state of the machine, application, database etc. Another reason I didn't use the Golden Master technique is that those tests are unmaintainable - very fragile to any modification of the behaviour.

But then a few months ago I had a few insights that suddenly made it very interesting. Here's an overview of them.

To deal with behaviour depending on state - I can transform them into direct input arguments by writing a wrapper function of the SUT (System Under Test) that configures different states
To deal with results doing side effects - I can transform them into return arguments by reading the state after exercising the SUT. So the system now behaves like a pure function from a testing perspective.
Those tests are ephemeral - I throw them away one I'm done with the refactoring. So they don't need to be explicit, clear, robust and all other things that takes time. They way I mock the system can also be quick and dirty.
Ordinary unit tests (one test class per production class) on awful code are often too fine grained and give only moderate security (failing to capture some intricate side effects) and while they allow refactoring below the interface they are testing, they hinder refactoring of that, often ill designed, interface - because they are a second client to it.

So if Golden Master tests only allow but don't enforce design improvement, aren't they dangerous? Don't they leave the SUT just as bad or even worse? Well they could of course, one way of avoiding that is to apply the classical TDD method on legacy, i.e.

Write non-regression tests
Refactor, to make the system Open-Closed (with respect to the feature you want to add)

Write ordinary unit tests for the refactored code

Test-drive the new functionality in isolation from current production code
Plug the new functionality (this is usually a one-line code or config change)

The Golden Master tests written in the first stage will allow the aggressive refactoring that is necessary for 4 to be a one-liner. They will not protect the 4th step. So if I want to reap maximum benefit from them I'll have to structure my work into absolute separation of refactoring from transformation (modifying behaviour), because in the phase of transformation the whole Golden Master tests will fail possibly not providing any information on why. Just like well executed TDD.

To me this is a wonderful technique for working with legacy code! I and a few friends have used it in a professional context with excellent results, Rémy Sanlaville wrote about string serialization and Matthieu Cans about coverage. I'm exploring this subject in detail and will post about it as I learn more.

Tooling: ApprovalTests and a powerful mocking library like PowerMock. Code coverage is also essential.

Btw, while we do throw away the tests, the majority of the work is still useful - the wrapper function(s) can be kept as an example and whatever we did to make the system testable has decoupled the system.