mercredi 5 décembre 2018

Typical arguments to avoid to write automated tests (and their counter arguments :) )



As a test aficionado, I often have to deal with some people who are refractory to tests, with many stereotypes and preconceptions. Here is the top 8 (no priority order) of the refusals I encountered.


Write tests will cost me too much time

  1. Of course it will take some time to write tests, but you have to consider this as an investment. You'll gain time after, by not having bugs coming again and again.
  2. When writing tests, you learn how the different parts works together, as a consequence, you'll better know your system and you'll be more efficient when debugging bugs and problems.

Code isn't testable


  1. if your code isn't testable, it is probably not re-usable neither nor easily configurable, you'll have many more problems with this code. You'll have to cut/split your code in smaller parts, use interfaces when possible, learn how to make some dependency injection.
  2. BTW, I never saw any code that wasn't testable at all, (well, maybe assembly code for very specific processors is difficult to test without hardware... but even in those case, tests can be done through test bench)

There are still bugs with tests



  1. Yes, it is true, this is a direct correlation between code and bug, . Automated tests are probably more efficient to protect against regressions bugs than avoiding new bugs, but regressions are the nightmare of any developers and IT managers.
  2. If you write tests during the development phase, you'll surely find bugs as if you were doing manual tests but those bugs will be fixed permanently (once the test is written, bug is definitively buried).

Write testable code is too complicated


  1. Yes, this is true, but writing bug-free code is even more complicated :) 
  2. Don't be afraid, it is difficult for everybody, as it was difficult the first times you wrote code, but the more you'll write tests and testable code, the more comfortable you'll be in writing tests

Test code will enlarge the quantity of code to maintain which would lead to a higher maintenance cost

  1. Well, according to my mailbox, everybody needs to enlarge his....Oh,... no, not this time, sorry... 
  2. This is right, more tests means more lines of code... but, there is a compensation, with a good test coverage, refactoring your code is easy and harmless, so you'll be able to reduce the amount of code to maintain => win-win ?
  3. Your base code will probably be a little bit biggest, but more modular and more isolated (which is a consequence of writing tests), so you'll have less difficulties to replace/change/split your code base into smaller parts and then, reduce maintenance cost.

We don't need tests, we just write some small patches


  1. And the code you're patching doesn't have got any tests ? If you really only create some patch to existent code, this is the good time to start learning how to make some functional tests (or system tests), you'll have to test a complete functionality from a higher level.
  2. This is the good starting point to follow the boy scout rule : "Leave your code better than you found it.". If the original code doesn't have any tests, start to write some :).


I don't need test, I make PR and code review

  1. Code review (and Pull Request) is better than nothing of course, but its efficiency to detect bugs is too much related to the experience and implication of the peers, and you'll have to follow some rules to make efficient PR (small PR, isolated modifications... many things you make naturally when doing tests 😇 )
  2. Anyway, it won't protect you against regressions, but tests will :-) .

We don't know how to write tests !

  1. This one is a good argument, but nothing is impossible ;) start reading good online resources about writing tests, start with the Way of Testivus from Alberto Savoia, this is a good start to understand the mood of testing.
  2. So you don't know how to write test, but maybe some of your colleagues already wrote automated tests ? Try to ask them the good way to do it (people who are writing automated test are often very happy to share their knowledge with other developers, don't be afraid to ask them for help, you'll be rewarded)


Reactions, comments, remarks, feel free to comment/start discussions :)


vendredi 10 novembre 2017

Write a DSL in less than 20 lines of code of python.



Imagine a customer (Nick Fury) ask you to develop a system to handle the directories of different organizations and let them navigate through it. But, they want to be able to describe themselves the organizations (probably for security reasons).
What will be the possibilities to let final users do that?

Let's ask to google what are good configuration formats.

First result leads you to INI files

Which is known as a good and simple format, easily readable and understandable by humans. OK, let's try this one!
Write the ini file for a member of the organization:
File bruce_wayne.ini
[user]
lastname=Wayne
firstname=Bruce
organization=DCC

Okay, looks readable. Let's write the second one.
File dick_grayson.ini
[user]
lastname=Grayson
firstname=Dick
organization=DCC
managed_by=bruce_wayne

Hmmm, the managed_by which declare relation with an external file sounds like a warning, how to ensure that this reference will exists in the system? Moreover it will soon become a nightmare when facing to non latin name (and if a person had 2 managers, ini will die!).

Okay, maybe INI files won't be the best solution, what else could we do?

Next result in Google is XML. 

Data seems hierarchical, with relations, we could use one xml file to store everything, let's try :
<organizations>
 <organization>
  <id>1</id>
  <name>DCC</name>
 </organization>
 <organization>
  <id>2</id>
  <name>MCU</name>
 </organization>
</organizations>
<user>
 <id>1</id>
 <lastname>Wayne</lastname>
 <firstname>Bruce</firstname>
 <organization_rel>1</organization_rel>
</user>
<user>
 <id>2</id>
 <lastname>Grayson</lastname>
 <firstname>Dick</firstname>
 <organization_rel>1</organization_rel>
 <managed_by>1</managed_by>
</user>
... arf, already tired to write those <tags> 
 and have to manage these internal stuff like ids.

So, letting users write XML is not a solution neither
#ProTip: XML is good for machine to machine communication, written by a machine and read by a machine, definitively not a human friendly language.

What else could we do?

Finally, a solution appeared, it was to use a Domain Specific Language (DSL)

We could imagine a description language easier to write and read by final users.
After some discussion with the customer, you have agreed on this proposal:
robin=User(firstname="Dick", lastname="Grayson")
batman=User(firstname="Bruce", lastname="Wayne", subordinates=[robin])
Organization(name="DCC", employees=[robin, batman])

ironman=User(firstname="Tony", lastname="Stark")
warmachine=User(firstname="James", lastname="Rhodes")
pepper=User(firstname="Pepper", lastname="Pots", subordinates=[ironman, war])
Organization(name="MCU", employees=[ironman, war, pepper])

Pretty straightforward, isn't it?

Now, let's do this with some python code:

First, to handle our organization we'll need some class :
class Organization:
    def __init__(self, name=None, employees=None):
        self.name = name
        self.employees = employees or []

class User:
    def __init__(self, firstname=None, lastname=None, subordinates=None):
        self.firstname = firstname
        self.lastname = lastname
        self.subordinates = subordinates or []

Of course, we miss the real stuff to do with those items, but for our example, it will be enough.
Then, how to load/read/interpret the file and transform this into python class? Here is what I propose:
class OrganizationConfigurator(object):
    def __init__(self):
        self.__symbols = {
             "User" : self._create_user,
             "Organization" : self._create_organization}
        self.organizations = []
        self.users = []

    def __read_file(self, filename):
        with open(filename, "r") as the_file:
            return the_file.read()

    def read_configuration_from_file(self, filename):
        exec(
            compile(self.__read_file(filename), filename, "exec"), 
            self.__symbols
        )

    def _create_user(self, **kwargs):
        new_user = User(**kwargs)
        self.users.append(new_user)
        return new_user

    def _create_organization(self, **kwargs):
        new_organization=Organization(**kwargs)
        self.organizations.append(new_organization)
        return new_organization


And that's all! You can count the lines to create the DSL: 10 lines for the core (open, read and interpret DSL) and 8 lines for the stuff related to user/organizations creation, can you do it in less?

So, let's try to explain a little bit what is done here.
First, the __init__ part :
class OrganizationConfigurator(object):
    def __init__(self):
        self.__symbols = {
             "User" : self._create_user,
             "Organization" : self._create_organization}
        self.organizations = []
        self.users = []
We populate a list of symbols that will be used to evaluate the configuration. Here, we declare the keywords "User" and "Organization" that will become available in DSL.

Then, the main method is read_configuration_from_file


def read_configuration_from_file(self, filename):
        exec(
            compile(self.__read_file(filename), filename, "exec"), 
            self.__symbols
        )

This function use 2 built-ins from python, compile and exec.
Compile will transform the content of the file into python "bytecode" (well, it's an AST object, if you want to dig into those wonderful world, go there). Then, exec will (beware of the surprise...) execute the code!
By giving the symbols dictionary as a parameter to exec, we populate the DSL world with our keywords, this way, User and Organization are known and callable objects.

Now, to launch this parser use
configurator = OrganizationConfigurator()
configurator.read_configuration_from_file("name_of_the_file_containing_the_configuration")

And you'll find in the members organizations the list of organizations, and in users, the list of users.

SECURITY WARNING:
Of course, if you don't trust your customer (or the config writers) you should restrict the world by setting manually under the __builtins__ key a dictionnary  where you will remove some dangerous keywords (as exec, compile...), and remove/override some other dangerous methods (import for example).

vendredi 27 octobre 2017

What is important in Docker is the Dockerfile



Most important part of a Docker container is the dockerfile.

In fact, doing good dockerfiles is enough for efficient Devops !



TL;DR : Use Docker in development but once your dockerfile is fine tuned, transform it into a "native" package (DEB, RPM...) and replace the content of the dockerfile by the installation of the package => more portable across container platforms and OS.

What is Docker ?

Docker is a way to create a world for your application (or a part of it) and isolating it from the outer world (for the oldest geeks, do you remember Populous? Well, that's what you're doing by using Docker, you are the God and you rules everything from ground design to interactions)

How do we do that ?

We do that by telling Docker how to construct the world, creating the lands, populating it with living creatures, all of this is described in the dockerfile : to continue with the god metaphor, this is the Genesis Book. You will describe everything that makes your application's ecosystem functional; from the operating system you choose to the starting of your application through all configuration parts.

 Why is it good ?

This is good because by writing this description of the world creation, we create a reproducible and stable world. And a stable world is the first step for a stable application.
If environment surrounding your application changes in every new installation (and doing it manually, it will) it will lead you to instability.
Imagine we transpose this to real world : it would be similar to start working in a different company every day, not the best way to be top productive, isn't it?
 This is why it is important to stabilize environment.
And this is the first step of the industrialisation process of your application, and with this you reduce instability, greatly accelerates deployment, and finally that will raise the productivity of you, your teammates and your application will looks shinier.

Why do you want to use dockerfile without Docker ? Isn't it a nonsense ?

Yes nut no :-).
Of course, you can't reuse dockerfile as-it outside the docker world, but you can reuse the content (or more generally, the concepts).

When you're thinking of how you will construct your world, you think about the dependencies, their relationships, the way to configure it effectively, in short, you think about the industrialisation of your application, how to install/configure/manage it.

Once you achieved to use docker for your application, you made the most difficult (in general, you made modifications to the dockerfile AND to your application), it is now easy to extrapolate/generify the process to make your application deployable (almost) everywhere. If you compare a dockerfile and the postinst file of a debian package (or any other package manager file), you'll see they have a lot in common, it's most a subject of translating one description to another, no magic needed.


Using Docker during development is now a commonly accepted idea, and this is a good habit but when it won't be changing anymore, you could think of transforming it into a native package (debian, RPM, once you made it for one, it's only some adaptations for another, they're all similar).
By doing this, you will gain those benefits:

  1. you won't be stuck with a single container will be independent of container technology (there is not only Docker in the Container's world : LXC, Solaris Zone, RKT...maybe it's not changing as fast a JS ecosystem, but it's moving), with native packaging, you won't have problems to adpat your application to one or another technology.
  2. some customers prefer native installations or classical Virtual Machines. This won't be a problem once you made native packaging (to understand why some customers would like this, take some time to have a look to the container management jungle ? Swarm, Kubernetes, Mesos, it's even moving faster than container technology, and nobody seems to take the lead for the moment) 

And finally, you will be able to fly yourself !

Ecosystem independence

lundi 9 octobre 2017

How to #Devops-ify a python (flask based) web application ?

How to #Devops-ify a python (flask based) web application ?


I recently had to develop a new web application in my job. I saw a good opportunity to apply as many as possible of the #Devops habits, and I’m going to share the recipe I applied with all the tools I used.

Recipe ingredients

To make a successful devops stack, you’ll need :
  • a modern language that will allow you to package your modules and manage dependencies => python
  • a tool to save and follow modifications you’ll make to your code => git (through installation)
  • a tool to let you automate your tasks across different machines =>Jenkins
  • a FTP server (to store external/legacy dependencies)
  • an “internal” release server to store releases of private modules =>pypi server


The Dev part of #Devops

Goal : get a versioned package every time a successful commit was made. 

Prerequisites :
  • create a Gitlab project for every module of your application
  • create a Jenkins project for each module
    • to ensure complete isolation of builds, and avoid to have to manually install many dependencies on the jenkins slaves, I installed the pyenv plugin, that way, I was able to create a fresh python environment for each build.
  • connect them through the Gitlab Plugin
Then, for each commit into a module, a Jenkins job will run with the following steps:
  • install the python requirements of the module through pip using my own pypi server (allowing me to mix public and “closed source - internal use only” packages, and work offline).
    • to be sure that my dependency files were always up-to-date, I used (and still use) pigarto update/detect dependencies in my different modules
  • run the test suite of the module with nosetestsand publishcoverage
  • if all tests are OK, package the module and upload it to the local pypi server (using twine)
    • this leads me to a problem: each time a new build was made, a new release was uploaded but with the same version number. Change the version manually with each commit was of course not thinkable. Solution came from setuptools_scm which allows to automatically create an incrementing version number for each commit and therefore correctly identify and publish it.
Thus done, every module composing my application was correctly and automatically versioned and published.

The 'vO' part of #Devops

subtitle : the 'vo' part is the link between the Dev part and the Ops, often underestimate, or neglected.

Goal : transform the package into a complete and installable application

Prerequisites :
  • have a machine similar to the production one
  • don’t be afraid of writing native scripts :) (I used to write bash scripts, I (re)-discovered batch scripts for windows and… duh that was rude but satisfying when everything was working !)

To allow to easily bundle a new package when needed (to store scripts, config files…), a new project needs to be created and an associated  jenkins job with the following steps :
  • create a python virtual env (the pyenv plugin is unfortunately not working on windows, so I had todo it manually) and activate it with the activate.bat script.
  • get the modules needed for application (through pip and our local pypi server)
  • generate the list of installed packages with the command : ---pip list --format=columns--- and reserve it for later
  • get every needed external dependencies. In my case, I needed some legacy DLL's that I bundled into versioned zip files and stored in a local FTP server.
  • Here is an interesting trick for python. Since python 3.5, we can embed the interpreter (and standard library) and use it in applications without need to make a full python installation on destination computer (only available for Windows users 😁 ... i'm guessing why... or not). So, let's download and extract the python embeddable zip file
  • Then, copy the complete site-packages directory of the previously created virtualenv into the directory where you extracted the python runtime, this way the embedded python will be able to use and import your modules
  • finally, compress the tree structure into a single zip file, (do not forget to name the zip accordingly to the version of your application, or use the date (till the minutes) to identify it.
At the end of this stage, we now have a fully autonomous archive that will allow us to run the application on every Windows computer, with this simple installation process : extract and run

The Ops part of #Devops

Goal : automate the deployment of the application 

Preamble:

This part is not the most complicated, yet it still needs some smart little tricks to ensure of a correct and reliable installation and update process (update is probably the most difficult part in an automatic installation, if you don’t think so, you probably never had to update a real running application 😜). This update procedure should be as simple as possible because if you think/hope that your final users (whoever they are, technical or not) will be happy to follow a complicated manual or have to launch plethora of commands, you’re wrong ! (Reminder Complicated manual for a normal human starts after step 2 of your installation manual

 You have to remember that users are even laziest than you are (yes, it is possible 😉) 

 Do not expect them to do an effort that you didn't made !

So, for this part, we’ll need :
  • to have access to the “production” machine (production, or pre-production in my case)
  • that’s all, pretty cool isn’t it ?

Because my production machine was in my LAN, I simply declared it as a slave for my Jenkins master and just had to create a Jenkins job, tightened to this slave. I automated the update  (only update, first installation consists of only an zip extraction... not too much risky) process with those steps :
  • get the latest released version from the previous job
  • copy it on the installation directory (standardized directory, to ease automatic deployment, but of course, never stored anywhere in the code, it could be anywhere else :) )
  • launch the update script (this basic script, bundled with all installations makes basic things)
    • extract the new archive into a temporary directory
    • stop the server
    • launch the post-update script that came with the new release (that way, update to a newest release is always under responsibility of the new release)
      • in my case, this script will only make a copy of all the new files to the installation directory, but it could also launch scripts to make some conversion of previous data if needed, removing old data…
    • then, it is the time to restart the server
  • here we could feel happy with what we made and go home…. yet, we’re missing the major step ! Test that the newly deployed application is correctly running 😰 . The simplest test you have to write here, is one that will access to the main page of your application and try to get the version (in my case, I put the version(s) in the main page footer) to compare it with the versions embedded in the archive (remember the file we generated with pip --list previously, here is why :) )

What did I learn/gain ?

This process helped me (much more than once) to detect problems related to packaging (oups, I forgot to update dependency X... for example), compatibility (arf, if you change some internal file format, do not forget to support (and transform) older formats) and many tricks for writing batch files (use the ping command to simulate a sleep because : "no, there is no sleep method in batch" 🙅 ).

Automating all those steps is not totally free, of course it took some time, but it was more a constant little effort at the beginning of the project (let's say, something like 5/10% of my time in the first weeks, with a peak of 1 or 2 days to make the first version of the full pipeline), but once this was running : it saved me a lot of time by discovering bugs just after creation (and sooner the bugs are discovered, the faster they are corrected, refs :defect prevention reducing costs and enhancing quality software quality at top speed, my personal experience ;)).

It also let me be able to release a new version by just launching a Jenkins job and wait a few minutes (actually less than 10 minutes) and with that :

I'm able to deliver new valuable features to my customers in a fast and safe way, and it makes customers happy and confident, which is our aim, isn't it ?

Typical arguments to avoid to write automated tests (and their counter arguments :) )

As a test aficionado, I often have to deal with some people who are refractory to tests, with many stereotypes and preconceptions. Here is...