Chemical Code a blog about simulation and programming

Pyflowsheet - Part 2: Clean Code and Refactoring

So now that a week has passed between the initial release of the pyflowsheet package has passed, maybe it is time to reflect on the code. In my opinion, it is the duty of the software developer, to deliver the best possible code that he or she is able to create. Does this mean that the pyflowsheet code is already good? Of course not! Nobody get’s it right the first time. That is why refactoring is so important in software engineering to ensure that code quality is high, and we do not create an unmaintainable mess. Or, as some other person would say: create legacy code at the speed of typing.

Since many years I have been a fan of the clean-code movement: starting my journey with the German branches (as propagated by Stefan Lieser and Ralf Westphal) to discovering the American school around Robert C. “Uncle Bob” Martin I was always intrigued with the ideas taught by “Clean Code”. Making code re-usable, even after years. Making code evolvable (because let’s face it: requirements WILL change in the future). Making code transferable to other authors.

During the rapid prototyping development phase of pyflowsheet (the alpha version was written in 4 days before Christmas) a few “code-smells” crept into repository. Things that I knew were bad, but couldn’t figure out a way to do correctly (at the time of coding).

About Clean Code

But before we go into the details of these smells, maybe we should recapitulate what Clean code should embody. In the namesake book, Uncle Bob recites a few anecdotes from his colleagues:

Taking these definitions into account, it becomes quite apparent that the pyflowsheet codebase (at the current state) does in no way embody clean code. But fret not, as mentioned earlier “nobody gets it right the first time”, so we still have a chance to fix it (before it is too late).

The code is still quite young and small, so changes can be implemented easily. There are no users “in the wild” which one could anger with breaking API changes, and even if they are, the alpha state of the software makes it clear that changes are to be expected. Once the software reaches 1.0 status, changes will be MUCH more difficult to implement, so I better get started fast!

The problems of the current design

The most obvious code smells are the following:

if self.internals and self.internals.lower() == "tubes":
    for i in range(1, 5):
        x = self.position[0] + self.size[0] / 5 * i
        start = (x, self.position[1] + capLength)
        end = (x, self.position[1] + self.size[1] - capLength)
        ctx.line(start, end, self.lineColor, self.lineSize)

if self.internals and self.internals.lower() == "bed":
    start = (self.position[0], self.position[1] + capLength)
    end = (
        self.position[0] + self.size[0],
        self.position[1] + self.size[1] - capLength,
    )
    ctx.line(start, end, self.lineColor, self.lineSize)

    start = (self.position[0] + self.size[0], self.position[1] + capLength)
    end = (self.position[0], self.position[1] + self.size[1] - capLength)
    ctx.line(start, end, self.lineColor, self.lineSize)

Violations of the principles of Clean Code

The current design around unit internals violates several principles of SOLID code:

How to improve the code?

Now that the issues of the current design are understood, what will be the solution?

The obvious solution to the problem in which the state or identity influences the behavior is of course Polymorphism. In the improved (I dare not say final) design, internals shall be encapsulated in small classes, that describe how the symbols should be drawn on the diagram. In addition I will apply the Composite pattern to model the whole-part relationship between a Unit Operation and the internals.

 def draw(self, ctx):

        #private method that handles the drawing of the "pill" shape
        self._drawVesselShell(ctx)

        # Composite Pattern 
        for internal in self.internals:
            internal.draw(ctx)

        super().draw(ctx)
        return

Instead of having a series of if-conditions that test what should be drawn on the image, the draw-method of the unit operation will call the draw-methods of all the internals and delegate the act of drawing to those composite elements. By drawing the internals from within the draw-method of the unit operation, the same rotation transform will also be applied to the internals.

For example, a vessel instance would in future be created with the following function call:

Vessel("U10","Vessel (with tubes)", position=(200,300), size=(80,40),internals=[TubesInternals] )

Notice the list literal at the end: this will allow the user to stack different internals onto each other to create combinations unforeseen by the library developer. In addition, users can create new derived classes from the BaseInternal class and implement their own drawing logic for icons not supported by the pyflowsheet package.

The code to create a drawing of a fermenter could look like the following: create a vertical vessel with external coils, an impeller stirrer and a sparger for gas diffusion.

fermenter = Vessel("U10","Bio-Fermenter", position=(200,300), size=(40,100), internals=[ExternalCoils, Impeller, Sparger] )

I now have to figure out how the geometric information from the parent unit operation shall be propagated to the internals, and how to handle vertical and horizontal rotation for the vessel icon. I will implement the new system for internals in the coming weeks and it will probably lead to a new release of the library on the PyPI package manager.

Summary

The quest for clean code is never finished. It is more a craving for an ideal state, but one must accept that it cannot be realistically obtained. What we can do instead, is treat our software as we treat our physical technical systems: Attend to them regularly, switch out broken parts, buff up the paint and leave the system in a better state then we found it in. Writing code is not a fire-and-forget activity, but requires regular maintenance and care.