In my last article I mentioned an unsolved challenge that needs to be discussed before I can start to actually implement the evolutionary algorithm itself:
It needs to be shown that an algorithm can be found that, using a numerical solver, evaluates the arbitrary process designs that will be generated by the evolutionary algorithm by mutation and cross-over.
Anyone with a background in process simulation can attest to the difficulty of getting process flowsheets to converge, especially if they exhibit distillation columns and recycles. This challenge is even harder with equation-oriented simulators, which have to rely on a general-purpose algorithms to solve the mass- and energy balance and cannot use tailored algorithms for each unit, unlike sequential-modular simulators.
A good comparison of these two approaches can be found on the homepage of PSE, the well known vendor of the gPROMS modelling & simulation software. PSE has developed a lot of initialization routines for the equation-oriented simulator gPROMS, but they are generally considered secrets and are not discussed publicly.
If one goes back in time into the 1970ies and 1980ies, a lot of literature shows up that deals with the problem of solving simulation flowsheets. Limited numerical capabilities of the time required “intelligent” algorithms that exploited the structure of the problem to speed up the solution time.
Although many of these ideas were originally intended to be used within a sequential-modular simulators, one can easily adapt them to an equation-oriented simulator if it is possible to quickly generate sub-equation-sets consisting only of a single unit.
The simulator available to me, my toy simulator MiniSim, only supports an equation-oriented (EO) modeling approach and solves the resulting nonlinear algebraic equation systems using a (heuristically modified) Newton-Raphson solver (you can find the code here).
It has all the problems associated with an EO solver such as difficult startup from arbitrary initial conditions. Luckily I have full control over the source code of MiniSim as it is an Open-Source project and can add new features easily. One extension was to add problem-specific Initialize() methods to each unit operation class. Inside these methods I wrote specific algorithms to pre-calculate the state variables using simple Flash routines and hard-coded mass-balances. As these algorithms have to be hand-written, I can only support a limited set of specifications. But anyway, I will solve the “real” flowsheet using an equation-oriented solver later, so a good initial guess is fine.
Furthermore, the user of MiniSim is able to create light-weight instances of Equation Systems very easily. Based on that feature, I added a Solve() method to each unit operation class, that will create a minimal equation system consisting just of the equations and variables of the unit operation as well as those of the outgoing streams. This small equation system is then solved using a Dulmage-Mendelsohn Decomposition-based solver that automatically determines a matching between variables and equations and solves them one at a time.
If the Initialize() and Solve() methods are called in succession, the model has quite reasonable initial values, that in many cases result in acceptable convergence rates for the overall problem.
Now that we are able to solve units one-by-one, or “sequential-modular”, we only need one building block for a robust calculation: the right sequence.
A LOT has already been written about the subject of flowsheet tearing. If you want to know more about this topic, refer to the works of Varma, Lau and Ulrichson, Motard and Westerberg or Gundersen.
When talking about process flowsheets, one cannot dismiss graph theory. Any process flowsheet can be represented by a directed graph consisting of flowsheet units (nodes) and streams (vertices). The challenge is to find a topological sorting of the graph. While this is easy to determine for a directed acyclic graph, process flowsheet very often have recycle streams that form closed cycles. These cycles have to be torn, and you need to decide which stream to tear. Tear streams (or sets of tear streams) are not unique, which might explain the multitude of tearing algorithms, as each algorithm has different criteria for selecting the optimal tear set.
I have created a Jupyter Notebook with an example to test out my ideas for a simple tearing algorithm. It uses the flowsheet of a pressure-swing distillation for separating methanol and acetone. This process features two distillation columns and a recycle flow from the second column to the first and was adapted from a ChemSep simulation, which in turn is based on a paper by Luyben.
Remark: Due to the way rendered flowsheets are converted to Jupyter images, they are saved as raw bitmaps with Base64 encoding. This results in very large Jupyter Notebooks. You might have to clone the entire repository to view the Notebooks, as the Notebook Viewer of Github sometimes refuses to render to large files. I need a way to compress the image before they are stored as Base64 strings in the Notebook file but so far haven’t been successful.
I decided to go with my own tearing algorithm because the existing ones are rather convoluted and are geared to find sequences that can be solved with the minimal amount of evaluations in a sequential-modular simulator. As I only need good initial guesses, I can accept an algorithm that would take long to converge, because I abort it long before it would converge naturally.
I will not go into the details of the implementation, which you can see in the Jupyter Notebook. In short, the algorithm performs the following steps:
In the image below you can see the results of the strongly connected component algorithm for this case study when the recycle stream is manually torn: Each source and sink can be solved individually (indicated by the border color), whereas each distillation column with periphery (condenser, reflux splitter and reboiler) would have to be solved as one SCC.
If the recycle stream is connected, both columns are part of the same strongly connected component (indicated by the green borders). But even in this case, which features 5 cycles, the simple heuristic works quite well and the flowsheet can be solved quickly. It might be required to perform multiple (3-5) initialization passes to be sure that every unit was calculated with reasonable inputs at least once.
My end goal is to create flowsheet topologies from simpler data structures that encode the flowsheet with only a few numbers. I talked about the Genome in an earlier article, and this time I added the ability to encode Distillation columns.
In the last article I still debated myself if I should encode distillation columns as adiabatic sections or simple columns (with reboiler and condenser).
I decided to go with alternative B. I just could not find a way to find reasonable values for the vapor and liquid reflux inlets if I could not assume the existence of a reboiler or condenser. What composition should the vapor have? What should the reflux look like? Pure solvent? Should it resemble to feed? I just don’t know yet. While I will lose some solutions from the design space, I can at least calculate some designs already now, and maybe I can find a way to add absorbers or stripping columns to the genome later in my article series.
You can find the extended mapping algorithm in another Jupyter Notebook. Here I hand-crafted a simple genome describing two distillation columns in sequence. In the image below you can see that the sequencing algorithms is applied to the resulting phenotype and the algorithm can actually solve this flowsheet very easily.
Now that I was confident that I could represent distillation columns in the genome and solve the resulting flowsheet models I build yet another small Jupyter Notebook that generates random flowsheets consisting of flashes and distillation columns up to a pre-defined depth.
One such randomly generated flowsheet can be seen in the picture below. Once again the Tarjan’s algorithm performs its magic and a reasonable sequence can be identified and the flowsheet can be solved quite easily even with a toy simulator like MiniSim.
In this article I have solved and discussed some of the open questions of this article series:
The next articles will cover the following topics:
So until next time, stay tuned and contact me if you have any questions, remarks or ideas. I would be happy to discuss the topic adressed in this post with anyone interested.
Written on January 28th, 2020 by Jochen Steimel