### Introduction

We have now completed the core of the application, but in order to use it, we still have to stich the different parts together. We will also take this opportunity to smooth out some rough edges in the current implementation. Although we are not going to go deep into advanced functional programming, we would like to sketch basic ideas that will allow for a more elegant solution. You shouldn't need prior knowledge of the underlying abstract concepts to follow the rest of the tutorial.

#### Computational Model

While it is quite clear what to do, there are many different ways to implement the processing steps. In our case we will try to follow the monadic model of computation that we can represent as follows:

``````read input data ----> error
|
|
v
|
|
v
parse formulas -----> error
|
|
v
evaluate formulas --> error
|
|
v
save output --------> error
|
|
v
success
``````

What this means is that we define the sequence of computations in such a way that it will break automatically if an error occurs. In this way we don't need to write error handling code for each step, we will "collect" the errors and handle them in once place.

###### Side Note

If you are not familiar with `for` expressions in Scala, this is just some syntactic sugar that converts a series of nested `flatMap` and `map` calls to a sequence in a `for` block. The following example should clarify this idea:

``````val A: Option[Int] = Some(2)
val B: Option[Int] = Some(3)
val result = A.flatMap(a => B.map(b => a * b))
result == Some(6)
``````

It looks much simpler if we rewrite it like this:

``````val result = for {
a <- A
b <- B
} yield a * b
``````

#### Implementation Idea

From the above example it is clear that we need a container type that can propagate possible errors occuring at any step in the computation sequence. Which standard Scala type is best suitable for the job? `Option` is not suitable, because it can't hold additional information like an error message. `Try` also doesn't quite fit, because we have built our parsing logic without throwing exceptions. We are left with `Either`, so we can decide to specialize it like `Either[String, T]`, in order to be able to output error messages to the user.

An initial sketch of the program then is going to look somewhat like this:

``````val result = for {
trees <- parseFormulas(formulas)
output <- process(input, trees)
msg <- Data.writeOutput(arguments.output, output)
} yield msg
``````

This compositional style has the added advantage that we can build up the solution step-by-step, without needing the full sequence of computations as shown above. Before we define the missing functions above, we have to make sure that everything we have developed so far is compatible with `Either[String, T]`, as we show next.

### House Cleaning

We can start updating the main function by considering the csv loading functions only:

``````def main(args: Array[String]): Unit = {
val arguments = parseCliArguments(args)
val result = for {
} yield formulas
}
``````

It is still possible to compose them, because both return `Try`, as we shall see next.

#### I/O Functions

When we started, we created functions that load csv files and convert the data to the corresponding row format of the input data / formula definition rows:

``````def loadInput(path: String): Try[List[InputDataRow]] = loadCsv[InputDataRow](path)
``````

Let us first observe that if the functions fail due to a wrong file path, permission problems, corrupted files, etc., then the only thing we need to do is output the corresponding error message. We don't need to handle the full error object. This means that we can freely convert `Try` to `Either[String, T]` as follows:

``````def loadInput(path: String): Either[String, List[InputDataRow]] = loadCsv[InputDataRow](path)
``````

Looks like we need to update the `loadCsv` function as well. Luckily for us, `Try` has a convenience method `toEither` that we are going to use. Unfortunately, we note that it returns `Either[Throwable, T]`. This means that we have to convert it to `Either[String, T]` explicitly:

``````.toEither match {
case Left(error) => Left(error.getMessage)
case Right(msg) => Right(msg)
}
``````

There could be a more elegant way to solve this, but we are willing to make a compromise for comprehisibility's sake. The full listing of `loadCsv` looks like this:

``````def loadCsv[T](path: String)(implicit ct: ClassTag[T]): Either[String, List[T]] = {
val mapper = new CsvMapper with ScalaObjectMapper
mapper.registerModule(DefaultScalaModule)
mapper
.asInstanceOf[java.util.Iterator[T]]
.asScala
.toList
}
}.toEither match {
case Left(error) => Left(error.getMessage)
case Right(msg) => Right(msg)
}
``````

In order to proceed, we will define a stub for the `FormulaParserApp.process` function that for now will serve as a 'glue' between the input and output only:

``````def process(input: List[InputDataRow], trees: List[FormulaRow]): Either[String, List[String]] = Right(List())
``````

With this change, we can update the main function as follows:

``````def main(args: Array[String]): Unit = {
val arguments = parseCliArguments(args)
val result = for {
} yield process(input, formulas)

result match {
case Left(errorMessage) => println(s"Error: \$errorMessage")
case Right(msg) => println(s"Success: \$msg")
}
}
``````

We are still missing a write function for the output data, so let's add one:

``````def writeOutput(path: String, data: List[String]): Either[String, String] = Try {
val bw = new BufferedWriter(new FileWriter(new File(path)))
bw.write(data.mkString("\n"))
bw.close()
s"Wrote \${data.size} lines"
}.toEither match {
case Left(error) => Left(error.getMessage )
case Right(msg) => Right(msg)
}
``````

Since this represents the final stage of the computation, we may as well update the main function:

``````val result = for {
output <- process(input, formulas)
msg <- Data.writeOutput(arguments.output, output)
} yield msg
``````

Now that we have defined the beginning and the end of this simple computational pipeline, let us focus on tidying up some earlier code.

#### Parsing All Formulas

The critical step in this tutorial is obviously the parsing process. We will start by defining a stub for the parsing function based on a compatible signature:

``````def parse(formulas: List[FormulaRow]): Either[String, List[(Long, FormulaAST)]] = Right(List())
``````

In order to feed the resulting formula ASTs to the processing function, it should accept this type as well:

``````def process(input: List[InputDataRow], trees: List[(Long, FormulaAST)]): Either[String, List[String]] = Right(List())
``````

With this, we are now ready to declare the full computational pipeline:

``````val result = for {
trees <- parse(formulas)
output <- process(input, trees)
msg <- Data.writeOutput(arguments.output, output)
} yield msg
``````
##### Parse Step

What we need to do next is obviously iterate over the elements of `List[FormulaRow]` and parse each formula:

``````val trees = formulas.map(row => FormulaParser.parse(row.formula))
``````

In addition, we also need the formula ids for the evaluation step, so we can extend the above line to:

``````val trees = formulas.map(row => FormulaParser.parse(row.formula).map(ast => (row.id, ast)))
``````

Let's stop for a moment and look back at the `FormulaParser.parse` function, did we forget something?

``````def parse(formulaExpression: String): Either[String, FormulaAST] =
// Unfortunately ParseResult does not provide flatMap
FormulaLexer.tokenize(formulaExpression) match {
case FormulaLexer.Success(tokens, _) =>
case Success(formulaAST, _) => Right(formulaAST)
case Failure(msg, _) => Left(msg)
case Error(msg, _) => Left(msg)
}
case FormulaLexer.Failure(msg, _) => Left(msg)
case FormulaLexer.Error(msg, _) => Left(msg)
}
``````

This doesn't look like the most elegant code ever written. The reason is that the Scala parser combinator library returns a `ParseResult[T]` object that doesn't have a `flatMap` function. We did not consider composition at the time, but now we could improve on this.

First, take a look at `tokenize` function of `FormulaLexer`:

``````def tokenize(formula: String): ParseResult[List[FormulaToken]] = parse(tokens, formula)
``````

We have to convert it to `def tokenize(formula: String): Either[String, List[FormulaToken]]`, so let's match the different types that `ParseResult` can return:

``````  def tokenize(formula: String): Either[String, List[FormulaToken]] = parse(tokens, formula) match {
case Success(tokens, _ ) => Right(tokens)
case Failure(msg, _) => Left(msg)
case Error(msg, _) => Left(msg)
}
``````

With this we can also update the `FormulaParser.parse` function that now looks like this:

``````def parse(formulaExpression: String): Either[String, FormulaAST] =
FormulaLexer
.tokenize(formulaExpression)
case Success(ast, _) => Right(ast)
case Failure(msg, _) => Left(msg)
case Error(msg, _) => Left(msg)
})
``````

Now that we can produce a list of parsed formulas in the form `List[Either[(Long, FormulaAST)]]`, we need to make a design decision. How do we handle formulas that fail to parse? If a formula is invalid (syntax error), it will produce `Left[String]`, but we can still retain a mix of errors and correctly parsed formulas even when multiple formulas in the configuration are invalid. Do we ignore the failed formulas and how do we notify the user?

Since the definition of the formulas is not the job of our application, we can now decide not to process the data if we encounter faulty formulas. This means that the application will have to terminate with an error message about the syntax error when the first such case is encountered. If all formulas are correct, then we proceed to compute the results.

Up until now we have automatically handled the error case by supplying `Either[String, T]` as the return type of the individual computational steps. But the parsing of the formulas occurs inside the `FormulaParserApp.parse` function, where we have defined the perfectly valid list of results `val trees: List[Either[String]]'. We need to do some extra work to look for errors and pass the first one as the result of the`FormulaParserApp.parse` function.

Scala collections provide a very convenient function to get the first element of the type we are looking for in a `List`:

``````val errorOption: Option[String] = trees collectFirst { case Left(error) => error }
``````

We can also collect all correctly parsed formulas as follows:

``````val correctASTs = trees.collect { case Right(r) => r}
``````

The only thing that remains to do is to convert `errorOption` into an `Either[String, List[]]` that the `FormulaParserApp.parse` function can return:

``````val result: Either[String, List[(Long, FormulaAST)]] = errorOption.toLeft(correctASTs)
``````

If `errorOption` is filled, `toLeft` will return `Left(errorMessage)`, on the other hand it will return `toRight(List(...))` if `errorOption` is `None`.

We can now finalize the `parse` function by sorting the list of formula syntax trees as follows:

``````  def parse(formulas: List[FormulaRow]): Either[String, List[(Long, FormulaAST)]] = {
val trees = formulas.map(row => FormulaParser.parse(row.formula).map(ast => (row.id, ast)))
trees
.collectFirst { case Left(error) => error }
.toLeft(trees.collect {case Right(r) => r}.sortBy(_._1))
}
``````

Now if there is a bad formula definition anywhere in the formula configuration file, the program is going to output an error message and halt.

As a final touch, we might want to clarify the error messages a bit in the `FormulaLexer.tokenizer`

``````case Failure(msg, input) => Left(s"Lexer Error: '\$msg'. Input: \$input")
case Error(msg, input) => Left(s"Lexer Error: '\$msg'. Input \$input")
``````

and `FormulaParser.parse` functions:

``````case Failure(msg, input) => Left(s"Parser Error: '\$msg'. Input: \$input")
case Error(msg, input) => Left(s"Parser Error: '\$msg'. Input: \$input")
``````

#### Formula Evaluation

We have come a long away, but we still need to complete the final computational step: evaluating the parsed formulas for every row of input data. We have already stubbed the `FormulaParserApp.process` function, now let's think about the design.

First, we have to convert the `InputDataRow` object to a compatible `Map[Int, Double]` that the `FormulaEvaluator.evaluate` function accepts. The most obvious way to do this is to define a member function `InputDataRow.toMap` of the `InputDataRow` class. If we take a look at the data, we will notice that not all fields have numerical values:

vin readout_date brand model mileage coolant_temperature oil_pressure ignition_cycles tyre_pressure
1GKS1AECXFR441593 24.04.2018 Volkswagen Eurovan 173888 29.91 45.04 813 3.78
SCBBR93W678635158 17.07.2010 Suzuki X-90 29902 92.99 12.18 665 2.35
WAUSG78E26A094153 04.01.2001 Hyundai Equus 150406 107.78 22.49 2550 1.4

Since our numbering scheme for the fields starts from the beginning:

field number
vin 1
brand 3
model 4
mileage 5
coolant_temperature 6
oil_pressure 7
ignition_cycles 8
tyre_pressure 9

we can see that we should convert only the numerical fields:

``````  def inputMap: Map[Int, Double] = Map(
5 -> mileage.toDouble,
6 -> coolant_temperature,
7 -> oil_pressure,
8 -> ignition_cycles.toDouble,
9 -> tyre_pressure
)
``````

With this detail out of the way, we can now think about the format of the output row. As we have explained in the beginning, we are going to use the formula ids to define the columns of our output table. Each row in the output table will contain formula results calculated from the corresponding row in the input table:

1 2 3 ... n
row 1 formula 1 row 1 formula 2 row 1 formula 3 ... row 1 formula n
... ... ... ... ...
row n formula 1 row n formula 2 row n formula 3 ... row n formula n

Let us start by creating the csv header. This is just a matter of joining the formula ids that we already have in the formula list:

``````val header = formulas.map(_._1.toString).mkString(",")
``````

Next, we can evaluate each formula for the given row input:

``````val output = input.map(row =>
formulas.map(formulaPair => FormulaEvaluator.evaluate(formulaPair._2, row.inputMap).mkString).mkString(",")
)
``````

Note that we convert the `Option[Double]` result of `FormulaEvaluator.evaluate` to a `String` in such a way, that we are going to produce an empty string if the option is empty.

Finally, we can stitch together the `process` function as follows:

``````def process(input: List[InputDataRow], formulas: List[(Long, FormulaAST)]): Either[String, List[String]] = {