There is no such thing as a free Free monad

August 25, 2017·orientman·7 min read·Posts In English (Wpisy po angielsku)

dotnet async biometry decoupling durabletask fsharp monads tpl

Disclaimer This article was also published on my company blog.

Random musings on managing state, side effects and decoupling.

Intro

This post supposed to be about DurableTask ~~framework~~library and how we use it at FinAi to orchestrate biometric authentication process. In the meantime, I got inspired by Mark Seemann's writings about "pure interactions". Go read it - it's wonderful! I call DurableTask library because it's very easy to decouple your application logic from it (at least in F#). Yeah!, decoupling is great, isn't it? Believe me, you can go astray with decoupling. See for yourself.

Biometric authentication

In essence (and vast simplification) biometric authentication process looks like this:

acquire user's identity document scan
verify it and if valid continue
acquire user's face photo
verify it with document
publish result

Which can be written in pseudo-code:

let verifyUser getDocImage verifyDoc getFaceImage verifyFace publishResult =
    let docId = getDocImage ()
    let (result, docVerificationId) = verifyDoc docId
    if result
    then
        let faceId = getFaceImage ()
        let result = verifyFace docVerificationId faceId
        publishResult result
    else
        publishResult false

We have lots of dependencies (this way it's almost like using constructor injection in C#) but we can mock & test it for sure. If we want to go asynchronous we need to wrap it with async {} block and add bangs (!) everywhere. But what if process is long running? For example: doc/face verification in some corner cases could be done by "mechanical Turk". What about saving state? What happens if process crashes? Can we run this code on multiple machines? And so on...

The question is, can we decouple it little more?

Going meta

Biometry language definition (again thanks Mark Seemann for inspiration):

type DocId = Guid
type FaceId = Guid
type DocVerificationId = Guid
 
type BiometryInstruction<'a> =
| GetDocImage of (DocId -> 'a)
| VerifyDoc of (DocId * (bool * DocVerificationId -> 'a))
| GetFaceImage of (FaceId -> 'a)
| VerifyFace of (DocVerificationId * FaceId) * (bool -> 'a)
| PublishResult of bool // always last instruction -> no continuation
 
type BiometryProgram<'a> =
| Free of BiometryInstruction<BiometryProgram<'a>>
| Pure of 'a

BiometryProgram equivalent to pseudo-code above:

let verifyUserProgram () =
    Free (GetDocImage (
        fun docId -> Free (VerifyDoc (
            docId,
            fun (result, docVerificationId) ->
                Free (GetFaceImage (
                    fun faceId ->
                        if result
                        then
                            Free (VerifyFace (
                                (docVerificationId, faceId),
                                fun result -> Free (PublishResult result)))
                        else
                            Free (PublishResult result)))))))

Ugly, isn't it? (or beautiful if you're Lisp fanatic)

Let's add some syntactic sugar (most of this code is boilerplate which languages with more powerful type systems - like Haskell - can generate):

// BEGIN: Monadic stuff that Haskell does automatically
module BiometryMonad =
    let private map f = function
    | GetDocImage next -> GetDocImage (next >> f)
    | VerifyDoc (x, next) -> VerifyDoc (x, next >> f)
    | GetFaceImage next -> GetFaceImage (next >> f)
    | VerifyFace (x, next) -> VerifyFace (x, next >> f)
    | PublishResult x -> PublishResult x
 
    let rec bind f = function
    | Free instruction -> instruction |> map (bind f) |> Free
    | Pure x -> f x
 
type BiometryBuilder() =
    member __.Bind (x, f) = BiometryMonad.bind f x
    member __.Return x = Pure x
    member __.ReturnFrom x = x
 
let biometry = BiometryBuilder ()
// END
 
// shortcuts for instructions
let getDocImage = Free (GetDocImage Pure)
let verifyDoc docId = Free (VerifyDoc (docId, Pure))
let getFaceImage = Free (GetDocImage Pure)
let verifyFace docVerificationId faceId =
    Free (VerifyFace ((docVerificationId, faceId), Pure))
let publishResult r = Free (PublishResult r)

Now we can write BiometryProgram that looks almost exactly like our pseudo-code at the beginning:

let verifyUserProgram () =
    biometry {
        let! docId = getDocImage
        let! (result, docVerificationId) = verifyDoc docId
        if result
        then
            let! faceId = getFaceImage
            let! result = verifyFace docVerificationId faceId
            return! publishResult result
        else
            return! publishResult false
    }

Going down

Very readable but how to run it? We must write an interpreter. Let's start with basic synchronized version. Instead of taking photos & doing real verification we will generate GUIDs and return true :)

module SyncInterpreter =
    let rec interpret = function
    | Pure x -> x
    | Free (GetDocImage next) -> Guid.NewGuid() |> next |> interpret
    | Free (VerifyDoc (docId, next)) ->
        printfn "VerifyDoc %A" docId
        (true, Guid.NewGuid()) |> next |> interpret
    | Free (GetFaceImage next) -> Guid.NewGuid() |> next |> interpret
    | Free (VerifyFace (request, next)) ->
        printfn "VerifyFace %A" request
        true |> next |> interpret
    | Free (PublishResult result) -> printfn "Result is %A" result

Result of running it in REPL:

> verifyUserProgram () |> SyncInterpreter.interpret;;
VerifyDoc 4b8b1115-4f8f-4e31-8c34-22a518064066
VerifyFace (d6db0275-56d2-4bb0-bfdb-37e649efb0f6, e0f007c7-e822-48f2-b72d-df88e2b72823)
Result is true

Now we can go asynchronous without (!) modifying original program:

module AsyncInterpreter =
    let rec interpret = function
    | Pure x -> x
    | Free (GetDocImage next) ->
        async { return! Guid.NewGuid() |> next |> interpret }
    | Free (VerifyDoc (docId, next)) ->
        async {
            printfn "VerifyDoc %A" docId
            return! (true, Guid.NewGuid()) |> next |> interpret
        }
    | Free (GetFaceImage next) ->
        async { return! Guid.NewGuid() |> next |> interpret }
    | Free (VerifyFace (request, next)) ->
        async {
            printfn "VerifyFace %A" request
            return! true |> next |> interpret
        }
    | Free (PublishResult result) ->
        async { do printfn "Result is %A" result }

Result of running it in REPL:

> verifyUserProgram () |> AsyncInterpreter.interpret |> Async.RunSynchronously;;
VerifyDoc 3c45965f-306b-4807-aa75-8f838e5eff67
VerifyFace (4361da1d-3428-48f5-9b8c-b7d016519562, 271be688-e61f-4ac5-85b2-112c0aebc72b)
Result is true

Reaping rewards

At last, let's use DurableTask to address our reliability questions. This library "allow users to write long running persistent workflows" and in our situation it makes process:

Resilient: State will be persisted at "checkpoints" (e.g. image capture, results of doc/face verification) and restored if needed (e.g. after crash or for audit/monitoring purposes)
Scalable: Backend leverages Azure Service Bus (or - in new version - Service Fabric) and what started on one node can continue on another

DurableTask is OOP friendly and it hurts my FP eyes but thanks to decoupling we can reuse verifyUserProgram:

// abstract external dependencies serving as "checkpoints" for Orchestration
type IBiometryActivities =
    abstract VerifyDoc: DocId -> Task<bool * DocVerificationId>
    abstract VerifyFace: DocVerificationId * FaceId -> Task<bool>
    abstract PublishResult: bool -> Task<unit>
 
type BiometryOrchestration() =
    inherit TaskOrchestration<unit, DocId, FaceId, string>()
 
    let mutable tcs = new TaskCompletionSource<FaceId>()
 
    // bind operator (fancy Task.ContinueWith)
    let (>>=) (x: Task<_>) f = task { let! x' = x in return! x' |> f }
 
    let run (activityClient: IBiometryActivities) docId =
        let rec interpret = function
        | Pure x -> x
        | Free (GetDocImage next) -> docId |> next |> interpret
        | Free (VerifyDoc (docId, next)) ->
            docId |> activityClient.VerifyDoc >>= (next >> interpret)
        | Free (GetFaceImage next) -> tcs.Task >>= (next >> interpret)
        | Free (VerifyFace (request, next)) ->
            request |> activityClient.VerifyFace >>= (next >> interpret)
        | Free (PublishResult result) -> result |> activityClient.PublishResult
        verifyUserProgram () |> interpret
 
    override __.RunTask(context, docId) =
        let activityClient = context.CreateClient<_>()
        tcs <- new TaskCompletionSource<FaceId>()
        run activityClient docId
 
    override __.OnEvent(_, _, faceId) = faceId |> tcs.TrySetResult |> ignore

Most interesting points:

Orchestration process starts when identity document arrives (DocId is an input)
Face image (FaceId) can arrive at any time (as an event sent to Orchestration)
Because RunTask method can be run many times (replaying all previous events and activities results) we must reset all "state" (TaskCompletionSource) each time
Code inside RunTask should be side effects free (BiometryProgram is great fit here) or wrapped with Task activity (that's the job for interpreter)
DurableTask depends on TPL semantics and that's why TaskBuilder.fs is used instead of default F#'s async (see Tomas Petricek post on C#/F# async differences)

Final thoughts

Surprise, surprise, I didn't go this route in my production code. Purity and decoupling are great but I value simplicity more. IMHO using such heavy abstraction (counted more in mental operations than lines of code) is not justified for one-off use. What's your opinion? Anyway I found it interesting enough to write this post. DurableTask definitely deserves post on its own which will come some day. Bye!