TechWorkRamblings

by Mike Kalvas

202107272306 Data-last pipe design

Contrasts with 202107272307 Data-first pipe design.

Background

To start, the basic idea here is that the “data” or “object” is passed as the last parameter to a function. Consider the standard library List.map function from OCaml.1

let numbers = [1, 2, 3];
(* numbers is the "data" and the last parameter *)
let listTwo = List.map(a => a + 1, numbers);

Data-last pipe design is common in functional languages because of currying #thread and partial application #thread.

Let’s take the example from above and break it up taking advantage of currying and partial application.1

let addOneToList = List.map(a => a + 1);
let listA = [1, 2, 3];
let listB = addOneToList(listA);

It’s obvious in this example why data-last makes sense. Partially applying the function gives us a new function that we can reuse on different data objects. This is a powerful (de)composition #thread mechanism and the style of programming building up composed functions without specifying their parameters is called point-free programming #thread. Point-free programming is only possibly because of currying and partial application.

Functional languages supported currying by default and therefore naturally gravitated towards data-last conventions.

The Pipe (last) Operator |>

The last major reason that data-last was widely adopted was because the pipe operator.1 It was introduced in the theorem proving language written in StandardML called Isabelle and later adopted by others like OCaml, Haskell, F#, or Elm. The value it brought to functional language centers around the verbosity and ergonomics of chaining function calls and their outputs.

(* Without |> we have to nest calls or assign temp variables *)
let getFolderSize = folderName => {
  let filesInFolder = filesUnderFolder(folderName);
  let fileInfos = List.map(fileInfo, filesInFolder);
  let fileSizes = List.map(fileSize, fileInfos);
  let totalSize = List.fold((+), 0, fileSizes);
  let fileSizeInMB = bytesToMB(totalSize);
  fileSizeInMB;
};

(* With |> we can chain the output of each function to the next *)
let getFolderSize = folderName =>
  folderName
  |> filesUnderFolder
  |> List.map(fileInfo)
  |> List.map(fileSize)
  |> List.fold((+), 0)
  |> bytesToMB;

You can see how this is much cleaner. But this raises another question — why doesn’t this solve the type inference problem? The data is now top-left of the function. The answer is simply that the |> pipe operator is an infix function #thread. This means the pipe is re-written to a function call and the body of the callback is evaluated before the map as a whole.

Advantages


  1. Chávarri, J. (2019, May 10). Data-first and data-last: A comparison. Javier Chávarri. https://www.javierchavarri.com/data-first-and-data-last-a-comparison/ 2 3