What are transducers and why they're cool?

A few weeks ago while I was browsing for reducer-functions, I stumbled upon a term called "transducer". Because I really like simple and declarative data transformations in code (e.g. with map and reduce), I looked into this topic. At first transducers can be kinda hard to grasp, but once you got it, you'll have an elegant technique for efficient and composable transformations in your "developer toolbelt".

What does it do?

The name "transducer", consisting of the words "reduce" and "transform", describes its functionality pretty well. With a transducer, you transform/map data from one value to another and reduce it, which means producing a new value by combining the values. Note that "reduce" does not necessarily mean to have fewer values than before when applying it to a data structure.

Let's take a look at the following code which deals with an array.

const append = (arr, x) => {
    arr.push(x);
    return arr;
}

transduce(
  compose(
    map(x => x * 2)
    filter(x => x > 5),
    map(x => x * 5)
  ),
  append,
  [],
  [1,2,3,4,5]
);
// will result in [30, 40, 50]

We declare a helper function append, which enables us to simply add a value to an array. Then we use a transduce function, which expects 4 arguments: The first argument is a function which gets applied to each of the array's values. In this case we compose a function, which 1. multiplies each value with 2, 2. filters with >3 and 3. multiplies the remaining values with 5. Pretty simple.

The 3 other arguments deal with the data structure, which is here an array. append helps us to add values to it. With [], we tell how the initial array should look like. The last argument is the actual input for the transducer, an array from 1 to 5.

Why should I care?

You might know think why transducers should be cool, when you can do the previous example with a basic JS one-liner:

[1,2,3,4,5].map(x => x * 2).filter(x => x > 5).map(x => x * 5);
Efficiency

Firstly, the transformation we composed with a transducer will not create intermediate results. So while the one-liner is definitely short, but it will create the arrays [2,4,6,8,10], [6,8,10] and [30,40,50] along the way. For such a trivial example this makes no difference, but when the array is huge and/or the transformation functions are expensive the performance will suffer.
Instead, a transducer will pass each value to the composed transformation at once and does not generate such intermediate arrays. Secondly, the code is clear and declarative which is often not the case when using loops.

Composability by being data structure-agnostic

The comparison of the examples is not quite fair as well. While the transducer makes no assumptions about the underlying data structure, which is why we passed in the initial value and the append function, JS' map and filter function will only work on arrays.
It's easy to adjust the transducer for Sets by using the initial value new Set() and the append function set.add(). The functions filter and map are not affected by such a change and can stay the same.
When you use a library for transducer like transducer.js you're able to work with various data structures like arrays, objects and iterables. You'll also get Rx.js-like transformations like take, dedupe or dropWhile on top 🔥

How do they work?

So now let's take a look how transducers can work internally. First, we play around with our well-known reduce and find out what it's able to do.
Then we move and transform it step by step into a transducer :)

The power of reduce

everythings-a-reduce-5a8f31

The very basic transformation of all transformations is reduce. For example, you can do a classic filter or map with a reduce as well:

const arr = [1, 2, 3, 4]
function map(f, arr) {
  return arr.reduce((result, x) => {
    result.push(f(x));
    return result;
  }, []);
}

function filter(f, arr) {
  return arr.reduce((result, x) => {
    if(f(x)) { result.push(x); }
    return result;
  }, []);
}

map(x => x + 1, arr); // [2, 3, 4, 5]
filter(x => x <= 2, arr); // [1, 2]

For mapping, we simply reduce the values from an array to a new one by pushing them into an initially empty array [], after we transformed each value with f. For filtering, we only add the value if the predicate f evaluates to true.

Getting generic

Now we adjust our map function a little, by extracting the reduce, so it becomes more generic.

function map(f) {
  return (acc, x) => {
    acc.push(f(x));
    return acc;
  }
}
arr.reduce(map(x => x + 1), []); // [2, 3, 4, 5]

Okay cool, so we just have to provide a function f to map and we get back a function with signature (accumulator, value) -> value, which we can feed right into reduce! But wait, something is keeping it very coupled to the array data structure. We use push() to combine the value with the accumulator, which will only work on arrays.
To work with any data structure, which might not have a method called push, let's provide this combine function as an argument as well:

function map(f) {
  return function(combine) {
    return (acc, x) => {
      return combine(acc, f(x));
    }
  }
}

const append = (arr, x) => {
    arr.push(x);
    return arr;
}

arr.reduce(map(x => x + 1)(append), []); // [2, 3, 4, 5]

As you can see, we now pass in to map the transformation function f as well as the combine function. Since push() is quite ugly, we use our rewritten helper function from before, append, as an argument for combine.
If we write map entirely with arrow functions, it becomes clearer that we have to provide to functions f and combine, to get our reducing function (accumulator, value) -> value.

// Rewritten with arrow functions
map = f => combine => (acc,x) => combine(acc, f(x))

Putting it all together

Now comes the crazy thing: If we take a closer look at combine and what it does you might see, that it combines a value with an accumulator. You could say it reduces it, don't you? It also has the same signature as a reducing function, because it is one. So what does it mean?
We can provide another reducing function for combine, which does some other things before combining the values. Take a look at map again.
For example, we could use pass in map(x => x - 10)(append) because it returns a valid reduce function. So we would pass in a map into a map with an append at last.

We could rewrite filter as well:

function filter(f) {
  return function(combine) {
    // only combine if predicate f(x) evalutates to true  
    return (acc, x) => f(x) ? combine(acc, x) : acc;
  }
}

and compose it with map, to get all values < 3 multiplied with 4 afterwards.

arr.reduce(
    //    👇predicate  👇combine                     in filter
    filter(x => x < 3)(map(x => x*4)(append)),
    //                     👆transform 👆combine     in map
    []
); // [4, 8] 

Because that's not very readable, you could write a compose function yourself or look for it on the web. Function composition allows you to write compose(f1, f2, f3, f4)(value) which gets executed as f1(f2(f3(f4(value) )));

arr.reduce(
    compose(
        filter(x => x < 3),
        map(x => x*4)
    )(append),
    []
); // [4, 8] 

A last rewrite of this code could be by introducing a transduce function, which separates compose from a combine function like append.

transduce(transformf, combine, initial, collection) {
    collection.reduce(transformf(combine), initial);
}

transduce(
    compose(
        filter(x => x < 3),
        map(x => x*4)
    ),
    append,
    [],
    arr
); // [4, 8]

And we're done! It was a long way, but I hope it became clear to you how a transducer works. If didn't explain it well to you, take a look at the very helpful pages linked at the bottom of this page, which have even more examples.

Please note that this transducer is still bound to arrays for now, since it relies on a method reduce on collection. In real transducer libraries like transducer.js, reduce is implemented for various data structures hidden from you.

Stateless and stateful transducers

For now we only saw stateless transducers, which transformations did not keep any state. But nothing's stopping you from putting and managing state in a transformation function. For example, we could implement take(n), which only passes the first n items down the transformation pipeline:

function take(number) {
  let i = 1;
  return (combine) => {
    return (acc, x) => {
        if(i <= number) {
            i++;
            return combine(acc, x);
        }
        return acc;
    }
  }
}
transduce(
  compose(
    take(2),
    map(function(x) { return x * 10; })
  ),
  append,
  [],
  arr
); // [10, 20]

Conclusion

I hope I could show why I think transducers are an elegant tool to work with data. Especially if you have huge collections to work on, you'll benefit from
a transducer's efficiency and readable code.
Also, when you're working with various data collection structures and apply data transformation on their values, you should consider using a transducers library, which decouples your transformations from a specific data structure and lets you compose them freely 💪 .
From a more abstract view, transducers cleverly combine a reduce/fold of a collection with a reducing function, which uses function composition to orchestrate a flow and transformation of the data.

Useful resources

Comments on this post


comments powered by Disqus