A guide to Custom Map Types in Clojure
March 10, 2021
In Clojure, we use maps everywhere. Most of the time, these maps are the standard persistent maps implementation that comes with Clojure, but those maps have a protocol that allows us to define new map types with custom implementations.
In practical terms, it means you can make custom implementations for things like get
,
find
, count
, reduce-kv
, etc…
What are custom map types?
One common usage for custom map types is to wrap some underlying type that behaves like a map.
The first example that comes to my mind is the cljs-bean project. This library is nice. Using it you can read straight from JS objects using the same map interface we are familiar with from Clojure.
(bean #js {:a 1, :b 2})
=> {:a 1, :b 2}
This library is inspired by the native bean
function (that I learned about while
looking up things for this post).
(bean (java.util.Date.))
=>
{:day 3,
:date 10,
:time 1615387053466,
:month 2,
:seconds 33,
:year 121,
:class java.util.Date,
:timezoneOffset 180,
:hours 11,
:minutes 37}
note
I did some dig in the sources of bean
to understand what does it look for in the classes.
It’s interesting to see that it looks for specific method names. It exposes methods
that starts with get
and is
. Examples are: getName
, isVolatile
, getAddress
…
Clojure itself already implements this on java.util.HashMap
for example:
(get (java.util.HashMap. {"foo" "bar" "baz" "quux"}) "foo")
=> "bar"
Specialised Map-like Structures
What is cool about having the power to give your implementation details is that we can use the map concept and extend it to new ideas.
For example, we can create a map that will deref any value that implements IDeref
,
the result would be like this:
(def m (my-custom-map {:a 1 :b (atom 2) :c (delay 3)}))
(:a m) ; => 1
(:b m) ; => 2
(:c m) ; => 3
The library Plumbing implements a lazy graph resolution, also using a custom map type.
In Pathom 3, I introduce a new special map type called Smart Maps
. This resembles
what Plumbing does, but using the Pathom engine for attribute resolution.
To know more about Smart Maps, check the Smart Map documentation page.
How to create a Custom Map Type
The answer to this question is, unfortunately, not trivial.
Depending on which environment you run your code in, the options on how to make a custom
map type will vary. Today I’ll cover implementation in Clojure, ClojureScript and
Babashka environments. I’ll use CMT
to refer to custom map types
from now on.
CMT in Clojure
In my opinion, the best way to get a correct map type implementation in Clojure is using the Potemkin library.
Why we need a library for that? I’ll quote it from the Potemkin docs:
A Clojure map implements the following interfaces:
clojure.lang.IPersistentCollection
,clojure.lang.IPersistentMap
,clojure.lang.Counted
,clojure.lang.Seqable
,clojure.lang.ILookup
,clojure.lang.Associative
,clojure.lang.IObj
,java.lang.Object
,java.util.Map
,java.util.concurrent.Callable
,java.lang.Runnable
, andclojure.lang.IFn
.Between them, there’s a few dozen functions, many with overlapping functionality, all of which need to be correctly implemented.
Despite this, there are only six functions which really matter:
get
,assoc
,dissoc
,keys
,meta
, andwith-meta
.def-map-type
is a variant ofdeftype
which, if those six functions are implemented, will look and act like a Clojure map.
Let’s implement our auto-deref map described before using Potemkin:
(ns com.wsscode.ideref-map
(:require [potemkin.collections :refer [def-map-type]])
(:import (clojure.lang IDeref)))
(declare deref-map)
(defn auto-deref
"If value implements IDeref, deref it, otherwise return original."
[x]
(if (instance? IDeref x)
@x
x))
(def-map-type DerefMapType [m]
(get [_ k default-value] (auto-deref (get m k default-value)))
(assoc [_ k v] (deref-map (assoc m k v)))
(dissoc [_ k] (deref-map (dissoc m k)))
(keys [_] (keys m))
(meta [_] (meta m))
(empty [_] (deref-map {}))
(with-meta [_ new-meta] (deref-map (with-meta m new-meta))))
(defn deref-map [m]
(->DerefMapType m))
(def m (deref-map {:a 1 :b (atom 2) :c (delay 3)}))
; reading keys, all IDeref evaluated
[(:a m)
(:b m)
(:c m)]
;=> [1 2 3]
; after assoc, it still a DerefMapType
(-> (assoc m :foo (delay "bar"))
:foo)
; => "bar"
; after dissoc, it still a DerefMapType
(-> (dissoc m :b)
:c)
; => 3
; empty returns the empty DerefMapType
(-> (assoc (empty m) :foo (delay "bar"))
:foo)
; => "bar
This is neat! The surface is quite small, and you get a complete custom map. You
can also extend other methods if you wish. In Smart Maps I override the behavior of
find
. To do that, I used the (entryAt [this k])
method. To learn about other signatures
like this, I suggest you check the Potemkin collections file.
The hard way
What if you don’t use Potemkin? What a raw custom map implementation looks like? To have this experience, check the lazymap implementation (this is what Plumbing using on its custom maps). It doesn’t look very fun.
note
And note that the lazymap
example is not a complete map replacement, no transients, no concurrent
interfaces.
Another interesting thing you must consider when making any sort lazy map
, is how to
handle keys
. Clojure doesn’t have any protocol for keys
, instead if relies on the
ISeq
protocol. In the case of Maps, Clojure expects the ISeq
result to be
a sequence of MapEntry
type.
The important thing to notice is that if you do a naive implementation using the standard
MapEntry
from Clojure, you are going to realize all the values at that moment. This is a bad
thing for lazy structures.
An example to illustrate this point:
; a naive custom lazy map
(deftype NaiveLazyMap [m]
clojure.lang.ILookup
(valAt [_ k] (auto-deref (get m k)))
; we need to implement this for Clojure to detect that this can supports `keys`
; keeping it dummy for demo sake
clojure.lang.IPersistentMap
(assoc [this _ _] this)
clojure.lang.ISeq
(seq [this]
; this is the tricky part, because we need a sequence of MapEntry, which the
; val of it needs to be derrefed, but we don't want to deref until the user
; tries to read it.
; doing it the naive way
(seq
(map (fn [[k v]] (clojure.lang.MapEntry. k (get this k))) m)))
; the keys implementation also requires that this type implements Iterable
java.lang.Iterable
(iterator [this]
; using iter for demo purposes to make this easy, but I like to point
; out the following discussion around this: https://ask.clojure.org/index.php/10303/interop-clojure-pattern-clojure-consider-adding-iter-clojure
(clojure.lang.RT/iter
; same as in seq
(map (fn [[k v]] (clojure.lang.MapEntry. k (get this k))) m))))
(let [m (->NaiveLazyMap
{:a (delay (println "A") 1)
:b (delay (println "B") 2)
:c 3})]
(keys m))
; prints A and B, showing they are getting realized, although we never used any value
A
B
=> (:a :b :c)
; note if we try the same using the Potemkin implementation, they don't get realized:
(let [m (deref-map {:a (delay (println "A") 1)
:b (delay (println "B") 2)
:c 3})]
(keys m))
; no prints
; => (:a :b :c)
This is why you can see a custom MapEntry definition
in lazy-maps implementation. This allows the val
part of the MapEntry
to be lazy.
note
Potemkin also has its own custom MapEntry
, and it behaves correctly in terms of keeping
the values lazy.
Using proxy
I recently learned about this one when I found the bean
implementation. It looks like
this:
(defn proxy-deref-map
{:added "1.0"}
[m]
(proxy [clojure.lang.APersistentMap]
[]
(iterator []
(clojure.lang.RT/iter
; same as in seq
(map (fn [[k v]] (clojure.lang.MapEntry. k (auto-deref v))) m)))
(containsKey [k] (contains? m k))
(entryAt [k] (when (contains? m k) (clojure.lang.MapEntry/create k
(auto-deref(get m k)))))
(valAt ([k] (auto-deref (get m k)))
([k default] (auto-deref (get m k default))))
(cons [v] (conj m v))
(count [] (count m))
(assoc [k v] (proxy-deref-map (assoc m k v)))
(without [k] (proxy-deref-map (dissoc m k)))
(seq [] (map (fn [[k v]] (clojure.lang.MapEntry. k (auto-deref v))) m))))
(let [m (proxy-deref-map
{:a (delay 1)
:b (delay 2)
:c 3})]
[(:a m)
(:b m)
(:c m)])
; => [1 2 3]
This is easier than the manual protocols, but still have the same problem as the NaiveLazyMap
regarding the MapEntry
evaluation.
CMT in ClojureScript
When you think you have everything figured out, comes Clojurescript, and that changes everything.
Jokes apart, Clojurescript had the hindsight advantage and used that to provide better protocols for the types. It’s good they are better, but that has a portability cost and requires that you learn a different thing to do it in Clojurescript.
For inspiration here, check the Bean implementation
from cljs-bean
.
Now that you know the names for everything, just go make yours!
The keys
situation here is the same as in Clojure, and you also need some sort of custom
LazyEntryMap
to keep things really lazy.
You can find the Smart Map implementation of CMT at this link.
CMT in Babashka
The first thing I like to point out is that you can’t use deftype
in Babashka. The
option available is to use reify
.
Other than that, the Babashka implementation is the same as in Clojure. Sadly, Potemkin
doesn’t work with Babashka, so we have to use the hard way.
For reference on this, here is the code for Smart Maps in Babashka from Pathom 3,
check it here.
important
The demos require Babashka v0.2.14
or up. At the time of this post, this version
wasn’t released yet. If you are reading much later, it’s probably available. If you
are here early, you can download the CI binary
(mac,
linux)
and use it directly.
There was a lot of work involved in getting all of this working, a special shout out to borkdude for being so helpful when I came to bring all sorts of issues around it.
One important difference is how to detect the type of your custom map. Because we have
to use reify
instead of deftype
, we don’t have an actual new type to check for
using instance?
.
To get around this issue, I created a new protocol just to use in this custom map type,
only implement it there, and them use satisfies?
to check for it.
EDIT 16/03/2021
A lot have happened to Babashka between the original post and now. Some problems were found. For a moment it seemed we may even have to let it go… But borkdude doesn’t give up so easy!
After a few more days, it came a new solution
using proxy
instead of reify
, and that overcomes the previous challenges!
This also means everything I wrote about custom map types on Babashka don’t work anymore.
The current way is like the proxy
version I demoed in the Clojure part. Here is an
example of Lazy Map (non Naive, proper handling of lazy map entries) that works in Babashka (version 0.3.0
or up):
(defn auto-deref
"If value implements IDeref, deref it, otherwise return original."
[x]
(if (instance? IDeref x)
@x
x))
(defn ->LazyMapEntry
[key_ val_]
(proxy [clojure.lang.AMapEntry] []
(key [] key_)
(val [] (force val_))
(getKey [] key_)
(getValue [] (force val_))))
(defn ->LazyMap [m]
(proxy [clojure.lang.APersistentMap clojure.lang.IMeta clojure.lang.IObj]
[]
(valAt
([k]
(auto-deref (get m k)))
([k default-value]
(auto-deref (get m k default-value))))
(iterator []
(.iterator ^java.lang.Iterable
(eduction
(map #(->LazyMapEntry % (delay (get this %))))
(keys m))))
(containsKey [k] (contains? m k))
(entryAt [k] (if (contains? m k)
(->LazyMapEntry k (delay (get this k)))))
(equiv [other] (= m other))
(empty [] (->LazyMap (empty m)))
(count [] (count m))
(assoc [k v] (->LazyMap (assoc m k v)))
(without [k] (->LazyMap (dissoc m k)))
(seq [] (some->> (keys m)
(map #(->LazyMapEntry % (delay (get this %))))))
; a lot of map users expect meta to work
(meta [] (meta m))
(withMeta [meta] (->LazyMap (with-meta m meta)))))
You can check that it’s doing the map entries lazy with:
(let [m (->LazyMap
{:a (delay (println "A") 1)
:b (delay (println "B") 2)
:c 3})]
(keys m))
Doesn’t print A
or B
, calling seq does:
(let [m (->LazyMap
{:a (delay (println "A") 1)
:b (delay (println "B") 2)
:c 3})]
(seq m))
As a piece of extra news, this means Smart Maps are fully compatible with Babashka now 🎉!
Follow closer
If you like to know in more details about my projects check my open Roam database where you can see development details almost daily.
Support my work
I'm currently an independent developer and I spent quite a lot of my personal time doing open-source work. If my work is valuable for you or your company, please consider supporting my work though Patreon, this way you can help me have more available time to keep doing this work. Thanks!
Current supporters
And here I like to give a thanks to my current supporters: