Connect readers

This section explains the different readers provided by Connect.

pc/parallel-reader

Parallel reader from Connect is implemented to work with the parallel-parser. This reader is capable of detecting attribute dependencies, execute multiple resolvers in parallel and coordinate the return - including back tracking for the secondary paths. Here is how it works:

The benefits of parallel parser come at a considerable overhead cost. After some experiments with different users I got the conclusion that parallel-parser is not good for the most users. To benefit from the parallel parser, you need to be in a position that:

  1. You have large queries, meaning hundreds of attributes on the same query

  2. You resolvers need to be "parallelized", meaning that should be ok to trigger many of then at once, for example if most resolvers hit different foreign services.

To begin, let’s recall Connect’s basic idea of expanding information from a context. To illustrate this case let’s have the following set of resolvers:

(pc/defresolver movie-details [env input]
  {::pc/input  #{:movie/id}
   ::pc/output [:movie/id :movie/title :movie/release-date]}
  ...)

(pc/defresolver movie-rating [env input]
  {::pc/input  #{:movie/id}
   ::pc/output [:movie/rating]}
  ...)

(pc/defresolver movie-title-prefixed [env input]
  {::pc/input  #{:movie/title}
   ::pc/output [:movie/title-prefixed]}
  ...)

Note that we have two resolvers that depend on a :movie/id and one that depends on :movie/title.

Now given the query: [{[:movie/id 42] [:movie/title-prefixed]}]

First we use the ident query to create the context with a :movie/id, for the attribute :movie/title-prefixed the parallel-reader will be invoked. The first thing the reader has to do is compute a plan to reach the attribute considering the data it has now. This is done by recursively iterating over the ::pc/index-oir until it reaches some available dependency or gives up if there is no possible path.

In most cases (specially for small APIs) there will only be a single path as it is the case in our example. The result of pc/compute-path is this:

#{[[:movie/title `movie-details] [:movie/title-prefixed `movie-title-prefixed]]}

The format returned by pc/compute-path is a set of paths, each path is a vector of tuples. The tuple contains the attribute reason (why that resolver is been called?) and the symbol of the resolver that will be used to fetch that attribute. This makes the path from the available data to the attribute requested, this is the plan.

For details on the path selection algorithm in cases of multiple options, check the paths selection section.

Ok, now let’s see how it behaves when you have multiple attributes to process. Here is the new query, but this time let’s try using the interactive parser, run the query and check in the tracing how it goes (I added a 100ms delay to each resolver call so it’s easier to see):

[{[:movie/id 42] [:movie/id :movie/title :movie/release-date :movie/rating :movie/title-prefixed]}]
Try changing the order of the attributes and see what happens. For example, if you put :movie/title-prefixed at start you will see this attribute being responsible for the title fetching and itself.

This is what’s happening for each attribute:

:movie/id: This data is already in the entity context, this means it will be read from memory and will not even invoke the parallel reader.

:movie/title: This attribute is not in entity, so it will create the plan to call movie-details. From this plan, we can also compute all the attributes that we will incorporate in the call chain (by combining the outs of all the resolvers in the path), we store this information as a waiting list. The waiting list on this case is: [:movie/id :movie/title :movie/releast-date]. The processing of attributes continues in parallel while the resolver is called.

:movie/release-date: This attribute is not on entity, but it is in the waiting list, so the parser will ignore it for now and skip to process the next one.

:movie/rating: This attribute is neither in entity, nor in the waiting list, so we can call the resolver for it immediately, and the plan output ([:movie/rating]) is appended to the waiting list.

:movie/title-prefixed: Like the rating, this is not in entity or waiting, so we compute the plan and execute, the plan is again:

#{[[:movie/title `movie-details] [:movie/title-prefixed `movie-title-prefixed]]}

But movie-details is already running because of :movie/title, when the parallel-reader calls a resolver, it actually caches it immediately as a promise channel in the request cache, so when we hit the same resolver with the same input, it hits the cache, getting a hold of the promise channel. And so, the process continues normally with only one actual call to the resolver but two listeners on the promise channel (and any posterior cache hit would get to this same promise channel). This is how the data fetch is coordinated across the attributes, placeholder nodes are also supported and optimized to avoid repeated calls to resolvers.

Another difference is during the processing of sequences, the parallel parser uses core.async pipeline to process each sequence with a parallelism concurrency of 10.

Path selection

In case there are multiple possible paths, Pathom has to decide which path to take. The current implementation chooses the path with less weight, that calculation is made in this way:

  1. Every resolver starts with weight 1 (this is recorded per instance)

  2. Once a resolver is called, its execution time is recorded and updated in the map using the formula:
    new-value = (old-value + last-time) * 0.5

  3. If a resolver call throws an exception, double its weight

  4. Every time we mention some resolver in a path calculation, its weight is reduced by one.

If you like to make your own sorting of the plan, you can set the key ::pc/sort-plan in your environment and Pathom will call this function to sort the results. It takes the environment and the plan (which is a set like demonstrated in the previous section).

pc/reader2

This reader leverages some techniques which were developed during the creation of the parallel reader, things like path choosing and backtracking.

pc/async-reader2

Like pc/reader2 but knows how to handle async processing inside.

pc/open-ident-reader

Like ident-reader, but not constrained to the indexed idents, this will create a context from any ident.

pc/ident-reader

The ident-reader is used to resolve ident-based queries by establishing an initial context from the ident. When an ident query reaches this reader it will check the index to see if the ident key is present on in the indexed idents.

Since version 2.2.0-beta11 this reader also supports extra context provision using the param :pathom/context, here is how to send extra data to it:

[{([:user/id 123] {:pathom/context {:other/data 123}})
  [:user/id :user/name :other/data]}]

pc/index-reader

This reader exposes the index itself with the name ::pc/indexes.

pc/reader [DEPRECATED]

DEPRECATED: use pc/reader2 instead

The main Connect reader. This will look up the attribute in the index and try to resolve it, recursively if necessary.

pc/async-reader [DEPRECATED]

DEPRECATED: use pc/async-reader2 instead

Like pc/reader but knows how to handle async processing inside.