Cache virtual object membership #586

jcoyne · 2023-03-30T20:58:58Z

This is required for adding relatedItem entries on the public xml. Fixes #585

jcoyne · 2023-03-30T21:13:27Z

spec/factories/purl.rb

@@ -1,7 +1,7 @@
 FactoryBot.define do
  factory :purl do
    sequence :druid do |n|
-      "druid:zz#{n.to_s * 3}yy#{n.to_s * 4}"
+      "druid:zz#{"%03d" % rand(1000)}yy#{"%04d" % rand(10000)}"


The sequence approach stopped working when we had more than 10 uses of this factory.

cbeer · 2023-03-31T16:27:28Z

I think I don't understand this architecturally. Maintaining a parallel database of COCINA-derived data seems unfortunate. It still feels like SDR could send us everything we need to know up-front?

cbeer · 2023-03-31T17:02:56Z

And, not just unfortunate, but doesn't this rely on us maintaining an accurate database? Async or parallel operations seem like we could get some inconsistent behavior.

It feels like it'd be nicer if infra maintained the only source of truth and hand over all the data we need in their original request.

jcoyne · 2023-04-04T13:04:32Z

It feels like it'd be nicer if infra maintained the only source of truth and hand over all the data we need in their original request.

They do hand over all the data. We just need a way to look at this relationship inversely, just the same as we do with collection membership. This is just a cache of the data.

This is required for adding relatedItem entries on the public xml. Fixes #585

cbeer · 2023-04-04T15:57:15Z

Maybe I don't understand what we're planning to use this data for. It seems to me, if it's data we need internally to construct the public xml, SDR should provide that to us to avoid any hint of race conditions, non-idempotent behavior, or need for synchronous processing.

jcoyne · 2023-04-04T16:07:10Z

@cbeer it will be used to construct the virtual object relationships (e.g. fedora:isConstituentOf in public xml) as seen here: https://github.com/sul-dlss/dor-services-app/blob/main/app/services/published_relationships_filter.rb#L16

At present the infrastructure team is looking this up in the Argo solr (which they are trying to decouple from).
Currently there is no way in the cocina to pass this data, because maintaining bidirectional relationships in flat files is a consistency challenge. In this PR the access team takes on the responsibility for having an index of this data, just as we do for collection membership data. Instead, if we're to require SDR to send this to us, we need to create a new data model that purl-fetcher can use as it's API. This option seems like a bigger project than we can tackle in the dwindling time we have in this maintenance cycle.

cbeer · 2023-04-04T16:18:25Z

In this PR the access team takes on the responsibility for having an index of this data, just as we do for collection membership data

The difference with the collection membership data is that we're not trying to consume that synchronously.

Without infra handing us the data, won't we need to process a virtual object and the constituents in a well-defined order (make sure we've processed all the constituents before the virtual object?) or have some logic to update the virtual object whenever a constituent changes? That seems like at least as much of a pain.

I guess I don't know enough about how the Cocina is managed on the infra side; I guess I assumed there was just a big database they could ask?

jcoyne force-pushed the cache-virtual-object-membership branch from 5dc768a to 1f0c931 Compare March 30, 2023 21:12

jcoyne commented Mar 30, 2023

View reviewed changes

jcoyne force-pushed the cache-virtual-object-membership branch from 1f0c931 to bcc7512 Compare March 30, 2023 21:18

jcoyne force-pushed the cache-virtual-object-membership branch from bcc7512 to 892530d Compare April 4, 2023 13:11

Cache virtual object membership

6256f2a

This is required for adding relatedItem entries on the public xml. Fixes #585

jcoyne force-pushed the cache-virtual-object-membership branch from 892530d to 6256f2a Compare April 4, 2023 13:54

jcoyne marked this pull request as draft April 4, 2023 17:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cache virtual object membership #586

Cache virtual object membership #586

jcoyne commented Mar 30, 2023

jcoyne Mar 30, 2023

cbeer commented Mar 31, 2023

cbeer commented Mar 31, 2023

jcoyne commented Apr 4, 2023

cbeer commented Apr 4, 2023

jcoyne commented Apr 4, 2023 •

edited

Loading

cbeer commented Apr 4, 2023

Cache virtual object membership #586

Are you sure you want to change the base?

Cache virtual object membership #586

Conversation

jcoyne commented Mar 30, 2023

jcoyne Mar 30, 2023

Choose a reason for hiding this comment

cbeer commented Mar 31, 2023

cbeer commented Mar 31, 2023

jcoyne commented Apr 4, 2023

cbeer commented Apr 4, 2023

jcoyne commented Apr 4, 2023 • edited Loading

cbeer commented Apr 4, 2023

jcoyne commented Apr 4, 2023 •

edited

Loading