Remove ip and port columns from dataset table #7210

andrewjstone · 2024-12-05T17:07:37Z

These columns are optional and only used by crucible. This makes working with datasets trickier than need be - doubly so for those that don't know these columns are only used by crucible.

It also looks like the port itself is not actually used and instead the port in the region table is used. Regions map 1:1 with datasets and so we should move the ip column to the region table and remove the columns from the dataset table. It should be reasonably straightforward to do this in schema migration. Then we'll update the code to point to region table.

The text was updated successfully, but these errors were encountered:

andrewjstone · 2024-12-05T17:08:10Z

This is a pre-requisite for #6998

andrewjstone · 2024-12-06T19:11:12Z

After reading more code and thinking through this, I have changed the implementation plan. Our goal is to remove ip and port columns from the region table. For existing regions it's easy to migrate the ip column from the dataset to a new column in region. However, this doesn't work for regions that aren't yet created, since the region will need some way to get the address it needs. This is why the dataset is used in the first place. So we need to step back and ask where that address comes from in the first place.

The address for the dataset is filled in by the reconfigurator and comes from the crucible zone associated with the dataset. Datasets map to a specific zone and are currently not relocated across zones. It's unclear if we'll ever want to do this, and it's somewhat irrelevant for the discussion at hand. So what we really want is for region allocation to be able to find the zone associated with the dataset that it has chosen for the region. Then the IP address the region will use will come from the zone and not from the dataset.

Now how can we find the zone associated with a dataset so that a region can use the right address? Well, that information already exists in the target blueprint. We can loop through all the crucible zones and then find the dataset that matches and return its IP. We may want to add an index for this. Alternatively, we could put a backpointer to the zone_id in the bp_omicron_dataset table and do a point lookup - since the dataset id from the dataset table is the same as in the blueprint. Both of these exclusively use the target blueprint to get to the eventual goal of eliminating the duplicate dataset table as described in #6998.

Ok, that solves how we find the IP address for a region, but what about the port? Well, AFAICT, the port isn't actually used from the dataset table, as it couldn't be, since multiple downstairs share a dataset. The ports already exist in the region table and we'll continue to use those.

smklein · 2024-12-06T19:33:24Z

FWIW I like the idea of using the blueprint to access this information -- we'd need to check that the blueprint we're reading is "still the target" in whatever allocation database requests we make, but that feels like a more reasonable source-of-truth than the raw dataset table.

The address/port information existing in the dataset table is a vestigial artifact of being needed well before we had blueprints.

andrewjstone · 2024-12-06T19:36:56Z

FWIW I like the idea of using the blueprint to access this information -- we'd need to check that the blueprint we're reading is "still the target" in whatever allocation database requests we make, but that feels like a more reasonable source-of-truth than the raw dataset table.

Yes, I think serializability should give us this if we read the current target blueprint table in a transaction.

smklein · 2024-12-06T19:40:33Z

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove ip and port columns from dataset table #7210

Remove ip and port columns from dataset table #7210

andrewjstone commented Dec 5, 2024

andrewjstone commented Dec 5, 2024

andrewjstone commented Dec 6, 2024 •

edited

Loading

smklein commented Dec 6, 2024

andrewjstone commented Dec 6, 2024 •

edited

Loading

smklein commented Dec 6, 2024

andrewjstone commented Dec 6, 2024

Remove ip and port columns from dataset table #7210

Remove ip and port columns from dataset table #7210

Comments

andrewjstone commented Dec 5, 2024

andrewjstone commented Dec 5, 2024

andrewjstone commented Dec 6, 2024 • edited Loading

smklein commented Dec 6, 2024

andrewjstone commented Dec 6, 2024 • edited Loading

smklein commented Dec 6, 2024

andrewjstone commented Dec 6, 2024

andrewjstone commented Dec 6, 2024 •

edited

Loading

andrewjstone commented Dec 6, 2024 •

edited

Loading