Core db stubs and create flow #405

neb42 · 2022-05-09T12:09:43Z

Opening this early so people can have a look. It probably shouldn't be merged until we're happy with the architecture.

New layout for microservice code. Here the code related to core db is entirely isolated from the rest of the code base (save for some of the deployment commands/config). The only code I've added outside of this directory is for common use across services. Code is also split out by resource within the microservice for further isolation.
I think the main improvement over our current flows is that the previous store/sqlmanager/sql layers have been moved to service/model/sqlbuilder layers. I feel each layer here has a clearer responsibility. Before the sqlmanager was too tied to the business logic and cut across resources.
validation.Validate to make validation more consistent and simple
Stubs for unimplemented flows. This is something we should do going forward. It will make parallelising on the backend much quicker if we can define the flows before hand and implement each stub one by one.
Added migrations outside of gorm. This uses migrate with migrations written in raw sql.

One thing I noticed is that creating a new microservice touches a lot of places. This should either be improved or documented.

I initially wanted to do just the stubs here, but I got carried away and started the create flow as an example. It is probably 70-80% finished.
TODO:

Finish the flow (eg. creating the record table)
Validation + JSON marsheling/unmarsheling for each struct that needs it
Tests

rational-terraforming-golem · 2022-05-09T12:09:49Z

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

rational-terraforming-golem · 2022-05-09T12:09:50Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: neb42

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [neb42]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

internetti · 2022-05-09T13:28:49Z

pkg/api/types/pagination.go

+package types
+
+type PaginatedData[T any] struct {
+	Data []T


shouldn't this have info on start and end index of the contained data? or page number? some meta data?

Yeah it's not complete. I was just messing around with generics. I'd probably flesh this out when doing the list endpoint

I would rather add metadata + interface to our current Lists objects rather than implementing another model collection

And returning a next bookmark in the metadata perhaps

Such as

type ListMetadata map[string]string type RecordList struct { Metadata ListMetadata Items: ? } ///// { "metadata":{"next":"9e44c0273dc3b838"} "items":[ {...} ] }

I think the generic is much cleaner

Yeah but in this case, we can't really use generics for each entity because we deal with runtime types.
We sure can define a generic ObjectList such as

type ObjectList[T:Object] struct{ Metadata: ListMetadata Items: []T }

and reuse that across

Isn't that just what I had? (with the metadata)

I'm just seeing a []Data field

Oh, lol yeah I see. Yes

internetti · 2022-05-09T13:56:38Z

pkg/server/core-db/migrations/1_entity.up.sql

@@ -0,0 +1,43 @@
+BEGIN;


if we change things about the sql schemas later, we'd add another file for the updates in this folder, right?

Yeah the next change would be 2_<name>.up.sql

internetti · 2022-05-09T14:07:15Z

pkg/server/core-db/models/entity/insertAttribute.go

+			"constraint_enum",
+			"constraint_custom",
+		},
+		[]any{


could this be typed more specifically?

Not that I know of

internetti · 2022-05-09T14:13:30Z

pkg/server/core-db/models/migrate.go

+	}
+
+	m, err := migrate.NewWithDatabaseInstance(
+		"file:///Users/bmcalindin/workspace/core/pkg/server/core-db/migrations",


this would need to be configurable

Yeah I need to figure out how to get this to work with relative paths

We can embed the migrations with the (really good) https://pkg.go.dev/embed package

internetti · 2022-05-09T14:18:52Z

pkg/server/core-db/types/Relationship.go

+	ID                string      `json:"id"`
+	Cardinality       Cardinality `json:"cardinality"`
+	SourceEntityID    string      `json:"sourceEntityId"`
+	SourceAttributeID string      `json:"sourceAttributeId"`


didn't see this in the frontend, what's this needed for?

I didn't do any relationships in the frontend client as I was only creating a single entity. It's used for defining relationships between entities (see the confluence docs for more details).

yes, the relationship I was aware of, but why do the relationships need attributes? what's that for? and the relationship type in the frontend doesn't have these properties

Yeah I want to think through relationships a bit more. I've added a line in the confluence doc that the relationship stuff may be wrong. Previously I was thinking of relationships as being defined by the primary keys, but I was thinking here that they could also be defined on an attribute level. I'm actually leaning back towards doing it based on ids, but I want to spend more time thinking through this again.

As I was typing this one other thought I had, if we were to have a "question bank" then if entities pulled attributes from the question bank we would actually have some implicit relationships. Maybe not useful functionally, but something interesting to see on the analytics side.

I think that's gonna bite us in the ass. I would rather add low-level FOREIGN KEY indexes in table objects rather than trying to find an abstraction for those relationships

I think that's gonna bite us in the ass.

Which bit? Also I wouldn't over index on the relationship stuff. I think I've got it wrong here and I want to go back and review it.

internetti · 2022-05-09T14:24:33Z

pkg/sqlmanager/sqlbuilder.go

+	columns []string,
+	values []interface{},
+) schema.DDL {
+	var placeholders = make([]string, len(columns))


"placeholders" as in "defaults"?

This is a copy+paste of another function that was unfortunately made private

so, are they meant to be defaults or not?

I think what it does is make $1, $2, $3, etc which correspond to Args: values,. So when the statement is executed values[0] is $1

So not defaults

internetti

what do you think about having a swagger documentation for the endpoints?

the new structure is a lot more intuitive to me, I like it

neb42 · 2022-05-09T14:28:01Z

what do you think about having a swagger documentation for the endpoints?

I think we do have open api json being generated. From my understanding swagger just uses that.

ludydoo · 2022-05-11T07:35:46Z

Boy.. this is a massive PR

ludydoo · 2022-05-11T07:39:26Z

pkg/common/transaction.go

+	Rollback(tx *gorm.DB) *gorm.DB
+}
+
+type transactionManager struct {


Not sure about the use of TransactionManager and transactionManager... We already have these methods on the db object (either Gorm or SQLx or sql.DB)

The idea is that if we change our db tooling (like we discussed) then the transaction interface will remain the same

I see, I think its a bit early to introduce this since the sql.DB is exclusively called from whatever store we have. We kind of don't need yet the handlers to have access to this interface. I think it's likely that it will remain like that, basically that the handler will simply Unmarshal Validate Call Store.Something, where the Store will do pretty much everything related to DB calls and so on

I think we definitely do need it. When you get to the service layer you'll see how.

Controller Unmarshal Validate Call Service.Create Service TransactionManager.Begin Call Store.InsertA Call Store.InsertB Iterate over x: Call Store.InsertC TransactionManager.Commit Controller Validate Marshal

But I don't see why we need a service layer ? This microservice "business task" is to talk to the database. So there is no need to encapsulate the database layer in an infrastructure layer (if you think DDD design). The database concepts should be first-class citizen of this whole microservice. What exactly would be the role of the service layer in that case ? I think the core-db-api is a really fancy store over http (or eventually websockets/grpc). But we basically expose a database through an api. So the role of service layer in this case eludes me.

I think the service layer IS the store layer

whatever we implement in this layer we basically have to fully and accurately replicate on the client side in Typescript

Build the api in typescript so we can reuse it?

I would say yes in the Java, .Net, Python, Ruby, dot net or other programming paradigms where the frameworks encourage you to use dependency injection containers like Spring, Windsor & al., automatic field injection and so on. But the advantage and strength of go is that it provides enough out of the standard library to create an application that is almost as fast as raw C++, supports extreme parallelization, etc. without needing any third party library. It is MUCH faster than NodeJs, Python, and others. It is also much easier to reason about because there is no framework-specific, automagical, convention-over-configuration stuff. It's all there in the code. There's also no annotation-based logic, such as NestJS @controller and others.

Another advantage is portability, since you can compile a go program into a portable executable that runs on all platforms, x86, windows, ios, raspberry pies, wasm, etc. You can end up with a container that contains a single file, the executable itself, which is a huge win for security, performance, and eventually cost. The recurring problem with NodeJS, python & al, is that you end up with thousands of dependencies, direct or indirect. And it makes it extremely difficult to bundle, and very long to build. By removing these dependencies, you also remove the need to keep up with vulns that are spread over thousands of libraries, and have to manage a very limited subset instead. NodeJS's founder even publicly said that he would rather use Go (or deno) for distributed apis rather than Node.

Unless there is a library that absolutely requires it, I don't see any reason not to use go.

And in the microservice architecture we are leaning towards, the service is the program. I think there would need to be a very specific usecase if we were to use services within microservices. For example, in a monolith app, we would probably have some kind of a CoreDbService or something like that. But with microservices, we already made that cut upstream, and divided code so that CoreDbService is its own microservice. So it seems to me that introducing a N-Layered architecture within a microservice architecture is overkill.

What we are building here is not a service per-se. We're more like coming up with a protocol, and programming a node to be part of a network of peer-to-peer clients, so that the data is available at all times. I think that's the main shift in how we think about this project vs other regular web application projects. This is not a web application. Given a node in the network, it would see the address of this server like the address of any other probably

# peers.ts {"peers": [10.22.104.203, 11.103.34.21, ...]}

Thie core-db resembles more a storage node than a web application. A node that only provides the Durability guarantee of a distributed, p2p storage solution. Which in a way, would perhaps make more sense if it was a Pull-based system rather than a Push-based system. That would probably make it much easier for the reconciliation logic, as each actor in the network would be responsible to keep its data up to date, rather than expecting other actors to push data to it. If each actor exposes a read api, it is sufficient for each actor to keep things up to date within themselves. In a way, there is no need for a public write api. Each actor could poll other actors in the network in a pull-based fashion, rather than having to deal with potential concurrent pushes to itself.

We could use something like the gossip protocol to distribute the latest timestamp of the latest change etc. between all actors in the network. Those could choose to pull the data when they see fit. If there is a discrepancy between the lastest local pulled state vs the gossiped latest state, then the actor knows it should pull stuff.

Advantage of gossip is that it will support fragmented networks, such as when internet goes down, and is extremely fast

I'm not really advocating for typescript as I agree with most of your points about go. The main benefit it would have is development speed and code sharing with the frontend. I think it's not the best choice, but probably the more pragmatic choice given a team of our size. Again, not suggesting we switch.

I don't agree with your point about not using dependency injection in go. Sure it may not explicitly encourage it, but it's still a good pattern to use. For example when we switch from writing to standard sql tables to writing to an event stream, we can easily swap out a single module.

I don't think a microservice architecture is a reason not to use a layered architecture within the service. If a service requires it then we should at it. Otherwise it can easily become very bloated and difficult to work with.

On the service layer, I think there are other use cases for it. In the new forms api it will need to communicate with the core-db api. I suspect this will warrant a service layer, although I haven't thought through this in detail yet.

On the distributed stuff. Yeah that all sounds good (without thinking too much on it), but I think we need to be pragmatic and consider this as a web service for now. If we start trying to design this as a p2p system without actually making a p2p system we'll end up down a rabbit hole and nothing will get done.

I've pushed up an example to models/foo that removes the service layer. I don't think it's right as the logic in Create is generic enough to be removed from the postgres implementation. But I thought it's worth having an example there to discuss.

ludydoo · 2022-05-11T07:40:53Z

docs/backend_standards.md

+      server.go
+```
+
+### Controllers


Isn't that exactly what the current handlers do ?

handlers = controller, sorry maybe I should have used handlers but I was just using the language I'm used to using. Happy with either

ludydoo · 2022-05-11T07:53:02Z

frontend/apps/core-app/src/clients/coreDBClient.ts

+  USED FOR TESTING PURPOSES ONLY
+*/
+
+import { BaseRESTClient } from 'core-api-client';


Not sure where to put this comment, but just based on naming (code-DB-client), we should probably be dealing with tables, rows, fields, constraints rather than entities

I think the advantage of having a low-level api such as a DB api would be to allow clients to interact with lower level concepts that will become necessary for reconciliation/offline use, like row versioning

I think the concept of Entity is too high-level for this kind of API, and we already have a some kind of a similar concept encapsulated in the forms api.

we should probably be dealing with tables, rows, fields, constraints rather than entities

Happy to talk about naming of resources. I'm not set on anything, but I thought an abstraction over the specific database technology used would be good.

we already have a some kind of a similar concept encapsulated in the forms api

What exactly are you referring to here?

Also I think these conversations are better for the confluence docs. This pr is more about the go api structure. I'm just using core-db as a reason to do it.

What I mean by that is concerning the conversation we had about exposing low level concepts around tables, records, etc, its upcoming usefulness in terms of implementing offline, and that the forms api is basically syntactic sugar that makes it easy to create those complicated, low-level objects. The forms-api is a higher level concept that depends on the db-api. Wheras the db-api is basically the lowest level thing in core, and the most important one probably.

What I foresee is that a subset of this core-db-api will basically become our reconciliation endpoint for offline use.

There are 2 apis I see in this microservice

Management api to create tables, constraints, stuff like that

Data api, to push data to those tables

The write part of (1) - Management API, should be exclusively for online use
The read part of (1) - Management Api, closely resemble what we need in terms of forms discovery for offline
The (2) - Data Api, closely resemble what we need in terms of apis for data reconciliation for offline

And whatever we come up with this Data Api will probably be what we need to implement on the client side as well, when we implement cross-device reconciliation. So this Data Api should be written in a way that it is agnostic to the fact that it is a server or a client or whatever else.

We don't want the mobile devices or website to have a different api whether its talking to a server or to another client in p2p mode. We probably want this api to be exactly the same. So imho the most sensible part of this whole PR is exactly in the design of this data api.

Yeah I think I pretty much agree with you. It's the interface of the CoreDBClient that should stay the same between offline and online. Seen as the client basically just maps to api endpoints then the api interface should also be fairly consistent.

tables, rows, fields, constraints rather than entities

When you talk about the naming here, do you mean we keep the same data structures but rename them or are you talking about changing the data structures?

I think it makes sense to have a DB service, but the entities within this service should be form the DB domain, such as tables fields constraints etc. rather than entities

Do you mean that for the api interface rather than GET /entity, you would have:

GET /table GET /field GET /constraint

or would the table entity still encompass the fields and constraints?

ludydoo · 2022-05-11T08:00:41Z

pkg/validation/validate.go

+
+import "reflect"
+
+func Validate(s any, isNew bool) ErrorList {


I don't really like that objects have a Validate method. In my opinion this violates the SOLID (Single-Responsibility) principle. Validation logic should be in its own encapsulated function/component/class. That will become increasingly evident once validation logic needs to make database calls, and then we need to inject a database handle into each Data Transfer Object (dto).

neb42 · 2022-05-11T08:02:23Z

Boy.. this is a massive PR

Haha sorry, got carried away. Happy to talk anyone through it in more detail if useful.

sonarqubecloud · 2022-05-12T12:05:05Z

SonarCloud Quality Gate failed.

0 Bugs
0 Vulnerabilities
0 Security Hotspots
3 Code Smells

No Coverage information
6.9% Duplication

neb42 added 30 commits May 9, 2022 10:18

Add type for paginated data

abd986a

Add function to perform validation on generic structs

67b71b2

Add SQLBuilder

bdf78c4

Add TransactionStore

ebb2eea

Add helper methods for validation

52ef1ee

Add migrations for entity tables

bc80c11

Add stubs for store layer

3f78da8

Add stubs for service layer

75aac31

Add stubs for controller layer

6c12f46

Add entity types

dc938fe

Add core db server

fc79757

Implement InsertAttribute

92338a1

Implement InsertEntityDefinition

c1eb3c5

Implement InsertRelationship

644fa9a

Create EntityService.Create

9396ead

Create EntityController.Create

2379599

core-db config

1128b85

Core DB configuration fixes

5c18d87

Fix sql syntax

fa65c0f

Add migrate command for core db

d044fdb

Use pointers to allow for null values

7b6903a

Set schema

1b4bad9

Set created entity id

1476fb0

Add attribute id to relationship

b13fd19

Fix

bb795e6

Add client to test things

7b6e559

Refactor logic into functions

9980740

Get db is tx is nil

f988fc4

Rollback on error

c559f09

Validate response

fc6612c

neb42 requested review from ludydoo, internetti and caparisi May 9, 2022 12:09

rational-terraforming-golem bot added the do-not-merge/work-in-progress label May 9, 2022

rational-terraforming-golem bot added approved size/XXL labels May 9, 2022

internetti reviewed May 9, 2022

View reviewed changes

ludydoo reviewed May 11, 2022

View reviewed changes

neb42 added 6 commits May 11, 2022 17:12

Add example without service layer

d082129

Run go mod tidy

45e092c

Add function to setup postgres and dbfactory + cleanup

3bcc76e

Add test suite for entity model

1f9e9bd

Add InsertEntityDefinition test

0f6af85

Init entity model in suite

376a73f

caparisi approved these changes Feb 7, 2023

View reviewed changes


		import "reflect"

		func Validate(s any, isNew bool) ErrorList {

Core db stubs and create flow #405

Are you sure you want to change the base?

Core db stubs and create flow #405

Conversation

neb42 commented May 9, 2022

rational-terraforming-golem bot commented May 9, 2022

rational-terraforming-golem bot commented May 9, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

internetti May 9, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

internetti May 9, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

internetti left a comment

Choose a reason for hiding this comment

neb42 commented May 9, 2022

ludydoo commented May 11, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ludydoo May 11, 2022 • edited Loading

Choose a reason for hiding this comment

neb42 May 11, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

neb42 commented May 11, 2022

sonarqubecloud bot commented May 12, 2022

internetti May 9, 2022 •

edited

Loading

internetti May 9, 2022 •

edited

Loading

ludydoo May 11, 2022 •

edited

Loading

neb42 May 11, 2022 •

edited

Loading