Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Observability platform multi-tenancy #85

Open
wants to merge 27 commits into
base: main
Choose a base branch
from

Conversation

marieroque
Copy link

@marieroque marieroque requested a review from a team February 8, 2024 15:08
multi-tenancy/README.md Outdated Show resolved Hide resolved
multi-tenancy/README.md Outdated Show resolved Hide resolved
multi-tenancy/README.md Outdated Show resolved Hide resolved
multi-tenancy/README.md Outdated Show resolved Hide resolved
multi-tenancy/README.md Outdated Show resolved Hide resolved
multi-tenancy/README.md Outdated Show resolved Hide resolved
multi-tenancy/README.md Outdated Show resolved Hide resolved
multi-tenancy/README.md Outdated Show resolved Hide resolved
multi-tenancy/README.md Outdated Show resolved Hide resolved
multi-tenancy-observability/README.md Outdated Show resolved Hide resolved
multi-tenancy-observability/README.md Outdated Show resolved Hide resolved
multi-tenancy-observability/README.md Outdated Show resolved Hide resolved
@QuentinBisson
Copy link
Contributor

@giantswarm/sig-architecture this is not a draft anymore so I would love your opinion on the matter.

@puja108 I added the write path if you want to take another look :)


But this approach brings a few questions:
1. What tenancy should common internal apps use (e.g. prometheus-operator, api-server, cilium logs are accessed by multiple internal teams)? Do we need to ensure customers are putting those apps into a namespace accessible by all?
2. What about teams that share namespaces?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I know from customers, we rarely have that outside of kube-system

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's also the case for the monitoring namespace and it would probably the same tenants that can access the kube-system namespace

This approach makes it really easy for us and customers to configure tenants and it would definitely be easy to onboard teams (e.g. adding a label on the namespace would make the agents collect logs from the containers in the namespace).

But this approach brings a few questions:
1. What tenancy should common internal apps use (e.g. prometheus-operator, api-server, cilium logs are accessed by multiple internal teams)? Do we need to ensure customers are putting those apps into a namespace accessible by all?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I understand the question, if they are separated by namespace=tenant then if someone wants everyone to have access to them they add everyone to that tenant-mapping. In most cases, especially the ones you mentioned above I would doubt everyone would have access to those.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For my discussion with customers, prometheus-operator was moved to another namespace because teams did not have access to kube-system logs :D

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, my main questions in this case is more for our managed apps because customers can install them in namespaces they want and it could also be running with their own workload like it is the case with prometheus-operator. Now we technically have access to all the logs because we manage the clusters but in loki, we could theoretically reduce our access to only kube-system, giantswarm namespaces and managed apps logs only.

Do we want to go in that direction or should Giant Swarm have access to all logs by default?

- This is far from the idea of a platform
- There is a high risk that configuration will be messy and hard for use to debug when we need to

This approach might be useful for customers that want to send logs from outside the clusters
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we enable that as an alternative route for the usecase of "outside the cluster" alone?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not see why not :)

- Quite easy to configure for logs

Cons:
- Tenant configuration will need to be configured in multiple places (MC for the read path and WC for the write path)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might not be a big problem as read path is access management and MC is a good place for that and write path workload related so deploying together with workload makes sense. The two lifecycles are separated.

- With this approach, we could still have a `giantswarm.io/tenant: my-tenant` label on some apps that should have a specific tenant attached to them.

Cons:
- Our existing customers want to start slow (team by team) and opting-out for all namespaces from the start will be a hassle. An idea to avoid this would be to be able to opt-out of the multi-tenancy features with a cluster label (`giantswarm.io/tenancy: false`).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can find that out with customers. AFAIK a lot of customers have managed namespaces that are controlled by the Platform team and where they should be able to add a default opt-out label on creation as they also add other things.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could probably roll out slowly (and under shared control) with Policy API. We can discuss

Copy link
Member

@puja108 puja108 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I think I understand much better now (also thanks to your explanations on other RFCs and previous comments). As I kinda suggested it, I'm in favour of approach 4 and I like that there's flexibility on top. Only need to find out a bit more about opting out and how feasible that is.

Main next step would be the read path mapping config we'd want to expose to customers. My main concern there is that we most probably would have a 1:1 there between out o11y permissions to namespaces and WC RBAC permissions to namespaces. Having a config interface (like a CR) makes sense, but we might be able to somehow generate it to reflect WC RBAC. We could sync with Big Mac here to see what we currently offer here in terms of WC permission management. IIRC there's an app we offer, but most customers might be using their own config.


What interface could we offer our customers to enable them to define the mapping between their groups and their tenants:

- Custom resources like TenantCR, GroupCR
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO this is the purpose of Project CR we thought for Dev Platform story

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is related, but I can see us pulling the concept apart so we can "translate" a project CR to specific custom CRs like a o11ytenant CR or sth. Because Project CR should be depicting project/namespace/team and other info, but here for example we only map OIDC group to o11ytenant, so this is more like RBAC binding.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm yeah, I guess that decoupling those will give us more flexibility. I am just afraid we may create too much complexity. Said that, tying everything to a single CR could be dangerous

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the beggining, we can have this maintained by us via giant swarm config until we think about the UX side to not block this RFC, but I'm not sure where (should we create a WG around multi-tenancy)?

Here is what the flow looks like:
1. The logging-operator creates a `password` for the cluster's logging agents to be able to send logs to Loki and configures them (actual configuration of what to scrape and so on) via an extra-config.
2. The logging-operator adds the new cluster_id/password pair to the proxy configuration
3. The logging agents collect and send logs to the loki-multi-tenant-proxy which validates the basic authentication sent in the http header and transform it into a tenant information (X-Scope_OrgID)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What in a future case where customer can define multiple tenants for write path, is that possible? does it require multiple instances of promtail and/or grafana agent?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what we discussed before IIRC we can have multiple tenants , but it would mean some data duplication. But @QuentinBisson can explain it better

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes exactly, so the tenants can be set in promtail as a label via what they called a pipeline stage so if they want they can set multiple tenant, but that means the data will be stored and queries twice so we would advise against it :)

This is why the idea of setting a tenant per namespace makes sense :)

Copy link
Contributor

@stone-z stone-z left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The structure presented here overall makes sense to me and I agree with Atlas's current intent.

I would add:

  1. App-level granularity seems inevitable to me. See my comment below.
  2. To achieve app-level granularity, the theoretically best approach IMO would be to use workload identities in some form. Logging as a platform feature will ship long before identity as a platform feature, but if logging is designed with that future in mind, it would be very powerful.
  3. Shield has very similar use cases for security data. We are also building toward an identity-based model, but are currently working with (essentially) app-level granularity.
  4. For the initial implementation, +1 on a single operator. Individual controllers can be spun out on their own if future needs demand it (Shield is doing the same for Policy API controllers).

Question/opinion - this seems like an enterprise feature of the underlying tools -- do you know of or anticipate any artificial limits from that? e.g. is there a limit to the number of tenant that would actually make per-app tenancy impossible?

Comment on lines +202 to +204
##### App == Tenant

This approach is especially flexible for customers but we do not think this approach makes sense because the mapping configuration would be impossible to maintain and we would probably face technical limits pretty fast
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO this is the natural end state.

Two valid requests:

  • "The team that owns namespace X can view logs from any application in that namespace"
  • "The monitoring team can view prometheus logs from any namespace in any cluster where they run prometheus"

Roughly app-level granularity is inevitable in my mind. It also works well with other potential features (described in the general comment).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could be but then I do not think it works for all things.

Usually, customers run applications for one team in one namespace with a clear boundary. Sure, prometheus might be a bit different but that's really dependant on the customers organizations.

What I do not like in this is that if you want all the logs of all apps in one namespace, then loki will actually run at least number of apps queries. Sure they're paralellized and so on but that does look a bit too much to me for most use cases.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

loki will actually run at least number of apps queries

The actual query implementation is beyond my knowledge 😅 Sure, it might be prohibitively expensive or that this is intentionally nerfed in OSS and reserved for enterprise versions.

BUT to the greatest extent possible, I would double down on the likelihood of needing app-level granularity support. Some other examples:

  • any type of less-trusted collaborators (externals, students, etc.) who should have access to only certain components inside a namespace
  • selective access to "system" resources - for example, there are multiple types of users who will need access to coredns or nginx, but they shouldn't have access to everything in the namespace, especially kube-system stuff.
  • automation - similar to above, but for machine accounts.

If it is possible to change the tenant behavior later, maybe logging-operator could default to tenant-per-ns until the first app-level rule is applied, or create app tenants only for those targeted by allow rules. This probably means the capability would only apply to future logs, which is a tradeoff, but could be a workable middle ground.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, from what I see, we can add levels of granularity on-demand, and we could even do that opt-in for apps, so it's not by default tenanted by app, but there's additional tenants for apps where we see demand for selective access control (e.g. we might at some point pull nginx, or coredns tenants into our default mapping). WDYT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah maybe so. I'm not sure about the Loki internals or implications for trying to change tenancy later on, I would just assume it wouldn't be possible to do retroactively. For example, if a Loki tenant is for a namespace, a new app-level tenant would have no historical logs for that app because they'd still be associated with the old namespace tenant. Possibly queries would also need to be adapted to account for the moved logs, but I'm not familiar enough to say for sure.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, from what I understood a new tenant is a new and separate datastore, which makes duplication of data a topic in that context, but also the issues you mentioned, you basically won't be able to access historical data previous to tenant creation. Not sure about how that looks with queries, but that I would assume should not change as tenants should not influence data, but you never know, maybe @QuentinBisson knows more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes you can think of it as a datastore and a tenant is only a http header so there is no reason why customers that want to migrate to 1 tenant per app would lose old logs if they keep the old and new tenants together :)

- Tenant configuration will need to be configured in multiple places (MC for the read path and WC for the write path)
- Label approach would most likely not work for metrics OOTB (this would require customer to add new labels like metrics port, path and so on).

#### Approach 3: Configure promtail based on the MC tenancy CR
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless the community provides a better option, some Giant Swarm abstraction is the likely end state for this use case. BUT, whether it is WC or MC level depends on architectural decisions about the Project CR which haven't been made yet.

Assuming a GitOps setup, consider the possible relationships between a GS Project CR and Flux. Does the Project CR live inside an application repository? If so, it would be a WC level resource. If the Project CR is managed separately from the application, it could be an MC-level resource. Those have tradeoffs and to my knowledge they haven't been discussed yet.

So this mechanism could be WC-level (or could have complementary abstractions, like a label-based version of the same Project CR fields).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the issue I have is that in my mind, a project CR would need need to be in the MC because a project can be cross clusters especially if customers have a cluster per stage as this is often the case.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, maybe the Project CR ends up being MC-scoped, it just hasn't been decided. It definitely needs more wider discussion. My comment here was mostly to consider if there are aspects of logging that need to be under dev control, and if so, what that interface would look like.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might in any case need a leveled approach where we have project CR on MC level, and some other CR on WC levels (a project needs to be deployed to x WCs). But even that WC level in some cases lives on the MC, when you look at HelmRelease or App CR for example. Other parts of WC level would live within WC and that's most probably unavoidable. Thus, to me it means we can handle these independently. It depends a lot on the architecture of implemenation, which the con below shows. Handling some delegation from an MC level resource to a WC level one is not a big issue and we do that regularly with Apps anyway.

@QuentinBisson
Copy link
Contributor

QuentinBisson commented Mar 14, 2024

@stone-z
How do you see workload identity having any relation to logging? I would be interested to know why you think they would be related? To me, tenancy is about who has access to the logs so maybe the identity would then be a label in the log but it should not defined a user tenancy model.

Question/opinion - this seems like an enterprise feature of the underlying tools -- do you know of or anticipate any artificial limits from that? e.g. is there a limit to the number of tenant that would actually make per-app tenancy impossible?

I think there is no limits on a technical level (max header size can be configured in most ingress controllers) but I think this would definitely add a lot more load to the read path as loki does at least 1 query per tenant even if we know we query only 1 actual tenant from what I understood of the code and that is still dependant on the duration of logs you want to retrieve as it would most likely retrieve multiple chunks per tenant and the number of read queries would be too much in that case.

Other that that, after some self monologue, I'm starting to envision a bit some of the technicalities of this.

What I would think make sense would be to allow customers to configure the tenancy they want at the cluster level (cluster label for instance) with the allowed tenancy values of:

  • cluster - all logs apart from gs logs are mapped to 1 tenant. I would think this make sense for small clusters where there is no real use for cross namespace tenancy.
  • namespace - all logs that are a part of a namespace are mapped to 1 tenant for all clusters (e.g. prod namespace on all clusters would be in the same tenant).
  • cluster_namespace - all logs that are in 1 namespace in 1 cluster are in a tenant .
  • project - this would then be dependant on the project crs so it's still unclear

logs from kube-system, giantswarm and I guess kyverno namespaces would anyway be sent to the -giantswarm tenant on our side and platform teams would have access the giantswarm tenant configured

We can definitely allow exceptions in here by creating a label like "application.giantswarm.io/tenant" that could be set on a namespace or a deployment that would override the tenant for specific things or maybe add the PodLogs CR that comes with the grafana agent.

Now I'm a bit more concerned about the following things:

  • What do we do with managed apps that are not deployed in their own namespace or kube-system. For instance, a lot of our customers install prometheus-operator in the monitoring namespace as the kube-system namespace is inaccessible to teams and they can deploy their own prometheus. But then we can also go even deeper in that questions. In this specific case, I'm not sure what the tenant should be or if we then need to store logs twice.

@giantswarm/team-atlas I would like your thoughs on this

Copy link
Contributor

@stone-z stone-z left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do you see workload identity having any relation to logging? I would be interested to know why you think they would be related? To me, tenancy is about who has access to the logs so maybe the identity would then be a label in the log but it should not defined a user tenancy model.

Having workload identities would make it easier to manage the access rights for human (or machine) users to application logs. Take GS or a customer monitoring team as a potential group of humans. That group of humans should have access to prometheus logs across the fleet. But there are multiple types of prometheus apps and instances hanging out in the fleet. How do we tell them apart in a clear enough way that an operator can configure access rules reliably? It isn't a requirement now though, we just have multiple places to configure things.

Regarding the rest of your larger comment:
I think we're starting to get more on the same page. As long as there is no technical reason it isn't possible, we could let customers choose a tenancy model so that they/we can balance loki performance with the actual RBAC needs. I still think at least some will need app<-->tenant (see my other comment), and that ensures the widest coverage later on, but it could be turned down for customers that don't need it.

I also echoed some of your ideas about shared namespaces in my other comment. It might be useful to discuss all this (identity/tenancy needs, GS Project vs dev absractions) in the next SIG Architecture as well. I have lots of ideas about these at a platform level, but they haven't been refined outside Shield and Atlas is now the first with a real need

Comment on lines +202 to +204
##### App == Tenant

This approach is especially flexible for customers but we do not think this approach makes sense because the mapping configuration would be impossible to maintain and we would probably face technical limits pretty fast
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

loki will actually run at least number of apps queries

The actual query implementation is beyond my knowledge 😅 Sure, it might be prohibitively expensive or that this is intentionally nerfed in OSS and reserved for enterprise versions.

BUT to the greatest extent possible, I would double down on the likelihood of needing app-level granularity support. Some other examples:

  • any type of less-trusted collaborators (externals, students, etc.) who should have access to only certain components inside a namespace
  • selective access to "system" resources - for example, there are multiple types of users who will need access to coredns or nginx, but they shouldn't have access to everything in the namespace, especially kube-system stuff.
  • automation - similar to above, but for machine accounts.

If it is possible to change the tenant behavior later, maybe logging-operator could default to tenant-per-ns until the first app-level rule is applied, or create app tenants only for those targeted by allow rules. This probably means the capability would only apply to future logs, which is a tradeoff, but could be a workable middle ground.

- Tenant configuration will need to be configured in multiple places (MC for the read path and WC for the write path)
- Label approach would most likely not work for metrics OOTB (this would require customer to add new labels like metrics port, path and so on).

#### Approach 3: Configure promtail based on the MC tenancy CR
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, maybe the Project CR ends up being MC-scoped, it just hasn't been decided. It definitely needs more wider discussion. My comment here was mostly to consider if there are aspects of logging that need to be under dev control, and if so, what that interface would look like.

- With this approach, we could still have a `giantswarm.io/tenant: my-tenant` label on some apps that should have a specific tenant attached to them.

Cons:
- Our existing customers want to start slow (team by team) and opting-out for all namespaces from the start will be a hassle. An idea to avoid this would be to be able to opt-out of the multi-tenancy features with a cluster label (`giantswarm.io/tenancy: false`).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could probably roll out slowly (and under shared control) with Policy API. We can discuss

@QuentinBisson
Copy link
Contributor

Sure let's discuss it on wednesday but we might need to make this an hour long meeting :D

multi-tenancy-observability/README.md Outdated Show resolved Hide resolved
multi-tenancy-observability/README.md Outdated Show resolved Hide resolved
}
```

The component `loki-multi-tenant-proxy` which has now been renamed to `grafana-multi-tenant-proxy` should be configured with the mapping between groups of people and tenants

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it could be worth to explain in a short sentence why the name has changed.

multi-tenancy-observability/README.md Outdated Show resolved Hide resolved
multi-tenancy-observability/README.md Outdated Show resolved Hide resolved
multi-tenancy-observability/README.md Outdated Show resolved Hide resolved
multi-tenancy-observability/README.md Outdated Show resolved Hide resolved
Copy link

This RFC has had no activity for 100 days. It will be closed in 1 week, unless activity occurs or the label lifecycle/keep is added.

@JosephSalisbury
Copy link
Contributor

@marieroque what's the status here?

@marieroque
Copy link
Author

I would say it's part of an epic: giantswarm/roadmap#3558

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.