Skip to content

Latest commit

 

History

History
908 lines (698 loc) · 31 KB

File metadata and controls

908 lines (698 loc) · 31 KB

Configuring topics in OpenShift Streams for Apache Kafka

As a developer of applications and services, you can refer to the properties of your topics in OpenShift Streams for Apache Kafka to better understand how the Kafka implementation is managed for your services. You can edit certain topic properties according to the needs and goals of your services. Kafka topics contain the data (events) that applications produce or consume, so the way the topics are configured affects how data is stored and exchanged between applications.

In addition, as a developer, you can use the Streams for Apache Kafka web console to check Kafka topics in Streams for Apache Kafka for matching schemas in OpenShift Service Registry. When you use a schema with your Kafka topic, the schema ensures that producers publish data that conforms to a certain structure and compatibility policy. The schema also helps consumers parse and interpret the data from a topic as it is meant to be read. Using the web console to quickly identify matches between Kafka topics and schemas means that you or others in your organization do not need to inspect client application code to check for schema usage. If you don’t have any Service Registry instances, or don’t have a schema that matches your topic, the console provides links to Service Registry, where you can create instances and schemas.

Reviewing and editing topic properties in Streams for Apache Kafka

Use the OpenShift Streams for Apache Kafka web console to select a topic in your Kafka instance and review the topic properties. You can adjust the editable topic properties as needed.

As an alternative to using the Streams for Apache Kafka web console, you can use the rhoas command-line interface (CLI) to update certain topic properties, as shown in the following example command:

Example CLI command to update topic retention time
rhoas kafka topic update --name my-kafka-topic --retention-ms 704800000

For a list of topic properties that you can update using the CLI, see the rhoas kafka topic update entry in the CLI command reference (rhoas).

Prerequisites
Procedure
  1. In the Streams for Apache Kafka web console, click Kafka Instances and select the Kafka instance that contains the topic you want to configure.

  2. Select the Topics tab.

  3. Select the options icon (three vertical dots) for the relevant topic and click Edit to review the current topic properties, and adjust any editable topic properties as needed.

  4. Click Save to finish.

Note
You can also edit topic properties by selecting the Properties tab within the topic.

Supported topic properties in Streams for Apache Kafka

The following Kafka topic properties are supported in OpenShift Streams for Apache Kafka. Each listed topic property indicates whether the property is editable or read only, and includes other relevant property attributes for your reference.

Core configuration

The following topic properties determine the identity and core behavior of the topic. Before deploying your topic, enter all core configuration properties.

Name

The topic name is the unique identifier for the topic within the cluster. You need this to set up your producers and consumers, so make it something memorable.

Table 1. Property attributes

Editable

Yes

Type

String

Default value

None

Supported values

Letters (Aa-Zz), numbers (0-9), underscores ( _ ), or hyphens ( - ), maximum of 249 characters

Partitions

Partitions are distinct lists of messages within a topic. Partitions are the main concurrency mechanism in Kafka and enable parts of a topic to be distributed over multiple brokers in the cluster. A topic can contain one or more partitions, enabling producer and consumer loads to be scaled. After you create a topic, you can increase the number of partitions but you cannot decrease it.

Table 2. Property attributes

Editable

Yes

Type

Integer

Default value

1

Supported values

1 or greater

Kafka property name

num.partitions

Note
If you try to create a new topic with a number of partitions that would cause the partition limit of the Kafka instance to be exceeded, you see an error message indicating this. If you try to increase the number of partitions for an existing topic and this increase would cause the partition limit of the Kafka instance to be exceeded, you see an error message stating that topic authorization failed.
Replicas

Replicas are copies of partitions in a topic. Partition replicas are distributed over multiple brokers in the cluster to ensure topic availability if a broker fails. When a follower replica is in sync with a partition leader, the follower replica can become the new partition leader if needed. Topic replication is an essential property for fault toleration and high availability.

Table 3. Property attributes

Editable

No

Type

Integer

Default value

3

(For trial Kafka instances, the default value is 1.)

Kafka property name

replication.factor

Minimum in-sync replicas

Minimum in-sync replicas is the minimum number of replicas that must acknowledge a write for the write to be considered successful. To enforce the minimum in-sync replicas value, (set acks to all) to ensure that producers request acknowledgements from all replicas. If this minimum is not met, the producer raises an exception (NotEnoughReplicas or NotEnoughReplicasAfterAppend).

Table 4. Property attributes

Editable

No

Type

Integer

Default value

2

(For trial Kafka instances, the default value is 1.)

Kafka property name

min.insync.replicas

Retention time

Retention time is the amount of time that messages are retained in a topic before they are deleted. This property applies only when the topic cleanup policy is set to Delete or Compact, Delete.

Table 5. Property attributes

Editable

Yes

Type

Long

Default value

604800000 (milliseconds, 7 days)

Kafka property name

retention.ms

Retention size

Retention size is the maximum total size of all log segments in a partition before old log segments are deleted to free up space. By default, no retention size limit is applied, only a retention time limit. This property applies only when the topic cleanup policy is set to Delete or Compact, Delete.

Table 6. Property attributes

Editable

Yes

Type

Long

Default value

-1 (no retention size limit)

Kafka property name

retention.bytes

Messages

The following topic properties control how the Kafka instance handles messages.

Maximum message bytes

Maximum message bytes is the maximum record batch size.

Table 7. Property attributes

Editable

No

Type

Integer

Default value

1048588 (bytes)

Kafka property name

max.message.bytes

Message timestamp type

Message timestamp type determines whether the timestamp is generated when the message is created (CreateTime) or when the message is appended to the log (LogAppendTime).

Table 8. Property attributes

Editable

No

Type

String

Default value

CreateTime

Kafka property name

message.timestamp.type

Maximum message timestamp difference

Maximum message timestamp difference is the maximum difference allowed between the timestamp specified in the message when it leaves the producer and the timestamp recorded when a broker receives the message.

Table 9. Property attributes

Editable

No

Type

Long

Default value

9223372036854775807 (milliseconds)

Kafka property name

message.timestamp.difference.max.ms

Message format version

Message format version is the ApiVersion value that the broker uses to append messages to topics. This value must be a valid ApiVersion value, such as 0.10.0, 1.1, 2.8, or 3.0.

Table 10. Property attributes

Editable

No

Type

String

Default value

3.0-IV1

Kafka property name

message.format.version

Message down-conversion

Message down-conversion determines whether the broker can convert the message.format.version property value to an older version for consumers that require an older message format version. By default, this property is enabled in order to avoid an UNSUPPORTED_VERSION error for consumption requests from older clients. If this property adds excessive load to your broker, you can disable it.

Table 11. Property attributes

Editable

Yes

Type

Boolean

Default value

true

Kafka property name

message.downconversion.enable

Compression type

Compression type determines the final compression for the topic. The only supported value for this property is Producer, which retains the original compression type set by the producer.

Table 12. Property attributes

Editable

No

Type

String

Default value

Producer

Kafka property name

compression.type

Log

The following topic properties define how the Kafka instance handles the message log.

Note
Messages are continually appended to the partition log and are assigned their offset.
Cleanup policy

Cleanup policy determines whether log messages are deleted, compacted, or both. With the Compact, Delete option, log segments are first compacted and then deleted according to the retention time or size limit settings.

Table 13. Property attributes

Editable

Yes

Type

List

Default value

Delete

Supported values

Delete, Compact, Compact, Delete

Kafka property name

cleanup.policy

Delete retention time

Delete retention time is the amount of time that deletion tombstone markers are retained if the log is compacted. Producers send a tombstone message to act as a marker to tell a consumer that the value is deleted.

Table 14. Property attributes

Editable

Yes

Type

Long

Default value

86400000 (milliseconds, 1 day)

Kafka property name

delete.retention.ms

Minimum cleanable dirty ratio

Minimum cleanable dirty ratio is the ratio of entries in the log that can be compacted versus entries that cannot be compacted. When this ratio is reached, the eligible messages in the log are compacted. By default, the ratio is 0.5 or 50%, meaning that messages are compacted after at least half of the log messages are eligible. This property applies only when the topic cleanup policy is set to Compact or Compact, Delete.

Table 15. Property attributes

Editable

No

Type

Double

Default value

0.5 (50%)

Kafka property name

min.cleanable.dirty.ratio

Minimum compaction lag time

Minimum compaction lag time is the minimum time a message remains uncompacted in a log. This property applies only when the topic cleanup policy is set to Compact or Compact, Delete.

Table 16. Property attributes

Editable

Yes

Type

Long

Default value

0 (milliseconds)

Kafka property name

min.compaction.lag.ms

Maximum compaction lag time

Maximum compaction lag time is the maximum time a message remains uncompacted in a log. This property applies only when the topic cleanup policy is set to Compact or Compact, Delete.

Table 17. Property attributes

Editable

Yes

Type

Long

Default value

9223372036854775807 (milliseconds)

Kafka property name

max.compaction.lag.ms

Replication

The following topic properties control the behavior of your replicas. Each of these properties impacts every replica created in the topic.

Unclean leader election

Unclean leader election allows a follower replica that is not in sync with the partition leader to become the leader of the partition. This property provides a way to retain at least partial data if partition leaders are lost. However, this property can lead to data loss, so it is disabled by default.

Table 18. Property attributes

Editable

No

Type

Boolean

Default value

false

Kafka property name

unclean.leader.election.enable

Cleanup

The following topic properties control the cleanup processing of the log.

Log segment size

Log segment size is the size of the log segment files that constitute the log. Log processing actions, such as deletion and compaction, operate on old log segments. A larger setting results in fewer files but less frequent log processing.

Table 19. Property attributes

Editable

Yes

Type

Integer

Default value

1073741824 (bytes)

Supported values

52428800 bytes or greater

Kafka property name

segment.bytes

Segment time

Segment time is the amount of time after which the current log segment is rolled even if the segment file is not full. This property enables the segment to be deleted or compacted as needed, even if the log retention limits have not yet been reached.

Table 20. Property attributes

Editable

Yes

Type

Long

Default value

604800000 (milliseconds, 7 days)

Supported values

600000 ms (10 mins) or greater

Kafka property name

segment.ms

Segment jitter time

Segment jitter time is the maximum delay for log segment rolling. This delay prevents bursts of log segment rolling activity.

Table 21. Property attributes

Editable

No

Type

Long

Default value

0 (milliseconds)

Kafka property name

segment.jitter.ms

File delete delay

File delete delay is the amount of time that a file is retained in the system before the file is deleted.

Table 22. Property attributes

Editable

No

Type

Long

Default value

60000 (milliseconds, 1 minute)

Kafka property name

file.delete.delay.ms

Preallocate log segment files

Preallocate log segment files determines whether to preallocate the file on disk when creating a new log segment. This property ensures sufficient disk space for log segments.

Table 23. Property attributes

Editable

No

Type

Boolean

Default value

false

Kafka property name

preallocate

Index

The following topic properties control the indexing of the log.

Index interval size

Index interval size is the number of bytes between each index entry to its offset index. The default setting indexes a message about every 4096 bytes. More indexing enables reads to be closer to the exact position in the log but makes the index larger.

Table 24. Property attributes

Editable

No

Type

Integer

Default value

4000 (bytes, 4 KB)

Kafka property name

index.interval.bytes

Segment index size

Segment index size is the size of the index that maps offset to file positions.

Table 25. Property attributes

Editable

No

Type

Integer

Default value

10485760 (bytes)

Kafka property name

segment.index.bytes

Flush

The following topic properties control the frequency of the flushing of the log.

Flush interval messages

Flush interval messages is the number of messages between each data flush to the log.

Table 26. Property attributes

Editable

No

Type

Long

Default value

9223372036854775807 (messages)

Kafka property name

flush.messages

Flush interval time

Flush interval time is the amount of time between each data flush to the log.

Table 27. Property attributes

Editable

No

Type

Long

Default value

9223372036854775807 (milliseconds)

Kafka property name

flush.ms

Additional resources

Using topics in OpenShift Streams for Apache Kafka with schemas in OpenShift Service Registry

By default, a Kafka topic that you create in OpenShift Streams for Apache Kafka can store any kind of data. The topic does not validate the structures of messages that it stores. However, as a developer of applications and services, you might want to define the structure of the data for messages stored in a given topic, and ensure that producers and consumers use this structure. To achieve this goal, you can use schemas that you upload to registry instances in OpenShift Service Registry with your Kafka topics. Service Registry is a cloud service that enables you to manage schema and API definitions in your applications without having to install, configure, run, and maintain your own registry instances.

When you use a schema with your Kafka topic, the schema ensures that producers publish data that conforms to a certain structure and compatibility policy. The schema also helps consumers parse and interpret the data from a topic as it is meant to be read.

To use a schema, a client application can directly publish a new schema to a Service Registry instance itself, or use one that is already created there. In either case, to associate the schema with a Kafka topic, client application code is typically configured to use a strategy whereby the schema ID must use the name of the topic. Specifically, to match an existing topic, a value or key schema must use a naming format of <topic-name>-value or <topic-name>-key. For example, my-topic-value or my-topic-key.

However, to identify schema usage for Kafka topics in Streams for Apache Kafka, it might not always be convenient for you or others in your organization to inspect client application code. Instead, to quickly identify schemas that match your topics, you can use the Streams for Apache Kafka web console.

For a given Kafka topic, you can use the console to check Service Registry instances for value or key schemas registered to those instances that match the name of the topic. If you do not have access to any Service Registry instances, or you do not have value or key schemas registered to your instances that match your topic, the console provides links to Service Registry, where you can create instances and schemas. The console also shows you the naming format you need to use when creating a new value or key schema, so that it matches the topic.

Checking a topic for existing schema matches

The following procedure shows how to use the OpenShift Streams for Apache Kafka web console to select a Kafka topic and then check an existing OpenShift Service Registry instance for value or key schemas that have IDs that match the name of the topic.

Alternatively, to learn how to create a new Service Registry instance with a value or key schema that matches a topic, see Creating a new registry instance and matching schema for a topic.

Prerequisites
Procedure
  1. In the Streams for Apache Kafka web console, click Kafka Instances and select the name of the Kafka instance that contains the topic that you want to check for matching schemas in Service Registry.

  2. On the Topics page, click the name of the topic that you want to check.

  3. Click the Schemas tab.

  4. In the Service Registry instance list, select a Service Registry instance to check for schemas that have IDs that match the name of the topic.

    The Schemas tab shows any schemas registered to the selected Service Registry instance that match the topic.

    Note
    Although the instance list shows all Service Registry instances in your organization, you can see schema information only for instances that you own or have been granted access to.
  5. If the Schemas tab shows the schema types that you want associated with your topic, you do not need to complete the remainder of this procedure.

    However, to see the details for a matching schema, or to manage it, click View details.

  6. If the Schemas tab doesn’t show a matching value or key schema that you want associated with your topic, you can start creating the schema using one of the following options:

    • If Streams for Apache Kafka found either a value or key schema that matches your topic (but not both), the Schemas tab displays No matching schema next to the schema type that it couldn’t find.

      To create this type of schema in your Service Registry instance, click the question mark icon. In the resulting pop-up window, copy the required naming format, and click Go to Service Registry instance.

    • If Streams for Apache Kafka found no schemas that match your topic, the Schemas tab displays No matching schema exists for the selected instance.

      For the type of schema that you want to associate with your topic, copy the required naming format, and click Go to Service Registry instance.

    The Service Registry section of the web console opens with your Service Registry instance selected.

  7. In your Service Registry instance, to create a new schema, click Upload artifact.

  8. In the ID of the artifact field, paste the naming format that you previously copied. You must use this naming format so that the new schema matches your Kafka topic.

    Note
    To match your topic, the schema ID must be in the format of <topic-name>-value, or <topic-name>-key. For example, my-topic-value or my-topic-key.
  9. When you have finished uploading a new schema, in the web console, click Streams for Apache Kafka. Navigate to the Schemas tab for your topic, as you did previously.

  10. Select the same Service Registry instance that you selected previously.

    The Schemas tab now shows the name of the matching schema that you uploaded.

  11. To see details for the schema, or to manage it, click View details.

Creating a new registry instance and matching schema for a topic

The following procedure shows how to use the web console to select a Kafka topic, and then create a new OpenShift Service Registry instance with a value or key schema that matches the topic.

Alternatively, to learn how to check an existing Service Registry instance for schemas that match a topic, see Checking a topic for existing schema matches.

Prerequisites
Procedure
  1. In the Streams for Apache Kafka web console, click Kafka Instances and select the name of the Kafka instance that contains the topic that you want to check for matching schemas in Service Registry.

  2. On the Topics page, click the name of the topic that you want to check.

  3. Click the Schemas tab.

  4. Based on what the Schemas tab shows, start creating a new Service Registry instance using one of the following options:

    • If there are no existing Service Registry instances in your organization, the instance list is empty and the Schemas tab displays No Service Registry instances.

      For the type of schema that you want to associate with your topic, copy the required naming format shown on the Schemas tab. To start creating a new Service Registry instance and schema, click Go to Service Registry.

    • Even if there are existing Service Registry instances in the list, you can still create and select a new instance.

      Before you start, take note of your topic name. To match the topic, the ID of a schema that you add to a new Service Registry instance must be in the format of <topic-name>-value, or <topic-name>-key. When you are ready to start creating a new Service Registry instance and schema, below the list, click Create Service Registry instance.

    The Service Registry section of the web console opens.

  5. On the Service Registry page, click Create Service Registry instance. Follow the instructions in the resulting dialog box to create a new instance.

  6. To create a new schema, select your new Service Registry instance and then click Upload artifact.

  7. In the ID of the artifact field, specify a schema ID in the format of <topic-name>-value, or <topic-name>-key. For example, my-topic-value or my-topic-key. If you previously copied this required naming format, you can paste it in the ID of the artifact field.

  8. Finish creating the schema in the normal way.

  9. When you have finished creating the new Service Registry instance and schema, in the web console, click Streams for Apache Kafka. Navigate to the Schemas tab for your topic, as you did previously.

  10. In the Service Registry instance list, select the new Service Registry instance that you created.

    The Schemas tab shows the name of the schema that you uploaded when you created the new Service Registry instance.

  11. To see details for the schema, or to manage it, click View details.