From 20535c00a9230873fcac44d31c00411df760a204 Mon Sep 17 00:00:00 2001 From: Jeff Xiang Date: Thu, 7 Nov 2024 12:02:56 -0500 Subject: [PATCH] Update javadocs for PscSink and PscSource to highlight difference compared to FlinkPscProducer and Consumer --- .../pinterest/flink/connector/psc/sink/PscSink.java | 8 ++++++++ .../flink/connector/psc/source/PscSource.java | 12 +++++++++++- 2 files changed, 19 insertions(+), 1 deletion(-) diff --git a/psc-flink/src/main/java/com/pinterest/flink/connector/psc/sink/PscSink.java b/psc-flink/src/main/java/com/pinterest/flink/connector/psc/sink/PscSink.java index 2361545..ada0de2 100644 --- a/psc-flink/src/main/java/com/pinterest/flink/connector/psc/sink/PscSink.java +++ b/psc-flink/src/main/java/com/pinterest/flink/connector/psc/sink/PscSink.java @@ -36,6 +36,14 @@ /** * Flink Sink to produce data into a PSC topicUri. The sink supports all delivery guarantees * described by {@link DeliveryGuarantee}. + * + * A PscSink using EXACTLY_ONCE delivery guarantee requires a {@link com.pinterest.flink.connector.psc.PscFlinkConfiguration#CLUSTER_URI_CONFIG} + * to be set in the producer config. This is used to pre-construct the {@link com.pinterest.psc.producer.PscBackendProducer} upon + * creation of a top-level {@link com.pinterest.psc.producer.PscProducer} to perform transactional operations. For this reason, + * a single PscSink instance cannot be used to write to topics spanning multiple different clusters. This limitation does not + * exist in the AT_LEAST_ONCE and NONE delivery guarantees, nor does it exist in {@link com.pinterest.flink.streaming.connectors.psc.FlinkPscProducer} + * which can be used to write to multiple clusters regardless of the delivery guarantee. + * *
  • {@link DeliveryGuarantee#NONE} does not provide any guarantees: messages may be lost in case * of issues on the PubSub broker and messages may be duplicated in case of a Flink failure. *
  • {@link DeliveryGuarantee#AT_LEAST_ONCE} the sink will wait for all outstanding records in the diff --git a/psc-flink/src/main/java/com/pinterest/flink/connector/psc/source/PscSource.java b/psc-flink/src/main/java/com/pinterest/flink/connector/psc/source/PscSource.java index 8b7dc06..7467003 100644 --- a/psc-flink/src/main/java/com/pinterest/flink/connector/psc/source/PscSource.java +++ b/psc-flink/src/main/java/com/pinterest/flink/connector/psc/source/PscSource.java @@ -62,7 +62,17 @@ /** * The Source implementation of PSC. Please use a {@link PscSourceBuilder} to construct a {@link - * PscSource}. The following example shows how to create a PscSource emitting records of + * PscSource} + * + * This is different from {@link com.pinterest.flink.streaming.connectors.psc.FlinkPscConsumer} + * in that it is a source implementation for Flink's new source API. Due to the difference in implementation, + * the two classes are not compatible with each other. Most notably, PscSource currently can only support + * reading from topics from a single backend cluster, while FlinkPscConsumer can read from multiple clusters. + * This limitation is due to the fact that a {@link com.pinterest.flink.connector.psc.PscFlinkConfiguration#CLUSTER_URI_CONFIG} + * is required to be set in the configuration for the PscSource to perform metadata queries against the cluster using + * a {@link com.pinterest.psc.metadata.client.PscMetadataClient}. + * + * The following example shows how to create a PscSource emitting records of * String type. * *
    {@code