-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove unchunking evalScan and apply logging to chunks for performance #64
Conversation
reachedHighest match { | ||
case Some(highest) => | ||
HighestOffsetsWithRecord( | ||
partitionOffsets = t.partitionOffsets - highest, | ||
consumerRecord = emittableRecord, | ||
partitionLast = PartitionLast(highest, r.offset).some | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we've reached the highest offset for this partition, store the Topic Partition and Offset inside the HighestOffsetsWithRecord
so we can signal this upstream
.takeWhile(_.partitionOffsets.nonEmpty, takeFailure = true) | ||
.evalTapChunk(_.partitionLast.traverse { last => | ||
logger.warn(s"Finished loading data from ${last.topicPartition.show} at offset ${last.offset}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i know this hasn't changed but why is this a warning? this should be info
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was like that in the Akka lib, always assumed it was intentional 👀 I've lowered it to info
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://github.com/sky-uk/kafka-topic-loader/blob/0f6c46d858e9d56287eca159eec4a3dc3924ae6c/src/main/scala/uk/sky/kafka/topicloader/TopicLoader.scala#L249 doesn't look like it did, unless i've missed something
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I stand corrected, guess I left it on while debugging and forgot to lower it 😨
@@ -22,9 +22,12 @@ object TopicLoader extends TopicLoader { | |||
private[topicloader] given [K, V]: Show[ConsumerRecord[K, V]] = | |||
Show.show(cr => s"${cr.topic}-${cr.partition}") | |||
|
|||
private case class PartitionLast(topicPartition: TopicPartition, offset: Long) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PartitionLastOffset
might make this clearer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ad4363f
to
66541f3
Compare
Description
Currently we use
evalScan
to perform the logic of producing a ConsumerRecord and the non-empty partition map. this is effectful as we perform logging upon reaching the highest offset.It would be more efficient to use the non-effectful scan, as
evalScan
de-chunks a stream into singleton chunks. There unfortunately is noevalScanChunks
equivalent ofscanChunks
.We still perform the logging, however we can operate on the chunk for performance by using
evalTapChunk
Related Issues
Definition of Done:
The Below tasks should be completed before marking a PR as ready for review.