-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add verbose pipeline parameter to output each processor's execution details #16843
base: main
Are you sure you want to change the base?
Add verbose pipeline parameter to output each processor's execution details #16843
Conversation
…etails Signed-off-by: Junwei Dai <[email protected]>
Signed-off-by: Junwei Dai <[email protected]>
d5a2c4c
to
d931750
Compare
Signed-off-by: Junwei Dai <[email protected]>
@joshpalis Hey Josh, can you take a look when you have time? :) |
Signed-off-by: Junwei Dai <[email protected]>
❌ Gradle check result for 488377f: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
❌ Gradle check result for 719fa1c: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Signed-off-by: Junwei Dai <[email protected]>
719fa1c
to
e4e30f5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work @junweid62 on implementing this feature. A few comments from my initial pass
server/src/main/java/org/opensearch/search/internal/InternalSearchResponse.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/opensearch/search/pipeline/Pipeline.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/opensearch/search/pipeline/Pipeline.java
Outdated
Show resolved
Hide resolved
❌ Gradle check result for e4e30f5: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Signed-off-by: Junwei Dai <[email protected]>
❕ Gradle check result for 615a4b6: UNSTABLE
Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure. |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #16843 +/- ##
============================================
+ Coverage 72.03% 72.04% +0.01%
- Complexity 65230 65238 +8
============================================
Files 5318 5319 +1
Lines 304051 304224 +173
Branches 43990 44022 +32
============================================
+ Hits 219021 219187 +166
- Misses 67121 67134 +13
+ Partials 17909 17903 -6 ☔ View full report in Codecov by Sentry. |
Signed-off-by: Junwei Dai <[email protected]>
820849b
to
0ae4d06
Compare
❌ Gradle check result for 0ae4d06: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Hi, @dbwiddis Could you please take a moment to review this PR when you have the time? I’d appreciate any feedback you might have :) |
@@ -302,6 +305,9 @@ public SearchSourceBuilder(StreamInput in) throws IOException { | |||
if (in.getVersion().onOrAfter(Version.V_2_18_0)) { | |||
searchPipeline = in.readOptionalString(); | |||
} | |||
if (in.getVersion().onOrAfter(Version.CURRENT)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you need to have exact version here, CURRENT will change with every next release
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I initially used the exact version here, but it caused failures in the BWC integration tests. @dbwiddis suggested switching to CURRENT for now, with the plan to update the version in a separate PR when preparing for the release.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is correct: once this gets backported to 2.x, then a PR should be submitted to 2.x branch changing it to 2_19_0 and then forward-ported. The problem is that current 2.19.0 code (2.x branch) doesn't have this code but would pass the onOrAfter check if we set it now.
Here's some examples of that workflow in progress: #16086 and #16174
That said, it's useful to include a TODO comment here as a reminder so it doesn't get forgotten, leading to bugs like #16590 :)
server/src/main/java/org/opensearch/search/builder/SearchSourceBuilder.java
Show resolved
Hide resolved
server/src/main/java/org/opensearch/search/builder/SearchSourceBuilder.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/opensearch/search/internal/InternalSearchResponse.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/opensearch/search/pipeline/Pipeline.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/opensearch/search/pipeline/Pipeline.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/opensearch/search/pipeline/PipelineProcessingContext.java
Show resolved
Hide resolved
Signed-off-by: Junwei Dai <[email protected]>
❌ Gradle check result for 500ac7c: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM generally. Some comments and suggestions
@@ -302,6 +305,9 @@ public SearchSourceBuilder(StreamInput in) throws IOException { | |||
if (in.getVersion().onOrAfter(Version.V_2_18_0)) { | |||
searchPipeline = in.readOptionalString(); | |||
} | |||
if (in.getVersion().onOrAfter(Version.CURRENT)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is correct: once this gets backported to 2.x, then a PR should be submitted to 2.x branch changing it to 2_19_0 and then forward-ported. The problem is that current 2.19.0 code (2.x branch) doesn't have this code but would pass the onOrAfter check if we set it now.
Here's some examples of that workflow in progress: #16086 and #16174
That said, it's useful to include a TODO comment here as a reminder so it doesn't get forgotten, leading to bugs like #16590 :)
beforeResponseProcessor(processor); | ||
final long start = relativeTimeSupplier.getAsLong(); | ||
processor.processResponseAsync(request, r, requestContext, ActionListener.wrap(rr -> { | ||
long took = TimeUnit.NANOSECONDS.toMillis(relativeTimeSupplier.getAsLong() - start); | ||
afterResponseProcessor(processor, took); | ||
if (request.source().verbosePipeline()) { | ||
detail.addOutput(Arrays.asList(rr.getHits().deepCopy().getHits())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you help me understand why this deepCopy is needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When the Object receives this list, it holds a reference to the original SearchHit objects. Without a deepCopy, any modifications to the output will also reflect in the input, as they share the same reference.
By using deepCopy, we ensure that the input and output hold independent values, allowing us to clearly see the differences between the two at each step of the processor.
@@ -17,6 +20,9 @@ | |||
public class PipelineProcessingContext { | |||
private final Map<String, Object> attributes = new HashMap<>(); | |||
|
|||
// Key for processor execution details | |||
private static final String PROCESSOR_EXECUTION_DETAILS_KEY = "processorExecutionDetails"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something seems off to me about having this private constant in this class. I see it's used in a getter so nobody needs to know what it is, but it's also a key in an attributes map that could have a name collision.
Are other attributes defined (should this constant be somewhere else, and public)? If not, is re-using this map for an undocumented purpose a good idea?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed a similar usage in the ScriptRequestProcessor class, where keys are defined as:
private static final List<String> SCRIPT_CONFIG_KEYS = List.of("id", "source", "inline", "lang", "params", "options");
In our case, I’m considering changing the constant to public and moving it to the ProcessorExecutionDetail class
Do you think this would be a better approach?
* | ||
* @opensearch.internal | ||
*/ | ||
@PublicApi(since = "1.0.0") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this new? Should this version be different?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I took a closer look and feel that this annotation isn't necessary at all.
private static Object parseFieldFromXContent(XContentParser parser) throws IOException { | ||
XContentParser.Token token = parser.currentToken(); | ||
if (token == XContentParser.Token.VALUE_NULL) { | ||
return null; | ||
} else if (token == XContentParser.Token.START_ARRAY) { | ||
return parseArrayFromXContent(parser); | ||
} else if (token == XContentParser.Token.START_OBJECT) { | ||
return parser.map(); | ||
} else { | ||
return parser.textOrNull(); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would be surprised if this utility method (and the related array one) doesn't already exist somewhere (like here?). If it doesn't, perhaps this code should be in such a util class?
2.use exist xcontentUtil to read 3.move processor excution key to ProcessorExecutionDetail Signed-off-by: Junwei Dai <[email protected]>
❌ Gradle check result for 7526ef4: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Description
Related RFC : #16705
This PR introduces enhancements to OpenSearch's search pipeline functionality, focusing on improving the traceability and debugging of search request and response transformations. It addresses the increasing complexity of search pipeline processors by implementing verbose mode support, which provides detailed insights into processor execution.
Adds Verbose Mode for Search Pipelines:
verbose_pipeline
parameter to search requests, default to false.Improves Pipeline Debugging:
Supports All Pipeline Configurations:
Test Framework Enhancements:
Example output with request processor:
filter_query
response processor:rename_field
andsort
Related Issues
Resolves #14745
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.