clarification on describe index #38994
Unanswered
vihariazure
asked this question in
Q&A and General discussion
Replies: 1 comment
-
If index is done, then you are ready to search. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
below is my details of describe index
{
'm': '16',
'ef_construct': '100',
'index_type': 'HNSW',
'metric_type': 'L2',
'field_name': 'product_embeddings',
'index_name': 'milvus_index_uuid_trail_1',
'total_rows': 1000000,
'indexed_rows': 970000,
'pending_index_rows': 882500,
'state': 'Finished'
}
Details:
total_rows: 1000000
What it is: The total number of rows in the target field (product_embeddings).
Meaning: The field contains 1,000,000 vector entries in total.
indexed_rows: 970000
What it is: The number of rows that have been successfully indexed so far.
Meaning: Out of 1,000,000 rows, 970,000 have been indexed using the HNSW algorithm.
pending_index_rows: 882500
What it is: The number of rows still waiting to be indexed.
Meaning: A large number of rows are still pending indexing. This might indicate ongoing data ingestion or that the indexing process hasn’t been finalized for some rows.
state: 'Finished'
What it is: The state of the index-building process.
Meaning: 'Finished' typically indicates the index-building operation has been completed. However, the presence of pending_index_rows suggests that additional data is still awaiting indexing, possibly due to delayed ingestion or re-indexing processes.
Questions:
Pending Index Rows:
There are still pending_index_rows. This value generally be 0 after the index-building process is marked as 'Finished'?
Impact on Search Results:
If the state is 'Finished', does this mean I can search and retrieve results?
Since pending_index_rows is not 0, does this mean I will get results, but with reduced accuracy due to incomplete indexing?
Accuracy Considerations:
To achieve better accuracy, should I wait until pending_index_rows = 0 before performing searches?
Wait for Complete Indexing:
I used the following utility function to wait for the index-building process to complete until pending_index_rows = 0:
python
utility.wait_for_index_building_complete(collection_name=self.collection_name, index_name=self.index_name)
However, even after inserting data, the describe_index results still show a large number of pending_index_rows.
Parameter Optimization:
Do I need to adjust parameters like segment size or flush interval to ensure the index is ready as soon as data insertion is complete (so that pending_index_rows = 0)?
Beta Was this translation helpful? Give feedback.
All reactions