Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add distance conf filter #250

Merged
merged 11 commits into from
Aug 21, 2024
Merged

Conversation

zjgemi
Copy link
Collaborator

@zjgemi zjgemi commented Aug 15, 2024

Summary by CodeRabbit

  • New Features

    • Introduced a filters argument for configuration customization in existing functions.
    • Implemented new filtering classes for validating atomic configurations based on distance and geometric criteria, enhancing configuration selection options.
  • Tests

    • Added unit tests for the new filtering classes to ensure robust functionality and validation of atomic configurations.

Copy link

coderabbitai bot commented Aug 15, 2024

Walkthrough

Walkthrough

The recent changes enhance the dpgen2 package by introducing advanced configuration filters for customized exploration behaviors. Key updates include the addition of a filters parameter in the lmp_args and caly_args functions, the implementation of multiple filter classes for validating atomic structures, and a streamlined method for handling data within these filters. Overall, these modifications improve user experience and provide greater flexibility in configuration management.

Changes

Files Change Summary
dpgen2/entrypoint/args.py Added optional filters argument to lmp_args and caly_args functions for enhanced configuration flexibility. Documentation updated.
dpgen2/exploration/selector/__init__.py Imported new filter classes and defined conf_filter_styles dictionary for expanded filtering options.
dpgen2/exploration/selector/conf_filter.py Modified check method in ConfFilter to accept a single frame parameter, simplifying data handling.
dpgen2/exploration/selector/distance_conf_filter.py Implemented DistanceConfFilter, BoxSkewnessConfFilter, and BoxLengthFilter classes for validating atomic configurations based on distance and box dimension criteria.
tests/exploration/test_conf_filter.py Updated FooFilter and faked_filter classes to streamline check method parameters, enhancing clarity and usability.
tests/exploration/test_distance_conf_filter.py Created a test suite for DistanceConfFilter, BoxSkewnessConfFilter, and BoxLengthFilter using unittest, covering both valid and invalid configurations.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant Args
    participant Submit
    participant Scheduler
    participant Renderer
    participant DistanceFilter

    User->>Args: Call lmp_args(filters)
    Args->>Submit: Pass filters to get_conf_filters()
    Submit->>Scheduler: Integrate conf_filters
    Scheduler->>Renderer: Invoke get_confs(conf_filters)
    Renderer->>DistanceFilter: Validate configurations
    DistanceFilter-->>Renderer: Return validation result
    Renderer-->>Scheduler: Return filtered subsystems
    Scheduler-->>User: Provide exploration results
Loading

Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 40206f4 and d386dac.

Files selected for processing (1)
  • dpgen2/exploration/selector/distance_conf_filter.py (1 hunks)
Additional context used
Ruff
dpgen2/exploration/selector/distance_conf_filter.py

126-127: Use a single if statement instead of nested if statements

Combine if statements using and

(SIM102)

Additional comments not posted (5)
dpgen2/exploration/selector/distance_conf_filter.py (5)

20-118: Consider externalizing safe_dist_dict.

The safe_dist_dict contains hardcoded values. Consider externalizing this data to a configuration file or database to enhance flexibility and maintainability.


126-127: Combine nested if statements.

The nested if statements can be combined for better readability.

- if i != j:
-     if values[i] > multiple * values[j]:
+ if i != j and values[i] > multiple * values[j]:
Tools
Ruff

126-127: Use a single if statement instead of nested if statements

Combine if statements using and

(SIM102)


135-218: Enhance DistanceConfFilter with logging and configuration.

The DistanceConfFilter class uses print statements for output. Replace these with logging for better control and integration with logging systems. Additionally, consider externalizing configuration values like safe_dist_dict.


220-273: Replace print statements with logging in BoxSkewnessConfFilter.

Replace the print statement with logging to provide better control over the output.

- print("Inclined box")
+ logger.warning("Inclined box")

276-329: Replace print statements with logging in BoxLengthFilter.

Replace the print statement with logging to provide better control over the output.

- print("One side is %s larger than another" % self.length_ratio)
+ logger.warning("One side is %s larger than another", self.length_ratio)

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share
Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai generate interesting stats about this repository and render them as a table.
    • @coderabbitai show all the console.log statements in this repository.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (invoked as PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Additionally, you can add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Outside diff range, codebase verification and nitpick comments (3)
dpgen2/exploration/selector/conf_filter.py (1)

70-70: Compatibility Issue: ff.check does not support atom names

The ff.check function in dpgen2/exploration/selector/conf_filter.py currently accepts atom_types as a parameter. To ensure compatibility with atom names, the function's implementation needs to be reviewed and potentially updated to handle atom_names.

  • Location: dpgen2/exploration/selector/conf_filter.py, function check

Please verify the function's logic and update it as necessary to accommodate atom names.

Analysis chain

Verify compatibility of atom names with ff.check.

The change from atom types to atom names enhances clarity but ensure that ff.check is compatible with receiving atom names instead of types.

Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify that `ff.check` can handle atom names instead of atom types.

# Test: Search for the `ff.check` function definition to verify its parameter handling.
rg --type python -A 5 'def check' | rg 'atom_names'

Length of output: 82


Script:

#!/bin/bash
# Description: Verify that `ff.check` can handle atom names instead of atom types.

# Test: Search for the `ff.check` function definition to verify its parameter handling.
rg --type py -A 5 'def check' | rg 'atom_names'

Length of output: 137


Script:

#!/bin/bash
# Description: Locate the `ff.check` function definition to verify its parameter handling.

# Test: Search for the `ff.check` function definition to understand its parameters.
rg --type py -A 10 'def check'

Length of output: 10684

dpgen2/exploration/selector/distance_conf_filter.py (1)

128-193: Consider removing debug print statements.

The check method contains debug print statements that may not be suitable for production code.

- print("Inclined box")
- print("One side is 5 larger than another")
- print(
-     f"Dangerous close for {type_i} - {type_j}, {dist:.5f} less than {dr:.5f}"
- )
- print("Valid structure")

Consider using logging instead of print statements for better control over output.

dpgen2/entrypoint/submit.py (1)

280-288: Ensure conf_filter_styles covers all expected types.

Currently, conf_filter_styles only contains the key "distance". Consider adding error handling in the get_conf_filters function for unexpected types and ensure that all expected types are included in conf_filter_styles.

  • Location to update: dpgen2/exploration/selector/__init__.py where conf_filter_styles is defined.
  • Function to enhance: get_conf_filters in dpgen2/entrypoint/submit.py.
Analysis chain

Verify conf_filter_styles comprehensiveness.

Ensure that the conf_filter_styles dictionary includes all expected types to prevent runtime errors.

Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify that all expected types are covered by `conf_filter_styles`.

# Test: Search for all keys used with `conf_filter_styles`. Expect: All expected types are covered.
rg --type python -A 1 'conf_filter_styles\[[^\]]+\]'

Length of output: 85


Script:

#!/bin/bash
# Description: Verify that all expected types are covered by `conf_filter_styles` using ast-grep.

# Test: Search for all keys used with `conf_filter_styles`. Expect: All expected types are covered.
ast-grep --lang python --pattern 'conf_filter_styles[$_]($$_)'

Length of output: 161


Script:

#!/bin/bash
# Description: Find the definition of `conf_filter_styles` and list its keys.

# Search for the definition of `conf_filter_styles` and print its contents.
rg --type python 'conf_filter_styles\s*=' -A 5

Length of output: 79


Script:

#!/bin/bash
# Description: Find the definition of `conf_filter_styles` and list its keys using ast-grep.

# Search for the definition of `conf_filter_styles`.
ast-grep --lang python --pattern 'conf_filter_styles = $_'

Length of output: 250

Comment on lines +286 to +287
conf_filter = conf_filter_styles[c.pop("type")](**c)
conf_filters.add(conf_filter)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add error handling for unexpected types.

Consider adding error handling to manage unexpected types that may not be present in conf_filter_styles.

try:
    conf_filter = conf_filter_styles[c.pop("type")](**c)
except KeyError as e:
    raise ValueError(f"Unexpected filter type: {e}")

)

from .context import (
dpgen2,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove unused import.

The import statement for .context.dpgen2 is not used in the file.

- from .context import (
-     dpgen2,
- )
Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
dpgen2,
from .context import (
)
Tools
Ruff

12-12: .context.dpgen2 imported but unused

Remove unused import: .context.dpgen2

(F401)

Comment on lines 1 to 4
import os
import shutil

import dpdata
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove unused imports.

The imports os, shutil, and dpdata are not used in the file.

- import os
- import shutil
- import dpdata
Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
import os
import shutil
import dpdata
Tools
Ruff

1-1: os imported but unused

Remove unused import: os

(F401)


2-2: shutil imported but unused

Remove unused import: shutil

(F401)


4-4: dpdata imported but unused

Remove unused import: dpdata

(F401)

Comment on lines +119 to +120
if i != j:
if values[i] > multiple * values[j]:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Combine nested if statements.

The nested if statements can be combined for better readability.

- if i != j:
-     if values[i] > multiple * values[j]:
+ if i != j and values[i] > multiple * values[j]:
Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if i != j:
if values[i] > multiple * values[j]:
if i != j and values[i] > multiple * values[j]:
Tools
Ruff

119-120: Use a single if statement instead of nested if statements

Combine if statements using and

(SIM102)

Copy link

codecov bot commented Aug 15, 2024

Codecov Report

Attention: Patch coverage is 89.74359% with 12 lines in your changes missing coverage. Please review.

Project coverage is 83.87%. Comparing base (c0f83c7) to head (d386dac).
Report is 19 commits behind head on master.

Files with missing lines Patch % Lines
dpgen2/entrypoint/submit.py 50.00% 6 Missing ⚠️
dpgen2/exploration/render/traj_render_lammps.py 14.28% 6 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #250      +/-   ##
==========================================
+ Coverage   83.75%   83.87%   +0.12%     
==========================================
  Files          97       98       +1     
  Lines        5318     5433     +115     
==========================================
+ Hits         4454     4557     +103     
- Misses        864      876      +12     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.


🚨 Try these New Features:

Copy link

@wanghan-iapcm wanghan-iapcm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To many hard coding. could you please provide interface allowing user specific parameters?

dpgen2/exploration/selector/distance_conf_filter.py Outdated Show resolved Hide resolved
dpgen2/exploration/selector/distance_conf_filter.py Outdated Show resolved Hide resolved
dpgen2/exploration/selector/distance_conf_filter.py Outdated Show resolved Hide resolved
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Comment on lines +18 to +116
safe_dist_dict = {
"H": 1.2255,
"He": 0.936,
"Li": 1.8,
"Be": 1.56,
"B": 1.32,
"C": 1.32,
"N": 1.32,
"O": 1.32,
"F": 1.26,
"Ne": 1.92,
"Na": 1.595,
"Mg": 1.87,
"Al": 1.87,
"Si": 1.76,
"P": 1.65,
"S": 1.65,
"Cl": 1.65,
"Ar": 2.09,
"K": 2.3,
"Ca": 2.3,
"Sc": 2.0,
"Ti": 2.0,
"V": 2.0,
"Cr": 1.9,
"Mn": 1.95,
"Fe": 1.9,
"Co": 1.9,
"Ni": 1.9,
"Cu": 1.9,
"Zn": 1.9,
"Ga": 2.0,
"Ge": 2.0,
"As": 2.0,
"Se": 2.1,
"Br": 2.1,
"Kr": 2.3,
"Rb": 2.5,
"Sr": 2.5,
"Y": 2.1,
"Zr": 2.1,
"Nb": 2.1,
"Mo": 2.1,
"Tc": 2.1,
"Ru": 2.1,
"Rh": 2.1,
"Pd": 2.1,
"Ag": 2.1,
"Cd": 2.1,
"In": 2.0,
"Sn": 2.0,
"Sb": 2.0,
"Te": 2.0,
"I": 2.0,
"Xe": 2.0,
"Cs": 2.5,
"Ba": 2.8,
"La": 2.5,
"Ce": 2.55,
"Pr": 2.7,
"Nd": 2.8,
"Pm": 2.8,
"Sm": 2.8,
"Eu": 2.8,
"Gd": 2.8,
"Tb": 2.8,
"Dy": 2.8,
"Ho": 2.8,
"Er": 2.6,
"Tm": 2.8,
"Yb": 2.8,
"Lu": 2.8,
"Hf": 2.4,
"Ta": 2.5,
"W": 2.3,
"Re": 2.3,
"Os": 2.3,
"Ir": 2.3,
"Pt": 2.3,
"Au": 2.3,
"Hg": 2.3,
"Tl": 2.3,
"Pb": 2.3,
"Bi": 2.3,
"Po": 2.3,
"At": 2.3,
"Rn": 2.3,
"Fr": 2.9,
"Ra": 2.9,
"Ac": 2.9,
"Th": 2.8,
"Pa": 2.8,
"U": 2.8,
"Np": 2.8,
"Pu": 2.8,
"Am": 2.8,
"Cm": 2.8,
"Cf": 2.3,
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider externalizing safe_dist_dict.

The safe_dist_dict contains hardcoded values. Consider externalizing this data to a configuration file or database to enhance flexibility and maintainability.

Comment on lines 133 to 209
class DistanceConfFilter(ConfFilter):
def __init__(
self, custom_safe_dist=None, safe_dist_ratio=1.0, theta=60.0, length_ratio=5.0
):
self.custom_safe_dist = custom_safe_dist if custom_safe_dist is not None else {}
self.safe_dist_ratio = safe_dist_ratio
self.theta = theta
self.length_ratio = length_ratio

def check(
self,
coords: np.ndarray,
cell: np.ndarray,
atom_types: np.ndarray,
nopbc: bool,
):
from ase import (
Atoms,
)
from ase.build import (
make_supercell,
)

safe_dist = deepcopy(safe_dist_dict)
safe_dist.update(self.custom_safe_dist)
for k in safe_dist:
# bohr -> ang and multiply by a relaxation ratio
safe_dist[k] *= 0.529 / 1.2 * self.safe_dist_ratio

atom_names = list(safe_dist)
structure = Atoms(
positions=coords,
numbers=[atom_names.index(n) + 1 for n in atom_types],
cell=cell,
pbc=(not nopbc),
)

cell, _ = structure.get_cell().standard_form() # type: ignore

if (
cell[1][0] > np.tan(self.theta / 180.0 * np.pi) * cell[1][1]
or cell[2][0] > np.tan(self.theta / 180.0 * np.pi) * cell[2][2]
or cell[2][1] > np.tan(self.theta / 180.0 * np.pi) * cell[2][2]
):
print("Inclined box")
return False

a = cell[0][0]
b = cell[1][1]
c = cell[2][2]

if check_multiples(a, b, c, self.length_ratio):
print("One side is %s larger than another" % self.length_ratio)
return False

P = [[2, 0, 0], [0, 2, 0], [0, 0, 2]]
extended_structure = make_supercell(structure, P)

coords = extended_structure.positions
symbols = extended_structure.get_chemical_symbols()

num_atoms = len(coords)
for i in range(num_atoms):
for j in range(i + 1, num_atoms):
dist = extended_structure.get_distance(i, j, mic=True)
type_i = symbols[i]
type_j = symbols[j]
dr = safe_dist[type_i] + safe_dist[type_j]

if dist < dr:
print(
f"Dangerous close for {type_i} - {type_j}, {dist:.5f} less than {dr:.5f}"
)
return False

print("Valid structure")
return True
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replace print statements with logging.

Consider using a logging framework instead of print statements to provide better control over the output and integrate with existing logging systems.

- print("Inclined box")
+ logger.warning("Inclined box")
- print("One side is %s larger than another" % self.length_ratio)
+ logger.warning("One side is %s larger than another", self.length_ratio)
- print(f"Dangerous close for {type_i} - {type_j}, {dist:.5f} less than {dr:.5f}")
+ logger.warning("Dangerous close for %s - %s, %.5f less than %.5f", type_i, type_j, dist, dr)
- print("Valid structure")
+ logger.info("Valid structure")
import logging

logger = logging.getLogger(__name__)

Comment on lines +253 to +260
Argument(
"filters",
list,
[],
[variant_filter()],
optional=True,
default=[],
doc=doc_filters,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Avoid mutable default arguments.

Using a mutable default argument like a list can lead to unexpected behavior. Consider using None and initializing inside the function.

- def lmp_args(filters: List[dict] = [], optional=True, doc=doc_filters):
+ def lmp_args(filters: List[dict] = None, optional=True, doc=doc_filters):
+     if filters is None:
+         filters = []

- def caly_args(filters: List[dict] = [], optional=True, doc=doc_filters):
+ def caly_args(filters: List[dict] = None, optional=True, doc=doc_filters):
+     if filters is None:
+         filters = []

Also applies to: 346-354

def variant_filter():
doc = "the type of the configuration filter."
var_list = []
for kk in conf_filter_styles.keys():
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optimize dictionary key iteration.

Use for kk in conf_filter_styles: instead of for kk in conf_filter_styles.keys(): for better performance.

- for kk in conf_filter_styles.keys():
+ for kk in conf_filter_styles:
Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
for kk in conf_filter_styles.keys():
for kk in conf_filter_styles:
Tools
Ruff

183-183: Use key in dict instead of key in dict.keys()

Remove .keys()

(SIM118)

dpgen2/exploration/selector/conf_filter.py Outdated Show resolved Hide resolved
dpgen2/exploration/selector/distance_conf_filter.py Outdated Show resolved Hide resolved
dpgen2/exploration/selector/distance_conf_filter.py Outdated Show resolved Hide resolved
dpgen2/exploration/selector/distance_conf_filter.py Outdated Show resolved Hide resolved
dpgen2/exploration/selector/distance_conf_filter.py Outdated Show resolved Hide resolved
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Comment on lines 133 to 187
class DistanceConfFilter(ConfFilter):
def __init__(self, custom_safe_dist=None, safe_dist_ratio=1.0):
self.custom_safe_dist = custom_safe_dist if custom_safe_dist is not None else {}
self.safe_dist_ratio = safe_dist_ratio

def check(
self,
coords: np.ndarray,
cell: np.ndarray,
atom_types: np.ndarray,
nopbc: bool,
):
from ase import (
Atoms,
)
from ase.build import (
make_supercell,
)

safe_dist = deepcopy(safe_dist_dict)
safe_dist.update(self.custom_safe_dist)
for k in safe_dist:
# bohr -> ang and multiply by a relaxation ratio
safe_dist[k] *= 0.529 / 1.2 * self.safe_dist_ratio

atom_names = list(safe_dist)
structure = Atoms(
positions=coords,
numbers=[atom_names.index(n) + 1 for n in atom_types],
cell=cell,
pbc=(not nopbc),
)

P = [[2, 0, 0], [0, 2, 0], [0, 0, 2]]
extended_structure = make_supercell(structure, P)

coords = extended_structure.positions
symbols = extended_structure.get_chemical_symbols()

num_atoms = len(coords)
for i in range(num_atoms):
for j in range(i + 1, num_atoms):
dist = extended_structure.get_distance(i, j, mic=True)
type_i = symbols[i]
type_j = symbols[j]
dr = safe_dist[type_i] + safe_dist[type_j]

if dist < dr:
print(
f"Dangerous close for {type_i} - {type_j}, {dist:.5f} less than {dr:.5f}"
)
return False

return True

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Enhance DistanceConfFilter with logging and configuration.

The DistanceConfFilter class uses print statements for output. Replace these with logging for better control and integration with logging systems. Additionally, consider externalizing configuration values like safe_dist_dict.

import logging

logger = logging.getLogger(__name__)
- print(
-     f"Dangerous close for {type_i} - {type_j}, {dist:.5f} less than {dr:.5f}"
- )
+ logger.warning(
+     "Dangerous close for %s - %s, %.5f less than %.5f", type_i, type_j, dist, dr
+ )

Comment on lines 274 to 305
class BoxLengthFilter(ConfFilter):
def __init__(self, length_ratio=5.0):
self.length_ratio = length_ratio

def check(
self,
coords: np.ndarray,
cell: np.ndarray,
atom_types: np.ndarray,
nopbc: bool,
):
from ase import (
Atoms,
)

atom_names = list(safe_dist_dict)
structure = Atoms(
positions=coords,
numbers=[atom_names.index(n) + 1 for n in atom_types],
cell=cell,
pbc=(not nopbc),
)

cell, _ = structure.get_cell().standard_form() # type: ignore

a = cell[0][0]
b = cell[1][1]
c = cell[2][2]

if check_multiples(a, b, c, self.length_ratio):
print("One side is %s larger than another" % self.length_ratio)
return False
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replace print statements with logging in BoxLengthFilter.

Replace the print statement with logging to provide better control over the output.

- print("One side is %s larger than another" % self.length_ratio)
+ logger.warning("One side is %s larger than another", self.length_ratio)
Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
class BoxLengthFilter(ConfFilter):
def __init__(self, length_ratio=5.0):
self.length_ratio = length_ratio
def check(
self,
coords: np.ndarray,
cell: np.ndarray,
atom_types: np.ndarray,
nopbc: bool,
):
from ase import (
Atoms,
)
atom_names = list(safe_dist_dict)
structure = Atoms(
positions=coords,
numbers=[atom_names.index(n) + 1 for n in atom_types],
cell=cell,
pbc=(not nopbc),
)
cell, _ = structure.get_cell().standard_form() # type: ignore
a = cell[0][0]
b = cell[1][1]
c = cell[2][2]
if check_multiples(a, b, c, self.length_ratio):
print("One side is %s larger than another" % self.length_ratio)
return False
class BoxLengthFilter(ConfFilter):
def __init__(self, length_ratio=5.0):
self.length_ratio = length_ratio
def check(
self,
coords: np.ndarray,
cell: np.ndarray,
atom_types: np.ndarray,
nopbc: bool,
):
from ase import (
Atoms,
)
atom_names = list(safe_dist_dict)
structure = Atoms(
positions=coords,
numbers=[atom_names.index(n) + 1 for n in atom_types],
cell=cell,
pbc=(not nopbc),
)
cell, _ = structure.get_cell().standard_form() # type: ignore
a = cell[0][0]
b = cell[1][1]
c = cell[2][2]
if check_multiples(a, b, c, self.length_ratio):
logger.warning("One side is %s larger than another", self.length_ratio)
return False

Comment on lines 218 to 250
class BoxSkewnessConfFilter(ConfFilter):
def __init__(self, theta=60.0):
self.theta = theta

def check(
self,
coords: np.ndarray,
cell: np.ndarray,
atom_types: np.ndarray,
nopbc: bool,
):
from ase import (
Atoms,
)

atom_names = list(safe_dist_dict)
structure = Atoms(
positions=coords,
numbers=[atom_names.index(n) + 1 for n in atom_types],
cell=cell,
pbc=(not nopbc),
)

cell, _ = structure.get_cell().standard_form() # type: ignore

if (
cell[1][0] > np.tan(self.theta / 180.0 * np.pi) * cell[1][1]
or cell[2][0] > np.tan(self.theta / 180.0 * np.pi) * cell[2][2]
or cell[2][1] > np.tan(self.theta / 180.0 * np.pi) * cell[2][2]
):
print("Inclined box")
return False
return True
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replace print statements with logging in BoxSkewnessConfFilter.

Replace the print statement with logging to provide better control over the output.

- print("Inclined box")
+ logger.warning("Inclined box")
Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
class BoxSkewnessConfFilter(ConfFilter):
def __init__(self, theta=60.0):
self.theta = theta
def check(
self,
coords: np.ndarray,
cell: np.ndarray,
atom_types: np.ndarray,
nopbc: bool,
):
from ase import (
Atoms,
)
atom_names = list(safe_dist_dict)
structure = Atoms(
positions=coords,
numbers=[atom_names.index(n) + 1 for n in atom_types],
cell=cell,
pbc=(not nopbc),
)
cell, _ = structure.get_cell().standard_form() # type: ignore
if (
cell[1][0] > np.tan(self.theta / 180.0 * np.pi) * cell[1][1]
or cell[2][0] > np.tan(self.theta / 180.0 * np.pi) * cell[2][2]
or cell[2][1] > np.tan(self.theta / 180.0 * np.pi) * cell[2][2]
):
print("Inclined box")
return False
return True
class BoxSkewnessConfFilter(ConfFilter):
def __init__(self, theta=60.0):
self.theta = theta
def check(
self,
coords: np.ndarray,
cell: np.ndarray,
atom_types: np.ndarray,
nopbc: bool,
):
from ase import (
Atoms,
)
atom_names = list(safe_dist_dict)
structure = Atoms(
positions=coords,
numbers=[atom_names.index(n) + 1 for n in atom_types],
cell=cell,
pbc=(not nopbc),
)
cell, _ = structure.get_cell().standard_form() # type: ignore
if (
cell[1][0] > np.tan(self.theta / 180.0 * np.pi) * cell[1][1]
or cell[2][0] > np.tan(self.theta / 180.0 * np.pi) * cell[2][2]
or cell[2][1] > np.tan(self.theta / 180.0 * np.pi) * cell[2][2]
):
logger.warning("Inclined box")
return False
return True

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

import unittest

import dpdata
import numpy as np
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove unused imports.

The imports for numpy and .context.dpgen2 are not used in the file.

- import numpy as np
- from .context import (
-     dpgen2,
- )

Also applies to: 14-14

Tools
Ruff

5-5: numpy imported but unused

Remove unused import: numpy

(F401)

zjgemi added 2 commits August 20, 2024 15:19
Signed-off-by: zjgemi <[email protected]>
Signed-off-by: zjgemi <[email protected]>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Outside diff range, codebase verification and nitpick comments (1)
dpgen2/exploration/selector/distance_conf_filter.py (1)

1-14: Remove unused imports.

The following imports appear to be unused in the file and should be removed to clean up the code:

  • logging
  • copy
  • typing
Analysis chain

Remove unused imports.

The imports dargs, dpdata, and deepcopy are used, but ensure that numpy and specific imports from ase are necessary. Remove any unused imports to clean up the code.

- import numpy as np

Run the following script to verify the usage of imports:

Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the usage of imports in the file.

# Test: Check for unused imports. Expect: No output if all imports are used.
rg --type python 'import' dpgen2/exploration/selector/distance_conf_filter.py | grep -v -E 'dargs|dpdata|deepcopy|ase'

Length of output: 149


Script:

#!/bin/bash
# Description: Verify the usage of imports in the file.

# Corrected Test: Check for unused imports by searching for import statements and verifying their usage.
rg 'import ' dpgen2/exploration/selector/distance_conf_filter.py | grep -v -E 'dargs|dpdata|deepcopy|ase|numpy'

Length of output: 184

@wanghan-iapcm wanghan-iapcm merged commit 44d3fd1 into deepmodeling:master Aug 21, 2024
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants