Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add DiffCSP as a new exploration engine #251

Merged
merged 41 commits into from
Aug 22, 2024

Conversation

zjgemi
Copy link
Collaborator

@zjgemi zjgemi commented Aug 19, 2024

Summary by CodeRabbit

  • New Features

    • Introduced variant_filter for configuration filtering in exploration tasks.
    • Added support for the new DiffCSP exploration style, enhancing exploration capabilities.
    • New classes and methods for managing and executing DiffCSP tasks, including DiffCSPTaskGroup.
    • CIF file conversion functionality added via the DiffCSPGen class.
    • New class DistanceConfFilter for validating atomic configurations based on distance criteria.
    • Implemented the RunRelax class for structured execution of relaxation tasks.
  • Bug Fixes

    • Enhanced configuration handling in multiple locations, addressing previously unsupported filters.
  • Tests

    • Comprehensive test suites for DistanceConfFilter, DiffCSPGen, RunRelax, and PrepRunDiffCSP to ensure functionality and integration.

Copy link

coderabbitai bot commented Aug 19, 2024

Walkthrough

Walkthrough

The recent modifications enhance the dpgen2 package by integrating the new "DiffCSP" exploration style, which includes the addition of various classes and functions for managing configuration filters, task groups, and relaxation processes. Comprehensive test frameworks have also been introduced, boosting the robustness of the codebase and improving the management of complex computational workflows.

Changes

Files Changed Change Summary
dpgen2/entrypoint/.../args.py Introduced variant_filter, updated lmp_args, caly_args, and added run_diffcsp_args and diffcsp_args for handling configuration filters in DiffCSP.
dpgen2/entrypoint/showkey.py Modified showkey function to include additional keys for enhanced output.
dpgen2/entrypoint/.../submit.py Added support for DiffCSP exploration style with new functions and classes, improving task management and configuration handling.
dpgen2/exploration/render/traj_render_lammps.py Updated get_confs to utilize conf_filters for filtering configurations.
dpgen2/exploration/selector/.../__init__.py Imported DistanceConfFilter and defined conf_filter_styles for distance-based filtering.
dpgen2/exploration/selector/.../conf_filter.py Modified check function to handle atom types by converting them to NumPy arrays for better processing.
dpgen2/exploration/selector/.../distance_conf_filter.py Implemented DistanceConfFilter class to validate configurations based on distance criteria.
dpgen2/exploration/task/.../__init__.py Added new imports for functionalities related to DiffCSP task groups.
dpgen2/exploration/task/.../diffcsp_task_group.py Defined DiffCSPTaskGroup class for managing DiffCSP-specific tasks.
dpgen2/exploration/task/.../make_task_group_from_config.py Introduced functions for creating and normalizing DiffCSP task groups.
dpgen2/op/.../__init__.py New imports for classes related to DiffCSP operations added.
dpgen2/op/.../diffcsp_gen.py Added functionality to convert model outputs to CIF format and defined the DiffCSPGen class.
dpgen2/op/.../prep_relax.py Introduced PrepRelax class for managing relaxation tasks.
dpgen2/op/run_relax.py Defined RunRelax class to execute relaxation operations effectively.
dpgen2/superop/.../__init__.py Imported PrepRunDiffCSP to support new workflows.
dpgen2/superop/.../prep_run_diffcsp.py Implemented PrepRunDiffCSP class for orchestrating multi-step workflows involving DiffCSP processes.
examples/diffcsp/dpgen.json Introduced a comprehensive configuration for managing DiffCSP computational tasks.
tests/exploration/.../test_distance_conf_filter.py Added unit tests for DistanceConfFilter to validate atomic configuration checks.
tests/op/.../test_diffcsp_gen.py Created tests for the DiffCSPGen class to ensure proper CIF generation from model outputs.
tests/op/.../test_prep_relax.py Implemented tests for PrepRelax, focusing on task generation from CIF files.
tests/op/.../test_run_relax.py Developed tests for RunRelax to validate relaxation task executions.
tests/.../test_prep_run_diffcsp.py Introduced tests for the PrepRunDiffCSP workflow to ensure integration of multiple components works as intended.

Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between dea89da and 4b42e38.

Files selected for processing (1)
  • dpgen2/op/diffcsp_gen.py (1 hunks)
Additional context used
Ruff
dpgen2/op/diffcsp_gen.py

92-92: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

Additional comments not posted (2)
dpgen2/op/diffcsp_gen.py (2)

92-92: Enhance exception handling with context.

Use raise ... from None to distinguish exceptions raised in the except block from those occurring elsewhere.

This comment has already been made in previous reviews.

Tools
Ruff

92-92: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


93-93: Verify the existence of model_path.

Ensure that model_path exists before proceeding with file operations.

Consider adding a check to confirm the existence of model_path and handle the case where it does not exist.


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share
Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai generate interesting stats about this repository and render them as a table.
    • @coderabbitai show all the console.log statements in this repository.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (invoked as PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Additionally, you can add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 22

Outside diff range, codebase verification and nitpick comments (7)
dpgen2/exploration/task/diffcsp_task_group.py (1)

47-52: Attributes Missing in ExplorationTask Class

The attributes trj_freq, fmax, steps, and timeout are not defined in the ExplorationTask class. Consider adding these attributes to the class definition to ensure proper functionality.

  • File: dpgen2/exploration/task/task.py
Analysis chain

Verify attributes of ExplorationTask.

The attributes trj_freq, fmax, steps, and timeout are being assigned to ExplorationTask. Verify that these attributes exist in ExplorationTask or modify the class to include them.

Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the attributes of `ExplorationTask` to ensure they exist.

# Test: Search for the `ExplorationTask` class and check for the attributes.
ast-grep --lang python --pattern $'class ExplorationTask {
  $$$
  trj_freq: $_
  fmax: $_
  steps: $_
  timeout: $_
  $$$
}'

Length of output: 638


Script:

#!/bin/bash
# Description: Verify the attributes of `ExplorationTask` to ensure they exist.

# Search for the definition of the `ExplorationTask` class and print surrounding lines to check for attributes.
rg -A 10 'class ExplorationTask' --glob '*.py'

Length of output: 1429


Script:

#!/bin/bash
# Description: Check for specific attributes in the `ExplorationTask` class.

# Search for the attributes `trj_freq`, `fmax`, `steps`, and `timeout` in the `dpgen2/exploration/task/task.py` file.
rg 'trj_freq|fmax|steps|timeout' dpgen2/exploration/task/task.py

Length of output: 66


Script:

#!/bin/bash
# Description: Check for inheritance or dynamic assignments related to `ExplorationTask`.

# Search for the `ExplorationTask` class definition and any inheritance.
rg 'class ExplorationTask' --glob '*.py' -A 5

# Search for dynamic assignments to `ExplorationTask` attributes in the codebase.
rg 'ExplorationTask' --glob '*.py' -A 5

Length of output: 56930

Tools
GitHub Check: pyright

[failure] 49-49:
Cannot assign member "trj_freq" for type "ExplorationTask"
  Member "trj_freq" is unknown (reportGeneralTypeIssues)


[failure] 50-50:
Cannot assign member "fmax" for type "ExplorationTask"
  Member "fmax" is unknown (reportGeneralTypeIssues)


[failure] 51-51:
Cannot assign member "steps" for type "ExplorationTask"
  Member "steps" is unknown (reportGeneralTypeIssues)


[failure] 52-52:
Cannot assign member "timeout" for type "ExplorationTask"
  Member "timeout" is unknown (reportGeneralTypeIssues)

dpgen2/op/diffcsp_gen.py (2)

1-7: Optimize imports for better readability.

Consider grouping imports from the same module together to enhance readability. For instance, Path and List can be imported on a single line.

from pathlib import Path
from typing import List

27-29: Consider using os.makedirs with exist_ok=True.

Instead of checking for the existence of a directory and then creating it, you can use os.makedirs(output_dir, exist_ok=True) to simplify the code.

- if not os.path.exists(output_dir):
-     os.makedirs(output_dir)
+ os.makedirs(output_dir, exist_ok=True)
tests/exploration/test_distance_conf_filter.py (1)

57-62: Ensure cleanup in setUp method.

If the setUp method is interrupted, files might not be cleaned up. Consider using try-finally or addCleanup to ensure cleanup.

def setUp(self):
    try:
        with open("POSCAR_1", "w") as f:
            f.write(POSCAR_1)
        with open("POSCAR_2", "w") as f:
            f.write(POSCAR_2)
    finally:
        self.addCleanup(self.tearDown)
dpgen2/exploration/selector/distance_conf_filter.py (3)

1-16: Organize imports for clarity.

Consider grouping related imports together and removing any unnecessary ones. The dargs import is used, but you might want to ensure all imported modules are necessary.

from copy import deepcopy
from typing import List

import numpy as np
import dargs
from dargs import Argument

from . import ConfFilter

133-140: Initialize class attributes with type hints.

Consider adding type hints for class attributes in DistanceConfFilter to improve code readability and maintainability.

class DistanceConfFilter(ConfFilter):
    def __init__(
        self, 
        custom_safe_dist: dict = None, 
        safe_dist_ratio: float = 1.0, 
        theta: float = 60.0, 
        length_ratio: float = 5.0
    ):

211-254: Ensure documentation consistency in args method.

The documentation strings for the args method should be consistent with the rest of the codebase, ensuring clarity and completeness.

- r"""The argument definition of the `ConfFilter`.
+ r"""The argument definition of the `DistanceConfFilter`.

Comment on lines +7 to +9
from .prep_run_diffcsp import (
PrepRunDiffCSP,
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider using or exporting PrepRunDiffCSP.

The PrepRunDiffCSP class is imported but not used within this module. Consider using it, adding it to __all__, or removing the import if it's unnecessary.

- from .prep_run_diffcsp import (
-     PrepRunDiffCSP,
- )
Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
from .prep_run_diffcsp import (
PrepRunDiffCSP,
)
Tools
Ruff

8-8: .prep_run_diffcsp.PrepRunDiffCSP imported but unused; consider removing, adding to __all__, or using a redundant alias

(F401)

Comment on lines +7 to +9
from .diffcsp_gen import (
DiffCSPGen,
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider using or exporting DiffCSPGen.

The DiffCSPGen class is imported but not used within this module. Consider using it, adding it to __all__, or removing the import if it's unnecessary.

- from .diffcsp_gen import (
-     DiffCSPGen,
- )
Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
from .diffcsp_gen import (
DiffCSPGen,
)
Tools
Ruff

8-8: .diffcsp_gen.DiffCSPGen imported but unused; consider removing, adding to __all__, or using a redundant alias

(F401)

Comment on lines +25 to +27
from .prep_relax import (
PrepRelax,
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider using or exporting PrepRelax.

The PrepRelax class is imported but not used within this module. Consider using it, adding it to __all__, or removing the import if it's unnecessary.

- from .prep_relax import (
-     PrepRelax,
- )
Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
from .prep_relax import (
PrepRelax,
)
Tools
Ruff

26-26: .prep_relax.PrepRelax imported but unused; consider removing, adding to __all__, or using a redundant alias

(F401)

Comment on lines +40 to +42
from .run_relax import (
RunRelax,
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider using or exporting RunRelax.

The RunRelax class is imported but not used within this module. Consider using it, adding it to __all__, or removing the import if it's unnecessary.

- from .run_relax import (
-     RunRelax,
- )
Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
from .run_relax import (
RunRelax,
)
Tools
Ruff

41-41: .run_relax.RunRelax imported but unused; consider removing, adding to __all__, or using a redundant alias

(F401)

from .lmp_template_task_group import (
LmpTemplateTaskGroup,
)
from .make_task_group_from_config import (
caly_normalize,
caly_task_group_args,
diffcsp_normalize,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider using or exporting diffcsp_normalize.

The diffcsp_normalize function is imported but not used within this module. Consider using it, adding it to __all__, or removing the import if it's unnecessary.

- from .make_task_group_from_config import (
-     diffcsp_normalize,
- )

Committable suggestion was skipped due to low confidence.

Tools
Ruff

19-19: .make_task_group_from_config.diffcsp_normalize imported but unused; consider removing, adding to __all__, or using a redundant alias

(F401)

Comment on lines +21 to +32
OPTemplate,
OutputArtifact,
OutputParameter,
Outputs,
Step,
Steps,
Workflow,
argo_len,
argo_range,
argo_sequence,
download_artifact,
upload_artifact,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove unused imports from dflow.

These imports are not used in the file and can be removed to clean up the code.

- from dflow import (
-     OPTemplate,
-     OutputParameter,
-     Workflow,
-     argo_len,
-     argo_range,
-     download_artifact,
-     upload_artifact,
- )
Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
OPTemplate,
OutputArtifact,
OutputParameter,
Outputs,
Step,
Steps,
Workflow,
argo_len,
argo_range,
argo_sequence,
download_artifact,
upload_artifact,
OutputArtifact,
Outputs,
Step,
Steps,
argo_sequence,
Tools
Ruff

21-21: dflow.OPTemplate imported but unused

Remove unused import

(F401)


23-23: dflow.OutputParameter imported but unused

Remove unused import

(F401)


27-27: dflow.Workflow imported but unused

Remove unused import

(F401)


28-28: dflow.argo_len imported but unused

Remove unused import

(F401)


29-29: dflow.argo_range imported but unused

Remove unused import

(F401)


31-31: dflow.download_artifact imported but unused

Remove unused import

(F401)


32-32: dflow.upload_artifact imported but unused

Remove unused import

(F401)

Comment on lines 18 to 116
safe_dist_dict = {
"H": 1.2255,
"He": 0.936,
"Li": 1.8,
"Be": 1.56,
"B": 1.32,
"C": 1.32,
"N": 1.32,
"O": 1.32,
"F": 1.26,
"Ne": 1.92,
"Na": 1.595,
"Mg": 1.87,
"Al": 1.87,
"Si": 1.76,
"P": 1.65,
"S": 1.65,
"Cl": 1.65,
"Ar": 2.09,
"K": 2.3,
"Ca": 2.3,
"Sc": 2.0,
"Ti": 2.0,
"V": 2.0,
"Cr": 1.9,
"Mn": 1.95,
"Fe": 1.9,
"Co": 1.9,
"Ni": 1.9,
"Cu": 1.9,
"Zn": 1.9,
"Ga": 2.0,
"Ge": 2.0,
"As": 2.0,
"Se": 2.1,
"Br": 2.1,
"Kr": 2.3,
"Rb": 2.5,
"Sr": 2.5,
"Y": 2.1,
"Zr": 2.1,
"Nb": 2.1,
"Mo": 2.1,
"Tc": 2.1,
"Ru": 2.1,
"Rh": 2.1,
"Pd": 2.1,
"Ag": 2.1,
"Cd": 2.1,
"In": 2.0,
"Sn": 2.0,
"Sb": 2.0,
"Te": 2.0,
"I": 2.0,
"Xe": 2.0,
"Cs": 2.5,
"Ba": 2.8,
"La": 2.5,
"Ce": 2.55,
"Pr": 2.7,
"Nd": 2.8,
"Pm": 2.8,
"Sm": 2.8,
"Eu": 2.8,
"Gd": 2.8,
"Tb": 2.8,
"Dy": 2.8,
"Ho": 2.8,
"Er": 2.6,
"Tm": 2.8,
"Yb": 2.8,
"Lu": 2.8,
"Hf": 2.4,
"Ta": 2.5,
"W": 2.3,
"Re": 2.3,
"Os": 2.3,
"Ir": 2.3,
"Pt": 2.3,
"Au": 2.3,
"Hg": 2.3,
"Tl": 2.3,
"Pb": 2.3,
"Bi": 2.3,
"Po": 2.3,
"At": 2.3,
"Rn": 2.3,
"Fr": 2.9,
"Ra": 2.9,
"Ac": 2.9,
"Th": 2.8,
"Pa": 2.8,
"U": 2.8,
"Np": 2.8,
"Pu": 2.8,
"Am": 2.8,
"Cm": 2.8,
"Cf": 2.3,
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider externalizing safe_dist_dict.

The safe_dist_dict is quite large and could be externalized to a configuration file or a separate module to improve readability and maintainability.

# Example of externalizing the dictionary
# safe_dist_dict.py
safe_dist_dict = {
    "H": 1.2255,
    "He": 0.936,
    # ... rest of the elements
}

# In your main file
from .safe_dist_dict import safe_dist_dict

Comment on lines 142 to 209
def check(
self,
coords: np.ndarray,
cell: np.ndarray,
atom_types: np.ndarray,
nopbc: bool,
):
from ase import (
Atoms,
)
from ase.build import (
make_supercell,
)

safe_dist = deepcopy(safe_dist_dict)
safe_dist.update(self.custom_safe_dist)
for k in safe_dist:
# bohr -> ang and multiply by a relaxation ratio
safe_dist[k] *= 0.529 / 1.2 * self.safe_dist_ratio

atom_names = list(safe_dist)
structure = Atoms(
positions=coords,
numbers=[atom_names.index(n) + 1 for n in atom_types],
cell=cell,
pbc=(not nopbc),
)

cell, _ = structure.get_cell().standard_form() # type: ignore

if (
cell[1][0] > np.tan(self.theta / 180.0 * np.pi) * cell[1][1]
or cell[2][0] > np.tan(self.theta / 180.0 * np.pi) * cell[2][2]
or cell[2][1] > np.tan(self.theta / 180.0 * np.pi) * cell[2][2]
):
print("Inclined box")
return False

a = cell[0][0]
b = cell[1][1]
c = cell[2][2]

if check_multiples(a, b, c, self.length_ratio):
print("One side is %s larger than another" % self.length_ratio)
return False

P = [[2, 0, 0], [0, 2, 0], [0, 0, 2]]
extended_structure = make_supercell(structure, P)

coords = extended_structure.positions
symbols = extended_structure.get_chemical_symbols()

num_atoms = len(coords)
for i in range(num_atoms):
for j in range(i + 1, num_atoms):
dist = extended_structure.get_distance(i, j, mic=True)
type_i = symbols[i]
type_j = symbols[j]
dr = safe_dist[type_i] + safe_dist[type_j]

if dist < dr:
print(
f"Dangerous close for {type_i} - {type_j}, {dist:.5f} less than {dr:.5f}"
)
return False

print("Valid structure")
return True
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Improve error handling and logging in check method.

The check method uses print statements for logging. Consider using a logging framework for better control over log levels and outputs.

import logging

# Configure logging at the start of your module
logging.basicConfig(level=logging.INFO)

# Replace print statements with logging
logging.info("Inclined box")
logging.info("One side is %s larger than another", self.length_ratio)
logging.info("Dangerous close for %s - %s, %.5f less than %.5f", type_i, type_j, dist, dr)
logging.info("Valid structure")

Comment on lines 119 to 130
def check_multiples(a, b, c, multiple):
values = [a, b, c]

for i in range(len(values)):
for j in range(len(values)):
if i != j:
if values[i] > multiple * values[j]:
print(
f"Value {values[i]} is {multiple} times greater than {values[j]}"
)
return True
return False
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Combine nested if statements.

The nested if statements in check_multiples can be combined for simplicity.

- if i != j:
-     if values[i] > multiple * values[j]:
+ if i != j and values[i] > multiple * values[j]:
Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def check_multiples(a, b, c, multiple):
values = [a, b, c]
for i in range(len(values)):
for j in range(len(values)):
if i != j:
if values[i] > multiple * values[j]:
print(
f"Value {values[i]} is {multiple} times greater than {values[j]}"
)
return True
return False
def check_multiples(a, b, c, multiple):
values = [a, b, c]
for i in range(len(values)):
for j in range(len(values)):
if i != j and values[i] > multiple * values[j]:
print(
f"Value {values[i]} is {multiple} times greater than {values[j]}"
)
return True
return False
Tools
Ruff

124-125: Use a single if statement instead of nested if statements

Combine if statements using and

(SIM102)

def variant_filter():
doc = "the type of the configuration filter."
var_list = []
for kk in conf_filter_styles.keys():
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Simplify dictionary key iteration.

Use for kk in conf_filter_styles: instead of for kk in conf_filter_styles.keys(): for better readability and performance.

-    for kk in conf_filter_styles.keys():
+    for kk in conf_filter_styles:
Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
for kk in conf_filter_styles.keys():
for kk in conf_filter_styles:
Tools
Ruff

183-183: Use key in dict instead of key in dict.keys()

Remove .keys()

(SIM118)

@zjgemi zjgemi changed the title Add DiffCSP as a new exploration engine [WIP] Add DiffCSP as a new exploration engine Aug 19, 2024
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Outside diff range, codebase verification and nitpick comments (3)
dpgen2/op/run_relax.py (3)

118-118: Fix type issue with ase.calculators.

The SinglePointCalculator is not part of the ase package. It appears to be a custom class defined in tests/op/test_run_relax.py. Ensure that you are using the correct module or class in dpgen2/op/run_relax.py.

  • dpgen2/op/run_relax.py: Verify and correct the usage of SinglePointCalculator.
Analysis chain

Fix type issue with ase.calculators.

The module ase.calculators does not have a member singlepoint. Ensure that the correct module or class is being used.

Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the correct import path for `SinglePointCalculator`.

# Test: Search for `SinglePointCalculator` in the `ase` package.
rg --type python 'SinglePointCalculator'

Length of output: 71


Script:

#!/bin/bash
# Description: Verify the correct import path for `SinglePointCalculator`.

# Test: Search for `SinglePointCalculator` in the `ase` package.
rg 'SinglePointCalculator'

Length of output: 276

Tools
GitHub Check: pyright

[failure] 118-118:
"calculators" is not a known member of module "ase" (reportGeneralTypeIssues)


142-142: Potential Issue with dp Initialization

The dp variable is initialized from the graphs list, which includes a None element at the first position. If j is 0, dp will be None, leading to a NoneType error when calling eval(). Ensure that j is never 0 when accessing graphs[j] for dp.

  • File: dpgen2/op/run_relax.py
  • Line: 142
Analysis chain

Fix type issue with dp.eval().

The method eval is being called on a NoneType. Ensure that dp is correctly initialized.

Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify initialization of `dp` and usage of `eval`.

# Test: Search for `eval` method in the codebase.
rg --type python 'def eval'

Length of output: 60


Script:

#!/bin/bash
# Description: Verify initialization of `dp` and usage of `eval`.

# Search for the initialization of `dp`.
rg --type py 'dp\s*=' -A 5

# Search for the `eval` method definition.
ast-grep --lang python --pattern 'def eval($_$$$)'

Length of output: 3274


Script:

#!/bin/bash
# Description: Investigate the `graphs` variable to understand its type and ensure `dp` is not `None`.

# Search for the definition or initialization of `graphs`.
rg --type py 'graphs\s*=' -A 5

Length of output: 1352

Tools
GitHub Check: pyright

[failure] 142-142:
Cannot access member "eval" for type "None"
  Member "eval" is unknown (reportGeneralTypeIssues)


139-139: Fix type issue with dp.get_type_map(). The initialization of graphs with None as its first element can lead to dp being None if accessed with an index of 0. Ensure that j is never 0 when accessing graphs[j] to avoid calling get_type_map on a NoneType.

  • dpgen2/op/run_relax.py: Ensure j is not 0 when accessing graphs[j].
Analysis chain

Fix type issue with dp.get_type_map().

The method get_type_map is being called on a NoneType. Ensure that dp is correctly initialized.

Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify initialization of `dp` and usage of `get_type_map`.

# Test: Search for `get_type_map` method in the codebase.
rg --type python 'def get_type_map'

Length of output: 68


Script:

#!/bin/bash
# Corrected script to search for `get_type_map` method in Python files.

# Find all Python files and search for the definition of `get_type_map`.
rg 'def get_type_map' --glob '*.py' -A 5

# Search for the initialization of `dp` in the relevant file.
rg 'dp\s*=' --glob '*.py' -A 3

Length of output: 2221


Script:

#!/bin/bash
# Search for the initialization and usage of `graphs` in `dpgen2/op/run_relax.py`.

# Find where `graphs` is initialized or assigned.
rg 'graphs\s*=' --glob 'dpgen2/op/run_relax.py' -A 3

# Search for any function or method definitions in the file to understand the context better.
rg 'def ' --glob 'dpgen2/op/run_relax.py' -A 2

Length of output: 808

Tools
GitHub Check: pyright

[failure] 139-139:
Cannot access member "get_type_map" for type "None"
  Member "get_type_map" is unknown (reportGeneralTypeIssues)

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

dpgen2/op/run_relax.py Outdated Show resolved Hide resolved
Copy link

codecov bot commented Aug 19, 2024

Codecov Report

Attention: Patch coverage is 85.90604% with 42 lines in your changes missing coverage. Please review.

Project coverage is 84.05%. Comparing base (44d3fd1) to head (4b42e38).
Report is 17 commits behind head on master.

Files with missing lines Patch % Lines
dpgen2/entrypoint/submit.py 5.26% 18 Missing ⚠️
...n2/exploration/task/make_task_group_from_config.py 23.52% 13 Missing ⚠️
dpgen2/superop/prep_run_diffcsp.py 90.74% 5 Missing ⚠️
dpgen2/op/run_relax.py 95.12% 4 Missing ⚠️
dpgen2/op/diffcsp_gen.py 96.61% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #251      +/-   ##
==========================================
+ Coverage   83.87%   84.05%   +0.17%     
==========================================
  Files          98      103       +5     
  Lines        5433     5725     +292     
==========================================
+ Hits         4557     4812     +255     
- Misses        876      913      +37     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.


🚨 Try these New Features:

@zjgemi zjgemi changed the title [WIP] Add DiffCSP as a new exploration engine Add DiffCSP as a new exploration engine Aug 19, 2024
Copy link

@wanghan-iapcm wanghan-iapcm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would recommend cherry-picking the implementation of diffcsp, so this pr is decoupled from #250

@zjgemi
Copy link
Collaborator Author

zjgemi commented Aug 20, 2024

I would recommend cherry-picking the implementation of diffcsp, so this pr is decoupled from #250

There are conf-filter related modifications in diffcsp. We can handle this PR after #250.

dpgen2/entrypoint/submit.py Outdated Show resolved Hide resolved
examples/diffcsp/dpgen.json Outdated Show resolved Hide resolved
zjgemi and others added 9 commits August 22, 2024 09:55
…_exploration_scheduler; add conf_filters to example

Signed-off-by: zjgemi <[email protected]>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Comment on lines +28 to +61
data = torch.load(input_file, map_location=torch.device("cpu"))
if not os.path.exists(output_dir):
os.makedirs(output_dir)
lengths = data["lengths"]
angles = data["angles"]
num_atoms = data["num_atoms"]
frac_coors = data["frac_coords"]
atom_types = data["atom_types"]

lengths_list = lengths.numpy()
angles_list = angles.numpy()
num_atoms_list = num_atoms
frac_coors_list = frac_coors.numpy()
atom_types_list = atom_types
if len(atom_types_list.shape) > 1:
atom_types_list = np.argmax(atom_types_list, axis=-1) + 1

now_atom = 0
for i in range(len(num_atoms_list)):
length = lengths_list[i]
angle = angles_list[i]
atom_num = num_atoms_list[i]

atom_type = atom_types_list[now_atom : now_atom + atom_num]
frac_coord = frac_coors_list[now_atom : now_atom + atom_num]
lattice = Lattice.from_parameters(*(length.tolist() + angle.tolist()))
structure = Structure(
lattice, atom_type, frac_coord, coords_are_cartesian=False
)

filename = "%s.cif" % i
file_path = os.path.join(output_dir, filename)
structure.to(filename=file_path)
now_atom += atom_num
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add error handling for file operations.

Consider adding error handling for file operations to catch potential issues with file I/O.

You can wrap the file operations in a try-except block to handle exceptions like FileNotFoundError or IOError.

except ValueError:
raise RuntimeError("Path of DiffCSP model not provided.")
model_path = args[i + 1]
subprocess.run(cmd, shell=True, check=True)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Improve subprocess command execution.

Use subprocess.run with additional parameters for better error handling and output management.

Apply this diff to improve the subprocess execution:

- subprocess.run(cmd, shell=True, check=True)
+ subprocess.run(cmd, shell=True, check=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE)

This change captures the command's output and errors, allowing for better debugging and logging.

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
subprocess.run(cmd, shell=True, check=True)
subprocess.run(cmd, shell=True, check=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE)

Comment on lines +29 to +30
if not os.path.exists(output_dir):
os.makedirs(output_dir)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use os.makedirs with exist_ok=True.

To simplify the code and avoid checking for the directory's existence, use os.makedirs with exist_ok=True.

Apply this diff to improve the code:

 if not os.path.exists(output_dir):
-    os.makedirs(output_dir)
+    os.makedirs(output_dir, exist_ok=True)
Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if not os.path.exists(output_dir):
os.makedirs(output_dir)
os.makedirs(output_dir, exist_ok=True)

@wanghan-iapcm wanghan-iapcm merged commit 4967951 into deepmodeling:master Aug 22, 2024
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants