Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FDS-1797] input graph api #1481

Draft
wants to merge 137 commits into
base: develop
Choose a base branch
from
Draft

[FDS-1797] input graph api #1481

wants to merge 137 commits into from

Conversation

afwillia
Copy link
Contributor

@afwillia afwillia commented Aug 28, 2024

This PR adds a graph_url parameter to the manifest/generate, validate, and submit API endpoints. graph_url is a URL to a pickled data model graph. Supplying it should make the request run faster. It depends on PR1425 and PR1396 which I have already merged into this branch to facilitate development.

graph_url will be added to other endpoints that also accept schema_url, but the three endpoints above are the most relevant to DCA and the current improvement sprint.

Linking #1425 , #1396

afwillia added 30 commits March 25, 2024 12:33
…th is pickle. Also test both parameters are empty.
schematic/utils/io_utils.py Show resolved Hide resolved
schematic/manifest/generator.py Show resolved Hide resolved
schematic/schemas/commands.py Show resolved Hide resolved
schematic/schemas/commands.py Show resolved Hide resolved
schematic/schemas/commands.py Show resolved Hide resolved
Comment on lines 61 to 142
def test_schema_convert_cli(self, runner, output_path, output_type):
model = "tests/data/example.model.csv"
label_type = "class_label"
expected = 0

output_path = helpers.get_data_path("example.model.jsonld")
resultOne = runner.invoke(schema, ["convert", model])

label_type = "class_label"
assert resultOne.exit_code == expected
# check output_path file is created then remove it
assert os.path.exists(output_path)

resultTwo = runner.invoke(
schema, ["convert", model, "--output_path", output_path]
)

assert resultTwo.exit_code == expected
# check output_path file is created then remove it
assert os.path.exists(output_path)

resultThree = runner.invoke(
schema, ["convert", model, "--output_type", output_type]
)

assert resultThree.exit_code == expected
# check output_path file is created then remove it
assert os.path.exists(output_path)

resultFour = runner.invoke(
schema,
[
"convert",
model,
"--output_type",
output_type,
"--output_jsonld",
output_path,
],
)

assert resultFour.exit_code == expected
# check output_path file is created then remove it
assert os.path.exists(output_path)

result = runner.invoke(
schema,
[
"convert",
data_model_csv_path,
model,
"--output_jsonld",
output_path,
"--data_model_labels",
label_type,
],
)

assert result.exit_code == 0
assert result.exit_code == expected
# check output_path file is created then remove it
assert os.path.exists(output_path)

expected_substr = (
"The Data Model was created and saved to " f"'{output_path}' location."
resultFive = runner.invoke(
schema,
[
"convert",
model,
"--output_jsonld",
"tests/data/example.model.pickle",
"--output_path",
"tests/data/example.model.pickle",
],
)

assert expected_substr in result.output
assert resultFive.exit_code == expected
# check output_path file is created then remove it
assert os.path.exists(output_path)

resultSix = runner.invoke(
schema, ["convert", model, "--output_jsonld", "", "--output_path", ""]
)

assert resultSix.exit_code == expected
# check output_path file is created then remove it
assert os.path.exists(output_path)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • 1 to Tom's suggestion. Please split this up so that it is more clear what has been tested.

tests/test_metadata.py Show resolved Hide resolved
tests/test_metadata.py Show resolved Hide resolved
Copy link

@linglp
Copy link
Contributor

linglp commented Sep 11, 2024

@afwillia another concern that I have about the PR is that I am seeing a lot of:

        if data_model_graph_pickle:
            self.graph_data_model = read_pickle(data_model_graph_pickle)

and also a lot of classes have both graph_data_model parameter and data_model_graph_pickle parameter. If we always get graph_data_model from the pickle file, then wouldn't it make sense to just keep one parameter?

Here's an example of what I meant: #1499

@linglp
Copy link
Contributor

linglp commented Sep 11, 2024

@afwillia another point is that if you change parameter create_manifests, I am thinking most likely that you will need to change test_api.py because test_api.py test the process of generating a manifest by hitting the manifest/generate endpoint, and in this case, it needs to test the process of generating a manifest using a pickle file. But I am not seeing any changes totest_api.py. Can changes be added to test_api.py too?

Copy link

@thomasyu888
Copy link
Member

Just taking a note here, sophia will reach out to discuss the PR/state of this feature at large prior to merge. Thanks for all the work!

@thomasyu888 thomasyu888 changed the title Fds 1797 input graph api [FDS-1797] input graph api Nov 2, 2024
@thomasyu888 thomasyu888 marked this pull request as draft November 29, 2024 00:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants