MVP improvements to automated archive runs #357

e-belfer · 2024-06-17T22:22:20Z

Overview

Addresses subset of #346 and #347, focusing on the lowest hanging improvements.

What problem does this address?
There are two types of GHA failures we want to keep track of: run failures and validation test failures. These should both be reported in the Slackbot notifications. The archive run now also automatically opens an issue using the archive issue template, noting the date of the run and linking to the URL for the action. Finally, we can now easily configure what datasets we run on a manual workflow dispatch, whether to kick off the large runner for EPACEMS (and eventually maybe some other datasets), and whether to make a Github issue following the run.

What did you change in this PR?

Update the make_slack_notification.py to incorporate validation test failures.
Auto-create Github issue monthly with link to run results and date
Add options for manual dispatch: to only run workflow on certain datasets, kick off large runner, make Github issue

Out of scope:

Figure out how to add other run failures to Slack message with enough info from the error to avoid having to luck at the run logs (e.g., "archive deleted")
Figure out how to add run summary to Github issue (e.g., these datasets passed, these failed validation, these are differently broken)
Figure out how to auto-populate tasklist with this run summary

Testing

How did you make sure this worked? How can a reviewer verify this?

Action testing one dataset manually provided, Github issue creation and skipping large runners:
https://github.com/catalyst-cooperative/pudl-archiver/actions/runs/9569277213
#360

Action testing large runners, 2 datasets manually provided:
https://github.com/catalyst-cooperative/pudl-archiver/actions/runs/9568539141

Action testing skipping of Github issue creation:
https://github.com/catalyst-cooperative/pudl-archiver/actions/runs/9569300760

To-do list

Tasks

Give feedback

Review the PR yourself and call out any questions or issues you have
Squash commits
Options

For more information, see https://pre-commit.ci

.github/workflows/run-archiver.yml

e-belfer · 2024-06-18T16:19:22Z

.github/workflows/run-archiver.yml

-          - nrelatb
-          - phmsagas
-
+        dataset: ${{ fromJSON(format('[{0}]', inputs.small_runner || '"eia176","eia191","eia757a","eia860","eia860m","eia861","eia923","eia930","eiaaeo","eiawater","eia_bulk_elec","epacamd_eia","ferc1","ferc2","ferc6","ferc60","ferc714","mshamines","nrelatb","phmsagas"')) }}


If scheduled, should default to the full list.

blocking: What do you think of defining one list of "all the damn datasets" in env so we can access that everywhere we need to?

It's a bit tricky here because there should in fact be two variables - "small runner datasets" and "large runner datasets". The alternative is one variable with some kind of filtering, but I couldn't figure out how to do that neatly. But I can make a "small" and "large" dataset list.

e-belfer · 2024-06-18T16:19:51Z

.github/workflows/run-archiver.yml

@@ -78,6 +73,7 @@ jobs:
          path: ${{ matrix.dataset }}_run_summary.json

  archive-run-large:
+    if: inputs.large_runner


If set as true in workflow dispatch, or triggered by scheduled run this should run.

Hm, if triggered by scheduled run, I would expect inputs.large_runner be empty and thus archive-run-large to get skipped - am I missing something here?

My assumption was that an empty string would actually get evaluated as true, but I could be totally off-base here. I think your suggestion re: incorporating the type of run below is great and I'll incorporate it here.

I think an unset variable here would get treated as a "falsey" value: https://docs.github.com/en/actions/learn-github-actions/expressions#literals

Note that in conditionals, falsy values (false, 0, -0, "", '', null) are coerced to false and truthy (true and other non-falsy values) are coerced to true.

And actually I bet unset variable is actually null instead of '', now that I look at those docs.

Either way, making this more explicit seems wise.

e-belfer · 2024-06-18T17:10:09Z

.github/workflows/run-archiver.yml

@@ -160,3 +155,19 @@ jobs:
          payload: ${{ steps.all_summaries.outputs.SLACK_PAYLOAD }}
        env:
          SLACK_BOT_TOKEN: ${{ secrets.PUDL_DEPLOY_SLACK_TOKEN }}
+
+  make-github-issue:
+    if: always() && inputs.create_github_issue != false


See discussion here for use of always() actions/runner#491
Completely unhinged GHA behavior, but what can we do.

Whee! hey, if it works it works.

Is the idea behind inputs.create_github_issue != false that "" != false and then we will get the github issue in scheduled runs as well as the specified manual runs?

If so, what do you think of using github.event_name to differentiate between workflow_dispatch and scheduled runs? That is more explicitly "do X step if it's scheduled or if there's some specific workflow_dispatch input."

I think that's a great idea, can incorporate it here. But yes, that was my original idea.

jdangerx

Broadly this looks good! I have a few small questions that I'd love to see addressed before we merge. If you think that the implied changes aren't useful, feel free to say so and re-request review :)

In terms of testing - pretty tricky to do more than just "trigger manual runs and see if things work the way you expect." I think that level of testing is fine, and if the scheduled runs fail we can always handle that manually.

.github/ISSUE_TEMPLATE/monthly-archive-update.md

jdangerx · 2024-06-18T19:43:33Z

.github/ISSUE_TEMPLATE/monthly-archive-update.md

 ```

 # Relevant logs
-[Link to logs from GHA run]( PLEASE FIND THE ACTUAL LINK AND FILL IN HERE )
+[Link to logs from GHA run]({{ env.RUN_URL }})


non-blocking: we could maybe strip this whole section if there's that summary of results section above.

.github/workflows/run-archiver.yml

jdangerx · 2024-06-18T19:49:20Z

.github/workflows/run-archiver.yml

@@ -78,6 +73,7 @@ jobs:
          path: ${{ matrix.dataset }}_run_summary.json

  archive-run-large:
+    if: inputs.large_runner


Hm, if triggered by scheduled run, I would expect inputs.large_runner be empty and thus archive-run-large to get skipped - am I missing something here?

jdangerx · 2024-06-18T19:49:45Z

.github/workflows/run-archiver.yml

-          - nrelatb
-          - phmsagas
-
+        dataset: ${{ fromJSON(format('[{0}]', inputs.small_runner || '"eia176","eia191","eia757a","eia860","eia860m","eia861","eia923","eia930","eiaaeo","eiawater","eia_bulk_elec","epacamd_eia","ferc1","ferc2","ferc6","ferc60","ferc714","mshamines","nrelatb","phmsagas"')) }}


blocking: What do you think of defining one list of "all the damn datasets" in env so we can access that everywhere we need to?

jdangerx · 2024-06-18T19:51:29Z

.github/workflows/run-archiver.yml

@@ -160,3 +155,19 @@ jobs:
          payload: ${{ steps.all_summaries.outputs.SLACK_PAYLOAD }}
        env:
          SLACK_BOT_TOKEN: ${{ secrets.PUDL_DEPLOY_SLACK_TOKEN }}
+
+  make-github-issue:
+    if: always() && inputs.create_github_issue != false


Whee! hey, if it works it works.

Is the idea behind inputs.create_github_issue != false that "" != false and then we will get the github issue in scheduled runs as well as the specified manual runs?

If so, what do you think of using github.event_name to differentiate between workflow_dispatch and scheduled runs? That is more explicitly "do X step if it's scheduled or if there's some specific workflow_dispatch input."

scripts/make_slack_notification_message.py

…ssue template

.github/workflows/run-archiver.yml

e-belfer · 2024-06-19T14:55:15Z

Tested the large runner and github issue creation here, all looks good. Issue is here. @jdangerx you're right that I haven't come up with a way to test the scheduled run behavior, we could either kick one off or debug this at the start of the month as needed.

jdangerx

🚢 - no need to futz around more in the tangled web of GHA if this works 😄 .

Sure there's some things that "could" be cleaner, but is the time saved down the line worth the upfront investment now? I don't think so at this point.

.github/workflows/run-archiver.yml

e-belfer added 3 commits June 17, 2024 17:14

Add validation failures to slackbot

7ca2750

Fix skipping failed tests

13302d9

Add format_message and reduce duplicated code

9425dd5

e-belfer changed the title ~~Add failures to slackbot~~ When GHA jobs fail or validation tests fail, report to Slackbot Jun 17, 2024

Merge branch 'main' into add-failures-to-slackbot

4cf7715

e-belfer self-assigned this Jun 17, 2024

e-belfer added 4 commits June 17, 2024 18:26

Test by running on some busted and non-busted archives

9610964

Fix issue in test run-archiver.yml

bf58f1c

Shrink test and flatten validation test lists

68c6edf

Update issue template, add template creation to workflow

83961e3

e-belfer changed the title ~~When GHA jobs fail or validation tests fail, report to Slackbot~~ MVP improvements to automated archive runs Jun 18, 2024

e-belfer added 2 commits June 18, 2024 10:04

Fix workflow format

60a76c5

Fix link formatting

dd270b4

e-belfer mentioned this pull request Jun 18, 2024

Publish June 18th 2024 archives #358

Closed

e-belfer and others added 12 commits June 18, 2024 10:28

Make slack validation failures more succinct

a2cb688

Attempt to add dataset selection in manual run

f304b4e

Try to fix inputs

ce9158e

[pre-commit.ci] auto fixes from pre-commit.com hooks

7cb879a

For more information, see https://pre-commit.ci

Try to fix matrix strategy

3d31147

Try to fix matrix strategy

53c12d4

Try to fix matrix strategy

6242c47

Fix syntax

8bbdf9c

Test syntax and aditional quotes

6094cb4

Remove epacems from large, try to get filtering to work

a7622b0

Add back large runner

ec4eea9

Remove epacems from default small runner list

ad7464c

e-belfer marked this pull request as ready for review June 18, 2024 16:16

e-belfer commented Jun 18, 2024

View reviewed changes

.github/workflows/run-archiver.yml Show resolved Hide resolved

e-belfer commented Jun 18, 2024

View reviewed changes

e-belfer mentioned this pull request Jun 18, 2024

Publish June 18th 2024 archives #359

Closed

e-belfer added 10 commits June 18, 2024 12:23

Fix github issue creation

5883670

Deal with foolish boolean formats

71bd627

Appease the GHA formatting nightmare

4bf7d5c

More playing around with github issue creation

a1301a0

Even more tooling with github issue creation

4829753

Just try everything

0d51e24

Try different tack for boolean

0e015c6

Try different tack for boolean

6fdd347

Try false instead of false

1fe8ba0

Handle skips and irrational GHA format requirements

f179acc

e-belfer commented Jun 18, 2024

View reviewed changes

e-belfer added github_actions Pull requests that update GitHub Actions code automation Issues relating to automated archiver runs labels Jun 18, 2024

e-belfer requested a review from jdangerx June 18, 2024 19:31

jdangerx requested changes Jun 18, 2024

View reviewed changes

e-belfer added 2 commits June 19, 2024 10:23

Make scheduled run workflow more explicit, remove redundant logs in i…

02d0eb6

…ssue template

Workflow dispatch doesn't like env variables as input

844ecc2

e-belfer commented Jun 19, 2024

View reviewed changes

.github/workflows/run-archiver.yml Show resolved Hide resolved

Roll back env vars due to difficult GHA behavior

140473c

e-belfer mentioned this pull request Jun 19, 2024

Publish June 19th 2024 archives #361

Closed

e-belfer requested a review from jdangerx June 19, 2024 14:55

jdangerx approved these changes Jun 19, 2024

View reviewed changes

.github/workflows/run-archiver.yml Show resolved Hide resolved

e-belfer merged commit e1d78b5 into main Jun 19, 2024
6 of 7 checks passed

e-belfer deleted the add-failures-to-slackbot branch June 19, 2024 15:14

This was linked to issues Jun 19, 2024

Better reporting & notification of archive creation and validation failures #347

Open

Automatically create archive approval checklist #346

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MVP improvements to automated archive runs #357

MVP improvements to automated archive runs #357

e-belfer commented Jun 17, 2024 •

edited

Loading

Tasks

e-belfer Jun 18, 2024 •

edited

Loading

jdangerx Jun 18, 2024

e-belfer Jun 19, 2024

e-belfer Jun 18, 2024

jdangerx Jun 18, 2024

e-belfer Jun 19, 2024

jdangerx Jun 19, 2024

e-belfer Jun 19, 2024

e-belfer Jun 18, 2024 •

edited

Loading

jdangerx Jun 18, 2024

e-belfer Jun 19, 2024 •

edited

Loading

jdangerx left a comment

jdangerx Jun 18, 2024

jdangerx Jun 18, 2024

jdangerx Jun 18, 2024

jdangerx Jun 18, 2024

e-belfer commented Jun 19, 2024

jdangerx left a comment

MVP improvements to automated archive runs #357

MVP improvements to automated archive runs #357

Conversation

e-belfer commented Jun 17, 2024 • edited Loading

Overview

Testing

To-do list

Tasks

e-belfer Jun 18, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

e-belfer Jun 18, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

e-belfer Jun 19, 2024 • edited Loading

Choose a reason for hiding this comment

jdangerx left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

e-belfer commented Jun 19, 2024

jdangerx left a comment

Choose a reason for hiding this comment

e-belfer commented Jun 17, 2024 •

edited

Loading

e-belfer Jun 18, 2024 •

edited

Loading

e-belfer Jun 18, 2024 •

edited

Loading

e-belfer Jun 19, 2024 •

edited

Loading