Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

added sample irida_next sample field option #140

Merged
merged 38 commits into from
Nov 26, 2024
Merged

added sample irida_next sample field option #140

merged 38 commits into from
Nov 26, 2024

Conversation

mattheww95
Copy link
Collaborator

Added support for the irida_next sample id.

Copy link

github-actions bot commented Oct 24, 2024

nf-core pipelines lint overall result: Passed ✅ ⚠️

Posted for pipeline commit ae296a3

+| ✅ 229 tests passed       |+
#| ❔  32 tests were ignored |#
!| ❗   4 tests had warnings |!

❗ Test warnings:

  • files_exist - File not found: conf/igenomes_ignored.config
  • nextflow_config - nf-validation has been detected in the pipeline. Please migrate to nf-schema: https://nextflow-io.github.io/nf-schema/latest/migration_guide/
  • readme - README contains the placeholder zenodo.XXXXXXX. This should be replaced with the zenodo doi (after the first release).
  • schema_lint - Schema $id should be https://raw.githubusercontent.com/phac-nml/mikrokondo/master/nextflow_schema.json
    Found https://raw.githubusercontent.com/phac-nml/mikrokondo/main/nextflow_schema.json

❔ Tests ignored:

  • files_exist - File is ignored: CODE_OF_CONDUCT.md
  • files_exist - File is ignored: assets/nf-core-mikrokondo_logo_light.png
  • files_exist - File is ignored: docs/images/nf-core-mikrokondo_logo_light.png
  • files_exist - File is ignored: docs/images/nf-core-mikrokondo_logo_dark.png
  • files_exist - File is ignored: .github/ISSUE_TEMPLATE/config.yml
  • files_exist - File is ignored: .github/workflows/awstest.yml
  • files_exist - File is ignored: .github/workflows/awsfulltest.yml
  • files_exist - File is ignored: docs/output.md
  • files_exist - File is ignored: docs/README.md
  • files_exist - File is ignored: docs/usage.md
  • nextflow_config - Config variable ignored: manifest.name
  • nextflow_config - Config variable ignored: manifest.homePage
  • nextflow_config - Config variable ignored: params.max_cpus
  • files_unchanged - File does not exist: CODE_OF_CONDUCT.md
  • files_unchanged - File ignored due to lint config: LICENSE or LICENSE.md or LICENCE or LICENCE.md
  • files_unchanged - File ignored due to lint config: .github/CONTRIBUTING.md
  • files_unchanged - File ignored due to lint config: .github/ISSUE_TEMPLATE/bug_report.yml
  • files_unchanged - File does not exist: .github/ISSUE_TEMPLATE/config.yml
  • files_unchanged - File ignored due to lint config: .github/ISSUE_TEMPLATE/feature_request.yml
  • files_unchanged - File ignored due to lint config: .github/PULL_REQUEST_TEMPLATE.md
  • files_unchanged - File ignored due to lint config: .github/workflows/branch.yml
  • files_unchanged - File ignored due to lint config: .github/workflows/linting.yml
  • files_unchanged - File ignored due to lint config: assets/email_template.html
  • files_unchanged - File ignored due to lint config: assets/email_template.txt
  • files_unchanged - File ignored due to lint config: assets/sendmail_template.txt
  • files_unchanged - File does not exist: assets/nf-core-mikrokondo_logo_light.png
  • files_unchanged - File does not exist: docs/images/nf-core-mikrokondo_logo_light.png
  • files_unchanged - File does not exist: docs/images/nf-core-mikrokondo_logo_dark.png
  • files_unchanged - File does not exist: docs/README.md
  • files_unchanged - File ignored due to lint config: .gitignore or .prettierignore
  • actions_awstest - 'awstest.yml' workflow not found: /home/runner/work/mikrokondo/mikrokondo/.github/workflows/awstest.yml
  • multiqc_config - multiqc_config

✅ Tests passed:

Run details

  • nf-core/tools version 3.0.2
  • Run at 2024-11-26 16:31:57

@mattheww95
Copy link
Collaborator Author

If these tests pass, a sample with the name .iridanext_output. should be passed as a sample name to verify it is valid and data passes through.

@mattheww95 mattheww95 marked this pull request as ready for review October 29, 2024 18:42
Copy link

@kylacochrane kylacochrane left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work Matthew 😸
I don’t have any specific comments - this sample_name solution looks solid to me. I tried adding a helper function to simplify the inx_string_suffix extraction logic in updated_samples within main.nf, but it ended up making things more complicated than expected, haha!

tests/pipelines/main.from_assemblies.nf.test Show resolved Hide resolved
Copy link
Member

@apetkau apetkau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great Matthew. Thanks so much for your work on including sample names 😄

I have a few suggestions and comments for you (given in-line below).

assets/schema_input.json Outdated Show resolved Hide resolved
assets/schema_input.json Outdated Show resolved Hide resolved
bin/report_summaries.py Outdated Show resolved Hide resolved
@mattheww95 mattheww95 requested a review from apetkau November 1, 2024 19:27
Copy link
Contributor

@sgsutcliffe sgsutcliffe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Following up on my comment, there needs to be a renaming of meta.id with meta.external_id when no sample_name is provided because it becomes null and then wants to group everything in the COMBINE_DATA() process. I tried using the map{} we have used in other pipelines but it wasn't working. I can give it more of a try.

What I tried doing was:

    // Track processed IDs
    def processedIDs = [] as Set

    input = Channel.fromSamplesheet("input")
    // and remove non-alphanumeric characters in sample_names (meta.id), whilst also correcting for duplicate sample_names (meta.id)
    .map { meta ->
            if (!meta.id) {
                meta.id = meta.external_id
            } else {
                // Non-alphanumeric characters (excluding _,-,.) will be replaced with "_"
                meta.id = meta.id.replaceAll(/[^A-Za-z0-9_.\-]/, '_')
            }
            // Ensure ID is unique by appending meta.external_id if needed
            while (processedIDs.contains(meta.id)) {
                meta.id = "${meta.id}_${meta.external_id}"
            }
            // Add the ID to the set of processed IDs
            processedIDs << meta.id

            tuple(meta)}.view()

in the input_check subworkflow but it tells me it cannot perform replaceAll because it is an ArrayList type.

@sgsutcliffe
Copy link
Contributor

One last comment! I promise, and a suggestion. Could we use meta.irida_id instead of meta.external_id, that way it will be consistent with the other phac-nml nextflow pipelines.

Copy link
Member

@apetkau apetkau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great. Thanks so much for all your work @mattheww95 . A few inline comments.

CHANGELOG.md Outdated Show resolved Hide resolved
CHANGELOG.md Outdated Show resolved Hide resolved
CHANGELOG.md Outdated Show resolved Hide resolved
assets/schema_input.json Outdated Show resolved Hide resolved
assets/schema_input.json Outdated Show resolved Hide resolved
subworkflows/local/input_check.nf Outdated Show resolved Hide resolved
Copy link
Contributor

@sgsutcliffe sgsutcliffe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Thanks for implementing the changes. I will continue to do some testing (i.e. playing with the pipeline) but for the PR I think it looks good to merge. Thanks for working through this rather tedious PR!

Copy link
Member

@apetkau apetkau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks so much Matthew Wells for the amazing work you've done with this. And thanks so much Steven Sutcliffe for all your help reviewing 😄

I have tested this out in IRIDA Next starting from both assemblies. The output files are named properly, they are stored properly with the respective sample records, and metadata is written properly.

This all looks great. Approving this PR. I only made note of 2 small typos/fixes to text in-line to change.

assets/schema_input.json Outdated Show resolved Hide resolved
assets/schema_input.json Outdated Show resolved Hide resolved
@apetkau apetkau merged commit f853ce8 into dev Nov 26, 2024
5 checks passed
@apetkau apetkau deleted the inx_id branch November 26, 2024 18:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants