Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sylph tool Wrapper #1518

Open
wants to merge 23 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
23 commits
Select commit Hold shift + click to select a range
027c8be
transfered sylph to galaxytools due to size of test database metadata…
tcollins2011 Oct 8, 2024
d7fc0e7
Update tools/sylph/sylph.xml
tcollins2011 Oct 9, 2024
6146fb0
Update tools/sylph/.shed.yml
tcollins2011 Oct 9, 2024
f1e4551
Update tools/sylph/.shed.yml
tcollins2011 Oct 9, 2024
6b9d9cb
Update tools/sylph/macros.xml
tcollins2011 Oct 9, 2024
55d54f1
Update tools/sylph/macros.xml
tcollins2011 Oct 9, 2024
3519a0b
Update tools/sylph/.shed.yml
tcollins2011 Oct 9, 2024
8944a8d
replaced all double quotes in the command section with single quotes …
tcollins2011 Oct 9, 2024
63ca10f
Update tools/sylph/sylph.xml
tcollins2011 Oct 9, 2024
46e5d68
updated database sylmlink to better reflect the name
tcollins2011 Oct 9, 2024
477e325
changed ouput name
tcollins2011 Oct 9, 2024
25d9bac
changed database path names and sample file
tcollins2011 Oct 10, 2024
1513faf
python linting and tab spacing
tcollins2011 Oct 10, 2024
691a915
Merge branch 'master' into sylph
tcollins2011 Oct 10, 2024
9221bb8
fixing flake8 linting problems
tcollins2011 Oct 10, 2024
d86bcf0
force adding the extra test files and fixing a spacing issue in python
tcollins2011 Oct 10, 2024
3c35f6f
actually remembring to add the correct whitespace file to my commit
tcollins2011 Oct 10, 2024
5566c61
changed profile and query to be different tools and upated the macros…
tcollins2011 Oct 25, 2024
21de493
some of the comments
Dec 14, 2024
3fae5b1
lint fix
Dec 15, 2024
da2821f
Merge branch 'bgruening:master' into sylph
tcollins2011 Dec 16, 2024
2dfa011
fixed database tabs
tcollins2011 Dec 16, 2024
bdc5e02
added history database test
tcollins2011 Dec 16, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 13 additions & 0 deletions tools/sylph/.shed.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
name: sylph
owner: bgruening
description: sylph - fast and precise species-level metagenomic profiling with ANIs
long_description: sylph is a program that performs ultrafast (1) ANI querying or (2) metagenomic profiling for metagenomic shotgun samples.
homepage_url: https://github.com/bluenote-1577/sylph
remote_repository_url: https://github.com/bgruening/galaxytools/main/tools/sylph
categories:
- Metagenomics
type: unrestricted
auto_tool_repositories:
name_template: "{{ tool_id }}"
description_template: "{{ tool_name }} from the sylph suite"

10 changes: 10 additions & 0 deletions tools/sylph/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
For Galaxy admins and local runs:

The databases for sylph have associated metadata files. These files MUST be paired with the correct databases to output correctly. Here is the easiest location to download databases and metadata files:
For databases: https://github.com/bluenote-1577/sylph/wiki/Pre%E2%80%90built-databases
For metadata: https://github.com/bluenote-1577/sylph-utils

The tool assumes the directory the data_table references to be
<name_of_organism>
- database.syldb
- metadata.tsv.gz
147 changes: 147 additions & 0 deletions tools/sylph/macros.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,147 @@
<macros>
<token name="@TOOL_VERSION@">0.6.1</token>
<token name="@VERSION_SUFFIX@">0</token>
<token name="@LICENSE@">MIT</token>
<token name="@DB_SELECTOR@"><![CDATA[
#if $database_select.select == 'cached':
ln -s '$database_select.sylph_database.fields.path/database.syldb' 'database.syldb' &&
#else:
ln -s '$database_select.sylph_database' 'database.syldb' &&
#end if
]]></token>
<token name="@SINGLE_INPUT@"><![CDATA[
#if $sketch.input.ext == 'fastqsanger'
#set $ext = 'fastq'
#else if $sketch.input.ext == 'fastqsanger.gz':
#set $ext = 'fastq.gz'
#else:
#set $ext = str($sketch.input.ext)
#end if
#if $sketch.input.element_identifier.endswith('.fastq') or $sketch.input.element_identifier.endswith('.fastq.gz'):
#set $input = re.sub(r'\s+', '_', $sketch.input.element_identifier)
#else:
#set $input = re.sub(r'\s+', '_', $sketch.input.element_identifier + '.' + str($ext))
#end if
ln -s '$sketch.input' '$input' &&
]]></token>
<token name="@SINGLE_GROUP@"><![CDATA[
#set input = ''
#for $number, $current_file in enumerate($sketch.input):
#if $current_file.ext == 'fastqsanger'
#set $ext = 'fastq'
#else if $current_file.ext == 'fastqsanger.gz':
#set $ext = 'fastq.gz'
#else:
#set $ext = str($current_file.ext)
#end if
#if $current_file.element_identifier.endswith('.fastq') or $current_file.element_identifier.endswith('.fastq.gz'):
#set $current_input = re.sub(r'\s+', '_', $current_file.element_identifier)
#else:
#set $current_input = re.sub(r'\s+', '_', $current_file.element_identifier + '.' + str($ext))
#end if
ln -s '${current_file}' '$current_input' &&
#set input = str($input) + ' ' + str($current_input)
#end for
]]></token>
<token name="@PAIRED@"><![CDATA[
#if $sketch.input_1.ext == 'fastqsanger'
#set $ext_1 = 'fastq'
#else if $sketch.input_1.ext == 'fastqsanger.gz':
#set $ext_1 = 'fastq.gz'
#else:
#set $ext_1 = str($sketch.input_1.ext)
#end if

#if $sketch.input_2.ext == 'fastqsanger'
#set $ext_2 = 'fastq'
#else if $sketch.input_2.ext == 'fastqsanger.gz':
#set $ext_2 = 'fastq.gz'
#else:
#set $ext_2 = str($sketch.input_2.ext)
#end if

#if $sketch.input_1.element_identifier.endswith('.fastq') or $sketch.input_1.element_identifier.endswith('.fastq.gz'):
#set $read1 = re.sub(r'\s+', '_', $sketch.input_1.element_identifier)
#else:
#set $read1 = re.sub(r'\s+', '_', str($sketch.input_1.element_identifier) + '.' + str($ext_1))
#end if
#if $sketch.input_2.element_identifier.endswith('.fastq') or $sketch.input_2.element_identifier.endswith('.fastq.gz'):
#set $read2 = re.sub(r'\s+', '_', $sketch.input_2.element_identifier)
#else:
#set $read2 = re.sub(r'\s+', '_', str($sketch.input_2.element_identifier) + '.' + str($ext_2))
#end if
ln -s '$sketch.input_1' '$read1' &&
ln -s '$sketch.input_2' '$read2' &&
]]></token>
<token name="@PAIRED_GROUP@"><![CDATA[
#if $sketch.input.forward.ext == 'fastqsanger'
#set $ext_1 = 'fastq'
#else if $sketch.input.forward.ext == 'fastqsanger.gz':
#set $ext_1 = 'fastq.gz'
#else:
#set $ext_1 = str($sketch.input.forward.ext)
#end if

#if $sketch.input.reverse.ext == 'fastqsanger'
#set $ext_2 = 'fastq'
#else if $sketch.input.reverse.ext == 'fastqsanger.gz':
#set $ext_2 = 'fastq.gz'
#else:
#set $ext_2 = str($sketch.input.reverse.ext)
#end if

#set $read1 = re.sub(r'\s+', '_', str($sketch.input.element_identifier) + '.' + str($ext_1))
#set $read2 = re.sub(r'\s+', '_', str($sketch.input.element_identifier) + '_r2.' + str($ext_2))
ln -s '$sketch.input.forward' '$read1' &&
ln -s '$sketch.input.reverse' '$read2' &&
]]></token>
<xml name="requirements">
<requirements>
<requirement type="package" version="@TOOL_VERSION@">sylph</requirement>
<requirement type="package" version="3.11">python</requirement>
<requirement type="package" version="2.2">pandas</requirement>
</requirements>
</xml>
<xml name="citation">
<citations>
<citation type="doi">10.1038/s41587-024-02412-y</citation>
</citations>
</xml>
<xml name="creator">
<creator>
<person givenName="Tyler" familyName="Collins"/>
<person givenName="Alexander" familyName="Ostrovsky"/>
</creator>
</xml>
<xml name="xrefs">
<xrefs>
<xref type="bio.tools">sylph</xref>
</xrefs>
</xml>
<xml name="input_database">
<conditional name="database_select">
<param name="select" type="select" label="Choose the source for databases and metadata">
<option value="cached">Cached data</option>
<option value="history">History</option>
</param>
<when value="cached">
<param label="Select a sylph database" name="sylph_database" type="select">
<options from_data_table="sylph_databases">
<!-- <filter type="sort_by" column="3"/> -->
<validator message="No Sylph databases are available" type="no_options" />
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add here a filter for version 1 or something like that ... and then we include version 1 in the test file.

Whenever the tool changes to DB layout we increase this version.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ping, you are not filtering the DB here according to the version

</options>
</param>
</when>
<when value="history">
<param label="Select a history dataset" name="sylph_database" type="data" format="binary" />
<param label="Metadata file for metaphlan and krona outputs" name="metadata" type="data" format="tabular.gz" optional="true" help="The metata file MUST be directly associated with the input database. For more information, view the help text of the tool."/>
</when>
</conditional>
</xml>
<xml name="output_format">
<param label="Additional output formats" name="outputs" type="select" display="checkboxes" multiple="true" help="In addition to Sylph's tabular output, you may ouput a file converted to these formats">
<option value="metaphlan">Sylph's MetaPhlAn-like output</option>
<option value="krona">Krona compatible</option>
</param>
</xml>
</macros>
Loading
Loading