Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rebuild Augustus recipe #51641

Merged
merged 6 commits into from
Oct 24, 2024
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
47 changes: 25 additions & 22 deletions recipes/augustus/build.sh
Original file line number Diff line number Diff line change
Expand Up @@ -2,34 +2,38 @@

set -x -e

#export CXXFLAGS="${CXXFLAGS} -std=c++11 -stdlib=libstdc++ -stdlib=libc++ -DUSE_BOOST"
export CXXFLAGS="${CXXFLAGS} -std=c++11 -DUSE_BOOST -I${PREFIX}/include"
export INCLUDES="-I${PREFIX}/include"
export LIBPATH="-L${PREFIX}/lib"
export LDFLAGS="${LDFLAGS} -L${PREFIX}/lib"
export CXXFLAGS="${CXXFLAGS} -O3 -std=c++14 -DUSE_BOOST -I${PREFIX}/include"

mkdir -p ${PREFIX}/bin
mkdir -p ${PREFIX}/scripts
mkdir -p ${PREFIX}/config

## Make the software

if [ "$(uname)" = Darwin ] ; then
# SQLITE disabled due to compile issue, see: https://svn.boost.org/trac10/ticket/13501
sqlite=
if [[ "$(uname)" = Darwin ]]; then
# SQLITE disabled due to compile issue, see: https://svn.boost.org/trac10/ticket/13501
sqlite=
else
sqlite='SQLITE=true'
sqlite='SQLITE=true'
fi
make \
CC="${CC}" \
CXX="${CXX}" \
INCLUDE_PATH_BAMTOOLS="-I${PREFIX}/include/bamtools" \
LIBRARY_PATH_BAMTOOLS="-L${PREFIX}/lib" \
INCLUDE_PATH_LPSOLVE="-I${PREFIX}/include/lpsolve" \
LIBRARY_PATH_LPSOLVE="-L${PREFIX}/lib" \
INCLUDE_PATH_HTSLIB="-I${PREFIX}/include/htslib" \
LIBRARY_PATH_HTSLIB="-L${PREFIX}/lib" \
COMPGENPRED=true \
MYSQL=false \
$sqlite

make \
CC="${CC}" \
CXX="${CXX}" \
INCLUDE_PATH_BAMTOOLS="-I${PREFIX}/include/bamtools" \
LIBRARY_PATH_BAMTOOLS="-L${PREFIX}/lib" \
INCLUDE_PATH_LPSOLVE="-I${PREFIX}/include/lpsolve" \
LIBRARY_PATH_LPSOLVE="-L${PREFIX}/lib" \
INCLUDE_PATH_HTSLIB="-I${PREFIX}/include/htslib" \
LIBRARY_PATH_HTSLIB="-L${PREFIX}/lib" \
COMPGENPRED=true \
MYSQL=false \
"${sqlite}" -j"${CPU_COUNT}"
Comment on lines +22 to +33
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Consider limiting parallel build jobs.

While using ${CPU_COUNT} for parallel builds is good for performance, it might cause issues on machines with many cores due to memory constraints.

-	"${sqlite}" -j"${CPU_COUNT}"
+	"${sqlite}" -j"$(( CPU_COUNT < 4 ? CPU_COUNT : 4 ))"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
make \
CC="${CC}" \
CXX="${CXX}" \
INCLUDE_PATH_BAMTOOLS="-I${PREFIX}/include/bamtools" \
LIBRARY_PATH_BAMTOOLS="-L${PREFIX}/lib" \
INCLUDE_PATH_LPSOLVE="-I${PREFIX}/include/lpsolve" \
LIBRARY_PATH_LPSOLVE="-L${PREFIX}/lib" \
INCLUDE_PATH_HTSLIB="-I${PREFIX}/include/htslib" \
LIBRARY_PATH_HTSLIB="-L${PREFIX}/lib" \
COMPGENPRED=true \
MYSQL=false \
"${sqlite}" -j"${CPU_COUNT}"
make \
CC="${CC}" \
CXX="${CXX}" \
INCLUDE_PATH_BAMTOOLS="-I${PREFIX}/include/bamtools" \
LIBRARY_PATH_BAMTOOLS="-L${PREFIX}/lib" \
INCLUDE_PATH_LPSOLVE="-I${PREFIX}/include/lpsolve" \
LIBRARY_PATH_LPSOLVE="-L${PREFIX}/lib" \
INCLUDE_PATH_HTSLIB="-I${PREFIX}/include/htslib" \
LIBRARY_PATH_HTSLIB="-L${PREFIX}/lib" \
COMPGENPRED=true \
MYSQL=false \
"${sqlite}" -j"$(( CPU_COUNT < 4 ? CPU_COUNT : 4 ))"


# Fix perl shebang
sed -i.bak '1 s|^.*$|#!/usr/bin/env perl|g' ${SRC_DIR}/scripts/*.pl
rm -rf ${SRC_DIR}/scripts/*.bak
Comment on lines +35 to +37
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add error checking for Perl script modifications.

While the shebang fix is good, the operation could fail silently if no .pl files exist.

Consider adding error checking:

-sed -i.bak '1 s|^.*$|#!/usr/bin/env perl|g' ${SRC_DIR}/scripts/*.pl
-rm -rf ${SRC_DIR}/scripts/*.bak
+# Check if Perl scripts exist
+if ls ${SRC_DIR}/scripts/*.pl >/dev/null 2>&1; then
+    sed -i.bak '1 s|^.*$|#!/usr/bin/env perl|g' ${SRC_DIR}/scripts/*.pl
+    rm -rf ${SRC_DIR}/scripts/*.bak
+else
+    echo "Warning: No Perl scripts found in ${SRC_DIR}/scripts/"
+fi
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# Fix perl shebang
sed -i.bak '1 s|^.*$|#!/usr/bin/env perl|g' ${SRC_DIR}/scripts/*.pl
rm -rf ${SRC_DIR}/scripts/*.bak
# Fix perl shebang
# Check if Perl scripts exist
if ls ${SRC_DIR}/scripts/*.pl >/dev/null 2>&1; then
sed -i.bak '1 s|^.*$|#!/usr/bin/env perl|g' ${SRC_DIR}/scripts/*.pl
rm -rf ${SRC_DIR}/scripts/*.bak
else
echo "Warning: No Perl scripts found in ${SRC_DIR}/scripts/"
fi


## Build Perl

Expand All @@ -39,7 +43,7 @@ cd perl-build
# affects tests for Augustus 3.3:
# https://github.com/Gaius-Augustus/Augustus/commit/7ca3ab
#sed -i'' -e '1s/perl -w/perl/' *.pl
cp ${RECIPE_DIR}/Build.PL ./
cp -f ${RECIPE_DIR}/Build.PL ./
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Add error checking for Build.PL copy operation.

The force flag (-f) could mask issues if the source file doesn't exist.

-cp -f ${RECIPE_DIR}/Build.PL ./
+if [ -f "${RECIPE_DIR}/Build.PL" ]; then
+    cp -f "${RECIPE_DIR}/Build.PL" ./
+else
+    echo "Error: ${RECIPE_DIR}/Build.PL not found"
+    exit 1
+fi
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
cp -f ${RECIPE_DIR}/Build.PL ./
if [ -f "${RECIPE_DIR}/Build.PL" ]; then
cp -f "${RECIPE_DIR}/Build.PL" ./
else
echo "Error: ${RECIPE_DIR}/Build.PL not found"
exit 1
fi

perl ./Build.PL
perl ./Build manifest
perl ./Build install --installdirs site
Expand All @@ -49,7 +53,6 @@ cd ..

## End build perl

mv bin/* $PREFIX/bin/
mv scripts/* $PREFIX/bin/
mv config/* $PREFIX/config/

Expand Down
19 changes: 12 additions & 7 deletions recipes/augustus/meta.yaml
Original file line number Diff line number Diff line change
@@ -1,28 +1,31 @@
{% set name = "augustus" %}
{% set version = "3.5.0" %}
{% set sha256 = "5ed6ce6106303b800c5e91d37a250baff43b20824657b853ae04d11ad8bdd686" %}

package:
name: augustus
name: {{ name }}
version: {{ version }}

source:
url: https://github.com/Gaius-Augustus/Augustus/archive/v{{ version }}.tar.gz
sha256: {{ sha256 }}
patches:
- patches/0001-Makefile.patch
- patches/homGeneMapping.src.makefile.patch
- patches/utrrnaseq.patch

build:
number: 4
number: 5
run_exports:
- {{ pin_subpackage('augustus', max_pin="x") }}

requirements:
build:
- make
- {{ compiler('c') }}
- {{ compiler('cxx') }}
host:
- zlib
- samtools ==0.1.19
- samtools
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codebase verification

Samtools version constraint removal requires attention

The codebase history shows that Augustus previously required samtools version 0.1.19 specifically. The build logs indicate direct integration with samtools through include paths and linking. Removing this strict version requirement could be problematic since:

  • Augustus has direct dependencies on samtools header files and libraries
  • The package previously used a very old version (0.1.19) of samtools
  • Newer versions of samtools may have incompatible API changes

Please:

  • Restore the version constraint to samtools ==0.1.19
  • Or verify compatibility with newer samtools versions by testing the build process
🔗 Analysis chain

Verify samtools compatibility after removing version constraint.

The version constraint for samtools has been removed. While this provides more flexibility, please ensure that the package is compatible with recent samtools versions to prevent potential runtime issues.


Let me gather more information about the augustus package and its relationship with samtools.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check if there are any known compatibility issues with samtools in the codebase
rg -i "samtools" . --type cpp --type c --type perl -C 2

Length of output: 438


Script:

#!/bin/bash
# Check augustus source code and build files for samtools usage
rg -i "samtools" ./recipes/augustus --type yaml --type txt -C 3

# Check test files for samtools usage patterns
fd "test" ./recipes/augustus -t f -x rg -i "samtools" {} -C 2 \;

# Check if there are any version constraints in previous versions
git log -p ./recipes/augustus/meta.yaml | rg -i "samtools"

Length of output: 1906

- bamtools
- htslib
- gsl
Expand All @@ -35,7 +38,6 @@ requirements:
- perl
- perl-module-build
run:
- zlib
- boost-cpp
- gsl
- lp_solve
Expand Down Expand Up @@ -152,19 +154,22 @@ test:
- fix_in_frame_stop_codon_genes.py --help | grep -Fq 'fix_in_frame_stop_codon_genes.py'

about:
home: http://bioinf.uni-greifswald.de/augustus/
license: Artistic Licence
home: "https://bioinf.uni-greifswald.de/augustus"
license: "Artistic Licence"
license_family: Other
license_file: "src/LICENSE.TXT"
summary: 'AUGUSTUS is a gene prediction program for eukaryotes written by Mario
Stanke and Oliver Keller. It can be used as an ab initio program, which means
it bases its prediction purely on the sequence. AUGUSTUS may also incorporate
hints on the gene structure coming from extrinsic sources such as EST, MS/MS,
protein alignments and synthenic genomic alignments.'
dev_url: "https://github.com/Gaius-Augustus/Augustus"
doc_url: "https://github.com/Gaius-Augustus/Augustus/blob/v{{ version }}/docs/RUNNING-AUGUSTUS.md"

extra:
identifiers:
- biotools:augustus
- doi:10.1093/bioinformatics/btr010
- usegalaxy-eu:augustus
- usegalaxy-eu:augustus_training
notes: "Builds with sqlite support are currently only available on Linux due to compile issues with macOS."
39 changes: 39 additions & 0 deletions recipes/augustus/patches/0001-Makefile.patch
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
diff --git a/Makefile b/Makefile
index a26b40c..b061356 100644
--- a/Makefile
+++ b/Makefile
@@ -26,20 +26,26 @@ clean:
cd .. && ./pyclean.sh; \
fi

-PREFIX = /usr/local
+# DESTDIR is usually the empty string but can be set for staging
+prefix ?= $(PREFIX)
+bindir = $(DESTDIR)$(prefix)/bin
INSTALLDIR = /opt/augustus-$(AUGVERSION)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Consider making INSTALLDIR configurable.

The hardcoded /opt path in INSTALLDIR might cause issues in containerized or managed environments. Consider making it respect the prefix:

-INSTALLDIR = /opt/augustus-$(AUGVERSION)
+INSTALLDIR = $(prefix)/share/augustus-$(AUGVERSION)

This would better align with standard Unix filesystem hierarchy and package management practices.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
INSTALLDIR = /opt/augustus-$(AUGVERSION)
INSTALLDIR = $(prefix)/share/augustus-$(AUGVERSION)

install:
- if [ ! $(PWD) -ef $(INSTALLDIR) ] ; then \
+ @echo Installing augustus executables into $(bindir)
+# two main ways:
+# 1) make install is executed from anywhere but in INSTALLDIR. Then INSTALLDIR is created if it does not exist and config data is copied.
+# 2) make install is executed in INSTALLDIR. This is done by the singularity file.
+ if [ ! $(CURDIR) -ef $(INSTALLDIR) ] ; then \
install -d $(INSTALLDIR) && \
cp -a config bin scripts $(INSTALLDIR) ; \
fi
- ln -sf $(INSTALLDIR)/bin/augustus $(PREFIX)/bin/augustus
- ln -sf $(INSTALLDIR)/bin/etraining $(PREFIX)/bin/etraining
- ln -sf $(INSTALLDIR)/bin/prepareAlign $(PREFIX)/bin/prepareAlign
- ln -sf $(INSTALLDIR)/bin/fastBlockSearch $(PREFIX)/bin/fastBlockSearch
- if [ -f $(INSTALLDIR)/bin/load2db ] ; then ln -sf $(INSTALLDIR)/bin/load2db $(PREFIX)/bin/load2db ; fi
- if [ -f $(INSTALLDIR)/bin/getSeq ] ; then ln -sf $(INSTALLDIR)/bin/getSeq $(PREFIX)/bin/getSeq ; fi
+ ln -sf $(INSTALLDIR)/bin/augustus $(bindir)/augustus
+ ln -sf $(INSTALLDIR)/bin/etraining $(bindir)/etraining
+ ln -sf $(INSTALLDIR)/bin/prepareAlign $(bindir)/prepareAlign
+ ln -sf $(INSTALLDIR)/bin/fastBlockSearch $(bindir)/fastBlockSearch
+ if [ -f $(INSTALLDIR)/bin/load2db ] ; then ln -sf $(INSTALLDIR)/bin/load2db $(bindir)/load2db ; fi
+ if [ -f $(INSTALLDIR)/bin/getSeq ] ; then ln -sf $(INSTALLDIR)/bin/getSeq $(bindir)/getSeq ; fi
Comment on lines +26 to +37
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Consider using install command for symlinks.

While the symlink creation works, using the install command would be more idiomatic and provide better error handling:

-	ln -sf $(INSTALLDIR)/bin/augustus $(bindir)/augustus
-	ln -sf $(INSTALLDIR)/bin/etraining $(bindir)/etraining
-	ln -sf $(INSTALLDIR)/bin/prepareAlign $(bindir)/prepareAlign
-	ln -sf $(INSTALLDIR)/bin/fastBlockSearch $(bindir)/fastBlockSearch
+	install -d $(bindir)
+	install -m 755 -s $(INSTALLDIR)/bin/augustus $(bindir)/augustus
+	install -m 755 -s $(INSTALLDIR)/bin/etraining $(bindir)/etraining
+	install -m 755 -s $(INSTALLDIR)/bin/prepareAlign $(bindir)/prepareAlign
+	install -m 755 -s $(INSTALLDIR)/bin/fastBlockSearch $(bindir)/fastBlockSearch

For the conditional installations:

-	if [ -f $(INSTALLDIR)/bin/load2db ] ; then ln -sf $(INSTALLDIR)/bin/load2db $(bindir)/load2db ; fi
-	if [ -f $(INSTALLDIR)/bin/getSeq ] ; then ln -sf $(INSTALLDIR)/bin/getSeq $(bindir)/getSeq ; fi
+	if [ -f $(INSTALLDIR)/bin/load2db ] ; then install -m 755 -s $(INSTALLDIR)/bin/load2db $(bindir)/load2db ; fi
+	if [ -f $(INSTALLDIR)/bin/getSeq ] ; then install -m 755 -s $(INSTALLDIR)/bin/getSeq $(bindir)/getSeq ; fi

This approach:

  • Ensures the target directory exists
  • Sets proper permissions
  • Strips debugging symbols (-s flag)
  • Provides better error handling
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- ln -sf $(INSTALLDIR)/bin/augustus $(PREFIX)/bin/augustus
- ln -sf $(INSTALLDIR)/bin/etraining $(PREFIX)/bin/etraining
- ln -sf $(INSTALLDIR)/bin/prepareAlign $(PREFIX)/bin/prepareAlign
- ln -sf $(INSTALLDIR)/bin/fastBlockSearch $(PREFIX)/bin/fastBlockSearch
- if [ -f $(INSTALLDIR)/bin/load2db ] ; then ln -sf $(INSTALLDIR)/bin/load2db $(PREFIX)/bin/load2db ; fi
- if [ -f $(INSTALLDIR)/bin/getSeq ] ; then ln -sf $(INSTALLDIR)/bin/getSeq $(PREFIX)/bin/getSeq ; fi
+ ln -sf $(INSTALLDIR)/bin/augustus $(bindir)/augustus
+ ln -sf $(INSTALLDIR)/bin/etraining $(bindir)/etraining
+ ln -sf $(INSTALLDIR)/bin/prepareAlign $(bindir)/prepareAlign
+ ln -sf $(INSTALLDIR)/bin/fastBlockSearch $(bindir)/fastBlockSearch
+ if [ -f $(INSTALLDIR)/bin/load2db ] ; then ln -sf $(INSTALLDIR)/bin/load2db $(bindir)/load2db ; fi
+ if [ -f $(INSTALLDIR)/bin/getSeq ] ; then ln -sf $(INSTALLDIR)/bin/getSeq $(bindir)/getSeq ; fi
install -d $(bindir)
install -m 755 -s $(INSTALLDIR)/bin/augustus $(bindir)/augustus
install -m 755 -s $(INSTALLDIR)/bin/etraining $(bindir)/etraining
install -m 755 -s $(INSTALLDIR)/bin/prepareAlign $(bindir)/prepareAlign
install -m 755 -s $(INSTALLDIR)/bin/fastBlockSearch $(bindir)/fastBlockSearch
if [ -f $(INSTALLDIR)/bin/load2db ] ; then install -m 755 -s $(INSTALLDIR)/bin/load2db $(bindir)/load2db ; fi
if [ -f $(INSTALLDIR)/bin/getSeq ] ; then install -m 755 -s $(INSTALLDIR)/bin/getSeq $(bindir)/getSeq ; fi


# for internal purposes:
release:
Loading