Technical overview: how Viash works in practice

viash
showcase
Viash is a command-line tool that transforms scripts into self-contained, reusable and reproducible components.
Author

See more

Keywords

Next-Generation Sequencing Data Analyses, Easy Pipeline running, Easy nextflow pipelines

Viash is a command-line tool that transforms scripts into self-contained, reusable and reproducible components.

Core functionality

Viash operates by combining two key elements:

  1. Your source script (written in Python, R, Bash, C#, or other languages)
  2. A YAML configuration file that specifies:
  • Component metadata
  • Input/output arguments
  • Software dependencies

Schema Component

src/minimal_component/script.py src/minimal_component/config.vsh.yaml
        import shutil
# copy file
print(
  f"Copying {par['input']} to {par['output']}."
)
shutil.copyfile(par['input'], par['output'])
      
        name: minimal_component
description: A minimal example component.
arguments:
  - type: file
    name: --input
  - type: file
    name: --output
    direction: output
resources:
  - type: python_script
    path: script.py
engines:
  - type: docker
    image: "python:3.10-slim"
  - type: native
runners:
  - type: executable
  - type: nextflow
      

Automated component generation

When you run Viash to build your component based on the script and configuration file, it automatically:

  • Generates a standalone, self-contained executable that encapsulates your script, and the required software dependencies
  • Creates the necessary containerization files (e.g. Dockerfile) based on the specified dependencies
  • Produces Nextflow executables, without requiring manually writing any VDSL code
  • Handles parameter validation and documentation generation of your component

When Viash processes your script and configuration, it automatically generates all the boilerplate code through a deterministic, rule-based system, no LLM or other AI involved. This means:

  • The generation process is consistent and predictable
  • The same input always produces the same output
  • Generated code follows established patterns and best practices

The generated components can run locally, in containers (e.g. Docker, Podman or Singularity), or as a stand-alone Nextflow executable, with future support planned for other workflow systems as well (e.g. Snakemake or Argo Workflows).

Testing and Debugging

Viash provides integrated testing and debugging features that work within your component’s defined environment. Unit tests can be placed alongside your script and configuration, such that these tests can be detected and run in the correct environment.

In addition, built-in debugging commands allow you to troubleshoot issues in the same environment where your component will actually run, ensuring consistency between development and deployment.

src/minimal_component/test.sh src/minimal_component/config.vsh.yaml
        #!/usr/bin/env bash
set -ex
touch test_file.txt
./example_python \
 --input test_file.txt \
 --output output.txt
[[ ! -f output.txt ]] && echo "It seems no output file is generated" && exit 1
exit 0
      
        name: example_python
description: A minimal example component.
arguments:
  - type: file
    name: --input
  - type: file
    name: --output
    direction: output
resources:
  - type: python_script
    path: script.py
test_resources:
  - type: bash_script
    path: test.sh
engines:
  - type: docker
    image: "python:3.10-slim"
  - type: native
runners:
  - type: executable
  - type: nextflow
      

Building larger workflows

Viash components are designed to be building blocks that can be executed as stand-alone entities, but can also be easily combined into larger workflows. Each component has clearly defined inputs and outputs, making it straight-forward to connect them in sequential or branching workflows. Viash automatically generates Nextflow-compatible components, allowing you to arrange these components in any order as part of a larger Nextflow workflow, and scale from local testing to high-performance computing environments. Importantly, components can be added as dependencies from your local repo or from the extensive Viash Catalogue.

src/larger_workflow/main.nf src/larger_workflow/config.vsh.yaml
        workflow run_wf {
  take: input_ch
  main:
  output_ch = input_ch
  | minimal_component.run(
      // the output of the minimal_component 
      // becomes the input of concat_text
      toState: ["input": "output"]
    )
  | concat_text
  emit: output_ch
}
      
        name: larger_workflow
description: A larger example workflow.
arguments:
 - type: file
   name: --input
 - type: file
   name: --output
   direction: output
 - type: boolean_true
   name: --gzip_output
resources:
 - type: nextflow_script
   path: main.nf
   entrypoint: run_wf
dependencies:
 - name: concat_text
   repository: craftbox
 - name: minimal_component
repositories:
 - name: craftbox
   type: vsh
   repo: vsh/craftbox
   tag: v0.1.0
runners:
 - type: nextflow
      

Reusability

By refining your configuration file and iteratively developing Viash modules, you create reusable source components that can be dynamically regenerated for various execution platforms. Viash currently supports multiple containerization technologies (Docker, Podman, Singularity) and workflow systems (Nextflow), with additional platforms in development. This architecture mitigates vendor lock-in risk, enabling seamless migration between technologies as organizational requirements evolve.

Are you interested to learn how Viash compares to traditional workflow development tools? see our blog article

Elevate your data workflows

Transform your data workflows with Data Intuitive’s complete support from start to finish.

Our team can assist with defining requirements, troubleshooting, and maintaining the final product, all while providing end-to-end support.

Contact Us