Flexiprep
Introduction
Flexiprep is a quality control pipeline. This pipeline checks for possible barcode contamination, clips reads, trims reads and
runs FASTQC.
Adapter clipping is performed by Cutadapt.
For quality trimming we use Sickle.
Flexiprep only works on .fastq
files.
Example
To get the help menu:
biopet pipeline Flexiprep -h
Arguments for Flexiprep:
-R1,--input_r1 <input_r1> R1 fastq file (gzipped allowed)
-R2,--input_r2 <input_r2> R2 fastq file (gzipped allowed)
-sample,--sampleid <sampleid> Sample ID
-library,--libid <libid> Library ID
-config,--config_file <config_file> JSON config file(s)
-DSC,--disablescatter Disable all scatters
Note that the pipeline also works on unpaired reads where one should only provide R1.
To start the pipeline (remove -run
for a dry run):
biopet pipeline Flexiprep -run -outDir myDir \
-R1 myFirstReadPair -R2 mySecondReadPair -sample mySampleName \
-library myLibname -config mySettings.json
Configuration and flags
For technical reasons, single sample pipelines, such as this pipeline do not take a sample config. Input files are in stead given on the command line as a flag.
Command line flags for Flexiprep are:
Flag (short) | Flag (long) | Type | Function |
---|---|---|---|
-R1 | --inputR1 | Path (required) | Path to input fastq file |
-R2 | --inputR2 | Path (optional) | Path to second read pair fastq file. |
-sample | --sampleid | String (required) | Name of sample |
-library | --libid | String (required) | Name of library |
If -R2
is given, the pipeline will assume a paired-end setup.
Sample input extensions
Please refer to our mapping pipeline for information about how the input samples should be handled.
Config
All other values should be provided in the config. Specific config values towards the mapping pipeline are:
Name | Type | Function |
---|---|---|
skip_trim | Boolean | Default false, if true the trimming step is skipped |
skip_clip | Boolean | Default false, if true the clipping step is skipped |
Result files
The results from this pipeline will be a fastq file. The pipeline also outputs 2 Fastqc runs one before and one after quality control.
Example output
.
├── mySample_01.qc.summary.json
├── mySample_01.qc.summary.json.out
├── mySample_01.R1.contams.txt
├── mySample_01.R1.fastqc
│ ├── mySample_01.R1_fastqc
│ │ ├── fastqc_data.txt
│ │ ├── fastqc_report.html
│ │ ├── Icons
│ │ │ ├── error.png
│ │ │ ├── fastqc_icon.png
│ │ │ ├── tick.png
│ │ │ └── warning.png
│ │ ├── Images
│ │ │ └── warning.png
│ │ ├── Images
│ │ │ ├── duplication_levels.png
│ │ │ ├── kmer_profiles.png
│ │ │ ├── per_base_gc_content.png
│ │ │ ├── per_base_n_content.png
│ │ │ ├── per_base_quality.png
│ │ │ ├── per_base_sequence_content.png
│ │ │ ├── per_sequence_gc_content.png
│ │ │ ├── per_sequence_quality.png
│ │ │ └── sequence_length_distribution.png
│ │ └── summary.txt
│ └── mySample_01.R1.qc_fastqc.zip
├── mySample_01.R1.qc.fastq.gz
├── mySample_01.R1.qc.fastq.gz.md5
├── mySample_01.R2.contams.txt
├── mySample_01.R2.fastqc
│ ├── mySample_01.R2_fastqc
│ │ ├── fastqc_data.txt
│ │ ├── fastqc_report.html
│ │ ├── Icons
│ │ │ ├── error.png
│ │ │ ├── fastqc_icon.png
│ │ │ ├── tick.png
│ │ │ └── warning.png
│ │ ├── Images
│ │ │ ├── duplication_levels.png
│ │ │ ├── kmer_profiles.png
│ │ │ ├── per_base_gc_content.png
│ │ │ ├── per_base_n_content.png
│ │ │ ├── per_base_quality.png
│ │ │ ├── per_base_sequence_content.png
│ │ │ ├── per_sequence_gc_content.png
│ │ │ ├── per_sequence_quality.png
│ │ │ └── sequence_length_distribution.png
│ │ └── summary.txt
│ └── mySample_01.R2_fastqc.zip
├── mySample_01.R2.fastq.md5
├── mySample_01.R2.qc.fastqc
│ ├── mySample_01.R2.qc_fastqc
│ │ ├── fastqc_data.txt
│ │ ├── fastqc_report.html
│ │ ├── Icons
│ │ │ ├── error.png
│ │ │ ├── fastqc_icon.png
│ │ │ ├── tick.png
│ │ │ └── warning.png
│ │ ├── Images
│ │ │ ├── duplication_levels.png
│ │ │ ├── kmer_profiles.png
│ │ │ ├── per_base_gc_content.png
│ │ │ ├── per_base_n_content.png
│ │ │ ├── per_base_quality.png
│ │ │ ├── per_base_sequence_content.png
│ │ │ ├── per_sequence_gc_content.png
│ │ │ ├── per_sequence_quality.png
│ │ │ └── sequence_length_distribution.png
│ │ └── summary.txt
│ └── mySample_01.R2.qc_fastqc.zip
├── mySample_01.R2.qc.fastq.gz
├── mySample_01.R2.qc.fastq.gz.md5
└── report
Getting Help
If you have any questions on running Flexiprep, suggestions on how to improve the overall flow, or requests for your favorite Quality Control (QC) related program to be added, feel free to post an issue to our issue tracker at GitHub. Or contact us directly via: SASC email