Snakemake

Since the segmentation of several patients is time-consuming, we have provided a snakemake pipeline to automate the process. This pipeline also allows to train other set of centroids and use it for the segmentation. This file allows to customize the usage of the hardware resources, like the number of threads and the amount of memory.

As before, this examples will use he data previously downloaded from the public dataset

Segment Multiple Scan

First of all, you have to create two folders:

  • INPUT : contains all and only the CT scans to segment

  • OUTPUT : empty folder, will contain the segmented scans as nrrd.

Now simply execute from command line

snakemake --cores 1 --config input_path='./Examples/INPUT/' --output_path='./Examples/OUTPUT/'

Note

It will create a folder named LUNG inside the INPUT, which contains the results of the lung extraction step.

Train a Centroid Set

Prepare three folders:
  • INPUT: will contains all the scans to segment

  • OUTPUT: will contain the segmented scans

  • TRAIN: will contain all the scans of the training set.

Now run Snakemake with the following configuration parameters :

snakemake --cores 1 --config input_path='./Examples/INPUT/' --output_path='./Examples/OUTPUT/'
--train_path='./Examples/TRAIN/' --centroid_path='.Examples/centorid_set.pkl.npy'

This will train the centroid set and use them to segment the input scans.

Note

This will create a folder named LUNG inside INPUT and TRAIN which contains the scans after lung extraction.

Warning

The TRAIN folder cannot be the same of INPUT!

Configuration

We have provided a configuration file (config.yaml) which allows to manage the resources and the path, which we usually provide from command-line.

Threads:

  • threads_labelling : Set the number of threads to use for the labelling process (default = 8);

  • threads_lung_extraction : Set the number of threads to use for the lung_extraction (default = 8);

  • threads_train : Set the number of threads to use for the training process (default = 8).

Memory:

  • memory_labelling : 8

  • memory_lung_extraction : 8

  • memory_train : 8

Training Parameters:

It is possible to specify the parameters for the training step:

  • n_subsamples : number of subsamples in which the slice of the training set will be divided during the training;

  • centroid_initialization : technique to use for the initialization of the centroids during k-means (0 for random initialization, 1 for k-means++)