Data Engineering CLIs

Luna Pathology offers data ingestion and data processing CLIs.

The data ingestion CLIs organize your pathology images and annotations with appropriate metadata. The data processing CLIs convert the pathology annotations into various formats compatible with other open-source tools like Qupath.

Note that the data ingestion CLIs rely on APIs that retrieve annotations from internal data sources.

Whole Slide Image (WSI) CLI

This module captures the metadata of your slides in a table.

Annotation CLIs

This module downloads point-click nuclear annotations and regional annotations, then converts them to geojson format.

luna.pathology.point_annotation.proxy_table.generate

This module generates a parquet table of point-click nuclear annotation jsons.

The configuration files are copied to your project/configs/table_name folder to persist the metadata used to generate the proxy table.

INPUT PARAMETERS

app_config_file - path to yaml file containing application runtime parameters. See config.yaml.template

data_config_file - path to yaml file containing data input and output parameters. See data_config.yaml.template

  • ROOT_PATH: path to output data

  • DATA_TYPE: data type used in table name e.g. POINT_RAW_JSON

  • PROJECT: your project name. used in table path

  • DATASET_NAME: optional, dataset name to version your table

  • PROJECT_ID: Slideviewer project id

  • USERS: list of users that provide expert annotations for this project

  • SLIDEVIEWER_CSV_FILE: an optional path to a SlideViewer csv file to use that lists the names of the whole slide images

and for which the regional annotation proxy table generator should download point annotations. If this field is left blank, then the regional annotation proxy table generator will download this file from SlideViewer.

TABLE SCHEMA

  • slideviewer_path: path to original slide image in slideviewer platform

  • slide_id: id for the slide. synonymous with image_id

  • sv_project_id: same as the PROJECT_ID from data_config_file, refers to the SlideViewer project number.

  • sv_json: json annotation file downloaded from slideviewer.

  • user: username of the annotator for a given annotation

  • sv_json_record_uuid: hash of raw json annotation file from slideviewer, format: SVPTJSON-{json_hash}

luna.pathology.point_annotation.proxy_table.generate [OPTIONS]

Options

-d, --data_config_file <data_config_file>

path to yaml file containing data input and output parameters. See data_config.yaml.template

-a, --app_config_file <app_config_file>

path to yaml file containing application runtime parameters. See config.yaml.template