{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# DSA Annotation" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Digital Slide Archive (DSA) is an open-source web application where users can annotate regional and point annotations on the high power slide viewer. Luna Pathology CLIs pull the different annotation types from DSA, and save the annotations in GeoJSON format along with metadata. In this notebook, we will review:\n", "\n", "- Project setup on DSA\n", "- Create annotations on DSA\n", "- Run regional annotation ETL\n", "- Run point annotation ETL\n", "\n", "DSA provides an excellent [video tutorial](https://www.youtube.com/watch?v=HTvLMyKYyGs&ab_channel=DigitalSlideArchive%2FHistomicsTK) that covers platform features. For the first two points on DSA, the information below is an abridged version of the tutorial for your reference." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Project setup on DSA" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Digital Slide Archive (DSA) is a platform that provides the ability to store, manage, visualize and annotate large imaging data sets. The DSA consists of an interface to visualize slides and manage annotations (HistomicsUI), and a web-server that provides a rich API and data management tools (using Girder). This system can:\n", "\n", "- Organize images from a variety of assetstores, such as local files systems and S3.\n", "- Provide user access controls.\n", "- Image annotation and review.\n", "\n", "HistomicsUI is a web-based application for examining, annotating, and processing histology images to extract both low and high level features (e.g. cellular structure, feature types).\n", "Concepts\n", "\n", "- **Collections** correspond to a project. Collections are at the top level objects in the data organization hierarchy.\n", "- **Folders** help organize slides under a project. e.g. hne_slides\n", "- **Items** correspond to a slide. An item can have metadata, annotations and files associated with it.\n", "- **Annotation** is a single rectangle, point, or polygon\n", "- **Annotation Document** is a set of annotations, created by the pathologist.\n", "- **Annotation Style** is a predefined set of labels (morphology like tumor, stroma, necrosis etc) and colors." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Create a collection for your project.\n", "Your images can be organized in a folder.\n", "In this example, we have a `pathology-tutorial` collection with `slides` folder where we organized the images.\n", "\n", "\"DSA\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Create annotations on DSA" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Please see this [video tutorial](https://youtu.be/HTvLMyKYyGs?t=369) for creating and viewing annotations. The information below is an abridged version of the tutorial for your reference. \n", "\n", "**1. To navigate to HistomicsUI, go to the Actions → Open in HistomicsUI on the upper right side. HistomicsUI will open a new tab in your browser.**\n", "\n", "\"DSA\n", "\n", "**2. Create an annotation document**\n", "- Click on the + New button on the Annotation panel. This will bring up a Create annotation modal.\n", "- Name you annotation document **regional** or **point**. These are the two types of annotations we support. The annotation document name will be used in the ETL, it is important to standardize your document names so the ETL can download all documents for the annotation type.\n", "- Optionally add a description, then click save.\n", " \n", "\"DSA\n", "\n", "**3. Create annotations**\n", "\n", "- Select a label (e.g. regional_tumor)\n", "- Click on **Point** or **Polygon**. When an annotation shape is highlighted, then your cursor on the slide area will look like a +\n", "- For Point annotation, zoom to an appropriate magnification and click on the cell. The annotation will appear as a circle.\n", "- For Polygon annotation, click and drag your mouse. As you drag the area will be highlighted. Try to meet the starting point, or double click to close the polygon.\n", "\n", "\"DSA\n", "\n", "\n", "**Note**: Using standardized annotation styles is recommended. A uniform annotation style json can be created and shared among the pathologists making annotations." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Run regional annotation ETL\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Once you have created annotations on DSA, we can run the annotation ETL CLI! This ETL will download the annotations, convert them to GeoJSON format, and create a parquet table to make the annotations and metadata queryable.\n", "\n", "First, let's look at the CLI arguments, by running `--help`\n", "\n", "- **data_config_file** contains details of the data source, project name, annotation name and path to save the annotations.\n", " To downnload regional annotations specify the DATA_TYPE as REGIONAL_METADATA_RESULTS. For point annotations, the DATA_TYPE should be POINT_GEOJSON\n", "- **app_config_file** contains dask parameters. The default parameters in the example config should be sufficient for most cases.\n", "- **user** is an optional argument for authenticating to DSA. This can be set as an environment variable DSA_USERNAME\n", "- **password** is an optional argument for authenticating to DSA. This can be set as an environment variable DSA_PASSWORD\n", "\n", "For details of the data and app configuration, please refer to the example configurations.\n", "\n", "**Note**: details of your DSA instance is specified as `DSA_URI` in `../conf/dsa_regional_annotation.yaml` and should be updated to reflect your DSA setup. If you are using the docker, replace the `localhost` with the IP you get from running:\n", "\n", "```docker inspect -f '{{range.NetworkSettings.Networks}}{{.IPAddress}}{{end}}' luna_tutorial_girder_1```\n" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Usage: dsa_annotation [OPTIONS]\r\n", "\r\n", " This module generates parquets with regional annotation pathology data from\r\n", " DSA\r\n", "\r\n", " TABLE SCHEMA - project_name: name of DSA collection - slide_id: slide id.\r\n", " synonymous with image_id - user: username of the annotator for a given\r\n", " annotation. For all slides, we combine multiple annotations as CONCAT\r\n", " user - dsa_json_path: file path to json file downloaded from DSA -\r\n", " geojson_path: file path to geojson file converted from DSA json format -\r\n", " date_updated: annotation updated time - date_created: annotation creation\r\n", " time - labelset: name of the provided labelset - annotation_name: name of\r\n", " the annotation in DSA - annotation_type: annotation type\r\n", "\r\n", " Args: app_config_file (str): path to yaml file containing application\r\n", " runtime parameters. See config.yaml.template data_config_file\r\n", " (str): path to yaml file containing data input and output parameters.\r\n", " See dask_data_config.yaml.template user (str, optional): DSA username.\r\n", " This can be provided as an argument or as an environment variable\r\n", " DSA_USERNAME. password (str, optional): DSA password. This can be\r\n", " provided as an argument or as an environment variable DSA_PASSWORD.\r\n", "\r\n", " Returns: None\r\n", "\r\n", "Options:\r\n", " -d, --data_config_file PATH path to yaml file containing data input and\r\n", " output parameters. See\r\n", " dask_data_config.yaml.template [required]\r\n", " -a, --app_config_file PATH path to yaml file containing application\r\n", " runtime parameters. See config.yaml.template\r\n", " [required]\r\n", " -u, --user TEXT DSA username. This can be provided as an\r\n", " argument or as an environment variable\r\n", " DSA_USERNAME.\r\n", " -p, --password TEXT DSA password. This can be provided as an\r\n", " argument or as an environment variable\r\n", " DSA_USERNAME.\r\n", " --help Show this message and exit.\r\n" ] } ], "source": [ "!dsa_annotation --help" ] }, { "cell_type": "code", "execution_count": 24, "metadata": { "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "2021-12-10 21:54:40,765 - INFO - root - FYI: Initalized logger, log file at: data-processing.log with handlers: [ (INFO)>, ]\n", "2021-12-10 21:54:40,767 - INFO - luna.common.config - loading config file ../conf/dsa_regional_annotation.yaml\n", "2021-12-10 21:54:40,772 - INFO - luna.common.config - loading config file ../conf/dsa_app_config.yaml\n", "2021-12-10 21:54:40,774 - INFO - root - data template: ../conf/dsa_regional_annotation.yaml\n", "2021-12-10 21:54:40,776 - INFO - root - config_file: ../conf/dsa_app_config.yaml\n", "2021-12-10 21:54:40,821 - INFO - root - config files copied to ../PRO_12-123/configs/REGIONAL_METADATA_RESULTS\n", "2021-12-10 21:54:40,919 - INFO - luna.pathology.cli.dsa.dsa_annotations - Table output directory: ../PRO_12-123/tables/REGIONAL_METADATA_RESULTS\n", "Successfully connected to DSA\n", "collection_id_dict {'_accessLevel': 2, '_id': '60edf904f398e364deb34dfa', '_modelType': 'collection', '_textScore': 15.0, 'created': '2021-07-13T20:35:16.195000+00:00', 'description': '', 'meta': {'stylesheet': [{'fillColor': 'rgb(255, 0, 0)', 'group': 'regional_tumor', 'id': 'regional_tumor', 'label': {'value': 'regional_tumor'}, 'lineColor': 'rgb(255, 0, 0)', 'lineWidth': 2}, {'fillColor': 'rgb(0, 255, 0)', 'group': 'regional_stroma', 'id': 'regional_stroma', 'label': {'value': 'regional_stroma'}, 'lineColor': 'rgb(0, 255, 0)', 'lineWidth': 2}, {'fillColor': 'rgb(0, 0, 255)', 'group': 'regional_fat', 'id': 'regional_fat', 'label': {'value': 'regional_fat'}, 'lineColor': 'rgb(0, 0, 255)', 'lineWidth': 2}, {'fillColor': 'rgb(255, 0, 0)', 'group': 'point_lymphocytes', 'id': 'point_lymphocytes', 'label': {'value': 'point_lymphocytes'}, 'lineColor': 'rgb(255, 0, 0)', 'lineWidth': 2}, {'fillColor': 'rgb(0, 255, 0)', 'group': 'point_other', 'id': 'point_other', 'label': {'value': 'point_other'}, 'lineColor': 'rgb(0, 255, 0)', 'lineWidth': 2}]}, 'name': 'pathology-tutorial', 'public': False, 'publicFlags': [], 'size': 3161997775, 'updated': '2021-12-07T17:48:13.526000+00:00'}\n", "Collection pathology-tutorial found with id: 60edf904f398e364deb34dfa\n", "retreived collection uuid\n", "2021-12-10 21:54:41,528 - INFO - luna.pathology.cli.dsa.dsa_annotations - Retrieved collection metadata\n", "2021-12-10 21:54:42,955 - INFO - root - FYI: Initalized logger, log file at: data-processing.log with handlers: [ (INFO)>, ]\n", "2021-12-10 21:54:42,956 - INFO - root - FYI: Initalized logger, log file at: data-processing.log with handlers: [ (INFO)>, ]\n", "2021-12-10 21:54:42,960 - INFO - root - FYI: Initalized logger, log file at: data-processing.log with handlers: [ (INFO)>, ]\n", "2021-12-10 21:54:42,963 - INFO - luna.pathology.cli.dsa.dsa_annotations - \n", "collection_id_dict {'_accessLevel': 2, '_id': '60edf904f398e364deb34dfa', '_modelType': 'collection', '_textScore': 15.0, 'created': '2021-07-13T20:35:16.195000+00:00', 'description': '', 'meta': {'stylesheet': [{'fillColor': 'rgb(255, 0, 0)', 'group': 'regional_tumor', 'id': 'regional_tumor', 'label': {'value': 'regional_tumor'}, 'lineColor': 'rgb(255, 0, 0)', 'lineWidth': 2}, {'fillColor': 'rgb(0, 255, 0)', 'group': 'regional_stroma', 'id': 'regional_stroma', 'label': {'value': 'regional_stroma'}, 'lineColor': 'rgb(0, 255, 0)', 'lineWidth': 2}, {'fillColor': 'rgb(0, 0, 255)', 'group': 'regional_fat', 'id': 'regional_fat', 'label': {'value': 'regional_fat'}, 'lineColor': 'rgb(0, 0, 255)', 'lineWidth': 2}, {'fillColor': 'rgb(255, 0, 0)', 'group': 'point_lymphocytes', 'id': 'point_lymphocytes', 'label': {'value': 'point_lymphocytes'}, 'lineColor': 'rgb(255, 0, 0)', 'lineWidth': 2}, {'fillColor': 'rgb(0, 255, 0)', 'group': 'point_other', 'id': 'point_other', 'label': {'value': 'point_other'}, 'lineColor': 'rgb(0, 255, 0)', 'lineWidth': 2}]}, 'name': 'pathology-tutorial', 'public': False, 'publicFlags': [], 'size': 3161997775, 'updated': '2021-12-07T17:48:13.526000+00:00'}\n", "Collection pathology-tutorial found with id: 60edf904f398e364deb34dfa\n", "collection_id_dict {'_accessLevel': 2, '_id': '60edf904f398e364deb34dfa', '_modelType': 'collection', '_textScore': 15.0, 'created': '2021-07-13T20:35:16.195000+00:00', 'description': '', 'meta': {'stylesheet': [{'fillColor': 'rgb(255, 0, 0)', 'group': 'regional_tumor', 'id': 'regional_tumor', 'label': {'value': 'regional_tumor'}, 'lineColor': 'rgb(255, 0, 0)', 'lineWidth': 2}, {'fillColor': 'rgb(0, 255, 0)', 'group': 'regional_stroma', 'id': 'regional_stroma', 'label': {'value': 'regional_stroma'}, 'lineColor': 'rgb(0, 255, 0)', 'lineWidth': 2}, {'fillColor': 'rgb(0, 0, 255)', 'group': 'regional_fat', 'id': 'regional_fat', 'label': {'value': 'regional_fat'}, 'lineColor': 'rgb(0, 0, 255)', 'lineWidth': 2}, {'fillColor': 'rgb(255, 0, 0)', 'group': 'point_lymphocytes', 'id': 'point_lymphocytes', 'label': {'value': 'point_lymphocytes'}, 'lineColor': 'rgb(255, 0, 0)', 'lineWidth': 2}, {'fillColor': 'rgb(0, 255, 0)', 'group': 'point_other', 'id': 'point_other', 'label': {'value': 'point_other'}, 'lineColor': 'rgb(0, 255, 0)', 'lineWidth': 2}]}, 'name': 'pathology-tutorial', 'public': False, 'publicFlags': [], 'size': 3161997775, 'updated': '2021-12-07T17:48:13.526000+00:00'}\n", "Collection pathology-tutorial found with id: 60edf904f398e364deb34dfa\n", "collection_id_dict {'_accessLevel': 2, '_id': '60edf904f398e364deb34dfa', '_modelType': 'collection', '_textScore': 15.0, 'created': '2021-07-13T20:35:16.195000+00:00', 'description': '', 'meta': {'stylesheet': [{'fillColor': 'rgb(255, 0, 0)', 'group': 'regional_tumor', 'id': 'regional_tumor', 'label': {'value': 'regional_tumor'}, 'lineColor': 'rgb(255, 0, 0)', 'lineWidth': 2}, {'fillColor': 'rgb(0, 255, 0)', 'group': 'regional_stroma', 'id': 'regional_stroma', 'label': {'value': 'regional_stroma'}, 'lineColor': 'rgb(0, 255, 0)', 'lineWidth': 2}, {'fillColor': 'rgb(0, 0, 255)', 'group': 'regional_fat', 'id': 'regional_fat', 'label': {'value': 'regional_fat'}, 'lineColor': 'rgb(0, 0, 255)', 'lineWidth': 2}, {'fillColor': 'rgb(255, 0, 0)', 'group': 'point_lymphocytes', 'id': 'point_lymphocytes', 'label': {'value': 'point_lymphocytes'}, 'lineColor': 'rgb(255, 0, 0)', 'lineWidth': 2}, {'fillColor': 'rgb(0, 255, 0)', 'group': 'point_other', 'id': 'point_other', 'label': {'value': 'point_other'}, 'lineColor': 'rgb(0, 255, 0)', 'lineWidth': 2}]}, 'name': 'pathology-tutorial', 'public': False, 'publicFlags': [], 'size': 3161997775, 'updated': '2021-12-07T17:48:13.526000+00:00'}\n", "Collection pathology-tutorial found with id: 60edf904f398e364deb34dfa\n", "Image file 01OV002-bd8cdc70-3d46-40ae-99c4-90ef77.svs found with id: 61563001b8ba89a9c64590a1\n", "Starting request for annotation\n", "Image file 01OV008-308ad404-7079-4ff8-8232-12ee2e.svs found with id: 6156306db8ba89a9c64590a4\n", "Starting request for annotation\n", "Image file 01OV002-ed65cf94-8bc6-492b-9149-adc16f.svs found with id: 61562f90b8ba89a9c645909e\n", "Starting request for annotation\n", "collection_id_dict {'_accessLevel': 2, '_id': '60edf904f398e364deb34dfa', '_modelType': 'collection', '_textScore': 15.0, 'created': '2021-07-13T20:35:16.195000+00:00', 'description': '', 'meta': {'stylesheet': [{'fillColor': 'rgb(255, 0, 0)', 'group': 'regional_tumor', 'id': 'regional_tumor', 'label': {'value': 'regional_tumor'}, 'lineColor': 'rgb(255, 0, 0)', 'lineWidth': 2}, {'fillColor': 'rgb(0, 255, 0)', 'group': 'regional_stroma', 'id': 'regional_stroma', 'label': {'value': 'regional_stroma'}, 'lineColor': 'rgb(0, 255, 0)', 'lineWidth': 2}, {'fillColor': 'rgb(0, 0, 255)', 'group': 'regional_fat', 'id': 'regional_fat', 'label': {'value': 'regional_fat'}, 'lineColor': 'rgb(0, 0, 255)', 'lineWidth': 2}, {'fillColor': 'rgb(255, 0, 0)', 'group': 'point_lymphocytes', 'id': 'point_lymphocytes', 'label': {'value': 'point_lymphocytes'}, 'lineColor': 'rgb(255, 0, 0)', 'lineWidth': 2}, {'fillColor': 'rgb(0, 255, 0)', 'group': 'point_other', 'id': 'point_other', 'label': {'value': 'point_other'}, 'lineColor': 'rgb(0, 255, 0)', 'lineWidth': 2}]}, 'name': 'pathology-tutorial', 'public': False, 'publicFlags': [], 'size': 3161997775, 'updated': '2021-12-07T17:48:13.526000+00:00'}\n", "Collection pathology-tutorial found with id: 60edf904f398e364deb34dfa\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "collection_id_dict {'_accessLevel': 2, '_id': '60edf904f398e364deb34dfa', '_modelType': 'collection', '_textScore': 15.0, 'created': '2021-07-13T20:35:16.195000+00:00', 'description': '', 'meta': {'stylesheet': [{'fillColor': 'rgb(255, 0, 0)', 'group': 'regional_tumor', 'id': 'regional_tumor', 'label': {'value': 'regional_tumor'}, 'lineColor': 'rgb(255, 0, 0)', 'lineWidth': 2}, {'fillColor': 'rgb(0, 255, 0)', 'group': 'regional_stroma', 'id': 'regional_stroma', 'label': {'value': 'regional_stroma'}, 'lineColor': 'rgb(0, 255, 0)', 'lineWidth': 2}, {'fillColor': 'rgb(0, 0, 255)', 'group': 'regional_fat', 'id': 'regional_fat', 'label': {'value': 'regional_fat'}, 'lineColor': 'rgb(0, 0, 255)', 'lineWidth': 2}, {'fillColor': 'rgb(255, 0, 0)', 'group': 'point_lymphocytes', 'id': 'point_lymphocytes', 'label': {'value': 'point_lymphocytes'}, 'lineColor': 'rgb(255, 0, 0)', 'lineWidth': 2}, {'fillColor': 'rgb(0, 255, 0)', 'group': 'point_other', 'id': 'point_other', 'label': {'value': 'point_other'}, 'lineColor': 'rgb(0, 255, 0)', 'lineWidth': 2}]}, 'name': 'pathology-tutorial', 'public': False, 'publicFlags': [], 'size': 3161997775, 'updated': '2021-12-07T17:48:13.526000+00:00'}\n", "Collection pathology-tutorial found with id: 60edf904f398e364deb34dfa\n", "collection_id_dict {'_accessLevel': 2, '_id': '60edf904f398e364deb34dfa', '_modelType': 'collection', '_textScore': 15.0, 'created': '2021-07-13T20:35:16.195000+00:00', 'description': '', 'meta': {'stylesheet': [{'fillColor': 'rgb(255, 0, 0)', 'group': 'regional_tumor', 'id': 'regional_tumor', 'label': {'value': 'regional_tumor'}, 'lineColor': 'rgb(255, 0, 0)', 'lineWidth': 2}, {'fillColor': 'rgb(0, 255, 0)', 'group': 'regional_stroma', 'id': 'regional_stroma', 'label': {'value': 'regional_stroma'}, 'lineColor': 'rgb(0, 255, 0)', 'lineWidth': 2}, {'fillColor': 'rgb(0, 0, 255)', 'group': 'regional_fat', 'id': 'regional_fat', 'label': {'value': 'regional_fat'}, 'lineColor': 'rgb(0, 0, 255)', 'lineWidth': 2}, {'fillColor': 'rgb(255, 0, 0)', 'group': 'point_lymphocytes', 'id': 'point_lymphocytes', 'label': {'value': 'point_lymphocytes'}, 'lineColor': 'rgb(255, 0, 0)', 'lineWidth': 2}, {'fillColor': 'rgb(0, 255, 0)', 'group': 'point_other', 'id': 'point_other', 'label': {'value': 'point_other'}, 'lineColor': 'rgb(0, 255, 0)', 'lineWidth': 2}]}, 'name': 'pathology-tutorial', 'public': False, 'publicFlags': [], 'size': 3161997775, 'updated': '2021-12-07T17:48:13.526000+00:00'}\n", "Collection pathology-tutorial found with id: 60edf904f398e364deb34dfa\n", "Image file 2551571.svs found with id: 60f1715cf398e364deb3befb\n", "Starting request for annotation\n", "Image file 01OV008-7579323e-2fae-43a9-b00f-a15c28.svs found with id: 61562e72b8ba89a9c6459098\n", "Starting request for annotation\n", "Image file 01OV007-9b90eb78-2f50-4aeb-b010-d642f9.svs found with id: 61562edeb8ba89a9c645909b\n", "Starting request for annotation\n", "Annotiaton not found for slide: 2551571.svs and annotation name: ov_regional\n", "2021-12-10 21:54:43,930 - INFO - luna.common.config - loading config file /home/rosed2/vmount/conf/datastore.cfg\n", "2021-12-10 21:54:43,936 - INFO - luna.common.DataStore - Configured datastore with {'GRAPH_STORE_ENABLED': False, 'GRAPH_URI': 'neo4j://localhost:7687', 'GRAPH_USER': 'neo4j', 'GRAPH_PASSWORD': 'password', 'OBJECT_STORE_ENABLED': False, 'MINIO_URI': 'localhost:8001', 'MINIO_USER': 'minio', 'MINIO_PASSWORD': 'password', 'DOC_STORE_ENABLED': False, 'MONGODB_URI': 'mongodb://localhost:27017/'}\n", "2021-12-10 21:54:43,940 - INFO - luna.common.DataStore - Datstore file backend= ../PRO_12-123/slides\n", "2021-12-10 21:54:43,948 - INFO - luna.common.DataStore - Save -> ../PRO_12-123/slides/01OV002-bd8cdc70-3d46-40ae-99c4-90ef77/admin/REGIONAL_METADATA_RESULTS_DSA_JSON/DEFAULT_LABELS\n", "2021-12-10 21:54:43,961 - INFO - luna.common.DataStore - Save -> ../PRO_12-123/slides/01OV002-bd8cdc70-3d46-40ae-99c4-90ef77/CONCAT/REGIONAL_METADATA_RESULTS/DEFAULT_LABELS\n", "2021-12-10 21:54:43,968 - INFO - luna.common.DataStore - Configured datastore with {'GRAPH_STORE_ENABLED': False, 'GRAPH_URI': 'neo4j://localhost:7687', 'GRAPH_USER': 'neo4j', 'GRAPH_PASSWORD': 'password', 'OBJECT_STORE_ENABLED': False, 'MINIO_URI': 'localhost:8001', 'MINIO_USER': 'minio', 'MINIO_PASSWORD': 'password', 'DOC_STORE_ENABLED': False, 'MONGODB_URI': 'mongodb://localhost:27017/'}\n", "2021-12-10 21:54:43,972 - INFO - luna.common.DataStore - Datstore file backend= ../PRO_12-123/slides\n", "2021-12-10 21:54:43,981 - INFO - luna.common.DataStore - Save -> ../PRO_12-123/slides/01OV002-ed65cf94-8bc6-492b-9149-adc16f/admin/REGIONAL_METADATA_RESULTS_DSA_JSON/DEFAULT_LABELS\n", "2021-12-10 21:54:43,995 - INFO - luna.common.DataStore - Save -> ../PRO_12-123/slides/01OV002-ed65cf94-8bc6-492b-9149-adc16f/CONCAT/REGIONAL_METADATA_RESULTS/DEFAULT_LABELS\n", "collection_id_dict {'_accessLevel': 2, '_id': '60edf904f398e364deb34dfa', '_modelType': 'collection', '_textScore': 15.0, 'created': '2021-07-13T20:35:16.195000+00:00', 'description': '', 'meta': {'stylesheet': [{'fillColor': 'rgb(255, 0, 0)', 'group': 'regional_tumor', 'id': 'regional_tumor', 'label': {'value': 'regional_tumor'}, 'lineColor': 'rgb(255, 0, 0)', 'lineWidth': 2}, {'fillColor': 'rgb(0, 255, 0)', 'group': 'regional_stroma', 'id': 'regional_stroma', 'label': {'value': 'regional_stroma'}, 'lineColor': 'rgb(0, 255, 0)', 'lineWidth': 2}, {'fillColor': 'rgb(0, 0, 255)', 'group': 'regional_fat', 'id': 'regional_fat', 'label': {'value': 'regional_fat'}, 'lineColor': 'rgb(0, 0, 255)', 'lineWidth': 2}, {'fillColor': 'rgb(255, 0, 0)', 'group': 'point_lymphocytes', 'id': 'point_lymphocytes', 'label': {'value': 'point_lymphocytes'}, 'lineColor': 'rgb(255, 0, 0)', 'lineWidth': 2}, {'fillColor': 'rgb(0, 255, 0)', 'group': 'point_other', 'id': 'point_other', 'label': {'value': 'point_other'}, 'lineColor': 'rgb(0, 255, 0)', 'lineWidth': 2}]}, 'name': 'pathology-tutorial', 'public': False, 'publicFlags': [], 'size': 3161997775, 'updated': '2021-12-07T17:48:13.526000+00:00'}\n", "Collection pathology-tutorial found with id: 60edf904f398e364deb34dfa\n", "Image file 2551129.svs found with id: 60edf9d1f398e364deb34dfd\n", "Starting request for annotation\n", "2021-12-10 21:54:44,275 - INFO - luna.common.config - loading config file /home/rosed2/vmount/conf/datastore.cfg\n", "2021-12-10 21:54:44,283 - INFO - luna.common.DataStore - Configured datastore with {'GRAPH_STORE_ENABLED': False, 'GRAPH_URI': 'neo4j://localhost:7687', 'GRAPH_USER': 'neo4j', 'GRAPH_PASSWORD': 'password', 'OBJECT_STORE_ENABLED': False, 'MINIO_URI': 'localhost:8001', 'MINIO_USER': 'minio', 'MINIO_PASSWORD': 'password', 'DOC_STORE_ENABLED': False, 'MONGODB_URI': 'mongodb://localhost:27017/'}\n", "2021-12-10 21:54:44,287 - INFO - luna.common.DataStore - Datstore file backend= ../PRO_12-123/slides\n", "2021-12-10 21:54:44,297 - INFO - luna.common.DataStore - Save -> ../PRO_12-123/slides/01OV008-308ad404-7079-4ff8-8232-12ee2e/admin/REGIONAL_METADATA_RESULTS_DSA_JSON/DEFAULT_LABELS\n", "Annotiaton not found for slide: 2551129.svs and annotation name: ov_regional\n", "2021-12-10 21:54:44,318 - INFO - luna.common.DataStore - Save -> ../PRO_12-123/slides/01OV008-308ad404-7079-4ff8-8232-12ee2e/CONCAT/REGIONAL_METADATA_RESULTS/DEFAULT_LABELS\n", "2021-12-10 21:54:44,326 - INFO - luna.common.DataStore - Configured datastore with {'GRAPH_STORE_ENABLED': False, 'GRAPH_URI': 'neo4j://localhost:7687', 'GRAPH_USER': 'neo4j', 'GRAPH_PASSWORD': 'password', 'OBJECT_STORE_ENABLED': False, 'MINIO_URI': 'localhost:8001', 'MINIO_USER': 'minio', 'MINIO_PASSWORD': 'password', 'DOC_STORE_ENABLED': False, 'MONGODB_URI': 'mongodb://localhost:27017/'}\n", "2021-12-10 21:54:44,331 - INFO - luna.common.DataStore - Datstore file backend= ../PRO_12-123/slides\n", "2021-12-10 21:54:44,342 - INFO - luna.common.DataStore - Save -> ../PRO_12-123/slides/01OV008-7579323e-2fae-43a9-b00f-a15c28/admin/REGIONAL_METADATA_RESULTS_DSA_JSON/DEFAULT_LABELS\n", "2021-12-10 21:54:44,352 - INFO - luna.common.config - loading config file /home/rosed2/vmount/conf/datastore.cfg\n", "2021-12-10 21:54:44,360 - INFO - luna.common.DataStore - Configured datastore with {'GRAPH_STORE_ENABLED': False, 'GRAPH_URI': 'neo4j://localhost:7687', 'GRAPH_USER': 'neo4j', 'GRAPH_PASSWORD': 'password', 'OBJECT_STORE_ENABLED': False, 'MINIO_URI': 'localhost:8001', 'MINIO_USER': 'minio', 'MINIO_PASSWORD': 'password', 'DOC_STORE_ENABLED': False, 'MONGODB_URI': 'mongodb://localhost:27017/'}\n", "2021-12-10 21:54:44,361 - INFO - luna.common.DataStore - Save -> ../PRO_12-123/slides/01OV008-7579323e-2fae-43a9-b00f-a15c28/CONCAT/REGIONAL_METADATA_RESULTS/DEFAULT_LABELS\n", "2021-12-10 21:54:44,363 - INFO - luna.common.DataStore - Datstore file backend= ../PRO_12-123/slides\n", "2021-12-10 21:54:44,365 - INFO - luna.pathology.cli.dsa.dsa_annotations - Annotation for slide 01OV007-9b90eb78-2f50-4aeb-b010-d642f9.svs generated successfully\n", "2021-12-10 21:54:44,377 - INFO - luna.common.DataStore - Save -> ../PRO_12-123/slides/01OV007-9b90eb78-2f50-4aeb-b010-d642f9/admin/REGIONAL_METADATA_RESULTS_DSA_JSON/DEFAULT_LABELS\n", "2021-12-10 21:54:44,377 - INFO - luna.pathology.cli.dsa.dsa_annotations - Annotation for slide 01OV007-9b90eb78-2f50-4aeb-b010-d642f9.svs generated successfully\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "2021-12-10 21:54:44,388 - INFO - luna.pathology.cli.dsa.dsa_annotations - Annotation for slide 01OV007-9b90eb78-2f50-4aeb-b010-d642f9.svs generated successfully\n", "2021-12-10 21:54:44,401 - INFO - luna.common.DataStore - Save -> ../PRO_12-123/slides/01OV007-9b90eb78-2f50-4aeb-b010-d642f9/CONCAT/REGIONAL_METADATA_RESULTS/DEFAULT_LABELS\n", "2021-12-10 21:54:44,404 - INFO - luna.pathology.cli.dsa.dsa_annotations - Annotation for slide 01OV007-9b90eb78-2f50-4aeb-b010-d642f9.svs generated successfully\n", "2021-12-10 21:54:44,415 - INFO - luna.pathology.cli.dsa.dsa_annotations - Annotation for slide 01OV007-9b90eb78-2f50-4aeb-b010-d642f9.svs generated successfully\n", "2021-12-10 21:54:45,832 - INFO - root - Code block 'generate DSA annotation geojson table' took: 5.0578504719997s\n" ] } ], "source": [ "!dsa_annotation \\\n", "-d ../conf/dsa_regional_annotation.yaml \\\n", "-a ../conf/dsa_app_config.yaml \\\n", "-u admin -p password" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Once the ETL is done, you can check the results in the following locations.\n", "\n", "- Configurations saved in `~/vmount/PRO_12-123/configs/REGIONAL_METADATA_RESULTS`\n", "- Parquet table saved in `~/vmount/PRO_12-123/tables/REGIONAL_METADATA_RESULTS`\n", "- DSA Json annotations and converted GeoJSONs saved in `~/vmount/PRO_12-123/slides/`\n" ] }, { "cell_type": "code", "execution_count": 25, "metadata": { "scrolled": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "total 0\n", "drwxr-xr-x 3 rosed2 rosed2 96 Dec 10 21:54 configs\n", "drwxr-xr-x 7 rosed2 rosed2 224 Dec 10 21:54 slides\n", "drwxr-xr-x 3 rosed2 rosed2 96 Dec 10 21:54 tables\n", "total 8.0K\n", "-rwxr-xr-x 1 rosed2 rosed2 86 Dec 10 21:54 app_config.yaml\n", "-rwxr-xr-x 1 rosed2 rosed2 740 Dec 10 21:54 data_config.yaml\n", "total 12K\n", "-rw-r--r-- 1 rosed2 rosed2 9.9K Dec 10 21:54 REGIONAL_METADATA_RESULTS.parquet\n", "total 0\n", "drwxr-xr-x 3 rosed2 rosed2 96 Dec 10 21:54 CONCAT\n", "drwxr-xr-x 3 rosed2 rosed2 96 Dec 10 21:54 admin\n" ] } ], "source": [ "!ls -lh ~/vmount/PRO_12-123\n", "!ls -lh ~/vmount/PRO_12-123/configs/REGIONAL_METADATA_RESULTS\n", "!ls -lh ~/vmount/PRO_12-123/tables/REGIONAL_METADATA_RESULTS\n", "!ls -lh ~/vmount/PRO_12-123/slides/01OV002-bd8cdc70-3d46-40ae-99c4-90ef77" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The resulting metadata table catalogs annotation metadata - project name, annotation name, slide id, annotation creation user along with created and updated time - along with annotation type and paths to DSA json and the converted geojson.\n", "\n", "The parquet table can be read with pyarrow as seen below:" ] }, { "cell_type": "code", "execution_count": 35, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
project_nameslide_iduserdsa_json_pathgeojson_pathdate_updateddate_createdlabelsetannotation_nameannotation_type
0pathology-tutorial01OV002-ed65cf94-8bc6-492b-9149-adc16fCONCAT../PRO_12-123/slides/01OV002-ed65cf94-8bc6-492...../PRO_12-123/slides/01OV002-ed65cf94-8bc6-492...2021-12-01T20:50:24.359000+00:002021-10-01T14:53:56.716000+00:00DEFAULT_LABELSov_regionalRegionalAnnotationJSON
1pathology-tutorial01OV008-308ad404-7079-4ff8-8232-12ee2eCONCAT../PRO_12-123/slides/01OV008-308ad404-7079-4ff...../PRO_12-123/slides/01OV008-308ad404-7079-4ff...2021-12-01T20:54:49.699000+00:002021-10-01T14:21:41.810000+00:00DEFAULT_LABELSov_regionalRegionalAnnotationJSON
2pathology-tutorial01OV002-bd8cdc70-3d46-40ae-99c4-90ef77CONCAT../PRO_12-123/slides/01OV002-bd8cdc70-3d46-40a...../PRO_12-123/slides/01OV002-bd8cdc70-3d46-40a...2021-12-01T20:47:11.551000+00:002021-10-01T14:59:02.979000+00:00DEFAULT_LABELSov_regionalRegionalAnnotationJSON
3pathology-tutorial01OV008-7579323e-2fae-43a9-b00f-a15c28CONCAT../PRO_12-123/slides/01OV008-7579323e-2fae-43a...../PRO_12-123/slides/01OV008-7579323e-2fae-43a...2021-12-01T20:55:54.516000+00:002021-10-01T13:59:56.123000+00:00DEFAULT_LABELSov_regionalRegionalAnnotationJSON
4pathology-tutorial01OV007-9b90eb78-2f50-4aeb-b010-d642f9CONCAT../PRO_12-123/slides/01OV007-9b90eb78-2f50-4ae...../PRO_12-123/slides/01OV007-9b90eb78-2f50-4ae...2021-12-01T20:52:39.769000+00:002021-10-01T14:44:44.514000+00:00DEFAULT_LABELSov_regionalRegionalAnnotationJSON
\n", "
" ], "text/plain": [ " project_name slide_id user \\\n", "0 pathology-tutorial 01OV002-ed65cf94-8bc6-492b-9149-adc16f CONCAT \n", "1 pathology-tutorial 01OV008-308ad404-7079-4ff8-8232-12ee2e CONCAT \n", "2 pathology-tutorial 01OV002-bd8cdc70-3d46-40ae-99c4-90ef77 CONCAT \n", "3 pathology-tutorial 01OV008-7579323e-2fae-43a9-b00f-a15c28 CONCAT \n", "4 pathology-tutorial 01OV007-9b90eb78-2f50-4aeb-b010-d642f9 CONCAT \n", "\n", " dsa_json_path \\\n", "0 ../PRO_12-123/slides/01OV002-ed65cf94-8bc6-492... \n", "1 ../PRO_12-123/slides/01OV008-308ad404-7079-4ff... \n", "2 ../PRO_12-123/slides/01OV002-bd8cdc70-3d46-40a... \n", "3 ../PRO_12-123/slides/01OV008-7579323e-2fae-43a... \n", "4 ../PRO_12-123/slides/01OV007-9b90eb78-2f50-4ae... \n", "\n", " geojson_path \\\n", "0 ../PRO_12-123/slides/01OV002-ed65cf94-8bc6-492... \n", "1 ../PRO_12-123/slides/01OV008-308ad404-7079-4ff... \n", "2 ../PRO_12-123/slides/01OV002-bd8cdc70-3d46-40a... \n", "3 ../PRO_12-123/slides/01OV008-7579323e-2fae-43a... \n", "4 ../PRO_12-123/slides/01OV007-9b90eb78-2f50-4ae... \n", "\n", " date_updated date_created \\\n", "0 2021-12-01T20:50:24.359000+00:00 2021-10-01T14:53:56.716000+00:00 \n", "1 2021-12-01T20:54:49.699000+00:00 2021-10-01T14:21:41.810000+00:00 \n", "2 2021-12-01T20:47:11.551000+00:00 2021-10-01T14:59:02.979000+00:00 \n", "3 2021-12-01T20:55:54.516000+00:00 2021-10-01T13:59:56.123000+00:00 \n", "4 2021-12-01T20:52:39.769000+00:00 2021-10-01T14:44:44.514000+00:00 \n", "\n", " labelset annotation_name annotation_type \n", "0 DEFAULT_LABELS ov_regional RegionalAnnotationJSON \n", "1 DEFAULT_LABELS ov_regional RegionalAnnotationJSON \n", "2 DEFAULT_LABELS ov_regional RegionalAnnotationJSON \n", "3 DEFAULT_LABELS ov_regional RegionalAnnotationJSON \n", "4 DEFAULT_LABELS ov_regional RegionalAnnotationJSON " ] }, "execution_count": 35, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import pyarrow.parquet as pq\n", "\n", "regional_annotation_table = pq.read_table('../PRO_12-123/tables/REGIONAL_METADATA_RESULTS').to_pandas()\n", "regional_annotation_table" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Run point annotation ETL" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We use the same CLI for downloading point annotations.\n", "\n", "Your data config file should have different `ANNOTATION_NAME` and the `DATA_TYPE` will be set as POINT_GEOJSON." ] }, { "cell_type": "code", "execution_count": 28, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "2021-12-10 21:56:37,724 - INFO - root - FYI: Initalized logger, log file at: data-processing.log with handlers: [ (INFO)>, ]\n", "2021-12-10 21:56:37,726 - INFO - luna.common.config - loading config file ../conf/dsa_point_annotation.yaml\n", "2021-12-10 21:56:37,730 - INFO - luna.common.config - loading config file ../conf/dsa_app_config.yaml\n", "2021-12-10 21:56:37,732 - INFO - root - data template: ../conf/dsa_point_annotation.yaml\n", "2021-12-10 21:56:37,734 - INFO - root - config_file: ../conf/dsa_app_config.yaml\n", "2021-12-10 21:56:37,780 - INFO - root - config files copied to ../PRO_12-123/configs/POINT_GEOJSON\n", "2021-12-10 21:56:37,885 - INFO - luna.pathology.cli.dsa.dsa_annotations - Table output directory: ../PRO_12-123/tables/POINT_GEOJSON\n", "Successfully connected to DSA\n", "collection_id_dict {'_accessLevel': 2, '_id': '60edf904f398e364deb34dfa', '_modelType': 'collection', '_textScore': 15.0, 'created': '2021-07-13T20:35:16.195000+00:00', 'description': '', 'meta': {'stylesheet': [{'fillColor': 'rgb(255, 0, 0)', 'group': 'regional_tumor', 'id': 'regional_tumor', 'label': {'value': 'regional_tumor'}, 'lineColor': 'rgb(255, 0, 0)', 'lineWidth': 2}, {'fillColor': 'rgb(0, 255, 0)', 'group': 'regional_stroma', 'id': 'regional_stroma', 'label': {'value': 'regional_stroma'}, 'lineColor': 'rgb(0, 255, 0)', 'lineWidth': 2}, {'fillColor': 'rgb(0, 0, 255)', 'group': 'regional_fat', 'id': 'regional_fat', 'label': {'value': 'regional_fat'}, 'lineColor': 'rgb(0, 0, 255)', 'lineWidth': 2}, {'fillColor': 'rgb(255, 0, 0)', 'group': 'point_lymphocytes', 'id': 'point_lymphocytes', 'label': {'value': 'point_lymphocytes'}, 'lineColor': 'rgb(255, 0, 0)', 'lineWidth': 2}, {'fillColor': 'rgb(0, 255, 0)', 'group': 'point_other', 'id': 'point_other', 'label': {'value': 'point_other'}, 'lineColor': 'rgb(0, 255, 0)', 'lineWidth': 2}]}, 'name': 'pathology-tutorial', 'public': False, 'publicFlags': [], 'size': 3161997775, 'updated': '2021-12-07T17:48:13.526000+00:00'}\n", "Collection pathology-tutorial found with id: 60edf904f398e364deb34dfa\n", "retreived collection uuid\n", "2021-12-10 21:56:38,567 - INFO - luna.pathology.cli.dsa.dsa_annotations - Retrieved collection metadata\n", "2021-12-10 21:56:40,056 - INFO - root - FYI: Initalized logger, log file at: data-processing.log with handlers: [ (INFO)>, ]\n", "2021-12-10 21:56:40,058 - INFO - root - FYI: Initalized logger, log file at: data-processing.log with handlers: [ (INFO)>, ]\n", "2021-12-10 21:56:40,059 - INFO - root - FYI: Initalized logger, log file at: data-processing.log with handlers: [ (INFO)>, ]\n", "2021-12-10 21:56:40,061 - INFO - luna.pathology.cli.dsa.dsa_annotations - \n", "collection_id_dict {'_accessLevel': 2, '_id': '60edf904f398e364deb34dfa', '_modelType': 'collection', '_textScore': 15.0, 'created': '2021-07-13T20:35:16.195000+00:00', 'description': '', 'meta': {'stylesheet': [{'fillColor': 'rgb(255, 0, 0)', 'group': 'regional_tumor', 'id': 'regional_tumor', 'label': {'value': 'regional_tumor'}, 'lineColor': 'rgb(255, 0, 0)', 'lineWidth': 2}, {'fillColor': 'rgb(0, 255, 0)', 'group': 'regional_stroma', 'id': 'regional_stroma', 'label': {'value': 'regional_stroma'}, 'lineColor': 'rgb(0, 255, 0)', 'lineWidth': 2}, {'fillColor': 'rgb(0, 0, 255)', 'group': 'regional_fat', 'id': 'regional_fat', 'label': {'value': 'regional_fat'}, 'lineColor': 'rgb(0, 0, 255)', 'lineWidth': 2}, {'fillColor': 'rgb(255, 0, 0)', 'group': 'point_lymphocytes', 'id': 'point_lymphocytes', 'label': {'value': 'point_lymphocytes'}, 'lineColor': 'rgb(255, 0, 0)', 'lineWidth': 2}, {'fillColor': 'rgb(0, 255, 0)', 'group': 'point_other', 'id': 'point_other', 'label': {'value': 'point_other'}, 'lineColor': 'rgb(0, 255, 0)', 'lineWidth': 2}]}, 'name': 'pathology-tutorial', 'public': False, 'publicFlags': [], 'size': 3161997775, 'updated': '2021-12-07T17:48:13.526000+00:00'}\n", "Collection pathology-tutorial found with id: 60edf904f398e364deb34dfa\n", "collection_id_dict {'_accessLevel': 2, '_id': '60edf904f398e364deb34dfa', '_modelType': 'collection', '_textScore': 15.0, 'created': '2021-07-13T20:35:16.195000+00:00', 'description': '', 'meta': {'stylesheet': [{'fillColor': 'rgb(255, 0, 0)', 'group': 'regional_tumor', 'id': 'regional_tumor', 'label': {'value': 'regional_tumor'}, 'lineColor': 'rgb(255, 0, 0)', 'lineWidth': 2}, {'fillColor': 'rgb(0, 255, 0)', 'group': 'regional_stroma', 'id': 'regional_stroma', 'label': {'value': 'regional_stroma'}, 'lineColor': 'rgb(0, 255, 0)', 'lineWidth': 2}, {'fillColor': 'rgb(0, 0, 255)', 'group': 'regional_fat', 'id': 'regional_fat', 'label': {'value': 'regional_fat'}, 'lineColor': 'rgb(0, 0, 255)', 'lineWidth': 2}, {'fillColor': 'rgb(255, 0, 0)', 'group': 'point_lymphocytes', 'id': 'point_lymphocytes', 'label': {'value': 'point_lymphocytes'}, 'lineColor': 'rgb(255, 0, 0)', 'lineWidth': 2}, {'fillColor': 'rgb(0, 255, 0)', 'group': 'point_other', 'id': 'point_other', 'label': {'value': 'point_other'}, 'lineColor': 'rgb(0, 255, 0)', 'lineWidth': 2}]}, 'name': 'pathology-tutorial', 'public': False, 'publicFlags': [], 'size': 3161997775, 'updated': '2021-12-07T17:48:13.526000+00:00'}\n", "Collection pathology-tutorial found with id: 60edf904f398e364deb34dfa\n", "collection_id_dict {'_accessLevel': 2, '_id': '60edf904f398e364deb34dfa', '_modelType': 'collection', '_textScore': 15.0, 'created': '2021-07-13T20:35:16.195000+00:00', 'description': '', 'meta': {'stylesheet': [{'fillColor': 'rgb(255, 0, 0)', 'group': 'regional_tumor', 'id': 'regional_tumor', 'label': {'value': 'regional_tumor'}, 'lineColor': 'rgb(255, 0, 0)', 'lineWidth': 2}, {'fillColor': 'rgb(0, 255, 0)', 'group': 'regional_stroma', 'id': 'regional_stroma', 'label': {'value': 'regional_stroma'}, 'lineColor': 'rgb(0, 255, 0)', 'lineWidth': 2}, {'fillColor': 'rgb(0, 0, 255)', 'group': 'regional_fat', 'id': 'regional_fat', 'label': {'value': 'regional_fat'}, 'lineColor': 'rgb(0, 0, 255)', 'lineWidth': 2}, {'fillColor': 'rgb(255, 0, 0)', 'group': 'point_lymphocytes', 'id': 'point_lymphocytes', 'label': {'value': 'point_lymphocytes'}, 'lineColor': 'rgb(255, 0, 0)', 'lineWidth': 2}, {'fillColor': 'rgb(0, 255, 0)', 'group': 'point_other', 'id': 'point_other', 'label': {'value': 'point_other'}, 'lineColor': 'rgb(0, 255, 0)', 'lineWidth': 2}]}, 'name': 'pathology-tutorial', 'public': False, 'publicFlags': [], 'size': 3161997775, 'updated': '2021-12-07T17:48:13.526000+00:00'}\n", "Collection pathology-tutorial found with id: 60edf904f398e364deb34dfa\n", "Image file 01OV008-308ad404-7079-4ff8-8232-12ee2e.svs found with id: 6156306db8ba89a9c64590a4\n", "Starting request for annotation\n", "Image file 01OV002-ed65cf94-8bc6-492b-9149-adc16f.svs found with id: 61562f90b8ba89a9c645909e\n", "Image file 01OV002-bd8cdc70-3d46-40ae-99c4-90ef77.svs found with id: 61563001b8ba89a9c64590a1\n", "Starting request for annotation\n", "Starting request for annotation\n", "Annotiaton not found for slide: 01OV008-308ad404-7079-4ff8-8232-12ee2e.svs and annotation name: point\n", "collection_id_dict {'_accessLevel': 2, '_id': '60edf904f398e364deb34dfa', '_modelType': 'collection', '_textScore': 15.0, 'created': '2021-07-13T20:35:16.195000+00:00', 'description': '', 'meta': {'stylesheet': [{'fillColor': 'rgb(255, 0, 0)', 'group': 'regional_tumor', 'id': 'regional_tumor', 'label': {'value': 'regional_tumor'}, 'lineColor': 'rgb(255, 0, 0)', 'lineWidth': 2}, {'fillColor': 'rgb(0, 255, 0)', 'group': 'regional_stroma', 'id': 'regional_stroma', 'label': {'value': 'regional_stroma'}, 'lineColor': 'rgb(0, 255, 0)', 'lineWidth': 2}, {'fillColor': 'rgb(0, 0, 255)', 'group': 'regional_fat', 'id': 'regional_fat', 'label': {'value': 'regional_fat'}, 'lineColor': 'rgb(0, 0, 255)', 'lineWidth': 2}, {'fillColor': 'rgb(255, 0, 0)', 'group': 'point_lymphocytes', 'id': 'point_lymphocytes', 'label': {'value': 'point_lymphocytes'}, 'lineColor': 'rgb(255, 0, 0)', 'lineWidth': 2}, {'fillColor': 'rgb(0, 255, 0)', 'group': 'point_other', 'id': 'point_other', 'label': {'value': 'point_other'}, 'lineColor': 'rgb(0, 255, 0)', 'lineWidth': 2}]}, 'name': 'pathology-tutorial', 'public': False, 'publicFlags': [], 'size': 3161997775, 'updated': '2021-12-07T17:48:13.526000+00:00'}\n", "Collection pathology-tutorial found with id: 60edf904f398e364deb34dfa\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Image file 01OV007-9b90eb78-2f50-4aeb-b010-d642f9.svs found with id: 61562edeb8ba89a9c645909b\n", "Starting request for annotation\n", "collection_id_dict {'_accessLevel': 2, '_id': '60edf904f398e364deb34dfa', '_modelType': 'collection', '_textScore': 15.0, 'created': '2021-07-13T20:35:16.195000+00:00', 'description': '', 'meta': {'stylesheet': [{'fillColor': 'rgb(255, 0, 0)', 'group': 'regional_tumor', 'id': 'regional_tumor', 'label': {'value': 'regional_tumor'}, 'lineColor': 'rgb(255, 0, 0)', 'lineWidth': 2}, {'fillColor': 'rgb(0, 255, 0)', 'group': 'regional_stroma', 'id': 'regional_stroma', 'label': {'value': 'regional_stroma'}, 'lineColor': 'rgb(0, 255, 0)', 'lineWidth': 2}, {'fillColor': 'rgb(0, 0, 255)', 'group': 'regional_fat', 'id': 'regional_fat', 'label': {'value': 'regional_fat'}, 'lineColor': 'rgb(0, 0, 255)', 'lineWidth': 2}, {'fillColor': 'rgb(255, 0, 0)', 'group': 'point_lymphocytes', 'id': 'point_lymphocytes', 'label': {'value': 'point_lymphocytes'}, 'lineColor': 'rgb(255, 0, 0)', 'lineWidth': 2}, {'fillColor': 'rgb(0, 255, 0)', 'group': 'point_other', 'id': 'point_other', 'label': {'value': 'point_other'}, 'lineColor': 'rgb(0, 255, 0)', 'lineWidth': 2}]}, 'name': 'pathology-tutorial', 'public': False, 'publicFlags': [], 'size': 3161997775, 'updated': '2021-12-07T17:48:13.526000+00:00'}\n", "Collection pathology-tutorial found with id: 60edf904f398e364deb34dfa\n", "collection_id_dict {'_accessLevel': 2, '_id': '60edf904f398e364deb34dfa', '_modelType': 'collection', '_textScore': 15.0, 'created': '2021-07-13T20:35:16.195000+00:00', 'description': '', 'meta': {'stylesheet': [{'fillColor': 'rgb(255, 0, 0)', 'group': 'regional_tumor', 'id': 'regional_tumor', 'label': {'value': 'regional_tumor'}, 'lineColor': 'rgb(255, 0, 0)', 'lineWidth': 2}, {'fillColor': 'rgb(0, 255, 0)', 'group': 'regional_stroma', 'id': 'regional_stroma', 'label': {'value': 'regional_stroma'}, 'lineColor': 'rgb(0, 255, 0)', 'lineWidth': 2}, {'fillColor': 'rgb(0, 0, 255)', 'group': 'regional_fat', 'id': 'regional_fat', 'label': {'value': 'regional_fat'}, 'lineColor': 'rgb(0, 0, 255)', 'lineWidth': 2}, {'fillColor': 'rgb(255, 0, 0)', 'group': 'point_lymphocytes', 'id': 'point_lymphocytes', 'label': {'value': 'point_lymphocytes'}, 'lineColor': 'rgb(255, 0, 0)', 'lineWidth': 2}, {'fillColor': 'rgb(0, 255, 0)', 'group': 'point_other', 'id': 'point_other', 'label': {'value': 'point_other'}, 'lineColor': 'rgb(0, 255, 0)', 'lineWidth': 2}]}, 'name': 'pathology-tutorial', 'public': False, 'publicFlags': [], 'size': 3161997775, 'updated': '2021-12-07T17:48:13.526000+00:00'}\n", "Collection pathology-tutorial found with id: 60edf904f398e364deb34dfa\n", "Image file 01OV008-7579323e-2fae-43a9-b00f-a15c28.svs found with id: 61562e72b8ba89a9c6459098\n", "Starting request for annotation\n", "Image file 2551571.svs found with id: 60f1715cf398e364deb3befb\n", "Starting request for annotation\n", "Annotiaton not found for slide: 01OV008-7579323e-2fae-43a9-b00f-a15c28.svs and annotation name: point\n", "Annotiaton not found for slide: 2551571.svs and annotation name: point\n", "2021-12-10 21:56:40,933 - INFO - luna.common.config - loading config file /home/rosed2/vmount/conf/datastore.cfg\n", "2021-12-10 21:56:40,934 - INFO - luna.common.config - loading config file /home/rosed2/vmount/conf/datastore.cfg\n", "2021-12-10 21:56:40,940 - INFO - luna.common.DataStore - Configured datastore with {'GRAPH_STORE_ENABLED': False, 'GRAPH_URI': 'neo4j://localhost:7687', 'GRAPH_USER': 'neo4j', 'GRAPH_PASSWORD': 'password', 'OBJECT_STORE_ENABLED': False, 'MINIO_URI': 'localhost:8001', 'MINIO_USER': 'minio', 'MINIO_PASSWORD': 'password', 'DOC_STORE_ENABLED': False, 'MONGODB_URI': 'mongodb://localhost:27017/'}\n", "2021-12-10 21:56:40,942 - INFO - luna.common.DataStore - Configured datastore with {'GRAPH_STORE_ENABLED': False, 'GRAPH_URI': 'neo4j://localhost:7687', 'GRAPH_USER': 'neo4j', 'GRAPH_PASSWORD': 'password', 'OBJECT_STORE_ENABLED': False, 'MINIO_URI': 'localhost:8001', 'MINIO_USER': 'minio', 'MINIO_PASSWORD': 'password', 'DOC_STORE_ENABLED': False, 'MONGODB_URI': 'mongodb://localhost:27017/'}\n", "2021-12-10 21:56:40,944 - INFO - luna.common.DataStore - Datstore file backend= ../PRO_12-123/slides\n", "2021-12-10 21:56:40,944 - INFO - luna.common.DataStore - Datstore file backend= ../PRO_12-123/slides\n", "2021-12-10 21:56:40,949 - INFO - luna.common.DataStore - Save -> ../PRO_12-123/slides/01OV002-bd8cdc70-3d46-40ae-99c4-90ef77/admin/POINT_GEOJSON_DSA_JSON/DEFAULT_LABELS\n", "2021-12-10 21:56:40,950 - INFO - luna.common.DataStore - Save -> ../PRO_12-123/slides/01OV002-ed65cf94-8bc6-492b-9149-adc16f/admin/POINT_GEOJSON_DSA_JSON/DEFAULT_LABELS\n", "2021-12-10 21:56:40,967 - INFO - luna.common.DataStore - Save -> ../PRO_12-123/slides/01OV002-bd8cdc70-3d46-40ae-99c4-90ef77/CONCAT/POINT_GEOJSON/DEFAULT_LABELS\n", "2021-12-10 21:56:40,967 - INFO - luna.common.DataStore - Save -> ../PRO_12-123/slides/01OV002-ed65cf94-8bc6-492b-9149-adc16f/CONCAT/POINT_GEOJSON/DEFAULT_LABELS\n", "2021-12-10 21:56:40,973 - INFO - luna.common.DataStore - Configured datastore with {'GRAPH_STORE_ENABLED': False, 'GRAPH_URI': 'neo4j://localhost:7687', 'GRAPH_USER': 'neo4j', 'GRAPH_PASSWORD': 'password', 'OBJECT_STORE_ENABLED': False, 'MINIO_URI': 'localhost:8001', 'MINIO_USER': 'minio', 'MINIO_PASSWORD': 'password', 'DOC_STORE_ENABLED': False, 'MONGODB_URI': 'mongodb://localhost:27017/'}\n", "2021-12-10 21:56:40,976 - INFO - luna.common.DataStore - Datstore file backend= ../PRO_12-123/slides\n", "2021-12-10 21:56:40,979 - INFO - luna.common.DataStore - Save -> ../PRO_12-123/slides/01OV007-9b90eb78-2f50-4aeb-b010-d642f9/admin/POINT_GEOJSON_DSA_JSON/DEFAULT_LABELS\n", "2021-12-10 21:56:40,988 - INFO - luna.common.DataStore - Save -> ../PRO_12-123/slides/01OV007-9b90eb78-2f50-4aeb-b010-d642f9/CONCAT/POINT_GEOJSON/DEFAULT_LABELS\n", "collection_id_dict {'_accessLevel': 2, '_id': '60edf904f398e364deb34dfa', '_modelType': 'collection', '_textScore': 15.0, 'created': '2021-07-13T20:35:16.195000+00:00', 'description': '', 'meta': {'stylesheet': [{'fillColor': 'rgb(255, 0, 0)', 'group': 'regional_tumor', 'id': 'regional_tumor', 'label': {'value': 'regional_tumor'}, 'lineColor': 'rgb(255, 0, 0)', 'lineWidth': 2}, {'fillColor': 'rgb(0, 255, 0)', 'group': 'regional_stroma', 'id': 'regional_stroma', 'label': {'value': 'regional_stroma'}, 'lineColor': 'rgb(0, 255, 0)', 'lineWidth': 2}, {'fillColor': 'rgb(0, 0, 255)', 'group': 'regional_fat', 'id': 'regional_fat', 'label': {'value': 'regional_fat'}, 'lineColor': 'rgb(0, 0, 255)', 'lineWidth': 2}, {'fillColor': 'rgb(255, 0, 0)', 'group': 'point_lymphocytes', 'id': 'point_lymphocytes', 'label': {'value': 'point_lymphocytes'}, 'lineColor': 'rgb(255, 0, 0)', 'lineWidth': 2}, {'fillColor': 'rgb(0, 255, 0)', 'group': 'point_other', 'id': 'point_other', 'label': {'value': 'point_other'}, 'lineColor': 'rgb(0, 255, 0)', 'lineWidth': 2}]}, 'name': 'pathology-tutorial', 'public': False, 'publicFlags': [], 'size': 3161997775, 'updated': '2021-12-07T17:48:13.526000+00:00'}\n", "Collection pathology-tutorial found with id: 60edf904f398e364deb34dfa\n", "Image file 2551129.svs found with id: 60edf9d1f398e364deb34dfd\n", "Starting request for annotation\n", "Annotiaton not found for slide: 2551129.svs and annotation name: point\n", "2021-12-10 21:56:41,199 - INFO - luna.pathology.cli.dsa.dsa_annotations - Annotation for slide 01OV007-9b90eb78-2f50-4aeb-b010-d642f9.svs generated successfully\n", "2021-12-10 21:56:41,205 - INFO - luna.pathology.cli.dsa.dsa_annotations - Annotation for slide 01OV007-9b90eb78-2f50-4aeb-b010-d642f9.svs generated successfully\n", "2021-12-10 21:56:41,211 - INFO - luna.pathology.cli.dsa.dsa_annotations - Annotation for slide 01OV007-9b90eb78-2f50-4aeb-b010-d642f9.svs generated successfully\n", "2021-12-10 21:56:42,625 - INFO - root - Code block 'generate DSA annotation geojson table' took: 4.892569079999703s\n" ] } ], "source": [ "# Will be replaced with the entrypoint (dsa_annotations)\n", "!dsa_annotation \\\n", "-d ../conf/dsa_point_annotation.yaml \\\n", "-a ../conf/dsa_app_config.yaml \\\n", "-u admin -p password" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Once the ETL is done, you can check the results in the following locations.\n", "\n", "- Configurations saved in ` ~/vmount/PRO_12-123/configs/POINT_GEOJSON`\n", "- Parquet table saved in ` ~/vmount/PRO_12-123/tables/POINT_GEOJSON`\n", "- DSA Json annotations and converted GeoJSONs saved in ` ~/vmount/PRO_12-123/slides/`\n", "\n", "Note that only 3 slides are annotated with point annotations, and that these annotations are made by a non-expert for demo purposes only.\n" ] }, { "cell_type": "code", "execution_count": 30, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "total 8.0K\n", "-rwxr-xr-x 1 rosed2 rosed2 86 Dec 10 21:56 app_config.yaml\n", "-rwxr-xr-x 1 rosed2 rosed2 724 Dec 10 21:56 data_config.yaml\n", "total 12K\n", "-rw-r--r-- 1 rosed2 rosed2 9.4K Dec 10 21:56 POINT_GEOJSON.parquet\n", "total 0\n", "drwxr-xr-x 3 rosed2 rosed2 96 Dec 10 21:56 POINT_GEOJSON\n", "drwxr-xr-x 3 rosed2 rosed2 96 Dec 10 21:54 REGIONAL_METADATA_RESULTS\n" ] } ], "source": [ "!ls -lh ~/vmount/PRO_12-123/configs/POINT_GEOJSON\n", "!ls -lh ~/vmount/PRO_12-123/tables/POINT_GEOJSON\n", "!ls -lh ~/vmount/PRO_12-123/slides/01OV002-bd8cdc70-3d46-40ae-99c4-90ef77/CONCAT" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Similar to the regional annotation table, the table captures rich metadata on the point annotation along with annotation type and paths to DSA json and the converted geojson.\n", "\n", "The parquet table can be read with pyarrow as seen below:" ] }, { "cell_type": "code", "execution_count": 34, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
project_nameslide_iduserdsa_json_pathgeojson_pathdate_updateddate_createdlabelsetannotation_nameannotation_type
0pathology-tutorial01OV002-bd8cdc70-3d46-40ae-99c4-90ef77CONCAT../PRO_12-123/slides/01OV002-bd8cdc70-3d46-40a...../PRO_12-123/slides/01OV002-bd8cdc70-3d46-40a...2021-12-07T17:18:12.856000+00:002021-12-07T17:04:27.150000+00:00DEFAULT_LABELSpointPointAnnotationJSON
1pathology-tutorial01OV002-ed65cf94-8bc6-492b-9149-adc16fCONCAT../PRO_12-123/slides/01OV002-ed65cf94-8bc6-492...../PRO_12-123/slides/01OV002-ed65cf94-8bc6-492...2021-12-07T17:46:28.147000+00:002021-12-07T16:56:07.014000+00:00DEFAULT_LABELSpointPointAnnotationJSON
2pathology-tutorial01OV007-9b90eb78-2f50-4aeb-b010-d642f9CONCAT../PRO_12-123/slides/01OV007-9b90eb78-2f50-4ae...../PRO_12-123/slides/01OV007-9b90eb78-2f50-4ae...2021-12-07T17:22:18.185000+00:002021-12-07T17:18:49.703000+00:00DEFAULT_LABELSpointPointAnnotationJSON
\n", "
" ], "text/plain": [ " project_name slide_id user \\\n", "0 pathology-tutorial 01OV002-bd8cdc70-3d46-40ae-99c4-90ef77 CONCAT \n", "1 pathology-tutorial 01OV002-ed65cf94-8bc6-492b-9149-adc16f CONCAT \n", "2 pathology-tutorial 01OV007-9b90eb78-2f50-4aeb-b010-d642f9 CONCAT \n", "\n", " dsa_json_path \\\n", "0 ../PRO_12-123/slides/01OV002-bd8cdc70-3d46-40a... \n", "1 ../PRO_12-123/slides/01OV002-ed65cf94-8bc6-492... \n", "2 ../PRO_12-123/slides/01OV007-9b90eb78-2f50-4ae... \n", "\n", " geojson_path \\\n", "0 ../PRO_12-123/slides/01OV002-bd8cdc70-3d46-40a... \n", "1 ../PRO_12-123/slides/01OV002-ed65cf94-8bc6-492... \n", "2 ../PRO_12-123/slides/01OV007-9b90eb78-2f50-4ae... \n", "\n", " date_updated date_created \\\n", "0 2021-12-07T17:18:12.856000+00:00 2021-12-07T17:04:27.150000+00:00 \n", "1 2021-12-07T17:46:28.147000+00:00 2021-12-07T16:56:07.014000+00:00 \n", "2 2021-12-07T17:22:18.185000+00:00 2021-12-07T17:18:49.703000+00:00 \n", "\n", " labelset annotation_name annotation_type \n", "0 DEFAULT_LABELS point PointAnnotationJSON \n", "1 DEFAULT_LABELS point PointAnnotationJSON \n", "2 DEFAULT_LABELS point PointAnnotationJSON " ] }, "execution_count": 34, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import pyarrow.parquet as pq\n", "\n", "point_annotation_table = pq.read_table('../PRO_12-123/tables/POINT_GEOJSON').to_pandas()\n", "point_annotation_table" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.9" } }, "nbformat": 4, "nbformat_minor": 4 }