pyluna-pathology.luna.pathology.common package

Submodules

pyluna-pathology.luna.pathology.common.EnsureByteContext module

Created on November 04, 2020

@author: aukermaa@mskcc.org

class luna.pathology.common.EnsureByteContext.EnsureByteContext[source]

Bases: object

pyluna-pathology.luna.pathology.common.annotation_utils module

pyluna-pathology.luna.pathology.common.build_geojson module

luna.pathology.common.build_geojson.add_contours_for_label(annotation_geojson: Dict[str, any], annotation: numpy.ndarray, label_num: int, mappings: dict, contour_level: float) Dict[str, any][source]

creates geoJSON feature dictionary for labels

Finds the contours for a label mask, builds a polygon and then converts the polygon to geoJSON feature dictionary

Parameters
  • annotation_geojson (dict[str, any]) – geoJSON result to populate

  • annotation (np.ndarray) – npy array of bitmap

  • label_num (int) – the integer cooresponding to the annotated label

  • mappings (dict) – label map for specified label set

  • contour_level (float) – value along which to find contours in the array

Returns

geoJSON with label countours

Return type

dict[str, any]

luna.pathology.common.build_geojson.build_all_geojsons_from_default(default_annotation_geojson: Dict[str, any], all_labelsets: List[dict], contour_level: float) dict[source]

builds geoJSON objects from a set of labels

wraps build_labelset_specific_geojson with logic to generate annotations from multiple labelsets

Parameters
  • default_annotation_geojson (dict[str, any]) – input geoJSON

  • all_labelsets (list[dict]) – a list of dictionaries containing label sets

  • contour_level (float) – value along which to find contours

Returns

a dictionary with labelset name and cooresponding geoJSON as key, value pairs

Return type

dict

luna.pathology.common.build_geojson.build_default_geojson_from_annotation(annotation_npy_filepath: str, all_labelsets: dict, contour_level: float)[source]

builds geoJSONS from numpy annotation with default label set

Parameters
  • annotation_npy_filepath (str) – string to numpy annotation

  • all_labelsets (dict) – a dictionary of label sets

  • contour_level (float) – value along which to find contours

Returns

the default geoJSON annotation

Return type

dict[str, any]

luna.pathology.common.build_geojson.build_geojson_from_annotation(df: pandas.core.frame.DataFrame) pandas.core.frame.DataFrame[source]

Builds geoJSON for all annotation labels in the specified labelset.

Parameters

df (pandas.DataFrame) – input regional annotation table

Returns

dataframe with geoJSON field poopulated

Return type

pandasDataFrame

luna.pathology.common.build_geojson.build_geojson_from_pointclick_json(labelsets: dict, labelset: str, sv_json: List[dict]) list[source]

Build geoJSON m slideviewer JSON

This method extracts point annotations from a slideviwer json object and converts them to a standardized geoJSON format

Parameters
  • labelsets (dict) – dictionary of label set as string (e.g. {labelset: {label_number: label_name}})

  • labelset (str) – the name of the labelset e.g. default_labels

  • sv_json (list[dict]) – annotatations from slideviwer in the form of a list of dictionaries

Returns

a list of geoJSON annotation objects

Return type

list

luna.pathology.common.build_geojson.build_labelset_specific_geojson(default_annotation_geojson: Dict[str, any], labelset: dict) Dict[str, any][source]

builds geoJSON for labelset

Instead of working with a large geJSON object, you can extact polygons that coorspond to specific labels into a smaller object.

Parameters
  • default_annotation_geojson (dict[str, any]) – geoJSON annotation file

  • labelset (dict) – label set dictionary

Returns

geoJSON with only polygons from provided labelset

Return type

dict[str, any]

luna.pathology.common.build_geojson.concatenate_regional_geojsons(geojson_list: List[Dict[str, any]]) Dict[str, any][source]

concatenate regional annotations

Concatenates geojsons if there are more than one annotations for the labelset.

Parameters

geojson_list (list[dict[str, any]]) – list of geoJSON strings

Returns

a single concatenated geoJSON

Return type

dict[str, any]

luna.pathology.common.build_geojson.find_parents(polygons: list) list[source]

determines of parent child relationships of polygons

Returns a list of size n (where n is the number of input polygons in the input list polygons) where the value at index n cooresponds to the nth polygon’s parent. In the case of no parent, -1 is used. for example, parent_nums[0] = 2 means that polygon 0’s parent is polygon 2

Parameters

polygons (list) – a list of shapely polygon objects

Returns

a list of parent-child relationships for the polygon objects

Return type

list

luna.pathology.common.build_geojson.handler(signum: str, frame: str) None[source]

signal handler for geojson

Parameters
  • signum (str) – signal number

  • fname (str) – filename for which exception occurred

Returns

None

pyluna-pathology.luna.pathology.common.ml module

class luna.pathology.common.ml.BaseTorchTileClassifier(**kwargs)[source]

Bases: torch.nn.modules.module.Module

forward(index, tile_data)[source]

Forward pass for base classifier class

Loads a tile image from the tile manifest

Parameters
  • index (list[str]) – Tile address indicies with length B

  • tile_data (torch.tensor) – Input tiles of shape (B, *)

Returns

Dataframe of output features

Return type

pd.DataFrame

predict(input_tiles: None._VariableFunctionsClass.tensor)[source]

predict method

Loads a tile image from the tile manifest, must be manually implimented to pass the input tensor through the modules specified in setup()

Parameters

input_tiles (torch.tensor) – Input tiles of shape (B, *)

Returns

2D tensor with (B, C) where B is the batch dimension and C are output classes or features

Return type

torch.tensor

setup(**kwargs)[source]

Set classifier modules

Template/abstract method where individual modules that make up the forward pass are configured

Parameters

kwargs – Keyward arguements passed onto the subclass method

training: bool
class luna.pathology.common.ml.BaseTorchTileDataset(tile_manifest=None, tile_path=None, label_cols=[], **kwargs)[source]

Bases: torch.utils.data.dataset.Dataset

Base class for a tile dataset

Impliments the usual torch dataset methods, and additionally provides a decoding of the binary tile data. PIL images can be further preprocessed before becoming torch tensors via an abstract preprocess method

Will send the tensors to gpu if available, on the device specified by CUDA_VISIBLE_DEVICES=”1”

preprocess(input_tile: <module 'PIL.Image' from '/home/docs/checkouts/readthedocs.org/user_builds/pyluna/envs/v0.1.1/lib/python3.7/site-packages/PIL/Image.py'>)[source]

Preprocessing method called for each tile patch

Loads a tile image from the tile manifest, must be manually implimented to accept a single PIL image and return a torch tensor.

Parameters

input_tile (Image) – Integer index

Returns

Output tile as preprocessed tensor

Return type

torch.tensor

setup(**kwargs)[source]

Set additional attributes for dataset class

Template/abstract method where a dataset is configured

Parameters

kwargs – Keyward arguements passed onto the subclass method

pyluna-pathology.luna.pathology.common.preprocess module

pyluna-pathology.luna.pathology.common.slideviewer_client module

Created on January 31, 2021

@author: pashaa@mskcc.org

Functions for downloading annotations from SlideViewer

luna.pathology.common.slideviewer_client.download_sv_point_annotation(url: str) Dict[str, any][source]

download slideviwer point annotation

Calls slideviewer API with the given url

Parameters

url (str) – slide viewer api to call

Returns

json response

Return type

dict[str, any]

luna.pathology.common.slideviewer_client.download_zip(url: str, dest_path: str, chunk_size: int = 128) bool[source]

Download zip file

Downloads zip from the specified URL and saves it to the specified file path. see https://stackoverflow.com/questions/9419162/download-returned-zip-file-from-url

Parameters
  • url (str) – slideviewer url to download zip from

  • dest_path (str) – file path where zipfile should be saved

  • chunk_size (int) – size in bytes of chunks to batch out during download

Returns

True if zipfile downloaded and saved successfully, else false

Return type

bool

luna.pathology.common.slideviewer_client.fetch_slide_ids(url: str, project_id: int, dest_dir: str, csv_file: Optional[str] = None) list[source]

get slide ids

Fetch the list of slide ids from the slideviewer server for the project with the specified project id. Alternately, a slideviewer csv file may be provided to override download from server.

Parameters
  • url (str or None) – slideviewer url. url may be None if csv_file is specified.

  • project_id (int) – slideviewer project id from which to fetch slide ids

  • dest_dir (str) – directory where csv file should be downloaded

  • csv_file (str) – slideviewer csv file may be provided to override the need

  • file (to download the) –

Returns

list of (slideviewer_path, slide_id, sv_project_id)

Return type

list

luna.pathology.common.slideviewer_client.get_slide_id(full_filename: str) str[source]

get slide id

Get slide id from the slideviewer full file name. The full_filename in the slideview csv is of the format: year;HOBS_ID;slide_id.svs for example: 2013;HobS13-283072057510;1435197.svs

Parameters

full_filename (str) – full filename of slide

Returns

numeric slide id

Return type

str

luna.pathology.common.slideviewer_client.unzip(zipfile_path: str) any[source]

unzip zip file

Parameters

zipfile_path (str) – path of zipfile to unzip

Returns

readfile pointer to unzippped file if successfully unzippped, else None

pyluna-pathology.luna.pathology.common.utils module

Module contents