Tiler¶
Different logics are implemented for tile extraction in the tiler
module. The constructor of the three extractors RandomTiler, GridTiler, and ScoreTiler share a similar interface and common parameters that define the extraction design:
tile_size
: the tile size;level
: the extraction level, from 0 to the number of available levels; negative indexing is also possible, counting backward from the number of available levels to 0 (e.g.level
=-1 means selecting the last available level);check_tissue
: True if a minimum percentage of tissue over the total area of the tile is required to save the tiles, False otherwise;tissue_percent
: number between 0.0 and 100.0 representing the minimum required ratio of tissue over the total area of the image, considered only ifcheck_tissue
equals to True (default is 80.0);prefix
: a prefix to be added at the beginning of the tiles’ filename (optional, default is the empty string);suffix
: a suffix to be added to the end of the tiles’ filename (optional, default is.png
).
The general mechanism is to (i) create a tiler object, (ii) define a Slide
object, used to identify the input image, and (iii) create a mask object to determine the area for tile extraction within the tissue. The extraction process starts when the tiler’s extract()
method is called, with the slide and the mask passed as parameters.
RandomTiler¶
The RandomTiler extractor allows for the extraction of tiles picked at random within the regions defined by the binary mask object. Since there is no intrinsic upper bound of the number of the tiles that could be extracted (no overlap check is performed), the number of wanted tiles must be specified.
In addition to 1-6, the RandomTiler constructor requires as two additional parameters the number of tiles requested (n_tiles
), and the random seed (seed
), to ensure reproducibility between different runs on the same WSI. Note that less than n_tiles
could be extracted from a slide with not enough tissue pixels and a lot of background, which is checked when the parameter check_tissue
is set to True.
n_tiles
will be interpreted as the upper bound of the number of tiles requested: it might not be possible to extract n_tiles
tiles from a slide with a little tissue sample and a lot of background.
The extraction procedure will (i) find the regions to extract tiles from, defined by the binary mask object; (ii) generate n_tiles
random tiles; (iii) save only the tiles with enough tissue if the attribute check_tissue
was set to True, save all the generated tiles otherwise.
GridTiler¶
A second basic approach consists of extracting all the tiles in the areas defined by the binary mask. This strategy is implemented in the GridTiler class. The additional pixel_overlap
parameter specifies the number of overlapping pixels between two adjacent tiles, i.e. tiles are cropped by using a sliding window with stride s defined as:
where w and h are customizable parameters defining the width and the height of the resulting tiles.
Calling the extract
method on the GridTiler instance will automatically (i) find the regions to extract tiles from, defined by the binary mask object; (ii) generate all the tiles according to the grid structure; (iii) save only the tiles with “enough tissue” if the attribute check_tissue
was set to True, save all the generated tiles otherwise.
ScoreTiler¶
Tiles extracted from the same WSI may not be equally informative; for example, if the goal is the detection of mitotic activity on H&E slides, tiles with no nuclei are of little interest. The ScoreTiler extractor ranks the tiles with respect to a scoring function, described in the scorer
module. In particular, the ScoreTiler class extends the GridTiler
extractor by sorting the extracted tiles in a decreasing order, based on the computed score. Notably, the ScoreTiler is agnostic to the scoring function adopted, thus a custom function can be implemented provided that it inputs a Tile
object and outputs a number. The additional parameter n_tiles
controls the number of highest-ranked tiles to save; if n_tiles
=0 all the tiles are kept.
Similarly to the GridTiler extraction process, calling the extract
method on the ScoreTiler instance will automatically (i) find the largest tissue area in the WSI; (ii) generate all the tiles according to the grid structure; (iii) retain all the tiles with enough tissue if the attribute check_tissue
was set to True, all the generated tiles otherwise; (iv) sort the tiles in a decreasing order according to the scoring function defined in the scorer
parameter; (v) save only the highest-ranked n_tiles
tiles, if n_tiles
>0; (vi) write a summary of the saved tiles and their scores in a CSV file, if the report_path
is specified in the extract
method. The summary reports for each tile t: (i) the tile filename; (ii) its raw score \(s_t\); (iii) the normalized score, scaled in the interval [0,1], computed as:
where S is the set of the raw scores of all the extracted tiles.
- class GridTiler(*args, **kwds)[source]¶
Extractor of tiles arranged in a grid, at the given level, with the given size.
- Parameters
tile_size (Tuple[int, int]) – (width, height) of the extracted tiles.
level (int, optional) – Level from which extract the tiles. Default is 0. Superceded by mpp if the mpp argument is provided.
check_tissue (bool, optional) – Whether to check if the tile has enough tissue to be saved. Default is True.
tissue_percent (float, optional) – Number between 0.0 and 100.0 representing the minimum required percentage of tissue over the total area of the image, default is 80.0. This is considered only if
check_tissue
equals to True.pixel_overlap (int, optional) – Number of overlapping pixels (for both height and width) between two adjacent tiles. If negative, two adjacent tiles will be strided by the absolute value of
pixel_overlap
. Default is 0.prefix (str, optional) – Prefix to be added to the tile filename. Default is an empty string.
suffix (str, optional) – Suffix to be added to the tile filename. Default is ‘.png’
mpp (float, optional) – Micron per pixel resolution of extracted tiles. Takes precedence over level. Default is None.
- extract(slide, extraction_mask=<histolab.masks.BiggestTissueBoxMask object>, log_level='INFO')[source]¶
Extract tiles arranged in a grid and save them to disk, following this filename pattern: {prefix}tile_{tiles_counter}_level{level}_{x_ul_wsi}-{y_ul_wsi}-{x_br_wsi}-{y_br_wsi}{suffix}
- Parameters
slide (Slide) – Slide from which to extract the tiles
extraction_mask (BinaryMask, optional) – BinaryMask object defining how to compute a binary mask from a Slide. Default BiggestTissueBoxMask.
log_level (str, {"DEBUG", "INFO", "WARNING", "ERROR", "CRITICAL"}) – Threshold level for the log messages. Default “INFO”
- Raises
TileSizeError – If the tile size is larger than the slide size
LevelError – If the level is not available for the slide
- Return type
None
- locate_tiles(slide, extraction_mask=<histolab.masks.BiggestTissueBoxMask object>, scale_factor=32, alpha=128, outline='red', linewidth=1, tiles=None)¶
Draw tile box references on a rescaled version of the slide
- Parameters
slide (Slide) – Slide reference where placing the tiles
extraction_mask (BinaryMask, optional) – BinaryMask object defining how to compute a binary mask from a Slide. Default BiggestTissueBoxMask
scale_factor (int, optional) – Scaling factor for the returned image. Default is 32.
alpha (int, optional) – The alpha level to be applied to the rescaled slide. Default is 128.
outline (Union[str, Iterable[str], Iterable[Tuple[int]]], optional) – The outline color for the tile annotations. Default is ‘red’. You can provide this as a string compatible with matplotlib, or you can provide a list of the same length as the tiles, where each color is your assigned color for the corresponding individual tile. This list can be a list of matplotlib-style string colors, or a list of tuples of ints in the [0, 255] range, each of length 3, representing the red, green and blue color for each tile. For example, if you have two tiles that you want to be colored yellow, you can pass this argument as any of the following .. - ‘yellow’ - [‘yellow’, ‘yellow’] - [(255, 255, 0), (255, 255, 0)]
linewidth (int, optional) – Thickness of line used to draw tiles. Default is 1.
tiles (Optional[Iterable[Tile]], optional) – Tiles to visualize. Will be extracted if None. Default is None. You may decide to provide this argument if you do not want the tiles to be re-extracted for visualization if you already have the tiles in hand.
- Returns
PIL Image of the rescaled slide with the extracted tiles outlined
- Return type
PIL.Image.Image
- property tile_size: Tuple[int, int]¶
(width, height) of the extracted tiles.
- class RandomTiler(*args, **kwds)[source]¶
Extractor of random tiles from a Slide, at the given level, with the given size.
- Parameters
tile_size (Tuple[int, int]) – (width, height) of the extracted tiles.
n_tiles (int) – Maximum number of tiles to extract.
level (int, optional) – Level from which extract the tiles. Default is 0. Superceded by mpp if the mpp argument is provided.
seed (int, optional) – Seed for RandomState. Must be convertible to 32 bit unsigned integers. Default is 7.
check_tissue (bool, optional) – Whether to check if the tile has enough tissue to be saved. Default is True.
tissue_percent (float, optional) – Number between 0.0 and 100.0 representing the minimum required percentage of tissue over the total area of the image, default is 80.0. This is considered only if
check_tissue
equals to True.prefix (str, optional) – Prefix to be added to the tile filename. Default is an empty string.
suffix (str, optional) – Suffix to be added to the tile filename. Default is ‘.png’
max_iter (int, optional) – Maximum number of iterations performed when searching for eligible (if
check_tissue=True
) tiles. Must be grater than or equal ton_tiles
.mpp (float, optional) – Micron per pixel resolution. If provided, takes precedence over level. Default is None.
- extract(slide, extraction_mask=<histolab.masks.BiggestTissueBoxMask object>, log_level='INFO')[source]¶
Extract random tiles and save them to disk, following this filename pattern: {prefix}tile_{tiles_counter}_level{level}_{x_ul_wsi}-{y_ul_wsi}-{x_br_wsi}-{y_br_wsi}{suffix}
- Parameters
slide (Slide) – Slide from which to extract the tiles
extraction_mask (BinaryMask, optional) – BinaryMask object defining how to compute a binary mask from a Slide. Default BiggestTissueBoxMask.
log_level (str, {"DEBUG", "INFO", "WARNING", "ERROR", "CRITICAL"}) – Threshold level for the log messages. Default “INFO”
- Raises
TileSizeError – If the tile size is larger than the slide size
LevelError – If the level is not available for the slide
- Return type
None
- locate_tiles(slide, extraction_mask=<histolab.masks.BiggestTissueBoxMask object>, scale_factor=32, alpha=128, outline='red', linewidth=1, tiles=None)¶
Draw tile box references on a rescaled version of the slide
- Parameters
slide (Slide) – Slide reference where placing the tiles
extraction_mask (BinaryMask, optional) – BinaryMask object defining how to compute a binary mask from a Slide. Default BiggestTissueBoxMask
scale_factor (int, optional) – Scaling factor for the returned image. Default is 32.
alpha (int, optional) – The alpha level to be applied to the rescaled slide. Default is 128.
outline (Union[str, Iterable[str], Iterable[Tuple[int]]], optional) – The outline color for the tile annotations. Default is ‘red’. You can provide this as a string compatible with matplotlib, or you can provide a list of the same length as the tiles, where each color is your assigned color for the corresponding individual tile. This list can be a list of matplotlib-style string colors, or a list of tuples of ints in the [0, 255] range, each of length 3, representing the red, green and blue color for each tile. For example, if you have two tiles that you want to be colored yellow, you can pass this argument as any of the following .. - ‘yellow’ - [‘yellow’, ‘yellow’] - [(255, 255, 0), (255, 255, 0)]
linewidth (int, optional) – Thickness of line used to draw tiles. Default is 1.
tiles (Optional[Iterable[Tile]], optional) – Tiles to visualize. Will be extracted if None. Default is None. You may decide to provide this argument if you do not want the tiles to be re-extracted for visualization if you already have the tiles in hand.
- Returns
PIL Image of the rescaled slide with the extracted tiles outlined
- Return type
PIL.Image.Image
- class ScoreTiler(*args, **kwds)[source]¶
Extractor of tiles arranged in a grid according to a scoring function.
The extraction procedure is the same as the
GridTiler
extractor, but only the firstn_tiles
tiles with the highest score are saved.- Parameters
scorer (Scorer) – Scoring function used to score the tiles.
tile_size (Tuple[int, int]) – (width, height) of the extracted tiles.
n_tiles (int, optional) – The number of tiles to be saved. Default is 0, which means that all the tiles will be saved (same exact behaviour of a GridTiler). Cannot be negative.
level (int, optional) – Level from which extract the tiles. Default is 0. Superceded by mpp if the mpp argument is provided.
check_tissue (bool, optional) – Whether to check if the tile has enough tissue to be saved. Default is True.
tissue_percent (float, optional) – Number between 0.0 and 100.0 representing the minimum required percentage of tissue over the total area of the image, default is 80.0. This is considered only if
check_tissue
equals to True.pixel_overlap (int, optional) – Number of overlapping pixels (for both height and width) between two adjacent tiles. If negative, two adjacent tiles will be strided by the absolute value of
pixel_overlap
. Default is 0.prefix (str, optional) – Prefix to be added to the tile filename. Default is an empty string.
suffix (str, optional) – Suffix to be added to the tile filename. Default is ‘.png’
mpp (float, optional.) – Micron per pixel resolution. If provided, takes precedence over level. Default is None.
- extract(slide, extraction_mask=<histolab.masks.BiggestTissueBoxMask object>, report_path=None, log_level='INFO')[source]¶
Extract grid tiles and save them to disk, according to a scoring function and following this filename pattern: {prefix}tile_{tiles_counter}_level{level}_{x_ul_wsi}-{y_ul_wsi}-{x_br_wsi}-{y_br_wsi}{suffix}
Save a CSV report file with the saved tiles and the associated score.
- Parameters
slide (Slide) – Slide from which to extract the tiles
extraction_mask (BinaryMask, optional) – BinaryMask object defining how to compute a binary mask from a Slide. Default BiggestTissueBoxMask.
report_path (str, optional) – Path to the CSV report. If None, no report will be saved
log_level (str, {"DEBUG", "INFO", "WARNING", "ERROR", "CRITICAL"}) – Threshold level for the log messages. Default “INFO”
- Raises
TileSizeError – If the tile size is larger than the slide size
LevelError – If the level is not available for the slide
- Return type
None
- locate_tiles(slide, extraction_mask=<histolab.masks.BiggestTissueBoxMask object>, scale_factor=32, alpha=128, outline='red', linewidth=1, tiles=None)¶
Draw tile box references on a rescaled version of the slide
- Parameters
slide (Slide) – Slide reference where placing the tiles
extraction_mask (BinaryMask, optional) – BinaryMask object defining how to compute a binary mask from a Slide. Default BiggestTissueBoxMask
scale_factor (int, optional) – Scaling factor for the returned image. Default is 32.
alpha (int, optional) – The alpha level to be applied to the rescaled slide. Default is 128.
outline (Union[str, Iterable[str], Iterable[Tuple[int]]], optional) – The outline color for the tile annotations. Default is ‘red’. You can provide this as a string compatible with matplotlib, or you can provide a list of the same length as the tiles, where each color is your assigned color for the corresponding individual tile. This list can be a list of matplotlib-style string colors, or a list of tuples of ints in the [0, 255] range, each of length 3, representing the red, green and blue color for each tile. For example, if you have two tiles that you want to be colored yellow, you can pass this argument as any of the following .. - ‘yellow’ - [‘yellow’, ‘yellow’] - [(255, 255, 0), (255, 255, 0)]
linewidth (int, optional) – Thickness of line used to draw tiles. Default is 1.
tiles (Optional[Iterable[Tile]], optional) – Tiles to visualize. Will be extracted if None. Default is None. You may decide to provide this argument if you do not want the tiles to be re-extracted for visualization if you already have the tiles in hand.
- Returns
PIL Image of the rescaled slide with the extracted tiles outlined
- Return type
PIL.Image.Image
- property tile_size: Tuple[int, int]¶
(width, height) of the extracted tiles.