# Reading Microscopy Data in Python

Choosing the right Python library for microscopy data analysis is crucial for optimizing workflow efficiency and gaining valuable insights. This guide offers a comprehensive overview of available libraries, helping you make informed decisions based on your priorities, whether it's speed, versatility, or integration with other tools. Empower your microscopy data analysis by exploring the options tailored to meet your specific requirements.

For opening microscopy data files in Python, you have several options, each with its own advantages. Here's a breakdown of the options and some considerations:

1. scikit-image (skimage.io.imread and skimage.io.imread_collection):
- Used for reading standard image formats.
- Provides simple and efficient functions for reading individual images or collections of images.

2. tifffile (tifffile.TiffFile and tifffile.TiffSequence):
 
- Specialized for working with TIFF files, including multi-dimensional arrays.
- TiffSequence is useful for handling sequences of TIFF files.

3. bioformats (bioformats.ImageReader):
- Supports a variety of microscopy formats, especially those using the OME data model.
- Handles multi-dimensional data and can read metadata.

Regarding ImageJ hyperstack organization (TZCYXS):

 T: Time
 Z: Z-stack (slices)
 C: Channels
 Y: Height
 X: Width
 S: Series (used for distinguishing multiple acquisitions)

For Holoviews:

- It's used for interactive visualization but doesn't directly handle file reading. Check for memmap support and disk reading capabilities.

Regarding Bioformats standard:

- Bioformats follows the OME (Open Microscopy Environment) standard, where each channel and time point is stored in a separate TIFF file. An OME.tif typically contains a single plane or a Z-stack.

About tiles and 5D in Bioformats:

- In the context of Bioformats, 5D typically refers to a dataset with dimensions T-Z-C-Y-X, where T is time, Z is the z-stack, C is the channel, and Y and X are spatial dimensions. Tiles may refer to sub-images or chunks of the larger image, which can be useful for efficiently working with large datasets.

The 6D, 7D, and 8D configurations in Bioformats likely involve additional dimensions or parameters specific to certain types of microscopy data.

To understand the exact definition of 5D in Bioformats, you should refer to the Bioformats documentation or OME data model specifications for detailed information on how these dimensions are interpreted and utilized in the context of microscopy data.

## Data Path Assignment and Imports

To begin our analysis, we first need to import the necessary libraries and assign the path to our data files. This step ensures that we have access to the tools and resources required for the subsequent tasks.

Let's get started by executing the following code:

In [None]:
%load_ext autoreload
%autoreload 2

from pathlib import Path

import skimage.io
import tifffile

import nima_io.read as ir

tdata = Path("../../tests/data/")
lif = tdata / "2015Aug28_TransHXB2_50min+DMSO.lif"
img_tile = tdata / "t4_1.tif" # C=3 T=4 S=15
img_void_tile = tdata / "tile6_1.tif" # C=4 T=3 S=14 scattered
# imgsingle = tdata / "exp2_2.tif" # C=2 T=81
# mcts = tdata / "multi-channel-time-series.ome.tif" # C=3 T=7
# bigtiff = tdata / "LC26GFP_1.tf8" # bigtiff

slif = str(lif)
simg_tile = str(img_tile)
simg_void_tile = str(img_void_tile)
# simgsingle = str(imgsingle)
# smcts = str(mcts)
# sbigtiff = str(bigtiff)

## Skimage and Tifffile

`scikit-image` serves as a versatile option for general image reading, encompassing various formats, including TIFF.
Meanwhile `tifffile` stands out for its capabilities in managing sequences, OME metadata, memory mapping, and Zarr arrays specifically for TIFF data files.

- Memory mapping `memmap` enables efficient work with large files by mapping portions into memory as needed, without loading the entire file.

- `Zarr` storage format, known for its handling of chunked, compressed, and n-dimensional arrays. This provides flexibility in reading and writing Zarr arrays, contributing to the library's versatility in managing microscopy datasets, especially those with large or complex structures.

In [None]:
t1 = skimage.io.imread(img_tile, plugin="tifffile")
t2 = skimage.io.imread(img_void_tile, plugin="tifffile")
t1.shape, t2.shape

In [None]:
tf1 = tifffile.imread(img_tile)
tf2 = tifffile.imread(img_void_tile)
tf1.shape, tf2.shape

In [None]:
fp1glob = str(tdata / "im1s1z3c5t_?.ome.tif")

tifs = tifffile.TiffSequence(fp1glob)
d = tifs.asarray()
print(d.shape)
print(tifs.shape)

In [None]:
with tifffile.TiffFile(img_tile) as tif:
 tag = tif.pages[0].tags["ImageDescription"]

tag.value[:1000]

## nima_io
### read

In [None]:
md, wr = ir.read(simg_void_tile)
md.core, wr

In [None]:
md.core.voxel_size

In [None]:
root = wr.rdr.getMetadataStoreRoot()

In [None]:
ome_store = wr.rdr.getMetadataStore()
ome_store

In [None]:
get_power = ome_store.getArcPower(0, 4)
get_power

In [None]:
att = ome_store.getChannelLightSourceSettingsAttenuation(0, 0)

In [None]:
nmax = 7
(
 len([md for md in md.full.items() if len(md[1][0][0]) == nmax]),
 [md for md in md.full.items() if len(md[1][0][0]) == nmax],
)

In [None]:
list(range(4))

In [None]:
[(0,) * n for n in range(3 + 1)]

In [None]:
ir.convert_java_numeric_field(att), ir.convert_java_numeric_field(get_power)

In [None]:
{md.full.get(k)[0][0] for k in md.full}

In [None]:
[(k, md.full.get(k)[0]) for k in md.full if not md.full.get(k)[0][1]]

In [None]:
ome_store.getRoot() == root

In [None]:
ome_store.getPlaneCount(4), ome_store.getPlaneTheC(4, 11), ome_store.getPixelsSizeZ(4)

In [None]:
wr.rdr.getDimensionOrder(), ir.read(slif)[1].rdr.getDimensionOrder()

Mind the difference between img_void_tile and lif files.

In [None]:
md.full["PixelsDimensionOrder"]

In [None]:
root.getImage(0)

In [None]:
root.getImage(13).getPixels().getPlane(11).getTheC().getValue()

In [None]:
(
 root.getImage(13).getName(),
 ir.read(slif)[1].rdr.getMetadataStoreRoot().getImage(2).getName(),
)

In [None]:
md.core.__dict__

In [None]:
vars(md.core)

### Stitch

In [None]:
f = ir.stitch(md.core, wr, c=2, t=2)
skimage.io.imshow(f)

In [None]:
md.core.stage_position[2]

## nima_io.read

| function | time (ms) | note |
|------------|------------|--------------------------|
| read | 169 | |
| read_pims | 195 | extra pims DIMS |

- Metadata is now uniform across different reading functions.

In the following sections, various bioformats implementations are explored. None of the explored libraries return the numerous metadata linked to individual planes. Consequently, I have developed a small library to handle additional (often neglected) metadata, such as acquisition stage position (essential for reconstructing tiled images) and illumination and emission settings.


### PIMS

Which is currently unable to download loci_tools.jar.

**I really like the frame metadata t_s, x_um, y_um and z_um.
Every array (2D, 3D, ..., n-D) having those metadata in common are contained in the Frame obj: a numpy array with metadata(dict) and frame_no(int).**

Are fs.bundle_axes (fs.frame_shape), fs.iter_axes and fs.default_coords overcomplicated?

Anyway: iter=0 == iter=n which is at least unexpected.

In [None]:
md_pims, wr_pims = ir.read_pims(img_void_tile)
md_pims.core.__dict__

In [None]:
mdata = wr.rdr.getMetadataStore()

In [None]:
root = mdata.getRoot()

In [None]:
im0 = root.getImage(0)

In [None]:
pixels = im0.getPixels()

In [None]:
for idx in range(pixels.sizeOfTiffDataList()):
 tiffData = pixels.getTiffData(idx)
 c = tiffData.getFirstC().getValue().intValue()
 t = tiffData.getFirstT().getValue().intValue()
 print(f"TiffData: c={c}, t={t}")

## ImageIO

In [None]:
from imageio.v3 import imread

%timeit imread(img_void_tile, index=13)
i = imread(img_void_tile, index=13)
i.shape

In [None]:
i.nbytes, 512**2 * 3 * 4 * 2

It can read tif (tf8) files. Series might be passed using `index` (you need to know in advance).

## AICSImageIO

In [None]:
from aicsimageio import AICSImage

i = AICSImage(img_void_tile)
# i = AICSImage(img_void_tile, reconstruct_mosaic=True)
# i_lif = AICSImage(lif)

In [None]:
i.ome_metadata.instruments[0].arcs[0]

In [None]:
lif_aics = AICSImage(slif)

In [None]:
lif_aics.metadata

In [None]:
i.ome_metadata

In [None]:
i.metadata.images[0].pixels.channels[0].light_source_settings.attenuation

In [None]:
i.scenes

In [None]:
i.get_dask_stack()

Mosaic stitch is not supported on tif files; so I will use my function relying on the PositionXYZ metadata.

## dask_image

In [None]:
from dask_image.imread import imread

i = imread(img_void_tile)

In [None]:
i

Somehow it uses bioformats and can handle lif. No mosaic, no metadata though.

**Pycroscopy** https://pypi.org/project/pycroscopy/ is not reading lif nor ome-tif at the moment.

**large-image[all]** failed to install.

**pyimagej** need conda?

## bioio-bioformats

import bioio_ome_tiled_tiff

bioio_ome_tiled_tiff.Reader(str(img_void_tile))

TypeError: tile6_1.tif is not a tiled tiff. The python backend of the BioReader only supports OME tiled tiffs.

In [None]:
import bioio_bioformats

im = bioio_bioformats.Reader(img_void_tile)

In [None]:
im.ome_metadata.images[0].pixels.channels[2].light_source_settings

In [None]:
lif_bioio = bioio_bioformats.Reader(lif)

In [None]:
lif_bioio.physical_pixel_sizes

In [None]:
im.get_dask_stack()

In [None]:
im.ome_metadata.plates[0].wells[0]

In [None]:
i = bioio_bioformats.Reader(img_tile)
i.data.shape, i.dims

In [None]:
i.xarray_dask_data.attrs["processed"]

In [None]:
unp = i.xarray_dask_data.attrs["unprocessed"]
unp[:1000]

In [None]:
stk = i.get_dask_stack()

In [None]:
stk.A

## bfio

In [None]:
import bfio

bfio.BioReader(img_void_tile)

In [None]:
rdr = bfio.BioReader(img_void_tile)
%timeit i = rdr.read()
i = rdr.read()
i.shape

In [None]:
rdr.metadata

In [None]:
rdr.ps_x

In [None]:
rdr.close()

## PIMS

In [None]:
import pims

# %timeit fs = pims.Bioformats(img_void_tile)
fs = pims.Bioformats(img_void_tile)
fs.sizes

## PyOMETiff

In [None]:
import pyometiff

%timeit rdr = pyometiff.OMETIFFReader(fpath=img_void_tile)
rdr = pyometiff.OMETIFFReader(fpath=img_void_tile)

In [None]:
%timeit r = rdr.read()
res = rdr.read()

In [None]:
res[1]

In [None]:
pyometiff.OMETIFFReader._get_metadata_template()

## Final Note

I will keep 

0. Read
1. stitch
2. md_grouping

- impy
- napari.read
- pycromanager
- microscope
- python-microscopy