PicoStream

PicoStream is a high-performance Python application for streaming data from a PicoScope to an HDF5 file, with an optional, decoupled live visualization. It is designed for robust, high-speed data logging where data integrity is critical.

Features

Robust Architecture: Separates data acquisition from disk I/O using a producer-consumer pattern with a large, shared memory buffer pool to prevent data loss.
Zero-Risk Live Plotting: The plotter reads from the HDF5 file, not the live data stream. This ensures that a slow or crashing GUI cannot interfere with data acquisition.
Efficient Visualization: Uses pyqtgraph and a Numba-accelerated min-max decimation algorithm to display large datasets with minimal CPU impact.
Live-Only Mode: Optional buffer size limiting for pure real-time monitoring without generating large files.
Flexible Data Reading: The included PicoStreamReader class allows for easy post-processing, including on-the-fly decimation ('mean' or 'min_max').

Installation

pip install -e .
Install the official PicoSDK from Pico Technology. Linux can take some additional work, see the AUR wiki for details.

Usage

The primary way to use the package is through the picostream command-line tool.

Acquisition

The following commands show different acquisition modes:

# Standard acquisition (saves all data)
picostream -s 62.5 -o my_data.hdf5 --plot

# Live-only mode (limited buffer)
picostream -s 62.5 --plot --max-buff-sec 60

Run picostream --help for a full list of options.

Live-Only Mode

For pure real-time monitoring without saving large amounts of data, use the --max-buff-sec option:

picostream -s 62.5 --plot --max-buff-sec 60

This limits the HDF5 file to the last 60 seconds of data, automatically overwriting older data. Perfect for continuous monitoring applications where you only need recent data visible.

Benefits: - Prevents disk space issues during long acquisitions - Maintains full plotting functionality - File size stays constant after initial buffer fill - No performance impact on acquisition

Viewing an Existing File

The plotter can be run as a standalone tool to view any compatible HDF5 file.

python -m picostream.dfplot /path/to/your/data.hdf5

Data Analysis with `PicoStreamReader`

The output HDF5 file contains raw ADC counts and metadata. The PicoStreamReader class in picostream.reader is the recommended way to read and process this data.

Example: Processing a File in Chunks

Here is an example of how to use PicoStreamReader to iterate through a large file and perform analysis without loading the entire dataset into memory.

import numpy as np
from picostream.reader import PicoStreamReader

# Use the reader as a context manager
with PicoStreamReader('my_data.hdf5') as reader:
    # Metadata is available as attributes after opening
    sample_rate_sps = 1e9 / reader.sample_interval_ns
    print(f"File contains {reader.num_samples:,} samples.")
    print(f"Sample rate: {sample_rate_sps / 1e6:.2f} MS/s")

    # Example: Iterate through the file with 10x decimation
    print("\nProcessing data with 10x 'min_max' decimation...")
    for times, voltages_mv in reader.get_block_iter(
        chunk_size=10_000_000, decimation_factor=10, decimation_mode='min_max'
    ):
        # The 'times' and 'voltages_mv' arrays are now decimated.
        # Process the smaller data chunk here.
        print(f"  - Processed a chunk of {voltages_mv.size} decimated points.")

    print("\nFinished processing.")

API Reference: `PicoStreamReader`

The PicoStreamReader provides a simple and efficient interface for accessing data.

Initialization

`init(self, hdf5_path: str)`

Initializes the reader. The file is opened and metadata is read when the object is used as a context manager.

Metadata Attributes

These attributes are populated from the HDF5 file's metadata when the reader is opened.

num_samples: int: Total number of raw samples in the dataset.
sample_interval_ns: float: The time interval between samples in nanoseconds.
voltage_range_v: float: The configured voltage range (e.g., 20.0 for ±20V).
max_adc_val: int: The maximum ADC count value (e.g., 32767).
analog_offset_v: float: The configured analog offset in Volts.
downsample_mode: str: The hardware downsampling mode used during acquisition ('average' or 'aggregate').
hardware_downsample_ratio: int: The hardware downsampling ratio used.

Data Access Methods

`get_block_iter(self, chunk_size: int = 1_000_000, decimation_factor: int = 1, decimation_mode: str = "mean") -> Generator`

Returns a generator that yields data blocks as (times, voltages) tuples for the entire dataset. This is the recommended method for processing large files. - chunk_size: The number of raw samples to read from the file for each chunk. - decimation_factor: The factor by which to decimate the data. - decimation_mode: The decimation method ('mean' or 'min_max').

`get_next_block(self, chunk_size: int, decimation_factor: int = 1, decimation_mode: str = "mean") -> Tuple | None`

Retrieves the next sequential block of data. Returns None when the end of the file is reached. Use reset() to start over.

`get_block(self, size: int, start: int = 0, decimation_factor: int = 1, decimation_mode: str = "mean") -> Tuple`

Retrieves a specific block of data from the file. - size: The number of raw samples to retrieve. - start: The starting sample index.

`reset(self) -> None`

Resets the internal counter for get_next_block() to the beginning of the file.

Changelog

Version 0.2.0

New Feature: Added live-only mode with --max-buff-sec option for buffer size limiting
Enhancement: Improved plotter to handle buffer resets gracefully
Enhancement: Added total sample count tracking across buffer resets
Enhancement: Skip verification step in live-only mode for better performance
Documentation: Updated README with live-only mode usage examples

Version 0.1.0

Initial release with core streaming and plotting functionality

Acknowledgements

This package began as a fork of JoshHarris2108/pico_streaming (which is unlicensed). I acknowledge and appreciate Josh's original idea/architecture.