# PicoStream PicoStream is a high-performance Python application for streaming data from a PicoScope to an HDF5 file, with an optional, decoupled live visualization. It is designed for robust, high-speed data logging where data integrity is critical. ## Features - **Robust Architecture**: Separates data acquisition from disk I/O using a producer-consumer pattern with a large, shared memory buffer pool to prevent data loss. - **Zero-Risk Live Plotting**: The plotter reads from the HDF5 file, not the live data stream. This ensures that a slow or crashing GUI cannot interfere with data acquisition. - **Efficient Visualization**: Uses `pyqtgraph` and a Numba-accelerated min-max decimation algorithm to display large datasets with minimal CPU impact. - **Flexible Data Reading**: The included `PicoStreamReader` class allows for easy post-processing, including on-the-fly decimation ('mean' or 'min_max'). ## Installation 1. `pip install -e .` 2. Install the official PicoSDK from Pico Technology. Linux can take some additional work, see the AUR wiki for details. ## Usage The primary way to use the package is through the `picostream` command-line tool. ### Acquisition The following command starts an acquisition at 62.5 MS/s and saves the data to `my_data.hdf5` with a live plot. ```bash picostream -s 62.5 -o my_data.hdf5 --plot ``` Run `picostream --help` for a full list of options. ### Viewing an Existing File The plotter can be run as a standalone tool to view any compatible HDF5 file. ```bash python -m picostream.dfplot /path/to/your/data.hdf5 ``` ## Data Analysis with `PicoStreamReader` The output HDF5 file contains raw ADC counts and metadata. The `PicoStreamReader` class in `picostream.reader` is the recommended way to read and process this data. ### Example: Processing a File in Chunks Here is an example of how to use `PicoStreamReader` to iterate through a large file and perform analysis without loading the entire dataset into memory. ```python import numpy as np from picostream.reader import PicoStreamReader # Use the reader as a context manager with PicoStreamReader('my_data.hdf5') as reader: # Metadata is available as attributes after opening sample_rate_sps = 1e9 / reader.sample_interval_ns print(f"File contains {reader.num_samples:,} samples.") print(f"Sample rate: {sample_rate_sps / 1e6:.2f} MS/s") # Example: Iterate through the file with 10x decimation print("\nProcessing data with 10x 'min_max' decimation...") for times, voltages_mv in reader.get_block_iter( chunk_size=10_000_000, decimation_factor=10, decimation_mode='min_max' ): # The 'times' and 'voltages_mv' arrays are now decimated. # Process the smaller data chunk here. print(f" - Processed a chunk of {voltages_mv.size} decimated points.") print("\nFinished processing.") ``` ## API Reference: `PicoStreamReader` The `PicoStreamReader` provides a simple and efficient interface for accessing data. ### Initialization #### `__init__(self, hdf5_path: str)` Initializes the reader. The file is opened and metadata is read when the object is used as a context manager. ### Metadata Attributes These attributes are populated from the HDF5 file's metadata when the reader is opened. - `num_samples: int`: Total number of raw samples in the dataset. - `sample_interval_ns: float`: The time interval between samples in nanoseconds. - `voltage_range_v: float`: The configured voltage range (e.g., `20.0` for ±20V). - `max_adc_val: int`: The maximum ADC count value (e.g., 32767). - `analog_offset_v: float`: The configured analog offset in Volts. - `downsample_mode: str`: The hardware downsampling mode used during acquisition (`'average'` or `'aggregate'`). - `hardware_downsample_ratio: int`: The hardware downsampling ratio used. ### Data Access Methods #### `get_block_iter(self, chunk_size: int = 1_000_000, decimation_factor: int = 1, decimation_mode: str = "mean") -> Generator` Returns a generator that yields data blocks as `(times, voltages)` tuples for the entire dataset. This is the recommended method for processing large files. - `chunk_size`: The number of *raw* samples to read from the file for each chunk. - `decimation_factor`: The factor by which to decimate the data. - `decimation_mode`: The decimation method (`'mean'` or `'min_max'`). #### `get_next_block(self, chunk_size: int, decimation_factor: int = 1, decimation_mode: str = "mean") -> Tuple | None` Retrieves the next sequential block of data. Returns `None` when the end of the file is reached. Use `reset()` to start over. #### `get_block(self, size: int, start: int = 0, decimation_factor: int = 1, decimation_mode: str = "mean") -> Tuple` Retrieves a specific block of data from the file. - `size`: The number of *raw* samples to retrieve. - `start`: The starting sample index. #### `reset(self) -> None` Resets the internal counter for `get_next_block()` to the beginning of the file. ## Acknowledgements This package began as a fork of [JoshHarris2108/pico_streaming](https://github.com/JoshHarris2108/pico_streaming) (which is unlicensed). I acknowledge and appreciate Josh's original idea/architecture.