diff options
Diffstat (limited to 'README.md')
| -rw-r--r-- | README.md | 114 |
1 files changed, 114 insertions, 0 deletions
diff --git a/README.md b/README.md new file mode 100644 index 0000000..2bcca2e --- /dev/null +++ b/README.md @@ -0,0 +1,114 @@ +# PicoStream + +PicoStream is a high-performance Python application for streaming data from a PicoScope to an HDF5 file, with an optional, decoupled live visualization. It is designed for robust, high-speed data logging where data integrity is critical. + +## Features + +- **Robust Architecture**: Separates data acquisition from disk I/O using a producer-consumer pattern with a large, shared memory buffer pool to prevent data loss. +- **Zero-Risk Live Plotting**: The plotter reads from the HDF5 file, not the live data stream. This ensures that a slow or crashing GUI cannot interfere with data acquisition. +- **Efficient Visualization**: Uses `pyqtgraph` and a Numba-accelerated min-max decimation algorithm to display large datasets with minimal CPU impact. +- **Flexible Data Reading**: The included `PicoStreamReader` class allows for easy post-processing, including on-the-fly decimation ('mean' or 'min_max'). + +## Installation + +1. `pip install -e .` + +2. Install the official PicoSDK from Pico Technology. Linux can take some additional work, see the AUR wiki for details. + +## Usage + +The primary way to use the package is through the `picostream` command-line tool. + +### Acquisition + +The following command starts an acquisition at 62.5 MS/s and saves the data to `my_data.hdf5` with a live plot. + +```bash +picostream -s 62.5 -o my_data.hdf5 --plot +``` + +Run `picostream --help` for a full list of options. + +### Viewing an Existing File + +The plotter can be run as a standalone tool to view any compatible HDF5 file. + +```bash +python -m picostream.dfplot /path/to/your/data.hdf5 +``` + +## Data Analysis with `PicoStreamReader` + +The output HDF5 file contains raw ADC counts and metadata. The `PicoStreamReader` class in `picostream.reader` is the recommended way to read and process this data. + +### Example: Processing a File in Chunks + +Here is an example of how to use `PicoStreamReader` to iterate through a large file and perform analysis without loading the entire dataset into memory. + +```python +import numpy as np +from picostream.reader import PicoStreamReader + +# Use the reader as a context manager +with PicoStreamReader('my_data.hdf5') as reader: + # Metadata is available as attributes after opening + sample_rate_sps = 1e9 / reader.sample_interval_ns + print(f"File contains {reader.num_samples:,} samples.") + print(f"Sample rate: {sample_rate_sps / 1e6:.2f} MS/s") + + # Example: Iterate through the file with 10x decimation + print("\nProcessing data with 10x 'min_max' decimation...") + for times, voltages_mv in reader.get_block_iter( + chunk_size=10_000_000, decimation_factor=10, decimation_mode='min_max' + ): + # The 'times' and 'voltages_mv' arrays are now decimated. + # Process the smaller data chunk here. + print(f" - Processed a chunk of {voltages_mv.size} decimated points.") + + print("\nFinished processing.") +``` + +## API Reference: `PicoStreamReader` + +The `PicoStreamReader` provides a simple and efficient interface for accessing data. + +### Initialization + +#### `__init__(self, hdf5_path: str)` +Initializes the reader. The file is opened and metadata is read when the object is used as a context manager. + +### Metadata Attributes + +These attributes are populated from the HDF5 file's metadata when the reader is opened. + +- `num_samples: int`: Total number of raw samples in the dataset. +- `sample_interval_ns: float`: The time interval between samples in nanoseconds. +- `voltage_range_v: float`: The configured voltage range (e.g., `20.0` for ±20V). +- `max_adc_val: int`: The maximum ADC count value (e.g., 32767). +- `analog_offset_v: float`: The configured analog offset in Volts. +- `downsample_mode: str`: The hardware downsampling mode used during acquisition (`'average'` or `'aggregate'`). +- `hardware_downsample_ratio: int`: The hardware downsampling ratio used. + +### Data Access Methods + +#### `get_block_iter(self, chunk_size: int = 1_000_000, decimation_factor: int = 1, decimation_mode: str = "mean") -> Generator` +Returns a generator that yields data blocks as `(times, voltages)` tuples for the entire dataset. This is the recommended method for processing large files. +- `chunk_size`: The number of *raw* samples to read from the file for each chunk. +- `decimation_factor`: The factor by which to decimate the data. +- `decimation_mode`: The decimation method (`'mean'` or `'min_max'`). + +#### `get_next_block(self, chunk_size: int, decimation_factor: int = 1, decimation_mode: str = "mean") -> Tuple | None` +Retrieves the next sequential block of data. Returns `None` when the end of the file is reached. Use `reset()` to start over. + +#### `get_block(self, size: int, start: int = 0, decimation_factor: int = 1, decimation_mode: str = "mean") -> Tuple` +Retrieves a specific block of data from the file. +- `size`: The number of *raw* samples to retrieve. +- `start`: The starting sample index. + +#### `reset(self) -> None` +Resets the internal counter for `get_next_block()` to the beginning of the file. + + +## Acknowledgements + +This package began as a fork of [JoshHarris2108/pico_streaming](https://github.com/JoshHarris2108/pico_streaming) (which is unlicensed). I acknowledge and appreciate Josh's original idea/architecture. |
