1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
|
# PicoStream
PicoStream is a high-performance Python application for streaming data from a PicoScope to an HDF5 file, with an optional, decoupled live visualization. It is designed for robust, high-speed data logging where data integrity is critical.
## Features
- **Robust Architecture**: Separates data acquisition from disk I/O using a producer-consumer pattern with a large, shared memory buffer pool to prevent data loss.
- **Zero-Risk Live Plotting**: The plotter reads from the HDF5 file, not the live data stream. This ensures that a slow or crashing GUI cannot interfere with data acquisition.
- **Efficient Visualization**: Uses `pyqtgraph` and a Numba-accelerated min-max decimation algorithm to display large datasets with minimal CPU impact.
- **Flexible Data Reading**: The included `PicoStreamReader` class allows for easy post-processing, including on-the-fly decimation ('mean' or 'min_max').
## Installation
1. `pip install -e .`
2. Install the official PicoSDK from Pico Technology. Linux can take some additional work, see the AUR wiki for details.
## Usage
The primary way to use the package is through the `picostream` command-line tool.
### Acquisition
The following command starts an acquisition at 62.5 MS/s and saves the data to `my_data.hdf5` with a live plot.
```bash
picostream -s 62.5 -o my_data.hdf5 --plot
```
Run `picostream --help` for a full list of options.
### Viewing an Existing File
The plotter can be run as a standalone tool to view any compatible HDF5 file.
```bash
python -m picostream.dfplot /path/to/your/data.hdf5
```
## Data Analysis with `PicoStreamReader`
The output HDF5 file contains raw ADC counts and metadata. The `PicoStreamReader` class in `picostream.reader` is the recommended way to read and process this data.
### Example: Processing a File in Chunks
Here is an example of how to use `PicoStreamReader` to iterate through a large file and perform analysis without loading the entire dataset into memory.
```python
import numpy as np
from picostream.reader import PicoStreamReader
# Use the reader as a context manager
with PicoStreamReader('my_data.hdf5') as reader:
# Metadata is available as attributes after opening
sample_rate_sps = 1e9 / reader.sample_interval_ns
print(f"File contains {reader.num_samples:,} samples.")
print(f"Sample rate: {sample_rate_sps / 1e6:.2f} MS/s")
# Example: Iterate through the file with 10x decimation
print("\nProcessing data with 10x 'min_max' decimation...")
for times, voltages_mv in reader.get_block_iter(
chunk_size=10_000_000, decimation_factor=10, decimation_mode='min_max'
):
# The 'times' and 'voltages_mv' arrays are now decimated.
# Process the smaller data chunk here.
print(f" - Processed a chunk of {voltages_mv.size} decimated points.")
print("\nFinished processing.")
```
## API Reference: `PicoStreamReader`
The `PicoStreamReader` provides a simple and efficient interface for accessing data.
### Initialization
#### `__init__(self, hdf5_path: str)`
Initializes the reader. The file is opened and metadata is read when the object is used as a context manager.
### Metadata Attributes
These attributes are populated from the HDF5 file's metadata when the reader is opened.
- `num_samples: int`: Total number of raw samples in the dataset.
- `sample_interval_ns: float`: The time interval between samples in nanoseconds.
- `voltage_range_v: float`: The configured voltage range (e.g., `20.0` for ±20V).
- `max_adc_val: int`: The maximum ADC count value (e.g., 32767).
- `analog_offset_v: float`: The configured analog offset in Volts.
- `downsample_mode: str`: The hardware downsampling mode used during acquisition (`'average'` or `'aggregate'`).
- `hardware_downsample_ratio: int`: The hardware downsampling ratio used.
### Data Access Methods
#### `get_block_iter(self, chunk_size: int = 1_000_000, decimation_factor: int = 1, decimation_mode: str = "mean") -> Generator`
Returns a generator that yields data blocks as `(times, voltages)` tuples for the entire dataset. This is the recommended method for processing large files.
- `chunk_size`: The number of *raw* samples to read from the file for each chunk.
- `decimation_factor`: The factor by which to decimate the data.
- `decimation_mode`: The decimation method (`'mean'` or `'min_max'`).
#### `get_next_block(self, chunk_size: int, decimation_factor: int = 1, decimation_mode: str = "mean") -> Tuple | None`
Retrieves the next sequential block of data. Returns `None` when the end of the file is reached. Use `reset()` to start over.
#### `get_block(self, size: int, start: int = 0, decimation_factor: int = 1, decimation_mode: str = "mean") -> Tuple`
Retrieves a specific block of data from the file.
- `size`: The number of *raw* samples to retrieve.
- `start`: The starting sample index.
#### `reset(self) -> None`
Resets the internal counter for `get_next_block()` to the beginning of the file.
## Acknowledgements
This package began as a fork of [JoshHarris2108/pico_streaming](https://github.com/JoshHarris2108/pico_streaming) (which is unlicensed). I acknowledge and appreciate Josh's original idea/architecture.
|