aboutsummaryrefslogtreecommitdiff
path: root/README.md
blob: d21e84bd6f59cd3dea17679b76e15868d9e53036 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
# PicoStream

PicoStream is a high-performance Python application for streaming data from a PicoScope to an HDF5 file, with an optional, decoupled live visualization. It is designed for robust, high-speed data logging where data integrity is critical.

## Features

- **Robust Architecture**: Separates data acquisition from disk I/O using a producer-consumer pattern with a large, shared memory buffer pool to prevent data loss.
- **Zero-Risk Live Plotting**: The plotter reads from the HDF5 file, not the live data stream. This ensures that a slow or crashing GUI cannot interfere with data acquisition.
- **Efficient Visualization**: Uses `pyqtgraph` and a Numba-accelerated min-max decimation algorithm to display large datasets with minimal CPU impact.
- **Live-Only Mode**: Optional buffer size limiting for pure real-time monitoring without generating large files.
- **Flexible Data Reading**: The included `PicoStreamReader` class allows for easy post-processing, including on-the-fly decimation ('mean' or 'min_max').

## Installation

1. `pip install -e .`

2. Install the official PicoSDK from Pico Technology. Linux can take some additional work, see the AUR wiki for details.

## Usage

The primary way to use the package is through the `picostream` command-line tool.

### Acquisition

The following commands show different acquisition modes:

```bash
# Standard acquisition (saves all data)
picostream -s 62.5 -o my_data.hdf5 --plot

# Live-only mode (limited buffer)
picostream -s 62.5 --plot --max-buff-sec 60
```

Run `picostream --help` for a full list of options.

### Live-Only Mode

For pure real-time monitoring without saving large amounts of data, use the `--max-buff-sec` option:

```bash
picostream -s 62.5 --plot --max-buff-sec 60
```

This limits the HDF5 file to the last 60 seconds of data, automatically overwriting older data. Perfect for continuous monitoring applications where you only need recent data visible.

**Benefits:**
- Prevents disk space issues during long acquisitions
- Maintains full plotting functionality
- File size stays constant after initial buffer fill
- No performance impact on acquisition

### Viewing an Existing File

The plotter can be run as a standalone tool to view any compatible HDF5 file.

```bash
python -m picostream.dfplot /path/to/your/data.hdf5
```

## Data Analysis with `PicoStreamReader`

The output HDF5 file contains raw ADC counts and metadata. The `PicoStreamReader` class in `picostream.reader` is the recommended way to read and process this data.

### Example: Processing a File in Chunks

Here is an example of how to use `PicoStreamReader` to iterate through a large file and perform analysis without loading the entire dataset into memory.

```python
import numpy as np
from picostream.reader import PicoStreamReader

# Use the reader as a context manager
with PicoStreamReader('my_data.hdf5') as reader:
    # Metadata is available as attributes after opening
    sample_rate_sps = 1e9 / reader.sample_interval_ns
    print(f"File contains {reader.num_samples:,} samples.")
    print(f"Sample rate: {sample_rate_sps / 1e6:.2f} MS/s")

    # Example: Iterate through the file with 10x decimation
    print("\nProcessing data with 10x 'min_max' decimation...")
    for times, voltages_mv in reader.get_block_iter(
        chunk_size=10_000_000, decimation_factor=10, decimation_mode='min_max'
    ):
        # The 'times' and 'voltages_mv' arrays are now decimated.
        # Process the smaller data chunk here.
        print(f"  - Processed a chunk of {voltages_mv.size} decimated points.")
        
    print("\nFinished processing.")
```

## API Reference: `PicoStreamReader`

The `PicoStreamReader` provides a simple and efficient interface for accessing data.

### Initialization

#### `__init__(self, hdf5_path: str)`
Initializes the reader. The file is opened and metadata is read when the object is used as a context manager.

### Metadata Attributes

These attributes are populated from the HDF5 file's metadata when the reader is opened.

-   `num_samples: int`: Total number of raw samples in the dataset.
-   `sample_interval_ns: float`: The time interval between samples in nanoseconds.
-   `voltage_range_v: float`: The configured voltage range (e.g., `20.0` for ±20V).
-   `max_adc_val: int`: The maximum ADC count value (e.g., 32767).
-   `analog_offset_v: float`: The configured analog offset in Volts.
-   `downsample_mode: str`: The hardware downsampling mode used during acquisition (`'average'` or `'aggregate'`).
-   `hardware_downsample_ratio: int`: The hardware downsampling ratio used.

### Data Access Methods

#### `get_block_iter(self, chunk_size: int = 1_000_000, decimation_factor: int = 1, decimation_mode: str = "mean") -> Generator`
Returns a generator that yields data blocks as `(times, voltages)` tuples for the entire dataset. This is the recommended method for processing large files.
-   `chunk_size`: The number of *raw* samples to read from the file for each chunk.
-   `decimation_factor`: The factor by which to decimate the data.
-   `decimation_mode`: The decimation method (`'mean'` or `'min_max'`).

#### `get_next_block(self, chunk_size: int, decimation_factor: int = 1, decimation_mode: str = "mean") -> Tuple | None`
Retrieves the next sequential block of data. Returns `None` when the end of the file is reached. Use `reset()` to start over.

#### `get_block(self, size: int, start: int = 0, decimation_factor: int = 1, decimation_mode: str = "mean") -> Tuple`
Retrieves a specific block of data from the file.
-   `size`: The number of *raw* samples to retrieve.
-   `start`: The starting sample index.

#### `reset(self) -> None`
Resets the internal counter for `get_next_block()` to the beginning of the file.


## Changelog

### Version 0.2.0
- **New Feature**: Added live-only mode with `--max-buff-sec` option for buffer size limiting
- **Enhancement**: Improved plotter to handle buffer resets gracefully
- **Enhancement**: Added total sample count tracking across buffer resets
- **Enhancement**: Skip verification step in live-only mode for better performance
- **Documentation**: Updated README with live-only mode usage examples

### Version 0.1.0
- Initial release with core streaming and plotting functionality

## Acknowledgements

This package began as a fork of [JoshHarris2108/pico_streaming](https://github.com/JoshHarris2108/pico_streaming) (which is unlicensed). I acknowledge and appreciate Josh's original idea/architecture.