Advanced data capture and sync configurations
Some data use cases require advanced configuration beyond the attributes accessible in the UI. You can use raw JSON to configure additional attributes for both data management and data capture. You can also configure data capture for remote parts.
Cloud data retention
Configure how long your synced data remains stored in the cloud:
- Retain data up to a certain size (for example, 100GB) or for a specific length of time (for example, 14 days): Set
retention_policyat the resource level. See theretention_policyfield in data capture configuration attributes. - Delete data captured by a machine when you delete the machine: Control whether your cloud data is deleted when a machine or machine part is removed.
See the
delete_data_on_part_deletionfield in the data management service configuration attributes.
Sync optimization
Configurable sync threads: You can control how many concurrent sync operations occur by adjusting the maximum_num_sync_threads setting.
Higher values may improve throughput on more powerful hardware, but raising it too high may introduce instability on resource-constrained devices.
Wait time before syncing arbitrary files: If you choose to sync arbitrary files (beyond those captured by the data management service), the file_last_modified_millis configuration attribute specifies how long a file must remain unmodified before the data manager considers it for syncing.
The default is 10 seconds.
Advanced data management service configuration
To configure the data manager in JSON, see the following example configurations:
{
"components": [],
"services": [
{
"name": "my-data-manager",
"api": "rdk:service:data_manager",
"model": "rdk:builtin:builtin",
"attributes": {
"sync_interval_mins": 1,
"capture_dir": "",
"tags": [],
"capture_disabled": false,
"sync_disabled": true,
"delete_data_on_part_deletion": true,
"delete_every_nth_when_disk_full": 5,
"maximum_num_sync_threads": 250
}
}
]
}
{
"components": [],
"services": [
{
"name": "my-data-manager",
"api": "rdk:service:data_manager",
"model": "rdk:builtin:builtin",
"attributes": {
"capture_dir": "",
"tags": [],
"additional_sync_paths": [],
"sync_interval_mins": 3
}
}
]
}
The following attributes are available for the data management service:
You can edit the JSON directly by switching to JSON mode in the UI.
Advanced data capture configuration
Caution
Avoid configuring data capture to higher rates than your hardware can handle, as this leads to performance degradation.
This example configuration captures data from the GetImages method of a camera:
{
"services": [
...
,
{
"name": "data_manager",
"api": "rdk:service:data_manager",
"model": "rdk:builtin:builtin",
"attributes": {
"sync_interval_mins": 5,
"capture_dir": "",
"sync_disabled": false,
"tags": []
}
}
],
"remotes": [
{
...
}
],
"components": [
...
,
{
"service_configs": [
{
"type": "data_manager",
"attributes": {
"capture_methods": [
{
"capture_frequency_hz": 0.333,
"disabled": false,
"method": "GetImages",
"additional_params": {
"reader_name": "cam1"
}
}
],
"retention_policy": {
"days": 5
}
}
}
],
"model": "webcam",
"name": "cam",
"api": "rdk:component:camera",
"attributes": {
"video_path": "video0"
},
"depends_on": [
"local"
]
},
...
]
}
This example configuration captures data from the Readings method of a temperature sensor and wifi signal sensor:
{
"services": [
{
"attributes": {
"capture_dir": "",
"tags": [],
"additional_sync_paths": [],
"sync_interval_mins": 3
},
"name": "dm",
"api": "rdk:service:data_manager",
"model": "rdk:builtin:builtin"
}
],
"components": [
{
"api": "rdk:component:sensor",
"model": "tmp36",
"attributes": {
"analog_reader": "temp",
"num_readings": 15
},
"depends_on": [],
"service_configs": [
{
"attributes": {
"capture_methods": [
{
"capture_frequency_hz": 0.2,
"cache_size_kb": 10,
"additional_params": {},
"method": "Readings"
}
]
},
"type": "data_manager"
}
],
"name": "tmp36"
},
{
"api": "rdk:component:sensor",
"model": "wifi-rssi",
"attributes": {},
"service_configs": [
{
"type": "data_manager",
"attributes": {
"capture_methods": [
{
"additional_params": {},
"method": "Readings",
"capture_frequency_hz": 0.1,
"cache_size_kb": 10
}
]
}
}
],
"name": "my-wifi-sensor"
}
]
}
Example configuration for a vision service:
This example configuration captures data from the CaptureAllFromCamera method of the vision service:
{
"components": [
{
"name": "camera-1",
"api": "rdk:component:camera",
"model": "webcam",
"attributes": {}
}
],
"services": [
{
"name": "vision-1",
"api": "rdk:service:vision",
"model": "mlmodel",
"attributes": {
"mlmodel_name": "my_mlmodel_service",
"camera_name": "camera-1"
},
"service_configs": [
{
"type": "data_manager",
"attributes": {
"capture_methods": [
{
"method": "CaptureAllFromCamera",
"capture_frequency_hz": 1,
"additional_params": {
"mime_type": "image/jpeg",
"camera_name": "camera-1",
"min_confidence_score": "0.7"
}
}
]
}
}
]
},
{
"name": "data_manager-1",
"api": "rdk:service:data_manager",
"model": "rdk:builtin:builtin",
"attributes": {
"sync_interval_mins": 0.1,
"capture_dir": "",
"tags": [],
"additional_sync_paths": []
}
},
{
"name": "mlmodel-1",
"api": "rdk:service:mlmodel",
"model": "viam:mlmodel-tflite:tflite_cpu",
"attributes": {}
}
],
"modules": [
{
"type": "registry",
"name": "viam_tflite_cpu",
"module_id": "viam:tflite_cpu",
"version": "0.0.3"
}
]
}
The following attributes are available for data capture configuration:
You can edit the JSON directly by switching to JSON mode in the UI.
Viam supports data capture from resources on remote parts.
For example, if you use a part that does not have a Linux operating system or does not have enough storage or processing power to run viam-server, you can still process and capture the data from that part’s resources by adding it as a remote part.
Currently, you can only configure data capture from remote resources in your JSON configuration.
To add them to your JSON configuration, you must explicitly add a service_config for the data manager in the remote object in the remotes array.
The service config array must contain an object with type: data_manager and an attributes object with an array of capture_methods.
Each capture method object contains the following fields:
| Key | Type | Description |
|---|---|---|
name | string | The name specifies the fully qualified name of the part. Example: "rdk:component:sensor/spacesensor". |
additional_params | Object | Varies based on the method. For example, DoCommand requires docommand_input with an object of the command object to pass to DoCommand, and GetImages can optionally intake a filter_source_names list of strings to indicate which source names to return images from. |
disabled | boolean | Whether data capture for the method is disabled. |
method | string | Depends on the type of component or service. See Supported components and services. Note: For tabular data, Viam enforces a maximum size of 4MB for any single reading. |
capture_frequency_hz | float | Frequency in hertz at which to capture data. For example, to capture a reading every 2 seconds, enter 0.5. |
cache_size_kb | float | viam-micro-server only. The maximum amount of storage (in kilobytes) allocated to a data collector.Default: 1 KB. |
Capture directly to your own MongoDB cluster
You can configure direct capture of tabular data to a MongoDB instance alongside disk storage on your edge device. This can be useful for powering real-time dashboards before data is synced from the edge to the cloud. The MongoDB instance can be a locally running instance or a cluster in the cloud.
Configure using the mongo_capture_config attributes in your data manager service.
You can configure data sync to a MongoDB instance separately from data sync to the Viam Cloud.
When mongo_capture_config.uri is configured, data capture will attempt to connect to the configured MongoDB server and write captured tabular data to the configured mongo_capture_config.database and mongo_capture_config.collection (or their defaults if unconfigured) after enqueuing that data to be written to disk.
If writes to MongoDB fail for any reason, data capture will log an error for each failed write and continue capturing.
Failing to write to MongoDB doesn’t affect capturing and syncing data to cloud storage other than adding capture latency.
Caution
- Capturing directly to MongoDB may write data to MongoDB that later fails to be written to disk (and therefore never gets synced to cloud storage).
- Capturing directly to MongoDB does not retry failed writes to MongoDB. As a consequence, it is NOT guaranteed all data captured will be written to MongoDB.
This can happen in cases such as MongoDB being inaccessible to
viam-serveror writes timing out. - Capturing directly to MongoDB may reduce the maximum frequency that data capture can capture data due to the added latency of writing to MongoDB. If your use case needs to support very high capture rates, this feature may not be appropriate.
Was this page helpful?
Glad to hear it! If you have any other feedback please let us know:
We're sorry about that. To help us improve, please tell us what we can do better:
Thank you!