5.3. Forecasting Dataset Format
This guide explains how to format datasets for time series forecasting tasks.
5.3.1. Directory Structure
Forecasting uses the same structure as regression:
my_dataset/
├── files/ # MUST be named "files"
│ ├── sequence1.csv
│ ├── sequence2.csv
│ └── sequenceN.csv
└── annotations/ # Required
├── instances_train_list.txt
└── instances_val_list.txt
5.3.2. Data File Format
All variables (features) are in columns. You specify which to use as inputs and which to forecast via configuration.
Example: Temperature Forecasting
ambient,coolant,current,pm_temp
19.85,18.81,2.28,22.93
19.85,18.79,2.28,22.94
19.85,18.79,2.28,22.94
19.85,18.77,2.28,22.94
...
In this example:
Column 0 (
ambient): Input featureColumn 1 (
coolant): Input featureColumn 2 (
current): Input featureColumn 3 (
pm_temp): Target to forecast
5.3.3. Key Difference from Regression
Note
In regression, the target is the current value of the last column, averaged
across the window. In forecasting, the target is a future value of the
specified variable at forecast_horizon steps ahead. This means regression
answers “what is the target value right now?” while forecasting answers “what
will this variable be in the future?”
Regression: Target is a separate value for each window (in last column)
Forecasting: Target is a future value of an existing variable
Regression: Input [t0...t3] → Predict separate_target_avg
Forecasting: Input [t0, t1, t2] → Predict variable[t3]
Unlike regression where the target column is always the last column, forecasting
requires you to explicitly specify which variable(s) to predict via the
target_variables configuration parameter.
5.3.4. Configuration
dataset:
enable: True
dataset_name: 'my_forecast_data'
input_data_path: '/path/to/my_dataset'
data_processing_feature_extraction:
data_proc_transforms: ['SimpleWindow']
frame_size: 3 # Lookback (use 3 past values)
forecast_horizon: 1 # Predict 1 step ahead
stride_size: 0.4
# Specify columns by index or name
variables: [0, 3] # Use columns 0 and 3 as inputs
target_variables: [3] # Forecast column 3
training:
model_name: 'FCST_LSTM8'
output_int: False # Required for forecasting!
5.3.5. Variable Specification Options
Note
target_variables can be specified by column index (0-based, after any
time column is removed) or by column name (if the CSV has a header row).
You can also mix indices and names, though using one form consistently is
recommended. When using indices, remember that any column containing “time”
is dropped first, so indices refer to the columns after that removal.
By Column Index (0-based, after time column removal):
variables: [0, 3] # Use columns 0 and 3
target_variables: [3] # Forecast column 3
By Column Name (requires CSV header):
variables: ['ambient', 'pm_temp']
target_variables: ['pm_temp']
Multiple Targets (forecast several variables):
variables: [0, 1, 2, 3]
target_variables: [2, 3] # Forecast columns 2 and 3
5.3.6. Windowing Behavior
With frame_size=3 and forecast_horizon=1:
Data: [v0, v1, v2, v3, v4, v5, v6, ...]
Window 1: Input [v0, v1, v2] → Output [v3]
Window 2: Input [v1, v2, v3] → Output [v4]
Window 3: Input [v2, v3, v4] → Output [v5]
...
5.3.7. Complete Example
Dataset structure:
pmsm_temp_forecast/
├── files/
│ ├── profile_10.csv
│ ├── profile_11.csv
│ └── profile_12.csv
└── annotations/
├── instances_train_list.txt
└── instances_val_list.txt
profile_10.csv:
ambient,coolant,u_d,u_q,i_a,pm
19.850,18.815,1.499,0.032,2.281,22.936
19.850,18.793,1.542,-0.092,2.281,22.941
19.850,18.790,1.456,0.081,2.281,22.944
...
config.yaml:
common:
task_type: 'generic_timeseries_forecasting'
target_device: 'F28P55'
dataset:
dataset_name: 'pmsm_temp'
input_data_path: '/data/pmsm_temp_forecast'
data_processing_feature_extraction:
data_proc_transforms: ['SimpleWindow']
frame_size: 3
forecast_horizon: 1
stride_size: 0.4
variables: ['ambient', 'pm'] # Use ambient and pm as inputs
target_variables: ['pm'] # Forecast pm temperature
training:
model_name: 'FCST_LSTM8'
output_int: False
5.3.8. Important Notes
Warning
output_intmust beFalsefor forecastingFeature extraction (FFT, wavelets) is not supported
The target variable should typically be included in input variables
5.3.9. Minimum Data Requirements
Each file must have at least:
frame_size + forecast_horizon
rows to generate at least one training sample.
5.3.10. Common Issues
“Insufficient sequence length” error
Files need at least frame_size + forecast_horizon rows.
Poor forecasting performance
Increase
frame_sizeto capture more historyTry LSTM models for complex temporal patterns
Ensure sufficient training data