LUT University Energy Consumption/Production Dataset

Beskrivning

The Data was collected at LUT University, Lappeenranta, Finland. Save for measurement or API errors, all variables were sampled at an hourly rate and logged using UTC timestamps. The dataset comprises: - The aggregated energy consumed by an entire building [kW]. - The aggregated electricity generated by the PV panel array [kW]. - Day-ahead (ELSPOT) prices for the Finnish Market [€/MWh]. - This information is made publicly available by [ENTSO-e](https://newtransparency.entsoe.eu/market/energyPrices). The version found in `raw/elspot.parquet` comprises only the relevant years and uses UTC timestamps instead of local time. - Meteorological variables measured 6 km away from campus at the Lappeenranta airport by the [Finnish Meteorological Institute](https://en.ilmatieteenlaitos.fi/) (see table below). | Variable | Unit | | --- | --- | | Air Temperature | ◦C | | Cloud Amount | Okta | | Dew Point Temperature | ◦C | | Global/Diffuse Radiation | W/m2 | | Gust Speed | m/s | | Horizontal Visibility | m | | Pressure | hPa | | Relative Humidity | % | | Sunshine | % | | Wind Direction | ◦ | | Wind Speed | m/s | The production and consumption columns are stored in two separate files (`raw/{consumption/production}.parquet`); thus, for the sake of consistency, the datasets were clipped to their overlapping period and joined into a single table. Discrepancies between duplicated columns arise from missing values in one of the two sources; a robust average (averaged if not null) was set as the consensus value for the redundant measurements. The hourly timestamps were first enforced via upscaling without interpolation. A graphical analysis of the raw data revealed that the measurements are naturally split by missing values into three segments: 1. From 30.09.2017 to 30.12.2017 (2208 samples, or 11.2% of the dataset), found in `partitioned/dataset_0.parquet`. 2. From 05.02.2018 to 06.10.2018 (5856 samples after previous-day interpolation for the missing data bump in the middle of the segment, or 29.7% of the dataset), found in `partitioned/dataset_1.parquet`. 3. From 16.11.2018 to 14.03.2020 (11640 samples, or 59.1% of the dataset), found in `partitioned/dataset_2.parquet`. Finally, the script that transforms the raw data into the partitioned tables is provided as a Jupyter Notebook (`dataset_integration.ipynb`).
Visa mer

Publiceringsår

2025

Typ av data

Upphovspersoner

Computational Engineering

Sergio Mauricio Vanegas Arias Orcid -palvelun logo - Kurator, Medarbetare, Utgivare

Lasse Lensu Orcid -palvelun logo - Medarbetare

Samuli Honkapuro Orcid -palvelun logo - Medarbetare

Kimmo Huoman - Upphovsperson

Ville Tikka Orcid -palvelun logo - Upphovsperson

Projekt

Övriga uppgifter

Vetenskapsområden

Data- och informationsvetenskap; Miljövetenskap; El-, automations- och telekommunikationsteknik, elektronik

Språk

engelska

Öppen tillgång

Öppet

Licens

Creative Commons Attribution ShareAlike 4.0 International (CC BY SA 4.0)

Nyckelord

Weather observations, weather, forecasting, time series, Solar Panels, Building, Electricity Production, PV Solar, Electricity Consumption

Ämnesord

Temporal täckning

undefined

Relaterade till denna forskningsdata