diff --git a/.gitignore b/.gitignore
index 701031c..a733132 100644
--- a/.gitignore
+++ b/.gitignore
@@ -4,4 +4,5 @@
/venv/
.env
/data/output*/
-old_*
\ No newline at end of file
+old_*
+.DS_Store
\ No newline at end of file
diff --git a/README.md b/README.md
index d5c10d8..827bc63 100644
--- a/README.md
+++ b/README.md
@@ -1,5 +1,28 @@
# Anvendt mappe
+## Installering
+- For å starte må man aktivere venv-en. Dette gjøres på følgende måte:
+ 1. Skriv en av disse i terminalen, dette kan variere fra pc og operativsystem:
+ - `python3 -m venv venv`
+ - `python -m venv venv`
+ 2. Aktiver vevn, det gjøres på en av følgende måter:
+ **Mac/linux**: `source venv/bin/activate`
+ **Windows**: `./venv/Scripts/activate`
+ 3. Installere nødvendige biblioteker, med en av disse:
+ - `pip3 install -r requirements.txt`
+ - `pip install -r requirements.txt`
+
+## Oversikt
+Her kommer oversikt over strukturen i prosjektet:
+- `data` denne mappen inneholder output data
+- `docs` denne mappen inneholder dokumentasjon
+- `notebooks` denne mappen inneholder notebookene med all funksjonalitet
+- `resources` denne mappen inneholder våre kilder
+- `src` denne mappen inneholder python filene
+- `tests` denne mappen inneholder våre unittester
+
+Det kan leses mer om disse i deres tilhørende `README.md` filer.
+
### Mappe del 1
#### Vår visjon av oppgaven
@@ -32,6 +55,10 @@ Ettersom ingen av de fra MET funket etter vårt ønske, søkte vi videre på net
- [OpenWeatherMap API](https://openweathermap.org/)
- Denne inneholder forecast data, men det er også mulig å hente historiske data.
- Med en student profil, får vi gratis tilgang på masse data. Dermed vil vi kunne requeste historiske data fra API-en.
+ - Det finnes også flere ulike API-er, som vil hjelpe oss å oppnå vår visjon. Blant annet:
+ - [Current Data](https://openweathermap.org/current): for å hente ut data fra ønsket sted på nåværende tidspunkt.
+ - [History API](https://openweathermap.org/history): for å hente data fra ønsket sted og tidsperiode (inntil 7 dager).
+ - [Statistic Historical Data](https://openweathermap.org/api/statistics-api): for å hente statistisk historisk data som kan brukes til regresjon. Den tar utganspunkt i all historisk data og oppsummerer det for hver dag i løpet av et år.
##### Henting av data
For å hente data fra OpenWeatherMap API-en har vi skrevet en funskjon som tar inn stedsnavn, startdato og sluttdato, den legger da ønskede verdier inn i url-en og requester for ønsket sted og tidsperiode, sammen med API-key som er lagret i en env-fil og importert.
@@ -47,7 +74,25 @@ Funksjonen returnerer en print setning når dataen er skrevet, og legger ved fil
##### Hente data fra fil
+For å hente data fra json-fil, bruker vi pandas sin innebygde funksjon _read_json_, for deretter å lagra dataene i en pandas dataframe.
+
+#### Oppgave 3 - Databehandling
+Vi har hele tiden fokusert på å forstå dataen vi har, derfor har vi lagret den i en json fil for å lettere kunne lese ut hvilke verdier vi har, og hvilke vi kanskje ikke trenger. De kolonnene vi mener vi ikke trenger har vi da fjernet. Så har vi sjekket etter feil og mangler i dataen, både med 'NaN' verdier, manglende kolonner eller ekstremverdier.
+##### Metoder for å identifisere og håndtere manglende data
+Metoder vi har brukt er for eksempel pd.json_normalize, df.drop_duplicates og df.drop(columns = «name»). Ved json.normalize har vi fått konvertert dataene våre til en tabell, DataFrame, fordi det er lettere å manipulere. Df.drop_duplicates bruker vi for å enkelt håndtere duplikatene i datasettet. Vi har også kolonner som inneholder informasjon som ikke er relevant til det vi ønsker å finne og da bruker vi df.drop(column= «name») og setter inn kolonnenavnet i parentes bak, eksempel: df = df.drop(columns = «base») eller df = df.drop(columns = «visability»). Denne metoden er nyttig for å rydde opp i datasettet og håndtere fjerning av kolonner som ikke er relevant, og dermed blir det mer oversiktlig og ryddig å jobbe med.
+Vi har også brukt missingno.matrix for å visualisere hvilke kolonner som mangler data, før vi har brukt enten fillna(0) for å endre 'NaN' verider til 0, eller fillna('obj.ffill()) for å bruke forrige lagret data.
-#### Oppgave 3 - Databehandling
+##### List comprehensions
+I den ene koden til statistic_data_notebook er et eksempel på hvor vi har brukt list comprehension for å manipulere data. Vi bruker den til å manipulere temperaturene til celsius og lagre det resultatet i en ny kolonne, temp.mean_celsius. Vi har gjort dette fordi den metoden er mer effektiv å bruke enn for eksempel en direkte for-løkke.
+
+Dette er også brukt i statistic_data_notebook for å lage en kolonne bestående av måned og dag.
+
+##### Pandas SQL vs tradisjonell Pandas
+Pandas-syntaks kan være noe kompleks og da kan man for eksempel med sqldf, bruke SQL-spørringer på Pandas DataFrames. Dette kan gi en enkel måte å filtrere, transformere og gruppere data på, på en mindre kompleks måte. SQL-spørringer kan også være enklere å lese og vedlikeholde enn Pandas-operasjoner, når man jobber med komplekse datasett. Man kan også bruke effektive og enklere SQL-kommandoer som for eksempel JOIN og GROUP BY.
+
+##### Uregelmessigheter i dataene
+Uregelmessigheter vi kan forvente å møte på er blant annet manglende verdier. For å håndtere disse kan vi bruke metoder som for eksempel fillna(), som fyller manglende verdier med en standardverdi. Eller så kan vi bruke dropna(), som fjerner radene med manglende verdi. Vi kan også møte på ufullstendige datoer eller datoer i ukjent format. Da kan vi bruke pd.to_datetime() for å sikre at datoene blir riktig konvertert til datetime format.
+
+Vi kan også møte ekstremverdier, som vi kan fjerne ved å sjekke om de er "uteliggere" ved å ligge mer enn tre standardavvik i fra gjennomsnittet. Da kan vi bruke verdien før med fillna('obj.ffill()') eller bruke interpolate linear metoden for å få den mest "smoothe" overgangen mellom manglende verdier. Da den "gjetter" seg frem til manglende verdier.
\ No newline at end of file
diff --git a/data/README.md b/data/README.md
index 401b18a..2721585 100644
--- a/data/README.md
+++ b/data/README.md
@@ -1,13 +1,11 @@
# Data-description
-### Possible API
-- **API from openweathermap**
-[API_OPEN_WEATHER_MAP](https://openweathermap.org/)
+Her vil det opprettes ulike mapper som et resultat av dataene som lagres gjennom kjøringen av de ulike notebookene.
-- **API from meterologisk institutt**
-[API_FROST](https://frost.met.no/index.html)
-
-### Possible dataset
-- **Natural Disasters:**
-[DATASET_1](https://www.kaggle.com/datasets/brsdincer/all-natural-disasters-19002021-eosdis)
+Funksjonen er bygd slik at den først sjekker om det eksisterer en mappe, før den eventuelt lager. Alle mapper som starter med output (altså output data) er lagt til i `.gitignore`. Dette for å ikke laste opp masse unødvendig til github, men også for at brukere ikke 'deler' data. Mine kjøringer vil være mine, og dine vil kun vises hos deg.
+Dette er eksempel på noen av mappene:
+- `output_current_data` lagrer dataen for ønsket sted, kjørt fra `notebook_current_data.ipynb`
+- `output_fig` lagrer grafer, kjørt fra `notebook_statistic_data.ipynb`
+- `output_record` lagrer rekord data fra ønsket sted, kjørt fra `notebook_statistic_data.ipynb`
+- `output_statistikk` lagrer dataen for ønsket sted, kjørt fra`notebook_statistic_data.ipynb`
\ No newline at end of file
diff --git a/notebooks/README.md b/notebooks/README.md
index 8c1d051..23f68fb 100644
--- a/notebooks/README.md
+++ b/notebooks/README.md
@@ -1 +1,14 @@
-# Notebook - description
\ No newline at end of file
+# Notebook - description
+
+Her finnes informasjon om de ulike notebookene og deres innhold.
+
+- [Current data](notebook_current_data.ipynb)
+ Denne notebooken er for å hente, skrive og vise nåværende data for ønsket lokasjon.
+- [One day data](notebook_one_day_data.ipynb)
+ Denne notebooken henter data fra ønsket dag og sted, skriver til fil. Visualiserer manglende verdier, retter opp manglende verdier, og visualisere og lagrer data fra plot.
+- [One week data](notebook_one_week_data.ipynb)
+ Denne notebooken henter data fra ønsket periode (inntil 7-dager) og sted, skriver til fil. Visualiserer manglende verdier, retter opp manglende verdier, og visualisere og lagrer data fra plot.
+- [Statistic year data](notebook_statistic_data.ipynb)
+ Denne notebooken henter data fra en API som samler alle historiske data for ønsket sted, å regner ut statistiske verdier for alle dagene i året. Vi fjerner uønskede kolonner, utelukker ekstremverdier og visualiserer data gjennom plotter.
+- [Test notebook](test_notebook.ipynb)
+ Dette er bare en test notebook, for å se om venv funker og det å importere funksjoner fra packager.
\ No newline at end of file
diff --git a/notebooks/get_data_notebook.ipynb b/notebooks/get_data_notebook.ipynb
deleted file mode 100644
index c3b6ad0..0000000
--- a/notebooks/get_data_notebook.ipynb
+++ /dev/null
@@ -1,548 +0,0 @@
-{
- "cells": [
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "### Velg start dato og sluttdato\n",
- "\n",
- "For å kunne hente data og gjøre en analyse trenger programmet å vite hvilken periode du vil hente ut for.\n",
- "\n",
- "Dataen skrives inn slik: (yyyy, mm, dd, hh, mm)\n",
- "Her følger et eksempel: \n",
- "|Hva|Hvordan|Eksempel|\n",
- "|:---|:---:|:---:|\n",
- "|år|yyyy|2025|\n",
- "|måned|mm|03| \n",
- "|dato|dd|01| \n",
- "|time|hh|12| \n",
- "|minutt|mm|00| \n",
- "\n",
- "Denne dataen skrives da inn på følgende hvis: (2025, 03, 01, 12, 00)\n"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 56,
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "Start date => unix timestamp: 1742202600\n",
- "End date => unix timestamp: 1742548200\n",
- "Unix timestamp => start date: 2025-03-17 10:10:00\n",
- "Unix timestamp => end date: 2025-03-21 10:10:00\n"
- ]
- }
- ],
- "source": [
- "import sys\n",
- "import os\n",
- "\n",
- "# Gets the absolute path to the src folder\n",
- "sys.path.append(os.path.abspath(\"../src\"))\n",
- "\n",
- "# Now we can import the fucntion from the module\n",
- "from my_package.date_to_unix import get_unix_timestamp\n",
- "from my_package.date_to_unix import from_unix_timestamp\n",
- "\n",
- "# Runs the function and store the data\n",
- "unix_start_date, unix_end_date = get_unix_timestamp()\n",
- "\n",
- "# Prints the unix_timestamp\n",
- "print(\"Start date => unix timestamp:\", unix_start_date)\n",
- "print(\"End date => unix timestamp:\", unix_end_date)\n",
- "\n",
- "# Run the function to convert from unix_timestamp to date, and store the variables\n",
- "start_date, end_date = from_unix_timestamp(unix_start_date, unix_end_date)\n",
- "\n",
- "# prints the date\n",
- "print(\"Unix timestamp => start date:\", start_date)\n",
- "print(\"Unix timestamp => end date:\", end_date)\n"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "### Velg et sted i Norge og få data\n",
- "\n",
- "Skriv inn et sted du ønsker data fra, foreløpig er det begrenset til Norge\n",
- "\n",
- "Programmet vil deretter hente data å lagre det i en json fil"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 57,
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "Data fetch: ok\n"
- ]
- }
- ],
- "source": [
- "import sys\n",
- "import os\n",
- "\n",
- "# Gets the absolute path to the src folder\n",
- "sys.path.append(os.path.abspath(\"../src\"))\n",
- "\n",
- "# Now we can import the fucntion from the module\n",
- "from my_package.fetch_data import fetch_data\n",
- "\n",
- "# User input the city, for the weather\n",
- "city_name = input(\"Enter a city in Norway: \")\n",
- "\n",
- "data = fetch_data(unix_start_date, unix_end_date, city_name)\n"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "### Lagre data i en json-fil\n",
- "\n",
- "Skriv inn navn for til filen du vil lagre med dataen.\n",
- "\n",
- "Eks. test\n",
- "Da vil filen lagres som data_**test**.json, i mappen \"../data/output_stedsnavn/data_{filnavn}.json\"\n",
- "\n"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 58,
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "Data has been written to /Users/toravestlund/Documents/ITBAITBEDR/TDT4114 - Anvendt programmering/anvendt_mappe/data/output_stedsdata/data_test6.json\n"
- ]
- }
- ],
- "source": [
- "# Gets the absolute path to the src folder\n",
- "sys.path.append(os.path.abspath(\"../src\"))\n",
- "\n",
- "from my_package.write_data import write_data\n",
- "\n",
- "filename = input(\"Write filename: \")\n",
- "\n",
- "write_data(data, filename)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "### Lese fra fil\n",
- "\n",
- "Henter opp data lagret i filen, lagd over, og skriver ut lesbart ved hjelp av pandas"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 59,
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- " message cod city_id calctime cnt \\\n",
- "0 Count: 96 200 3133880 0.021173 96 \n",
- "1 Count: 96 200 3133880 0.021173 96 \n",
- "2 Count: 96 200 3133880 0.021173 96 \n",
- "3 Count: 96 200 3133880 0.021173 96 \n",
- "4 Count: 96 200 3133880 0.021173 96 \n",
- ".. ... ... ... ... ... \n",
- "91 Count: 96 200 3133880 0.021173 96 \n",
- "92 Count: 96 200 3133880 0.021173 96 \n",
- "93 Count: 96 200 3133880 0.021173 96 \n",
- "94 Count: 96 200 3133880 0.021173 96 \n",
- "95 Count: 96 200 3133880 0.021173 96 \n",
- "\n",
- " list \n",
- "0 {'dt': 1742205600, 'main': {'temp': 1.98, 'fee... \n",
- "1 {'dt': 1742209200, 'main': {'temp': 3.05, 'fee... \n",
- "2 {'dt': 1742212800, 'main': {'temp': 3.6, 'feel... \n",
- "3 {'dt': 1742216400, 'main': {'temp': 4.16, 'fee... \n",
- "4 {'dt': 1742220000, 'main': {'temp': 4.11, 'fee... \n",
- ".. ... \n",
- "91 {'dt': 1742533200, 'main': {'temp': -0.24, 'fe... \n",
- "92 {'dt': 1742536800, 'main': {'temp': -0.24, 'fe... \n",
- "93 {'dt': 1742540400, 'main': {'temp': 0.62, 'fee... \n",
- "94 {'dt': 1742544000, 'main': {'temp': 2.18, 'fee... \n",
- "95 {'dt': 1742547600, 'main': {'temp': 5.03, 'fee... \n",
- "\n",
- "[96 rows x 6 columns]\n"
- ]
- }
- ],
- "source": [
- "import pandas as pd\n",
- "\n",
- "data = pd.read_json(f'../data/output_stedsdata/data_{filename}.json')\n",
- "\n",
- "print(data)"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 62,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/html": [
- "
\n",
- "\n",
- "
\n",
- " \n",
- " \n",
- " | \n",
- " main.temp | \n",
- " main.feels_like | \n",
- " main.pressure | \n",
- " main.humidity | \n",
- " main.temp_min | \n",
- " main.temp_max | \n",
- " wind.speed | \n",
- " wind.deg | \n",
- " wind.gust | \n",
- " clouds.all | \n",
- " rain.1h | \n",
- "
\n",
- " \n",
- " | dt | \n",
- " | \n",
- " | \n",
- " | \n",
- " | \n",
- " | \n",
- " | \n",
- " | \n",
- " | \n",
- " | \n",
- " | \n",
- " | \n",
- "
\n",
- " \n",
- " \n",
- " \n",
- " | 2025-03-17 10:00:00 | \n",
- " 1.98 | \n",
- " 0.11 | \n",
- " 1021 | \n",
- " 92 | \n",
- " 1.07 | \n",
- " 2.77 | \n",
- " 1.79 | \n",
- " 203 | \n",
- " 3.58 | \n",
- " 100 | \n",
- " 0.36 | \n",
- "
\n",
- " \n",
- " | 2025-03-17 11:00:00 | \n",
- " 3.05 | \n",
- " 0.84 | \n",
- " 1021 | \n",
- " 93 | \n",
- " 2.73 | \n",
- " 3.33 | \n",
- " 2.24 | \n",
- " 225 | \n",
- " 5.36 | \n",
- " 100 | \n",
- " 0.79 | \n",
- "
\n",
- " \n",
- " | 2025-03-17 12:00:00 | \n",
- " 3.60 | \n",
- " 1.49 | \n",
- " 1021 | \n",
- " 91 | \n",
- " 3.03 | \n",
- " 3.88 | \n",
- " 2.24 | \n",
- " 248 | \n",
- " 4.02 | \n",
- " 100 | \n",
- " 1.38 | \n",
- "
\n",
- " \n",
- " | 2025-03-17 13:00:00 | \n",
- " 4.16 | \n",
- " 1.75 | \n",
- " 1021 | \n",
- " 92 | \n",
- " 3.84 | \n",
- " 4.44 | \n",
- " 2.68 | \n",
- " 270 | \n",
- " 8.05 | \n",
- " 100 | \n",
- " 0.16 | \n",
- "
\n",
- " \n",
- " | 2025-03-17 14:00:00 | \n",
- " 4.11 | \n",
- " 0.75 | \n",
- " 1021 | \n",
- " 89 | \n",
- " 3.88 | \n",
- " 5.03 | \n",
- " 4.02 | \n",
- " 293 | \n",
- " 8.05 | \n",
- " 100 | \n",
- " 0.14 | \n",
- "
\n",
- " \n",
- " | ... | \n",
- " ... | \n",
- " ... | \n",
- " ... | \n",
- " ... | \n",
- " ... | \n",
- " ... | \n",
- " ... | \n",
- " ... | \n",
- " ... | \n",
- " ... | \n",
- " ... | \n",
- "
\n",
- " \n",
- " | 2025-03-21 05:00:00 | \n",
- " -0.24 | \n",
- " -2.29 | \n",
- " 1024 | \n",
- " 91 | \n",
- " -1.16 | \n",
- " 0.55 | \n",
- " 1.67 | \n",
- " 122 | \n",
- " 1.86 | \n",
- " 42 | \n",
- " NaN | \n",
- "
\n",
- " \n",
- " | 2025-03-21 06:00:00 | \n",
- " -0.24 | \n",
- " -2.14 | \n",
- " 1024 | \n",
- " 90 | \n",
- " -1.16 | \n",
- " 0.55 | \n",
- " 1.57 | \n",
- " 136 | \n",
- " 1.67 | \n",
- " 44 | \n",
- " NaN | \n",
- "
\n",
- " \n",
- " | 2025-03-21 07:00:00 | \n",
- " 0.62 | \n",
- " -0.97 | \n",
- " 1025 | \n",
- " 89 | \n",
- " -0.60 | \n",
- " 2.03 | \n",
- " 1.45 | \n",
- " 125 | \n",
- " 1.77 | \n",
- " 97 | \n",
- " NaN | \n",
- "
\n",
- " \n",
- " | 2025-03-21 08:00:00 | \n",
- " 2.18 | \n",
- " 0.78 | \n",
- " 1025 | \n",
- " 92 | \n",
- " 2.18 | \n",
- " 3.03 | \n",
- " 1.47 | \n",
- " 94 | \n",
- " 1.99 | \n",
- " 88 | \n",
- " NaN | \n",
- "
\n",
- " \n",
- " | 2025-03-21 09:00:00 | \n",
- " 5.03 | \n",
- " 3.85 | \n",
- " 1025 | \n",
- " 78 | \n",
- " 5.03 | \n",
- " 5.03 | \n",
- " 1.60 | \n",
- " 85 | \n",
- " 2.29 | \n",
- " 67 | \n",
- " NaN | \n",
- "
\n",
- " \n",
- "
\n",
- "
96 rows × 11 columns
\n",
- "
"
- ],
- "text/plain": [
- " main.temp main.feels_like main.pressure main.humidity \\\n",
- "dt \n",
- "2025-03-17 10:00:00 1.98 0.11 1021 92 \n",
- "2025-03-17 11:00:00 3.05 0.84 1021 93 \n",
- "2025-03-17 12:00:00 3.60 1.49 1021 91 \n",
- "2025-03-17 13:00:00 4.16 1.75 1021 92 \n",
- "2025-03-17 14:00:00 4.11 0.75 1021 89 \n",
- "... ... ... ... ... \n",
- "2025-03-21 05:00:00 -0.24 -2.29 1024 91 \n",
- "2025-03-21 06:00:00 -0.24 -2.14 1024 90 \n",
- "2025-03-21 07:00:00 0.62 -0.97 1025 89 \n",
- "2025-03-21 08:00:00 2.18 0.78 1025 92 \n",
- "2025-03-21 09:00:00 5.03 3.85 1025 78 \n",
- "\n",
- " main.temp_min main.temp_max wind.speed wind.deg \\\n",
- "dt \n",
- "2025-03-17 10:00:00 1.07 2.77 1.79 203 \n",
- "2025-03-17 11:00:00 2.73 3.33 2.24 225 \n",
- "2025-03-17 12:00:00 3.03 3.88 2.24 248 \n",
- "2025-03-17 13:00:00 3.84 4.44 2.68 270 \n",
- "2025-03-17 14:00:00 3.88 5.03 4.02 293 \n",
- "... ... ... ... ... \n",
- "2025-03-21 05:00:00 -1.16 0.55 1.67 122 \n",
- "2025-03-21 06:00:00 -1.16 0.55 1.57 136 \n",
- "2025-03-21 07:00:00 -0.60 2.03 1.45 125 \n",
- "2025-03-21 08:00:00 2.18 3.03 1.47 94 \n",
- "2025-03-21 09:00:00 5.03 5.03 1.60 85 \n",
- "\n",
- " wind.gust clouds.all rain.1h \n",
- "dt \n",
- "2025-03-17 10:00:00 3.58 100 0.36 \n",
- "2025-03-17 11:00:00 5.36 100 0.79 \n",
- "2025-03-17 12:00:00 4.02 100 1.38 \n",
- "2025-03-17 13:00:00 8.05 100 0.16 \n",
- "2025-03-17 14:00:00 8.05 100 0.14 \n",
- "... ... ... ... \n",
- "2025-03-21 05:00:00 1.86 42 NaN \n",
- "2025-03-21 06:00:00 1.67 44 NaN \n",
- "2025-03-21 07:00:00 1.77 97 NaN \n",
- "2025-03-21 08:00:00 1.99 88 NaN \n",
- "2025-03-21 09:00:00 2.29 67 NaN \n",
- "\n",
- "[96 rows x 11 columns]"
- ]
- },
- "metadata": {},
- "output_type": "display_data"
- }
- ],
- "source": [
- "import pandas as pd\n",
- "\n",
- "data = pd.read_json(f'../data/output_stedsdata/data_{filename}.json')\n",
- "\n",
- "if 'list' in data:\n",
- " df = pd.json_normalize(data['list'])\n",
- "\n",
- " # Delete duplicates based on the dt row, all the other values can appear more than once, but the date should only appear once\n",
- " df = df.drop_duplicates(subset=['dt'])\n",
- "\n",
- " # The weather column dosnt have any releated information, therefor we delete it\n",
- " df = df.drop(columns=\"weather\")\n",
- "\n",
- " # Convert 'dt' column from Unix timestamp to datetime and set it as the index\n",
- " df['dt'] = pd.to_datetime(df['dt'], unit='s')\n",
- " df.set_index('dt', inplace=True)\n",
- " \n",
- "\n",
- " \n",
- "\n",
- " # Ensure the DataFrame is displayed correctly\n",
- " display(df)\n",
- "\n",
- " # # Extract main values\n",
- " # temp = df['main.temp']\n",
- " # humidity = df['main.humidity']\n",
- "\n",
- " # # Extract wind values\n",
- " # w_speed = df['wind.speed']\n",
- "\n",
- " # # Extract other variables\n",
- " # clouds = df['clouds.all']\n",
- "\n",
- " # try:\n",
- " # rain = df['rain.1h']\n",
- " # except KeyError:\n",
- " # print(\"'Rain' is not present in the JSON file.\")\n",
- "\n",
- " # try:\n",
- " # snow = df['snow.1h']\n",
- " # except KeyError:\n",
- " # print(\"'Snow' is not present in the JSON file.\")\n",
- "\n",
- " # # Print the average temperature\n",
- " # print('Gjennomsnitts temperatur: ', temp.mean().round(2))\n",
- "\n",
- " # Display the temperature column\n",
- " # display(temp)\n",
- "else:\n",
- " print(\"The 'list' key is not present in the JSON file.\")"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "# \"komprimere oversikten over\"\n",
- "# Som i å, finne gjennomsnitt av alle aktuelle data, \n",
- "# høyeste, laveste (spesielt temp) i gitte periode"
- ]
- }
- ],
- "metadata": {
- "kernelspec": {
- "display_name": "venv",
- "language": "python",
- "name": "python3"
- },
- "language_info": {
- "codemirror_mode": {
- "name": "ipython",
- "version": 3
- },
- "file_extension": ".py",
- "mimetype": "text/x-python",
- "name": "python",
- "nbconvert_exporter": "python",
- "pygments_lexer": "ipython3",
- "version": "3.12.5"
- }
- },
- "nbformat": 4,
- "nbformat_minor": 2
-}
diff --git a/notebooks/notebook_current_data.ipynb b/notebooks/notebook_current_data.ipynb
new file mode 100644
index 0000000..c99efdd
--- /dev/null
+++ b/notebooks/notebook_current_data.ipynb
@@ -0,0 +1,188 @@
+{
+ "cells": [
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "# Notebook - Current Data\n",
+ "Denne notebooken er for å hente, skrive og vise nåværende data for ønsket lokasjon."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Velg sted og få nåværende data\n",
+ "\n",
+ "Skriv inn et sted du ønsker å få nåværende data fra, foreløpig er det begrenset til Norge"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "import sys\n",
+ "import os\n",
+ "\n",
+ "# Gets the absolute path to the src folder\n",
+ "sys.path.append(os.path.abspath(\"../src\"))\n",
+ "\n",
+ "# Now we can import the fucntion from the module\n",
+ "from my_package.fetch_current_data import fetch_current_data\n",
+ "\n",
+ "# Import function to replace nordic (æøå)\n",
+ "from my_package.util import replace_nordic\n",
+ "\n",
+ "# User input the city, for the weather\n",
+ "city_name = input(\"Enter a city in Norway: \")\n",
+ "\n",
+ "city_name = replace_nordic(city_name)\n",
+ "\n",
+ "# Stores the return of the function\n",
+ "data, folder = fetch_current_data(city_name)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Lagre data i en json-fil\n",
+ "\n",
+ "Skriv inn navn for til filen du vil lagre med dataen.\n",
+ "\n",
+ "Eks. test\n",
+ "Da vil filen lagres som data_**test**.json, i mappen \"../data/output_stedsnavn/data_{filnavn}.json\"\n",
+ "\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# Gets the absolute path to the src folder\n",
+ "sys.path.append(os.path.abspath(\"../src\"))\n",
+ "\n",
+ "from my_package.write_data import write_data\n",
+ "\n",
+ "# The user choose the filename\n",
+ "filename = input(\"Write filename: \")\n",
+ "\n",
+ "# Writes the data, using user input filename\n",
+ "write_data(data, folder, filename)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Lese fra fil\n",
+ "\n",
+ "Henter opp data lagret i filen, lagd over, og skriver ut lesbart ved hjelp av pandas"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "import json\n",
+ "\n",
+ "# Read from the json-file\n",
+ "with open(f\"../data/output_current_data/data_{filename}.json\", \"r\") as file:\n",
+ " data = json.load(file)\n",
+ "\n",
+ "# Display data\n",
+ "display(data)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Rydde i data\n",
+ "For å gjøre det enkelre å lese dataen, normaliserer vi json-filen ved hjelp av pandas.\n",
+ "\n",
+ "Vi fjerner også irrellevante kolonner som:\n",
+ "- weather: denne inneholder informasjon om været (beskrivelse, id, icon osv.)\n",
+ "- coord.lon og coord.lat: vi trengre ikke koordinatene når vi har valgt basert på ønsket sted\n",
+ "- sys.type, sys.id, base, cod: interne parametre\n",
+ "- temp_max og temp_min: er ikke store endringer av temperatur innenfor en times tid\n",
+ "- visibility: sikt avstand i forhold til tåke, vi anser den som urelevant\n",
+ "\n",
+ "Deretter konverteres datetime [dt] fra unix_timestamp til vanlig tid, for å brukes som index\n",
+ "\n",
+ "Tiden for soloppgang og solnedgang konverteres også fra unix til vanlig tid, for å lettere leses og forstås."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "import pandas as pd\n",
+ "\n",
+ "# Normalize the json-structure, to add better readability\n",
+ "df = pd.json_normalize(data)\n",
+ "\n",
+ "# Delete duplicates based on the dt row, all the other values can appear more than once, but the date should only appear once\n",
+ "df = df.drop_duplicates(subset=['dt'])\n",
+ "\n",
+ "# Delete columns that is not relevant\n",
+ "df = df.drop(columns=\"weather\")\n",
+ "df = df.drop(columns=\"base\")\n",
+ "df = df.drop(columns=\"visibility\")\n",
+ "df = df.drop(columns=\"timezone\")\n",
+ "df = df.drop(columns=\"id\")\n",
+ "df = df.drop(columns=\"cod\")\n",
+ "df = df.drop(columns=\"coord.lon\")\n",
+ "df = df.drop(columns=\"coord.lat\")\n",
+ "df = df.drop(columns=\"wind.deg\")\n",
+ "df = df.drop(columns=\"main.temp_min\")\n",
+ "df = df.drop(columns=\"main.temp_max\")\n",
+ "df = df.drop(columns=\"sys.type\")\n",
+ "df = df.drop(columns=\"sys.id\")\n",
+ "\n",
+ "# Change from unix to datetime for sunrise and sunset\n",
+ "df['sys.sunrise'] = pd.to_datetime(df['sys.sunrise'], unit='s')\n",
+ "df['sys.sunset'] = pd.to_datetime(df['sys.sunset'], unit='s')\n",
+ "\n",
+ "# Convert 'dt' column from Unix timestamp to datetime and set it as the index\n",
+ "df['dt'] = pd.to_datetime(df['dt'], unit='s')\n",
+ "df.set_index('dt', inplace=True)\n",
+ "\n",
+ "# Drops the whole column, if all values is 'NaN' value.\n",
+ "df = df.dropna(axis='columns', how='all')\n",
+ "\n",
+ "# Display the df after changes\n",
+ "display(df)"
+ ]
+ }
+ ],
+ "metadata": {
+ "kernelspec": {
+ "display_name": "venv",
+ "language": "python",
+ "name": "python3"
+ },
+ "language_info": {
+ "codemirror_mode": {
+ "name": "ipython",
+ "version": 3
+ },
+ "file_extension": ".py",
+ "mimetype": "text/x-python",
+ "name": "python",
+ "nbconvert_exporter": "python",
+ "pygments_lexer": "ipython3",
+ "version": "3.12.5"
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}
diff --git a/notebooks/notebook_one_day_data.ipynb b/notebooks/notebook_one_day_data.ipynb
new file mode 100644
index 0000000..1eb90fb
--- /dev/null
+++ b/notebooks/notebook_one_day_data.ipynb
@@ -0,0 +1,589 @@
+{
+ "cells": [
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "# Notebook - One day data\n",
+ "\n",
+ "Denne notebooken henter data fra ønsket dag og sted, skriver til fil. Visualiserer manglende verdier, retter opp manglende verdier, og visualisere og lagrer data fra plot."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Velg hvilken dag du vil sjekke været for\n",
+ "\n",
+ "For å kunne hente data og gjøre en analyse trenger programmet å vite hvilken dag du vil hente ut for, også skrives alle timene fra den dagen ut. Programmet kan ikke hente ut data fra nåværende, eller senere datoer, altså må man velge datoer fra tidligere tidspunkt.\n",
+ "\n",
+ "Dataen skrives inn slik: (yyyy, mm, dd)\n",
+ "Her følger et eksempel: \n",
+ "|Hva|Hvordan|Eksempel|\n",
+ "|:---|:---:|:---:|\n",
+ "|år|yyyy|2025|\n",
+ "|måned|mm|03| \n",
+ "|dato|dd|01| \n",
+ "\n",
+ "Denne dataen skrives da inn på følgende hvis: (2025, 03, 01)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "import datetime\n",
+ "import time\n",
+ "\n",
+ "# Makes a function so the start and end date is the same date, with all hours of that date\n",
+ "def get_unix_timestamps_for_day():\n",
+ " date_input = input(\"Choose a date (yyyy, mm, dd): \")\n",
+ " date_components = date_input.split(\",\")\n",
+ " year = int(date_components[0])\n",
+ " month = int(date_components[1])\n",
+ " day = int(date_components[2])\n",
+ "\n",
+ " # Goes through all hours of the day, use %Y-%m-%d etc. from pythons strftime to convert datetime into a readable string \n",
+ " timestamps = []\n",
+ " for hour in range(24):\n",
+ " dt = datetime.datetime(year, month, day, hour, 0)\n",
+ " unix_timestamp = int(time.mktime(dt.timetuple()))\n",
+ " timestamps.append((unix_timestamp, dt.strftime('%Y-%m-%d %H:%M:%S'))) \n",
+ " \n",
+ " # Prevents from getting data for the current day, or the future\n",
+ " if dt >= datetime.datetime.now():\n",
+ " print(\"Failed, cant use future dates\")\n",
+ "\n",
+ " # If \n",
+ " raise ValueError\n",
+ "\n",
+ " # Prints the date chosen\n",
+ " print(f\"Selected date: {year}-{month:02d}-{day:02d}\")\n",
+ "\n",
+ " # Prints the timestamp and the date an hour of the day after\n",
+ " for ts, readable in timestamps:\n",
+ " print(f\"Unix Timestamp: {ts} -> {readable}\")\n",
+ " \n",
+ " return date_input, [ts[0] for ts in timestamps]\n",
+ "\n",
+ "date, timestamps = get_unix_timestamps_for_day()\n",
+ "\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Velg et sted i Norge og få data\n",
+ "\n",
+ "Skriv inn et sted du ønsker data fra, foreløpig er det begrenset til Norge\n",
+ "\n",
+ "Programmet vil deretter hente data å lagre det i en json fil"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "import sys\n",
+ "import os\n",
+ "\n",
+ "# Gets the absolute path to the src folder\n",
+ "sys.path.append(os.path.abspath(\"../src\"))\n",
+ "\n",
+ "# Now we can import the fucntion from the module\n",
+ "from my_package.fetch_data import fetch_data\n",
+ "\n",
+ "# Import function to replace nordic (æøå)\n",
+ "from my_package.util import replace_nordic\n",
+ "\n",
+ "# User choose a city they want the weather data from\n",
+ "city_name = input(\"Enter city name: \")\n",
+ "\n",
+ "city_name = replace_nordic(city_name)\n",
+ "\n",
+ "# Start_date is the first timestamp, end_date is the last\n",
+ "start_date, end_date = timestamps[0], timestamps[-1]\n",
+ "\n",
+ "city_name = replace_nordic(city_name)\n",
+ "\n",
+ "# Stores the values in the variables\n",
+ "weather_data, folder = fetch_data(start_date, end_date, city_name)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Lagre data i en json-fil\n",
+ "\n",
+ "Skriv inn navn for til filen du vil lagre med dataen.\n",
+ "\n",
+ "Eks. test\n",
+ "Da vil filen lagres som data_**test**.json, i mappen \"../data/output_stedsnavn/data_{filnavn}.json\""
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# Gets the absolute path to the src folder\n",
+ "sys.path.append(os.path.abspath(\"../src\"))\n",
+ "\n",
+ "from my_package.write_data import write_data\n",
+ "\n",
+ "filename = input(\"Write filename: \")\n",
+ "\n",
+ "# Writes the data, with the chosen name\n",
+ "write_data(weather_data, folder, filename)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Lese fra fil\n",
+ "\n",
+ "Henter opp data lagret i filen, lagd over, og skriver ut lesbart ved hjelp av pandas"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "import pandas as pd\n",
+ "\n",
+ "# Reads from file using pandas\n",
+ "weather_data = pd.read_json(f'../data/output_stedsnavn/data_{filename}.json')\n",
+ "\n",
+ "# Checks if 'list' in weather, then proceed because it is the right data\n",
+ "if 'list' in weather_data:\n",
+ " # Normalize the json for better readability\n",
+ " df = pd.json_normalize(weather_data['list'])\n",
+ "\n",
+ " # Delete duplicates based on the dt row, all the other values can appear more than once, but the date should only appear once\n",
+ " df = df.drop_duplicates(subset=['dt'])\n",
+ "\n",
+ " # The weather column dosnt have any releated information, therefor we delete it\n",
+ " df = df.drop(columns=\"weather\")\n",
+ "\n",
+ " # Convert 'dt' column from Unix timestamp to datetime and set it as the index\n",
+ " df['dt'] = pd.to_datetime(df['dt'], unit='s')\n",
+ " df.set_index('dt', inplace=True)\n",
+ "\n",
+ " # Ensure the DataFrame is displayed correctly \n",
+ " display(df)\n",
+ " \n",
+ "else:\n",
+ " print(\"The 'list' key is not present in the JSON file.\")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Viser temperaturen\n",
+ "Regner ut gjennomsnittst-temperatur ved hjelp av innebygde funksjoner. Finner også høyeste og laveste målte temperatur.\n",
+ "\n",
+ "Plotter temperaturen ved hjelp av matplotlib."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "import matplotlib.pyplot as plt\n",
+ "import matplotlib.dates as mdates\n",
+ "\n",
+ "# Stores the temperature values\n",
+ "temp = df['main.temp']\n",
+ "\n",
+ "temp_mean = temp.mean().round(2)\n",
+ "\n",
+ "# Print the average temperature\n",
+ "print(f'Mean temperatur: {temp_mean}')\n",
+ "\n",
+ "# Find the highest and lowest temperatures\n",
+ "max_temp = df['main.temp'].max().round(2)\n",
+ "min_temp = df['main.temp'].min().round(2)\n",
+ "\n",
+ "print(\"Highest temperature:\", max_temp)\n",
+ "print(\"Lowest temperature:\", min_temp)\n",
+ "\n",
+ "# Set the x_axis to the index, which means the time\n",
+ "x_axis = df.index\n",
+ "\n",
+ "# Choose the width and height of the plot\n",
+ "plt.figure(figsize=(12, 6))\n",
+ "\n",
+ "# Plotting temperatur\n",
+ "plt.plot(x_axis, temp, color='tab:red', label='Temperatur')\n",
+ "\n",
+ "# Get the current axsis, and store it as ax\n",
+ "ax = plt.gca()\n",
+ "\n",
+ "# Customize the x-axis to show ticks for each hour\n",
+ "ax.xaxis.set_major_locator(mdates.HourLocator(interval=1)) # Tick marks for every hour\n",
+ "ax.xaxis.set_major_formatter(mdates.DateFormatter('%H:%M')) # Format as \"Day Month Hour:Minute\"\n",
+ "\n",
+ "# Adjust layout\n",
+ "plt.tight_layout()\n",
+ "\n",
+ "# Add title for the plot, with city_name and start to end date\n",
+ "plt.title(f'Temperatur {city_name}, ({date})')\n",
+ "\n",
+ "# Shows a grid\n",
+ "plt.grid()\n",
+ "\n",
+ "# Show the label-description\n",
+ "plt.legend(loc = 'upper right')\n",
+ "\n",
+ "# Show the plot\n",
+ "plt.show()"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Visualiserer nedbør\n",
+ "Ved hjelp av matplotlib visualiserer vi nedbør for ønsket dag."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "import matplotlib.pyplot as plt\n",
+ "import matplotlib.dates as mdates\n",
+ "import numpy as np\n",
+ "\n",
+ "x_axis = df.index\n",
+ "\n",
+ "# Checks if the rain is a value, it will not be if it is no rain and then cause a KeyError\n",
+ "try:\n",
+ " rain = df['rain.1h']\n",
+ "\n",
+ "# If no rain, make the rain column and fill it with NaN\n",
+ "except KeyError:\n",
+ " print(\"'Rain' is not present in the JSON file.\")\n",
+ " df['rain.1h'] = np.nan\n",
+ "\n",
+ "# Checks if the snow is a value, it will not be if it is no rain and then cause a KeyError\n",
+ "try:\n",
+ " snow = df['snow.1h']\n",
+ "\n",
+ "# If no snow, make the snow column and fill it with NaN\n",
+ "except KeyError:\n",
+ " print(\"'Snow' is not present in the JSON file.\")\n",
+ " df['snow.1h'] = np.nan\n",
+ "\n",
+ "# Choose the width and height of the plot\n",
+ "plt.figure(figsize=(15, 6))\n",
+ "\n",
+ "# Check with rain, will cause NameError if the try/except over fails\n",
+ "try:\n",
+ " plt.bar(x_axis, rain, width=0.02, alpha=0.5, color='tab:blue', label='rain')\n",
+ "except: NameError\n",
+ "\n",
+ "# Check with snow, will cause NameError if the try/except over fails\n",
+ "try: \n",
+ " plt.bar(x_axis, snow, width=0.02, alpha=0.5, color='tab:grey', label='snow')\n",
+ "except: NameError\n",
+ "\n",
+ "# Get the current axsis, and store it as ax\n",
+ "ax = plt.gca()\n",
+ "\n",
+ "# Use the current ax, to get a tick-mark on the x_axis for each hour, and print like \"HH:MM\"\n",
+ "ax.xaxis.set_major_locator(mdates.HourLocator())\n",
+ "ax.xaxis.set_major_formatter(mdates.DateFormatter('%H:%M'))\n",
+ "\n",
+ "# Add the label-desciption\n",
+ "plt.legend(loc = 'upper right')\n",
+ "\n",
+ "# Add title to the plot, with date\n",
+ "plt.title(f'Precipitation {city_name}, ({date}))')\n",
+ "\n",
+ "# Shows the plot\n",
+ "plt.show()"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Vise dataframe, med nye kolonner\n",
+ "Hvis dataframen ikke inneholdt 'rain.1h' eller 'snow.1h', skal de nå ha blitt lagt til med 'NaN' verdier."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# Display df, to see if 'rain.1h' and 'snow.1h' was added with NaN values\n",
+ "display(df)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Sjekk for manglende verdier\n",
+ "Missigno sjekker og visualiserer manglende verdier, slik at det blir lettere å se hvilke kolonner feilen ligger i. \n",
+ "\n",
+ "Vis the blir \"hull\" i en søyle, tyder the på manglende verdier."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "import missingno as msno\n",
+ "\n",
+ "# Checks for and display missing values\n",
+ "msno.matrix(df)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Endre manglende verdier\n",
+ "I de fleste tilfeller virker dataene å være tilnærmet \"perfekte\", men de inkluderer bare snø og regn dersom det er snø eller regn. Derfor vil vi fa NaN verdier i de målingene det ikke har regnet/snødd. \n",
+ "\n",
+ "Under sjekker vi først om regn eller snø er i målingen, og hvis de er, bytter vi ut NaN med 0. \n",
+ "\n",
+ "Så sjekker vi om alle verdiene i en kolonne er 'NaN', isåfall så fjerner vi hele kolonnen. Grunne til at dette ikke inkluderer snø og regn, er fordi vi senere plotter disse verdiene, og da får vi ikke feil om verdien er 0, men vil få om hele kolonnen mangler.\n",
+ "\n",
+ "Deretter sjekker vi andre verdier, og bytter enten 'NaN' med 0, eller med verdien før. Verdiene vi setter til 0 gjelder da snø, regn og vind, resten blir satt til verdien før."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# If rain is stored, fill the NaN with 0\n",
+ "try: \n",
+ " df['rain.1h'] = df['rain.1h'].fillna(0)\n",
+ "except KeyError:\n",
+ " print([\"'rain.1h', not in df\"])\n",
+ "\n",
+ "# If snow is stored, fill the NaN with 0\n",
+ "try: \n",
+ " df['snow.1h'] = df['snow.1h'].fillna(0)\n",
+ "except KeyError:\n",
+ " print(\"['snow.1h'], not in df\")\n",
+ "\n",
+ "# Drops all the columns, if it has 'NaN' value.\n",
+ "df = df.dropna(axis='columns', how='all')\n",
+ "\n",
+ "# If wind_gust is stored, fill the NaN with 0\n",
+ "try: \n",
+ " df['wind.gust'] = df['wind.gust'].fillna(0)\n",
+ "except KeyError:\n",
+ " print(\"['wind.gust'], not in df\")\n",
+ "\n",
+ "# If wind_deg is stored, fill the NaN with 0\n",
+ "try: \n",
+ " df['wind.deg'] = df['wind.deg'].fillna(0)\n",
+ "except KeyError:\n",
+ " print(\"['wind.deg'], not in df\")\n",
+ "\n",
+ "# If wind_speed is stored, fill the NaN with 0\n",
+ "try: \n",
+ " df['wind.speed'] = df['wind.speed'].fillna(0)\n",
+ "except KeyError:\n",
+ " print(\"['wind.speed'], not in df\")\n",
+ "\n",
+ "# If temperature is missing, take the same as the one before\n",
+ "df['main.temp'] = df['main.temp'].fillna('obj.ffill()')\n",
+ "\n",
+ "# Forward fill missing values in what the temperature feels like\n",
+ "df['main.feels_like'] = df['main.feels_like'].fillna('obj.ffill()')\n",
+ "\n",
+ "# Forward fill missing values in the pressure\n",
+ "df['main.pressure'] = df['main.pressure'].fillna('obj.ffill()')\n",
+ "\n",
+ "# Forward fill missing values in the humidity\n",
+ "df['main.humidity'] = df['main.humidity'].fillna('obj.ffill()')\n",
+ "\n",
+ "# Forward fill missing values in the lowest temperature \n",
+ "df['main.temp_min'] = df['main.temp_min'].fillna('obj.ffill()')\n",
+ "\n",
+ "# Forward fill missing values in the highest temperature \n",
+ "df['main.temp_max'] = df['main.temp_max'].fillna('obj.ffill()')\n",
+ "\n",
+ "# Forward fill missing values of clouds\n",
+ "df['clouds.all'] = df['clouds.all'].fillna('obj.ffill()')\n",
+ "\n",
+ "# Display the df, now without NaN\n",
+ "display(df)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Visualisere endring av data\n",
+ "Har lagt inn en ny missigno visualisering, for å se at de manglende dataene \"forsvinner\" når vi kjører cellen over. "
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "import missingno as msno\n",
+ "\n",
+ "# Visulaize the same data again, but now it should be no missing values (atleast for rain and snow)\n",
+ "msno.matrix(df)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Visualisere data i en graf\n",
+ "Ved hjelp av Matplotlib har vi visualiert ønsket data, og ved hjelp av subplot, en modul i matplotlib, kan vi plotte flere verdier i samme graf, og få \"to y-akse\" på samme x-akse. \n",
+ "\n",
+ "Temperatur og nedbør får plass i samme graf, hvor man leser temperatur verdiene på venstre side, og nedbørsverdiene på høyre side.\n",
+ "\n",
+ "I grafen under, men på samme x-akse, finner vi informasjon om vind, både vindhastighet og vindkast."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "import matplotlib.pyplot as plt\n",
+ "import matplotlib.dates as mdates\n",
+ "import os\n",
+ "\n",
+ "# Where the figure should be saved when exported\n",
+ "output_folder = \"../data/output_fig\"\n",
+ "\n",
+ "# Creates the folder if it does not exist\n",
+ "os.makedirs(output_folder, exist_ok=True)\n",
+ "\n",
+ "# x_axis set to the index, which mean the datetime\n",
+ "x_axis = df.index\n",
+ "\n",
+ "# Gets the values\n",
+ "rain = df['rain.1h']\n",
+ "temp = df['main.temp']\n",
+ "snow = df['snow.1h']\n",
+ "wind_gust = df['wind.gust']\n",
+ "wind_speed = df['wind.speed']\n",
+ "temp_mean = temp.mean().round(2)\n",
+ "\n",
+ "# Two vertically stacked axis, (2 rows, 1 column), width and height of the figure, and the axis share the same x_axis\n",
+ "fig, (ax1, ax3) = plt.subplots(2, 1,figsize=(15, 8), sharex=True)\n",
+ "\n",
+ "# Set the title for the diagram, above the first axis, with city_name and input_date\n",
+ "ax1.set_title(f'Weather data for {city_name} ({date}) ')\n",
+ "\n",
+ "# Plot temperature on the primary y-axis\n",
+ "ax1.plot(x_axis, temp, color='tab:red', label='Temperature (°C)')\n",
+ "\n",
+ "# Design the y-axis for temperatur\n",
+ "ax1.set_ylabel('Temperature (°C)', color='tab:red')\n",
+ "ax1.axhline(y=temp_mean, color='tab:red', linestyle='dashed', label='Mean temperature (°C)')\n",
+ "ax1.tick_params(axis='y', labelcolor='tab:red')\n",
+ "\n",
+ "# Plot Precipitation as bars on the secondary y-axis\n",
+ "ax2 = ax1.twinx()\n",
+ "\n",
+ "# Add rain\n",
+ "ax2.bar(x_axis, rain, color='tab:blue', alpha=0.5, width=0.02, label='Rain (mm)')\n",
+ "\n",
+ "# Add snow\n",
+ "ax2.bar(x_axis, snow, color='tab:grey', alpha=0.5, width=0.02, label='Snow (mm)')\n",
+ "\n",
+ "# Design the y-axis for precipiation\n",
+ "ax2.set_ylabel(\"Precipitation (mm)\", color='tab:blue')\n",
+ "ax2.tick_params(axis='y', labelcolor='tab:blue')\n",
+ "\n",
+ "# Format the x-axis to show all hours, in the format \"HH:MM\"\n",
+ "ax1.xaxis.set_major_locator(mdates.HourLocator()) \n",
+ "ax1.xaxis.set_major_formatter(mdates.DateFormatter('%H:%M'))\n",
+ "\n",
+ "# Add label-description for both axis\n",
+ "ax1.legend(loc='upper left')\n",
+ "ax2.legend(loc='upper right')\n",
+ "\n",
+ "# Add grid, but only vertically\n",
+ "ax1.grid(axis = 'x')\n",
+ "\n",
+ "# Plot the wind at the second x-axis (the axis below)\n",
+ "ax3.plot(x_axis, wind_gust, color='tab:purple', linestyle='dashed', label='Wind_gust')\n",
+ "ax3.plot(x_axis, wind_speed, color='tab:purple', label='Wind_speed')\n",
+ "ax3.set_ylabel('Wind (m/s)')\n",
+ "\n",
+ "# Add x_label visible for both x-axis\n",
+ "ax3.set_xlabel('Datetime')\n",
+ "\n",
+ "# Add label-description\n",
+ "ax3.legend(loc='upper right')\n",
+ "\n",
+ "# Format the x-axis to show all hours, in the format \"HH:MM\"\n",
+ "ax3.xaxis.set_major_locator(mdates.HourLocator())\n",
+ "ax3.xaxis.set_major_formatter(mdates.DateFormatter('%H:%M'))\n",
+ "\n",
+ "# Add grid, but only vertically\n",
+ "ax3.grid(axis = 'x')\n",
+ "\n",
+ "# Adjust layout\n",
+ "plt.tight_layout()\n",
+ "\n",
+ "# Save the plot to the data/output_fig folder\n",
+ "plot_path = os.path.join(output_folder, f\"weather_data_plot{city_name}.png\")\n",
+ "plt.savefig(plot_path) # Save the plot as a PNG file\n",
+ "\n",
+ "# Show the plot\n",
+ "plt.show()"
+ ]
+ }
+ ],
+ "metadata": {
+ "kernelspec": {
+ "display_name": "venv",
+ "language": "python",
+ "name": "python3"
+ },
+ "language_info": {
+ "codemirror_mode": {
+ "name": "ipython",
+ "version": 3
+ },
+ "file_extension": ".py",
+ "mimetype": "text/x-python",
+ "name": "python",
+ "nbconvert_exporter": "python",
+ "pygments_lexer": "ipython3",
+ "version": "3.12.5"
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}
diff --git a/notebooks/notebook_one_week_data.ipynb b/notebooks/notebook_one_week_data.ipynb
new file mode 100644
index 0000000..dcaef29
--- /dev/null
+++ b/notebooks/notebook_one_week_data.ipynb
@@ -0,0 +1,626 @@
+{
+ "cells": [
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "# Notebook - One week data\n",
+ "Denne notebooken henter data fra ønsket periode (inntil 7-dager) og sted, skriver til fil. Visualiserer manglende verdier, retter opp manglende verdier, og visualisere og lagrer data fra plot."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Velg start dato og sluttdato\n",
+ "\n",
+ "For å kunne hente data og gjøre en analyse trenger programmet å vite hvilken periode du vil hente ut for.\n",
+ "\n",
+ "Dataen skrives inn slik: (yyyy, mm, dd, hh, mm)\n",
+ "Her følger et eksempel: \n",
+ "|Hva|Hvordan|Eksempel|\n",
+ "|:---|:---:|:---:|\n",
+ "|år|yyyy|2025|\n",
+ "|måned|mm|03| \n",
+ "|dato|dd|01| \n",
+ "|time|hh|12| \n",
+ "|minutt|mm|00| \n",
+ "\n",
+ "Denne dataen skrives da inn på følgende hvis: (2025, 03, 01, 12, 00)\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "import sys\n",
+ "import os\n",
+ "\n",
+ "# Gets the absolute path to the src folder\n",
+ "sys.path.append(os.path.abspath(\"../src\"))\n",
+ "\n",
+ "# Now we can import the fucntion from the module\n",
+ "from my_package.date_to_unix import get_unix_timestamp\n",
+ "from my_package.date_to_unix import from_unix_timestamp\n",
+ "\n",
+ "# Runs the function and store the data\n",
+ "unix_start_date, unix_end_date = get_unix_timestamp()\n",
+ "\n",
+ "# Prints the unix_timestamp\n",
+ "print(\"Start date => unix timestamp:\", unix_start_date)\n",
+ "print(\"End date => unix timestamp:\", unix_end_date)\n",
+ "\n",
+ "# Run the function to convert from unix_timestamp to date, and store the variables\n",
+ "start_date, end_date = from_unix_timestamp(unix_start_date, unix_end_date)\n",
+ "\n",
+ "# Prints the date\n",
+ "print(\"Unix timestamp => start date:\", start_date)\n",
+ "print(\"Unix timestamp => end date:\", end_date)\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Velg et sted i Norge og få data\n",
+ "\n",
+ "Skriv inn et sted du ønsker data fra, foreløpig er det begrenset til Norge\n",
+ "\n",
+ "Programmet vil deretter hente data å lagre det i en json fil"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "import sys\n",
+ "import os\n",
+ "\n",
+ "# Gets the absolute path to the src folder\n",
+ "sys.path.append(os.path.abspath(\"../src\"))\n",
+ "\n",
+ "# Now we can import the fucntion from the module\n",
+ "from my_package.fetch_data import fetch_data\n",
+ "\n",
+ "# Import function to replace nordic (æøå)\n",
+ "from my_package.util import replace_nordic\n",
+ "\n",
+ "# User input the city, for the weather\n",
+ "city_name = input(\"Enter a city in Norway: \")\n",
+ "\n",
+ "city_name = replace_nordic(city_name)\n",
+ "\n",
+ "# Stores the values in the variables\n",
+ "data, folder = fetch_data(unix_start_date, unix_end_date, city_name)\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Lagre data i en json-fil\n",
+ "\n",
+ "Skriv inn navn for til filen du vil lagre med dataen.\n",
+ "\n",
+ "Eks. test\n",
+ "Da vil filen lagres som data_**test**.json, i mappen \"../data/output_stedsnavn/data_{filnavn}.json\"\n",
+ "\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# Gets the absolute path to the src folder\n",
+ "sys.path.append(os.path.abspath(\"../src\"))\n",
+ "\n",
+ "from my_package.write_data import write_data\n",
+ "\n",
+ "# User chose the name for the file\n",
+ "filename = input(\"Write filename: \")\n",
+ "\n",
+ "# Write the data, with the choosen filename\n",
+ "write_data(data, folder, filename)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Lese fra fil\n",
+ "\n",
+ "Henter opp data lagret i filen, lagd over, og skriver ut lesbart ved hjelp av pandas"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "import pandas as pd\n",
+ "\n",
+ "# Read json-file using pandas\n",
+ "data = pd.read_json(f'../data/output_stedsnavn/data_{filename}.json')\n",
+ "\n",
+ "# Display the data\n",
+ "display(data)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Rensking av riktig data\n",
+ "Vi går inn i 'list' for å finne den relevante informasjonen, og ikke bare meta-informasjon.\n",
+ "\n",
+ "Sørger for å fjerne duplikater, og andre irelevante kolonner. Samt setter index kolonnen til tid."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "import numpy as np\n",
+ "\n",
+ "# Goes into the 'list' to get the needed and relevant information\n",
+ "if 'list' in data:\n",
+ " # Normalize the json, for better readability\n",
+ " df = pd.json_normalize(data['list'])\n",
+ "\n",
+ " # Delete duplicates based on the dt row, all the other values can appear more than once, but the date should only appear once\n",
+ " df = df.drop_duplicates(subset=['dt'])\n",
+ "\n",
+ " # The weather column does not have any releated information, therefor we delete it\n",
+ " df = df.drop(columns=\"weather\")\n",
+ "\n",
+ " # Convert 'dt' column from Unix timestamp to datetime and set it as the index\n",
+ " df['dt'] = pd.to_datetime(df['dt'], unit='s')\n",
+ " df.set_index('dt', inplace=True)\n",
+ "\n",
+ " # Checks if the rain is a value, it will not be if it is no rain and then cause a KeyError\n",
+ " try:\n",
+ " rain = df['rain.1h']\n",
+ "\n",
+ " # If no rain, make the rain column and fill it with NaN\n",
+ " except KeyError:\n",
+ " print(\"'Rain' is not present in the JSON file.\")\n",
+ " df['rain.1h'] = np.nan\n",
+ "\n",
+ " # Checks if the snow is a value, it will not be if it is no rain and then cause a KeyError\n",
+ " try:\n",
+ " snow = df['snow.1h']\n",
+ "\n",
+ " # If no snow, make the snow column and fill it with NaN\n",
+ " except KeyError:\n",
+ " print(\"'Snow' is not present in the JSON file.\")\n",
+ " df['snow.1h'] = np.nan\n",
+ "\n",
+ " # Display the datafram, with the changes\n",
+ " display(df)\n",
+ "else:\n",
+ " print(\"The 'list' key is not present in the JSON file.\")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Viser temperaturen\n",
+ "Regner ut gjennomsnittst-temperatur ved hjelp av innebygde funksjoner. Finner også høyeste og laveste målte temperatur."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# Extract main values\n",
+ "temp = df['main.temp']\n",
+ "temp_mean = temp.mean().round(2)\n",
+ "temp_max = temp.max().round(2)\n",
+ "temp_min = temp.min().round(2)\n",
+ "\n",
+ "# Print the average temperature\n",
+ "print(f'Mean temperatur: {temp_mean}')\n",
+ "print(f'Highest temperatur: {temp_max}')\n",
+ "print(f'Lowest temperatur: {temp_min}')"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Sjekk for manglende verdier\n",
+ "Missigno sjekker og visualiserer manglende verdier, slik at det blir lettere å se hvilke kolonner feilen ligger i. \n",
+ "\n",
+ "Vis the blir \"hull\" i en søyle, tyder the på manglende verdier."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "import missingno as msno\n",
+ "\n",
+ "# Checks for and display missing values\n",
+ "msno.matrix(df)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Endre manglende verdier\n",
+ "I de fleste tilfeller virker dataene å være tilnærmet \"perfekte\", men de inkluderer bare snø og regn dersom det er snø eller regn. Derfor vil vi fa NaN verdier i de målingene det ikke har regnet/snødd. \n",
+ "\n",
+ "Under sjekker vi først om regn eller snø er i målingen, og hvis den er, bytter vi ut NaN med 0."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# If rain is stored, fill the NaN with 0\n",
+ "try: \n",
+ " df['rain.1h'] = df['rain.1h'].fillna(0)\n",
+ "except KeyError:\n",
+ " print([\"'rain.1h', not in df\"])\n",
+ "\n",
+ "# If snow is stored, fill the NaN with 0\n",
+ "try: \n",
+ " df['snow.1h'] = df['snow.1h'].fillna(0)\n",
+ "except KeyError:\n",
+ " print(\"['snow.1h'], not in df\")\n",
+ "\n",
+ "# If wind_gust is stored, fill the NaN with 0\n",
+ "try: \n",
+ " df['wind.gust'] = df['wind.gust'].fillna(0)\n",
+ "except KeyError:\n",
+ " print(\"['wind.gust'], not in df\")\n",
+ "\n",
+ "# If wind_deg is stored, fill the NaN with 0\n",
+ "try: \n",
+ " df['wind.deg'] = df['wind.deg'].fillna(0)\n",
+ "except KeyError:\n",
+ " print(\"['wind.deg'], not in df\")\n",
+ "\n",
+ "# If wind_speed is stored, fill the NaN with 0\n",
+ "try: \n",
+ " df['wind.speed'] = df['wind.speed'].fillna(0)\n",
+ "except KeyError:\n",
+ " print(\"['wind.speed'], not in df\")\n",
+ "\n",
+ "# If temperature is missing, take the same as the one before\n",
+ "df['main.temp'] = df['main.temp'].fillna('obj.ffill()')\n",
+ "\n",
+ "# Forward fill missing values in what the temperature feels like\n",
+ "df['main.feels_like'] = df['main.feels_like'].fillna('obj.ffill()')\n",
+ "\n",
+ "# Forward fill missing values in the pressure\n",
+ "df['main.pressure'] = df['main.pressure'].fillna('obj.ffill()')\n",
+ "\n",
+ "# Forward fill missing values in the humidity\n",
+ "df['main.humidity'] = df['main.humidity'].fillna('obj.ffill()')\n",
+ "\n",
+ "# Forward fill missing values in the lowest temperature \n",
+ "df['main.temp_min'] = df['main.temp_min'].fillna('obj.ffill()')\n",
+ "\n",
+ "# Forward fill missing values in the highest temperature \n",
+ "df['main.temp_max'] = df['main.temp_max'].fillna('obj.ffill()')\n",
+ "\n",
+ "# Forward fill missing values of clouds\n",
+ "df['clouds.all'] = df['clouds.all'].fillna('obj.ffill()')\n",
+ "\n",
+ "# Display the df, now without NaN\n",
+ "display(df)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Visualisere endring av data\n",
+ "Har lagt inn en ny missigno visualisering, for å se at de manglende dataene \"forsvinner\" når vi kjører cellen over. "
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "import missingno as msno\n",
+ "\n",
+ "# Visulaize the same data again, but now it should be no missing values (atleast for rain and snow)\n",
+ "msno.matrix(df)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Visualisere data i en graf\n",
+ "Ved hjelp av Matplotlib har vi visualiert ønsket data, og ved hjelp av subplot, en modul i matplotlib, kan vi plotte flere verdier i samme graf, og få \"to y-akse\" på samme x-akse. \n",
+ "\n",
+ "Temperatur og nedbør får plass i samme graf, hvor man leser temperatur verdiene på venstre side, og nedbørsverdiene på høyre side.\n",
+ "\n",
+ "I grafen under, men på samme x-akse, finner vi informasjon om vind, både vindhastighet og vindkast."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "import matplotlib.pyplot as plt\n",
+ "import matplotlib.dates as mdates\n",
+ "import os\n",
+ "\n",
+ "# Where the figure should be saved when exported\n",
+ "output_folder = \"../data/output_fig\"\n",
+ "\n",
+ "# Creates the folder if it does not exist\n",
+ "os.makedirs(output_folder, exist_ok=True)\n",
+ "\n",
+ "# x_axis set to the index, which mean the datetime\n",
+ "x_axis = df.index\n",
+ "\n",
+ "# Gets the values\n",
+ "rain = df['rain.1h']\n",
+ "temp = df['main.temp']\n",
+ "snow = df['snow.1h']\n",
+ "wind_gust = df['wind.gust']\n",
+ "wind_speed = df['wind.speed']\n",
+ "temp_mean = temp.mean().round(2)\n",
+ "\n",
+ "# Two vertically stacked axis, (2 rows, 1 column), width and height of the figure, and the axis share the same x_axis\n",
+ "fig, (ax1, ax3) = plt.subplots(2, 1,figsize=(15, 8), sharex=True)\n",
+ "\n",
+ "\n",
+ "# Set the title for the diagram, above the first axis, with city_name and input_date\n",
+ "ax1.set_title(f'Weather data for {city_name} ({start_date}) to ({end_date}) ')\n",
+ "\n",
+ "# Plot temperature on the primary y-axis\n",
+ "ax1.plot(x_axis, temp, color='tab:red', label='Temperature (°C)')\n",
+ "ax1.axhline(y=temp_mean, color='tab:red', linestyle='dashed', label='Mean temperature (°C)')\n",
+ "ax1.axhline(y=0, color='black', linewidth=1.5)\n",
+ "\n",
+ "# Design the y-axis for temperatur\n",
+ "ax1.set_ylabel('Temperature (°C)', color='tab:red')\n",
+ "ax1.tick_params(axis='y', labelcolor='tab:red')\n",
+ "\n",
+ "# Plot Precipitation as bars on the secondary y-axis\n",
+ "ax2 = ax1.twinx()\n",
+ "\n",
+ "# Add rain\n",
+ "# ax2.bar(x_axis, rain, color='tab:blue', alpha=0.5, width=0.02, label='Rain (mm)')\n",
+ "ax2.hist(x_axis, bins=len(x_axis), weights=rain, color='tab:blue', alpha=0.5, label= 'Rain (mm)', bottom=snow)\n",
+ "\n",
+ "# Add snow\n",
+ "# ax2.bar(x_axis, snow, color='tab:grey', alpha=0.5, width=0.02, label='Snow (mm)')\n",
+ "ax2.hist(x_axis, bins=len(x_axis), weights=snow, color='tab:gray', alpha=0.5, label= 'Snow (mm)')\n",
+ "\n",
+ "# Design the y-axis for precipiation\n",
+ "ax2.set_ylabel(\"Precipitation (mm)\", color='tab:blue')\n",
+ "ax2.tick_params(axis='y', labelcolor='tab:blue')\n",
+ "\n",
+ "\n",
+ "# Customize the x-axis to show ticks for each hour\n",
+ "ax1.xaxis.set_major_locator(mdates.HourLocator(interval=12)) # Tick marks for every hour\n",
+ "ax1.xaxis.set_major_formatter(mdates.DateFormatter('%d %b %H')) # Format as \"Day Month Hour:Minute\"\n",
+ "\n",
+ "# Add label-description for both axis\n",
+ "ax1.legend(loc='upper left')\n",
+ "ax2.legend(loc='upper right')\n",
+ "\n",
+ "# Add grid, but only vertically\n",
+ "ax1.grid(axis = 'x')\n",
+ "\n",
+ "\n",
+ "# Plot the wind at the second x-axis (the axis below)\n",
+ "ax3.plot(x_axis, wind_gust, color='tab:purple', linestyle='dashed', label='Wind_gust')\n",
+ "ax3.plot(x_axis, wind_speed, color='tab:purple', label='Wind_speed')\n",
+ "ax3.set_ylabel('Wind (m/s)')\n",
+ "\n",
+ "# Add x_label visible for both x-axis\n",
+ "ax3.set_xlabel('Datetime')\n",
+ "\n",
+ "# Add label-description\n",
+ "ax3.legend(loc='upper right')\n",
+ "\n",
+ "# Customize the x-axis to show ticks for each hour\n",
+ "ax3.xaxis.set_major_locator(mdates.HourLocator(interval=12)) # Tick marks for every hour\n",
+ "ax3.xaxis.set_major_formatter(mdates.DateFormatter('%d %b %H')) # Format as \"Day Month Hour:Minute\"\n",
+ "\n",
+ "# Add grid, but only vertically\n",
+ "ax3.grid(axis = 'x')\n",
+ "\n",
+ "# Adjust layout\n",
+ "plt.tight_layout()\n",
+ "\n",
+ "# Save the plot to the data/output_fig folder\n",
+ "plot_path = os.path.join(output_folder, f\"weather_data_plot{city_name}.png\")\n",
+ "plt.savefig(plot_path) # Save the plot as a PNG file\n",
+ "\n",
+ "\n",
+ "# Show the plot\n",
+ "plt.show()"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "import numpy as np\n",
+ "import statistics\n",
+ "\n",
+ "# Extract temperature columns\n",
+ "temp_mean = df['main.temp']\n",
+ "\n",
+ "# Calculate means\n",
+ "temp_mean_mean = temp_mean.mean()\n",
+ "\n",
+ "\n",
+ "# Calculate standard deviations\n",
+ "temp_mean_stdev = statistics.stdev(temp_mean)\n",
+ "\n",
+ "\n",
+ "# Calculate 3 standard deviation limits\n",
+ "mean_lower_limit = temp_mean_mean - (temp_mean_stdev * 3)\n",
+ "mean_upper_limit = temp_mean_mean + (temp_mean_stdev * 3)\n",
+ "\n",
+ "# Identify outliers\n",
+ "mean_outliers = df.loc[(df['main.temp'] > mean_upper_limit) | (df['main.temp'] < mean_lower_limit), 'main.temp']\n",
+ "\n",
+ "# Print the outliers\n",
+ "print(\"Outliers in main.temp:\")\n",
+ "print(mean_outliers)\n",
+ "\n",
+ "# Replace outliers with NaN\n",
+ "df.loc[(df['main.temp'] > mean_upper_limit) | (df['main.temp'] < mean_lower_limit), 'main.temp'] = np.nan\n",
+ "\n",
+ "# Interpolate to replace NaN values with linear interpolation\n",
+ "df['main.temp'] = df['main.temp'].interpolate(method='linear')"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "import matplotlib.pyplot as plt\n",
+ "import matplotlib.dates as mdates\n",
+ "import os\n",
+ "\n",
+ "# Where the figure should be saved when exported\n",
+ "output_folder = \"../data/output_fig\"\n",
+ "\n",
+ "# Creates the folder if it does not exist\n",
+ "os.makedirs(output_folder, exist_ok=True)\n",
+ "\n",
+ "# x_axis set to the index, which mean the datetime\n",
+ "x_axis = df.index\n",
+ "\n",
+ "# Gets the values\n",
+ "rain = df['rain.1h']\n",
+ "temp = df['main.temp']\n",
+ "snow = df['snow.1h']\n",
+ "wind_gust = df['wind.gust']\n",
+ "wind_speed = df['wind.speed']\n",
+ "temp_mean = temp.mean().round(2)\n",
+ "\n",
+ "# Two vertically stacked axis, (2 rows, 1 column), width and height of the figure, and the axis share the same x_axis\n",
+ "fig, (ax1, ax3) = plt.subplots(2, 1,figsize=(15, 8), sharex=True)\n",
+ "\n",
+ "\n",
+ "# Set the title for the diagram, above the first axis, with city_name and input_date\n",
+ "ax1.set_title(f'Weather data for {city_name} ({start_date}) to ({end_date}) ')\n",
+ "\n",
+ "# Plot temperature on the primary y-axis\n",
+ "ax1.plot(x_axis, temp, color='tab:red', label='Temperature (°C)')\n",
+ "ax1.axhline(y=temp_mean, color='tab:red', linestyle='dashed', label='Mean temperature (°C)')\n",
+ "ax1.axhline(y=0, color='black', linewidth=1.5)\n",
+ "\n",
+ "# Design the y-axis for temperatur\n",
+ "ax1.set_ylabel('Temperature (°C)', color='tab:red')\n",
+ "ax1.tick_params(axis='y', labelcolor='tab:red')\n",
+ "\n",
+ "# Plot Precipitation as bars on the secondary y-axis\n",
+ "ax2 = ax1.twinx()\n",
+ "\n",
+ "# Add rain\n",
+ "# ax2.bar(x_axis, rain, color='tab:blue', alpha=0.5, width=0.02, label='Rain (mm)')\n",
+ "ax2.hist(x_axis, bins=len(x_axis), weights=rain, color='tab:blue', alpha=0.5, label= 'Rain (mm)', bottom=snow)\n",
+ "\n",
+ "# Add snow\n",
+ "# ax2.bar(x_axis, snow, color='tab:grey', alpha=0.5, width=0.02, label='Snow (mm)')\n",
+ "ax2.hist(x_axis, bins=len(x_axis), weights=snow, color='tab:gray', alpha=0.5, label= 'Snow (mm)')\n",
+ "\n",
+ "# Design the y-axis for precipiation\n",
+ "ax2.set_ylabel(\"Precipitation (mm)\", color='tab:blue')\n",
+ "ax2.tick_params(axis='y', labelcolor='tab:blue')\n",
+ "\n",
+ "\n",
+ "# Customize the x-axis to show ticks for each hour\n",
+ "ax1.xaxis.set_major_locator(mdates.HourLocator(interval=12)) # Tick marks for every hour\n",
+ "ax1.xaxis.set_major_formatter(mdates.DateFormatter('%d %b %H')) # Format as \"Day Month Hour:Minute\"\n",
+ "\n",
+ "# Add label-description for both axis\n",
+ "ax1.legend(loc='upper left')\n",
+ "ax2.legend(loc='upper right')\n",
+ "\n",
+ "# Add grid, but only vertically\n",
+ "ax1.grid(axis = 'x')\n",
+ "\n",
+ "\n",
+ "# Plot the wind at the second x-axis (the axis below)\n",
+ "ax3.plot(x_axis, wind_gust, color='tab:purple', linestyle='dashed', label='Wind_gust')\n",
+ "ax3.plot(x_axis, wind_speed, color='tab:purple', label='Wind_speed')\n",
+ "ax3.set_ylabel('Wind (m/s)')\n",
+ "\n",
+ "# Add x_label visible for both x-axis\n",
+ "ax3.set_xlabel('Datetime')\n",
+ "\n",
+ "# Add label-description\n",
+ "ax3.legend(loc='upper right')\n",
+ "\n",
+ "# Customize the x-axis to show ticks for each hour\n",
+ "ax3.xaxis.set_major_locator(mdates.HourLocator(interval=12)) # Tick marks for every hour\n",
+ "ax3.xaxis.set_major_formatter(mdates.DateFormatter('%d %b %H')) # Format as \"Day Month Hour:Minute\"\n",
+ "\n",
+ "# Add grid, but only vertically\n",
+ "ax3.grid(axis = 'x')\n",
+ "\n",
+ "# Adjust layout\n",
+ "plt.tight_layout()\n",
+ "\n",
+ "# Save the plot to the data/output_fig folder\n",
+ "plot_path = os.path.join(output_folder, f\"weather_data_plot{city_name}.png\")\n",
+ "plt.savefig(plot_path) # Save the plot as a PNG file\n",
+ "\n",
+ "\n",
+ "# Show the plot\n",
+ "plt.show()"
+ ]
+ }
+ ],
+ "metadata": {
+ "kernelspec": {
+ "display_name": "venv",
+ "language": "python",
+ "name": "python3"
+ },
+ "language_info": {
+ "codemirror_mode": {
+ "name": "ipython",
+ "version": 3
+ },
+ "file_extension": ".py",
+ "mimetype": "text/x-python",
+ "name": "python",
+ "nbconvert_exporter": "python",
+ "pygments_lexer": "ipython3",
+ "version": "3.12.5"
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}
diff --git a/notebooks/notebook_statistic_data.ipynb b/notebooks/notebook_statistic_data.ipynb
new file mode 100644
index 0000000..996d9cc
--- /dev/null
+++ b/notebooks/notebook_statistic_data.ipynb
@@ -0,0 +1,602 @@
+{
+ "cells": [
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "# Notebook - Statistic data\n",
+ "Denne notebooken henter data fra en API som samler alle historiske data for ønsket sted, å regner ut statistiske verdier for alle dagene i året. Vi fjerner uønskede kolonner, utelukker ekstremverdier og visualiserer data gjennom plotter. "
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Velg et sted i Norge å få statistisk data\n",
+ "\n",
+ "Denne API-en henter statistisk historisk data, herunder, statistisk data basert på de historiske dataene, ikke reele statistisk historisk. \n",
+ "\n",
+ "Statistikken er basert på de historiske datane total sett, ikke for hvert år."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "import sys\n",
+ "import os\n",
+ "\n",
+ "# Gets the absolute path to the src folder\n",
+ "sys.path.append(os.path.abspath(\"../src\"))\n",
+ "\n",
+ "# Now we can import the fucntion from the module\n",
+ "from my_package.year_data import fetch_data\n",
+ "\n",
+ "# Import function to replace nordic (æøå)\n",
+ "from my_package.util import replace_nordic\n",
+ "\n",
+ "# User input the city, for the weather\n",
+ "city_name = input(\"Enter a city in Norway: \")\n",
+ "\n",
+ "city_name = replace_nordic(city_name)\n",
+ "\n",
+ "data, folder = fetch_data(city_name)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Lagre data i json-fil\n",
+ "\n",
+ "Skriv inn navn for til filen du vil lagre med dataen.\n",
+ "\n",
+ "Eks. test\n",
+ "Da vil filen lagres som data_**test**.json, i mappen \"../data/output_statistikk/data_{filnavn}.json\"\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# Gets the absolute path to the src folder\n",
+ "sys.path.append(os.path.abspath(\"../src\"))\n",
+ "\n",
+ "from my_package.write_data import write_data\n",
+ "\n",
+ "filename = input(\"Write filename: \")\n",
+ "\n",
+ "write_data(data, folder, filename)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Lese fra fil\n",
+ "\n",
+ "Henter opp data lagret i filen over, og lagrer i en variabel."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "import pandas as pd\n",
+ "\n",
+ "data = pd.read_json(f'../data/output_statistikk/data_{filename}.json')\n",
+ "\n",
+ "# Display data\n",
+ "display(data)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Lesbar data\n",
+ "Sørger for at dataen lagret over blir mer lesbar."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "import pandas as pd\n",
+ "\n",
+ "# Checks if the 'result' column is in the data\n",
+ "if 'result' in data:\n",
+ " # Normalize the json and store it as a dataframe for better readability\n",
+ " df = pd.json_normalize(data['result'])\n",
+ "\n",
+ " # Display the dataframe\n",
+ " display(df)\n",
+ "else:\n",
+ " print(\"'result' not in data\")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Rydder i data\n",
+ "Fjerner alle kolonner vi ikke trenger, som standardavvik for alle kategorier for alle dager, vi kan regne ut en felles ved å bruke statistisc modulen. \n",
+ "\n",
+ "Ettersom alle kateogirene har lik data, ogg vi vil fjerne noen av verdiene fra alle kategoriene. Kan vi bruke filter funksjonen til å filtrere ut dataene som inneholder f.eks. '.st_dev'. Dette gjør at alle kategoirene fjernes på likt å vi slipper å skrive alle flere ganger."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# Drop all columns that end with '...' using the filter function\n",
+ "df = df.drop(columns=df.filter(like='.p25').columns)\n",
+ "df = df.drop(columns=df.filter(like='.p75').columns)\n",
+ "df = df.drop(columns=df.filter(like='.st_dev').columns)\n",
+ "df = df.drop(columns=df.filter(like='.num').columns)\n",
+ "\n",
+ "display(df)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Plotter temperatur\n",
+ "Denne koden plotter data basert på gjennomsnitts temperatur gjennom året. For å sikre lagring av de ulike kjøringene, vil grafen bli lagret i mappen \"../data/output_fig/mean_temp_plot_{city_name}.json\"\n",
+ "\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "import matplotlib.pyplot as plt\n",
+ "import matplotlib.dates as mdates\n",
+ "import os\n",
+ "import sys\n",
+ "\n",
+ "# Gets the absolute path to the src folder\n",
+ "sys.path.append(os.path.abspath(\"../src\"))\n",
+ "\n",
+ "# Import the kelvin to celsius function\n",
+ "from my_package.util import kelvin_to_celsius\n",
+ "\n",
+ "output_folder = \"../data/output_fig\"\n",
+ "os.makedirs(output_folder, exist_ok=True) # Create the folder if it doesn't exist\n",
+ "\n",
+ "# Converts to and make a new column with celsius temp, and not kelvin\n",
+ "df['temp.mean_celsius'] = kelvin_to_celsius(df['temp.mean'])\n",
+ "temp = df['temp.mean_celsius']\n",
+ "\n",
+ "# Convert from day and month, to datetime\n",
+ "# df['date'] = pd.to_datetime(df[['month', 'day']].assign(year=2024))\n",
+ "\n",
+ "# Create a new column that concatenates month and day (e.g., \"03-01\" for March 1)\n",
+ "df['month_day'] = df[['month', 'day']].apply(lambda x: f\"{x['month']:02d}-{x['day']:02d}\",axis=1)\n",
+ "\n",
+ "# Plot the graph of the mean temperature\n",
+ "plt.figure(figsize=(12, 6))\n",
+ "plt.plot(df['month_day'], temp)\n",
+ "\n",
+ "# Label for easier reading and understanding of the plot\n",
+ "plt.title(f\"Mean temp - statistic historical {city_name}\")\n",
+ "plt.xlabel(\"Date\")\n",
+ "plt.ylabel(\"Temperature (°C)\")\n",
+ "\n",
+ "# Customize the x-axis to show ticks and labels only at the start of each month\n",
+ "plt.gca().xaxis.set_major_locator(mdates.MonthLocator()) \n",
+ "# Format ticks to show abbreviated month names (e.g., Jan, Feb)\n",
+ "plt.gca().xaxis.set_major_formatter(mdates.DateFormatter('%b')) \n",
+ "\n",
+ "plt.xticks(rotation=45)\n",
+ "plt.yticks(range(-20, 30, 2))\n",
+ "plt.tight_layout()\n",
+ "plt.grid()\n",
+ "\n",
+ "# Save the plot to the data/output_fig folder\n",
+ "plot_path = os.path.join(output_folder, f\"mean_temp_plot_{city_name}.png\")\n",
+ "plt.savefig(plot_path) # Save the plot as a PNG file\n",
+ "\n",
+ "# Show the plot\n",
+ "plt.show()\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Plotter data\n",
+ "Her plottes temperatur og regn på samme akse, med vind i en egen graf under, men de deler samme x-akse, som er month_date."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "import matplotlib.pyplot as plt\n",
+ "import matplotlib.dates as mdates\n",
+ "import os\n",
+ "import sys\n",
+ "\n",
+ "# Gets the absolute path to the src folder\n",
+ "sys.path.append(os.path.abspath(\"../src\"))\n",
+ "\n",
+ "# Import the kelvin to celsius function\n",
+ "from my_package.util import kelvin_to_celsius\n",
+ "\n",
+ "# Defines the output folder for the figure, and makes it if is does not exsist\n",
+ "output_folder = \"../data/output_fig\"\n",
+ "os.makedirs(output_folder, exist_ok=True) \n",
+ "\n",
+ "# Converts to and make a new column with celsius temp, and not kelvin\n",
+ "df['temp.mean_celsius'] = kelvin_to_celsius(df['temp.mean'])\n",
+ "temp = df['temp.mean_celsius']\n",
+ "precipitation = df['precipitation.mean']\n",
+ "wind = df['wind.mean']\n",
+ "\n",
+ "# Create a new column that concatenates month and day (e.g., \"03-01\" for March 1)\n",
+ "df['month_day'] = df[['month', 'day']].apply(lambda x: f\"{x['month']:02d}-{x['day']:02d}\",axis=1)\n",
+ "\n",
+ "x_axis = df['month_day']\n",
+ "\n",
+ "fig, (ax1, ax3) = plt.subplots(2, 1, figsize = (15, 8), sharex=True)\n",
+ "\n",
+ "# Plot temperature on the primary y-axis\n",
+ "ax1.plot(x_axis, temp, color='tab:red', label='Temperature (°C)')\n",
+ "# ax1.set_xlabel('Datetime')\n",
+ "ax1.set_ylabel('Temperature (°C)', color='tab:red')\n",
+ "ax1.tick_params(axis='y', labelcolor='tab:red')\n",
+ "\n",
+ "# Plot precipitation as bars on the secondary y-axis\n",
+ "ax2 = ax1.twinx()\n",
+ "ax2.bar(x_axis, precipitation, color='tab:blue', alpha=0.5, width=1, label='Precipitation (mm)')\n",
+ "ax2.set_ylabel(\"Precipitation (mm)\", color='tab:blue')\n",
+ "ax2.tick_params(axis='y', labelcolor='tab:blue')\n",
+ "\n",
+ "ax1.grid(axis = 'x')\n",
+ "ax1.legend(loc='upper left')\n",
+ "ax2.legend(loc='upper right')\n",
+ "\n",
+ "ax3.plot(x_axis, wind, color='tab:purple', label='Wind (m/s)')\n",
+ "# ax3.plot(x_axis, wind_speed, color='tab:purple', linestyle='dashed', label='Wind_speed')\n",
+ "ax3.set_ylabel('Wind (m/s)')\n",
+ "ax3.set_xlabel('Datetime')\n",
+ "ax3.legend(loc='upper right')\n",
+ "\n",
+ "ax3.grid(axis = 'x')\n",
+ "\n",
+ "\n",
+ "# Customize the x-axis to show ticks and labels only at the start of each month\n",
+ "plt.gca().xaxis.set_major_locator(mdates.MonthLocator()) \n",
+ "# Format ticks to show abbreviated month names (e.g., Jan, Feb)\n",
+ "plt.gca().xaxis.set_major_formatter(mdates.DateFormatter('%b')) \n",
+ "\n",
+ "plt.tight_layout()\n",
+ "\n",
+ "# Show the plot\n",
+ "plt.show()\n",
+ "\n",
+ "print(df['precipitation.max'].max())"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Visualiserer målte tempraturer\n",
+ "\n",
+ "Ved hjelp av matplotlib visualiserer vi temperaturen målt for alle dagene.\n",
+ "\n",
+ "Forklaring til grafen:\n",
+ "- Grå graf: gjennomsnitt av alle målingene\n",
+ "- Rød graf: høyeste målte temperatur\n",
+ "- Blå graf: laveste målte temperatur"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "import matplotlib.pyplot as plt\n",
+ "import matplotlib.dates as mdates\n",
+ "import os\n",
+ "import sys\n",
+ "\n",
+ "# Gets the absolute path to the src folder\n",
+ "sys.path.append(os.path.abspath(\"../src\"))\n",
+ "\n",
+ "# Import the kelvin to celsius function\n",
+ "from my_package.util import kelvin_to_celsius\n",
+ "\n",
+ "# Converts to and make a new column with celsius temp, and not kelvin\n",
+ "df['temp.mean_celsius'] = kelvin_to_celsius(df['temp.mean'])\n",
+ "temp_mean = df['temp.mean_celsius']\n",
+ "\n",
+ "df['temp.record_max_celsius'] = kelvin_to_celsius(df['temp.record_max'])\n",
+ "temp_record_max = df['temp.record_max_celsius']\n",
+ "\n",
+ "df['temp.record_min_celsius'] = kelvin_to_celsius(df['temp.record_min'])\n",
+ "temp_record_min = df['temp.record_min_celsius']\n",
+ "\n",
+ "# Create a new column that concatenates month and day (e.g., \"03-01\" for March 1)\n",
+ "df['month_day'] = df[['month', 'day']].apply(lambda x: f\"{x['month']:02d}-{x['day']:02d}\",axis=1)\n",
+ "\n",
+ "# Set the month_date as values for the x_axis\n",
+ "x_axis = df['month_day']\n",
+ "\n",
+ "# Defines the height and width of the figure\n",
+ "plt.figure(figsize=(12, 6))\n",
+ "\n",
+ "# Plots the temperatur\n",
+ "plt.plot(x_axis, temp_mean, color='tab:gray', label='Mean temperatur')\n",
+ "plt.plot(x_axis, temp_record_max, color='tab:red', label = 'Max temperatur')\n",
+ "plt.plot(x_axis, temp_record_min, color='tab:blue', label = 'Min temperatur')\n",
+ "\n",
+ "\n",
+ "# Customize the x-axis to show ticks and labels only at the start of each month\n",
+ "plt.gca().xaxis.set_major_locator(mdates.MonthLocator()) \n",
+ "# Format ticks to show abbreviated month names (e.g., Jan, Feb)\n",
+ "plt.gca().xaxis.set_major_formatter(mdates.DateFormatter('%b')) \n",
+ "\n",
+ "plt.tight_layout()\n",
+ "\n",
+ "# Plot title with city_name\n",
+ "plt.title(f'Temperatur {city_name}')\n",
+ "\n",
+ "# Add grid\n",
+ "plt.grid()\n",
+ "\n",
+ "# Show the label description\n",
+ "plt.legend(loc = 'upper right')\n",
+ "\n",
+ "# Show the plot\n",
+ "plt.show()"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Sjekker uteliggere\n",
+ "Denne koden sjekker om det er noen uteliggere i de ulike temperatur grafene, altså om noen verdier ligger mer enn 3 standardavvik i fra gjennomsnittet."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "import numpy as np\n",
+ "import statistics\n",
+ "\n",
+ "# Ensure 'month_day' is set as the index\n",
+ "if 'month_day' in df.columns:\n",
+ " df.set_index('month_day', inplace=True)\n",
+ "else:\n",
+ " print('month_day not in')\n",
+ "\n",
+ "# Extract temperature columns\n",
+ "temp_mean = df['temp.mean_celsius']\n",
+ "temp_record_min = df['temp.record_min_celsius']\n",
+ "temp_record_max = df['temp.record_max_celsius']\n",
+ "\n",
+ "# Calculate means\n",
+ "temp_mean_mean = temp_mean.mean()\n",
+ "temp_record_min_mean = temp_record_min.mean()\n",
+ "temp_record_max_mean = temp_record_max.mean()\n",
+ "\n",
+ "# Calculate standard deviations\n",
+ "temp_mean_stdev = statistics.stdev(temp_mean)\n",
+ "temp_record_min_stdev = statistics.stdev(temp_record_min)\n",
+ "temp_record_max_stdev = statistics.stdev(temp_record_max)\n",
+ "\n",
+ "# Calculate 3 standard deviation limits\n",
+ "mean_lower_limit = temp_mean_mean - (temp_mean_stdev * 3)\n",
+ "mean_upper_limit = temp_mean_mean + (temp_mean_stdev * 3)\n",
+ "\n",
+ "min_lower_limit = temp_record_min_mean - (temp_record_min_stdev * 3)\n",
+ "min_upper_limit = temp_record_min_mean + (temp_record_min_stdev * 3)\n",
+ "\n",
+ "max_lower_limit = temp_record_max_mean - (temp_record_max_stdev * 3)\n",
+ "max_upper_limit = temp_record_max_mean + (temp_record_max_stdev * 3)\n",
+ "\n",
+ "# Identify outliers\n",
+ "mean_outliers = df.loc[(df['temp.mean_celsius'] > mean_upper_limit) | (df['temp.mean_celsius'] < mean_lower_limit), 'temp.mean_celsius']\n",
+ "min_outliers = df.loc[(df['temp.record_min_celsius'] > min_upper_limit) | (df['temp.record_min_celsius'] < min_lower_limit), 'temp.record_min_celsius']\n",
+ "max_outliers = df.loc[(df['temp.record_max_celsius'] > max_upper_limit) | (df['temp.record_max_celsius'] < max_lower_limit), 'temp.record_max_celsius']\n",
+ "\n",
+ "# Print the outliers\n",
+ "print(\"Outliers in temp.mean_celsius:\")\n",
+ "print(mean_outliers)\n",
+ "\n",
+ "print(\"Outliers in temp.record_min_celsius:\")\n",
+ "print(min_outliers)\n",
+ "\n",
+ "print(\"Outliers in temp.record_max_celsius:\")\n",
+ "print(max_outliers)\n",
+ "\n",
+ "# Replace outliers with NaN\n",
+ "df.loc[(df['temp.mean_celsius'] > mean_upper_limit) | (df['temp.mean_celsius'] < mean_lower_limit), 'temp.mean_celsius'] = np.nan\n",
+ "df.loc[(df['temp.record_min_celsius'] > min_upper_limit) | (df['temp.record_min_celsius'] < min_lower_limit), 'temp.record_min_celsius'] = np.nan\n",
+ "df.loc[(df['temp.record_max_celsius'] > max_upper_limit) | (df['temp.record_max_celsius'] < max_lower_limit), 'temp.record_max_celsius'] = np.nan\n",
+ "\n",
+ "# Interpolate to replace NaN values with linear interpolation\n",
+ "df['temp.mean_celsius'] = df['temp.mean_celsius'].interpolate(method='linear')\n",
+ "df['temp.record_min_celsius'] = df['temp.record_min_celsius'].interpolate(method='linear')\n",
+ "df['temp.record_max_celsius'] = df['temp.record_max_celsius'].interpolate(method='linear')"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Visualiserer temperatur etter endringer\n",
+ "Hvis det er uteliggere i dataen, som skal ha blitt endret, vil denne plotten vise en mer riktig og \"feilfri\" plot."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "import matplotlib.pyplot as plt\n",
+ "import matplotlib.dates as mdates\n",
+ "\n",
+ "# Ensure 'month_day' is set as the index for proper plotting\n",
+ "if 'month_day' in df.columns:\n",
+ " df.set_index('month_day', inplace=True)\n",
+ "\n",
+ "# Extract updated temperature columns\n",
+ "temp_mean = df['temp.mean_celsius']\n",
+ "temp_record_max = df['temp.record_max_celsius']\n",
+ "temp_record_min = df['temp.record_min_celsius']\n",
+ "\n",
+ "# Plot the updated temperature data\n",
+ "plt.figure(figsize=(12, 6))\n",
+ "\n",
+ "# Plot mean, max, and min temperatures\n",
+ "plt.plot(temp_mean.index, temp_mean, color='tab:gray', label='Mean Temperature')\n",
+ "plt.plot(temp_record_max.index, temp_record_max, color='tab:red', label='Max Temperature')\n",
+ "plt.plot(temp_record_min.index, temp_record_min, color='tab:blue', label='Min Temperature')\n",
+ "\n",
+ "# Customize the x-axis to show ticks and labels only at the start of each month\n",
+ "plt.gca().xaxis.set_major_locator(mdates.MonthLocator()) \n",
+ "plt.gca().xaxis.set_major_formatter(mdates.DateFormatter('%b')) # Format ticks to show abbreviated month names (e.g., Jan, Feb)\n",
+ "\n",
+ "# Add labels, title, and legend\n",
+ "plt.xlabel('Month-Day')\n",
+ "plt.ylabel('Temperature (°C)')\n",
+ "plt.title(f'Temperature Data for {city_name}')\n",
+ "plt.legend(loc='upper right')\n",
+ "\n",
+ "# Add grid for better readability\n",
+ "plt.grid()\n",
+ "\n",
+ "# Adjust layout to prevent overlap\n",
+ "plt.tight_layout()\n",
+ "\n",
+ "# Show the plot\n",
+ "plt.show()"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Rekorder\n",
+ "\n",
+ "Denne funksjonen regner ut ulike rekorder for året, for angitt sted."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "import sys\n",
+ "import os\n",
+ "\n",
+ "# Gets the absolute path to the src folder\n",
+ "sys.path.append(os.path.abspath(\"../src\"))\n",
+ "\n",
+ "from my_package.get_record import get_records\n",
+ "\n",
+ "summary_df, filename, folder = get_records(df, city_name)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Skriver dataen til fil\n",
+ "Lagrer rekord-dataen i en fil, med stedsnavn."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# Gets the absolute path to the src folder\n",
+ "sys.path.append(os.path.abspath(\"../src\"))\n",
+ "\n",
+ "from my_package.write_data import write_data\n",
+ "# makes the data 'json-compatible'\n",
+ "json_data = summary_df.to_dict(orient=\"records\")\n",
+ "\n",
+ "write_data(json_data, folder, filename)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Leser fra fil, og printer data\n",
+ "Denne funksjonen henter rekordene fra filen den ble skrevet til, og displayer de som en fin lettlest tabell."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "import pandas as pd\n",
+ "import json\n",
+ "\n",
+ "# Reads data from file and store it\n",
+ "with open(f\"../data/output_record/data_{filename}.json\", \"r\", encoding=\"utf-8\") as file:\n",
+ " data = json.load(file)\n",
+ "\n",
+ "# Normalize the data for better readability\n",
+ "df = pd.json_normalize(data)\n",
+ "\n",
+ "\n",
+ "# Displays the dataframe\n",
+ "display(df)"
+ ]
+ }
+ ],
+ "metadata": {
+ "kernelspec": {
+ "display_name": "venv",
+ "language": "python",
+ "name": "python3"
+ },
+ "language_info": {
+ "codemirror_mode": {
+ "name": "ipython",
+ "version": 3
+ },
+ "file_extension": ".py",
+ "mimetype": "text/x-python",
+ "name": "python",
+ "nbconvert_exporter": "python",
+ "pygments_lexer": "ipython3",
+ "version": "3.12.5"
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}
diff --git a/notebooks/test_notebook.ipynb b/notebooks/notebook_test.ipynb
similarity index 75%
rename from notebooks/test_notebook.ipynb
rename to notebooks/notebook_test.ipynb
index c887b20..d7ff57d 100644
--- a/notebooks/test_notebook.ipynb
+++ b/notebooks/notebook_test.ipynb
@@ -1,21 +1,18 @@
{
"cells": [
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "# Notebook - Test\n",
+ "Dette er bare en test notebook, for å se om venv funker og det å importere funksjoner fra packager."
+ ]
+ },
{
"cell_type": "code",
- "execution_count": 2,
+ "execution_count": null,
"metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "'Hello World!'"
- ]
- },
- "execution_count": 2,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
+ "outputs": [],
"source": [
"import sys\n",
"import os\n",
diff --git a/notebooks/statistic_data_notebook.ipynb b/notebooks/statistic_data_notebook.ipynb
deleted file mode 100644
index e4b10d1..0000000
--- a/notebooks/statistic_data_notebook.ipynb
+++ /dev/null
@@ -1,180 +0,0 @@
-{
- "cells": [
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "### Velg et sted i Norge å få statistisk data\n",
- "\n",
- "Denne API-en henter statistisk historisk data, herunder, statistisk data basert på de historiske dataene, ikke reele statistisk historisk. \n",
- "\n",
- "Statistikken er basert på de historiske datane total sett, ikke for hvert år."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "import sys\n",
- "import os\n",
- "\n",
- "# Gets the absolute path to the src folder\n",
- "sys.path.append(os.path.abspath(\"../src\"))\n",
- "\n",
- "# Now we can import the fucntion from the module\n",
- "from my_package.year_data import fetch_data\n",
- "\n",
- "# User input the city, for the weather\n",
- "city_name = input(\"Enter a city in Norway: \")\n",
- "\n",
- "data, folder = fetch_data(city_name)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "### Lagre data i json-fil\n",
- "\n",
- "Skriv inn navn for til filen du vil lagre med dataen.\n",
- "\n",
- "Eks. test\n",
- "Da vil filen lagres som data_**test**.json, i mappen \"../data/output_statistikk/data_{filnavn}.json\"\n"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "# Gets the absolute path to the src folder\n",
- "sys.path.append(os.path.abspath(\"../src\"))\n",
- "\n",
- "from my_package.write_data import write_data\n",
- "\n",
- "filename = input(\"Write filename: \")\n",
- "\n",
- "write_data(data, folder, filename)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "### Lese fra fil\n",
- "\n",
- "Henter opp data lagret i filen, lagd over, og skriver ut lesbart ved hjelp av pandas"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "import pandas as pd\n",
- "\n",
- "data = pd.read_json(f'../data/output_statistikk/data_{filename}.json')\n",
- "\n",
- "display(data)"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "import pandas as pd\n",
- "\n",
- "if 'result' in data:\n",
- " df = pd.json_normalize(data['result'])\n",
- "\n",
- " display(df)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "### Plotter data\n",
- "Denne koden plotter data basert på gjennomsnitts temperatur gjennom året. For å sikre lagring av de ulike kjøringene, vil grafen bli lagret i mappen \"../data/output_fig/mean_temp_plot_{city_name}.json\"\n",
- "\n"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "import matplotlib.pyplot as plt\n",
- "import matplotlib.dates as mdates\n",
- "import os\n",
- "\n",
- "output_folder = \"../data/output_fig\"\n",
- "os.makedirs(output_folder, exist_ok=True) # Create the folder if it doesn't exist\n",
- "\n",
- "# Converts to and make a new column with celsius temp, and not kelvin\n",
- "df['temp.mean_celsius'] = df['temp.mean'] - 273.15\n",
- "temp = df['temp.mean_celsius']\n",
- "\n",
- "# Convert from day and month, to datetime\n",
- "# df['date'] = pd.to_datetime(df[['month', 'day']].assign(year=2024))\n",
- "\n",
- "# Create a new column that concatenates month and day (e.g., \"03-01\" for March 1)\n",
- "df['month_day'] = df[['month', 'day']].apply(lambda x: f\"{x['month']:02d}-{x['day']:02d}\",axis=1)\n",
- "\n",
- "# Plot the graph of the mean temperature\n",
- "plt.figure(figsize=(12, 6))\n",
- "plt.plot(df['month_day'], temp)\n",
- "\n",
- "# Label for easier reading and understanding of the plot\n",
- "plt.title(f\"Mean temp - statistic historical {city_name}\")\n",
- "plt.xlabel(\"Date\")\n",
- "plt.ylabel(\"Temperature (°C)\")\n",
- "\n",
- "# Customize the x-axis to show ticks and labels only at the start of each month\n",
- "plt.gca().xaxis.set_major_locator(mdates.MonthLocator()) \n",
- "# Format ticks to show abbreviated month names (e.g., Jan, Feb)\n",
- "plt.gca().xaxis.set_major_formatter(mdates.DateFormatter('%b')) \n",
- "\n",
- "plt.xticks(rotation=45)\n",
- "plt.yticks(range(-20, 30, 2))\n",
- "plt.tight_layout()\n",
- "plt.grid()\n",
- "\n",
- "# Save the plot to the data/output_fig folder\n",
- "plot_path = os.path.join(output_folder, f\"mean_temp_plot_{city_name}.png\")\n",
- "plt.savefig(plot_path) # Save the plot as a PNG file\n",
- "\n",
- "# Show the plot\n",
- "plt.show()\n"
- ]
- }
- ],
- "metadata": {
- "kernelspec": {
- "display_name": "venv",
- "language": "python",
- "name": "python3"
- },
- "language_info": {
- "codemirror_mode": {
- "name": "ipython",
- "version": 3
- },
- "file_extension": ".py",
- "mimetype": "text/x-python",
- "name": "python",
- "nbconvert_exporter": "python",
- "pygments_lexer": "ipython3",
- "version": "3.12.5"
- }
- },
- "nbformat": 4,
- "nbformat_minor": 2
-}
diff --git a/requirements.txt b/requirements.txt
index 78d1b7b..d671daa 100644
--- a/requirements.txt
+++ b/requirements.txt
@@ -13,7 +13,7 @@ certifi==2025.1.31
cffi==1.17.1
charset-normalizer==3.4.1
comm==0.2.2
-contourpy==1.3.1
+contourpy==1.2.0
cycler==0.12.1
debugpy==1.8.13
decorator==5.2.1
@@ -27,6 +27,7 @@ httpcore==1.0.7
httpx==0.28.1
idna==3.10
ipykernel==6.29.5
+ipympl==0.9.7
ipython==9.0.1
ipython_pygments_lexers==1.1.1
ipywidgets==8.1.5
@@ -54,6 +55,7 @@ kiwisolver==1.4.8
MarkupSafe==3.0.2
matplotlib==3.10.1
matplotlib-inline==0.1.7
+missingno==0.5.2
mistune==3.1.2
narwhals==1.29.0
nbclient==0.10.2
@@ -62,7 +64,7 @@ nbformat==5.10.4
nest-asyncio==1.6.0
notebook==7.3.2
notebook_shim==0.2.4
-numpy==2.2.3
+numpy==1.26.4
overrides==7.7.0
packaging==24.2
pandas==2.2.3
@@ -92,6 +94,7 @@ requests==2.32.3
rfc3339-validator==0.1.4
rfc3986-validator==0.1.1
rpds-py==0.23.1
+scipy==1.15.2
seaborn==0.13.2
Send2Trash==1.8.3
setuptools==75.8.2
diff --git a/resources/README.md b/resources/README.md
index 142fb2d..ec71c31 100644
--- a/resources/README.md
+++ b/resources/README.md
@@ -1 +1,28 @@
-# Resources - description
\ No newline at end of file
+# Resources - description
+
+Kilden til våre API-er er: [Open Weather](https://openweathermap.org/)
+
+Her finner vi API-er for:
+- Current Data (Now)
+- Historical Data (7 days)
+- Statistic Historical Data (A year)
+
+For å benytte denne API-en må man lage en bruker, og som student for man tilgang på en del "ekstra" ressurser gratis. Her kommer en oversikt over hvordan lage bruker:
+1. Du kan registrere bruker [HER](https://home.openweathermap.org/users/sign_up?student=true)
+2. Når du logger inn trykker du til din profil å finner fanen 'API keys'
+3. Kopier koden
+4. Gå inn i `src/my_package/setup.py` kjør funksjonen, og du kan lime inn mail og API-key i terminalen
+5. Finn en notebook, og kjør kode!
+6. Du skal nå få data fra API-en
+
+### Possible API
+- **API from openweathermap**
+[API_OPEN_WEATHER_MAP](https://openweathermap.org/)
+
+- **API from meterologisk institutt**
+[API_FROST](https://frost.met.no/index.html)
+
+### Possible dataset
+- **Natural Disasters:**
+[DATASET_1](https://www.kaggle.com/datasets/brsdincer/all-natural-disasters-19002021-eosdis)
+
diff --git a/src/README.md b/src/README.md
index 42a797b..fca3fd8 100644
--- a/src/README.md
+++ b/src/README.md
@@ -1 +1,16 @@
-# Src - description
\ No newline at end of file
+# Src - description
+
+Mye av funksjonaliteten og funksjonener er skrevet i en vanlig `.py` fil, før de er importert til notebooken og kjøres der.
+
+`my_package` med en `__init__.py` gjør at funksjonene funker som 'moduler' og blir mulig å importere til videre bruk.
+
+Her kommer en kjapp forklaring av de ulike filene og deres funksjoner:
+- `date_to_unix.py` bruker innebygde moduler som datetime og time, for å gjøre om datoer og tider til unix timestamp, sekunder fra 1. januar 1970.
+- `fetch_current_data.py` funksjon for å hente nåværende data for ønsket sted fra API-en. Sender feilkode dersom statusen ikke har 200, altså ok.
+- `fetch_data.py` henter data for ønsket sted, fra ønsket starttid til sluttid. Sender feilkode dersom statusen ikke har 200, altså ok.
+- `get_record.py` brukt i `notebook_statistic_data.ipynb` for å finne rekord-målinger som høyeste og laveste målte temperatur.
+- `setup.py` funskjon for å hjelpe brukeren å lage en .env fil for å lagre API-key og email.
+- `test_module.py` en test funksjon for å sjekke at venv og implementering til notebook funker som det skal.
+- `util.py` inneholder funksjoner for å erstatte nordiske (æøå) og å omgjøre temperaturer fra kelvin til celsius. Altså funksjoner som bare er en enkel del av noe større.
+- `write_data.py` lagrer data i json-format, med ønsket filnavn til en 'passende' mappe basert på hvor funksjonen brukes.
+- `year_data.py` henter statistisk værdata basert på historikk for ønsket sted. Sender feilkode dersom statusen ikke har 200, altså ok.
\ No newline at end of file
diff --git a/src/my_package/fetch_current_data.py b/src/my_package/fetch_current_data.py
new file mode 100644
index 0000000..787f3c3
--- /dev/null
+++ b/src/my_package/fetch_current_data.py
@@ -0,0 +1,38 @@
+# Import of needed libaries
+import requests
+import os
+from dotenv import load_dotenv
+
+load_dotenv()
+
+# Gets the key, from my env file
+API_KEY = os.getenv("API_KEY")
+
+# city_name = "Trondheim"
+country_code = "NO"
+
+
+# Gets the data from the API - openweathermap.org
+def fetch_current_data(city_name):
+
+
+ # f-string url, to add the "custom" variables to the API-request
+ url = f"https://api.openweathermap.org/data/2.5/weather?q={city_name},NO&units=metric&appid={API_KEY}"
+
+ # Saves the API-request for the url
+ response = requests.get(url)
+
+ # Checks if the status code is OK
+ if response.status_code == 200:
+
+ # Converts the data into json
+ data = response.json()
+ folder = "../data/output_current_data"
+
+ print("Data fetch: ok")
+ return data, folder
+
+
+ else:
+ # If html status code != 200, print the status code
+ print("Failed to fetch data from API. Status code:", response.status_code)
\ No newline at end of file
diff --git a/src/my_package/get_record.py b/src/my_package/get_record.py
new file mode 100644
index 0000000..7454681
--- /dev/null
+++ b/src/my_package/get_record.py
@@ -0,0 +1,23 @@
+import pandas as pd
+
+def get_records(df, city_name):
+ if df.empty:
+ print("df is empty")
+
+ else:
+ max_temp_mean = df['temp.mean_celsius'].max()
+ min_temp_mean = df['temp.mean_celsius'].min()
+
+ max_temp = df['temp.record_max_celsius'].max()
+ min_temp = df['temp.record_min_celsius'].min()
+
+ summary_data = {
+ "Metric": ["Max Temp mean (°C)", "Min Temp Mean (°C)", "Max Temp (°C)", "Min temp (°C)"],
+ "Values": [max_temp_mean, min_temp_mean, max_temp, min_temp]
+ }
+
+ summary_df = pd.DataFrame(summary_data)
+ folder = "../data/output_record"
+ filename = f"records_{city_name}"
+
+ return summary_df, filename, folder
\ No newline at end of file
diff --git a/src/my_package/setup.py b/src/my_package/setup.py
new file mode 100644
index 0000000..0f18581
--- /dev/null
+++ b/src/my_package/setup.py
@@ -0,0 +1,25 @@
+import os
+
+def set_up_API():
+ # Define the path to the .env file at the root of the project
+ env_filepath = os.path.join(os.path.dirname(__file__), "../../.env")
+
+ # Stores the API_EMAIL and API_KEY
+ API_EMAIL = input("Write your API - email: ")
+ API_KEY = input("Write your API - key: ")
+
+ # Prints the file path
+ print(f".env file created at: {env_filepath}")
+
+ # Writes the API_EMAIL and API_KEY
+ with open (env_filepath, "w") as env_file:
+ env_file.write(f'API_EMAIL = "{API_EMAIL}"')
+ env_file.write("\n")
+ env_file.write(f'API_KEY = "{API_KEY}"')
+
+ # Confirmation messages
+ print("Values are stored!")
+ print("You can now run the notebooks, and get data!")
+
+print("Add your info to OpenWeatherMap.com, and the function will create and add the info to env.")
+set_up_API()
\ No newline at end of file
diff --git a/src/my_package/util.py b/src/my_package/util.py
new file mode 100644
index 0000000..2eb13c3
--- /dev/null
+++ b/src/my_package/util.py
@@ -0,0 +1,12 @@
+def replace_nordic(city_name):
+ for letter in city_name:
+ if letter in 'æøå':
+ city_name = city_name.replace('æ', 'ae')
+ city_name = city_name.replace('ø', 'o')
+ city_name = city_name.replace('å', 'aa')
+ return city_name
+
+
+def kelvin_to_celsius(temp_in_kelvin):
+ temp_in_celsius = temp_in_kelvin - 273.15
+ return temp_in_celsius
\ No newline at end of file
diff --git a/tests/README.md b/tests/README.md
index aa1174a..69f95b0 100644
--- a/tests/README.md
+++ b/tests/README.md
@@ -1 +1,8 @@
-# Test - description
\ No newline at end of file
+# Test - description
+
+Her har vi lagd noen enkle tester for å sjekke deler av funksjonaliteten. Det skal legges til at det gjøres flere 'tester' av koden inne i koden, som `try and except`, `if-else` og `raise Error`. Dette sørger for å raskere oppfatte feil når man kjører koden.
+
+Her er litt info om testene:
+- `test_letter_one_day.py` Tester at man får samme data av et sted med stor og liten bokstav.
+- `test_one_day.py` Tester at funksjonaliteten for å gjøre om fra unix-timestamp blir den samme som input date.
+- `test_test.py` Bare en første test for å sjekke at unittest funker.
\ No newline at end of file
diff --git a/tests/unit/test_letter_one_day.py b/tests/unit/test_letter_one_day.py
new file mode 100644
index 0000000..53ad395
--- /dev/null
+++ b/tests/unit/test_letter_one_day.py
@@ -0,0 +1,26 @@
+import unittest
+import sys
+import os
+
+# Add the src folder to the Python path
+sys.path.insert(0, os.path.abspath(os.path.join(os.path.dirname(__file__), '../../src')))
+
+from my_package.fetch_current_data import fetch_current_data
+
+class TestCityNameCase(unittest.TestCase):
+
+ def test_city_name_case_insensitive(self):
+ # Test city with big and small letter
+ city_name_upper = "Oslo"
+ city_name_lower = "oslo"
+
+ # Test if they return the same, the underscore is for the folder that we dont use here
+ data_upper, _ = fetch_current_data(city_name_upper)
+ data_lower, _ = fetch_current_data(city_name_lower)
+
+ # use temperature as an example to see if data is identical
+ self.assertEqual(data_upper["main"]["temp"], data_lower["main"]["temp"])
+
+if __name__ == "__main__":
+ unittest.main()
+
diff --git a/tests/unit/test_one_day.py b/tests/unit/test_one_day.py
new file mode 100644
index 0000000..ad6e05c
--- /dev/null
+++ b/tests/unit/test_one_day.py
@@ -0,0 +1,39 @@
+import unittest
+import sys
+import os
+from datetime import datetime
+from src.my_package.date_to_unix import from_unix_timestamp
+
+# This will make the absolute path from the root of the project, and will therefor work every time
+sys.path.append(os.path.abspath(os.path.join(os.path.dirname(__file__), "../../src")))
+
+
+class TestGetUnixTimestamp(unittest.TestCase):
+
+ def test_get_unix_timestamp(self):
+ # Example user input for start and end date
+ start_date_input = "2000, 03, 05, 11, 00"
+ end_date_input = "2000, 03, 05, 13, 00"
+
+ # Convert input string to datetime object
+ start_date = datetime.strptime(start_date_input, "%Y, %m, %d, %H, %M")
+ end_date = datetime.strptime(end_date_input, "%Y, %m, %d, %H, %M")
+
+ # Get the Unix timestamp by calling .timestamp()
+ expected_unix_start = int(start_date.timestamp())
+ expected_unix_end = int(end_date.timestamp())
+
+ # Call the function directly with test data
+ function_unix_start, function_unix_end = from_unix_timestamp(expected_unix_start, expected_unix_end)
+
+ # Assert that the returned timestamps are correct
+ self.assertEqual(function_unix_start, start_date)
+ self.assertEqual(function_unix_end, end_date)
+
+
+if __name__ == "__main__":
+ unittest.main()
+
+
+#this test is to test if the code date matches its timestamp
+