Skip to content

Commit

Permalink
add correlation matrix
Browse files Browse the repository at this point in the history
  • Loading branch information
toravest committed May 24, 2025
1 parent e37df08 commit 4dafb49
Show file tree
Hide file tree
Showing 6 changed files with 372 additions and 6 deletions.
66 changes: 66 additions & 0 deletions notebooks/notebook_compare_one_day_data.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -491,6 +491,72 @@
"# Show the plot\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Korrelasjonsmatrise\n",
"Ved hjelp av Seaborn har vi visualisert korrelasjons koeffisienten mellom ulike variabler. Det viser sammenhengen mellom variablene, og hvordan de påvirker hverandre. Sammenhengen kan forklares slik:\n",
"- +1 - Sterk positiv sammenheng\n",
"- 0 - Ingen sammenheng\n",
"- -1 - Sterk negativ sammenheng\n",
"\n",
"For å lese av grafen, finner man en variabel vertikalt og en horisontal variabel. Der de møtes i diagrammet er korrelasjons koeffisienten mellom disse.\n",
"\n",
"Her har vi laget en korrelasjonsmatrsie for hver av stedene, slik at man kan sammeligne om et sted har sterkere eller svakere sammenhenger mellom de ulike variablene.\n",
"\n",
"Grafen lagres i mappen `../data/figures/output_fig_compare_one_day`"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"import seaborn as sns\n",
"import matplotlib.pyplot as plt\n",
"\n",
"# Where the figure should be saved when exported\n",
"output_folder = \"../data/figures/output_fig_compare_one_day\"\n",
"\n",
"# Creates the folder if it does not exist\n",
"os.makedirs(output_folder, exist_ok=True)\n",
"\n",
"cities = [\"city_1\", \"city_2\"]\n",
"\n",
"# The columns we want to include in the correlation matrix\n",
"columns_needed = ['main.temp', 'main.pressure', 'main.humidity', 'wind.speed', 'wind.gust', 'clouds.all', 'rain.1h', 'snow.1h']\n",
"\n",
"# Two horisontally plots (1 row, 2 columns), width and height of the figure\n",
"fig, axes = plt.subplots(1, 2, figsize=(14, 6)) # Adjust figsize as needed\n",
"\n",
"# Loops through both cities, the enumerate make sure we get both the city and the index of the city\n",
"for i, city in enumerate(cities):\n",
" # Stores the data for the right city in cities\n",
" city_df = both_cities_df[both_cities_df[\"city\"] == city]\n",
" city_name = city_df['city_name'].iloc[0]\n",
"\n",
" df_selected = city_df[columns_needed]\n",
"\n",
" # Calculates the correlation\n",
" corr_matrix = df_selected.corr()\n",
"\n",
" # Makes a seaborn heatmat, with the values in the rectangel and 2 decimals\n",
" sns.heatmap(corr_matrix, annot=True, cmap=\"coolwarm\", fmt=\".2f\", ax=axes[i])\n",
"\n",
" # Add a title, with the city_name\n",
" axes[i].set_title(f\"Correlation Matrix - {city_name}\")\n",
"\n",
"# Save the plot to the 'data/figures/output_fig_compare_one_day' folder\n",
"plot_path = os.path.join(output_folder, f\"correlation_matrix_{city_1}_{city_2}.png\")\n",
"plt.savefig(plot_path) # Save the plot as a PNG file\n",
"\n",
"# Show the plot\n",
"plt.show()"
]
}
],
"metadata": {
Expand Down
68 changes: 68 additions & 0 deletions notebooks/notebook_compare_one_week_data.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -523,6 +523,74 @@
"# Show the plot\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"id": "c5390c2d",
"metadata": {},
"source": [
"### Korrelasjonsmatrise\n",
"Ved hjelp av Seaborn har vi visualisert korrelasjons koeffisienten mellom ulike variabler. Det viser sammenhengen mellom variablene, og hvordan de påvirker hverandre. Sammenhengen kan forklares slik:\n",
"- +1 - Sterk positiv sammenheng\n",
"- 0 - Ingen sammenheng\n",
"- -1 - Sterk negativ sammenheng\n",
"\n",
"For å lese av grafen, finner man en variabel vertikalt og en horisontal variabel. Der de møtes i diagrammet er korrelasjons koeffisienten mellom disse.\n",
"\n",
"Her har vi laget en korrelasjonsmatrsie for hver av stedene, slik at man kan sammeligne om et sted har sterkere eller svakere sammenhenger mellom de ulike variablene.\n",
"\n",
"Grafen lagres i mappen `../data/figures/output_fig_compare_one_week`"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "55d18857",
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"import seaborn as sns\n",
"import matplotlib.pyplot as plt\n",
"\n",
"# Where the figure should be saved when exported\n",
"output_folder = \"../data/figures/output_fig_compare_one_week\"\n",
"\n",
"# Creates the folder if it does not exist\n",
"os.makedirs(output_folder, exist_ok=True)\n",
"\n",
"cities = [\"city_1\", \"city_2\"]\n",
"\n",
"# The columns we want to include in the correlation matrix\n",
"columns_needed = ['main.temp', 'main.pressure', 'main.humidity', 'wind.speed', 'wind.gust', 'clouds.all', 'rain.1h', 'snow.1h']\n",
"\n",
"# Two horisontally plots (1 row, 2 columns), width and height of the figure\n",
"fig, axes = plt.subplots(1, 2, figsize=(14, 6)) # Adjust figsize as needed\n",
"\n",
"# Loops through both cities, the enumerate make sure we get both the city and the index of the city\n",
"for i, city in enumerate(cities):\n",
" # Stores the data for the right city in cities\n",
" city_df = both_cities_df[both_cities_df[\"city\"] == city]\n",
" city_name = city_df['city_name'].iloc[0]\n",
"\n",
" df_selected = city_df[columns_needed]\n",
"\n",
" # Calculates the correlation\n",
" corr_matrix = df_selected.corr()\n",
"\n",
" # Makes a seaborn heatmat, with the values in the rectangel and 2 decimals\n",
" sns.heatmap(corr_matrix, annot=True, cmap=\"coolwarm\", fmt=\".2f\", ax=axes[i])\n",
"\n",
" # Add a title, with the city_name\n",
" axes[i].set_title(f\"Correlation Matrix - {city_name}\")\n",
"\n",
"# Save the plot to the 'data/figures/output_fig_compare_one_week' folder\n",
"plot_path = os.path.join(output_folder, f\"correlation_matrix_{city_1}_{city_2}.png\")\n",
"plt.savefig(plot_path) # Save the plot as a PNG file\n",
"\n",
"# Show the plot\n",
"plt.show()"
]
}
],
"metadata": {
Expand Down
68 changes: 68 additions & 0 deletions notebooks/notebook_compare_statistic_data.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -361,6 +361,74 @@
"# Show the plot\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"id": "04c13808",
"metadata": {},
"source": [
"### Korrelasjonsmatrise\n",
"Ved hjelp av Seaborn har vi visualisert korrelasjons koeffisienten mellom ulike variabler. Det viser sammenhengen mellom variablene, og hvordan de påvirker hverandre. Sammenhengen kan forklares slik:\n",
"- +1 - Sterk positiv sammenheng\n",
"- 0 - Ingen sammenheng\n",
"- -1 - Sterk negativ sammenheng\n",
"\n",
"For å lese av grafen, finner man en variabel vertikalt og en horisontal variabel. Der de møtes i diagrammet er korrelasjons koeffisienten mellom disse.\n",
"\n",
"Her har vi laget en korrelasjonsmatrsie for hver av stedene, slik at man kan sammeligne om et sted har sterkere eller svakere sammenhenger mellom de ulike variablene.\n",
"\n",
"Grafen lagres i mappen `../data/figures/output_fig_compare_statistic`"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "ceb72c4e",
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"import seaborn as sns\n",
"import matplotlib.pyplot as plt\n",
"\n",
"# Where the figure should be saved when exported\n",
"output_folder = \"../data/figures/output_fig_compare_statistic\"\n",
"\n",
"# Creates the folder if it does not exist\n",
"os.makedirs(output_folder, exist_ok=True)\n",
"\n",
"cities = [\"city_1\", \"city_2\"]\n",
"\n",
"# The columns we want to include in the correlation matrix\n",
"columns_needed = ['temp.mean_celsius', 'pressure.mean', 'humidity.mean', 'wind.mean', 'clouds.mean', 'precipitation.mean']\n",
"\n",
"# Two horisontally plots (1 row, 2 columns), width and height of the figure\n",
"fig, axes = plt.subplots(1, 2, figsize=(14, 6)) # Adjust figsize as needed\n",
"\n",
"# Loops through both cities, the enumerate make sure we get both the city and the index of the city\n",
"for i, city in enumerate(cities):\n",
" # Stores the data for the right city in cities\n",
" city_df = both_cities_df[both_cities_df[\"city\"] == city]\n",
" city_name = city_df['city_name'].iloc[0]\n",
"\n",
" df_selected = city_df[columns_needed]\n",
"\n",
" # Calculates the correlation\n",
" corr_matrix = df_selected.corr()\n",
"\n",
" # Makes a seaborn heatmat, with the values in the rectangel and 2 decimals\n",
" sns.heatmap(corr_matrix, annot=True, cmap=\"coolwarm\", fmt=\".2f\", ax=axes[i])\n",
"\n",
" # Add a title, with the city_name\n",
" axes[i].set_title(f\"Correlation Matrix - {city_name}\")\n",
"\n",
"# Save the plot to the 'data/figures/output_fig_compare_statistic' folder\n",
"plot_path = os.path.join(output_folder, f\"correlation_matrix_{city_1}_{city_2}.png\")\n",
"plt.savefig(plot_path) # Save the plot as a PNG file\n",
"\n",
"# Show the plot\n",
"plt.show()"
]
}
],
"metadata": {
Expand Down
63 changes: 58 additions & 5 deletions notebooks/notebook_one_day_data.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -180,7 +180,7 @@
"x_axis = df.index\n",
"\n",
"# Choose the width and height of the plot\n",
"plt.figure(figsize=(12, 6))\n",
"plt.figure(figsize=(14, 6))\n",
"\n",
"# Scatter plot for each temperature reading\n",
"plt.scatter(x_axis, temp, color='tab:red', label='Temperaturmålinger', alpha=0.7)\n",
Expand Down Expand Up @@ -368,7 +368,7 @@
"temp_mean = temp.mean().round(2)\n",
"\n",
"# Two vertically stacked axis, (2 rows, 1 column), width and height of the figure, and the axis share the same x_axis\n",
"fig, (ax1, ax3) = plt.subplots(2, 1,figsize=(15, 8), sharex=True)\n",
"fig, (ax1, ax3) = plt.subplots(2, 1,figsize=(14, 6), sharex=True)\n",
"\n",
"# Set the title for the diagram, above the first axis, with city_name and input_date\n",
"ax1.set_title(f'Weather data for {city_name} ({date}) ')\n",
Expand Down Expand Up @@ -429,8 +429,61 @@
"# Adjust layout\n",
"plt.tight_layout()\n",
"\n",
"# Save the plot to the 'data/figures/output_one_day' folder\n",
"plot_path = os.path.join(output_folder, f\"weather_data_plot{city_name}.png\")\n",
"# Save the plot to the 'data/figures/output_fig_one_day' folder\n",
"plot_path = os.path.join(output_folder, f\"weather_data_plot_{city_name}.png\")\n",
"plt.savefig(plot_path) # Save the plot as a PNG file\n",
"\n",
"# Show the plot\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Korrelasjonsmatrise\n",
"Ved hjelp av Seaborn har vi visualisert korrelasjons koeffisienten mellom ulike variabler. Det viser sammenhengen mellom variablene, og hvordan de påvirker hverandre. Sammenhengen kan forklares slik:\n",
"- +1 - Sterk positiv sammenheng\n",
"- 0 - Ingen sammenheng\n",
"- -1 - Sterk negativ sammenheng\n",
"\n",
"For å lese av grafen, finner man en variabel vertikalt og en horisontal variabel. Der de møtes i diagrammet er korrelasjons koeffisienten mellom disse.\n",
"\n",
"Grafen lagres i mappen `../data/figures/output_fig_one_day`"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import seaborn as sns\n",
"\n",
"# Where the figure should be saved when exported\n",
"output_folder = \"../data/figures/output_fig_one_day\"\n",
"\n",
"# Creates the folder if it does not exist\n",
"os.makedirs(output_folder, exist_ok=True)\n",
"\n",
"# The columns we want to include in the correlation matrix\n",
"columns_needed = ['main.temp', 'main.pressure', 'main.humidity', 'wind.speed', 'wind.gust', 'clouds.all', 'rain.1h', 'snow.1h']\n",
"df_selected = df[columns_needed]\n",
"\n",
"# Calculates the correlation\n",
"corr_matrix = df_selected.corr()\n",
"\n",
"# Choose the width and height of the plot\n",
"plt.figure(figsize=(14, 6))\n",
"\n",
"# Makes a seaborn heatmat, with the values in the rectangel and 2 decimals\n",
"sns.heatmap(corr_matrix, annot=True, cmap=\"coolwarm\", fmt=\".2f\")\n",
"\n",
"# Add a title, with the city_name\n",
"plt.title(f\"Correlation Matrix - {city_name}\")\n",
"\n",
"# Save the plot to the 'data/figures/output_fig_one_day' folder\n",
"plot_path = os.path.join(output_folder, f\"correlation_matrix_{city_name}.png\")\n",
"plt.savefig(plot_path) # Save the plot as a PNG file\n",
"\n",
"# Show the plot\n",
Expand Down Expand Up @@ -513,7 +566,7 @@
"\n",
"x_axis = df.index\n",
"\n",
"plt.figure(figsize=(12, 6))\n",
"plt.figure(figsize=(14, 6))\n",
"\n",
"# Plot the original data\n",
"plt.scatter(x_axis, y, color='green', label='Original data', alpha=0.6)\n",
Expand Down
56 changes: 55 additions & 1 deletion notebooks/notebook_one_week_data.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -448,13 +448,67 @@
"plt.tight_layout()\n",
"\n",
"# Save the plot to the 'data/figures/output_fig_one_week' folder\n",
"plot_path = os.path.join(output_folder, f\"weather_data_plot{city_name}.png\")\n",
"plot_path = os.path.join(output_folder, f\"weather_data_plot_{city_name}.png\")\n",
"plt.savefig(plot_path) # Save the plot as a PNG file\n",
"\n",
"\n",
"# Show the plot\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Korrelasjonsmatrise\n",
"Ved hjelp av Seaborn har vi visualisert korrelasjons koeffisienten mellom ulike variabler. Det viser sammenhengen mellom variablene, og hvordan de påvirker hverandre. Sammenhengen kan forklares slik:\n",
"- +1 - Sterk positiv sammenheng\n",
"- 0 - Ingen sammenheng\n",
"- -1 - Sterk negativ sammenheng\n",
"\n",
"For å lese av grafen, finner man en variabel vertikalt og en horisontal variabel. Der de møtes i diagrammet er korrelasjons koeffisienten mellom disse.\n",
"\n",
"Grafen lagres i mappen `../data/figures/output_fig_one_week`"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import seaborn as sns\n",
"import os\n",
"\n",
"# Where the figure should be saved when exported\n",
"output_folder = \"../data/figures/output_fig_one_week\"\n",
"\n",
"# Creates the folder if it does not exist\n",
"os.makedirs(output_folder, exist_ok=True)\n",
"\n",
"# The columns we want to include in the correlation matrix\n",
"columns_needed = ['main.temp', 'main.pressure', 'main.humidity', 'wind.speed', 'wind.gust', 'clouds.all', 'rain.1h', 'snow.1h']\n",
"selected_df = df[columns_needed]\n",
"\n",
"# Calculates the correlation\n",
"corr_matrix = selected_df.corr()\n",
"\n",
"# Choose the width and height of the plot\n",
"plt.figure(figsize=(14, 6))\n",
"\n",
"# Makes a seaborn heatmat, with the values in the rectangel and 2 decimals\n",
"sns.heatmap(corr_matrix, annot=True, cmap=\"coolwarm\", fmt=\".2f\")\n",
"\n",
"# Add a title, with the city_name\n",
"plt.title(f\"Correlation Matrix - {city_name}\")\n",
"\n",
"# Save the plot to the 'data/figures/output_fig_one_week' folder\n",
"plot_path = os.path.join(output_folder, f\"correlation_matrix_{city_name}.png\")\n",
"plt.savefig(plot_path) # Save the plot as a PNG file\n",
"\n",
"# Show the plot\n",
"plt.show()\n"
]
}
],
"metadata": {
Expand Down
Loading

0 comments on commit 4dafb49

Please sign in to comment.