Skip to content

Analysis of public EV chargepoint data and car ownership by local authority district to understand supply and demand

Notifications You must be signed in to change notification settings

JessArkesden/EV_charging

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Summary

This analysis uses public EV chargepoint and plug-in vehicle (PiV) registration data to understand, by local authority district (LAD), which areas have the best and worst EV infrastructure.

What data is included?

First, four features are calculated to allow for initial exploration of the data. Details of these are as follows:

Count of PiVs

Source: DVLA and DfT dataset

  • A simple count of registered PiVs by LAD is calculated.
  • This includes all PiVs with Private keepership. This excludes PiVs where the keepership is Company to avoid skewing the analysis with large quantities of fleet vehicles registered to a single address.
  • Where data has been suppressed in the raw file, the [c] value has been replaced with NaN

Count of EV chargepoints

Source: Public EV chargepoint registry

  • A simple count of public EV chargepoints by LAD is calculated.
  • There has been no cleaning of this data beyond converting the latitude and longitude into a point. This is a public dataset and further accuracy has not been verified.

Ratio of PiVs to EV chargepoints

Source: Author's calculations

  • Taking the count of PiVs in each LAD and dividing this by the count of public EV chargepoints in each LAD.
  • A higher ratio means there are more PiVs to each chargepoint and therefore public infrastructure may be lacking.

Average distance to nearest EV chargepoint

Source: Geographies from ONS via Open Geography Portal

  • Using the population-weighted centroids (PWCs) for each output area (OA) within an LAD, the nearest public EV chargepoint is calculated. OA is the smallest census geography and contains between 40 and 250 households.
  • Then this is aggregated to LAD level by taking the average distance of all OAs within the LAD to obtain the average distance to the nearest public EV chargepoint by LAD.

What do these features show?

The following choropleths map these four features. The absolute number of PiVs varies across England and Wales, with LADs across Wales, Lincolnshire and Cumbria having particularly low numbers. By contrast, the highest number of public EV chargepoints by LAD is concentrated in Greater London, with hotspots dotted elsewhere across the country. If we instead look at this as the ratio of PiVs to chargepoints, we again see the patterns change. Hotspots around the southeast of England highlight that there is potentially infrastructure lacking in these areas compared to PiV numbers. Lastly, we can see the average distance to the nearest chargepoint is perhaps unsurprisingly highest in rural areas, particularly in the north of England and Wales.

Code to create an interactive version of this map can be found below.

What is the relationship between the features?

The correlation matrix heatmap below shows the relationship between each feature. There is some positive correlation between chargepoints and PiVs, which is good as this suggests that where there are more PiVs there tend to be more chargepoints. There is also some positive correlation between average distance to the nearest chargepoint and the ratio of PiVs to chargepoints. This means that if the distance is greater, there are also more PiVs to chargepoints perhaps suggesting these areas lack infrastructure.

The following scatter plots illustrate this further, adding a regression line. We can see that there are outliers across both, with a cluster of LADs on the lower end.

So what does this mean for public EV infrastructure?

Taking the ratio of PiVs to public EV chargepoints and the average distance to the nearest chargepoint, a simple ranking is created combining the two. Areas with a high ratio of PiVs to chargepoints and a large distance to the nearest chargepoint could be lacking in public EV infrastructure.

Top 10 The best performing LADs were all in London, suggesting infrastructure in the capital is well established.

LAD22CD LAD22NM combined_rank
288 E09000013 Hammersmith and Fulham 1.0
308 E09000033 Westminster 2.0
303 E09000028 Southwark 3.0
276 E09000001 City of London 4.0
295 E09000020 Kensington and Chelsea 4.0
299 E09000024 Merton 6.0
307 E09000032 Wandsworth 6.0
282 E09000007 Camden 8.0
287 E09000012 Hackney 9.0
294 E09000019 Islington 9.0

Bottom 10 The worst performing LADs are more dispersed. Most are more rural areas, or include only one or two larger towns.

LAD22CD LAD22NM combined_rank
99 E07000074 Maldon 330.0
89 E07000064 Rother 329.0
235 E07000242 East Hertfordshire 328.0
16 E06000017 Rutland 327.0
100 E07000075 Rochford 326.0
85 E07000047 West Devon 325.0
177 E07000169 Selby 324.0
160 E07000139 North Kesteven 323.0
82 E07000044 South Hams 321.0
199 E07000198 Staffordshire Moorlands 321.0

This information could be used by both public and private organisations alike to understand where public EV infrastructure is lacking and therefore incentivise installation of new infrastructure.

Considerations and future work

This analysis has demonstrated that current public EV structure may be lacking in some areas based on the current number of PiVs. This does not include:

  • A temporal view of how PiV ownership has changed over time - there may be areas experiencing higher growth, with could arguably justify increased investment over other areas.
  • EV infrastructure on private property (e.g. at home) - areas with a higher number of terraced houses or flats may require more public infrastructure as it can often be more challenging to install at these types of properties. Areas with higher rates of rented accomodation over private ownership could also face challenges as it would be the responsibility of the landlord to install EV chargepoints. Again, these areas may require more public infrastructure and so overlaying this data could be an interesting investigation to see how it affects the ranking.

The public EV chargepoint registry has also not been quality checked. Further investigation into this dataset and wrangling/cleaning as appropriate may impact results.

Coding

Import libraries

import pandas as pd
import geopandas as gpd
from shapely.geometry import Point
import numpy as np
import seaborn as sns
from scipy.stats import zscore
import matplotlib.pyplot as plt
import contextily as ctx
import folium

Data wrangling

EV Chargepoint data

# Import csv
chargepoints_RAW = pd.read_csv('Data/national-charge-point-registry_230524.csv', low_memory=False)
# Reduce to just required columns
chargepoints_TRIM = chargepoints_RAW[['chargeDeviceID','reference','name','latitude','longitude']]
# Check length of df
len(chargepoints_TRIM)

40580

# Visual check
chargepoints_TRIM.head()
chargeDeviceID reference name latitude longitude
0 b86a77a42bb68c81946ec50cfc95e89d 11172306P Network Rail Westwood Centre 1 52.386590 -1.587384
1 dc1c347d471f68e41ad2a9a1145941d6 APT-0296-0015/13P Brindley Drive Car Park Birmingham - 70524 52.480918 -1.907710
2 7d545ad9367ccb8a80c94a953314ae71 CM123 Renault Liverpool 53.383579 -2.977230
3 68c7fca1e3bba5e49ec90847dcdd456b CM164 NCP Portman Square 51.516201 -0.157996
4 ac7a21c48f5833b33a5b606b2089e6a9 CM167 NCP Prince Street Car Park 51.450340 -2.596704
# Geocode locations using lat and long, set crs to 4326 and then convert to 27700 (British National Grid), drop lat and long
chargepoints_gdf = gpd.GeoDataFrame(chargepoints_TRIM,
                                    geometry=gpd.points_from_xy(chargepoints_TRIM.longitude, chargepoints_TRIM.latitude)
                                   ).set_crs(epsg=4326, inplace=True).to_crs(epsg=27700).drop(columns=['latitude','longitude'])
# Visual check
chargepoints_gdf.head()
chargeDeviceID reference name geometry
0 b86a77a42bb68c81946ec50cfc95e89d 11172306P Network Rail Westwood Centre 1 POINT (428179.411 276586.797)
1 dc1c347d471f68e41ad2a9a1145941d6 APT-0296-0015/13P Brindley Drive Car Park Birmingham - 70524 POINT (406364.849 287003.294)
2 7d545ad9367ccb8a80c94a953314ae71 CM123 Renault Liverpool POINT (335096.815 387859.900)
3 68c7fca1e3bba5e49ec90847dcdd456b CM164 NCP Portman Square POINT (527908.652 181305.630)
4 ac7a21c48f5833b33a5b606b2089e6a9 CM167 NCP Prince Street Car Park POINT (358631.700 172541.813)

EV Cars data

# Import csv
PiVs_RAW = pd.read_csv('Data/df_VEH0145.csv', low_memory=False)
# Reduce to just required columns
PiVs_TRIM = PiVs_RAW[['LSOA11CD','LSOA11NM','Fuel','Keepership','2023 Q4']]
# Visual check
PiVs_TRIM.head()
LSOA11CD LSOA11NM Fuel Keepership 2023 Q4
0 95AA01S1 Aldergrove 1 Battery electric Company [c]
1 95AA01S2 Aldergrove 2 Battery electric Company 9
2 95AA01S3 Aldergrove 3 Battery electric Company [c]
3 95AA02W1 Balloo Battery electric Company 5
4 95AA03W1 Ballycraigy Battery electric Company [c]
# Filter to all Fuel and private Keepership types, then drop those columns
PiVs_FILTERED = PiVs_TRIM[(PiVs_TRIM['Fuel'] == 'Total') & 
                          (PiVs_TRIM['Keepership'] == 'Private')].drop(columns=['Fuel', 'Keepership'])
# Replace the suppressed [c] data with NaN
PiVs_FILTERED['2023 Q4'] = PiVs_FILTERED['2023 Q4'].replace('[c]', np.nan)

# Convert the '2023 Q4' column to numeric data type
PiVs_FILTERED['2023 Q4'] = pd.to_numeric(PiVs_FILTERED['2023 Q4'])
# Visual check
PiVs_FILTERED.head()
LSOA11CD LSOA11NM 2023 Q4
189862 95AA01S1 Aldergrove 1 NaN
189863 95AA01S2 Aldergrove 2 15.0
189864 95AA01S3 Aldergrove 3 19.0
189865 95AA02W1 Balloo 10.0
189866 95AA03W1 Ballycraigy NaN

LSOA lookup

The PiV data uses the LSOA codes from 2011 and so this will need to be changed to the LSOA 2021 codes.

# Import csv
LSOA_lookup = pd.read_csv(
    'Data/LSOA_(2011)_to_LSOA_(2021)_to_Local_Authority_District_(2022)_Best_Fit_Lookup_for_EW_(V2).csv',
    low_memory=False)
# Reduce to just required columns
LSOA_lookup = LSOA_lookup[['LSOA11CD','LSOA21CD','LAD22CD','LAD22NM']].copy()
# Match 2021 to 2011 codes using lookup
PiVs_FILTERED = pd.merge(PiVs_FILTERED, LSOA_lookup, on='LSOA11CD')
# Visual check
PiVs_FILTERED.head()
LSOA11CD LSOA11NM 2023 Q4 LSOA21CD LAD22CD LAD22NM
0 E01000001 City of London 001A 28.0 E01000001 E09000001 City of London
1 E01000002 City of London 001B 30.0 E01000002 E09000001 City of London
2 E01000003 City of London 001C 15.0 E01000003 E09000001 City of London
3 E01000005 City of London 001E NaN E01000005 E09000001 City of London
4 E01000006 Barking and Dagenham 016A 18.0 E01000006 E09000002 Barking and Dagenham

LADs

# Import shp file
LADs = gpd.read_file('Data/Local_Authority_Districts_December_2022_UK_BFC_V2_-177113771882051469/LAD_DEC_2022_UK_BFC_V2.shp')
# Reduce to just required columns
LADs = LADs[['LAD22CD','LAD22NM','geometry']].copy()
# Visual check
LADs.head()
LAD22CD LAD22NM geometry
0 E06000001 Hartlepool MULTIPOLYGON (((450154.599 525938.201, 450140....
1 E06000002 Middlesbrough MULTIPOLYGON (((446854.700 517192.700, 446854....
2 E06000003 Redcar and Cleveland MULTIPOLYGON (((451747.397 520561.100, 451792....
3 E06000004 Stockton-on-Tees MULTIPOLYGON (((447177.704 517811.797, 447176....
4 E06000005 Darlington POLYGON ((423496.602 524724.299, 423497.204 52...
len(LADs)

374

OAs

This analysis will use the population-weighted centroids (PWCs) to find the nearest EV chargepoints. This will then be aggregated to LAD level, so the OA to LAD lookup is required.

OA PWCs

# Import shp file
OA_PWC = gpd.read_file('Data/Output_Areas_2021_PWC_V3_-1981902074309169314/PopCentroids_EW_2021_V3.shp')
# Reduce to just required columns
OA_PWC = OA_PWC[['OA21CD','geometry']].copy()

OA to LAD lookup

# Import csv
OA_lookup = pd.read_csv(
    'Data/Output_Area_to_Lower_layer_Super_Output_Area_to_Middle_layer_Super_Output_Area_to_Local_Authority_District_(December_2021)_Lookup_in_England_and_Wales_v3.csv',
    low_memory=False)
# Keep only required columns
OA_lookup = OA_lookup[['OA21CD','LSOA21CD','LAD22CD']].copy()
# Visual check
OA_lookup.head()
OA21CD LSOA21CD LAD22CD
0 E00060358 E01011968 E06000001
1 E00060359 E01011968 E06000001
2 E00060360 E01011968 E06000001
3 E00060361 E01011968 E06000001
4 E00060362 E01011970 E06000001

Feature engineering

Aggregate EV car ownership data to LAD

# Aggregate to LAD and calculate total number of PiVs
PiVs_LAD = PiVs_FILTERED.groupby('LAD22CD')['2023 Q4'].agg('sum').reset_index()

# Rename column to PiVs
PiVs_LAD.rename(columns={'2023 Q4': 'PiVs'}, inplace=True)
# Visual check
PiVs_LAD.head()
LAD22CD PiVs
0 E06000001 410.0
1 E06000002 391.0
2 E06000003 580.0
3 E06000004 1308.0
4 E06000005 719.0
len(PiVs_LAD)

331

Calculate distance to nearest chargepoint

  • Using the OA PWCs, calculate the nearest EV chargepoint.
  • Aggregate this to LAD level to get the average distance to the nearest chargepoint by LAD.
# Find the nearest chargepoint to each OA PWC and calculate distance in metres
OA_nearest_chargepoint = OA_PWC.sjoin_nearest(chargepoints_gdf, distance_col='distance', how='left')
# Merge on OA_lookup_Leeds to get the LAD22CDs
OA_nearest_chargepoint = pd.merge(OA_nearest_chargepoint, OA_lookup, on='OA21CD', how='inner')
# Keep only required columns
OA_nearest_chargepoint = OA_nearest_chargepoint[['OA21CD','geometry','distance','LAD22CD']].copy()
# Aggregate to LAD and calculate average distance to nearest chargepoint
avg_dist_nearest_chargepoint_LAD = OA_nearest_chargepoint.groupby('LAD22CD')['distance'].agg('mean').reset_index()

# Rename column to avg_distance
avg_dist_nearest_chargepoint_LAD.rename(columns={'distance': 'avg_distance'}, inplace=True)
# Visual check
avg_dist_nearest_chargepoint_LAD.head()
LAD22CD avg_distance
0 E06000001 1649.498055
1 E06000002 1088.305461
2 E06000003 1454.585351
3 E06000004 911.259807
4 E06000005 1188.923394
len(avg_dist_nearest_chargepoint_LAD)

331

Calculate number of EV chargepoints in each LAD

  • Aggregate the EV chargepoint data to LAD level to get total number of chargepoints in each LAD.
# Spatially match
chargepoints_LAD = gpd.sjoin(LADs, chargepoints_gdf, predicate='intersects')
# Aggregate to LAD and calculate total number of chargepoints
chargepoints_LAD = chargepoints_LAD.groupby('LAD22CD')['index_right'].agg('count').reset_index()

# Rename column to total_chargepoints
chargepoints_LAD.rename(columns={'index_right': 'total_chargepoints'}, inplace=True)
# Visual check
chargepoints_LAD.head()
LAD22CD total_chargepoints
0 E06000001 42
1 E06000002 51
2 E06000003 46
3 E06000004 167
4 E06000005 76
len(chargepoints_LAD)

373

Calculate ratio of EV cars to chargepoints in each LAD

# Merge PiVs and chargepoint count dataframes
ratio_PiVs_to_chargepoints = pd.merge(PiVs_LAD, chargepoints_LAD, on='LAD22CD', how='left')

# Replace NaN values with 0 where there are no chargepoints in an MSOA
ratio_PiVs_to_chargepoints.fillna({'total_chargepoints': 0}, inplace=True)
# Calculate ratio
ratio_PiVs_to_chargepoints['ratio_PiVs_to_chargepoints'] = (
    ratio_PiVs_to_chargepoints['PiVs'] / ratio_PiVs_to_chargepoints['total_chargepoints'])

# Replace 'inf' with NaN
ratio_PiVs_to_chargepoints['ratio_PiVs_to_chargepoints'] = ratio_PiVs_to_chargepoints[
    'ratio_PiVs_to_chargepoints'].replace([np.inf, -np.inf], np.nan)

Add average distance and LAD polygon to final dataframe

# Add nearest chargepoint
final_df = pd.merge(ratio_PiVs_to_chargepoints,avg_dist_nearest_chargepoint_LAD, on='LAD22CD')

# Add LAD polygons
final_df = pd.merge(LADs, final_df, on='LAD22CD')
# Visual check
final_df.head()
LAD22CD LAD22NM geometry PiVs total_chargepoints ratio_PiVs_to_chargepoints avg_distance
0 E06000001 Hartlepool MULTIPOLYGON (((450154.599 525938.201, 450140.... 410.0 42.0 9.761905 1649.498055
1 E06000002 Middlesbrough MULTIPOLYGON (((446854.700 517192.700, 446854.... 391.0 51.0 7.666667 1088.305461
2 E06000003 Redcar and Cleveland MULTIPOLYGON (((451747.397 520561.100, 451792.... 580.0 46.0 12.608696 1454.585351
3 E06000004 Stockton-on-Tees MULTIPOLYGON (((447177.704 517811.797, 447176.... 1308.0 167.0 7.832335 911.259807
4 E06000005 Darlington POLYGON ((423496.602 524724.299, 423497.204 52... 719.0 76.0 9.460526 1188.923394

Handle outliers

There are some outliers on the top end, so anything outside 2sd will be amended.

# Loop through each variable and calculate mean, std, and threshold
for variable in ['PiVs', 'total_chargepoints', 'ratio_PiVs_to_chargepoints', 'avg_distance']:
    mean_var = final_df[variable].mean()
    std_var = final_df[variable].std()
    threshold_var = mean_var + 2 * std_var
    
    # Create new column based on outlier condition
    final_df[f'amended_{variable}'] = final_df[variable].apply(lambda x: threshold_var if x > threshold_var else x)
# Visual check
final_df.head()
LAD22CD LAD22NM geometry PiVs total_chargepoints ratio_PiVs_to_chargepoints avg_distance amended_PiVs amended_total_chargepoints amended_ratio_PiVs_to_chargepoints amended_avg_distance
0 E06000001 Hartlepool MULTIPOLYGON (((450154.599 525938.201, 450140.... 410.0 42.0 9.761905 1649.498055 410.0 42.0 9.761905 1649.498055
1 E06000002 Middlesbrough MULTIPOLYGON (((446854.700 517192.700, 446854.... 391.0 51.0 7.666667 1088.305461 391.0 51.0 7.666667 1088.305461
2 E06000003 Redcar and Cleveland MULTIPOLYGON (((451747.397 520561.100, 451792.... 580.0 46.0 12.608696 1454.585351 580.0 46.0 12.608696 1454.585351
3 E06000004 Stockton-on-Tees MULTIPOLYGON (((447177.704 517811.797, 447176.... 1308.0 167.0 7.832335 911.259807 1308.0 167.0 7.832335 911.259807
4 E06000005 Darlington POLYGON ((423496.602 524724.299, 423497.204 52... 719.0 76.0 9.460526 1188.923394 719.0 76.0 9.460526 1188.923394

Visualising data

The following code builds an interactive map of the features in final_df. This is saved down into an Outputs/ folder as an HTML file.

# Create the individual layers
m = final_df.explore(
    column="amended_PiVs",
    scheme="naturalbreaks",
    cmap='YlOrRd',
    legend=False,
    k=10,
    tooltip=["LAD22CD", "LAD22NM", "amended_PiVs"],
    style_kwds=dict(color='black', weight=0.5, fillOpacity=0.8),
    highlight_kwds=dict(fillOpacity=1),
    name="Plug-in Vehicles",
    show=True)

final_df.explore(
    m=m,
    column="amended_total_chargepoints",
    scheme="naturalbreaks",
    cmap='YlOrRd',
    legend=False,
    k=10,
    tooltip=["LAD22CD", "LAD22NM", "amended_total_chargepoints"],
    style_kwds=dict(color='black', weight=0.5, fillOpacity=0.8),
    highlight_kwds=dict(fillOpacity=1),
    name="Total EV Chargepoints",
    show=False)

final_df.explore(
    m=m,
    column="amended_ratio_PiVs_to_chargepoints",
    scheme="naturalbreaks",
    cmap='YlOrRd',
    legend=False,
    k=10,
    tooltip=["LAD22CD", "LAD22NM", "amended_ratio_PiVs_to_chargepoints"],
    style_kwds=dict(color='black', weight=0.5, fillOpacity=0.8),
    highlight_kwds=dict(fillOpacity=1),
    name="Ratio of Plug-in Vehicles to EV Chargepoints",
    show=False)

final_df.explore(
    m=m,
    column="amended_avg_distance",
    scheme="naturalbreaks",
    cmap='YlOrRd',
    legend=False,
    k=10,
    tooltip=["LAD22CD", "LAD22NM", "amended_avg_distance"],
    style_kwds=dict(color='black', weight=0.5, fillOpacity=0.8),
    highlight_kwds=dict(fillOpacity=1),
    name="Average distance to nearest EV Chargepoint by OA",
    show=False)

# Add the map base
folium.TileLayer("CartoDB positron", show=True).add_to(m)

# Add layer control to map
folium.LayerControl().add_to(m)

# Save the map to an HTML file
m.save("Outputs/Interactive_map.html")

The following code builds four choropleth maps of the features in final_df. This is saved down into an Outputs/ folder as a .png file.

# Create 2x2 subplots
fig, axs = plt.subplots(2, 2, figsize=(12, 12))

# Define aliases for column names
column_aliases = {
    'amended_PiVs': 'Plug-in Vehicles (PiVs)',
    'amended_total_chargepoints': 'Total EV Chargepoints (EVCs)',
    'amended_ratio_PiVs_to_chargepoints': 'Ratio of PiVs to EVCs',
    'amended_avg_distance': 'Average distance to nearest EVC by OA (metres)'
}

# Loop through each variable and corresponding subplot
variables = list(column_aliases.keys())
for variable, ax in zip(variables, axs.flatten()):
    final_df.plot(column=variable, cmap='YlOrRd', ax=ax, legend=True)
    ax.set_title(f'{column_aliases[variable]}')  # Set the title using the alias
    ax.set_axis_off()  # Turn off axis
    
plt.subplots_adjust(wspace=0.05, hspace=0.05)  # Adjust space between subplots
plt.tight_layout()  # Adjust layout to prevent overlap

# Export to Outputs/ folder
plt.savefig('Outputs/Feature_choropleths.png', dpi=300)

# Close figure
plt.close(fig)

The following code creates two scatterplots with regression lines for selected features from the final_df. This is saved down into an Outputs/ folder as a .png file.

# Create a subplot grid with 1 row and 2 columns
fig, axs = plt.subplots(1, 2, figsize=(15, 6))

# Plot the first scatter plot
sns.scatterplot(x='amended_PiVs', y='amended_total_chargepoints', data=final_df, ax=axs[0])
sns.regplot(x='amended_PiVs', y='amended_total_chargepoints', data=final_df, scatter=False, ax=axs[0])
axs[0].set_xlabel(column_aliases['amended_PiVs'])  # Set x-axis label using alias
axs[0].set_ylabel(column_aliases['amended_total_chargepoints'])  # Set y-axis label using alias
axs[0].set_title('Plug-in Vehicles vs. Total EV Chargepoints')  # Set the title

# Plot the second scatter plot
sns.scatterplot(x='amended_ratio_PiVs_to_chargepoints', y='amended_avg_distance', data=final_df, ax=axs[1])
sns.regplot(x='amended_ratio_PiVs_to_chargepoints', y='amended_avg_distance', data=final_df, scatter=False, ax=axs[1])
axs[1].set_xlabel(column_aliases['amended_ratio_PiVs_to_chargepoints'])  # Set x-axis label using alias
axs[1].set_ylabel(column_aliases['amended_avg_distance'])  # Set y-axis label using alias
axs[1].set_title('Ratio vs. Distance')  # Set the title

# Adjust layout
plt.tight_layout()

# Export to Outputs/ folder
plt.savefig('Outputs/Scatter_Reg_Plots.png', dpi=300)

# Close figure
plt.close(fig)

The following code creates a correlation matrix heatmap of the features in final_df. This is saved down into an Outputs/ folder as a .png file.

# Calculate the correlation matrix
correlation_matrix = final_df[[
    'amended_PiVs','amended_total_chargepoints',
    'amended_ratio_PiVs_to_chargepoints','amended_avg_distance']].corr()

# Generate a mask for the diagonal cells
mask = np.triu(np.ones_like(correlation_matrix, dtype=bool))

# Generate a heatmap
plt.figure(figsize=(8, 7))
sns.heatmap(correlation_matrix, cmap='RdYlGn', annot=True, 
            fmt=".2f", mask=mask, vmin=-1, vmax=1, cbar_kws={"shrink": 0.75}, linewidth=.5)

# Rotate axis labels
plt.xticks(rotation=45)
plt.yticks(rotation=45)

# Set axis labels using aliases and wrap text
plt.xticks(ticks=range(len(correlation_matrix.columns)), 
           labels=['Plug-in Vehicles', 
                   'Total EV Chargepoints', 
                   'Ratio of PiVs to EVCs', 
                   'Avg dist to nearest EVC'])

plt.yticks(ticks=np.arange(len(correlation_matrix.columns))+0.5, 
           labels=['Plug-in Vehicles', 
                   'Total EV Chargepoints', 
                   'Ratio of PiVs to EVCs', 
                   'Avg dist to nearest EVC'])

plt.title('Correlation Matrix Heatmap', fontweight='bold')

plt.tight_layout()

# Export to Outputs/ folder
plt.savefig('Outputs/CorrelationMatrix.png', dpi=300)

# Close figure
plt.close()

Creating a ranking

The following code creates a simple ranking of LADs based on:

  • Average distance to the nearest EV chargepoint
  • Ratio of Plug-in vehicles to EV chargepoints

The idea is that if the average distance is high, and the ratio is high, then these areas may be lacking public EV infrastructure.

# Create a new dataframe containing just the amended features for the ranking
ranking = final_df.drop(final_df.columns[3:9], axis=1)
# Rank the values in each column
ranking['rank_ratio'] = ranking['amended_ratio_PiVs_to_chargepoints'].rank(ascending=True, method='min')
ranking['rank_distance'] = ranking['amended_avg_distance'].rank(ascending=True, method='min')
# Compute the combined ranking
ranking['combined_rank'] = (ranking['rank_ratio'] + ranking['rank_distance']) / 2
ranking['combined_rank'] = ranking['combined_rank'].rank(ascending=True, method='min')

The following code creates a map of the top 10 combined_rank LADs in ranking. This is saved down into an Outputs/ folder as a .png file.

# Get top 10 areas
ranking[['LAD22CD','LAD22NM','combined_rank']].sort_values(by='combined_rank', ascending=True).head(10)
LAD22CD LAD22NM combined_rank
288 E09000013 Hammersmith and Fulham 1.0
308 E09000033 Westminster 2.0
303 E09000028 Southwark 3.0
276 E09000001 City of London 4.0
295 E09000020 Kensington and Chelsea 4.0
299 E09000024 Merton 6.0
307 E09000032 Wandsworth 6.0
282 E09000007 Camden 8.0
287 E09000012 Hackney 9.0
294 E09000019 Islington 9.0
# Sort and select the top 10 areas
top_10_areas = ranking.sort_values(by='combined_rank', ascending=True).head(10)

# Plot only the top 10 areas
fig, ax = plt.subplots(1, 1, figsize=(8, 10))
top_10_areas.plot(ax=ax, color='#88C88E', edgecolor='black', linewidth=1, legend=True, alpha=0.5)

# Annotate the top 10 areas with their rank
for idx, row in top_10_areas.iterrows():
    plt.annotate(text=int(row['combined_rank']), xy=(row.geometry.centroid.x, row.geometry.centroid.y),
                 horizontalalignment='center', fontsize=12, weight='bold', color='black')

# Plot basemap
ctx.add_basemap(ax, crs=top_10_areas.crs.to_string(), source=ctx.providers.CartoDB.Positron)

# Set title and remove axis
ax.set_title('Top 10 Areas by Combined Rank', fontsize=12, fontweight='bold')
ax.set_axis_off()

# Adjust layout
plt.tight_layout()

# Export to Outputs/ folder
plt.savefig('Outputs/Map_of_Top_10.png', dpi=300)

# Close figure
plt.close(fig)

The following code creates a map of the bottom 10 combined_rank LADs in ranking. This is saved down into an Outputs/ folder as a .png file.

# Show bottom 10 areas
ranking[['LAD22CD','LAD22NM','combined_rank']].sort_values(by='combined_rank', ascending=False).head(10)
LAD22CD LAD22NM combined_rank
99 E07000074 Maldon 330.0
89 E07000064 Rother 329.0
235 E07000242 East Hertfordshire 328.0
16 E06000017 Rutland 327.0
100 E07000075 Rochford 326.0
85 E07000047 West Devon 325.0
177 E07000169 Selby 324.0
160 E07000139 North Kesteven 323.0
82 E07000044 South Hams 321.0
199 E07000198 Staffordshire Moorlands 321.0
# Sort and select the bottom 10 areas
bottom_10_areas = ranking.sort_values(by='combined_rank', ascending=False).head(10)

# Plot only the top 10 areas
fig, ax = plt.subplots(1, 1, figsize=(10, 10))
bottom_10_areas.plot(ax=ax, color='#DE6464', edgecolor='black', linewidth=1, legend=True, alpha=0.5)

# Annotate the top 10 areas with their rank
for idx, row in bottom_10_areas.iterrows():
    plt.annotate(text=int(row['combined_rank']), xy=(row.geometry.centroid.x, row.geometry.centroid.y),
                 horizontalalignment='center', fontsize=12, weight='bold', color='black')

# Plot basemap
ctx.add_basemap(ax, crs=bottom_10_areas.crs.to_string(), source=ctx.providers.CartoDB.Positron)

# Set title and remove axis
ax.set_title('Bottom 10 Areas by Combined Rank', fontsize=12, fontweight='bold')
ax.set_axis_off()

# Adjust layout
plt.tight_layout()

# Export to Outputs/ folder
plt.savefig('Outputs/Map_of_Bottom_10.png', dpi=300)

# Close figure
plt.close(fig)

About

Analysis of public EV chargepoint data and car ownership by local authority district to understand supply and demand

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published