使用 GeoPandas 使用 shapefile 屏蔽 netcdf 文件 Python

Masking netcdf file with shapefile using GeoPandas Python

我有一个 EDGAR 排放清单的 netcdf 文件和一个美国人口普查数据的 shapefile。我想从 netcdf 中提取整个 shapefile 中只有 overlaps/intersects 与纽约地区的数据,这样我就可以计算纽约市的总排放量。

我从未与 shapefiles/GeoPandas 合作过,所以请耐心等待。我能够读取 shapefile,过滤特定区域,然后将 netcdf 转换为 GeoDataFrame。 我只想保留来自 shapefile 过滤区域内的 netcdf 数据,以便进行分析

更新:我尝试使用 sjoinclip,但是当我执行该命令时,我的数据帧没有数据,当我使用 sjoin 绘图时,出现错误“您尝试绘制的 GeoDataFrame 是空的。没有显示任何内容。"

import netCDF4
import numpy as np
from osgeo import gdal,osr,ogr
import matplotlib.pyplot as plt
import geopandas as gpd
import pandas as pd
import xarray as xr


# read in file path for shapefile
fp_shp = "C:/Users/cb_2018_us_ua10_500k/cb_2018_us_ua10_500k.shp"
# read in netcdf file path
ncs = "C:/Users/v50_N2O_2015.0.1x0.1.nc"

# Read in NETCDF as a pandas dataframe
# Xarray provides a simple method of opening netCDF files, and converting them to pandas dataframes
ds = xr.open_dataset(ncs)
edgar = ds.to_dataframe()

# the index in the df is a Pandas.MultiIndex. To reset it, use df.reset_index()
edgar = edgar.reset_index()

# Read shapefile using gpd.read_file()
shp = gpd.read_file(fp_shp)

# read the netcdf data file
#nc = netCDF4.Dataset(ncs,'r')

# quick check for shpfile plotting
shp.plot(figsize=(12, 8));

# filter out shapefile for SPECIFIC city/region

# how to filter rows in DataFrame that contains string
# extract NYC from shapefile dataframe
nyc_shp = shp[shp['NAME10'].str.contains("New York")]

# export shapefile
#nyc_shp.to_file('NYC.shp', driver ='ESRI Shapefile')

# use geopandas points_from_xy() to transform Longitude and Latitude into a list of shapely.Point objects and set it as a geometry while creating the GeoDataFrame
edgar_gdf = gpd.GeoDataFrame(edgar, geometry=gpd.points_from_xy(edgar.lon, edgar.lat))

print(edgar_gdf.head())

# check CRS coordinates
nyc_shp.crs #shapefile
edgar_gdf.crs #geodataframe netcdf

# set coordinates equal to each other
# PointsGeodataframe.crs = PolygonsGeodataframe.crs
edgar_gdf.crs = nyc_shp.crs

# check coordinates after setting coordinates equal to each other
edgar_gdf.crs #geodataframe netcdf

# Clip points, lines, or polygon geometries to the mask extent.
mask = gpd.clip(edgar_gdf, nyc_shp)

我想通了!我需要确保我的 netcdf 文件与我的 shapefile 具有相同的经度。因此,我将其转换为 [-180, 180] 而不是 [0, 360],以便在转换为 GeoDataFrame 之前进行匹配。那么上面的代码就可以了!