-
Notifications
You must be signed in to change notification settings - Fork 2
Description
I got these error when running exemplar_workflow.ipynb...
First this error...
This was the command that led to this error after I edited it from the original...
Adding quotes to the parameters seemed to solve it.
utils.to_geoparquet("0039648-250827131500795.csv", "EQDGC-Level-2.gpkg", leftID='eqdcellcode', rightID='cellCode', exportPath='./data/export.parquet')
But then I got this error....
Cell In[4], line 3 utils.to_geoparquet(0039648-250827131500795.csv, EQDGC-Level-2.gpkg, leftID='eqdcellcode', rightID='cellCode', exportPath='./data/export.parquet') ^ SyntaxError: leading zeros in decimal integer literals are not permitted; use an 0o prefix for octal integers
C:\Users\quent\anaconda3\Lib\contextlib.py:141: RuntimeWarning: driver GPKG does not support open option CRS return next(self.gen) C:\Users\quent\anaconda3\Lib\contextlib.py:141: RuntimeWarning: GPKG: unrecognized user_version=0x00000000 (0) on 'EQDGC-Level-2.gpkg' return next(self.gen) --------------------------------------------------------------------------- NameError Traceback (most recent call last) Cell In[5], line 3 1 from b3alien import utils ----> 3 utils.to_geoparquet("0039648-250827131500795.csv", "EQDGC-Level-2.gpkg", leftID='eqdcellcode', rightID='cellCode', exportPath='./data/export.parquet') File ~\anaconda3\Lib\site-packages\b3alien\utils\geo.py:30, in to_geoparquet(csvFile, geoFile, leftID, rightID, exportPath) 27 data = pd.read_csv(csvFile, sep='\t') 28 geoRef = gpd.read_file(geoFile, engine='pyogrio', use_arrow=True, crs="EPSG:4326") ---> 30 test_merge = pd.merge(data, qdgc_ref, left_on=leftID, right_on=rightID) 32 gdf = gpd.GeoDataFrame(test_merge, geometry='geometry') 33 if gdf.crs is None: NameError: name 'qdgc_ref' is not defined
This was solved by editting geo.py and changing this line...
merged = pd.merge(data, qdgc_ref, left_on=leftID, right_on=rightID, how="inner")
to
merged = pd.merge(data, geoRef, left_on=leftID, right_on=rightID, how="inner")
Followed by this error...
TypeError Traceback (most recent call last) Cell In[4], line 3 1 from b3alien import utils ----> 3 utils.to_geoparquet("0039648-250827131500795.csv", "EQDGC-Level-2.gpkg", leftID='eqdcellcode', rightID='cellCode', exportPath='./data/export.parquet') File ~\anaconda3\Lib\site-packages\b3alien\utils\geo.py:31, in to_geoparquet(csvFile, geoFile, leftID, rightID, exportPath) 28 data[leftID] = data[leftID].str.strip() 29 geoRef = gpd.read_file(geoFile, engine='pyogrio', use_arrow=True, crs="EPSG:4326") ---> 31 test_merge = pd.merge(data, geoFile, left_on=leftID, right_on=rightID) 33 gdf = gpd.GeoDataFrame(test_merge, geometry='geometry') 34 if gdf.crs is None: File ~\anaconda3\Lib\site-packages\pandas\core\reshape\merge.py:153, in merge(left, right, how, on, left_on, right_on, left_index, right_index, sort, suffixes, copy, indicator, validate) 135 @Substitution("\nleft : DataFrame or named Series") 136 @Appender(_merge_doc, indents=0) 137 def merge( (...) 150 validate: str | None = None, 151 ) -> DataFrame: 152 left_df = _validate_operand(left) --> 153 right_df = _validate_operand(right) 154 if how == "cross": 155 return _cross_merge( 156 left_df, 157 right_df, (...) 167 copy=copy, 168 ) File ~\anaconda3\Lib\site-packages\pandas\core\reshape\merge.py:2692, in _validate_operand(obj) 2690 return obj.to_frame() 2691 else: -> 2692 raise TypeError( 2693 f"Can only merge Series or DataFrame objects, a {type(obj)} was passed" 2694 ) TypeError: Can only merge Series or DataFrame objects, a <class 'str'> was passed
I got beyond this error by using this code in geo.py...
# Read tab-separated CSV, keep key as string to preserve leading zeros
data = pd.read_csv(csvFile, sep="\t", dtype={leftID: "string"})
data[leftID] = data[leftID].str.strip()
# Read the GeoPackage (no crs= here)
geoRef = gpd.read_file(geoFile, engine="pyogrio", use_arrow=True)
# If the layer has no CRS and you KNOW it should be WGS84, set it explicitly
if geoRef.crs is None:
geoRef = geoRef.set_crs(4326)
# Ensure join key is string and trimmed
geoRef[rightID] = geoRef[rightID].astype("string").str.strip()
# ✅ Merge DATAFRAMES, not strings
merged = pd.merge(data, geoRef, left_on=leftID, right_on=rightID, how="inner")
# Build GeoDataFrame and write Parquet
gdf = gpd.GeoDataFrame(merged, geometry="geometry", crs=geoRef.crs)
gdf.to_parquet(exportPath, index=False)