Skip to content

several errors running exemplar_workflow.ipynb #14

@qgroom

Description

@qgroom

I got these error when running exemplar_workflow.ipynb...

First this error...

This was the command that led to this error after I edited it from the original...

Adding quotes to the parameters seemed to solve it.

utils.to_geoparquet("0039648-250827131500795.csv", "EQDGC-Level-2.gpkg", leftID='eqdcellcode', rightID='cellCode', exportPath='./data/export.parquet')
But then I got this error....

Cell In[4], line 3 utils.to_geoparquet(0039648-250827131500795.csv, EQDGC-Level-2.gpkg, leftID='eqdcellcode', rightID='cellCode', exportPath='./data/export.parquet') ^ SyntaxError: leading zeros in decimal integer literals are not permitted; use an 0o prefix for octal integers

C:\Users\quent\anaconda3\Lib\contextlib.py:141: RuntimeWarning: driver GPKG does not support open option CRS return next(self.gen) C:\Users\quent\anaconda3\Lib\contextlib.py:141: RuntimeWarning: GPKG: unrecognized user_version=0x00000000 (0) on 'EQDGC-Level-2.gpkg' return next(self.gen) --------------------------------------------------------------------------- NameError Traceback (most recent call last) Cell In[5], line 3 1 from b3alien import utils ----> 3 utils.to_geoparquet("0039648-250827131500795.csv", "EQDGC-Level-2.gpkg", leftID='eqdcellcode', rightID='cellCode', exportPath='./data/export.parquet') File ~\anaconda3\Lib\site-packages\b3alien\utils\geo.py:30, in to_geoparquet(csvFile, geoFile, leftID, rightID, exportPath) 27 data = pd.read_csv(csvFile, sep='\t') 28 geoRef = gpd.read_file(geoFile, engine='pyogrio', use_arrow=True, crs="EPSG:4326") ---> 30 test_merge = pd.merge(data, qdgc_ref, left_on=leftID, right_on=rightID) 32 gdf = gpd.GeoDataFrame(test_merge, geometry='geometry') 33 if gdf.crs is None: NameError: name 'qdgc_ref' is not defined

This was solved by editting geo.py and changing this line...

merged = pd.merge(data, qdgc_ref, left_on=leftID, right_on=rightID, how="inner")

to

merged = pd.merge(data, geoRef, left_on=leftID, right_on=rightID, how="inner")

Followed by this error...

TypeError Traceback (most recent call last) Cell In[4], line 3 1 from b3alien import utils ----> 3 utils.to_geoparquet("0039648-250827131500795.csv", "EQDGC-Level-2.gpkg", leftID='eqdcellcode', rightID='cellCode', exportPath='./data/export.parquet') File ~\anaconda3\Lib\site-packages\b3alien\utils\geo.py:31, in to_geoparquet(csvFile, geoFile, leftID, rightID, exportPath) 28 data[leftID] = data[leftID].str.strip() 29 geoRef = gpd.read_file(geoFile, engine='pyogrio', use_arrow=True, crs="EPSG:4326") ---> 31 test_merge = pd.merge(data, geoFile, left_on=leftID, right_on=rightID) 33 gdf = gpd.GeoDataFrame(test_merge, geometry='geometry') 34 if gdf.crs is None: File ~\anaconda3\Lib\site-packages\pandas\core\reshape\merge.py:153, in merge(left, right, how, on, left_on, right_on, left_index, right_index, sort, suffixes, copy, indicator, validate) 135 @Substitution("\nleft : DataFrame or named Series") 136 @Appender(_merge_doc, indents=0) 137 def merge( (...) 150 validate: str | None = None, 151 ) -> DataFrame: 152 left_df = _validate_operand(left) --> 153 right_df = _validate_operand(right) 154 if how == "cross": 155 return _cross_merge( 156 left_df, 157 right_df, (...) 167 copy=copy, 168 ) File ~\anaconda3\Lib\site-packages\pandas\core\reshape\merge.py:2692, in _validate_operand(obj) 2690 return obj.to_frame() 2691 else: -> 2692 raise TypeError( 2693 f"Can only merge Series or DataFrame objects, a {type(obj)} was passed" 2694 ) TypeError: Can only merge Series or DataFrame objects, a <class 'str'> was passed

I got beyond this error by using this code in geo.py...

# Read tab-separated CSV, keep key as string to preserve leading zeros
data = pd.read_csv(csvFile, sep="\t", dtype={leftID: "string"})
data[leftID] = data[leftID].str.strip()

# Read the GeoPackage (no crs= here)
geoRef = gpd.read_file(geoFile, engine="pyogrio", use_arrow=True)

# If the layer has no CRS and you KNOW it should be WGS84, set it explicitly
if geoRef.crs is None:
    geoRef = geoRef.set_crs(4326)

# Ensure join key is string and trimmed
geoRef[rightID] = geoRef[rightID].astype("string").str.strip()

# ✅ Merge DATAFRAMES, not strings
merged = pd.merge(data, geoRef, left_on=leftID, right_on=rightID, how="inner")

# Build GeoDataFrame and write Parquet
gdf = gpd.GeoDataFrame(merged, geometry="geometry", crs=geoRef.crs)
gdf.to_parquet(exportPath, index=False)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions