Jan-02-2025, 08:29 PM
(This post was last modified: Jan-02-2025, 09:21 PM by Yoriz.
Edit Reason: Added code tags
)
How do I convert the following code to use the offset limit items and get a large item instead of doing a get_all type operation? The code I want to convert is below. There seems to be a way of paginating the download of the data but then creating the DataFrame at the end and just using that instead of going through the records thing you see below:
Something like:
Thanks
client = Socrata(nmdx_url, token, username, password, timeout=90) item_count += 1 # Process metadata ds_md = client.get_metadata(ds) print('=[ ' + str(item_count) + ' ]==================================================================================================================') print(f"ID: {ds_md['id']}") print(f"NAME: {ds_md['name']}") print(f"CATEGORY: {ds_md['category']}") print(f"DESCRIPTION: {ds_md['description']}") #print(f"rowsUpdatedAt: {ds_md['rowsUpdatedAt']} ({datetime.fromtimestamp(ds_md['rowsUpdatedAt'])})") # Fetch records from the table ds_md = client.get_metadata(ds) ds_cols = [colname['fieldName'] for colname in ds_md['columns']] ds_records = client.get(ds, where='', limit=10) #ds_records = client.get_all(ds, where='') #ds_records = client.get_all(ds) ds_df = pd.DataFrame.from_records(ds_records, columns=ds_cols)How do you do this without the ds_records thing here and I resolve cols?
Something like:
chunk = client.get(limit=chunk_size, offset=offset) if not chunk: # Check for empty response break all_data.extend(chunk) offset += chunk_sizeAgain, I'm trying to do the
client.get()
where the get happens many times with limited data to keep the network traffic reasonable. How is this done?Thanks