Python Forum

Full Version: bulk import in arangodb from a dataframe
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Hello

I am trying to bulk import documents to an arangodb collection from dataframes. The code I use is like this:
import pandas as pd
from arango import ArangoClient
import json

client = ArangoClient(hosts='http://127.0.0.1:8529')
db = client.db('DB', username='root', password='123456')

df = pd.read_excel("file.xlsx", sheet_name = "worksheet", header = 1)
df1 = df.to_json(orient="records")
collection = db.collection("collection_name")
collection.import_bulk(df1)
When I run this code I get the following error:
Error:
DocumentInsertError: [HTTP 400][ERR 1227] invalid document type
Reading the documentation I saw this:
Quote:DataFrame.to_json
Parameters:
path_or_bufstr or file handle, optional
File path or object. If not specified, the result is returned as a string.
If I run the above code like this:
df1 = df.to_json("filename.json",orient="records")
with open("filename.json", 'r') as json_file:
    data = json.load(json_file)
collection = db.collection("collection_name")
collection.import_bulk(data)

then the data are imported in the database without any problems. Can Anyone help and explain why this is happening? I am trying to import a huge amount of data in the database and writing the json files takes too much time and sometimes creates other problems.