Python Forum
UserWarning: Could not infer format
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
UserWarning: Could not infer format
#1
Hi,

I have this code:
def get_rework_flags(logs, clean_df):
    """ Gets the flags used for rework """
    logger = logs['logger']
    # logger.info('Rework: Setting index on clean df')
    # clean_df = clean_df.set_index('Filename')
    logger.info('Rework: Extracting sent items')
    rework_df = clean_df.set_index('Filename')['Clean'].astype(
        str,
    ).str.lower().str.extractall('sent:(.*)').reset_index()
    rework_df.columns = ['Filename', 'date_num', 'date_sent']
    logger.info('Rework: Converting to dates')
    rework_df['date_sent'] = pd.to_datetime(
        rework_df['date_sent'], errors='coerce', exact=False,
    )
And I'm getting this error:
Output:
UserWarning: Could not infer format, so each element will be parsed individually, falling back to `dateutil`. To ensure parsing is consistent and as-expected, please specify a format. rework_df['date_sent'] = pd.to_datetime(
Please help
Reply
#2
Please show complete, unaltered error traceback as it includes much useful background information about the app.
Reply
#3
And post an example of what is found in rework_df['date_sent'].

Do you need the .astype(str)? If you are searching for "sent:" can the column be anything other than a str?

I would print the dataframe after this step and see if how many columns there are and if one of them looks like a datetime string.
rework_df = (
    clean_df.set_index("Filename")["Clean"]
    .str.lower()
    .str.extractall("sent:(.*)")
    .reset_index()
)
print(rework_df)
The code works if it finds a matching string, and if the date sent column is some kind of datestring.
import pandas as pd

clean_df = pd.DataFrame(
    {
        "Filename": ["A", "B", "C"],
        "Clean": [f"sent: June {x}, 2023" for x in range(1, 4)],
    }
)

rework_df = (
    clean_df.set_index("Filename")["Clean"]
    .str.lower()
    .str.extractall("sent:(.*)")
    .reset_index()
)
print(rework_df)

rework_df.columns = ["Filename", "date_num", "date_sent"]
rework_df["date_sent"] = pd.to_datetime(
    rework_df["date_sent"],
    errors="coerce",
    exact=False,
)
print(rework_df)
  Filename  date_num  date_sent
0        A         0 2023-06-01
1        B         0 2023-06-02
2        C         0 2023-06-03
But it doesn't work for a datetime string like 2023:6:1, or a blank, or None. For those I get the same warning you are getting.
Reply


Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020