Dec-18-2020, 10:30 AM
Everyone,
I want to move this thread forward. bowlofred pointed me in the right direction with my problem.
I now have another problem my code. Can you please have a look. I am pasting the code again and the error I am receiving.
Thank you in advance.
YK
I want to move this thread forward. bowlofred pointed me in the right direction with my problem.
I now have another problem my code. Can you please have a look. I am pasting the code again and the error I am receiving.
#Very good file. Third revision! import os import pandas as pd pd.set_option('display.max_rows', 500) import itertools import datetime as dt from matplotlib import pyplot as plt import matplotlib as mpl mpl.use('Agg') import numpy as np import matplotlib.pyplot as plt #%matplotlib inline #from IPython import get_ipython #get_ipython().run_line_magic('matplotlib', 'inline') import seaborn as sns import re from collections import Counter import string import emoji import pickle import numpy as np import matplotlib.pyplot as plt import matplotlib.dates as mdates import matplotlib.cbook as cbook import pandas as pd import time import sys from wordcloud import WordCloud, STOPWORDS from PIL import Image files_groups = os.listdir('data/') def read_history(file,conv_type): f = open('data/{}/{}'.format(conv_type,file), 'r',) # Feed the file text into findall(); it returns a list of all the found strings messages = re.findall('\[(\d+-\d+-\d+, \d+:\d+:\d+ [A-Z]*)\] (.*?): (.*)', f.read()) f.close() #Convert list to a dataframe and name columns history = pd.DataFrame(messages,columns =['date','name','msg']) history['date'] = pd.to_datetime(history['date'],format="%Y-%m-%d, %I:%M:%S %p") history['date1'] = history['date'].apply(lambda x: x.date()) history['msg_len'] = history['msg'].str.len() history['conv_name'] = file[19:-4] history['conv_name'] = file[19:-4] # Get Media shared in the Message history['Media']=history['msg'].str.contains('omitted') return history history['Media'] all = [] for file in files_groups: history = read_history(file,'') history['tipo'] = 'g' all.append(history) history = pd.concat(all).reset_index() history_clean = history[history['msg']!=' <Media omitted>'].sort_values(by=['conv_name','name','date1']) history_clean.shape history.columns
Error:Matplotlib created a temporary config/cache directory at /tmp/matplotlib-4oa8sevy because the default path (/config/matplotlib) is not a writable directory; it is highly recommended to set the MPLCONFIGDIR environment variable to a writable directory, in particular to speed up the import of Matplotlib and to better support multiprocessing.
Traceback (most recent call last):
File "main.py", line 55, in <module>
history['Media']
NameError: name 'history' is not defined
Please note, I have a folder called Data in Repl that I have created. Inside this folder is the text file that contains the whatsapp chat data.Thank you in advance.
YK