![]() |
Issues with csv double quotes - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: Data Science (https://python-forum.io/forum-44.html) +--- Thread: Issues with csv double quotes (/thread-25197.html) |
Issues with csv double quotes - bltzr75 - Mar-23-2020 Hi guys, I made a Bs4 code for export a csv file with numbers but can't get rid of double quotes... It doesn't let me manipulate the data later when I try to make any calculation... Bit Coin Data Exported: import requests import bs4 import pandas as pd import csv from bs4 import BeautifulSoup as bs dateList = [] openList = [] highList = [] lowList = [] closeList= [] volumeList= [] MCapList = [] r = requests.get('https://coinmarketcap.com/currencies/bitcoin/historical-data/?start=20130428&end=20200315') soup = bs(r.text,'lxml') soup.find('tr', {'class': 'cmc-table-row'}).find('td', {'class' : 'cmc-table__cell cmc-table__cell--sticky cmc-table__cell--left'}).text tr = soup.findAll('tr', {'class': 'cmc-table-row'}) for item in tr: dateList.append(item.find('td', {'class' : 'cmc-table__cell cmc-table__cell--sticky cmc-table__cell--left'}).text) openList.append(item.find_all('td')[1].text) highList.append(item.find_all('td')[2].text) lowList.append(item.find_all('td')[3].text) closeList.append(item.find_all('td')[4].text) volumeList.append(item.find_all('td')[5].text) MCapList.append(item.find_all('td')[6].text) row0 =['Dates', 'Open', 'High', 'Low', 'Close', 'Volume', 'Market Capitalization' ] rows = zip(dateList, openList, highList, lowList, closeList, volumeList, MCapList) with open('bitcoinHistoricalPrice.csv', 'w', encoding='utf-8', newline='') as csvfile: links_writer = csv.writer(csvfile) links_writer.writerow(row0) for row in rows: links_writer.writerow(row) # dfTable = pd.DataFrame({'Dates': dateList,'Open':openList ,'High':highList, 'Low': lowList, 'Close':closeList, 'Volume':volumeList, 'Market Capitalization': MCapList})Trying to manipulate the data: %matplotlib inline import pandas as pd import matplotlib as plt import numpy as np plt.rcParams['figure.figsize'] = (20.0 , 10.0) #Read the data data = pd.read_csv('bitcoinHistoricalPrice.csv') print(data.shape) data.head() # Can't use the values because it's in string format #Collect X and Y X = data['Low'].values Y = data['High'].values #Mean X and Y mean_x = np.mean(X) mean_y = np.mean(Y) RE: Issues with csv double quotes - snippsat - Mar-23-2020 Pandas can read table from website with pd.read_html ,then no need to do parsing.Example. >>> import pandas as pd ... ... df = pd.read_html("https://coinmarketcap.com/currencies/bitcoin/historical-data/?start=20130428&end=20200315") ... df = df[2] >>> df.head(10) Date Open* High Low Close** Volume Market Cap 0 Mar 16, 2020 5385.23 5385.23 4575.36 5014.48 45368026430 91633478850 1 Mar 15, 2020 5201.07 5836.65 5169.28 5392.31 33997889639 98530059890 2 Mar 14, 2020 5573.08 5625.23 5125.07 5200.37 36154506008 95014981944 3 Mar 13, 2020 5017.83 5838.11 4106.98 5563.71 74156772075 101644613038 4 Mar 12, 2020 7913.62 7929.12 4860.35 4970.79 53980357243 90804613601 5 Mar 11, 2020 7910.09 7950.81 7642.81 7911.43 38682762605 144508402671 6 Mar 10, 2020 7922.15 8136.95 7814.76 7909.73 42213940994 144465567734 7 Mar 09, 2020 8111.15 8177.79 7690.10 7923.64 46936995808 144706353758 8 Mar 08, 2020 8908.21 8914.34 8105.25 8108.12 39973102121 148060284561 9 Mar 07, 2020 9121.60 9163.22 8890.74 8909.95 36216930370 162684945903 >>> df.dtypes Date object Open* float64 High float64 Low float64 Close** float64 Volume int64 Market Cap int64 dtype: object >>> df['High'].max() 20089.0Find all types correct except Date ,so fix for date would be.>>> df['Date'] = pd.to_datetime(df['Date']) >>> df.dtypes Date datetime64[ns] Open* float64 High float64 Low float64 Close** float64 Volume int64 Market Cap int64 dtype: object |