Jul-29-2019, 02:17 AM
(Jul-12-2019, 06:26 AM)perfringo Wrote:(Jul-11-2019, 05:12 AM)ahmedwaqas92 Wrote: I would most probably export them to a CSV file
Link appears to be dead, so generic example how to write specific columns to file:
>>> import pandas as pd >>> d = {'ham': [1, 2, 3], 'spam': ['a', 'b', 'c'], 'bacon': ['1a', '2b', '3c']} >>> df = pd.DataFrame(d) >>> df ham spam bacon 0 1 a 1a 1 2 b 2b 2 3 c 3c >>> df.to_csv('out.csv', columns=['ham', 'bacon'], index=False)The content of out.csv is:
Output:ham,bacon 1,1a 2,2b 3,3c
(Jul-12-2019, 10:48 AM)snippsat Wrote:(Jul-12-2019, 04:34 AM)ahmedwaqas92 Wrote: Do I have to restructure my code from scratch? Is there no way I can use my existing code to get the columns that I might need?It's just a lot more work and it's not so easy either as have to make columns as it's not clear defined in html.
Need to clean up data to,if gone do plot or other stuff.
As shown by @perfringo Pandas make this a lot easier.
Here a NoteBook where i show some different stuff that may be needed,like clean up take out columns.
Apologies for the delayed response on the matter, I reviewed the detailed instructions on both these examples and then read a few docs pertaining to the use of data frames in Python. Based on all what I gathered in the last week or so I have managed to make a small Python script which captures the data as to how I want. Then it removes the excess columns, removes special characters and does some calculation as well. The results are finally presented in a pie chart - See code below
import pandas as pd import matplotlib.pyplot as plt dataMain = pd.read_html('http://stats.espncricinfo.com/ci/engine/player/422108.html?class=3;spanmin1=07+Sep+2016;spanval1=span;template=results;type=batting;view=innings') dataTabulated = dataMain[3] columns = [0,2,3,4] tempFrame = pd.DataFrame(dataTabulated) dataFinal = tempFrame[tempFrame.columns[columns]] dataFinal = dataFinal[~dataFinal.Runs.str.contains("DNB")] dataFinal = dataFinal.replace('\*','',regex=True).astype(float) totalRuns = dataFinal['Runs'].sum() ballsFaced = dataFinal['BF'].sum() fours = dataFinal['4s'].sum() sixes = dataFinal['6s'].sum() totalFours = fours * 4 totalSixes = sixes * 6 #This is Calculating the Total runs Scored in Boundaries and Rotation(1s,2s,3s) boundaryRuns = totalFours + totalSixes rotationRuns = totalRuns - boundaryRuns #This is Calculating the Percentage runs in Rotation(1s,2s,3s) & in Boundaries (1s,2s,3s) rotationRuns_p = rotationRuns / totalRuns * 100 boundaryRuns_p = boundaryRuns / totalRuns * 100 #Calculating Approximate & Percentage Dot balls forBall = ballsFaced-(fours+sixes) forRun = totalRuns-(totalFours+totalSixes) approxDot = forBall-forRun approxDot_p = approxDot/ballsFaced * 100 score_P = 100 - approxDot_p print(round(approxDot_p,2)) #Plotting for Boundaries / Rotation Ratio labels = 'Rotation', 'Boundaries' sizes = [rotationRuns_p, boundaryRuns_p] colors = ['yellowgreen', 'yellow'] explode = [0.1, 0] #Explode 1st Slice #Plotting the Pie Chart plt.pie(sizes, explode=explode, labels=labels, colors=colors, autopct='%1.1f%%', shadow=True, startangle=140) plt.axis('equal') plt.show() #Plotting for Dot Ball Percentage labels = 'Approx Dot %', 'Scoring %' sizes = [approxDot_p, score_P] colors = ['red', 'blue'] explode = [0.1, 0] #Explode 1st Slice #Plotting the Pie Chart plt.pie(sizes, explode=explode, labels=labels, colors=colors, autopct='%1.1f%%', shadow=True, startangle=140) plt.axis('equal') plt.show()
I guess now it seems that my problem has been resolved so you can mark this thread as solved. Your help on the matter @perfringo & @snippsat is much appreciated :)