Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Data Analysis
#1
Hello, I am taking a data analysis class and this is my first time using python. I have done the first 3 problems of my homework assignment but am struggling on accessing and combining the tuples/data in problems 4 and 5. I can't seem to find the access record and I don't know how i would combine three data points.

Using Python, In this assignment, we will analyze the HURDAT2 dataset for Atlantic hurricane data from 1851 through 2017. This dataset is provided by the National Hurricane Center and is documented here. You will do some analysis of this data to answer some questions about it. I have provided code to organize this data, but you may feel free to improve this rudimentary organization. I have also provided functions that allow you to check your work. Note that you may choose to organize the cells as you wish, but you must properly label each problem’s code and its solution. Use a Markdown cell to add a header denoting your work for a particular problem.
You should start with the provided Jupyter Notebook, http://www.cis.umassd.edu/~dkoop/dsc201-...1/a1.ipynb. Download this notebook (right-click to save the link) and upload it to your Jupyter workspace (on the JupyterHub server or your local notebook/lab). Make sure to execute the first two cells in the notebook (Shift+Enter). The second cell will download the data and define a variable records which consists of a list of tuples each with two entries:
a string with information about the hurricane and
a list of strings each of which is a tracking point for the hurricane
To access the fourth hurricane’s third tracking point, you would access records[3][1][2]. Remember indexing is zero-based! Thus [3] accesses the fourth hurricane, [1] accesses the list of tracking point strings, and [2] accesses the third tracking point.
In the provided file, I provided examples of how to check your work. For example, for Problem 1, you would call the check1 function with the number of hurricane names. After executing this function, you will see a message that indicates whether your answer is correct.
1. Number of Unique Hurricane Names (10 pts)
Write code that computes the number of unique hurricane names in the dataset. Note that UNNAMED is not a hurricane name.
Hints:
You will need to extract the name from the string in the first entry in the tuple
The split function for strings will be useful
The strip function will also be useful to trim whitespace
Consider using a set to keep track of all the names
2. Most Frequently Used Name (10 pts)
Write code that computes the most frequently used hurricane name. Again, UNNAMED does not count!
Hints:
collections.Counter() is a good structure to help with counting.
Clean up the strings in the same manner as in Problem 1.
3. Year with Most Hurricanes (10 pts)
Write code that computes the year with the most hurricanes.
Hints:
You can extract the year from the first entry in the tuple. It is the last four characters before the first comma.
4. Most Northerly Hurricane (10 pts)
Write code that computes the hurricane that went furthest north as measured by the greatest latitude. You need to find the name and the year of the hurricane.
Hints:
Check the documentation to find where the latitude is recorded.
You will need to go through the tracking points to check all of the latitude points recorded.
You need to keep track of three things: the maximum latitude seen so far plus the name of the corresponding hurricane and year
The latitude adds the N character to indicate the northern hemisphere. This needs to be removed to do numeric comparisons.
You can convert a string to a float or int by castingit. For example, float("81.5") returns a floating-point value of 81.5.
5. Hurricane with Maximum Sustained Wind (10 pts)
Write code that determines the hurricane with the highest sustained windspeed. You need to find the name, year, and wind speed for this hurricane.
Hints:
Check the documentation to find where the wind speed is recorded.
You will need to go through the tracking points to check all of the wind speeds recorded.
You can convert a string to a float or int by castingit. For example, float("81.5") returns a floating-point value of 81.5.

problem 1 :
names = set()
for record in records:
#access names record and remove ','
first_entry = record[0].split(',')[1]
first_entry = first_entry.split(' ')[-1]
# strip whitespace
first_entry.strip()
# if hurricane name not UNNAMED add to set, thus generating unique names
if(first_entry != 'UNNAMED'):
names.add(first_entry)
# answer is number of unique hurricane names
answer = len(names)
print(answer)
problem 2 :
names = []
# import counter
from collections import Counter
for record in records:
# access hurricane name record, remove ','
first_entry = record[0].split(',')[1]
first_entry = first_entry.split(' ')[-1]
# strip white space
first_entry.strip()
# if not unnamed append to names set
if(first_entry != 'UNNAMED'):
names.append(first_entry)
# call most_common function to get most common hurricane name
answer = Counter(names).most_common(1)[0][0]
print(answer)
problem 3:
years = []
for record in records:
# access years record
first_entry = record[0].split(',')[0]
year = first_entry[-4:]
# append year to years set
years.append(year)
# call most_common function to get most common hurricanes in 1 year
answer = Counter(years).most_common(1)[0][0]
print(answer)
Reply
#2
Looking at the HURDAT2 data documentation, the records start with a header row for the hurrican (which appears to be what you were processing in the first three problems), and then a bunch of data rows showing the course of the hurricane. The latitude is the fifth item in the data row.

So you need to loop through the tows looking for the header row. Get the year and name from that, as you have already been doing. Then, switch to checking the latitudes in the data rows. Keep track of which one is the highest. When you get to a new header row, first check the maximum latitude of the previous hurricane against the maximum of all the hurricanes seen so far. If it is higher, store that hurricane as the new best case. Then start again with the new header row.
Craig "Ichabod" O'Brien - xenomind.com
I wish you happiness.
Recommended Tutorials: BBCode, functions, classes, text adventures
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Movie lens data analysis sekhar_desiraju 4 6,962 Nov-29-2020, 01:57 AM
Last Post: kakkarshivam
  Sentiment Analysis with NLTK Vader - Writing data in one row ulrich48155 1 4,124 May-15-2017, 06:36 AM
Last Post: Ofnuts

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020