Python Forum
Extracting data from tweets and saving it as CSV
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Extracting data from tweets and saving it as CSV
#1
Hello! I am new to the programming and learn everything in Python from scratch. 

My goal is to (1) import tweets from a JSON file, (2) pull out the data of interest from the tweets, and (3) save the extracted data to a CSV. Building on the examples provided by other users online, I came up with the following code.

# import necessary modules
import json
from csv import writer

# import tweets from JSON 
with open('00.json') as in_file, \
     open('test.csv', 'w') as out_file:
    print >> out_file, 'ids, text, lang, geo, place'
    csv = writer(out_file)
    tweet_count = 0

    for line in in_file:
        tweet_count += 1
        tweet = json.loads(line)

        # Pull out various data from the tweets
        row = (
            tweet['id_str'],                # author_id
            tweet['text'],                  # tweet_time
            tweet['lang'],                  # tweet_language
            tweet['geo'],                   # tweet_geo
            tweet['place'],                 # tweet_place
        )
        values = [(value.encode('utf8') if hasattr(value, 'encode') else value) for value in row]
        csv.writerow(values)

# print the name of the file and number of tweets imported
print ("File Imported:"), str('00.json')
print ("# Tweets Imported:"), tweet_count
print ("File Exported:"), str('test.csv')
 

However, there seems to be a problem that falls out of my comprehension  Wall And for that I am seeking your help. I include the error message below:

Error:
[color=#000000]TypeError                                 Traceback (most recent call last) [/color] <ipython-input-56-2e4d7aed55d0> in <module>()       5 # import tweets from JSON       6 with open('00.json') as in_file,      open('test.csv', 'w') as out_file: ----> 7     print >> out_file, 'ids, text, lang, geo, place'       8     csv = writer(out_file)       9     tweet_count = 0 [color=#000000]TypeError: unsupported operand type(s) for >>: 'builtin_function_or_method' and '_io.TextIOWrapper'[/color]
Thank you in advance for your guidance!
Reply
#2
Hello!

.......
with open('00.json') as in_file, open('test.csv', 'w') as out_file:
    print('ids, text, lang, geo, place', file=out_file)
......  
"As they say in Mexico 'dosvidaniya'. That makes two vidaniyas."
https://freedns.afraid.org
Reply
#3
wavic, thank you for prompt response. Indeed, I figured out the error is in this part and it is because of "builtin_function_or_method' and '_io.TextIOWrapper". What I don't understand is why this is a problem and how to go about it.
Reply
#4
Hm! I have seen >> in C++ when it comes to standard output or to a file.
See the print statement at line 8. It should be a function - print(). >> in Python is bitwise shifting operator
"As they say in Mexico 'dosvidaniya'. That makes two vidaniyas."
https://freedns.afraid.org
Reply
#5
As wavic said.

">>" in python is bitwise operator, so from python's view you are trying to apply bitwise shift on a print function with shift size given by the file object (out_file), but >> works only with integers, thats why there  is error about unsupported operand types
Reply
#6
wavic, zivoni, I am thankful for your useful comments. Let me dig into this and try to find a solution.
Reply
#7
I am afraid that wavic it already solved for you.

if you are trying to print your header to out_file with line
 print >> out_file, 'ids, text, lang, geo, place'
then replace it with wavic's
 print('ids, text, lang, geo, place', file=out_file)
file = out_file parameter redirects printing to your file.

Actually now I noticed that if you are using old python (pre 2.7?), then print is not function, but statement. In that case you will have to use
out_file.write('ids, text, lace, geo, place\n')
for "printing" to file

And if you are using python3, then your last three lines will trigger errors, because there is no print statement in python3 and you will need to add parentheses there.
Reply
#8
zivoni, yeah I see what you are saying. I didn't realize that right away, my bad.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Extracting Data into Columns using pdfplumber arvin 17 5,478 Dec-17-2022, 11:59 AM
Last Post: arvin
  Need help formatting dataframe data before saving to CSV cubangt 16 5,743 Jul-01-2022, 12:54 PM
Last Post: cubangt
  Extracting Data from tables DataExtrator 0 1,134 Nov-02-2021, 12:24 PM
Last Post: DataExtrator
  extracting data ajitnayak1987 1 1,529 Jul-29-2021, 06:13 AM
Last Post: bowlofred
  Extracting and printing data ajitnayak1987 0 1,406 Jul-28-2021, 09:30 AM
Last Post: ajitnayak1987
  Extracting unique pairs from a data set based on another value rybina 2 2,293 Feb-12-2021, 08:36 AM
Last Post: rybina
Thumbs Down extracting data/strings from Word doc mikkelibsen 1 1,909 Feb-10-2021, 11:06 AM
Last Post: Larz60+
  Extracting data without showing dtype, name etc. tgottsc1 3 4,351 Jan-10-2021, 02:15 PM
Last Post: buran
  Extracting data from a website tgottsc1 2 2,253 Jan-09-2021, 08:14 PM
Last Post: tgottsc1
  saving data from text file to CSV file in python having delimiter as space K11 1 2,387 Sep-11-2020, 06:28 AM
Last Post: bowlofred

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020