Python Forum
Extracting data from tweets and saving it as CSV - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: General Coding Help (https://python-forum.io/forum-8.html)
+--- Thread: Extracting data from tweets and saving it as CSV (/thread-2191.html)



Extracting data from tweets and saving it as CSV - kiton - Feb-24-2017

Hello! I am new to the programming and learn everything in Python from scratch. 

My goal is to (1) import tweets from a JSON file, (2) pull out the data of interest from the tweets, and (3) save the extracted data to a CSV. Building on the examples provided by other users online, I came up with the following code.

# import necessary modules
import json
from csv import writer

# import tweets from JSON 
with open('00.json') as in_file, \
     open('test.csv', 'w') as out_file:
    print >> out_file, 'ids, text, lang, geo, place'
    csv = writer(out_file)
    tweet_count = 0

    for line in in_file:
        tweet_count += 1
        tweet = json.loads(line)

        # Pull out various data from the tweets
        row = (
            tweet['id_str'],                # author_id
            tweet['text'],                  # tweet_time
            tweet['lang'],                  # tweet_language
            tweet['geo'],                   # tweet_geo
            tweet['place'],                 # tweet_place
        )
        values = [(value.encode('utf8') if hasattr(value, 'encode') else value) for value in row]
        csv.writerow(values)

# print the name of the file and number of tweets imported
print ("File Imported:"), str('00.json')
print ("# Tweets Imported:"), tweet_count
print ("File Exported:"), str('test.csv')
 

However, there seems to be a problem that falls out of my comprehension  Wall And for that I am seeking your help. I include the error message below:

Error:
[color=#000000]TypeError                                 Traceback (most recent call last) [/color] <ipython-input-56-2e4d7aed55d0> in <module>()       5 # import tweets from JSON       6 with open('00.json') as in_file,      open('test.csv', 'w') as out_file: ----> 7     print >> out_file, 'ids, text, lang, geo, place'       8     csv = writer(out_file)       9     tweet_count = 0 [color=#000000]TypeError: unsupported operand type(s) for >>: 'builtin_function_or_method' and '_io.TextIOWrapper'[/color]
Thank you in advance for your guidance!


RE: Extracting data from tweets and saving it as CSV - wavic - Feb-24-2017

Hello!

.......
with open('00.json') as in_file, open('test.csv', 'w') as out_file:
    print('ids, text, lang, geo, place', file=out_file)
......  



RE: Extracting data from tweets and saving it as CSV - kiton - Feb-24-2017

wavic, thank you for prompt response. Indeed, I figured out the error is in this part and it is because of "builtin_function_or_method' and '_io.TextIOWrapper". What I don't understand is why this is a problem and how to go about it.


RE: Extracting data from tweets and saving it as CSV - wavic - Feb-24-2017

Hm! I have seen >> in C++ when it comes to standard output or to a file.
See the print statement at line 8. It should be a function - print(). >> in Python is bitwise shifting operator


RE: Extracting data from tweets and saving it as CSV - zivoni - Feb-24-2017

As wavic said.

">>" in python is bitwise operator, so from python's view you are trying to apply bitwise shift on a print function with shift size given by the file object (out_file), but >> works only with integers, thats why there  is error about unsupported operand types


RE: Extracting data from tweets and saving it as CSV - kiton - Feb-24-2017

wavic, zivoni, I am thankful for your useful comments. Let me dig into this and try to find a solution.


RE: Extracting data from tweets and saving it as CSV - zivoni - Feb-24-2017

I am afraid that wavic it already solved for you.

if you are trying to print your header to out_file with line
 print >> out_file, 'ids, text, lang, geo, place'
then replace it with wavic's
 print('ids, text, lang, geo, place', file=out_file)
file = out_file parameter redirects printing to your file.

Actually now I noticed that if you are using old python (pre 2.7?), then print is not function, but statement. In that case you will have to use
out_file.write('ids, text, lace, geo, place\n')
for "printing" to file

And if you are using python3, then your last three lines will trigger errors, because there is no print statement in python3 and you will need to add parentheses there.


RE: Extracting data from tweets and saving it as CSV - kiton - Feb-24-2017

zivoni, yeah I see what you are saying. I didn't realize that right away, my bad.