Python Forum
Twitter scraping exclude some data
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Twitter scraping exclude some data
#1
Hello everyone! Im new here and im also new to python. Im eager to learn python because the possibilities are immense. Currently im working on a twitter streaming code, which I pasted in the code section below.
Im wondering how I should exclude data from the streamer?
1. For instance, i want to check wether the 'status' or 'location' fields are not null.
2. I would like to exclude some fields. For instance, 'retweets'.

If someone could explain how I'm supposed to program [1] en [2] then I would be very happy :)

from tweepy import Stream
from tweepy import OAuthHandler
from tweepy.streaming import StreamListener
import json

# consumer key, consumer secret, access token, access secret.
consumer_key = "xxx"
consumer_secret = "xxx"
access_token = "xxxx"
access_token_secret = "xxxx"


class StdOutlistener(StreamListener):
    def on_data(self, data):
        json_data = json.loads(data)
        print (json_data)

        # Open json text file to save the tweets
        with open('tweets.json', 'a') as tf:
            tf.write(data)
        return True

    def on_error(self, status):

        print(status)


auth = OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)

twitterStream = Stream(auth, StdOutlistener())
twitterStream.filter(track=["Test"])
Reply
#2
Can anyone help me?
Is my question unclear>?
Reply
#3
Are those fields part of the json response you're receiving?
Reply
#4
(Aug-31-2017, 05:33 PM)nilamo Wrote: Are those fields part of the json response you're receiving?

Yes. The json format contains all the data that is available.
In this tutorial: http://adilmoujahid.com/posts/2014/07/tw...analytics/ an overview is given of the data and Json output when no filters are applied. Literally everything is passing through and I would like to know whether it is possible to skip some fields. For instance; im not interesse in the fact that someone does have 20 followers or something.
Reply
#5
Here's a direct link to the StreamListener class from the tweepy module: https://github.com/tweepy/tweepy/blob/v3...ing.py#L30

You're currently using on_data(), which fires off for every single message.  Have you tried using one of the more specific ones, like on_status()?
Reply
#6
(Aug-31-2017, 08:55 PM)nilamo Wrote: Here's a direct link to the StreamListener class from the tweepy module: https://github.com/tweepy/tweepy/blob/v3...ing.py#L30

You're currently using on_data(), which fires off for every single message.  Have you tried using one of the more specific ones, like on_status()?

Thanks for your reply and suggestion. I will have at the webpage you mentioned.
No, i haven't tried on_status which would probably be better. But i have no idea how to use on_status in this particular script.

Do you perhaps have a link for that to?
Reply
#7
You currently use on_data.  replace the word "data" with "status", and it should run.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Scraping Data from Website melkaray 3 742 Sep-22-2023, 12:41 PM
Last Post: melkaray
  Regex Include and Exclude patterns in Same Expression starzar 2 736 May-23-2023, 09:12 AM
Last Post: Gribouillis
  [SOLVED] [loop] Exclude ranges in… range? Winfried 2 1,367 May-14-2023, 04:29 PM
Last Post: Winfried
  How do I scrape profile information from Twitter People search results? asdad 0 704 Nov-29-2022, 10:25 AM
Last Post: asdad
  Is it possible to write a python script to block twitter feeds? cubangt 0 837 Apr-07-2022, 04:14 PM
Last Post: cubangt
  Reducing JSON character count in Python for a Twitter Bot johnmitchell85 2 45,918 Apr-28-2021, 06:08 PM
Last Post: johnmitchell85
  Telegram Users Scrapper - Exclude UserPrivacyRestricted graphite2015 0 2,556 Oct-23-2020, 05:43 AM
Last Post: graphite2015
  Twitter follower network gugatcgwgf 2 2,019 May-06-2020, 10:29 AM
Last Post: gugatcgwgf
  How to exclude bools from integers? boris602 2 1,933 Nov-02-2019, 12:44 PM
Last Post: boris602
  loop through range until reach size and exclude specific symbol pino88 3 2,324 Sep-23-2019, 02:32 AM
Last Post: perfringo

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020