CSV : Troubleshooting my script - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: Web Scraping & Web Development (https://python-forum.io/forum-13.html) +--- Thread: CSV : Troubleshooting my script (/thread-1430.html) |
CSV : Troubleshooting my script - scriptso - Jan-02-2017 Hello forum! First post and starting it of by making yself look extra nubZ! lol So I just reached (...yesterday infact ) first year of complte diving into everything python... If you check my I have a repo of 99.98% python forks and personal project and its about 150ish deep? If embarked on this journey to meet and surpass the reuirements for various certifications In a month ill be completibng/enrolling (WebSec heavy)... past 4 months its been nothing but scrapping and data analytics... #Just made a quick dirty scrapy script Ill use for what Im trying to get across Self taught, theres no stock module I have issues with but the CSV module... I have a really hard time understanding the seemingly basic logic behind using.. Here the flow of what Im trying to do in an example scenario.. #scrapy script # -*- coding: utf-8 -*- import scrapy class PyvsrcSpider(scrapy.Spider): name = "pyVSrc" allowed_domains = ["pyvideo.org"] start_urls = ['ht tp: //pyv ideo. org /t ag/ tuto ria l/'] #not been proved to no be spammer on forum def parse(self, response): # follow links to author pages # item = tagitems() # item['tag2'] = response.css('div.headline h1::text').re_first('\w+$'); for href in response.css('div.thumb a::attr(href)').extract(): yield scrapy.Request(response.urljoin(href), callback=self.parse_mainsrc) # Reason for the second def is because I use this as a template for qyuickjk and dirty # scraping and where this comen is would lie a pagination script... #muchmess def parse_mainsrc(self, response): yield { 'vidName': response.css('h2.entry-title a::text').extract(), 'link2vid': response.css('iframe').xpath('@src').extract(), # 'tags': response.css('div.video-tags a::text').extract() } ##### csvrewrite.py _author_ = '...not me import csv with open('pyvids_tuts.csv') as f: reader = csv.reader(f) new_data = [] for row in reader: if reader.line_num == 1: continue print(row[1] + "\n" + row[0])Outputing results into csv, as you can see, output two tables.. of which the I run a second script in order to parse all items.. The idea is to have iterated 'NAME,TITLE' so it out puts as... #and exclude the headers My issue (please to flame the nub) is that I have a hard tim getting to not only simply output, of whic my own noob works around has been 1) Having scrapy and csvrewrite.py script in same folder 2) issue manual output with ">"...#Which I know probly using OS I can get it to just run in current direct or w.e ... but also doing extra editing to the output has become a 2 week wtf issue... Extra edit fields? 1. To add "#EXTM3U" as the new hear #all attempts have either looped "#EXTM3U" along or error out 2 Add "#EXTINF:0," before every iteration in "vidName" #Issue being that coma after the 0 though not the important part cus it really doesnt need it to function as a M3U playlist DESIRED OUTPUT! Oh yeah... LMAO the whole point of this is to create screamable m3u/8 playlist and such...ALSO, There has to be a way to include all this into on script right?? Help a noob out?! Point me to the right direction?! lol I am googling and doinf csv module tuts but could use a hand RE: CSV : Troubleshooting my script - Larz60+ - Jan-02-2017 Please re-post with code tags Thank you RE: CSV : Troubleshooting my script - scriptso - Jan-03-2017 (Jan-02-2017, 10:35 PM)Larz60+ Wrote: Please re-post with code tags Im sorry? wait did you add the tags for me? if not then I blame being a new user as I hadent been declared(notspammer).... At writing the post I did so... AHHH except for that output non sence, let me know if my post format is not correct format atm. THANKS! RE: CSV : Troubleshooting my script - Larz60+ - Jan-03-2017 I attempted to, but someone else actually did it, not sure who. At any rate, it's done. Makes it a whole lot easier to read. Now let me go back and look at what's happening. Ok I read your post and having a very hard time understanding. Here's a suggestion that may help. Try writing you request in your native language and passing throu google translate [url=https://translate.google.com/][/url] RE: CSV : Troubleshooting my script - scriptso - Jan-03-2017 My second attempt, ( please excuse my messy post) I'm having an issue understading the CSV library logic, mostly the write portion. What I am simply trying to do is very basic editing, for the sake of keeping things simple in this example I'll say that the CSV is only two columns, A and B, and the idea is to combine them both and the idea is combine them both, not merge. As well as replacing the header and adding a string to each iterance of a row in once of the columns... example My csv file looks like this.. And edit and write out to look then output as... Again I apologize for the original post and inability to convey this simple thing, I will remember to tackle one problem at a time. To reiterate something from the original post, I am totally not just posting on here hoping to be spoon-fed a solution, I have gone through CSV Library documentation and various tutorials on this matter with no real luck getting to understand the CSV module, all I have been able to really do it used a read function properly.Thank you readers and thank you Admin! RE: CSV : Troubleshooting my script - Larz60+ - Jan-03-2017 Would you change the following and post results with open('pyvids_tuts.csv') as f: reader = csv.reader(f) new_data = [] for row in reader: if reader.line_num == 1: continue print('row: {}\n{}\n{}\n'.format(row, row[0], row[1])) # print(row[1] + "\n" + row[0])I am heading for bed, but will look at it first thing in the AM (or someone else will pick up) RE: CSV : Troubleshooting my script - scriptso - Jan-04-2017 okay so.... output comes out as However I did redo definition CSV script that posted, I do use it for most of this current project in which the regex never fails but I guess in this instance with this site there was a small adjustment.... if reader.line_num == 1: continue # print('row: {}\n{}\n{}\n'.format(row, row[0], row[1])) print(row[0] + row[1]) # used to be print(row[1] + "\n" + row[0])The small edit out puts... Seems pretty foolish of me to forget how about using the format option, but really it was never the big issue since realistically I could have added it as a strength of which you can't when you're using the CSV writerow option...import csv with open('pyvids_tuts.csv') as f: reader = csv.reader(f) new_data = [] for row in reader: if reader.line_num == 1: continue print('#EXTINF:0, {}{}'.format(row[0], row[1])) [color=#a9b7c6][font=Source Code Pro][size=small] [/size][/font][/color]# print(row[0] + row[1]) The real issue has really been about adding a new header. By using continue I know that I'm skipping over the first line but after doing so I have tried using append to add the space as a blank row to what would be A1 to just insert "#EXTM3U" as the header but my attempts have yeild the desired header to loop through...
RE: CSV : Troubleshooting my script - Larz60+ - Jan-04-2017 Try this, (you should just get populated items with open('pyvids_tuts.csv') as f: reader = csv.reader(f) new_data = [] for row in reader: row.strip() for item in row: print('item: {}'.format(item)) |