Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
CSV : Troubleshooting my script
#1
Hello forum!
  First post and starting it of by making yself look extra nubZ! lol  So I just reached (...yesterday infact ) first year of complte diving into everything python... If you check my I have a repo of 99.98% python forks and personal project and its about 150ish deep?  If embarked on this journey to meet and surpass the reuirements for various certifications In a month ill be completibng/enrolling (WebSec heavy)... past 4 months its been nothing but scrapping and data analytics...
#Just made a quick dirty scrapy script Ill use for what Im trying to get across

Self taught, theres no stock module I have issues with but the CSV module... I have a really hard time understanding the seemingly basic logic behind using..  Here the flow of what Im trying to do in an example scenario..
#scrapy script
# -*- coding: utf-8 -*-
import scrapy

class PyvsrcSpider(scrapy.Spider):
   name = "pyVSrc"
   allowed_domains = ["pyvideo.org"]
   start_urls = ['ht tp: //pyv ideo. org /t ag/ tuto ria l/'] #not been proved to no be spammer on forum

   def parse(self, response):
       # follow links to author pages
       # item = tagitems()
       # item['tag2'] = response.css('div.headline h1::text').re_first('\w+$');
       for href in response.css('div.thumb a::attr(href)').extract():
           yield scrapy.Request(response.urljoin(href),
                                callback=self.parse_mainsrc)
          # Reason for the second def is because I use this as a template for qyuickjk and dirty
          # scraping and where this comen is would lie a pagination script... #muchmess
   def parse_mainsrc(self, response):
       yield {
           'vidName': response.css('h2.entry-title a::text').extract(),
           'link2vid': response.css('iframe').xpath('@src').extract(),

           # 'tags': response.css('div.video-tags a::text').extract()
       }


##### csvrewrite.py
_author_ = '...not me

import csv

with open('pyvids_tuts.csv') as f:
    reader = csv.reader(f)
    new_data = []
    for row in reader:
        if reader.line_num == 1:
            continue
        print(row[1] + "\n" + row[0])
Outputing results into csv, as you can see, output two tables.. of which the I run a second script in order to parse all items.. The idea is to have iterated 'NAME,TITLE' so it out puts as... #and exclude the headers
Output:
vidName1 link2vid1 vidName2 link2vid2 vidName2 link2vid2
My issue (please to flame the nub) is that I have a hard tim getting to not only simply output, of whic my own noob works around has been 1) Having scrapy and csvrewrite.py script in same folder 2) issue manual output with ">"...
#Which I know probly using OS I can get it to just run in current direct or w.e
... but also doing extra editing to the output has become a 2 week wtf issue... Extra edit fields?

1. To add "#EXTM3U" as the new hear #all attempts have either looped "#EXTM3U" along or error out
2 Add "#EXTINF:0," before every iteration in "vidName" #Issue being that coma after the 0 though not the important part cus it really doesnt need it to function as a M3U playlist
DESIRED OUTPUT!
Output:
#EXTM3U #EXTINF:0,vidName link2vid1 #EXTINF:0,vidName2 link2vid2 #EXTINF:0,vidName3 link2vid3
Oh yeah... LMAO the whole point of this is to create screamable m3u/8 playlist and such...
ALSO, There has to be a way  to include all this into on script right??  Help a noob out?! Point me to the right direction?! lol I am googling and doinf csv module tuts but could use a hand
Reply
#2
Please re-post with code tags
Thank you
Reply
#3
(Jan-02-2017, 10:35 PM)Larz60+ Wrote: Please re-post with code tags
Thank you

Im sorry? wait did you add the tags for me?  if not then I blame being a new user as I hadent been declared(notspammer).... At writing the post I did so... AHHH except for that output non sence, let me know if my post format is not correct format atm. THANKS!
Reply
#4
I attempted to, but someone else actually did it, not sure who.
At any rate, it's done. Makes it a whole lot easier to read.
Now let me go back and look at what's happening.

Ok I read your post and having a very hard time understanding.
Here's a suggestion that may help.
Try writing you request in your native language and passing throu google translate
[url=https://translate.google.com/][/url]
Reply
#5
My second attempt, (  please excuse my messy post)
 I'm having an issue understading the CSV library logic, mostly the write portion.   What I am simply trying to do is very basic editing,  for the sake of keeping things simple in this example I'll say that the CSV is only two columns, A and B,  and the idea is to combine them both and the idea is combine them both,  not  merge.  As well as replacing the header  and adding a string  to each iterance of a row in once of the columns... example

My csv file looks like this..


Output:
╔════════╦═════════╗ ║ Header ║ Header2 ║ ╠════════╬═════════╣ ║ 123    ║ abc     ║ ╠════════╬═════════╣ ║ 456    ║ def     ║ ╠════════╬═════════╣ ║ 789    ║ ghi     ║ ╚════════╩═════════╝
And edit and write out to look then output as...

Output:
╔════════════╗ ║ newHeader  ║ ╠════════════╣ ║ $string123 ║ ╠════════════╣ ║ abc        ║ ╠════════════╣ ║ $string456 ║ ╠════════════╣ ║ def        ║ ╠════════════╣ ║ $string789 ║ ╠════════════╣ ║ ghi        ║ ╚════════════╝
Again I apologize for the original post and inability to  convey this simple thing, I will remember to tackle one problem at a time.  To reiterate something from the original post,  I am totally not just posting on here hoping  to be spoon-fed a solution,  I have gone through CSV Library documentation and various tutorials on this matter with no  real luck getting  to understand the CSV module,  all I have been able to really do it used a read function properly.

Thank you readers and thank you Admin!
Reply
#6
Would you change the following and post results
with open('pyvids_tuts.csv') as f:
    reader = csv.reader(f)
    new_data = []
    for row in reader:
        if reader.line_num == 1:
            continue
        print('row: {}\n{}\n{}\n'.format(row, row[0], row[1]))
        # print(row[1] + "\n" + row[0])
I am heading for bed, but will look at it first thing in the AM (or someone else will pick up)
Reply
#7
okay so.... output comes out as

Output:
row: ['Julia Tutorial\n      ', 'https://www.youtube.com/embed/AyvyVS6u8AM'] Julia Tutorial        https://www.youtube.com/embed/AyvyVS6u8AM row: ['Selecting the best model in scikit-learn using cross-validation\n      ', 'https://www.youtube.com/embed/6dbrR-WymjI'] Selecting the best model in scikit-learn using cross-validation        https://www.youtube.com/embed/6dbrR-WymjI row: ['Setting up Python for machine learning: scikit-learn and IPython Notebook\n      ', 'https://www.youtube.com/embed/IsXXlYVBt1M'] Setting up Python for machine learning: scikit-learn and IPython Notebook        https://www.youtube.com/embed/IsXXlYVBt1M
However I did redo definition CSV script that posted, I do use it for most of  this current project in which the regex  never fails but I guess in this instance with this site there was a small adjustment.



...
if reader.line_num == 1:
   continue
# print('row: {}\n{}\n{}\n'.format(row, row[0], row[1]))
print(row[0] + row[1]) # used to be print(row[1] + "\n" + row[0])
 The small edit out puts...

Output:
Julia Tutorial       https://www.youtube.com/embed/AyvyVS6u8AM Selecting the best model in scikit-learn using cross-validation       https://www.youtube.com/embed/6dbrR-WymjI Setting up Python for machine learning: scikit-learn and IPython Notebook       https://www.youtube.com/embed/IsXXlYVBt1M
Seems pretty foolish  of me to forget how about using the format option, but really it was never the big issue since realistically I could have added it as a strength of which you can't when you're using the CSV writerow  option...

import csv



with open('pyvids_tuts.csv') as f:

   reader = csv.reader(f)

   new_data = []

   for row in reader:

       if reader.line_num == 1:

           continue

       print('#EXTINF:0, {}{}'.format(row[0], row[1]))
[color=#a9b7c6][font=Source Code Pro][size=small]       [/size][/font][/color]# print(row[0] + row[1])
Output:
#EXTINF:0, Julia Tutorial       https://www.youtube.com/embed/AyvyVS6u8AM #EXTINF:0, Selecting the best model in scikit-learn using cross-validation       https://www.youtube.com/embed/6dbrR-WymjI #EXTINF:0, Setting up Python for machine
The real issue has really been about adding a new  header.  By using continue I know that I'm skipping over the first line but after doing so I have tried using append to add the space as a blank row to what would be  A1 to just insert "#EXTM3U" as the header but my attempts have yeild the desired header to loop through...
Reply
#8
Try this, (you should just get populated items
with open('pyvids_tuts.csv') as f:
    reader = csv.reader(f)
    new_data = []
    for row in reader:
        row.strip()
        for item in row:
            print('item: {}'.format(item))
Reply


Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020