Python Forum
How can I parse a text file?
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
How can I parse a text file?
#1
I have a file. It has such a content
-----myfile.txt---------
Name - Mike
City - Mosсow
Have - dog
something
Name - Dasha
City - Voronesh
Have - cat
something
Name - Sveta
Citi - Vologda
Have - mouse
something
-----------------------
How can I make it
1 Name Mike City Mosсow Have dog
2 Name Dasha City Voronesh Have cat
3 Name Sveta City Vologda Have mouse
Reply
#2
What have you tried?
This also look like a typical school task.
Reply
#3
No, this is not a school task. That file just for example. Actually I need to parse another file. I do to scan a command 'sudo iwlist scanning' and I need to get some the formatted data. That is way, I have decided to start with this file. I don't even know how do it.
Reply
#4
You start with basic like read in the file.
If this all new,you need to study Python at a basic level better.
# Read file line bye line 
with open('myfile.txt') as f:
   for line in f:
       line = line.strip()
       print(line)
# Read to a list
with open('myfile.txt') as f:
   result = [i.strip() for i in f]
   print(result)
# Read to a string
with open('myfile.txt') as f:
    result = f.read()
    print(result)
So can you start to think of option you can use.
Like what happens if do:
if 'something' in line:
   print(line) 
Or what happens if  split() on something in last example.
>>> with open('myfile.txt') as f:
...     result = f.read()
...     
>>> result.split('something')
['Name - Mike\nCity - Moscow\nHave - dog\n',
 '\nName - Dasha\nCity - Voronesh\nHave - cat\n',
 '\nName - Sveta\nCiti - Vologda\nHave - mouse\n',
 '']
Almost to many \n Wink
Reply
#5
Thank you very much!

Here you see that I have just coded  Smile
def command_save(command):
   var = subprocess.check_output(command.split(), universal_newlines=True)
   output = open('scanning.txt', 'w')
   print(var, file=output)
   output.close()

def scannig():
   var = subprocess.check_output(['sudo', 'iwlist', 'scanning'], universal_newlines=True)
   output = open('scanning.txt', 'w')
   print(var, file=output)
   output.close()

address = []
channel = []
essid = []
f = open('scanning.txt', 'r').readlines()
for x in f:
   if "Address" in x:
       address.append(x)


for x in f:
   if "Channel" in x:
       channel.append(x)

for x in f:
   if "ESSID" in x:
       essid.append(x)

address = ''.join(address)
channel = ''.join(channel)
essid = ''.join(essid)

address = address.split()
channel = channel.split()
essid = essid.split()
Now I cannot join address, channel, essid together them.
Reply
#6
If you have 3 lists with the related items in order, then why not just use zip to walk through the 3 lists together?
Susan
Reply
#7
If the file has this structure ( name, city, have, something ) you can print it as you want just in one loop. What happens with 'something'?
"As they say in Mexico 'dosvidaniya'. That makes two vidaniyas."
https://freedns.afraid.org
Reply
#8
(May-30-2017, 10:57 PM)Mike Ru Wrote:
def command_save(command):
   var = subprocess.check_output(command.split(), universal_newlines=True)
   output = open('scanning.txt', 'w')
   print(var, file=output)
   output.close()

def scannig():
   var = subprocess.check_output(['sudo', 'iwlist', 'scanning'], universal_newlines=True)
   output = open('scanning.txt', 'w')
   print(var, file=output)
   output.close()

I have couple of recommendations for you:
  1. Those 2 functions are identical, excluding command string, which is in violation of DRY imperative - Don't Repeat Yourself. You should write just 1 function, and always pass command line
  2. You hard-code the name of your output file - it's a bad practice, because you may unintentionally overwrite it. If you want to continue writing to the same file - you should open it with a flag. In any case, you should pass file name as a parameter too.
  3. You read output in order to write it to file - there's a more direct way (below)
  4. Regular split is not good for complex lines - use shlex.split
def exec_command_store_out(command_line, out_file):
   cmd_process = subprocess.Popen(shlex.split(command_line), stdout=open(out_file, 'a'))
   cmd_process.wait()
(May-30-2017, 10:57 PM)Mike Ru Wrote:
f = open('scanning.txt', 'r').readlines()
for x in f:
  if "Address" in x:
      address.append(x)
.......
  • Too many empty lines
  • Each line read from the file contains CR/LF at the end
  • There's more Pythonic way to handle file
  • You make 3 loops - one is enough
  • You would probably need zip to mix your data - but this is another story
  • If you are on Python 3 - you'll have to decode read lines, because process output is bytes, not UTF-8 coded strings.
with open(<file name>) as in_file:
   for line in in_file:
       line = line.strip()
       if "Address" in line :
           address.append(line )
       elif "Channel" in line:
           channel.append(line )
...............
Why are you joining and splitting your lines - beats me Naughty
Test everything in a Python shell (iPython, Azure Notebook, etc.)
  • Someone gave you an advice you liked? Test it - maybe the advice was actually bad.
  • Someone gave you an advice you think is bad? Test it before arguing - maybe it was good.
  • You posted a claim that something you did not test works? Be prepared to eat your hat.
Reply
#9
Just using a regex is sometimes shorter, but not always good.

import re

address_regex = re.compile(r'Address: ([A-Z0-9:]+)')
channel_regex = re.compile(r'Channel:(\d+)')
essid_regex = re.compile(r'ESSID:"(.+)"')

addresses = address_regex.findall(var)
channels = channel_regex.findall(var)
essids = essid_regex.findall(var)

####
# independent how you've parsed the data, you can stick the lists together with zip.
# a = [1,2,3]
# b = [4,5,6]
# c = [7,8,9]
# list(zip(a, b, c))
# => [(1, 4, 7), (2, 5, 8), (3, 6, 9)]
# it's similar to the transpose function in excel, but more powerful
# In Python 3 zip is a lazy evaluated iterator.
# It gives you also the ability to do things, which normally doesn't fit complete in memory

# now make the data structure in the order you want:
access_points_list = list(zip(essids, addresses, channels))
# or as a dict, where the ssids are the keys:
access_points_dict = {elements[0]: elements[1:] for elements in zip(essids, addresses, channels)}
# or with address as key:
access_points_dict = {elements[0]: elements[1:] for elements in zip(addresses, essids, channels)}
# or a list with a nested dict:
access_points_list_with_dicts = [{'essid': elements[0], 'address': elements[1], 'channel': int(elements[2])} for elements in access_points_list]
First you run your program, then you parse the output of it without saving it to disk, then you transform the data and finally you can write the data to disk.
Almost dead, but too lazy to die: https://sourceserver.info
All humans together. We don't need politicians!
Reply
#10
(May-31-2017, 12:19 PM)DeaD_EyE Wrote:
...{elements[0]: elements[1:] for elements in zip(...)}
Well, then, why not
{key: elements for key, *elements in zip(...)}
which is Python3-ic?

And I am not sure that OP is ready for all that - he's still pretty much struggling with the basics
Test everything in a Python shell (iPython, Azure Notebook, etc.)
  • Someone gave you an advice you liked? Test it - maybe the advice was actually bad.
  • Someone gave you an advice you think is bad? Test it before arguing - maybe it was good.
  • You posted a claim that something you did not test works? Be prepared to eat your hat.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  parse json field from csv file lebossejames 4 668 Nov-14-2023, 11:34 PM
Last Post: snippsat
  parse/read from file seperated by dots giovanne 5 1,043 Jun-26-2023, 12:26 PM
Last Post: DeaD_EyE
Thumbs Up Need to compare the Excel file name with a directory text file. veeran1991 1 1,064 Dec-15-2022, 04:32 PM
Last Post: Larz60+
  Trying to parse only 3 key values from json file cubangt 8 3,338 Jul-16-2022, 02:05 PM
Last Post: deanhystad
  Modify values in XML file by data from text file (without parsing) Paqqno 2 1,575 Apr-13-2022, 06:02 AM
Last Post: Paqqno
  Converted Pipe Delimited text file to CSV file atomxkai 4 6,842 Feb-11-2022, 12:38 AM
Last Post: atomxkai
  [split] How to convert the CSV text file into a txt file Pinto94 5 3,248 Dec-23-2020, 08:04 AM
Last Post: ndc85430
  Saving text file with a click: valueerror i/o operation on closed file vizier87 5 4,326 Nov-16-2020, 07:56 AM
Last Post: Gribouillis
  saving data from text file to CSV file in python having delimiter as space K11 1 2,355 Sep-11-2020, 06:28 AM
Last Post: bowlofred
  Web Form to Python Script to Text File to zip file to web wfsteadman 1 2,098 Aug-09-2020, 02:12 PM
Last Post: snippsat

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020