As this is second thread about this problem I suspect that this is assignment. One should always learn, whether in class or in this forum. So I try to describe one potential approach.
For starters word of caution:
In this particular case I don't see need for hammer as there are no nails.
What we need:
- read a file
- get actors names and their text from file rows
- write text by actors into file(s)
Read the file.
It's simple. We use
If we observe example file then we can see:
- actors are before
- some actor names have space between name and
- some actor names contain additional information which should be stripped:
- some lines of text are without actor name (continuation of text)
So how we should approach this? One way is to process every line, split line on
As we earlier observed need for some 'cleaning' then we should do it here, otherwise we may end with different keys for same actor. So we write this (assuming thad we have dictionary named d), stripping whitespaces and (the boss) part:
We have dictionary and need to create files and write needed text into them.
We have actor names as dictionary keys. We need to format these to get suitable file names (camelcase). f-string is obvious choice if using Python <= 3.6. With every key we should split it on whitespace, join back together without whitespace and add extension (k is dictionary key):
For starters word of caution:
Quote:Regex is a pretty big hammer. Of course, you should know how to use that hammer if you really need a big hammer, but nevertheless it's a hammer. And you know what they say about hammers - if all you got is a hammer, everything looks like a nail.
Jamie Zawinsky is credited with the phrase "Some people, when confronted with a problem, think "I know, I'll use regular expressions." Now they have two problems."
Using regex is troublesome because you may quickly end up with something that may work but in a week or month you lost the intuition of the expression, unable to tell what this thing is doing without diving into its intricacies all over again.
In this particular case I don't see need for hammer as there are no nails.
What we need:
- read a file
- get actors names and their text from file rows
- write text by actors into file(s)
Read the file.
It's simple. We use
open
. It's a good practice to indicate mode ('r'). As file object is iterable we can iterate rows directly:with open('somefile.txt', 'r') as script: for row in script: # do something usefulGet actors names and text from rows
If we observe example file then we can see:
- actors are before
:
- some actor names have space between name and
:
- some actor names contain additional information which should be stripped:
(the boss)
NB! This is not actually in assignment, I assigned it myself.- some lines of text are without actor name (continuation of text)
So how we should approach this? One way is to process every line, split line on
:
and unpack result into variables, something like that:actor, text = line.split(':')If we run this we will find out that it will fail on rows there is no actor name and therefore no
:
. Python will raise ValueError because (a) there is no sign on what to split (b) .split method returns list with only one item in it © there is no value to assign to text
>>> 'abc'.split(':') ['abc'] >>> actor, text = 'abc'.split(':') /../ ValueError: not enough values to unpack (expected 2, got 1)Are we stuck? No, on the contrary. This error enables use use pythonic EAFP style of coding. How? We process first line and get actor name and text, on second line there is no actor name and therefore error is raised. We can just assign whole line to text without changing actor name defined while processing previous line:
for line in script: try: actor, text = line.split(':') except ValueError: text = lineNow we have to think about data structure we will keep the results. One obvious way is to use defaultdict but I chose ordinary dictionary with .setdefault method. In spoken language: 'if we encounter name first time create a key from that name and assign empty list as value and and in any case (whether key existed or not) append text into this list'. This enables us to create dictionary where key is actor name and value is list of text lines.
As we earlier observed need for some 'cleaning' then we should do it here, otherwise we may end with different keys for same actor. So we write this (assuming thad we have dictionary named d), stripping whitespaces and (the boss) part:
d.setdefault(actor.split('(')[0].strip(), []).append(text.strip())The whole code to read file and save data into dictionary can be something like this:
d = {} with open('sample.txt', 'r') as script: for line in script: try: actor, text = line.split(':') except ValueError: text = line d.setdefault(actor.split('(')[0].strip(), []).append(text.strip())Write text by actors into file(s).
We have dictionary and need to create files and write needed text into them.
We have actor names as dictionary keys. We need to format these to get suitable file names (camelcase). f-string is obvious choice if using Python <= 3.6. With every key we should split it on whitespace, join back together without whitespace and add extension (k is dictionary key):
f"{''.join(k.split(' '))}.txt"In every line we should write actor name and one line of text. Here we can also take advantage of f-strings. File operation(s) is already covered so we combine our knowledge into code. In spoken language: 'iterate over keys and values, create filename (and file) from key, for every item in values write row in file in form of key: value'
for k, v in d.items(): with open(f"{''.join(k.split(' '))}.txt", 'w') as outing: for value in v: outing.write(f'{k}: {value}\n')So this is how whole code could look like:
d = {} with open('sample.txt', 'r') as script: for line in script: try: actor, text = line.split(':') except ValueError: text = line d.setdefault(actor.split('(')[0].strip(), []).append(text.strip()) for k, v in d.items(): with open(f"{''.join(k.split(' '))}.txt", 'w') as outing: for value in v: outing.write(f'{k}: {value}\n')As this is probably assignment I deliberately didn't address the issue that line without name should be appended to previous line and altered slightly conditions (stripping '(the boss)')
I'm not 'in'-sane. Indeed, I am so far 'out' of sane that you appear a tiny blip on the distant coast of sanity. Bucky Katt, Get Fuzzy
Da Bishop: There's a dead bishop on the landing. I don't know who keeps bringing them in here. ....but society is to blame.
Da Bishop: There's a dead bishop on the landing. I don't know who keeps bringing them in here. ....but society is to blame.