Python Forum
Adding new line in a one line txt file.
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Adding new line in a one line txt file.
#1
I converted a pdf to txt. It returned the file where the whole text is written in one line. In order to work with the file i need to add 2 new lines before a number and one after it. (btw. i am using python 3.6)

F.e.:
Input:
Here is some text. It is written in one lines. 12.13. Here is some more text. 2.12.14. Here is even more text.

Output(i wish to have):
Here is omse text. It is written in one lines.

12.13.
Here is some more text.

2.12.14.
Here is even more text.

This is my code. The code runs , but unfortunatly reautrns an empty page. I would be glad about some editing advice.


in_file2 = 'work1-T1.txt'
out_file2 = 'work2-T1.txt'


start_rx = re.compile('|'.join(
    ['\d\d\.\d\d\.', '^\d\.\d\d\.\d\d']))


with open(in_file2,'r', encoding='utf-8') as fin2, open(out_file2, 'w', encoding='utf-8') as fout2:
    text_list = fin2.read().split()

    for line in in_file2:
        start = True
        if re.match(start_rx, line):
            line = line.replace(start_rx, '\n\n' + start_rx + '\n')

        if line == True:
            fout2.write(line)
Reply
#2
It is for line in fin2:
Reply
#3
Thanks for your reply, i eddited it to fin2, but it still returns an empty file :(
Reply
#4
The if line == True is always false because line is a string. What's its purpose?
Reply
#5
well i used it from a different code and adjusted it because. but you are right. Here it makes no sense, i thought it would always return true so it would be no problem. I adjusted it to string .... but still it returns an empty file.

Here the code:
with open(in_file2,'r', encoding='utf-8') as fin2, open(out_file2, 'w', encoding='utf-8') as fout2:
    text_list = fin2.read().split()

    for string in fin2:
        if re.match(start_rx, string):
            string = str.replace(start_rx, '\n\n' + start_rx + '\n')

        fout2.write(string)
Reply
#6
Now the problem is that after fin2.read(), the file pointer is at the end of the file. You can go back to the beginning of the file it by calling seek
fin2.seek(0)
before the for string in fin2
Reply
#7
thank you, that fixed the output problem. But now it returns the exact same as the input. I figure something is wrong with my regex then ?
Reply
#8
Use re.sub() for replacing regex matches in a string
def foo(match):
    return '\n\n' + match.group(0) + '\n'

...
line = start_rx.sub(foo, line)
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Line graph with two superimposed lines sawtooth500 4 333 Apr-02-2024, 08:56 PM
Last Post: sawtooth500
  How to add multi-line comment section? Winfried 1 208 Mar-24-2024, 04:34 PM
Last Post: deanhystad
  break print_format lengthy line akbarza 4 370 Mar-13-2024, 08:35 AM
Last Post: akbarza
  Reading and storing a line of output from pexpect child eagerissac 1 4,257 Feb-20-2024, 05:51 AM
Last Post: ayoshittu
  coma separator is printed on a new line for some reason tester_V 4 490 Feb-02-2024, 06:06 PM
Last Post: tester_V
  problem with spliting line in print akbarza 3 388 Jan-23-2024, 04:11 PM
Last Post: deanhystad
  Unable to understand the meaning of the line of code. jahuja73 0 309 Jan-23-2024, 05:09 AM
Last Post: jahuja73
  Receive Input on Same Line? johnywhy 8 725 Jan-16-2024, 03:45 AM
Last Post: johnywhy
  Reading in of line not working? garynewport 2 840 Sep-19-2023, 02:22 PM
Last Post: snippsat
  'answers 2' is not defined on line 27 0814uu 4 737 Sep-02-2023, 11:02 PM
Last Post: 0814uu

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020