Python Forum
docx insert words of previuos paragraph on next paragraph in the same position
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
docx insert words of previuos paragraph on next paragraph in the same position
#1
Hi, with from docx import Document I want to convert a docx to xml (and I did, I create a string that will be my output xml). This docx document contains songs with chords (in red), there is a program that I want to use, and it can accept chords in a particular way.. the chords will be placed in the middle of the word where the chord is. [Image: s!AohXx8uDzsTq8zbeCwAvNj_pA8MB?e=BtKeUb] for example, here is a part of a song.
I want the code to do it like:

1. [D-]My most [G]most beautiful [A7]i years
I want to spend on [RE]Te,


I've tried using the space between chords to get a good position of the chords in the lyrics but to no avail. Could anyone help me? Any advice or feature is welcome

p.s. if you can't see the picture I attached it.
   
Reply
#2
Is the chord just a letter? Or is it an image?

If it is just a letter, just put a marker where you want the chord, like XX.

If you make a Python list of the chords, like:

chordlist = ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'F']

Then, if you have the text as a string, use re.sub to find and replace each instance of XX

import re
# get your text as a string
# mytext is your text as a string
mytext = """You took my dreams from me,
I put them with my own,
Can't make on my own,
I love you baby!"""

# a list of the words you want to put chords in
chord_words = ['took', 'me', 'put', 'own', 'make', 'own', 'love', 'baby']

# make a list of words with XX in them
new_chord_words = []
for i in range(len(chord_words)):       
    letter = chord_words[i][0]
    rest = chord_words[i][1:]
    chord = letter + 'XX'
    newword = chord + rest    
    new_chord_words.append(newword)

# People have told me here this is not a good thing to do, because the string is immutable
# so you end up with a lot of old string variables using memory
# but I don't know how else to do this just with lists
for i in range(len(chord_words)):
    mytext = re.sub(chord_words[i], new_chord_words[i], mytext, count=1)

# mytext looks like this now
"""
You toXXok my dreams from mXXe,
I pXXut them with my oXXwn,
Can't mXXake on my oXXwn,
I lXXove you bXXaby!
"""

chordlist = ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'F']
pattern = re.compile('XX')
# now replace XX with the chords in
# again not good because of the string mytext
for chord in chordlist:
    mytext = re.sub(pattern, chord, mytext, count=1)

# now write the new text where you want it
But if your chords are images, that will be difficult, getting the exact location on the page to place the image!!

Could be done with PIL I suppose!
Reply
#3
(Jun-19-2023, 05:05 AM)Pedroski55 Wrote: Is the chord just a letter? Or is it an image?

If it is just a letter, just put a marker where you want the chord, like XX.

If you make a Python list of the chords, like:

chordlist = ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'F']

Then, if you have the text as a string, use re.sub to find and replace each instance of XX

import re
# get your text as a string
# mytext is your text as a string
mytext = """You took my dreams from me,
I put them with my own,
Can't make on my own,
I love you baby!"""

# a list of the words you want to put chords in
chord_words = ['took', 'me', 'put', 'own', 'make', 'own', 'love', 'baby']

# make a list of words with XX in them
new_chord_words = []
for i in range(len(chord_words)):       
    letter = chord_words[i][0]
    rest = chord_words[i][1:]
    chord = letter + 'XX'
    newword = chord + rest    
    new_chord_words.append(newword)

# People have told me here this is not a good thing to do, because the string is immutable
# so you end up with a lot of old string variables using memory
# but I don't know how else to do this just with lists
for i in range(len(chord_words)):
    mytext = re.sub(chord_words[i], new_chord_words[i], mytext, count=1)

# mytext looks like this now
"""
You toXXok my dreams from mXXe,
I pXXut them with my oXXwn,
Can't mXXake on my oXXwn,
I lXXove you bXXaby!
"""

chordlist = ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'F']
pattern = re.compile('XX')
# now replace XX with the chords in
# again not good because of the string mytext
for chord in chordlist:
    mytext = re.sub(pattern, chord, mytext, count=1)

# now write the new text where you want it
But if your chords are images, that will be difficult, getting the exact location on the page to place the image!!

Could be done with PIL I suppose!

It is a docx, but there is a problem, you defined chord_words, I can't do this... there are maybe 2000 songs... How can I create a variable that undestand where insert the chord? There is another "problem" in your script: the chords are inserted always after the first letter, in my situation is important to follow the "real structure": maybe the chord is in the second or third letter, not after first. I don't know if it is clear what I want to say. Howere, thank you for the replie Wink
Reply
#4
Oh dear! Python can't think or read your mind!

You will have to tell it where you want the chords, in which word and where in that word.

I use something similar to make gapped texts for use in class.

I put each sentence of text on a new line.

Then I give my program a list of tuples like this:

mylist = [(3,woman), (5,man), (6,dog), (7,cat)]

The number is the line number from data = mytext.readlines()

Then the program goes to each line and removes the word 1 time and replaces it with a number and _______

Then I put the text together again and write it to a .docx file, with the missing words in a table.

But I have to tell the program which word I want to remove and which line that word is on!

You could do a similar thing, but you must tell it which line and which word you want the chord letter in!

How else could the program know what and where??
Reply
#5
(Jun-19-2023, 08:27 AM)Pedroski55 Wrote: Oh dear! Python can't think or read your mind!

You will have to tell it where you want the chords, in which word and where in that word.

I use something similar to make gapped texts for use in class.

I put each sentence of text on a new line.

Then I give my program a list of tuples like this:

mylist = [(3,woman), (5,man), (6,dog), (7,cat)]

The number is the line number from data = mytext.readlines()

Then the program goes to each line and removes the word 1 time and replaces it with a number and _______

Then I put the text together again and write it to a .docx file, with the missing words in a table.

But I have to tell the program which word I want to remove and which line that word is on!

You could do a similar thing, but you must tell it which line and which word you want the chord letter in!

How else could the program know what and where??

Yes I know that python can't read in my mind Big Grin, the code that I want in my mind it takes the chord that it is ALWAYS placed, if there is, in the upper row (or paragraph) and if you track a vertical line, it will put it in the row below. I don't know if it is useful but I tabbed also all the chords. (I attached an example of my idea of the "imaginary ruler")
   
Reply
#6
Just a thought, but to achieve the correct positioning of the chords:

Take each line as a text string.
Get the length of that string.
Write another string of spaces, except where you want a chord.

But you still need a list like I showed above, a list of tuples: [(position in the string, which chord)]

I am sadly not musical, so I don't understand music. But Linux has software for writing music, Rosegarden is one I have heard of.

Perhaps it can do what you want! The words, the music and the chords!
Reply
#7
(Jun-19-2023, 11:39 PM)Pedroski55 Wrote: Just a thought, but to achieve the correct positioning of the chords:

Take each line as a text string.
Get the length of that string.
Write another string of spaces, except where you want a chord.

But you still need a list like I showed above, a list of tuples: [(position in the string, which chord)]

I am sadly not musical, so I don't understand music. But Linux has software for writing music, Rosegarden is one I have heard of.

Perhaps it can do what you want! The words, the music and the chords!

I tried a similar method: where there was the first letter of the chord, I took the length and when the word finished I copyed it in the lines below, but there are two problems: the spaces are different for empty text and text, so it is not so precious, another problem is that py when take the docx like string changes (maybe the character) and it is different. I think that the best solution could be a code that convert a docx and do the edit in another generating another docx, maybe in the process of re-converting it will be put in the right order, but there is always the problem of the spaces.. Does anyone have any other tips?
Reply
#8
Yes, I thought about the fact that the characters will be different sizes. But if you can get close to what you want first, you can tweak it to be exact.

The position, size and style of everything in a Word document is set using xml I believe.

One time I renamed a word document without the .docx ending. Try that.

You will find you have a .zip file with lots of information in the form of .xml files and various folders.

Maybe you can find the exact position of each character there, I don't know. Somehow Word must be told how and where to display what.

Then, if you take you whole page as an image, you could insert chord images over that image, using position information from the .xml data.

Here is a link on how to insert an image over an image in Python.

from PIL import Image
Or, if you can mine the exact position of each character you want from the .xml data of a .docx file, you could perhaps modify the .xml to paste your chord images over the text. There are Python modules for parsing .xml files I believe, people here will know that.

As I understand it, you only have do, re, mi, so, fa, la, ti, do. (Shades of Julie Andrews!)

Hope you can get this solved!
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  no module named 'docx' when importing docx MaartenRo 1 894 Dec-31-2023, 11:21 AM
Last Post: deanhystad
  python-docx regex : Browse the found words in turn from top to bottom Tmagpy 0 1,537 Jun-27-2022, 08:45 AM
Last Post: Tmagpy
  python-docx regex: replace any word in docx text Tmagpy 4 2,248 Jun-18-2022, 09:12 AM
Last Post: Tmagpy
  Generate a string of words for multiple lists of words in txt files in order. AnicraftPlayz 2 2,822 Aug-11-2021, 03:45 PM
Last Post: jamesaarr
  How to add run in paragraph using python-docx? toothedsword 0 2,794 May-12-2021, 10:55 AM
Last Post: toothedsword
  Compare all words in input() to all words in file Trianne 1 2,776 Oct-05-2018, 06:27 PM
Last Post: ichabod801
  Adding a paragraph of text before an exercise oldDog 5 3,601 May-22-2018, 06:57 PM
Last Post: oldDog
  Insert using psycopg giving syntax error near "INSERT INTO" olgethorpe 4 15,659 Jul-21-2017, 07:39 PM
Last Post: nilamo

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020