Python Forum

Full Version: Removing leading whitespaces
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Hi all

I am trying to build a simple (semi) live dictation script with Whisper. As Whisper do not support commands out of the box (eg 'new paragraph') I would need to catch these. I am using the

 keyboard 
module to enter the text outputted by Whisper into various apps (such as my word processor or web browser), via its write function. My script will then capture these commands from Whisper's output and translate them into the required action (in this example. pressing enter twice when 'new paragraph' is detected). In this case I am using re.sub to substitute 'new paragraph' with the escape sequence '\n\n'.

The part I am struggling with is that by using this, the paragraph output always have an indent which I do not want. This is my script:

import keyboard
import re

#dictionary containing list of spoken command and actions to be substituted
command_dict =COMMAND_DICT = {'new paragraph': '\n\n', 'question mark': '\b?'}

def write(text):

   for k,v in command_dict.items():   
       text = re.sub(k,v , text, flags = re.IGNORECASE)
       
   for line in text.splitlines():
       if not line.isspace():
           line = line.lstrip()
   
   keyboard.write(text)

#the output of Whisper will be fed into write()
write('I am a new paragraph old man question mark')
Desired output:
Output:
I am a old man?
What I get:
Output:
I am a old man?
I also tried using '\b\n\n' or '\n\n\b' for the value of the 'new paragraph' key, like the question mark key, but it didn't work (it worked as expected for the 'question mark' key).

Some help would be appreciated.
You have spaces around the word "paragraph", and nothing happens to them. I'm going to replace them with dots here just for visibility. But you might start with the string A.paragraph.B. After substitution that becomes A.\n\n.B.

Printed, that would be:
Output:
A. .B
You could capture optional whitespace around the keywords as part of your regex and that might improve things. Or you could remove any whitespace following a newline if you never want any indents (intentional or not).