Python Forum
Invalid syntax on input string - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: General Coding Help (https://python-forum.io/forum-8.html)
+--- Thread: Invalid syntax on input string (/thread-13808.html)



Invalid syntax on input string - Callahan - Nov-01-2018

Hi,

Not an avid Python user so probably a simple solution to this but I've yet to find it. I have a string containing a mix of spaces, alphanumerics and special chars. I want to extract text between 2 key tags and assign it to a variable. I'm using re.search to accomplish this as shown below.

_This works_
import re

text = '<dc:title>Flames</dc:title>'

m = re.search('<dc:title>(.*)</dc:title>', text)
if m:
    title = m.group(1)
    print(title)
However, trying to search on a much larger, more complex string causes a syntax error as shown below.

import re

text = '<dc:title>{u'AbsTime': u'NOT_IMPLEMENTED', u'@xmlns:u': u'urn:schemas-upnp-org:service:AVTransport:1', u'Track': u'5', u'TrackDuration': u'0:03:15', u'TrackURI': u'x-sonosapi-hls-static:catalog%2ftracks%2fB07BGCPYBM%2f71090898-695d-4d13-b8ff-4028fccf0a05%2fbaf8f5ea-7f11-4cfe-b0cd-ce07704b31df%2fA3F2HSL9IRWOWF%2fn%2fPRIME%2f26ac853f-8dad-4dbe-b5aa-2bb2e22df554%2fPRIME_STATION%2f57637924-c963-4136-8d51-e0c887aab1f5%2f?sid=201&flags=0&sn=1', u'RelTime': u'0:00:02', u'TrackMetaData': u'<DIDL-Lite xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:upnp="urn:schemas-upnp-org:metadata-1-0/upnp/" xmlns:r="urn:schemas-rinconnetworks-com:metadata-1-0/" xmlns="urn:schemas-upnp-org:metadata-1-0/DIDL-Lite/"><item id="-1" parentID="-1" restricted="true"><res protocolInfo="sonos.com-http:*:application/x-mpegURL:*" duration="0:03:15">x-sonosapi-hls-static:catalog%2ftracks%2fB07BGCPYBM%2f71090898-695d-4d13-b8ff-4028fccf0a05%2fbaf8f5ea-7f11-4cfe-b0cd-ce07704b31df%2fA3F2HSL9IRWOWF%2fn%2fPRIME%2f26ac853f-8dad-4dbe-b5aa-2bb2e22df554%2fPRIME_STATION%2f57637924-c963-4136-8d51-e0c887aab1f5%2f?sid=201&amp;flags=0&amp;sn=1</res><r:streamContent></r:streamContent><upnp:albumArtURI>/getaa?s=1&amp;u=x-sonosapi-hls-static%3acatalog%252ftracks%252fB07BGCPYBM%252f71090898-695d-4d13-b8ff-4028fccf0a05%252fbaf8f5ea-7f11-4cfe-b0cd-ce07704b31df%252fA3F2HSL9IRWOWF%252fn%252fPRIME%252f26ac853f-8dad-4dbe-b5aa-2bb2e22df554%252fPRIME_STATION%252f57637924-c963-4136-8d51-e0c887aab1f5%252f%3fsid%3d201%26flags%3d0%26sn%3d1</upnp:albumArtURI><dc:title>Flames</dc:title><upnp:class>object.item.audioItem.musicTrack</upnp:class><dc:creator>David Guetta &amp; Sia</dc:creator><upnp:album>Flames</upnp:album></item></DIDL-Lite>', u'RelCount': u'2147483647', u'AbsCount': u'2147483647'}</dc:title>'

m = re.search('<dc:title>(.*)</dc:title>', text)
if m:
    title = m.group(1)
    print(title)
The above will return a:
SyntaxError: invalid syntax
on the input string. I'm assuming it's because the input string contains such a mix of characters, ' being a problem I imagine.

So I've tried to format the string on input to remove all the chars I think Python doesn't like as shown below:

import re

text = '{u'AbsTime': u'NOT_IMPLEMENTED', u'@xmlns:u': u'urn:schemas-upnp-org:service:AVTransport:1', u'Track': u'1', u'TrackDuration': u'0:03:34', u'TrackURI': u'x-sonosapi-hls-static:catalog%2ftracks%2fB07DLVN8GS%2f%3fplaylistAsin%3dB07JB7Z9YC%26playlistType%3dprimePlaylist?sid=201&flags=65536&sn=1', u'RelTime': u'0:00:18', u'TrackMetaData': u'<DIDL-Lite xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:upnp="urn:schemas-upnp-org:metadata-1-0/upnp/" xmlns:r="urn:schemas-rinconnetworks-com:metadata-1-0/" xmlns="urn:schemas-upnp-org:metadata-1-0/DIDL-Lite/"><item id="-1" parentID="-1" restricted="true"><res protocolInfo="sonos.com-http:*:application/x-mpegURL:*" duration="0:03:34">x-sonosapi-hls-static:catalog%2ftracks%2fB07DLVN8GS%2f%3fplaylistAsin%3dB07JB7Z9YC%26playlistType%3dprimePlaylist?sid=201&amp;flags=65536&amp;sn=1</res><r:streamContent></r:streamContent><upnp:albumArtURI>/getaa?s=1&amp;u=x-sonosapi-hls-static%3acatalog%252ftracks%252fB07DLVN8GS%252f%253fplaylistAsin%253dB07JB7Z9YC%2526playlistType%253dprimePlaylist%3fsid%3d201%26flags%3d65536%26sn%3d1</upnp:albumArtURI><dc:title>Girls Like You [Explicit]</dc:title><upnp:class>object.item.audioItem.musicTrack</upnp:class><dc:creator>Maroon 5</dc:creator><upnp:album>Best of Prime Music</upnp:album></item></DIDL-Lite>', u'RelCount': u'2147483647', u'AbsCount': u'2147483647'}'

# Delete Python-style comments
new_text = re.sub('[!@#:$\']', '', text)

print(new_text)
However, I'm still not able to get past the error on the input string.

If anyone can point me in the right direction, I'd appreciate it.

Thanks.


RE: Invalid syntax on input string - nilamo - Nov-01-2018

(Nov-01-2018, 04:18 PM)Callahan Wrote:
text = '<dc:title>{u'AbsTime': u'NOT_IMPLEMENTED', u'@xmlns:u': u'urn:schemas-upnp-org:service:AVTransport:1', u'Track': u'5', u'TrackDuration': u'0:03:15', u'TrackURI': u'x-sonosapi-hls-static:catalog%2ftracks%2fB07BGCPYBM%2f71090898-695d-4d13-b8ff-4028fccf0a05%2fbaf8f5ea-7f11-4cfe-b0cd-ce07704b31df%2fA3F2HSL9IRWOWF%2fn%2fPRIME%2f26ac853f-8dad-4dbe-b5aa-2bb2e22df554%2fPRIME_STATION%2f57637924-c963-4136-8d51-e0c887aab1f5%2f?sid=201&flags=0&sn=1', u'RelTime': u'0:00:02', u'TrackMetaData': u'<DIDL-Lite xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:upnp="urn:schemas-upnp-org:metadata-1-0/upnp/" xmlns:r="urn:schemas-rinconnetworks-com:metadata-1-0/" xmlns="urn:schemas-upnp-org:metadata-1-0/DIDL-Lite/"><item id="-1" parentID="-1" restricted="true"><res protocolInfo="sonos.com-http:*:application/x-mpegURL:*" duration="0:03:15">x-sonosapi-hls-static:catalog%2ftracks%2fB07BGCPYBM%2f71090898-695d-4d13-b8ff-4028fccf0a05%2fbaf8f5ea-7f11-4cfe-b0cd-ce07704b31df%2fA3F2HSL9IRWOWF%2fn%2fPRIME%2f26ac853f-8dad-4dbe-b5aa-2bb2e22df554%2fPRIME_STATION%2f57637924-c963-4136-8d51-e0c887aab1f5%2f?sid=201&amp;flags=0&amp;sn=1</res><r:streamContent></r:streamContent><upnp:albumArtURI>/getaa?s=1&amp;u=x-sonosapi-hls-static%3acatalog%252ftracks%252fB07BGCPYBM%252f71090898-695d-4d13-b8ff-4028fccf0a05%252fbaf8f5ea-7f11-4cfe-b0cd-ce07704b31df%252fA3F2HSL9IRWOWF%252fn%252fPRIME%252f26ac853f-8dad-4dbe-b5aa-2bb2e22df554%252fPRIME_STATION%252f57637924-c963-4136-8d51-e0c887aab1f5%252f%3fsid%3d201%26flags%3d0%26sn%3d1</upnp:albumArtURI><dc:title>Flames</dc:title><upnp:class>object.item.audioItem.musicTrack</upnp:class><dc:creator>David Guetta &amp; Sia</dc:creator><upnp:album>Flames</upnp:album></item></DIDL-Lite>', u'RelCount': u'2147483647', u'AbsCount': u'2147483647'}</dc:title>'

Allow me to rewrite your string, as the interpreter sees it:
text = '<dc:title>{u'nonsense
Our syntax highlighter makes that clear, and the syntax error you got probably had more details that said something similar. If you want to include quotes in your string, you could triple quote it, so any inclusive quotes don't end the string. Check this out:
text = '''<dc:title>{u'AbsTime': u'NOT_IMPLEMENTED', u'@xmlns:u': u'urn:schemas-upnp-org:service:AVTransport:1', u'Track': u'5', u'TrackDuration': u'0:03:15', u'TrackURI': u'x-sonosapi-hls-static:catalog%2ftracks%2fB07BGCPYBM%2f71090898-695d-4d13-b8ff-4028fccf0a05%2fbaf8f5ea-7f11-4cfe-b0cd-ce07704b31df%2fA3F2HSL9IRWOWF%2fn%2fPRIME%2f26ac853f-8dad-4dbe-b5aa-2bb2e22df554%2fPRIME_STATION%2f57637924-c963-4136-8d51-e0c887aab1f5%2f?sid=201&flags=0&sn=1', u'RelTime': u'0:00:02', u'TrackMetaData': u'<DIDL-Lite xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:upnp="urn:schemas-upnp-org:metadata-1-0/upnp/" xmlns:r="urn:schemas-rinconnetworks-com:metadata-1-0/" xmlns="urn:schemas-upnp-org:metadata-1-0/DIDL-Lite/"><item id="-1" parentID="-1" restricted="true"><res protocolInfo="sonos.com-http:*:application/x-mpegURL:*" duration="0:03:15">x-sonosapi-hls-static:catalog%2ftracks%2fB07BGCPYBM%2f71090898-695d-4d13-b8ff-4028fccf0a05%2fbaf8f5ea-7f11-4cfe-b0cd-ce07704b31df%2fA3F2HSL9IRWOWF%2fn%2fPRIME%2f26ac853f-8dad-4dbe-b5aa-2bb2e22df554%2fPRIME_STATION%2f57637924-c963-4136-8d51-e0c887aab1f5%2f?sid=201&amp;flags=0&amp;sn=1</res><r:streamContent></r:streamContent><upnp:albumArtURI>/getaa?s=1&amp;u=x-sonosapi-hls-static%3acatalog%252ftracks%252fB07BGCPYBM%252f71090898-695d-4d13-b8ff-4028fccf0a05%252fbaf8f5ea-7f11-4cfe-b0cd-ce07704b31df%252fA3F2HSL9IRWOWF%252fn%252fPRIME%252f26ac853f-8dad-4dbe-b5aa-2bb2e22df554%252fPRIME_STATION%252f57637924-c963-4136-8d51-e0c887aab1f5%252f%3fsid%3d201%26flags%3d0%26sn%3d1</upnp:albumArtURI><dc:title>Flames</dc:title><upnp:class>object.item.audioItem.musicTrack</upnp:class><dc:creator>David Guetta &amp; Sia</dc:creator><upnp:album>Flames</upnp:album></item></DIDL-Lite>', u'RelCount': u'2147483647', u'AbsCount': u'2147483647'}</dc:title>'''



RE: Invalid syntax on input string - buran - Nov-01-2018

Use triple quotes in order to fix the Invalid Syntax error.
import re
 
text = """<dc:title>{u'AbsTime': u'NOT_IMPLEMENTED', u'@xmlns:u': u'urn:schemas-upnp-org:service:AVTransport:1', u'Track': u'5', u'TrackDuration': u'0:03:15', u'TrackURI': u'x-sonosapi-hls-static:catalog%2ftracks%2fB07BGCPYBM%2f71090898-695d-4d13-b8ff-4028fccf0a05%2fbaf8f5ea-7f11-4cfe-b0cd-ce07704b31df%2fA3F2HSL9IRWOWF%2fn%2fPRIME%2f26ac853f-8dad-4dbe-b5aa-2bb2e22df554%2fPRIME_STATION%2f57637924-c963-4136-8d51-e0c887aab1f5%2f?sid=201&flags=0&sn=1', u'RelTime': u'0:00:02', u'TrackMetaData': u'<DIDL-Lite xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:upnp="urn:schemas-upnp-org:metadata-1-0/upnp/" xmlns:r="urn:schemas-rinconnetworks-com:metadata-1-0/" xmlns="urn:schemas-upnp-org:metadata-1-0/DIDL-Lite/"><item id="-1" parentID="-1" restricted="true"><res protocolInfo="sonos.com-http:*:application/x-mpegURL:*" duration="0:03:15">x-sonosapi-hls-static:catalog%2ftracks%2fB07BGCPYBM%2f71090898-695d-4d13-b8ff-4028fccf0a05%2fbaf8f5ea-7f11-4cfe-b0cd-ce07704b31df%2fA3F2HSL9IRWOWF%2fn%2fPRIME%2f26ac853f-8dad-4dbe-b5aa-2bb2e22df554%2fPRIME_STATION%2f57637924-c963-4136-8d51-e0c887aab1f5%2f?sid=201&amp;flags=0&amp;sn=1</res><r:streamContent></r:streamContent><upnp:albumArtURI>/getaa?s=1&amp;u=x-sonosapi-hls-static%3acatalog%252ftracks%252fB07BGCPYBM%252f71090898-695d-4d13-b8ff-4028fccf0a05%252fbaf8f5ea-7f11-4cfe-b0cd-ce07704b31df%252fA3F2HSL9IRWOWF%252fn%252fPRIME%252f26ac853f-8dad-4dbe-b5aa-2bb2e22df554%252fPRIME_STATION%252f57637924-c963-4136-8d51-e0c887aab1f5%252f%3fsid%3d201%26flags%3d0%26sn%3d1</upnp:albumArtURI><dc:title>Flames</dc:title><upnp:class>object.item.audioItem.musicTrack</upnp:class><dc:creator>David Guetta &amp; Sia</dc:creator><upnp:album>Flames</upnp:album></item></DIDL-Lite>', u'RelCount': u'2147483647', u'AbsCount': u'2147483647'}</dc:title>"""
 
m = re.search('<dc:title>(.*)</dc:title>', text)
if m:
    title = m.group(1)
    print(title)
also, when post traceback, always post the full traceback, not just the last line. In this case it was easy to spot the problem, but in more complex code the traceback has valuable information


RE: Invalid syntax on input string - Callahan - Nov-01-2018

Ouch, simple solution. Rolleyes

However, despite fixing the invalid syntax issue (thanks for that), it doesn't work the way the more simple

import re
 
text = """<dc:title>Flames</dc:title>"""
 
m = re.search('<dc:title>(.*)</dc:title>', text)
if m:
    title = m.group(1)
    print(title)
does. The script just returns the complete input string rather than the string between the tags.


RE: Invalid syntax on input string - nilamo - Nov-01-2018

https://www.regexpal.com/ says your regex is fine (though I would have escaped the angle brackets, since that's regex syntax), so you should be getting results. What is the result of print(m.groups())?


RE: Invalid syntax on input string - buran - Nov-01-2018

I can also confirm you regex works the same in both cases