Python Forum
How do I parse the string? - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: General Coding Help (https://python-forum.io/forum-8.html)
+--- Thread: How do I parse the string? (/thread-41930.html)



How do I parse the string? - anna17 - Apr-10-2024

hello

There is a string below.
FTP://21.105.28.15/abc/test1.txt
Cut this string
/abc/
I just want to get the path.

How do I parse the string?

import pymssql
conn = pymssql.connect('10.21.100.21', 'sa', 'abc', 'DBA', as_dict=False) 
cur = conn.cursor()

cur.execute('select path From table')
 
fetch = cur.fetchall()
 
for i in fetch:
	print (i) <==== FTP://21.105.28.15/abc/test1.txt (Cut this string /abc/ 
I just want to get the path.)

conn.close()



RE: How do I parse the string? - Gribouillis - Apr-10-2024

You could use urllib.parse.urlparse
>>> from urllib.parse import urlparse
>>> urlparse('FTP://21.105.28.15/abc/test1.txt')
ParseResult(scheme='ftp', netloc='21.105.28.15', path='/abc/test1.txt', params='', query='', fragment='')
>>> result = urlparse('FTP://21.105.28.15/abc/test1.txt')
>>> result.path
'/abc/test1.txt'
>>> 



RE: How do I parse the string? - Pedroski55 - Apr-10-2024

Maybe like this, without using any modules: find the third / and the last /

mystring = 'FTP://21.105.28.15/abc/test1.txt'
mystring2 = 'FTP://21.105.28.15/abc/def/ghi/test1.txt'

# get the index of the third /
def getFirst():
    count = 0
    for s in range(1, len(mystring)):
        # skip the first 2 /
        if mystring[s] == '/':
            count +=1            
            # get the third /
            if count == 3:
                index_first = s
                return index_first

# get the index of the last /
def getLast():
    for s in range(1, len(mystring)):
        if mystring[-s] == '/':
            index_last = -(s - 1)
            return index_last

wanted = mystring[getFirst():getLast()]
wanted = mystring2[getFirst():getLast()]
Gives:

Output:
wanted '/abc/'
or

Output:
wanted '/abc/def/ghi/'



RE: How do I parse the string? - menator01 - Apr-10-2024

Another way
astring = 'FTP://21.105.28.15/abc/test1.txt'
astring2 = 'FTP://21.105.28.15/abc/def/test1.txt'

def remover(astring):
    astring = astring.split('/')
    for index, item in enumerate(astring):
        if item == '':
            astring.remove(item)
        elif index in (0, 1):
            astring.remove(item)
    return '/'.join(astring)

print(remover(astring))
print(remover(astring2))
output
Output:
/abc/test1.txt /abc/def/test1.txt
little adjustment can remove the file at the end as well


RE: How do I parse the string? - DeaD_EyE - Apr-10-2024

If you need a path, a Path object is maybe what you want.

Code is based on Gribouillis example:
from urllib.parse import urlparse
from pathlib import PurePosixPath


def url2path(url: str) -> PurePosixPath:
    return PurePosixPath(urlparse(url).path)


path = url2path('FTP://21.105.28.15/abc/test1.txt')

print("path:", path)
print("path.parent", path.parent)

print("path.parts", path.parts)
print("path.parents", list(path.parents))

print("path.name:", path.name)
print("path.stem:", path.stem)
print("path.suffix", path.suffix)


# Path objects are printed as strings
# but it's a different type
# here how to do it explicit
print("Path as str:", str(path))

# not all 3rd party libraries handle the conversion from `Path` to `str` implicit
Output:
path: /abc/test1.txt path.parent /abc path.parts ('/', 'abc', 'test1.txt') path.parents [PurePosixPath('/abc'), PurePosixPath('/')] path.name: test1.txt path.stem: test1 path.suffix .txt Path as str: /abc/test1.txt
This also works with other non-FTP urls:
urls = (
    'FTP://21.105.28.15/abc/test1.txt',
    'FTPS://21.105.28.15/abc/test1.txt',
    'HTTP://21.105.28.15/abc/test1.txt',
    'HTTPS://21.105.28.15/abc/test1.txt',
    'MyOwnUselessURL://xyz.de/abc/test1.txt',
)


for url in urls:
    print(url2path(url))
Boring output:
Output:
/abc/test1.txt /abc/test1.txt /abc/test1.txt /abc/test1.txt /abc/test1.txt
Documents you should read: