hello
There is a string below.
FTP://21.105.28.15/abc/test1.txt
Cut this string
/abc/
I just want to get the path.
How do I parse the string?
import pymssql
conn = pymssql.connect('10.21.100.21', 'sa', 'abc', 'DBA', as_dict=False)
cur = conn.cursor()
cur.execute('select path From table')
fetch = cur.fetchall()
for i in fetch:
print (i) <==== FTP://21.105.28.15/abc/test1.txt (Cut this string /abc/
I just want to get the path.)
conn.close()
You could use
urllib.parse.urlparse
>>> from urllib.parse import urlparse
>>> urlparse('FTP://21.105.28.15/abc/test1.txt')
ParseResult(scheme='ftp', netloc='21.105.28.15', path='/abc/test1.txt', params='', query='', fragment='')
>>> result = urlparse('FTP://21.105.28.15/abc/test1.txt')
>>> result.path
'/abc/test1.txt'
>>>
Maybe like this, without using any modules: find the third / and the last /
mystring = 'FTP://21.105.28.15/abc/test1.txt'
mystring2 = 'FTP://21.105.28.15/abc/def/ghi/test1.txt'
# get the index of the third /
def getFirst():
count = 0
for s in range(1, len(mystring)):
# skip the first 2 /
if mystring[s] == '/':
count +=1
# get the third /
if count == 3:
index_first = s
return index_first
# get the index of the last /
def getLast():
for s in range(1, len(mystring)):
if mystring[-s] == '/':
index_last = -(s - 1)
return index_last
wanted = mystring[getFirst():getLast()]
wanted = mystring2[getFirst():getLast()]
Gives:
Output:
wanted
'/abc/'
or
Output:
wanted
'/abc/def/ghi/'
Another way
astring = 'FTP://21.105.28.15/abc/test1.txt'
astring2 = 'FTP://21.105.28.15/abc/def/test1.txt'
def remover(astring):
astring = astring.split('/')
for index, item in enumerate(astring):
if item == '':
astring.remove(item)
elif index in (0, 1):
astring.remove(item)
return '/'.join(astring)
print(remover(astring))
print(remover(astring2))
output
Output:
/abc/test1.txt
/abc/def/test1.txt
little adjustment can remove the file at the end as well
If you need a path, a Path object is maybe what you want.
Code is based on Gribouillis example:
from urllib.parse import urlparse
from pathlib import PurePosixPath
def url2path(url: str) -> PurePosixPath:
return PurePosixPath(urlparse(url).path)
path = url2path('FTP://21.105.28.15/abc/test1.txt')
print("path:", path)
print("path.parent", path.parent)
print("path.parts", path.parts)
print("path.parents", list(path.parents))
print("path.name:", path.name)
print("path.stem:", path.stem)
print("path.suffix", path.suffix)
# Path objects are printed as strings
# but it's a different type
# here how to do it explicit
print("Path as str:", str(path))
# not all 3rd party libraries handle the conversion from `Path` to `str` implicit
Output:
path: /abc/test1.txt
path.parent /abc
path.parts ('/', 'abc', 'test1.txt')
path.parents [PurePosixPath('/abc'), PurePosixPath('/')]
path.name: test1.txt
path.stem: test1
path.suffix .txt
Path as str: /abc/test1.txt
This also works with other non-FTP urls:
urls = (
'FTP://21.105.28.15/abc/test1.txt',
'FTPS://21.105.28.15/abc/test1.txt',
'HTTP://21.105.28.15/abc/test1.txt',
'HTTPS://21.105.28.15/abc/test1.txt',
'MyOwnUselessURL://xyz.de/abc/test1.txt',
)
for url in urls:
print(url2path(url))
Boring output:
Output:
/abc/test1.txt
/abc/test1.txt
/abc/test1.txt
/abc/test1.txt
/abc/test1.txt
Documents you should read: