Posts: 7
Threads: 7
Joined: Mar 2024
Apr-10-2024, 07:07 AM
(This post was last modified: Apr-10-2024, 07:32 AM by Gribouillis.)
hello
There is a string below.
FTP://21.105.28.15/abc/test1.txt
Cut this string
/abc/
I just want to get the path.
How do I parse the string?
1 2 3 4 5 6 7 8 9 10 11 12 13 |
import pymssql
conn = pymssql.connect( '10.21.100.21' , 'sa' , 'abc' , 'DBA' , as_dict = False )
cur = conn.cursor()
cur.execute( 'select path From table' )
fetch = cur.fetchall()
for i in fetch:
print (i) < = = = = FTP: / / 21.105 . 28.15 / abc / test1.txt (Cut this string / abc /
I just want to get the path.)
conn.close()
|
Posts: 4,786
Threads: 76
Joined: Jan 2018
Apr-10-2024, 07:36 AM
(This post was last modified: Apr-10-2024, 07:37 AM by Gribouillis.)
You could use urllib.parse.urlparse
1 2 3 4 5 6 7 |
>>> from urllib.parse import urlparse
ParseResult(scheme = 'ftp' , netloc = '21.105.28.15' , path = '/abc/test1.txt' , params = ' ', query=' ', fragment=' ')
>>> result.path
'/abc/test1.txt'
>>>
|
« We can solve any problem by introducing an extra level of indirection »
Posts: 1,093
Threads: 143
Joined: Jul 2017
Maybe like this, without using any modules: find the third / and the last /
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 |
def getFirst():
count = 0
for s in range ( 1 , len (mystring)):
if mystring[s] = = '/' :
count + = 1
if count = = 3 :
index_first = s
return index_first
def getLast():
for s in range ( 1 , len (mystring)):
if mystring[ - s] = = '/' :
index_last = - (s - 1 )
return index_last
wanted = mystring[getFirst():getLast()]
wanted = mystring2[getFirst():getLast()]
|
Gives:
Output: wanted
'/abc/'
or
Output: wanted
'/abc/def/ghi/'
Posts: 1,145
Threads: 114
Joined: Sep 2019
Apr-10-2024, 10:23 AM
(This post was last modified: Apr-10-2024, 10:23 AM by menator01.)
Another way
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
def remover(astring):
astring = astring.split( '/' )
for index, item in enumerate (astring):
if item = = '':
astring.remove(item)
elif index in ( 0 , 1 ):
astring.remove(item)
return '/' .join(astring)
print (remover(astring))
print (remover(astring2))
|
output
Output: /abc/test1.txt
/abc/def/test1.txt
little adjustment can remove the file at the end as well
Posts: 2,125
Threads: 11
Joined: May 2017
Apr-10-2024, 10:26 AM
(This post was last modified: Apr-10-2024, 10:26 AM by DeaD_EyE.)
If you need a path, a Path object is maybe what you want.
Code is based on Gribouillis example:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 |
from urllib.parse import urlparse
from pathlib import PurePosixPath
def url2path(url: str ) - > PurePosixPath:
return PurePosixPath(urlparse(url).path)
print ( "path:" , path)
print ( "path.parent" , path.parent)
print ( "path.parts" , path.parts)
print ( "path.parents" , list (path.parents))
print ( "path.name:" , path.name)
print ( "path.stem:" , path.stem)
print ( "path.suffix" , path.suffix)
print ( "Path as str:" , str (path))
|
Output: path: /abc/test1.txt
path.parent /abc
path.parts ('/', 'abc', 'test1.txt')
path.parents [PurePosixPath('/abc'), PurePosixPath('/')]
path.name: test1.txt
path.stem: test1
path.suffix .txt
Path as str: /abc/test1.txt
This also works with other non-FTP urls:
1 2 3 4 5 6 7 8 9 10 11 |
urls = (
)
for url in urls:
print (url2path(url))
|
Boring output:
Output: /abc/test1.txt
/abc/test1.txt
/abc/test1.txt
/abc/test1.txt
/abc/test1.txt
Documents you should read:
Larz60+ and Gribouillis like this post
Posts: 2
Threads: 0
Joined: Jan 2025
Jan-20-2025, 08:11 PM
(This post was last modified: Jan-21-2025, 08:58 AM by Larz60+.)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 |
def getFirst(url):
count = 0
for i, char in enumerate (url):
if char = = '/' :
count + = 1
if count = = 3 :
return i
return - 1
def getLast(url):
return url.rfind( '/' )
def extract_substring(url):
first_index = getFirst(url)
last_index = getLast(url)
if first_index ! = - 1 and first_index < last_index:
return url[first_index + 1 :last_index]
return ""
wanted1 = extract_substring(mystring)
wanted2 = extract_substring(mystring2)
print (wanted1)
print (wanted2)
|
I'm using this and it work for me.
Larz60+ write Jan-21-2025, 08:58 AM:Please post all code, output and errors (it it's entirety) between their respective tags. Refer to BBCode help topic on how to post. Use the "Preview Post" button to make sure the code is presented as you expect before hitting the "Post Reply/Thread" button.
Tags have been added. Please use BBCode tags on future posts
Posts: 5
Threads: 0
Joined: Jan 2025
I define an "extract_path" function that uses a regular expression to find the part of the URL after "/abc/" and extract it.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 |
import pymssql
import re
def extract_path(url):
match = re.search(r '/abc/(.*)' , url)
if match:
return match.group( 1 )
return None
conn = pymssql.connect( '10.21.100.21' , 'sa' , 'abc' , 'DBA' , as_dict = False )
cur = conn.cursor()
cur.execute( 'SELECT path FROM table' )
fetch = cur.fetchall()
for row in fetch:
ftp_url = row[ 0 ]
extracted_path = extract_path(ftp_url)
if extracted_path:
print ( f "Original URL: {ftp_url}" )
print ( f "Extracted path: {extracted_path}" )
print ( "---" )
conn.close()
|
This is working at your end?
Posts: 2,125
Threads: 11
Joined: May 2017
Solving this kind of problem with regex is brute-force. Sometimes it's good to take a step back and ask, if you can solve the problem differently.
Posts: 1
Threads: 0
Joined: Feb 2025
Feb-13-2025, 07:08 AM
(This post was last modified: Feb-13-2025, 07:59 AM by buran.)
(Jan-21-2025, 12:48 PM)DeaD_EyE Wrote: Solving this kind of problem with regex is brute-force. Sometimes it's good to take a step back and ask, if you can solve the problem differently.
If it really works, there is no reason to refuse.
buran write Feb-13-2025, 07:59 AM:Spam link removed
|