Posts: 7
Threads: 7
Joined: Mar 2024
Apr-10-2024, 07:07 AM
(This post was last modified: Apr-10-2024, 07:32 AM by Gribouillis.)
hello
There is a string below.
FTP://21.105.28.15/abc/test1.txt
Cut this string
/abc/
I just want to get the path.
How do I parse the string?
import pymssql
conn = pymssql.connect('10.21.100.21', 'sa', 'abc', 'DBA', as_dict=False)
cur = conn.cursor()
cur.execute('select path From table')
fetch = cur.fetchall()
for i in fetch:
print (i) <==== FTP://21.105.28.15/abc/test1.txt (Cut this string /abc/
I just want to get the path.)
conn.close()
Posts: 4,790
Threads: 76
Joined: Jan 2018
Apr-10-2024, 07:36 AM
(This post was last modified: Apr-10-2024, 07:37 AM by Gribouillis.)
You could use urllib.parse.urlparse
>>> from urllib.parse import urlparse
>>> urlparse('FTP://21.105.28.15/abc/test1.txt')
ParseResult(scheme='ftp', netloc='21.105.28.15', path='/abc/test1.txt', params='', query='', fragment='')
>>> result = urlparse('FTP://21.105.28.15/abc/test1.txt')
>>> result.path
'/abc/test1.txt'
>>>
« We can solve any problem by introducing an extra level of indirection »
Posts: 1,094
Threads: 143
Joined: Jul 2017
Maybe like this, without using any modules: find the third / and the last /
mystring = 'FTP://21.105.28.15/abc/test1.txt'
mystring2 = 'FTP://21.105.28.15/abc/def/ghi/test1.txt'
# get the index of the third /
def getFirst():
count = 0
for s in range(1, len(mystring)):
# skip the first 2 /
if mystring[s] == '/':
count +=1
# get the third /
if count == 3:
index_first = s
return index_first
# get the index of the last /
def getLast():
for s in range(1, len(mystring)):
if mystring[-s] == '/':
index_last = -(s - 1)
return index_last
wanted = mystring[getFirst():getLast()]
wanted = mystring2[getFirst():getLast()] Gives:
Output: wanted
'/abc/'
or
Output: wanted
'/abc/def/ghi/'
Posts: 1,145
Threads: 114
Joined: Sep 2019
Apr-10-2024, 10:23 AM
(This post was last modified: Apr-10-2024, 10:23 AM by menator01.)
Another way
astring = 'FTP://21.105.28.15/abc/test1.txt'
astring2 = 'FTP://21.105.28.15/abc/def/test1.txt'
def remover(astring):
astring = astring.split('/')
for index, item in enumerate(astring):
if item == '':
astring.remove(item)
elif index in (0, 1):
astring.remove(item)
return '/'.join(astring)
print(remover(astring))
print(remover(astring2)) output
Output: /abc/test1.txt
/abc/def/test1.txt
little adjustment can remove the file at the end as well
Posts: 2,125
Threads: 11
Joined: May 2017
Apr-10-2024, 10:26 AM
(This post was last modified: Apr-10-2024, 10:26 AM by DeaD_EyE.)
If you need a path, a Path object is maybe what you want.
Code is based on Gribouillis example:
from urllib.parse import urlparse
from pathlib import PurePosixPath
def url2path(url: str) -> PurePosixPath:
return PurePosixPath(urlparse(url).path)
path = url2path('FTP://21.105.28.15/abc/test1.txt')
print("path:", path)
print("path.parent", path.parent)
print("path.parts", path.parts)
print("path.parents", list(path.parents))
print("path.name:", path.name)
print("path.stem:", path.stem)
print("path.suffix", path.suffix)
# Path objects are printed as strings
# but it's a different type
# here how to do it explicit
print("Path as str:", str(path))
# not all 3rd party libraries handle the conversion from `Path` to `str` implicit Output: path: /abc/test1.txt
path.parent /abc
path.parts ('/', 'abc', 'test1.txt')
path.parents [PurePosixPath('/abc'), PurePosixPath('/')]
path.name: test1.txt
path.stem: test1
path.suffix .txt
Path as str: /abc/test1.txt
This also works with other non-FTP urls:
urls = (
'FTP://21.105.28.15/abc/test1.txt',
'FTPS://21.105.28.15/abc/test1.txt',
'HTTP://21.105.28.15/abc/test1.txt',
'HTTPS://21.105.28.15/abc/test1.txt',
'MyOwnUselessURL://xyz.de/abc/test1.txt',
)
for url in urls:
print(url2path(url)) Boring output:
Output: /abc/test1.txt
/abc/test1.txt
/abc/test1.txt
/abc/test1.txt
/abc/test1.txt
Documents you should read:
Gribouillis and Larz60+ like this post
Posts: 2
Threads: 0
Joined: Jan 2025
Jan-20-2025, 08:11 PM
(This post was last modified: Jan-21-2025, 08:58 AM by Larz60+.)
mystring = 'FTP://21.105.28.15/abc/test1.txt'
mystring2 = 'FTP://21.105.28.15/abc/def/ghi/test1.txt'
# Get the index of the third /
def getFirst(url):
count = 0
for i, char in enumerate(url):
if char == '/':
count += 1
if count == 3:
return i
return -1 # Return -1 if there's no third '/'
# Get the index of the last /
def getLast(url):
return url.rfind('/') # Use rfind to get the last occurrence of '/'
# Extract the substring between the third and the last '/'
def extract_substring(url):
first_index = getFirst(url)
last_index = getLast(url)
if first_index != -1 and first_index < last_index:
return url[first_index + 1:last_index] # Skip the '/' itself
return ""
# Extract and print the portions of the URLs
wanted1 = extract_substring(mystring)
wanted2 = extract_substring(mystring2)
print(wanted1) # Output: abc/test1.txt
print(wanted2) # Output: abc/def/ghi/test1.txt I'm using this and it work for me.
Larz60+ write Jan-21-2025, 08:58 AM:Please post all code, output and errors (it it's entirety) between their respective tags. Refer to BBCode help topic on how to post. Use the "Preview Post" button to make sure the code is presented as you expect before hitting the "Post Reply/Thread" button.
Tags have been added. Please use BBCode tags on future posts
Posts: 5
Threads: 0
Joined: Jan 2025
I define an "extract_path" function that uses a regular expression to find the part of the URL after "/abc/" and extract it.
import pymssql
import re
def extract_path(url):
# Use regex to find the part after "/abc/"
match = re.search(r'/abc/(.*)', url)
if match:
return match.group(1)
return None
# Database connection
conn = pymssql.connect('10.21.100.21', 'sa', 'abc', 'DBA', as_dict=False)
cur = conn.cursor()
# Execute the query
cur.execute('SELECT path FROM table')
# Fetch all results
fetch = cur.fetchall()
# Process and print the results
for row in fetch:
ftp_url = row[0]
extracted_path = extract_path(ftp_url)
if extracted_path:
print(f"Original URL: {ftp_url}")
print(f"Extracted path: {extracted_path}")
print("---")
# Close the connection
conn.close() This is working at your end?
Posts: 2,125
Threads: 11
Joined: May 2017
Solving this kind of problem with regex is brute-force. Sometimes it's good to take a step back and ask, if you can solve the problem differently.
Posts: 1
Threads: 0
Joined: Feb 2025
Feb-13-2025, 07:08 AM
(This post was last modified: Feb-13-2025, 07:59 AM by buran.)
(Jan-21-2025, 12:48 PM)DeaD_EyE Wrote: Solving this kind of problem with regex is brute-force. Sometimes it's good to take a step back and ask, if you can solve the problem differently.
If it really works, there is no reason to refuse.
buran write Feb-13-2025, 07:59 AM:Spam link removed
|