Posts: 5
Threads: 1
Joined: Nov 2020
I have a Windows file server (Win Server 2003 Standard x64 Edition) that contains csv files.
On a Debian machine (9.13 Stretch) I mounted a Windows share pointing to abovementioned machine by adding this line in file /etc/fstab:
//192.168.254.10/DATA /mnt/fsdata cifs uid=postgres,username=<*user*>,password=<*pwd*>,iocharset=utf8,sec=ntlm 0 0
I am running a python script on the Debian machine to check if the csv files exist on the windows machine.
On the windows machine, when I “cut” or “drag” all the files away from folder, and run Path.exists() on /mnt/fsdata/IT/Servers/PostGres/CDR/, the result is always true. (=incorrect)
On the windows machine, when I delete all the files, and run Path.exists() on /mnt/fsdata/IT/Servers/PostGres/CDR/, the result is false. (=correct)
It seems that cut or drag is not recognized on Debian side, and python still “sees” the files as being there.
This is the code excerpt:
#!/usr/bin/env python3
import pathlib
src = "/mnt/fsdata/IT/Servers/PostGres/CDR/"
dest = src + "procd/"
srcpath = pathlib.Path(src)
for file in list(srcpath.glob('*.csv')):
destpath = pathlib.Path(dest + file.name)
# check if file already exists in /procd folder:
if destpath.exists():
# something happens... What can be done to avoid this?
I searched a lot online but found no answers.
Thank you
NaN
Posts: 12,028
Threads: 485
Joined: Sep 2016
Nov-26-2020, 10:43 PM
(This post was last modified: Nov-26-2020, 10:44 PM by Larz60+.)
pathlib will return the 'address' of where you want a file to be.
To see if file is there, use exists.
example (untested):
from pathlib import Path
homepath = Path('.')
datapath = homepath / 'data'
datapath.mkdir(exist_ok=True) # make data directory (only if it is not already there)
myfile = datapath / 'myfile.text'
# check if file exists:
if myfile.exists():
with myfile.open() as fp:
data = myfile.read()
else:
print("myfile.txt does not exist")
Posts: 4,786
Threads: 76
Joined: Jan 2018
Nov-26-2020, 10:47 PM
(This post was last modified: Nov-26-2020, 10:48 PM by Gribouillis.)
I suspect it has something to do with the way samba servers work, or samba clients. Some answers on the web, such as this one indicate that it could be a cache problem. I'm not proficient in these samba issues, but you can perhaps either fine tune the configuration of the shared directory on the Windows side or find a way for the Linux client to force the server to reread the contents of the directory. This link could be helpful too.
Posts: 5
Threads: 1
Joined: Nov 2020
(Nov-26-2020, 10:47 PM)Gribouillis Wrote: I suspect it has something to do with the way samba servers work, or samba clients. Some answers on the web, such as this one indicate that it could be a cache problem. I'm not proficient in these samba issues, but you can perhaps either fine tune the configuration of the shared directory on the Windows side or find a way for the Linux client to force the server to reread the contents of the directory. This link could be helpful too.
I added cache=none in /etc/fstab and restarted the machine. Nothing changed.
My opinion is that the problem is on python side, because when I ls the folder it is empty. After that, running the py script still gives destpath.exists() as being true.
I added following lines to try to open files:
with destpath.open() as f:
print("File name: " + str(destpath)) it returns an error:
Error: An error occurred: [Errno 2] No such file or directory: '/mnt/fsdata/IT/Servers/PostGres/CDR/procd/Trunks-2020-11-01.csv'
Could it be a bug in python? Who can I contact for that?
Posts: 4,786
Threads: 76
Joined: Jan 2018
You could perhaps first perform a os.stat() call on the file and print all the fields of the returned stat_result object to see what it contains. A bug in Python is by far the least plausible explanation.
Posts: 5
Threads: 1
Joined: Nov 2020
stats are as follows:
Output: os.stat_result(st_mode=33261, st_ino=1407374883714381, st_dev=41, st_nlink=1, st_uid=118, st_gid=0, st_size=1621, st_atime=1606730312, st_mtime=1606730312, st_ctime=1606730608)
os.stat_result(st_mode=33261, st_ino=1125899907003726, st_dev=41, st_nlink=1, st_uid=118, st_gid=0, st_size=6249, st_atime=1606730312, st_mtime=1606730312, st_ctime=1606730608)
os.stat_result(st_mode=33261, st_ino=1125899907003737, st_dev=41, st_nlink=1, st_uid=118, st_gid=0, st_size=8594, st_atime=1606730312, st_mtime=1606730312, st_ctime=1606730608)
etc...
Posts: 2,125
Threads: 11
Joined: May 2017
Nov-30-2020, 12:28 PM
(This post was last modified: Nov-30-2020, 12:28 PM by DeaD_EyE.)
This kind of checks leads very often into problems. More than you think.
# check if file exists:
if myfile.exists(): # <--- file could exist during this moment
# <--- maybe the file is now deleted, has changed permission or something else.
with myfile.open() as fp: # <-- will definitely raise an Exception if there is a problem
data = myfile.read() # <-- could also raise an Exception Don't ask for permission, ask for forgiveness:
from pathlib import Path
...
myfile = Path("C:")
# try other not working Paths
...
try:
with myfile.open() as fp:
data = fp.read()
except FileNotFoundError:
print("File not found")
except PermissionError:
print("I do not have the permission to access", myfile)
except UnicodeDecodeError:
print("Could not decode UTF-8. Binary file or wrong encoding?") Output: I do not have the permission to access C:
Try this with binary files and you'll get a UnicodeDecodeError .
Try this with your Samba-Share. It should raise FileNotFoundError , but maybe it's an OSError .
Just try it and observe which Exception you get.
Posts: 4,786
Threads: 76
Joined: Jan 2018
What happens if you try os.listdir(...) or list(os.scandir(...)) and if you mix calls to subprocess.check_output(['ls', ...]) or subprocess.check_output("ls ...", shell=True) in between?
Posts: 5
Threads: 1
Joined: Nov 2020
I tried your suggestion:
from pathlib import Path
src = "/mnt/fsdata/IT/Servers/PostGres/CDR/"
dest = src + "procd"
# define the path
currentDirectory = Path(src)
currentPattern = "*.csv"
destDirectory = Path(dest)
for currentFile in currentDirectory.glob(currentPattern):
destFile = destDirectory / currentFile.name
if destFile.exists():
try:
with destFile.open() as fp:
data = fp.read()
print(data)
except FileNotFoundError:
print("File not found")
except PermissionError:
print("I do not have the permission to access", myfile)
except UnicodeDecodeError:
print("Could not decode UTF-8. Binary file or wrong encoding?") I get File not found .
I am the only user manipulating files in these folders. It is a test environment.
What am I doing wrong?
Posts: 5
Threads: 1
Joined: Nov 2020
DeaD_EyE
OK, I tried your "Don't ask for permission, ask for forgiveness"-approach and rewrote my script:
#!/usr/bin/env python3.8
import pathlib
from pathlib import Path
src = "/mnt/fsdata/IT/Servers/PostGres/CDR/"
dest = src + "procd"
dup = src + "dup"
# define the paths:
currentDirectory = pathlib.Path(src)
destDirectory = pathlib.Path(dest)
dupDirectory = pathlib.Path(dup)
currentPattern = "*.csv"
for currentFile in currentDirectory.glob(currentPattern):
destFile = destDirectory / currentFile.name
try:
with destFile.open() as fp:
data = fp.read()
except IOError:
currentFile.rename(destFile)
print('File moved to procd.')
continue
# Move file to dup folder:
duppath = dupDirectory / currentFile.name
currentFile.rename(duppath)
print('File moved to dup.')
print("Finished !") When I upload files to /mnt/fsdata/IT/Servers/PostGres/CDR everything works fine.
Files are moved to /procd .
Now, when I drag those files from /procd back to /mnt/fsdata/IT/Servers/PostGres/CDR , nothing happens. The files remain where they are. It is as if the drag and drop never happened! (note that they are not moved to /dup either...)
What I basicaly want to achieve here is that I import files to /CDR folder. Check if they are not already in /procd . (if already in /procd , move them to /dup ) Import them into a database. Move them to /procd .
What's wrong?
|