Python Forum
Access my webpage and download files from Python
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Access my webpage and download files from Python
#1
I have a godaddy webpage. I don't use it much. Mainly just for trying things out.

I have managed enough PHP to get files uploaded to my webhost from a webpage. The user just clicks a button to select a file, then clicks the upload button

The PHP to validate the files, pack them in an email and send them seems quite complex. Too much for me! I need to install PHPMailer or Pear on the server and do all kinds of complicated stuff it seems. Python to the rescue!

I found I can get a file simply like this:

import requests 
file_url = "http://www.mywebpage.com/php/uploads/chineseYearAnimals.txt"
  
# URL of the files to be downloaded is defined as file_url 

r = requests.get(file_url) # create HTTP response object 
  
# send a HTTP request to the server and save 
# the HTTP response in a response object called r 
with open("/home/pedro/Downloads/download1.txt",'wb') as f: 
  
    # Saving received content as a file in 
    # binary format 
  
    # write the contents of the response (r.content) 
    # to a new file in binary mode. 
    f.write(r.content)
    f.close()
Could you please help me modify this?

I want to

1. download all files in the path,
2. save them under whatever name they already have,
3. then delete them on the web server.

On my laptop, when I want to do something with all files in a directory, I use:

Quote:files = os.listdir(path)

and then:

Quote:for file in files:
do tricky stuff
I need the web version of that.
Reply
#2
Have you managed to SSH into your server from the command line?
Once inside the server then is normal to use commands like scp, rm or rsync to do stuff with files and folder.

Pedroski55 Wrote:1. download all files in the path,
-r Recursively copy entire directories
scp -r [email protected]:/path/to/foo /home/user/Desktop/
If Copy to Windows,c is the drive C:\.
scp -r [email protected]:/path/to/foo /c/tmp
Pedroski55 Wrote:3. then delete them on the web server.
rm /path/to/directory/*
This is example of usage on server over SSH,can do stuff like this with Python,
then with project like fabric, Paramiko or ansible.
Reply
#3
No, can't ssh into the webpage. I have 2 passwords.

1 is the goDaddy login password. If I use that, the connection is immediately closed.
2 the cPanel login password. If I use that, it just asks me to enter the password again.

I also tried the RSA key fingerprint, but that also is not accepted.

Quote:pedro@pedro-newssd:~$ ssh www.mywebpage.com
[email protected]'s password:
Permission denied, please try again.
[email protected]'s password:
Permission denied, please try again.
[email protected]'s password:

Do I really need ssh?
Reply
#4
Maybe you can help me with this. Using requests and BeautifulSoup I can get this text:

Quote:>>> soup = BeautifulSoup(requests.get(file_url).text)
>>> soup
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
<html>
<head>
<title>Index of /php/uploads</title>
</head>
<body>
<h1>Index of /php/uploads</h1>
<ul><li><a href="/php/"> Parent Directory</a></li>
<li><a href="chineseYearAnimals.txt"> chineseYearAnimals.txt</a></li>
<li><a href="chineseYearAnimals_gapWords.xlsx"> chineseYearAnimals_gapWords.xlsx</a></li>
<li><a href="chineseYearAnimals_gapWords.xlsx.data"> chineseYearAnimals_gapWords.xlsx.data</a></li>
<li><a href="chineseYearAnimals_gapped.txt"> chineseYearAnimals_gapped.txt</a></li>
<li><a href="cloze2.txt"> cloze2.txt</a></li>
<li><a href="cloze3.txt"> cloze3.txt</a></li>
<li><a href="cloze4HiddenRules.txt"> cloze4HiddenRules.txt</a></li>
<li><a href="cloze4HiddenRules.txtnoPrepos"> cloze4HiddenRules.txtnoPrepos</a></li>
<li><a href="cloze4HiddenRules_again.txt"> cloze4HiddenRules_again.txt</a></li>
<li><a href="cloze4HiddenRules_gapped.txt"> cloze4HiddenRules_gapped.txt</a></li>
<li><a href="cloze4HiddenRules_gapped.txt.data"> cloze4HiddenRules_gapped.txt.data</a></li>
</ul>
</body></html>

>>>

So, I just need to get the file names from this and I have what I want.

Can I do this with Regex??
Reply
#5
(May-26-2019, 05:24 AM)Pedroski55 Wrote: Do I really need ssh?
You can do stuff from there Cpanel,but it's good to learn to use SSH from command line and if want to use Python(links) i mention need SSH login.
Enable SSH and SSH connect.
I would say a better host in general and for Python eg Digital Ocean,there is no Cpanel(crap) and need to use SSH.
Reply
#6
(May-26-2019, 08:18 AM)Pedroski55 Wrote: So, I just need to get the file names from this and I have what I want.

Can I do this with Regex??
You are already using BeautifulSoup,so no Regex which is bad for HTML/XML.
from bs4 import BeautifulSoup

html = '''\
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
<html>
<head>
  <title>Index of /php/uploads</title>
</head>
<body>
  <h1>Index of /php/uploads</h1>
  <ul>
    <li><a href="/php/"> Parent Directory</a></li>
    <li><a href="chineseYearAnimals.txt"> chineseYearAnimals.txt</a></li>
    <li><a href="chineseYearAnimals_gapWords.xlsx"> chineseYearAnimals_gapWords.xlsx</a></li>
    <li><a href="chineseYearAnimals_gapWords.xlsx.data"> chineseYearAnimals_gapWords.xlsx.data</a></li>
    <li><a href="chineseYearAnimals_gapped.txt"> chineseYearAnimals_gapped.txt</a></li>
    <li><a href="cloze2.txt"> cloze2.txt</a></li>
    <li><a href="cloze3.txt"> cloze3.txt</a></li>
    <li><a href="cloze4HiddenRules.txt"> cloze4HiddenRules.txt</a></li>
    <li><a href="cloze4HiddenRules.txtnoPrepos"> cloze4HiddenRules.txtnoPrepos</a></li>
    <li><a href="cloze4HiddenRules_again.txt"> cloze4HiddenRules_again.txt</a></li>
    <li><a href="cloze4HiddenRules_gapped.txt"> cloze4HiddenRules_gapped.txt</a></li>
    <li><a href="cloze4HiddenRules_gapped.txt.data"> cloze4HiddenRules_gapped.txt.data</a></li>
  </ul>
</body>
</html>'''

soup = BeautifulSoup(html, 'lxml')
link_tag = soup.select('li > a')
for link in link_tag:
    print(link.get('href'))
Output:
/php/ chineseYearAnimals.txt chineseYearAnimals_gapWords.xlsx chineseYearAnimals_gapWords.xlsx.data chineseYearAnimals_gapped.txt cloze2.txt cloze3.txt cloze4HiddenRules.txt cloze4HiddenRules.txtnoPrepos cloze4HiddenRules_again.txt cloze4HiddenRules_gapped.txt cloze4HiddenRules_gapped.txt.data
Here you are taking about files that are public available.
In you first post:
Pedroski55 Wrote:3. then delete them on the web server.
Can download files over that are are public available,
but delete files from server need more access like SSH and links of Python prog given earlier eg fabric.
Reply
#7
Thank you very much! That was a great help!

I will look into ssh, it doesn't look too hard!

You recommend Digital Ocean as a good web hosting service?
Reply
#8
(May-26-2019, 11:12 AM)Pedroski55 Wrote: You recommend Digital Ocean as a good web hosting service?
Yes,better than goDaddy regarding Python and in general.
Look at this post where i talk about host for Python.
Also if you did't know,so do we use Digital Ocean as host form this forum.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  How to access text files, hidden behind 'm3u8' resources Pavel_47 4 2,158 Feb-19-2023, 02:47 PM
Last Post: Pavel_47
  Using range slider in flask webpage to use in python KimPet 2 7,528 Jan-23-2021, 11:58 PM
Last Post: snippsat
  How to access a web service from a python script? dangermaus33 6 3,116 Dec-04-2020, 07:04 AM
Last Post: dangermaus33
  Download some JPG files and make it a single PDF & share it rompdeck 5 5,583 Jul-31-2020, 01:15 AM
Last Post: Larz60+
  Python values on a WebPage Petrouil 1 1,879 Apr-01-2020, 05:08 PM
Last Post: ndc85430
  Read Save RadioButtons from Database in Python Flask Webpage Gary8877 0 7,113 Apr-11-2019, 12:33 AM
Last Post: Gary8877
  display multiple sensors on webpage python flask jinja pascale 6 5,191 Jan-29-2019, 10:10 AM
Last Post: pascale
  I wan't to Download all .zip Files From A Website (Project AI) eddywinch82 68 37,280 Oct-28-2018, 02:13 PM
Last Post: eddywinch82
  I Want To Download Many Files Of Same File Extension With Either Wget Or Python, eddywinch82 15 14,346 May-20-2018, 06:05 PM
Last Post: eddywinch82
  Login in a Webpage using a python program sumandas89 2 12,196 Dec-21-2017, 01:43 PM
Last Post: metulburr

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020