Posts: 12,053
Threads: 488
Joined: Sep 2016
My code has already been posted. I have given you the link several times.
I gave you the complete directory setup in a previous post.
I'll have to go back and find it all, but will consolidate.
This time follow the steps exactly as given. Programming is a very unforgiving art.
This will take a while, but today sometime.
Posts: 12,053
Threads: 488
Joined: Sep 2016
May-27-2018, 12:47 AM
(This post was last modified: May-28-2018, 02:34 AM by Larz60+.)
OK,
I'm running on fumes lately, so start with the following:
I won't be able to test, so if you have a problem, do not go on to the next step. Just let me know which step fails,
any error messages (verbatim), and what you think might be wrong.
Continue only after the problem has been fixed.
This post will be the go to point for the entire process. As I add steps, they will be added here, so save the URL
- From windows Start, click on control panel.
- If necessary enlarge the window, and choose view by Large Icons
- Select Programs and Features
- scroll through the list until you find Python 3.6.5 (or your latest python 3.6 version)
- double click and follow uninstall instructions.
- right click on the post #32 (or what ever number you see in upper right corner of this post), click on copy link location, open a notepad or similar window, click in window and type ctrl-v. Save this URL where you can find it if you have to return to this post.
- Navigate to the following URL and follow snippsat's instructions for installing python (and cmder if you haven't already installed same):
Part1: https://python-forum.io/Thread-Basic-Par...er-Windows
Part2: https://python-forum.io/Thread-Basic-Par...ight=cmder
Install the latest version which is 3.6.5. Make sure you install on C:/Python365, and make sure you check add to paths
- Continue here
- On 'O' drive do the following:
- Use explorer or navigate to the 'O' drive.
- create a directory named 'python'
- ImportantOpen PyCharm. If there are any projects named WellInfo remove from PyCharm by clicking on 'X' in upper right corner of project icon.
- Click on Create New Project
- In the top location Box, enter: O:\python\WellInfo
- Expand arrow to left of Project Interpreter: New Virtualenv environemnt
- Click on Existing Interpreter
- under 'Existing Interpreter' Click on Gear far right
- Select 'Add Local'
- Click on System Interpreter
- If Interpreter window does not show C:\Python365\python.exe, either select that interpreter from the pull down list, or if not there use '...' button to navigate to C:\Python365\python.exe and click ok
- Click Create
- Click close on Tip of day window (if it shows up)
- With WellInfo Highlighted in Left Pane, from top menu select File-->Settings
- Expand sub menu of Project: WellInfo
- Select Project Interpreter make sure it showd Python 36 C:\Python365\python.exe
- in package list below, make sure the following packages are installed: beautifulsoup4, lxml, requests.
- If any are missing, click + on right, type package name (make sure it's highlighted in left Pane, and click Install Package.
- Repeat for all missing packages.
- Now in left pane, click on Project Structure
- Right click on O:\python\WellInfo
- Click on new folder and add data
- Right click on O:\python\WellInfo again
- Click on new folder and add src
- Now highlight data, right click and select new folder
- Add command_files
- highlight data again, right click and select new folder
- Add completions
- Highlight src and click on Sources button next to Mark as:
- Click OK
- Continue here need a few more directories that I missed:
- Right click on data directory
- Click New-->directory and add reports
- Right click on data directory
- Click New-->directory and add html
- Right click on src
- Choose New-->Python File
- Name it CheckInternet (don't type the .py, it's added for you)
- Cut (do not type) the following code by doulbe clicking on any line of code and typing ctrl-c
import socket
class CheckInternet:
def __init__(self):
self.internet_available = False
def check_availability(self):
self.internet_available = False
if socket.gethostbyname(socket.gethostname()) != '127.0.0.1':
self.internet_available = True
return self.internet_available
def testit():
ci = CheckInternet()
print('Please turn internet OFF, then press Enter')
input()
ci.check_availability()
print(f'ci.internet_available: {ci.internet_available}')
if not ci.internet_available:
print(' Off test successful')
else:
print(' Off test failed')
print('Please turn internet ON, then press Enter')
input()
ci.check_availability()
print(f'ci.internet_available: {ci.internet_available}')
if ci.internet_available:
print(' On test successful')
else:
print(' On test failed')
if __name__ == '__main__':
testit()
- Right click on src
- Choose New-->Python File
- Name it FetchCompletions (don't type the .py, it's added for you)
- Cut (do not type) the following code by doulbe clicking on any line of code and typing ctrl-c
import requests
from bs4 import BeautifulSoup
from pathlib import Path
import CheckInternet
import sys
class GetCompletions:
def __init__(self, infile):
self.check_network = CheckInternet.CheckInternet()
self.homepath = Path('.')
self.rootpath = self.homepath / '..'
self.datapath = self.rootpath / 'data'
self.commandpath = self.datapath / 'command_files'
self.completionspath = self.datapath / 'completions'
self.htmlpath = self.datapath / 'html'
self.reportspath = self.datapath / 'reports'
if self.check_network.check_availability():
# use: Api_May_27_2018.txt for testing
# self.infilename = 'Api_May_27_2018.txt'
self.infilename = input('Please enter api filename: ')
self.infile = self.commandpath / self.infilename
self.api = []
with self.infile.open() as f:
for line in f:
self.api.append(line.strip())
self.fields = ['Spud Date', 'Total Depth', 'IP Oil Bbls', 'Reservoir Class', 'Completion Date',
'Plug Back', 'IP Gas Mcf', 'TD Formation', 'Formation', 'IP Water Bbls']
self.get_all_pages()
self.parse_and_save(getpdfs=True)
else:
print('Internet access required, and not found.')
print('Please make Internet available and try again')
def get_url(self):
for entry in self.api:
print("http://wogcc.state.wy.us/wyocomp.cfm?nAPI={}".format(entry[3:10]))
yield (entry, "http://wogcc.state.wy.us/wyocomp.cfm?nAPI={}".format(entry[3:10]))
def get_all_pages(self):
for entry, url in self.get_url():
print('Fetching main page for entry: {}'.format(entry))
response = requests.get(url)
if response.status_code == 200:
filename = self.htmlpath / 'api_{}.html'.format(entry)
with filename.open('w') as f:
f.write(response.text)
else:
print('error downloading {}'.format(entry))
def parse_and_save(self, getpdfs=False):
filelist = [file for file in self.htmlpath.iterdir() if file.is_file()]
for file in filelist:
with file.open('r') as f:
soup = BeautifulSoup(f.read(), 'lxml')
if getpdfs:
links = soup.find_all('a')
for link in links:
url = link['href']
if 'www' in url:
continue
print('downloading pdf at: {}'.format(url))
p = url.index('=')
response = requests.get(url, stream=True, allow_redirects=False)
if response.status_code == 200:
try:
header_info = response.headers['Content-Disposition']
idx = header_info.index('filename')
filename = self.completionspath / header_info[idx+9:]
except ValueError:
filename = self.completionspath / 'comp{}.pdf'.format(url[p + 1:])
print("couldn't locate filename for {} will use: {}".format(file, filename))
except KeyError:
filename = self.completionspath / 'comp{}.pdf'.format(url[p + 1:])
print('got KeyError on {}, response.headers = {}'.format(file, response.headers))
print('will use name: {}'.format(filename))
print(response.headers)
with filename.open('wb') as f:
f.write(response.content)
sfname = self.reportspath / 'summary_{}.txt'.format((file.name.split('_'))[1].split('.')[0][3:10])
tds = soup.find_all('td')
with sfname.open('w') as f:
for td in tds:
if td.text:
if any(field in td.text for field in self.fields):
f.write('{}\n'.format(td.text))
# Delete html file when finished
file.unlink()
if __name__ == '__main__':
GetCompletions('apis.txt')
- Right click on PyCharm code area for this module and click paste
- Click File-->Save-All
- On Left pane, expand data directory.
- Right click on command_files and select New-->File
- Type Api_May_27_2018.txt and click OK
- Add some api numbers (real ones) without quotes, one per line
- When done click File-->Save_all
- Click on FetchCompletions.py tab
- With cursor anywhere in code window:
- On Top Menu click Run-->Run and select FetchCompletions
When done, your pdf files will be in the completions directory, and
a simple run report will be in the
This concludes setup for project,
Do these steps (8-40) and let me know when done.
This will need some tweeking later (after I finish my move and have internet reestablished at new location (which is more remote, so I am hoping I can still get broadband))
Movers will be here Thursday. I have already moved most of the items that I didn't want them to break.
Posts: 126
Threads: 27
Joined: Feb 2018
May-27-2018, 02:12 PM
(This post was last modified: May-27-2018, 02:12 PM by tjnichols.)
I would be running low on steam too. I truly appreciate your help!
All steps worked. What's next?
Posts: 12,053
Threads: 488
Joined: Sep 2016
May-27-2018, 02:21 PM
(This post was last modified: May-27-2018, 02:21 PM by Larz60+.)
Working on it, ... give me an hour
Before I get too far into it,
Question
Do you have another drive where you can isolate your code?
Awaiting your reply.
Posts: 126
Threads: 27
Joined: Feb 2018
Posts: 12,053
Threads: 488
Joined: Sep 2016
Posts: 126
Threads: 27
Joined: Feb 2018
The path is O:\Completions.
Posts: 12,053
Threads: 488
Joined: Sep 2016
Got it. I'm working on it (and watching the PGA Golf final in Ft Worth).
This will be resolved today.
Posts: 126
Threads: 27
Joined: Feb 2018
Posts: 12,053
Threads: 488
Joined: Sep 2016
|