Python Forum

Full Version: paths
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Pages: 1 2
Hi everyone,

I am having a lot of trouble with setting up paths in my code to pull data from. My most recent problem is with this code below

import os
import tarfile
from six.moves import urllib

DOWNLOAD_ROOT = "https://raw.githubusercontent.com/ageron/handson-ml/master/"
HOUSING_PATH = os.path.join("datasets", "housing")
HOUSING_URL = DOWNLOAD_ROOT + "datasets/housing/housing.tgz"

def fetch_housing_data(housing_url=HOUSING_URL, housing_path=HOUSING_PATH):
    if not os.path.isdir(housing_path):
        os.makedirs(housing_path)
    tgz_path = os.path.join(housing_path, "housing.tgz")
    urllib.request.urlretrieve(housing_url, tgz_path)
    housing_tgz = tarfile.open(tgz_path)
    housing_tgz.extractall(path=housing_path)
    housing_tgz.close()

import pandas as pd

def load_housing_data(housing_path=HOUSING_PATH):
    csv_path = os.path.join(housing_path, "housing.csv")
    return pd.read_csv(csv_path)
I am running the code in spyder and then doing the below code in jupter notebook to try view it.

housing = load_housing_data()
housing.head()
I get the below error

FileNotFoundError: File b'datasets\\housing\\housing.csv' does not exist

Can anyone help me I want to start modelling on the data but am getting stuck :(
what OS do do you run the code on? based on your profile info you are on Win10... Does datasets\housing\housing.csv look like proper windows path to you?
Hi Buran

Yes it is windows10. No it does not look like a proper path, what would be a proper path? The only paths I've used is from my c drive. should I enter the below as the path?

C:\Users\SGrah\OneDrive\Documents\Python Scripts
I don't know where you want to save the files, but you need to specify either full absolute path or a relative path (relative to current working directory)
I tried to change it to

HOUSING_PATH = os.path.join(r"C:\Users\SGrah\OneDrive\Documents\Python Scripts\Machine Learning with Skikit-learn and Tensorflow\Datasets")
but I got the same error that the path does not exist.

If I want to save it to folder 'C:\Users\SGrah\OneDrive\Documents\Python Scripts\Machine Learning with Skikit-learn and Tensorflow\Datasets' what should I change the code to?

Also what is a relative path?
does the path C:\Users\SGrah\OneDrive\Documents\Python Scripts\Machine Learning with Skikit-learn and Tensorflow\Datasets\housing exists?
import os
import pandas as pd

base_path = r"C:\Users\SGrah\OneDrive\Documents\Python Scripts\Machine Learning with Skikit-learn and Tensorflow\Datasets"
HOUSING_PATH = os.path.join(base_path, "housing")
 
def load_housing_data(housing_path=HOUSING_PATH):
    csv_path = os.path.join(housing_path, "housing.csv")
    return pd.read_csv(csv_path)

relative path is path that is relative to some other path, e.g.

./housing refers to folder housing that is child to current folder (denoted with the dot)
root/current/housing
../housing - in this case .. is the parent folder (one level up) and housing is child to that folder, i.e. both current folder and housing are child to same parent folder
root/current
root/housing
It does exist but my code does not think it does. I am still getting the same error.

import os
import tarfile
from six.moves import urllib

DOWNLOAD_ROOT = "https://raw.githubusercontent.com/ageron/handson-ml/master/"

base_path = r"C:\Users\SGrah\OneDrive\Documents\Python Scripts\Machine Learning with Skikit-learn and Tensorflow\Datasets"
HOUSING_PATH = os.path.join(base_path, "housing")
HOUSING_URL = DOWNLOAD_ROOT + "datasets/housing/housing.tgz"

def fetch_housing_data(housing_url=HOUSING_URL, housing_path=HOUSING_PATH):
    if not os.path.isdir(housing_path):
        os.makedirs(housing_path)
    tgz_path = os.path.join(housing_path, "housing.tgz")
    urllib.request.urlretrieve(housing_url, tgz_path)
    housing_tgz = tarfile.open(tgz_path)
    housing_tgz.extractall(path=housing_path)
    housing_tgz.close()

import pandas as pd

def load_housing_data(housing_path=HOUSING_PATH):
    csv_path = os.path.join(housing_path, "housing.csv")
    return pd.read_csv(csv_path)

housing = load_housing_data()
housing.head()
please post the full traceback in error tags
Full errors here:

Python 3.6.4 |Anaconda, Inc.| (default, Jan 16 2018, 10:22:32) [MSC v.1900 64 bit (AMD64)]
Type "copyright", "credits" or "license" for more information.

IPython 6.2.1 -- An enhanced Interactive Python.

import os
import tarfile
from six.moves import urllib

DOWNLOAD_ROOT = "https://raw.githubusercontent.com/ageron/handson-ml/master/"

base_path = r"C:\Users\SGrah\OneDrive\Documents\Python Scripts\Machine Learning with Skikit-learn and Tensorflow\Datasets"
HOUSING_PATH = os.path.join(base_path, "housing")
HOUSING_URL = DOWNLOAD_ROOT + "datasets/housing/housing.tgz"

def fetch_housing_data(housing_url=HOUSING_URL, housing_path=HOUSING_PATH):
    if not os.path.isdir(housing_path):
        os.makedirs(housing_path)
    tgz_path = os.path.join(housing_path, "housing.tgz")
    urllib.request.urlretrieve(housing_url, tgz_path)
    housing_tgz = tarfile.open(tgz_path)
    housing_tgz.extractall(path=housing_path)
    housing_tgz.close()


import pandas as pd

def load_housing_data(housing_path=HOUSING_PATH):
    csv_path = os.path.join(housing_path, "housing.csv")
    return pd.read_csv(csv_path)


housing = load_housing_data()
housing.head()
Error:
Traceback (most recent call last): File "<ipython-input-1-785dfbe39f75>", line 28, in <module> housing = load_housing_data() File "<ipython-input-1-785dfbe39f75>", line 25, in load_housing_data return pd.read_csv(csv_path) File "C:\Users\SGrah\Anaconda3\lib\site-packages\pandas\io\parsers.py", line 709, in parser_f return _read(filepath_or_buffer, kwds) File "C:\Users\SGrah\Anaconda3\lib\site-packages\pandas\io\parsers.py", line 449, in _read parser = TextFileReader(filepath_or_buffer, **kwds) File "C:\Users\SGrah\Anaconda3\lib\site-packages\pandas\io\parsers.py", line 818, in __init__ self._make_engine(self.engine) File "C:\Users\SGrah\Anaconda3\lib\site-packages\pandas\io\parsers.py", line 1049, in _make_engine self._engine = CParserWrapper(self.f, **self.options) File "C:\Users\SGrah\Anaconda3\lib\site-packages\pandas\io\parsers.py", line 1695, in __init__ self._reader = parsers.TextReader(src, **kwds) File "pandas/_libs/parsers.pyx", line 402, in pandas._libs.parsers.TextReader.__cinit__ File "pandas/_libs/parsers.pyx", line 718, in pandas._libs.parsers.TextReader._setup_parser_source FileNotFoundError: File b'C:\\Users\\SGrah\\OneDrive\\Documents\\Python Scripts\\Machine Learning with Skikit-learn and Tensorflow\\Datasets\\housing\\housing.csv' does not exist
well, I'm pretty sure the housing.csv does not exists at the location if the error says so.
Carefully check that you have that file at that location.
Pages: 1 2