How to open my files in data_folder with pandas using relative path?
PythonPandasRelative PathPython Problem Overview
I'm working with pandas and need to read some csv files, the structure is something like this:
>folder/folder2/scripts_folder/script.py > >folder/folder2/data_folder/data.csv
How can I open the data.csv
file from the script in scripts_folder
?
I've tried this:
absolute_path = os.path.abspath(os.path.dirname('data.csv'))
pandas.read_csv(absolute_path + '/data.csv')
I get this error:
File folder/folder2/data_folder/data.csv does not exist
Python Solutions
Solution 1 - Python
Try
import pandas as pd
pd.read_csv("../data_folder/data.csv")
Solution 2 - Python
Pandas will start looking from where your current python file is located. Therefore you can move from your current directory to where your data is located with '..' For example:
pd.read_csv('../../../data_folder/data.csv')
Will go 3 levels up and then into a data_folder (assuming it's there) Or
pd.read_csv('data_folder/data.csv')
assuming your data_folder is in the same directory as your .py file.
Solution 3 - Python
You could use the __file__
attribute:
import os
import pandas as pd
df = pd.read_csv(os.path.join(os.path.dirname(__file__), "../data_folder/data.csv"))
Solution 4 - Python
For non-Windows users:
import pandas as pd
import os
os.chdir("../data_folder")
df = pd.read_csv("data.csv")
For Windows users:
import pandas as pd
df = pd.read_csv(r"C:\data_folder\data.csv")
The prefix r in location above saves time when giving the location to the pandas Dataframe.
Solution 5 - Python
# script.py
current_file = os.path.abspath(os.path.dirname(__file__)) #older/folder2/scripts_folder
#csv_filename
csv_filename = os.path.join(current_file, '../data_folder/data.csv')
Solution 6 - Python
Keeping things tidy with f-strings:
import os
import pandas as pd
data_files = '../data_folder/'
csv_name = 'data.csv'
pd.read_csv(f"{data_files}{csv_name}")
Solution 7 - Python
With python or pandas when you use read_csv
or pd.read_csv
, both of them look into current working directory, by default where the python process have started. So you need to use os
module to chdir()
and take it from there.
import pandas as pd
import os
print(os.getcwd())
os.chdir("D:/01Coding/Python/data_sets/myowndata")
print(os.getcwd())
df = pd.read_csv('data.csv',nrows=10)
print(df.head())
Solution 8 - Python
If you want to keep your tidy, then I would suggest you to assign the path and file separately and then read:
path = 'C:/Users/username/Documents/folder'
file_name = 'file_name.xlsx'
file=pd.read_excel(f"{path}{file_name}")
Solution 9 - Python
I was also looking for the relative path version, this works OK. Note when run (Spyder 3.6) you will see (unicode error) 'unicodeescape' codec can't decode bytes at the closing triple quote. Remove the offending comment lines 14 and 15 and adjust the file names and location for your environment and check for indentation.
-- coding: utf-8 --
""" Created on Fri Jan 24 12:12:40 2020
Source: https://stackoverflow.com/questions/16952632/read-a-csv-into-pandas-from-f-drive-on-windows-7
Demonstrates: Load a csv not in the CWD by specifying relative path - windows version
@author: Doug
From CWD C:\Users\Doug\.spyder-py3\Data Camp\pandas
we will load file
C:/Users/Doug/.spyder-py3/Data Camp/Cleaning/g1803.csv
"""
import csv
trainData2 = []
with open(r'../Cleaning/g1803.csv', 'r') as train2Csv:
trainReader2 = csv.reader(train2Csv, delimiter=',', quotechar='"')
for row in trainReader2:
trainData2.append(row)
print(trainData2)
Solution 10 - Python
You can always point to your home directory using ~
then you can refer to your data folder.
import pandas as pd
df = pd.read_csv("~/mydata/data.csv")
For your case, it should be like this
import pandas as pd
df = pd.read_csv("~/folder/folder2/data_folder/data.csv")
You can also set your data
directory as a prefix
import pandas as pd
DATA_DIR = "~/folder/folder2/data_folder/"
df = pd.read_csv(DATA_DIR+"data.csv")
You can take advantage of f-strings as @nikos-tavoularis said
import pandas as pd
DATA_DIR = "~/folder/folder2/data_folder/"
FILE_NAME = "data.csv"
df = pd.read_csv(f"{DATA_DIR}{FILE_NAME}")
Solution 11 - Python
import pandas as pd
df = pd.read_csv('C:/data_folder/data.csv')
Solution 12 - Python
This link here answers it. https://stackoverflow.com/questions/40416072/reading-file-using-relative-path-in-python-project
Basically using Path
from pathlib
you'll do the following in script.py
from pathlib import Path
path = Path(__file__).parent / "../data_folder/data.csv"
pd.read_csv(path)
Solution 13 - Python
You can try with this.
df = pd.read_csv("E:\working datasets\sales.csv")
print(df.head())
Solution 14 - Python
import os
s_path = os.getcwd()
# s_path = "...folder/folder2/scripts_folder/script.py"
s_path = s_path.split('/')
print(s_path)
# [,..., 'folder', 'folder2', 'scripts_folder', 'script.py']
d_path = s_path[:len(s_path)-2] + ['data_folder', 'data.csv']
print(os.path.join(*d_path))
# ...folder/folder2/data_folder/data.csv```
Solution 15 - Python
You can use .
to represent now working path.
#Linux
df = pd.read_csv("../data_folder/data.csv")
#Wins
df = pd.read_csv("..\\data_folder\\data.csv")
Solution 16 - Python
Try this: Open a new terminal window. Drag and drop the file (that you want Pandas to read) in that terminal window. This will return the full address of your file in a line. Copy and paste that line into read_csv command as shown here:
import pandas as pd
pd.read_csv("the path returned by terminal")
That's it.