How to create full compressed tar file using Python?
PythonCompressionZipTarfilePython Problem Overview
How can I create a .tar.gz file with compression in Python?
Python Solutions
Solution 1 - Python
To build a .tar.gz
(aka .tgz
) for an entire directory tree:
import tarfile
import os.path
def make_tarfile(output_filename, source_dir):
with tarfile.open(output_filename, "w:gz") as tar:
tar.add(source_dir, arcname=os.path.basename(source_dir))
This will create a gzipped tar archive containing a single top-level folder with the same name and contents as source_dir
.
Solution 2 - Python
import tarfile
tar = tarfile.open("sample.tar.gz", "w:gz")
for name in ["file1", "file2", "file3"]:
tar.add(name)
tar.close()
If you want to create a tar.bz2 compressed file, just replace file extension name with ".tar.bz2" and "w:gz" with "w:bz2".
Solution 3 - Python
You call tarfile.open with mode='w:gz'
, meaning "Open for gzip compressed writing."
You'll probably want to end the filename (the name
argument to open
) with .tar.gz
, but that doesn't affect compression abilities.
BTW, you usually get better compression with a mode of 'w:bz2'
, just like tar
can usually compress even better with bzip2
than it can compress with gzip
.
Solution 4 - Python
Previous answers advise using the tarfile
Python module for creating a .tar.gz
file in Python. That's obviously a good and Python-style solution, but it has serious drawback in speed of the archiving. This question mentions that tarfile
is approximately two times slower than the tar
utility in Linux. According to my experience this estimation is pretty correct.
So for faster archiving you can use the tar
command using subprocess
module:
subprocess.call(['tar', '-czf', output_filename, file_to_archive])
Solution 5 - Python
In addition to @Aleksandr Tukallo's answer, you could also obtain the output and error message (if occurs). Compressing a folder using tar
is explained pretty well on the following answer.
import traceback
import subprocess
try:
cmd = ['tar', 'czfj', output_filename, file_to_archive]
output = subprocess.check_output(cmd).decode("utf-8").strip()
print(output)
except Exception:
print(f"E: {traceback.format_exc()}")
Solution 6 - Python
In this tar.gz file compress in open view directory In solve use os.path.basename(file_directory)
import tarfile
with tarfile.open("save.tar.gz","w:gz") as tar:
for file in ["a.txt","b.log","c.png"]:
tar.add(os.path.basename(file))
its use in tar.gz file compress in directory
Solution 7 - Python
Minor correction to @THAVASI.T's answer which omits showing the import of the 'tarfile' library, and does not define the 'tar' object which is used in the third line.
import tarfile
with tarfile.open("save.tar.gz","w:gz") as tar:
for file in ["a.txt","b.log","c.png"]:
tar.add(os.path.basename(file))
Solution 8 - Python
Perfect answer
.
and ..
in compressed file!
best performance and without the > NOTICE (thanks MaxTruxa):
>
> this answer is vulnerable to shell injections. Please read the security considerations from the docs. Never pass unescaped strings to subprocess.run
, subprocess.call
, etc. if shell=True
. Use shlex.quote
to escape (Unix shells only).
>
> I'm using it locally - so it's good for my needs.
subprocess.call(f'tar -cvzf {output_filename} *', cwd=source_dir, shell=True)
the cwd
argument changes directory before compressing - which solves the issue with the dots.
the shell=True
allows wildcard usage (*
)
> WORKS also for a directory recursively