How can files be added to a tarfile with Python, without adding the directory hierarchy?

PythonTar

Python Problem Overview


When I invoke add() on a tarfile object with a file path, the file is added to the tarball with directory hierarchy associated. In other words, if I unzip the tarfile the directories in the original directories hierarchy are reproduced.

Is there a way to simply adding a plain file without directory info that untarring the resulting tarball produce a flat list of files?

Python Solutions


Solution 1 - Python

Using the arcname argument of TarFile.add() method is an alternate and convenient way to match your destination.

Example: you want to archive a dir repo/a.git/ to a tar.gz file, but you rather want the tree root in the archive begins by a.git/ but not repo/a.git/, you can do like followings:

archive = tarfile.open("a.git.tar.gz", "w|gz")
archive.add("repo/a.git", arcname="a.git")
archive.close()

Solution 2 - Python

You can use tarfile.addfile(), in the TarInfo object, which is the first parameter, you can specify a name that's different from the file you're adding.

This piece of code should add /path/to/filename to the TAR file but will extract it as myfilename:

tar.addfile(tarfile.TarInfo("myfilename.txt"), open("/path/to/filename.txt"))

Solution 3 - Python

Maybe you can use the "arcname" argument to TarFile.add(name, arcname). It takes an alternate name that the file will have inside the archive.

Solution 4 - Python

thanks to @diabloneo, function to create selective tarball of a dir

def compress(output_file="archive.tar.gz", output_dir='', root_dir='.', items=[]):
    """compress dirs.

    KWArgs
    ------
    output_file : str, default ="archive.tar.gz"
    output_dir : str, default = ''
        absolute path to output
    root_dir='.',
        absolute path to input root dir
    items : list
        list of dirs/items relative to root dir

    """
    os.chdir(root_dir)
    with tarfile.open(os.path.join(output_dir, output_file), "w:gz") as tar:
        for item in items:
            tar.add(item, arcname=item)    


>>>root_dir = "/abs/pth/to/dir/"
>>>compress(output_file="archive.tar.gz", output_dir=root_dir, 
            root_dir=root_dir, items=["logs", "output"])

Solution 5 - Python

If you want to add the directory name but not its contents inside a tarfile, you can do the following:

(1) create an empty directory called empty (2) tf.add("empty", arcname=path_you_want_to_add)

That creates an empty directory with the name path_you_want_to_add.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestiontheactiveactorView Question on Stackoverflow
Solution 1 - PythondiabloneoView Answer on Stackoverflow
Solution 2 - PythonWimView Answer on Stackoverflow
Solution 3 - PythonLauro MouraView Answer on Stackoverflow
Solution 4 - PythonmuonView Answer on Stackoverflow
Solution 5 - PythonSteven R BrandtView Answer on Stackoverflow