Directories are actually extracted in two distinct phases. Directories are created during
archive_write_header(), but final permissions are not set until
archive_write_close(). This separation is necessary to correctly handle borderline cases such as a non-writable directory containing files, but can cause unexpected results. In particular, directory permissions are not fully restored until the archive is closed. If you use
chdir(2) to change the current directory between calls to
archive_read_extract() or before calling
archive_read_close(), you may confuse the permission-setting logic with the result that directory permissions are restored incorrectly.
The library attempts to create objects with filenames longer than
PATH_MAX by creating prefixes of the full path and changing the current directory. Currently, this logic is limited in scope; the fixup pass does not work correctly for such objects and the symlink security check option disables the support for very long pathnames.
Restoring the path
aa/../bb does create each intermediate directory. In particular, the directory
aa is created as well as the final object
bb. In theory, this can be exploited to create an entire directory heirarchy with a single request. Of course, this does not work if the
ARCHIVE_EXTRACT_NODOTDOT option is specified.
Implicit directories are always created obeying the current umask. Explicit objects are created obeying the current umask unless
ARCHIVE_EXTRACT_PERM is specified, in which case they current umask is ignored.
SGID and SUID bits are restored only if the correct user and group could be set. If
ARCHIVE_EXTRACT_OWNER is not specified, then no attempt is made to set the ownership. In this case, SGID and SUID bits are restored only if the user and group of the final object happen to match those specified in the entry.
The “standard” user-id and group-id lookup functions are not the defaults because
getgrnam(3) and
getpwnam(3) are sometimes too large for particular applications. The current design allows the application author to use a more compact implementation when appropriate.
There should be a corresponding
archive_read_disk interface that walks a directory heirarchy and returns archive entry objects.