How To Extract Web Files, Databases, etc. From Plesk Backup Manually

One of our customers asked today if one can access plesk backups, and possibly extract just one directory or certain files from it. The answer is in the Parallels knowledgebase, and now here also:

Parallels KB: http://kb.parallels.com/en/1757

I. FIRST WAY:

If you have not so big dump file, for example 100-200MB, you can unzip it and open in any local Email client. Paths of the dump will be shown as attachments. Choose and save needed one then unzip it.

II. SECOND WAY:

It can be done using mpack tools to work with MIME files. This packet is included into Debian:

# apt-get install mpack

For other Linux systems you can try to user RPM from ALT Linux:

ftp://ftp.pbone.net/mirror/ftp.altlinux.ru/pub/distributions/ALTLinux/Sisyphus/files/i586/RPMS/mpack-1.6-alt1.i586.rpm
or compile mpack from the sources: http://ftp.andrew.cmu.edu/pub/mpack/.
- Create an empty directory to extract the back up file:

# mkdir recover
# cd recover

and copy backup into it.By default Plesk backup is gzipped (if not, use cat), so run zcat to pass data to munpack to extract content  of directories from the backup file:

# zcat DUMP_FILE.gz > DUMP_FILE
# cat DUMP_FILE | munpack

In result you get the set of tar and sql files that contain domains’ directories and databases. Untar the needed directory. For example if you need to restore the httpdocs folder for the DOMAIN.TLD domain:

# tar xvf DOMAIN.TLD.htdocs

NOTE: ‘munpack’ utility may not work with files greater then 2Gb and during dump extracting you may receive the error like[]/i

# cat DUMP_FILE | munpack
DOMAIN.TLD.httpdocs (application/octet-stream)
File size limit exceeded

In this case try the next way below.

III. THIRD WAY:

First, check if the dump is compressed or not and unzip if needed:

# file testdom.com_2006.11.13_11.27
testdom.com_2006.11.13_11.27: gzip compressed data, from Unix

# zcat testdom.com_2006.11.13_11.27 > testdom.com_dump

Dump consists from the XML path that describes what is included into the dump and the data itself. Every data pie can be found by appropriate CID (Content ID) that  can be found in the XML path.

For example if the domain has hosting, all path that are included in the hosting are listed like:

    

If you need to extract domain’s ‘httpdocs’ you should look for value of ‘cid_docroot‘ parameter, it is ‘testdom.com.htdocs’ in our case.

Next, cut the content of ‘httpdocs’ from the whole dump using the CID you found. In order to do it you should find the string number from that our content begins and the string where it ends, like:

# egrep -an '(^--_----------)|(testdom.com.shtdocs)' ./testdom.com_dump | grep -A1 "Content-Type"
2023:Content-Type: application/octet-stream; name="testdom.com.shtdocs"
3806:--_----------=_1163395694117660-----------------------------------------

Increase the first line number on 2 and  subtract 1 from the second line number, then run:

head -n 3805  ./testdom.com_dump | tail +2025  > htdocs.tar

You get the tar archive of the ‘httpdocs’ directory in result.

If you need to restore the database, the behavior is similar. You should find databases XML description for the domain you need, for example:

        
          
            localhost
            3306
          
        

Find the database content by CID:

# egrep -an '(^--_----------)|(mytest22.mysql.sql)' ./testdom.com_dump | grep -A1 "Content-Type"
1949:Content-Type: application/octet-stream; name="mytest22.mysql.sql"
1975:--_----------=_1163395694117660-----------------------------------------

Increase the first line number on 2 and subtract 1 from the second line number, then run:

head -n 1974  ./testdom.com_dump | tail +1951  > mytest22.sql

In result you get the database in SQL format.

Wayne Egerer

Leave a Reply