Saturday, June 14, 2008

Save images in Microsoft Word documents as separate files

You can save all the pictures in a Word document to individual image files using the Save as Web page option (Word 2000, Word 2002/XP, and Word 2003) or by unzipping the .docx file (Word 2007).

For manually adding pictures to a Microsoft Office Word document is a simple process. Saving pictures already embedded in a Word document as separate image files can be a bit more difficult–unless you know these simple tricks.

Imagine the following scenario. Some sent you a Word document loaded with pictures (30 or more). You need the pictures as individual image files, but for some reason the document’s creator can’t send you the images.

Now, you could open the document in Word, select a single image, copy it, paste the image into your favorite image-editing application, and then save the picture. But, this would take too long. You could also create a script or macro to remove copy the images, but again, this is more work than necessary. Following are the simple tricks.

Save as Web page

Using the following steps for Word 2000, Word 2002/XP, or Word 2003:

  1. Open the document in Word.
  2. Click File from the Standard Toolbar.
  3. Click Save As.
  4. Specify your Save in location.
  5. Select Web Page (*.htm; *.html) from the Save as type drop-down menu, as shown in Figure A.
  6. Click Save.


When you save the document as a Web page, Word creates an .htm file and folder containing the embedded images, as shown in Figure B.

Figure B


By default, Word saves supporting files to a subfolder in the same location as the main .htm file. By default, Word saves supporting files to a subfolder in the same location as the main .htm file. You can instruct Word to save the files to the .htm file’s location instead of a folder from the Web Options settings window.

The .htm file contains the document’s text, formatting information, properties, image references, and so forth. Open the .htm file with and HTML editor, and you can see the code Word generates. As I mentioned above, the folder contains the document’s embedded images and a filelist.xml file, as shown in Figure C.

Figure C


If the image has been resized within Word, the folder will contain both the original image and a resized copy. Word will preserve each file’s original format (.jpg, .png, etc.) but will not preserve the image’s original file name. Word renames the files in ascending order starting with the first image in the document. Each original image is immediately followed by the resized copy, if it exists.

Depending on the Web Options settings, Word may automatically create a resized image when you save the file as a Web page. Word may also convert the image to a .gif. For example, if you haven’t told Word to allow .png as a graphics format under Web Options and you insert a .png file into your document, the supporting-files folder will contain both the original image file and a resized, reformatted .gif copy.

You can now copy the file(s) to another location.

Unzipping a .docx file

With Word 2007, Microsoft introduced the XML-based .docx file format. The new format is essentially a ZIP container, which contains a series of XML files and any embedded images. To access the embedded images in a .docx file, use the following steps:

  1. If it’s not already a .docx file, Open the file in Word 2007 and save the file as a Word Document (*.docx).
  2. Change the file extension on the original file from .docx to .zip, as shown in Figure D.

Figure D


  1. Open the file using a ZIP application. The image files should be listed at the top of the file list, as shown in Figure E.

Figure E


You can now copy the file(s) to another location.

No comments: