Knowledge for the World

How to export clean HTML from Google Docs

By default, the HTML exported from Google Docs includes tons of classes, styles, and is generally messy. This short guide will teach you how to export clean HTML devoid of classes and inline styles. This is particularly useful if exporting the HTML for use in Wordpress or any other CMS.

1

By default, exporting as HTML in Google Docs results in the following:

<html><head><meta content="text/html; charset=UTF-8" http-equiv="content-type"><style type="text/css">.lst-kix_8lvw1s5my11v-6>li:before{content:"\0025cf  "}ul.lst-kix_8lvw1s5my11v-8{list-style-type:none}ul.lst-kix_8lvw1s5my11v-7{list-style-type:none}ul.lst-kix_8lvw1s5my11v-6{list-style-type:none}ul.lst-kix_8lvw1s5my11v-5{list-style-type:none}.lst-kix_8lvw1s5my11v-3>li:before{content:"\0025cf  "}.lst-kix_8lvw1s5my11v-7>li:before{content:"\0025cb  "}.lst-kix_8lvw1s5my11v-2>li:before{content:"\0025a0  "}.lst-kix_8lvw1s5my11v-0>li:before{content:"\0025cf  "}.lst-kix_8lvw1s5my11v-8>li:before{content:"\0025a0  "}.lst-kix_8lvw1s5my11v-1>li:before{content:"\0025cb  "}ul.lst-kix_8lvw1s5my11v-0{list-style-type:none}.lst-kix_8lvw1s5my11v-4>li:before{content:"\0025cb  "}ul.lst-kix_8lvw1s5my11v-4{list-style-type:none}ul.lst-kix_8lvw1s5my11v-3{list-style-type:none}ul.lst-kix_8lvw1s5my11v-2{list-style-type:none}.lst-kix_8lvw1s5my11v-5>li:before{content:"\0025a0  "}ul.lst-kix_8lvw1s5my11v-1{list-style-type:none}ol{margin:0;padding:0}table td,table th{padding:0}.c1{color:#000000;font-weight:400;text-decoration:none;vertical-align:baseline;font-size:11pt;font-family:"Arial";font-style:normal}.c7{color:#000000;font-weight:400;text-decoration:none;vertical-align:baseline;font-size:16pt;font-family:"Arial";font-style:normal}.c9{color:#434343;font-weight:400;text-decoration:none;vertical-align:baseline;font-size:14pt;font-family:"Arial";font-style:normal}.c2{padding-top:0pt;padding-bottom:0pt;line-height:1.15;orphans:2;widows:2;text-align:left}.c4{padding-top:16pt;padding-bottom:4pt;line-height:1.15;page-break-after:avoid;text-align:left}.c0{padding-top:18pt;padding-bottom:6pt;line-height:1.15;page-break-after:avoid;text-align:left}.c6{background-color:#ffffff;max-width:468pt;padding:72pt 72pt 72pt 72pt}.c8{margin-left:36pt;padding-left:0pt}.c5{padding:0;margin:0}.c3{height:11pt}.title{padding-top:0pt;color:#000000;font-size:26pt;padding-bottom:3pt;font-family:"Arial";line-height:1.15;page-break-after:avoid;orphans:2;widows:2;text-align:left}.subtitle{padding-top:0pt;color:#666666;font-size:15pt;padding-bottom:16pt;font-family:"Arial";line-height:1.15;page-break-after:avoid;orphans:2;widows:2;text-align:left}li{color:#000000;font-size:11pt;font-family:"Arial"}p{margin:0;color:#000000;font-size:11pt;font-family:"Arial"}h1{padding-top:20pt;color:#000000;font-size:20pt;padding-bottom:6pt;font-family:"Arial";line-height:1.15;page-break-after:avoid;orphans:2;widows:2;text-align:left}h2{padding-top:18pt;color:#000000;font-size:16pt;padding-bottom:6pt;font-family:"Arial";line-height:1.15;page-break-after:avoid;orphans:2;widows:2;text-align:left}h3{padding-top:16pt;color:#434343;font-size:14pt;padding-bottom:4pt;font-family:"Arial";line-height:1.15;page-break-after:avoid;orphans:2;widows:2;text-align:left}h4{padding-top:14pt;color:#666666;font-size:12pt;padding-bottom:4pt;font-family:"Arial";line-height:1.15;page-break-after:avoid;orphans:2;widows:2;text-align:left}h5{padding-top:12pt;color:#666666;font-size:11pt;padding-bottom:4pt;font-family:"Arial";line-height:1.15;page-break-after:avoid;orphans:2;widows:2;text-align:left}h6{padding-top:12pt;color:#666666;font-size:11pt;padding-bottom:4pt;font-family:"Arial";line-height:1.15;page-break-after:avoid;font-style:italic;orphans:2;widows:2;text-align:left}</style></head><body class="c6"><h2 class="c0" id="h.giptyn5l7kdr"><span class="c7">Nulla Facilisi. Duis</span></h2><p class="c2"><span class="c1">Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nullam in dui mauris. Vivamus hendrerit arcu sed erat molestie vehicula. Sed auctor neque eu tellus rhoncus ut eleifend nibh porttitor. Ut in nulla enim. Phasellus molestie magna non est.</span></p><h2 class="c0" id="h.y9dxotb9wolu"><span class="c7">Mauris Iaculis Porttitor</span></h2><p class="c2"><span class="c1">Bibendum non venenatis nisl tempor. Suspendisse dictum feugiat nisl ut dapibus. Mauris iaculis porttitor posuere. Praesent id metus massa, ut blandit odio. Proin quis tortor orci. Etiam at risus et justo dignissim congue. Donec congue lacinia dui, a porttitor lectus condimentum laoreet. Nunc eu ullamcorper orci.</span></p><h3 class="c4" id="h.9dj87f3q69vq"><span class="c9">Class Aptent Taciti Sociosqu</span></h3><p class="c2"><span class="c1">Quisque eget odio ac lectus vestibulum faucibus eget in metus. In pellentesque faucibus vestibulum. Nulla at nulla justo, eget luctus tortor. Nulla facilisi. Duis aliquet egestas purus in blandit. Curabitur vulputate, ligula lacinia scelerisque.</span></p><h3 class="c4" id="h.14dbm37w9ur7"><span class="c9">Lorem Ipsum Dolor Sit</span></h3><p class="c2"><span class="c1">Tempor, lacus lacus ornare ante, ac egestas est urna sit amet arcu. Class aptent taciti sociosqu ad litora torquent per conubia nostra, per inceptos himenaeos.</span></p><p class="c2 c3"><span class="c1"></span></p><ul class="c5 lst-kix_8lvw1s5my11v-0 start"><li class="c2 c8"><span class="c1">Sed auctor neque eu tellus rhoncus ut.</span></li><li class="c2 c8"><span class="c1">Lorem ipsum dolor sit amet, consectetur adipiscing elit.</span></li><li class="c2 c8"><span class="c1">Sed molestie augue sit amet leo.</span></li></ul><p class="c2 c3"><span class="c1"></span></p><p class="c2"><span class="c1">Sed molestie augue sit amet leo consequat posuere. Vestibulum ante ipsum.</span></p></body></html>

Even if we remove the head/body tags and fix indentation, things are still terrible:

<h2 class="c0" id="h.giptyn5l7kdr">
    <span class="c7">Nulla Facilisi. Duis
</span>
</h2>
<p class="c2">
    <span class="c1">Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nullam in dui mauris. Vivamus hendrerit arcu sed erat molestie vehicula. Sed auctor neque eu tellus rhoncus ut eleifend nibh porttitor. Ut in nulla enim. Phasellus molestie magna non est.
</span>
</p>
<h2 class="c0" id="h.y9dxotb9wolu">
    <span class="c7">Mauris Iaculis Porttitor
</span>
</h2>
<p class="c2">
    <span class="c1">Bibendum non venenatis nisl tempor. Suspendisse dictum feugiat nisl ut dapibus. Mauris iaculis porttitor posuere. Praesent id metus massa, ut blandit odio. Proin quis tortor orci. Etiam at risus et justo dignissim congue. Donec congue lacinia dui, a porttitor lectus condimentum laoreet. Nunc eu ullamcorper orci.
</span>
</p>
<h3 class="c4" id="h.9dj87f3q69vq">
    <span class="c9">Class Aptent Taciti Sociosqu
</span>
</h3>
<p class="c2">
    <span class="c1">Quisque eget odio ac lectus vestibulum faucibus eget in metus. In pellentesque faucibus vestibulum. Nulla at nulla justo, eget luctus tortor. Nulla facilisi. Duis aliquet egestas purus in blandit. Curabitur vulputate, ligula lacinia scelerisque.
</span>
</p>
<h3 class="c4" id="h.14dbm37w9ur7">
    <span class="c9">Lorem Ipsum Dolor Sit
</span>
</h3>
<p class="c2">
    <span class="c1">Tempor, lacus lacus ornare ante, ac egestas est urna sit amet arcu. Class aptent taciti sociosqu ad litora torquent per conubia nostra, per inceptos himenaeos.
</span>
</p>
<p class="c2 c3">
    <span class="c1">
</span>
</p>
<ul class="c5 lst-kix_8lvw1s5my11v-0 start">
    <li class="c2 c8">
    <span class="c1">Sed auctor neque eu tellus rhoncus ut.
</span>
</li>
<li class="c2 c8">
    <span class="c1">Lorem ipsum dolor sit amet, consectetur adipiscing elit.
</span>
</li>
<li class="c2 c8">
    <span class="c1">Sed molestie augue sit amet leo.
</span>
</li>
</ul>
<p class="c2 c3">
    <span class="c1">
</span>
</p>
<p class="c2">
    <span class="c1">Sed molestie augue sit amet leo consequat posuere. Vestibulum ante ipsum.
</span>
</p>

Finally, this is what our clean export script outputs:

<h2>Nulla Facilisi. Duis</h2>
<p>Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nullam in dui mauris. Vivamus hendrerit arcu sed erat molestie vehicula. Sed auctor neque eu tellus rhoncus ut eleifend nibh porttitor. Ut in nulla enim. Phasellus molestie magna non est.</p>
<h2>Mauris Iaculis Porttitor</h2>
<p>Bibendum non venenatis nisl tempor. Suspendisse dictum feugiat nisl ut dapibus. Mauris iaculis porttitor posuere. Praesent id metus massa, ut blandit odio. Proin quis tortor orci. Etiam at risus et justo dignissim congue. Donec congue lacinia dui, a porttitor lectus condimentum laoreet. Nunc eu ullamcorper orci.</p>
<h3>Class Aptent Taciti Sociosqu</h3>
<p>Quisque eget odio ac lectus vestibulum faucibus eget in metus. In pellentesque faucibus vestibulum. Nulla at nulla justo, eget luctus tortor. Nulla facilisi. Duis aliquet egestas purus in blandit. Curabitur vulputate, ligula lacinia scelerisque.</p>
<h3>Lorem Ipsum Dolor Sit</h3>
<p>Tempor, lacus lacus ornare ante, ac egestas est urna sit amet arcu. Class aptent taciti sociosqu ad litora torquent per conubia nostra, per inceptos himenaeos.</p>
<ul>
    <li>Sed auctor neque eu tellus rhoncus ut.</li>
    <li>Lorem ipsum dolor sit amet, consectetur adipiscing elit.</li>
    <li>Sed molestie augue sit amet leo.</li>
</ul>
<p>Sed molestie augue sit amet leo consequat posuere. Vestibulum ante ipsum.</p>

Much better. :)

2

We're going to use a script called GoogleDoc2Html and install it into our Google Docs account. When we're ready to export some clean HTML, we'll run this script and it'll email us an attachment containing the clean HTML and any images in the document.

Download the script from the GoogleDoc2Html Github repo. You can also fork the repo if you choose to apply further customizations per your org's specific needs.

Then, open code.js from the downloaded repository.

3

Install the export script

Open the Google Doc you'd like to convert to HTML and navigate to Tools > Script Editor. Then, select File > New > Script File. Give it a name and save it.

Copy and paste the contents from code.js into the new window, replacing the entire contents (including the sample myFunction declaration) of the new script file.

Save the file and close that window.

4

Run the export script

To run the script, open the Google Doc you'd like to convert to HTML and select Tools > Script editor. Then, run it by navigating to Run > ConvertGoogleDoc2Html.

Protips to maximize cleanliness:

  • Be sure to use the correct heading tags in your original Google Doc. For example, if a heading should be an h2, select "Heading 2" from the Style dropdown when writing the original document.
  • Don't add extra weird formatting. For example, don't bother floating images left in your document, etc.

Enjoy!

5

You're done! The converted HTML will be emailed to you, along with any images that were in the document.