How to Convert HTML to Plain Text in Javascript

By Daag Alemayehu

One of the more useful things you can do with the JavaScript scripting language is manipulate and work with the Document Object Model (or "DOM") of an HTML document. The DOM of an HTML document is a convention for representing the content of that document. It provides a way for web browsers and scripting languages such as JavaScript to interact with the various elements that compose that document. Using JavaScript and a web page's DOM, you can easily convert HTML to plain text.

Step 1

Add a SCRIPT declaration to the HEAD section of your HTML document. The SCRIPT tag defines a client-side script such as JavaScript. Your SCRIPT tag needs to have its "type" attribute set to "text/javascript" so that the entire SCRIPT declaration should read as follows:(where the JavaScript code goes between the two tags).

Step 2

Define a JavaScript function that takes one string as a parameter. This string parameter contains the HTML that you will be converting to plain text.

Step 3

Create a temporary DIV element inside your JavaScript function using the "createElement()" method.

Step 4

Assign your function's string parameter to your temporary DIV's "innerHTML" attribute.

Step 5

Create a temporary string variable in your function.

Step 6

Grab the plain text content of your temporary DIV using its "textContent" and "innerText" attributes and assign it to your temporary string variable. Because of cross-browser compatibility issues, both "textContent" and "innerText" will be defined in some web browsers and undefined in others, but one or the other will be defined in all versions of all major browsers.

Step 7

Return the value held by your temporary string using a "return" statement. This will return the plain text value of your converted HTML.

Tips & Warnings

  • Instead of using IF-THEN statements to check cross-browser compatibility and to decide whether to use "textContent" or "innerText" in your function, simply assign your temporary DIV element's plain text value to your temporary string variable as follows: var tmpString = tmpDiv.textContent || tmpDiv.innerText.