This project is to build a HTML5 parser, then use that to build a WYSIWYG html
editor for the browser.
+The code is written in coffeescript for modern browsers.
+
+
Status
------
Getting Started
---------------
+Recommended: see "Compiling" below
+
+(experimental) Alternate: see "Without Compiling"
+
+
+Compiling
+---------
+
You'll need coffeescript, you can hopefully get that with a command such as
this:
Then run the test suite by opening ``index.html`` in a modern browser.
+Without Compiling
+-----------------
+
+It is recommended to install coffeescript (see Compiling above), but you might
+be able to got it to compile directly in the browser, see here:
+
+ http://coffeescript.org/#scripts
+
+Please nudge Jason (see below) to make this easier.
+
+
Feedback, Questions, Etc
------------------------
# The implementation is a pretty direct implementation of the parsing algorithm
# described here:
-# http://www.w3.org/TR/html5/syntax.html#preprocessing-the-input-stream
#
-# Deviations from that spec:
+# http://www.w3.org/TR/html5/syntax.html
#
-# Purposeful: search this file for "WHATWG"
+# except for some places marked "WHATWG" that are implemented as described here:
#
-# Not finished yet: search this file for "fixfull", "TODO" and "FIXME"
+# https://html.spec.whatwg.org/multipage/syntax.html
+#
+# This code passes all of the tests in the .dat files at:
+#
+# https://github.com/JasonWoof/html5lib-tests/tree/patch-1/tree-construction
+
+
+##################################
+## how to use this code
+##################################
+#
+# See README.md for how to pre-compile this file, or compile it in the browser.
+#
+# This file exports a single useful function: parse_tml
+#
+# Once you include this file in a page (see index.html for an example) you'll
+# have window.wheic
+#
+# Call it like this:
+#
+# wheic.parse_html({html: "<p><b>hi</p>"})
+#
+# Or, if you don't want <html><head><body>/etc, do this:
+#
+# wheic.parse_html({fragment: "body", html: "<p><b>hi</p>"})
+#
+# This code can _almost_ run outside the browser (eg under node.js). To get it
+# to run without the browser would require native implementation of
+# decode_named_char_ref(). The current implementation of that function uses the
+# browser's DOM api, to save space (the list of valid named characters is
+# massive.)
+
+# This code is a work in progress, eg try search this file for "fixfull",
+# "TODO" and "FIXME"
-# stacks/lists
+# Notes: stacks/lists
#
-# the spec uses a many different words do indicate which ends of lists/stacks
-# they are talking about (and relative movement within the lists/stacks). This
-# section splains. I'm implementing "lists" (afe and open_els) the same way
-# (both as stacks)
+# Jason was frequently confused by the terminology used to refer to different
+# parts of the stacks and lists in the spec, so he made this chart to help keep
+# his head straight:
#
# stacks grow downward (current element is index=0)
#
# 1: b
# 0: a "end of the list", "current node", "bottommost", "last"
-
-# browser
-# note: to get this to run outside a browser, you'll have to write a native
-# implementation of decode_named_char_ref()
unless module?.exports?
window.wheic = {}
module = exports: window.wheic