From 9ecc7f55f96de835055fa7c82f66d08b7b884a36 Mon Sep 17 00:00:00 2001 From: Jason Woofenden Date: Thu, 24 Dec 2015 14:32:22 -0500 Subject: [PATCH] improve docs --- README.md | 22 ++++++++++++++++++++++ parse-html.coffee | 53 ++++++++++++++++++++++++++++++++++++++++------------- 2 files changed, 62 insertions(+), 13 deletions(-) diff --git a/README.md b/README.md index 111cd5d..d2acc77 100644 --- a/README.md +++ b/README.md @@ -4,6 +4,9 @@ wheic This project is to build a HTML5 parser, then use that to build a WYSIWYG html editor for the browser. +The code is written in coffeescript for modern browsers. + + Status ------ @@ -13,6 +16,14 @@ Under development Getting Started --------------- +Recommended: see "Compiling" below + +(experimental) Alternate: see "Without Compiling" + + +Compiling +--------- + You'll need coffeescript, you can hopefully get that with a command such as this: @@ -27,6 +38,17 @@ Then run ``make`` Then run the test suite by opening ``index.html`` in a modern browser. +Without Compiling +----------------- + +It is recommended to install coffeescript (see Compiling above), but you might +be able to got it to compile directly in the browser, see here: + + http://coffeescript.org/#scripts + +Please nudge Jason (see below) to make this easier. + + Feedback, Questions, Etc ------------------------ diff --git a/parse-html.coffee b/parse-html.coffee index 46195f9..a4beb3d 100644 --- a/parse-html.coffee +++ b/parse-html.coffee @@ -20,21 +20,52 @@ # The implementation is a pretty direct implementation of the parsing algorithm # described here: -# http://www.w3.org/TR/html5/syntax.html#preprocessing-the-input-stream # -# Deviations from that spec: +# http://www.w3.org/TR/html5/syntax.html # -# Purposeful: search this file for "WHATWG" +# except for some places marked "WHATWG" that are implemented as described here: # -# Not finished yet: search this file for "fixfull", "TODO" and "FIXME" +# https://html.spec.whatwg.org/multipage/syntax.html +# +# This code passes all of the tests in the .dat files at: +# +# https://github.com/JasonWoof/html5lib-tests/tree/patch-1/tree-construction + + +################################## +## how to use this code +################################## +# +# See README.md for how to pre-compile this file, or compile it in the browser. +# +# This file exports a single useful function: parse_tml +# +# Once you include this file in a page (see index.html for an example) you'll +# have window.wheic +# +# Call it like this: +# +# wheic.parse_html({html: "

hi

"}) +# +# Or, if you don't want /etc, do this: +# +# wheic.parse_html({fragment: "body", html: "

hi

"}) +# +# This code can _almost_ run outside the browser (eg under node.js). To get it +# to run without the browser would require native implementation of +# decode_named_char_ref(). The current implementation of that function uses the +# browser's DOM api, to save space (the list of valid named characters is +# massive.) + +# This code is a work in progress, eg try search this file for "fixfull", +# "TODO" and "FIXME" -# stacks/lists +# Notes: stacks/lists # -# the spec uses a many different words do indicate which ends of lists/stacks -# they are talking about (and relative movement within the lists/stacks). This -# section splains. I'm implementing "lists" (afe and open_els) the same way -# (both as stacks) +# Jason was frequently confused by the terminology used to refer to different +# parts of the stacks and lists in the spec, so he made this chart to help keep +# his head straight: # # stacks grow downward (current element is index=0) # @@ -50,10 +81,6 @@ # 1: b # 0: a "end of the list", "current node", "bottommost", "last" - -# browser -# note: to get this to run outside a browser, you'll have to write a native -# implementation of decode_named_char_ref() unless module?.exports? window.wheic = {} module = exports: window.wheic -- 1.7.10.4