• Guido van Rossum's avatar
    Reformatted with 4-space tab stops. · 48766512
    Guido van Rossum yazdı
    Allow '=' and '~' in unquoted attribute values.
    
    Added overridable methods handle_starttag(tag, method, attrs) and
    handle_endtag(tag, method) so subclasses can decide whether they
    really want to call the method (e.g. when suppressing some portion of
    the document).
    
    Added support for a number of SGML shortcuts:
    
            shorthand               full notation
            <tag>...<>...           <tag>...<tag>...
            <tag>...</>             <tag>...</tag>
            <tag/.../               <tag>...</tag>
            <tag1<tag2>             <tag1><tag2>
            </tag1</tag2>           </tag1></tag2>
            </tag1<tag2>            </tag1><tag2>
    
    This required factoring out some common actions and rationalizing the
    interface to parse_endtag(), so as to make the code more readable.
    
    Fixed syntax for &entity and &#char references so the trailing
    semicolon is optional; removed explicit support for trailing period
    (which was a TBL mistake in HTML 0.0).
    
    Generalized the test program.
    
    Tried to speed things up a little.  (More to come after the profile
    results are in.)
    
    Fix error recovery: call the end methods popped from the stack instead
    of the one that triggers.  (Plus some complications because of the way
    HTML extensions are handled in Grail.)
    48766512
sgmllib.py 11.5 KB