43 lines
1.9 KiB
Plaintext
43 lines
1.9 KiB
Plaintext
From: SSTirlin at holnam.com (Scott Stirling)
|
|
Date: Thu, 29 Apr 1999 09:26:18 -0400
|
|
Subject: HTML "sanitizer" in Python
|
|
Message-ID: <s72825ca.087@holnam.com>
|
|
Content-Length: 1665
|
|
X-UID: 867
|
|
|
|
Thanks, Mark! That is a very cool tool. It will make a nice HTML editor for me here at work.
|
|
|
|
The only feature I immediately saw lacking (but maybe I missed it--I just downloaded it this AM) is the ability to record macros. For my Excel problem, I really need the ability to batch process the HTML files because there are 14 of them.
|
|
|
|
Anyway, this is a great reference. Thank you again.
|
|
|
|
Scott
|
|
>>> "Mark Nottingham" <mnot at pobox.com> 04/28 6:17 PM >>>
|
|
There's a better (albeit non-Python) way.
|
|
|
|
Check out http://www.w3.org/People/Raggett/tidy/
|
|
|
|
Tidy will do wonderful things in terms of making HTML compliant with the
|
|
spec (closing tags, cleaning up the crud that Word makes, etc.) As a big
|
|
bonus, it will remove all <FONT> tags, etc, and replace them with CSS1 style
|
|
sheets. Wow.
|
|
|
|
It's C, and is also available with a windows GUI (HTML-Kit) that makes a
|
|
pretty good HTML editor as well. On Unix, it's a command line utility, so
|
|
you can use it (clumsily) from a Python program.
|
|
|
|
I suppose an extension could also be written; will look into this (or if
|
|
anyone does it, please tell me!)
|
|
|
|
__________________________________________________________________
|
|
| Scott M. Stirling |
|
|
| Visit the HOLNAM Year 2000 Web Site: http://web/y2k |
|
|
| Keane - Holnam Year 2000 Project |
|
|
| Office: 734/529-2411 ext. 2327 fax: 734/529-5066 email: sstirlin at holnam.com |
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
|
|
|
|
|
|
|