If someone sends you an “HTML” mail from Outlook,
even Tidy will run away
screaming unless you strip out some of the gunk manually before
trying to fix it.
If it’s Quoted-Printable, you have a bit more work to do first [maybe this (web service) or this (sed script).], though you probably have even more work to do if the original document used a non-Western encoding. Not tested.
sed -e "s/\<o\:p\>/\<p\>/g" | sed -e
"s/\<\/o\:p\>/\<\/p\>/g" | /usr/local/bin/tidy
-c
broken into two sed invocations for
readability’s (hah!) sake…
Of course, it’s all very brute-force, but usually good enough for government work.
:: Dave Walker 12:13 (EST/EDT) [+]
:: [/tech/computers/os/all]
:: tags: all office html outlook
:: Comments (0)
Comments:
sillema sillema nika su [translation: look it up...hint-fin]