NAME
vilistextum - html to ascii converter
SYNOPSIS
vilistextum [OPTIONS] [inputfile |-] [outputfile | -]
DESCRIPTION
vilistextum is a html to ascii converter specifically programmed to get the best out of incorrect html.
OPTIONS
- inputfile,- resp. outputfile,-
- replace inputfile with '-' for reading from standard input, likewise outputfile with '-' for writing to standard output.
- -a, --no-alt
- don't output anything for IMG tags even if they have an ALT attribute. Implies --no-image.
- -c, --convert-tags
- Some of the tags will be converted to special characters.
- -e, --errorlevel NUMBER
- Increase level of verbosity for error messages.
- -i, --defimage STRING
- IMG tags without alt attribute are output as [STRING]. Default: Image.
- -l, --links
- Numbers the links in the document and creates footnotes of each link at the end of the file. Similar to 'lynx -dump'. Relative URIs are not resolved and won't be printed.
- -k, --links-inline
- print the links directly after the html tag.
- -m, --dont-convert-characters
- The entities from windows1252 (€ - Ÿ and their proper entity names) will not be converted.
- -n, --no-image
- don't output [Image] for IMG tags that have no ALT attribute.
- -p, --palm
- This outputs text more suitable for reading on a PDA. Palm textreaders do their own word wrapping, so the width is set to infinity and the program doesn't right justify or center the text.
- -r, --remove-empty-alt
- if there is an empty ALT attribute in a IMG tag (eg <IMG href="..." alt="">), don't output '[]'.
- -s, --shrink-lines [NUMBER]
- If there are more than number empty lines, output only NUMBER. Default: 1.
- -t, --no-title
- Don't output title of the HTML document
- -w, --width NUMBER
- Maximum width of the output text. Default: 72.
- -h, --help
- print a list of the command line options.
- -v, --version
- output version information and exit
MULTIBYTE OPTIONS (Only available if compiled with multibyte support)
- -u, --output-utf-8
- instead of the character set of the html document, everything will be output as utf-8.
- -x, --translit
- use the //TRANSLIT feature of libiconv. Consult the iconv manual for details.
- -y, --charset CHARSET
- if the HTML document doesn't provide a character set in the meta tags, use CHARSET.
LIMITATIONS
The rendering of tables is not very good.
The handling of OL is incomplete. The program treats it as UL and more than 10 nested lists confuse it.
Text is never justified.
REPORTING BUGS
Please report bugs to
bhaak@gmx.net.
AUTHOR
Vilistextum was written by Patric Mueller <
bhaak@gmx.net>.
It may be freely distributed under the terms of the GNU General Public License Version 2. There is ABSOLUTELY NO WARRANTY for this program.
SEE ALSO
iconv(3),
lynx(1),
links(1),
w3m(1)