File Converters department of social sciences new methods for the analysis of media content Home Staff Research Resources News Data Preparation File Converters HTML to TXT Other Converters Splitting Files Quicklinks Data Analysis File Format Converters Many software tools for data analysis require specific formatting for processing source files. On this page you will find a number of tools, with which you can convert the format of multiple files at the same time. HTML to TXT Conversion Frequently, software processing textual data cannot directly handle HTML-files, which are the most common files found on the web. Below programs convert HTML into plain Text, many of them do batch conversion. To get an idea about the capabilities of these programs, you can get the plain-text-version of the current HTML page, as created by the different programs by clicking on the "Conversion Result"-links. NB: These conversion files are usually much better than conversions done with Microsoft™ Word™ (Conversion Result of Word™). * Web2Text Web2Text is a bare bones program, with which you can batch convert HTML files into TXT. It allows you to configure the most important options (such as line length) and yields decent results. (Conversion Result) * Detagger Detagger contains a few more customization options than Web2Text, so you should check, if you require the additional options (such as an restriction on the output file to contain only ASCII characters etc.). You can test a fully function version of this shareware, which currently costs $20 (US). (Conversion Result) * HTMLtoTXT If you do not require paragraph marks in HTML to be reproduced in the ASCII file, try freeware HTMLtoTXT. (Conversion Result) * Markup Remover This Windows 3.11 style tag remover is shareware and has some useful customization facilities, most importantly, it can convert to ASCII, iso 8859-1, and ANSI (for UNIX). (Conversion Result) * Microblast HTML to TEXT Microblast's HTML to TEXT (shareware @ US-$ 10) features the most intuitive interface, but yields at best mediocre results. It is not customizable, it does not even allow for adjustments, not even the line breaks are configurable. Its "Open" and "Save" menus do not follow the Windows™ standard (there are no standard file type filters) and batch conversions are not implemented. (Conversion Result) * NoteTab NoteTab is not a stand-alone detagger, but a full fledged ASCII/HTML-Editor. The shareware fee of US-$ 19.95 will yield a quick pay back, as it most effectively transforms HTML into plain text, as its results are very clean and you can batch convert many files. (Conversion Result) * HTML Markdown HTML Markdown was written for the PowerMac. * more HTML converters Other Converters * ABC Amber Textconverter This shareware conversion tool performs conversions between many major file formats, namely: + ANSI (.txt) + Unicode (.txt) + Rich Text Format (.rtf) + Microsoft™ Word™ (.doc) + Corel™ WordPerfect™ (.doc) + Lotus™ AmiPro™ (.ami) + Microsoft™ Excel™ (.xls) + Lotus™ 1-2-3™ + Adobe™ Portable Data Format™ (.pdf). media methods | resources | data preparation | file converters contact