File encoding

This plugin offers the possibility to find out which encoding is used for a file. This makes it a useful tool when, for example, setting the correct encoding of the input file in a profile, which is necessary to correctly parse the input file.

Settings


(1) Enter search text: First, enter a search term. It should be a term that you know of, or strongly suspect, that it occurs in the file you are investigating. If possible, use a term that contains umlauts or special characters to increase the probability of a unique result, since usually there are differences between the different encodings for these characters.

(2) Drag and drop file: The file whose encoding is to be checked can be uploaded here either via drag and drop or via the "Import" button if (1) has been entered.

(3) Read suggested encoding: If the file was uploaded (2), the file name and a suitable encoding are displayed. There may be several possible encodings for a search term (1). The search term is encoded with an encoding and then a match is searched for in the file. If no match is found, the next encoding is tried. The first encoding for which a match is found is displayed as a result.

Example


The following file "Invoice.xml" will give you the result shown in the screenshot above. Of course, we assume that the file is encoded with the encoding "8859_1". It is merely a matter of illustrating the approach.

Note: Instead of the search term "Müller" try the less-definite (since no umlauts or special characters) search term "Invoice".


<Invoice>
<InvoiceNo>INV_12345</InvoiceNo>
<InvoiceDate>2014-05-21T15:25:11Z</InvoiceDate>
<Customer>
<CustomerNo>12345</CustomerNo>
<Vorname>Peter</Vorname>
<Nachname>Müller</Nachname>
<City>Munich</City>
<DOB>15.07.1981</DOB>
<Order>
<ArticleNo id="123689">GB459363258</ArticleNo>
<Quantity>1</Quantity>
<Price>17.98</Price>
</Order>
<Order>
<ArticleNo id="5896324">459363298</ArticleNo>
<Quantity>1</Quantity>
<Price>4.99</Price>
</Order>
</Customer>
</Invoice>