File Encoding
This plugin offers the possibility to find out which encoding is used for a file. This makes it a useful tool when, for example, setting the correct encoding of the input file in a Lobster_data profile, which is necessary to correctly parse the input file.
(1) First, enter a search term. It should be a term that you know of, or strongly suspect, that it occurs in the file you are investigating. If possible, use a term that contains umlauts or special characters to increase the probability of a unique result, since usually there are differences between the different encodings for these characters.
(2) The file whose encoding is to be checked can be uploaded here either via drag and drop or via the Import button if (1) has been entered.
(3) If the file was uploaded (2), the file name and a suitable encoding are displayed. There may be several possible encodings for a search term (1). The search term is encoded with an encoding and then a match is searched for in the file. If no match is found, the next encoding is tried. The first encoding for which a match is found is displayed as a result.
Example
The following file Invoice.xml will give you the result shown in the screenshot above. Of course, we assume that the file is encoded with the encoding 8859_1. It is merely a matter of illustrating the approach.
Note: Instead of the search term Müller try the less-definite (since no umlauts or special characters) search term Invoice.
<
Invoice
>
<
InvoiceNo
>INV_12345</
InvoiceNo
>
<
InvoiceDate
>2014-05-21T15:25:11Z</
InvoiceDate
>
<
Customer
>
<
CustomerNo
>12345</
CustomerNo
>
<
Vorname
>Peter</
Vorname
>
<
Nachname
>Müller</
Nachname
>
<
City
>Munich</
City
>
<
DOB
>15.07.1981</
DOB
>
<
Order
>
<
ArticleNo
id
=
"123689"
>GB459363258</
ArticleNo
>
<
Quantity
>1</
Quantity
>
<
Price
>17.98</
Price
>
</
Order
>
<
Order
>
<
ArticleNo
id
=
"5896324"
>459363298</
ArticleNo
>
<
Quantity
>1</
Quantity
>
<
Price
>4.99</
Price
>
</
Order
>
</
Customer
>
</
Invoice
>