Classification of goods by textual description: an applied study

All countries involved in foreign trade are likely to feel the need for automatic classification of goods as there are numerous commodities, different goods have different tax rates and countries are interested in the structure of trade. Is it possible to automate the assignment of commodity codes using the text of the commodity description, and if so, to what degree of accuracy?

Classification of goods by textual description: an applied study

Commissioned by the Ministry of Economic Affairs and Communications, Statistics Estonia conducted an applied study to find out whether and to what degree of accuracy it is possible to develop a machine learning system capable of deriving a combined nomenclature code from the text of a description, or at least of narrowing down the search. In addition, the feasibility of developing such a system was assessed with a view to the potential benefits to be gained: for how large a share of the data set would the system be able to determine the correct commodity code, either independently or as a tool used by the operator?

Applied study (in Estonian)

Interesting facts:

  • From a list of 8,000 commodity codes, the machine learning model tested is able to suggest 9 codes that are most likely to match the product.
  • While the machine learning model would not completely do away with the role of the declarant in assigning the correct code to their goods at customs, it would greatly simplify the task.
  • A fully automated machine learning model would be able to assign the correct commodity code only to two thirds of the goods passing through customs.

Read more in the news release (in Estonian). 

Commissioned by:

Ministry of Economic Affairs and Communications

Author

Hans Hõrak