Features analysis of internet traffic classification using interpretable machine learning models

dc.contributor.authorHOUNDJI, Vinasetan Ratheil
dc.contributor.authorAdje, Erick
dc.contributor.authorDOSSOU, MICHEL
dc.date.accessioned2026-06-02T16:06:57Z
dc.date.available2026-06-02T16:06:57Z
dc.date.issued2022
dc.description.abstractInternet traffic classification is a fundamental task for network services and management. There are good machine learning models to identify the class of traffic. However, finding the most discriminating features to have efficient models remains essential. In this paper, we use interpretable machine learning algorithms such as decision tree, random forest and eXtreme gradient boosting (XGBoost) to find the most discriminating features for internet traffic classification. The dataset used contains 377,526 traffics. Each traffic is described by 248 features. From these features, we propose a 12-feature model with an accuracy of up to 99.76%. We tested it on another dataset with 19626 flows and obtained 98.40% of accuracy. This shows the efficiency and stability of our model. Also, we identify a set of 14 important features for internet traffic classification, including two that are crucial: port number (server) and minimum segment size (client to server).
dc.identifier.doi10.11591/ijai.v11.i3.pp1175-1183
dc.identifier.otherBECDB-14628
dc.identifier.urihttps://dspace.uac.bj/handle/123456789/12455
dc.language.isofr
dc.relation.ispartofInternational Journal of Artificial Intelligence
dc.subjectclassification algorithm
dc.subjectinternet traffic
dc.subjectmachine learning
dc.subjecttraffic classification
dc.subjecttraffic internet discriminators
dc.titleFeatures analysis of internet traffic classification using interpretable machine learning models
dc.typeArticle

Files

Collections