TY - GEN
T1 - Using feature selection and classification to build effective and efficient firewalls
AU - Wald, Randall
AU - Villanustre, Flavio
AU - Khoshgoftaar, Taghi M.
AU - Zuech, Richard
AU - Robinson, Jarvis
AU - Muharemagic, Edin
N1 - Publisher Copyright:
© 2014 IEEE.
PY - 2014/2/27
Y1 - 2014/2/27
N2 - Firewalls form an essential element of modern network security, detecting and discarding malicious packets before they can cause harm to the network being protected. However, these firewalls must process a large number of packets very quickly, and so can't always make decisions based on all of the packets' properties (features). Thus, it is important to understand which features are most relevant in determining if a packet is malicious, and whether a simple model built from these features can be as effective as a model which uses all information on each packet. We explore a dataset with real-world firewall data to answer these questions, ranking the features with 22 feature selection techniques and building classification models using four classifiers (learners). Our results show that the top two features are proto and dst (representing the network protocol and destination IP address, respectively), and that models built using these two features in combination with the Naive Bayes learner are highly effective while being minimally computationally expensive. Such models have the potential to replace conventional firewalls while lowering computational needs.
AB - Firewalls form an essential element of modern network security, detecting and discarding malicious packets before they can cause harm to the network being protected. However, these firewalls must process a large number of packets very quickly, and so can't always make decisions based on all of the packets' properties (features). Thus, it is important to understand which features are most relevant in determining if a packet is malicious, and whether a simple model built from these features can be as effective as a model which uses all information on each packet. We explore a dataset with real-world firewall data to answer these questions, ranking the features with 22 feature selection techniques and building classification models using four classifiers (learners). Our results show that the top two features are proto and dst (representing the network protocol and destination IP address, respectively), and that models built using these two features in combination with the Naive Bayes learner are highly effective while being minimally computationally expensive. Such models have the potential to replace conventional firewalls while lowering computational needs.
KW - Classification
KW - Feature selection
KW - Firewall
KW - Intrusion detection
UR - https://www.scopus.com/pages/publications/84946687474
U2 - 10.1109/IRI.2014.7051979
DO - 10.1109/IRI.2014.7051979
M3 - Contribución a la conferencia
AN - SCOPUS:84946687474
T3 - Proceedings of the 2014 IEEE 15th International Conference on Information Reuse and Integration, IEEE IRI 2014
SP - 850
EP - 854
BT - Proceedings of the 2014 IEEE 15th International Conference on Information Reuse and Integration, IEEE IRI 2014
A2 - Bertino, Elisa
A2 - Thuraisingham, Bhavani
A2 - Liu, Ling
A2 - Joshi, James
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 15th IEEE International Conference on Information Reuse and Integration, IEEE IRI 2014
Y2 - 13 August 2014 through 15 August 2014
ER -