The
scholars Diyashree and Sherly has proposed a detection system called the (IDS)
Intrusion detection system use to monitor the suspicious traffic activities.
And this IDS system is induced with an ensemble CVM (Core Vector Machine)
classifier. In this classifier, to select the different appropriate feature,
Chi-square method is used. In this method, different datasets are used
including the (KDD CUP’99) and this dataset has 41 different features. But
these 41 feature are further divide into three categories: the basic features,
the content features and the traffic features. The features that are used for
the testing and the training includes the basic and the content and in this 21
features are used from the total 41 features. To test and train the classifier,
the dataset used is, the KDD Cup’99. A Core Vector Machine approach is utilized
and then the results of the classifier are assembled and merged using a
function called the weighted function. By merging the results of the datasets,
an assemble Core Vector Machine classifier is developed. The results which were
generated by the proposed method were compared with the existing methods
including the SVM (Support Vector Machines), NB (Naïve Bayesian classifier), DT
(Decision tree), RF (Random Forest) and as well as the AdaBoost Dt (AdaBoost
Classifier). From differentiating the different classifier and comparing them,
it was concluded that the Core Vector Machine Classifier is most appropriate
for the IDS because it does the testing properly by taking proper time and it
also uses sometime for the training as well. The method of the IDS shows some
the different rated of different factors like is shows the 99% detection rate, 90.9175%
accuracy and a 27% false-positive rate.
The
other scholars introduce an algorithm called the FWP-SVM genetic algorithm.
This algorithm method use 19 features out of the 41 and these features are used
to decrease the time of the classification and are used to increase the
accuracy of the classification. The 19 feature are selected on the base of the
genetic algorithm and the SVM (Support Vector Machine). The dataset used in
this algorithm to train and test it is the (isKDD Cup’99). The advantage of
this algorithm is that it increase the rate of the IDS by 96.61, its rate of
accuracy by 99.75 but it affects the false-positive rate by decreasing its rate
up to 3.39.
The
scholar Yusof at el. has worked on the computational complexity reduction, so
that the data can be easily analyzed when an attack occurs. They have designed
a system by merging the two scheme in the algorithm of feature selection named
CSE (Consistently subset evaluation). And the DCF (DDoS characteristic
features), these features are selected in order to choose and recognize the
most important and relevant feature that are associated with the attacks of the
Dos. This proposed theme is then treated and trained with the help of a dataset
called the (NSL-KDD 2009 dataset). In this scheme 17 out of the 41 features are
used. And the classifier that is used to test this scheme is the ELM (Extreme
Learning Machine). And from the result of the test, it was conclude that this
scheme has better accuracy of 91.7 in comparison to other schemes and methods.
The
scholar Anwer at el. has developed a framework to identify the minimum number
of features in order to get the highest accuracy. This framework has used
different methods and strategies by utilizing the different features, the filter
and the wrapper features selection techniques. The data set which is applied in
this scheme to judge the framework of the scheme is the UNSW-NB15 dataset. And
two other machine learning classifiers, the J48and the Naïve Bayes are also
used. And the results that are concluded that out of 41 only 18 features are
utilized and 88% accuracy is achieved.
Taher
at el. has discovered a network called the ANN (Artificial Neutral Network
based machine learning, with different feature selection including the SVM
mechanism in order to achieve the best classifier that is more accurate and has
more success rate. For the selection of the features, two mechanisms are used,
the filter mechanism and the wrapper mechanism. In the wrapper mechanism 17 out
of 41 features are used based on the Correlation technique and in the filter
mechanism 35 out of 41 features are used based on the Chi-square technique. He
dataset that is used to measure this mechanism is the NSL-KDD. The result that was
conclude that the ANN and the wrapper feature selection have outperformed in
the IDS system with the detection rate of the 94.02%.
According
to the scholars B and S, they have introduced a new approach, the FRFSA (Fuzzy
Rule and the Information Gain Ratio based feature selection approach), to
utilize this approach in selecting the most important features and these
feature are used to classify the attack records and the normal records. For the
effective classification two algorithms are used, the SVM and the LSSVM, these
models are used to classify the IDS dataset as the normal and the attack. In
this algorithm 14 feature are used out of 41. The dataset which is used to
examine this method is the NSL-KDD dataset, as it calculates the performance of
the work. And from the results it was concluded that mechanism that is
introduced has exceeded the decision making process than the present and
existing mechanisms and systems. It can be utilized in the future wok as an
intelligent agent-based classifier to make effective and appropriate decisions.
Al-yaseen
et al. has designed a model and this model deals with the IDS problems that
occurs in the data analysis and the segregate network data. They have used a
multiple level hybrid IDS model that uses the (SVM and ELM) mechanisms to
increase the IDS efficiency to detect the known and unknown attacks. A new and
modified algorithm is also used as to design a high quality dataset for
training that utilizes 22 features out of 41 features. The dataset that is used
for the calculation of this scheme is the KDD Cup 1999. The result has been
concluded that the accuracy obtained after evaluating the results is 95.71%,
the rate of detection is 95.75%, and the false alarm rate is 1.87%. They are
trying to develop a Moe effective model that can segregate new attacks and
these models are based on the efficient classifiers. In this model, the characteristics
of the Multi-agent system is also utilized in order to speed the analysis of
the data and to increase the overall efficiency of the system.
In
the next process, the methods that is developed to optimize the feature
selection process was utilized in order to increase the classifier accuracy. A
new method is developed called the IDW (Intelligent water drops) feature
selection method. The dataset that is used to evaluate this scheme is the KDD
CUO’99 dataset. The new method developed is a combination of two processes the IWD
and the SVM, and the main focus of this development method is to increase SVM
performance through the IWD feature reduction process. The features used in
this scheme are 9 out of 41 because of the different developed methods including
the SVM, GA and the IDW. The rate of detection achieved due to this method was
99.4075%, the accuracy was 99.0915% and the low false alarm rate was up to
1.405%.
Another
scholar Pham et al. has implemented a new approach to improve the IDS
performance by applying different ensemble methods and the feature selection
method. The approaches of the ensemble were built on the basis of the two
different ensemble methods including the Bosting and the Bagging methods and it
also includes the tree based algorithms as the base classifiers. Out of 41, 35
features are used in this scheme, and these feature were selected on the base
of the GR technique (Gain Ratio). The results that were concluded, includes the
accuracy of 84.25% and False alarm rate of 2.79%.
In
his portion, an effective IDS framework is developed on the basis of the SVM
including other augmented feature as well. To reshape and remodel the original
features in order to provide the summarize training and the high quality data
for the Support Vector Machine algorithm. The Support Vector Machine classifier
was trained with the help of the newly generated and transformed data. The
dataset that is used to examine this scheme is the NSL-KDD dataset. And the results
were conclude that the detection rate was 99.85%, the low FAR was 2.69% and the
accuracy rate was 99.18%.