Apply Two Feature Selections (Chi-square and Symmetric Uncertainty) using C4.5 Classification Algorithm Based on Arabic Data Set.
No doubt that there are a huge amount of available data and this amount of data is increasing rapidly. The huge amount of data needs emergent tools and methods to extract useful information. Also,there are several methods for extracting information. Experiments done for text classification algorithm using Arabic Dataset still need more investigation. This research applies C4.5 algorithm with two feature selection (chi-Square and Symmetric Uncertainty SU)Experiments done using Arabic dataset. Results demonstrate that C4.5 with Chi-square is a little bit better than SU.
Keywords - Feature Selection, Chi-square, Symmetric uncertainty, Arabic Text Classification, C4.5 Classification method.