Optimizing and Verifying Model Effectiveness Using Random Forest and Logistic Regression

Nasteski, Vladimir and Manevska, Violeta and Savoska, Snezana (2017) Optimizing and Verifying Model Effectiveness Using Random Forest and Logistic Regression. In: ISGT 2017 Conference, 29 September 2017, Sofia.

Full text not available from this repository.

Abstract

In the last decade, the volume, variety, velocity and the veracity of the information including the variety of devices that transmit the information rapidly increased and made a huge data flooding, creating big unstructured databases. Big Data refers to databases that vary by type, volume, velocity and variety. The standard techniques for data analyzing and optimizing fail to deal with these databases and it is necessary to find other methods and tools to deal with these problems. One of these tools is Apache Spark which has become one of the most popular tools when analytics and visualization of Big Data is taken into consideration. In this paper, the MLlib library is used as a part of Spark, for creating and analyzing data models. The two models are created using the Random Forest regression and Logistic Regression algorithms. Using these algorithms, the strong features that define the targets are presented and the models’ effectiveness is verified.

Item Type: Conference or Workshop Item (Paper)
Subjects: Scientific Fields (Frascati) > Engineering and Technology > Electrical engineering, electronic engineering,information engineering
Divisions: Faculty of Information and Communication Technologies
Depositing User: Prof. d-r. Snezana Savoska
Date Deposited: 18 Jun 2020 13:44
Last Modified: 25 Aug 2020 06:58
URI: https://eprints.uklo.edu.mk/id/eprint/5454

Actions (login required)

View Item View Item