Focused crawls are collections of frequently-updated webcrawl data from narrow (as opposed to broad or wide) web crawls, often focused on a single domain or subdomain.
Create separate files for all the issues you are solving and always open an issue that has all details of the process or method you will use to perform anomaly detection and wait till it is assigned (not more than 2-3 hours it will take, we are passionate open source developers )
Predictive model that divides the customers into groups based on common characteristics so companies can market to each group effectively and appropriately
Exploratory data analysis was performed on a dataset heart disease dataset, constructing various visualisations, to gain useful insights about the data. Various machine learning algorithms were then trained to classify whether someone has heart disease or not based on various features, scoring each algorithm based on their accuracy and deciding on which algorithm was the best to use for future predictions.
Classification of IMDB Reviews dataset and News Group dataset using Logistic Regression, Decision Trees, Support Vector Machines, Ada Boost and Random Forest. Methods and Accuracy of each model were compared and reported
This repository contains the final report drafted on the replication of the Caruana and Niculescu-Mizil paper on the comparison of supervised learning algorithms. Here, I compare Logistic Regression, Random Forest, and Artificial Neural Networks over 4 different datasets measured over 3 different metrics. This project was done for Cogs 118A. The datasets were taken from the UCI ML repository.