Implements an entire machine learning pipeline to train and evaluate a Random Forest Classifier on labeled gait data for walking. Data generated during the experiment has led to helpful insights in to the problem domain.
Développer un modèle de scoring de la probabilité de défaut de paiement du client pour étayer la décision d'accorder ou non un prêt à un client potentiel.
Before training a model or feed a model, first priority is on data,not in model. The more data is preprocessed and engineered the more model will learn. Feature selectio one of the methods processing data before feeding the model. Various feature selection techniques is shown here.
We have a data of retail transactions over two year. Apart from data analysis and visualization, a regression model is developed to predict the price of retail items belonging to different categories. Foretelling the Retail price can be a daunting task due to the huge datasets with a variety of attributes ranging from Text, Numbers(floats, integers), and DateTime. Also, outliers can be a big problem when dealing with unit prices.
Explored data using data visualisation and exploratory data analysis. Used Logistic Regression to create a basic prediction model. Improved model using recursive feature elimination.
Hospitals contain large databases. We can use that data to discover new useful and potentially life saving knowledge. Here we use datamining especially to predict type 2 diabetes mellitus.Predicting the percentage of chance of occurrence of Diabetes mellitus type 2 with less time complexity and high accuracy.