Focused crawls are collections of frequently-updated webcrawl data from narrow (as opposed to broad or wide) web crawls, often focused on a single domain or subdomain.
My ML related stuff including notebooks, codes and a curated list of various useful resources such as books and softwares. Almost everything mentioned here is free(as speech not free food) or open-source.
The objective of this project is to utilize the IMDB data set to generate Meaningful and Interesting Insights and then create a movie rating model based on average IMDB ratings and a sentiment analysis score of user tweets. And also to create an accurate Machine Learning model to predict average movie ratings based on some key features.
📝 Text Data Analysis & Machine Learning on supermarket's Social Network non-English Comments using limited language resources via Java, Scala & Apache Spark.
Hello-world kind of notebook would be great to integrate in Snorkel, so you don't come in an empty Zeppelin.
As discussed with @a-pagano it could also help people know where to put their data