Media Summary: Create a RDD and pair RDD - Counting words - Finding unique words and mean value - Reference to regular expressions ... Review - Math review - Numpy and Spark - Lambda functions. Notebook usage - Intro to spark and pySpark API - Using RDDs - Lambda functions - RDD actions, transformation, caching ...
Hackon Data 2017 Workshop 2 - Detailed Analysis & Overview
Create a RDD and pair RDD - Counting words - Finding unique words and mean value - Reference to regular expressions ... Review - Math review - Numpy and Spark - Lambda functions. Notebook usage - Intro to spark and pySpark API - Using RDDs - Lambda functions - RDD actions, transformation, caching ... Text Analysis - Text similarity of Entity Resolution - Weighted bag-of-words - Cosine similarity - Scalable Entity Resolution ... Feature Hashing - One-Hot Encoding (OHE) - OHE Dictionary - Prediction and log loss evaluation - Feature reduction More info: ... HackOn(Data) 2017 - Workshop 8th & Competition
Please complete the videos, exercise, and lab before the session on Thursday. Jul 14 - Virtual session - Intro ...