攻克数据科学面试

DS/ML 面试准备小抄、书籍、问题和作品集合集。『A Collection of Cheatsheets, Books, Questions, and Portfolio For DS/ML Interview Prep』

Github星跟踪图

Here are the sections:

Data Science Cheatsheets

This section contains cheatsheets of basic concepts in data science that will be asked in interviews:

Data Science EBooks

This section contains books that I have read about data science and machine learning:

Data Science Question Bank

This section contains sample questions that were asked in actual data science interviews:

Data Science Case Studies

This section contains case study questions that concern designing machine learning systems to solve practical problems.

Data Science Portfolio

This section contains portfolio of data science projects completed by me for academic, self learning, and hobby purposes.

For a more visually pleasant experience for browsing the portfolio, check out jameskle.com/data-portfolio

  • Recommendation Systems

    • Transfer Rec: My ongoing research work that intersects deep learning and recommendation systems.

    • Movie Recommendation: Designed 4 different models that recommend items on the MovieLens dataset.

    Tools: PyTorch, TensorBoard, Keras, Pandas, NumPy, SciPy, Matplotlib, Seaborn, Scikit-Learn, Surprise, Wordcloud

  • Machine Learning

    • Trip Optimizer: Used XGBoost and evolutionary algorithms to optimize the travel time for taxi vehicles in New York City.

    • Instacart Market Basket Analysis: Tackled the Instacart Market Basket Analysis challenge to predict which products will be in a user's next order.

    Tools: Pandas, NumPy, Matplotlib, XGBoost, Geopy, Scikit-Learn

  • Computer Vision

    • Fashion Recommendation: Built a ResNet-based model that classifies and recommends fashion images in the DeepFashion database based on semantic similarity.

    • Fashion Classification: Developed 4 different Convolutional Neural Networks that classify images in the Fashion MNIST dataset.

    • Dog Breed Classification: Designed a Convolutional Neural Network that identifies dog breed.

    • Road Segmentation: Implemented a Fully-Convolutional Network for semantic segmentation task in the Kitty Road Dataset.

    Tools: TensorFlow, Keras, Pandas, NumPy, Matplotlib, Scikit-Learn, TensorBoard

  • Natural Language Processing

  • Data Analysis and Visualization

    • World Cup 2018 Team Analysis: Analysis and visualization of the FIFA 18 dataset to predict the best possible international squad lineups for 10 teams at the 2018 World Cup in Russia.

    • Spotify Artists Analysis: Analysis and visualization of musical styles from 50 different artists with a wide range of genres on Spotify.

    Tools: Pandas, NumPy, Matplotlib, Rspotify, httr, dplyr, tidyr, radarchart, ggplot2

Data Journalism Portfolio

This section contains portfolio of data journalism articles completed by me for freelance clients and self-learning purposes.

For a more visually pleasant experience for browsing the portfolio, check out jameskle.com/data-journalism

Downloadable Cheatsheets

These PDF cheatsheets come from BecomingHuman.AI.

1 - Neural Network Basics

Neural Network Basics

2 - Neural Network Graphs

Neural Network Graphs

3 - Machine Learning with Emojis

Machine Learning with Emojis

4 - Scikit-Learn With Python

Scikit-Learn With Python

5 - Python Basics

Python Basics

6 - NumPy Basics

NumPy Basics

7 - Pandas Basics

Pandas Basics

8 - Data Wrangling With Pandas

Data Wrangling With Pandas Part 1

Data Wrangling With Pandas Part 2

9 - SciPy Linear Algebra

SciPy Linear Algebra

10 - Matplotlib Basics

Matplotlib Basics

11 - Keras

Keras

12 - Big-O

Big-O

主要指标

概览
名称与所有者khanhnamle1994/cracking-the-data-science-interview
主编程语言Jupyter Notebook
编程语言Jupyter Notebook (语言数: 8)
平台
许可证
所有者活动
创建于2018-08-09 07:57:57
推送于2024-08-31 11:22:32
最后一次提交2024-08-31 18:22:32
发布数0
用户参与
星数4.1k
关注者数77
派生数1.1k
提交数520
已启用问题?
问题数2
打开的问题数1
拉请求数2
打开的拉请求数5
关闭的拉请求数2
项目设置
已启用Wiki?
已存档?
是复刻?
已锁定?
是镜像?
是私有?