October 21

At this point we have been introduced to several basic concepts related to Pandas. These include

  • importing pandas
  • reading csv files
  • performingv basic statsistcis with column data
  • grouping data
  • simple visualization

Now we will be learn some more basic operations with Pandas so that we can perform more sophisticated exploratory data analysis.

Today, we will following along a presentation (on youtube) and do some basic data explaoration online. The platform that we will use is called kaggle and it is very similar to jupyter notebook.

Activity 1

  • Get an account (free) on kaggle. According to google:

“Kaggle is an online community and platform for data scientists and machine learning practitioners. It is known for its machine learning competitions but also provides a range of resources for learning and collaboration.”

  • Navigate to https://www.kaggle.com/
  • Follow instructions to make an account and log on to your account

Activity 2

** Take notes to upload on moodle **

Activity 3: Pandas for data analysis

  • Follow along the presentation Learning Pandas for Data Analysis? Start Here Note You may have trouble locating the data file used in the presentation. For now, just view the presentation and take notes on pandas capabilities. We will review this next class.

** Take notes to upload on moodle **

Activity 4: Quick intro to correlation and regressions

In preperation for a deeper dive into data analysis, please familiarize yourself with the basic ideas of correlation and regression by viewing:

** Take notes to upload on moodle **

Activity 5:

  • Upload your notes on moodle under the class activity assignment for oct 21 for getting credit for today’s work