-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge branch 'main' of https://github.com/carmengg/eds-221-course-mat…
- Loading branch information
Showing
29 changed files
with
5,698 additions
and
265 deletions.
There are no files selected for viewing
11 changes: 11 additions & 0 deletions
11
_freeze/lectures/lesson-1-python-review/execute-results/html.json
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,11 @@ | ||
{ | ||
"hash": "b629e25d687f375e7a87308affe96248", | ||
"result": { | ||
"markdown": "# Python Review\n\n## About\nThis is a short reveiw about some core concepts in Python exemplified by objects in the `numpy` library. \n\n## `numpy`\n\nNumPy is one of the core packages for numerical computing in Python. Many of the packages we will use in this course use NumPy's arrays as their building blocks. Additionally, numpy objects have been optimized for processing, so computations on them are really fast and use less memory than doing the equivalent using base Python. \n\nIn this lesson we will use `numpy` to review some core concepts in Python you're already familiar with. \n\nFirst, let's start by importing the library:\n\n::: {.cell execution_count=1}\n``` {.python .cell-code}\nimport numpy as np\n```\n:::\n\n\n## Variables\nWe can think of a **variable** as a name we assign to a particular object in Python. For example:\n\n::: {.cell execution_count=2}\n``` {.python .cell-code}\n# assign a small array to variable a\na = np.array([[1,1,2],[3,5,8]])\n```\n:::\n\n\nWhen we run the cell, we store the variables and their value. We can view a variable's value in two ways:\n\n1. running a cell with the variable name\n\n2. using the `print` function to print the value\n\n::: {.cell execution_count=3}\n``` {.python .cell-code}\n# show the value\na\n```\n\n::: {.cell-output .cell-output-display execution_count=22}\n```\narray([[1, 1, 2],\n [3, 5, 8]])\n```\n:::\n:::\n\n\n::: {.cell execution_count=4}\n``` {.python .cell-code}\n# print the value \nprint(a)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[[1 1 2]\n [3 5 8]]\n```\n:::\n:::\n\n\n:::{.callout-note}\n## R and Python\n\nIn Python we use the equal sign `=` to assign values to variables in the same way the left-arrow `<-` is used in R.\n:::\n\n::: {.callout-caution}\n## Naming Variables\n\nThere are many ways of constructing multi-word variable names. In this course we will name variables using **snake_case**, where words are all in small caps and separated by underscores (ex: `my_variable`). This is the naming convention suggested by the [Style Guide for Python Code](https://peps.python.org/pep-0008/).\n:::\n\n## Variables and Objects\nYou will often encounter the word **object** in Python documentation and tutorials. Informally speaking, an object is a bundle of *properties* and *actions* about something specific. For example, an object could represent a data frame with properties such as number of rows, names of columns, and date created, and actions suchs as selecting a specific row or adding a new column. \n\nA variable is the name we give a specific object, and the same object can be referenced by different variables. An analogy for this is the following: the Sun (object) is called \"sol\" in Spanish and \"soleil\" in French, so two different names (variables) represent the same object. You can read more technical details about the [difference between objects and variables in Python here](https://realpython.com/python-variables/#object-references).\n\nIn practice, we can often use the word variable and object interchangeably. I want to bring up what objects are so you're not caught off-guard with vocabulary you'll often encounter in the documentation, StackExchange, etc. We'll often use the word object too (for example, in the next subsection!).\n\n## Types\n\n Every object in Python has a **type**, the type tells us what kind of object it is. We can also call the type of an object, the **class** of an object (so class and type both mean what kind of object we have). \n \n We can see the type/class of a variable/object by using the `type` function:\n\n::: {.cell execution_count=5}\n``` {.python .cell-code}\nprint(a)\ntype(a)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[[1 1 2]\n [3 5 8]]\n```\n:::\n\n::: {.cell-output .cell-output-display execution_count=24}\n```\nnumpy.ndarray\n```\n:::\n:::\n\n\nThe `numpy.ndarray` is the core object/data type in the NumPy pakcage. We can check the type of an entry in the array by indexing:\n\n::: {.cell execution_count=6}\n``` {.python .cell-code}\nprint(a[0,0])\ntype(a[0,0])\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n1\n```\n:::\n\n::: {.cell-output .cell-output-display execution_count=25}\n```\nnumpy.int64\n```\n:::\n:::\n\n\n::: {.callout-tip}\n## Check-in\nHow would you access the value 5 in the array `a`? **Remember indexing in Python starts from 0!**\n:::\n\nNotice the type of the value 1 in the array is `numpy.int64` and not just the core Python integer type `int`. The NumPy type `numpy.int64` is telling us 1 is an integer stored as a 64-bit number. NumPy has its own data types to deal with numbers depending on memory storage and floating point precision, [click here to know see all the types](https://numpy.org/doc/stable/reference/arrays.scalars.html#sized-aliases). \n\nSince \"everything in Python is an object\" and every object has a class, we will interact with SO MANY classes in this course. Often, knowing the type of an object is the first step to finding information to code what you want!\n\n## Functions\n\n`print` was our first example of a Python **function**. Functions take in a set of **arguments**, separated by commas, and use those arguments to create some **output**. There are several built-in funcions in Python, most of them are for interacting with the basic Python data types. You can see a [list of them here](https://realpython.com/python-data-types/#built-in-functions).\n\n::: {.callout-caution}\n## Argument or Parameter?\nWe can interchangeably say arguments or parameters. You will see argument more often in the documentation.\n:::\n\nWe can ask for information about a function by executing `?` followed by the function name:\n\n::: {.cell execution_count=7}\n``` {.python .cell-code}\n?print\n```\n:::\n\n\n![](/images/lesson-1/print_docstring.png)\n\nThe first line is always the function showing all of its arguments in parenthesis. \nThen there is a short description of what the function does.\nAnd finally a list of the arguments and a brief explanation about each of them.\n\nYou can see there are different types of arguments inside the parenthesis. Roughly speaking, a function has two types of arguments:\n\n- **non-optional arguments**: arguments *you* need to specify for the function to do something, and\n\n- **optional arguments**: arguments that are pre-filled with a default value by the function, but you can override them. Optional arguments appear inside the parenthesis () in the form `optional_argument = default_value`. \n\n**Example:**\n\n`end` is an argument in `print` with the default value a new line. We can change this argument so that finishes the line with ` ^_^` instead:\n\n::: {.cell execution_count=8}\n``` {.python .cell-code}\n# notice we had always used print withough specifying any value for the `end` argument\nprint('I am changing the default end argument of the print function', end=' ^_^')\n```\n\n::: {.cell-output .cell-output-stdout}\n```\nI am changing the default end argument of the print function ^_^\n```\n:::\n:::\n\n\n## Attributes & Methods\n\nAn object in Python has attributes and methods. An **attribute** is a property of the object, some piece of information about it. A **method** is a procedure associated with an object, so it is an action where the main ingredient is the object. \n\nFor example, these could be some attributes and methods a class `cat`:\n\n![. ](/images/lesson-1/cat_class.png){width=45%}\n\n\nMore formally, **a method is a function** that acts on the object it is part of.\n\nWe can access a variable's attributes and methods by adding a period `.` at the end of the variable's name. So we would write `variable.variable_method()` or `variable.variable_attribute`. \n\n:::{.callout-tip}\n## Check-in\nSuppose we have a class `fish`, make a diagram similar to the `cat` class diagram showing 3 attributes for the class and 3 methods.\n:::\n\n**Example**\n\nNumPy arrays have many methods and attributes. Let's see some concrete examples.\n\n::: {.cell execution_count=9}\n``` {.python .cell-code}\n# define a 3x3 array\nvar = np.array([[1,2,3],[4,5,6],[7,8,9]])\nvar\n```\n\n::: {.cell-output .cell-output-display execution_count=28}\n```\narray([[1, 2, 3],\n [4, 5, 6],\n [7, 8, 9]])\n```\n:::\n:::\n\n\n::: {.cell execution_count=10}\n``` {.python .cell-code}\n# T is an example of attribute, it returns the transpose of var\nprint(var.T)\nprint(type(var.T))\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[[1 4 7]\n [2 5 8]\n [3 6 9]]\n<class 'numpy.ndarray'>\n```\n:::\n:::\n\n\n::: {.cell execution_count=11}\n``` {.python .cell-code}\n# shape, another attribute, tells us the shape of the array (3x3)\nprint(var.shape)\nprint(type(var.shape))\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n(3, 3)\n<class 'tuple'>\n```\n:::\n:::\n\n\n::: {.cell execution_count=12}\n``` {.python .cell-code}\n# ndim is an attribute holding the number of array dimensions\nprint(var.ndim)\nprint(type(var.ndim))\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n2\n<class 'int'>\n```\n:::\n:::\n\n\nNotice these attributes can have many different data types. Here we saw tuples and int, two of the core Python classes, and also a numpy array as attributes of `var`. \n\nNow some examples of methods:\n\n::: {.cell execution_count=13}\n``` {.python .cell-code}\n# the tolist method returns the array as a nested list of scalars\nvar.tolist()\n```\n\n::: {.cell-output .cell-output-display execution_count=32}\n```\n[[1, 2, 3], [4, 5, 6], [7, 8, 9]]\n```\n:::\n:::\n\n\n::: {.cell execution_count=14}\n``` {.python .cell-code}\n# the min method returns the minimum value in the array along an axis\nvar.min(axis=0)\n```\n\n::: {.cell-output .cell-output-display execution_count=33}\n```\narray([1, 2, 3])\n```\n:::\n:::\n\n\n::: {.callout-tip}\n## Check-in\n\nWe can also call the `min` method without any parameters:\n\n::: {.cell execution_count=15}\n``` {.python .cell-code}\nvar.min()\n```\n\n::: {.cell-output .cell-output-display execution_count=34}\n```\n1\n```\n:::\n:::\n\n\nWhat kind of parameter is `axis` in our previous call of the `var` method?\n:::\n\nRemember, methods are functions associated to an object. We can check this!\n\n::: {.cell execution_count=16}\n``` {.python .cell-code}\ntype(var.tolist)\n```\n\n::: {.cell-output .cell-output-display execution_count=35}\n```\nbuiltin_function_or_method\n```\n:::\n:::\n\n\n::: {.cell execution_count=17}\n``` {.python .cell-code}\ntype(var.min)\n```\n\n::: {.cell-output .cell-output-display execution_count=36}\n```\nbuiltin_function_or_method\n```\n:::\n:::\n\n\nYou can see a complete list of [NumPy array's methods and attributes in the documentation](https://numpy.org/doc/stable/reference/generated/numpy.ndarray.html).\n\n:::{.callout-note}\n## R and Python\n\nIn R we don't use methods within an object. Rather, functions are extrinsic to (outside) the objects they are acting on. In R, for example, there would be two separate items: the variable `var` and a separate function `min` that gets `var` as a parameter:\n\n``` R\n# this is R code\nvar <- array(c(1,4,7,2,5,8,3,6,9), dim =c(3,3))\nmin(var)\n```\n\nUsing the pipe operator `%>%` in R's tidyverse is closer to the dot `.` in Python:\n\n``` R\n# this is R code\nvar <- array(c(1,4,7,2,5,8,3,6,9), dim =c(3,3))\nvar %>% min()\n```\n\nWhat happens here is that the pipe `%>%` is passing `var` to the `min()` function as its first argument. This is essentially what happens in Python when a function is a method of a class:\n\n``` python\n# this is Python code\nvar = np.array([[1,2,3],[4,5,6],[7,8,9]])\nvar.min()\n```\n\nWhen working in Python, remember that *methods are functions that are part of an object* and a method uses the object it is part of to produce some information.\n:::\n\n<!--\n## Exercises\n\n::: {.callout-tip}\n## Exercise 2\nConsider the following code:\n\n```python\nimport numpy as np\n\n?np.ones\n```\n![](/images/lesson-1/np-ones-docstring.png)\n\n::: {.cell execution_count=18}\n``` {.python .cell-code}\nabc = np.ones([3,2], dtype=np.int8)\nprint(abc)\n\nx = abc.mean()\nprint(x)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[[1 1]\n [1 1]\n [1 1]]\n1.0\n```\n:::\n:::\n\n\nComplete the following paragraph using the given words:\n\n| . | . | .| .|\n|---|---|---|---|\n| class | function | object |method |\n| variable | non-default |data-type |default |\n|package | attribute | outout | parameter(s) | optional|\n\n`abc` is a ________ assigned to the NumPy ________ `np.ones([3,2])`. We construct `np.ones([3,2])` by calling a ________ from the NumPy ________. `[3,2]` and `np.int8` are ________ we pass to the `np.ones` ________. `np.int8` is a ________ parameter of `np.ones`. `abc.mean` is an ________ of `abc` and `x` is its ________. \n\n\n:::\n\n::: {.callout-tip}\n## Exercise 3\n1. Read the `print` function help. What is the type of the argument `sep`? Is this a default or non-default argument? Why?\n\n2. Create two new variables, one with the integer value 77 and another one with the string 99.\n\n3. Use your variables to print `77%99%77` by changing the value of one of the default arguments in `print`.\n:::\n\nTO DO: add an exercise about coding numpy \n\n::: {.cell execution_count=19}\n``` {.python .cell-code}\nvar = np.array([2,3,5,7,11,13]).reshape([2,3])\nvar\n```\n\n::: {.cell-output .cell-output-display execution_count=38}\n```\narray([[ 2, 3, 5],\n [ 7, 11, 13]])\n```\n:::\n:::\n\n\n-->\n\n", | ||
"supporting": [ | ||
"lesson-1-python-review_files" | ||
], | ||
"filters": [], | ||
"includes": {} | ||
} | ||
} |
15 changes: 15 additions & 0 deletions
15
_freeze/lectures/lesson-2-pandas-basics/execute-results/html.json
Large diffs are not rendered by default.
Oops, something went wrong.
Oops, something went wrong.