Joshua's Docs - Python Notes & WIP Cheatsheet
Light

Resources

What & Link Type
Programming with Mosh: Python Cheatsheet Cheat Sheet
pythoncheatsheet.org Cheat Sheet
Official Docs Official Docs
Drew Learns Python Excellent guide and cheatsheet, especially for JS devs.
Florian GOTO - "Python Essentials for Node.js Developers (Medium version, currently down) Intro Guide
PythonTips.com book Cheat Sheet / Quick Reference
RealPython.com Tutorial / Guide Site
Joshua's Docs: Python Types My notes on Python Types
Using VSCode with Python My guide on using VS Code for Python Development
Tushar Sadhwani's blog Lots of detailed posts about Python

The Module, Package, and Import System

https://docs.python.org/3/reference/import.html https://docs.python.org/3/tutorial/modules.html http://python-notes.curiousefficiency.org/en/latest/python_concepts/import_traps.html

Using Packages

The default package manager for Python is called "pip", and lets you install packages from the Python Package Index (PyPi).

For those with NodeJS experience, pip is similar to NPM or Yarn, and PyPi is similar to the NPM Package Repository.

Installing packages with pip is fairly straightforward, and the commands are similar to other package managers:

Action Command
Install from PyPi pip install "PackageName"

You can also use semver:
pip install "PackageName~=1.2.6"
Update package to latest pip install --upgrade PackageName
Install from Requirements File pip install -r requirements.txt
Install from a setup.py file pip install . (or pip install -e . for an editable install)
Install from VCS / Github pip install -e {VCS_Syntax_String} (*)
Install from local tarball pip install {localTarballPath}.tgz
Check installed version pip show {PackageName}
Use --verbose for extra info, like install path

For the full list of commands, please see the pip reference guide.

🚨 Unlike NodeJS package managers, By default, pip installs packages globally. You can --user to scope the install to the user instead, but the best approach is to use virtual environments

Requirements Files

If you are sharing your python script, or even just using it in different environments, you probably don't want to require that its packages (dependencies) be installed one-by-one, manually with pip. A Requirements File is a way to declare all the packages required by your project, down to the specific versions, and which pip can use to provide a repeatable installation experience.

To generate a requirements file, you can either hand-code it, or use the pip freeze command:

python -m pip freeze > requirements.txt

And, then to install from it:

python -m pip install -r requirements.txt

requirements.txt vs setup.py? The usual convention is that requirements.txt is for other developers working on the project / package, whereas setup.py is for consumers of the package.

NodeJS developers: Yes, this is similar to dependencies section of package.json, although pip's package resolution algorithms are not identical to NPM or Yarn.

Working on Nested Packages Locally

Install the package(s) with an editable install:

python3 -m pip install -e {PATH}

Virtual Environments

The default behavior of pip, installing packages globally, is not very optimal once you move beyond a single project; you can easily end up with version conflicts (project A requires lib v1.0, but Project B requires lib v2.0), and other issues.

Luckily, Virtual Environments are a way in Python to have packages scoped to individual project directories, instead of the OS.

In general, if you are using Python 3.3 or newer, it is recommended to use the built-in venv module. Otherwise, the virtualenv tool (as a separate install) is the standard approach. This is a good guide that covers both. And this is a good guide for just venv. You can also find summarized step-by-step instructions for venv below.

Venv Instructions

For convenience, here are the minimal setup steps for venv (assuming you already have Python installed):

  1. Navigate to directory you want mapped / controlled by venv
  2. Create the virtual environment. Convention is to use either env/ or venv/ as the directory.
    • Unix / MacOS: python3 -m venv venv
    • Win: py -m venv venv
    • The folder name you use here usually (depends on shell) shows up in the CLI after activation. E.g. (venv) joshua@desktop-pc >

  3. Activate the virtual environment
    • Unix / MacOS: source venv/bin/activate
    • Win: .\venv\Scripts\activate
  4. Double check that Python is being loaded via the Virtual Environment
    • Unix / MacOS: which python
    • Win: where python
  5. (Optional) Install dependencies, based on requirements.txt
    • python3 -m pip install -r requirements.txt

⚠ REMINDER: Don't forget to activate the environment (step 3)!

💡 Don't forget to exclude the virtual environment directory from Git / Version Control.

When you need to leave the virtual environment, run deactivate. Note that this is not destroying or removing the virtual environment directory - you are just leaving the shell shim / environment. When you want to re-enter the environment, just run the activate script again (step 3 of the above section).

Virtual Environments - Troubleshooting

Venv - Wrong Python Path

I recently found that a venv environment that I had been using for days without issue, suddenly stopped working entirely. Although it would "activate", nothing seemed to function correctly, and calls to which python returned the wrong interpreter path!

The culprit ended up being exactly what this StackOverflow answer covers - I had renamed the parent folder. If you rename the parent directory of a venv, you might need to recreate the environment from scratch!

Venv - WSL Syntax Error

When trying to activate venv on WSL, you might encounter an error like this one, about syntax errors and unexpected tokens:

source ./venv/scripts/activate
	: command not found
	-bash: ./venv/scripts/activate: line 4: syntax error near unexpected token `$'{\r''
	'bash: ./venv/scripts/activate: line 4: `deactivate () {

This is because of a line-ending issue; Python creates activate on windows with CRLF, when it should just be LF (aka \n). To fix, you can patch the file with dos2unix, with tr, or manually in an editor (just do a search and replace for \r\n with \n). I don't believe this can be done in one pass, but I might be wrong (although it would probably require some real shell skills to get as a one-liner).

Paths

Python exposes a magic global, __file__, for each file that holds the filepath of that given file (this is analogous to __filename in NodeJS).

If you want the directory of the file, you can use a few different approaches:

file_dir = os.path.dirname(os.path.realpath(__file__))
file_dir = Path(__file__).resolve().parent

Strings

  • Strings can be enclosed in either single or double quotes
    • Triple quotes should be used for multi-line strings (aka heredocs)
  • The modern approach to string interpellation is with "f-strings".
    • E.g. f"My var = {my_var}".
    • These work with double quotes, and even triple quotes
  • To join strings, you can use:
    • Plus operator ("alpha" + "bet")
    • Placing strings side-by-side ("alpha""bet")
    • The .join method
    • And more.

Classes

Syntax:

# Just used for example purposes
import datetime
from enum import Enum
from typing import Optional

class Media:
	# "class variables", which act (somewhat) like static properties
	# Shared among all class instances (*), so generally used for constants, etc.
	deprecated_physical_mediums = ["VHS", "DVD"]

	# You can also declare class properties without assigning a value
	# This is useful for strongly typing properties before assignment in `__init__` or elsewhere
	co_authors: Optional[List[str]]

	# Pretty much anything can be a class variable, including nested classes!
	class MediaType(Enum):
		Book = 1
		Movie = 2
		TV = 3
		Music = 4

	# Class constructor
	def __init__(self, title: str, author: str, published: datetime.date):
		# You can declare class properties belonging to the instance here
		self.title = title
		self.author = author
		self.published = published

	# Some methods. Note how `self` is *always* the first parameter
	def republish(self):
		self.published = datetime.date.today()

	def print_relative_age(self, date: datetime.date):
		delta = date - self.published
		print(f"{self.title} was published {delta.days} days from {date}")

	# For "getters", use `@property` decorator
	@property
	def age(self):
		return datetime.date.today() - self.published


# You can sub-class, mix-in, etc.

Classes - Init

Is defining an __init__() method always necessary? What about on subclasses?

The answer is no - neither regular classes or subclasses require an explicit __init__() method to be defined. Only declare the method if you actually want to change the behavior.

For subclasses, be aware that if you declare an __init__() method, you will need to manually call the parent init method if you want it to run; it will not be called automatically as you are overriding the method. You can use a pattern like this:

class Subclass(Superclass):
	def __init__(self):
		super(Subclass, self).__init__()
		print('Do something else')

__init__() is not a constructor. A closer parallel in Python would be __new__(), and init is called after it runs

Dataclasses

The dataclasses module is available with Python v3.7+

Dataclasses are great for instances when you just want to wrap up a bunch of data in a nice clean, strongly-typed, class. Dataclasses cut down on boilerplate with data-holding classes as they automatically generate the __init__ method for you.

For those familiar, it can feel a lot like TypeScript interfaces + JS objects.

from dataclasses import dataclass

@dataclass
class Beverage:
	name: str
	size_in_oz: float
	iced: bool
	# You can define default values
	price: float = 3
	coupon_code: Optional[str] = None

my_beverage = Beverage('Large Iced Coffee', 32.0, True)
# kwargs work too, just like normal
my_beverage = Beverage(name='Large Iced Coffee', size_in_oz=32.0, iced=True)

There is much more to dataclasses than the above example, but often the above is all you need.

TypedDict

A TypedDict is really just a type-annotation layer over a standard dictionary.

The common way to use them is to extend the TypedDict class as a new class declaration:

from typing import TypedDict

class Beverage(TypedDict, total=True):
	name: str
	size_in_oz: float
	iced: bool

beverage: Beverage = {
	'name': 'Large Iced Coffee',
	'size_in_oz': 32.0,
	'iced': True
}

total is True by default, and is only included for illustrative purposes.

There is also an alternative syntax, which calls TypeDict:

from typing import TypedDict

Beverage = TypedDict('Beverage', {'name': str, 'size_in_oz': float, 'iced': bool}, total=True)

beverage: Beverage = {
	'name': 'Large Iced Coffee',
	'size_in_oz': 32.0,
	'iced': True
}

Note: With the alternative syntax for TypedDict declarations (type alias approach), the first argument must exactly match the variable name you are assigning to

Currently, inline / anonymous TypedDicts are not supported, although there is discussion around supporting them

With either declaration approach, you can instantiate / create / declare an instance of the typeddict through regular dictionary declaration (curly braces), or by calling the class constructor directly, with kwargs:

beverage: Beverage = {
	'name': 'Large Iced Coffee',
	'size_in_oz': 32.0,
	'iced': True
}

# or

beverage = Beverage(
	name='Large Iced Coffee',
	size_in_oz=32.0,
	iced=True
)

SimpleNamespace

The SimpleNamespace object subclass is kind of an odd hybrid between a dataclass and a standard dictionary, but with less strong-typing as TypedDict. It lets you define an object instance using a single function call with keyword arguments, instead of having to define your own class.

This is kind of similar to PHP's (object) array() shorthand to convert an associative array into an object instance

Here is a basic example of a SimpleNamespace instantiation and usage

from types import SimpleNamespace

my_beverage = SimpleNamespace(
	name="Large Iced Coffee",
	size_in_oz=32.0,
	iced=True
)
# Notice that we can use dot notation here to look up values
print(my_beverage.name)

Dataclass vs TypedDict vs SimpleNamespace

  • Serialization / JSON.dumps()
    • TypedDict: Serialization works out of the box (since you are still using dictionaries)
    • SimpleNamespace: Serialization pretty much works out of the box (this is basically just a dict wrapper) - just use my_simplenamespace_obj.__dict__ instead of the object directly.
      • E.g., json.dumps(my_simplenamespace_obj.__dict__)
    • dataclass: Must use dataclasses.asdict(my_dataclass) first, and there are some limitations there
  • Accessor Syntax
    • TypedDict still requires that you use bracket notation to look up values, as this is really just a type annotation for a regular dictionary
      • E.g. my_typeddict["my_key"]
    • Both SimpleNamespace and dataclass let you use dot notation syntax to look up values, as they are classes
      • E.g. my_obj.my_key
  • Strong static-typing
    • TypedDict: As its name implies, very strong static-typing support :)
    • SimpleNamespace: Despite the import coming from types, this has very little static-typing support. Expect no inference with current-generation tooling (mypy, pyright, etc.)
    • dataclass: Since dataclass declaration requires you to use type annotations, it has very strong static-typing support.

Arrays and Lists

Iteration, Enumeration, and List Comprehension

Looping through items is, generally, one of the more syntactically unique parts of Python.

List Comprehension

List comprehension is a concise (and considered pythonic) way to build new lists from existing data.

Generally, it refers to any use of the [expression for element in iterable] syntax. For example:

users = [{"user_id": i, "name": f"user_{i}"} for i in range(1,11)]
print(users[1])
# {'user_id': 2, 'name': 'user_2'}

List comprehension can get much more complex than the above example. You can nest list comprehension, combine it with filtering, and much more.

Filtering Lists in Python

List comprehension is not just for iterating; you can filter with it too:

# Using list comprehension
[my_var for my_var in my_list if my_predicate]

# Example:
mixed = ['a', None, 2, 5, 'c', None]
strings = [e for e in mixed if isinstance(e, str)]

Another option is filter(), although it should be noted that this returns an iterator, not a list:

mixed = ['a', None, 2, 5, 'c', None]
strings_iter = filter(lambda e: isinstance(e, str), mixed)

filter() also tends to have worse static typing support than plain list comprehension, at least at the moment

Fast Checking For First Match of Condition

The any() function is the best (based on speed and conciseness) approach:

# any()
has_truthy = any(my_iter)

# if needing to compute truthy on the fly
admin_assigned = any(map(lambda e: e.is_admin == True, assigned_users))

# Above use of `any` is more readable than the `next()` approach...
first_item = next(x for x in seq if predicate(x))

# But both are much shorter and faster than...
len(list(filter())) > 0

any() is basically equivalent to lodash _.some()

If you need to actually retrieve / store the first item that matches the predicate, then use next() (and you'll often want to use a default, such as None, with it).

Array / List Oddness in Python

  • For joining array elements, instead of my_arr.join(separator), Python uses separator.join(my_arr)
  • For filtering array elements, instead of my_arr.filter(fn), Python uses filter(fn, my_arr)
    • Example: list(filter(lambda elem: elem.startswith('b'), ['bear', 'bat', 'cat']))
  • For sorting arrays, Python uses sorted(my_arr, key=optional_sort_fn)
  • Python lists don't have unshift / append left functionality!
    • Quick solutions:
      • Concatenate: ['a'] + ['b', 'c']
      • Insert: my_arr.insert(0, 'a')

List Joining, Appending, and Prepending

  • Adding elements to the end of a list
    • Use .append() to add a single element
    • Use .extend() to add multiple elements.
      • This can be used to join lists: list_a.extend(list_b)
  • Adding elements to the beginning of a list / prepending
    • With concatenation: ['a'] + ['b', 'c']
    • With insert: my_arr.insert(0, 'a')

Advanced Indexing and Slicing

For slicing arrays, there is a short syntax that is rather useful:

a[start:stop]  # items start through stop
a[start:]      # items start through the rest of the array
a[:stop]       # items from the beginning through stop
a[:]           # a copy of the whole array
a[-n]          # item n offset from end (a[-1] would be the last element)

Use my_list[0::2] for every even element, my_list[1::2] for every odd. You can use this to extract a list of x coordinates and y coordinates if they are mixed together in a single array.

If you are working numpy arrays, such as NDArray, you have additional advanced indexing and slicing methods available to you. Some of this syntax carries over to assignment as well.

Dictionaries and Object Patterns

Looking Up Values in Dictionaries or Instances

Coming from JavaScript, one of the rudest surprises learning Python was that trying to lookup a value in a dict by a key that does not exist will throw an error, whereas in JavaScript it just return undefined. For example:

// JavaScript
const myObj = {"myKey": "myVal"};
const lookedUpValue = myObj["this_key_does_not_exist"];
console.log(lookedUpValue); // prints `undefined`
# Python
my_dict = {"myKey": "myVal"}
looked_up_val = my_dict["this_key_does_not_exist"]
print(looked_up_val) # KeyError: 'this_key_does_not_exist'

# This also applies to objects / class instances
@dataclass
class MyClass:
	my_key: str

my_instance = MyClass(my_key="myVal")
# A type-checker, like `mypy`, should also complain about
# this lookup via a property that does not exist
looked_up_val = my_instance.this_key_does_not_exist
print(looked_up_val)
# AttributeError: 'MyClass' object has no attribute 'this_key_does_not_exist'

This is a bit annoying in Python, as it means you always need to be careful about dictionary and / or object lookups. Here are some common patterns:

  • Use attribute getter methods. This is my preferred option, because it allows you to specify a fallback value.
    • Dictionaries: use my_dict.get(key, OPT_fallback)
      • my_dict.get(key) is actually equivalent to my_dict.get(key, None); Python uses None as a fallback and won't throw an access error if the key does not exist
      • Use defaultdict when declaring dictionaries (more on this below)
    • Objects: Use getattr(my_obj, key, OPT_fallback)
      • Unlike my_dict.get(), for getattr if you don't specify a fallback, and try to lookup via a key / prop that does not exist, this will throw an error (AttributeError)
      # Fallback to a custom value - examples
      looked_up_val = getattr(my_obj, my_key, None)
      looked_up_val = getattr(my_obj, "user_group", "regular")
    • Using getattr(my_obj, key, None) or my_dict.get(key) makes Python act a lot more like JavaScript
      • For getattr, It even works when my_obj is None!
  • Use in to check if the key or property exists first, before accessing. (the hasattr() function is a similar option)
    if my_key in my_obj:
    	# if dictionary
    	looked_up_val = my_obj[my_key]
    	# if object / class instance
    	looked_up_val = my_obj.my_key

Another way to avoid this issue for dictionaries is to use the defaultdict class, and make sure a value is passed for the default_factory parameter.

For example:

from collections import defaultdict

my_dict = defaultdict(lambda: None)

lookedUpValue = my_dict["this_key_does_not_exist"]
print(lookedUpValue) # None

Miscellaneous Dictionary Notes

  • Checking if key exists in dict
    • if 'my_key' in my_dict
  • Dictionary filling shorthand
    • In JavaScript you can do this:
    // Javascript
    const quantity = 12;
    const items = 'eggs';
    const carton = {quantity, items}
  • defaultdict class: Use this to instantiate a dictionary with a default value for any key that does not exist (so lookups for non-existing keys return the default value instead of throwing a KeyError)

Unpacking and Destructuring

Unpacking and Destructuring from Lists

For unpacking from an array, list, or tuple use:

drink_sizes = ['12oz', '16oz', '20oz']
tall, grande, venti = drink_sizes

If there are more elements in a list than you want to capture, you can use *my_var to capture leftovers:

tall, *_other_sizes = ['12oz', '16oz', '20oz', '24oz', '32oz']

You can unpack / spread into existing arrays as well:

drink_sizes_oz = ['12oz', '16oz', '20oz']
drink_sizes_all = ['small', 'medium', 'large', *drink_sizes_oz]

Destructuring / unpacking while iterating

for name, age in [("Fred", 50), ("Betty", 42)]:
	print(f"{name} is {age} years old")

Unpacking and Destructuring from Dicts / Objects

Unfortunately, destructuring from dictionaries in Python requires more boilerplate code than in JS, and without shorthand syntax.

Assuming this is our sample dictionary:

# sample dict
drink_orders = {
	'fred': 'Iced Mocha',
	'jennifer': 'Americano'
}

Option A: Use itemgetter or attrgetter:

# Use `attrgetter` if class
fred_order, jennifer_order = itemgetter('fred', 'jennifer')(drink_orders)

Option B: use dict.values()

If you are using Python version 3.7 or above, this method works with any dictionary (as all dictionaries will have guaranteed order), but if you are using 3.6 or below, you will need to use an OrderedDict with this approach.

fred_order, jennifer_order = drink_orders.values()

Spreading Dicts / Objects

You can spread one dict into another with the ** (double asterisk) operator.

initial_dict = {
	'customer_name': 'Newhart',
	'item_qty': 5,
	'total': 24.32
}
updates = {
	'item_qty': 6,
	'total': 25.86
}
combined = {
	'title': 'Final Receipt',
	**initial_dict,
	**updates
}
print(combined)
# {'title': 'Final Receipt', 'customer_name': 'Newhart', 'item_qty': 6, 'total': 25.86}

This can also be used to spread a dictionary as keyword arguments (kwargs) to a function:

def greet(greeting: str, name: str):
	print(f'{greeting}, {name}')

my_args = {'name': 'Guido van Rossum', 'greeting': 'How are things'}
greet(**my_args)

However, there are multiple caveats with spreading dicts into function arguments. The main being that combining properties with the same key does not work directly with kwargs (will through got multiple values for keyword argument) and some indirection has to be used. You need to combine the values before spreading into the function arguments:

# This does NOT work
my_args = {'name': 'Guido van Rossum', 'greeting': 'How are things'}
greet(name='Joshua', **my_args)
# ^ Error: got multiple values for keyword argument 'name'

# Workaround - combine
my_args = {'name': 'Guido van Rossum', 'greeting': 'How are things'}
overrides = {'name': 'Joshua'}
greet(**{**my_args, **overrides})

Spreading a dictionary as kwargs currently has some serious issues when it comes to static type-checking, even when using TypedDict; see mypy Issue #4441 and pylance discussion #3046. This may potentially be fixed by PEP-692 and the use of Unpack.


Formatting

VSCode has an entire docs page devoted to the topic of Python linting.

The PEP8 Style Guide is a very common standard to follow when it comes to Python formatting rules and styles.

Here is a summary of some of the most popular linters and formatters.

💡 For ignoring pep8 errors, you [can add # nopep8 at the end of a line to ignore the error. Other solutions are discussed in this thread. For generally ignoring linting errors, # noqa is accepted by many linters, such as flake8. If you need to ignore issues within a giant triple-quoted string, you can add # noqa at the very end.

Formatting - Weird Edge Cases and Tips

Because Python is a language where whitespace and indenting actually alter how the program runs, there are some... interesting scenarios in which you have to be extra careful about how you format your code.

This is compounded by the fact that the recommended line length limit for pep8 is rather restrictive; just 79 characters!

Here are some special case worth mentioning (WIP):

Long IF Statements / IF with Chaining

When writing a long IF statement, and especially if it uses chaining, you will want to wrap the whole statement in parenthesis. I believe for short statements this would be anti-pythonic, but if you need to wrap due to line length, this will be necessary for parsing:

if (
	students.filter(
		degree: 'Business'
	)
	# You can leave comments mixed between chained methods this way too
	.exclude(
		drop_out_in_progress: True
	)
	.count() > 100
):
	print('Enrollment full!')

Import Statements

from mylib import Alfa, Bravo, Charlie, Delta, Echo, Foxtrot, Golf, Hotel

# To

from mylib import Alpha, Bravo, Charlie, Delta, \
	Echo, Foxtrot, Golf Hotel

Exceptions and Error Handling

Exceptions in Python can and should have distinct types.

There are a bunch of built-in exception types, such as FileNotFoundError and PermissionError, but you can also create your own by subclassing the Exception base class:

For example,

class InvalidEmailError(Exception):
	"""Raise for an invalid email address"""
	pass

if '@' not in email_input:
	raise InvalidEmailError(f'{email_input} is not a valid email')

Try / Catch With Exceptions

If you are excepting a specific exception type, you can specify:

try:
	validate_email(email)
except InvalidEmailError:
	print(f"Invalid email: {email}")

If you are unsure, you can use use a plain except clause, or capture with except Exception as my_var:

try:
	do_something()
except Exception as err
	print(err)

What happens if a thrown exception does not match any of the listed except blocks? It becomes an unhandled exception and will halt the overall execution of the program; basically the same as if you didn't have a try / except section to start with.

Testing

Random PyTest Notes

📄 My blog post: Pytest Productivity – Tips I Wish I Had Known Sooner

  • How to filter without allowing implicit wildcard at the end?
    • If you want to run a single test file, use pytest {FILEPATH}
    • If you want run a single test function (or method), use pytest "MODULE::MY_TEST" instead of pytest -k "SEARCH_STRING"
  • List matching tests
    • Use --collect-only

How Do I...

  • Write a noop? (something like the equivalent of () => {} in JS)
    • Lambda: lambda *args: None
  • Ternary operator?
    • MY_VAL if CONDITION else OTHER_VAL
  • Short-circuit logical-OR assignment?
    • MY_VAL = VAL_A or VAL_B
  • Avoid key errors / object has no attribute errors?
    • For objects, if hasattr(object, 'my_property'), or use try / catch, or use getattr(object, 'my_property', 'fallback_value')
  • Post or pre-increment?
    • You don't. Use += or -=
  • Run a Python file?
    • python3 -m python_file.py
  • Have an empty indent block?
    • Use pass
    • Within functions, you can also use ... (commonly used in typedefs)
Markdown Source Last Updated:
Sun May 28 2023 02:08:22 GMT+0000 (Coordinated Universal Time)
Markdown Source Created:
Wed Nov 11 2020 10:38:21 GMT+0000 (Coordinated Universal Time)
© 2023 Joshua Tzucker, Built with Gatsby
Feedback