Resources
What & Link | Type |
---|---|
Programming with Mosh: Python Cheatsheet | Cheat Sheet |
pythoncheatsheet.org | Cheat Sheet |
Official Docs | Official Docs |
Drew Learns Python | Excellent guide and cheatsheet, especially for JS devs. |
Florian GOTO - "Python Essentials for Node.js Developers (Medium version, currently down) | Intro Guide |
PythonTips.com book | Cheat Sheet / Quick Reference |
RealPython.com | Tutorial / Guide Site |
Joshua's Docs: Python Types | My notes on Python Types |
Using VSCode with Python | My guide on using VS Code for Python Development |
Tushar Sadhwani's blog | Lots of detailed posts about Python |
Here are some people that do some really cool things with the Python language itself:
- Tushar Sadhwani - GitHub (tusharsadhwani), Blog
- Frank Hoffman - GitHub (15r10nk)
- Alex Hall - GitHub (alexmojaki)
- Eric Traut - GitHub (erictraut)
- Sebastián Ramírez (@tiangolo) - GitHub (tiangolo)
The Module, Package, and Import System
https://docs.python.org/3/reference/import.html https://docs.python.org/3/tutorial/modules.html http://python-notes.curiousefficiency.org/en/latest/python_concepts/import_traps.html
Using Packages
The default package manager for Python is called "pip", and lets you install packages from the Python Package Index (PyPi).
For those with NodeJS experience,
pip
is similar to NPM or Yarn, and PyPi is similar to the NPM Package Repository.
Installing packages with pip is fairly straightforward, and the commands are similar to other package managers:
Action | Command |
---|---|
Install from PyPi | pip install "PackageName" You can also use semver: pip install "PackageName~=1.2.6" |
Update package to latest | pip install --upgrade PackageName |
Install from Requirements File | pip install -r requirements.txt |
Install from a setup.py file |
pip install . (or pip install -e . for an editable install) |
Install from VCS / Github | pip install -e {VCS_Syntax_String} (*) |
Install from local tarball | pip install {localTarballPath}.tgz |
Check installed version | pip show {PackageName} Use --verbose for extra info, like install path |
For the full list of commands, please see the pip reference guide.
🚨 Unlike NodeJS package managers, By default, pip installs packages globally. You can
--user
to scope the install to the user instead, but the best approach is to use virtual environments
Requirements Files
If you are sharing your python script, or even just using it in different environments, you probably don't want to require that its packages (dependencies) be installed one-by-one, manually with pip. A Requirements File is a way to declare all the packages required by your project, down to the specific versions, and which pip can use to provide a repeatable installation experience.
To generate a requirements file, you can either hand-code it, or use the pip freeze
command:
python -m pip freeze > requirements.txt
And, then to install from it:
python -m pip install -r requirements.txt
requirements.txt
vssetup.py
? The usual convention is thatrequirements.txt
is for other developers working on the project / package, whereassetup.py
is for consumers of the package.
NodeJS developers: Yes, this is similar to dependencies section of
package.json
, although pip's package resolution algorithms are not identical to NPM or Yarn.
Working on Nested Packages Locally
Install the package(s) with an editable install:
python3 -m pip install -e {PATH}
Virtual Environments
The default behavior of pip, installing packages globally, is not very optimal once you move beyond a single project; you can easily end up with version conflicts (project A requires lib v1.0, but Project B requires lib v2.0), and other issues.
Luckily, Virtual Environments are a way in Python to have packages scoped to individual project directories, instead of the OS.
In general, if you are using Python 3.3 or newer, it is recommended to use the built-in venv
module. Otherwise, the virtualenv
tool (as a separate install) is the standard approach. This is a good guide that covers both. And this is a good guide for just venv
. You can also find summarized step-by-step instructions for venv
below.
Venv Instructions
For convenience, here are the minimal setup steps for venv
(assuming you already have Python installed):
- Navigate to directory you want mapped / controlled by
venv
- Create the virtual environment. Convention is to use either
env/
orvenv/
as the directory.- Unix / MacOS:
python3 -m venv venv
- Win:
py -m venv venv
-
The folder name you use here usually (depends on shell) shows up in the CLI after activation. E.g.
(venv) joshua@desktop-pc >
- Unix / MacOS:
- Activate the virtual environment
- Unix / MacOS:
source venv/bin/activate
- Win:
.\venv\Scripts\activate
- Unix / MacOS:
- Double check that Python is being loaded via the Virtual Environment
- Unix / MacOS:
which python
- Win:
where python
- Unix / MacOS:
- (Optional) Install dependencies, based on
requirements.txt
python3 -m pip install -r requirements.txt
⚠ REMINDER: Don't forget to activate the environment (step 3)!
💡 Don't forget to exclude the virtual environment directory from Git / Version Control.
When you need to leave the virtual environment, run deactivate
. Note that this is not destroying or removing the virtual environment directory - you are just leaving the shell shim / environment. When you want to re-enter the environment, just run the activate
script again (step 3 of the above section).
Virtual Environments - Troubleshooting
Venv - Wrong Python Path
I recently found that a venv environment that I had been using for days without issue, suddenly stopped working entirely. Although it would "activate", nothing seemed to function correctly, and calls to which python
returned the wrong interpreter path!
The culprit ended up being exactly what this StackOverflow answer covers - I had renamed the parent folder. If you rename the parent directory of a venv, you might need to recreate the environment from scratch!
Venv - WSL Syntax Error
When trying to activate venv
on WSL, you might encounter an error like this one, about syntax errors and unexpected tokens:
source ./venv/scripts/activate
: command not found
-bash: ./venv/scripts/activate: line 4: syntax error near unexpected token `$'{\r''
'bash: ./venv/scripts/activate: line 4: `deactivate () {
This is because of a line-ending issue; Python creates activate
on windows with CRLF, when it should just be LF (aka \n
). To fix, you can patch the file with dos2unix
, with tr
, or manually in an editor (just do a search and replace for \r\n
with \n
). I don't believe this can be done in one pass, but I might be wrong (although it would probably require some real shell skills to get as a one-liner).
Verifying the Installed Version of a Package
pip show {package_name}
Paths
Python exposes a magic global, __file__
, for each file that holds the filepath of that given file (this is analogous to __filename
in NodeJS).
If you want the directory of the file, you can use a few different approaches:
file_dir = os.path.dirname(os.path.realpath(__file__))
file_dir = Path(__file__).resolve().parent
The benefit to using Path(__file__).resolve().parent
is that you can more easily chain Path
operations onto the end - e.g. use .parent.parent
For getting the current working directory, you can use os.getcwd()
:
import os
cwd = os.getcwd()
Within an interactive REPL that is launched outside of any given file, such as through the Django management shell,
__file__
will be undefined, butos.getcwd()
should still be available.
Strings
- Strings can be enclosed in either single or double quotes
- Triple quotes should be used for multi-line strings (aka heredocs)
- The modern approach to string interpellation is with "f-strings".
- E.g.
f"My var = {my_var}"
. - These work with double quotes, and even triple quotes
- E.g.
- To join strings, you can use:
- Plus operator (
"alpha" + "bet"
) - Placing strings side-by-side (
"alpha""bet"
) - The
.join
method - And more.
- Plus operator (
Regular Expressions / RegEx
🚨 WARNING: Python uses argument orders that (IMHO) are opposing to most other languages. E.g.,
re.sub
usesre.sub(pattern, replacement, haystack)
whereas most other languages use(pattern, haystack, replacement)
, orhaystack.replace(pattern, replacement)
In general, using regular expressions is handled through the re
standard library:
import re
Common use cases:
- Replace matches
re.sub(regex_pattern, replacement, input_string)
re.sub(r'(^[Hh]ello|Goodbye)', "PLEASANTRY", "Hello. How are you? Goodbye.")
- Using back-references
re.sub(r".+_(.+)_sku", r"item_\1", "current_book_sku")
->'item_book'
- Use named capture groups, extract as variables
match_info = re.match(regex_pattern, input_string)
- If the pattern did not match,
match_info
would beNone
- You can access matched groups via:
- Standard dict key lookup -
match_info['my_group']
- Special accessor:
match_info.group('my_group')
- Get all groups, as key-value dict:
match_info.groupdict()
- Get all group values, as tuple:
match_info.groups()
- Standard dict key lookup -
- If a sub group did not match, and had optional matching, you will get
None
back for that group. E.g., in using.groupdict()
, you could see something like:{ "named_group_a": "abc123" "named_group_b": None }
Classes
Syntax:
# Just used for example purposes
import datetime
from enum import Enum
from typing import Optional
class Media:
# "class variables", which act (somewhat) like static properties
# Shared among all class instances (*), so generally used for constants, etc.
deprecated_physical_mediums = ["VHS", "DVD"]
# You can also declare class properties without assigning a value
# This is useful for strongly typing properties before assignment in `__init__` or elsewhere
co_authors: Optional[List[str]]
# Pretty much anything can be a class variable, including nested classes!
class MediaType(Enum):
Book = 1
Movie = 2
TV = 3
Music = 4
# Class constructor
def __init__(self, title: str, author: str, published: datetime.date):
# You can declare class properties belonging to the instance here
self.title = title
self.author = author
self.published = published
# Some methods. Note how `self` is *always* the first parameter
def republish(self):
self.published = datetime.date.today()
def print_relative_age(self, date: datetime.date):
delta = date - self.published
print(f"{self.title} was published {delta.days} days from {date}")
# For "getters", use `@property` decorator
@property
def age(self):
return datetime.date.today() - self.published
# You can sub-class, mix-in, etc.
Classes - Init
Is defining an __init__()
method always necessary? What about on subclasses?
The answer is no - neither regular classes or subclasses require an explicit __init__()
method to be defined. Only declare the method if you actually want to change the behavior.
For subclasses, be aware that if you declare an __init__()
method, you will need to manually call the parent init method if you want it to run; it will not be called automatically as you are overriding the method. You can use a pattern like this:
class Subclass(Superclass):
def __init__(self):
super(Subclass, self).__init__()
print('Do something else')
__init__()
is not a constructor. A closer parallel in Python would be__new__()
, and init is called after it runs
Classes - How to Call Super
In looking at example code, you might see calls to super written as both super()
and super(SubClass, self)
. Both are valid (in modern Python), and in Python 3+, super()
is actually equivalent to the more verbose form.
Classes - How to Check For Type and Inheritance
You can use isinstance(my_instance, MyClass)
to check if an instantiated instance matches a certain class type.
This should not be confused with checking if an instantiated class (the class definition itself) matches another class - for that, use issubclass(MyClass, ClassToCompareWith)
.
If you mix these up, you won't get an error, but will get very confusing results (will never return true)!
Dataclasses
The
dataclasses
module is available with Python v3.7+
Dataclasses are great for instances when you just want to wrap up a bunch of data in a nice clean, strongly-typed, class. Dataclasses cut down on boilerplate with data-holding classes as they automatically generate the __init__
method for you.
For those familiar, it can feel a lot like TypeScript interfaces + JS objects.
from dataclasses import dataclass
@dataclass
class Beverage:
name: str
size_in_oz: float
iced: bool
# You can define default values
price: float = 3
coupon_code: Optional[str] = None
my_beverage = Beverage('Large Iced Coffee', 32.0, True)
# kwargs work too, just like normal
my_beverage = Beverage(name='Large Iced Coffee', size_in_oz=32.0, iced=True)
There is much more to dataclasses than the above example, but often the above is all you need.
TypedDict
A TypedDict is really just a type-annotation layer over a standard dictionary.
The common way to use them is to extend the TypedDict
class as a new class declaration:
from typing import TypedDict
class Beverage(TypedDict, total=True):
name: str
size_in_oz: float
iced: bool
beverage: Beverage = {
'name': 'Large Iced Coffee',
'size_in_oz': 32.0,
'iced': True
}
total
isTrue
by default, and is only included for illustrative purposes.
There is also an alternative syntax, which calls TypeDict
:
from typing import TypedDict
Beverage = TypedDict('Beverage', {'name': str, 'size_in_oz': float, 'iced': bool}, total=True)
beverage: Beverage = {
'name': 'Large Iced Coffee',
'size_in_oz': 32.0,
'iced': True
}
Note: With the alternative syntax for TypedDict declarations (type alias approach), the first argument must exactly match the variable name you are assigning to
Currently, inline / anonymous TypedDicts are not supported, although there is discussion around supporting them
With either declaration approach, you can instantiate / create / declare an instance of the typeddict through regular dictionary declaration (curly braces), or by calling the class constructor directly, with kwargs:
beverage: Beverage = {
'name': 'Large Iced Coffee',
'size_in_oz': 32.0,
'iced': True
}
# or
beverage = Beverage(
name='Large Iced Coffee',
size_in_oz=32.0,
iced=True
)
However, neither format supports default values for optional properties (by design).
SimpleNamespace
The SimpleNamespace
object subclass is kind of an odd hybrid between a dataclass and a standard dictionary, but with less strong-typing as TypedDict
. It lets you define an object instance using a single function call with keyword arguments, instead of having to define your own class.
This is kind of similar to PHP's
(object) array()
shorthand to convert an associative array into an object instance
Here is a basic example of a SimpleNamespace instantiation and usage
from types import SimpleNamespace
my_beverage = SimpleNamespace(
name="Large Iced Coffee",
size_in_oz=32.0,
iced=True
)
# Notice that we can use dot notation here to look up values
print(my_beverage.name)
You can use SimpleNamespace
to convert existing dictionaries to objects, but by default this only works one layer deep - nested dictionaries are left as-is and not converted. If you want full recursive conversion, check out the dotwiz
package.
Dataclass vs TypedDict vs SimpleNamespace
- Serialization / JSON.dumps()
TypedDict
: Serialization works out of the box (since you are still using dictionaries)SimpleNamespace
: Serialization pretty much works out of the box (this is basically just a dict wrapper) - just usemy_simplenamespace_obj.__dict__
instead of the object directly.- E.g.,
json.dumps(my_simplenamespace_obj.__dict__)
- E.g.,
dataclass
: Must usedataclasses.asdict(my_dataclass)
first, and there are some limitations there
- Accessor Syntax
TypedDict
still requires that you use bracket notation to look up values, as this is really just a type annotation for a regular dictionary- E.g.
my_typeddict["my_key"]
- E.g.
- Both
SimpleNamespace
anddataclass
let you use dot notation syntax to look up values, as they are classes- E.g.
my_obj.my_key
- E.g.
- Strong static-typing
TypedDict
: As its name implies, very strong static-typing support :)SimpleNamespace
: Despite the import coming fromtypes
, this has very little static-typing support. Expect no inference with current-generation tooling (mypy
,pyright
, etc.)dataclass
: Since dataclass declaration requires you to use type annotations, it has very strong static-typing support.
- Default values
- Only dataclass: Default per-property values really only makes sense for
dataclass
- dictionaries cannot have have default per-property values (so the same holds forTypedDict
), and althoughSimpleNamespace
is a class, defaults don't really fit the use-case- WARNING: you might not want defaults anyway, considering how Python treats mutable defaults
- Only dataclass: Default per-property values really only makes sense for
-
Docstrings
Other Special Class Types
Context Managers
- These should have an
__init__
,__enter__
, and__exit__
- To use, call
with MyContextManager as value
- You can have
__init__
take values, in which case you would add parenthesis and arguments when calling withwith MyContextManager
Arrays and Lists
Iteration, Enumeration, and List Comprehension
Looping through items is, generally, one of the more syntactically unique parts of Python.
List Comprehension
List comprehension is a concise (and considered pythonic) way to build new lists from existing data.
Generally, it refers to any use of the [expression for element in iterable]
syntax. For example:
users = [{"user_id": i, "name": f"user_{i}"} for i in range(1,11)]
print(users[1])
# {'user_id': 2, 'name': 'user_2'}
List comprehension can get much more complex than the above example. You can nest list comprehension, combine it with filtering, and much more.
Filtering Lists in Python
List comprehension is not just for iterating; you can filter with it too:
# Using list comprehension
[my_var for my_var in my_list if my_predicate]
# Example:
mixed = ['a', None, 2, 5, 'c', None]
strings = [e for e in mixed if isinstance(e, str)]
Another option is filter()
, although it should be noted that this returns an iterator, not a list:
mixed = ['a', None, 2, 5, 'c', None]
strings_iter = filter(lambda e: isinstance(e, str), mixed)
filter()
also tends to have worse static typing support than plain list comprehension, at least at the moment
Fast Checking For First Match of Condition
The any()
function is the best (based on speed and conciseness) approach:
# any()
has_truthy = any(my_iter)
# if needing to compute truthy on the fly
admin_assigned = any(u.is_admin for u in assigned_users)
# Lambda works too
admin_assigned = any(map(lambda u: u.is_admin == True, assigned_users))
# Above use of `any` is more readable than the `next()` approach...
first_item = next(x for x in seq if predicate(x))
# But both are much shorter and faster than...
len(list(filter())) > 0
any()
is basically equivalent to lodash _.some()
If you need to actually retrieve / store the first item that matches the predicate, then use next()
(and you'll often want to use a default, such as None
, with it).
File Walking / Directory Traversal / File Globs
One of the easiest way to do nested file matching is with the glob
package (built-in):
for jpg_file_path in glob.glob(f"{workdir}/**/*.jpg", recursive=True):
print(f"JPG found at {jpg_file_path}")
Array / List Oddness in Python
- For joining array elements, instead of
my_arr.join(separator)
, Python usesseparator.join(my_arr)
- Although there might be historical reasons why this was chosen, this never ceases to frustrate me.
- For filtering array elements, instead of
my_arr.filter(fn)
, Python usesfilter(fn, my_arr)
- Example:
list(filter(lambda elem: elem.startswith('b'), ['bear', 'bat', 'cat']))
- Example:
- For sorting arrays, Python uses
sorted(my_arr, key=optional_sort_fn)
- Python lists don't have
unshift
/ append left functionality!- Quick solutions:
- Concatenate:
['a'] + ['b', 'c']
- Insert:
my_arr.insert(0, 'a')
- Concatenate:
- Quick solutions:
List Joining, Appending, and Prepending
- Adding elements to the end of a list
- Use
.append()
to add a single element - Use
.extend()
to add multiple elements.- This can be used to join lists:
list_a.extend(list_b)
- This can be used to join lists:
- Use
- Adding elements to the beginning of a list / prepending
- With concatenation:
['a'] + ['b', 'c']
- With
insert
:my_arr.insert(0, 'a')
- With concatenation:
Advanced Indexing and Slicing
For slicing arrays, there is a short syntax that is rather useful:
a[start:stop] # items start through stop
a[start:] # items start through the rest of the array
a[:stop] # items from the beginning through stop
a[:] # a copy of the whole array
a[-n] # item n offset from end (a[-1] would be the last element)
Use my_list[0::2]
for every even element, my_list[1::2]
for every odd. You can use this to extract a list of x coordinates and y coordinates if they are mixed together in a single array.
If you are working numpy
arrays, such as NDArray
, you have additional advanced indexing and slicing methods available to you. Some of this syntax carries over to assignment as well.
Get random list element
Since random.choice
already supports structures like list
, you can pass one directly to the method to get a random list element:
import random
random_pick = random.choice(my_list)
Dictionaries and Object Patterns
Looking Up Values in Dictionaries or Instances
Coming from JavaScript, one of the rudest surprises learning Python was that trying to lookup a value in a dict by a key that does not exist will throw an error, whereas in JavaScript it just return undefined
. For example:
// JavaScript
const myObj = {"myKey": "myVal"};
const lookedUpValue = myObj["this_key_does_not_exist"];
console.log(lookedUpValue); // prints `undefined`
# Python
my_dict = {"myKey": "myVal"}
looked_up_val = my_dict["this_key_does_not_exist"]
print(looked_up_val) # KeyError: 'this_key_does_not_exist'
# This also applies to objects / class instances
@dataclass
class MyClass:
my_key: str
my_instance = MyClass(my_key="myVal")
# A type-checker, like `mypy`, should also complain about
# this lookup via a property that does not exist
looked_up_val = my_instance.this_key_does_not_exist
print(looked_up_val)
# AttributeError: 'MyClass' object has no attribute 'this_key_does_not_exist'
This is a bit annoying in Python, as it means you always need to be careful about dictionary and / or object lookups. Here are some common patterns:
- Use attribute getter methods. This is my preferred option, because it allows you to specify a fallback value.
- Dictionaries: use
my_dict.get(key, OPT_fallback)
my_dict.get(key)
is actually equivalent tomy_dict.get(key, None)
; Python usesNone
as a fallback and won't throw an access error if the key does not exist- Use
defaultdict
when declaring dictionaries (more on this below)
- Objects: Use
getattr(my_obj, key, OPT_fallback)
- Unlike
my_dict.get()
, forgetattr
if you don't specify a fallback, and try to lookup via a key / prop that does not exist, this will throw an error (AttributeError
)
# Fallback to a custom value - examples looked_up_val = getattr(my_obj, my_key, None) looked_up_val = getattr(my_obj, "user_group", "regular")
- Unlike
- Using
getattr(my_obj, key, None)
ormy_dict.get(key)
makes Python act a lot more like JavaScript- For
getattr
, It even works whenmy_obj
isNone
!
- For
- Dictionaries: use
- Use
in
to check if the key or property exists first, before accessing. (thehasattr()
function is a similar option, but only for non-dicts)if my_key in my_obj: # if dictionary looked_up_val = my_obj[my_key] # if object / class instance looked_up_val = my_obj.my_key
Another way to avoid this issue for dictionaries is to use the defaultdict
class, and make sure a value is passed for the default_factory
parameter.
For example:
from collections import defaultdict
my_dict = defaultdict(lambda: None)
lookedUpValue = my_dict["this_key_does_not_exist"]
print(lookedUpValue) # None
Building Dictionaries
Dictionary with a Single Dynamic Key
In JavaScript, you can construct a dictionary with a key based on a variable, like so:
let dynamicKey = getKey(); // Example val: "name"
console.log({[dynamicKey]: 'test'});
// {'name': 'test'}
In Python, the same thing can be done by just directly putting the variable name as the key:
dynamic_key = get_key() # Example val: "name"
print({dynamic_key: 'test'})
# {'name': 'test'}
Removing Dictionary Values
- To delete a prop and capture the value, use
.pop()
my_val = my_dict.pop(my_key)
my_val = my_dict.pop(my_key, my_default)
- To delete a prop, use
del my_dict[my_key]
- Warning: This will throw a key error if the key does not exist in the dict
- To delete a prop that might exist, without throwing a KeyError if it doesn't, use the
.pop()
method with a default (None
) that you ignore / discard:my_dict.pop(my_key, None)
Miscellaneous Dictionary Notes
- Checking if key exists in dict
if 'my_key' in my_dict
- Dictionary filling shorthand
- In JavaScript you can do this:
// Javascript const quantity = 12; const items = 'eggs'; const carton = {quantity, items}
- Unfortunately, there is no parallel in Python (not without some meta-programming magic)
defaultdict
class: Use this to instantiate a dictionary with a default value for any key that does not exist (so lookups for non-existing keys return the default value instead of throwing aKeyError
)- Deleting an object key-value pair / entry
- Use the
del
keyword (e.g.del my_dict[my_key]
) (acts likedelete
in JavaScript)
- Use the
- Checking for an empty dict
- Empty dicts are falsy, so
"not empty" if my_dict else "empty"
works just fine
- Empty dicts are falsy, so
Unpacking and Destructuring
Unpacking and Destructuring from Lists
For unpacking from an array, list, or tuple use:
drink_sizes = ['12oz', '16oz', '20oz']
tall, grande, venti = drink_sizes
If there are more elements in a list than you want to capture, you can use *my_var
to capture leftovers:
tall, *_other_sizes = ['12oz', '16oz', '20oz', '24oz', '32oz']
You can unpack / spread into existing arrays as well:
drink_sizes_oz = ['12oz', '16oz', '20oz']
drink_sizes_all = ['small', 'medium', 'large', *drink_sizes_oz]
Unpacking and Destructuring from Tuples
Unpacking and destructing from tuples works the same as it does for lists.
Destructuring / unpacking while iterating
for name, age in [("Fred", 50), ("Betty", 42)]:
print(f"{name} is {age} years old")
Unpacking and Destructuring from Dicts / Objects
Unfortunately, destructuring from dictionaries in Python requires more boilerplate code than in JS, and without shorthand syntax.
Assuming this is our sample dictionary:
# sample dict
drink_orders = {
'fred': 'Iced Mocha',
'jennifer': 'Americano'
}
Option A: Use itemgetter
or attrgetter
:
# Use `attrgetter` if class
fred_order, jennifer_order = itemgetter('fred', 'jennifer')(drink_orders)
Option B: use dict.values()
If you are using Python version 3.7 or above, this method works with any dictionary (as all dictionaries will have guaranteed order), but if you are using 3.6 or below, you will need to use an OrderedDict
with this approach.
fred_order, jennifer_order = drink_orders.values()
Spreading Dicts / Objects
You can spread one dict into another with the **
(double asterisk) operator.
initial_dict = {
'customer_name': 'Newhart',
'item_qty': 5,
'total': 24.32
}
updates = {
'item_qty': 6,
'total': 25.86
}
combined = {
'title': 'Final Receipt',
**initial_dict,
**updates
}
print(combined)
# {'title': 'Final Receipt', 'customer_name': 'Newhart', 'item_qty': 6, 'total': 25.86}
This can also be used to spread a dictionary as keyword arguments (kwargs) to a function:
def greet(greeting: str, name: str):
print(f'{greeting}, {name}')
my_args = {'name': 'Guido van Rossum', 'greeting': 'How are things'}
greet(**my_args)
However, there are multiple caveats with spreading dicts into function arguments. The main being that combining properties with the same key does not work directly with kwargs (will through got multiple values for keyword argument
) and some indirection has to be used. You need to combine the values before spreading into the function arguments:
# This does NOT work
my_args = {'name': 'Guido van Rossum', 'greeting': 'How are things'}
greet(name='Joshua', **my_args)
# ^ Error: got multiple values for keyword argument 'name'
# Workaround - combine
my_args = {'name': 'Guido van Rossum', 'greeting': 'How are things'}
overrides = {'name': 'Joshua'}
greet(**{**my_args, **overrides})
Spreading a dictionary as kwargs currently has some serious issues when it comes to static type-checking, even when using
TypedDict
; see mypy Issue #4441 and pylance discussion #3046. This may potentially be fixed by PEP-692 and the use ofUnpack
.
Formatting
VSCode has an entire docs page devoted to the topic of Python linting.
The PEP8
Style Guide is a very common standard to follow when it comes to Python formatting rules and styles.
Here is a summary of some of the most popular linters and formatters.
💡 For ignoring
pep8
errors, you [can add# nopep8
at the end of a line to ignore the error. Other solutions are discussed in this thread. For generally ignoring linting errors,# noqa
is accepted by many linters, such asflake8
. If you need to ignore issues within a giant triple-quoted string, you can add# noqa
at the very end.
Formatting - Weird Edge Cases and Tips
Because Python is a language where whitespace and indenting actually alter how the program runs, there are some... interesting scenarios in which you have to be extra careful about how you format your code.
This is compounded by the fact that the recommended line length limit for pep8
is rather restrictive; just 79 characters!
Here are some special case worth mentioning (WIP):
Long IF Statements / IF with Chaining
When writing a long IF statement, and especially if it uses chaining, you will want to wrap the whole statement in parenthesis. I believe for short statements this would be anti-pythonic, but if you need to wrap due to line length, this will be necessary for parsing:
if (
students.filter(
degree: 'Business'
)
# You can leave comments mixed between chained methods this way too
.exclude(
drop_out_in_progress: True
)
.count() > 100
):
print('Enrollment full!')
Import Statements
from mylib import Alfa, Bravo, Charlie, Delta, Echo, Foxtrot, Golf, Hotel
# To
from mylib import Alpha, Bravo, Charlie, Delta, \
Echo, Foxtrot, Golf Hotel
Exceptions and Error Handling
Exceptions in Python can and should have distinct types.
There are a bunch of built-in exception types, such as FileNotFoundError
and PermissionError
, but you can also create your own by subclassing the Exception
base class:
For example,
class InvalidEmailError(Exception):
"""Raise for an invalid email address"""
pass
if '@' not in email_input:
raise InvalidEmailError(f'{email_input} is not a valid email')
Try / Catch With Exceptions
If you are excepting a specific exception type, you can specify:
try:
validate_email(email)
except InvalidEmailError:
print(f"Invalid email: {email}")
If you are unsure, you can use use a plain except
clause, or capture with except Exception as my_var
:
try:
do_something()
except Exception as err
print(err)
What happens if a thrown exception does not match any of the listed except
blocks? It becomes an unhandled exception and will halt the overall execution of the program; basically the same as if you didn't have a try / except
section to start with.
Profiling
Generating a pstat profile:
Using cProfile:
# cProfile
python3 -m cProfile -o myapp.profile.pstats {YOUR_NORMAL_INVOKE_STUFF}
Using Yappi (better than cProfile
for multi-process / parallel operations):
# yappi
yappi --clock-type=wall --output-format=pstat -o myapp.profile.pstats {INVOKE_PATH}
Analyze (using snakeviz):
snakeviz myapp.profile.pstats
Other tools:
Crafting Python CLIs
💡 I highly recommend not using
argparse
. There are so many better alternatives out there, my personal favorite probably being typer which utilizes Python type-hints to speed up CLI development.
To detect when a file is being run from the CLI, use:
if __name__ == "__main__":
...
Accessing Command Line Arguments Directly
To access command line arguments directly, use sys.argv
. You'll want to skip the first one, so you often see something like:
args = sys.argv[1:]
Testing
Random PyTest Notes
📄 My blog post: Pytest Productivity – Tips I Wish I Had Known Sooner
- How to filter without allowing implicit wildcard at the end?
- If you want to run a single test file, use
pytest {FILEPATH}
- If you want run a single test function (or method), use
pytest "MODULE::MY_TEST"
instead ofpytest -k "SEARCH_STRING"
- If you want to run a single test file, use
- List matching tests
- Use
--collect-only
- Use
How Do I...
- Write a noop? (something like the equivalent of
() => {}
in JS)- Lambda:
lambda *args: None
- Function:
def my_func(*args, **kwargs):
- Lambda:
- Ternary operator?
MY_VAL if CONDITION else OTHER_VAL
- Short-circuit logical-OR assignment?
MY_VAL = VAL_A or VAL_B
- Avoid key errors /
object has no attribute
errors?- For objects,
if hasattr(object, 'my_property')
, or use try / catch, or usegetattr(object, 'my_property', 'fallback_value')
- For objects,
- Post or pre-increment?
- You don't. Use
+=
or-=
- You don't. Use
- Run a Python file?
python3 -m python_file.py
- Have an empty indent block?
- Use
pass
- Within functions, you can also use
...
(commonly used in typedefs)
- Use
- Is there an equivalent to PHP's
(object) my_arr
casting, to convert a Python dictionary to a class instance?- Yes - see
SimpleNamespace
. - If you want a utility to recursively turn a python dictionary into a dot-notation accessible class, check out the
dotwiz
package
- Yes - see
- Easiest / fastest way to execute a shell command and capture the output?
result = subprocess.check_output(cmd_list)
- E.g.:
result = json.loads(subprocess.check_output(["node", "my_script.js", "hello"])
- E.g.:
- If you want success / check for zero exit code:
file_exists = subprocess.run(["stat", "my_file"]).returncode == 0
file_exists = subprocess.call(["stat", "my_file"]) == 0