Some Python basics, gotchas, idiosyncrasies, and links

Icon class

icon_class_computed

fab fa-python

icon_class

fab fa-python

Keywords

python

This is just a reference page with some Python basics and some handy external links. The information is primarily for Python version ≥ 3.12.6.

Some key properties and idiosyncrasies of Python

Before searching say on Stackoverflow (or having Google send you there) it's often best to read the Python3: Programming FAQ first, which in fact does answer most questions asked on Stackoverflow.

Each language has its own ways of doing familiar things. If you are coming to Python from another coding language knowing the following may save you some time:

Immutable types include ...

str, int, float, bool, str, tuple, Hashable, range, bytes, frozenset, MappingProxyType

Mutable types include ...

list, dict, set

Note that leveraging mutability breaks pure functional!

When returning a processed list from a function consider converting it to an immutable tuple:

def immute_list(l):
    return tuple(map(lambda x: x^2 + 2 * x, l)

>>> immute_list([1,2,3])
(5, 4, 11)

When returning a processed set from a function consider converting it to an immutable frozenset:

def immute_set(s):
...     return frozenset(map(str.upper, s)) 

>>> out = immute_set({'fe', 'fi', 'fo'})
>>> out
frozenset({'FE', 'FI', 'FO'})

>>> 'FE' in out
True
>>> 'FUM' in out
False

When returning a processed dict from a function consider converting it to an immutable types.MappingProxyType (the proxied dict can still be changed) or (better) use the frozendict or immutabledict external libraries to create truly immutable structures.

When returning a processed Pandas DataFrame (which is mutable) from a function consider converting it to a StaticFrame.

A set (which is not ordered) can't be referenced by index:

>>> s = {'fe', 'fi', 'fo'}
>>> s[1]
Traceback (most recent call last):
  File "", line 1, in 
TypeError: 'set' object is not subscriptable
>>>

To access the elements of a set in immutable, subscriptable form use a tuple (but the order is not guaranteed):

>>> tuple(s)
('fo', 'fi', 'fe')
>>> tuple(s)[0]
'fo'

For simple structured data consider using namedtuple instead of a tuple, or consider using a @dataclass, which may have methods, and may be made immutable using @dataclass(frozen=True).

Indexing and list operations

As in most coding language, Python subscript indexes start at 0 for the 1st element. (Compare with Wolfram Mathematica which starts at [[1]], with [[0]] giving the Head.)

Visit this cheatsheet for a good basic guide to indexing and ranges for list, and some basic list operations.

Note that list.pop() "removes" the last element, list.pop(i) "removes" the ith element, and remove('findme') searches for the specific element 'findme' (and removes it if found).

>>> lst = ['a', 'b', 'c']
>>> print(lst.index('c'))
2

>>> print('x' in lst)
False

>>> lst.index('x')
ValueError: 'x' is not in list

List index() searches for an element and returns the index if found:

Slicing (extracting data by range)

Python has powerful "slicing", but if you come from another language background the syntax and range conventions can be a bit confusing, an especially that in the a selection [start:stop:step] it means start <= x < stop including the item as start but excluding the item at stop. Note how in the following the [2:4] only selects 2 items:

>>> lst = [1, 2, 3, 4, 5]
>>> lst[2:4]
[3, 4]

Another "gotcha" compared with most other coding language is that no error occurs if you specify a position that exceeds the number of items:

>>> lst = [1, 2, 3, 4, 5]
>>> lst[2:10]
[3, 4,5]

For some good examples visit How to Slice a List.

List + and +=

The + can add two lists (creates a new list).

>>> lst = [1, 2, 3]
>>> lst2 = [4, 5]
>>> lst + lst2
[1, 2, 3, 4, 5]

lst  # unchanged
[1, 2, 3]

The += can add a list to an existing list inline.

>>> lst += [9, 10]
[1, 2, 3, 9, 10]

>>> print(lst)
[1, 2, 3, 9, 10]

Basic loops: for, in, while

The in construct is widely used in Python, both in a loops and as a smart comparitor and selector:

list = ['apple', 'orange', 'peach']
if 'apple' in list:
    print('bake an apple pie')

for fruit in list:
    print(fruit)

Visit also this Google education guide.

list comprehension

The term list comprehension refers to algorithmic population of lists.

squares = [x**2 for x in range(10)]
print(squares)

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

With a condition:

fruits = ["apple", "banana", "cherry", "kiwi", "mango"]

newlist = [x for x in fruits if "a" in x]

print(newlist)

['apple', 'banana', 'mango']

Note how above a str is handled as a list of characters and you can therefore use in to test for a character in a string.

Use range to populate with integers.

Without comprension (ugly):

lst = []
for i in range(10):
    lst.append(i)
print(lst)

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

For some reason there are lots of online examples showing the above. Do NOT replicate that!

Could use range with list comprehension:

lst = [x for x in range(10)]
print(lst)

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

But because range is an iterable can just use:

lst = list(range(10))
print(lst)

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

Don't try to assign lst = range(10) as range is a specific type of object

List vs iterable

Many operations that work with list also work with range and other iterables, including sorted(iterable), len(iterable), indexing iterable[0] and for elem in iterable.

print(sorted(range(10),reverse=True))

[9, 8, 7, 6, 5, 4, 3, 2, 1, 0]

Use of iterables is often more memory efficient

Some iterables: Sequences: list, tuple, str, bytes, bytesarray, range; Collections: dict, set; frozenset; View objects: dict.keys(); dict.values(); dict.items(); Other: memoryview, generator objects (see yield).

NB: A list is an iterable but is NOT and iterator.

Iterator

Useful when need to access elements in sequence outside a for loop.

Example: An iterator for a tuple:

fruits = ("apple", "banana", "cherry")
it = iter(fruits)

print(next(it))
apple

print(next(it))
banana

print(next(it))
cherry

Some Python-specific operator "gotchas" compared with some other languages

The exponentation (power) operator is ** not ^ (which performs a bitwise exclusive-or):

>>> 3**2
9

>>> 3^2
1

The // operator performs integer division with floor rounding:

>>> 5 // 2
2

>>> 6 // 2
3

>>> 6 // 2.0
3.0

>>> 6 // 2.9
2.0

Since Python3 float division is the default for / (in Python2 it was integer floor division)

>>> 2/3
0.6666666666666666

>>> 4/2
2.0

Python's integer floor division operator // rounds towards negative infinity (rather than towards zero):

>>> -7 // 3
-3

In Python3 the following identity holds (for b not zero):

a == (a // b) * b + (a % b)

In Python3 x % m has the sign of m:

>>> (-7) % (-2)
-1

Numerical types in Python do not support the len operation

>>> len(42)
Traceback (most recent call last):
  File "", line 1, in 
TypeError: object of type 'int' has no len()

Compare with Wolfram Mathematica (which is closer to Lisp):

Length[42]
0

Head[42]
Integer

But Python len works with str (which is an iterable sequence):

>>> len("test")
4

From the Python3 FAQ:

A slash in the argument list of a function denotes that the parameters prior to it are positional-only. Positional-only parameters are the ones without an externally usable name. Upon calling a function that accepts positional-only parameters, arguments are mapped to parameters based solely on their position. For example, divmod() is a function that accepts positional-only parameters. Its documentation looks like this:

help(divmod)
Help on built-in function divmod in module builtins:

divmod(x, y, /)
    Return the tuple (x//y, x%y).  Invariant: div*y + mod == x.

You can access the global symbol table using the globals() function.

This facility should not be an invitation abuse globals!

Since globals() is a dict you can access and change globals thus:

>>> bad = 'Global abuse'

>>> globals()['bad']
'Global abuse'

>>> globals()['bad'] = 'Even worse'
>>> globals()['bad']
'Even worse'

Use eval to interpret and evaluate strings a Python code:

>>> eval('1+2')
3

>>> eval('sum([1, 2, 3, 4])')
10

Using eval can get you out of some tight corners and is handy to have for some special scenarios but may indicate a code smell or insufficient use of functional support. .

Indiscriminately offering eval of user input is a MAJOR security risk!

You can restrict the functions eval has access to, visit this Geeks For Geeks article.<'/p>

Hashable

The hash() function (which only applies to immutable variants of collections types) creates and integer (that amongst other things is used to compare dictionary keys).

The hash() of an int is just the integer value.

The hash() of a tuple is constructed from its elements.

The hash() of a mutable list or dict or code>set is NOT DEFINED.

A custom hash can be defined on an object.

The hash function enables quick comparison of collections:


>>> hash((1,2,3))
529344067295497451

>>> hash((3,2,1))
-925386691174542831

>>> hash((1,2,3)) == hash((3,2,1))
False

This mechanism is used under the hood for direct tests:

>>>(1,2,3) == (3,2,1)
False

The range(start, stop) function excludes the 'stop' int itself:

>>> out = range(1, 3)
>>> type(out)

>>> out
range(1, 3)
>>> list(out)
[1, 2]

Exceptions

The ordering and logic of the try vs exception handling constructs in Python is slightly different from some other languages. The except (equivalent of Java catch) comes after the try, the non-exception case is handled in a following else:

try:
    ...
except:
    ...
else:
    ...
finally:
    ...

Exception catching can be restricted to one or more type. A separate except block may be used for each type of error (affords per case handling):

try:
    result = 10 / 0 
except ZeroDivisionError:
    print("Cannot divide by zero!")
except TypeError:
    print("Operation on unsupported type!")

Multiple exceptions can be caught within a single except block as a tuple. Use the as e construct to catch the exception type in a var e:

try:
    data = {"key": "value"}
    item = data["non_existent_key"] 
except (KeyError, IndexError) as e:
    print(f"Caught an exception: {e}")

assert

Assertions in Python use the assert keyword. The message is optional:

def divide(a, b):
        try:
            assert b != 0, "Cannot divide by zero!"
            return a / b
        except AssertionError as e:
            print(f"Error: {e}")
            return None

print(divide(10, 2))
5.0

print(divide(10, 0)) 
Error: Cannot divide by zero!
None

To disable assertions run with python -0.

namedtuple

A collections.namedtuple offers a lightweight alternative to a full class for simple immutable data structures, with access to fields by name (as well as the usually tuple indexing):

from collections import namedtuple
Point = namedtuple('Point', 'x y')

p = Point(10, 20)

# Equivalent access
print(p[0]) 
print(p.x)

@dataclass

Since Python 3.8. The dataclasses.dataclass offers a lightweight mutable data structure with a high degree of automation through the @dataclass decorator.

Data classes can define methods.

Data classes require type hints on defined fields (even through they aren't enforced on execution).

Data classes can be generated on-the-fly in code using make_dataclass.

Data classes offer value-based == equivalence comparison.

Data classes offer sorting support through @dataclass(order=True).

Data classes support inheritance (but don't play nicely with all patterns that use inheritance).

Data classes don’t allow mutable default values.

For flexible handling of default use field(default_factory=...), where default_factory accepts list, deist, set, etc. or any user-defined custom function.

Data classes can be made immutable through @dataclass(frozen=True). One advantage of an immutable data class over a namedtuple that it can carry methods.

Instantiating frozen dataclasses can be slightly slower than mutable ones

For some good explanations visit this datacamp article and this data quest article.

Data classes are not a panacea

A collections.defaultdict is a specialised dictionary that can handle missing keys:

from collections import defaultdict

# Default factory is int (default value for int is 0)
counts = defaultdict(int)

# Default factory is list (default value for list is an empty list)
grouped_items = defaultdict(list)

No KeyError is thrown:

print(counts["missing1"])
print(counts.keys())

0
dict_keys(['missing1'])

print(grouped_items["missing2"])
print(grouped_items.keys())

[]
dict_keys(['missing2'])

`with` for resources

For resource management (such as opening files an database connections) it is usually best to use the concise with statement rather than try/finally. Note only is it more concise, the context manager associated with a with statement will ensure the resource is automatically closed cleanly.

Instead of:

try:
    f = open('readme.txt', 'r')
    print(f.read())
finally:
    f.close()

Use with:

with open('readme.txt', 'r') as f:
  print(f.read())

The open() above returns an io.TextIOBase object, which is also a context manager.

To write:

with open("writeme.txt", "w") as fw:
    fw.write("Opened!")

Or with a database connection:

def db_fetch_table():
    with sqlite3.connect('db/example.db') as conn:
        cursor = conn.cursor()
        cursor.execute("SELECT * FROM table")
        return cursor.fetchall()

You can also create custom Context Managers.

with open("writeme.txt", "w") as fw:
    fw.write("Hello World!")

Counter

Python's Counter is a powerful way to do counting (frequency statistics) with hashables such as dict and also supports some special arithmetic operations. Visit this RealPython guide and this GeeksForGeeks guide.