7 Ways to Compute Averages in Python: Basic to Advanced

1. Basic Method for Calculating the Average in Python

The most fundamental way to calculate the average of data in Python is to combine the sum() function with the len() function. Since this uses Python’s built-in standard functions, no additional libraries are required, making it very easy for beginners to understand.

Basic Average Calculation Method

The average is obtained by dividing the sum of the data by the number of data points. For example, the code to calculate the average of numbers stored in a list looks like this:
# Sample data
numbers = [10, 20, 30, 40, 50]

# Calculate the sum
total = sum(numbers)

# Calculate the average
average = total / len(numbers)

print(f"Data: {numbers}")
print(f"Average: {average}")

Code Explanation

In this code, we first use sum() to sum all the values in the list numbers, then use len() to obtain the number of elements in the list. Finally, we divide the sum by the number of elements to get the average. The len() function returns the length of the list, so the average can be accurately calculated regardless of how many data points there are. Because this method uses only Python’s basic features, it offers the advantage of being flexible and adaptable. It is also the optimal approach when dealing with small datasets.

2. Calculating the Average Using the statistics Module

Python includes the standard library statistics module, making it easier to calculate averages. The statistics.mean() function takes iterable objects such as lists or tuples and computes their average.

How to Calculate the Average Using statistics.mean()

import statistics

# Sample data
numbers = [10, 20, 30, 40, 50]

# Calculate the average
average = statistics.mean(numbers)

print(f"Data: {numbers}")
print(f"Average: {average}")

Code Explanation

This code uses the mean() function from the statistics module to obtain the average. This function calculates the average of a collection of numbers regardless of data type or format. Because it is provided by the standard library, no installation is required, and it can be used immediately. Since statistics.mean() internally uses sum() and len(), the result is the same as the method mentioned above, but the code becomes more concise and reduces the chance of errors.
侍エンジニア塾

3. Fast Average Calculation Using the numpy Module

When dealing with large amounts of data or frequently performing numerical calculations, Python’s numpy module is suitable. numpy is a library that enables efficient manipulation and computation of numeric data, especially excelling at processing large-scale datasets.

How to Compute the Average Using numpy.mean()

import numpy as np

# Sample data
numbers = np.array([10, 20, 30, 40, 50])

# Calculate the average
average = np.mean(numbers)

print(f"Data: {numbers}")
print(f"Average: {average}")

Code Explanation

Here, we use numpy.array to convert a list into a numpy array and calculate the average with the np.mean() function. numpy arrays, unlike lists, allow direct numeric operations, which speeds up computation especially when the data size is large. Additionally, using the numpy.average() function, you can compute weighted averages, supporting a variety of data analyses. If you need to process large datasets quickly or apply weighting to data, numpy is the optimal choice.

4. Calculating the Mean of Tabular Data Using pandas

Python’s data analysis library pandas excels at handling tabular data. Using the pandas.DataFrame object, you can easily compute the average of datasets that span multiple columns.

How to Compute the Mean Using DataFrame.mean()

import pandas as pd

# Create a sample DataFrame
data = {'A': [10, 20, 30, 40, 50],
        'B': [15, 25, 35, 45, 55]}
df = pd.DataFrame(data)

# Compute the mean of each column
average = df.mean()

print(f"DataFrame:
{df}")
print(f"Mean of each column:
{average}")

Code Explanation

In this example, we create a two‑column DataFrame df and use the df.mean() function to calculate the mean of each column. DataFrames are well suited for tabular data processing, allowing statistical calculations per column. For instance, if you need to aggregate by category, you can use the groupby method to aggregate based on specific conditions. pandas is one of the most widely used libraries in data analysis and can handle complex data processing.
侍エンジニア塾

5. Special Average Calculation Methods (Weighted, Harmonic Average, Geometric Average)

In addition to the ordinary arithmetic mean, there are averaging methods tailored to specific conditions. Below, we introduce special averaging techniques such as weighted average, harmonic average, and geometric average.

How to Calculate Weighted Average

Weighted average is used when you want to assign different importance (weights) to each data point. For example, the numpy.average() function can easily compute a weighted mean.
import numpy as np

# Sample data and weights
data = [10, 20, 30, 40, 50]
weights = [0.1, 0.2, 0.4, 0.2, 0.1]

# Calculate the weighted average
weighted_average = np.average(data, weights=weights)

print(f"Data: {data}")
print(f"Weights: {weights}")
print(f"Weighted average: {weighted_average}")

How to Calculate Harmonic Average

The harmonic average is suitable for dealing with ratios or speeds, such as when calculating the average speed for traveling the same distance at different speeds. Using the statistics.harmonic_mean() function makes the calculation easy.
import statistics

# Sample data
data = [10, 20, 30]

# Calculate the harmonic mean
harmonic_mean = statistics.harmonic_mean(data)

print(f"Data: {data}")
print(f"Harmonic mean: {harmonic_mean}")

How to Calculate Geometric Average

Geometric average is used for growth rates and compound interest calculations. In Python, you can easily compute it with the math module, and it can also be efficiently obtained using numpy.
import numpy as np

# Sample data
data = [1.1, 1.3, 1.5]

# Calculate the geometric mean
geometric_mean = np.prod(data) ** (1 / len(data))

print(f"Data: {data}")
print(f"Geometric mean: {geometric_mean}")

6. Techniques for Error Handling and Accuracy Improvement

When calculating an average, you need to consider improving accuracy and handling errors. In particular, extra care is required when working with floating-point numbers. Python provides tools to address these accuracy issues.

Floating-Point Accuracy Issues

When performing calculations with floating-point numbers, the results can contain errors. If accuracy is critical, using the decimal.Decimal module is effective.
from decimal import Decimal

# Sample data
data = [Decimal('0.1'), Decimal('0.2'), Decimal('0.3')]

# Calculate the average
average = sum(data) / len(data)

print(f"Data: {data}")
print(f"Average (improved precision): {average}")

Basics of Error Handling

For example, attempting to calculate an average on an empty list will raise an error. In such cases, it’s important to include error handling that checks whether the list is non-empty.
# Sample data (including an empty list case)
data = []

# Error checking
if len(data) > 0:
    average = sum(data) / len(data)
    print(f"Average: {average}")
else:
    print("Error: Data is empty")

7. Summary

In this article, we explained various ways to calculate the average in Python. sum() and len() were used for basic calculations, and we introduced a wide range of advanced methods leveraging the statistics module, numpy, and pandas. By selecting the optimal approach based on the data type and purpose, you can achieve more accurate and efficient data analysis. As a next step, learning how to visualize data—such as graphing the calculated averages using matplotlib—will give you a deeper understanding of the data.
RUNTEQ(ランテック)|超実戦型エンジニア育成スクール