Skip to content

🎨 The Pipeline class

Pipelines in spamfilter allow you to customize own models that periodically work through the filters wrapped into them.

Please note: Prior to spamfilter v2.0.0, the pipelines were called "machines". This is no longer the case, as we figured the term "machine" is not fully accurate for the way they work. The term "pipeline" is the new standard, as it describes the process of passing data through a series of filters. Thus, spamfilter v2.0.0 is a breaking change and you will need to update your code if you used machines before.

How to import the pipeline class

To import the class of a Pipeline, simply run:

from spamfilter.pipelines import Pipeline

Pipelines explained

Generally, all filters are stacked onto each other using a pipeline object which will then check them one after each other.

A pipeline also has a check(string: str) method which accepts a string as an input and will return a results.Result object. This object contains info about the filtering run.

Check the documentation for results.Result for more information.


spamfilter.pipelines.Pipeline

A pipeline is an object that accepts several filters and passes strings through these. It's the core mechanism to filter potential spam strings.

  • Pipeline.filters: this property is a list of all filters in a pipeline. The order is kept.
  • Pipeline.mode: can either be "normal", "tolerant" or "zero-tolerance".
    • "normal" lets filters change the string itself and will make strings fail if a filter says so.
    • "normal-quick" is like normal, but stops execution as soon as a fail happens.
    • "tolerant" passes strings, no matter what filters say, and does not stop execution of them on fail.
    • "zero-tolerance" does not accept any changes to a string and fails it as soon as a filter registers something.

Was called "Machine" prior to v2.0.0, which was a breaking change.

__init__(filters: Union[None, list[Type[Filter]]] = None, mode: str = 'normal')

Initializes the Pipeline object for later use. Filters do not need to be passed at this stage, they can be added later on.

Modes normal, normal-quick, tolerant and zero-tolerance are supported.

check(string: str) -> Result

Checks a given string against the filters inside the Pipeline. Returns a Result object.

Filters may modify the strings they get as input. The pipeline will pass the modified strings from one filter to the next ones:

[THIS IS A CAPITAL STRING]

⬇

[Capitals filter]

⬇

[this is a capital string]

⬇

[other filters...]


An example

This is an example of how you could implement three filters into one pipeline, which will check the string of them through each of the given filters.

from spamfilter.pipelines import Pipeline
from spamfilter.filters import (
    Capitals,
    SpecialChars,
    Length
)

m = Pipeline([
    Capitals(),
    SpecialChars(mode = "crop"),
    Length(min_length = 20, max_length = 60)
])

print(
    m.check("Test string!").passed
)