Skip to content

API Reference

This page documents the public Python API by hand. The API surface is small enough that hand-written reference text is clearer than generated output for the initial release.

moine.distance

moine.distance(left, right, *, lang=None, dictionary=None, score_cutoff=None, max_readings_per_segment=None, max_span_chars=None, max_paths=None, longest_only=None)

Returns the Levenshtein-style Lattice Path Edit Distance for one pair of strings. When both lang and dictionary are omitted, the function falls back to plain string edit distance.

left
The first string.
right
The second string.
lang
Optional language code. Use "ja" for Japanese or "zh" for Chinese.
dictionary
Optional loaded dictionary object. When omitted, mòine loads or reuses the default dictionary for lang.
score_cutoff
Optional integer threshold. Distances greater than the cutoff return score_cutoff + 1.
max_readings_per_segment, max_span_chars, max_paths, longest_only
Optional dictionary expansion controls. These options require lang or dictionary; plain string distance rejects them.
>>> import moine
>>> moine.distance("weishiji", "威士忌", lang="zh")
0

moine.damerau_distance

moine.damerau_distance(left, right, *, lang=None, dictionary=None, score_cutoff=None, max_readings_per_segment=None, max_span_chars=None, max_paths=None, longest_only=None)

Returns the lattice-aware Damerau-Levenshtein distance for one pair of strings. It can count adjacent transpositions as one edit on lattice paths.

>>> import moine
>>> moine.damerau_distance("moine", "mione")
1

moine.normalized_distance

moine.normalized_distance(left, right, *, lang=None, dictionary=None, score_cutoff=None, max_readings_per_segment=None, max_span_chars=None, max_paths=None, longest_only=None)

Returns a normalized distance in 0.0..=1.0.

>>> import moine
>>> moine.normalized_distance("もいにゃ", "モイニャ", lang="ja")
0.0

moine.normalized_similarity

moine.normalized_similarity(left, right, *, lang=None, dictionary=None, score_cutoff=None, max_readings_per_segment=None, max_span_chars=None, max_paths=None, longest_only=None)

Returns a normalized similarity in 0.0..=1.0, where larger is better.

>>> import moine
>>> moine.normalized_similarity("もいにゃ", "モイニャ", lang="ja")
1.0

moine.ratio

moine.ratio(left, right, *, lang=None, dictionary=None, score_cutoff=None, max_readings_per_segment=None, max_span_chars=None, max_paths=None, longest_only=None)

Alias for normalized_similarity.

>>> import moine
>>> moine.ratio("ピィート", "ピート", lang="ja")
0.7142857142857143

moine.partial_ratio

moine.partial_ratio(query, text, *, lang=None, dictionary=None, score_cutoff=None, max_span_chars=None, max_reading_span_chars=None, max_readings_per_segment=None, max_paths=None, longest_only=None)

Returns the best normalized similarity between query and a span in text. The returned score is in 0.0..=1.0, where larger is better. In partial APIs, max_span_chars limits scanned spans in text; max_reading_span_chars limits dictionary reading expansion. When max_span_chars is omitted, dictionary-backed matching also accounts for the longest reading path of query, so short written forms such as kanji or hanzi can still match longer romanized spans.

>>> import moine
>>> moine.partial_ratio("ウイスキー", "ういすきーをのんでいます", lang="ja")
1.0

moine.partial_distance

moine.partial_distance(query, text, *, lang=None, dictionary=None, score_cutoff=None, max_span_chars=None, max_reading_span_chars=None, max_readings_per_segment=None, max_paths=None, longest_only=None)

Returns the best distance between query and a span in text. If dictionary-backed matching cannot score any span in text, this returns len(query) without a cutoff or score_cutoff + 1 with a cutoff.

>>> import moine
>>> moine.partial_distance("ウイスキー", "ういすきーをのんでいます", lang="ja")
0

moine.partial_alignment

moine.partial_alignment(query, text, *, lang=None, dictionary=None, metric="ratio", score_cutoff=None, max_span_chars=None, max_reading_span_chars=None, max_readings_per_segment=None, max_paths=None, longest_only=None)

Returns a PartialAlignment(score, src_start, src_end, dest_start, dest_end) for the best span, or None when no span can be scored or score_cutoff filters every span. Offsets are Python character offsets. metric is "ratio" by default; use "distance" to rank by distance instead.

>>> import moine
>>> text = "ういすきーをのんでいます"
>>> alignment = moine.partial_alignment("ウイスキー", text, lang="ja")
>>> alignment
PartialAlignment(score=1.0, src_start=0, src_end=5, dest_start=0, dest_end=5)
>>> text[alignment.dest_start:alignment.dest_end]
'ういすきー'

moine.cdist

moine.cdist(queries, choices, *, lang=None, dictionary=None, metric="distance", score_cutoff=None, max_readings_per_segment=None, max_span_chars=None, max_paths=None, longest_only=None)

Returns a query-by-choice matrix of scores.

queries
Iterable of query strings.
choices
Iterable of candidate strings.
lang
Optional language code. Use "ja" or "zh" for dictionary-backed scoring. Omit it for plain string scoring.
dictionary
Optional loaded dictionary object. When supplied, cdist can run without lang.
metric
One of "distance", "damerau_distance", "normalized_distance", "normalized_similarity", or "ratio".
score_cutoff
Optional threshold. Use an integer for distance metrics and a float for normalized metrics.
max_readings_per_segment, max_span_chars, max_paths, longest_only
Optional dictionary expansion controls. These require lang or dictionary.
>>> import moine
>>> moine.cdist(["abc", "axc"], ["abc", "acb"])
[[0, 2], [1, 2]]
>>> moine.cdist(["abc"], ["abc", "adc"], metric="ratio")
[[1.0, 0.6666666666666666]]
>>> moine.cdist(
...     ["weishiji", "布納哈奔"],
...     ["威士忌", "布納哈本"],
...     lang="zh",
... )
[[0, 8], [8, 0]]

Note

cdist intentionally keeps the first public API small. It does not expose RapidFuzz-only knobs such as processor, score_hint, NumPy dtype options, or worker parallelism.

Dictionary Loading

moine.load_dict

moine.load_dict(*, lang, path=None)

Loads a dictionary artifact for one language. If path is omitted, mòine searches the configured cache, language-specific environment variables, and MOINE_DICTIONARIES_PATH.

>>> import moine
>>> dictionary = moine.load_dict(lang="ja")

moine.set_default_dictionary

moine.set_default_dictionary(dictionary)

Registers a loaded dictionary as the default dictionary for its language.

>>> import moine
>>> dictionary = moine.load_dict(lang="ja")
>>> moine.set_default_dictionary(dictionary)
>>> moine.distance("もいにゃ", "モイニャ", lang="ja")
0

moine.clear_default_dictionary

moine.clear_default_dictionary(*, lang)

Clears the configured default dictionary for a language.

moine.get_default_dictionary

moine.get_default_dictionary(*, lang)

Returns the configured default dictionary for a language, or None.

Language-Specific Modules

moine.ja
Japanese helpers and the UniDic-backed Dictionary alias.
moine.zh
Chinese helpers and the CC-CEDICT-backed Dictionary alias.

Rust users should use the crate documentation on docs.rs.