2D graph visualisation visualize.py

About This Project

MarcXimiL  is a free, flexible, fully standards-compliant and efficient bibliographic similarity analysis framework. MarcXimiL comes out of the box with generic predefined similarity strategies. However, strategies may be customized in a flexible way:

  • the method of comparisons between or within collections
  • including ways to skip probable useless comparisons
  • for each field, selection of a parsing function (fields may be indexed in several ways [words, digrams, soundex, initials, shingles] and can be concatenated, regrouped, or conditionally extracted)
  • for each field, selection of a comparison function amongst a wide selection: vectorial (Dice, Jaccard, Salton's cosine), probabilistic (OKAPI BM25), Levenshtein based, Authors, Date, and others.
  • the way fields similarities are combined to obtain a records similarities (various weighted means and ad-hoc functions)
  • the output format (XML, spreadsheet)
  • thresholds at different levels