Roadmap

the future of matchID

Consolidation

Method

  • recipes.py refactoring (split datasets, …)
  • git tag for versionning
  • build library
  • unitary testing

Monitoring api

  • restart
  • jobs supervision

Code

  • migrate to python 3

Automation / integration

  • CI testing
  • nginx path of dataviz (Kibana/Superset)
  • nginx modularity (remove Kibana)
  • adds superset support

Evolutions

Documentation

  • tutorial for doubles detection
  • tutorial for data API-fication

Examples

  • sample for doubles detection

Frontend

  • cliques validation for doubles detection
  • cost matrix charts
  • data loading helpers (e.g csv type pseudo-guessing)
  • data type display
  • transformation helpers (e.g. parsing dates, etc.)
  • editing data (and converting edits into recipes)

Backend

  • join with SQL
  • triggering recipes

Interoperability

Files

  • json
  • xml

Databases

  • Vertica
  • MySQL
  • MongoDB

Hadoop support

  • Spark
  • HDFS

Languages

  • SQL
  • R
  • pySpark

Other softs

  • Dataiku/DSS
  • Luigi