classical-automl-model / README.md

Iris314

Update README.md

35632cb verified 2 months ago

preview code

raw

history blame contribute delete

2.68 kB

metadata

language:
  - en
tags:
  - automl
  - tabular-classification
  - autogluon
  - cmu-course
datasets:
  - aedupuga/lego-sizes
metrics:
  - type: accuracy
  - type: f1
model-index:
  - name: Lego Brick Classification (Classical AutoML)
    results:
      - task:
          type: tabular-classification
          name: Tabular Classification
        dataset:
          name: aedupuga/lego-sizes
          type: classification
          split: augmented
        metrics:
          - type: accuracy
            value: 0.97
          - type: f1
            value: 0.96
      - task:
          type: tabular-classification
          name: Tabular Classification
        dataset:
          name: aedupuga/lego-sizes
          type: classification
          split: original
        metrics:
          - type: accuracy
            value: 0.9
          - type: f1
            value: 0.89

Model Card for Lego Brick Classification (Classical AutoML)

This model classifies LEGO pieces into three types — Standard, Flat, and Sloped — using their geometric dimensions (Length, Height, Width, Studs).
It was trained using AutoGluon Tabular AutoML, which automatically searched over classical ML models (LightGBM, XGBoost, CatBoost, Random Forest, k-NN, Neural Network) and selected the best-performing one.

Model Details

Model Description

Developed by: Xinxuan Tang (CMU)
Dataset curated by: Anuhya Edupuganti (CMU)
Model type: AutoML ensemble (best model = LightGBM)
Language(s): N/A (tabular data)
Finetuned from: Not applicable

Model Sources

Repository: Hugging Face Model Repo
Dataset: aedupuga/lego-sizes

Uses

Direct Use

Educational practice in tabular classification.
Experimenting with AutoML search and hyperparameter tuning.

Downstream Use

Could be used as a teaching example for AutoML pipelines on small tabular datasets.

Out-of-Scope Use

Not suitable for industrial LEGO quality control, since dataset is synthetic and small.

Bias, Risks, and Limitations

Small dataset: only 30 original bricks, augmented to 300 synthetic samples.
Synthetic data bias: jitter augmentation may not reflect real-world LEGO variations.

Recommendations

Users should treat results as proof-of-concept and not deploy in production.

How to Get Started with the Model

from autogluon.tabular import TabularPredictor
import pandas as pd

# Load trained predictor
predictor = TabularPredictor.load("autogluon_model/")

# Run inference
test_data = pd.DataFrame([{"Length": 4, "Height": 1.2, "Width": 2, "Studs": 4}])
print(predictor.predict(test_data))