CS 561 Final Project

AI vs Human Writing Detection

Find if your text is AI or not

Project Context

Background + Problem

NLP text classification is used here to identify whether text was written by a human or generated by AI models, supporting concerns around education, academic integrity, and content authenticity.

Dataset

Kaggle ai-and-human-text-dataset with 6,069 samples. Labels: 0 = Human and 1 = AI.

Methods

Text is lowercased, tokenized, and cleaned with stop-word removal. TF-IDF transforms text to features, then models are compared across TF-IDF baseline and BERT.

Evaluation

We evaluate with accuracy, precision, recall, and F1 to compare baseline ML performance against advanced model approaches.

Run Classifier

TF-IDF + Logistic Regression