2025-06-12 16:12:58 +02:00
2025-06-12 15:17:00 +02:00
2025-05-13 13:27:38 +02:00
2025-06-12 16:12:58 +02:00
2025-05-20 12:19:04 +02:00

CRAB Webapp (Code Review Automation Benchmark)

A research-driven platform for evaluating deep learning models on automated code review tasks. CRAB provides two core services:

  • Dataset download: Obtain high-quality, curated Java code review datasets for comment generation and code refinement tasks.
  • Result evaluation: Upload model-generated predictions to receive standardized evaluation metrics via a REST+WebSocket API.

Table of Contents

Features

  • Static Frontend: Vanilla HTML/CSS/JS interface—no build toolchain required.
  • Dataset Delivery: ZIP archives of JSON files, with optional full repo context fileciteturn3file13.
  • Submission Queue: Server-managed job queue with configurable parallelism (via MAX_WORKERS) fileciteturn3file0.
  • Realtime Feedback: Progress updates over WebSockets (using Flask-SocketIO) fileciteturn3file3.
  • Robust Data Processing: Utilities for parsing, validating, and evaluating submissions in src/utils.

Prerequisites

  • Python 3.8+
  • (Optional) Docker daemon if you wish to containerize the service

Installation & Setup

  1. Clone the repository:

    git clone https://github.com/yourusername/crab-webapp.git
    cd crab-webapp
    
  2. Install Python dependencies:

    pip install -r requirements.txt
    

Environment Variables

Defaults are set in src/utils/env_defaults.py (port 45003, data/ path, etc.) fileciteturn3file5. To override:

cp .env.example .env
# Edit .env to adjust:
# PORT=..., MAX_WORKERS=..., DATA_PATH=..., RESULTS_DIR=...

Running the Application

From the project root:

python -m src.server
  • The Flask app serves static files from public/ at / and mounts API routes under /datasets and /answers via Blueprints fileciteturn3file3.
  • By default, open your browser to http://localhost:45003/.

Using the Webapp

Download a Dataset

  1. Select Comment Generation or Code Refinement.
  2. (Optional) Check Include context to get full repo snapshots.
  3. Click Download to receive a ZIP with JSON (see schemas in public/index.html) fileciteturn3file14.

Upload Predictions

  1. Choose task type (comment or refinement).
  2. Select your JSON file (matching the dataset schema).
  3. Click Upload JSON.
  4. The server responds with a process ID.

Track Submission Status

  • Progress bar displays real-time percentage via WebSocket events.

  • You can also poll GET /answers/status/<id> (requires X-Socket-Id header) to retrieve:

    • status: created, waiting, processing, or complete
    • on completion: { type, results } JSON payload fileciteturn3file0.

API Endpoints

Method Route Description
GET /datasets/download/<dataset> Download ZIP of comment_generation or code_refinement (use ?withContext=true for full repo). fileciteturn3file13
POST /answers/submit/comment Submit comment-generation JSON.
POST /answers/submit/refinement Submit code-refinement JSON.
GET /answers/status/<id> Poll status or results (include X-Socket-Id).

Project Structure

├── data/                    # Dataset files: dataset.json, archives, etc.
├── public/                  # Static frontend
│   ├── css/style.css        # Styles
│   ├── img/crab.png         # Icon
│   ├── index.html           # UI with modals, schema docs
│   └── js/                  # Frontend scripts
│       ├── index.js         # UI logic, fetch & WebSocket handlers
│       ├── modal.js         # Modal dialogs
│       └── sorttable.js     # Table sorting
├── src/                     # Backend source
│   ├── server.py            # App entry: Flask + SocketIO fileciteturn3file3
│   ├── routes/              # Blueprints
│   │   ├── index.py         # Root & health-check
│   │   ├── datasets.py      # File downloads fileciteturn3file13
│   │   └── answers.py       # Submission & status endpoints fileciteturn3file1
│   └── utils/               # Core logic & helpers
│       ├── env_defaults.py  # Default ENV vars fileciteturn3file5
│       ├── dataset.py       # Load/validate dataset JSON fileciteturn3file2
│       ├── process_data.py  # Evaluation functions
│       ├── observer.py      # WebSocket observer & queue cleanup fileciteturn3file17
│       ├── queue_manager.py # Concurrency control
│       └── build_handlers.py# Build/test wrappers
├── requirements.txt         # Python libs: Flask, SocketIO, dotenv, etc. fileciteturn3file12
├── TODO.md                  # Next steps and backlog
└── .env.example             # Template for environment variables

Contributing

Issues and PRs welcome! Please follow existing style, add tests for new features, and update documentation accordingly.

License

This project is licensed under [Your License Here].

Acknowledgements

  • Developed as part of a Master's thesis at USI.
  • Inspired by Dean Edwards' sortable tables (sorttable.js) and FlaskSocketIO examples.
Description
The webapp to perform the evaluation process on the benchmark we created for my Master's Thesis CRAB: Code Review Automated benchmark
Readme 204 KiB
Languages
Python 51.5%
JavaScript 27.4%
HTML 17.3%
CSS 3.8%